About

I have not really mastered this concept yet. mXSS is my introduction to reading the HTML specs. I will consider myself to somewhat understand the concept once I find a mXSS bug.
This page is mainly to park some important research in mXSS that I find very interesting.

Researches

S1r1us mXSS Explained series: Part 1 and Part 2. His Github MXSS repository also contains a lot of insights.
mXSS cheatsheet (to save some sanity reading HTML specs)
SecurityMB DOMPurify 2.0.17 bypass
Yaniv Nizry mXSS introduction research
Yaniv Nizry DOMPurify 3.2.1 Bypass (Non-Default Config)
Jorian Woltjer mXSS: Covered some of the basics, including my content below, as well as the is attribute trick.
Kevin Mizu’s analysis on DOMPurify Bypasses: Part 1, Part 2
Ensy DOMPurify 3.2.3 Bypass (Non-Default Config)
Helping secure DOMPurify

Analysis tools

Nuggets

Chrome now encodes < and > characters in attributes (source). From the PR, it seems like this feature has not been pushed to all users yet. We can still enjoy mXSS for some time.

1

<svg><style><a alt="</style>">

alt attribute value will be HTML encoded, thus nerfing out some attacks

1

<svg><style><a alt="&lt;/style&gt;">

For the parsing differential payload below in the HTML specs, here is the mechanism of parsing this HTML snippet:
- When you open a <form> tag, the parser needs to keep record of the fact that it was opened with a form element pointer (that’s how it’s called in the spec). If the pointer is not null, then form element cannot be created.
- When you end a <form> tag, the form element pointer is always set to null.

1

<form id="outer"><div></form><form id="inner"><input>

Parsing in different namespaces

This is about the <style> element, but there are other elements that these explanation applies as well: <title>, <textarea>, <noscript> (if scripting is enabled). See the HTML specs for more details
The <style> element seems to be widely used in the payload of mXSS. I guess this is because it is “valid” in all 3 namespaces (?).

HTML namespace

In HTML (when served as text/html), the <style> element is defined as a raw text element. That means:
Raw text elements do not treat their content as HTML markup.
The parser does not look for nested tags inside them—it simply looks for the literal string that starts the closing tag (i.e. </style>).

SVG/MathML namespace

SVG and MathML content is served in XML MIME type, and there is NO raw text mode. Every element is parsed according to XML’s normal rules for element content.
<style> is not a valid element in these two namespaces. Hence <style> will be treated as any other tag and their contents will be parsed as normal HTML (in other words, normal elements like <a>).
This means that all elements must be properly nested, and attribute values are parsed as strings without special “raw” behavior.

Resulting quirks

These are some quirks that leverages the behavior above to deliver a mXSS payload. This is usually at the final step after we have figured out a good mutation to use.
Comments is interpreted differently in <style> of MathML namespace and HTML namespace:

MathML namespace: The parser sees the opening comment tag . <img> is foreign content, hence it breaks out of the MathML namespace to HTML namespace

1

<math><style><!--</style>a<foo-bar is="--><img src=x onerror=alert(1)>">

HTML namespace: Now it is slightly different, the <style> tag content is treated as raw text, hence the opening comment tag <!--> is treated as raw text, not a HTML element. The parser consumes everything until the closing </style> tag, hence the <foo-bar> element is considered a normal element with is attribute set to the rest of the payload, until the closing ".

1

<style><!--</style>a<foo-bar is="--><img src=x onerror=alert(1)>">

Similarly, the way that SVG namespace and HTML namespace parse attributes are different too.

HTML namespace: Same behavior as the above. <a id=" is considered as raw text, and this “breaks” the <a> tag. Hence, the <img> tag is treated as the normal HTML tag.

1

<style><a id="</style><img src=x onerror=alert()>"></a></style>

SVG namespace: In here, the parser sees the <style> tag, then the nested <a> tag inside, with id set to the rest of the payload.

1

<svg><style><a id="</style><img src=x onerror=alert()>"></a></style>

About#

Researches#

Analysis tools#

Nuggets#

Parsing in different namespaces#

HTML namespace#

SVG/MathML namespace#

Resulting quirks#