1. 2.5 URLs
      1. 2.5.1 Terminology
      2. 2.5.2 Parsing URLs
      3. 2.5.3 Dynamic changes to base URLs
    2. 2.6 Fetching resources
      1. 2.6.1 Terminology
      2. 2.6.2 Determining the type of a resource
      3. 2.6.3 Extracting character encodings from meta elements
      4. 2.6.4 CORS settings attributes
      5. 2.6.5 Referrer policy attributes
      6. 2.6.6 Nonce attributes
      7. 2.6.7 Lazy loading attributes

2.5 URLs

2.5.1 Terminology

A string is a valid non-empty URL if it is a valid URL string but it is not the empty string.

A string is a valid URL potentially surrounded by spaces if, after stripping leading and trailing ASCII whitespace from it, it is a valid URL string .

A string is a valid non-empty URL potentially surrounded by spaces if, after stripping leading and trailing ASCII whitespace from it, it is a valid non-empty URL .

This specification defines the URL about:legacy-compat as a reserved, though unresolvable, about: URL, for use in DOCTYPE s in HTML documents when needed for compatibility with XML tools. [ABOUT]

This specification defines the URL about:html-kind as a reserved, though unresolvable, about: URL, that is used as an identifier for kinds of media tracks. [ABOUT]

This specification defines the URL about:srcdoc as a reserved, though unresolvable, about: URL, that is used as the URL of iframe srcdoc documents . [ABOUT]

The fallback base URL of a Document object document is the URL record obtained by running these steps:

  1. If document is an iframe srcdoc document , then return the document base URL of document 's browsing context 's browsing context container 's node document .

  2. If document 's URL is about:blank , and document 's browsing context has a creator browsing context , then return the creator base URL .

  3. Return document 's URL .

The document base URL of a Document object is the absolute URL obtained by running these steps:

  1. If there is no base element that has an href attribute in the Document , then return the Document 's fallback base URL .

  2. Otherwise, return the frozen base URL of the first base element in the Document that has an href attribute, in tree order .

2.5.2 Parsing URLs

Parsing a URL is the process of taking a string and obtaining the URL record that it represents. While this process is defined in the WHATWG URL standard, the HTML standard defines a wrapper for convenience. [URL]

This wrapper is only useful when the character encoding for the URL parser has to match that of the document or environment settings object for legacy reasons. When that is not the case the URL parser can be used directly.

To parse a URL url , relative to either a document or environment settings object , the user agent must use the following steps. Parsing a URL either results in failure or a resulting URL string and resulting URL record .

  1. Let encoding be document 's character encoding , if document was given, and environment settings object 's API URL character encoding otherwise.

  2. Let baseURL be document 's base URL , if document was given, and environment settings object 's API base URL otherwise.

  3. Let urlRecord be the result of applying the URL parser to url , with baseURL and encoding .

  4. If urlRecord is failure, then return failure.

  5. Let urlString be the result of applying the URL serializer to urlRecord .

  6. Return urlString as the resulting URL string and urlRecord as the resulting URL record .

2.5.3 Dynamic changes to base URLs

When a document's document base URL changes, all elements in that document are affected by a base URL change .

The following are base URL change steps , which run when an element is affected by a base URL change (as defined by the DOM specification):

If the element creates a hyperlink

If the URL identified by the hyperlink is being shown to the user, or if any data derived from that URL is affecting the display, then the href attribute should be reparsed relative to the element's node document and the UI updated appropriately.

For example, the CSS :link / :visited pseudo-classes might have been affected.

If the hyperlink has a ping attribute and its URL(s) are being shown to the user, then the ping attribute's tokens should be reparsed relative to the element's node document and the UI updated appropriately.

If the element is a q , blockquote , ins , or del element with a cite attribute

If the URL identified by the cite attribute is being shown to the user, or if any data derived from that URL is affecting the display, then the URL should be reparsed relative to the element's node document and the UI updated appropriately.

Otherwise

The element is not directly affected.

For instance, changing the base URL doesn't affect the image displayed by img elements, although subsequent accesses of the src IDL attribute from script will return a new absolute URL that might no longer correspond to the image being shown.

2.6 Fetching resources

2.6.1 Terminology

A response whose type is " basic ", " cors ", or " default " is CORS-same-origin . [FETCH]

A response whose type is " opaque " or " opaqueredirect " is CORS-cross-origin .

A response 's unsafe response is its internal response if it has one, and the response itself otherwise.

To create a potential-CORS request , given a url , destination , corsAttributeState , and an optional same-origin fallback flag , run these steps:

  1. Let mode be " no-cors " if corsAttributeState is No CORS , and " cors " otherwise.

  2. If same-origin fallback flag is set and mode is " no-cors ", set mode to " same-origin ".

  3. Let credentialsMode be " include ".

  4. If corsAttributeState is Anonymous , set credentialsMode to " same-origin ".

  5. Let request be a new request whose url is url , destination is destination , mode is mode , credentials mode is credentialsMode , and whose use-URL-credentials flag is set.

2.6.2 Determining the type of a resource

The Content-Type metadata of a resource must be obtained and interpreted in a manner consistent with the requirements of the WHATWG MIME Sniffing standard. [MIMESNIFF]

The computed MIME type of a resource must be found in a manner consistent with the requirements given in the WHATWG MIME Sniffing standard. [MIMESNIFF]

The rules for sniffing images specifically , the rules for distinguishing if a resource is text or binary , and the rules for sniffing audio and video specifically are also defined in the WHATWG MIME Sniffing standard. These rules return a MIME type as their result. [MIMESNIFF]

It is imperative that the rules in the WHATWG MIME Sniffing standard be followed exactly. When a user agent uses different heuristics for content type detection than the server expects, security problems can occur. For more details, see the WHATWG MIME Sniffing standard. [MIMESNIFF]

2.6.3 Extracting character encodings from meta elements

The algorithm for extracting a character encoding from a meta element , given a string s , is as follows. It either returns a character encoding or nothing.

  1. Let position be a pointer into s , initially pointing at the start of the string.

  2. Loop : Find the first seven characters in s after position that are an ASCII case-insensitive match for the word " charset ". If no such match is found, return nothing.

  3. Skip any ASCII whitespace that immediately follow the word " charset " (there might not be any).

  4. If the next character is not a U+003D EQUALS SIGN (=), then move position to point just before that next character, and jump back to the step labeled loop .

  5. Skip any ASCII whitespace that immediately follow the equals sign (there might not be any).

  6. Process the next character as follows:

    If it is a U+0022 QUOTATION MARK character (") and there is a later U+0022 QUOTATION MARK character (") in s
    If it is a U+0027 APOSTROPHE character (') and there is a later U+0027 APOSTROPHE character (') in s
    Return the result of getting an encoding from the substring that is between this character and the next earliest occurrence of this character.
    If it is an unmatched U+0022 QUOTATION MARK character (")
    If it is an unmatched U+0027 APOSTROPHE character (')
    If there is no next character
    Return nothing.
    Otherwise
    Return the result of getting an encoding from the substring that consists of this character up to but not including the first ASCII whitespace or U+003B SEMICOLON character (;), or the end of s , whichever comes first.

This algorithm is distinct from those in the HTTP specification (for example, HTTP doesn't allow the use of single quotes and requires supporting a backslash-escape mechanism that is not supported by this algorithm). While the algorithm is used in contexts that, historically, were related to HTTP, the syntax as supported by implementations diverged some time ago. [HTTP]

2.6.4 CORS settings attributes

A CORS settings attribute is an enumerated attribute . The following table lists the keywords and states for the attribute — the keywords in the left column map to the states in the cell in the second column on the same row as the keyword.

Keyword State Brief description
anonymous Anonymous Requests for the element will have their mode set to " cors " and their credentials mode set to " same-origin ".
use-credentials Use Credentials Requests for the element will have their mode set to " cors " and their credentials mode set to " include ".

The empty string is also a valid keyword, and maps to the Anonymous state. The attribute's invalid value default is the Anonymous state. For the purposes of reflection , the canonical case for the Anonymous state is the anonymous keyword. The missing value default , used when the attribute is omitted, is the No CORS state.

The majority of fetches governed by CORS settings attributes will be done via the create a potential-CORS request algorithm.

For module scripts , certain CORS settings attributes have been repurposed to have a slightly different meaning, wherein they only impact the request 's credentials mode (since the mode is always " cors "). To perform this translation, we define the module script credentials mode for a given CORS settings attribute to be determined by switching on the attribute's state:

No CORS
Anonymous
" same-origin "
Use Credentials
" include "

2.6.5 Referrer policy attributes

A referrer policy attribute is an enumerated attribute . Each referrer policy , including the empty string, is a keyword for this attribute, mapping to a state of the same name.

The attribute's invalid value default and missing value default are both the empty string state.

The impact of these states on the processing model of various fetches is defined in more detail throughout this specification, in the WHATWG Fetch standard, and in Referrer Policy . [FETCH] [REFERRERPOLICY]

Several signals can contribute to which processing model is used for a given fetch ; a referrer policy attribute is only one of them. In general, the order in which these signals are processed are:

  1. First, the presence of a noreferrer link type;

  2. Then, the value of a referrer policy attribute ;

  3. Then, the presence of any meta element with name attribute set to referrer .

  4. Finally, the ` Referrer-Policy ` HTTP header.

2.6.6 Nonce attributes

A nonce content attribute represents a cryptographic nonce ("number used once") which can be used by Content Security Policy to determine whether or not a given fetch will be allowed to proceed. The value is text. [CSP]

Elements that have a nonce content attribute ensure that the crytographic nonce is only exposed to script (and not to side-channels like CSS attribute selectors) by extracting the value from the content attribute, moving it into an internal slot named [[CryptographicNonce]] , and exposing it to script via the HTMLOrSVGElement interface mixin. Unless otherwise specified, the slot's value is the empty string.

element . nonce

Returns the value of the element's [[CryptographicNonce]] internal slot.

Can be set, to update that slot's value.

The nonce IDL attribute must, on getting, return the value of this element's [[CryptographicNonce]] ; and on setting, set this element's [[CryptographicNonce]] to the given value.

Note how the setter for the nonce IDL attribute does not update the corresponding content attribute. This, as well as the below setting of the nonce content attribute to the empty string when an element becomes browsing-context connected , is meant to prevent exfiltration of the nonce value through mechanisms that can easily read content attributes, such as selectors. Learn more in issue #2369 , where this behavior was introduced.

Whenever an element including HTMLOrSVGElement has its nonce attribute is set or changed, set this element's [[CryptographicNonce]] to the given value.

Whenever an element including HTMLOrSVGElement becomes browsing-context connected , the user agent must execute the following steps on the element :

  1. Let CSP list be element 's shadow-including root 's CSP list .

  2. If CSP list contains a header-delivered Content Security Policy , and element has a nonce content attribute attr whose value is not the empty string, then:

    1. Set an attribute value for element using " nonce " and the empty string.

As each Document 's CSP list is append-only, user agents can optimize away the contains a header-delivered Content Security Policy check by, for example, holding a flag on the Document , set during Document initialization .

The cloning steps for elements that include HTMLOrSVGElement must set the [[CryptographicNonce]] slot on the copy to the value of the slot on the element being cloned.

2.6.7 Lazy loading attributes

A lazy loading attribute is an enumerated attribute . The following table lists the keywords and states for the attribute — the keywords in the left column map to the states in the cell in the second column on the same row as the keyword.

The attribute provides a hint to the user agent to aid in deciding whether to load an element immediately or to defer loading until the element will be viewable, according to the attribute's current state.

Keyword State Description
on On Indicates a strong preference to defer fetching the element's resource until it will be viewable.
off Off Indicates the element's resource must be fetched immediately, regardless of viewability.
auto Auto Indicates that the user agent may determine the fetching strategy (the default).

The attribute's missing value default and invalid value default are both the Auto state.