<isindex> is a strange feature from the dawn of HTML. It predates
proper <form> functionality and provides a single search field that
maps to the URL query string in a way that differs from <form> fields.

When Hixie specced the HTML parsing algorithm, he adopted the Trident
approach to <isindex>, which is to treat the tag as a parser macro
that expands to multiple elements, including a <form> and <input>.

Since <isindex> maps to the query string in a different way than
normal <form> fields, the form submission code has a special case that
recognizes the naming of the <input> generated by the parser macro and
handles the <input> in an abnormal way.

In December 2013, Blink decided to remove <isindex> citing it as a
bypass vector for XSS filters and citing usage as minimal. (Such a
filter would have to be a blacklist-based filter, and those are
fundamentally broken anyway: if you want a filter that actually works,
you must have a whitelist-based filter.)

EdgeHTML has since followed Blink. At this point, sites that are still
maintained and that used <isindex>  at the time of removal from Blink
have had to adapt. (Existence proof seems to include e.g.
http://cdcvs.fnal.gov/cgi-bin/searchaddproduct.cgi whose admin
complained on blink-dev, but now the page no longer uses <isindex>.)

Therefore, even though removing <isindex> is a violation of the
Support Existing Content design principle, Blink (and EdgeHTML) having
broken <isindex> and kept it broken for a couple years anyway have
made the feature even less valuable than it used to be, because the
still-maintained sites have had to adapt. (The notion of unmaintained
sites that accept user input is pretty scary.)

The general ugliness of the implementation of <isindex> is sunk cost
at this point, but the implementation does impose an ongoing privacy
weirdness: <isindex> is a case where a Web site can make the browser
provide a string supplied by the browser UI localization as part of
the DOM. That is, even if the user tries to conceal their locale by
e.g. making Accept-Language look like vanilla U.S. English, a site
that wants to target users whose browser UI is in a particular
language can still identify these users by making the browser parse
<isindex>.

Thus, removing this feature makes the portability layer of the HTML
parser a bit smaller and makes the HTML parsing algorithm independent
of localization. It is also possible that the removal of the matching
oddity from the form submission algorithm ends up allowing a slight
cleanup of the URL Standard.

Bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1266495
Spec bug: https://github.com/whatwg/html/issues/1088
Blink thread: 
https://groups.google.com/a/chromium.org/forum/#!msg/blink-dev/14q_I06gwg8/52oBtr2VCAAJ

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to