https://bugzilla.wikimedia.org/show_bug.cgi?id=24918

--- Comment #8 from entli...@gmx-topmail.de 2010-08-26 23:44:05 UTC ---
> So the only remaining problem is that the validator complains about things 
> like
> <a href="#&nbsp;"></a>?

Not quite. It's unclear why the HTML5 validator complains about Unicode
whitespace like nbsp etc.; the RFCs give no clue. But unencoded "[" and "]" are
clearly non-compliant. RFC 3987 says "... square bracket characters ... MUST
NOT be converted" and then RFC 3986 says "A host identified by an Internet
Protocol literal address, version 6 [RFC3513] or later, is distinguished by
enclosing the IP literal within square brackets ("[" and "]").  This is the
only place where square bracket characters are allowed in the URI syntax."

> I don't know.  I glanced at the code but didn't see an obvious reason.  It's a
> separate bug.

Separate, but related. There is apparently no way to write links to sections
with "[" and "]" in the title as external links (this includes permalinks)
without getting them percent-encoded (more than that, it's hard to write them
at all, as they clash with wiki markup).

Other than that, I'm not sure if stripping the most problematic characters is
the right approach at all. It doesn't solve all compatibility issues. I've just
noticed the following: Paste http://example.com/#< into Firefox' address bar.
Copy from there and paste into an arbitrary text editor. You'll get
http://example.com/#%3C (tested in a current Firefox 4.0 nightly), which
doesn't work in IE. This happens with some funny ASCII characters like "<" and
">", but also - and that's far worse - non-ASCII characters that occur in
natural language.

So Firefox users will create links that don't work in IE as long as IE doesn't
understand percent encoding. Maybe we should therefore allow all characters in
IDs, percent-encode where necessary (that is, just 4 ASCII characters which
rarely occur in natural language anyway) and accept that this minor detail
doesn't work in IE. That's at least compliant; the whole attempt to allow
arbitrary Unicode characters isn't interoperable with Firefox enforcing percent
encoding and IE not supporting it.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to