https://bugzilla.wikimedia.org/show_bug.cgi?id=21429


Philippe Verdy <verd...@wanadoo.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |verd...@wanadoo.fr




--- Comment #3 from Philippe Verdy <verd...@wanadoo.fr>  2009-11-19 19:48:24 
UTC ---
Isn't the U+FC61 a compatibility character whose normalization excludes
decomposition and recombinations under NFD/NFC canonical equivalences?

If some Arabic fonts do not support two successive diacritcs as recommended by
Unicode, and only support the decomposable compatibility characters, these
fonts are really bogous and should be avoided. But the problem is not there,
see below.

If the character is not a canonical equivalent to the two diacritics, it must
not be altered (even if it's not recommended).
In other words, MediaWiki must just apply the NFC normalization, but NOT the
NFKC normalisation.

When I look at the UCD, it reveals that U+FC61 decomposes as "[isolated] U+0020
U+064F U+0651"

Which means that this is just a compatibility decomposition, and not a
canonical decomposition (note also that the decomposition adds an extra space,
which in newer documents should rather be a non-breaking space instead of a
regular space, to avoid side effects that are possible with whitespace
compressions in HTML and XML). Note also that the space still prohibits
reordering.

I see no reason then, why Mediawiki would choose to convert U+FC61 incorrectly
to U+064F U+0651 (stripping the "[isolated]" compatibility specifier and one
space).

And also no reason why it would recombine U+064F U+0651 (adding the leading
space and an inexistant [isolated] form) into U+FC61 in the editor.

The same reason should be applied to all the other Arabic compatibility
characters (with implicit letter forms) that should be avoided in actual arabic
text, unless there is a strong reason to display the character in isolation
with a specific form distinct from the normal Arabic presentation rules.


-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to