Le 10-juil.-2014 à 1:04, John MacFarlane <j...@berkeley.edu> a écrit :
> +++ Michel Fortin [Jul 09 14 18:07 ]: > >> Fun fact: PHP Markdown is mostly encoding agnostic. It understands UTF-8 >> sequences but any byte that is not a valid UTF-8 sequence is treated as a >> character in itself. It's only relevant when converting tabs into spaces >> however, and only if you have non-ASCII characters before the tab. > > Small amendment: There are at least two places where the difference > between utf-8 and latin1 matters: tab expansion (as you note) and > reference links, since these are stipulated to be case insensitive. > (Case conversion is sensitive to the encoding.) Like Markdown.pl, PHP Markdown will just treat non-ASCII characters in a case-sensitive way so in my case it doesn't matter. Also, if you want to compare characters in a case-sensitive manner, the most correct way to do it is to use the Unicode Collation Algorithm, not case conversion to lower or uppercase, because some characters can't round-trip (see [german ß]). Then you'll notice that unfortunately Unicode collation is locale dependent (because equivalent characters aren't the same in all locales, see the [turkish ı]). And then you'll realize there's not correct way to do it universally. [GERMAN SS]: https://en.wikipedia.org/wiki/ß [TURKISH I]: https://en.wikipedia.org/wiki/Turkish_dotted_and_dotless_I On Babelmark I see that cheapskate 0.1.0.1 understands the first link above -- good job! -- an no one understands the second one. http://johnmacfarlane.net/babelmark2/?normalize=1&text=Also%2C+if+you+want+to+compare+characters+in+a+case-sensitive+manner%2C+the+most+correct+way+to+do+it+is+to+use+the+Unicode+Collation+Algorithm+--+not+case+conversion+to+lower+or+uppercase+--+because+some+characters+can't+round-trip+(see+%5Bgerman+ß%5D).+Then+you'll+notice+that+unfortunately+Unicode+collation+is+locale+dependent+(because+equivalent+characters+aren't+the+same+in+all+locales%2C+see+the+%5Bturkish+ı%5D).+And+then+you'll+realize+there's+not+really+a+correct+way+to+do+it.%0A%0A+%5BGERMAN+SS%5D%3A+https%3A%2F%2Fen.wikipedia.org%2Fwiki%2Fß%0A+%5BTURKISH+I%5D%3A+https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FTurkish_dotted_and_dotless_I%0A -- Michel Fortin michel.for...@michelf.ca http://michelf.ca _______________________________________________ Markdown-Discuss mailing list Markdown-Discuss@six.pairlist.net http://six.pairlist.net/mailman/listinfo/markdown-discuss