+++ Michel Fortin [Jul 10 14 07:53 ]:
Le 10-juil.-2014 à 1:04, John MacFarlane <j...@berkeley.edu> a écrit :

+++ Michel Fortin [Jul 09 14 18:07 ]:

Fun fact: PHP Markdown is mostly encoding agnostic. It understands UTF-8 
sequences but any byte that is not a valid UTF-8 sequence is treated as a 
character in itself. It's only relevant when converting tabs into spaces 
however, and only if you have non-ASCII characters before the tab.

Small amendment: There are at least two places where the difference
between utf-8 and latin1 matters:  tab expansion (as you note) and
reference links, since these are stipulated to be case insensitive.
(Case conversion is sensitive to the encoding.)

Like Markdown.pl, PHP Markdown will just treat non-ASCII characters in a 
case-sensitive way so in my case it doesn't matter.

I think this is a deficiency in Markdown.pl.  The syntax description
says that reference links are case-insensitive, and it doesn't say
anything about this just applying to ascii references.  I think someone
who writes in, say, Spanish, would be quite naturally expect words with
accents to behave the same as words without accents in reference links.

By the way, I'm not sure what the motivation for making the reference
links case-insensitive was.  I conjecture that it was to allow the
following sort of thing:

   [Foo][] is better than [bar][].  And [Bar][] is worse than [foo][].

   [foo]: /url1
   [bar]: /url2

This is a good motivation:  it would be a  burden to have to define
separate references for capitalized and uncapitalized versions of a
phrase, or to use the longer form `[Foo][foo]` for capitalized
versions.  But this motivation extends naturally beyond ascii.

Hence, I think markdown processors *should* do a proper unicode
case fold in determining when references match.

Unfortunately, as you point out, this becomes very complex, and
brings in locale dependence for a few cases (e.g. Turkish).  Still,
I think it's the ideal we should aspire to.

_______________________________________________
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Reply via email to