Le 10 oct. 2006 à 3:17, A. Pagaltzis a écrit :
* John Gruber <[EMAIL PROTECTED]> [2006-10-10 05:55]:
I think it's simpler and better to just say "use UTF-8".
+1
UTF-8 is in fact deliberately constructed such that the chance of
arbitrary text accidentally being valid UTF-8 approaches zero wit
* John Gruber <[EMAIL PROTECTED]> [2006-10-10 05:55]:
> I think it's simpler and better to just say "use UTF-8".
+1
UTF-8 is in fact deliberately constructed such that the chance of
arbitrary text accidentally being valid UTF-8 approaches zero with
increasing length of the text.
Regards,
--
Ari
Michel Fortin <[EMAIL PROTECTED]> wrote on 10/9/06 at
9:33 PM:
I haven't tried it inside PHP Markdown yet, but I've tested
`mb_strlen` and it seems to treat any invalid UTF-8 byte
sequence as individual characters. So the neat result is that
text in ISO Latin, Windows Latin, or Mac Roman will w
On 10. Oct 2006, at 03:33, Michel Fortin wrote:
[...] I'm not sure how high is that risk for all character
combinaisons, but it obviously is less problematic than the current
behaviour is to UTF-8.
This report http://www.ifi.unizh.ch/mml/mduerst/papers/PDF/IUC11-
UTF-8.pdf talks about prob
Le 9 oct. 2006 à 20:34, John Gruber a écrit :
Michel Fortin <[EMAIL PROTECTED]> wrote on 10/9/06 at 8:26 PM:
If anyone is interested in a fix for PHP Markdown, just change
the call to the `strlen` function within detab to a call to
`mb_strlen($line, 'utf-8')`. I'll fix this for the next
versio
Michel Fortin <[EMAIL PROTECTED]> wrote on 10/9/06 at
8:26 PM:
If anyone is interested in a fix for PHP Markdown, just change
the call to the `strlen` function within detab to a call to
`mb_strlen($line, 'utf-8')`. I'll fix this for the next
version.
Will that still work if people pass in Win
Le 9 oct. 2006 à 19:43, Allan Odgaard a écrit :
As you can see, expand is able to correctly convert tabs to spaces,
where Markdown.pl counts the é as occupying two columns.
Ah! Now I see what you mean. It makes perfect sense and is super-easy
to reproduce. Thank you for that clear example.
On 10. Oct 2006, at 00:19, John Gruber wrote:
[...] If Markdown.pl ever gains explicit support for text
encodings, the
rules will be simple: UTF-8 in, UTF-8 out, no exceptions.
Or you could check the users locale (LC_CTYPE). Though hardcoding it
to UTF-8 works for me.
You can also verify
On 10. Oct 2006, at 00:52, Michel Fortin wrote:
[...] From your description of the problem, I believe you're not
using UTF-8.
No, here is an example showing the problem:
% Markdown.pl <<< $'Test:\nresume\tbar\nrésumé\tbar\n'
Test:
resume bar
résumébar
Le 9 oct. 2006 à 17:02, Allan Odgaard a écrit :
As for #2, Markdown doesn’t know the encoding of the source
document, so that would mean it can’t really be aware of things
such as UTF-8 mb sequences, OTOH if it changes my pre-formatted
text, I would like to have it do the right thing.
Cur
Allan Odgaard <[EMAIL PROTECTED]> wrote on 10/9/06 at
11:02 PM:
This raises two questions:
1. Should Markdown convert tabs to spaces in pre-formated text?
2. If yes, should Markdown be aware of multi-byte characters?
I’d say yes to #1 -- Markdown converts to (X)HTML which
does not define
A user has table-formatted data which contains accents and finds it
problematic that his tables misalign after going through Markdown.
This is because he made them align using tab characters and Markdown
will convert these to spaces even in pre-formatted text and Markdown
is not multi-byte
12 matches
Mail list logo