Re: Detab should be multi-byte aware?

2006-10-09 Thread John Gruber
Michel Fortin <[EMAIL PROTECTED]> wrote on 10/9/06 at 9:33 PM: I haven't tried it inside PHP Markdown yet, but I've tested `mb_strlen` and it seems to treat any invalid UTF-8 byte sequence as individual characters. So the neat result is that text in ISO Latin, Windows Latin, or Mac Roman will w

Re: Detab should be multi-byte aware?

2006-10-09 Thread Allan Odgaard
On 10. Oct 2006, at 03:33, Michel Fortin wrote: [...] I'm not sure how high is that risk for all character combinaisons, but it obviously is less problematic than the current behaviour is to UTF-8. This report http://www.ifi.unizh.ch/mml/mduerst/papers/PDF/IUC11- UTF-8.pdf talks about prob

Re: Detab should be multi-byte aware?

2006-10-09 Thread Michel Fortin
Le 9 oct. 2006 à 20:34, John Gruber a écrit : Michel Fortin <[EMAIL PROTECTED]> wrote on 10/9/06 at 8:26 PM: If anyone is interested in a fix for PHP Markdown, just change the call to the `strlen` function within detab to a call to `mb_strlen($line, 'utf-8')`. I'll fix this for the next versio

Re: Markdown file extension

2006-10-09 Thread Lou Quillio
John Gruber wrote: > I don't like it at all, and certainly won't use or endorse it. Amen. How does a cryptic or tech-schmancy extension square with the Markdown ethic? It don't. But if ya absolutely hadda have one sometimes, what about, uhh, `filename.markdown`? Too confusing? LQ ___

Re: Markdown file extension

2006-10-09 Thread Fletcher T. Penney
On Oct 9, 2006, at 7:32 PM, Michel Fortin wrote: Le 9 oct. 2006 à 18:50, John Gruber a écrit : If you're going to go to six characters for ".mdtext", then why not go to eight characters and use ".markdown"? That's what I would use. Because ".mdtext" better conveys that the document is prima

Re: Detab should be multi-byte aware?

2006-10-09 Thread John Gruber
Michel Fortin <[EMAIL PROTECTED]> wrote on 10/9/06 at 8:26 PM: If anyone is interested in a fix for PHP Markdown, just change the call to the `strlen` function within detab to a call to `mb_strlen($line, 'utf-8')`. I'll fix this for the next version. Will that still work if people pass in Win

Re: Detab should be multi-byte aware?

2006-10-09 Thread Michel Fortin
Le 9 oct. 2006 à 19:43, Allan Odgaard a écrit : As you can see, expand is able to correctly convert tabs to spaces, where Markdown.pl counts the é as occupying two columns. Ah! Now I see what you mean. It makes perfect sense and is super-easy to reproduce. Thank you for that clear example.

Re: Markdown file extension

2006-10-09 Thread Robert Ullrey
Tobias Gruetzmacher wrote this pithy remark on 10/10/06 >That is only legacy. Since the web is older then Windows 95, which >brought LFN to FAT and ISO9660 (Joilet), there were some file extensions >shortened. I don't think there are any modern operating systems which >don't allow longer file exte

Re: Markdown file extension

2006-10-09 Thread Tobias Gruetzmacher
Hi, On Mon, Oct 09, 2006 at 04:40:27PM -0700, Robert Ullrey wrote: > >Why wouldn't the extension be the same on all platforms? > > I seem to recall that windows only supports three letter extensions. I > am wlecome to being corected though, but that was also my understanding > why we have .htm (w

Re: Detab should be multi-byte aware?

2006-10-09 Thread Allan Odgaard
On 10. Oct 2006, at 00:19, John Gruber wrote: [...] If Markdown.pl ever gains explicit support for text encodings, the rules will be simple: UTF-8 in, UTF-8 out, no exceptions. Or you could check the users locale (LC_CTYPE). Though hardcoding it to UTF-8 works for me. You can also verify

Re: Detab should be multi-byte aware?

2006-10-09 Thread Allan Odgaard
On 10. Oct 2006, at 00:52, Michel Fortin wrote: [...] From your description of the problem, I believe you're not using UTF-8. No, here is an example showing the problem: % Markdown.pl <<< $'Test:\nresume\tbar\nrésumé\tbar\n' Test: resume bar résumébar

Re: Markdown file extension

2006-10-09 Thread Robert Ullrey
Fletcher T. Penney wrote this pithy remark on 10/9/06 >markdown text: >> .mdt (windows) .mdtx (unix?) > > >Why wouldn't the extension be the same on all platforms? I seem to recall that windows only supports three letter extensions. I am wlecome to being corected though, but that was also my und

Re: Markdown file extension

2006-10-09 Thread Michel Fortin
Le 9 oct. 2006 à 18:50, John Gruber a écrit : If you're going to go to six characters for ".mdtext", then why not go to eight characters and use ".markdown"? That's what I would use. Because ".mdtext" better conveys that the document is primarily in text format, it should be less intimidatin

Re: Minor regexp oversight for setext headings

2006-10-09 Thread Michel Fortin
Le 9 oct. 2006 à 17:48, John Gruber a écrit : But what about this: #this# that Right now that gets turned into this that and I think that's just plain wrong. The only reason this works is by accident; so I plan to require a blank line here. I totally agree that the syntax

Re: Detab should be multi-byte aware?

2006-10-09 Thread Michel Fortin
Le 9 oct. 2006 à 17:02, Allan Odgaard a écrit : As for #2, Markdown doesn’t know the encoding of the source document, so that would mean it can’t really be aware of things such as UTF-8 mb sequences, OTOH if it changes my pre-formatted text, I would like to have it do the right thing. Cur

Re: Markdown file extension

2006-10-09 Thread John Gruber
Fletcher T. Penney <[EMAIL PROTECTED]> wrote on 10/9/06 at 6:20 PM: I was not too keen on ".mdml", but did like the ".mdtext" that someone else pointed out. All in all, I don't much care what a consensus extension is, but would happily use it if there was one. If you're going to go to

Re: Markdown file extension

2006-10-09 Thread Allan Odgaard
On 10. Oct 2006, at 00:20, Fletcher T. Penney wrote: But wouldn't it make sense to try and have a consistent file extension for Markdown files for those who want something more specific than .txt or .text? Definitely! In addition to the “out of the box” support for the file opened in var

Re: Markdown file extension

2006-10-09 Thread A. Pagaltzis
* Fletcher T. Penney <[EMAIL PROTECTED]> [2006-10-10 00:25]: > But wouldn't it make sense to try and have a consistent file > extension for Markdown files for those who want something more > specific than .txt or .text? More and more applications are > supporting them in various a consistent file e

Re: Markdown file extension

2006-10-09 Thread Fletcher T. Penney
On Oct 9, 2006, at 6:26 PM, Robert Ullrey wrote: That being said, it seems to me that if a file extension was to be declared, it should be similar on all platforms, e.g., markdown text: .mdt (windows) .mdtx (unix?) Why wouldn't the extension be the same on all platforms? Fletcher -- Fletch

Re: Markdown file extension

2006-10-09 Thread Robert Ullrey
It seems to me though that the advantage to using a uniform file extension other then .text is that applications like TextMate, BBedit, Subethaedit, emacs, and their windows/linux equivalents can then have specific grammers and modes to support syntax highlighting, auto-complete etc. Using somethin

Re: Markdown file extension

2006-10-09 Thread Fletcher T. Penney
On Oct 9, 2006, at 6:03 PM, John Gruber wrote: Allan Odgaard <[EMAIL PROTECTED]> wrote on 10/9/06 at 6:33 PM: Now, it’s no problem to add another, but I’ve just never came across mdml, nor was it mentioned when I asked JG for which extensions are in use. I don't like it at all, and certainl

Re: Detab should be multi-byte aware?

2006-10-09 Thread John Gruber
Allan Odgaard <[EMAIL PROTECTED]> wrote on 10/9/06 at 11:02 PM: This raises two questions: 1. Should Markdown convert tabs to spaces in pre-formated text? 2. If yes, should Markdown be aware of multi-byte characters? I’d say yes to #1 -- Markdown converts to (X)HTML which does not define

Re: Markdown file extension

2006-10-09 Thread John Gruber
Allan Odgaard <[EMAIL PROTECTED]> wrote on 10/9/06 at 7:12 PM: I also wonder if JG views Markdown as a markup language seeing how a lot of constructs are not exactly based on explicit markup. Funny, but true. I don't really have a strong opinion about what "markup language" means, but when

Re: Markdown file extension

2006-10-09 Thread John Gruber
Allan Odgaard <[EMAIL PROTECTED]> wrote on 10/9/06 at 6:33 PM: Now, it’s no problem to add another, but I’ve just never came across mdml, nor was it mentioned when I asked JG for which extensions are in use. I don't like it at all, and certainly won't use or endorse it. What's the "ml" part?

Re: Minor regexp oversight for setext headings

2006-10-09 Thread John Gruber
Michel Fortin <[EMAIL PROTECTED]> wrote on 10/9/06 at 12:47 AM: As much as I agree with you, I tend to believe it'll break backward compatibility for a couple of people. I've seen this a couple of times: Paragraph... Header == Paragraph... although I don't remember

Detab should be multi-byte aware?

2006-10-09 Thread Allan Odgaard
A user has table-formatted data which contains accents and finds it problematic that his tables misalign after going through Markdown. This is because he made them align using tab characters and Markdown will convert these to spaces even in pre-formatted text and Markdown is not multi-byte

Re: Markdown file extension

2006-10-09 Thread A. Pagaltzis
* Robert Ullrey <[EMAIL PROTECTED]> [2006-10-09 18:15]: > Just to let everyone know, markdown now has it’s own official > file extension; “mdml”. > > I have never heard of any of that extension, file-extensions.org or an official

Re: Minor regexp oversight for setext headings

2006-10-09 Thread A. Pagaltzis
* Angie Ahl <[EMAIL PROTECTED]> [2006-10-09 16:10]: > And I've read the book about 5 times by now so I'd really > appreciate you qualifying what you say to the list for the > benefit of the list as I'm pretty sure you're mistaken. Ask yourself: if non-greedy is always faster, why is it not the def

Re: Markdown file extension

2006-10-09 Thread Allan Odgaard
On 9. Oct 2006, at 19:09, James Bennett wrote: On 10/9/06, Robert Ullrey <[EMAIL PROTECTED]> wrote: Not sure. I was searching to see it there was an official one. I have been using md for a while now. It looked official enough that I thought JG put it in. Even weirder is that there is a mar

Re: Markdown file extension

2006-10-09 Thread Michel Fortin
Le 9 oct. 2006 à 12:14, Robert Ullrey a écrit : Just to let everyone know, markdown now has it’s own official file extension; “mdml”. That's the first time I see that extension. If I were to invent an unambiguous extension

Re: Re: Markdown file extension

2006-10-09 Thread James Bennett
On 10/9/06, Robert Ullrey <[EMAIL PROTECTED]> wrote: Not sure. I was searching to see it there was an official one. I have been using md for a while now. It looked official enough that I thought JG put it in. Even weirder is that there is a markup language called MDML -- the "Market Data Markup

Greedy versus non-greedy repeat (was: Minor regexp oversight for setext headings)

2006-10-09 Thread Allan Odgaard
Greedy matching is generally more efficient (so it’s a good idea to use it when possible), but the reason it is faster is because of changed semantics. Take this string: foo...«lots of text»...bar...«lots more text»...end If we match it against ‘foo.*bar’ the regexp engine will first mat

Re: Markdown file extension

2006-10-09 Thread Robert Ullrey
Allan Odgaard wrote this pithy remark on 10/9/06 >Who came up with that? Not sure. I was searching to see it there was an official one. I have been using md for a while now. It looked official enough that I thought JG put it in. Robert __

Re: Markdown file extension

2006-10-09 Thread Fletcher T. Penney
What do you mean by "official"? Has this been endorsed by John? I can't find any information on the file-extensions.org site to describe what the "organization" is. The site seems heavily advertising focused and doesn't provide a lot of information. Perhaps I am being overly skeptcal...

Re: Markdown file extension

2006-10-09 Thread Allan Odgaard
On 9. Oct 2006, at 18:14, Robert Ullrey robert_ullrey-at-mac.com | Markdown| wrote: Just to let everyone know, markdown now has it’s own official file extension; “mdml”. Who came up with that? In TextMate we associate mar

Markdown file extension

2006-10-09 Thread Robert Ullrey
Just to let everyone know, markdown now has it’s own official file extension; “mdml”. __ Robert Ullrey Phone: (916) 600-5619 E-mail: [EMAIL PRO

Re: Minor regexp oversight for setext headings

2006-10-09 Thread Angie Ahl
Sorry list I did mean that to go to the sender not the list. Trying to avoid flame wars not start them.. my bad. ___ Markdown-Discuss mailing list Markdown-Discuss@six.pairlist.net http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Minor regexp oversight for setext headings

2006-10-09 Thread Angie Ahl
Oi. All due respect, If you're going to correct/flame people publicly at least have to politeness to point out/qualify what you're saying eg chapter or page number otherwise you come across as an ignorant troll with no social skills. Unless of course you are an ignorant troll in which ca