from:"Michel Fortin"

Re: whitespace at the start of code in a blockquote trimmed

2017-08-14 Thread Michel Fortin

Looks like a bug in Markdown.pl. Pretty much all the other implementations of 
Markdown handle this correctly. So you can use another Markdown parser, or 
alternatively you can write the whole blockquote in HTML as a workaround. I 
think you can add spaces to all the lines too (so it looks better in the text 
document). Or you take some time to fix Markdown.pl.

> Le 14 août 2017 à 9:03, Bram Mertens <mertensb.ma...@gmail.com> a écrit :
> 
> Hi,
> 
> In a document I'm writing I wanted to include an extract from an email 
> message as a blockquote.
> The message contains a code extract which contains both a piece of ascii art 
> (output from banner) as well as some lines that start with spaces.
> 
> After converting the markdown to XHTML I noticed that the ascii art does not 
> look as expected and some lines that start with spaces are displayed 
> left-aligned. (Meaning the leading spaces are removed).
> 
> The problem occurs only when the blocquote (> ) and code (4spaces) are 
> combined.
> Adding two more spaces somehow resolves the problem as well but that 
> obviously makes the Markdown text look different.
> 
> I've created a small sample and uploaded it to pastebin at 
> https://pastebin.com/iCb3ESYZ.
> It has the original text, the same text without blockquote and the version 
> with the additional spaces added.
> Converting that in dingus 
> (https://daringfireball.net/projects/markdown/dingus) demonstrates the 
> problem.
> 
> Is this by design?
> Is there another way to avoid this?
> 
> Thanks in advance
> 
> Bram
> ___
> Markdown-Discuss mailing list
> Markdown-Discuss@six.pairlist.net
> https://pairlist6.pair.net/mailman/listinfo/markdown-discuss

-- 
Michel Fortin
https://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
https://pairlist6.pair.net/mailman/listinfo/markdown-discuss

Re: HTML Tags = Markdown Quick Reference @ Write Kit

2015-08-04 Thread Michel Fortin

Le 2015-08-04 à 10:26, Gerald Bauer gerald.ba...@gmail.com a écrit :

   Thanks great comments. I've update the quick reference and it reads now
 
   i or emphasisand
   b or strong

Shouldn't it be em? There's no such thing as emphasis in HTML.

-- 
Michel Fortin
michel.for...@michelf.ca
https://michelf.ca



smime.p7s
Description: S/MIME cryptographic signature
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
https://pairlist6.pair.net/mailman/listinfo/markdown-discuss

Re: Opinions on in-band variation signaling

2014-11-11 Thread Michel Fortin

Before this discussion derails on a syntax bikeshedding, I think I should 
highlight the important part:

Sean Leonard wrote:

 Do any implementers have experience with picking the type of Markdown based 
 on some info at the top of the Markdown content?

To my knowledge, no one is doing that. But it'd be interesting to hear about it 
if someone is doing it.


As for this:

 By in-band, I mean a Markdown file with content like this:
 
 
 !pandoc
 % This is a Title
 % Sean Leonard
 % November 2014
 
 Blah blah *blah*.
  (fortin’s suggestion)

I suggested privately to you that using `!pandoc` would be better as a header 
than `Content-Type: text/markdown; variant=pandoc` if you wished people to 
actually write something in a file. But for me this is an enclosure format for 
Markdown (similar to how mp4 is an enclosure format for many codecs.). The 
actual Markdown content starts after the metadata header. 

Also, lines prefixed by a `%` are ugly, I would never suggest such a thing.

-- 
Michel Fortin
michel.for...@michelf.ca
https://michelf.ca



smime.p7s
Description: S/MIME cryptographic signature
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
https://pairlist6.pair.net/mailman/listinfo/markdown-discuss

Re: (text/markdown) link label vs. link identifier and last-one-wins

2014-10-10 Thread Michel Fortin

Le 10-oct.-2014 à 11:56, Sean Leonard dev+i...@seantek.com a écrit :

 On this basis, I am going to call it link identifier. Questions?

It is called both a label, an identifier, and a name under the [link][] 
section of the syntax description on Daring Fireball. Choose as you like.

 [link]: http://daringfireball.net/projects/markdown/syntax#link


 Also, Markdown.pl seems to define last-wins behavior: the last link 
 definition is indexed as the definition in $g_urls. (See 
 _StripLinkDefinitions.) Older ones get overwritten by newer ones. Is this 
 common or normative behavior? How do other implementations do it?

Well, it is not specified in the official documentation. You can always check 
on Babelmark 2 to see how various implementations are doing it:
http://johnmacfarlane.net/babelmark2/?normalize=1text=%5Blink%5D%5B%5D%0A%0A%5Blink%5D%3A+1%0A%5Blink%5D%3A+2


 It's important that I keep the original reference list short; I would rather 
 not refer normatively to documents other than Gruber's own Markdown rules.

I don't understand what all this has to do with a spec for a MIME type.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca



smime.p7s
Description: S/MIME cryptographic signature
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
https://pairlist6.pair.net/mailman/listinfo/markdown-discuss

Re: Markdown within block-level elements

2014-09-17 Thread Michel Fortin

Le 17-sept.-2014 à 21:07, Alan Hogan cont...@alanhogan.com a écrit :

 On Sep 17, 2014, at 4:57 PM, Michel Fortin michel.for...@michelf.ca wrote:
 
 I'll just point out that the markdown=1 trick should be credited to John 
 Gruber.
 
 Thanks. Sorry for getting that wrong.

It's normal to get it wrong. I wrote the first implementation in an 
experimental branch of PHP Markdown, John [suggested a syntax][1] different 
from mine, I adopted it. The experimental branch later became PHP Markdown 
Extra.

[1]: http://six.pairlist.net/pipermail/markdown-discuss/2004-August/000669.html 

Syntax: Markdown processing within block-level HTML


 CommonMark interprets it right according to your intent, but the Markdown 
 spec by John Gruber is very explicit about block-level HTML elements:
 
 I think the spec makes it clear that the content of `header` should not be 
 parsed with the Markdown syntax. As for whether the spec is right or wrong 
 in that choice, that is another debate entirely.
 
 Indeed. To be clear, I am aware of Gruber’s rule (although I forget it from 
 time-to-time as an author), and was hoping to provoke some discussion with 
 the aim of, yes, rallying around the Common Mark decision here. 
 
 I do not understand why the rule existed in the first place.
 
 Apart from historical reasons, are there other good reasons to avoid Markdown 
 processing within block-level elements?

I can't find the reference, but if I remember right the idea was that HTML 
blocks would often be copy-pasted snippets for embeded videos or similar 
things. In other words, if you paste some HTML in the middle of your text, you 
likely don't want Markdown to interfere with it, it should just work.


 And as far as those historical reasons go, I hope it’s abundantly clear how 
 very silly it is for a hundred non-interoperable implementations to claim the 
 motivation of “compatibility to shun the change needed for consensus.
 
 Break things. Bump the major version. Be part of an ecosystem that actually 
 works.

PHP Markdown adheres as much as possible to what John Gruber intended Markdown 
to be, based on the spec as well as the code and comments in Markdown.pl, plus 
all the posts John has made while he was still participating on this list and a 
few private emails here and there.

The Markdown Extra variant adds some features and makes only one very small 
breaking change (disallowing middle-word underscore emphasis). Even though the 
most notable features in it have been praised by John (on this list), I feel 
Extra needs have a distinct name to signal that it's not purely Markdown.

If I write another parser that breaks a few more things (for instance if it 
implemented CommonMark), I'll use a new name to avoid the confusion while 
keeping the old parser around for those who needs it. That however wouldn't 
solve the problem of non-interoperable Markdown implementation unfortunately, 
it'd just create one more parser.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
https://pairlist6.pair.net/mailman/listinfo/markdown-discuss

Re: Flexible Markdown Parser

2014-09-14 Thread Michel Fortin

Le 14-sept.-2014 à 6:34, Andrei Fangli andrei_fan...@hotmail.com a écrit :

 Hello again,
 
 I’m happy to announce that I have a working parser fully based on regex and a 
 default implementation for Markdown based on Gruber’s Markdown 1.0.1 (Dec 
 2004) specification. The only thing I could not figure out is the blank line 
 between list items to generate a paragraph in the list item instead of plain 
 text (that is very ambiguous to me and sounds a bit odd) and instead I 
 decided that all list items cannot contain plain text (a very simple 
 approach, I know).

The rule used in Markdown.pl is that if there is a blank line separating two 
list items, both the item above and the item below that blank line will be 
parsed as block content (and have paragraphs). That said, sublists are parsed 
anyway regardless of blank lines.

 A preview site is up and running: 
 http://www.markdownparser.development.andrei15193.ro/
 
 Critics and comments are welcomed.
 
 PS: can someone point me to some test suites you use? I would like more test 
 cases (the ones I wrote are really rudimentary). Also big documents for 
 performance testing would be nice.

You can use [MDTest][1] which includes the original MarkdownTest test suite 
from John Gruber as well as my own test suite for PHP Markdown and PHP Markdown 
Extra.

 [MDTest]: https://github.com/michelf/MDTest

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: when rational discussion was still a possibility

2014-09-06 Thread Michel Fortin

Le 6-sept.-2014 à 0:16, John MacFarlane j...@berkeley.edu a écrit :

Michel,

What you did at the beginning, I gather, was to port (and then extend)
an existing implementation, Markdown.pl. The same will be possible with
CommonMark, which provides two implementations that use the same parsing
algorithm, one in portable C and one in 1540 lines of javascript (with
no library dependencies). The javascript implementation doesn't use any
unusual javascript features and should be straightforward to
port to other dynamic languages: perl, python, ruby, PHP. (Or you could
just use the javascript library client-side and skip the server-side
rendering.) Those who work with compiled languages will be able to use
the C library directly.

The parsers are both fast and accurate. The original C parser I wrote
was about as fast as discount. An expert C coder is now working on
otimizing it and, without changing the algorithm, has managed to make it
about as fast as sundown, which is very fast indeed (0.01 seconds to
parse a 1MB document, for example). When optimization is complete, it
should be even faster. The javascript parser is also very fast (0.28
seconds for the above-mentioned 1MB document, running in the Chrome
browser). By comparising, Markdown.pl takes 250 seconds on the same
input, and pandoc takes 3.19 seconds.

I have no doubt a parser written in C, or even JavaScript (which nowadays gets
executed with JIT compilers) will beat PHP Markdown. I also have no doubt that
your algorithm can be ported to PHP. I have some doubt it'll be fast enough in
PHP.

But regardless of performance, I can't swap my algorithm with your algorithm
and still call it PHP Markdown if it gives significantly different results.
CommonMark does not pass the PHP Markdown test suite, neither does it pass the
original test suite made by John Gruber.

Failing tests from the original test suite:
https://github.com/michelf/mdtest/blob/master/Markdown.mdtest/Hard-wrapped%20paragraphs%20with%20list-like%20lines.text
https://github.com/michelf/mdtest/blob/master/Markdown.mdtest/Literal%20quotes%20in%20titles.text

Failing tests from PHP Markdown test suite:
https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Backslash%20escapes.text
https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Code%20block%20in%20a%20list%20item.text
https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Email%20auto%20links.text
https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Headers.text
https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Ins%20%26%20del.text
https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Tight%20blocks.text

Some of these are obviously bugs on your side you'll likely fix. Some of these
are degenerate cases I don't really care about the result as long as it
produces valid HTML. But for some there is an obvious intent do produce
something different (and there are probably more of these than the test suite
can catch).

My understanding is that CommonMark is a different flavor of Markdown that
chose to diverge in a couple of small ways from the original. I could obviously
fork it and fix things so they can pass my test suite and John Gruber's test
suite and behave more like the original Markdown behave, but that's going to
take a lot of time and it'll just create one more flavor situated in between
PHP Markdown and CommonMark. That's not a worthy goal to me.

- - -

With all that said, if I do port CommonMark to PHP I'd probably call it PHP
CommonMark and promote it as an alternative, better defined, Markdown-like
syntax.

--
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: when rational discussion was still a possibility

2014-09-06 Thread Michel Fortin

#email-autolink
It seems not to allow the international example or the crazier ones
(with strange symbols and quotes). Probably this should be fixed in our
spec.

Here is my regex if you're interested:
https://github.com/michelf/php-markdown/blob/lib/Michelf/Markdown.php#L1321

https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Ins%20%26%20del.text

The spec does not include ins or del among the list of HTML block tags.
I can't recall where we got this list, and it now seems a mistake.
Adding these to the list would still yield different output from PHP
Markdown, because of differences in treatment of HTML blocks,
but more reasonable output.

PHP Markdown treats them as hybrid block/inline depending on the context. If
alone on its line, ins or del is a block-level tag, otherwise it's a
span-level tag. This is because they can be both in HTML.

https://github.com/michelf/mdtest/blob/master/PHP%20Markdown.mdtest/Code%20block%20in%20a%20list%20item.text

Here I just need to refer you to the extensive discussion in the spec
of the motivation for the list rules we chose.
http://jgm.github.io/stmd/spec.html#motivation
This was one of the hardest things to work out in a (to me) satisfactory
way. NO choices here will be perfectly backwards compatible with every
implementation, since they go in so many directions. But I'm pretty confident
that the choices we've made are better than any of the alternatives I've
considered. I would be interested to hear your feedback on this!

Are you sure we're talking about the same thing? The issue is that in the
middle item your code block content has a two space indent while it is indented
by four spaces (minus one for the list marker) in the original. I'd have
expected a four character indent to produce a code block with no leading space
in its content. One thing that might be confusing here is that the input mixes
tabs and spaces and tabs often get rendered as a 8 space indent. Is this really
what stmd should be outputing:

http://johnmacfarlane.net/babelmark2/?normalize=1text=*%09List+Item%3A%0A%0A%09%09code+block%0A%0A%09%09with+a+blank+line%0A%0A%09within+a+list+item.%0A%0A*+++code+block%0A%0Aas+first+element+of+a+list+item%0A%0A*%09List+Item%3A%0A%0A%09%09code+block+with+whitespace+on+preceding+line

--
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: On simplifying table syntax in any future markdown extension. (Use CSV)

2014-09-05 Thread Michel Fortin

Le 5-sept.-2014 à 5:28, mofo syne mofos...@gmail.com a écrit :

 |:- Year -|:- Make -|:- Model  -:|
 1997, Ford, E350
 1999, Chevy, Venture Extended Edition 
 1999, Chevy, Venture Extended Edition 
 1996, Jeep, Grand Cherokee

Doesn't make much sense to me. I mean, it doesn't look too bad until you get to 
the quoted text part and have to escape quotes using a CSV-style double-quote 
escape instead of the Markdown-style backslash. Also, quoting the whole value 
is required in CSV anytime you have a comma in one of your cell which isn't a 
rare occurrence.

I think PSV (pipe-separated-value) is better because you're much less likely to 
have pipes in your text. And also it's better to reuse our current escape 
mechanism in the unlikely event you have pipes in your cells. And that brings 
us back to what everyone is already using for tables in Markdown, which aren't 
harder to maintain if you don't care about making the text form pretty.

This is also a perfectly valid Markdown Extra table:

Year|Make|Model
||-
1997|Ford|E350
1999|Chevy|Venture Extended Edition
1999|Chevy|Venture Extended Edition
1996|Jeep|Grand Cherokee


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: On simplifying table syntax in any future markdown extension. (Use CSV)

2014-09-05 Thread Michel Fortin

Le 5-sept.-2014 à 13:34, mofo syne mofos...@gmail.com a écrit :

 I see. So it's a bit too idealistic in terms of the practicality of 
 implementing csv inspired tables.
 
 How's Tom other second and third examples? Both uses pipes and I preferred 
 the second version. However the - is mistaken for a h1 header 
 according to babelmark. Second examples of a single |-| means only the 
 first row of the cell data is displayed or pandoc.
 
 I'm guessing the biggest issue about implementing with -- is that 
 parsing such tables require processing the context.

Actually, if you start with a pipe character on each line, you don't need to 
add more pipes on the separator line with Markdown Extra. Tom's last suggestion 
already works, and so does this one:

| header | header | header
| 
| cell | cell | cell

Well, for PHP Markdown Extra and a couple more at least.

http://johnmacfarlane.net/babelmark2/?normalize=1text=%7C+header+%7C+header+%7C+header%0A%7C+%0A%7C+cell+%7C+cell+%7C+cell

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: when rational discussion was still a possibility

2014-09-05 Thread Michel Fortin

Le 5-sept.-2014 à 15:07, bowerbird via Markdown-Discuss 
markdown-discuss@six.pairlist.net a écrit :

 i am also aware that even if it _could_ happen here,
 it'll never be the case that _i_ would be permitted to
 start it. that's not something an outsider is allowed.

I'm not sure where you get that impression, but if you're allowed to write 
poetry here then you're certainly allowed to start a discussion about something 
Markdown related.


 (but hey, if you want to play, chew on this for a bit:
 could we eliminate ambiguities and inconsistencies,

After years of thinking about it, I don't think this is actually possible.


 while still retaining the flexibility that gruber desires?
 and if so, how exactly would we go about doing that?
 here's a good hint: do not let atwood lead the effort.)

Irrelevant since the first part is impossible.


A lot of people want to have a Markdown parser in their own programming 
language, tailored to their own needs, and they really often want to be able to 
extend it to cover their own use case.

Because of differences between the tools available in each language and 
programming environments, different implementations use different regular 
expression engines, or formal grammars, or they all the parsing logic in code. 
This variety of tools make the tradeoffs different for every implementers.

Often, something that's easy to implement one way with a set of tool will be 
hard to implement similarly with another set of tool, and vice versa. Getting 
all the edge cases to parse the same way across dissimilar implementations is a 
noble goal, but it has to be weighted with other goals such as having good 
parsing performance, ease of maintainability, and ease of adding extra features 
tailored to some specific needs. A parser that sacrifice those three last 
things to conform to some standard will only fill the niche of those seeking 
everything to be parsed the same everywhere.

From an implementer point of view, unless a detailed standard is written as a 
description of what your own parser does, you'll have to spend a lot of time 
tweaking things to match that standard. By a lot of time I mean more than 
what it took you to implement the parser in the first place. Remember, it's 
always getting the last 20% that takes 80% of the time. So you'll spend a lot 
of time to achieve some dubious tradeoff. Implementers have better things to 
do with their time.

So my conclusion is that if you want one or another Markdown flavor to become 
the standard, you need to find a way for its implementation to be included 
everywhere. But with all the diverse language ecosystem we have, and with the 
varied needs of different communities using Markdown, that seems quite 
difficult to achieve. I'd call that impossible.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Talk about Markdown in The Talk Show

2014-07-21 Thread Michel Fortin

There's a short discussion about Markdown between Marco Arment and John Gruber 
in The Talk Show, episode 88: 'Cat Pictures', With Marco Arment (Side 1) 
starting at time 1:15:13. I don't have anything particular to say about it, but 
it's probably the closest thing to an official response you can get on some 
things that were discussed on this list so I felt it was worth a link:

https://soundcloud.com/thetalkshow/ep-88-cat-pictures-side-1#t=1:15:13

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Punchline: variants and processor (text/markdown)

2014-07-15 Thread Michel Fortin

Le 15-juil.-2014 à 3:20, Sean Leonard dev+i...@seantek.com a écrit :

 IANA would create a sub-registry of processors. Each registry entry must 
 contain the processor name (identifier), the full name of the tool (if it 
 differs from the processor name), the authors or maintainers, and any URL or 
 other address at which to locate the processor tool and documentation. 
 Optionally, versions and processor-specific arguments can be documented in 
 the registry entry.

...

 IANA would create a sub-registry of rulesets for the variants parameter. Each 
 registry entry must include the ruleset identifier, a formal description of 
 the rules, and identification of included rulesets. Optionally the entry may 
 describe processors (including versions and arguments) that are known to 
 implement the ruleset.
 
 Each ruleset identifier shall uniquely identify that set of rules. I.e., if 
 fenced_code_blocks is registered, guarded_code_blocks cannot be 
 registered if the effective rules in guarded_code_blocks are the same as 
 fenced_code_blocks.

But how does a document get annotated with the attributes in the first place? 
Who chooses the processor and variant attributes of a document and based on 
what? And where is it stored? Do you have any specific example of how that 
could work in any given setup?

My impression is that all this is going to do is define some metadata flags 
that no one will use.

What is the goal here? Is the goal to have most Markdown documents on the 
internet be annotated in this way so some browser software can pick 
automatically a sort-of compatible implementation for a given document? Or is 
it a way to have inside a given system (a CMS for instance) a way to annotate 
which Markdown implementation to use internally to parse a specific document?


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Markdown validity Re: Agreeing on Historical Markdown

2014-07-12 Thread Michel Fortin

Le 12-juil.-2014 à 10:32, Sean Leonard dev+i...@seantek.com a écrit :

 Markdown may have a concept of HTML validity. A Markdown processor that 
 identifies HTML in Markdown content may determine that the HTML is valid or 
 invalid. For example, it may identify div ... [end of document] as HTML 
 that is invalid because it lacks a closing /div tag. Then, it has five 
 choices:
 1. treat the invalid HTML as text--pass the text-as-text to the markup (i.e., 
 turn  into amp; ,  into lt; , etc.)
 2. treat the invalid HTML as Markdown--keep on processing the input and look 
 for markdown inside of it (thus *hello* inside the invalid HTML will get 
 marked up...and diva href=http://www.example.com/;hello/a[end of 
 document] will become a real link with the literal text 'div' preceding it)
  -- this is the same behavior as not identifying the text as HTML in the 
 first place
 3. pass the invalid HTML as HTML
 4. attempt to fix the HTML...thus diva 
 href=http://www.example.com/;hello/a[end of document] might become 
 diva href=http://www.example.com/;hello/a/div
 5. fail due to HTML invalidity
 
 ?

Is that really a question?

1. Turning `` and `` into `amp;` and `lt;` is part of the official syntax 
rules. Hopefully every Markdown parser does that.

2. 3. 4. 5. We have implementations doing all of that, probably mixing a few of 
those solutions depending on the exact error.

When you have a question like this, just try it Babelmark 2:
http://johnmacfarlane.net/babelmark2/?normalize=1text=%3Cdiv%3E


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Agreeing on Historical Markdown (was: Re: text/markdown effort in IETF (invite))

2014-07-11 Thread Michel Fortin

Le 11-juil.-2014 à 4:54, Sean Leonard dev+i...@seantek.com a écrit :

 Since we cannot reach consensus on what ought to be Standard Markdown 
 today, can the community reach consensus on Historical Markdown--of which I 
 propose three working definitions?
 
 * Classic Markdown: The Markdown syntax or Markdown.pl implementation, as 
 implemented by John Gruber, in 1.0.1, with all ambiguities, bugs, 
 frustrations, and contradictions. [In cases that the syntax and the tool 
 contradict, we come up with a way to resolve the contradictions.]
 
 * Original Markdown: The Markdown syntax or Markdown.pl implementation, as 
 implemented by John Gruber, in 1.0.2b7, with as many of the ambiguities, 
 bugs, frustrations, and contradictions fixed as he actually fixed (or failed 
 to fix) them. Aka Markdown Web Dingus.
 
 * Idealized Markdown (aka Historical Standard Markdown): The Markdown that 
 everyone can agree is the way Markdown should have been back when there was 
 One True Markdown. Basically this is Original Markdown with its faults duly 
 recognized and corrected...many of these faults having been corrected in 
 practice in divergent implementations (Markdown Extra etc.) but never 
 officially recognized in Original Markdown.
 
 
 I cannot say which of these three is better...but by recognizing these three 
 as common points, we can then start to compare on the same page.

You might also call the first two Markdown 1.0.1 and Markdown 1.0.2b7 for 
simplicity's sake. As for the idealized version, that's what I call Markdown 
personally, or plain Markdown when I need to disambiguate.

Wasn't 1.0.2b8 the last one though? Why is the Dingus running 1.0.2b7? 
Babelmark 2 has 1.0.2b8.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Agreeing on Historical Markdown

2014-07-11 Thread Michel Fortin

Le 11-juil.-2014 à 6:08, Sean Leonard dev+i...@seantek.com a écrit :

On 7/11/2014 3:04 AM, Michel Fortin wrote:
You might also call the first two Markdown 1.0.1 and Markdown 1.0.2b7
for simplicity's sake. As for the idealized version, that's what I call
Markdown personally, or plain Markdown when I need to disambiguate.

Ok; however, I understand that there are some differences between the syntax
http://daringfireball.net/projects/markdown/syntax and the 1.0.1
implementation. Maybe also the 1.0.2b[x] implementation(s). Right?

In the 1.0.2 beta branch the HTML block parser supports the markdown=1
attribute, but also introduces some regressions; the shortcut reference links
were added; there has been some hacky bug fixing regarding code spans-like
things in the attributes of HTML tags (but I'll argue it's are just shifting
the errors to somewhere else). The version history is right there if you want
the differences between 1.0.1 and 1.0.2b[x] (looks like someone posted 1.0.2b8
on Github for convenience):
https://github.com/mayoff/Mathdown/blob/master/Markdown.pl#L1529

The syntax page is documenting the 1.0.1 features. Parsing of list indentation
doesn't work exactly as described in that document however. First point of
this answer in the Babelmark 2 FAQ gives more details:
http://johnmacfarlane.net/babelmark2/faq.html#what-are-some-big-questions-that-the-markdown-spec-does-not-answer

Beside that, the document makes many simplifications to make it easier to
understand from a user perspective. It is not really an implementer's guide.

--
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: text/markdown effort in IETF (invite)

2014-07-10 Thread Michel Fortin

Le 10-juil.-2014 à 5:00, Sean Leonard dev+i...@seantek.com a écrit :

 I haven't tried it yet, but I suspect PHP Markdown is mostly encoding 
 agnostic only for most encodings that preserve the US-ASCII range.

That's what I meant by mostly encoding agnostic. It'll work with ASCII and 
most European encoding schemes because they are ASCII-compatible, but anything 
more fancy than that will have to be UTF-8.

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: text/markdown effort in IETF (invite)

2014-07-10 Thread Michel Fortin

Le 10-juil.-2014 à 1:04, John MacFarlane j...@berkeley.edu a écrit :

 +++ Michel Fortin [Jul 09 14 18:07 ]:
 
 Fun fact: PHP Markdown is mostly encoding agnostic. It understands UTF-8 
 sequences but any byte that is not a valid UTF-8 sequence is treated as a 
 character in itself. It's only relevant when converting tabs into spaces 
 however, and only if you have non-ASCII characters before the tab.
 
 Small amendment: There are at least two places where the difference
 between utf-8 and latin1 matters:  tab expansion (as you note) and
 reference links, since these are stipulated to be case insensitive.
 (Case conversion is sensitive to the encoding.)

Like Markdown.pl, PHP Markdown will just treat non-ASCII characters in a 
case-sensitive way so in my case it doesn't matter.

Also, if you want to compare characters in a case-sensitive manner, the most 
correct way to do it is to use the Unicode Collation Algorithm, not case 
conversion to lower or uppercase, because some characters can't round-trip (see 
[german ß]). Then you'll notice that unfortunately Unicode collation is locale 
dependent (because equivalent characters aren't the same in all locales, see 
the [turkish ı]). And then you'll realize there's not correct way to do it 
universally.

 [GERMAN SS]: https://en.wikipedia.org/wiki/ß
 [TURKISH I]: https://en.wikipedia.org/wiki/Turkish_dotted_and_dotless_I

On Babelmark I see that cheapskate 0.1.0.1 understands the first link above -- 
good job! -- an no one understands the second one.

http://johnmacfarlane.net/babelmark2/?normalize=1text=Also%2C+if+you+want+to+compare+characters+in+a+case-sensitive+manner%2C+the+most+correct+way+to+do+it+is+to+use+the+Unicode+Collation+Algorithm+--+not+case+conversion+to+lower+or+uppercase+--+because+some+characters+can't+round-trip+(see+%5Bgerman+ß%5D).+Then+you'll+notice+that+unfortunately+Unicode+collation+is+locale+dependent+(because+equivalent+characters+aren't+the+same+in+all+locales%2C+see+the+%5Bturkish+ı%5D).+And+then+you'll+realize+there's+not+really+a+correct+way+to+do+it.%0A%0A+%5BGERMAN+SS%5D%3A+https%3A%2F%2Fen.wikipedia.org%2Fwiki%2Fß%0A+%5BTURKISH+I%5D%3A+https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FTurkish_dotted_and_dotless_I%0A

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: text/markdown effort in IETF (invite)

2014-07-09 Thread Michel Fortin

Le 9-juil.-2014 à 11:49, Sean Leonard dev+i...@seantek.com a écrit :

 Hi markdown-discuss Folks:
 
 I am working on a Markdown effort in the Internet Engineering Task Force, to 
 standardize on text/markdown as the Internet media type for all variations 
 of Markdown content. You can read my draft here: 
 http://tools.ietf.org/html/draft-seantek-text-markdown-media-type-00.
 
 The proposal is already getting traction. Is there anyone on this list that 
 is interested in participating or helping this effort? In particular we need 
 to better understand and document what versions of Markdown exist, so that 
 either Markdown as a family of informal syntaxes will start to converge, or 
 if not, that Markdown variations have an easy way to be distinguished from 
 one another. (See the flavor parameter discussed in the draft.)

The flavor parameter is a good idea in theory. I'm not sure it'll be very 
useful in general though. Nobody is going to annotate their file with the right 
flavor unless there's a tangible benefit, and I don't see what the benefit 
could be. Software that could do something useful with markdown-identified 
content will likely ignore the flavor part when parsing because no one wants to 
see incompatible flavor errors, especially when commonly used parts of the 
syntax are compatible anyway.

Markdown is in the spot where HTML was before HTML5 with each implementation 
doing its own thing. I don't know if Markdown will get out of there anytime 
soon. I'll point out however that HTML never got anything like a flavor 
parameter in its MIME type, and even if it did it'd not have helped clear the 
mess in any way.

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: text/markdown effort in IETF (invite)

2014-07-09 Thread Michel Fortin

Le 9-juil.-2014 à 16:08, Sean Leonard dev+i...@seantek.com a écrit :

The operating question is: What metadata (companion data) is /necessary/ to
reflect the creator's intent with respect to the data?

With Markdown, I think the answer is: you need the character set, and you
need to know how to turn the text into HTML (or XHTML, PDF, RTF, MS
Word/Office Open XML, or whatever).

Indeed.

Markdown has no way to communicate the character set in the document (other
than the Unicode Byte Order Marks, which is a generalized property about text
streams, not specific to Markdown)--and it would be counterproductive to
invent one. So that is a perfect example of relevant metadata.

Fun fact: PHP Markdown is mostly encoding agnostic. It understands UTF-8
sequences but any byte that is not a valid UTF-8 sequence is treated as a
character in itself. It's only relevant when converting tabs into spaces
however, and only if you have non-ASCII characters before the tab.

So whatever the input encoding is becomes the output's encoding (this works for
HTML). Naturally, it's good to know the input's encoding if you want to know
the output's. So obviously it's a good idea to specify the text encoding even
though the parser itself doesn't need it, so you know the resulting document's
encoding.

That's not really relevant though.

And the second one, is how to turn it into something else that the author
wants. If it's not communicated, it's going to be implied. Implied means
guessing and likely guessing wrong.

Ideally you'd use the exact same version of the same parser the author used to
interpret the document in the first place.

Or you could be loose and use another version of the same parser.
Or you could be loose and use another parser claiming to be of the same flavor.
Or you could be loose and use another parser claiming to be of a superset of
the given flavor.
Or you could be loose and use another Markdown parser.

It's a spectrum. Each step down will increase the likeliness of something going
wrong.

Hopefully this makes sense. I want to be more educated about this.

This makes perfect sense, but I fear there's no good answer to your second
question. Since you want to know more, here's some insight.

It's important to understand that there is no notion of invalid Markdown input.
As an implementer every time you fix what looks like a parsing bug to you or
add a feature you're also breaking some valid input that was producing
something else before. The implementer will usually only choose to break valid
input that was deemed very unlikely to ever have been used before, but there's
no way to know for sure (and no reliable way to measure impact either). So if
you really really want to be sure things are parsed in the intended way, you
should use the closest version possible of the same parser as the creator of
the document was using.

Also, subtle changes can make things technically incompatible. For instance,
Markdown Extra is mostly a superset of the original Markdown feature-wise,
except for one small incompatible change: underscore emphasis within a word is
disallowed. This was a deliberate change to fix some problems users were having
with words that contained underscore. So even though most people would consider
Markdown Extra as a superset of Markdown, it technically isn't. Other
implementers might do the same thing but consider it as a bug fix instead and
tell their users implementation implements the original syntax.

Babelmark 2 will tell you that implementations are pretty much evenly split on
this:
http://johnmacfarlane.net/babelmark2/?normalize=1text=word_with_emphasis

You'll even see that Pandoc implements both behaviour depending on whether
you're in strict mode or not.

Something stranger happens with the shortcut reference syntax:
http://johnmacfarlane.net/babelmark2/?normalize=1text=%5Blink%3F%5D%0A%0A%5Blink%3F%5D%3A+http%3A%2F%2Flink.x%2F

It's pretty much universally supported. It comes from a Markdown.pl beta that
was never formally released but which is widely in use. If you were to go to
the Markdown website and use the download link, you won't get the beta and it
won't work. And while the second one is feature-wise a superset of the first,
technically it could in some rare situations break documents, turning square
bracketed text into links where it shouldn't:

Someone on [street Ivanhoe Carol][sIC] told me this:

This is bad [sic].

[sIC]: http://sic.sickdomain

I sure wish things would be simpler. But as things are now, I have a hard time
identifying what flavor could mean. Should Markdown.pl-1.0.1 be a flavor on
its own?

--
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

[ANN] PHP Markdown Lib 1.4.1

2014-05-04 Thread Michel Fortin

This is a bug fix release for PHP Markdown Lib. You can download it from here:
http://michelf.ca/projects/php-markdown

PHP Markdown Lib 1.4.1 (4 May 2014)

*   The HTML block parser will now treat `figure` as a block-level element
(as it should) and no longer wrap it in `p` or parse it's content 
with 
the as Markdown syntax (although with Extra you can use `markdown=1`
if you wish to use the Markdown syntax inside it).

*   The content of `style` elements will now be left alone, its content
won't be interpreted as Markdown.

*   Corrected an bug where some inline links with spaces in them would not
work even when surounded with angle brackets:

[link](s p a c e s)

*   Fixed an issue where email addresses with quotes in them would not 
always
have the quotes escaped in the link attribute, causing broken links (and
invalid HTML).

*   Fixed the case were a link definition following a footnote definition 
would
be swallowed by the footnote unless it was separated by a blank line.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: inline html - usemap

2014-02-10 Thread Michel Fortin

Le 9-févr.-2014 à 21:56, Peter Watts pe...@nguyenwatts.com a écrit :

 The href is being removed

Probably some kind of over-zealous filter for bad HTML. You should report this 
to Github; I'm pretty sure they aren't listening here.

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

[ANN] PHP Markdown Lib 1.4.0, PHP Markdown Extra 1.2.8, PHP Markdown 1.0.2

2013-11-29 Thread Michel Fortin

This is a new update to PHP Markdown. The most notable feature is that Markdown 
Extra now supports Github-flavored backtick fenced code blocks. You can 
download it from here:
http://michelf.ca/projects/php-markdown/


Changes since last version
--

In PHP Markdown Lib 1.4.0, PHP Markdown Extra 1.2.8, and PHP Markdown 1.0.2:

*   Added support for the `tel:` URL scheme in automatic links.

tel:+1-111-111-

It gets converted to this (note the `tel:` prefix becomes invisible):

a href=tel:+1-111-111-+1-111-111-/a


In PHP Markdown Lib 1.4.0 and PHP Markdown Extra 1.2.8:

*   Added backtick fenced code blocks to MarkdownExtra, originally from
Github-Flavored Markdown.


In PHP Markdown Lib 1.4.0:

*   Added an interface called MarkdownInterface implemented by both 
the Markdown and MarkdownExtra parsers. You can use the interface if
you want to create a mockup parser object for unit testing.

*   For those of you who cannot use class autoloading, you can now
include `Michelf/Markdown.inc.php` or `Michelf/MarkdownExtra.inc.php` 
(note 
the `.inc.php` extension) to automatically include other files required
by the parser.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: [ANN] vfmd

2013-10-05 Thread Michel Fortin

Le 5-oct.-2013 à 2:45, Roopesh Chander r...@forwardbias.in a écrit :

 Let us consider the example I gave in my last mail, which to me looks like
 a practical use case. Let me elaborate on it here:
 
 1. Consider a user who sets his text editor to keep tabs as is (and not
 expand it to spaces)
 2. This chap wants a code block inside a blockquote in his Markdown
 document
 3. This is the first time he's tried to write a code block inside a
 blockquote, so he consults the user doc
 4. The user doc says:
 (a) For code blocks, indent each line with 4 spaces or 1 tab
 (b) For blockquoting, start the line with '' followed by an optional
 space..
 (c) The doc gives examples of blockquotes containing code blocks that
 use 4 spaces after the   of a blockquote
 5. After reading the doc, the user naturally writes
 (.=space;_=tab;tabstop=4):
 .__code block
 (or)
 ___code block
 
 Would you consider the above example hypothetical - something that cannot
 happen practically? If yes, which step(s) above would you term as
 impractical?
 

 That is:
 1. Is it impractical to assume that a user would set noexpandtab? (Y/N)
 2. Is it impractical that the user would want to have a code block inside
 a blockquote? (Y/N)
 and so on.

What would be quite surprising is a user who wrote this without checking the 
HTML result. Especially since visually in the editor it doesn't look like there 
is enough indentation. If someone wanted it to work that way, I'd probably have 
received a complain by now. Instead, what happens is that users in this 
situation fix their documents so the Markdown parser would convert it to HTML 
correctly (adding the necessary spaces).

What should be learned from web browsers and HTML specs is that people don't 
write against the spec, they write by looking at the result produced by their 
tools. For Markdown, many don't even look at the produced HTML, they just check 
that it looks right in their browser.


 Frankly, I don't know whether it's biting anyone out there. But there is an
 inherent value in being self-consistent. The parser should behave in a way
 that is consistent with the user doc. This is violated in the example given
 above.

Fix the doc then! It'll be self-consistent.

It's a much less risky move to fix the user doc to match what the parsers have 
been doing for 10 years than change the parsers to accommodate the doc (and 
potentially have everyone go back and fix 10 years worth of documents).


 Are you talking about copying code from a code editor (say XCode), where
 we're using tab-indentation in some lines and space-indentation in some
 other lines?

That, and perhaps also stray spaces within tab indentation. It looks right so 
nobody sees it, and then it gets copy-pasted and Markdown would try to be smart 
and break everything apart? No thanks.


 In your earlier mail, you had said that since browsers don't all show tabs
 in a consistent way inside a `pre`, it's much better if they get
 converted to spaces. My response was browsers among themselves are
 consistent.

Actually, I think browsers were not always that consistent with regard to tabs. 
But if they are now, then that's good. I guess I'm not up to date with that.


 I agree that browsers use 8 column tabs and many Markdown implementations
 use 4 column tabs to convert to spaces. But if we leave tabs intact in the
 HTML output, then Markdown and the browsers will be consistent, so all will
 be well? Maybe explaining using an example use case would help here (maybe
 we are copying from XCode and pasting it into our doc, or maybe we are
 copying code from a browser)?

But most users don't want to know about the HTML output! People care about what 
they see in their browsers, and if the browser converts their 4-space indents 
to 8-space indents they'll complain. Even more so if it worked correctly before 
and they're now upgrading to a newer parser that changes the behaviour and 
break old documents.


 As I said earlier, I hadn't understood what you'd meant when you said I
 didn't think tabs inside code blocks were into question here, are they?.
 Please explain (if you think it's worth discussing, that is).
 
 Also, I had requested that you provide a few examples of inputs that would
 break if we change handling of tabs. If you have such examples, please
 provide them.

Probably a good portions the pages on my website. I could dig them out if 
necessary, but right now I don't see the necessity. I have yet to see why your 
changes are needed. In fact, you've just admitted that those are all 
hypothetical problems, so I feel like you're wasting my time here.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Github-style fenced code blocks

2013-09-27 Thread Michel Fortin

I get a lot of requests for Github-style fenced code blocks in PHP Markdown 
Extra. While I despite the syntax -- it also happens to be a valid code span! 
-- I wonder whether I should relent on this. It seems to be bothering a lot of 
people (even those who know about the tilde-based fenced code block syntax).

In Github-Flavored Markdown, a code block works like this:

```php
some php code
```

Replace those backticks with tildes and you get a valid fenced code block in 
PHP Markdown Extra:

~~~php
some php code
~~~

Of course, now Github also supports tilde for fenced code blocks. But their 
documentation only mention the backtick-based syntax.
https://help.github.com/articles/github-flavored-markdown

If take a look at Babelmark 2, it seems that most implementations supporting 
one also support the other.
http://johnmacfarlane.net/babelmark2/?normalize=1text=```php%0Asome+php+code%0A```%0A
http://johnmacfarlane.net/babelmark2/?normalize=1text=~~~php%0Asome+php+code%0A~~~%0A

I wondered if some of you have any opinion to share on this.

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: [ANN] vfmd

2013-09-27 Thread Michel Fortin

Le 27-sept.-2013 à 2:54, Roopesh Chander r...@forwardbias.in a écrit :

 A U+0009 (TAB) character in the input shall be treated as four
 consecutive U+0020 (SPACE) characters.
 
 No. That's often not the case. If I write * followed by tab to begin a 
 line,
 that tab represents three spaces, not four. The number of spaces represented
 by a tab is 4-(column_number modulo 4). But you probably knew that. ;-)
 
 In vfmd, a TAB character is always 4 spaces. It's not like pressing TAB in
 MS Word. For the same reason, from a user perspective, it's better to use
 spaces instead of TABs to separate the list bullet char (`*`) from the list
 item.

Ok, so that's deliberate. I don't understand the motivation though. From the 
user perspective, that's not how tab works, nor how Markdown has done things 
for the last decade.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Islands of talk (was [ANN] vfmd)

2013-09-27 Thread Michel Fortin

Le 27-sept.-2013 à 2:54, Roopesh Chander r...@forwardbias.in a écrit :

 I think it's a good idea to track problems found using GitHub issues
 instead of mails - it's easier that way to (1) stay focused on the issue
 and (2) locate the discussion in the future.

You're probably right. Those two points are true.

I'm going to make a more general comment though. This list is followed by many 
Markdown implementers and users. It is a good place to discuss the Markdown 
syntax and have people raise a flag whenever something conflicting pops up or 
to have many eyes review an issue.

But I can't keep but wonder if every implementation having its own separate 
issue tracker with separate discussions is healthy for Markdown. Of course, all 
implementations cannot share all the same issue tracker, but it seems to me 
that this is moving the talk about the syntax to multiple islands scattered all 
around the Internet. At least that's my experience with PHP Markdown having its 
own issue tracker. I fear that this reducing awareness among implementers of 
what is happening with other implementations, and this might be contributing to 
fragmentation.

On the flip side, having too many people discuss pointless details of the 
syntax makes it easy for the discussion here to fall into irrelevance. Perhaps 
that's why syntax discussions here are rare now.

I'm not exactly sure what to ask for though. Should everyone subscribe to 
everyone else's issue tracker to stay aware of what's happening? That's 
probably too much noise and not very practical.

Or perhaps the lack of talk here reflects a lack of anything happening. I don't 
believe this. It seems that half the implementations added support for 
Github-style fenced code blocks without me noticing. Isn't that newsworthy? It 
should be for any Markdown implementer.

Am I the only one who feels uninformed about what's happening with Markdown 
(outside of my own implementation)? And if so, what could be done to improve 
this?


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: [ANN] vfmd

2013-09-27 Thread Michel Fortin

Le 27-sept.-2013 à 9:04, Roopesh Chander r...@forwardbias.in a écrit :

 Because this is how the syntax is defined (which is not hard or unintuitive
 to follow for a user), there's no need to worry about a TAB character being
 interpreted as 1-4 spaces based on it's position. If the user inserted a
 TAB immediately after the bullet character, he is expected to do that for
 all the list items anyway.
 
   *\tlist 1 item 1
   * list 1 item 2
 
 The above too shall be interpreted as two lists.

Ok, but what about this:

*\tlist 1 item 1
*list 1 item 2

They will both look unaligned in your editor (unless you set your editor to 5 
spaces per tab (who does this?)), but they'll be in same list because because 
you're interpreting spaces differently from the editor.

And what about this:

*\titem 1 paragraph 1

\titem 1 paragraph 2

Also, what happens within code blocks? (I haven't checked your algorithm for 
code blocks, but if you change tabs to four spaces you're going to get strange 
results for any code block with tabs in them not a the beginning of the line.)

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Islands of talk

2013-09-27 Thread Michel Fortin

Le 27-sept.-2013 à 8:44, Roopesh Chander r...@forwardbias.in a écrit :

 I think that discussion about a issues in a *particular implementation* is
 best done in the issue-tracker (or some such thing) specific to that
 implementation.

Obviously.

 For syntax discussions (stuff like what's the best syntax to describe, I
 don't know, citations to blockquotes), this mailing list is a good place,
 but they seem to have died down nowadays. (I wonder if that can be
 correlated to Gruber's reduced participation nowadays.)

That's what I pointed out. This list was the place to know what was happening. 
It isn't anymore.

 About what's happening in other implementations, there are still a good
 number of ANN mails to this list, I guess, describing what new syntax got
 added, et al.

There are very few announcement mails on this list. Half of the announcements 
on this list in 2013 were by me. And announcements other than mine generally 
don't describe syntax much. (Not all announcements have something newsworthy to 
say about the syntax either, so that's fine.)

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: [ANN] vfmd

2013-09-27 Thread Michel Fortin

Le 27-sept.-2013 à 11:15, Roopesh Chander r...@forwardbias.in a écrit :

 So to summarize:
  - It's not possible to solve this correctly without giving the tabstop
 number as an input to the parser
  - We don't want to get the tabstop as input, therefore we need a way
 around it
  - If the user is not mixing tabs and spaces, he should be fine
  - If the user is mixing tabs and spaces, but has read the syntax guide
 and follows it, he should be fine
  - If the user hasn't read the syntax and is also mixing spaces and tabs,
 sorry, I'm afraid I'm unable to help him

I still miss how it is an improvement over current Markdown. It's true that you 
can't solve the issue of editors having different lengths for tabs, but you're 
already picking four-space-per-tab so why do it differently from everyone else?

Note that if I were to use your algorithm, it'd probably break a lot of my own 
existing documents. I almost always use tabs to indent multi-paragraph lists, 
which includes a tab after the list marker. Maybe I'm the only one...

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: [ANN] vfmd

2013-09-26 Thread Michel Fortin

Le 26-sept.-2013 à 12:17, Roopesh Chander r...@forwardbias.in a écrit :

 I'm pleased to announce the vfmd project to you all.
 
 vfmd is a variant of Markdown with a well-defined specification. The spec
 is ready - I'm working on a testsuite at present and intend to start making
 a working implementation after that.
 
 http://vfmd.github.io/
 
 I'm looking forward to knowing what you think. I'd especially like to hear
 from Michel Fortin and John MacFarlane, who have tried to define the
 Markdown syntax unambiguously in a spec or grammar before.


Great. You've written a parser in prose. I'm impressed that you managed to have 
captured a lot of subtleties of the language within a formal spec. Seriously, 
how much time did you put into that?

Ok, commenting as I read http://vfmd.github.io/vfmd-spec/specification/:

 A U+0009 (TAB) character in the input shall be treated as four consecutive 
 U+0020 (SPACE) characters.

No. That's often not the case. If I write * followed by tab to begin a line, 
that tab represents three spaces, not four. The number of spaces represented by 
a tab is 4-(column_number modulo 4). But you probably knew that. ;-)

About paragraphs:

 10. If none of the above conditions apply, then the block-element start line 
 marks the start of a block-element of type paragraph.
 

 In order to find the block-element end line, we need to make use of a 
 HTML parser. [...]

Oh oh... have you thought about code spans? If I write something like this:

This is `pre`.

Now the end tag: `/pre`.

with your block-level algorithm that does not take code spans into account, the 
HTML parser would make the above a single paragraph instead of two. The same 
could happen if you had tag-like things in the tile of a link, or with partial 
tags:

This is `sp`.

Now ending the tag: `an`.

Or a comment.

You could either instruct the HTML parser to ignore span-level tags while 
you're trying to delimit blocks, or you need the parser to be aware of 
span-level Markdown syntax (basically parsing span-level syntax at the same 
time you're delimiting your paragraphs).

... or you could try to be less clever and ask the user to start the paragraph 
with a block-level tag if he wants to use block-level HTML elements (like 
Markdown.pl and PHP Markdown do).

 - - -

I don't have more time to read the rest of your spec right now. But I like the 
way it goes. I'll post more comments later.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: ordered lists: type lower alpha?

2013-07-24 Thread Michel Fortin

Le 24-juil.-2013 à 8:19, Sherwood Botsford sgbotsf...@gmail.com a écrit :

 DIVs are natural containers for CSS style elements, where you want to
 change the way something is presented.
 
 (I'm not wedded to the colon marker.  It probably conflicts with something,
 but it sees fairly unobtrusive, and hence in line with the transparency
 goal of MD)

Wouldn't it be simpler to have a syntax like this one?

div class=FOO markdown=1

blah blah

/div

div class=BAR markdown=1

blah blah

/div

Also, this syntax is easily extensible to `section` or `aside` or 
`article` or `footer` or `p` or ... Someone should really implement that.

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: ordered lists: type lower alpha?

2013-07-24 Thread Michel Fortin

Le 23-juil.-2013 à 15:57, John MacFarlane j...@berkeley.edu a écrit :

 +++ Klaus Mueller [Jul 23 13 18:40 ]:
 Hi folks,
 
 atm we have in markdown [ordered lists] with the [type] 'arabic
 numbers' working:
 
 1. one
 1. two
 
 would it be possible to have also the type 'lower alpha'?
 
 a. first
 a. second
 
 The type attribute is deprecated, but it is allowed to supported it
 or just add an inline css (e.g.  style='list-style-type:
 lower-alpha;' ).
 
 It's not standard markdown, but pandoc's extended markdown supports this:
 http://johnmacfarlane.net/pandoc/try/?text=a.+item%0Ab.+item%0Afrom=markdownto=html

Just curious, has anyone reported a bug where a sentence would end with a 
letter alone and would trigger a list on the next item? Also, what is a valid 
list marker in pandoc, is aa. a valid list marker (just like 11. is)?

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Problem with links in Markdown

2013-06-21 Thread Michel Fortin

Le 21-juin-2013 à 10:40, sendi...@gmx.net a écrit :

 Am 21.06.2013 um 05:44 schrieb Aristotle Pagaltzis pagalt...@gmx.de:
 
 I guess a more elegant workaround would be to use reference style since there 
 is no need for [brackets][1]
 
 [1]: example.com/wiki/(brackets)

Another idea is to backlash-escape those parenthesis:

[example](example.com/wiki/\(brackets\) title)

Seems to work for most implementations, including Markdown.pl: 
http://johnmacfarlane.net/babelmark2/?normalize=1text=%5Bexample%5D(example.com%2Fwiki%2F%5C(brackets%5C)+%22title%22)


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Problem with links in Markdown

2013-06-20 Thread Michel Fortin

Le 20-juin-2013 à 14:48, Alexander Veit lbl...@gmx.net a écrit :

The book of [Life](https://en.wikipedia.org/wiki/Life_(textbook) Life
textbook).

which is converted to

pThe book of a
href=https://en.wikipedia.org/wiki/Life_(textbookLife/a Life
textbook)./p

Most Markdown implementations out there do the right thing: matching opening
and closing parenthesis. It's just sad that Markdown.pl doesn't.
http://johnmacfarlane.net/babelmark2/?normalize=1text=The+book+of+%5BLife%5D(https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FLife_(textbook)+%22Life+textbook%22).%0A

However, there's probably a solution to this problem. RFC 2396, 2.4.3 defines
the left and right angle brackets as delimiters and explains that

The angle-bracket and and double-quote () characters are
excluded because they are often used as the delimiters around URI in
text documents and protocol fields.

This means that left and right angle brackets will never occur as part of an
URI. So deprecating parentheses, and replacing the parentheses with angle
brackets in Markdown's inline link syntax would resolve the problem
described. As an additional benefit, this would unify Markdown inline link
syntax and reference-style link syntax, and simplify link parsing.

Angle brackets surrounding the URL are supported by most Markdown parsers, but
the URL must be kept inside the parens. Unfortunately, only some parsers
correctly use them to disambiguate:
http://johnmacfarlane.net/babelmark2/?normalize=1text=%5BBrackets%5D(%3Chttps%3A%2F%2Fen.wikipedia.org%2Fwiki%2F)%3E+%22Brackets%22).%0A

--
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Styling Markdown approaches

2013-04-19 Thread Michel Fortin

Le 19-avr.-2013 à 17:00, Sherwood Botsford sgbotsf...@gmail.com a écrit :

 That particular cat is out of the bag, however, and we have a score of
 implementations.  From all apparent discussion here, there is no particular
 urge for the writers to get together to reduce the implementations.  So we
 have 20 document formats already.  And not all the implementers are
 concerned with backward compatibility.
 
 The same can be said of html and CSS.  CSS configures how the html is
 rendered.  So CMD could configure the way MD is rendered.

CSS doesn't change how the HTML is parsed, only how it looks (and sometime how 
it behaves). Similarly, configuration options in a Markdown parser that let you 
adjust *the output* to your linking are very welcome.

As for all the implementations, they mostly vary in edge cases and in their 
extensions to the core syntax. The core Markdown syntax (as defined by John 
Gruber) is pretty much the same everywhere, and this includes how HTML blocks 
are parsed. Implementations doing things differently than core Markdown are 
doing it mostly by adding restrictions out of security concerns with 
user-generated content.

By the way, if you really feel like it you should go ahead and hack your 
preferred implementation to do what you want. Just keep in mind that your 
documents using this tweaked syntax feature won't work right with other 
implementations. This might or might not come to bite you in the future 
depending on what you intend to do with those documents.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

PHP Markdown Lib 1.3

2013-04-11 Thread Michel Fortin

This is the first release of PHP Markdown Lib. This package requires PHP 
version 4.3 or later and is designed to work with PSR-0 autoloading and, 
optionally, with Composer. You can download it from the PHP Markdown website:
http://michelf.ca/projects/php-markdown/

You can also read the official announcement here:
http://michelf.ca/blog/2013/php-markdown-lib/

Also note new versions of PHP Markdown 1.0.1q  Extra 1.2.7 were just released 
too. They're now labeled as the classic version, and as previously mentioned 
on this list I will stop updating those next year, focusing in the Lib version.
http://michelf.ca/projects/php-markdown/classic/

Here is a list of the changes in Lib since the previous classic release (some 
of the changes are also included in the classic version, see the website for 
details):

PHP Markdown Lib 1.3 (11 Apr 2013):

*   Plugin interface for Wordpress and other systems is no longer present in
the Lib package. The classic package is still available if you need it:
http://michelf.ca/projects/php-markdown/classic/

*   Added `public` and `protected` protection attributes, plus a section 
about
what is public API and what isn't in the Readme file.

*   Changed HTML output for footnotes: now instead of adding `rel` and `rev`
attributes, footnotes links have the class name `footnote-ref` and
backlinks `footnote-backref`.

*   Fixed some regular expressions to make PCRE not shout warnings about 
POSIX
collation classes (dependent on your version of PCRE).

*   Added optional class and id attributes to images and links using the 
same
syntax as for headers:

[link](url){#id .class}  
![img](url){#id .class}

It work too for reference-style links and images. In this case you need
to put those attributes at the reference definition:

[link][linkref] or [linkref]  
![img][linkref]

[linkref]: url optional title {#id .class}

*   Fixed a PHP notice message triggered when some table column separator 
markers are missing on the separator line below column headers.

*   Fixed a small mistake that could cause the parser to retain an invalid
state related to parsing links across multiple runs. This was never 
observed (that I know of), but it's still worth fixing.



-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

footer and cie.

2013-02-08 Thread Michel Fortin

Question: what should be the output for this:

 for me the point of having my blog as a static site is mainly easy 
deploymentbr
 I don't have to worry about even configuring PHPbr
 I just put html on a web server and boom! instant win

 footer— [Igor Wiedler wins](https://igor.io)/footer

The real question is how to treat `footer`. Is it a block-level element like 
`div`? That should mean that the content would be passed to the output 
literally. One thing for sure, it's not a span element, as the HTML spec 
forbids putting it inside a `p`. Perhaps is should be something in between. 
Many of the new HTML5 elements are in that same situation.

In PHP Markdown 1.0.1p I started treating them as block-level elements and they 
get the same treatment as `div`, but implementations don't seem to agree on 
this (and yes PHP Markdown Extra gives something that makes no sense):
http://johnmacfarlane.net/babelmark2/?normalize=1text=%3E+for+me+the+point+of+having+my+blog+as+a+static+site+is+mainly+easy+deployment%3Cbr%3E%0A%3E+I+don't+have+to+worry+about+even+configuring+PHP%3Cbr%3E%0A%3E+I+just+put+html+on+a+web+server+and+boom!+instant+win%0A%3E%0A%3E+%3Cfooter%3E—+%5BIgor+Wiedler+wins%5D(https%3A%2F%2Figor.io)%3C%2Ffooter%3E%0A

I find it somewhat worrying that the reference tool (Markdown.pl) treats it as 
a span element. Well, I don't care that much about what old Markdown.pl does, 
but I do care about people trying it there on the dingus and finding that it 
works as they intended (if they only look at the rendered output, they won't 
see the mess in HTML tags), and then getting an unexpected and undesired result 
with other implementations.

Reference:
https://github.com/michelf/php-markdown/issues/67

-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

[ANN] PHP Markdown Lib 1.3 (Beta 4)

2013-01-21 Thread Michel Fortin

This is basically a pre-announcement. I'll make the announcement more official 
by posting it on my blog when it goes out of beta.

There's a new branch to PHP Markdown. It contains the two exact same parsers 
found in PHP Markdown and PHP Markdown Extra, but without all the cruft. In 
other words: is packaged to be used as a library, and only as a library. The 
two parsers classes are in the Michelf namespace:

\Michelf\Markdown
\Michelf\MarkdownExtra

Yep, a namespace, so it requires PHP 5.3 or later.

The Lib branch is [PSR-0] compliant, allowing easy autoloading of its parser 
classes. It is also a [Composer] package made [available on Packagist][aop].

It's currently in beta. Now's a good time to make breaking changes, if you have 
any to suggest.

[PSR-0]: https://github.com/php-fig/fig-standards/blob/master/accepted/PSR-0.md
[Composer]: http://getcomposer.org
[aop]: https://packagist.org/packages/michelf/php-markdown


## Moving forward

Here's the plan: next year, in 2014, only the Lib branch will continue to be 
updated.

The original PHP Markdown and PHP Markdown Extra distributions will continue to 
be available and they certainly won't stop working overnight, but starting next 
year updates containing bug fixes and new Extra features will go exclusively to 
the Lib branch.

Ditching support for older versions of PHP will simplify the maintenance work 
and will enable usage of newer PHP constructs in the code. Getting rid of the 
Wordpress plugin part will let me worry about things which are more related to 
the parser side of things.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Community Group for Markdown Standardization

2012-11-21 Thread Michel Fortin

Le 2012-11-20 à 20:36, marbux mar...@gmail.com a écrit :

 Every MD implementation would have to have two behaviours, set either
 by a command line flag, a configuration file, or a preference if used
 with a GUI. One behaviour would be the individual behavior so that
 the followers of that implementation wouldn't be left in the lurch.
 One would be the standard behavior.
 
 I think the behavioral switch could be handled automatically if the
 standardized version has its its own doctype declaration and profile
 header. If the doc has the doctype declaration, then process the doc
 as the standardized version of markdown; if not, then apply the
 implementation's unique default processing.

If your idea of an improved Markdown is one that starts with a doctype, I'm 
afraid it won't get very far (with users and implementers alike).


 Bridging the A11Y gap is in my view a major
 incentive for MD implementers to participate in the working group and
 to implement its deliverables. This is a legal requirement for web
 sites at least in the U.S. and E.U. Although enforcement has been lax
 so far, there is no guarantee that enforcement won't be ramped up
 later.)

I think you're mistaken about exactly who are the implementers. Most of us 
implemented Markdown for our own needs, then shared the code so other could use 
it. Then did some maintenance on the code in our spare time (at least some 
did). If any of us had this legal requirement to satisfy, or too many users 
nagging for this, it'd already been done. Changing the output is a piece of 
cake by the way.

So feel free to suggest improvements to the output. But I doubt very much it'll 
have any effect beyond changing the output for some willing implementers.

The reason most people wants a spec is to help various implementations 
interpret Markdown text the same way, and that is the hard problem. That's the 
problem that would require long discussions -- if not negotiations -- 
especially because it is likely to require near complete reimplementation from 
many of us (because various implementations have very dissimilar parsing models 
currently).

I, personally, don't have the time for this. As it is now, I barely have the 
time to do maintenance work on PHP Markdown and PHP Markdown Extra. PHP 
Markdown and its Extra counterpart probably have thousands of users, if not 
more (I have no way to measure that), but my spare time doesn't scale with the 
number of users and I have plenty of more interesting things to do in my spare 
time.


 The exception to that is support for the one feature that is likely to be 
 added which has no direct support in HTML, precisely because of that lack of 
 direct expressibility in HTML, namely footnotes. (Or has HTML 5 provided a 
 solution here (and one that isn’t still evolving)?) 
 
 Kinda'/Sorta'. HTML 5 has the aside element that was originally
 stuck in with footnotes in mind.
 http://dev.w3.org/html5/spec/single-page.html#the-aside-element. But
 it's really just a container that can be positioned on the page with
 CSS. No footnote/endnote-specific markup. (I'll omit my long rant
 about browser developers and their mindset when it comes to HTML spec
 footnote proposals. Let it suffice to observe that repurposing of
 content never enters their minds when the topic of footnotes comes
 up.)

Some things of interest:

Footnotes were discussed at length in 2008 at the WHATWG.
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-April/014485.html

I started to write a detailed spec in 2008 for a new Markdown Extra parsing 
algorithm and document model (which includes a Markdown subset). Despite being 
fun to work on, I stopped because it was too time-consuming.
http://michelf.ca/specs/markdown-extra/

Pretty much at the same time, I made Babelmark for comparing various Markdown 
implementations. Ian Hickson wrote a similar tool to see how browsers were 
parsing HTML snippets to help build the parsing spec for HTML. I figured it'd 
help to have something similar for Markdown.
http://babelmark.bobtfish.net

John MacFarlane rewrote it as Babelmark 2 a few months ago, with more 
up-to-date versions of the implementations. It's really great.
http://johnmacfarlane.net/babelmark2/?text=aasdf

 - - -

If I had to fix Markdown today, I'd radically change to a cheap approach. I'd 
take the few worse cases from the Babelmark 2 FAQ, try to come up with a 
right way to parse these, put them in a test suite and try to convince other 
implementations conform to that test suite. Even that would probably be a hard 
sell to me, and probably others. I'm pretty picky about what's right and 
wrong in Markdown.


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Multidingus

2012-10-19 Thread Michel Fortin

Le 2012-10-18 à 15:24, John MacFarlane j...@berkeley.edu a écrit :

I've implemented a version of what I described below.
http://johnmacfarlane.net/pandoc/dingus.html

I see you've now added Markdown.pl 1.0.2b8, which is the latest non-public
release made by John Gruber. Maybe you should add 1.0.1 too, like I've done for
Babelmark, because it's the latest public release and it is probably the one in
more widespread use. 1.0.2b8 and 1.0.1 do exhibit significant differences in
some areas.

For instance, we were just talking about parens inside URLs and I just noticed
that 1.0.2b8 does balance them correctly:
http://babelmark.bobtfish.net/?markdown=%28%5Btest%5D%28url.with%28paren%29%29%29src=1dest=2

The HTML block parser was also completely redone in 1.0.2b4. Look at the
changelog for for details (reproduced at the end of this email).

Also, perhaps dingus isn't the best word for it, since it
just displays the HTML source, not the formatted output. I take
it that's what we want, since this is intended primarily for
comparing the output of different implementations on corner
cases, not for users to get a feel for markdown.

If you want to call it Babelmark 2, you have my permission. I think it'll be a
worthy successor. Also, you really ought to have a title tag on your page.

## Changelog found in 1.0.2b8 ##

1.0.2b8 - Wed 09 May 2007

+ Fixed bug with nested raw HTML tags that contained
attributes. The problem is that it uses a backreference in
the expression that it passes to gen_extract_tagged, which
is broken when Text::Balanced wraps it in parentheses.

Thanks to Matt Kraai for the patch.

+ Now supports URLs containing literal parentheses, such as:

http://en.wikipedia.org/wiki/WIMP_(computing)

Such parentheses may be arbitrarily nested, but must be
balanced.

1.0.2b7

+ Changed shebang line from /usr/bin/perl to /usr/bin/env perl

+ Now only trim trailing newlines from code blocks, instead of
trimming
all trailing whitespace characters.

1.0.2b6 - Mon 03 Apr 2006

+ Fixed bad performance bug in new `Text::Balanced`-based
block-level parser.

1.0.2b5 - Thu 08 Dec 2005

+ Fixed bug where this:

[text](http://m.com title )

wasn't working as expected, because the parser wasn't allowing
for spaces
before the closing paren.

1.0.2b4 - Thu 08 Sep 2005

+ Filthy hack to support markdown='1' in div tags, because I need
it
to write today's fireball.

+ First crack at a new, smarter, block-level HTML parser.

1.0.2b3 - Thu 28 Apr 2005

+ _DoAutoLinks() now supports the 'dict://' URL scheme.

+ PHP- and ASP-style processor instructions are now protected as
raw HTML blocks.

? ... ?
% ... %

+ Workarounds for regressions introduced with fix for backticks
within
tags bug in 1.0.2b1. The fix is to allow `...` to be turned
into
code.../code within an HTML tag attribute, and then to turn
these spurious `code` tags back into literal backtick
characters
in _EscapeSpecialCharsWithinTagAttributes().

The regression was caused because in the fix, we moved
_EscapeSpecialCharsWithinTagAttributes() ahead of _DoCodeSpans()
in _RunSpanGamut(), but that's no good. We need to process code
spans first, otherwise we can get tripped up by something like
this:

`test a=` content of attribute ``

1.0.2b2 - 20 Mar 2005

+ Fix for nested sub-lists in list-paragraph mode. Previously we
got
a spurious extra level of `p` tags for something like this:

* this

* sub

that

+ Experimental support for [this] as a synonym for [this][].
(Note to self: No test yet for this.)
Be sure to test, e.g.: [permutations of this sort of [thing][].]

1.0.2b1 - 28 Feb 2005

+ Fix for backticks within HTML tag: span attr='`ticks`'like
this/span

+ Fix for escaped backticks still triggering code spans:

There are two raw backticks here: \` and here: \`, not
a code span

1.0.1 - 14 Dec 2004

1.0 - 28 Aug 2004

--
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Trouble with parentheses in Markdown hyperlinks

2012-10-19 Thread Michel Fortin

Le 2012-10-18 à 22:20, John Gruber gru...@fedora.net a écrit :

I believe Markdown has thrived and continues to grow because I haven't fucked
around with it. The canonical docs are those on Daring Fireball.

The syntax document on Daring Fireball is a good definition of what Markdown
is, and of the underlying philosophy. It's a good thing too that the syntax
isn't defined in a too restrictive way: this leaves room for extensions and
experimentation without breaking the claim that the modified version is still a
Markdown implementation. Which means of course that Markdown is everywhere,
even though it's not the same Markdown.

Markdown is not defined by its edge cases. But nevertheless those edge cases
which differs between implementations are a problem for many people. I doubt
very much that you being here on this list from time to time to give an opinion
on whether something is a feature or a bug in Markdown.pl would impede
Markdown's growth in any way. But of course, it's your time and you can spend
it elsewhere if you want, and I respect that.

The case that spawned this thread seems to be a bug in your view. I say that
because you fixed it in the unreleased 1.0.2b4 version of Markdown.pl, which
added support for properly nested parens inside URLs. Which is great, except
that 1.0.2b4 is an unreleased version from 2005 that not everyone can easily
find.

There are in fact two reference versions of Markdown right now -- Markdown.pl
1.0.1 and Markdown.pl 1.0.2b8 -- exhibiting some differences in behaviours and
in features. This is confusing, both for users and for implementers. It's also
hard not to think that Markdown.pl is abandonware at this point because 1.0.1
was was 8 years ago, and because of the beta left it in an unfinished state.

Despite its unfinished state, I know that 1.0.2b8 is indeed in widespread use.
I had a lot of pressure to enable [shortcut-style] links in PHP Markdown a
while ago, all that for interoperability with Markdown.pl (how ironic!). It's
now enabled by default in PHP Markdown [and many other implementations][1].

[1]:
http://johnmacfarlane.net/babelmark2/?text=a+%5Bshortcut-style%5D+link.%0A%0A%5Bshortcut-style%5D%3A+http%3A%2F%2Fmichelf.ca%2F

Still, Markdown is stronger than ever. So perhaps we don't need to change any
of that. I guess it's fine if Markdown.pl is left to a 8 year old version with
unfixed bugs and lacking a feature you introduced yourself in an unfinished
beta and that everyone is now using regardless. And I'm not being sarcastic:
Markdown's growth doesn't seem impeded at all by that, so why change any of
that?

I think Markdown (and Markdown.pl) is in need for an update. A small one. Like
adding shortcut-style links and fixing small bugs like parens in URLs... that'd
be useful to many people and it wouldn't impede Markdown's growth in any way.
It'll also make the reference implementation a slightly more reliable
reference, and somewhat more credible. One update every 8 years isn't too many.

But that's indeed not needed for Markdown to thrive indeed, as the past years
have shown. Feel free to continue not doing anything and passively watch it
take over the world on its own. ;-)

For edge cases and interoperability issues, it probably ought to be someone
else's spec. For one thing, you don't seem to have much time or interest for
these matters. And also, it's probably better for growth if the official
Markdown spec isn't screwed too tightly. This leaves more room for new
implementations and experimentation, and more things can thus be called
Markdown.

It's good to hear from you on this list John. Please keep doing that.

--
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: [question] special characters

2012-07-27 Thread Michel Fortin

Le 2012-07-27 à 14:06, Fugo a écrit :

 I'm new to this list and have a question to format special characters. 
 
 On the website http://daringfireball.net I don't find anything.
 Is it possible to format characters like m² or ©?
 
 I hope someone can help.


If you copy-paste your email above (in particular the paragraph with m² and ©) 
to the Markdown dingus you'll see that it works fine.
http://daringfireball.net/projects/markdown/dingus/


-- 
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Paragraphs and html integration.

2012-06-18 Thread Michel Fortin

Le 2012-06-18 à 6:26, Boris Le Ninivin a écrit :

 Now, to step forward to the problem I have :
 
 In a website, parts of the pages (essentially headers and footers) are often 
 the same. Hence I've added a functionality to my toolkit : inclusion. It is 
 performed when the parser finds @include filename.
 
 The problem I have had is that these instructions are wrapped between p 
 tags. Indeed I've tried to bypass the problem by many ideas, but since 
 EVERYTHING is wrapped between p tags (including doctypes and all!), I get 
 non-compliant html documents (my header defines the doctype and html head 
 body tags too; and my footer closes the body and html tags; but these are 
 wrapped into paragraphs...).
 
 Since the markdown language is aimed to be a format for /writing/ for the 
 web. and not a replacement for HTML, or even close to it., I think the md 
 language should allow a strong usage of html tags, and even, to have .md 
 files containing 99% of html tags.

Except @include is not an HTML tag at all.

You could instead use XML-style processing instructions, such as ?include blah 
blah ?. PHP Markdown should handle them fine regardless of where you put them.


 In the end, on the df website, it is said that Markdown is smart enough not 
 to add extra (unwanted) |p| tags around HTML block-level tags.. So I don't 
 know if it's an implementation problem (related to the PHP port, maybe?), or 
 if it's a design problem, but as far as I know, Markdown is not smart enough 
 to not add unwanted p tags.

That's only true for known HTML tags, and only the block-level ones.


 [1] I'm not really delighted to see that a GOOGLE email address is required 
 to be able to post to this list. It might be a more or less effective way to 
 reduce spam, but it's clearly not the correct one. Google uses the data from 
 your emails to build profiles on you, and to [identify](http://donttrack.us/) 
 and [bubble](http://dontbubble.us/) you. Therefore, I use a personal email 
 address from a domain I own. And that one was rejected. I just wanted to 
 point all that out while I'm at it. Oh and in case I'm wrong and that it was 
 my domain which is blacklisted or anything else, do not pay attention to this 
 complaint. :)


As far as I know, the only requirement is that you need to post using the 
address you subscribed with.


-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

PHP Markdown 1.0.1o Extra 1.2.5

2012-01-08 Thread Michel Fortin

This is a small update to PHP Markdown and PHP Markdown Extra fixing three bugs.
http://michelf.com/projects/php-markdown/

Also of note: PHP Markdown is now available on Github (has been for a few 
months), so if any of you wants to fill bug reports, or fix those bugs and send 
pull requests, you're most welcome.
https://github.com/michelf/php-markdown/


PHP Markdown 1.0.1o (8 Jan 2012):

*   Silenced a new warning introduced around PHP 5.3 complaining about
POSIX characters classes not being implemented. PHP Markdown does not
use POSIX character classes, but it nevertheless trigged that warning.


PHP Markdown Extra 1.2.5 (8 Jan 2012):

*   Fixed an issue preventing fenced code blocks indented inside lists 
items 
and elsewhere from being interpreted correctly.

*   Fixed an issue where HTML tags inside fenced code blocks were sometime
not encoded with entities.




-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Help with Error: POSIX collating elements are not supported

2011-12-09 Thread Michel Fortin

Le 2011-12-09 à 16:27, Doug Heavrin-Brown a écrit :

 Hello,
 
 I hope this is the right place to get help fixing an error when using 
 Markdown on our servers.
 
 I'm not sure what our webmasters have done, but a recent change (perhaps to 
 PHP?) made our Markdown text  disappear from the page because of an error.
 
 PHP Warning:  preg_replace_callback() [a 
 href='function.preg-replace-callback'function.preg-replace-callback/a]: 
 Compilation failed: POSIX collating elements are not supported at offset 116 
 in /[PATH]/markdown.php on line 974
 
 Searching for help turned out to be mostly fruitless, except for a page 
 pointing to rows 977-981, where one poster says commenting out those lines 
 stops the error.
 
 http://forum.theturninggate.net/viewtopic.php?f=8t=1738
 
 That is correct, but it also stops unordered lists from being formatted. 
 While we can live with that for now (because Markdown has worked so well so 
 far) that won't do in the long run.
 
 There were other pages that mention PRCE, but that seemed to be with a 
 different version of PHP (our servers are running 5.2.9 -- and we are not 
 able to change that).
 
 We have not found anyone who has the proper skills and is also able to help 
 us with this problem.
 
 Has anyone encountered this before and is there a way to fix it?


I guess it's my fault for not updating PHP Markdown fast enough. Sorry. :-)

I've been made aware of it in September, but I haven't checked it yet. It might 
be that a PHP update is causing it, although it seems I would be getting more 
emails if that was the case. Perhaps not that many people have warnings 
enabled, or perhaps they just silently fill up everyone's server logs and 
nobody look at those logs and so nobody tells me.

The warning message could be clearer, but I suspect PCRE is treating specially 
character classes in regular expressions that starts with a dot, warning that 
POSIX character classes (which starts with a dot) are not implemented. The 
solution should be to replace any instance of [. in any regular expressions 
with [\..

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: paragraph inconsistency

2011-09-16 Thread Michel Fortin


Le 2011-09-16 à 12:50, Koen H a écrit :

 I was looking at the PHP Markdown tests and noticed that
 
paragraph
 some quote
 
 was rendered as a paragraph with a blockquote following it. And
 
paragraph
#header
 
 should be rendered as a paragraph with a header following it. But then
 
paragraph
* ciao
 
 should be rendered as one paragraph? From the first two I would expect
 it to be a paragraph and a list.
 
 Is there a rule or some reasoning behind this?


You see, I could be writing something about the iPad
2. And if Markdown treated all lines that begin with
a list marker as a list item, then this text would
be inside a list item.

John Gruber decided a long time ago that to solve
this problem, lists (both ordered an unordered ones)
needed a blank line before them. This only applies 
to the root level:

1.  This is a list item
*   This is a nested list item


-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Universal syntax for Markdown

2011-08-14 Thread Michel Fortin

Le 2011-08-13 à 18:43, David Chambers a écrit :

 Perhaps Babelmark could switch to making API calls to obtain translations as 
 developers make endpoints available. What are your thoughts, Michel? I'd love 
 to contribute to this effort if you consider it worthwhile.

Tomas Doran is currently hosting the best Babelmark site at 
http://babelmark.bobtfish.net/.

My original code for Babelmark can be found here at 
http://michelf.com/docs/projets/babelmark.zip, but it miss some 
implementations and the comparison option that was added later.

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: `time` element syntax

2011-06-05 Thread Michel Fortin

Le 2011-06-05 à 18:23, David Chambers a écrit :

 Hi folks,
 
 Thanks for your input. I'm pleased to have raised this issue for discussion.
 
 [...]
 
 The problem with the approach suggested by Rob and Michel is that it works
 for only a subset of cases. A very large subset perhaps, but a subset
 nonetheless. The displayed text may be Christmas Day or 7pm tomorrow.
 While in many cases it may seem a violation of DRY to include both forms,
 it's clearly not possible to go translate from human to machine in all
 cases, nor is it necessarily possible to translate in the other direction.

If I put my mouse pointer over 7pm tomorrow in your email, Apple Mail circles 
it with a dotted line with a popup menu offering to add a new event to my 
calendar. Clearly the algorithm can be quite clever. But you're right it will 
never catch all cases. But for the cases it won't catch, there's HTML.

I'm not saying it's the ideal solution.


 Heck, Waylan, you've done it again. This is extremely readable and allows
 the `pubdate` attribute to be included if desired.
 
Some text [30 May 2011] more text.
 
[30 May 2011]: datetime: 2011-05-30T15:00-07:00, pubdate
 
 would become…
 
time datetime=2011-05-30T15:00-07:00 pubdate=pubdate30 May 2011
 /time

I'd tend to go for something even simpler:

Some text 30 May 2011 more text.

*[30 May 2011]: 2011-05-30 15:00 -07:00

Basically, why do we need to force brackets in the text at all? Also, why force 
the writer to use 'T' as a time separator and strictly follow to the rules of 
HTML date syntax? It's much more readable without the 'T'. Reformatting it to 
HTML's liking should be pretty trivial.

Also, note that what I have proposed above is simply an extension to PHP 
Markdown Extra's abbreviation syntax. I'd propose that if the abbreviation 
looks like an ISO date/time, the Markdown parser would just emit a `time` or 
`date` element, whatever is most appropriate, instead of `abbr`. I'd rather 
not add a new Markdown syntax for each and every element in HTML.


 The resemblance to links is actually a *good* thing in my opinion. It allows
 readers to guess (correctly) that the there is accompanying data and that
 it likely resides after the current paragraph or at the end of the document.

I disagree. Someone reading the HTML output in the browser is unlikely to 
notice there's a date/time element here or there on the page with a 
computer-readable date. And even if you made date time elements flashing red, 
what would be the point?

It has some value if, like in Apple Mail, you can move your pointer to it and 
do some action. But that's not going to be true in the Markdown text version 
(unless it gets scanned for common date patterns like Mail does). So why again 
should the reader know there's a computer-readable version of the date 
somewhere further in the document?


-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: `time` element syntax

2011-06-02 Thread Michel Fortin

Le 2011-06-02 à 5:08, David Chambers a écrit :

 Hi folks,
 
 I expect that the response to this post will be we don't need such a
 thing, but humour me for a moment by pretending that in fact we do.
 
 HTML5 added a number of new tags to the mix, but arguably the most
 significant is the `time` element. It associates a machine-readable
 timestamp with a human-readable string (e.g. `time
 datetime=2011-05-30T15:00-07:0030 May 2011/time`).
 
 I would love to be able to write something like `[30 May
 2011]{2011-05-30T15:00-07:00}`.
 
 `/^(\d{4})-(\d\d)-(\d\d)T(\d\d):(\d\d)(?::(\d\d)(?:[.](\d+))?)?([-+]\d\d:\d\d|Z)$/`
 could be used to ensure that only valid `datetime` attribute values are
 matched. This would avoid false positives and would keep `[foo]{bar}`
 available for other functions, potentially.
 
 Are there any reasons not to use `[human]{computer}`? Can anyone suggest a
 better syntax?

Personally, I think if you're going to write a lot of dates like this the best 
syntax would be to auto-detect 30 May 2011 as a date. But this might need to 
be done at another layer than Markdown since Markdown doesn't know about your 
time zone and the date format might depend on your language and locale.

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Proposed table specification (long!)

2011-05-17 Thread Michel Fortin

Le 2011-05-17 à 17:24, bowerb...@aol.com a écrit :

 when it comes to tables, you put people in a catch-22...
 it has to be something that can work with existing tools,
 and it has to be something that can work with fonts that
 are proportional and nonproportional.   surely you are all
 smart enough to know that the clauses are contradictory.
 the only reason to put people in a double-bind is if you
 actively want to stall 'em out, in a passive way.   that's it.
 
 edge-cases are not as clear of a catch-22, but even there,
 you all have historic content you prefer not to break by
 changing your implementation, so you are at an impasse.
 
 but hey, if it makes you feel better, i would be happy to add
 that all of this is only in my own humble opinion, and that
 maybe i'm just not smart enough to see your solution, and
 whatever other evasive language i would need to say so as to
 protect you from feeling challenged or threatened, because
 that's really _not_ my intention.   i have my own fish to fry...

What we really need is an effort in the style of HTML5's HTML parsing algorithm 
which provides an unambiguous definition of how things should be parsed. Heck, 
I started one a while ago for Markdown Extra, first by creating a tool to be 
able to evaluate what each implementation do when encountering an edge case 
(Babelmark, which is now hosted at http://babelmark.bobtfish.net/), then by 
starting writing such a specification (see 
http://michelf.com/specs/markdown-extra/). Then I stopped because I realized 
it'd be too long and that I had many more interesting projects I could do in 
that free time.

HTML5 had some companies to back its development, Markdown doesn't. And it's 
not like any of us is making money selling our implementation of Markdown (or 
at least I don't), so I'm not sure how it'll happen.

Still, thanks for your analysis. It's refreshing to have an outsider's opinion 
one time in a while.

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: An Observation

2011-05-14 Thread Michel Fortin


Le 2011-05-14 à 15:55, Aristotle Pagaltzis a écrit :

 * Fletcher T. Penney fletc...@fletcherpenney.net [2011-05-14 16:50]:
 I was looking at the original Markdown syntax specification,
 and became curious as to the last time Gruber posted anything
 to this discusion list.  According to some searches of the
 archives using some Google tricks, the last post I could find
 from him was in 2009.
 
 Just thought that was interesting.
 
 An anatomy of a loss of interest:
 
2004  279
2005  153
2006   78
2007   25  (21 in the first half of the year)
20086  (2 in Feb; 4 in a row in mid-March)
20094  (in row in late Feb)
20100
20110
 
 It’s not uncommon. Most of my own projects look like this…
 
 I just wish we’d get one last release out of him that takes the
 beta label off the last version and fixes the couple of small
 things Gruber himself already agreed years ago should be changed
 in certain ways. (Eg. not triggering emphasis on word-internal
 underscores.)

You should add a new column for the total number of posts.

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Proposed table specification (long!)

2011-05-11 Thread Michel Fortin

Le 2011-05-10 à 23:54, Simon Bull a écrit :

 If the proposed syntax overly complicated, I am very happy to simplify it.
 The question is whether or not the following is really complicated?
 
 ~
 
 
---
 THE PEOPLE OF MIDDLE-EARTH
---
 
  PeopleHomelandTongue
===
  Elves Rivendell,  Quenya,
Mirkwood,   Sindarin,
Lorien  Nandorin
 
  Dwarves   Erebor  Khuzdul
 
  Hobbits   The Shire,  Westron
Breeland
 
 
 ~

I agree with most of Fletcher's points. This is complicated. I made a parser 
that can parse something relatively similar to the above before settling on PHP 
Markdown Extra's current table syntax. I decided against it for a couple of 
reasons.

First, it relies on spacing too much. With most syntaxes in Markdown, you can 
be lazy and not indent everything perfectly. This table syntax relies entirely 
on perfect spacing, which goes contrary to this principle. It also only work 
with monospace fonts which can be a problem in some cases.

Second, editing its content is a real pain. Try to add a new elven tongue 
between Quenya and Sindarin and tell me how much time it takes. Now compare 
with editing the same table in HTML.

I'll concede that the table is more readable than in HTML, but I think the 
ratio between usefulness and implementation effort is rather weak.

And did I miss it or does it lacks one feature PHP Markdown Extra has: 
per-column left/right/center alignment?


-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Should leading and trailing spaces between backticks be preserved?

2011-02-13 Thread Michel Fortin

Le 2011-02-13 à 15:31, David Chambers a écrit :

 Well, yes, it might be wrong, but that's how the language works (one after
 the opening, one before the closing is what 
 http://daringfireball.net/projects/markdown/syntax#code says, And it
 gives an example (`` `foo` ``) as well.
 
 John's examples suggest that this stripping should apply only within ``
 double-backticked `` contexts. I imagine his intention was to avoid the
 leading and trailing spaces in `` `foo` `` (required by the syntax) from
 being included in the output. I can't imagine any reason to strip whitespace
 in regular ` single-backticked ` contexts.

Then what about ` ``foo`` ` ?


-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: my scala markdown implementation

2010-12-15 Thread Michel Fortin

Le 2010-12-15 à 15:41, Christoph Henkelmann a écrit :

 3. What is the current official test suite? I found one here:
 http://michelf.com/docs/projets/mdtest-1.1.zip, however that one is
 GPL Licensed so I cannot include it in my project if I want to release
 it under the BSD license. What is the best BSD-licensed test-suite I
 could use?

The only reason MDTest is GPL is because the test driver (mdtest.php) contains 
some GPL code from another source (the PHP Diff code). If you remove or replace 
the PHPDiff function at the bottom of the file, I give you permission to 
relicense it under BSD. It'd be better however if you get the permission from 
John Gruber too, as this code is based on his older MarkdownText implementation.

As for the test files included with MDTest, with this email I release the PHP 
Markdown.mdtest and PHP Markdown Extra.mdtest test suites from MDTest to the 
public domain (I'm the author of these files), so feel free to do what you want 
with them. The Markdown.mdtest test suite comes from Gruber's older 
MarkdownTest and are unmodified since then (except for some file extensions), 
so all you need is his permission.

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: PHP Markdown Extra fork for MathJax

2010-09-28 Thread Michel Fortin

Le 2010-09-28 à 10:08, Dr. Drang a écrit :

 My fork of Michel Fortin's PHP Markdown Extra has been updated to
 include support for MathJax as well as jsMath. Both of these are
 JavaScript libraries for using LaTeX syntax, e.g., \(E = mc^2\), to
 include high quality, scalable equations (not just bitmapped images of
 equations) in your HTML pages.

If I read that well, it's more an extension than a fork.

I took a look at your changes, and it seems that most of the modifications 
could easily be done in a subclass of the MarkdownExtra_Parser class (adding 
functions, adding entries in the span/block gamut arrays), the one notable 
exception being the `parseSpan` function.

I'm not complaining or suggesting anything, only making the observation that 
`parseSpan` in PHP Markdown  PHP Markdown Extra is somewhat in the way of 
making extensions by subclassing.


-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: RFC: Lazy syntax for paragraphs, blockquotes and lists

2010-09-03 Thread Michel Fortin

Le 2010-09-03 à 5:43, Thomas Leitner a écrit :

Note that kramdown would generate two separate blockquotes if they are
separated by a blank line (Markdown.pl merges the blockquotes):

This is one blockquote with
a long line.

This is another blockquote
with a long line.

If you run the example BQ1 to BQ5 through Markdown.pl, you will find
that it produces the expected output (as defined above). This is no
coincidence, I think, since Markdown.pl has been designed with email
messages in mind. However, the requirements as stated above
haven't been written down anywhere (at least I don't know of it) and
with those the behaviour of Markdown.pl is easily explained.

This last example is quite similar to the example of a lazy blockquote in the
Markdown syntax page. It states:

Markdown allows you to be lazy and only put the before the first line of a
hard-wrapped paragraph:

This is a blockquote with two paragraphs. Lorem ipsum dolor sit amet,
consectetuer adipiscing elit. Aliquam hendrerit mi posuere lectus.
Vestibulum enim wisi, viverra nec, fringilla in, laoreet vitae, risus.

Donec sit amet nisl. Aliquam semper ipsum sit amet velit. Suspendisse
id sem consectetuer libero luctus adipiscing.

http://daringfireball.net/projects/markdown/syntax#blockquote

Considering the first line of this example in the spec says this is a
blockquote with two paragraphs and that it follows a non-lazy example where
all this text is part of a single blockquote, I wouldn't consider the spec
ambiguous. The Markdown syntax description is ambiguous about a lot of things,
but not about this one.

Most Markdown implementations, but not all, do it as Markdown.pl:
http://babelmark.bobtfish.net/?markdown=%3E++This+is+one+blockquote+with%0D%0A+++a+long+line.%0D%0A%0D%0A%3E++This+is+another+blockquote%0D%0A+++with+a+long+line.%0D%0Anormalize=onsrc=1

This isn't a disapproval of how you're planning to do things. I just wanted to
make it clear that your last example with two consecutive blockquoted
paragraphs is clearly a single blocquote per the Markdown syntax description.

--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: HTML::StripScripts and markdown incompatibilities

2010-08-24 Thread Michel Fortin

Le 2010-08-24 à 8:27, Louis-David Mitterrand a écrit :

 Hi,
 
 I'm using perl's HTML::StripScripts to clean out unwanted/broken html
 from forum post on my web site but it also removes http://example.com
 or u...@example.com markdown constructs.
 
 Any idea how to make these two live together in harmony?

Are you calling StripScripts before or after Markdown? You should always filter 
tags after converting to HTML, as it seems StripScripts was designed to filter 
HTML, not Markdown-formatted text.

Long explanation:
http://michelf.com/weblog/2010/markdown-and-xss/

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: HTML::StripScripts and markdown incompatibilities

2010-08-24 Thread Michel Fortin

Le 2010-08-24 à 8:49, Louis-David Mitterrand a écrit :

 On Tue, Aug 24, 2010 at 08:41:05AM -0400, Michel Fortin wrote:
 Le 2010-08-24 à 8:27, Louis-David Mitterrand a écrit :
 
 I'm using perl's HTML::StripScripts to clean out unwanted/broken html
 from forum post on my web site but it also removes http://example.com
 or u...@example.com markdown constructs.
 
 Any idea how to make these two live together in harmony?
 
 Are you calling StripScripts before or after Markdown? You should
 always filter tags after converting to HTML, as it seems StripScripts
 was designed to filter HTML, not Markdown-formatted text.
 
 Long explanation:
 http://michelf.com/weblog/2010/markdown-and-xss/
 
 Actually I save the forum posts to the DB in non-converted markdown and
 filtered of any unwanted html.
 
 Should I save the raw unfiltered post to DB and then (1) expand markdown
 and (2) filter with StripScripts only when _displaying_ the post? That
 would entail keeping some potentially unclean posts in the DB and
 having to StripScripts them repeatedly.

The only important thing for correctness of the output is to apply the Markdown 
filter before ScripScripts. The rest is just optimization.

For performance reasons it might be a good idea to save the 
(Markdown+StripScripts)-processed text in the DB, but if you allows users to 
edit their posts once published it'd be more convenient for them to have start 
from the original unprocessed Markdown source. So you might want to save either 
one, or both.

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: New parser-based Markdown implementation for Java

2010-05-03 Thread Michel Fortin

Le 2010-05-03 à 4:29, Allan Odgaard a écrit :

 Not aware of one but I wrote the TextMate manual in Markdown and it is 
 available here:http://svn.textmate.org/trunk/Manual/pages/en/

For benchmarking I'm using your manual too. :-)

I have concatenated all the files into one big file, and I have made a second 
file which just contains the content of the first repeated twice. I expect the 
second to take about twice the time, or less. If the second one takes more than 
twice the time, it means that the time it takes does not grow linearly with the 
input, which is a bad sign of further problems as input files grow.

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: New parser-based Markdown implementation for Java

2010-05-03 Thread Michel Fortin

Le 2010-05-03 à 3:42, Mathias a écrit :

 1. Test Suite: Is there anything more than John Grubers test suite that you 
 know off? Maybe a good, large file for benchmarking? I'd like to compare 
 pegdowns performance to some other implementations. I could also run the 
 original test suite a few times in a row for benchmarking but maybe there is 
 another more accepted benchmark available somewhere?

I've made a test suite for PHP Markdown called MDTest, which includes a couple 
more tests and a system to normalize the HTML output to avoid flagging as error 
insignificant whitespace differences.

http://michelf.com/docs/projects/mdtest-1.1.zip

It's lacking a couple of other tests I added recently.

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: New parser-based Markdown implementation for Java

2010-05-03 Thread Michel Fortin

Le 2010-05-03 à 7:03, Michel Fortin a écrit :

 http://michelf.com/docs/projects/mdtest-1.1.zip

Sorry. Bad URL. Try this one:

http://michelf.com/docs/projets/mdtest-1.1.zip

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: A Modest Definition List Proposal

2010-04-13 Thread Michel Fortin

Le 2010-04-13 à 15:21, Tom Humiston a écrit :

 Is my use of DL appropriate?

When in doubt, check HTML5 (which tend to make everything unambiguous). Here it 
says:

 The dl element represents an association list consisting of zero or more 
 name-value groups (a description list). Each group must consist of one or 
 more names (dt elements) followed by one or more values (dd elements). Within 
 a single dl element, there should not be more than one dt element for each 
 name.
 
 Name-value groups may be terms and definitions, metadata topics and values, 
 or any other groups of name-value data.
 
 The values within a group are alternatives; multiple paragraphs forming part 
 of the same value must all be given within the same dd element.
 
 The order of the list of groups, and of the names and values within each 
 group, may be significant.

http://www.whatwg.org/specs/web-apps/current-work/multipage/grouping-content.html#the-dl-element

Note that it renames definition list to description list, in support for 
the idea that it's not only for definitions.

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Markdown development

2010-03-06 Thread Michel Fortin

Le 2010-03-06 à 11:56, Aristotle Pagaltzis a écrit :

 If Markdown is to evolve, this is the problem that needs to be
 solved: it needs a principal designer with a good enough sense
 for its spirit and enough of a voice to gain the authority to
 have his or her mandates followed.

I'm of the same opinion, although I'll say the problem goes a little deeper.

Personally, I see little to be gained by improving Markdown. This applies to 
myself, but probably to John Gruber too. There's not much more recognition I 
can get by improving PHP Markdown and very little money to make with it.

I'm still answering my emails about PHP Markdown, giving tips to whoever needs 
them, and I'm fixing bugs from time to time. But all this I do by charity: I 
don't gain anything personally or professionally by doing it except perhaps the 
goodwill of a few people. That could change if I were to return to web 
development, but right now I'm working on other things to pay the bills. I 
think John is in a similar situation.

Is there anyone interested in sponsoring Markdown development? I think you 
first need a solution to this problem if you want to have a lead designer.

-- 
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Order of Markdown and SmartyPants filters (was: Re: Markdown Support in Drupal6.14?)

2009-10-22 Thread Michel Fortin


Le 2009-10-22 à 13:29, Joonas Pulakka a écrit :


2009/10/20 Lou Quillio pub...@quillio.com


On Mon, Oct 19, 2009 at 3:20 PM, AJG Baeumel
a...@st-maurices.n-lanark.sch.uk wrote:

Why is Marksmarty no longersupported in Drupal 6.14?


It's been a while since I've worked with Drupal, but I remember that
MarkSmarty was really just a hybrid convenience filter. It's better  
to

apply the SmartyPants and Markdown filters (in that order) to your
content, and update them by dropping-in Michel's latest PHP libs if
the Drupal-supplied ones fall out of date.



I'm now puzzled which is the correct order to apply the filters.  
SmartyPants
first and Markdown then (as above), or the other way round (as you  
suggested

a couple of months ago :-)?
http://six.pairlist.net/pipermail/markdown-discuss/2009-September/001647.html


Markdown, then Smartypants. If you run Smartypants first, it'll  
convert characters like quotes to curly ones in link definitions and  
code samples, and the result won't be processed correctly by the  
Markdown filter. Smartypants wants HTML-formatted input, not Markdown- 
formatted input.


This is documented in the PHP Markdown Readme:

If you wish to use PHP Markdown with another text filter function  
built to parse HTML, you should filter the text *after* the Markdown  
function call. This is an example with PHP SmartyPants:


$my_html = SmartyPants(Markdown($my_text));


--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

PHP Markdown 1.0.1n Extra 1.2.4

2009-10-10 Thread Michel Fortin

Hello all.

This is an update to PHP Markdown and PHP Markdown Extra correcting a
couple of bugs that got reported since last release.

I also decided to enable in this release shortcut-style reference
links (as present in Markdown.pl 1.0.2 beta), as most implementations
of Markdown accepts them by default. It's also because waiting for
John to release officially version 1.0.2 as an actual release (instead
of a beta) for enabling this feature looks like a dead end at the
moment.

So you can download it from here:
http://michelf.com/projects/php-markdown/

As usual, you can also grab the latest release from the PEAR channel
at http://pear.michelf.com/ or the Git repository mirror at http://git.michelf.com/php-markdown/

Here is the detailed list of changes.

1.0.1n (10 Oct 2009):

* Enabled reference-style shortcut links. Now you can write reference-
style

links with less brakets:

This is [my website].

[my website]: http://example.com/

This was added in the 1.0.2 betas, but commented out in the 1.0.1
branch,
waiting for the feature to be officialized. [But half of the other
Markdown
implementations are supporting this syntax][half], so it makes sense
for

compatibility's sake to allow it in PHP Markdown too.

[half]:
http://babelmark.bobtfish.net/?markdown=This+is+%5Bmy+website%5D.%0D%0A%09%09%0D%0A%5Bmy+website%5D%3A+http%3A%2F%2Fexample.com%2F%0D%0Asrc=1dest=2

* Now accepting many valid email addresses in autolinks that were
previously rejected, such as:

abc+mailbox/department=shipp...@example.com
!#$%'*+-/=?^_`.{|}...@example.com
a...@def@example.com
Fred Bloggs@example.com
jsm...@[192.0.2.1]

* Now accepting spaces in URLs for inline and reference-style links.
Such

URLs need to be surrounded by angle brakets. For instance:

[link text](http://url/with space optional title)

[link text][ref]
[ref]: http://url/with space optional title

There is still a quirk which may prevent this from working correctly
with

relative URLs in inline-style links however.

* Fix for adjacent list of different kind where the second list could
end as a sublist of the first when not separated by an empty line.

* Fixed a bug where inline-style links wouldn't be recognized when the
link

definition contains a line break between the url and the title.

* Fixed a bug where tags where the name contains an underscore aren't
parsed

correctly.

* Fixed some corner-cases mixing underscore-ephasis and asterisk-
emphasis.

Extra 1.2.4 (10 Oct 2009):

* Fixed a problem where unterminated tags in indented code blocks could
prevent proper escaping of characaters in the code block.

--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: No way to indent text?

2009-08-05 Thread Michel Fortin


Le 2009-08-05 à 7:42, Tim Visher a écrit :


As others said, this is non-standard for (X)HTML except for certain
elements like `blockquote`.  You could use inline styles but then
you'd have to drop back to straight (X)HTML rather than the simpler
Markdown syntax.

For instance, were you to desire to indent a paragraph, markdown will
correctly allow you to do the following:

   p style=margin-left: 2em;Lorem ipsum dolor sit amet,  
consectetur
   adipisicing elit, sed do eiusmod tempor incididunt ut labore et  
dolore magna
   aliqua. Ut enim ad minim veniam, quis nostrud exercitation  
ullamco laboris

   nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
   reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
   pariatur. Excepteur sint occaecat cupidatat non proident, sunt in  
culpa qui

   officia deserunt mollit anim id est laborum./p

I think that's your only option directly within Markdown.



Indeed it is.

The big drawback here is that Markdown formatting isn't applied in an  
HTML block like the example above. Some parsers, like PHP Markdown  
Extra, allow you to activate markdown formatting within the paragraph  
by adding a `markdown=1` attribute to the `p` tag.


--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: md2html.awk and a question

2009-07-17 Thread Michel Fortin


Le 2009-07-16 à 14:05, yy a écrit :


There are many questions I have, I know some test suites and am trying
to pass those tests. When I don't know how to handle a corner case I
use to check with Dingus. I would really appreciate if somebody could
explain me the output of this text:

   this paragraph is outside of list blocks

   * #this is not a h1
   * ##this is not a h2
   * ###and this is not a h3
   * but the next one is

 #an h1!
 inside a list item, ok, but...

   * ###wtf is this?

   bad, bad, bad...


The algorithm in Markdown.pl resumes to this:

1. The content of a list item which contains a blank line is treated
   as block-level content.

2. The content of a list item preceded by a blank line, when it's not
   the first item of a list, is teated as block level content.

3. The content of a list item followed by a blank line, when it's not
   the last item of a list, is teated as block level content.

4. Otherwise, the content is parsed for sublists and the rest is
   span-level content.



   - btw, this an h1, not a list item
   ===

   - but indenting...
 

   that's better


The algorithm in Markdown.pl resumes to this:

1. Any non-blank line followed by a non-indented line of three or more  
of

   consecutive `=` is treated as a h1 header. Inside a h1 header only
   span-level content is allowed. Oh, and Makrdown.pl searches for  
headers

   before checking for lists, so it won't see a list here.


I'm not claiming any of this makes any sense. Just how Markdown.pl  
(and PHP Markdown) works.



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Table of contents

2009-03-06 Thread Michel Fortin


Le 2009-03-05 à 4:27, Daniel Winterstein a écrit :

I'm using Markdown in an app and would like to provide support for  
including

a table of contents.
Any suggestions for a syntax? Has anyone done this before?
My first thoughts are:

1. Have a special header item (using markdown extra's header  
syntax), e.g.


generate-contents: yes


Just to make things clear. PHP Markdown Extra doesn't have this kind  
of metadata syntax. Other implementations do however.




2. Have a special xml tag with optional alternative text inside, e.g.

contents
1. First thingy
2. Second thingy
3. Other stuff
/contents


HTML and XML tags shouldn't be part of the Markdown syntax.


3. Detect that a set of list items matches the first few headers.  
E.g. if

the document has headers

# Monkeys
## Chimps
## Humans
## Proboscis monkeys
## Other monkeys
## Do Lemur's count?

Then a list that ran:

1. Monkeys
  1. Chimps
  2. Humans

Would be detected as the start of a contents list, and the other  
entries
would automatically be added to it. This seems the nicest approach  
in some

respects, but also the one likely to cause confusion and annoyance.


I'd rather have a single tag, or another indicator saying insert TOC  
here. Sample TOC content in your contents element is going fall out  
of sync with the rest of the document some day when you edit the  
document; I see no use for it.


That said, I believe the best way to generate a TOC is to implement  
HTML5's outline algorithm, working directly on the HTML or XHTML output.
http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html#outlines 




--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: A Modest Definition List Proposal (David E. Wheeler)

2009-02-26 Thread Michel Fortin


Le 2009-02-25 à 22:00, Dr. Drang a écrit :

1. Regular Markdown--by which I mean Gruber's Markdown.pl--looks  
good[^1] regardless of whether you're using proportional or  
monospaced fonts. I can't think of any construct in which the width  
of the characters makes a difference.


Well, if you want list items spanning on multiple lines to align  
properly, it's pretty hard to do with spaces using a proportional font.


1.  List item
spanning on more
than one line.


2. Plain text tables almost always look like crap *unless* you're  
using a monospaced font, because columns always include a mixture of  
visible characters and spaces. I suspect this is one of the reasons  
Gruber hasn't put tables into Markdown.pl.


You can sometime use tabs in proportional fonts in some situations. It  
helps aligning things correctly, and not only for tables, but also  
with the list item example above.


Although if you later view it with a different font it'll probably  
look like crap too.



3. Markdown was not intended to cover every situation; it's meant to  
be a simple, readable substitute for simple (X)HTML. In this spirit,  
we shouldn't expect table additions to Markdown to be able to handle  
every type of table, just the simpler types. I like the table syntax  
of PHP Markdown Extra and MultiMarkdown for this reason.


That's in part why I'm reticent to add a new type of table with cells  
spanning on multiple rows to PHP Markdown Extra.



4. Using a script[^2] to align the pipes of a plain text table is  
very practical if you're writing in a monospaced font. You can write  
and edit the table quickly without regard to alignment, then make it  
readable by applying the script to it. It's much easier to make  
tables this way than to type them out in HTML.


Indeed. I never though of that, but it's a good idea.

That said, it's pretty easy to write a properly-aligned table without  
a script too. I've been doing that for some time.



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: RFC: Markdown Table Syntax

2009-02-26 Thread Michel Fortin


Le 2009-02-26 à 19:54, David E. Wheeler a écrit :


http://www.justatheory.com/computers/markup/markdown-table-rfc.html

[...]

Those refinements, in a nutshell, are, simply:

* Implicit cell alignment using space characters, rather than
 the explicit formatting hints in the header lines
* Cell content continuation using : for succeeding lines
* Stricter use of space, for proper alignment in plain text (which
 all of the MultiMarkdown examples I’ve seen tend to do anyway)
* Allow + to separate columns in the header-demarking lines
* A table does not have to start right at the beginning of a line


Perhaps you should mention that you're now forcing a table cells to be  
properly aligned using a monospace font. This in fact makes the syntax  
impossible to use in a proportional font context.


While it's true that many people use a monospace font and that in a  
text editor you can change the font, I'd like to mention that Markdown  
is also used for writing comments and blog posts in textareas not  
formatted with a monospace font, and for which it may not be easy to  
change the font. Forcing proper alignment makes the table syntax  
unusable in these situations.


Now, given that cell continuation using colons relies on that proper  
alignment with a monospace font feature (or else you risk mistaking  
colons in the text for column separators), I don't find that syntax  
very satisfying.


Allowing `+` as column separator in the header underline looks like a  
good idea though.



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: A Modest Definition List Proposal (David E. Wheeler)

2009-02-25 Thread Michel Fortin



Le 2009-02-25 à 1:04, Benoit Perdu - TransMékong a écrit :


Taking David's example further, here is a first try:

 id |  name   |   description |  more info
-+-+---+-
  6 | Inset   | An inset element  | just one element
  8 | Stories | Another element   | another element
  : with 2 lines, without
  : colons on the left.
  9 | Other   | Another element   | another element
: :   : with 2 lines, with
: :   : colons on the left.
  5 | Illust. | An illustration   | new line, would this do?
: : and I think you   : Is it parseable?
: : know what I mean.


I doubt the no-colon-on-the-left lines will work. I mean, how do you  
know you're writing in fourth column without counting whitespace? (And  
if count whitespace it's unusable with proportional fonts.)



The colon at each empty cell looks like vertical ellipsis, that  
makes it

pretty legible


Colons are a nice way to do it, but I doubt the table will be readable  
by anyone not already aware of the syntax. To see what I mean, just  
take a look at the last column and forget for a moment that you know  
the difference in meaning between : and |. It then looks like one  
big paragraph of continuous text. You can disambiguate by reading the  
text, then figure out the meaning of | and :, but that's removing  
the usefulness of a table which should be scannable at first glance.


If you want multiple lines per cell, I'd suggest using a more explicit  
grid, something like this:


 id |  name   |  description |  more info
|=|==|=
  6 | Inset   | An inset element | just one element
|-|--|-
  8 | Stories | Another element  | another element
| |  | with 2 lines
|-|--|-
  9 | Other   | Another element  | another element
| |  | with 2 lines
|-|--|-
  5 | Illust. | An illustration  | line breaks are not
| | andn I think you | possible in a table
| | know what I mena.|

This is totally unambiguous and easy to scan for the reader. The  
problem is that, even though it's easy to read, it's also more tedious  
to write.


--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: A Modest Definition List Proposal (David E. Wheeler)

2009-02-25 Thread Michel Fortin


Le 2009-02-25 à 9:57, Sherwood Botsford a écrit :


id |  name   |  description |  more info
   |=|==|=
 6 | Inset   | An inset element | just one element
   |-|--|-
 8 | Stories | Another element  | another element
   | |  | with 2 lines
   |-|--|-
 9 | Other   | Another element  | another element
   | |  | with 2 lines
   |-|--|-
 5 | Illust. | An illustration  | line breaks are not
   | | andn I think you | possible in a table
   | | know what I mena.|

This is totally unambiguous and easy to scan for the reader. The  
problem is

that, even though it's easy to read, it's also more tedious to write.



One minor change.  You don't need pipes in the horizontal separator  
lines.  E.g:

   id |  name   |  description |  more info
   


My idea was to follow the way it currently works for tables in PHP  
Markdown Extra, which was made this way to make the difference clearer  
from the header syntax and to allow specifying text alignment for  
columns.

http://michelf.com/projects/php-markdown/extra/#table



I suspect that if you were going to write a lot of tables, you'd write
a perl program that would take an existing table that you slopped
together, and would fix the spacing of pipes on all the lines to match
the pipe spacing on the first line.


If you're going to write a Perl program for creating Markdown tables,  
why not create tables in HTML directly? It'd be a lot simpler to write  
a converter for that, and it would avoid cluttering the syntax with  
something you need a special program to make anyway.



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: A Modest Definition List Proposal

2009-02-19 Thread Michel Fortin


Le 2009-02-19 à 1:52, Yuri Takhteyev a écrit :


Correct. Or one could say there is no way to get any standard blessed
as official here period. The closest you can get is to convince Michel
to add it to Markdown Extra, in which case there is a reasonable
chance that some of the other implementations will add optional
support for it. In this particular case, I find it hard to see this
proposal going very far, though. (Of course, Michel can prove me
wrong.)


It seems I've been called into this discussion. :-)

Well, that whole thread left me wondering. I'm of opinion that using a  
tilde is as good as a colon, perhaps even better visually. It has the  
advantage of looking better when you add a colon at the end of your  
term (which, I think, happen often enough):


Term:
   : Definition

Term:
   ~ Definition

The current syntax using a colon has been there... since PHP Markdown  
Extra's debuts (July 2005). There is obviously a lot of documents out  
there using it, so there's not way I'll just change it to something  
else.


As for accepting the tilde as an alternate marker, there's obviously a  
precedent of that: unordered lists which can be made of `*`, `+`, or `- 
`, so it wouldn't be counter-nature to add one for definition lists.  
On the other hand, that precedent was meant to allow other syntaxes in  
common use to work with Markdown; I'm not sure any syntax we can come  
with for definition lists is already in common use. Definition lists  
are already some sort of specialized niche syntax within Markdown and  
HTML: useful when you need one, but not something a lot of people care  
for or even know it exists.


So in the end I'm pretty much ambivalent to this proposal: I like it  
but I don't think it's worth wasting another marker character and  
dealing with possible clashes (legitimate text with a tilde in it).


That said, perhaps I'll change my mind the next time I have to write a  
definition list.



--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

[ANN] PHP Markdown Extra 1.2.3

2008-12-31 Thread Michel Fortin

Small update to PHP Markdown Extra to fix a problematic bug with  
inline HTML introduced at version 1.2. You can download the update here:

http://michelf.com/projects/php-markdown/

Extra 1.2.3 (31 Dec 2008):

*	In WordPress pages featuring more than one post, footnote id  
prefixes are

now automatically applied with the current post ID to avoid clashes
between footnotes belonging to different posts.

*	Fix for a bug introduced in Extra 1.2 where block-level HTML tags  
where

not detected correctly, thus the addition of erroneous `p` tags and
interpretation of their content as Markdown-formatted instead of
HTML-formatted.

--
Michel Fortin
michel.for...@michelf.com
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Extra markdown suggestions

2008-11-28 Thread Michel Fortin


Le 2008-11-28 à 2:34, Simon Bull a écrit :


I have recently started using Michel Fortin's PHP Markdown Extra
implementation to programmatically transform my markdown text files  
into
HTML.  Firstly I'd like to say markdown is very cool -- thanks to  
everyone

involved :)

I'd also like to suggest two additions to markdown:

1)  I very often use /this/ markdown to indicate emphasis since I  
find it

much easier to type and read than _this_ or *this*.


Not that you can add it for yourself, but I think Markdown already has  
enough syntaxes for emphasis.




2)  I also use additional setext style headers like this:

Header 1


Header 2


Header 3


Header 4


Header 5


Whether or not these suggestions would be a worthwhile addition to the
markdown syntax is one topic.


One problem with your header syntax is that it's not an addition but a  
change from what Markdown currently does. Nothing prevents you from  
doing it yourself though.



Another topic is about how to go about changing Michel Fortin's PHP  
code to

implement these changes.


Look for the doHeaders function.



Is this the right forum to discuss such code
changes?


Sure.


--
Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Link syntax (was: definition lists?)

2008-11-21 Thread Michel Fortin


Le 2008-11-21 à 11:36, Jason Blevins a écrit :

On Babelmark Markdown.pl 1.0.2b8 honors this implicit reference  
syntax but
1.0.1 doesn't.  I was aware of this syntax so my expected output is  
that
of the beta version.  I haven't used it myself though because I've  
often
wondered how 'standard' it is.  It looks like the majority of  
implementations

do support it though.



For PHP Markdown and PHP Markdown Extra, I've kept that syntax  
disabled on purpose, waiting for an non-beta release of Markdown.pl to  
make it official (at which point it'll be in the Markdown  
documentation). The problem is that Markdown.pl 1.0.2 doesn't seem  
like it'll go out of beta one day...


--
Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Link syntax (was: definition lists?)

2008-11-21 Thread Michel Fortin


Le 2008-11-21 à 12:26, Jason Blevins a écrit :

Superscripts are nice in print, but there you don't have to click on  
them.


With a proper stylesheet you can make them as easy to click as  
anything else. Look at footnotes on Daring Fireball for instance:


http://daringfireball.net/2008/11/google_mobile_uses_private_iphone_apis#fnr1-2008-11-19 



The clickable region is big enough.


--
Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/



___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: list corner case

2008-09-10 Thread Michel Fortin


Le 2008-09-09 à 22:05, Michel Fortin a écrit :

That said, about the situation where there is no space between the  
two lists, I'm not sure why it should be treated differently than  
with Dhruba's report. If you take the following:


   * one
   * two

   * three
   * four

you only get one list. With my fix, this doesn't change; the only  
change is that it now stops the list when it can't find another list  
marker *matching the current list type*, plain and simple.


Oops, I got this all wrong. PHP Markdown in fact creates a sublist for  
three and four with this input:


* one
* two
1. three
2. four

Sounds like a bug to me. This should be the same as:

* one
* two

1. three
2. four

which means: an unordered list followed by an unordered one. I need to  
add a test for that to MDTest...


--
Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/





___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: + in email addresses - edge case with a lot of variation between implementations.

2008-07-25 Thread Michel Fortin


Le 2008-07-25 à 3:52, Tomas Doran a écrit :


http://babelmark.bobtfish.net/?markdown=%3Ca%2Bb%40c.org%3Enormalize=on

Only Python markdown, Pandoc, discount and PEG markdown seem to get  
this 'right'.


As a + is perfectly valid in email addresses, I'm going to fix this  
in my modules.


This was reported to me via the cpan.org RT (37909), and I thought  
I'd share as it's a good one in Babelmark.


I'm looking at [some examples of valid email addresses on Wikipedia] 
[1] and wondering if we should we support all these...


[EMAIL PROTECTED]

[EMAIL PROTECTED]

[EMAIL PROTECTED]

abc+mailbox/[EMAIL PROTECTED]

!#$%'*+-/=?^_`.{|[EMAIL PROTECTED] (all of these characters are  
allowed)


[EMAIL PROTECTED]@example.com (anything goes inside quotation marks)

Fred Bloggs@example.com


Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: standard-izing extended markdown

2008-07-23 Thread Michel Fortin


Le 2008-07-22 à 15:18, Gour a écrit :


That's why I'm interested to know what is the plan in regard to
standardizing some of those 'extra' features which put markdown markup
in the category of (more) serious authoring solutions?


Well, I plan to specify and define unambiguously the syntax for  
Markdown Extra, with an eye on keeping that spec usable to implement a  
plain Markdown parser too. I hope to convince some other Markdown  
implementers to follow the parsing model in that spec to help improve  
interoperability and compatibility both with standard and extra  
features.


You can read the current draft here:
http://michelf.com/specs/markdown-extra/

Development of that spec is somewhat stalled right now, as I'm working  
on other things. Note that these other things include an experimental  
incremental parser for PHP Markdown and PHP Markdown Extra... which  
could count as advancing the spec anyway by looking at what is  
practical and what is not. (Some problems with the current approach in  
the spec convinced me to experiment a little before writing further.)



Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Syntax Questions

2008-07-22 Thread Michel Fortin


Le 2008-07-22 à 2:47, Jurgens du Toit a écrit :


At the end of the day I probably will maintain my own copy, with some
changes, of Markdown. I also don't want to break the syntax. One of my
previous mails I mentioned a way that makes the Markdown more  
useable (by
being able to usefully use nl2br on the Markdown'ed string) without  
breaking

the syntax or HTML and plain text presentation.


Have you considered what will happen to code blocks with `nl2br`?  
Won't this:


precodefunction a() {
return 1;
}/code/pre

be turned into this:

precodefunction a() {br /
return 1;br /
}/code/pre

effectively doubling the newlines?


Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Syntax Questions

2008-07-21 Thread Michel Fortin


Le 2008-07-21 à 6:32, Jurgens du Toit a écrit :

I mean that difficulty to test must not impair the development  
process.
Yes, sure, don't roll out software that hasn't been tested, but, as  
Markdown
is issued under an open source license, there's who knows how many  
people
who might want the untested functionality, and who will be willing  
to test

it, and probably improve on it as well. Me included.


No doubt about that: testing shouldn't impair development. As you've  
seen, I'm not against experiments; I've even told you what to change  
to get what you requested.


But I'm not interested in *publishing* this as a feature of PHP  
Markdown because I don't want to test and maintain a new optional  
feature. Not to mention that I think it breaks the syntax. If you wish  
to do the maintenance and testing it requires and handle the bug  
reports that will come (or ignore them), feel free to fork PHP  
Markdown and publish that; the license allows it.



Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Syntax Questions

2008-07-18 Thread Michel Fortin


Le 2008-07-18 à 5:13, Jurgens du Toit a écrit :


Kewl.

If you look at a formatter like tidy, it's got a lot of options  
where you
can turn certain behaviour on and off, making it much more useable  
for a lot
of people. Wouldn't it improve the usability of Markdown if these  
kind of

options were present?


The more options, the more difficult to test, because each input can  
have more than one output. There are some configurable things in PHP  
Markdown, but I can attest they are under-tested compared to the  
regular syntax.


Moreover, with each option affecting how the Markdown source is  
parsed, you multiply per two the number of variants of the language in  
the wild. Currently, if I encounter a text box on a web page claiming  
to be Markdown-formatted I can be pretty sure of the output I'll get  
for what I write. If Markdown had one option turning each newline in  
one HTML line break, then writing in that textbox is guesswork.  
Hopefully, the form author will tell which options are on and which  
are off -- something like Markdown + automatic line breaks in our  
case -- but the more options, the less practical it is for authors to  
write this extra info, or for users to read it, because the length of  
the description would become intimidating.


Which means that if you modify Markdown to change some of its  
behaviour, please don't call it plainly Markdown. Markdown +  
automatic line breaks explains clearly what your text field does  
differently from Markdown and will avoid surprises for your visitors.


 - - -

Now, if you still want to do a hard break at each newline with PHP  
Markdown, go to the `doHardBreaks` function and change this expression:


/ {2,}\n/

for this one:

/\n/

and I expect it should do the trick. This is totally untested however.  
And I don't plan to add an option like this to future versions.



Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Syntax Questions

2008-07-17 Thread Michel Fortin


Le 2008-07-17 à 10:11, John Gabriele a écrit :


On Thu, Jul 17, 2008 at 9:49 AM, Jurgens du Toit
[EMAIL PROTECTED] wrote:

Ok.

Is it possible to modify the code to do that?


Very probably, but you may not want to. My impression is that there's
a lot of tradeoffs in Markdown between it trying to do what you mean
and it requiring non-ambiguous markup. For example, if you've got a
paragraph with a plus sign in it, and you have your editor re-wrap it,
you might end up with a line starting with that plus sign. You
wouldn't want Markdown to think it's a one-item list.


Well, that's almost the exact reason this was changed, a long time ago
now. The problem was with sentences finishing with a number in the
middle of a hard-wrapped paragraph. For instance, I could say: I want
2. He wants 3., and then you'd have a strange list popping in the
middle of your otherwise normal paragraph.

The requirement of a blank line goes away when a list is nested in
another, so you can write nice-looking hierarchical lists:

1.  Test
*   Test
*   Test


Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Syntax Questions

2008-07-17 Thread Michel Fortin


Le 2008-07-17 à 16:41, Jan Erik Moström a écrit :

Well, there is a good reason why Markdown doesn't do this. Many  
prefer to use a plain text editor which doesn't wrap text (I for  
example prefer my text files this way) and we insert hard new lines  
to keep the lines from becoming too long. If those hard newlines  
were translated into br / Markdown would be useless for a lot of  
people.


That's one reason. Personally, I often generally don't write hard- 
wrapped paragraphs... except inside lists and blockquotes where I  
wants things aligned properly in the source text.


For instance, I don't want my text editor to wrap automatically my  
like item like this:


1.  First item of a list with two lines worth wasted text
for your reading pleasure.

So I indent correctly the second line to make it better looking and  
easier to write:


1.  First item of a list with two lines worth wasted text
for your reading pleasure.

By doing this, I'm inserting a newline character at the end of each  
line. If Markdown was adding a line break there, then I'd be forced to  
write the bad-looking version, reducing readability of the source text.


The same applies to blockquotes:

 This and that and this and that and this
 and that and this.

You couldn't indent each line with a  if Markdown was to convert  
every newline to a `br /`.



(and I don't think there is an option to get it to behave the way  
you want)



No there isn't one.


Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

[ANN] PHP Markdown 1.0.1m Extra 1.2.2

2008-06-21 Thread Michel Fortin

Here's another update to PHP Markdown and PHP Markdown Extra featuring  
some bug fixes.


See the download link on the project's page:
http://michelf.com/projects/php-markdown/

This update also features a new emphasis parser which process regular  
and strong emphasis in one pass instead of two and four for previous  
versions of PHP Markdown and PHP Markdown Extra. The new parser is  
also faster and easier to change if the logic needs to be adjusted. It  
works by using context-dependent composition of a regular expression  
for matching allowed emphasis tokens, and PHP code to generate proper  
emphasis elements from the tokens.



1.0.1m:

*   Lists can now have empty items.

*   Rewrote the emphasis and strong emphasis parser to fix some issues
with odly placed and overlong markers.


Extra 1.2.2:

*   Fixed a problem where abbreviation definitions, footnote
definitions and link references were stripped inside
fenced code blocks.

*   Fixed a bug where characters such as `` in abbreviation
definitions weren't properly encoded to HTML entities.

*   Fixed a bug where double quotes `` were not correctly encoded
as HTML entities when used inside a footnote reference id.


Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Optional features (was: Markdown Extra Specification (First Draft))

2008-05-22 Thread Michel Fortin


Le 2008-05-22 à 2:10, Aristotle Pagaltzis a écrit :


It is, mind, perfectly fine to have two (or maybe three?) specs
of which one is a superset of the other, as seems to be Michel’s
current thrust with Markdown vs Markdown Extra. Assuming that no
feature in either spec is optional, that means you would be able
to expect Markdown Extra documents to work in all Markdown Extra
processors, and all Markdown documents to work in all Markdown
and Markdown Extra processors. The scope of the problem is much
smaller in such a scenario, enough so to be perfectly tractable.


I perfectly agree with this by the way: optional features should be  
kept to a minimum. It may be interesting to note there are currently  
two configurable parsing-related[^1] in PHP Markdown:


Tab width (default = 4)

:   This one comes from a similar configuration option in
Markdown.pl and is essentially the size in spaces for one
indent through a Markdown document. When John Gruber says
four spaces or one tab in his syntax description document,
he really means tab-width spaces or one tab, where
tab-width is a configurable parameter defaulting to 4.

I'm not aware of anyone changing this parameter, and I'm not
even sure of how well it works, but it is clear that changing
this will break many documents written with a different tab
width in mind.

No markup (default = false)
No entities (default = false)

:   This one prevents the parser from skipping over HTML tags
and/or HTML character entities. I was originally opposed to
it, and in some way I still am. I decided to add it because
there was too much people attempting to disable HTML by
preprocessing the input with strip_tags or a substitution
regular expression without realizing they were breaking
automatic links, code spans and code blocks with HTML in
them, and sometimes blockquotes.

I'm no fan of this mode, but I feel it was the best way to
avoid people breaking the syntax by accident, so I've added
it in.

I'm not sure those features should be formally part of the spec. I  
believe however that if the spec is well written it should be pretty  
trivial to see what must be changed to achieve them.


[^1]:
A parsing-related setting is a setting that changes the
interpretation of the document given in output. The oposite
is an output-related setting, which changes the HTML
output but does not affect the interpretation the parser
makes of the document.


Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Parsing Code Blocks

2008-05-22 Thread Michel Fortin


Le 2008-05-16 à 0:31, Yuri Takhteyev a écrit :


Your first two examples are not treated as the same by any
implementation.  It seems that all implementations interprete this:

~~~

   One

  Two


   Three

  Four

  Five
~~~

as meaning that One is in a code block, but Two is not.

Or did you mean to put a few more spaces in front of Two?


Hum, yes I did, and in fact I had. It just looks like my email client  
(Mac OS X's Mail) eat the first space on each line that begins with a  
space... I really wish it wasn't using Web Kit as its text editor when  
in text-only mode.



[spec]: http://michelf.com/specs/markdown-extra/#block-element-generator 



I think it would help if the spec maked it more clear what part of
each line of the blockquote is consumed before we go looking for
sub-elements, especially as far as consuming initial whitespace goes.


Quoting item 2 of blockquote (at the moment you wrote the above):

 A run of the [block element generator](#block-element-generator) by
 pushing the following sequence to the varcontext-line-prefix/var
 stack:
 1.  Zero or one [insignificant-indent](#insignificant-indent)
 2.  
 3.  Zero or one [space](#space)

This means that the block element generator is used as a grammar rule  
at this point. It matches if it can generate one or more block  
elements. Since each rule in the block generator first checks for a  
hard-block-content-line-prefix, you could check for yourself that you  
can match a hard-block-content-line-prefix prior calling the generator  
(this *could* be more performant).


I've added this to the block element generator section:

 The block element generator is used as a parsing rule in the  
grammar of
 the document element generator and the block element generator. The  
block
 element generator matches if it one of the following rule matches  
and creates

 an element.

That said, I decided to revamp the blockquote rule to no longer use  
directly the block element generator. Everything now passes through a  
rule named block-element-run, matching one or more block element  
(using the block-element generator), and the blockquote first  is  
parsed separately in the blockquote rule instead of indirectly from  
attempting to parse block elements.


Does this makes it clearer?

By the way, I agree things are not optimal at the moment. They are  
also way off the tracks of what PHP Markdown and Markdown.pl actually  
do in many cases. The plan is to start by making something that mostly  
work. Then I'll compare with the actual regular expressions used in  
the code and do the adjustments as necessary. After that, I'll compare  
with test cases in MDTest, and with the output given by other  
implementations in Babelmark. And I might mix the order a bit.



Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Parsing Code Blocks

2008-05-15 Thread Michel Fortin

I've rewritten the code block grammar in the Markdown Extra [spec][]  
to match what Markdown.pl and PHP Markdown do. It should now handle  
things such as this:


~~~
 One
Two

 Three
Four

Five
~~~

as one blockquote containing only one code block with five lines,  
equivalent to this one (using fenced code blocks instead for clarity):


~~~
 One
 Two

 Three
 Four

Five
~~~

I'm wondering though if code blocks shouldn't force a non-lazy  
syntax, which would mean yielding a result identical to this instead:


~~~
 One

Two

 Three

Four

Five
~~~

Thoughts?


[spec]: http://michelf.com/specs/markdown-extra/#block-element-generator 




Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Markdown Extra Spec: Parsing Section

2008-05-13 Thread Michel Fortin


Le 2008-05-13 à 2:20, John MacFarlane a écrit :

PS. Here's all you have to learn in order to write or read a PEG  
grammar.


A B C A followed by B followed by C
A | B A or B (ordered choice)
A+one or more As
A*zero or more As
A?optional A
!Anot followed by A
Afollowed by A (but does not consume A)
(A B) grouping
. matches any character
'x'   matches the character 'x'
string  matches the string string
[a-z] matches a character from 'a' to 'z'


It certainly true that many parts could be converted to this and be  
less verbose, and I find this idea appealing. I doubt the whole  
Markdown Extra ruleset can be expressed in this format though. Can a  
PEG grammar have parametrized rules?


I've just added nested block element support in the spec. This is done  
by having the block element generator (formerly the block element  
pass) have a stack of rules to match when starting each line. This  
idea coming straight from Allan Odgaard's explanation of his lost  
Markdown parser.
http://six.pairlist.net/pipermail/markdown-discuss/2008-March/001107.html 




Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Markdown Extra Spec: Parsing Section

2008-05-12 Thread Michel Fortin


Le 2008-05-12 à 21:55, John MacFarlane a écrit :


I am assuming that there will be a different type to deal with link
text.


There will.


Is there any good reason for having two different types here?


The link text can contain other span-level elements, such as emphasis,  
code blocks, etc. This *has* to be taken into account while parsing.  
On the other hand, text in the reference part is just plain text.




As far as
I can see, allowing anything that can serve as link text to be a  
refname

would not contradict anything in the official Markdown syntax
specification. In addition, it is hard to imagine a realistic case  
where

allowing brackets and newlines in refnames would break an existing
document. Why make users remember extra restrictions? (I didn't even
know about them until a few days ago, and I've used markdown for  
years.)
And why expose users to the risk that their documents will break if  
they

hard-wrap a long refname?


I'm in favor of allowing hard-wrapped reference names where the line  
break is not significant, so that will probably end up in the spec  
when I write the part about parsing the link span element.


Please keep in mind that the current refname construct is for the  
reference name in link definitions, and may be different from the one  
used in the link span element.




I think the current behavior of phpmarkdown and Markdown.pl is very
confusing. This produces a link:

   [[hi]][]

   [[hi]]: /url

But this doesn't produce a link:

   [hello][[hi]]

   [[hi]]: /url

So either (a) not all link references begin with a refname, or (b)
refnames can sometimes (but not always!) contain embedded brackets.
Either option would conflict with Michel's syntax specification
as it now stands.



This situation is indeed inconsistant. I'd be in favor of allowing  
balanced square brakets in link reference, even though John Gruber  
seems (or seemed in 2006) to think they should be disallowed completely.
http://six.pairlist.net/pipermail/markdown-discuss/2006-September/000257.html 




Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Markdown Extra Spec: Parsing Section

2008-05-12 Thread Michel Fortin


Le 2008-05-12 à 18:14, John MacFarlane a écrit :


The PEG representation is concise, precise, and readable.


Readable, hum... if I look at this rule from PEG Markdown:

ListContinuationBlock = a:StartList
( BlankLines
{ if (strlen($$.contents.str) == 0)
$$.contents.str = strdup(\001); /* block separator */
pushelt($$, a); } )
( Indent ListBlock { pushelt($$, a); } )+
{ $$ = mk_str(concat_string_list(reverse(a.children))); }

it looks a lot like code to me, half of it I don't understand. If  
we're going this way, there's going to be a learning curve: for me,  
and for everyone trying to understand the syntax. I'd prefer to avoid  
forcing people to learn a new language only to understand the  
specification.



Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Markdown Extra Spec: Parsing Section

2008-05-11 Thread Michel Fortin

Le 2008-05-11 à 20:55, Jacob Rus a écrit :

You should write it in something closer to a BNF-like format. The
current version is about 10x more verbose than necessary, and it
makes reading the spec considerably more difficult.

The reason I'm doing it like this is that I doubt everything will be
expressible in a BNF format. Using plain english descriptions allows
me to not bother about fitting things to a specific grammar and just
write what I feel is the most natural and the easier to understand.

Shopping for a more formal and less verbose grammar, if we need one,
will be much easier once we know what we need, once we can compare
existing grammars against a checklist of what is necessary to
implement the given parsing algorithm.

If you remember the timetable I've given, you'll see that I've booked
about half a year for polishing things out. This includes rephrasing
sentences, refactorizing the syntax, and reformatting the spec to make
it easier to understand. This *could* include switching to a new
grammar format if it makes things more intuitive and readable.

Also, you're still going to have quite a few sticky edge cases with
your current parsing model. What happens when we have a ``-
delimited URL inside a blockquote? For instance:

what about this http://
google.com/ case?

Well, currently newlines aren't allowed inside automatic links in
Markdown.pl, PHP Markdown and some others. Implementations who see an
automatic link there sees it as a link to http://
google.com/ (notice the space) or http://; (notice what's missing).

http://babelmark.bobtfish.net/?markdown=%0D%0A%3E+what+about+this+%3Chttp%3A%2F%2F%0D%0A%3E+google.com%2F%3E+case%3Fnormalize=onsrc=1dest=2

Anyway, with the parsing model in three passes I'm currently defining
it's pretty trivial to do correctly: the block elements pass extracts
the text of the blockquote, leaving this to parse by the span element
pass:

what about this http://
google.com/ case?

The span element pass would then see an autolink and just ignore any
newline it finds in the URL.

Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Markdown Extra Spec: Parsing Section

2008-05-09 Thread Michel Fortin

I've clarified and changed a few things about some parsing rules and  
started defining new rules for the block elements pass.


Of notice is the flat code block in the block elements pass, which  
is is going to be part of the next version of PHP Markdown Extra.



Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: a JavaScript front-end for Babelmark

2008-03-31 Thread Michel Fortin


Le 2008-03-31 à 3:35, Tomas Doran a écrit :

I've just stolen your code and plugged it into my mirror of  
Babelmark, looks really awesome!


I've also just re-jigged servers, so ruby, Java and Pandoc have been  
broken on my mirror over most of the weekend. I've fixed this now.


http://babelmark.bobtfish.net/?markdown=*This+**is+a+test*.normalize=on


Great.

I think it'd be a little better if the compare checkbox was off by  
default however. And that sentence may need revisiting:


With Javascript diffing Copyright © 20048 by John Fraser


Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: [ANN] Babelmark

2008-03-23 Thread Michel Fortin


Le 2008-03-22 à 17:27, Michel Fortin a écrit :


2-tier list indented by three spaces:
http://michelf.com/projects/babelmark/?markdown=*+what%27s+up%3F%0D%0A+++*+ok

Now, on this one, I must say I have mixed feelings, since
python-markdown is the only implementation that follows Markdown
Syntax  and treats the item indented by three spaces as being at the
same level.  Makes me feel like a naive fool for following the  
spec.

:)


Well, you've been following the official spec; no one should call  
you a fool for that. But it certainly doesn't give much leverage to  
the idea of keeping the spec as it is.


It should be mentioned that, in addition to Python Markdown, both  
markdown.lua and Pandoc seem to follow the spec regarding list  
indentation.



Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

[ANN] Babelmark

2008-03-22 Thread Michel Fortin

I'm currently attempting to write a spec for parsing Markdown Extra,  
and since one goal is to minimize the differences in output between  
implementations, I've made a tool allowing me to compare who does what  
for any given input. I hope this can also facilitate future  
discussions about the syntax.


So here's Babelmark, a testbed for various Markdown implementations  
were you give the input and you get the HTML output for Markdown.pl  
1.0.1 and 1.0.2b8, PHP Markdown and PHP Markdown Extra, Python  
Markdown, Text::Markdown and Text::MultiMarkdown, and finally Showdown.


Unfortunatly, my web host doesn't do Ruby, nor Java, C# or Lua, so the  
online version is missing a couple of interesting implementations.  
Locally on my computer Babelmark also do BlueCloth, Maruku, MarkdownJ,  
markdown.lua, and Pandoc. I'm very sorry if your Markdown  
implementation can't be part of Babelmark online, but if anyone has a  
better host to offer for Babelmark, ideally with support for all of  
these, I'd gladly send him the scripts.


You can find Babelmark here:

http://michelf.com/projects/babelmark/

Also note that, contrary to the normal Markdown or PHP Markdown  
dingus, you can bookmark specific tests since the input string is  
encoded in the URI. Here are some interesting cases for instance:


http://michelf.com/projects/babelmark/?markdown=_test_test_test_
http://michelf.com/projects/babelmark/?markdown=*test+%5Btest*+test%5D%28%23%29 

http://michelf.com/projects/babelmark/?markdown=%5Btest%5D%0D%0A%0D%0A%5Btest%5D%3A+%23 



So that's it. Have fun with it.


Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Text::Markdown vs MDTest (Was: Re: forking Markdown.pl?)

2008-03-21 Thread Michel Fortin


Le 2008-03-21 à 16:39, Tomas Doran a écrit :

So, the *only* things that Text::Markdown currently fails on are  
small whitespace changes..


Hum, have you written your own test script?

I encourage you to use the mdtest.php script if you have PHP 5  
installed on your computer. It'll normalize the whitespace for you  
before comparing the output, ensuring that insignificant whitespace  
differences don't make any test fail. All you need is an executable  
you can invoke that will parse the standard input and put the result  
on the standard output and you can use MDTest like this:


./mdtest.php -n -s Markdown.pl

-n for normalize (and ignore insignificant whitespace), -s to tell  
mdtest to use the given script (such as Markdown.pl). You can add -d  
to see a diff for failing tests.



Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: Javascript in URLs (was: Markdown doesn't always generate XHTML)

2008-03-15 Thread Michel Fortin


Le 2008-03-15 à 0:39, Waylan Limberg a écrit :


On Fri, Mar 14, 2008 at 11:22 PM, Michel Fortin


PHP Markdown also has a no-markup mode which would filter script tags
and any other HTML tags. But this doesn't prevent anyone from
inserting their own script on the page. Do you know you can inject a
script in a URL? Guess what this does:

[link](javascript:alert%28'Hello%20world!'%29)


This is a good point, and something I hadn't thought about myself. I
would think that markdown should *not* allow that regardless of any
safe/no-markup/whatever-you-call-it mode. If someone legitimately
wants javascript in their links/images/etc then they should be writing
raw html. What do you think?


Well if you want your safe mode to be really safe, then sure you  
should not allow `javascript:` URIs indeed.


But in general I believe Markdown should work with any URI. Markdown  
is a mean of writing web documents of all kinds, not only content from  
external untrusted sources, and there are many legitimate reasons one  
would want to write a `javascript:` URI.


Why would you want a non-safe Markdown to disallow such URIs in its  
link syntax if we're going to be able to add them using HTML tags  
anyway?




Of course, then how do we do that? Some possabilites I came up with
without much thought:

1. Trunicate a url at javascript:
2. Completely remove the entire url (perhaps replace with blank  
string or #)

3. Leave the markup for the entire link as plan text (in other words -
its not considered a match)
4. Do some kind of escaping (not sure what at this point) and leave it
in the url


Whatever you do, you first have to detect script URIs, all of them;  
this is no trivial matters. Most of these will run a script in IE or  
some other browser (based on the [XSS cheat sheet][1]):


[link](vbscript:msgbox%28%22Hello%20world!%22%29)
[link](livescript:alert%28'Hello%20world!'%29)
[link](mocha:[code])
[link](jAvAsCrIpT:alert%28'Hello%20world!'%29)
[link](ja#32;vas#32;cr#32;ipt:alert%28'Hello%20world!'%29)
[link](ja#00032;vas#32;cr#32;ipt:alert%28'Hello%20world!'%29)
[link](ja#x00020;vas#32;cr#32;ipt:alert%28'Hello%20world!'%29)
[link](ja%09#x20;%0Avas#32;cr#x0a;ipt:alert%28'Hello 
%20world!'%29)

[link](ja%20vas%20cr%20ipt:alert%28'Hello%20world!'%29)
[link](live%20script:alert%28'Hello%20world!'%29)

I can't claim this is an exhaustive list, nor that they're all going  
to work, but it should give an idea of the problem at hand.


I think blacklisting known dangerous schemes is always going to leave  
holes. A better approach is to have a white list of known safe URI  
schemes and disallow any scheme not in that list. But would be utterly  
restrictive for any non-safe Markdown.


Security filters already exist to do that (like kses); I'd say it's  
much simpler *and* safer to use such a specialized filter on  
Markdown's output than trying to come with our own integrated within  
Markdown.


 [1]: http://ha.ckers.org/xss.html


Michel Fortin
[EMAIL PROTECTED]
http://michelf.com/


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

1 2 >

1 - 100 of 191 matches

Mail list logo