RE: Moving Markdown towards a standard syntax

2014-08-14 Thread Waylan Limberg
-Original Message-
From: Markdown-Discuss [mailto:markdown-discuss-boun...@six.pairlist.net] On
Behalf Of John MacFarlane
Sent: Thursday, August 14, 2014 2:10 PM
To: Discussion related to Markdown.
Subject: Re: Moving Markdown towards a standard syntax

 Here's an attempt of my own, which I've been working on for some time:
http://jgm.github.io/stmd/


Interesting. You appear to mostly be tightening up the existing rules, with
only a few deviations. I have little of value to say about most of it, so
only a few observations follow.

 Tabs in lines are expanded to spaces, with a tab stop of 4 characters. We
strictly enforce this rule in Python-Markdown and I like it, but we have
received bug reports from time to time that certain languages require tabs
(spaces would be a syntax error). I think makefiles would be the most
well-known example. For example, how would you expect someone to be able to
copy from a code block in a blog post and paste into a makefile without
needing to go back and edit all the whitespace? If they are copy and
pasting, they are not likely to be an advanced user and significant
whitespace it already one of the most non-obvious gotchas. Just an
observation here. The answer isn't clear to me either.

I notice that you state that an HTML block ends with a blank line. I have
always wished that it worked that way for the very reasons you cited. As you
observe, things like  raw `pre` blocks can't have blank lines (might want
to add comments, processing instructions and CDATA to that list along with
workarounds??). Either way is a compromise and it is not clear to me which
is the better way to go.

 A blank line always separates block quotes Brilliant!

I absolutely love what you did with how much indentation indicates nesting
within a list (for all block elements, not just nested lists). However, I
expect most people will have trouble getting it right in practice. And I
wouldn't want to write a parser for that. But I sure would enjoy writing
lists with it.

 Two blank lines will end a list. Really? What about a code block nested
within a list item that contains multiple blank lines? If it wasn't for that
corner case, I would love this two. Or it there an exception for that?
Example 198 seems to indicate so, but I don't see it explicitly stated
anywhere. Is it for fenced code blocks only (because you can look for the
closing fence -- if so, makes sense to me) or does it work with indented
code blocks also?

 Changing the bullet or ordered list delimiter starts a new list. As it
should. Also like the start number being set on ordered lists.

 A backslash at the end of the line is a hard line break. Brilliant! I see
you preserved the 'two spaces' rule. You changed some other things (like
list nesting) enough that backward compatibility with existing docs
shouldn't be a concern. Therefore, I'm not sure we need both.

Every implementation should follow your strong/emphasis spec. All
implementers go change your implementations now... oh wait, that means more
work for me...

If I understand you correctly, all autolinks must be surrounded by angle
brackets (the right call btw). Perhaps you should include a url **without**
angle brackets in your list of not autolinks to make that clear.

That must have been am incredible amount of work. I'm only concerned that
it's not really markdown anymore. I'd be tempted to  take it that extra step
and get rid of all the other things that could be slightly better and call
it something else. Of course, I don't see some new plain text markup
language taking off now that markdown it so widespread. But it is fun to
think about.

Waylan Limberg

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Moving Markdown towards a standard syntax

2014-08-14 Thread John MacFarlane

Waylan,

Thanks for your comments!  I'm glad it looks generally useful.


 Tabs in lines are expanded to spaces, with a tab stop of 4 characters. We
strictly enforce this rule in Python-Markdown and I like it, but we have
received bug reports from time to time that certain languages require tabs
(spaces would be a syntax error). I think makefiles would be the most
well-known example. For example, how would you expect someone to be able to
copy from a code block in a blog post and paste into a makefile without
needing to go back and edit all the whitespace? If they are copy and
pasting, they are not likely to be an advanced user and significant
whitespace it already one of the most non-obvious gotchas. Just an
observation here. The answer isn't clear to me either.


Yes, this is something that has always bothered me about the markdown
rules.  In pandoc I have an option to leave tabs intact, and the parser
knows how to handle tabs, so it can be done.  But it would certainly
complicate the spec.


I notice that you state that an HTML block ends with a blank line. I have
always wished that it worked that way for the very reasons you cited. As you
observe, things like  raw `pre` blocks can't have blank lines (might want
to add comments, processing instructions and CDATA to that list along with
workarounds??). Either way is a compromise and it is not clear to me which
is the better way to go.


Agreed.  There are pros and cons either way.  I could be persuaded to
go with something more standard (looking for matching closing tags),
but this approach has an appealing simplicity and flexibility.


 A blank line always separates block quotes Brilliant!

I absolutely love what you did with how much indentation indicates nesting
within a list (for all block elements, not just nested lists). However, I
expect most people will have trouble getting it right in practice. And I
wouldn't want to write a parser for that. But I sure would enjoy writing
lists with it.


I think the way the rules for lists are written is currently pretty hard
to understand, but this is a writing issue that can be improved.  For
most people, the kind of informal presentation given in John Gruber's
syntax document should be enough to get them up and running.  I designed
these rules so that lists that look natural should be normally parsed in
the way their authors expect, so authors shouldn't need to think too
hard about the rules.

As for writing a parser:  I believe the algorithm used in my javascript
implementation could be easily ported over to other dynamic languages.


 Two blank lines will end a list. Really? What about a code block nested
within a list item that contains multiple blank lines? If it wasn't for that
corner case, I would love this two. Or it there an exception for that?
Example 198 seems to indicate so, but I don't see it explicitly stated
anywhere. Is it for fenced code blocks only (because you can look for the
closing fence -- if so, makes sense to me) or does it work with indented
code blocks also?


This is something that needs clarification, thanks.  The C
implementation allows multiple blank lines in fenced code blocks, and
the js implementation should too (but currently doesn't).  But I see
there's nothing about it in the spec.

It would probably not be good to make indented code blocks behave the
same way, since one of the reasons for the two-blanks rule is to deal
with cases involving indented code.

I'd also be open to the idea of dropping the two-blanks rule, which
adds additional complexity.  But it is something that seems to come up
a lot, and without it you sometimes need artifices like HTML comments
to split up lists or separate lists from indented code blocks.


 Changing the bullet or ordered list delimiter starts a new list. As it
should. Also like the start number being set on ordered lists.

 A backslash at the end of the line is a hard line break. Brilliant! I see
you preserved the 'two spaces' rule. You changed some other things (like
list nesting) enough that backward compatibility with existing docs
shouldn't be a concern. Therefore, I'm not sure we need both.


Backwards compatibility was actually a big concern for me.  With list
nesting, it is impossible to be fully backwards compatible with every
existing implementation, because they are incompatible with each other.
But the rules I've given should make most normal looking lists (that is,
lists that aren't TRYING to break things) work in a large range of
different implementations.

For that reason, I favor keeping the two-spaces line break, which is
also nice in documents that you want to look pretty in plain text (a
big goal of markdown).


Every implementation should follow your strong/emphasis spec. All
implementers go change your implementations now... oh wait, that means more
work for me...

If I understand you correctly, all autolinks must be surrounded by angle
brackets (the right call btw). Perhaps you should include a url **without**