On Jul 09, 2014, at 11:49 AM, Sean Leonard <dev+i...@seantek.com> wrote:

Hi markdown-discuss Folks:

I am working on a Markdown effort in the Internet Engineering Task Force, to standardize on "text/markdown" as the Internet media type for all variations of Markdown content. You can read my draft here: <http://tools.ietf.org/html/draft-seantek-text-markdown-media-type-00        >.

My response below is lengthy but covers a number of different points including 
some raised later in the discussion by others. 

Sean, have you reached out to Mr. Gruber specifically? I mention this because 
in the past I have CCed him directly on a response I sent to this list which 
prompted him to respond (admittedly that happened some years ago). I suspect he 
might be amenable to the general idea though. A search of the list archives 
turned up a previous discussion [1] where he indicated a willingness to put in 
some work to obtain a mime type for markdown. Of course, that was back when he 
was still actively involved. Your mileage may vary.

[1]: http://article.gmane.org/gmane.text.markdown.general/1179

In any event, I have some thoughts about your proposal. I like it for the most 
part. But a few comments on some specifics:

Why do we need a Mime Type?
----------------------------------------

First of all, when is this necessary? In order words, when is plain markdown being sent 
around such that it needs a mime type? In my experience, REST API's (for example) use 
JSON or XML which may contain some Markdown text among other data. That other data may 
identify that the text is "markdown", but the mime type for the file is JSON or 
XML (or at least the appropriate mime type for that file type). Or are you proposing that 
everyone standardize on a way to identify the markdown text within JSON and XML documents 
as Markdown text? What am I missing here?

Encodings
--------------

To shed a different light on the encoding issue, consider Python-Markdown 
(disclosure: I'm the primary developer). Just as in Python 3 (where all strings 
are Unicode), Python-Markdown only works with Unicode. You pass Unicode text 
in, and you get Unicode text out. It is up to the user of the library to 
concern themselves with encoding and decoding a file to/from a specific 
encoding. As Python provides the libraries to do that, it is not a big problem 
-- although for those used to working with byte strings it may be a little 
jarring (I'm seeing that reaction from people who are experimenting with 
Apple's new Swift Language -- which also supports Unicode only strings).

The point is, the Python-Markdown implementation has no use for the encoding 
(except for the included wrapping commandline script). Of course, the user 
(user of the library) will care about that and will need some way to identify 
the encoding before converting and passing the input on to the Python-Markdown 
library. So yes, encoding is very much a real, needed piece of meta-data.

However, if the markdown text is included in a JSON file (see my previous point 
above), then wouldn't the encoding be defined for the JSON file, not the 
markdown text specifically. The JSON parsing library would just spit out a 
Unicode string -- in which case, why do we need this?

Flavors
---------

To me, "flavors" seems like a disaster waiting to happen. Sean, I realize you 
have specifically stated a lack of understanding here, so lets go back in time. The 
following may not be an all-inclusive (or in proper order of events) history of Markdown, 
but provides enough (I hope) to make a point.

Way back when, the "flavor" of markdown you used depended almost entirely on which language (Perl, 
PHP, Python...) you were using to code your project (blog, wiki, CMS, etc.). If you where using PHP, them 
your flavor was PHP Markdown... There was only one implementation per language and they (mostly?) agreed with 
each other. In that day "flavor" was completely pointless. I suspect a number of us resistant to 
the "flavors" part of your proposal are from that period in Markdown's history.

Of course, then Ruby came along. I don't remember which library was which, but when the first 
library came out, it was not very good (lots of bugs and slow). Then a second library came out 
which also wasn't very good, but in different ways (except for the slow part). Some people wrote 
their markdown documents with the bugs of the first implementation in mind, while other's wrote 
their documents with the second in mind. Then a few projects started offering users the option to 
pick which Ruby implementation of Markdown to use for each individual document - and 
"flavors" were born. Then other people started making ports of those projects to other 
languages and the "flavors" followed -- even though the other languages didn't really 
have any choices. As a reminder, Github came out of that Ruby culture, which might explain why 
Github-Flavored-Markdown ever existed in the first place (interesting side note: Gruber appears to 
like GFM [2] -- or at least the original release -- it has grown to include more features since 
then).

[2]: http://daringfireball.net/linked/2009/10/23/github-flavored-markdown

Then someone wrote a PEG grammar for Markdown. Once the hard work was done, a 
few people ported that grammar to other languages. And then a few people wrote 
C implementations (one of which used a PEG Grammar IIRC). Then, people wrote 
wrappers around the C libraries for any number of scripting languages (Perl, 
PHP, Python, Ruby...) and now there are a multitude of choices regardless of 
which language your project is coded in. Some time ago I started an incomplete 
[list] -- incomplete because those are the implementations I am aware of -- I'm 
sure there are some others.

[list]: https://github.com/markdown/markdown.github.com/wiki/Implementations

But for those of us that remember the pre-Ruby days there is only "one true implementation" per 
language and all the rest is just a bunch or noise (Okay, perhaps I exaggerate a bit -- just trying to make a 
point). For us "flavors" means something else entirely. Because before all this Ruby and C mess, we 
also had Multimarkdown and PHP Markdown Extra, more-or-less extending the same basic markdown syntax. Of 
course, those extensions are not identical, but given that each was implemented in a different language, it 
didn't matter. The "flavor" depended on which language your project was implemented in and that was 
it.

Of course, many of the extensions created in Multimarkdown and PHP Markdown Extra were then ported to other 
implementations in other languages. Consider Python-Markdown for instance. Python-Markdown provides an extension API so 
that any user of the library can write an extension which modifies the syntax in any way they wish -- to the point that 
it may not be Markdown any more. And a number of extensions ship with the Python-Markdown library [3]. Of those (at 
current count) 17 extensions, 7 of them also come under the umbrella of an 8th -- Extra. In other words, each 
individual feature of PHP Markdown Extra was implemented as its own extension, then when we had all of them, a wrapping 
extension (called "extra") was created as a shortcut. Some users use "extra", but others only use 
"footnotes" (for example). Any number of "flavors" are possible with the various combinations of 
extensions that ship with just this one library. And many of those extensions also accept user defined configuration 
settings which alters that extension's behavior (see footnotes [4] for an example). Then, there is a fairly extensive 
list of third party extensions [5] (which is always changing). I don't imagine that there is any sensible way to define 
all those possibilities in a way that is also understandable by other markdown implementations.

[3]: https://pythonhosted.org/Markdown/extensions/index.html
[4]: https://pythonhosted.org/Markdown/extensions/footnotes.html
[5]: https://github.com/waylan/Python-Markdown/wiki/Third-Party-Extensions

The great thing about Markdown is that any (decent) parser will simply pass over markup 
it doesn't understand. The text will just get passed through as (mostly) plain text. 
Given that one of the guiding principles behind Markdown is that it is human readable, if 
a particular implementation does not support a certain extension, the reader of the 
output could still understand the intended meaning and formatting (or at least "view 
source" as other's have mentioned). Of course, this depends on a number of factors 
(overridden tokens, HTML's whitespace collapsing considerations, etc). There are 
certainly many examples that that does not hold true for. But overall, I don't see that 
as a large concern. 

So, the point (finally) is that "flavors" seem like an impossible-to-get-right part of your proposal and really won't matter in the real word. For example. if 
you send me some markdown text with a flavor of "markdown.pl", but I'm using Awk as my programming language, then I'm not going to use markdown.pl anyway. Or, 
if you send me a flavor of "extra", Awk doesn't have an implementation that supports "extra" (AFAIK), so, that is useless to me as well. On the other 
hand, if I'm using Python, I can account for "extra" easily. Or for "markdown.pl" (just turn off smart_emphasis [6]). But "multimarkdown" 
is a different matter (I'm not exactly sure which features are supported by Multimarkdown or whether Python-Markdown's extensions implement them in the same way). And 
then there's "gfm" and "pandoc" and ... so many variations to account for. I think I'll just ignore this flavor stuff and use the implementation of 
*my* choice which may or may not support the flavor sent my way.

[6]: https://pythonhosted.org/Markdown/reference.html#smart_emphasis

I hope that helps.

Waylan Limberg
_______________________________________________
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Reply via email to