Re: text/markdown effort in IETF (invite)

Waylan Limberg Thu, 10 Jul 2014 08:57:34 -0700

On Jul 09, 2014, at 11:49 AM, Sean Leonard <dev+i...@seantek.com> wrote:


Hi markdown-discuss Folks:

I am working on a Markdown effort in the Internet Engineering TaskForce, to standardize on "text/markdown" as the Internet media type forall variations of Markdown content. You can read my draft here:<http://tools.ietf.org/html/draft-seantek-text-markdown-media-type-00 >.

My response below is lengthy but covers a number of different points including
some raised later in the discussion by others.

Sean, have you reached out to Mr. Gruber specifically? I mention this because
in the past I have CCed him directly on a response I sent to this list which
prompted him to respond (admittedly that happened some years ago). I suspect he
might be amenable to the general idea though. A search of the list archives
turned up a previous discussion [1] where he indicated a willingness to put in
some work to obtain a mime type for markdown. Of course, that was back when he
was still actively involved. Your mileage may vary.

[1]: http://article.gmane.org/gmane.text.markdown.general/1179

In any event, I have some thoughts about your proposal. I like it for the most
part. But a few comments on some specifics:

Why do we need a Mime Type?
----------------------------------------

First of all, when is this necessary? In order words, when is plain markdown being sent
around such that it needs a mime type? In my experience, REST API's (for example) use
JSON or XML which may contain some Markdown text among other data. That other data may
identify that the text is "markdown", but the mime type for the file is JSON or
XML (or at least the appropriate mime type for that file type). Or are you proposing that
everyone standardize on a way to identify the markdown text within JSON and XML documents
as Markdown text? What am I missing here?

Encodings
--------------

To shed a different light on the encoding issue, consider Python-Markdown
(disclosure: I'm the primary developer). Just as in Python 3 (where all strings
are Unicode), Python-Markdown only works with Unicode. You pass Unicode text
in, and you get Unicode text out. It is up to the user of the library to
concern themselves with encoding and decoding a file to/from a specific
encoding. As Python provides the libraries to do that, it is not a big problem
-- although for those used to working with byte strings it may be a little
jarring (I'm seeing that reaction from people who are experimenting with
Apple's new Swift Language -- which also supports Unicode only strings).

The point is, the Python-Markdown implementation has no use for the encoding
(except for the included wrapping commandline script). Of course, the user
(user of the library) will care about that and will need some way to identify
the encoding before converting and passing the input on to the Python-Markdown
library. So yes, encoding is very much a real, needed piece of meta-data.

However, if the markdown text is included in a JSON file (see my previous point
above), then wouldn't the encoding be defined for the JSON file, not the
markdown text specifically. The JSON parsing library would just spit out a
Unicode string -- in which case, why do we need this?

Flavors
---------

To me, "flavors" seems like a disaster waiting to happen. Sean, I realize you
have specifically stated a lack of understanding here, so lets go back in time. The
following may not be an all-inclusive (or in proper order of events) history of Markdown,
but provides enough (I hope) to make a point.

Way back when, the "flavor" of markdown you used depended almost entirely on which language (Perl,
PHP, Python...) you were using to code your project (blog, wiki, CMS, etc.). If you where using PHP, them
your flavor was PHP Markdown... There was only one implementation per language and they (mostly?) agreed with
each other. In that day "flavor" was completely pointless. I suspect a number of us resistant to
the "flavors" part of your proposal are from that period in Markdown's history.

Of course, then Ruby came along. I don't remember which library was which, but when the first
library came out, it was not very good (lots of bugs and slow). Then a second library came out
which also wasn't very good, but in different ways (except for the slow part). Some people wrote
their markdown documents with the bugs of the first implementation in mind, while other's wrote
their documents with the second in mind. Then a few projects started offering users the option to
pick which Ruby implementation of Markdown to use for each individual document - and
"flavors" were born. Then other people started making ports of those projects to other
languages and the "flavors" followed -- even though the other languages didn't really
have any choices. As a reminder, Github came out of that Ruby culture, which might explain why
Github-Flavored-Markdown ever existed in the first place (interesting side note: Gruber appears to
like GFM [2] -- or at least the original release -- it has grown to include more features since
then).

[2]: http://daringfireball.net/linked/2009/10/23/github-flavored-markdown

Then someone wrote a PEG grammar for Markdown. Once the hard work was done, a
few people ported that grammar to other languages. And then a few people wrote
C implementations (one of which used a PEG Grammar IIRC). Then, people wrote
wrappers around the C libraries for any number of scripting languages (Perl,
PHP, Python, Ruby...) and now there are a multitude of choices regardless of
which language your project is coded in. Some time ago I started an incomplete
[list] -- incomplete because those are the implementations I am aware of -- I'm
sure there are some others.

[list]: https://github.com/markdown/markdown.github.com/wiki/Implementations

But for those of us that remember the pre-Ruby days there is only "one true implementation" per
language and all the rest is just a bunch or noise (Okay, perhaps I exaggerate a bit -- just trying to make a
point). For us "flavors" means something else entirely. Because before all this Ruby and C mess, we
also had Multimarkdown and PHP Markdown Extra, more-or-less extending the same basic markdown syntax. Of
course, those extensions are not identical, but given that each was implemented in a different language, it
didn't matter. The "flavor" depended on which language your project was implemented in and that was
it.

Of course, many of the extensions created in Multimarkdown and PHP Markdown Extra were then ported to other
implementations in other languages. Consider Python-Markdown for instance. Python-Markdown provides an extension API so
that any user of the library can write an extension which modifies the syntax in any way they wish -- to the point that
it may not be Markdown any more. And a number of extensions ship with the Python-Markdown library [3]. Of those (at
current count) 17 extensions, 7 of them also come under the umbrella of an 8th -- Extra. In other words, each
individual feature of PHP Markdown Extra was implemented as its own extension, then when we had all of them, a wrapping
extension (called "extra") was created as a shortcut. Some users use "extra", but others only use
"footnotes" (for example). Any number of "flavors" are possible with the various combinations of
extensions that ship with just this one library. And many of those extensions also accept user defined configuration
settings which alters that extension's behavior (see footnotes [4] for an example). Then, there is a fairly extensive
list of third party extensions [5] (which is always changing). I don't imagine that there is any sensible way to define
all those possibilities in a way that is also understandable by other markdown implementations.

[3]: https://pythonhosted.org/Markdown/extensions/index.html
[4]: https://pythonhosted.org/Markdown/extensions/footnotes.html
[5]: https://github.com/waylan/Python-Markdown/wiki/Third-Party-Extensions

The great thing about Markdown is that any (decent) parser will simply pass over markup
it doesn't understand. The text will just get passed through as (mostly) plain text.
Given that one of the guiding principles behind Markdown is that it is human readable, if
a particular implementation does not support a certain extension, the reader of the
output could still understand the intended meaning and formatting (or at least "view
source" as other's have mentioned). Of course, this depends on a number of factors
(overridden tokens, HTML's whitespace collapsing considerations, etc). There are
certainly many examples that that does not hold true for. But overall, I don't see that
as a large concern.

So, the point (finally) is that "flavors" seem like an impossible-to-get-right part of your proposal and really won't matter in the real word. For example. if
you send me some markdown text with a flavor of "markdown.pl", but I'm using Awk as my programming language, then I'm not going to use markdown.pl anyway. Or,
if you send me a flavor of "extra", Awk doesn't have an implementation that supports "extra" (AFAIK), so, that is useless to me as well. On the other
hand, if I'm using Python, I can account for "extra" easily. Or for "markdown.pl" (just turn off smart_emphasis [6]). But "multimarkdown"
is a different matter (I'm not exactly sure which features are supported by Multimarkdown or whether Python-Markdown's extensions implement them in the same way). And
then there's "gfm" and "pandoc" and ... so many variations to account for. I think I'll just ignore this flavor stuff and use the implementation of
*my* choice which may or may not support the flavor sent my way.

[6]: https://pythonhosted.org/Markdown/reference.html#smart_emphasis

I hope that helps.

Waylan Limberg

_______________________________________________
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss

Re: text/markdown effort in IETF (invite)

Reply via email to