Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-22 Thread Seumas Mac Uilleachan

Oh My God I actually agree with Big Bird here.

This whole discussion is getting far beyond what I think of as a 
lightweight markup system. Personally I think metadata should be 
processed separately from markdown data. Keep it unixy - one tool, one 
job.


On 09/20/2011 01:03 PM, bowerb...@aol.com wrote:

sherwood said:
   Well if your dogs are like mine,
   they will eat practically anything.
   Lately in addition to their kibble they've
   been catching pocket gophers and mice.
   A border collie is much less lovable with
   'mouse breath'

gophers and mice taste _great_ to a dog --
a dietary delicacy for many millennia now...

it's your kibble they don't really care for.
its redeeming feature is that it's so _easy_.
but i bet there are several brands of kibble
which your dogs still turn up their noses at.

as the ad man replied, when asked why his
costly campaign hadn't moved more units
of the client's dogfood: dogs hate its taste.

people are the same way.  they'll eat a _lot_
of things, including some that you consider
to taste _dreadful_ (e.g., ms-word), but that
does not mean that they will eat _anything_.

***

anyway, this conversation sounds confused...
aside from questions of philosophy, it seems
to me that there is confusion about just what
sort of metadata we're all talking about, and
how it's used, by whom, for what purposes...
and so on and so forth, and hmm baby swing.

but maybe i'm the only one confused...:+)

you all seem like bright competent fellows,
so i'm sure you'll get it all worked out, so
i'm gonna go back to my sandbox and play.

have a nice day...

-bowerbird


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-20 Thread Tao Klerks
On Tue, Sep 20, 2011 at 1:30 AM, David Sanson dsan...@gmail.com wrote:


 On Sep 19, 2011, at 4:02 PM, Rob McBroom mailingli...@skurfer.com wrote:

  Those sound like reasons for the metadata to *identify* the abstract, but
 I see no requirement that it must be literally *stored* there. If the
 metadata contained something like
 
 abstract: relative/path/to/abstract.mdown
 
  That would allow for all of the above scenarios while keeping the
 metadata syntax/section simple.

 But that makes the document far less portable, and I'm liable to lose the
 extra file at some point. I'd much rather have it be self-contained---not,
 of course, if that means that the document suddenly becomes weirdly ugly and
 complicated, but I don't see anyone proposing a solution that makes
 documents weird and ugly and complicated.


Given that the abstract is actually part of the content (it is generally
printed as part of the document, right?) it would seem more sensible to have
the meta-data refer to a section name/path within the document. We can
probably assume any markdown parser is capable of identifying the content
between a heading and its next same-or-higher-depth sibling. Abstract
could be a default value, supporting the simplest case first
example Fletcher T. Penney provided above;

This way content is in the right place (in the document, and appearing where
you would expect it to with a simple abstract-unaware markdown converter),
english speakers just write their document, and others can provide the
abstract header, without needing to know anything about parsing or
serialization rules.

I realize I'm following up on the least-important aspect of this
conversation, but I do wonder: what are genuine use cases where meta-data
really does contain structured/formattable content that should not be
considered part of the document content? It doesn't look like the abstract
is really a valid case.
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-20 Thread John MacFarlane
+++ Tao Klerks [Sep 20 11 03:15 ]:
I realize I'm following up on the least-important aspect of this
conversation, but I do wonder: what are genuine use cases where
meta-data really does contain structured/formattable content that
should not be considered part of the document content? It doesn't look
like the abstract is really a valid case.

I think that the abstract is a fine case. Although one *could* handle
it the way you suggest, by having the metadata specify a section
of the document to use as the abstract, I don't see the advantage of
that. It is natural distinguish between the body text, which is *always* part
of the produced document, whether a fragment or a standalone document is being
produced, and regardless of the format or template used, and the metadata,
which sometimes appear in the produced document, depending on one's purposes,
and which appear differently in different formats. Once you make this
distinction, the abstract clearly falls on the side of the metadata.

Other cases:

* bibliographic data for the document itself, which you might want
  to print in some presentations but not others
* revision history
* tags
* bibliography entries used in the document
* settings for things like default stylesheets

On the last item:  pandoc includes a powerful citation formatting
system, citeproc.  So you can use plain text citations in your
document, like this [@smith99, p. 30; @barney04], and pandoc will
format them according to a style sheet you select and include
a bibliography (if the style sheet calls for that). This is a huge
convenience, as you can write the document once, and change the
citation style (even from author-date to footnotes) by selecting
a different CSL stylesheet on the command line.

Currently you need to specify the bibliography database on the
command line as well (it can be bibtex, endnote, or any number
of other formats).  Ideally, though, the document itself should
specify where its bibliographical entries are coming from.
This could just be a file path, but if you want the document to
be truly portable, it would be nice to be able to include the structured
bibliography entries themselves in metadata at the end of the document.
This could be done easily with a data description language as
powerful as lua/yaml/json.

John
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-20 Thread Tao Klerks
On Tue, Sep 20, 2011 at 9:56 AM, John MacFarlane j...@berkeley.edu wrote:


 I think that the abstract is a fine case. Although one *could* handle
 it the way you suggest, by having the metadata specify a section
 of the document to use as the abstract, I don't see the advantage of
 that. It is natural distinguish between the body text, which is *always*
 part
 of the produced document, whether a fragment or a standalone document is
 being
 produced, and regardless of the format or template used, and the metadata,
 which sometimes appear in the produced document, depending on one's
 purposes,
 and which appear differently in different formats. Once you make this
 distinction, the abstract clearly falls on the side of the metadata.


In that case, you're talking about metadata in the more general sense - like
link definitions, footnotes, and other constructs that are currently treated
as a special case in markdown. I'm all for having a special syntax for
defining the abstract, as long as the author doesn't have to worry about any
escaping conventions and can just write it like he/she would any other
regular markdown content.



 Other cases:

 * bibliographic data for the document itself, which you might want
  to print in some presentations but not others
 * revision history
 * tags
 * bibliography entries used in the document
 * settings for things like default stylesheets


Point taken, most of these are good cases for supporting structured content,
but not formattable/markdown content, right?



 Currently you need to specify the bibliography database on the
 command line as well (it can be bibtex, endnote, or any number
 of other formats).  Ideally, though, the document itself should
 specify where its bibliographical entries are coming from.
 This could just be a file path, but if you want the document to
 be truly portable, it would be nice to be able to include the structured
 bibliography entries themselves in metadata at the end of the document.
 This could be done easily with a data description language as
 powerful as lua/yaml/json.


Absolutely - but the (possibly unattainable) ideal would be a situation
where tools and experts can specify complex structured metadata, and regular
joe can change his title, author, and other basic/simple values and lists,
specifying values that contain apostrophes, commas and other natural
punctuation, wihout blowing anything up in the process. As soon as he needs
to specify/modify something that contains structure (or even something
multi-line?) it seems fair that he should have to use a tool or do some
research on the standard (esp. as most if not all of the structured-data use
cases relate to tools already).

My concern with a pure-lua/yaml/json metadata format is that it requires
specialized knowledge (not related to the existing markdown
standards/experience) on the part of the user for even the most trivial
changes to the simplest fields - *especially* if structured/markdown content
such as the abstract is placed in a metadata field!
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-20 Thread John MacFarlane
+++ Tao Klerks [Sep 20 11 10:34 ]:
On Tue, Sep 20, 2011 at 9:56 AM, John MacFarlane [1]j...@berkeley.edu
wrote:
 
  I think that the abstract is a fine case. Although one *could*
  handle
  it the way you suggest, by having the metadata specify a section
  of the document to use as the abstract, I don't see the advantage of
  that. It is natural distinguish between the body text, which is
  *always* part
  of the produced document, whether a fragment or a standalone
  document is being
  produced, and regardless of the format or template used, and the
  metadata,
  which sometimes appear in the produced document, depending on one's
  purposes,
  and which appear differently in different formats. Once you make
  this
  distinction, the abstract clearly falls on the side of the metadata.
 
In that case, you're talking about metadata in the more general sense -
like link definitions, footnotes, and other constructs that are
currently treated as a special case in markdown. I'm all for having a
special syntax for defining the abstract, as long as the author doesn't
have to worry about any escaping conventions and can just write it like
he/she would any other regular markdown content.

Yes, absolutely.  There are two ways to approach this while keeping
'abstract' a metadata field:

(1) There could be a special syntax for designating metadata fields
as markdown (or alternatively markdown could be the default, and there
could be a special syntax for designating them plain strings).
I showed in my original post how lunamark implements this:

  abstract = m[[
Here's the abstract.  You can put anything you want
here, including blank lines. No special escaping is
needed.  It can be flush left, but I've left a small
indent because it looks nice.

* item 1
* item 2
]]

The 'm' indicates that the content is markdown. If you left it
out, you'd have a plain string.

(2) It could just be conventional that certain fields ('abstract',
'title', etc.) are interpreted as markdown.

  Other cases:
  * bibliographic data for the document itself, which you might want
   to print in some presentations but not others
  * revision history
  * tags
  * bibliography entries used in the document
  * settings for things like default stylesheets
 
Point taken, most of these are good cases for supporting structured
content, but not formattable/markdown content, right?

Right in most cases, but one might want a free-form revision history
that is just markdown, and bibliographic entries might include
abstracts etc.

  Currently you need to specify the bibliography database on the
  command line as well (it can be bibtex, endnote, or any number
  of other formats).  Ideally, though, the document itself should
  specify where its bibliographical entries are coming from.
  This could just be a file path, but if you want the document to
  be truly portable, it would be nice to be able to include the
  structured
  bibliography entries themselves in metadata at the end of the
  document.
  This could be done easily with a data description language as
  powerful as lua/yaml/json.
 
Absolutely - but the (possibly unattainable) ideal would be a situation
where tools and experts can specify complex structured metadata, and
regular joe can change his title, author, and other basic/simple values
and lists, specifying values that contain apostrophes, commas and other
natural punctuation, wihout blowing anything up in the process. As soon
as he needs to specify/modify something that contains structure (or
even something multi-line?) it seems fair that he should have to use a
tool or do some research on the standard (esp. as most if not all of
the structured-data use cases relate to tools already).
My concern with a pure-lua/yaml/json metadata format is that it
requires specialized knowledge (not related to the existing markdown
standards/experience) on the part of the user for even the most trivial
changes to the simplest fields - *especially* if structured/markdown
content such as the abstract is placed in a metadata field!

I understand the concern. YAML is particularly bad this way, because you
get used to not quoting or escaping things, but then your document blows up
when you have a colon in the field. I think lua is a nice compromise--more
regular and predictable, but you don't have to quote the fields as in json,
and you have a really nice multiline string syntax that eliminates the need
for escaping.[^1] But my lua-based proposal is compatible with also having a
simpler way of specifying title, author, and date -- e.g. pandoc's, or Michael
Thompson's proposal involving centering, or MMD's (though I think the Hamlet
problem is serious).

[^1]:  What if your abstract contains `]]`, you might ask?

re: Metadata syntax (was Universal syntax for Markdown)

2011-09-20 Thread Bowerbird
sherwood said:
Well if your dogs are like mine, 
they will eat practically anything. 
Lately in addition to their kibble they've 
been catching pocket gophers and mice. 
A border collie is much less lovable with 
'mouse breath' 

gophers and mice taste _great_ to a dog --
a dietary delicacy for many millennia now...

it's your kibble they don't really care for.
its redeeming feature is that it's so _easy_.
but i bet there are several brands of kibble
which your dogs still turn up their noses at.

as the ad man replied, when asked why his
costly campaign hadn't moved more units
of the client's dogfood: dogs hate its taste.

people are the same way.   they'll eat a _lot_
of things, including some that you consider
to taste _dreadful_ (e.g., ms-word), but that
does not mean that they will eat _anything_.

***

anyway, this conversation sounds confused...
aside from questions of philosophy, it seems
to me that there is confusion about just what
sort of metadata we're all talking about, and
how it's used, by whom, for what purposes...
and so on and so forth, and hmm baby swing.

but maybe i'm the only one confused... :+)

you all seem like bright competent fellows,
so i'm sure you'll get it all worked out, so
i'm gonna go back to my sandbox and play.

have a nice day...

-bowerbird___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread David Sanson
The reStructuredText field list syntax might be a reasonable compromise.

I like the fact that the metadata can occur anywhere in the document.

The [RST spec][] itself says that the leading colon can be dropped in
well-defined contexts such as when a field list invariably occurs at
the beginning of a document (PEPs and email messages). As John's
example from Hamlet shows, this isn't the case for markdown documents.
But it would be possible to grandfather in existing MMD documents by
insisting, as MMD does, that first lines that contain a colon are
unambiguously metadata, with apologies to Hamlet. Or one could
introduce a delimiter, like `--`, and say that lines before that
delimiter are unambiguously metadata, and so don't require the leading
colon.

Frankly, I don't think it wouldn't be a terrible thing if
implementations disagreed on this one detail---when is a leading colon
required?---but agreed on everything else. Those who wish to trade
flexibility for beauty could leave off the leading colon; those who
value inter-operability over beauty could leave it on.

David

[RST spec]: 
http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#field-lists
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread John MacFarlane
+++ David Chambers [Sep 18 11 15:08 ]:
If we want to avoid defining our
own serialization format, we have two options: we can adopt an existing
format (such as JSON or YAML), or we can hand off the responsibility to
application developers.

Yes, I agree, and I certainly agree that we shouldn't go down the path
of reinventing YAML. My proposal was to use lua as a data description
language, as it is more texty than json, less quirky than YAML, and
more flexible than either.  But I don't really expect to get consensus
on that.

It seems to me that there are three levels at which we might hope
to achieve consensus about metadata in markdown:

1.  Agreement about which bits of the document are metadata, so
these won't be processed as part of the document's text.

2.  Agreement about a key-value format, so that all implementations
can extract metadata into key/value pairs, with literal string
values, in the same way.

3.  Agreement about how the values are to be parsed into structured
data, which bits are to be parsed as markdown, etc.

Consensus on 1 would be useful, because it would prevent your metadata
from turning into displayed garbage when processed with another markdown
implementaiton.

My own proposal on 1 was to put metadata inside specially marked HTML
comments.  An advantage is that there is *already* agreement among
implementations not to make this part of the displayed document, so
no agreement is needed. In effect, my proposal already achieves
consensus on 1.

Another possibility would be to put metadata inside '---' and '---'.
This would solve two problems with MMD metadata:  it would allow it
to occur anywhere in the document, and it would avoid unwanted results
when you happen to have a colon in the first line of your text.

As for 2, a minimal modification from MMD style metadata would be
to allow blank lines in fields, by requiring continuation lines
to be indented four spaces.

---
field1:  Value one.
Continued here.

Another paragraph.

field2:  Next field.
---

This would work best if we had something like the '---' '---'
delimiters, since otherwise you have even more opportunities for
unwanted captures (a blank line doesn't end metadata).

John

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread Fletcher Penney
Side note - the actual Hamlet line has a colon at the end of the line.
So it would be fine in MMD.

;)

F

Sent from my iPhone

On Sep 19, 2011, at 10:30 AM, David Sanson dsan...@gmail.com wrote:

 The reStructuredText field list syntax might be a reasonable compromise.

 I like the fact that the metadata can occur anywhere in the document.

 The [RST spec][] itself says that the leading colon can be dropped in
 well-defined contexts such as when a field list invariably occurs at
 the beginning of a document (PEPs and email messages). As John's
 example from Hamlet shows, this isn't the case for markdown documents.
 But it would be possible to grandfather in existing MMD documents by
 insisting, as MMD does, that first lines that contain a colon are
 unambiguously metadata, with apologies to Hamlet. Or one could
 introduce a delimiter, like `--`, and say that lines before that
 delimiter are unambiguously metadata, and so don't require the leading
 colon.

 Frankly, I don't think it wouldn't be a terrible thing if
 implementations disagreed on this one detail---when is a leading colon
 required?---but agreed on everything else. Those who wish to trade
 flexibility for beauty could leave off the leading colon; those who
 value inter-operability over beauty could leave it on.

 David

 [RST spec]: 
 http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#field-lists
 ___
 Markdown-Discuss mailing list
 Markdown-Discuss@six.pairlist.net
 http://six.pairlist.net/mailman/listinfo/markdown-discuss
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread David Sanson
On Mon, Sep 19, 2011 at 11:31 AM, Fletcher Penney
fletc...@fletcherpenney.net wrote:
 Side note - the actual Hamlet line has a colon at the end of the line.
 So it would be fine in MMD.

Clever use of a dash there ;)
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread Fletcher T. Penney
I wonder how often multi-paragraph metadata comes up in real world use?  
Something like an abstract, IMO, shouldn't be metadata - it should be part of 
the document.  How to *display* that section graphically (e.g. smaller font?, 
narrower width, etc.) is a problem for CSS/LaTeX/ODF/whatever - *not* for 
Markdown itself.

I published a book via Lulu (a friend's PhD thesis) using MMD - it had an 
abstract, dedication, acknowledgements, preface, ToC, lists of figures, etc.  
Each of which was formatted appropriately, without using metadata for any of 
it.  I customized an XSLT to generate the LaTeX I desired, and the result was 
fantastic.

The only problem I have run into with MMD's metadata is that it would be nice 
to support markup inside some fields but not all, and that has rarely been a 
problem for me.  This was easily remedied in MMD 2, but trickier in MMD 3.


I think the best shot at consensus is a basic syntax for general metadata 
(obviously I'm partial to MMD ;) that covers 90% of what people do.  Then a 
more complicated shared syntax for those variants that want to support the 
kitchen sink approach to metadata.


Not to repeat myself, but I again think we're approaching this from the wrong 
end.  If there's going to be a consensus, I think it's going to have to start 
with a shared philosophy for the standards.  Each variant may end up with it's 
own philosophy outside of that, but there has to be a common vision for the 
purpose of the standard.  

Until that happens, I don't think we'll get anywhere trying to sort out 
specific implementations for specific features - we don't have a shared 
understanding of the problem we're trying to solve.


F-


On Sep 19, 2011, at 11:12 AM, John MacFarlane wrote:

 +++ David Chambers [Sep 18 11 15:08 ]:
   If we want to avoid defining our
   own serialization format, we have two options: we can adopt an existing
   format (such as JSON or YAML), or we can hand off the responsibility to
   application developers.
 
 Yes, I agree, and I certainly agree that we shouldn't go down the path
 of reinventing YAML. My proposal was to use lua as a data description
 language, as it is more texty than json, less quirky than YAML, and
 more flexible than either.  But I don't really expect to get consensus
 on that.
 
 It seems to me that there are three levels at which we might hope
 to achieve consensus about metadata in markdown:
 
 1.  Agreement about which bits of the document are metadata, so
these won't be processed as part of the document's text.
 
 2.  Agreement about a key-value format, so that all implementations
can extract metadata into key/value pairs, with literal string
values, in the same way.
 
 3.  Agreement about how the values are to be parsed into structured
data, which bits are to be parsed as markdown, etc.
 
 Consensus on 1 would be useful, because it would prevent your metadata
 from turning into displayed garbage when processed with another markdown
 implementaiton.
 
 My own proposal on 1 was to put metadata inside specially marked HTML
 comments.  An advantage is that there is *already* agreement among
 implementations not to make this part of the displayed document, so
 no agreement is needed. In effect, my proposal already achieves
 consensus on 1.
 
 Another possibility would be to put metadata inside '---' and '---'.
 This would solve two problems with MMD metadata:  it would allow it
 to occur anywhere in the document, and it would avoid unwanted results
 when you happen to have a colon in the first line of your text.
 
 As for 2, a minimal modification from MMD style metadata would be
 to allow blank lines in fields, by requiring continuation lines
 to be indented four spaces.
 
 ---
 field1:  Value one.
Continued here.
 
Another paragraph.
 
 field2:  Next field.
 ---
 
 This would work best if we had something like the '---' '---'
 delimiters, since otherwise you have even more opportunities for
 unwanted captures (a blank line doesn't end metadata).
 
 John
 
 ___
 Markdown-Discuss mailing list
 Markdown-Discuss@six.pairlist.net
 http://six.pairlist.net/mailman/listinfo/markdown-discuss

--  
Fletcher T. Penney
fletc...@fletcherpenney.net




___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread Fletcher T. Penney

On Sep 19, 2011, at 3:28 PM, John MacFarlane wrote:

 I can think of many reasons for putting an abstract into metadata.
 The treatment of the abstract (like that of author and title) varies
 quite a bit depending on the output format.  In LaTeX, it goes in
 a special environment; in HTML, it may go in a special DIV; for some
 purposes, you may want to omit it entirely and just store it for
 bibliographic purposes.  If the markdown processor pulls it out
 as metadata, then a templating system can put it where it needs to
 go in the final document.
 
 Now of course, you can always postprocess the output of your markdown
 processor, locate the abstract, and mess around with the result. But that's
 uglier and much harder for end users than the approach above, which lunamark
 takes. Users are more likely to be able to modify a default template
 than write their own XSLT transformations.
 
 John


I think it is somewhat of an academic debate whether it is better for a 
templating system to look for an abstract in the metadata, or to check the 
first h1 to see if it is called Abstract.  I guarantee the computer doesn't 
care where it's located...  I personally think that the raw markdown document 
looks much better if the abstract is part of the text than if it is part of a 
complicated metadata markup scheme.  

In any case, my primary point stands.  For any consensus to come about, I think 
we need to agree on the fundamental purpose and philosophy of the consensus we 
claim to be interested in. Otherwise many of these discussions will continue to 
occur without much hope of moving forward to any actual outcome/resolution.  
Then it's just a bunch of us sitting around explaining why we each think our 
own dog food is the best.


But in the meantime, MMD will continue to march forward --- platform 
independent processor code (Mac,Windows,*nix, and presumably iOS/Android??) 
without any external requirements that is lightning fast (Thanks to John's 
fabulous peg-markdown as a starting point, and Daniel's work on getting rid of 
the glib2 requirement), a Mac OS X text editor with built-in MMD syntax 
highlighting, exporting, and editing that I hope to release in the next month 
or two, and I hope to put together a proof of concept native MMD parser for iOS 
(built using the same code as the desktop version) if no one else out there 
beats me to it (which would be a welcome turn of events!)


F-

--  
Fletcher T. Penney
fletc...@fletcherpenney.net




___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread Rob McBroom
On Sep 19, 2011, at 3:28 PM, John MacFarlane wrote:

 I can think of many reasons for putting an abstract into metadata.
 The treatment of the abstract (like that of author and title) varies
 quite a bit depending on the output format.  In LaTeX, it goes in
 a special environment; in HTML, it may go in a special DIV; for some
 purposes, you may want to omit it entirely and just store it for
 bibliographic purposes.  If the markdown processor pulls it out
 as metadata, then a templating system can put it where it needs to
 go in the final document.

Those sound like reasons for the metadata to *identify* the abstract, but I see 
no requirement that it must be literally *stored* there. If the metadata 
contained something like

abstract: relative/path/to/abstract.mdown

That would allow for all of the above scenarios while keeping the metadata 
syntax/section simple.

(Obviously, I lean toward Fletcher’s philosophy #2 on this.)

-- 
Rob McBroom
http://www.skurfer.com/

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread Bowerbird
fletcher said:
For any consensus to come about, 
I think we need to agree on the 
fundamental purpose and philosophy of 
the consensus we claim to be interested in. 

it would be nice.


Otherwise many of these discussions will 
continue to occur without much hope of 
moving forward to any actual outcome/resolution. 

yep.


it's just a bunch of us sitting around 
explaining why we each think 
our own dog food is the best. 

uh-huh.   and what's really ironic is that you're
not even polling the general-public dogs you
hope will eat the food that you're putting out...

y'all seem to believe that they'll eat _anything_.


But in the meantime, MMD will continue to 
march forward

great to hear it!


a Mac OS X text editor with built-in MMD 
   syntax highlighting, exporting, and editing 
that I hope to release in the next month or two

wha...   the next month or two?   what's the hold-up?


and I hope to put together a proof of concept 
native MMD parser for iOS (built using the same code 
as the desktop version) if no one else out there 
beats me to it (which would be a welcome turn of events!)

i don't understand why a parser has to be so hard...

-bowerbird___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread Sam Angove
On Mon, Sep 19, 2011 at 3:47 AM, John MacFarlane j...@berkeley.edu wrote:
 Another major problem, in my view, is that if a document starts
 with a phrase followed by a colon, it gets swallowed into metadata:
 [...]
 Also, because this is recognizable as metadata wherever it occurs
 in the document, one could then drop the requirement that the
 metadata occur at the top of the document, which I think is
 undesirable.

Another alternative is to re-use the syntax that Markdown already has
for document-level metadata:

   [1]: http://example.com/
   [^f1]: A footnote here

Perhaps:

   [title]:Here is the title.
   [abstract]: The abstract here.

   As with footnotes, lists etc., indented lines continue the block.
   [author]:   John


Not quite as natural as the unbracketed version, but more consistent
with Markdown conventions and less likely to cause unpleasant
surprises. (The obvious risk is the potential for collision with
reference links, but I think it is quite minor, and could be minimized
by special-casing metadata at the beginning of a document.)

From a syntax perspective, the idea would be that reference link
definitions, footnotes, MMD-format references etc. are all removed as
metadata. Keys starting with ^ are treated as footnotes, values
matching the URI/title form may be re-inserted as reference links,
etc.


Sam
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread John MacFarlane
+++ Sam Angove [Sep 20 11 10:33 ]:
 On Mon, Sep 19, 2011 at 3:47 AM, John MacFarlane j...@berkeley.edu wrote:
  Another major problem, in my view, is that if a document starts
  with a phrase followed by a colon, it gets swallowed into metadata:
  [...]
  Also, because this is recognizable as metadata wherever it occurs
  in the document, one could then drop the requirement that the
  metadata occur at the top of the document, which I think is
  undesirable.
 
 Another alternative is to re-use the syntax that Markdown already has
 for document-level metadata:
 
[1]: http://example.com/
[^f1]: A footnote here
 
 Perhaps:
 
[title]:Here is the title.
[abstract]: The abstract here.
 
As with footnotes, lists etc., indented lines continue the block.
[author]:   John
 
 
 Not quite as natural as the unbracketed version, but more consistent
 with Markdown conventions and less likely to cause unpleasant
 surprises. (The obvious risk is the potential for collision with
 reference links, but I think it is quite minor, and could be minimized
 by special-casing metadata at the beginning of a document.)
 
 From a syntax perspective, the idea would be that reference link
 definitions, footnotes, MMD-format references etc. are all removed as
 metadata. Keys starting with ^ are treated as footnotes, values
 matching the URI/title form may be re-inserted as reference links,
 etc.

I think this is a very nice idea.  Authors would have to be careful
not to use the same label for a reference link and a piece of
metadata, but I don't see that being a big problem.

If people didn't like the brackets, then I think the next best
idea would be to require a delimiter of some kind, but keep
the capacity for multiple paragraphs as with footnotes:

---
title:Here is the title.
author:   John
abstract: The abstract here.

As with footnotes, lists etc., indented lines continue the block.
---

John

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread Richard Caldwell

 From: Fletcher T. Penney fletc...@fletcherpenney.net
 
 I think the idea of metadata boils down to three perspectives:
 
 1) I don't want it/need it/care about it  --- get rid of it
 
 2) I want something easy to write, easy to read, and fits with the
 Markdown philosophy of as little markup as possible to accomplish
 the job ---even if not quite as powerful (e.g. MultiMarkdown)
 
 3) I want something powerful/flexible, even if it looks like computer
 code at the top of my document (e.g. lunamark)
 
 
 Before there can be a unified standard, there has to be a unified
 philosophy (just like the rest of the standards debate on the
 list).

After some initial excitement that it might be possible to brew up a standard 
for Markdown extensions I have become disheartened.  Metadata is one of the 
most commonly implemented extensions for Markdown.  If we cannot agree that 
including metadata is important and that any standards should adhere to the 
fundamental philosophy of Markdown, then there is little hope for consensus.

I suppose having Gruber as the the absentee landlord of Markdown is better than 
turning Markdown it into something completely different than what has worked so 
well for so many.  Many of the proposals that I'm seeing try to solve problems 
that go far beyond the scope of what Markdown is or should ever be.

Here is what I believe to be the appropriate solution for Markdown metadata.

* Metadata is specified at the top of the document similar to RFC822 headers.  
The keys and values may be arbitrary.
  Multiple lines may be folded as in RFC822.

* Metadata lines may be enclosed in an HTML comment to hide metadata if 
original Markdown is used.

* Metadata is omitted from the output except that:
   
  * Keys matching standard header elements are included as the appropriate 
header element.
  * Keys that do not match standard header elements are included as standard 
HTML meta element.

* If metadata is present in the file then full HTML files are generated by 
default.  This could be suppressed by a switch.

* Everything else is left up to the extension or whatever is processing the 
HTML.

For example:

Title: Markdown for Dummies
Tags: markdown, text, markup

header
titleMarkdown for Dummies/title
meta name=tags content=markdown, text, markup 
/header
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread David Sanson
On Mon, Sep 19, 2011 at 2:34 PM, Fletcher T. Penney
fletc...@fletcherpenney.net wrote:
 Not to repeat myself, but I again think we're approaching this from the wrong 
 end.  If there's going to be a consensus, I think it's going to have to start 
 with a shared philosophy for the standards.  Each variant may end up with 
 it's own philosophy outside of that, but there has to be a common vision for 
 the purpose of the standard.

It seems to me that many visions have been expressed here. I'm not
sure what more can be done to generate consensus. But I'm happy to try
to express my own. And, for what it is worth, bowerbird, this is the
vision of user, not a developer :-)

I have two visions, and I think they are compatible. One is an agreed
upon text-y format for title, author, and date. The other is an agreed
upon text-y-as-possible-but-no-doubt-more code-y format for arbitrary
metadata. If I were going to push for consensus on one of these rather
than the other, it would be the second, but I'd like to see both, and
I think, as I've suggested before, that some of the reasons for
resisting code-y metadata (elegance and aesthetics, not assuming that
all documents are in English) are better thought of as reasons for
developing a text-y format for title, author, and date.

As for the code-y metadata, I think it is a mistake to think that we
can imagine ahead of time all the ways this arbitrary metadata might
be used, so I'd like it to be as flexible and powerful as possible.
I've already mentioned one vision---the ability to embed bibliographic
data in academic papers---but that's just something I think about
because I am an academic who often uses markdown to write papers with
lots of citations. Markdown is used in so many ways by so many
different people---bloggers writing posts, academics writing research
papers, scriveners writing novels, developers writing readme's,  I
say: make it as powerful as feasible and let the users discover new
uses.

There has been some discussion of whether or not there is any real
need for multi-paragraph metadata, focusing on the example of
abstracts. I currently use Jekyll for my website. By far the easiest
way to generate a blurb for a given page---the sort of thing that on
a blog gets shown before the fold---is to toss it into a metadata
field and adjust Jekyll's templates to use the content of the blurb.
There are no doubt other ways to do this---filters and scripts and
pre- or post-processors. But that doesn't take away from the fact that
using metadata is one very easy way to do this. So multi-paragraph
metadata is something I use regularly in this context.

There has been some discussion of whether or not markdown
implementations should be responsible for parsing this code-y
metadata. I suppose it is part of my vision that markdown
implementations do parse this code, and pass it along as appropriate
to templates and the like. But John's first point of possible
consensus,

1. Agreement about which bits of the document are metadata, so
   these won't be processed as part of the document's text.

would be of great value on its own. I've spent time converting
documents from Scrivener or Mellel to MMD, and then to Pandoc's
extended markdown. A MMD document with lots of metadata---even with
hard line breaks---is, when used with other processors, a markdown
file with a bunch of junk at the top that has to be trimmed away.
Likewise, I've written documents using Pandoc's title-author-date
blocks, and then needed to use those documents with other processors,
and that stuff at the top was just so much junk that had to be trimmed
away. So if everyone could just agree on what to ignore, that would be
a serious improvement.

But if markdown implementations are not themselves going to be
responsible for parsing the code-y metadata, I would strongly prefer
that the metadata be in a format that has existing wide support. I
doubt that some decree by the markdown community will have the power
to move all the developers who have developed all the various tools
that use markdown and rely on metadata. And I think the whole thing is
likely to be a nonstarter if it requires that these developers all
write parsers for some new fangled format. Even if markdown
implementations are going to handle to parsing, I guess someone is
also going to need to write tools for translating existing data
formats into the new format---unless we are assuming that nobody would
ever want to use existing data as metadata in a markdown document?

So if there were a standard out there for human writable/machine
readable plaintext data that shares the values of markdown, I would
think it made more sense to use that, and let the markdown community
focus their intellectual energy on markdown. I had naively mentioned
YAML in an earlier post just because among us naive users, it has the
reputation of being such a standard. But I really don't know anything
about plaintext data formats, and have no special affection for 

Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-19 Thread Sherwood Botsford
Well if your dogs are like mine, they will eat practically anything.  Lately
in addition to their kibble they've been catching pocket gophers and mice.
A border collie is much less lovable with 'mouse breath'


Respectfully,

Sherwood of Sherwood's Forests

Sherwood Botsford
Sherwood's Forests --  http://Sherwoods-Forests.com
780-848-2548
50042 Range Rd 31
Warburg, Alberta T0C 2T0




On Mon, Sep 19, 2011 at 4:39 PM, bowerb...@aol.com wrote:

 fletcher said:
For any consensus to come about,
I think we need to agree on the
fundamental purpose and philosophy of
the consensus we claim to be interested in.

 it would be nice.


Otherwise many of these discussions will
continue to occur without much hope of
moving forward to any actual outcome/resolution.

 yep.


it's just a bunch of us sitting around
explaining why we each think
our own dog food is the best.

 uh-huh.  and what's really ironic is that you're
 not even polling the general-public dogs you
 hope will eat the food that you're putting out...

 y'all seem to believe that they'll eat _anything_.


But in the meantime, MMD will continue to
march forward

 great to hear it!


a Mac OS X text editor with built-in MMD
   syntax highlighting, exporting, and editing
that I hope to release in the next month or two

 wha...  the next month or two?  what's the hold-up?


and I hope to put together a proof of concept
native MMD parser for iOS (built using the same code
as the desktop version) if no one else out there
beats me to it (which would be a welcome turn of events!)

 i don't understand why a parser has to be so hard...

 -bowerbird

 ___
 Markdown-Discuss mailing list
 Markdown-Discuss@six.pairlist.net
 http://six.pairlist.net/mailman/listinfo/markdown-discuss


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-18 Thread John MacFarlane
+++ David Sanson [Aug 17 11 23:09 ]:
 First time posting here as well. I've been watching this discussion
 with interest. As a user of (extended) markdown, I have long hoped for
 a unified standard for (most or all) markdown extensions and a unified
 handling of metadata.

Thanks for the thoughtful post.  A few comments below, describing
a metadata experiment I've done in implementing lunamark.

 It seems to me that one of the issues that arises when we start
 thinking about metadata is that there really are two different kinds
 of metadata: some metadata (title, author, date) is---at least in many
 cases---also part of the *content* of the document. This is the kind
 of metadata for which I feel the force of the demand for an elegant
 plaintext solution. For some bold suggestions in this direction, see
 this [old post by Michael Thompson][1] to the pandoc-discuss list.
 Here is one of his examples from that post:
 
  A Good Man Is Hard To Find
 
Flannery O'Connor
   Spring 1952
 
 
 The grandmother didn't want to go to Florida. She wanted to visit
 some of her connections in east Tennessee and she was seizing at
 every chance to change Bailey's mind.
 
 Isn't that so much *prettier* than any of the options currently in
 play? Email someone a document like that, and they will know exactly
 what you mean, and see no distracting markup. No doubt this presents
 challenges when it comes to parsing, and I have no idea whether or not
 those challenges are surmountable. Clearly some rules would have to be
 laid down (Does it have to be centered? Indented? Can I underline the
 title ala setext? Do I have to have two blank lines after the date?
 Can I leave the date out? etc.) And it raises issues for backwards
 compatibility too. But I think its worth having in view a solution
 that achieves a certain degree of perfection along this one dimension.
 
 But then there is the other kind of metadata. Tags, keywords,
 baseurls, paths to associated files, directives for webpage templating
 software, and so on and so on. This sort of stuff is definitely not
 content. It is a bunch of data that I want to associate with the file
 for some reason or other. It needs to be indefinitely extensible. It
 is frequently tied directly to some specific output format or context.
 In other contexts, probably just needs to be ignored. Blosxom taught
 us that it should all be at the top of the document (and successors,
 like Jekyll, follow this tradition), but much of it is ugly enough
 that it could just as well be banished to the bottom of the document,
 where nobody but the author would ever have to look at it.
 
 When it comes to this sort of metadata, I don't see any reason to look
 for something elegant, language-independent, and plaintext-y. This is
 where it feels like I just want a way of embedding a block of data
 within a markdown file, knowing that it won't be treated as content
 (and, depending on my processor and the context, knowing that it may
 be sucked up and used in various ways). It is here that I agree with
 the sentiment that metadata shouldn't be part of the markdown spec,
 *but* I think markdown should be smart enough to ignore the metadata,
 so that I don't have to strip it out before feeding the document to a
 markdown processor.

One way to achieve this is to put metadata inside specially
marked HTML comments.  Then existing markdown parsers will all
ignore it (at any rate, it won't display).

That's what I did in lunamark's experimental 'lua_metadata' feature.
Here's an example:

!--@
catalog_number   = 23423423A
category = fish
tags = { Arctic, fish, char }
bib  = { title = Fishing for Arctic char,
 author= Samuel Smith,
 publisher = Alaska Press,
 year  = 2008 }
--

Inside the comment we just have lua declarations (they're processed
in a sandbox, so metadata can't do anything nasty).  This makes the
metadata slightly less textual looking, but it gives you the ability
to have metadata of various types: string, number, array, key-value
table. And it's actually pretty readable -- note that bibtex's
format was based on lua tables.

One thing that needs to be considered in a metadata format is that
some metadata entries need to be parsed as markdown, while others
should remain literal (suppose you have a product number with
lots of '*' and '[' in it).  I handle this by providing a function
'markdown' or 'm' that you can use:

!--@
title  = mReading *Hamlet*,
author = m[Sally Cho](http://sallycho.net)
--

It doesn't matter whether you write

markdown(foo)
m(foo)
markdown foo
mfoo
etc.

They all work.

It would be possible to define other functions as well, even
ones that do IO, and expose them individually without giving access
to general IO functions. So 

Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-18 Thread Fletcher T. Penney
I think the idea of metadata boils down to three perspectives:


1) I don't want it/need it/care about it  --- get rid of it

2) I want something easy to write, easy to read, and fits with the Markdown 
philosophy of as little markup as possible to accomplish the job ---even if not 
quite as powerful (e.g. MultiMarkdown)

3) I want something powerful/flexible, even if it looks like computer code at 
the top of my document (e.g. lunamark)


Before there can be a unified standard, there has to be a unified philosophy 
(just like the rest of the standards debate on the list).


My philosophy, and therefore that of MMD, is #2 above.  I obviously have more 
markup than plain Markdown, but I feel that my feature to markup ratio is as 
good or better than Markdown (obviously a personal opinion, not a fact).  The 
metadata functionality is pretty powerful, fails gracefully when run through 
plain markdown (if you remember the extra two spaces at the end of lines), but 
does have some limitations.

I reiterate one of my previous posts - if we want to have any sort of consensus 
for the Markdown derivatives, the first step is agreeing on a philosophy for 
those standards.  Individual variants can still have their own features, but we 
would need agreement on the core.

F-


On Sep 18, 2011, at 11:53 AM, John MacFarlane wrote:

 +++ David Sanson [Aug 17 11 23:09 ]:

snipped for brevity - please see original posts

--  
Fletcher T. Penney
fletc...@fletcherpenney.net




___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-18 Thread John MacFarlane
Yes, the key question is: what's the right balance of flexibility vs.
textiness?

To my mind, multimarkdown comments just aren't flexible enough:

* There's no way to have multiline metadata fields that contain
  blank lines, e.g. an abstract with two paragraphs.

* There's no provision for structured data (e.g. key/value
  tables or lists), or for boolean or numerical fields.

* Metadata fields are interpreted as raw strings, not markdown.
  That's sometimes what you want, but not always.  Titles
  often contain emphasis and other formatting, for example,
  and sometimes even footnotes (for acknowledgements).  If
  these are just going into an html meta field, it doesn't much
  matter, but if you're using the metadata fields in templates,
  it does.  (And sure, you could always run a raw string through
  your markdown processor again, before passing it to the template engine,
  but that creates problems for things like reference links and
  footnotes.)

Another major problem, in my view, is that if a document starts
with a phrase followed by a colon, it gets swallowed into metadata:

% multimarkdown
To be or not to be: that is the question.
^D
?xml version=1.0 encoding=UTF-8 standalone=yes ?
!DOCTYPE html
html xmlns=http://www.w3.org/1999/xhtml;
head
meta name=tobeornottobe content=that is the question./
/head
body

/body
/html

That's not what most authors would expect!

For this reason, I would favor something more like reStructuredText
field lists, which marks the fields explicitly as fields:

:title:Here is the title.
:author:   John
:abstract: The abstract here.
  It can span multiple lines.

  As long as the indentation is maintained.

This is not part of the metadata.

This is slightly less texty because of the leading colon, but less likely to
capture regular text.

Also, because this is recognizable as metadata wherever it occurs
in the document, one could then drop the requirement that the
metadata occur at the top of the document, which I think is
undesirable.  When there's lots of metadata, it's nicer to put
it at the bottom (or at least to put some of it at the bottom),
so it doesn't interfere with reading the article. lunamark's
lua_metadata allows that, by the way -- so you don't have to
start the document with something that doesn't look like plain
text.

One nice point that David Sanson made is that one could combine
a simple, texty metadata format for common things like titles
and authors with a flexible, more cody format for everything else.
One should keep this in mind in thining about how to balance flexibility
vs. textiness.

John

+++ Fletcher T. Penney [Sep 18 11 12:06 ]:
 I think the idea of metadata boils down to three perspectives:
 
 
 1) I don't want it/need it/care about it  --- get rid of it
 
 2) I want something easy to write, easy to read, and fits with the Markdown 
 philosophy of as little markup as possible to accomplish the job ---even if 
 not quite as powerful (e.g. MultiMarkdown)
 
 3) I want something powerful/flexible, even if it looks like computer code at 
 the top of my document (e.g. lunamark)
 
 
 Before there can be a unified standard, there has to be a unified philosophy 
 (just like the rest of the standards debate on the list).
 
 
 My philosophy, and therefore that of MMD, is #2 above.  I obviously have more 
 markup than plain Markdown, but I feel that my feature to markup ratio is 
 as good or better than Markdown (obviously a personal opinion, not a fact).  
 The metadata functionality is pretty powerful, fails gracefully when run 
 through plain markdown (if you remember the extra two spaces at the end of 
 lines), but does have some limitations.
 
 I reiterate one of my previous posts - if we want to have any sort of 
 consensus for the Markdown derivatives, the first step is agreeing on a 
 philosophy for those standards.  Individual variants can still have their own 
 features, but we would need agreement on the core.
 
 F-
 
 
 On Sep 18, 2011, at 11:53 AM, John MacFarlane wrote:
 
  +++ David Sanson [Aug 17 11 23:09 ]:
 
 snipped for brevity - please see original posts
 
 --  
 Fletcher T. Penney
 fletc...@fletcherpenney.net
 
 
 
 
 ___
 Markdown-Discuss mailing list
 Markdown-Discuss@six.pairlist.net
 http://six.pairlist.net/mailman/listinfo/markdown-discuss
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-18 Thread Fletcher T. Penney

On Sep 18, 2011, at 1:47 PM, John MacFarlane wrote:
snipped

 To my mind, multimarkdown comments just aren't flexible enough:
 
 * There's no way to have multiline metadata fields that contain
  blank lines, e.g. an abstract with two paragraphs.

True - but in MMD an abstract would be included in the document with a separate 
header, not as metadata.  But you're correct that blank lines are not allowed.  
I've never needed them, but they aren't allowed.

 * There's no provision for structured data (e.g. key/value
  tables or lists), or for boolean or numerical fields.

True.  I've never needed them, and have never had them requested.  But there is 
no provision for that.

 * Metadata fields are interpreted as raw strings, not markdown.
  That's sometimes what you want, but not always.  Titles
  often contain emphasis and other formatting, for example,
  and sometimes even footnotes (for acknowledgements).  If
  these are just going into an html meta field, it doesn't much
  matter, but if you're using the metadata fields in templates,
  it does.  (And sure, you could always run a raw string through
  your markdown processor again, before passing it to the template engine,
  but that creates problems for things like reference links and
  footnotes.)

This is a slight difference in behavior from MMD 2.  I'm considering approaches 
to allow processing the contents of the metadata, as this can be an issue 
occasionally.

 Another major problem, in my view, is that if a document starts
 with a phrase followed by a colon, it gets swallowed into metadata:
 
% multimarkdown
To be or not to be: that is the question.
^D
?xml version=1.0 encoding=UTF-8 standalone=yes ?
!DOCTYPE html
html xmlns=http://www.w3.org/1999/xhtml;
head
meta name=tobeornottobe content=that is the question./
/head
body
 
/body
/html
 
 That's not what most authors would expect!

This is true.  But a blank line at the top of the document solves the problem.  
And it doesn't match a URL on the first line as metadata, so I'm not sure how 
often this really happens in real life.

 For this reason, I would favor something more like reStructuredText
 field lists, which marks the fields explicitly as fields:
 
:title:Here is the title.
:author:   John
:abstract: The abstract here.
  It can span multiple lines.
 
  As long as the indentation is maintained.
 
This is not part of the metadata.
 
 This is slightly less texty because of the leading colon, but less likely to
 capture regular text.

This becomes a matter of values.  To me, the ugliness of this approach 
outweighs the virtually negligible chance that I will have a document 
triggering metadata when I don't mean it.  But it's certainly not as bad as 
some other alternatives.  If it was proposed as a standard, I would try to vote 
against it, but would not necessarily boycott it within MultiMarkdown.


 Also, because this is recognizable as metadata wherever it occurs
 in the document, one could then drop the requirement that the
 metadata occur at the top of the document, which I think is
 undesirable.  When there's lots of metadata, it's nicer to put
 it at the bottom (or at least to put some of it at the bottom),
 so it doesn't interfere with reading the article. lunamark's
 lua_metadata allows that, by the way -- so you don't have to
 start the document with something that doesn't look like plain
 text.

I don't view metadata as necessarily belonging at the bottom, but the 
flexibility is a bonus.

 One nice point that David Sanson made is that one could combine
 a simple, texty metadata format for common things like titles
 and authors with a flexible, more cody format for everything else.
 One should keep this in mind in thining about how to balance flexibility
 vs. textiness.
 
 John

My vote would be for something more akin to MMD's metadata as the first option, 
and then for something more robust as the optional variant for those who need 
it.  The cody alternative could allow lists, key value pairs, multiple 
paragraphs, etc.  I suspect it would be used by only a minority of users, but 
that the minority is going to be over-represented on this discussion list.


F-

--  
Fletcher T. Penney
fletc...@fletcherpenney.net




___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-09-18 Thread David Chambers
On Sep 18, 2011, at 10:47 AM, John MacFarlane wrote:

 * There's no provision for structured data (e.g. key/value
  tables or lists), or for boolean or numerical fields.

I'm not convinced that Markdown should have any say as to which data structure 
a particular value should be transformed into.

These are the things I believe Markdown certainly should define:
delimiters for metadata blocks (whitespace or otherwise)
syntax for key–value pairs
valid keys
valid values
Perhaps Markdown's responsibilities should be limited to the following:
ensuring that metadata are omitted from the HTML output
storing the key–value pairs (as strings) in a dictionary-like object
The reason I lean towards this approach is that the alternative (defining 
syntax for lists, numbers, etc.) would impose extra syntax in common cases. 
Take the following, for example:

date: Sunday, 22 May 2011
time: 6:30pm
zone: America/Los_Angeles
tags: JavaScript, regex, regular expressions

To a human reader, tags is clearly a list. How, though, would a parser know 
that tags is a list but date—which also contains a comma—is not? Resolving 
this ambiguity would require that the tags be wrapped in square brackets (or 
the addition of some other syntax):

date: Sunday, 22 May 2011
time: 6:30pm
zone: America/Los_Angeles
tags: [JavaScript, regex, regular expressions]

What if list items are allowed to contain commas? Perhaps an item may be quoted 
to resolve this ambiguity. What happens, then, if one wishes to include a 
quoted item:

tags: [foo, bar, baz!]

If quotation marks are optional, would this necessitate wrapping baz! in an 
extra pair?

These are certainly edge cases, but as we've agreed defining correct behaviour 
in such cases is important. If we want to avoid defining our own serialization 
format, we have two options: we can adopt an existing format (such as JSON or 
YAML), or we can hand off the responsibility to application developers.

I favour the latter, because serialization formats, by necessity, contain quite 
a bit of punctuation. Transforming strings from a metadata dictionary into 
appropriate values is something with which I have first-hand experience. Mango 
provides a META_LISTS setting which determines which keys' (string) values 
should be transformed in lists. Sure, this required a bit of work on my part, 
but the end result is pleasing (no extra punctuation in my Markdown files).

Won't this lead to a situation where one application cannot correctly process 
another application's metadata? Yes. If we're unwilling to accept this I fear 
we'll end up reinventing YAML. ;)

David


On Sep 18, 2011, at 11:07 AM, Fletcher T. Penney wrote:

 
 On Sep 18, 2011, at 1:47 PM, John MacFarlane wrote:
 snipped
 
 To my mind, multimarkdown comments just aren't flexible enough:
 
 * There's no way to have multiline metadata fields that contain
 blank lines, e.g. an abstract with two paragraphs.
 
 True - but in MMD an abstract would be included in the document with a 
 separate header, not as metadata.  But you're correct that blank lines are 
 not allowed.  I've never needed them, but they aren't allowed.
 
 * There's no provision for structured data (e.g. key/value
 tables or lists), or for boolean or numerical fields.
 
 True.  I've never needed them, and have never had them requested.  But there 
 is no provision for that.
 
 * Metadata fields are interpreted as raw strings, not markdown.
 That's sometimes what you want, but not always.  Titles
 often contain emphasis and other formatting, for example,
 and sometimes even footnotes (for acknowledgements).  If
 these are just going into an html meta field, it doesn't much
 matter, but if you're using the metadata fields in templates,
 it does.  (And sure, you could always run a raw string through
 your markdown processor again, before passing it to the template engine,
 but that creates problems for things like reference links and
 footnotes.)
 
 This is a slight difference in behavior from MMD 2.  I'm considering 
 approaches to allow processing the contents of the metadata, as this can be 
 an issue occasionally.
 
 Another major problem, in my view, is that if a document starts
 with a phrase followed by a colon, it gets swallowed into metadata:
 
   % multimarkdown
   To be or not to be: that is the question.
   ^D
   ?xml version=1.0 encoding=UTF-8 standalone=yes ?
   !DOCTYPE html
   html xmlns=http://www.w3.org/1999/xhtml;
   head
   meta name=tobeornottobe content=that is the question./
   /head
   body
 
   /body
   /html
 
 That's not what most authors would expect!
 
 This is true.  But a blank line at the top of the document solves the 
 problem.  And it doesn't match a URL on the first line as metadata, so I'm 
 not sure how often this really happens in real life.
 
 For this reason, I would favor something more like reStructuredText
 field lists, which marks the fields explicitly as fields:
 
   :title:Here is the title.
   :author:   John
   :abstract: 

Re: Metadata syntax (was Universal syntax for Markdown)

2011-08-29 Thread Alan Hogan

On Aug 26, 2011, at 11:01 AM, bowerb...@aol.com wrote:

 
 there's something else that i generally put under metadata
 -- which other people do not -- which are the specifications
 used to create the output-formats.  these include things like
 straight-quotes vs. curly, indented paragraphs vs. block, and
 the pagesize (for .pdf), the font, fontsize, leading, and so on.
 this allows the end-user who receives the z.m.l. file to create
 outputs matching what the author intended them to look like.
 in accordance with the all-text-in-one-file mandate of z.m.l.,
 these specifications should be included in the text-file itself,
 and can fall in the metadata section, the colophon section,
 or in their own output specifications section, as you desire...
 and, of course, end-users can also change the specifications,
 so as to create output that is formatted to their own desires...
 
 -bowerbird

I think the definition of such a section, for similar reasons (such metadata 
would only be considered in certain contexts such as publishing or CMS 
extensions), was a motivation for the metadata discussion.

Alan Hogan___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


re: Metadata syntax (was Universal syntax for Markdown)

2011-08-26 Thread Bowerbird
all pooped out, are you?

oh well, the conversation this time lasted longer
than it ever has before, in my memory, so maybe
you're just working up your stamina for next time...

so let me finish off this round...

***

christoph said:
A Markdown document may contain metadata
in a human readable form that the parser converts
to a machine readable form of metadata automatically. 
A casual reader will understand the content directly
and without distraction.   Bowerbird will love this.

indeed, christoph...   because you've begun to describe
the very system that i use, for the very reason i use it.

i'll describe it more fully below, but first other stuff

***

i'm not sure i fully understand the mentality that says
implementations of markdown 2.0 can toss metadata.

isn't the objective to dispense with implementations that
act differently from each other?   ok, sure, i'm not naive;
i realize that once a standard for markup 2.0 is made,
someone will come along and tweak it for their benefit,
and then we are once again on the path toward fracture.
but still, the goal for here and now is to unify all.   right?

i feel the same way about command-line switches that
turn on different modes, like quirks and extensions.
isn't it our zeitgeist to gather everyone under one roof?
you'll just ignore (or never learn) features you don't need.

so everyone gets what they want.   and if it's not possible,
if you want to use the system you have been using which
is tweaked the way you want it, just continue to do that...
it's not like those scripts will stop working or something.

but manufacturing a situation where all of the differences
are _blessed_ (rather than removed) is counterproductive.

***

now on to metadata...

as for the color of the metadata bikeshed, we have one
shade of paint -- simple -- so that's what it must be...

you've probably over-discussed it already, without even
getting to the meat of the matter.   for _most_ purposes,
the metadata is relatively unimportant, which you'll see
quite clearly if you only begin to concentrate on specifics.

in a .pdf, for example, the metadata consists merely of
title, author, subject, creator, and keywords.   that's it...

in an .epub or a .mobi, you can specify a ton of metadata,
if you want, but there's no standardized way of getting it,
so you're basically whistling at a noisy construction site...
(or doing pantomime in the dark, if you prefer that image.)

unless/until the microformat people get an upper-hand
-- and lord help us if that kind of bureaucracy wins out --
metadata in .html continues to be a rather iffy thing, so
at least for now, i think this issue needs little attention...

as for the matter of tags or keywords, they're _lame_,
to a large degree, because they can be gleaned from the
text itself in most cases.   and perhaps more importantly,
such descriptive judgments need to be accumulated over
the input from hundreds or thousands of objective users,
rather than plugged in by a document's author or publisher,
or the specter of gaming the system makes it all worthless...
i'm not telling people not to use tags, but i think it's obvious
that any worthwhile recommendation system will ignore 'em.
your metadata often tries to tell lies; google knows the truth.

there are a lot of consultants selling metadata as a cure-all.
it's more like snake-oil.

***

as for my system...

as i said, my focus is on _books_, so for me, the concept of
the title-page (plus the cover) is the one that rules here.

the first section or chapter in a .zml file is the title-page,
and _everything_ on that page is considered as metadata.

remember that my first pass consists of separating chunks
-- a sequence of non-blank lines bordered by blank lines --
so the top chunk (of one or more lines) is defined as the title.

the second chunk is considered to be the subtitle, and the
third is considered to be the author.   the author chunk is
required to start with the word by, so if the second chunk
starts with by and the third chunk does not, my routines
assume that the book has no subtitle, so the second chunk
is considered to be the author chunk.   subsequent chunks
are required to be labeled appropriately, such as edited by
or illustrations by or plus additional contributions by or
with preface by, and so on.   you get the picture; it's clear.

other things which commonly appear on the title-page are
the publisher's name and often the city where it is located,
publication date, contact information for the author(s), etc.

none of this is particularly difficult to parse.

nor does it sacrifice any power _or_ flexibility.

other info about the document is obtained in the course of
analyzing it, like the number of chapters and illustrations,
the size of the file, the number of references, and so forth.

you also have to acknowledge, at some point in time, that
no matter what you do, you ain't gonna make a professional
book-cataloger happy...   and one of my close 

Re: Metadata syntax (was Universal syntax for Markdown)

2011-08-18 Thread John MacFarlane
+++ Sam Angove [Aug 18 11 12:26 ]:
 On Thu, Aug 18, 2011 at 7:29 AM, Fletcher T. Penney
 fletc...@fletcherpenney.net wrote:
 
  The MMD format for metadata was actually taken from the Blosxom software
  that you mention.
 
 And before that, almost certainly taken from the Internet Message Format [1]).
 
 MultiMarkdown improves on the IETF version from a user's point of view
 (and becomes more Markdownish) by making it legal to do lazy
 line-folding. The result is something that's simple to write, simple
 to parse, and exactly what a normal person would come up with if you
 asked them to put some metadata at the top of a text file.
 
 I think it's a perfect fit for Markdown.
 
 YAML, by contrast, is complicated and outrageously heavy to include as
 a dependency -- data merging, references, different types of folding,
 user-defined data-types... you've got to be kidding.

I have to agree that YAML is overkill for our purposes, and adds
a lot of complexity.

___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Metadata syntax (was Universal syntax for Markdown)

2011-08-17 Thread Fletcher T. Penney
1) This is a totally separate issue from the discussion at hand I was 
responding to, which is how to try and converge the Markdown derivatives, so I 
renamed the thread

2) That said, there are certainly multiple ways of including metadata 
information, and I'm happy to discuss.  But I do want to be clear that I think 
this is a secondary discussion, and do not wish to distract from the bigger 
picture of how to develop a plan for unifying the core features of the Markdown 
family.  How to merge metadata syntaxes should be a secondary or tertiary 
concern for such an effort.



For those who care about metadata syntaxes, read on.  If you don't, feel free 
to skip to the next email that interests you.  ;)


The MMD format for metadata was actually taken from the Blosxom software that 
you mention.  As you may recall, the first line could be used as a title, but 
beyond that a syntax basically identical to that of MMD was the most common way 
of including metadata, and I believe that the plugin responsible was in fact, 
called metadata.  This was necessary to allow information such as dates, 
categories, etc to be included in the document itself.  The ability to include 
metadata using arbitrary keys created a blossoming (pun intended) of plugins 
that added many useful features to the blosxom package.

Your suggested syntax certainly requires less markup than that used by MMD 
currently, but at the cost of a great deal of flexibility, and would require 
more complexity in programming the parser.

You mention the English-centric nature of MMD metadata.  This is certainly 
true, but no more so than HTML itself.  One could certainly localize MMD to use 
any language you like (the beauty of open source), but to match your proposal 
in multiple languages would be quite complicated.

For example, the following are valid MMD metadata dates, and easily used:

date:   8/17/2011
date:   August 17th, 2011
date:   2011-08-17
date:   17/8/2011
date:   14. Juni 2001
date:   8 avril 2000

Writing a parser that would correctly catch all of these dates in any language 
would be quite difficult, and prone to error.

You mention tags as being easily recognized, but that this is not always true:

A sample document

by John Smith, MD
Director of Palliative Care, Division of General Medicine, Medical 
University of Somewhere

While perhaps not the best example of potential problems, this would be 
incorrectly interpreted as tags, when the author probably implies that this 
represents his academic affiliation and would like it to be properly placed 
after his name on the title page, or on the slide deck if generating via beamer.


So your example would work for simple metadata that relies only on numerical 
dates.  For documents that fit your desired model, this syntax would be great 
and would involve less markup --- which is good.  However, I suspect that for 
those who want metadata in their document, it would be too limiting --- which 
is not good.  Many of my users, myself included, would end up right back where 
we started with needing another way to include metadata.


To help give you perspective on the power of the current metadata model, by 
properly including the right metadata, a single MMD document can be processed 
into a web page, a pdf slide show (aka powerpoint), and a pdf handout.   
Another document can be processed into letterhead, complete with logo, return 
address, recipient information, graphical signature, and even a properly 
addressed envelope.  Another can be output as a properly formatted manuscript 
for submission to a publisher.  

I don't expect all users to use the full power of metadata.  Many users can 
simply ignore it altogether.  But it is an incredibly useful feature that is 
one of the primary ways that I integrate MMD into my own personal workflow.  It 
does take a bit of willingness to dig around and experiment in order to 
understand how metadata works.  So while I am certainly interested in ways to 
improve it, metadata will not be removed from MMD. 


That doesn't mean I expect all variants to use metadata, just because MMD does. 
 Nor do I expect them to follow the MMD syntax if they do.  Other than yours, I 
haven't seen any proposals for a metadata syntax that had *less* markup than 
mine, nor did they seem any more human friendly than this syntax.  And for my 
purposes, your proposal doesn't offer the flexibility that I would need for the 
ways I use MMD.


I've tried to throw a few things into your example, to show how it wouldn't 
work as well for my own use cases:

---
Test Document for Automatic Metadata Detection
Is this a subtitle, or a continuation of the title from above?

by Christoph Freitag  
date: August 17, 2001
Markdown, Standardization, MMD, Metadata
affiliation:University of Somewhere
comment:This looks funny aligned 

Re: Metadata syntax (was Universal syntax for Markdown)

2011-08-17 Thread David Chambers
It is true that certain metadata (author and date, to provide two examples)
are used far more frequently than return addresses or URIs for graphical
signatures. That said, it would be foolish to try to imagine every way in
which metadata might be used, nor do I see much value in doing so.

If Markdown is to process metadata, the syntax should support arbitrary
key–value pairs.

For example:

author: Jesper Nøhr
date: 17 August 2011
tags: lol, omg, lulz

Formatted differently:

author: Jesper Nøhr
date: 17 August 2011
tag: lol
tag: omg
tag: lulz

If — again, if — Markdown is to be charged with parsing metadata, my opinion
is that it's role should be limited to returning a dictionary-like metadata
object (in addition to the HTML string generated from the remainder of the
document's contents).

For the first example:

{date: 17 August 2011, tags: lol, omg, lulz, author: Jesper
Nøhr}

For the second example:

{date: 17 August 2011, author: Jesper Nøhr, tag: [lol,
omg, lulz]}

In my opinion, Markdown should *not* be responsible for any of the
following:

   - splitting lists (note that lol, omg, lulz is a string in the first
   example)
   - converting date strings into date objects
   - any other manipulation of values

In other words, every value should be either a string, or an ordered,
list-like object containing two or more strings (in the case of a repeated
key).

In addition to converting strings into appropriate objects, applications
making use of Markdown's metadata feature would also be responsible for
handling the fact that the value for a particular key may be a string for
one document and a list of strings for another.

Fletcher touched on another question that should be discussed: should
multiline values be accommodated and if so, how?

I think it'd be great to support multiline strings. I imagine the formatting
looking something like this:

author:
  Jesper Nøhr
date:
  17 August 2011
lol:
  Irony keffiyeh pitchfork, mustache letterpress tofu cred twee scenester
  thundercats gluten-free yr chambray sartorial stumptown. Homo cosby
sweater
  gentrify banh mi letterpress, vinyl beard hoodie terry richardson. Art
party
  whatever banksy, readymade skateboard you probably haven't heard of them
  tumblr tattooed PBR letterpress photo booth carles vegan organic.
omg:
  VHS carles photo booth food truck synth craft beer, wes anderson tofu
banksy
  fanny pack stumptown.

This strikes me as being in the spirit of Markdown, as it's how one might
structure this content if one were to produce it on a typewriter.

I'm interested to hear people's thoughts on multiline values and on the
unfancy approach to metadata parsing that I (currently) favour.

David


On 17 August 2011 15:17, M Harris m...@2011.n0b.org wrote:

 So, hi all. First time commenting on the list.

 I personally think having tags (whether of type author: or type by)
 is useful for two reasons.
 One: It allows multiple tags to be entered. Two, it clears up the
 potential problem listed by Fletcher regarding tags.

 by Christoph Freitag
 Affiliation: XYZ
 by Fletcher T. Penney
 Affiliation: ABC
 tags: Markdown, Standardization, MMD, Metadata
 desc: An interesting discussion of how metadata could be included
 usefully in Markdown, whilst being readable etc.


 Regarding the localisation problem then, I thought that this was a
 solved problem when it came to computing? (At least in the cases of the
 major world languages.) A parser could have a table of equivalent words,
 so in English by, en français de (pardon my French*).

 * By which I mean, I'm not sure that's correct, because I'm only a
 learner.

  From: Christoph Freitag m...@christoph-freitag.de
  Fletcher, sorry, but personally -- despite loving MMD (and even having
 used MMD CMS for a diary) -- I have never liked the way MMD handles
 metadata. Partly this is because, not being a native English speaker, I
 dislike English meta descriptors. A localization could resolve this -- but I
 still think it looks ugly. However, do you actually need descriptors at all?
 I doubt it:
 
  *   The title could be anything at the start of the document. Blosxom
 is a good example. Anything up to the first blank line is the title.
  *   After that, anything between the first blank line and the second
 blank line would be treated as additional metadata.
  *   Instead of the Author: descriptor, explicitely stated, it should
 suffice to write by. What follows is the name of the author. (Localization
 would be easier as only this keyword would have to be known to the parser
 in a number of languages.)
  *   Dates would be self-explanatory, to a clever parser.
  *   Any list of words separated by commas on a single line would be
 treated as tags.
  *   Any more fanciful meta descriptors might be given explicitly just as
 in MMD before. This could be left to non-standard, personalized variants of
 Markdown.
 
  Thus the following would be a valid document:
 
  ---
  Test 

Re: Metadata syntax (was Universal syntax for Markdown)

2011-08-17 Thread Alan J. Hogan


On Aug 17, 2011, at 6:17 PM, David Chambers david.chambers...@gmail.com wrote:

 I'm interested to hear people's thoughts on multiline values and on the 
 unfancy approach to metadata parsing that I (currently) favour.

I agree that:

- multiline values are a must
- arbitrary key/value pairs are a must

When you describe the syntax you envision, I am just thinking, why redefine 
YAML? (In that case, if you'll allow me a moment of glibness, lets call that 
syntax YAYAML.)

AFAIK both YAML and JSON allow for representation of the same data types 
(numbers, strings, arrays, objects/dictionaries). 

If we pick a format as preferred metadata syntax, my vote is for YAML. It's 
already defined, already implemented, already proven, and fairly natural. Hell, 
I'm TextMate, for example, the YAML bundle would simply apply to the 
appropriate section of the Markdown 2 doc (like JavaScript or PHP among HTML)!

Alan
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-08-17 Thread Sam Angove
On Thu, Aug 18, 2011 at 7:29 AM, Fletcher T. Penney
fletc...@fletcherpenney.net wrote:

 The MMD format for metadata was actually taken from the Blosxom software
 that you mention.

And before that, almost certainly taken from the Internet Message Format [1]).

MultiMarkdown improves on the IETF version from a user's point of view
(and becomes more Markdownish) by making it legal to do lazy
line-folding. The result is something that's simple to write, simple
to parse, and exactly what a normal person would come up with if you
asked them to put some metadata at the top of a text file.

I think it's a perfect fit for Markdown.

YAML, by contrast, is complicated and outrageously heavy to include as
a dependency -- data merging, references, different types of folding,
user-defined data-types... you've got to be kidding.

[1]: http://tools.ietf.org/html/rfc2822#section-2.2
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: Metadata syntax (was Universal syntax for Markdown)

2011-08-17 Thread Alan J. Hogan


On Aug 17, 2011, at 7:26 PM, Sam Angove peas...@gmail.com wrote:

 YAML, by contrast, is complicated and outrageously heavy to include as
 a dependency -- data merging, references, different types of folding,
 user-defined data-types... you've got to be kidding.

Wow, I seem to have a vastly over-simplified conception of YAML.

Alan
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss



Re: Metadata syntax (was Universal syntax for Markdown)

2011-08-17 Thread David Sanson
First time posting here as well. I've been watching this discussion
with interest. As a user of (extended) markdown, I have long hoped for
a unified standard for (most or all) markdown extensions and a unified
handling of metadata.

It seems to me that one of the issues that arises when we start
thinking about metadata is that there really are two different kinds
of metadata: some metadata (title, author, date) is---at least in many
cases---also part of the *content* of the document. This is the kind
of metadata for which I feel the force of the demand for an elegant
plaintext solution. For some bold suggestions in this direction, see
this [old post by Michael Thompson][1] to the pandoc-discuss list.
Here is one of his examples from that post:

 A Good Man Is Hard To Find

   Flannery O'Connor
  Spring 1952


The grandmother didn't want to go to Florida. She wanted to visit
some of her connections in east Tennessee and she was seizing at
every chance to change Bailey's mind.

Isn't that so much *prettier* than any of the options currently in
play? Email someone a document like that, and they will know exactly
what you mean, and see no distracting markup. No doubt this presents
challenges when it comes to parsing, and I have no idea whether or not
those challenges are surmountable. Clearly some rules would have to be
laid down (Does it have to be centered? Indented? Can I underline the
title ala setext? Do I have to have two blank lines after the date?
Can I leave the date out? etc.) And it raises issues for backwards
compatibility too. But I think its worth having in view a solution
that achieves a certain degree of perfection along this one dimension.

But then there is the other kind of metadata. Tags, keywords,
baseurls, paths to associated files, directives for webpage templating
software, and so on and so on. This sort of stuff is definitely not
content. It is a bunch of data that I want to associate with the file
for some reason or other. It needs to be indefinitely extensible. It
is frequently tied directly to some specific output format or context.
In other contexts, probably just needs to be ignored. Blosxom taught
us that it should all be at the top of the document (and successors,
like Jekyll, follow this tradition), but much of it is ugly enough
that it could just as well be banished to the bottom of the document,
where nobody but the author would ever have to look at it.

When it comes to this sort of metadata, I don't see any reason to look
for something elegant, language-independent, and plaintext-y. This is
where it feels like I just want a way of embedding a block of data
within a markdown file, knowing that it won't be treated as content
(and, depending on my processor and the context, knowing that it may
be sucked up and used in various ways). It is here that I agree with
the sentiment that metadata shouldn't be part of the markdown spec,
*but* I think markdown should be smart enough to ignore the metadata,
so that I don't have to strip it out before feeding the document to a
markdown processor.

Here is an extreme version of this: extant implementations of citeproc
support JSON as a bibliography format. Imagine they supported YAML.
Then imagine being able to stick something like this at the *end* of
your markdown file,

---
story:
title: A Good Man is Hard to Find
author: Flannery O'Connor
date: Spring 1952
key: oconnor1952
story:
title: The Old Man and the Sea
author: Ernest Hemingway
date: Sep 1952
key: hemingway1952
...

and then being able to treat the same file as both your markdown file
and your bibliography database, knowing that, when you run it through
the markdown parser, that chunk of metadata will be ignored, and when
you feed it as a database to your citeproc implementation, the
markdown will be ignored. This is just one example of the sort of
flexibility and power that you might get from supporting arbitrary
blocks of data within markdown files.

So, here is my *pipe dream* implementation of metadata in markdown:

1. A syntax for clean, language independent title, author, date (and
?) that looks the way you would have done it on a typewriter or in a
plaintext email.

2. Support for embedding arbitrary metadata inside of appropriate
delimiters (e.g., YAML's '---' and '...') *anywhere* within the
document.

I would then add, that, for simplicity, all markdown processors should
look into the arbitrary metadata for a few common bits of metadata,
namely, title, date, and author (perhaps with proper localizations).
That way, I could write beautiful plaintext markdown, providing title,
author, date as part of the content, if I wanted too, but if I was
lazy, or was using a bunch of metadata and preferred to keep it all in
one place, I could instead just specify that as metadata along with
all the rest. I guess this means that I 

Re: Metadata syntax (was Universal syntax for Markdown)

2011-08-17 Thread David Sanson
As for the heaviness of YAML as a dependency, I think it would
reasonable to expect markdown itself to handle only the simplest YAML
constructs when trying to find the few bits of metadata---title,
author, date---that it should be responsible for.
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss