Re: on the philosophical aspects of a specification

2008-03-07 Thread James Grimmelmann

On Mar 7, 2008, at 5:17 AM, Aristotle Pagaltzis wrote:


Hi Yuri, Weylan and Seumas,

* Yuri Takhteyev <[EMAIL PROTECTED]> [2008-03-07 08:50]:

   *hello **dear* boy**


That's a very good question. Here's a counterquestion: what
does a human reader see in that text?


When I try to look at this with my normal-person eye, what I
see here is incorrect markup


Sorry, but if you see “markup” (much less “incorrect markup”)
you’re not looking at it with a normal-person eye. :-)


So, the user will type in something like this and get
"hello **dear boy**". Not much of a tradegy. They will
say, oh, silly me, must have screwed something up. (They did!)
Then they'll go and fix it. I am all for flexibility, but not
to the point of trying to divine the meaning of ambiguous or
ill-formed markup.


Only a small minority will do that. Most people most of the time
don’t care enough about that particular piece of text to actually
fix any small nits in it, any more than they’ll care to fix all
of their small spelling and grammar mistakes. (Less, actually.)
That has certainly been my experience on wikis and weblogs that
use shorthand markups like Markdown.


Given that, I would take advantage of the fact that Markdown source is  
highly readable.  If an input is too ambiguous, leave it unparsed.   
The source will be reasonably clear.  As the rules get more complex  
and try to make assumptions about what authors intended, there will be  
more cases in which the rules get it wrong and the output contains  
something that's both unintended and harder to puzzle out than the  
source would have been.


James
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: on the philosophical aspects of a specification

2008-03-07 Thread James Grimmelmann

On Mar 7, 2008, at 2:45 AM, Yuri Takhteyev wrote:


   *hello **dear* boy**


That's a very good question. Here's a counterquestion: what does
a human reader see in that text?


When I try to look at this with my normal-person eye, what I see here
is incorrect markup, which I then want to leave it as is and move on.
When I look at it with my formalistic left-parsing eye, I see
"hello **dear boy**".   When I look at it with my reg-exps in
a loop eye, I see "*hello dear* boy".  Either one of
those is ok with me.  Let's just pick one.  Everything else is from
the devil, I say.  Please, let's keep it simple.

So, the user will type in something like this and get "hello
**dear boy**".  Not much of a tradegy.  They will say, oh, silly
me, must have screwed something up.  (They did!)  Then they'll go and
fix it.  I am all for flexibility, but not to the point of trying to
divine the meaning of ambiguous or ill-formed markup.


I strongly agree.  (Or is that emphatically agree?)

In this context, if the output is
hello **dear boy**
or
*hello dear* boy
and the user never notices the problem, the output is still readable.   
If the user does notice that something's wrong, it's easy to look at  
the output and realize roughly what happened.  ("Some the *s must not  
have matched up correctly, because they made it through into the  
output.")  Either of these solutions is a healthy response by the  
parser.


I would prefer "formalistic left-parsing" to "reg-exps in a loop"  
because (a) working from left to right is closer to the informal  
intuitive model that most users are likely to have as to how the text  
transformations work, and (b) if we actually specify a grammar, the  
more we stick to "formalistic left-parsing," the cleaner it will be.



I don't think it really matters what we output for cases like this.  I
think any rule would be ok, as long as it satisfies the following
criteria:

1. It's _simple_
2. It always produces valid XHTML (unless input has HTML tags)
3. It should produce appropriate HTML for "normal" markdown.



I agree, though I might have said it ought to be **simple**.

James


My reg-exp eye says:  "strong" before "em" (longer pattern first),
starting from the right for each.  I am pretty sure this rule
satisfies 1, 2, and 3.

Let's stop this non-sense and get back to defining a spec for the
_normal_ markdown.

 - yuri

--
http://sputnik.freewisdom.org/
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss


Re: on the philosophical aspects of a specification

2008-03-05 Thread James Grimmelmann

On Mar 5, 2008, at 2:38 PM, david parsons wrote:


  When I write a really long list,

   *  sometimes, after a particularly long and
  detailed list item, I'll lose track of the
  exact indentation and
* add one too many spaces to the leading
  indent.

   so it would be bad if that broke nesting.


It'd be nice to get near-miss list indentation right.

BUT . . . if I make this mistake and Markdown mis-nests, the mistake  
will be obvious when I look over the output.  What's more, it'll be  
obvious how to fix it.


One of the advantages of Markdown syntax is that when something weird  
happens, it's usually very easy to spot and to debug.  I'd rather have  
clear and intuitive syntax that produces predictable outputs than get  
all of the near-misses and edge cases right.  It's good to be  
forgiving of user goofs, but it's also good to provide good implicit  
feedback on how to clean them up.  Enforcing a rule that items at the  
same level of intended indentation should start with the same number  
of spaces seems like a good case for being rigid, because a user who  
messes it up (as I've often done) can easily spot the problem and  
recover gracefully.


James
___
Markdown-Discuss mailing list
Markdown-Discuss@six.pairlist.net
http://six.pairlist.net/mailman/listinfo/markdown-discuss