https://github.com/mxcl/homebrew/wiki/Tips-N%27-Tricks/ has a lot of text lost 
during conversion, and the <code>```</code> blocks don't get converted right.  
I wonder if this is due to the substitutions being simple regexes and not part 
of a parser?  (see end of this comment)

https://github.com/mxcl/homebrew/wiki/Formula-Cookbook gets converted pretty 
well, but the formatting specifier e.g. "ruby" doesn't get converted to 
":::ruby"

The `~~(.*)~~` regex is too greedy.  Based on how github does strikeout, I 
think `~~(\S*?)~~` is better

Minor: can `\b` replace `(\s|^)` and `(\s|$)`?  I think that'd be cleaner.

Whitespace handling at beginning & end of `(\S+\s+)(#\d+)` is different than 
all the other regexes.

I think you should try a simple parser to handle pre-formatted blocks (indented 
and `~~~~`) and escaped inline text with \`\`  Here is a stub sample parser: 
https://sourceforge.net/p/allura/pastebin/526ac1030910d42342ac23ef as a 
starting point.  It's untested/debugged so will need fixing.  It also needs to 
be expanded to handle backticks.  I'm thinking the regexes could be run within 
the `handle_non_code` method.  The code block regex probably will need a 
different hook to run in.  This seems quite complex to me, but the right way to 
do it.  The Markdown package itself might also have a parser at the right level 
that can be hooked into and generate modified markdown again (I'm afraid it 
might be only suitable for generating HTML).  Anyway, I think you should give 
this approach a shot at see how it goes and if it's going to work or not.


---

** [tickets:#6622] Convert or handle Github markdown extensions**

**Status:** in-progress
**Labels:** import github 42cc 
**Created:** Fri Aug 30, 2013 01:55 PM UTC by Dave Brondsema
**Last Updated:** Fri Oct 25, 2013 07:47 PM UTC
**Owner:** nobody

When importing github content (tickets, wiki, comments) we should deal with 
their special markup.  For example, code blocks with optional language 
specification:

~~~~
```javascript
function fancyAlert(arg) {
  if(arg) {
    $.facebox({div:'#foo'})
  }
}
```
~~~~

should be converted to:

<pre>
~~~~
function fancyAlert(arg) {
  if(arg) {
    $.facebox({div:'#foo'})
  }
}
~~~~
</pre>

And strikethrough `~~example~~` should be converted to `<s>example</s>`.  This 
we could possibly support directly in our Markdown renderer if we wanted to.  
That would also allow it to work for Markdown files in git repos (since we 
won't modify those during import).

Emoji I don't think we should handle (yet?)

Cross-reference syntax 
https://help.github.com/articles/github-flavored-markdown#references we may 
want to consider handling.  See also Trac syntax [#6140] handling.

Converting markdown can be tricky to get right, so we have to be careful that 
we only convert the right content.  Nested markup, escaped markup, etc. 


---

Sent from sourceforge.net because [email protected] is subscribed 
to https://sourceforge.net/p/allura/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/allura/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.

Reply via email to