I have some concern about just using regexes, since a lot of false substitutions may be made (e.g. within code blocks or escaped text)
`pandoc` can convert a lot of formats and does real parsing. It has github support too. `pandoc -f markdown_github -t markdown_strict ~/dev/github-test.md` works but loses some info like the language for a code block. Using `-t markdown` preserves it with a different syntax than Allura uses. It is the syntax that is supported via this extension: http://pythonhosted.org/Markdown/extensions/fenced_code_blocks.html (which also supports github syntax). Moreover, `pandoc` is GPL and would have to be run from the commandline (its written in haskell). Also it doesn't handle github's special shortlink syntaxes. I want to explore applying the regexes only within the right context, using a simple parser something like https://sourceforge.net/p/allura/pastebin/526ac1030910d42342ac23ef --- ** [tickets:#6622] Convert or handle Github markdown extensions** **Status:** code-review **Labels:** import github 42cc **Created:** Fri Aug 30, 2013 01:55 PM UTC by Dave Brondsema **Last Updated:** Mon Oct 21, 2013 09:23 PM UTC **Owner:** nobody When importing github content (tickets, wiki, comments) we should deal with their special markup. For example, code blocks with optional language specification: ~~~~ ```javascript function fancyAlert(arg) { if(arg) { $.facebox({div:'#foo'}) } } ``` ~~~~ should be converted to: <pre> ~~~~ function fancyAlert(arg) { if(arg) { $.facebox({div:'#foo'}) } } ~~~~ </pre> And strikethrough `~~example~~` should be converted to `<s>example</s>`. This we could possibly support directly in our Markdown renderer if we wanted to. That would also allow it to work for Markdown files in git repos (since we won't modify those during import). Emoji I don't think we should handle (yet?) Cross-reference syntax https://help.github.com/articles/github-flavored-markdown#references we may want to consider handling. See also Trac syntax [#6140] handling. Converting markdown can be tricky to get right, so we have to be careful that we only convert the right content. Nested markup, escaped markup, etc. --- Sent from sourceforge.net because [email protected] is subscribed to https://sourceforge.net/p/allura/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/allura/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.
