Lennart Borgman <lennart.borg...@gmail.com> writes:

> On Thu, Nov 4, 2010 at 3:37 PM, Carsten Dominik
> <carsten.domi...@gmail.com> wrote:
>>
>> On Nov 3, 2010, at 1:34 PM, Lennart Borgman wrote:
>>
>>> On Wed, Nov 3, 2010 at 1:15 PM, Eric M. Ludlam <e...@siege-engine.com>
>>> wrote:
>>>>
>>>> On 10/30/2010 03:45 AM, Konrad Scorciapino wrote:
>>>>>
>>>>> Hey!
>>>>>
>>>>> Is anybody working on Org-mode? My main interest is to build a parser to
>>>>> manipulate the nodes of the resulting tree and save them back. Tips on
>>>>> how to get started are also welcome.
>>>>>
>>>>
>>>> I know of no one doing that.  I don't know what org-mode's code
>>>> structure is like, but I'd assume it already has a parser, and you could
>>>> adapt the output to Semantic tag format.
>>>>
>>>> The HTML parser also handles arbitrary text so you could look in
>>>> semantic-html to see what sort of things that  does.
>>>
>>> There are different exporters for org-mode.
>>>
>>> Currently we are trying to make an exporter to ODT files. I think a
>>> parser would come in handy.
>>
>>
>> org-html.el is probably the best starting point to make a complete parser.
>>  It does a very detailed analysis of the text.
>>
>> We should have built all the exporters on the same parser - unfortunately we
>> did not.  One of the hard to correct mistakes we made in early development.
>
> Then perhaps the best we can do now is starting by breaking up
> org-hml.el in the parser and a callback function for writing the
> export. After that we can add new exports by adding new callback
> functions.

The suggested refactoring could be a side-effect of org-odt.el that is
shaping up from org-html.el. Note that the refactoring happens or
atleast is visible in org-odt.el (which I control) and *not* in
org-html.el (which is in the field)

The main challenge with breaking up org-html.el first and then say
plugging in org-odt.el to that later is:

1. Code-churn that it would create in org-html.el
2. Proving that *nothing* in HTML export actually breaks.

Carsten would like to avoid (1) - he might want to go with one bit
commit and (naturally) shift the responsibility of (2) to the committer.

So a committer has one more thing he needs to be concerned about.

I do see some regression tests for html exporter and I am unsure how
*complete* they are. What would have really made things easier is the
following:

1. One Org file in repo that has *all* the Org-specific markups.
2. One HTML file that is a exported from this Org file which is
   re-checked in as and when the org-html.el changes it's markups.

Everytime something changes in the exporter one just diffs the *new*
HTML file with the one in the repo and be able to say with assured
confidence that something has improved or broken.

In some sense, test.org in my repo does this today:

- Base URL: http://repo.or.cz/w/org-mode/oo.git/blob_plain/HEAD
  Rel URL:  :/contrib/odt/files/test.org

What I believe I am recording here is that org-odt.el moves the Org
export engine in the "right" direction without me having to sell to
Carsten the need for a complete refactoring of org-html.

>From my experience, re-factoring is all good. But at the end of the day
if it is going to delay something useful or places overly much
responsibilities on regression and validation it is better avoided.

Then there is also this question of how many more export formats that
Org could be possibly have in the future ... If there aren'y many that
we foresee maybe the prudent thing to do is to be not overly much
concerned about refactoring the parsing engine.

Lennart 

Btw, can you summarize what UseCase in cedet-devel triggered this line
of thought ...

Just my 2 cents here. Sorry if I sound too overboard or venturing in to
the speculative realm.

Jambunathan K.








_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

Reply via email to