ODF_to_HTML_keys (Branch"odf-filter-attempt2" review)

Peter Kelly Tue, 12 May 2015 00:04:47 -0700

For mapping between ODF and HTML keys, I would suggest starting off with a more 
traditional/direct approach, where you have either a series of if statements 
or, equivalently, a switch statement, which checks what the current ODF element 
in the traversal is, and then goes ahead and creates the appropriate HTML node. 
So in traverseContent in ODFText.c you could have a switch statement there 
(remember the breaks at the end each case! - and also { } inside each case for 
scoping purposes).

I noticed also that in traverseContent there’s this line:

    if (odfChild->tag == 2) { // we have some text here.

I advise against using “magic numbers” like this, because it’s not at all clear 
what the two means (well, actually your comment makes it clear). But whenever 
you’re about to write a specific number, the question to ask is can you define 
a macro or constant whose name matches what the number means.

In fact in DFDOM.h there are the following macros defined:

    #define DOM_DOCUMENT                 1
    #define DOM_TEXT                     2
    #define DOM_COMMENT                  3
    #define DOM_CDATA                    4
    #define DOM_PROCESSING_INSTRUCTION   5

So you could change the line above to:

    if (odfChild->tag == DOM_TEXT) {

and then that makes the code self-describing, removing the need for the 
comment. Also, if for some reason the specific integer value used for text node 
was ever changed, then this code would still work correctly as long as the 
macro was updated. While the DOM_ numbers above are extremely unlikely to ever 
change, the other pre-defined constants (actually enums) like HTML_H1 defined 
in DFXMLNames.h are almost certain to change (when the file is re-generated 
from the script that assigns these numbers when someone adds some new names). 
So you should always use the symbolic names rather than writing out the numbers 
directly.

Despite my suggestion of starting with if statements or a switch statement in 
the traversal to begin with, I like where you’re going conceptually with the 
idea of representing the information necessary for translation in a data 
structure rather than code. In fact, whether or not this was a conscious thing 
or not, you’ve taken the very first step in designing your own domain-specific 
programming language for expressing transformations on XML data. Code that uses 
the data structure you define constitutes an interpreter for this language, and 
the sophistication of both the data structure and the interpreter can be 
expanded over time to cater for more complex needs. This is something I hope we 
can explore a lot further, and I’ve got a lot of ideas on this I’ve been 
thinking about for quite a while now.

—
Dr Peter M. Kelly
[email protected]

PGP key: http://www.kellypmk.net/pgp-key <http://www.kellypmk.net/pgp-key>
(fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)

> On 10 May 2015, at 8:52 am, Gabriela Gibson <[email protected]> wrote:
> 
> Hi,
> 
> So far I got my branch to produce a list of html nodes (and report on
> still missing stuff).
> 
> This is probably a good point to have a look if the approach I'm using
> here is any good.
> 
> It of course has quite a few warts still, and I think I will need to
> add function pointers to the ODF_to_HTML_key struct to deal with some
> special cases.  If that struct is a good idea that is.
> 
> The branch can be found here:
> 
> https://github.com/apache/incubator-corinthia/commit/c81e68626489b9515e7e8f3a5ce5d38ac8f59af0
> 
> I added the test odt file I was using, plus the current output of the program.
> 
> thanks for looking,
> 
> G
> 
> -- 
> Visit my Coding Diary: http://gabriela-gibson.blogspot.com/

ODF_to_HTML_keys (Branch"odf-filter-attempt2" review)

Reply via email to