For mapping between ODF and HTML keys, I would suggest starting off with a more
traditional/direct approach, where you have either a series of if statements
or, equivalently, a switch statement, which checks what the current ODF element
in the traversal is, and then goes ahead and creates the appropriate HTML node.
So in traverseContent in ODFText.c you could have a switch statement there
(remember the breaks at the end each case! - and also { } inside each case for
scoping purposes).
I noticed also that in traverseContent there’s this line:
if (odfChild->tag == 2) { // we have some text here.
I advise against using “magic numbers” like this, because it’s not at all clear
what the two means (well, actually your comment makes it clear). But whenever
you’re about to write a specific number, the question to ask is can you define
a macro or constant whose name matches what the number means.
In fact in DFDOM.h there are the following macros defined:
#define DOM_DOCUMENT 1
#define DOM_TEXT 2
#define DOM_COMMENT 3
#define DOM_CDATA 4
#define DOM_PROCESSING_INSTRUCTION 5
So you could change the line above to:
if (odfChild->tag == DOM_TEXT) {
and then that makes the code self-describing, removing the need for the
comment. Also, if for some reason the specific integer value used for text node
was ever changed, then this code would still work correctly as long as the
macro was updated. While the DOM_ numbers above are extremely unlikely to ever
change, the other pre-defined constants (actually enums) like HTML_H1 defined
in DFXMLNames.h are almost certain to change (when the file is re-generated
from the script that assigns these numbers when someone adds some new names).
So you should always use the symbolic names rather than writing out the numbers
directly.
Despite my suggestion of starting with if statements or a switch statement in
the traversal to begin with, I like where you’re going conceptually with the
idea of representing the information necessary for translation in a data
structure rather than code. In fact, whether or not this was a conscious thing
or not, you’ve taken the very first step in designing your own domain-specific
programming language for expressing transformations on XML data. Code that uses
the data structure you define constitutes an interpreter for this language, and
the sophistication of both the data structure and the interpreter can be
expanded over time to cater for more complex needs. This is something I hope we
can explore a lot further, and I’ve got a lot of ideas on this I’ve been
thinking about for quite a while now.
—
Dr Peter M. Kelly
[email protected]
PGP key: http://www.kellypmk.net/pgp-key <http://www.kellypmk.net/pgp-key>
(fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)
> On 10 May 2015, at 8:52 am, Gabriela Gibson <[email protected]> wrote:
>
> Hi,
>
> So far I got my branch to produce a list of html nodes (and report on
> still missing stuff).
>
> This is probably a good point to have a look if the approach I'm using
> here is any good.
>
> It of course has quite a few warts still, and I think I will need to
> add function pointers to the ODF_to_HTML_key struct to deal with some
> special cases. If that struct is a good idea that is.
>
> The branch can be found here:
>
> https://github.com/apache/incubator-corinthia/commit/c81e68626489b9515e7e8f3a5ce5d38ac8f59af0
>
> I added the test odt file I was using, plus the current output of the program.
>
> thanks for looking,
>
> G
>
> --
> Visit my Coding Diary: http://gabriela-gibson.blogspot.com/