Hi Thomas,
why don't you take a look at the OpenOffice export function, I saw it's
possible to convert a document to xhtml and this could be a start for
you.
Wolfgang
Am 14.02.2009 um 18:40 schrieb Thomas A. Schmitz:
Hi all,
this is not a question about direct technical details, but more of a
conceptual problem, and I would love to have your input and ideas on
this. I will be editing several edited volumes in my field
(humanities, classics). From experience, I know that it's impossible
to make scholars in the humanities adhere to standards. Each and
every one of them will turn in a paper (most of them written in half
a dozen different versions of Word) with its own idiosyncracies. At
my last conference, I asked them to please use Unicode for their
Greek passages, and I got blank looks and the question "What the
hell is Unicode?"
So: I want to extract the content of these papers and process it
with ConTeXt. I thought the easiest route might be convert them to
OpenOffice odt and then use the content.xml as a starting point.
Since the formatting will be unusable anyways, it doesn't make sense
to process the odt directly; instead, I want to transform the xml
via xslt to a simplified format and then process that with ConTeXt.
I have just discovered the tool xalan ( http://xml.apache.org/xalan-c/index.html
) which allows me to use an xslt style sheet and direct the output
to a new file. I will then need to clean up these xml files and
write a mkiv xml setup for them.
So for those who know much more about this sort of workflow: does
that make sense? Is there any better way to achieve these results,
i.e., have the content of a couple of papers in Word and/or rtf
format and typeset it in a consistent ConTeXt environment? Is there
any tool better than xslt to convert the OpenOffice xml than xslt
(anything in lua that can parse xml)? Anything better than xalan to
convert xm -> xml? I'm just beginning to plan this, so I'd be most
grateful for any pointers.
Thanks for reading this long message, all best
Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the
Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://tex.aanhet.net
archive : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___________________________________________________________________________________