> (first, sorry to Norman Walsh -- this should go here, not explicitly
> to you ;-)
> / Galen Boyer <[EMAIL PROTECTED]> was heard to say:
> | Oh God, I'll probably get killed for this question.
> | 
> | Is there some tag which can be used to include a word doc or
> | excel file or other element?
> I suppose that this would be extremely difficult.  I guess that you
> should want to convert the doc into XML.  The following may help
> you only if you want to do it once with the Word document.
> I am very new to XML/SGML and DocBook, but I did the conversion
> of say 150 pages Word document into XML.  I did it via exporting the
> doc into HTML, and then I did a lot of perl fiddling... Now I have
> well-formed XML, but not the DocBook markup, yet.
> The process was rather painful -- because I did not know 
> HTML Tidy program before!!!  (My thanks to Dave Raggett
> who wrote it and to Jirka Kosek who mentioned it in his book.)
> So, if I was forced to do it again, I would do it this way:
>   1. Export the Word to HTML (manually).
>   2. Use HTML Tidy (off line) do convert the <font ...> and the like
>      tags into markup that uses CSS (automatically) and to
>      output the XML result.
>   3. Use ImageMagick to convert the images into the desired
>      format (off line).
>   4. Use some XSLT processor and write XSL file to prescribe 
>      the conversion of that XML to DocBook XML (off line).
>   5. Perl may still be needed.
> Well, I never did the third step (being very new to XSL), nor I know
> whether it is the best approach.  I guess that there could be some
> easier way.  Anyway, I think that "Word to HTML" is the first step
> to follow and I do not think that can be done off-line.
> Any comments?  (I want to learn something better ;-)
> Petr
> -- 
> Petr Prikryl, SKIL, spol. s r.o., [EMAIL PROTECTED]

To unsubscribe from this elist send a message with the single word
"unsubscribe" in the body to: [EMAIL PROTECTED]

Reply via email to