2008/3/4, Lukas Theussl <[EMAIL PROTECTED]>: > Ehm, yes, sorry, I talked quicker than I thought. Of course, the parser > is an xml parser so it will cough up any tags that are not properly > closed. So it has to be xhtml. You can use tools like htmltidy [1] to > convert html to xhtml. > > Btw, Vincent just added a simple tool to do document translations with > doxia: http://svn.apache.org/viewvc?view=rev&revision=633328 > Feel free to test and comment! :)
You need to use the entire trunk for this. I guess it will be easy to patch the converter with jtidy to support html as an input format. Patches are welcome :) Cheers, Vincent > Cheers, > -Lukas > > [1] http://tidy.sourceforge.net/ > > > > Cristóbal Fandiño wrote: > > Output latex2html produces no XHTML code. For example: > > > > HTML > > ========== > > <LINK REL="STYLESHEET" HREF="embebidos.css"> > > > > XhtmlParser > > ========== > > org.apache.maven.doxia.parser.ParseException: Error parsing the model: end > > tag name </HEAD> must be the same as start tag <LINK> from line 19 > > (position: TEXT seen ...<LINK REL="STYLESHEET" > > HREF="embebidos.css">\n\n</HEAD>... > > @21:8) > > at org.apache.maven.doxia.parser.AbstractXmlParser.parse( > > AbstractXmlParser.java:57) > > > > > > HTML > > ========== > > <H2><A NAME="SECTION00221000000000000000"></A> > > <A NAME="74"></A> > > <BR> > > Grupos de usuarios > > </H2> > > > > XhtmlParser > > ========== > > org.apache.maven.doxia.parser.ParseException: Error parsing the model: end > > tag name </H2> must be the same as start tag <BR> from line 119 (position: > > TEXT seen ...<BR>\nGrupos de usuarios\n</H2>... @121:6) > > at org.apache.maven.doxia.parser.AbstractXmlParser.parse( > > AbstractXmlParser.java:57) > > > > > > XhtmlParser > > ========== > > org.apache.maven.doxia.parser.ParseException: Error parsing the model: > > attribute value must start with quotation or apostrophe not 3 (position: > > TEXT seen ...<A NAME="91"></A>\n<TABLE CELLPADDING=3... @171:21) > > at org.apache.maven.doxia.parser.AbstractXmlParser.parse( > > AbstractXmlParser.java:57) > > > > ... and far more > > > > > > 2008/3/3, Lukas Theussl <[EMAIL PROTECTED]>: > > > >>doxia doesn't have a latex parser (I'd like to have one too!), > >>latex2html is the only solution I can think of (there exist other latex > >>translators though but that's the only one I know). I am not sure what > >>kind of output latex2html produces, however, the difference HTML - xhtml > >>shouldn't matter here. What kind of exceptions do you get? Maybe you > >>could attach an example file at jira [1] with a snippet of your code so > >>we can try to reproce the problem? > >> > >>-Lukas > >> > >>[1] http://jira.codehaus.org/browse/DOXIA > >> > >> > >>krycho fandino wrote: > >> > >>>Thanks for your help, however my HTML files isn't XHTML and XhtmlParser > >>>throws a lot of exceptions. Perhaps, I should convert these HTML files > >> > >>to > >> > >>>XHTML format, but I've a lot of pages and should be a hard task. > >>> > >>>Really, I has generated these HTML files using latex2html conversion > >> > >>tool. I > >> > >>>don't know how I could transform latex files to some markup languages > >>>supported by doxia (apt or xdoc). Could you give me some advice? > >>> > >>> > >>>2008/3/2, Lukas Theussl <[EMAIL PROTECTED]>: > >>> > >>> > >>>>If you use the current development branch of doxia (beta-1-SNAPSHOT) > >>>>then this should work rather well for simple html files. However, you > >>>>will probably loose a lot of information if you have anything fancy (eg > >>>>special layout, tables, figures are not well supported), don't expect it > >>>>to be perfect. In particular if you have figures you might try to > >>>>translate to xdoc instead of apt (use XdocSink), that should work > >> > >>better. > >> > >>>>Cheers, > >>>> > >>>>-Lukas > >>>> > >>>> > >>>> > >>>>Vincent Siveton wrote: > >>>> > >>>> > >>>>>Hi, > >>>>> > >>>>>Frankly, I never test your use case. > >>>>> > >>>>>But I guess that you need to have an XHTML file in input with no > >>>>>header, footer or navbar something to the div bodyColumn in [1]. > >>>>> > >>>>>The snippet should be something like the following: > >>>>> > >>>>>File f = new File( "blabla.html" ); > >>>>>XhtmlParser parser = new XhtmlParser(); > >>>>>StringWriter output = new StringWriter(); > >>>>>Sink sink = new AptSink( output ); > >>>>>parser.parse( new FileReader( f ), output ); > >>>>> > >>>>>Output will contain APT declaration. > >>>>> > >>>>>HTH, > >>>>> > >>>>>Vincent > >>>>> > >>>>>[1] http://maven.apache.org/doxia/ > >>>>> > >>>>>2008/3/1, krycho fandino <[EMAIL PROTECTED]>: > >>>>> > >>>>> > >>>>> > >>>>>>I'm a newbie using doxia. I've a lot of documentation in HTML format > >> > >>an > >> > >>>>I'd > >>>> > >>>> > >>>>>>like convert these files to apt format. Is there some way to transform > >>>>>>easily? I want to create a maven site for my project and, right now, I > >>>> > >>>>only > >>>> > >>>> > >>>>>>have this documentation in HTML format without css styles nor menu. > >>>>>> > >>>>>>Could you help me? Very thanks > >>>>>>Cristóbal > >>>>> > >> > > >