Ehm, yes, sorry, I talked quicker than I thought. Of course, the parser is an xml parser so it will cough up any tags that are not properly closed. So it has to be xhtml. You can use tools like htmltidy [1] to convert html to xhtml.

Btw, Vincent just added a simple tool to do document translations with doxia: http://svn.apache.org/viewvc?view=rev&revision=633328
Feel free to test and comment! :)

Cheers,
-Lukas

[1] http://tidy.sourceforge.net/


Cristóbal Fandiño wrote:
Output latex2html produces no XHTML code. For example:

HTML
==========
<LINK REL="STYLESHEET" HREF="embebidos.css">

XhtmlParser
==========
org.apache.maven.doxia.parser.ParseException: Error parsing the model: end
tag name </HEAD> must be the same as start tag <LINK> from line 19
(position: TEXT seen ...<LINK REL="STYLESHEET"
HREF="embebidos.css">\n\n</HEAD>...
@21:8)
    at org.apache.maven.doxia.parser.AbstractXmlParser.parse(
AbstractXmlParser.java:57)


HTML
==========
<H2><A NAME="SECTION00221000000000000000"></A>
<A NAME="74"></A>
<BR>
Grupos de usuarios
</H2>

XhtmlParser
==========
org.apache.maven.doxia.parser.ParseException: Error parsing the model: end
tag name </H2> must be the same as start tag <BR> from line 119 (position:
TEXT seen ...<BR>\nGrupos de usuarios\n</H2>... @121:6)
    at org.apache.maven.doxia.parser.AbstractXmlParser.parse(
AbstractXmlParser.java:57)


XhtmlParser
==========
org.apache.maven.doxia.parser.ParseException: Error parsing the model:
attribute value must start with quotation or apostrophe not 3 (position:
TEXT seen ...<A NAME="91"></A>\n<TABLE CELLPADDING=3... @171:21)
    at org.apache.maven.doxia.parser.AbstractXmlParser.parse(
AbstractXmlParser.java:57)

... and far more


2008/3/3, Lukas Theussl <[EMAIL PROTECTED]>:

doxia doesn't have a latex parser (I'd like to have one too!),
latex2html is the only solution I can think of (there exist other latex
translators though but that's the only one I know). I am not sure what
kind of output latex2html produces, however, the difference HTML - xhtml
shouldn't matter here. What kind of exceptions do you get? Maybe you
could attach an example file at jira [1] with a snippet of your code so
we can try to reproce the problem?

-Lukas

[1] http://jira.codehaus.org/browse/DOXIA


krycho fandino wrote:

Thanks for your help, however my HTML files isn't XHTML and XhtmlParser
throws a lot of exceptions. Perhaps, I should convert these HTML files

to

XHTML format, but I've a lot of pages and should be a hard task.

Really, I has generated these HTML files using latex2html conversion

tool. I

don't know how I could transform latex files to some markup languages
supported by doxia (apt or xdoc). Could you give me some advice?


2008/3/2, Lukas Theussl <[EMAIL PROTECTED]>:


If you use the current development branch of doxia (beta-1-SNAPSHOT)
then this should work rather well for simple html files. However, you
will probably loose a lot of information if you have anything fancy (eg
special layout, tables, figures are not well supported), don't expect it
to be perfect. In particular if you have figures you might try to
translate to xdoc instead of apt (use XdocSink), that should work

better.

Cheers,

-Lukas



Vincent Siveton wrote:


Hi,

Frankly, I never test your use case.

But I guess that you need to have an XHTML file in input with no
header, footer or navbar something to the div bodyColumn in [1].

The snippet should be something like the following:

File f = new File( "blabla.html" );
XhtmlParser parser = new XhtmlParser();
StringWriter output = new StringWriter();
Sink sink = new AptSink( output );
parser.parse( new FileReader( f ), output );

Output will contain APT declaration.

HTH,

Vincent

[1] http://maven.apache.org/doxia/

2008/3/1, krycho fandino <[EMAIL PROTECTED]>:



I'm a newbie using doxia. I've a lot of documentation in HTML format

an

I'd


like convert these files to apt format. Is there some way to transform
easily? I want to create a maven site for my project and, right now, I

only


have this documentation in HTML format without css styles nor menu.

Could you help me? Very thanks
Cristóbal



Reply via email to