[
https://issues.apache.org/jira/browse/CAMEL-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155825#comment-13155825
]
Babak Vahdat commented on CAMEL-3774:
-------------------------------------
Had a quick chance to look at this after the 2.8.3 release today. Be warned,
I'm not a Apache Maven expert, so maybe I'm completely *wrong* ;-)
Looking at [1] it says:
{code}
private String downloadContent() throws MalformedURLException,
MojoExecutionException {
String contentTag = "<div class=\"" + contentDivClass + "\"";
getLog().info("Downloading: " + page);
URL url = new URL(page);
try {
TidyMarkupDataFormat dataFormat = new TidyMarkupDataFormat();
dataFormat.setMethod("html");
Node doc = dataFormat.asNodeTidyMarkup(new
BufferedInputStream(url.openStream()));
XPath xpath = XPathFactory.newInstance().newXPath();
Node nd = (Node)xpath.evaluate("//div[@class='" + contentDivClass +
"']", doc, XPathConstants.NODE);
if (nd != null) {
return new XmlConverter().toString(nd, null);
}
} catch (Throwable e) {
if (errorOnDownloadFailure) {
throw new MojoExecutionException("Download or validation of '"
+ page + "' failed: " + e);
} else {
getLog().error("Download or validation of '" + page + "'
failed: " + e);
return null;
}
}
throw new MojoExecutionException("The '" + page + "' page did not have
a " + contentTag + " element.");
}
{code}
On the other hand the default value of contentDivClass by this Mojo is:
{code}
/**
* The first div with who's class matches the contentDivClass will be
* assumed to be the content section of the HTML and is what will be used as
* the content in the PDF.
*
* @parameter default-value="wiki-content"
*/
private String contentDivClass = "wiki-content";
{code}
But looking at [2] the real value of the div-class element is:
{code}
<DIV class="wiki-content maincontent">
{code}
So that the exception message of the TidyMarkupDataFormat doesn't surprise me
too much:
{code}
[INFO] ERROR: 'NOT_FOUND_ERR: An attempt is made to reference a node in a
context where it does not exist.'
{code}
Other than that the method storeDummyFile() of [1] is:
{code}
private void storeDummyFile() throws FileNotFoundException {
PrintWriter out = new PrintWriter(new BufferedOutputStream(new
FileOutputStream(getHTMLFileName())));
out.println("<html>");
out.println("<body>Download of " + page + " failed</body>");
out.close();
getLog().info("Stored dummy file: " + getHTMLFileName() + " since
download of " + page + " failed.");
}
{code}
which doesn't generate a valid HTML (the closing html tag is missing). IMHO we
should generate something like:
{code}
<html>
<body>
<a href="http://camel.apache.org/book-in-one-page.html">Generation of the
offline PDF version of the manual failed, however you could try the online HTML
version here.</a>
</body>
</html>
{code}
So that in the case something goes wrong whatsoever we would still have a
proper HTML pointing to the online version of the manual.
I would love to provide a patch for this issue (I know, you love contributions
:-)), however I'm afraid I will not be able to verify that as I'm not a
committer so that release:prepare would fail before ahead (I assume). So maybe
some Rider (Hadrian?) could locally check/verify the patch before applying it
into the trunk...
Any thoughts?
[1]
https://svn.apache.org/repos/asf/camel/trunk/tooling/maven/maven-html-to-pdf/src/main/java/org/apache/camel/maven/HtmlToPdfMojo.java
[2] http://camel.apache.org/book-in-one-page.html
> camel-manual generation fails during release process
> ----------------------------------------------------
>
> Key: CAMEL-3774
> URL: https://issues.apache.org/jira/browse/CAMEL-3774
> Project: Camel
> Issue Type: Task
> Affects Versions: 2.6.0
> Reporter: Hadrian Zbarcea
> Assignee: Hadrian Zbarcea
> Fix For: Future
>
>
> It works during a regular mvn install, but it fails during release:prepare.
> The bug is somewhere in the maven-html-to-pdf plugin. I hope a mock release
> using mvn -X will reveal the problem, but would take a frustrating amount of
> time. For now the available info is the output:
> {code}
> [INFO] [INFO] [html-to-pdf:compile {execution: default}]
> [INFO] [INFO] Downloading: http://camel.apache.org/book-in-one-page.html
> [INFO] ERROR: 'NOT_FOUND_ERR: An attempt is made to reference a node in a
> context where it does not exist.'
> [INFO] [ERROR] Download or validation of
> 'http://camel.apache.org/book-in-one-page.html' failed:
> org.apache.camel.CamelException: Failed to convert the HTML to tidy Markup
> [INFO] [INFO] Stored dummy file:
> /w1/apache/release/camel270/tooling/camel-manual/target/site/manual/camel-manual-2.7.0.html
> since download of http://camel.apache.org/book-in-one-page.html failed.
> {code}
> The error points to a DOM related issue, but since the downloaded manual gets
> overwritten with the dummy file, I have no idea if the download got
> interrupted, or in what way the source gets corrupted.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira