[ 
https://issues.apache.org/jira/browse/CAMEL-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13155825#comment-13155825
 ] 

Babak Vahdat commented on CAMEL-3774:
-------------------------------------

Had a quick chance to look at this after the 2.8.3 release today. Be warned, 
I'm not a Apache Maven expert, so maybe I'm completely *wrong* ;-)

Looking at [1] it says:

{code}
 private String downloadContent() throws MalformedURLException, 
MojoExecutionException {
        String contentTag = "<div class=\"" + contentDivClass + "\"";

        getLog().info("Downloading: " + page);
        URL url = new URL(page);        
        
        try {
            TidyMarkupDataFormat dataFormat = new TidyMarkupDataFormat();
            dataFormat.setMethod("html");
            Node doc = dataFormat.asNodeTidyMarkup(new 
BufferedInputStream(url.openStream()));
            XPath xpath = XPathFactory.newInstance().newXPath();
            Node nd = (Node)xpath.evaluate("//div[@class='" + contentDivClass + 
"']", doc, XPathConstants.NODE);
            if (nd != null) {
                return  new XmlConverter().toString(nd, null);
            }
        } catch (Throwable e) {
            if (errorOnDownloadFailure) {
                throw new MojoExecutionException("Download or validation of '" 
+ page + "' failed: " + e);
            } else {
                getLog().error("Download or validation of '" + page + "' 
failed: " + e);
                return null;
            }
        }        
        throw new MojoExecutionException("The '" + page + "' page did not have 
a " + contentTag + " element.");
    }
{code}

On the other hand the default value of contentDivClass by this Mojo is:

{code}
/**
  * The first div with who's class matches the contentDivClass will be
  * assumed to be the content section of the HTML and is what will be used as
  * the content in the PDF.
  *
  * @parameter default-value="wiki-content"
  */
  private String contentDivClass = "wiki-content";
{code}

But looking at [2] the real value of the div-class element is:

{code}
<DIV class="wiki-content maincontent">
{code}

So that the exception message of the TidyMarkupDataFormat doesn't surprise me 
too much:

{code}
[INFO] ERROR:  'NOT_FOUND_ERR: An attempt is made to reference a node in a 
context where it does not exist.'
{code}

Other than that the method storeDummyFile() of [1] is:

{code}
private void storeDummyFile() throws FileNotFoundException {
        PrintWriter out = new PrintWriter(new BufferedOutputStream(new 
FileOutputStream(getHTMLFileName())));
        out.println("<html>");
        out.println("<body>Download of " + page + " failed</body>");
        out.close();
        getLog().info("Stored dummy file: " + getHTMLFileName() + " since 
download of " + page + " failed.");
    }
{code}

which doesn't generate a valid HTML (the closing html tag is missing). IMHO we 
should generate something like:

{code}
<html>
 <body>
  <a href="http://camel.apache.org/book-in-one-page.html";>Generation of the 
offline PDF version of the manual failed, however you could try the online HTML 
version here.</a>
 </body>
</html>
{code}

So that in the case something goes wrong whatsoever we would still have a 
proper HTML pointing to the online version of the manual.

I would love to provide a patch for this issue (I know, you love contributions 
:-)), however I'm afraid I will not be able to verify that as I'm not a 
committer so that release:prepare would fail before ahead (I assume). So maybe 
some Rider (Hadrian?) could locally check/verify the patch before applying it 
into the trunk...

Any thoughts?

[1] 
https://svn.apache.org/repos/asf/camel/trunk/tooling/maven/maven-html-to-pdf/src/main/java/org/apache/camel/maven/HtmlToPdfMojo.java
[2] http://camel.apache.org/book-in-one-page.html
                
> camel-manual generation fails during release process
> ----------------------------------------------------
>
>                 Key: CAMEL-3774
>                 URL: https://issues.apache.org/jira/browse/CAMEL-3774
>             Project: Camel
>          Issue Type: Task
>    Affects Versions: 2.6.0
>            Reporter: Hadrian Zbarcea
>            Assignee: Hadrian Zbarcea
>             Fix For: Future
>
>
> It works during a regular mvn install, but it fails during release:prepare. 
> The bug is somewhere in the maven-html-to-pdf plugin. I hope a mock release 
> using mvn -X will reveal the problem, but would take a frustrating amount of 
> time. For now the available info is the output:
> {code}
> [INFO] [INFO] [html-to-pdf:compile {execution: default}]
> [INFO] [INFO] Downloading: http://camel.apache.org/book-in-one-page.html
> [INFO] ERROR:  'NOT_FOUND_ERR: An attempt is made to reference a node in a 
> context where it does not exist.'
> [INFO] [ERROR] Download or validation of 
> 'http://camel.apache.org/book-in-one-page.html' failed: 
> org.apache.camel.CamelException: Failed to convert the HTML to tidy Markup
> [INFO] [INFO] Stored dummy file: 
> /w1/apache/release/camel270/tooling/camel-manual/target/site/manual/camel-manual-2.7.0.html
>  since download of http://camel.apache.org/book-in-one-page.html failed.
> {code}
> The error points to a DOM related issue, but since the downloaded manual gets 
> overwritten with the dummy file, I have no idea if the download got 
> interrupted, or in what way the source gets corrupted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to