[GitHub] [maven-doxia] michael-o commented on a change in pull request #49: [DOXIA-616]

GitBox Tue, 29 Dec 2020 11:06:42 -0800


michael-o commented on a change in pull request #49:
URL: https://github.com/apache/maven-doxia/pull/49#discussion_r549815437




##########
File path: 
doxia-modules/doxia-module-markdown/src/main/java/org/apache/maven/doxia/module/markdown/MarkdownParser.java
##########
@@ -130,133 +191,98 @@ public void parse( Reader source, Sink sink )
      * @return HTML content generated by flexmark-java
      * @throws IOException passed through
      */
-    String toHtml( Reader source )
+    CharSequence toHtml( Reader source )
         throws IOException
     {
+        // Read the source
         String text = IOUtil.toString( source );
-        MutableDataHolder flexmarkOptions = 
PegdownOptionsAdapter.flexmarkOptions(
-                Extensions.ALL & ~( Extensions.HARDWRAPS | 
Extensions.ANCHORLINKS ) ).toMutable();
-        ArrayList<Extension> extensions = new ArrayList<>();
-        for ( Extension extension : flexmarkOptions.get( 
com.vladsch.flexmark.parser.Parser.EXTENSIONS ) )
-        {
-            extensions.add( extension );
-        }
-
-        extensions.add( FlexmarkDoxiaExtension.create() );
-        flexmarkOptions.set( com.vladsch.flexmark.parser.Parser.EXTENSIONS, 
extensions );
-        flexmarkOptions.set( HtmlRenderer.HTML_BLOCK_OPEN_TAG_EOL, false );
-        flexmarkOptions.set( HtmlRenderer.HTML_BLOCK_CLOSE_TAG_EOL, false );
-        flexmarkOptions.set( HtmlRenderer.MAX_TRAILING_BLANK_LINES, -1 );
-
-        com.vladsch.flexmark.parser.Parser parser = 
com.vladsch.flexmark.parser.Parser.builder( flexmarkOptions )
-                .build();
-        HtmlRenderer renderer = HtmlRenderer.builder( flexmarkOptions )
-                                    .linkResolverFactory( new 
FlexmarkDoxiaLinkResolver.Factory() )
-                                    .build();
-
 
+        // Now, build the HTML document
         StringBuilder html = new StringBuilder( 1000 );
         html.append( "<html>" );
         html.append( "<head>" );
-        Pattern metadataPattern = Pattern.compile( 
MULTI_MARKDOWN_METADATA_SECTION, Pattern.MULTILINE );
-        Matcher metadataMatcher = metadataPattern.matcher( text );
+
+        // First, we interpret the "metadata" section of the document and add 
the corresponding HTML headers
+        Matcher metadataMatcher = METADATA_SECTION_PATTERN.matcher( text );
         boolean haveTitle = false;
         if ( metadataMatcher.find() )
         {
-            metadataPattern = Pattern.compile( MULTI_MARKDOWN_METADATA_ENTRY, 
Pattern.MULTILINE );
-            Matcher lineMatcher = metadataPattern.matcher( 
metadataMatcher.group( 1 ) );
-            boolean first = true;
-            while ( lineMatcher.find() )
+            Matcher entryMatcher = METADATA_ENTRY_PATTERN.matcher( 
metadataMatcher.group( 0 ) );
+            while ( entryMatcher.find() )
             {
-                String key = StringUtils.trimToEmpty( lineMatcher.group( 1 ) );
-                if ( first )
-                {
-                    boolean found = false;
-                    for ( String k : STANDARD_METADATA_KEYS )
-                    {
-                        if ( k.equalsIgnoreCase( key ) )
-                        {
-                            found = true;
-                            break;
-                        }
-                    }
-                    if ( !found )
-                    {
-                        break;
-                    }
-                    first = false;
-                }
-                String value = StringUtils.trimToEmpty( lineMatcher.group( 2 ) 
);
+                String key = entryMatcher.group( 1 );
+                String value = entryMatcher.group( 2 );
                 if ( "title".equalsIgnoreCase( key ) )
                 {
                     haveTitle = true;
                     html.append( "<title>" );
-                    html.append( StringEscapeUtils.escapeXml( value ) );
+                    html.append( HtmlTools.escapeHTML( value, false ) );
                     html.append( "</title>" );
                 }
-                else if ( "author".equalsIgnoreCase( key ) )
-                {
-                    html.append( "<meta name=\'author\' content=\'" );
-                    html.append( StringEscapeUtils.escapeXml( value ) );
-                    html.append( "\' />" );
-                }
-                else if ( "date".equalsIgnoreCase( key ) )
-                {
-                    html.append( "<meta name=\'date\' content=\'" );
-                    html.append( StringEscapeUtils.escapeXml( value ) );
-                    html.append( "\' />" );
-                }
                 else
                 {
-                    html.append( "<meta name=\'" );
-                    html.append( StringEscapeUtils.escapeXml( key ) );
-                    html.append( "\' content=\'" );
-                    html.append( StringEscapeUtils.escapeXml( value ) );
-                    html.append( "\' />" );
+                    html.append( "<meta name='" );
+                    html.append( HtmlTools.escapeHTML( key, true ) );

Review comment:
       I understand that, but the Sink says XHTML, so we need at least XML 
escapes. Just checked the code, all is fine. It runs in XML mode. You can drop 
the `true`. This is the default value.

##########
File path: 
doxia-modules/doxia-module-markdown/src/main/java/org/apache/maven/doxia/module/markdown/MarkdownParser.java
##########
@@ -130,133 +191,98 @@ public void parse( Reader source, Sink sink )
      * @return HTML content generated by flexmark-java
      * @throws IOException passed through
      */
-    String toHtml( Reader source )
+    CharSequence toHtml( Reader source )
         throws IOException
     {
+        // Read the source
         String text = IOUtil.toString( source );
-        MutableDataHolder flexmarkOptions = 
PegdownOptionsAdapter.flexmarkOptions(
-                Extensions.ALL & ~( Extensions.HARDWRAPS | 
Extensions.ANCHORLINKS ) ).toMutable();
-        ArrayList<Extension> extensions = new ArrayList<>();
-        for ( Extension extension : flexmarkOptions.get( 
com.vladsch.flexmark.parser.Parser.EXTENSIONS ) )
-        {
-            extensions.add( extension );
-        }
-
-        extensions.add( FlexmarkDoxiaExtension.create() );
-        flexmarkOptions.set( com.vladsch.flexmark.parser.Parser.EXTENSIONS, 
extensions );
-        flexmarkOptions.set( HtmlRenderer.HTML_BLOCK_OPEN_TAG_EOL, false );
-        flexmarkOptions.set( HtmlRenderer.HTML_BLOCK_CLOSE_TAG_EOL, false );
-        flexmarkOptions.set( HtmlRenderer.MAX_TRAILING_BLANK_LINES, -1 );
-
-        com.vladsch.flexmark.parser.Parser parser = 
com.vladsch.flexmark.parser.Parser.builder( flexmarkOptions )
-                .build();
-        HtmlRenderer renderer = HtmlRenderer.builder( flexmarkOptions )
-                                    .linkResolverFactory( new 
FlexmarkDoxiaLinkResolver.Factory() )
-                                    .build();
-
 
+        // Now, build the HTML document
         StringBuilder html = new StringBuilder( 1000 );
         html.append( "<html>" );
         html.append( "<head>" );
-        Pattern metadataPattern = Pattern.compile( 
MULTI_MARKDOWN_METADATA_SECTION, Pattern.MULTILINE );
-        Matcher metadataMatcher = metadataPattern.matcher( text );
+
+        // First, we interpret the "metadata" section of the document and add 
the corresponding HTML headers
+        Matcher metadataMatcher = METADATA_SECTION_PATTERN.matcher( text );
         boolean haveTitle = false;
         if ( metadataMatcher.find() )
         {
-            metadataPattern = Pattern.compile( MULTI_MARKDOWN_METADATA_ENTRY, 
Pattern.MULTILINE );
-            Matcher lineMatcher = metadataPattern.matcher( 
metadataMatcher.group( 1 ) );
-            boolean first = true;
-            while ( lineMatcher.find() )
+            Matcher entryMatcher = METADATA_ENTRY_PATTERN.matcher( 
metadataMatcher.group( 0 ) );
+            while ( entryMatcher.find() )
             {
-                String key = StringUtils.trimToEmpty( lineMatcher.group( 1 ) );
-                if ( first )
-                {
-                    boolean found = false;
-                    for ( String k : STANDARD_METADATA_KEYS )
-                    {
-                        if ( k.equalsIgnoreCase( key ) )
-                        {
-                            found = true;
-                            break;
-                        }
-                    }
-                    if ( !found )
-                    {
-                        break;
-                    }
-                    first = false;
-                }
-                String value = StringUtils.trimToEmpty( lineMatcher.group( 2 ) 
);
+                String key = entryMatcher.group( 1 );
+                String value = entryMatcher.group( 2 );
                 if ( "title".equalsIgnoreCase( key ) )
                 {
                     haveTitle = true;
                     html.append( "<title>" );
-                    html.append( StringEscapeUtils.escapeXml( value ) );
+                    html.append( HtmlTools.escapeHTML( value, false ) );
                     html.append( "</title>" );
                 }
-                else if ( "author".equalsIgnoreCase( key ) )
-                {
-                    html.append( "<meta name=\'author\' content=\'" );
-                    html.append( StringEscapeUtils.escapeXml( value ) );
-                    html.append( "\' />" );
-                }
-                else if ( "date".equalsIgnoreCase( key ) )
-                {
-                    html.append( "<meta name=\'date\' content=\'" );
-                    html.append( StringEscapeUtils.escapeXml( value ) );
-                    html.append( "\' />" );
-                }
                 else
                 {
-                    html.append( "<meta name=\'" );
-                    html.append( StringEscapeUtils.escapeXml( key ) );
-                    html.append( "\' content=\'" );
-                    html.append( StringEscapeUtils.escapeXml( value ) );
-                    html.append( "\' />" );
+                    html.append( "<meta name='" );
+                    html.append( HtmlTools.escapeHTML( key, true ) );
+                    html.append( "' content='" );
+                    html.append( HtmlTools.escapeHTML( value, true ) );
+                    html.append( "' />" );
                 }
             }
-            if ( !first )
-            {
-                text = text.substring( metadataMatcher.end() );
-            }
+
+            // Trim the metadata from the source
+            text = text.substring( metadataMatcher.end( 0 ) );
+
         }
 
-        Node rootNode = parser.parse( text );
-        String markdownHtml = renderer.render( rootNode );
+        // Now is the time to parse the Markdown document
+        // (after we've trimmed out the metadatas, and before we check for its 
headings)
+        Node documentRoot = FLEXMARK_PARSER.parse( text );
 
-        if ( !haveTitle && rootNode.hasChildren() )
+        // Special trick: if there is no title specified as a metadata in the 
header, we will use the first
+        // heading as the document title
+        if ( !haveTitle && documentRoot.hasChildren() )
         {
-            // use the first (non-comment) node only if it is a heading
-            Node firstNode = rootNode.getFirstChild();
-            while ( firstNode != null && !( firstNode instanceof Heading ) )
+            // Skip the comment nodes
+            Node firstNode = documentRoot.getFirstChild();
+            while ( firstNode != null && firstNode instanceof HtmlCommentBlock 
)

Review comment:
       OK.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [maven-doxia] michael-o commented on a change in pull request #49: [DOXIA-616]

Reply via email to