Hi Antonio,

Welcome back :)

I think I get what you mean.
The purpose of this commit was explicitly to NOT use the configured Excalibur's SAXParser, due to security reason. But we certainly could implement it otherwise by declaring another specific poolable SAXFactory in the cocoon.xconf

Cédric


Le 02/11/2021 à 05:33, Antonio Gallardo a écrit :
Hi all,

It has been long time since I wrote an email in a Cocoon mail list. First I want to say hello to all my cocoon fellows.

I am writing, because reviewing this commit, I see we are moving apart from the cocoon pools that allows us to control how many instances of a given components are being created. Please see my comments between lines below.

Best Regards,

Antonio Gallardo.

El 16/6/20 a las 17:04, cdami...@apache.org escribió:
Author: cdamioli
Date: Tue Jun 16 23:04:30 2020
New Revision: 1878905

URL: http://svn.apache.org/viewvc?rev=1878905&view=rev
Log:
Make StreamGenerator more robust

Modified:
cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/generation/StreamGenerator.java

Modified: cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/generation/StreamGenerator.java URL: http://svn.apache.org/viewvc/cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/generation/StreamGenerator.java?rev=1878905&r1=1878904&r2=1878905&view=diff ============================================================================== --- cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/generation/StreamGenerator.java (original) +++ cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/generation/StreamGenerator.java Tue Jun 16 23:04:30 2020
@@ -16,6 +16,16 @@
   */
  package org.apache.cocoon.generation;
  +import java.io.IOException;
+import java.io.InputStreamReader;
+import java.io.Reader;
+import java.io.StringReader;
+
+import javax.servlet.http.HttpServletRequest;
+import javax.xml.parsers.SAXParser;
+import javax.xml.parsers.SAXParserFactory;
+
+import org.apache.avalon.framework.activity.Initializable;
  import org.apache.cocoon.ProcessingException;
  import org.apache.cocoon.ResourceNotFoundException;
  import org.apache.cocoon.environment.ObjectModelHelper;
@@ -23,16 +33,11 @@ import org.apache.cocoon.environment.Req
  import org.apache.cocoon.environment.http.HttpEnvironment;
  import org.apache.cocoon.servlet.multipart.Part;
  import org.apache.cocoon.util.PostInputStream;
-import org.apache.excalibur.xml.sax.SAXParser;
+import org.xml.sax.ErrorHandler;
  import org.xml.sax.InputSource;
  import org.xml.sax.SAXException;
-
-import javax.servlet.http.HttpServletRequest;
-
-import java.io.IOException;
-import java.io.InputStreamReader;
-import java.io.Reader;
-import java.io.StringReader;
+import org.xml.sax.SAXParseException;
+import org.xml.sax.XMLReader;
    /**
   * @cocoon.sitemap.component.documentation
@@ -64,7 +69,7 @@ import java.io.StringReader;
   * @author <a href="mailto:kinga_dziembow...@hp.com";>Kinga Dziembowski</a>
   * @version CVS $Id$
   */
-public class StreamGenerator extends ServiceableGenerator
+public class StreamGenerator extends ServiceableGenerator implements Initializable
  {
        /** The parameter holding the name associated with the xml data  **/
@@ -73,6 +78,9 @@ public class StreamGenerator extends Ser
      /** The input source */
      private InputSource inputSource;
  +    // don't use Excalibur's SAXParser to prevent XXE injection
+    private SAXParserFactory factory;
+
      /**
       * Recycle this component.
       * All instance variables are set to <code>null</code>.
@@ -81,13 +89,20 @@ public class StreamGenerator extends Ser
          super.recycle();
          this.inputSource = null;
      }
+
+    public void initialize() throws Exception {
+        factory = SAXParserFactory.newInstance();
+        factory.setNamespaceAware(true);
+        factory.setXIncludeAware(false);
+ factory.setFeature("http://xml.org/sax/features/external-general-entities";, false); + factory.setFeature("http://xml.org/sax/features/external-parameter-entities";, false);
+    }
        /**
       * Generate XML data out of request InputStream.
       */
      public void generate()
      throws IOException, SAXException, ProcessingException {
-        SAXParser parser = null;
          int len = 0;
          String contentType = null;
  @@ -149,9 +164,16 @@ public class StreamGenerator extends Ser
              String charset =  getCharacterEncoding(request, contentType) ;
              if( charset != null) {
                  this.inputSource.setEncoding(charset);
-            }
-            parser = (SAXParser)this.manager.lookup(SAXParser.ROLE);
-            parser.parse(this.inputSource, super.xmlConsumer);


The above lines removes the part that was controlled by Cocoon. The manager allowed to defined how many instances of the Parser can be created. Very useful to handle how much memory the webapp will use.

Please see the place in cocoon.xconf when we define the parsers:
http://svn.apache.org/viewvc/cocoon/branches/BRANCH_2_1_X/src/webapp/WEB-INF/cocoon.xconf?view=markup#l363

Also check the place when we add to the manager the parser that is set in cocoon.xconf: http://svn.apache.org/viewvc/cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/Cocoon.java?view=markup#l298

I think, there should be a way to configure the new parser in cocoon.xconf instead of hard coding it in the stream generator. Please note the same code using the line:

parser = (SAXParser)this.manager.lookup(SAXParser.ROLE);

is used in many place in the coocon code.

+            }
+
+            SAXParser parser = factory.newSAXParser();
+            XMLReader xmlReader = parser.getXMLReader();
+            xmlReader.setContentHandler(super.xmlConsumer);
+            xmlReader.setProperty( "http://xml.org/sax/properties/lexical-handler";, super.xmlConsumer ); +            xmlReader.setFeature( "http://xml.org/sax/features/namespaces";, true );
+
+            xmlReader.parse(this.inputSource);
+

I am afraid that with the above code, we loose this control and perhaps we are creating a vector to shutdown the server by sending too many instances that create a lot of parsers.


          } catch (IOException e) {
              getLogger().error("StreamGenerator.generate()", e);
              throw new ResourceNotFoundException("StreamGenerator could not find resource", e);
@@ -161,9 +183,7 @@ public class StreamGenerator extends Ser
          } catch (Exception e) {
              getLogger().error("Could not get parser", e);
              throw new ProcessingException("Exception in StreamGenerator.generate()", e);
-        } finally {
-            this.manager.release( parser);
-        }
+        }
      }
        /**
@@ -208,7 +228,7 @@ public class StreamGenerator extends Ser
               return extractCharset( contentType, idx );
          }
      }
-
+
        protected String extractCharset(String contentType, int idx) {
          String charencoding = null;



--
Cédric Damioli
CMS - Java - Open Source
www.ametys.org

Reply via email to