Poor performance transforming from SAX to DOM with large text content
---------------------------------------------------------------------
Key: XALANJ-2530
URL: https://issues.apache.org/jira/browse/XALANJ-2530
Project: XalanJ2
Issue Type: Improvement
Security Level: No security risk; visible to anyone (Ordinary problems in
Xalan projects. Anybody can view the issue.)
Components: JAXP
Affects Versions: 2.7.1
Environment: java version "1.6.0_23"
Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)
Linux 2.6.34.7-66.fc13.x86_64 #1 SMP Wed Dec 15 07:04:30 UTC 2010 x86_64 x86_64
x86_64 GNU/Linux
Reporter: Steve Jones
Xalan performs poorly when transforming a SAX source to a DOM result when the
input contains large amounts of contiguous text.
The following test shows that Xalan takes 45 seconds to process the test
document, but the "Sun" JDK transformer takes under half a second.
// Generate XML with large text content
final int bufferSize = 1024*1024*5;
final StringBuilder stringBuilder = new StringBuilder(bufferSize);
stringBuilder.append( "<test-document>" );
for ( int i=0; i< 1000000; i++ ) { stringBuilder.append( "text " ); }
stringBuilder.append( "</test-document>" );
final String testDocument = stringBuilder.toString();
System.out.println( "Test document size : " + testDocument.length() +
"/" + bufferSize );
// Process it
//System.setProperty( "javax.xml.transform.TransformerFactory",
"com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl" );
final javax.xml.transform.Transformer transformer =
javax.xml.transform.TransformerFactory.newInstance().newTransformer( );
final javax.xml.transform.sax.SAXSource source = new
javax.xml.transform.sax.SAXSource( new org.xml.sax.InputSource( new
java.io.StringReader( testDocument ) ) );
final javax.xml.transform.dom.DOMResult result = new
javax.xml.transform.dom.DOMResult();
final long startTime = System.currentTimeMillis();
transformer.transform( source, result );
System.out.println( ( System.currentTimeMillis() - startTime ) + "ms"
);
It could be argued that this is a DOM implementation issue (due to the poor
performance of CharacterData.appendData), but it seems easy to fix within Xalan.
The "Sun" JDK solution to this issue can be seen in the class:
com.sun.org.apache.xalan.internal.xsltc.trax.SAX2DOM
which uses a StringBuilder to buffer the character data.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]