G'Day,

When doing some profiling of Jena on another matter, I noticed that the IRI code seemed to consume a lot of time when parsing RDF/XML. On further investigation, a minor modification to com.hp.hpl.jena.iri.impl.AbsLexer results in a 15% speedup in IRI creation and a 10% speedup in parsing RDF/XML, in my tests (YMMV).


How:

In com.hp.hpl.jena.iri.impl.AbsLexer

Replace
[[
    final protected void rule(int rule) {
            parser.matchedRule(range,rule,yytext());
    }
]]

with

[[
    private static final boolean DEBUG = false;
    final protected void rule(int rule) {
        if (DEBUG) {
            parser.matchedRule(range,rule,yytext());
        }
    }
]]

Explanation:

This is debug code. As far as I can see yytext() returns a copy of part of the input buffer as a stream. Parser.matchedRule prints some debug information if DEBUG is true, which it is not.

The method rule(int) is called rather a lot, hence compiling out this code in AbsLexer results in noticeable performance improvement.

I have run a test creating 10 million IRIs fo the form "http://www.example.com/foo/bar/bas#nnnnn";. On my machine I see approx 15% performance improvement.

I have taken the test graphs from the test in [1] and read them into a memory model. This runs about 10% faster with the mod I am suggesting than without the mod.

com.hp.hpl.jena.iri.test.TestPackage green lines.

I have attached a patch file rooted in the IRI project.

Brian

[1]

https://jena.svn.sourceforge.net/svnroot/jena/TDB/trunk/src-dev/reports/ReportOutOfMemoryManyGraphsTDB.java





Index: src/com/hp/hpl/jena/iri/impl/AbsLexer.java
===================================================================
RCS file: /cvsroot/jena/iri/src/com/hp/hpl/jena/iri/impl/AbsLexer.java,v
retrieving revision 1.4
diff -u -r1.4 AbsLexer.java
--- src/com/hp/hpl/jena/iri/impl/AbsLexer.java  28 Feb 2009 17:44:46 -0000      
1.4
+++ src/com/hp/hpl/jena/iri/impl/AbsLexer.java  7 Apr 2011 16:48:19 -0000
@@ -72,8 +72,11 @@
         parser.recordError(range,e);
     }
     
+    private static final boolean DEBUG = false;
     final protected void rule(int rule) {
-        parser.matchedRule(range,rule,yytext());
+       if (DEBUG) {
+            parser.matchedRule(range,rule,yytext());
+       }
     }
     abstract String yytext();
     protected void surrogatePair() {

Reply via email to