Hi,

I have rewritten large part of the source code of gnu.regexp package.

The imortant points of this change are:

  (1) A new method REToken#matchThis. This method tries to match
      the input string against the REToken itself and does not
      try to match the next RETokens chained to it. The currently
      used REToken#match method should be defined using REToken#matchThis.
      This is useful for (3).

  (2) A new method REToken#findMatch. This is almost the same as
      the current REToken#match but returns a resulting REMatch
      instead of a boolean value.  This is useful for the depth-first
      search with backtracking.

  (3) New methods REToken#returnsFixedLengthMatches and
      REToken#findFixedLengthMatches. These will fasten the
      search for repeated matches if the matched string is
      supposed to have a fixed length.

  (4) RETokenOneOf and RETokenRepeated perform a depth-first
      search with backtracking.

After this change, the test attached below shows 400% improved
performance compared with the current CVS version.  The improved
performance comes mainly from the change (3).  To my regret,
The change (4) had a negative effect on performance. 

ChangeLog
2006-03-01  Ito Kazumitsu  <[EMAIL PROTECTED]>

        * gnu/regexp/BacktrackStack.java: New file.
        * gnu/regexp/RE.java(findMatch): New method.
        * gnu/regexp/REMatch.java(next,matchFlags,MF_FIND_ALL,
        REMatchList): Removed. (backtrackStack): New field.
        * gnu/regexp/REToken.java(match): Changed from an abstract
        method to an ordinary method defined with the new method
        matchThis. (matchThis, getNext, findMatch, returnsFixedLengthMatches,
        findFixedLengthMatches, backtrack, toString): New methods.
        * gnu/regexp/RETokenAny.java: Inplemented new methods of REToken.
        * gnu/regexp/RETokenBackRef.java: Likewise.
        * gnu/regexp/RETokenChar.java: Likewise.
        * gnu/regexp/RETokenEnd.java: Likewise.
        * gnu/regexp/RETokenEndSub.java: Likewise.
        * gnu/regexp/RETokenIndependent.java: Likewise.
        * gnu/regexp/RETokenLookAhead.java: Likewise.
        * gnu/regexp/RETokenLookBehind.java: Likewise.
        * gnu/regexp/RETokenNamedProperty.java: Likewise.
        * gnu/regexp/RETokenPOSIX.java: Likewise.
        * gnu/regexp/RETokenRange.java: Likewise.
        * gnu/regexp/RETokenStart.java: Likewise.
        * gnu/regexp/RETokenWordBoundary.java: Likewise
        * gnu/regexp/RETokenOneOf.java: Rewriten.
        * gnu/regexp/RETokenRepeated.java: Rewriten.

The performance test follows:
import java.io.*;
import java.util.*;
import java.util.regex.*;
 
import java.io.*;
import java.util.*;
import java.util.regex.*;
 
public class RegExTestCase {
    
    static final Pattern dynPatString =
Pattern.compile("(.*?)(<dynstr\\s+)(property)\\s*=\\s*\"(.*?)\"\\s*(/>)"
);

 // | 1 ||<-   2 ->||<-  3 ->|           |<4>|      |5 |


    long time = 0;
    public RegExTestCase () {
        for (int i = 0; i < 10000; i++) {
            String s = replaceDynamicStringAll(testString, "Foo");
        }
        System.out.println("Elapsed time = " + time);
    }
 
    public String replaceDynamicStringAll(String inStr, String
replaceStr) {
        StringBuffer sb = new StringBuffer();
        Matcher m = dynPatString.matcher(inStr);
        time -= System.currentTimeMillis();
        boolean b = m.find();
        time += System.currentTimeMillis();
        while (b) {
            sb.append(m.group(1));
            m.appendReplacement(sb, replaceStr);
            b = m.find();
        }
        m.appendTail(sb);
        return sb.toString();
    }
    
    private final static String testString = "ABC<dynstr property =             
\"X\"/>";
    
    public final static void main(String[] args) {
        new RegExTestCase();
    }
}

--- gnu/java/nio/charset/iconv/IconvProvider.java.orig  Sat Jul 16 00:12:48 2005
+++ gnu/java/nio/charset/iconv/IconvProvider.java       Thu Oct 20 23:41:57 2005
@@ -62,7 +62,11 @@
         }
     }
 
-  private IconvProvider()
+  // Declaring the construtor public may violate the use of singleton.
+  // But it must be public so that an instance of this class can be
+  // created by Class.newInstance(), which is the case when this provider is
+  // defined in META-INF/services/java.nio.charset.spi.CharsetProvider.
+  public IconvProvider()
   {
     IconvMetaData.setup();
   }

Reply via email to