DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUGĀ·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=38331>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED ANDĀ·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=38331

           Summary: ArrayIndexOutOfBoundsException under certain conditions
           Product: Regexp
           Version: unspecified
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Other
        AssignedTo: regexp-dev@jakarta.apache.org
        ReportedBy: [EMAIL PROTECTED]


This code generates an exception when running with jdk1.3.1_17:

RE r123 = new RE("((a|b){1637})");
r123.match("a");

This code works properly:

RE r123 = new RE("((a|b){1638})");
r123.match("a");

This code shows that depending on the number requested, regexp switches between 
working and not working:

boolean lastvalue = true;
for(int i = 1; i < 3650; i+=1) {
    try {
        RE r = new RE("((a|b){" + i + "})");
        r.match("a");
        if (!lastvalue) { System.out.println("Switching from NOT to WORKING 
at " + i + " (" + i + " works) "+lastvalue); }
        lastvalue = true;
    } catch (Exception ex) {
        if (lastvalue) { System.out.println("Switching from WORKING to NOT at " 
+ i + " (" + i + " doesn't work) "+lastvalue); }
        lastvalue = false;
    }
}

This behavior, if "i" was allowed past 3650, would switch back and forth a 
couple more times before 10000, however seen it happen above 7000 (this is as 
far as I let it test). In RE.java, look under the following signature:

protected int matchNodes(int firstNode, int lastNode, int idxStart)

Look for this line:

next   = node + (short)instruction[node + offsetNext];

Change it to say:

next   = node + (int)instruction[node + offsetNext];

Recompile and test and this problem appears to go away, however I cannot 
confirm that it doesn't break something else. I'm not sure why "short" would 
have been chosen over "int". Maybe there is a hidden reason.

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to