Hello

For some reason I need to match strings from the beginning of an
input-string and I dont want to load this string completely (from a file
or URL) if not needed. That should be possible, because the match
can be cancelled, if the first characters dont match.

Regexp doesn't do this, also if I start the regexp with a '^' and
activate the Flag MATCH_SINGLELINE. So I added a new flag to
match from the beginning of the string and stops, if the first
characters do not match. That also enhances the performance,
if the regexp dont match on a very long input-string.
To demonstrate this, I added the Performance.java that makes some
matches on very long strings.

I added in the patchtxt file the patch for RE.java itself, as created by
'cvs diff'.

Hope this patch will be helpful not only for me. Keep on the good work
on the regexp-library. Excuse my bad english, I'm not a native
english-speaker.

Ciao, Jörgen Kosche
-- 
Jörgen 'Mnementh' Kosche
[EMAIL PROTECTED]
GPG: http://www.mnementh.de/public_key
Webseite: http://www.mnementh.de/
import org.apache.regexp.*;

public class Performance
{

	public static void main(String[] args)
	{
		StringBuffer input = new StringBuffer();
		for (int i = 0; i < 1000; i++)
		{
			input.append("abcdefghijklmnopqrstuvwxyz");
		} // for
		for (int j = 0; j < 10000; j++)
		{
			RE re = new RE("xyz", RE.MATCH_FROMBEGINNING);
			//System.err.println(
			re.match(input.toString());
		} // for
	}

}
Index: src/java/org/apache/regexp/RE.java
===================================================================
RCS file: /home/cvspublic/jakarta-regexp/src/java/org/apache/regexp/RE.java,v
retrieving revision 1.23
diff -r1.23 RE.java
316a317,321
>     /**
>      * Flag to indicate that matching should start from the beginning of input
>      */
>     public static final int MATCH_FROMBEGINNING = 0x0008;
> 
1413a1419,1420
>         boolean matchEverywhere = ((matchFlags & MATCH_FROMBEGINNING) == 0);
> 
1421c1428
<             for ( ;! search.isEnd(i - 1); i++)
---
>             for ( ;(matchEverywhere || (i == 0)) && (!search.isEnd(i - 1)); i++)
1436c1443
<             for ( ; !search.isEnd(i + prefix.length - 1); i++)
---
>             for ( ;(matchEverywhere || (i == 0)) && (!search.isEnd(i + prefix.length 
> - 1)); i++)

Attachment: pgp115krBcgDI.pgp
Description: signature

Reply via email to