Hi.

Jon Stevens > Q: There are two tests that don't pass. Why?
Having got home and looked at the tests I think I did not answer the question 
properly earlier. I assume you mean why should these not match?

Jon Stevens > #159
If so then #159 the www is preceded by a period.
The re requires that the the first character of the domain name be 
alpanumeric or a hyphen.
http://.www.test.com

Jon Stevens > #161
The re only matchs ftp and http protocols. But not Fttp.

NOTE:  The "Match: NO" means a successful non-matching.

A big...
*********************************************
************Failure*************************
*********************************************
type message appears if one of the tests "fails".


As to the test(); as opposed to new RETest( args );
I have included another patch to clean this up( this is a repeat of the 
previous patch with additions).
I think someone intended to clean it up earlier but did not finish or was 
distracted as the javadocs says one thing and the code does something other.
I think it on track now...

Michael
Index: docs/RETest.txt
===================================================================
RCS file: /home/cvspublic/jakarta-regexp/docs/RETest.txt,v
retrieving revision 1.1
diff -r1.1 RETest.txt
886a887,980
> 
> #149
> (?:a)
> a
> YES
> a
> 
> #150
> (?:a)
> aa
> YES
> a
> 
> #151
> (?:\w)
> abc
> YES
> a
> 
> #152
> (?:\w\s\w)+
> a b c
> YES
> a b
> 
> #153
> (a\w)(?:,(a\w))+
> ab,ac,ad
> YES
> ab,ac,ad
> ab
> ad
> 
> #154
> z(\w\s+(?:\w\s+\w)+)z
> za   b bc   cd     dz
> YES
> za   b bc   cd     dz
> a   b bc   cd     d
> 
> #155
> (([hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]+)*
> http://www.test.com
> YES
> http://www.test.com
> http://
> http
> .com
> 
> #156
> ((?:[hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]+)*
> ftp://www.test.com
> YES
> ftp://www.test.com
> ftp://
> .com
> 
> #157
> (([hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*
> htTp://www.test.com
> YES
> htTp://www.test.com
> htTp://
> htTp
> 
> #158
> (?:([hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]+)*
> FTP://www.test.com
> YES
> FTP://www.test.com
> FTP
> .com
> 
> #159
> ^(?:([hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]+)*$
> http://.www.test.com
> NO
> 
> #160
> ^(?:(?:[hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*$
> FtP://www.test.com
> YES
> FtP://www.test.com
> 
> #161
> ^(?:(?:[hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*$
> FtTP://www.test.com
> NO
> 
> #162
> ^(?:(?:[hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*$
> www.test.com
> YES
> www.test.com
Index: src/java/org/apache/regexp/RE.java
===================================================================
RCS file: /home/cvspublic/jakarta-regexp/src/java/org/apache/regexp/RE.java,v
retrieving revision 1.6
diff -r1.6 RE.java
176,186c176,186
<  *    [:alnum:]            Alphanumeric characters. 
<  *    [:alpha:]            Alphabetic characters. 
<  *    [:blank:]            Space and tab characters. 
<  *    [:cntrl:]            Control characters. 
<  *    [:digit:]            Numeric characters. 
<  *    [:graph:]            Characters that are printable and are also visible. (A 
space is printable, but not visible, while an `a' is both.) 
<  *    [:lower:]            Lower-case alphabetic characters. 
<  *    [:print:]            Printable characters (characters that are not control 
characters.) 
<  *    [:punct:]            Punctuation characters (characters that are not letter, 
digits, control characters, or space characters). 
<  *    [:space:]            Space characters (such as space, tab, and formfeed, to 
name a few). 
<  *    [:upper:]            Upper-case alphabetic characters. 
---
>  *    [:alnum:]            Alphanumeric characters.
>  *    [:alpha:]            Alphabetic characters.
>  *    [:blank:]            Space and tab characters.
>  *    [:cntrl:]            Control characters.
>  *    [:digit:]            Numeric characters.
>  *    [:graph:]            Characters that are printable and are also visible. (A 
>space is printable, but not visible, while an `a' is both.)
>  *    [:lower:]            Lower-case alphabetic characters.
>  *    [:print:]            Printable characters (characters that are not control 
>characters.)
>  *    [:punct:]            Punctuation characters (characters that are not letter, 
>digits, control characters, or space characters).
>  *    [:space:]            Space characters (such as space, tab, and formfeed, to 
>name a few).
>  *    [:upper:]            Upper-case alphabetic characters.
188c188
<  *         
---
>  *
199c199
<  *         
---
>  *
254a255
>  *   (?:A)                 Used for subexpression clustering (just like grouping but 
>no backrefs)
399a401
>     static final char OP_OPEN_CLUSTER     = '<';  //                 opening cluster
400a403
>     static final char OP_CLOSE_CLUSTER    = '>';  //                 closing cluster
421c424
<     static final char POSIX_CLASS_ALPHA   = 'a';  // Alphabetics 
---
>     static final char POSIX_CLASS_ALPHA   = 'a';  // Alphabetics
947a951,955
> 
>                 case OP_OPEN_CLUSTER:
>                 case OP_CLOSE_CLUSTER:
>                     // starting or ending the matching of a subexpression which has 
>no backref.
>                     return matchNodes( next, maxNode, idx );
Index: src/java/org/apache/regexp/RECompiler.java
===================================================================
RCS file: /home/cvspublic/jakarta-regexp/src/java/org/apache/regexp/RECompiler.java,v
retrieving revision 1.2
diff -r1.2 RECompiler.java
1191c1191
<         boolean paren = false;
---
>         int paren = -1;
1196,1198c1196,1208
<             idx++;
<             paren = true;
<             ret = node(RE.OP_OPEN, parens++);
---
>             // if its a cluster ( rather than a proper subexpression ie with 
>backrefs )
>             if ( idx + 2 < len && pattern.charAt( idx + 1 ) == '?' && 
>pattern.charAt( idx + 2 ) == ':' )
>             {
>                 paren = 2;
>                 idx += 3;
>                 ret = node( RE.OP_OPEN_CLUSTER, 0 );
>             }
>             else
>             {
>                 paren = 1;
>                 idx++;
>                 ret = node(RE.OP_OPEN, parens++);
>             }
1223c1233
<         if (paren)
---
>         if ( paren > 0 )
1233c1243,1250
<             end = node(RE.OP_CLOSE, closeParens);
---
>             if ( paren == 1 )
>             {
>                 end = node(RE.OP_CLOSE, closeParens);
>             }
>             else
>             {
>                 end = node( RE.OP_CLOSE_CLUSTER, 0 );
>             }
Index: src/java/org/apache/regexp/RETest.java
===================================================================
RCS file: /home/cvspublic/jakarta-regexp/src/java/org/apache/regexp/RETest.java,v
retrieving revision 1.2
diff -r1.2 RETest.java
58c58
<  */ 
---
>  */
85c85
<     public static void main(String[] arg)
---
>     public static void main(String[] args)
89,90c89
<             //new RETest(arg);
<             test();
---
>             test( args );
103c102
<     public static boolean test() throws Exception
---
>     public static boolean test( String[] args ) throws Exception
106c105,121
<         test.runAutomatedTests("docs/RETest.txt");
---
>         // Run interactive tests against a single regexp
>         if (args.length == 2)
>         {
>             test.runInteractiveTests(args[1]);
>         }
>         else if (args.length == 1)
>         {
>             // Run automated tests
>             test.runAutomatedTests(args[0]);
>         }
>         else
>         {
>             System.out.println( "Usage: RETest ([-i] [regex]) 
>([/path/to/testfile.txt])" );
>             System.out.println( "By Default will run automated tests from file 
>'docs/RETest.txt' ..." );
>             System.out.println();
>             test.runAutomatedTests("docs/RETest.txt");
>         }
118,146d132
<      * Constructor for test
<      * @param arg Command line arguments
<     */
<     public RETest(String[] arg)
<     {
<         try
<         {
<             // Run interactive tests against a single regexp
<             if (arg.length == 2)
<             {
<                 runInteractiveTests(arg[1]);
<             }
<             else if (arg.length == 1)
<             {
<                 // Run automated tests
<                 runAutomatedTests(arg[0]);
<             }
<             else
<             {
<                 System.out.println ( "Usage: RETest ([-i] [regex]) 
([/path/to/testfile.txt])" );
<             }
<         }
<         catch (Exception e)
<         {
<             e.printStackTrace();
<         }       
<     }
< 
<     /**
162c148
<             
---
> 
Index: xdocs/RETest.txt
===================================================================
RCS file: /home/cvspublic/jakarta-regexp/xdocs/RETest.txt,v
retrieving revision 1.1
diff -r1.1 RETest.txt
886a887,980
> 
> #149
> (?:a)
> a
> YES
> a
> 
> #150
> (?:a)
> aa
> YES
> a
> 
> #151
> (?:\w)
> abc
> YES
> a
> 
> #152
> (?:\w\s\w)+
> a b c
> YES
> a b
> 
> #153
> (a\w)(?:,(a\w))+
> ab,ac,ad
> YES
> ab,ac,ad
> ab
> ad
> 
> #154
> z(\w\s+(?:\w\s+\w)+)z
> za   b bc   cd     dz
> YES
> za   b bc   cd     dz
> a   b bc   cd     d
> 
> #155
> (([hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]+)*
> http://www.test.com
> YES
> http://www.test.com
> http://
> http
> .com
> 
> #156
> ((?:[hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]+)*
> ftp://www.test.com
> YES
> ftp://www.test.com
> ftp://
> .com
> 
> #157
> (([hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*
> htTp://www.test.com
> YES
> htTp://www.test.com
> htTp://
> htTp
> 
> #158
> (?:([hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]+)*
> FTP://www.test.com
> YES
> FTP://www.test.com
> FTP
> .com
> 
> #159
> ^(?:([hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(\.[a-zA-Z0-9\-]+)*$
> http://.www.test.com
> NO
> 
> #160
> ^(?:(?:[hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*$
> FtP://www.test.com
> YES
> FtP://www.test.com
> 
> #161
> ^(?:(?:[hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*$
> FtTP://www.test.com
> NO
> 
> #162
> ^(?:(?:[hH][tT]{2}[pP]|[fF][tT][pP]):\/\/)?[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*$
> www.test.com
> YES
> www.test.com

Reply via email to