Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Daniel wrote: The Java regex syntax is almost a superset of Perl, which is why I don't see the impact of using a Perl engine for JDK 1.3 and java.util.regex for J2SE 1.4 as being major. The expression Rami gave was straight Perl 5.005. jakarta-oro's Perl5Compiler/Perl5Matcher implements zero-width look-ahead assertions from Perl 5.003 but does not implement the zero-width look-behind assertions from 5.005 and future versions (if you don't ask for it ...). This can be added. The other difference is that in Perl \Q and \E are not part of the regex syntax. They are part of Perl string handling, so we didn't implement them in Perl5Compiler (instead quotmeta() is provided), but support them in the Perl5Util convenience class. This can be moved into Perl5Compiler if desired. There has to be a user driver for these small things to happen. Very true. It is also obvious that java has followed in the footsteps of Perl that has much longer history with regexes. The reason they are not compatible is the lack of standardisation on the perl side. Since Java folks have always put much effort on internationalization I think Java regexes have made extra effort with handling of Unicode. If regexes would be standardized then Perl deserves to have the biggest word in that committee. However for that standard I feel that all the aspects of the language should be encoded inside the language rather than outside (like embedded sql or quotemeta() in regexes) Else the language will never be defined exactly but will have loose boundaries. In general, most regular expressions you see in the wild can be simplified and don't require unusual constructs. For example, why write \\Q**\\E when \\*\\* will do (you would usually want to use \Q and \E for longer sequences or for dynamically generated strings you want to escape; but quotemeta works equally well)? I am using quoting with dynamic input so I need the feature. Now I have been told that I need to support JAVA, PERL5 and POSIX syntaxes. So in case of Java I have to use \\Q and \\E In case of PERL5 I have to use quotemeta() And in case of POSIX I have no clue ! Why use a negative look-behind assertion in ((?!^)|[^/]) when [^/] will suffice (the negative look-behind assertion is redundant because if there's a character present that's not a slash, then it's not the start of the input)? Thanks for the tip! I am an occasional regex user :=) Of course, you can't always simplify your expressions and I think Rami's point is that you shouldn't be bothered with the finer points and stuff should just work. Thank you for understanding my intention so well ! I think the answer is that as long as you stick to Perl5 syntax (which most people using java.util.regex are unknowingly doing), you'll rarely run into differences; but that oro doesn't implement most of the stuff added after Perl 5.003 for lack of demand (there's not that much stuff). (And from above) There has to be a user driver for these small things to happen. I think there is a user driver for the fact that users could read one well written documentation about regexes and use them worry free. Don't you think? - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
In message [EMAIL PROTECTED], Mario Ivankovits writes: It looks like Perl and Java are very (very) simmilar. So asking ORO to The Java regex syntax is almost a superset of Perl, which is why I don't see the impact of using a Perl engine for JDK 1.3 and java.util.regex for J2SE 1.4 as being major. The expression Rami gave was straight Perl 5.005. jakarta-oro's Perl5Compiler/Perl5Matcher implements zero-width look-ahead assertions from Perl 5.003 but does not implement the zero-width look-behind assertions from 5.005 and future versions (if you don't ask for it ...). This can be added. The other difference is that in Perl \Q and \E are not part of the regex syntax. They are part of Perl string handling, so we didn't implement them in Perl5Compiler (instead quotmeta() is provided), but support them in the Perl5Util convenience class. This can be moved into Perl5Compiler if desired. There has to be a user driver for these small things to happen. In general, most regular expressions you see in the wild can be simplified and don't require unusual constrcuts. For example, why write \\Q**\\E when \\*\\* will do (you would usually want to use \Q and \E for longer sequences or for dynamically generated strings you want to escape; but quotemeta works equally well)? Why use a negative look-behind assertion in ((?!^)|[^/]) when [^/] will suffice (the negative look-behind assertion is redundant because if there's a character present that's not a slash, then it's not the start of the input)? Of course, you can't always simplify your expressions and I think Rami's point is that you shouldn't be bothered with the finer points and stuff should just work. I think the answer is that as long as you stick to Perl5 syntax (which most people using java.util.regex are unknowingly doing), you'll rarely run into differences; but that oro doesn't implement most of the stuff added after Perl 5.003 for lack of demand (there's not that much stuff). daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
I wrote: want to escape; but quotemeta works equally well)? Why use a negative look-behind assertion in ((?!^)|[^/]) when [^/] will suffice (the negative look-behind assertion is redundant because if there's a character present that's not a slash, then it's not the start of the input)? Of I forgot to add that that's assuming single line mode. daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Again, you use a common syntax, which is translated to the appropriate syntax for the implementation. If some specific feature of the common syntax is not supported in the implementation, then your RE will fail to translate, and should throw some sort of exception. Doesn't seem too difficult to deal with IMHO. What is common syntax? Could someone point me to the documentation. And what is this talk about translating? I thought there are only separate syntaxes (= dialect) and corresponding interpreters (= implementations). - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
In message [EMAIL PROTECTED], Rami Ojares writes: The problem with having a generic interface for different regex implementation s is that the syntax and semantics of regexes are different. I want to know EXACTLY what my regexes match and what constucts/syntax I can use. Somehow I missed this message. Sorry for the belated response. There are different use cases. What you say is absolutely right for the case where you're coding to a regex API and using those expressions directly in your code. But when you are dynamically fetching expressions, for example from a user interface dialog, it doesn't matter. You can specify what syntax is required for the input. Also, when you're writing generic/reusable code it's of great help. For example, all of the split and substitute methods in the org.apache.oro.text.regex.Util will work independent of the regex syntax used. org.apache.oro.io.RegexFilenameFilter will work with any regex engine. There are plenty of cases where you're writing regular expression code that is not dependent on the specific syntax. For those cases, having generic engines is very useful. The best candidate in my humble opinion for regex language is the one defined in jdk 1.4. What would be needed is a separate package that would implement jdk 1 .4 regex lang and could be used together with older jdk's. That would be a waste of effort in my opinion. Other than glob expressions, there is already a set of syntax common to most pattern matching languages. Since the whole point of the VFS discussion appears to be to support users who aren't using J2SE 1.4, all you have to do is use the syntax subset shared by Perl5 and java.util.regex, which is rather rich and useful. Anyway, that's my take given my understanding of what's being discussed. daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Since the whole point of the VFS discussion appears to be to support users who aren't using J2SE 1.4, all you have to do is use the syntax subset shared by Perl5 and java.util.regex, which is rather rich and useful. Anyway, that's my take given my understanding of what's being discussed. Ok. I am using \Q \E for quoting segments that should not be considered regexes. Then I use ((?!^)|[^/])\\Q**\\E((?!$)|[^/]) construct when looking for illegal patterns. (this means you can only have 2 stars in a row if they are delimited by - start of pattern OR slash AND - end of pattern OR slash Do these things work in all different regex flavors? And if not then how can I do it otherwise? - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Mario Ivankovits wrote: Hello! A contribution of Rami Ojares brings in an PatternSelector to handle ant-style patterns (/dir/**/file) to select files. This class currently uses the jdk1.4 regular expression library. Now there are some questions how to handle the regexp thing: [ ] Avoid dependency to jdk1.4. [ ] ... use jakarta-regexp [ ] ... use jakarta-oro [ ] ... use the regexp bundled with ant. But then we could not use the PatternSelector without the ant.jar and have to move it to the vfs.ant package. This powerfull thing might then not be useable for projects without it. [X ] Dont bother and use jdk1.4 as minimum requirement (and its regexp) Anthony - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
In message [EMAIL PROTECTED], Noel J. Bergman w rites: Daniel, are you still interested/trying to move ORO into Commons? What is I'm interested in doing whatever it will take to get people using the library or who should be using the library more involved in development. At first I thought that moving into Jakarta Commons might do that, then when Apache Commons started up I thought maybe that was the right place. Now it looks like Jakarta Commons is the right place again, although it would be nice not to have to change the package names. As far as trying, the best I've been able to do is make some noises, but I haven't taken much action under the theory that a couple of other people would lead the charge like happened with Commons Net. happening in the ORO project in terms of developers? Several important contributions were made by some users, but the contributions didn't keep on coming, so I've never called for a vote on granting committership despite their having expressed interested in becoming committers. So despite the contents of the avail file, we're back into a situation where I'm probably the only active committer (and I only make contributions in widely spaced bursts). I think the committer deficit problem would be solved easily if some existing Apache committers who have needs that can be satisfied with jakarta-oro can be convinced to hack the code a little. I've also thought that jakarta-oro and jakarta-regexp ought to interact more and have the same committer base, but I've never started a campaign for that. Asking jakarta-regexp if they're interested in implementing the regexp wrapper engine for oro might get that started. For now though, if I can squeeze enough time out to do the couple of simple things to meet VFS's needs, that may be enough to get more involvement and work things out (whether oro sits where it is or moves partially or completely into jakarta commons). daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Hi, The problem with having a generic interface for different regex implementations is that the syntax and semantics of regexes are different. I want to know EXACTLY what my regexes match and what constucts/syntax I can use. The implementations are not only implementations but they define also the form and meaning of the regexes that they use. Even though most of the constructs are the same there are differences. Many packages let themselves be configured to understand different dialects which proves my point. Let's call the set of allowed regexes as regex language. The best candidate in my humble opinion for regex language is the one defined in jdk 1.4. What would be needed is a separate package that would implement jdk 1.4 regex lang and could be used together with older jdk's. Could ORO do this? Therefore one can never avoid dependency to the regex language he uses. [X] Dont bother and use jdk1.4 as minimum requirement OR write separate package that works exactly like jdk 1.4 on earlier jdk's - Rami Ojares - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Hello, Our server product is and has been stuck on Java 1.3.1 for quite a while. This is a customer requirement, not ours. What I would like to see is ORO either evolve to provide the a thin 1.4 bridge or have a new commons-regexp as we now have a commons-logging. I do not really care which one happens but it would be nice to be able to automatically use the 1.4 libraries if they are there and fall back on ORO if not. I am assuming that the 1.4 RE support is better for a vague enough definition of better. Thank you, Gary -Original Message- From: Daniel F. Savarese [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 15, 2004 13:07 To: Jakarta Commons Developers List Subject: Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement In message [EMAIL PROTECTED], Noel J. Bergman w rites: Daniel, are you still interested/trying to move ORO into Commons? What is I'm interested in doing whatever it will take to get people using the library or who should be using the library more involved in development. At first I thought that moving into Jakarta Commons might do that, then when Apache Commons started up I thought maybe that was the right place. Now it looks like Jakarta Commons is the right place again, although it would be nice not to have to change the package names. As far as trying, the best I've been able to do is make some noises, but I haven't taken much action under the theory that a couple of other people would lead the charge like happened with Commons Net. happening in the ORO project in terms of developers? Several important contributions were made by some users, but the contributions didn't keep on coming, so I've never called for a vote on granting committership despite their having expressed interested in becoming committers. So despite the contents of the avail file, we're back into a situation where I'm probably the only active committer (and I only make contributions in widely spaced bursts). I think the committer deficit problem would be solved easily if some existing Apache committers who have needs that can be satisfied with jakarta-oro can be convinced to hack the code a little. I've also thought that jakarta-oro and jakarta-regexp ought to interact more and have the same committer base, but I've never started a campaign for that. Asking jakarta-regexp if they're interested in implementing the regexp wrapper engine for oro might get that started. For now though, if I can squeeze enough time out to do the couple of simple things to meet VFS's needs, that may be enough to get more involvement and work things out (whether oro sits where it is or moves partially or completely into jakarta commons). daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Rami Ojares wrote: The problem with having a generic interface for different regex implementations is that the syntax and semantics of regexes are different. I want to know EXACTLY what my regexes match and what constucts/syntax I can use. The developer has to tell the ORO factory what regexp language he would like to use and therefore, if you got an instance, you know what language you have to use. This is not nice - and maybe we have to deal with 2 (or more) languages, but it might not be worth the time to bring in a meta-regexp-language. -- Mario - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
The real question turns to: who is still using the jdk 1.3, and does this population match the intended target for [vfs] ? I was planning to integrate [vfs] into [configuration] but I consider the jdk 1.3 compatibility as a requirement since I'm deploying on WebSphere 4, and most of the time changing the application server is not an option :( Emmanuel Bourg Mario Ivankovits wrote: Now there are some questions how to handle the regexp thing: [X] Avoid dependency to jdk1.4. [ ] ... use jakarta-regexp [ ] ... use jakarta-oro [ ] ... use the regexp bundled with ant. But then we could not use the PatternSelector without the ant.jar and have to move it to the vfs.ant package. This powerfull thing might then not be useable for projects without it. [ ] Dont bother and use jdk1.4 as minimum requirement (and its regexp) Here is my vote: [X] Dont bother and use jdk1.4 as minimum requirement But i dont know which jdk version will be the most used and this is why i started this poll. IMHO at least for a sandbox component it should be permitted to use this bleeding edge ;-) jdk-version. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
[X] Avoid dependency to jdk1.4. [X] ... use jakarta-oro I strongly disagree with making 1.4 the base requirement. JDK reqs have been discussed over and over on this list so I will not reiterate the arguments again but in general I would say lower is better. From my POV, our product is stuck to running on our /customers/ lowest common denominator web server which are for the most part Java 1.2 and 1.3 based. Thank you, Gary -Original Message- From: Mario Ivankovits [mailto:[EMAIL PROTECTED] Sent: Monday, June 14, 2004 02:23 To: Jakarta Commons Developers List Subject: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement Hello! A contribution of Rami Ojares brings in an PatternSelector to handle ant-style patterns (/dir/**/file) to select files. This class currently uses the jdk1.4 regular expression library. Now there are some questions how to handle the regexp thing: [ ] Avoid dependency to jdk1.4. [ ] ... use jakarta-regexp [ ] ... use jakarta-oro [ ] ... use the regexp bundled with ant. But then we could not use the PatternSelector without the ant.jar and have to move it to the vfs.ant package. This powerfull thing might then not be useable for projects without it. [ ] Dont bother and use jdk1.4 as minimum requirement (and its regexp) Here is my vote: [X] Dont bother and use jdk1.4 as minimum requirement But i dont know which jdk version will be the most used and this is why i started this poll. IMHO at least for a sandbox component it should be permitted to use this bleeding edge ;-) jdk-version. PS: During writing of this poll, i noticed that there already two dependencies to the jdk1.4, but those can be easily reverted. -- Mario - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Now there are some questions how to handle the regexp thing: [X] Avoid dependency to jdk1.4. JDK 1.3 should be the highest minimum standard unless the code absolutely positively needs java.nio. --- Noel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
On Mon, 14 Jun 2004, Mario Ivankovits wrote: Hello! A contribution of Rami Ojares brings in an PatternSelector to handle ant-style patterns (/dir/**/file) to select files. This class currently uses the jdk1.4 regular expression library. Now there are some questions how to handle the regexp thing: [X] Avoid dependency to jdk1.4. [ ] ... use jakarta-regexp [X] ... use jakarta-oro [ ] ... use the regexp bundled with ant. IIRC, DFS changed ORO recently so that JDK 1.4 could be used, if desired, so having VFS use ORO makes sense to me. -- Martin Cooper But then we could not use the PatternSelector without the ant.jar and have to move it to the vfs.ant package. This powerfull thing might then not be useable for projects without it. [ ] Dont bother and use jdk1.4 as minimum requirement (and its regexp) Here is my vote: [X] Dont bother and use jdk1.4 as minimum requirement But i dont know which jdk version will be the most used and this is why i started this poll. IMHO at least for a sandbox component it should be permitted to use this bleeding edge ;-) jdk-version. PS: During writing of this poll, i noticed that there already two dependencies to the jdk1.4, but those can be easily reverted. -- Mario - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Mario Ivankovits wrote: Hello! A contribution of Rami Ojares brings in an PatternSelector to handle ant-style patterns (/dir/**/file) to select files. This class currently uses the jdk1.4 regular expression library. Now there are some questions how to handle the regexp thing: [ ] Avoid dependency to jdk1.4. [ ] ... use jakarta-regexp [ ] ... use jakarta-oro [ ] ... use the regexp bundled with ant. But then we could not use the PatternSelector without the ant.jar and have to move it to the vfs.ant package. This powerfull thing might then not be useable for projects without it. [X] Dont bother and use jdk1.4 as minimum requirement (and its regexp) The best would be that the Ant regex system with only ** * and ? could be easily extracted in one class or 2. If I understand correctly if the jdk 1.4 is the minimum requirement and someone uses VFS with jdk 1.2 or 1.3 only the PatternSelector will throw a ClassNotFoundException, right? Anthony - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]