Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-24 Thread Rami Ojares
Daniel wrote:

 The Java regex syntax is almost a superset of Perl, which is why I don't
 see the impact of using a Perl engine for JDK 1.3 and java.util.regex
 for J2SE 1.4 as being major.  The expression Rami gave was straight
 Perl 5.005.  jakarta-oro's Perl5Compiler/Perl5Matcher implements
 zero-width look-ahead assertions from Perl 5.003 but does not implement
 the zero-width look-behind assertions from 5.005 and future versions (if
 you don't ask for it ...).  This can be added.  The other difference is
 that in Perl \Q and \E are not part of the regex syntax.  They are part
 of Perl string handling, so we didn't implement them in Perl5Compiler
 (instead quotmeta() is provided), but support them in the Perl5Util
 convenience class.  This can be moved into Perl5Compiler if desired.
 There has to be a user driver for these small things to happen.

Very true. It is also obvious that java has followed in the footsteps
of Perl that has much longer history with regexes. The reason they are not
compatible is the lack of standardisation on the perl side.
Since Java folks have always put much effort on internationalization
I think Java regexes have made extra effort with handling of Unicode.

If regexes would be standardized then Perl deserves to have the biggest word
in that committee.

However for that standard I feel that all the aspects of the language should be
encoded inside the language rather than outside (like embedded sql or quotemeta()
in regexes) Else the language will never be defined exactly but will have loose 
boundaries.

 In general, most regular expressions you see in the wild can be
 simplified and don't require unusual constructs.  For example, why
 write \\Q**\\E when \\*\\* will do (you would usually want to use
 \Q and \E for longer sequences or for dynamically generated strings you
 want to escape; but quotemeta works equally well)?

I am using quoting with dynamic input so I need the feature.
Now I have been told that I need to support JAVA, PERL5 and POSIX syntaxes.
So in case of Java I have to use \\Q and \\E
In case of PERL5 I have to use quotemeta()
And in case of POSIX I have no clue !

 Why use a negative
 look-behind assertion in ((?!^)|[^/]) when [^/] will suffice (the
 negative look-behind assertion is redundant because if there's a character
 present that's not a slash, then it's not the start of the input)?

Thanks for the tip! I am an occasional regex user :=)

  Of
 course, you can't always simplify your expressions and I think Rami's point
 is that you shouldn't be bothered with the finer points and stuff should
 just work.

Thank you for understanding my intention so well !

 I think the answer is that as long as you stick to Perl5 syntax
 (which most people using java.util.regex are unknowingly doing), you'll
 rarely run into differences; but that oro doesn't implement most of the
 stuff added after Perl 5.003 for lack of demand (there's not that much stuff).
(And from above)
 There has to be a user driver for these small things to happen.

I think there is a user driver for the fact that users could read one
well written documentation about regexes and use them worry free.
Don't you think?

- rami

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-21 Thread Daniel F. Savarese

In message [EMAIL PROTECTED], Mario Ivankovits writes:
It looks like Perl and Java are very (very) simmilar. So asking ORO to 

The Java regex syntax is almost a superset of Perl, which is why I don't
see the impact of using a Perl engine for JDK 1.3 and java.util.regex
for J2SE 1.4 as being major.  The expression Rami gave was straight
Perl 5.005.  jakarta-oro's Perl5Compiler/Perl5Matcher implements
zero-width look-ahead assertions from Perl 5.003 but does not implement
the zero-width look-behind assertions from 5.005 and future versions (if
you don't ask for it ...).  This can be added.  The other difference is
that in Perl \Q and \E are not part of the regex syntax.  They are part
of Perl string handling, so we didn't implement them in Perl5Compiler
(instead quotmeta() is provided), but support them in the Perl5Util
convenience class.  This can be moved into Perl5Compiler if desired.
There has to be a user driver for these small things to happen.

In general, most regular expressions you see in the wild can be
simplified and don't require unusual constrcuts.  For example, why
write \\Q**\\E when \\*\\* will do (you would usually want to use
\Q and \E for longer sequences or for dynamically generated strings you
want to escape; but quotemeta works equally well)?  Why use a negative
look-behind assertion in ((?!^)|[^/]) when [^/] will suffice (the
negative look-behind assertion is redundant because if there's a character
present that's not a slash, then it's not the start of the input)?  Of
course, you can't always simplify your expressions and I think Rami's point
is that you shouldn't be bothered with the finer points and stuff should
just work.  I think the answer is that as long as you stick to Perl5 syntax
(which most people using java.util.regex are unknowingly doing), you'll
rarely run into differences; but that oro doesn't implement most of the
stuff added after Perl 5.003 for lack of demand (there's not that much stuff).

daniel



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-21 Thread Daniel F. Savarese

I wrote:
want to escape; but quotemeta works equally well)?  Why use a negative
look-behind assertion in ((?!^)|[^/]) when [^/] will suffice (the
negative look-behind assertion is redundant because if there's a character
present that's not a slash, then it's not the start of the input)?  Of

I forgot to add that that's assuming single line mode.

daniel



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-18 Thread Rami Ojares
 Again, you use a common syntax, which is translated to the appropriate
 syntax for the implementation.  If some specific feature of the common
 syntax is not supported in the implementation, then your RE will fail to
 translate, and should throw some sort of exception.  Doesn't seem too
 difficult to deal with IMHO.

What is common syntax?
Could someone point me to the documentation.
And what is this talk about translating?
I thought there are only separate syntaxes (= dialect)
and corresponding interpreters (= implementations).

- rami

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-17 Thread Daniel F. Savarese

In message [EMAIL PROTECTED], Rami Ojares writes:
The problem with having a generic interface for different regex implementation
s
is that the syntax and semantics of regexes are different. I want to know
EXACTLY what my regexes match and what constucts/syntax I can use.

Somehow I missed this message.  Sorry for the belated response.
There are different use cases.  What you say is absolutely right
for the case where you're coding to a regex API and using those
expressions directly in your code.  But when you are dynamically
fetching expressions, for example from a user interface dialog,
it doesn't matter.  You can specify what syntax is required for
the input.  Also, when you're writing generic/reusable code it's
of great help.  For example, all of the split and substitute methods
in the org.apache.oro.text.regex.Util will work independent of the
regex syntax used.  org.apache.oro.io.RegexFilenameFilter will work
with any regex engine.  There are plenty of cases where you're writing
regular expression code that is not dependent on the specific syntax.
For those cases, having generic engines is very useful.

The best candidate in my humble opinion for regex language is the one defined 
in
jdk 1.4. What would be needed is a separate package that would implement jdk 1
.4
regex lang and could be used together with older jdk's.

That would be a waste of effort in my opinion.  Other than glob expressions,
there is already a set of syntax common to most pattern matching languages.
Since the whole point of the VFS discussion appears to be to support
users who aren't using J2SE 1.4, all you have to do is use the syntax
subset shared by Perl5 and java.util.regex, which is rather rich and
useful.  Anyway, that's my take given my understanding of what's being
discussed.

daniel



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-17 Thread Rami Ojares

 Since the whole point of the VFS discussion appears to be to support
 users who aren't using J2SE 1.4, all you have to do is use the syntax
 subset shared by Perl5 and java.util.regex, which is rather rich and
 useful.  Anyway, that's my take given my understanding of what's being
 discussed.

Ok.
I am using \Q \E for quoting segments that should not be considered regexes.
Then I use ((?!^)|[^/])\\Q**\\E((?!$)|[^/]) construct when looking for
illegal patterns. (this means you can only have 2 stars in a row if they
are delimited by
- start of pattern OR slash
AND
- end of pattern OR slash
Do these things work in all different regex flavors?
And if not then how can I do it otherwise?

- rami


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-15 Thread Anthony Goubard
Mario Ivankovits wrote:
Hello!
A contribution of Rami Ojares brings in an PatternSelector to handle 
ant-style patterns (/dir/**/file) to select files.
This class currently uses the jdk1.4 regular expression library.

Now there are some questions how to handle the regexp thing:
[ ] Avoid dependency to jdk1.4.
[ ] ... use jakarta-regexp
[ ] ... use jakarta-oro
[ ] ... use the regexp bundled with ant. But then we could not use the 
PatternSelector without the ant.jar and have to move it to the vfs.ant 
package. This powerfull thing might then not be useable for projects 
without it.

[X ] Dont bother and use jdk1.4 as minimum requirement (and its regexp)
Anthony
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-15 Thread Daniel F. Savarese

In message [EMAIL PROTECTED], Noel J. Bergman w
rites:
Daniel, are you still interested/trying to move ORO into Commons?  What is

I'm interested in doing whatever it will take to get people using the library
or who should be using the library more involved in development.  At first
I thought that moving into Jakarta Commons might do that, then when Apache
Commons started up I thought maybe that was the right place.  Now it looks
like Jakarta Commons is the right place again, although it would be nice
not to have to change the package names.  As far as trying, the best I've
been able to do is make some noises, but I haven't taken much action under
the theory that a couple of other people would lead the charge like happened
with Commons Net.

happening in the ORO project in terms of developers?

Several important contributions were made by some users, but the contributions
didn't keep on coming, so I've never called for a vote on granting
committership despite their having expressed interested in becoming
committers.  So despite the contents of the avail file, we're back into
a situation where I'm probably the only active committer (and I only make
contributions in widely spaced bursts).  I think the committer deficit
problem would be solved easily if some existing Apache committers who have
needs that can be satisfied with jakarta-oro can be convinced to hack the
code a little.  I've also thought that jakarta-oro and jakarta-regexp ought
to interact more and have the same committer base, but I've never started
a campaign for that.  Asking jakarta-regexp if they're interested in
implementing the regexp wrapper engine for oro might get that started.
For now though, if I can squeeze enough time out to do the couple of
simple things to meet VFS's needs, that may be enough to get more
involvement and work things out (whether oro sits where it is or
moves partially or completely into jakarta commons).

daniel



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-15 Thread Rami Ojares
Hi,

The problem with having a generic interface for different regex implementations
is that the syntax and semantics of regexes are different. I want to know
EXACTLY what my regexes match and what constucts/syntax I can use.

The implementations are not only implementations but they define also the form
and meaning of the regexes that they use.

Even though most of the constructs are the same there are differences. Many
packages let themselves be configured to understand different dialects which
proves my point.

Let's call the set of allowed regexes as regex language.

The best candidate in my humble opinion for regex language is the one defined in
jdk 1.4. What would be needed is a separate package that would implement jdk 1.4
regex lang and could be used together with older jdk's.

Could ORO do this?

Therefore one can never avoid dependency to the regex language he uses.

[X] Dont bother and use jdk1.4 as minimum requirement OR write separate package
that works exactly like jdk 1.4 on earlier jdk's

- Rami Ojares

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-15 Thread Gary Gregory
Hello,

Our server product is and has been stuck on Java 1.3.1 for quite a
while. This is a customer requirement, not ours. What I would like to
see is ORO either evolve to provide the a thin 1.4 bridge or have a new
commons-regexp as we now have a commons-logging. I do not really care
which one happens but it would be nice to be able to automatically use
the 1.4 libraries if they are there and fall back on ORO if not. I am
assuming that the 1.4 RE support is better for a vague enough
definition of better.

Thank you,
Gary 

 -Original Message-
 From: Daniel F. Savarese [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, June 15, 2004 13:07
 To: Jakarta Commons Developers List
 Subject: Re: [vfs][all][poll]regular expression library or jdk1.4 as
 minimum requirement
 
 
 In message [EMAIL PROTECTED], Noel J.
 Bergman w
 rites:
 Daniel, are you still interested/trying to move ORO into Commons?
What
 is
 
 I'm interested in doing whatever it will take to get people using the
 library
 or who should be using the library more involved in development.  At
first
 I thought that moving into Jakarta Commons might do that, then when
Apache
 Commons started up I thought maybe that was the right place.  Now it
looks
 like Jakarta Commons is the right place again, although it would be
nice
 not to have to change the package names.  As far as trying, the best
I've
 been able to do is make some noises, but I haven't taken much action
under
 the theory that a couple of other people would lead the charge like
 happened
 with Commons Net.
 
 happening in the ORO project in terms of developers?
 
 Several important contributions were made by some users, but the
 contributions
 didn't keep on coming, so I've never called for a vote on granting
 committership despite their having expressed interested in becoming
 committers.  So despite the contents of the avail file, we're back
into
 a situation where I'm probably the only active committer (and I only
make
 contributions in widely spaced bursts).  I think the committer deficit
 problem would be solved easily if some existing Apache committers who
have
 needs that can be satisfied with jakarta-oro can be convinced to hack
the
 code a little.  I've also thought that jakarta-oro and jakarta-regexp
 ought
 to interact more and have the same committer base, but I've never
started
 a campaign for that.  Asking jakarta-regexp if they're interested in
 implementing the regexp wrapper engine for oro might get that started.
 For now though, if I can squeeze enough time out to do the couple of
 simple things to meet VFS's needs, that may be enough to get more
 involvement and work things out (whether oro sits where it is or
 moves partially or completely into jakarta commons).
 
 daniel
 
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-15 Thread Mario Ivankovits
Rami Ojares wrote:
The problem with having a generic interface for different regex implementations
is that the syntax and semantics of regexes are different. I want to know
EXACTLY what my regexes match and what constucts/syntax I can use.
 

The developer has to tell the ORO factory what regexp language he would 
like to use and therefore, if you got an instance, you know what 
language you have to use.
This is not nice - and maybe we have to deal with 2 (or more) languages, 
but it might not be worth the time to bring in a meta-regexp-language.

--
Mario
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-14 Thread Emmanuel Bourg
The real question turns to: who is still using the jdk 1.3, and does 
this population match the intended target for [vfs] ? I was planning to 
integrate [vfs] into [configuration] but I consider the jdk 1.3 
compatibility as a requirement since I'm deploying on WebSphere 4, and 
most of the time changing the application server is not an option :(

Emmanuel Bourg
Mario Ivankovits wrote:
Now there are some questions how to handle the regexp thing:
[X] Avoid dependency to jdk1.4.
[ ] ... use jakarta-regexp
[ ] ... use jakarta-oro
[ ] ... use the regexp bundled with ant. But then we could not use the 
PatternSelector without the ant.jar and have to move it to the vfs.ant 
package. This powerfull thing might then not be useable for projects 
without it.

[ ] Dont bother and use jdk1.4 as minimum requirement (and its regexp)
Here is my vote:
[X] Dont bother and use jdk1.4 as minimum requirement
But i dont know which jdk version will be the most used and this is why 
i started this poll.
IMHO at least for a sandbox component it should be permitted to use this 
bleeding edge ;-) jdk-version.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-14 Thread Gary Gregory
 [X] Avoid dependency to jdk1.4.
 [X] ... use jakarta-oro

I strongly disagree with making 1.4 the base requirement. 

JDK reqs have been discussed over and over on this list so I will not
reiterate the arguments again but in general I would say lower is
better. 

From my POV, our product is stuck to running on our /customers/ lowest
common denominator web server which are for the most part Java 1.2 and
1.3 based.

Thank you,
Gary 
 -Original Message-
 From: Mario Ivankovits [mailto:[EMAIL PROTECTED]
 Sent: Monday, June 14, 2004 02:23
 To: Jakarta Commons Developers List
 Subject: [vfs][all][poll]regular expression library or jdk1.4 as
minimum
 requirement
 
 Hello!
 
 A contribution of Rami Ojares brings in an PatternSelector to handle
 ant-style patterns (/dir/**/file) to select files.
 This class currently uses the jdk1.4 regular expression library.
 
 Now there are some questions how to handle the regexp thing:
 [ ] Avoid dependency to jdk1.4.
 [ ] ... use jakarta-regexp
 [ ] ... use jakarta-oro
 [ ] ... use the regexp bundled with ant. But then we could not use the
 PatternSelector without the ant.jar and have to move it to the vfs.ant
 package. This powerfull thing might then not be useable for projects
 without it.
 
 [ ] Dont bother and use jdk1.4 as minimum requirement (and its regexp)
 
 
 Here is my vote:
 [X] Dont bother and use jdk1.4 as minimum requirement
 But i dont know which jdk version will be the most used and this is
why
 i started this poll.
 IMHO at least for a sandbox component it should be permitted to use
this
 bleeding edge ;-) jdk-version.
 
 
 PS: During writing of this poll, i noticed that there already two
 dependencies to the jdk1.4, but those can be easily reverted.
 
 --
 Mario
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-14 Thread Noel J. Bergman
Now there are some questions how to handle the regexp thing:
[X] Avoid dependency to jdk1.4.

JDK 1.3 should be the highest minimum standard unless the code absolutely
positively needs java.nio.

--- Noel


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-14 Thread Martin Cooper

On Mon, 14 Jun 2004, Mario Ivankovits wrote:
Hello!
A contribution of Rami Ojares brings in an PatternSelector to handle 
ant-style patterns (/dir/**/file) to select files.
This class currently uses the jdk1.4 regular expression library.

Now there are some questions how to handle the regexp thing:
[X] Avoid dependency to jdk1.4.
[ ] ... use jakarta-regexp
[X] ... use jakarta-oro
[ ] ... use the regexp bundled with ant.
IIRC, DFS changed ORO recently so that JDK 1.4 could be used, if desired, 
so having VFS use ORO makes sense to me.

--
Martin Cooper

But then we could not use the 
PatternSelector without the ant.jar and have to move it to the vfs.ant 
package. This powerfull thing might then not be useable for projects without 
it.

[ ] Dont bother and use jdk1.4 as minimum requirement (and its regexp)
Here is my vote:
[X] Dont bother and use jdk1.4 as minimum requirement
But i dont know which jdk version will be the most used and this is why i 
started this poll.
IMHO at least for a sandbox component it should be permitted to use this 
bleeding edge ;-) jdk-version.

PS: During writing of this poll, i noticed that there already two 
dependencies to the jdk1.4, but those can be easily reverted.

--
Mario
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement

2004-06-14 Thread Anthony Goubard
Mario Ivankovits wrote:
Hello!
A contribution of Rami Ojares brings in an PatternSelector to handle 
ant-style patterns (/dir/**/file) to select files.
This class currently uses the jdk1.4 regular expression library.

Now there are some questions how to handle the regexp thing:
[ ] Avoid dependency to jdk1.4.
[ ] ... use jakarta-regexp
[ ] ... use jakarta-oro
[ ] ... use the regexp bundled with ant. But then we could not use the 
PatternSelector without the ant.jar and have to move it to the vfs.ant 
package. This powerfull thing might then not be useable for projects 
without it.

[X] Dont bother and use jdk1.4 as minimum requirement (and its regexp)
The best would be that the Ant regex system with only ** * and ? could 
be easily extracted in one class or 2.
If I understand correctly if the jdk 1.4 is the minimum requirement and 
someone uses VFS with jdk 1.2 or 1.3 only the PatternSelector will throw 
a ClassNotFoundException, right?

Anthony
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]