Fantastic! Thanks for the analysis. Looks like we should be able to
compile the JavaScript regexp to a Java regexp and then use
java.util.regex. I'll put that on my queue :-)

--Norris

On Apr 22, 12:32 pm, "John Cowan" <[EMAIL PROTECTED]> wrote:
> On Tue, Apr 22, 2008 at 9:52 AM, Norris Boyd <[EMAIL PROTECTED]> wrote:
> >  We welcome contributions and contributors; see
> >  http://developer.mozilla.org/en/docs/Rhino_Wish_List.
>
> Your list asks about ECMAscript regular expressions.  As far as I can
> tell by closely comparing the 3rd Edition with the Javadoc for
> java.util.regex.Pattern (supplemented by a few experiments), they are
> a proper subset of Java regular expressions with the following three
> exceptions:
>
> Java does not support the \v escape:  use \ck instead.
>
> Java does not support the \0 escape: use \x00 instead.
>
> Java does not support the \b escape within character classes: for
> [...\b...] read [...\ch...].
>
> Java also provides the following extensions over ECMAscript:
>
> Octal escapes (\0d, \0dd, \01dd)
> \a (same as \cg) and \e (same as \x1b)
> Posix, Unicode, and Java-specific character classes with \p and \P
> \A (beginning of input), \z  (end of input), and \Z (end of input
> except for final line terminator)
> Possessive quantifiers ?+, *+, ++ (match as much as possible even if
> other parts fail as a result)
> \Q and \E (force all characters in between to be escaped)
> (?<=X) and (?<!X) for positive and negative lookbehind
> (?idmnsux) Turn on special matching flags
> (?idmnsux:X) Turn on special matching flags in this group
> Character class union (by concatenation) and intersection (with &&)
>
> The Java syntax for character class union and intersection provokes
> incompatible interpretations in certain cases: for example,
> [a-z&&[^d-f]] is the same as [a-cg-z] in Java (modulo locale issues),
> but in ECMAscript it should match any of a-z&^[ followed by ].
> However, this is a very improbable way of writing that regular
> expression in ECMAscript (or any non-Java regular expression
> language), so the syntax is *in effect* backward compatible.
> Likewise, [a-z[] is invalid in Java (erroneous nested character class)
> but should match any of a-z or [ in ECMAscript.
>
> --
> GMail doesn't have rotating .sigs, but you can see mine 
> athttp://www.ccil.org/~cowan/signatures
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to