Is the email regex validator in Struts validation incorrect?

2014-08-25 Thread Miguel Almeida
This is the regex for email validation in Struts:

\\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\*
\.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi|
museum|name|nato|net|org|pro|tel|travel|xxx)$\\b

I had a report of this failing for a user with an umlaut email
( shläg...@example.com ).  My regex is not very good, but the above
mentioned regex doesn't seem to allow said characters.

However, International characters above U+007F are permitted by RFC
6531 :

http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate



What is your view on this? Could this regex be incorrect and miss out
any special characters?

Miguel Almeida



Re: Is the email regex validator in Struts validation incorrect?

2014-08-25 Thread Paul Benedict
I looked up the RFC. The document lists itself as a proposed standard [1]
so it's not really available yet for general use (but correct me if wrong).
I propose that an enhancement should be made in JIRA to handle this.

[1] http://tools.ietf.org/html/rfc6531



Cheers,
Paul


On Mon, Aug 25, 2014 at 10:46 AM, Miguel Almeida mig...@almeida.at wrote:

 This is the regex for email validation in Struts:

 \\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\*
 \.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi|
 museum|name|nato|net|org|pro|tel|travel|xxx)$\\b

 I had a report of this failing for a user with an umlaut email
 ( shläg...@example.com ).  My regex is not very good, but the above
 mentioned regex doesn't seem to allow said characters.

 However, International characters above U+007F are permitted by RFC
 6531 :

 http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate



 What is your view on this? Could this regex be incorrect and miss out
 any special characters?

 Miguel Almeida




Re: Is the email regex validator in Struts validation incorrect?

2014-08-25 Thread Miguel Almeida
I have added it to the JIRA -
https://issues.apache.org/jira/browse/WW-4389


I can't seem to find the actual standard though (i.e., the one in place
that essentially doesn't allow these characters). For documentation
purposes, does anyone know what effective standard disallows these
characters?

Cheers!
Miguel

On Mon, 2014-08-25 at 10:51 -0500, Paul Benedict wrote:

 I looked up the RFC. The document lists itself as a proposed standard [1]
 so it's not really available yet for general use (but correct me if wrong).
 I propose that an enhancement should be made in JIRA to handle this.
 
 [1] http://tools.ietf.org/html/rfc6531
 
 
 
 Cheers,
 Paul
 
 
 On Mon, Aug 25, 2014 at 10:46 AM, Miguel Almeida mig...@almeida.at wrote:
 
  This is the regex for email validation in Struts:
 
  \\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\*
  \.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi|
  museum|name|nato|net|org|pro|tel|travel|xxx)$\\b
 
  I had a report of this failing for a user with an umlaut email
  ( shläg...@example.com ).  My regex is not very good, but the above
  mentioned regex doesn't seem to allow said characters.
 
  However, International characters above U+007F are permitted by RFC
  6531 :
 
  http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate
 
 
 
  What is your view on this? Could this regex be incorrect and miss out
  any special characters?
 
  Miguel Almeida
 
 


Re: Is the email regex validator in Struts validation incorrect?

2014-08-25 Thread Dave Newton
http://tools.ietf.org/html/rfc2822

IIRC http://tools.ietf.org/html/rfc2047 discusses non-0-127 chars in
headers, I'm not sure if that extends to addresses.

The bottom line is that any realistic email regex will miss a lot of edge
cases, and some fairly normal use cases as well. Email regexes are
generally good enough and that's about it. Regexes isn't the right
solution for completely-spec-compliant email address validation.

Note that other email validators can be plugged in fairly easily.

Dave




On Mon, Aug 25, 2014 at 12:11 PM, Miguel Almeida mig...@almeida.at wrote:

 I have added it to the JIRA -
 https://issues.apache.org/jira/browse/WW-4389


 I can't seem to find the actual standard though (i.e., the one in place
 that essentially doesn't allow these characters). For documentation
 purposes, does anyone know what effective standard disallows these
 characters?

 Cheers!
 Miguel

 On Mon, 2014-08-25 at 10:51 -0500, Paul Benedict wrote:

  I looked up the RFC. The document lists itself as a proposed standard
 [1]
  so it's not really available yet for general use (but correct me if
 wrong).
  I propose that an enhancement should be made in JIRA to handle this.
 
  [1] http://tools.ietf.org/html/rfc6531
 
 
 
  Cheers,
  Paul
 
 
  On Mon, Aug 25, 2014 at 10:46 AM, Miguel Almeida mig...@almeida.at
 wrote:
 
   This is the regex for email validation in Struts:
  
   \\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\*
   \.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi|
   museum|name|nato|net|org|pro|tel|travel|xxx)$\\b
  
   I had a report of this failing for a user with an umlaut email
   ( shläg...@example.com ).  My regex is not very good, but the above
   mentioned regex doesn't seem to allow said characters.
  
   However, International characters above U+007F are permitted by RFC
   6531 :
  
   http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate
  
  
  
   What is your view on this? Could this regex be incorrect and miss out
   any special characters?
  
   Miguel Almeida
  
  




-- 
e: davelnew...@gmail.com
m: 908-380-8699
s: davelnewton_skype
t: @dave_newton https://twitter.com/dave_newton
b: Bucky Bits http://buckybits.blogspot.com/
g: davelnewton https://github.com/davelnewton
so: Dave Newton http://stackoverflow.com/users/438992/dave-newton


Re: Is the email regex validator in Struts validation incorrect?

2014-08-25 Thread Miguel Almeida
Note: I pasted the wrong JIRA issue. The correct one is:
https://issues.apache.org/jira/browse/WW-4395

On Mon, 2014-08-25 at 12:22 -0400, Dave Newton wrote:

 http://tools.ietf.org/html/rfc2822
 
 IIRC http://tools.ietf.org/html/rfc2047 discusses non-0-127 chars in
 headers, I'm not sure if that extends to addresses.
 
 The bottom line is that any realistic email regex will miss a lot of edge
 cases, and some fairly normal use cases as well. Email regexes are
 generally good enough and that's about it. Regexes isn't the right
 solution for completely-spec-compliant email address validation.
 
 Note that other email validators can be plugged in fairly easily.
 
 Dave
 
 
 
 
 On Mon, Aug 25, 2014 at 12:11 PM, Miguel Almeida mig...@almeida.at wrote:
 
  I have added it to the JIRA -
  https://issues.apache.org/jira/browse/WW-4389
 
 
  I can't seem to find the actual standard though (i.e., the one in place
  that essentially doesn't allow these characters). For documentation
  purposes, does anyone know what effective standard disallows these
  characters?
 
  Cheers!
  Miguel
 
  On Mon, 2014-08-25 at 10:51 -0500, Paul Benedict wrote:
 
   I looked up the RFC. The document lists itself as a proposed standard
  [1]
   so it's not really available yet for general use (but correct me if
  wrong).
   I propose that an enhancement should be made in JIRA to handle this.
  
   [1] http://tools.ietf.org/html/rfc6531
  
  
  
   Cheers,
   Paul
  
  
   On Mon, Aug 25, 2014 at 10:46 AM, Miguel Almeida mig...@almeida.at
  wrote:
  
This is the regex for email validation in Struts:
   
\\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\*
\.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi|
museum|name|nato|net|org|pro|tel|travel|xxx)$\\b
   
I had a report of this failing for a user with an umlaut email
( shläg...@example.com ).  My regex is not very good, but the above
mentioned regex doesn't seem to allow said characters.
   
However, International characters above U+007F are permitted by RFC
6531 :
   
http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate
   
   
   
What is your view on this? Could this regex be incorrect and miss out
any special characters?
   
Miguel Almeida
   
   
 
 
 
 


Re: Is the email regex validator in Struts validation incorrect?

2014-08-25 Thread Paul Benedict
It's impossible to generalize if the developer wants ASCII characters or
Unicode characters in email validation. A switch is obviously mandated.
However, for Unicode characters, this can be easily solved by using the \w
switch (word characters), I believe, as a replacement for the typical
[A-Za-z0-9_\-] pattern.


Cheers,
Paul


On Mon, Aug 25, 2014 at 11:22 AM, Dave Newton davelnew...@gmail.com wrote:

 http://tools.ietf.org/html/rfc2822

 IIRC http://tools.ietf.org/html/rfc2047 discusses non-0-127 chars in
 headers, I'm not sure if that extends to addresses.

 The bottom line is that any realistic email regex will miss a lot of edge
 cases, and some fairly normal use cases as well. Email regexes are
 generally good enough and that's about it. Regexes isn't the right
 solution for completely-spec-compliant email address validation.

 Note that other email validators can be plugged in fairly easily.

 Dave




 On Mon, Aug 25, 2014 at 12:11 PM, Miguel Almeida mig...@almeida.at
 wrote:

  I have added it to the JIRA -
  https://issues.apache.org/jira/browse/WW-4389
 
 
  I can't seem to find the actual standard though (i.e., the one in place
  that essentially doesn't allow these characters). For documentation
  purposes, does anyone know what effective standard disallows these
  characters?
 
  Cheers!
  Miguel
 
  On Mon, 2014-08-25 at 10:51 -0500, Paul Benedict wrote:
 
   I looked up the RFC. The document lists itself as a proposed standard
  [1]
   so it's not really available yet for general use (but correct me if
  wrong).
   I propose that an enhancement should be made in JIRA to handle this.
  
   [1] http://tools.ietf.org/html/rfc6531
  
  
  
   Cheers,
   Paul
  
  
   On Mon, Aug 25, 2014 at 10:46 AM, Miguel Almeida mig...@almeida.at
  wrote:
  
This is the regex for email validation in Struts:
   
\\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\*
   
 \.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi|
museum|name|nato|net|org|pro|tel|travel|xxx)$\\b
   
I had a report of this failing for a user with an umlaut email
( shläg...@example.com ).  My regex is not very good, but the above
mentioned regex doesn't seem to allow said characters.
   
However, International characters above U+007F are permitted by RFC
6531 :
   
http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate
   
   
   
What is your view on this? Could this regex be incorrect and miss out
any special characters?
   
Miguel Almeida
   
   
 



 --
 e: davelnew...@gmail.com
 m: 908-380-8699
 s: davelnewton_skype
 t: @dave_newton https://twitter.com/dave_newton
 b: Bucky Bits http://buckybits.blogspot.com/
 g: davelnewton https://github.com/davelnewton
 so: Dave Newton http://stackoverflow.com/users/438992/dave-newton



Re: Is the email regex validator in Struts validation incorrect?

2014-08-25 Thread Lukasz Lenart
2014-08-25 18:27 GMT+02:00 Miguel Almeida mig...@almeida.at:
 Note: I pasted the wrong JIRA issue. The correct one is:
 https://issues.apache.org/jira/browse/WW-4395

But you can simple override default pattern with regex or
regexExpression param (don't use both)

validator type=regex
  param name=regex*./param
  param name=regexExpression/param
/validator

http://struts.apache.org/release/2.3.x/docs/email-validator.html
http://struts.apache.org/release/2.3.x/docs/regex-validator.html


Regards
-- 
Łukasz
+ 48 606 323 122 http://www.lenart.org.pl/

-
To unsubscribe, e-mail: user-unsubscr...@struts.apache.org
For additional commands, e-mail: user-h...@struts.apache.org