Is the email regex validator in Struts validation incorrect?
This is the regex for email validation in Struts: \\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\* \.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi| museum|name|nato|net|org|pro|tel|travel|xxx)$\\b I had a report of this failing for a user with an umlaut email ( shläg...@example.com ). My regex is not very good, but the above mentioned regex doesn't seem to allow said characters. However, International characters above U+007F are permitted by RFC 6531 : http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate What is your view on this? Could this regex be incorrect and miss out any special characters? Miguel Almeida
Re: Is the email regex validator in Struts validation incorrect?
I looked up the RFC. The document lists itself as a proposed standard [1] so it's not really available yet for general use (but correct me if wrong). I propose that an enhancement should be made in JIRA to handle this. [1] http://tools.ietf.org/html/rfc6531 Cheers, Paul On Mon, Aug 25, 2014 at 10:46 AM, Miguel Almeida mig...@almeida.at wrote: This is the regex for email validation in Struts: \\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\* \.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi| museum|name|nato|net|org|pro|tel|travel|xxx)$\\b I had a report of this failing for a user with an umlaut email ( shläg...@example.com ). My regex is not very good, but the above mentioned regex doesn't seem to allow said characters. However, International characters above U+007F are permitted by RFC 6531 : http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate What is your view on this? Could this regex be incorrect and miss out any special characters? Miguel Almeida
Re: Is the email regex validator in Struts validation incorrect?
I have added it to the JIRA - https://issues.apache.org/jira/browse/WW-4389 I can't seem to find the actual standard though (i.e., the one in place that essentially doesn't allow these characters). For documentation purposes, does anyone know what effective standard disallows these characters? Cheers! Miguel On Mon, 2014-08-25 at 10:51 -0500, Paul Benedict wrote: I looked up the RFC. The document lists itself as a proposed standard [1] so it's not really available yet for general use (but correct me if wrong). I propose that an enhancement should be made in JIRA to handle this. [1] http://tools.ietf.org/html/rfc6531 Cheers, Paul On Mon, Aug 25, 2014 at 10:46 AM, Miguel Almeida mig...@almeida.at wrote: This is the regex for email validation in Struts: \\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\* \.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi| museum|name|nato|net|org|pro|tel|travel|xxx)$\\b I had a report of this failing for a user with an umlaut email ( shläg...@example.com ). My regex is not very good, but the above mentioned regex doesn't seem to allow said characters. However, International characters above U+007F are permitted by RFC 6531 : http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate What is your view on this? Could this regex be incorrect and miss out any special characters? Miguel Almeida
Re: Is the email regex validator in Struts validation incorrect?
http://tools.ietf.org/html/rfc2822 IIRC http://tools.ietf.org/html/rfc2047 discusses non-0-127 chars in headers, I'm not sure if that extends to addresses. The bottom line is that any realistic email regex will miss a lot of edge cases, and some fairly normal use cases as well. Email regexes are generally good enough and that's about it. Regexes isn't the right solution for completely-spec-compliant email address validation. Note that other email validators can be plugged in fairly easily. Dave On Mon, Aug 25, 2014 at 12:11 PM, Miguel Almeida mig...@almeida.at wrote: I have added it to the JIRA - https://issues.apache.org/jira/browse/WW-4389 I can't seem to find the actual standard though (i.e., the one in place that essentially doesn't allow these characters). For documentation purposes, does anyone know what effective standard disallows these characters? Cheers! Miguel On Mon, 2014-08-25 at 10:51 -0500, Paul Benedict wrote: I looked up the RFC. The document lists itself as a proposed standard [1] so it's not really available yet for general use (but correct me if wrong). I propose that an enhancement should be made in JIRA to handle this. [1] http://tools.ietf.org/html/rfc6531 Cheers, Paul On Mon, Aug 25, 2014 at 10:46 AM, Miguel Almeida mig...@almeida.at wrote: This is the regex for email validation in Struts: \\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\* \.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi| museum|name|nato|net|org|pro|tel|travel|xxx)$\\b I had a report of this failing for a user with an umlaut email ( shläg...@example.com ). My regex is not very good, but the above mentioned regex doesn't seem to allow said characters. However, International characters above U+007F are permitted by RFC 6531 : http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate What is your view on this? Could this regex be incorrect and miss out any special characters? Miguel Almeida -- e: davelnew...@gmail.com m: 908-380-8699 s: davelnewton_skype t: @dave_newton https://twitter.com/dave_newton b: Bucky Bits http://buckybits.blogspot.com/ g: davelnewton https://github.com/davelnewton so: Dave Newton http://stackoverflow.com/users/438992/dave-newton
Re: Is the email regex validator in Struts validation incorrect?
Note: I pasted the wrong JIRA issue. The correct one is: https://issues.apache.org/jira/browse/WW-4395 On Mon, 2014-08-25 at 12:22 -0400, Dave Newton wrote: http://tools.ietf.org/html/rfc2822 IIRC http://tools.ietf.org/html/rfc2047 discusses non-0-127 chars in headers, I'm not sure if that extends to addresses. The bottom line is that any realistic email regex will miss a lot of edge cases, and some fairly normal use cases as well. Email regexes are generally good enough and that's about it. Regexes isn't the right solution for completely-spec-compliant email address validation. Note that other email validators can be plugged in fairly easily. Dave On Mon, Aug 25, 2014 at 12:11 PM, Miguel Almeida mig...@almeida.at wrote: I have added it to the JIRA - https://issues.apache.org/jira/browse/WW-4389 I can't seem to find the actual standard though (i.e., the one in place that essentially doesn't allow these characters). For documentation purposes, does anyone know what effective standard disallows these characters? Cheers! Miguel On Mon, 2014-08-25 at 10:51 -0500, Paul Benedict wrote: I looked up the RFC. The document lists itself as a proposed standard [1] so it's not really available yet for general use (but correct me if wrong). I propose that an enhancement should be made in JIRA to handle this. [1] http://tools.ietf.org/html/rfc6531 Cheers, Paul On Mon, Aug 25, 2014 at 10:46 AM, Miguel Almeida mig...@almeida.at wrote: This is the regex for email validation in Struts: \\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\* \.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi| museum|name|nato|net|org|pro|tel|travel|xxx)$\\b I had a report of this failing for a user with an umlaut email ( shläg...@example.com ). My regex is not very good, but the above mentioned regex doesn't seem to allow said characters. However, International characters above U+007F are permitted by RFC 6531 : http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate What is your view on this? Could this regex be incorrect and miss out any special characters? Miguel Almeida
Re: Is the email regex validator in Struts validation incorrect?
It's impossible to generalize if the developer wants ASCII characters or Unicode characters in email validation. A switch is obviously mandated. However, for Unicode characters, this can be easily solved by using the \w switch (word characters), I believe, as a replacement for the typical [A-Za-z0-9_\-] pattern. Cheers, Paul On Mon, Aug 25, 2014 at 11:22 AM, Dave Newton davelnew...@gmail.com wrote: http://tools.ietf.org/html/rfc2822 IIRC http://tools.ietf.org/html/rfc2047 discusses non-0-127 chars in headers, I'm not sure if that extends to addresses. The bottom line is that any realistic email regex will miss a lot of edge cases, and some fairly normal use cases as well. Email regexes are generally good enough and that's about it. Regexes isn't the right solution for completely-spec-compliant email address validation. Note that other email validators can be plugged in fairly easily. Dave On Mon, Aug 25, 2014 at 12:11 PM, Miguel Almeida mig...@almeida.at wrote: I have added it to the JIRA - https://issues.apache.org/jira/browse/WW-4389 I can't seem to find the actual standard though (i.e., the one in place that essentially doesn't allow these characters). For documentation purposes, does anyone know what effective standard disallows these characters? Cheers! Miguel On Mon, 2014-08-25 at 10:51 -0500, Paul Benedict wrote: I looked up the RFC. The document lists itself as a proposed standard [1] so it's not really available yet for general use (but correct me if wrong). I propose that an enhancement should be made in JIRA to handle this. [1] http://tools.ietf.org/html/rfc6531 Cheers, Paul On Mon, Aug 25, 2014 at 10:46 AM, Miguel Almeida mig...@almeida.at wrote: This is the regex for email validation in Struts: \\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\* \.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi| museum|name|nato|net|org|pro|tel|travel|xxx)$\\b I had a report of this failing for a user with an umlaut email ( shläg...@example.com ). My regex is not very good, but the above mentioned regex doesn't seem to allow said characters. However, International characters above U+007F are permitted by RFC 6531 : http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate What is your view on this? Could this regex be incorrect and miss out any special characters? Miguel Almeida -- e: davelnew...@gmail.com m: 908-380-8699 s: davelnewton_skype t: @dave_newton https://twitter.com/dave_newton b: Bucky Bits http://buckybits.blogspot.com/ g: davelnewton https://github.com/davelnewton so: Dave Newton http://stackoverflow.com/users/438992/dave-newton
Re: Is the email regex validator in Struts validation incorrect?
2014-08-25 18:27 GMT+02:00 Miguel Almeida mig...@almeida.at: Note: I pasted the wrong JIRA issue. The correct one is: https://issues.apache.org/jira/browse/WW-4395 But you can simple override default pattern with regex or regexExpression param (don't use both) validator type=regex param name=regex*./param param name=regexExpression/param /validator http://struts.apache.org/release/2.3.x/docs/email-validator.html http://struts.apache.org/release/2.3.x/docs/regex-validator.html Regards -- Łukasz + 48 606 323 122 http://www.lenart.org.pl/ - To unsubscribe, e-mail: user-unsubscr...@struts.apache.org For additional commands, e-mail: user-h...@struts.apache.org