It's impossible to generalize if the developer wants ASCII characters or Unicode characters in email validation. A switch is obviously mandated. However, for Unicode characters, this can be easily solved by using the \w switch (word characters), I believe, as a replacement for the typical [A-Za-z0-9_\-] pattern.
Cheers, Paul On Mon, Aug 25, 2014 at 11:22 AM, Dave Newton <davelnew...@gmail.com> wrote: > http://tools.ietf.org/html/rfc2822 > > IIRC http://tools.ietf.org/html/rfc2047 discusses non-0-127 chars in > headers, I'm not sure if that extends to addresses. > > The bottom line is that any realistic email regex will miss a lot of edge > cases, and some fairly normal use cases as well. Email regexes are > generally "good enough" and that's about it. Regexes isn't the right > solution for completely-spec-compliant email address validation. > > Note that other email validators can be plugged in fairly easily. > > Dave > > > > > On Mon, Aug 25, 2014 at 12:11 PM, Miguel Almeida <mig...@almeida.at> > wrote: > > > I have added it to the JIRA - > > https://issues.apache.org/jira/browse/WW-4389 > > > > > > I can't seem to find the actual standard though (i.e., the one in place > > that essentially doesn't allow these characters). For documentation > > purposes, does anyone know what effective standard disallows these > > characters? > > > > Cheers! > > Miguel > > > > On Mon, 2014-08-25 at 10:51 -0500, Paul Benedict wrote: > > > > > I looked up the RFC. The document lists itself as a "proposed standard" > > [1] > > > so it's not really available yet for general use (but correct me if > > wrong). > > > I propose that an enhancement should be made in JIRA to handle this. > > > > > > [1] http://tools.ietf.org/html/rfc6531 > > > > > > > > > > > > Cheers, > > > Paul > > > > > > > > > On Mon, Aug 25, 2014 at 10:46 AM, Miguel Almeida <mig...@almeida.at> > > wrote: > > > > > > > This is the regex for email validation in Struts: > > > > > > > > \\b^['_a-z0-9-\\+]+(\\.['_a-z0-9-\\+]+)*@[a-z0-9-]+(\\.[a-z0-9-]+)\* > > > > > \.([a-z]{2}|aero|arpa|asia|biz|com|coop|edu|gov|info|int|jobs|mil|mobi| > > > > museum|name|nato|net|org|pro|tel|travel|xxx)$\\b > > > > > > > > I had a report of this failing for a user with an umlaut email > > > > ( shläg...@example.com ). My regex is not very good, but the above > > > > mentioned regex doesn't seem to allow said characters. > > > > > > > > However, International characters above U+007F are permitted by RFC > > > > 6531 : > > > > > > > > http://sphinx.mythic-beasts.com/~pdw/cgi-bin/emailvalidate > > > > > > > > > > > > > > > > What is your view on this? Could this regex be incorrect and miss out > > > > any special characters? > > > > > > > > Miguel Almeida > > > > > > > > > > > > > > -- > e: davelnew...@gmail.com > m: 908-380-8699 > s: davelnewton_skype > t: @dave_newton <https://twitter.com/dave_newton> > b: Bucky Bits <http://buckybits.blogspot.com/> > g: davelnewton <https://github.com/davelnewton> > so: Dave Newton <http://stackoverflow.com/users/438992/dave-newton> >