-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Casey T. Deccio wrote:
| On Thu, 2004-01-22 at 09:51, Von Fugal wrote:
|>"^[a-zA-Z0-9._-]+@([a-zA-Z0-9-]+\.)+[a-zA-Z.]{2,5}$"
|
| "^[a-zA-Z0-9._\-]+@([a-zA-Z0-9\-]+\.)+[a-zA-Z]{2,5}$"
|
| I don't think you want the dot in the last [a-zA-Z].  And if you want
| the hyphen to work in the brackets, you'll need to escape it.  Other
| than that the regexp looks pretty good, I think.

Actually[1], the hyphen is only special inside the brackets if it occurs
between characters. If it is at the front or end of the list it doesn't
need to be escaped. Agreed about the '.' in the last character class,
though. And insofar as keeping the same pattern as the original, the
rest of the regex does look good.

For an even better regex we can look at the address specification from
the RFC[2]. If I grok it correctly I get this:

~ /^word(\.word)[EMAIL PROTECTED](.subdomain)*$/

Where

~  word      = atom | quoted-string
~  subdomain = atom | domain-literal
~  atom[3]   = [a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+

~  quoted-string  =
~     "([\x00-\x0C\x0E-\x21\x23-\x5B\x5D-\x7F]|\\[\x00-\x7F])*"

~  domain-literal =
~     \[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\]

So the end result (wrapped hard at 70 chars, take out the new lines :)
would be:

/^([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\x0C\x0E-\x21\x23-\x5B\x5D
- -\x7F]|\\[\x00-\x7F])*")(\.([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|"([\x00-\
x0C\x0E-\x21\x23-\x5B\x5D-\x7F]|\\[\x00-\x7F])*"))*@([a-zA-Z0-9&_?\/`!
|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[\x00-\x7F])*\])(\.
([a-zA-Z0-9&_?\/`!|#*$^%=~{}+'-]+|\[([\x00-\x0C\x0E-\x5A\x5E-\x7F]|\\[
\x00-\x7F])*\]))*$/

And this doesn't even take into account the "John Doe <[EMAIL PROTECTED]>"
type of syntax. Don't let it hurt your head too much.

Jacob Fugal

[1] All my comments about regex syntax are perlish. I think the same is
true about posix regular expressions, and thus hopefully PHP, but no
guarantees.

[2] http://www.faqs.org/rfcs/rfc822.html
Specifically section 6.1, page 26, <addr-spec>. All the rules used in
the grammar are defined in Appendix D, starting at page 43.

[3] In perl, I also had to escape the $ in this character class so that
perl wouldn't try and interpolate $^ into the regex. This may or may not
be necessary for other languages.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAECX1/PO7QochUiQRAontAJ9AOx2/T7KW3Iwt5uuEyMPA9V8IywCgl2E/
exk209D6U8L2OIMagC5TLzU=
=2W42
-----END PGP SIGNATURE-----


____________________
BYU Unix Users Group http://uug.byu.edu/ ___________________________________________________________________
List Info: http://uug.byu.edu/cgi-bin/mailman/listinfo/uug-list

Reply via email to