On Sun, Dec 17, 2017, at 10:46, Chris Angelico wrote: > But if you're trying to *validate* an email address - for instance, if > you receive a form submission and want to know if there was an email > address included - then my recommendation is simply DON'T. You can't > get all the edge cases right; it is actually impossible for a regex to > perfectly match every valid email address and no invalid addresses.
That's not actually true (the thing that notoriously can't be matched in a regex, RFC822 "address", is basically most of the syntax of the To: header - the part that is *the address* as we speak of it normally is "addr-spec" and is in fact a regular language, though a regex to match it goes on for a few hundred characters. The formal syntax also has some surprising corners that might not reflect real-world implementations: for example, a local-part may not begin or end with a dot or contain two dots in a row (unless quoted - the suggestion someone else made that a local-part may contain an @ sign also requires quoting). It's also unfortunate that a domain-part may not end with the dot, since this would provide a way to specify TLD- only addresses without allowing the error of mistakenly leaving the TLD off of an address. > And that's only counting *syntactically* valid - it doesn't take into > account the fact that "b...@junk.example.com" is not going to get > anywhere. So if you're trying to do validation, basically just don't. The recommendation still stands, of course - this script is probably not the place to explore these obscure corners. If the email address is important, you can send a link to it and wait for them to click it to confirm the email. If it's not, don't bother at all. -- https://mail.python.org/mailman/listinfo/python-list