Thanks, Chris, I am not actually up-to-date on such messaging issues but not shocked at what you wrote. Years ago I recall most messages going out of my workplace looked like machine!machine2!ihnp4!more!evenmore!user with no @ in sight and as you mention, you may want to send to a domain and have it send to a subdomain so a multiple @ may make sense and so on. I note we have some places like groups.io that disguise the @ in your original email address so you can still see who it is from, even though it is in some sense from them but to actually use the email address in your own mailer, you need to substitute it back in.
I think we all agree that unless there is further standardization, an email address can easily be rejected that is otherwise usable in some context and that one in proper format (by some definition) will fail in that context. The original question actually focused more narrowly on a good way to find if a character existed in a string for which regular expressions need not apply and most email addresses re short enough that techniques to speed up the search may not be useful unless all the program does is search millions of email addresses for the presence. Dropping out, ... -----Original Message----- From: Python-list <python-list-bounces+avigross=verizon....@python.org> On Behalf Of Chris Angelico Sent: Monday, December 28, 2020 8:02 PM To: Python <python-list@python.org> Subject: Re: Which method to check if string index is queal to character. On Tue, Dec 29, 2020 at 10:08 AM Avi Gross via Python-list <python-list@python.org> wrote: > > This may be a nit, but can we agree all valid email addresses as used > today have more than an @ symbol? > > I see it as requiring at least one character before the @ that come > from a list of allowed characters (perhaps not ASCII) but does not > include the symbol @ again. It is normally followed by some minimal > number of characters and maybe a period and one of the currently > valid domains like .com or .it but the latter gets tricky as it can > look like u...@abd.def.att.com or other long variations where only the > final component must be testable in the program. There can be an @ in the first part of the address, and the domain may well not have a dot. > The lack of an at-sign suggests it is not an email address. The lack > of anything before or after also seems to disqualify it. You may be > able to add more conditions but as noted, having more than one at-sign > may also disqualify it. Lack of an at sign means it's a local address that can't be routed over the internet, and in many contexts, it's reasonable to exclude those. But two isn't illegal. > I am sure someone has some complex regular expressions that they think > matches only potentially valid strings but, of course, as noted by > Chris, to really validate that an address works might require sending > something and validating a human replied and that can be quite task. > Yes, many such regexes exist, and they are *all wrong*. Without exception. I don't think it's actually possible for a regex to perfectly match all (syntactically) valid email addresses and nothing else. ChrisA -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list