[web2py:28175] Re: IS_EMAIL validator problem

Yarko Tymciurak Fri, 07 Aug 2009 12:50:29 -0700

On Fri, Aug 7, 2009 at 2:14 PM, Jonathan Lundell <jlund...@pobox.com> wrote:


> On Aug 7, 2009, at 10:04 AM, Yarko Tymciurak wrote:
>
> Whoever makes up this patch, since this is complicated enough,
> can I ask you follow the commented regex style (re.X)
> which is now used to validate paths;
>
> see example starting on line 74 of main.py:
>
> http://bazaar.launchpad.net/~mdipierro/web2py/devel/annotate/head%3A/gluon/main.py<http://bazaar.launchpad.net/%7Emdipierro/web2py/devel/annotate/head%3A/gluon/main.py>
>
>
> That's my plan (I'm the one who did the main.py re.X patch).
>

+1  (great!; thanks!)

>
>
> Thanks,
> - Yarko
>
> On Fri, Aug 7, 2009 at 10:56 AM, Carl <carl.ro...@gmail.com> wrote:
>
>>
>> You've convinced me that staying close to RFC is a "best choice" even
>> though we lose the opportunity for users to correct addresses at the
>> point of data entry.
>>
>> nb the suggested regex in my last posting doesn't work well enough!
>> e.g., a...@domain.co.uk isn't matched
>>
>> C
>>
>>
>>
>> On Aug 7, 4:48 pm, Jonathan Lundell <jlund...@pobox.com> wrote:
>> > On Aug 7, 2009, at 8:13 AM, Carl wrote:
>> >
>> >
>> >
>> > > This is an excellent article on the traps to beware of when regex'ing
>> > > email address formats
>> >
>> > >http://www.regular-expressions.info/email.html
>> >
>> > > This may ignite a debate though :)
>> >
>> > A discussion, maybe. In the abstract, I like the idea of verifying the
>> > RFC verbatim, but we *should* be clear on what we're trying to do.
>> > Guard against typos? Prevent some kind of attack? How much do we care
>> > about false positives?
>> >
>> > The article objects (to RFC-style checking) that j...@aol.com.nospam,
>> > for example, will validate. I'm not too concerned about that, in that
>> > there are lots of ways that a user can enter a wrong but
>> > (syntactically) valid address. We deal with that through active
>> > validation, not a syntax check.
>> >
>> > Might there be a security concern? The quoted variation of the RFC
>> > checker is very permissive:
>> >
>> >         "([^"\r\\]|\\["\r\\])*"
>> >
>> > Could that open the door to some kind of injection attack? Presumably
>> > we sanitize it for display; how about when we actually use it to send
>> > mail? Any consumer that doesn't understand quoted names could end up
>> > very confused.
>> >
>> > I take false positives as a v. bad thing: if a user enters a real and
>> > valid address, I do not want to reject it. So I don't much like the
>> > explicit list of TLDs (below), on the grounds that it's bound to
>> > expand, and at some point it'll break. From the Wikipedia TLD article:
>> >
>> > > During the 32nd International Public ICANN Meeting in Paris in 2008,
>> > > ICANN started a new process of TLD naming policy to take a
>> > > "significant step forward on the introduction of new generic top-
>> > > level domains." This program envisions the availability of many new
>> > > or already proposed domains, as well a new application and
>> > > implementation process. Observers believed that the new rules could
>> > > result in hundreds of new gTLDs to be registered. Proposed TLDs
>> > > include music, berlin and nyc.
>> >
>> > I think I'd favor the RFC-style pattern without the quoted-name
>> > alternation.
>> >
>> > One thing we could do is to give the developer an option:
>> > IS_EMAIL(something or other) that lets them select one of a small
>> > number of regexes. And of course the developer can always use IS_MATCH
>> > if they don't like our choice of email filters.
>> >
>> > If we permitted a choice, I'd suggest:
>> >
>> >         1. default to the RFC regex, but without quoted names
>> >         2. RFC including quoted names
>> >         3. something like the pattern below, including the TLD filter
>> (maybe)
>> >
>> >
>> >
>> >
>> >
>> > > I favour this variation...
>> > > [a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-
>> > > z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|com|org|net|gov|mil|biz|
>> > > info|mobi|name|aero|jobs|museum)\b
>> >
>> > > C
>> >
>> > > On Aug 7, 8:25 am, Jonathan Lundell <jlund...@pobox.com> wrote:
>> > >> On Aug 7, 2009, at 12:22 AM, mdipierro wrote:
>> >
>> > >>> I will take a patch for this.
>> >
>> > >> If nobody else gets to it first, I'll work up a patch over the
>> > >> weekend.
>> >
>> > >>> Massimo
>> >
>> > >>> On Aug 7, 1:33 am, Jonathan Lundell <jlund...@pobox.com> wrote:
>> > >>>> On Aug 6, 2009, at 9:32 PM, DenesL wrote:
>> >
>> > >>>>> IS_EMAIL does not follow the RFC specs for valid email addresses
>> > >>>>> (seehttp://en.wikipedia.org/wiki/E-mail_address)
>> >
>> > >>>>> even a simple a...@b.com fails
>> >
>> > >>>>> it is kinda late to work on the regex now, maybe tomorrow.
>> >
>> > >>>> The RFC is fairly hard to validate. If that's what we really
>> > >>>> want, I
>> > >>>> found this one on the web that looks about right:
>> >
>> > >>>> ^(?!\.)("([^"\r\\]|\\["\r\\])*"|([-a-z0-9!#$%&'*+/=?^_`{|}~]|(?...@[a-
>> > >>>> z0-9][\w\.-]*[a-z0-9]\.[a-z][a-z\.]*[a-z]$
>> >
>> > >>>> It assumes the case-insensitive flag.
>> >
>> > >>>>http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an-
>> > >>>> email...
>> >
>> > >>>> Overkill? Or, what the heck?
>>
>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To post to this group, send email to web2py@googlegroups.com
To unsubscribe from this group, send email to 
web2py+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/web2py?hl=en
-~----------~----~----~----~------~----~------~--~---

[web2py:28175] Re: IS_EMAIL validator problem

Reply via email to