Re: Proposal for 1.2: Dumber email validation
> Russell raises my biggest concern with this proposal. There are > a lot of smart folks in the Django-Developers end of things that > can cobble together a pretty legit regexp that covers the > majority of cases with no horrific DOS cases (e.g. last security > issue). > ... > My initial candidate is ticket #12005, though it merely > re.VERBOSE's the original and tweaks the domain portion to meet > an internal need. Some changes on the stuff before the "@" might > make it more "relaxed" (if not RFC-compliant-ish) while keeping > out some of the badness. > Of course it would also be possible to use a non-regex approach. There are libraries by Dominic Sayers [1] and Cal Henderson [2] with a ton of tests. Unfortunately they are written in PHP but shouldn't be to hard to translate to Python. As a bonus the authors claim these libraries to validate fully RFC compliant. Ulrich [1] http://www.dominicsayers.com/isemail/ [2] http://code.iamcal.com/php/rfc822/ --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Proposal for 1.2: Dumber email validation
> 1) If we encourage people to write their own regex if they want > tighter email validation, we run the risk that users will > inadvertently introduce the same bug that we have just fixed. Russell raises my biggest concern with this proposal. There are a lot of smart folks in the Django-Developers end of things that can cobble together a pretty legit regexp that covers the majority of cases with no horrific DOS cases (e.g. last security issue). I've seen the regexps created by people who don't comprehend them and it's UGLY. This proposal basically throws those people to the wolves. I'd much rather Django provided an email field that got most of the way and let regexp-understanding users tweak if needed. But I'd hate to see somebody opening themselves to email addresses like bad_stuff()@wherever.space space.& or f...@domain.tld\x0a\0x0dfrom: s...@spammer.spam\x0a\0x0dto: s...@spimmer.spim\x0a\x0d\x0a\x0dspam, spam, spam! which can (without added caution) inject headers into sent-mail. My initial candidate is ticket #12005, though it merely re.VERBOSE's the original and tweaks the domain portion to meet an internal need. Some changes on the stuff before the "@" might make it more "relaxed" (if not RFC-compliant-ish) while keeping out some of the badness. -tim --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Proposal for 1.2: Dumber email validation
On Oct 10, 9:35 am, James Bennettwrote: > So what I'd like to propose is that EmailField essentially check that > the value contains an '@', and a '.' somewhere after it. This will > cover most addresses that are likely to be in actual use, and various > confirmation processes can be used to rule out any invalid addresses > which happen to slip through that. Good idea - every real project I've had where this became an issue had to switch to some sort of actual mail-based validation system (confirmation, live MX connection, etc.). Adding doc links about email validation tools would be better because it'd establish that this is something which deserves some thought if you rely on email addresses. Chris --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Proposal for 1.2: Dumber email validation
Ned, You really ought to show us all how to use that time machine. :) On Oct 10, 2009, at 8:49 AM, Ned Batchelderwrote: > > +1 > > http://nedbatchelder.com/blog/200908/humane_email_validation.html > > I was going to kibbitz on the fix (removing a single * would have > sufficed), and realized we were once again in the quagmire of email > regex validation. > > --Ned. > > James Bennett wrote: >> In light of yesterday's security issue, I'd like to propose that we >> significantly dumb down the regex Django uses to validate email >> addresses. >> >> Currently, the regex we use covers many common cases, but comes >> nowhere near covering the entire spectrum of addresses allowed by the >> RFC; several tickets are open regarding this. Trying to cover more of >> the RFC is possible, although supporting all valid email addresses is >> not (various regexes claim to do this, but full coverage is >> impossible >> -- the RFC is flexible enough WRT things like nested comments that >> I'm >> fairly certain no single regex can handle them all), and -- as we've >> seen -- attempts to cover a broader chunk of the RFC can introduce >> issues with performance. >> >> So what I'd like to propose is that EmailField essentially check that >> the value contains an '@', and a '.' somewhere after it. This will >> cover most addresses that are likely to be in actual use, and various >> confirmation processes can be used to rule out any invalid addresses >> which happen to slip through that. Meanwhile, people who want to >> support comments inside a bang path or other such exotic beasts can >> simply write their own regex for it and tell a form to use that >> instead. >> >> >> >> >> > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Proposal for 1.2: Dumber email validation
On Sat, Oct 10, 2009 at 9:35 PM, James Bennettwrote: > > In light of yesterday's security issue, I'd like to propose that we > significantly dumb down the regex Django uses to validate email > addresses. > > Currently, the regex we use covers many common cases, but comes > nowhere near covering the entire spectrum of addresses allowed by the > RFC; several tickets are open regarding this. Trying to cover more of > the RFC is possible, although supporting all valid email addresses is > not (various regexes claim to do this, but full coverage is impossible > -- the RFC is flexible enough WRT things like nested comments that I'm > fairly certain no single regex can handle them all), and -- as we've > seen -- attempts to cover a broader chunk of the RFC can introduce > issues with performance. > > So what I'd like to propose is that EmailField essentially check that > the value contains an '@', and a '.' somewhere after it. This will > cover most addresses that are likely to be in actual use, and various > confirmation processes can be used to rule out any invalid addresses > which happen to slip through that. Meanwhile, people who want to > support comments inside a bang path or other such exotic beasts can > simply write their own regex for it and tell a form to use that > instead. +1, with two additions: 1) If we encourage people to write their own regex if they want tighter email validation, we run the risk that users will inadvertently introduce the same bug that we have just fixed. We should probably beef up the documentation of RegexField to highlight the potential problem, give a few examples of how it can be triggered, and give some links to useful resources. 2) I think we should we relax the analogous regex check on URLField. This one is slightly self-serving - one of the customers has an internal network in which machines are named mymachine.foowhiz - which, is a violation of the RFC because of the 7 character TLD, but that doesn't change the fact that it works fine on their internal network. A quick survey of tickets affected by this: #9764 - Validation on internationalized domain names #9202 - URLField validation #7334 - non-ASCII domains (possible dupe of #9764) #6092 - Allow custom validator for URL and Email fields Russ %-) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Proposal for 1.2: Dumber email validation
+1 http://nedbatchelder.com/blog/200908/humane_email_validation.html I was going to kibbitz on the fix (removing a single * would have sufficed), and realized we were once again in the quagmire of email regex validation. --Ned. James Bennett wrote: > In light of yesterday's security issue, I'd like to propose that we > significantly dumb down the regex Django uses to validate email > addresses. > > Currently, the regex we use covers many common cases, but comes > nowhere near covering the entire spectrum of addresses allowed by the > RFC; several tickets are open regarding this. Trying to cover more of > the RFC is possible, although supporting all valid email addresses is > not (various regexes claim to do this, but full coverage is impossible > -- the RFC is flexible enough WRT things like nested comments that I'm > fairly certain no single regex can handle them all), and -- as we've > seen -- attempts to cover a broader chunk of the RFC can introduce > issues with performance. > > So what I'd like to propose is that EmailField essentially check that > the value contains an '@', and a '.' somewhere after it. This will > cover most addresses that are likely to be in actual use, and various > confirmation processes can be used to rule out any invalid addresses > which happen to slip through that. Meanwhile, people who want to > support comments inside a bang path or other such exotic beasts can > simply write their own regex for it and tell a form to use that > instead. > > > > > --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---