Re: [web2py] Re: Bug? Invalid url puts python into a tight loop - 100% CPU

Jonathan Lundell Mon, 19 Nov 2012 12:38:13 -0800

On 19 Nov 2012, at 12:33 PM, Niphlod <niph...@gmail.com> wrote:
> it's just only for those who use the parametric router, not for all the 
> web2py installations out there.


You can relax the pattern in routes.py, too.


> 
> Il giorno lunedì 19 novembre 2012 17:17:54 UTC+1, jc ha scritto:
> I have been thinking a little about this. Niphlod's suggestion solves the 
> problem for me at the moment, but isn't there an enormous problem? It seems 
> that any web2py installation can be taken down accidentally or maliciously 
> just by somebody requesting an invalid argument string in the url of the form 
> 'xxxxxxX' where the 'x's are valid characters and there are enough of them, 
> and the 'X' is invalid? There must be a lot of vulnerable sites out there.
> 
> It seems to me there is one easy fix which is to just strip out invalid 
> characters before the regex match. You will get collisions, but since the url 
> is invalid anyway, who cares? Or the string could be urlencoded first so that 
> the invalid characters become % encoded?
> 
> 
> On Tuesday, November 13, 2012 7:33:26 PM UTC, Jonathan Lundell wrote:
> On 13 Nov 2012, at 11:20 AM, Niphlod <nip...@gmail.com> wrote:
>> I'm definitely not a regex master, but what's the [=.]? part required for ?
> 
> The idea (not mine, fwiw) is that you can have multiple strings of [\w@ -]+ 
> separated or ended (but not begun) with a single . or = (but not multiple 
> ones). My workaround would allow leading or multiple . or =. I think we 
> probably should anyway, since we should be assuming that args are necessarily 
> a file path, which seems to be what's going on there.
> 
> It's trying to prevent stuff like foo/../../../bar.
> 
>> 
>> On Tuesday, November 13, 2012 7:00:32 PM UTC+1, Jonathan Lundell wrote:
>> On 13 Nov 2012, at 9:04 AM, Niphlod <nip...@gmail.com> wrote:
>>> seems a problem with the default regex checking for args.... Let's wait for 
>>> Jonathan
>>> 
>>> >>> import re
>>> >>> mymatch = re.compile(r'([\w@ -]+[=.]?)*$')
>>> >>> mymatch.match('a')
>>> <_sre.SRE_Match object at 0x02A61020>
>>> >>> mymatch.match('Abbbbbbbb Lccc - Pddddddd GA Deeeeee (ffff ffff A).pdf')
>>> 
>>> endless loop of backtracing regex
>> 
>> I don't have a quick fix. The easy solutions involve re elements not 
>> available in Python re (or at least not until 3.1).
>> 
>> A workaround would be to make the pattern a little more lenient: [\w@ -=.]+
>> 
>> If we really want to exclude successive dots or equals, we could make a 
>> separate check for that.
>> 
>> 
> 
> 
> 
> -- 
>  
>  
>  


--

Re: [web2py] Re: Bug? Invalid url puts python into a tight loop - 100% CPU

Reply via email to