Luke Plant <l.plant...@cantab.net> added the comment:

@ David Murray:

Thanks for taking the time to look at this - can I trouble you to keep going 
and read my response?

Thanks. 

You wrote:

> IMO the thing that needs to be fixed here is that receiving an invalid cookie 
> makes it difficult to receive the valid cookies.

I agree absolutely, and my patch implements exactly that aim. So I don't 
understand why the rest of your reply goes on to assume a very different goal - 
handling RFC-invalid cookies such that we can read their values. 

> I'd love to accept your patch, but "silently ignore" sounds like a bad 
> idea and is something we try to avoid in Python where practical.

"silently ignore" is what the current BaseCookie implementation does for 
**every other** type of invalid input, with the only exception (I can find) 
being the case where everything is correct apart from the name:

>>> from Cookie import SimpleCookie
>>> c = SimpleCookie()
>>> c.load('rubbish')
>>> c
<SimpleCookie: >
>>> c.output()
''
>>> c.load('more:rubbish')
>>> c
<SimpleCookie: >
>>> c.load('name=value')
>>> c
<SimpleCookie: name='value'>
>>> c.load('name=value; invalidattribute;')
>>> c.output()
'Set-Cookie: name=value'
>>> c.load('xyz123"sfs;;=-abc')
>>> c
<SimpleCookie: name='value'>
>>> c.load('namewith:colon=value')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/Cookie.py", line 632, in load
    self.__ParseString(rawdata)
  File "/usr/lib/python2.7/Cookie.py", line 665, in __ParseString
    self.__set(K, rval, cval)
  File "/usr/lib/python2.7/Cookie.py", line 585, in __set
    M.set(key, real_value, coded_value)
  File "/usr/lib/python2.7/Cookie.py", line 460, in set
    raise CookieError("Illegal key value: %s" % key)
Cookie.CookieError: Illegal key value: namewith:colon

As you said, I think this ticket is about fixing that last one, which is the 
gross exception to the rule, and the only thing that is causing major problems 
for people.

I agree that handling cookies with invalid names so that we can read them would 
be nice, but I think it is **very** difficult to find a rationale for saying 
that this bug fix should have to wait for that change.

In addition, I do not think there is any sensible way to implement that 
suggestion given the API of BaseCookie at the moment - it's not just the 
implementation I baulked at, it is the API of BaseCookie that works against you 
every step of the way.

The API of BaseCookie works like this in the typical case:

Consuming:

input -> load()        -> store in BaseCookie -> __getitem__()

Generating:

input -> __setitem__() -> store in BaseCookie -> output()

(Of course you don't have to do it that way, but looking at the docs and 
examples, http://docs.python.org/library/cookie.html, it's very clear that it's 
meant to be used that way).

The fact that both modes involves storing in BaseCookie really messes up any 
attempt to make these two work differently, and means you would have a really 
surprising and strange API whichever way you do it.

So, option (1): you could allow BaseCookie to store invalid cookie names if 
they come via load, but not via __setitem__(). This means that you've got an 
easy work-around for BaseCookie refusing to store your value - just use 
'load()' instead. It also means that simple code like this fails:

>>> c['name:val'] = c['name:val']

which can be a problem if you are trying to do some generalised processing of 
the contents of a BaseCookie.

This leads us to option (2): allow RFC-invalid cookies to be stored, but then 
quietly sanitise (i.e. discard) in output().

But this becomes a much worse example of "silently ignoring" - the former is 
tolerant handling of data that is outside the programmers control - Postel's 
law. But this would be accepting incorrect data from a programmer, and then 
silently discarding it, is what we should do our utmost to avoid.

Neither of these options is any good, and it is because the API of BaseCookie 
was just not designed for it. The only sensible things to do, given our API, is 
sanitise on input. We could have an 'RFC-invalid' mode, but it would have to be 
turned on for the whole Cookie instance, and it would be a feature addition. An 
alternative implementation would be a 'badmorsels' attribute which preserves 
the rubbish where possible, in case you want it - but again, it's a feature 
addition.

Finally, I think the behaviour you are aiming at is unreasonable to ask for, 
especially in this ticket. You are essentially asking for a tolerant kind of 
cookie parsing which does its best to do 'do-what-I-mean'. But:

1) You haven't specified further than that (and there are no RFCs of use)
2) in general that kind of thing is notoriously hard to get right
3) the job is never finished - there are always more cases of invalid cookies 
that you *could* handle
4) and in fact it is impossible to provide an implementation that pleases 
everyone - there will always be invalid cookies that were 'meant' to be one 
thing, but the implementation interprets in a different way.

Do we really have to wait for all of that to be sorted out before we fix the 
glaring inconsistency in the way that BaseCookie.load() deals with some kinds 
of invalid data?

Thanks!

Luke

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue2193>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to