[issue42967] [security] urllib.parse.parse_qsl(): Web cache poisoning - `; ` as a query args separator

2021-01-24 Thread Adam Goldschmidt

Adam Goldschmidt  added the comment:

> The difference is that semicolon is defined in a previous specification.

I understand, but this will limit us in the future if the spec changes - though 
I don't have strong feelings regarding this one.

> Dear all, now that Adam has signed the CLA, I have closed my PR in favor of 
> Adam's because I think 2 open PRs might split everyone's attention. Instead, 
> I'll focus on reviewing Adam's PR. Sorry for any inconvenience caused.

❤

--

___
Python tracker 
<https://bugs.python.org/issue42967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42967] [security] urllib.parse.parse_qsl(): Web cache poisoning - `; ` as a query args separator

2021-01-23 Thread Adam Goldschmidt

Adam Goldschmidt  added the comment:

> That doesn’t feel necessary to me.   I suspect most links use &, some use ;, 
> nothing else is valid at the moment and I don’t expect a new separator to 
> suddenly appear.  IMO the boolean parameter to also recognize ; was better.

That's reasonable. However, I think that we are making this change in order to 
treat the semicolon as a "custom" separator. In that case, why not let the 
developer decide on a different custom separator for their own use cases? 
What's the difference between a semicolon and something else?

--

___
Python tracker 
<https://bugs.python.org/issue42967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42967] [security] urllib.parse.parse_qsl(): Web cache poisoning - `; ` as a query args separator

2021-01-23 Thread Adam Goldschmidt


Adam Goldschmidt  added the comment:

> I _didn't_ change the default - it will allow both '&' and ';' still. Eric 
> showed a link above that still uses semicolon. So I feel that it's strange to 
> break backwards compatibility in a patch update. Maybe we can make just '&' 
> the default in Python 3.10, while backporting the ability to specify 
> separators to older versions so it's up to users?

I like this implementation. I definitely think we should not break backwards 
compatibility and only change the default in Python 3.10.

--

___
Python tracker 
<https://bugs.python.org/issue42967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42967] [security] urllib.parse.parse_qsl(): Web cache poisoning - `; ` as a query args separator

2021-01-22 Thread Adam Goldschmidt


Adam Goldschmidt  added the comment:

I haven't noticed, I'm sorry. I don't mind closing mine, just thought it could 
be a nice first contribution. Our PRs are different though - I feel like if we 
are to implement this, we should let the developer choose the separator and not 
limit to just `&` and `;` - but that discussion probably belongs in the PR.

--

___
Python tracker 
<https://bugs.python.org/issue42967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42967] [security] urllib.parse.parse_qsl(): Web cache poisoning - `; ` as a query args separator

2021-01-22 Thread Adam Goldschmidt


Change by Adam Goldschmidt :


--
pull_requests: +23120
pull_request: https://github.com/python/cpython/pull/24297

___
Python tracker 
<https://bugs.python.org/issue42967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42967] Web cache poisoning - `;` as a query args separator

2021-01-19 Thread Adam Goldschmidt

New submission from Adam Goldschmidt :

The urlparse module treats semicolon as a separator 
(https://github.com/python/cpython/blob/master/Lib/urllib/parse.py#L739) - 
whereas most proxies today only take ampersands as separators. Link to a blog 
post explaining this vulnerability: 
https://snyk.io/blog/cache-poisoning-in-popular-open-source-packages/

When the attacker can separate query parameters using a semicolon (;), they can 
cause a difference in the interpretation of the request between the proxy 
(running with default configuration) and the server. This can result in 
malicious requests being cached as completely safe ones, as the proxy would 
usually not see the semicolon as a separator, and therefore would not include 
it in a cache key of an unkeyed parameter - such as `utm_*` parameters, which 
are usually unkeyed. Let’s take the following example of a malicious request:   
  

```
GET /?link=http://google.com_content=1;link='>alert(1) HTTP/1.1

Host: somesite.com

Upgrade-Insecure-Requests: 1

User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 
(KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,imag 
e/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9 
Accept-Encoding: gzip, deflate 

Accept-Language: en-US,en;q=0.9 Connection: close   
```

urlparse sees 3 parameters here: `link`, `utm_content` and then `link` again. 
On the other hand, the proxy considers this full string: 
`1;link='>alert(1)` as the value of `utm_content`, which is why the 
cache key would only contain `somesite.com/?link=http://google.com`. 

A possible solution could be to allow developers to specify a separator, like 
werkzeug does:

https://github.com/pallets/werkzeug/blob/6784c44673d25c91613c6bf2e614c84465ad135b/src/werkzeug/urls.py#L833

--
components: C API
messages: 385266
nosy: AdamGold
priority: normal
severity: normal
status: open
title: Web cache poisoning - `;` as a query args separator
type: security
versions: Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue42967>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com