Steven D'Aprano <steve+pyt...@pearwood.info> added the comment:

I believe that Python's behaviour here is correct. You are supplying a netloc 
which includes a username "www.google.com\" with no password. That might be 
what you intend to do, or it might be malicious data. That depends on context, 
and the urlparse module can't tell what the context is and has no reason to 
assume malice.

If I am reading this correctly:

https://tools.ietf.org/html/rfc1738#section-3.1

the colon after the username can be omitted, so the URL is legal and Python has 
returned the correct value for the netloc.

As Christian says, Python is not an end-user application like a browser. It is 
right and proper for a browser to expect that the user is non-technical and may 
not have noticed the @ sign, and to expect malicious behaviour, or to assume 
that backslash \ is a typo for forward slash / but Python programmers by 
definition are technical users and it is their responsibility to validate their 
data.

There are legitimate uses for the userinfo component (user:password@hostname) 
and it is not the library's responsibility to assume that backslashes are typos 
for forward slashes.

So I think that the behaviour here is correct, and this should be closed. But 
if you disagree, please explain what you think the library should do, and why. 
WHen you do, remember that:

* there are legitimate users for user:password@hostname;
* either the user name or the password can contain backslashes.

----------
nosy: +steven.daprano

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue35748>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to