[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-04-03 Thread Matt Eaton

Matt Eaton  added the comment:

Wanted to check in on this to see if there was any feedback on this topic?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-30 Thread Matt Eaton

Matt Eaton  added the comment:

I was able to get some time together today and created a rough draft for the 
idea that you had Berker on introducing a new API with more strict parsing 
rules.

This will allow the ValueError to be raised when the URL is parsed rather than 
when the computed property is used.  

Jonathan, this will help address your concern on the consistency of when the 
error message is raised.

I added the rough draft on my Github account here:
https://github.com/agnosticdev/cpython-apis/tree/master/url
https://github.com/agnosticdev/cpython-apis/blob/master/url/url.py

Please take a look and let me know what you think.
You can contact me directly to continue the conversation at my email: 
agnostic...@gmail.com

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-20 Thread Matt Eaton

Matt Eaton  added the comment:

Berker and Eric, thank you very much.  I really like the idea of introducing a 
new API with more strict parsing rules for this situation. I would be willing 
to put some ideas down on a first pass at this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-20 Thread Berker Peksag

Berker Peksag  added the comment:

The problem with adding a port_validation argument is that the port attribute 
is not the only thing that is computed lazily. There is also hostname, 
username, password attributes etc.

I think the best way would be introducing a new API with more strict parsing 
rules. For example:

from urllib.parse URL

url = URL('http://Server=sde; Service=sde:oracle').parse()

would raise a ValueError earlier than urlparse() and return a immutable 
namedtuple.

Such an API can be designed as a standalone module first and then can be added 
into the existing urllib.parse module. I'd personally happy to review and 
discuss such a modern and user friendly API in the standard library.

I think Eric has explained the current situation perfectly and we can now close 
this issue. We can create a new issue or a new python-ideas thread when or if 
we have a prototype of a new API.

Thank you, everyone!

--
components: +Library (Lib)
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed
type:  -> enhancement
versions: +Python 3.8 -Python 2.7, Python 3.5, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-20 Thread Berker Peksag

Berker Peksag  added the comment:


New changeset 2cb4661707818cfd92556e7fdf9068a993577002 by Berker Peksag (Matt 
Eaton) in branch 'master':
bpo-33034: Improve exception message when cast fails for 
{Parse,Split}Result.port (GH-6078)
https://github.com/python/cpython/commit/2cb4661707818cfd92556e7fdf9068a993577002


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-17 Thread Matt Eaton

Matt Eaton  added the comment:

Yes, my goal for the patch was to provide a more explicit error message for 
this situation and to provide a low surface area change to the overall source, 
knowing that there are future development goals and backward compatibility to 
keep in mind.  That way the patch would be a first step in providing more 
explicit information when a developer would run into this situation and 
hopefully improving the current situation at hand.

The new ValueError message raised in this patch does work for both urlparse and 
urlsplit.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-17 Thread Jonathan

Jonathan  added the comment:

Thanks for the thoughts.
If only the exception message changes, that doesn't really address the issue 
that caused me to raise this - namely that it seems to be inconsistent with how 
almost every other Python builtin function I've used works. I have to defer to 
you folks who know how feasible changing any of that is - I can see your 
reasoning.

One quick observation - on glancing over the patch, it seems to only be for 
urlparse, but this happens to urlsplit too. Or does urlsplit import from that 
function (as I said, I only glanced).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-17 Thread Matt Eaton

Matt Eaton  added the comment:

This is a very good point, Eric.  For backwards compatibility we would have to 
set the default parameter to false, so we be in the same state we are today by 
default.  Knowing this my vote would be to go with the improvements to the 
ValueError message when using .port as the current PR does now.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-17 Thread Eric V. Smith

Eric V. Smith  added the comment:

For backward comparability, option C would have to default to not raising an 
exception. Which would limit it's usefulness, but that's the way it goes.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-17 Thread Matt Eaton

Matt Eaton  added the comment:

One more question I would raise based upon a point made earlier by Eric is if 
option C would be too large of a change for 3.8?  Any thoughts on this?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-16 Thread Matt Eaton

Matt Eaton  added the comment:

Jonathan, thank you very much for your thoughts I appreciate the pros and cons 
of each option.

In regards to your option C, if we provided a flag to optionally raise the 
error in urlsplit and urlparse were you thinking the default flag to be set to 
True?  

For example:

def urlparse(url, scheme='', allow_fragments=True, port_validation=True):


def urlsplit(url, scheme='', allow_fragments=True, port_validation=True):

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-16 Thread Jonathan

Jonathan  added the comment:

Interesting conversation

As I see it, there are two ways to solve this, both discussed above:
A) Python can compute and ValueError at parse-time
B) Python can ValueError at property-call time. (Current method)

But both have Advantages/Disadvantages.
A - Pros) - The function is more consistent with all the other Python builtins 
(well, the one's I've dealt with).
A - Pros) - Not carrying around a "bad" port.
A - Cons) - As Matt suggests, we could be "taking something from the user" 
because they may want the other values. (although in that case, the current 
semi-related behaviour: "Unmatched square brackets in the netloc attribute will 
raise a ValueError" is also potentially taking something from the user).

B - Pros) - User gets the other values even if they don't get the port.
B - Cons) - User needs to have more Try/Excepts in the code (whereever you may 
access the property), or to write their own wrapper function.


Given the above, of the those options I still think option (A) is better. The 
other values may have a reduced worth if the port is invalid (depending on the 
user's goal).

May I suggest a third option:
C) A flag for urlsplit/urlparse to indicate we don't want to do port 
validation, and to just return whatever it thinks is there. 
(example.com:3293849038 would return 3293849038. example.com:gibberish would 
return "gibberish" etc). 

This way the user can choose what behaviour they want with the bad port and 
test the value of the port themselves if they care (something I was going to do 
anyway before I discovered it was included in the builtin). Some of the url 
quoting functions have a similar flag ("errors") for handling bad data 
transparently or not.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-15 Thread Eric V. Smith

Eric V. Smith  added the comment:

Yes, you're right, it can't be changed. And .port is documented as raising 
ValueError, so it's all working correctly. I think improving the ValueError 
message, as the PR does, is as good as we can do.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-15 Thread Matt Eaton

Matt Eaton  added the comment:

"Wouldn't we be better off to catch this error at parse time, instead of just 
improve the error message when .port is called?"

I think there could be a case to be made about catching and dealing with this 
error in urlparse() / urlsplit() instead of displaying an error when port 
property is used.  I think that approaching it this way would cut right to the 
problem and alleviate carrying around a potentially bad port value.  However, 
if the port error was caught during parsing but the url, scheme, etc.. values 
were still valid, are we taking away something from the user by raising the 
error too soon?  Just a thought.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-15 Thread Eric V. Smith

Eric V. Smith  added the comment:

Wouldn't we be better off to catch this error at parse time, instead of just 
improve the error message when .port is called?

I'm not sure why port is a property and not just computed. Maybe the least 
intrusive way of doing this would be to just evaluate self.port after the 
parsing is completed?

--
nosy: +berker.peksag, eric.smith

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-15 Thread Eric V. Smith

Eric V. Smith  added the comment:

Although having said that, we probably can't make that change for 2.7. And 
maybe it's even too big a change for 3.8.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-11 Thread Matt Eaton

Matt Eaton  added the comment:

I agree.  I think an explicit exception message would be appropriate when the 
cast fails to cast from string to int in int(post, 10).

Putting in a PR to fix this now.
https://github.com/python/cpython/pull/6078

--
nosy: +agnosticdev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-11 Thread Matt Eaton

Change by Matt Eaton :


--
pull_requests: +5838

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-11 Thread Matt Eaton

Change by Matt Eaton :


--
keywords: +patch
pull_requests: +5837
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-10 Thread bbayles

Change by bbayles :


--
nosy: +bbayles

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-09 Thread Jonathan

Jonathan  added the comment:

Arguably the same logic applies to out-of-range ports:

> a = urlsplit('http://example.com:444')
> a.port
Traceback (most recent call last):
  File "", line 1, in 
  File "E:\Software\_libs\Python36\lib\urllib\parse.py", line 169, in port
raise ValueError("Port out of range 0-65535")
ValueError: Port out of range 0-65535

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-09 Thread Jonathan

New submission from Jonathan :

(Confirmed in 2.7.14, 3.5.4, and 3.6.3)

I have this really bad URL from a crawl:
"http://Server=sde; Service=sde:oracle$sde:oracle11g:geopp; User=bodem; 
Version=SDE.DEFAULT"

if I try and parse it with wither urlparse or urlsplit it works - no errors. 
But when I try and get the port, I get a ValueError.

> from urllib.parse import urlparse
> r = urlparse('http://Server=sde; Service=sde:oracle$sde:oracle11g:geopp; 
> User=bodem; Version=SDE.DEFAULT')
ParseResult(scheme='http', netloc='Server=sde; 
Service=sde:oracle$sde:oracle11g:geopp; User=bodem; Version=SDE.DEFAULT', 
path='', params='', query='', fragment='')

Ok, great, now to use the result:
> print(r.port)
Traceback (most recent call last):
  File "", line 1, in 
  File "E:\Software\_libs\Python36\lib\urllib\parse.py", line 167, in port
port = int(port, 10)
ValueError: invalid literal for int() with base 10: 
'oracle$sde:oracle11g:geopp; User=bodem; Version=SDE.DEFAULT'


I'm not a Python Guru, but to me at least it's inconsistent with how every 
other Python Function works. In all other builtin functions I've used it would 
fail with the exception when I ran the function, not when I try and get the 
results. This caused a good few minutes of head-scratching while I tried to 
debug why my try/except wasn't catching it.

This inconsistency makes the results more difficult to use. Now a user needs to 
wrap all calls to the *results* in a try/except, or write an entire function 
just to "read" the results into a won't-except tuple/dict. Seems sub-optimal.


(May relate to: https://bugs.python.org/issue20059)

--
messages: 313475
nosy: jonathan-lp
priority: normal
severity: normal
status: open
title: urllib.parse.urlparse and urlsplit not raising ValueError for bad port
versions: Python 2.7, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33034] urllib.parse.urlparse and urlsplit not raising ValueError for bad port

2018-03-09 Thread Jonathan

Change by Jonathan :


--
versions: +Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com