[issue37475] What is urllib.request.urlcleanup() function?

2021-09-21 Thread STINNER Victor


Change by STINNER Victor :


--
resolution:  -> out of date
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37475] What is urllib.request.urlcleanup() function?

2019-07-02 Thread STINNER Victor


STINNER Victor  added the comment:

Another weirdness of test_urllib.py: it redefines urlopen() and actually tests 
its test function, rather than testing the urllib.request module.

https://github.com/python/cpython/pull/14529#issuecomment-507646868

"def urlopen(url, data=None, proxies=None):" in test_urllib.py is as old as the 
urllib module itself!

commit 1afc1696167547a5fa101c53e5a3ab4717f8852c
Author: Jeremy Hylton 
Date:   Wed Jun 18 20:49:58 2008 +

Make a new urllib package .

It consists of code from urllib, urllib2, urlparse, and robotparser.
The old modules have all been removed.  The new package has five
submodules: urllib.parse, urllib.request, urllib.response,
urllib.error, and urllib.robotparser.  The urllib.request.urlopen()
function uses the url opener from urllib2.

Note that the unittests have not been renamed for the
beta, but they will be renamed in the future.

Joint work with Senthil Kumaran.

At least, the test function should have a different name, no?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37475] What is urllib.request.urlcleanup() function?

2019-07-02 Thread STINNER Victor


STINNER Victor  added the comment:

> urlcleanup also sets the global variable _opener to None so it does the extra 
> work of uninstalling the global variable opener set by install_opener which 
> is also not documented.

Oh wait, urlopen() also sets this private _opener global variable and so has an 
effect on next calls to urlopen()... This API is really strange, I dislike it 
:-(

I modified my PR 14529 to call urlcleanup() is way more tests, but I also 
modified regrtest to test that urllib.requests._url_tempfiles and 
urllib.requests._opener are not modified.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37475] What is urllib.request.urlcleanup() function?

2019-07-01 Thread Senthil Kumaran


Senthil Kumaran  added the comment:

Hi Victor and Karthikeyan,

Both your analysis are correct.

- This is a legacy interface, present purely for satisfying the old code, when 
urlretrieve was advertised as the first/easy to way to use urllib and download 
something.

- At this moment, I think, we should remove those legacy interfaces including 
urlcleanup. 

- I think, users can easily using urllib to write to the file using context 
manager and if desired urlretrieve can be modernized over it's use of 
urlcleanup.

+1 to deprecation and removal.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37475] What is urllib.request.urlcleanup() function?

2019-07-01 Thread Karthikeyan Singaravelan


Karthikeyan Singaravelan  added the comment:

I have added Senthil for thoughts. The temporary files is also stored in a 
module level variable. Looking into the git history there were some changes in 
e24f96a05973ddbb59d88c03570aef8545c5ef10 . The function is also marked as 
legacy in __all__ with compat code in 2to3. urlcleanup also sets the global 
variable _opener to None so it does the extra work of uninstalling the global 
variable opener set by install_opener which is also not documented.

urlretrieve enables retrieving and storing the content in a temporary file to 
return (tempfilename, headers) to be read later. In case the user doesn't give 
a filename it implicitly creates a temporary file. There is similar code in 
urllib.request.URLopener().retrieve [0] too where the temporary files are 
created implicitly but __del__ is overridden where the temp files are deleted 
as the program exits. I think it's better to ask the user to give filename and 
so that the user is responsible for the file but since the behavior is present 
for a long time there is backwards compatibility in changing this and there 
might be some code depending on the implicit temporary file created as a 
feature. 

One possible way would be to have a wrapper class that creates the temporary 
file when not given and then deletes it or calls urlcleanup on __del__ to clean 
it up as the GC is called like URLopener.retrieve? This would be done only when 
user doesn't give a file and for the temporary files generated by urlretrieve.


[0] 
https://github.com/python/cpython/blob/67310023f299b5a2fad71fca449b46d280036690/Lib/urllib/request.py#L1702

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37475] What is urllib.request.urlcleanup() function?

2019-07-01 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +orsenthil

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37475] What is urllib.request.urlcleanup() function?

2019-07-01 Thread STINNER Victor


New submission from STINNER Victor :

While working on bpo-37421 "Some tests leak temporary files", I had to write PR 
14529 fix: test_urllib urlretrieve() tests now explicitly call urlcleanup() to 
remove temporary files.

If urlretrieve() caller forgets to remove the temporary file, the "temporary" 
file is left in the temporary directory, until this directory is cleared which 
may happen soon (next reboot?) or not.

When ContentTooShortError is raised, the temporary file is left in the 
temporary directory: the caller must call urlcleanup() explicitly. It sounds 
like a bug to me.

Is it really a good API if urlcleanup() must be called explicitly? I dislike 
having a "separated" function for the cleanup, whereas Python could rely on 
object lifetime to do the cleanup.

Should we deprecate this API in favor of a better urllib API?

--
components: Library (Lib)
messages: 347056
nosy: vstinner, xtreak
priority: normal
severity: normal
status: open
title: What is urllib.request.urlcleanup() function?
versions: Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com