Aron Stansvik wrote:
2007/2/20, Phil Knirsch <[EMAIL PROTECTED]>:
Aron Stansvik wrote:
> Hello fellow urlgrabbers, is this intentional behavior?
>
> $ urlgrabber ftp://user:[EMAIL PROTECTED]/non_existing_file
> non_existing_file 0 B
> 00:00
> file written to non_existing_file
>
> It happily saves a 0 byte file when I try to fetch a file that isn't
> there. Shouldn't it fail on me? How can I make it file when the file
> is not present on the server?
>
> Best regards,
> Aron Stansvik
Funny enough, i stumbled accross the exact same problem yesterday. I'm
currently debugging the cause and i'm down to urllib.py which makes the
same mistake and ftlib.py which works correctly and fails with an
error 550.
Yep. I'm guessing bullet point number 7 under "Restrictions" at the
bottom of http://docs.python.org/lib/module-urllib.html might be the
culprit, but I'm not sure.
For now I'm using ftplib directly in my script, which works but is not
as nice as urllib or urlgrabber when all you want is to download a
file.
Hopefully i'll have a patch ready tonight. I'll send it upstream to
python then, so it might take some more time until that finds it's way
down again.
Great! Thanks for the fast response.
Regards,
Aron
OK, fix is done and submitted to python upstream. The problem came from
the urrlib.ftpwrapper.retrfile() method. Here my comments on the python
entry:
When trying to retrieve a none existing file using the
urllib.ftpwrapper.retrfile() method the behaviour is that instead of an
error message a valid hook is returned and you will recieve a 0 byte file.
The current behaviour tries to emulate what one typically sees with http
servers and DirIndexes, which means:
1) Try to RETR the file.
2) If that fails, assume it is a directory and LIST it.
Unfortunately it doesn't actually check whether the directory actually
exists.
The attached patch fixes this by remembering the current directory using
the PWD command, then temporarily change to that directory and switch
back to the previous working directory if it was successfull.
If not we raise an IO error, as the file could neither be opened (RETR)
nor was it a directory.
That way the behaviour is even closer to what happens with http servers
where we get a 404 when we try to access a none existing file or directory.
Storing the current directory and switching back to it in case of no
error will also put the connection back in the proper state and
directory, so no unexpected behaviour happens here.
The patch is against the current SVN repository at revision 53833.
Read ya, Phil
PS: Attached said patch.
--
Philipp Knirsch | Tel.: +49-711-96437-470
Development | Fax.: +49-711-96437-111
Red Hat GmbH | Email: Phil Knirsch <[EMAIL PROTECTED]>
Hauptstaetterstr. 58 | Web: http://www.redhat.de/
D-70178 Stuttgart
Motd: You're only jealous cos the little penguins are talking to me.
--- urllib.py.old 2007-02-20 18:16:34.000000000 +0100
+++ urllib.py 2007-02-20 18:16:13.000000000 +0100
@@ -866,8 +866,15 @@
if not conn:
# Set transfer mode to ASCII!
self.ftp.voidcmd('TYPE A')
- # Try a directory listing
- if file: cmd = 'LIST ' + file
+ # Try a directory listing. Verify that directory exists.
+ if file:
+ pwd = self.ftp.pwd()
+ try:
+ self.ftp.cwd(file)
+ except ftplib.error_perm, reason:
+ raise IOError, reason, sys.exc_info()[2]
+ self.ftp.cwd(pwd)
+ cmd = 'LIST ' + file
else: cmd = 'LIST'
conn = self.ftp.ntransfercmd(cmd)
self.busy = 1
_______________________________________________
Yum-devel mailing list
[email protected]
https://lists.dulug.duke.edu/mailman/listinfo/yum-devel