Re: Archives and magic bytes

2005-03-26 Thread andrea crotti
 Perhaps this is mostly a reflection on me as a programmer :-} but I
 found the job surprisingly tricky.
No I think you're right...
It's not very important for me retrieve exactly what kind of file it
is, it could be just something more in my little program (an organizer
that put files in the right directories using extensions of files,
categories and regular expressions).

A really good file identifier is this one(the author is a friend of mine):
http://mark0.net/soft-trid-e.html

It doesn't work with mono yet but maybe one day I'll try to use it
with python...

Thanks everybody
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Archives and magic bytes

2005-03-25 Thread Jim
This is something I've recently thought about; perhaps you wouldn't
mind some points?

1) I've been running 'file' via os.popen, and I've had trouble with it
incorrectly spotting file types (Fedora Core 1).  I can name a specific
example where it thinks a plain text README file is HTML (despite that
the configuration file for 'file' at least looks right).  That makes me
suspicious of its ability to spot more obscure types.

(No, I haven't tried to get the latest 'file'; the days are long but
they are filled with negative time and in the end I don't always get
done what I should.)

2) Watch out for someone giving you, say, a bogus /bin/ls in a .zip
file.  You may want to look into chroot (which I believe requires you
to run as root), or at least examine the output of unzip -l

3) You might also have to worry about the possibility that unpacking a
bundle will fill up your disk's partition.  At least for a while you
hold both the bundle and the unpacked bundle.

4) Using os.popen to unpack the bundle has a lot of advantages,
including that during debugging you can test the stuff from the command
line and feel that you completely understand which steps are working (I
think I use popen2, IIRC, and capture stderr for error messages).

Perhaps this is mostly a reflection on me as a programmer :-} but I
found the job surprisingly tricky.

Jim

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Archives and magic bytes

2005-03-24 Thread andrea
Chris Rebert (cybercobra) wrote:
Have you tried the tarfile or zipfile modules? You might need to ugrade
your python if you don't have them. They look pretty easy and should
make this a snap.
You can grab the output from the *nix file command using the new
subprocess module.
Good Luck
- Chris
===
PYTHON POWERs all!
All your code are belong to Python!
 

I've got them (I'm still using python 2.3 because I use gentoo) but they 
are not very easy to use as they seem...
I'll try again, thanks!
--
http://mail.python.org/mailman/listinfo/python-list


Archives and magic bytes

2005-03-23 Thread andrea
Hi everybody,
this is my first post but I've read already many of yours interesting 
posts... (sorry for my bad english)

Anyway for my little project I need a module that given an archive (zip, 
bz2, tar ...) gives me back the archive decompressed.

I looked at the modules in the library reference and used some of them, 
but the problem is that they all behave in a different way, and just one 
has a useful command for decompressing files easily, bz2.

/decompress(...)
   decompress(data) - decompressed data
  
   Decompress data in one shot. If you want to decompress data 
sequentially,
   use an instance of BZ2Decompressor instead.
/
I can't get even that one working, what does it mean data? A file?

Maybe I could implement myself the compression algorithm (or copy that 
from the modules) and implement myself compress/decompress functions, 
what do you think?
Do you know if there is already something similar??

Another thing, I work on linux (gentoo) and I would like to use the 
file command to retrieve informations about type of file instead of 
using extensions, do you think this can be done?

Thanks, and sorry if I've done some stupid questions, I'm still a python 
novice (but it's a great language)

Andrea
--
http://mail.python.org/mailman/listinfo/python-list


Re: Archives and magic bytes

2005-03-23 Thread Maxim Krikun
 Another thing, I work on linux (gentoo) and I would like to use the
file command to retrieve informations about type of file instead of
using extensions, do you think this can be done?

this is trivial:

 import os
 os.popen(file /etc/passwd).read()
'/etc/passwd: ASCII text\n'

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Archives and magic bytes

2005-03-23 Thread Chris Rebert (cybercobra)
Have you tried the tarfile or zipfile modules? You might need to ugrade
your python if you don't have them. They look pretty easy and should
make this a snap.
You can grab the output from the *nix file command using the new
subprocess module.
Good Luck
- Chris
===
PYTHON POWERs all!
All your code are belong to Python!

-- 
http://mail.python.org/mailman/listinfo/python-list