On Sat, 4 Dec 2010, Jorge Biquez wrote:

What would do you suggest to take a look? If possible available under the 3 plattforms.

I would second the use of SQLite. It's built into Python now, on all platforms.

But you specified "non SQL", so one other thing I'd suggest is to just create the data structure you need in Python and use pickle to save it.

I recently had an exercise of recovering files from a damaged hard drive. The problem is, it recovered a lot of legitimately deleted files along with the recovered "live" files. All the files had generic names, with only filetypes to guide me for content, like "028561846.avi" instead of descriptive names.

I wrote a program to read every single one of these files and determine its MD5 checksum; I stored the results in a dictionary. The key to the dictionary was the checksum; and the value was a list of files that had that checksum; the list was usually, but not always, only one element.

Then I pickled that dictionary.

In another program, I ran os.walk against my archive CDROMs/DVDRROMs, or some other directories on my hard drive, finding the MD5 of each file; and if it corresponded to a "rescued" file, it deleted the rescued file.

Ideally, I would have also updated the dictionary to drop the files I'd cleaned up, and at the end of processing, re-pickle the edited dictionary; but that wasn't an option as I usually had 2 or 3 instances of the program running simultaneously, each processing a different directory of CD/DVD.

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to