On Thu, Jul 2, 2009 at 8:42 AM, Alexander Shulgin<alex.shul...@gmail.com> wrote: > I fail to see how is this true for normal tar files (vs. data read > from pipe). Can you elaborate please?
Yepp, of course;) Tar archive does not contain the byte positions of files inside the archive. That means accessing a file inside the archive needs to read the whole content before it, and determine where each file ends. (and you test if you are at the desired file by reading its header). It simply lacks of a TOC (table of content). So accessing the last file in the archive reuires to reading the whole archive. You can read it here: http://en.wikipedia.org/wiki/Tar_(file_format)#Format_details Simplification of tar archive: [1. file header][1. file][2.file header][2. file][3. file header][3. file] So how you read the third file from the archive? You read the file until the [3. file header], your test is successfull (is it the right file?), and you read the file itself. You see? You have read the whole file, just accessing the last item inside. >> Zip support accessing each files in the archive, although >> it compress the file by default. > > Pardon my ignorance, but wouldn't zip -0 do the trick for your purpose? :) It will do more or less, however there are three main problems with it: 1. you can only obtain the whole file from the archive. So you cant read a part of the file. So if you packed lets say a 700MB file to zip, you run out of memory on neo. At least this is the case on standard python zipfile module. 2. There is no random access feature, at least not in standard python modules. 3. There are significant processor time wasted when accessing to a file (many computation required). Btw, it needs to benchmark on the neo, how worse is it. Best regards, Laszlo _______________________________________________ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community