I'm working on a parallel unzip. I started with phobos std.zip, but found that to be too monolithic. I needed to separate out the tasks that get the directory entries, create the directory tree, get the compressed data, expand the data and create the uncompressed files on disk. It currently unzips a 2GB directory struct in about 18 secs while 7zip takes around 55 secs. Only about 4 seconds of this is the creation of the directory structure and the expanding. The other 14 secs is writing the regular files.

The subtasks needed to be separated not only because of the need to run them in parallel, but also because the current std.zip implementation is a memory hog, keeping the whole compressed and expanded data sections in memory. I was running out of memory in a 32 bit application just attempting to unzip the test file with the std.zip operations. The parallel version peaks at around 150MB memory used during the operation.


The parallel version is still missing the operation of restoring the original file attributes, and I see no example in the documents of what would normally be done. Am I missing this somewhere? I'll have to dig around...


Reply via email to