Hi, putting the exact representation of an archive entry aside I've put down an idea of the API for reading and writing archives together with a POC port of the AR classes for this API. All is inside http://svn.apache.org/repos/asf/commons/proper/compress/branches/compress-2.0/
The port doesn't look pretty but I wanted to get there quickly and change as little as possible, partly to see how much effort porting the existing code base would be. In particular I copied IOUtils into the AR package so I don't have to thing about a proper package right now. I also didn't care about Java < 7 so far. Please have a look (more on the interfaces than the actual implementation) and show me how wrong I am :-) Some points I'd like to highlight and discuss: * ArchiveInput and ArchiveOutput are not Streams (or Channels) themselves This is unline Archive*Stream in 1.x Emmanuel brought this up in a chat between the two of us and I agreed with him. You don't really use them as a stream but rather as a stream per entry. For Compressor* I'd still wrap streams/channels, different issue. * Using Channels rather than Streams I'm a bit torn about this. I did so because I'd prefer to base ZipFile and friends on SeekableByteStream rather than RandomAccessFile - so it would make the API look more symmetric. Drawbacks I've already found - no skip in ReadableByteChannel so you are forced to read data even if something more efficient could be done. This smells like another IOUtils method. - worse, no mark/reset or pushback, this is going to make format detection uglier as we have to rewind the channel in a different way Another concern might be that Compress 2.0 might get delayed because proting effort was bigger - I've deliberately taken the Channels.new* route to wrap the existing stream based API in ArArchiveInput and it seems to work (although likely is suboptimal). Going all-in on Channels in ArArchiveOutput didn't look much more difficult either, but the I/O part of output is simpler anyway. * Checked vs Unchecked exceptions I would love to make ArchiveInput be an Iterator over the entries but can't do so as the things we'd need to do in next() might throw an IOException. One option may be to introduce an unchecked ArchiveException and wrap al checked exceptions (and do so throughout the API). * RandomAccessArchiveInput as a generalization of ZipFile This extends ArchiveInput so if you ask for an ArchiveInput to a file and the format doesn't support a stream-like interface (like 7z) you can still obtain one. This is helped a lot by the fact that ArchiveInput is not a stream itself. * I'm not sure about ArchiveInput#getChannel Should next return a Pair of ArchiveEntry and Channel instead? * tiny change to the contract of ArchiveOutput finish finish used to throw an exception if you didn't call closeEntry for the last entry while putEntry closes the previous entry. This looked inconsistent and finish now silently closes the entry as well. Stefan --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org