On Aug 7, 2010, at 11:12 PM, Paul Eggert wrote: > On 08/06/10 22:42, Dustin J. Mitchell wrote: >> My understanding is that tar should keep running until it gets EOF on >> its input - that there's no explicit indication within a tarfile that >> would cause tar to exit normally without reading to EOF. > > That's incorrect. A tar file ends with two blocks full of zeros; tar > doesn't need to keep reading after that. ...
To be pedantic, the end of a tar archive is marked by two 512-byte "records" of all zeros. The term "block" is generally used to refer to the I/O size (which traditionally defaults to 20 records). Historically, there's been a lot of variation in how readers handle the end-of-archive: * Some implementations stop reading when they see the first all-zero record. If this happens at the end of a block, they won't read the next block. * Some readers will try to drain pipes to avoid sending SIGPIPE to the writer. * Some readers aggressively read ahead to try to maintain high throughput. I believe GNU tar has changed its behavior a couple of times. star uses a FIFO and aggressively reads ahead but I don't think it deliberately tries to drain pipes. bsdtar does deliberately drain pipes. It sounds like Solaris tar also deliberately drains pipes. > example, in the POSIX standard for tar format > <http://www.opengroup.org/onlinepubs/9699919799/utilities/pax.html>. Thanks for pointing this out; most people don't know that there is a POSIX standard for the tar format, largely because it's hidden away under "pax". > You must be using Solaris /bin/tar for that. I believe it reads past > the two zero blocks. GNU tar doesn't do that. ... > arguably GNU tar's behavior is more useful. I'm curious about why you think so. I've always thought that draining the pipe was the more useful behavior. Tim
