Hi, I apologize in advance for the length of the message.
I recently ran into a problem where pax deltas appeared corrupted. This was a bit of a chore to diagnose at first, because it seemed like it was only happening on larger archives over 100 GB, where file data tended to range between 100 MB and 1 GB. Smaller archives do not manifest the problem. I'm uncertain if the bug is actually in the writing or the reading of the delta archive, but it seems like it's in the writing (or perhaps both). This bug also appears in pax deltas created from the FreeBSD 6 binaries formerly available at the AT&T release site, and not just with the recently built-from-sources. In addition to simply being unable to process the entire delta archive, another symptom is that lines like "delete 0" are printed over and over when pax is run -v (one per file deleted, it seems). I think this is pointing to a problem in the creation of the delta archive, and not necessarily in its reading, because I've found that I get the same results from older tar. Likewise, when a delta archive is successful, I can process it with tar as well (of course, I can't *do* anything with the contents because they're vdelta, but still, the archive itself appears intact). When the "delete 0" issue hits, both pax and tar fail to unpack the file. Once the glitch hits, the rest of the archive cannot be unpacked with or without the --base option. I also believe, but can't confirm, that I've seen "create 0" lines as well from pax when the bug hits. : rtfm; pax -rv --base=../foo.base <../foo.delta ... delete 0 delete 0 delete 0 delete 0 ... The first thing I did was disable compression, but that made no difference. Straight pax-format files manifested the problem. Creating large archives, well over 100 GB in size, of nothing but random data didn't manifest the problem at first, either. Random binary data, in files randomly sized up to 1GB, spread in trees with ten files to a directory could not reproduce the problem. This still has me worried, because it implies the contents of the data triggers the bug. I've finally been able to recreate it with about 10-20 GB of data. What's very interesting is that the bug only hits when the changes since the base archive include both deletions as well as new files. Simply adding new files didn't trigger the bug at first. And, the final oddity: exactly one set of data produced the following error. The share3.base.000 file was just created minutes earlier, so it should match. I only saw this condition hit once, no other data triggered it as I was trying to recreate this bug. : rtfm; pax -rv --base=../share3.base.000 <../share3.delta.001 /dev/stdin base ../share3.base.000 in pax format /dev/stdin in delta pax format ...[2 directories and 12 new files created]... pax: 0: base archive mismatch [/usr/local/ast.working/src/cmd/pax/copy.c#278] What confuses me is that pax should have read the base archive during its verification step, before it even touches the delta archive. It got past that, and didn't report the mismatch message until after it had processed all of the new files in the *delta* archive, and had just seen its first deleted file in the delta. So, the criteria _seems_ to be - Large archives (>10 GB) - Specific data? (100+ GB of /dev/urandom won't trigger it) - Both creations and deletions in the delta archive Next week, I'll try to whittle this down further, from the 10-20 GB range to something more manageable that still generates the problem. But in the meantime, I wanted to report this before too much more time passes. Certainly, it could be that this issue is a symptom of the platform in some way (though not this specific build) and that there's more work to be done in the FreeBSD port of libast. Also, just to be clear: I have never had a problem in traditional pax archives; I only see this in ones where deltas are being read or written. Best regards, Bob -- Bob Krzaczek, Chester F. Carlson Center for Imaging Science, RIT phone +1-585-4757196, email [email protected], icbm 43.08586N 77.67744W
pgpQbRGgqiipL.pgp
Description: PGP signature
_______________________________________________ ast-developers mailing list [email protected] http://lists.research.att.com/mailman/listinfo/ast-developers
