Toralf Lund wrote:

Paul Bijnens wrote:

Toralf Lund wrote:

Other possible error sources that I think I have eliminated:

  1. tar version issues - since gzip complains even if I just uncopress
     and send the data to /dev/null, or use the -t option.
  2. Network transfer issues. I get errors even with server
     compression, and I'm assuming gzip would produce consistent output
     even if input data were garbled  due to network problems.
  3. Problems with a specific amanda version. I've tried 2.4.4p1 and
     2.4.4p3. Results are the same.
  4. Problems with a special disk. I've tested more than one, as target
     for "file" dumps as well as holding disk.



5. Hardware errors, e.g. in bad RAM (on a computer without ECC), or disk controller, or cables.

If one single bit is flipped, then gzip produces complete garbage from
that point on.


Good point. The data isn't compeletely garbled, though; closer inspection reveals that the uncompressed data actually have valid tar file entries after the failure point. In other words, it looks like only limited sections within the file are corrupted.

Also, I believe it's not the disk controller, since I've even tried dumping to NFS volumes (but maybe that raises other issues.)

Maybe you're only seeing it in such large backups with gzip, but it happens (less often) in other cases too.
Any tools available to test the hardware?


I have one of those stand-alone test software packages... Yes. Maybe I should run it. I can't just take the server down right now, though ;-(

Yes, the problem was most probably caused by a memory error. Faults were reported when testing the RAM thoroughly, and we have not been able to reproduce the gzip issues after replacing the memory!



- Toralf





Reply via email to