Greetings! After upgrading from 5.4 to 5.6.1 I'm no more able to mirror two hammer PFSs to their respective slaves. Of course it could be just a coincidence, however problems started right after upgrading.
- In my PC I have an hard disk (spinning disk) with 8 master PFSs. Each of them is mirrored on a second disk of the same PC and on a third disk on another PC (I have two identical copies of the main disk). - Both PCs a re DragonlfyBSD 5.6.1 and hammer is version 1. - Two of this 8 PFSs can't be mirrored anymore, no matter if I use first backup disk or the second, this is the `hammer mirror-copy` invocation and output: root# hammer mirror-copy /home/comm /mhome/pfs/comm Prescan to break up bulk transfer Prescan 2 chunks, total 4659 MBytes (4295704808, 590357072) hammer: Mirror-read /home/comm failed: Numerical argument out of domain hammer: Mirror-write /mhome/pfs/comm: Did not get termination sync record, or rec_size is wrong rt=-1 - And these are the errors in /var/log/messages Jun 28 15:49:17 copernico kernel: hammer_btree_extract: CRC DATA @ a00003b000390000/16384 FAILED Jun 28 15:49:17 copernico kernel: hammer_btree_extract: CRC DATA @ a00003b1be8c8000/16384 FAILED Jun 28 15:49:17 copernico kernel: hammer_btree_extract: CRC DATA @ a00003abdd0dc000/16384 FAILED Jun 28 15:49:17 copernico kernel: hammer_btree_extract: CRC DATA @ a00003acfd444000/16384 FAILED Jun 28 15:49:17 copernico kernel: hammer_btree_extract: CRC DATA @ a00003b001320000/16384 FAILED Jun 28 15:49:17 copernico kernel: hammer_btree_extract: CRC DATA @ a00003b012b90000/16384 FAILED Jun 28 15:49:17 copernico kernel: hammer_btree_extract: CRC DATA @ a00003b0d78d4000/16384 FAILED Jun 28 15:49:17 copernico kernel: hammer_btree_extract: CRC DATA @ a00003b1be1d0000/16384 FAILED Jun 28 15:49:17 copernico kernel: hammer_btree_extract: CRC DATA @ a00003b1be8cc000/16384 FAILED Some more infos: - During cleanups (snapshots/prune/rebalance/reblock, I didn't try to force a recopy) there are no errors (neither on the command line, nor in /var/log/messages). This is true for all three disks. - I evaluated the md5sum of every file in these problematic PFS and both mirrors, again without error (maybe a dumb test but I thought that forcing all data to be read could expose errors). - So only hammer mirror-copy raise errors. - After many retries (10-20) the mirror operation on one of the problematic PFSs succeeded on both mirrors. Content of the copy was identical (but I don't know if they are both identically corrupted!). After some successful mirror-copy invocations errors started again. Can somebody give me some hint to proceed investigation? I could try to copy data from these PFSs, destroy them and their slaves but I'm afraid to simply hide some disk problems. My main fear is to have some disk corruption going on but in this was the case I think Dragonfly would refuse to mount the disk. Thank you! Andrea
