Any chance you know the LBA or byte offset of the corruption so I can compare it against the log? On Wed, Sep 12, 2018 at 8:32 PM <patrick.mcl...@sony.com> wrote: > > Hi Jason, > > On 2018-09-10 11:15:45-07:00 ceph-users wrote: > > On 2018-09-10 11:04:20-07:00 Jason Dillaman wrote: > > > &gt; In addition to this, we are seeing a similar type of corruption in > another use case when we migrate RBDs and snapshots across pools. In this > case we clone a version of an RBD (e.g. HEAD-3) to a new pool and rely on > 'rbd export-diff/import-diff' to restore the last 3 snapshots on top. Here > too we see cases of fsck and RBD checksum failures. > &gt; We maintain various metrics and logs. Looking back at our data we > have seen the issue at a small scale for a while on Jewel, but the frequency > increased recently. The timing may have coincided with a move to Luminous, > but this may be coincidence. We are currently on Ceph 12.2.5. > &gt; We are wondering if people are experiencing similar issues with 'rbd > export-diff / import-diff'. I'm sure many people use it to keep backups in > sync. Since it is backups, many people may not inspect the data often. In our > use case, we use this mechanism to keep data in sync and actually need the > data in the other location often. We are wondering if anyone else has > encountered any issues, it's quite possible that many people may have this > issue, buts simply don't realize. We are likely hitting it much more > frequently due to the scale of our operation (tens of thousands of syncs a > day). > > If you are able to recreate this reliably without tiering, it would > assist in debugging if you could capture RBD debug logs during the > export along w/ the LBA of the filesystem corruption to compare > against. > > We haven't been able to reproduce this reliably as of yet, as of yet we > haven't actually figured out the exact conditions that cause this to happen, > we have just been seeing it happen on some percentage of export/import-diff > operations. > > > Logs from both export-diff and import-diff in a case where the result gets > corrupted are attached. Please let me know if you need any more information. >
-- Jason _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com