Any chance you know the LBA or byte offset of the corruption so I can
compare it against the log?
On Wed, Sep 12, 2018 at 8:32 PM <patrick.mcl...@sony.com> wrote:
>
> Hi Jason,
>
> On 2018-09-10 11:15:45-07:00 ceph-users wrote:
>
> On 2018-09-10 11:04:20-07:00 Jason Dillaman wrote:
>
>
> &amp;gt; In addition to this, we are seeing a similar type of corruption in 
> another use case when we migrate RBDs and snapshots across pools. In this 
> case we clone a version of an RBD (e.g. HEAD-3) to a new pool and rely on 
> 'rbd export-diff/import-diff' to restore the last 3 snapshots on top. Here 
> too we see cases of fsck and RBD checksum failures.
> &amp;gt; We maintain various metrics and logs. Looking back at our data we 
> have seen the issue at a small scale for a while on Jewel, but the frequency 
> increased recently. The timing may have coincided with a move to Luminous, 
> but this may be coincidence. We are currently on Ceph 12.2.5.
> &amp;gt; We are wondering if people are experiencing similar issues with 'rbd 
> export-diff / import-diff'. I'm sure many people use it to keep backups in 
> sync. Since it is backups, many people may not inspect the data often. In our 
> use case, we use this mechanism to keep data in sync and actually need the 
> data in the other location often. We are wondering if anyone else has 
> encountered any issues, it's quite possible that many people may have this 
> issue, buts simply don't realize. We are likely hitting it much more 
> frequently due to the scale of our operation (tens of thousands of syncs a 
> day).
>
> If you are able to recreate this reliably without tiering, it would
> assist in debugging if you could capture RBD debug logs during the
> export along w/ the LBA of the filesystem corruption to compare
> against.
>
> We haven't been able to reproduce this reliably as of yet, as of yet we 
> haven't actually figured out the exact conditions that cause this to happen, 
> we have just been seeing it happen on some percentage of export/import-diff 
> operations.
>
>
> Logs from both export-diff and import-diff in a case where the result gets 
> corrupted are attached. Please let me know if you need any more information.
>



-- 
Jason
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to