[ceph-users] Re: 3 OSDs can not be started after a server reboot - rocksdb Corruption

2022-03-17 Thread Igor Fedotov
Hi Sebastian, actually it's hard to tell what's happening with this osd... May be it's less fragmented and hence benefit from the sequential reading. IIRC you're using spinning drives IIRC which are very susceptible to access pattern. Thanks, Igor On 3/17/2022 11:54 PM, Sebastian Mazza

[ceph-users] Re: 3 OSDs can not be started after a server reboot - rocksdb Corruption

2022-03-17 Thread Sebastian Mazza
Hi Igor, thank you very much for your explanation. I much appreciate it. You was right, as always. :-) There was not a single corrupted object. I did run `time ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-$X` and `time ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-$X --deep

[ceph-users] Re: CephFS snaptrim bug?

2022-03-17 Thread Arnaud M
Hello Linkriver I might have an issue close to your Can you tell us if your strays dirs are full ? What does this command output to you ? ceph tell mds.0 perf dump | grep strays Does the value change over time ? All the best Arnaud Le mer. 16 mars 2022 à 15:35, Linkriver Technology <

[ceph-users] Re: Quincy: mClock config propagation does not work properly

2022-03-17 Thread Neha Ojha
Hi Luis, Thanks for testing the Quincy rc and trying out the mClock settings! Sridhar is looking into this issue and will provide his feedback as soon as possible. Thanks, Neha On Thu, Mar 3, 2022 at 5:05 AM Luis Domingues wrote: > > Hi all, > > As we are doing some tests on our lab cluster,

[ceph-users] Re: RGW/S3 losing multipart upload objects

2022-03-17 Thread Ulrich Klein
Yo, that one is one of the threads that looks very similar to my problem, just with no resolution for me. I have a multi-site setup, so no resharding. Tried it — setup RGW from scratch due to all the “funny” errors. So, hoping for 16.2.7 or the “in the works” fix :) Ciao, Uli > On 17. 03

[ceph-users] Re: RGW/S3 losing multipart upload objects

2022-03-17 Thread Ulrich Klein
Ok, I’ll try again on 16.2.7. Only downside is that then I can’t use the dashboard on Safari i.e. iPads for monitoring anymore. And to make sure: From a user’s/client’s perspective the objects do disappear. Only on the Ceph/RGW-side - including accounting - they are still there and can’t be

[ceph-users] Re: RGW/S3 losing multipart upload objects

2022-03-17 Thread Matt Benjamin
Thanks, Soumya. It's also possible that what's reproducing is the known (space) leak during re-upload of multipart parts, described here: https://tracker.ceph.com/issues/44660. A fix for this is being worked on, it's taking a while. Matt On Thu, Mar 17, 2022 at 10:31 AM Soumya Koduri wrote: >

[ceph-users] Re: RGW/S3 losing multipart upload objects

2022-03-17 Thread Soumya Koduri
On 3/17/22 17:16, Ulrich Klein wrote: Hi, My second attempt to get help with a problem I'm trying to solve for about 6 month now. I have a Ceph 16.2.6 test cluster, used almost exclusively for providing RGW/S3 service. similar to a production cluster. The problem I have is this: A client

[ceph-users] Re: How often should I scrub the filesystem ?

2022-03-17 Thread Chris Palmer
Hi Miland I've done some more tests and updated the tracker https://tracker.ceph.com/issues/54557 Essentially the "rm" works, but after a restart the problem reappears. Thanks, Chris On 17/03/2022 06:15, Milind Changire wrote: Chris, After you run "scrub repair" followed by a "scrub" without

[ceph-users] RGW/S3 losing multipart upload objects

2022-03-17 Thread Ulrich Klein
Hi, My second attempt to get help with a problem I'm trying to solve for about 6 month now. I have a Ceph 16.2.6 test cluster, used almost exclusively for providing RGW/S3 service. similar to a production cluster. The problem I have is this: A client uploads (via S3) a bunch of large files

[ceph-users] Re: How often should I scrub the filesystem ?

2022-03-17 Thread Milind Changire
Chris, After you run "scrub repair" followed by a "scrub" without any issues, and if the "damage ls" still shows you an error, try running "damage rm" and re-run "scrub" to see if the system still reports a damage. Please update the upstream tracker with your findings if possible. -- Milind On