date:20220301

[ceph-users] Re: How to clear "Too many repaired reads on 1 OSDs" on pacific

2022-03-01 Thread Christian Rohmann


On 28/02/2022 20:54, Sascha Vogt wrote:
Is there a way to clear the error counter on pacific? If so, how? 


No, no anymore. See https://tracker.ceph.com/issues/54182


Regards


Christian

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Multisite sync issue

2022-03-01 Thread Poß , Julian

Thanks a ton for pointing this out.
Just verified this with a rgw user without tenant, works perfectly as you would 
expect.
I guess I could have suspected that tenants have something to do with it, since 
I spotted issues with them in the past, too.
Anyways, I got my “solution”. Thanks again!

Best, Julian

Von: Mule Te (TWL007) 
Gesendet: Freitag, 25. Februar 2022 19:45
An: Poß, Julian 
Cc: Eugen Block ; ceph-users@ceph.io
Betreff: Re: [ceph-users] Multisite sync issue

We have the same issue on Ceph 15.2.15.

In the testing cluster, seem like Ceph 16 solved this issue. The PR 
https://github.com/ceph/ceph/pull/41316 seem to remove this issue, but I do not 
know why it does not merge back to Ceph 15.

Also here is a new issue in Ceph tracker describes the same issue you have: 
https://tracker.ceph.com/issues/53737

Thanks


On Feb 25, 2022, at 10:07 PM, Poß, Julian 
mailto:julian.p...@cgm.com>> wrote:

As far as i can tell, it can be reproduced every time, yes.

That statement was actually about two RGW in one zone. That is also something 
that I tested.
Because I felt like ceph should be able to handle that ha-like on its own.

But for the main issue, there is indeed only one rgw in each zone running.
Well as far as I can tell, I see no issues others than what I posted in my 
initial mail.

Best, Julian

-Ursprüngliche Nachricht-
Von: Eugen Block mailto:ebl...@nde.ag>>
Gesendet: Freitag, 25. Februar 2022 12:57
An: ceph-users@ceph.io
Betreff: [ceph-users] Re: WG: Multisite sync issue

I see, then I misread your statement about multiple RGWs:


It also worries me that replication won't work with multiple rgws in
one zone, but one of them being unavailable, for instance during
maintenance.

Is there anything else than the RGW logs pointing to any issues? I find it 
strange that after a restart of the RGW fixes it. Is this always reproducable?

Zitat von "Poß, Julian" mailto:julian.p...@cgm.com>>:


Hi Eugen,

there is currently only one RGW installed for each region+realm.
So the places to look at are already pretty much limited.

As of now, the RGWs itself are the endpoints. So far no loadbalancer
has been put into place there.

Best, Julian

-Ursprüngliche Nachricht-
Von: Eugen Block mailto:ebl...@nde.ag>>
Gesendet: Freitag, 25. Februar 2022 10:52
An: ceph-users@ceph.io
Betreff: [ceph-users] Re: WG: Multisite sync issue

This email originated from outside of CGM. Please do not click links
or open attachments unless you know the sender and know the content is
safe.


Hi,

I would stop alle RGWs except one in each cluster to limit the places
and logs to look at. Do you have a loadbalancer as endpoint or do you
have a list of all RGWs as endpoints?


Zitat von "Poß, Julian" mailto:julian.p...@cgm.com>>:


Hi,

i did setup multisite with 2 ceph clusters and multiple rgw's and
realms/zonegroups.
This setup was installed using ceph ansible branch "stable-5.0", with
focal+octopus.
During some testing, i noticed that somehow the replication seems to
not work as expected.

With s3cmd, i put a small file of 1.9 kb into a bucket on the master
zone s3cmd put /etc/hosts s3://test/

Then i can see at the output of "radosgw-admin sync status
--rgw_realm internal", that the cluster has indeed to sync something,
and switching back to "nothing to sync" after a couple of seconds.
"radosgw-admin sync error list --rgw_realm internal" is emtpy, too.
However, if i look via s3cmd on the secondary zone, i can't see the
file. Even if i look at the ceph pools directly, the data didn't get
replicated.
If i proceed by uploading the file again, with the same command and
without a change, basically just updating it, or by restarting rgw
deamon of the secondary zone, the affected file gets replicated.

I spotted this issue with all my realms/zonegroups. But even with
"debug_rgw = 20" and debug_rgw_sync = "20" i can't spot any obvious
errors in the logs.

It also worries me that replication won't work with multiple rgws in
one zone, but one of them being unavailable, for instance during
maintenance.
I did somehow expect ceph to work it's way though the list of
available endpoints, and only fail if none are available.
...Or am I missing something here?

Any help whatsoever is very much appreciated.
I am pretty new to multisite and stuck on this for a couple of days
now already.

Thanks, Julian


Here is some additional information, including some log snippets:

# ON Master site, i can see the file in the bilog right away
radosgw-admin bilog list --bucket test/test --rgw_realm internal
   {
   "op_id": "3#001.445.5",
   "op_tag": "b9794e07-8f6c-4c45-a981-a73c3a4dc863.8360.106",
   "op": "write",
   "object": "hosts",
   "instance": "",
   "state": "complete",
   "index_ver": 1,
   "timestamp": "2022-02-24T09:14:41.957638774Z",
   "ver": {
   "pool": 7,
   "epoch": 2
   },
   "bilog_flags": 0

[ceph-users] Re: Multisite sync issue

2022-03-01 Thread Te Mule

Hi Julian:

Could you share your solution for this? We are also trying to find out a 
solution for this. 

Thanks

> 在 2022年3月1日，下午5:18，Poß, Julian  写道：
> 
> 
> Thanks a ton for pointing this out.
> Just verified this with a rgw user without tenant, works perfectly as you 
> would expect.
> I guess I could have suspected that tenants have something to do with it, 
> since I spotted issues with them in the past, too.
> Anyways, I got my “solution”. Thanks again!
>  
> Best, Julian
>  
> Von: Mule Te (TWL007)  
> Gesendet: Freitag, 25. Februar 2022 19:45
> An: Poß, Julian 
> Cc: Eugen Block ; ceph-users@ceph.io
> Betreff: Re: [ceph-users] Multisite sync issue
>  
> We have the same issue on Ceph 15.2.15.
>  
> In the testing cluster, seem like Ceph 16 solved this issue. The PR 
> https://github.com/ceph/ceph/pull/41316 seem to remove this issue, but I do 
> not know why it does not merge back to Ceph 15. 
>  
> Also here is a new issue in Ceph tracker describes the same issue you have: 
> https://tracker.ceph.com/issues/53737
>  
> Thanks
> 
> 
> On Feb 25, 2022, at 10:07 PM, Poß, Julian  wrote:
>  
> As far as i can tell, it can be reproduced every time, yes.
> 
> That statement was actually about two RGW in one zone. That is also something 
> that I tested.
> Because I felt like ceph should be able to handle that ha-like on its own.
> 
> But for the main issue, there is indeed only one rgw in each zone running.
> Well as far as I can tell, I see no issues others than what I posted in my 
> initial mail.
> 
> Best, Julian
> 
> -Ursprüngliche Nachricht-
> Von: Eugen Block  
> Gesendet: Freitag, 25. Februar 2022 12:57
> An: ceph-users@ceph.io
> Betreff: [ceph-users] Re: WG: Multisite sync issue
> 
> I see, then I misread your statement about multiple RGWs:
> 
> 
> It also worries me that replication won't work with multiple rgws in 
> one zone, but one of them being unavailable, for instance during 
> maintenance.
> 
> Is there anything else than the RGW logs pointing to any issues? I find it 
> strange that after a restart of the RGW fixes it. Is this always reproducable?
> 
> Zitat von "Poß, Julian" :
> 
> 
> Hi Eugen,
> 
> there is currently only one RGW installed for each region+realm.
> So the places to look at are already pretty much limited.
> 
> As of now, the RGWs itself are the endpoints. So far no loadbalancer 
> has been put into place there.
> 
> Best, Julian
> 
> -Ursprüngliche Nachricht-
> Von: Eugen Block 
> Gesendet: Freitag, 25. Februar 2022 10:52
> An: ceph-users@ceph.io
> Betreff: [ceph-users] Re: WG: Multisite sync issue
> 
> This email originated from outside of CGM. Please do not click links 
> or open attachments unless you know the sender and know the content is 
> safe.
> 
> 
> Hi,
> 
> I would stop alle RGWs except one in each cluster to limit the places 
> and logs to look at. Do you have a loadbalancer as endpoint or do you 
> have a list of all RGWs as endpoints?
> 
> 
> Zitat von "Poß, Julian" :
> 
> 
> Hi,
> 
> i did setup multisite with 2 ceph clusters and multiple rgw's and 
> realms/zonegroups.
> This setup was installed using ceph ansible branch "stable-5.0", with
> focal+octopus.
> During some testing, i noticed that somehow the replication seems to 
> not work as expected.
> 
> With s3cmd, i put a small file of 1.9 kb into a bucket on the master 
> zone s3cmd put /etc/hosts s3://test/
> 
> Then i can see at the output of "radosgw-admin sync status 
> --rgw_realm internal", that the cluster has indeed to sync something, 
> and switching back to "nothing to sync" after a couple of seconds.
> "radosgw-admin sync error list --rgw_realm internal" is emtpy, too.
> However, if i look via s3cmd on the secondary zone, i can't see the 
> file. Even if i look at the ceph pools directly, the data didn't get 
> replicated.
> If i proceed by uploading the file again, with the same command and 
> without a change, basically just updating it, or by restarting rgw 
> deamon of the secondary zone, the affected file gets replicated.
> 
> I spotted this issue with all my realms/zonegroups. But even with 
> "debug_rgw = 20" and debug_rgw_sync = "20" i can't spot any obvious 
> errors in the logs.
> 
> It also worries me that replication won't work with multiple rgws in 
> one zone, but one of them being unavailable, for instance during 
> maintenance.
> I did somehow expect ceph to work it's way though the list of 
> available endpoints, and only fail if none are available.
> ...Or am I missing something here?
> 
> Any help whatsoever is very much appreciated.
> I am pretty new to multisite and stuck on this for a couple of days 
> now already.
> 
> Thanks, Julian
> 
> 
> Here is some additional information, including some log snippets:
> 
> # ON Master site, i can see the file in the bilog right away 
> radosgw-admin bilog list --bucket test/test --rgw_realm internal
>{
>"op_id": "3#001.445.5",
>"op_tag": "b9794e07-8f6c-4c45-a981-a73c3

[ceph-users] Re: Understanding RGW multi zonegroup replication topology

2022-03-01 Thread Ulrich Klein

Hi,

Disclaimer: I'm in no way a Ceph expert. Have just been tinkering with Ceph/RGW 
for a larger installation for a while.

My understanding is that the data between zones in a zonegroup is synced by 
default. And that works well, most of the time.
If you, as I had to, want to restrict what data is synced, because e.g. some 
users' data should not be copied to a second site/zone, then there are a couple 
of ways to achieve that. Sync policies I discarded because I couldn't get them 
to work without "sync status" reporting all non-synced data as "not yet" 
synced, and there wasn't any per-user setting anyway.
The sync settings on zones and buckets work fine. But the setting 
"sync_from..." on a zone is all or nothing and the setting on buckets can only 
be applied once the bucket exists. No setting on a per user basis.

As an alternative I used two zonegroups A and B. On the primary cluster both 
had a zone A1 and B1, on the secondary cluster only one of them had zone A2.
That way what's written to to A1 is replicated to A2. What's written to B1 is 
not replicated/synced.
The good and bad part is that metadata (?) is synced across the zonegroups. 
That's good (for me) because the same user credentials work in both zonegroups, 
i.e. user selects the "replication" by using different S3 URLs. On the other 
hand it's annoying because also bucket names - if I remember correctly - are 
considered metadata and thus synced across all zones in the realm, which can 
create "interesting" effects.

As I said, I'm not the expert. Just my experience so far.

Ciao, Uli

> On 01. 03 2022, at 01:54, Mark Selby  wrote:
> 
> I am designing a Ceph RGW multisite configuration. I have done a fair bit if 
> reading but still am having trouble groking the utility of having multiple 
> zonegroups within a realm. I know that all meta data is replicated between 
> zonegroups and that replication can be setup between zones across zonegroups. 
> I am having trouble understanding the benefits that a multi zonegroup 
> topology would provide. I really want to understand this as I want to make 
> sure that I design a topology that best meets the company’s needs.
> 
> 
> 
> Any and all help is greatly appreciated. Thanks!
> 
> 
> 
> -- 
> 
> Mark Selby
> 
> mse...@voleon.com 
> 
> 
> 
> This email is subject to important conditions and disclosures that are listed 
> on this web page: https://voleon.com/disclaimer/.
> 
> 
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Multisite sync issue

2022-03-01 Thread Poß , Julian

Hey,

my cluster is only a test installation, to generally verify a rgw multisite 
design, so there is no production data on it.
Therefore my “solution” was to create a rgw s3 user without a tenant, so 
instead of
radosgw-admin user create –tenant=test --uid=test --display-name=test 
--access_key=123 --secret_key=123 --rgw_realm internal

I created a user like this:
radosgw-admin user create --uid=test2 --display-name=test2 --access_key=1234 
--secret_key=1234 --rgw_realm internal


That worked for me, to verify the problem. Unfortunately this is most likely 
not going to be a solution for you.
And it isn’t for me either. But knowing this in my test-setup, I can take 
precautions for the production installation, and install that with v16 release, 
instead.
I’ll probably verify that this is fixed in the latest v16 release, too, before 
installing production clusters.

Best, Julian

Von: Te Mule 
Gesendet: Dienstag, 1. März 2022 10:26
An: Poß, Julian 
Cc: Eugen Block ; ceph-users@ceph.io
Betreff: Re: [ceph-users] Multisite sync issue

Hi Julian:

Could you share your solution for this? We are also trying to find out a 
solution for this.

Thanks


在 2022年3月1日，下午5:18，Poß, Julian 
mailto:julian.p...@cgm.com>> 写道：

Thanks a ton for pointing this out.
Just verified this with a rgw user without tenant, works perfectly as you would 
expect.
I guess I could have suspected that tenants have something to do with it, since 
I spotted issues with them in the past, too.
Anyways, I got my “solution”. Thanks again!

Best, Julian

Von: Mule Te (TWL007) mailto:twl...@gmail.com>>
Gesendet: Freitag, 25. Februar 2022 19:45
An: Poß, Julian mailto:julian.p...@cgm.com>>
Cc: Eugen Block mailto:ebl...@nde.ag>>; 
ceph-users@ceph.io
Betreff: Re: [ceph-users] Multisite sync issue

We have the same issue on Ceph 15.2.15.

In the testing cluster, seem like Ceph 16 solved this issue. The PR 
https://github.com/ceph/ceph/pull/41316 seem to remove this issue, but I do not 
know why it does not merge back to Ceph 15.

Also here is a new issue in Ceph tracker describes the same issue you have: 
https://tracker.ceph.com/issues/53737

Thanks



On Feb 25, 2022, at 10:07 PM, Poß, Julian 
mailto:julian.p...@cgm.com>> wrote:

As far as i can tell, it can be reproduced every time, yes.

That statement was actually about two RGW in one zone. That is also something 
that I tested.
Because I felt like ceph should be able to handle that ha-like on its own.

But for the main issue, there is indeed only one rgw in each zone running.
Well as far as I can tell, I see no issues others than what I posted in my 
initial mail.

Best, Julian

-Ursprüngliche Nachricht-
Von: Eugen Block mailto:ebl...@nde.ag>>
Gesendet: Freitag, 25. Februar 2022 12:57
An: ceph-users@ceph.io
Betreff: [ceph-users] Re: WG: Multisite sync issue

I see, then I misread your statement about multiple RGWs:



It also worries me that replication won't work with multiple rgws in
one zone, but one of them being unavailable, for instance during
maintenance.

Is there anything else than the RGW logs pointing to any issues? I find it 
strange that after a restart of the RGW fixes it. Is this always reproducable?

Zitat von "Poß, Julian" mailto:julian.p...@cgm.com>>:



Hi Eugen,

there is currently only one RGW installed for each region+realm.
So the places to look at are already pretty much limited.

As of now, the RGWs itself are the endpoints. So far no loadbalancer
has been put into place there.

Best, Julian

-Ursprüngliche Nachricht-
Von: Eugen Block mailto:ebl...@nde.ag>>
Gesendet: Freitag, 25. Februar 2022 10:52
An: ceph-users@ceph.io
Betreff: [ceph-users] Re: WG: Multisite sync issue

This email originated from outside of CGM. Please do not click links
or open attachments unless you know the sender and know the content is
safe.


Hi,

I would stop alle RGWs except one in each cluster to limit the places
and logs to look at. Do you have a loadbalancer as endpoint or do you
have a list of all RGWs as endpoints?


Zitat von "Poß, Julian" mailto:julian.p...@cgm.com>>:



Hi,

i did setup multisite with 2 ceph clusters and multiple rgw's and
realms/zonegroups.
This setup was installed using ceph ansible branch "stable-5.0", with
focal+octopus.
During some testing, i noticed that somehow the replication seems to
not work as expected.

With s3cmd, i put a small file of 1.9 kb into a bucket on the master
zone s3cmd put /etc/hosts s3://test/

Then i can see at the output of "radosgw-admin sync status
--rgw_realm internal", that the cluster has indeed to sync something,
and switching back to "nothing to sync" after a couple of seconds.
"radosgw-admin sync error list --rgw_realm internal" is emtpy, too.
However, if i look via s3cmd on the secondary zone, i can't see the
file. Even if i look at the ceph pools directly, the data didn't get
replicated.
If i proceed by uploading the file again, with the

[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays

2022-03-01 Thread Arnaud M

I am using ceph pacific (16.2.5)

Does anyone have an idea about my issues ?

Thanks again to everyone

All the best

Arnaud

Le mar. 1 mars 2022 à 01:04, Arnaud M  a écrit :

> Hello to everyone
>
> Our ceph cluster is healthy and everything seems to go well but we have a
> lot of num_strays
>
> ceph tell mds.0 perf dump | grep stray
> "num_strays": 1990574,
> "num_strays_delayed": 0,
> "num_strays_enqueuing": 0,
> "strays_created": 3,
> "strays_enqueued": 17,
> "strays_reintegrated": 0,
> "strays_migrated": 0,
>
> And num_strays doesn't seems to reduce whatever we do (scrub / or scrub
> ~mdsdir)
> And when we scrub ~mdsdir (force,recursive,repair) we get thoses error
>
> {
> "damage_type": "dir_frag",
> "id": 3775653237,
> "ino": 1099569233128,
> "frag": "*",
> "path": "~mds0/stray3/100036efce8"
> },
> {
> "damage_type": "dir_frag",
> "id": 3776355973,
> "ino": 1099567262916,
> "frag": "*",
> "path": "~mds0/stray3/1000350ecc4"
> },
> {
> "damage_type": "dir_frag",
> "id": 3776485071,
> "ino": 1099559071399,
> "frag": "*",
> "path": "~mds0/stray4/10002d3eea7"
> },
>
> And just before the end of the ~mdsdir scrub the mds crashes and I have to
> do a
>
> ceph mds repaired 0 to have the filesystem back online
>
> A lot of them. Do you have any ideas of what those errors are and how
> should I handle them ?
>
> We have a lot of data in our cephfs cluster 350 TB+ and we takes snapshot
> everyday of / and keep them for 1 month (rolling)
>
> here is our cluster state
>
> ceph -s
>   cluster:
> id: 817b5736-84ae-11eb-bf7b-c9513f2d60a9
> health: HEALTH_WARN
> 78 pgs not deep-scrubbed in time
> 70 pgs not scrubbed in time
>
>   services:
> mon: 3 daemons, quorum ceph-r-112-1,ceph-g-112-3,ceph-g-112-2 (age 10d)
> mgr: ceph-g-112-2.ghcodb(active, since 4d), standbys:
> ceph-g-112-1.ksojnh
> mds: 1/1 daemons up, 1 standby
> osd: 67 osds: 67 up (since 14m), 67 in (since 7d)
>
>   data:
> volumes: 1/1 healthy
> pools:   5 pools, 609 pgs
> objects: 186.86M objects, 231 TiB
> usage:   351 TiB used, 465 TiB / 816 TiB avail
> pgs: 502 active+clean
>  82  active+clean+snaptrim_wait
>  20  active+clean+snaptrim
>  4   active+clean+scrubbing+deep
>  1   active+clean+scrubbing+deep+snaptrim_wait
>
>   io:
> client:   8.8 MiB/s rd, 39 MiB/s wr, 25 op/s rd, 54 op/s wr
>
> My questions are about the damage found on the ~mdsdir scrub, should I
> worry about it ? What does it mean ? It seems to be linked with my issue of
> the high number of strays, is it right ? How to fix it and how to reduce
> num_stray ?
>
> Thank for all
>
> All the best
>
> Arnaud
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Multisite sync issue

2022-03-01 Thread 特木勒

Hi Julian:

Thanks for your reply.

We are using tenant enabled RGW for now. :(

I will try to Use CEPH 16 as secondary cluster to do the testing. If it
works, I will upgrade master cluster upgrade to Ceph v16 too.

Have a good day.

Poß, Julian  于2022年3月1日周二 17:42写道：

> Hey,
>
>
>
> my cluster is only a test installation, to generally verify a rgw
> multisite design, so there is no production data on it.
>
> Therefore my “solution” was to create a rgw s3 user without a tenant, so
> instead of
>
> radosgw-admin user create –tenant=test --uid=test --display-name=test
> --access_key=123 --secret_key=123 --rgw_realm internal
>
>
>
> I created a user like this:
> radosgw-admin user create --uid=test2 --display-name=test2
> --access_key=1234 --secret_key=1234 --rgw_realm internal
>
>
>
>
>
> That worked for me, to verify the problem. Unfortunately this is most
> likely not going to be a solution for you.
>
> And it isn’t for me either. But knowing this in my test-setup, I can take
> precautions for the production installation, and install that with v16
> release, instead.
>
> I’ll probably verify that this is fixed in the latest v16 release, too,
> before installing production clusters.
>
>
>
> Best, Julian
>
>
>
> *Von:* Te Mule 
> *Gesendet:* Dienstag, 1. März 2022 10:26
> *An:* Poß, Julian 
> *Cc:* Eugen Block ; ceph-users@ceph.io
> *Betreff:* Re: [ceph-users] Multisite sync issue
>
>
>
> Hi Julian:
>
>
>
> Could you share your solution for this? We are also trying to find out a
> solution for this.
>
>
>
> Thanks
>
>
>
> 在 2022年3月1日，下午5:18，Poß, Julian  写道：
>
> 
>
> Thanks a ton for pointing this out.
>
> Just verified this with a rgw user without tenant, works perfectly as you
> would expect.
>
> I guess I could have suspected that tenants have something to do with it,
> since I spotted issues with them in the past, too.
>
> Anyways, I got my “solution”. Thanks again!
>
>
>
> Best, Julian
>
>
>
> *Von:* Mule Te (TWL007) 
> *Gesendet:* Freitag, 25. Februar 2022 19:45
> *An:* Poß, Julian 
> *Cc:* Eugen Block ; ceph-users@ceph.io
> *Betreff:* Re: [ceph-users] Multisite sync issue
>
>
>
> We have the same issue on Ceph 15.2.15.
>
>
>
> In the testing cluster, seem like Ceph 16 solved this issue. The PR
> https://github.com/ceph/ceph/pull/41316 seem to remove this issue, but I
> do not know why it does not merge back to Ceph 15.
>
>
>
> Also here is a new issue in Ceph tracker describes the same issue you
> have: https://tracker.ceph.com/issues/53737
>
>
>
> Thanks
>
>
>
>
> On Feb 25, 2022, at 10:07 PM, Poß, Julian  wrote:
>
>
>
> As far as i can tell, it can be reproduced every time, yes.
>
> That statement was actually about two RGW in one zone. That is also
> something that I tested.
> Because I felt like ceph should be able to handle that ha-like on its own.
>
> But for the main issue, there is indeed only one rgw in each zone running.
> Well as far as I can tell, I see no issues others than what I posted in my
> initial mail.
>
> Best, Julian
>
> -Ursprüngliche Nachricht-
> Von: Eugen Block 
> Gesendet: Freitag, 25. Februar 2022 12:57
> An: ceph-users@ceph.io
> Betreff: [ceph-users] Re: WG: Multisite sync issue
>
> I see, then I misread your statement about multiple RGWs:
>
>
>
> It also worries me that replication won't work with multiple rgws in
> one zone, but one of them being unavailable, for instance during
> maintenance.
>
>
> Is there anything else than the RGW logs pointing to any issues? I find it
> strange that after a restart of the RGW fixes it. Is this always
> reproducable?
>
> Zitat von "Poß, Julian" :
>
>
>
> Hi Eugen,
>
> there is currently only one RGW installed for each region+realm.
> So the places to look at are already pretty much limited.
>
> As of now, the RGWs itself are the endpoints. So far no loadbalancer
> has been put into place there.
>
> Best, Julian
>
> -Ursprüngliche Nachricht-
> Von: Eugen Block 
> Gesendet: Freitag, 25. Februar 2022 10:52
> An: ceph-users@ceph.io
> Betreff: [ceph-users] Re: WG: Multisite sync issue
>
> This email originated from outside of CGM. Please do not click links
> or open attachments unless you know the sender and know the content is
> safe.
>
>
> Hi,
>
> I would stop alle RGWs except one in each cluster to limit the places
> and logs to look at. Do you have a loadbalancer as endpoint or do you
> have a list of all RGWs as endpoints?
>
>
> Zitat von "Poß, Julian" :
>
>
>
> Hi,
>
> i did setup multisite with 2 ceph clusters and multiple rgw's and
> realms/zonegroups.
> This setup was installed using ceph ansible branch "stable-5.0", with
> focal+octopus.
> During some testing, i noticed that somehow the replication seems to
> not work as expected.
>
> With s3cmd, i put a small file of 1.9 kb into a bucket on the master
> zone s3cmd put /etc/hosts s3://test/
>
> Then i can see at the output of "radosgw-admin sync status
> --rgw_realm internal", that the cluster has indeed to sync something,
> and switching back to "nothing to sync

[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays

2022-03-01 Thread Dan van der Ster

Hi,

stray files are created when you have hardlinks to deleted files, or
snapshots of deleted files.
You need to delete the snapshots, or "reintegrate" the hardlinks by
recursively listing the relevant files.

BTW, in pacific there isn't a big problem with accumulating lots of
stray files. (Before pacific there was a default limit of 1M strays,
but that is now removed).

Cheers, dan

On Tue, Mar 1, 2022 at 1:04 AM Arnaud M  wrote:
>
> Hello to everyone
>
> Our ceph cluster is healthy and everything seems to go well but we have a
> lot of num_strays
>
> ceph tell mds.0 perf dump | grep stray
> "num_strays": 1990574,
> "num_strays_delayed": 0,
> "num_strays_enqueuing": 0,
> "strays_created": 3,
> "strays_enqueued": 17,
> "strays_reintegrated": 0,
> "strays_migrated": 0,
>
> And num_strays doesn't seems to reduce whatever we do (scrub / or scrub
> ~mdsdir)
> And when we scrub ~mdsdir (force,recursive,repair) we get thoses error
>
> {
> "damage_type": "dir_frag",
> "id": 3775653237,
> "ino": 1099569233128,
> "frag": "*",
> "path": "~mds0/stray3/100036efce8"
> },
> {
> "damage_type": "dir_frag",
> "id": 3776355973,
> "ino": 1099567262916,
> "frag": "*",
> "path": "~mds0/stray3/1000350ecc4"
> },
> {
> "damage_type": "dir_frag",
> "id": 3776485071,
> "ino": 1099559071399,
> "frag": "*",
> "path": "~mds0/stray4/10002d3eea7"
> },
>
> And just before the end of the ~mdsdir scrub the mds crashes and I have to
> do a
>
> ceph mds repaired 0 to have the filesystem back online
>
> A lot of them. Do you have any ideas of what those errors are and how
> should I handle them ?
>
> We have a lot of data in our cephfs cluster 350 TB+ and we takes snapshot
> everyday of / and keep them for 1 month (rolling)
>
> here is our cluster state
>
> ceph -s
>   cluster:
> id: 817b5736-84ae-11eb-bf7b-c9513f2d60a9
> health: HEALTH_WARN
> 78 pgs not deep-scrubbed in time
> 70 pgs not scrubbed in time
>
>   services:
> mon: 3 daemons, quorum ceph-r-112-1,ceph-g-112-3,ceph-g-112-2 (age 10d)
> mgr: ceph-g-112-2.ghcodb(active, since 4d), standbys:
> ceph-g-112-1.ksojnh
> mds: 1/1 daemons up, 1 standby
> osd: 67 osds: 67 up (since 14m), 67 in (since 7d)
>
>   data:
> volumes: 1/1 healthy
> pools:   5 pools, 609 pgs
> objects: 186.86M objects, 231 TiB
> usage:   351 TiB used, 465 TiB / 816 TiB avail
> pgs: 502 active+clean
>  82  active+clean+snaptrim_wait
>  20  active+clean+snaptrim
>  4   active+clean+scrubbing+deep
>  1   active+clean+scrubbing+deep+snaptrim_wait
>
>   io:
> client:   8.8 MiB/s rd, 39 MiB/s wr, 25 op/s rd, 54 op/s wr
>
> My questions are about the damage found on the ~mdsdir scrub, should I
> worry about it ? What does it mean ? It seems to be linked with my issue of
> the high number of strays, is it right ? How to fix it and how to reduce
> num_stray ?
>
> Thank for all
>
> All the best
>
> Arnaud
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays

2022-03-01 Thread Arnaud M

Hello Dan

Thanks a lot for the answer

i do remove the the snap everydays (I keep them for one month)
But the "num_strays" never seems to reduce.

I know I can do a listing of the folder with "find . -ls".

So my question is: is there a way to find the directory causing the strays
so I can "find . ls" them ? I would prefer not to do it on my whole cluster
as it will take time (several days and more if i need to do it also on
every snap) and will certainly overload the mds.

Please let me know if there is a way to spot the source of strays ? So I
can find the folder/snap with the biggest strays ?

And what about the scrub of ~mdsdir who crashes every times with the error:

{
"damage_type": "dir_frag",
"id": 3776355973,
"ino": 1099567262916,
"frag": "*",
"path": "~mds0/stray3/1000350ecc4"
},

Again, thanks for your help, that is really appreciated

All the best

Arnaud

Le mar. 1 mars 2022 à 11:02, Dan van der Ster  a écrit :

> Hi,
>
> stray files are created when you have hardlinks to deleted files, or
> snapshots of deleted files.
> You need to delete the snapshots, or "reintegrate" the hardlinks by
> recursively listing the relevant files.
>
> BTW, in pacific there isn't a big problem with accumulating lots of
> stray files. (Before pacific there was a default limit of 1M strays,
> but that is now removed).
>
> Cheers, dan
>
> On Tue, Mar 1, 2022 at 1:04 AM Arnaud M 
> wrote:
> >
> > Hello to everyone
> >
> > Our ceph cluster is healthy and everything seems to go well but we have a
> > lot of num_strays
> >
> > ceph tell mds.0 perf dump | grep stray
> > "num_strays": 1990574,
> > "num_strays_delayed": 0,
> > "num_strays_enqueuing": 0,
> > "strays_created": 3,
> > "strays_enqueued": 17,
> > "strays_reintegrated": 0,
> > "strays_migrated": 0,
> >
> > And num_strays doesn't seems to reduce whatever we do (scrub / or scrub
> > ~mdsdir)
> > And when we scrub ~mdsdir (force,recursive,repair) we get thoses error
> >
> > {
> > "damage_type": "dir_frag",
> > "id": 3775653237,
> > "ino": 1099569233128,
> > "frag": "*",
> > "path": "~mds0/stray3/100036efce8"
> > },
> > {
> > "damage_type": "dir_frag",
> > "id": 3776355973,
> > "ino": 1099567262916,
> > "frag": "*",
> > "path": "~mds0/stray3/1000350ecc4"
> > },
> > {
> > "damage_type": "dir_frag",
> > "id": 3776485071,
> > "ino": 1099559071399,
> > "frag": "*",
> > "path": "~mds0/stray4/10002d3eea7"
> > },
> >
> > And just before the end of the ~mdsdir scrub the mds crashes and I have
> to
> > do a
> >
> > ceph mds repaired 0 to have the filesystem back online
> >
> > A lot of them. Do you have any ideas of what those errors are and how
> > should I handle them ?
> >
> > We have a lot of data in our cephfs cluster 350 TB+ and we takes snapshot
> > everyday of / and keep them for 1 month (rolling)
> >
> > here is our cluster state
> >
> > ceph -s
> >   cluster:
> > id: 817b5736-84ae-11eb-bf7b-c9513f2d60a9
> > health: HEALTH_WARN
> > 78 pgs not deep-scrubbed in time
> > 70 pgs not scrubbed in time
> >
> >   services:
> > mon: 3 daemons, quorum ceph-r-112-1,ceph-g-112-3,ceph-g-112-2 (age
> 10d)
> > mgr: ceph-g-112-2.ghcodb(active, since 4d), standbys:
> > ceph-g-112-1.ksojnh
> > mds: 1/1 daemons up, 1 standby
> > osd: 67 osds: 67 up (since 14m), 67 in (since 7d)
> >
> >   data:
> > volumes: 1/1 healthy
> > pools:   5 pools, 609 pgs
> > objects: 186.86M objects, 231 TiB
> > usage:   351 TiB used, 465 TiB / 816 TiB avail
> > pgs: 502 active+clean
> >  82  active+clean+snaptrim_wait
> >  20  active+clean+snaptrim
> >  4   active+clean+scrubbing+deep
> >  1   active+clean+scrubbing+deep+snaptrim_wait
> >
> >   io:
> > client:   8.8 MiB/s rd, 39 MiB/s wr, 25 op/s rd, 54 op/s wr
> >
> > My questions are about the damage found on the ~mdsdir scrub, should I
> > worry about it ? What does it mean ? It seems to be linked with my issue
> of
> > the high number of strays, is it right ? How to fix it and how to reduce
> > num_stray ?
> >
> > Thank for all
> >
> > All the best
> >
> > Arnaud
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays

2022-03-01 Thread Dan van der Ster

Hi,

There was a recent (long) thread about this. It might give you some hints:
   
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/2NT55RUMD33KLGQCDZ74WINPPQ6WN6CW/

And about the crash, it could be related to
https://tracker.ceph.com/issues/51824

Cheers, dan


On Tue, Mar 1, 2022 at 11:30 AM Arnaud M  wrote:
>
> Hello Dan
>
> Thanks a lot for the answer
>
> i do remove the the snap everydays (I keep them for one month)
> But the "num_strays" never seems to reduce.
>
> I know I can do a listing of the folder with "find . -ls".
>
> So my question is: is there a way to find the directory causing the strays so 
> I can "find . ls" them ? I would prefer not to do it on my whole cluster as 
> it will take time (several days and more if i need to do it also on every 
> snap) and will certainly overload the mds.
>
> Please let me know if there is a way to spot the source of strays ? So I can 
> find the folder/snap with the biggest strays ?
>
> And what about the scrub of ~mdsdir who crashes every times with the error:
>
> {
> "damage_type": "dir_frag",
> "id": 3776355973,
> "ino": 1099567262916,
> "frag": "*",
> "path": "~mds0/stray3/1000350ecc4"
> },
>
> Again, thanks for your help, that is really appreciated
>
> All the best
>
> Arnaud
>
> Le mar. 1 mars 2022 à 11:02, Dan van der Ster  a écrit :
>>
>> Hi,
>>
>> stray files are created when you have hardlinks to deleted files, or
>> snapshots of deleted files.
>> You need to delete the snapshots, or "reintegrate" the hardlinks by
>> recursively listing the relevant files.
>>
>> BTW, in pacific there isn't a big problem with accumulating lots of
>> stray files. (Before pacific there was a default limit of 1M strays,
>> but that is now removed).
>>
>> Cheers, dan
>>
>> On Tue, Mar 1, 2022 at 1:04 AM Arnaud M  wrote:
>> >
>> > Hello to everyone
>> >
>> > Our ceph cluster is healthy and everything seems to go well but we have a
>> > lot of num_strays
>> >
>> > ceph tell mds.0 perf dump | grep stray
>> > "num_strays": 1990574,
>> > "num_strays_delayed": 0,
>> > "num_strays_enqueuing": 0,
>> > "strays_created": 3,
>> > "strays_enqueued": 17,
>> > "strays_reintegrated": 0,
>> > "strays_migrated": 0,
>> >
>> > And num_strays doesn't seems to reduce whatever we do (scrub / or scrub
>> > ~mdsdir)
>> > And when we scrub ~mdsdir (force,recursive,repair) we get thoses error
>> >
>> > {
>> > "damage_type": "dir_frag",
>> > "id": 3775653237,
>> > "ino": 1099569233128,
>> > "frag": "*",
>> > "path": "~mds0/stray3/100036efce8"
>> > },
>> > {
>> > "damage_type": "dir_frag",
>> > "id": 3776355973,
>> > "ino": 1099567262916,
>> > "frag": "*",
>> > "path": "~mds0/stray3/1000350ecc4"
>> > },
>> > {
>> > "damage_type": "dir_frag",
>> > "id": 3776485071,
>> > "ino": 1099559071399,
>> > "frag": "*",
>> > "path": "~mds0/stray4/10002d3eea7"
>> > },
>> >
>> > And just before the end of the ~mdsdir scrub the mds crashes and I have to
>> > do a
>> >
>> > ceph mds repaired 0 to have the filesystem back online
>> >
>> > A lot of them. Do you have any ideas of what those errors are and how
>> > should I handle them ?
>> >
>> > We have a lot of data in our cephfs cluster 350 TB+ and we takes snapshot
>> > everyday of / and keep them for 1 month (rolling)
>> >
>> > here is our cluster state
>> >
>> > ceph -s
>> >   cluster:
>> > id: 817b5736-84ae-11eb-bf7b-c9513f2d60a9
>> > health: HEALTH_WARN
>> > 78 pgs not deep-scrubbed in time
>> > 70 pgs not scrubbed in time
>> >
>> >   services:
>> > mon: 3 daemons, quorum ceph-r-112-1,ceph-g-112-3,ceph-g-112-2 (age 10d)
>> > mgr: ceph-g-112-2.ghcodb(active, since 4d), standbys:
>> > ceph-g-112-1.ksojnh
>> > mds: 1/1 daemons up, 1 standby
>> > osd: 67 osds: 67 up (since 14m), 67 in (since 7d)
>> >
>> >   data:
>> > volumes: 1/1 healthy
>> > pools:   5 pools, 609 pgs
>> > objects: 186.86M objects, 231 TiB
>> > usage:   351 TiB used, 465 TiB / 816 TiB avail
>> > pgs: 502 active+clean
>> >  82  active+clean+snaptrim_wait
>> >  20  active+clean+snaptrim
>> >  4   active+clean+scrubbing+deep
>> >  1   active+clean+scrubbing+deep+snaptrim_wait
>> >
>> >   io:
>> > client:   8.8 MiB/s rd, 39 MiB/s wr, 25 op/s rd, 54 op/s wr
>> >
>> > My questions are about the damage found on the ~mdsdir scrub, should I
>> > worry about it ? What does it mean ? It seems to be linked with my issue of
>> > the high number of strays, is it right ? How to fix it and how to reduce
>> > num_stray ?
>> >
>> > Thank for all
>> >
>> > All the best
>> >
>> > Arnaud
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-us

[ceph-users] Journal size recommendations

2022-03-01 Thread Michel Niyoyita

Dear ceph-users ,

Would like to ask the journal size set which can well perfom for a cluster
which has 36 OSD each with 8 TB . I would like to set 5012 , is this
helpful? your in put are highly appreciated.

Thank you.

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How to clear "Too many repaired reads on 1 OSDs" on pacific

2022-03-01 Thread Sascha Vogt


Hi,

Am 01.03.2022 um 02:04 schrieb Szabo, Istvan (Agoda):

Restart osd.


Many thanks! That worked. I wonder why I didn't try that myself :)

Greetings
-Sascha-

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How to clear "Too many repaired reads on 1 OSDs" on pacific

2022-03-01 Thread Sascha Vogt


Hi Christian,
Am 01.03.2022 um 09:01 schrieb Christian Rohmann:

On 28/02/2022 20:54, Sascha Vogt wrote:
Is there a way to clear the error counter on pacific? If so, how? 


No, no anymore. See https://tracker.ceph.com/issues/54182


Thanks for the link. Restarting the OSD seems to clear the counter or at 
least the warning in "ceph -s"


Greetings
-Sascha-
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Journal size recommendations

2022-03-01 Thread Eugen Block


Hi,

can you be more specific what exactly you are looking for? Are you  
talking about the rocksDB size? And what is the unit for 5012? It’s  
really not clear to me what you’re asking. And since the  
recommendations vary between different use cases you might want to  
share more details about your use case.


Zitat von Michel Niyoyita :


Dear ceph-users ,

Would like to ask the journal size set which can well perfom for a cluster
which has 36 OSD each with 8 TB . I would like to set 5012 , is this
helpful? your in put are highly appreciated.

Thank you.

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] {Disarmed} Problem with internals and mgr/ out-of-memory, unresponsive, high-CPU

2022-03-01 Thread Ted Lum

I'm attempting to install an OpenStack cluster, with Ceph. It's doing a 
cephadm install (bootstrap overcloud-controller-0, then deploy from 
there to the other two nodes)


This is a containerized install:

parameter_defaults:
  ContainerImagePrepare:
  - set:
  ceph_alertmanager_image: alertmanager
  ceph_alertmanager_namespace: quay.ceph.io/prometheus
  ceph_alertmanager_tag: v0.16.2
  ceph_grafana_image: grafana
  ceph_grafana_namespace: quay.ceph.io/app-sre
  ceph_grafana_tag: 6.7.4
  ceph_image: daemon
  ceph_namespace: quay.io/ceph
  ceph_node_exporter_image: node-exporter
  ceph_node_exporter_namespace: quay.ceph.io/prometheus
  ceph_node_exporter_tag: v0.17.0
  ceph_prometheus_image: prometheus
  ceph_prometheus_namespace: quay.ceph.io/prometheus
  ceph_prometheus_tag: v2.7.2
  ceph_tag: v6.0.4-stable-6.0-pacific-centos-8-x86_64
  name_prefix: openstack-
  name_suffix: ''
  namespace: quay.io/tripleomaster
  neutron_driver: ovn
  rhel_containers: false
  tag: current-tripleo
    tag_from_label: rdo_version

When a controller node becomes active a call is made to, I believe, 
ActivePyModules::set_store(...) with a corrupt - corrupt in a Huge Way - 
json payload, which results in it attempting to allocate virtual memory 
for it, but runs out of VM around 120Gig, all the while the node is 
unresponsive because the CPU is running around 100% IO wait. Eventually 
control is transferred to another node which then begins the same 
behavior, and it just transfers from one node to the next indefinitively.


I don't know where the payload is coming from; I don't yet know enough 
about Ceph internals. I believe it's delivered by a message, but I don't 
know who sent it.


The set_store call is trying to do a "config-key set 
mgr/cephadm/host.overcloud-controller-0". The problem with the payload 
seems to be a repeatable, indefinite, duplication of an IP address. The 
following is an excerpt from just before it looses it's mind:


..."networks_and_interfaces": {"10.100.4.0/24": {"br-ex": 
["10.100.4.71"]}, "10.100.5.0/24": {"vlan1105": ["10.100.5.154"]}, 
"10.100.6.0/24": {"vlan1106": ["10.100.6.163"]}, "10.100.7.0/24": 
{"vlan1107": ["10.100.7.163", "10.100.7.163", ... (the vlan1107 IP is 
repeated more times than the log can hold).


I've truncated the length of these lines, but it gives an idea of the 7 
minutes that it spends in I/O wait hell while it's sucking up all 
available VM.


Feb 27 22:43:59 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 1207967744 bytes == 0x5606e03a @
Feb 27 22:44:00 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 1207967744 bytes == 0x560662376000 @
Feb 27 22:44:10 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 2415927296 bytes == 0x5607b8ba6000 @
Feb 27 22:44:12 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 2415927296 bytes == 0x560848ba8000 @
Feb 27 22:44:18 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 2415927296 bytes == 0x560848ba8000 @
Feb 27 22:44:19 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 2415927296 bytes == 0x56063e374000 @
Feb 27 22:44:25 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 2415927296 bytes == 0x56063e374000 @
Feb 27 22:44:32 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 4831846400 bytes == 0x5608d8baa000 @
Feb 27 22:44:35 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 4831846400 bytes == 0x5609f93ac000 @
Feb 27 22:44:46 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 4831846400 bytes == 0x5609f93ac000 @
Feb 27 22:44:48 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 4831846400 bytes == 0x5607b8ba6000 @
Feb 27 22:45:00 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 4831846400 bytes == 0x5607b8ba6000 @
Feb 27 22:45:02 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 4831846400 bytes == 0x56063e374000 @
Feb 27 22:45:13 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 9663684608 bytes == 0x560b193ae000 @
Feb 27 22:45:21 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 9663684608 bytes == 0x560d59bb @
Feb 27 22:45:45 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 9663684608 bytes == 0x560d59bb @
Feb 27 22:45:52 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 9663684608 bytes == 0x560f9abb2000 @
Feb 27 22:46:14 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 9663684608 bytes == 0x560f9abb2000 @
Feb 27 22:46:18 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 9663684608 bytes == 0x5611db3b4000 @
Feb 27 22:47:42 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 19327361024 bytes == 0x56141bbb6000
Feb 27 22:47:58 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 19327361024 bytes == 0x56189cbb8000
Feb 27 22:48:45 overcloud-controller-0 conmon[4885]: tcmalloc: large 
alloc 19327361024 bytes == 0x56189cbb8000
Feb

[ceph-users] Re: How to clear "Too many repaired reads on 1 OSDs" on pacific

[ceph-users] Re: Multisite sync issue

[ceph-users] Re: Multisite sync issue

[ceph-users] Re: Understanding RGW multi zonegroup replication topology

[ceph-users] Re: Multisite sync issue

[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays

[ceph-users] Re: Multisite sync issue

[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays

[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays

[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays

[ceph-users] Journal size recommendations

[ceph-users] Re: How to clear "Too many repaired reads on 1 OSDs" on pacific

[ceph-users] Re: How to clear "Too many repaired reads on 1 OSDs" on pacific

[ceph-users] Re: Journal size recommendations

[ceph-users] {Disarmed} Problem with internals and mgr/ out-of-memory, unresponsive, high-CPU

15 matches

Site Navigation

Mail list logo

Footer information