[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2020-06-25 Thread Lindsay Mathieson
On 25/06/2020 5:10 pm, Frank Schilder wrote: I was pondering with that. The problem is, that on Centos systems it seems to be ignored, in general it does not apply to SAS drives, for example, and that it has no working way of configuring which drives to exclude. For example, while for data

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Lindsay Mathieson
On 26/06/2020 1:44 am, Jiri D. Hoogeveen wrote: In Mimic I had only some misplaced objects and it recovered within an hour. In Nautilis, when I do exactly the same, I get beside misplaced objects, also degraded PGs and undersized PGs, and the recovery takes almost a day. Slowness of recovery

[ceph-users] Re: RGW listing slower on nominally faster setup

2020-06-25 Thread Mariusz Gronczewski
I've already filled a bug for it as we ran into same issue: https://tracker.ceph.com/issues/45955 might want to add that extra info about debug_rgw. Dnia 2020-06-24, o godz. 20:31:35 jgo...@teraswitch.com napisał(a): > We have a cluster, running Octopus 15.2.2, with the same exact issue >

[ceph-users] Re: Bluestore performance tuning for hdd with nvme db+wal

2020-06-25 Thread Mark Kirkwood
Progress update: - tweaked debug_rocksdb to 1/5. *possibly* helped, fewer slow requests - will increase osd_memory_target from 4 to 16G, and observe On 24/06/20 1:30 pm, Mark Kirkwood wrote: Hi, We have recently added a new storage node to our Luminous (12.2.13) cluster. The prev nodes are

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Chad William Seys
Do you mean unfound instead of undersized? There is an as yet unreproducible bug: https://tracker.ceph.com/issues/44286 (Please follow this bug if it affects you! I've experienced it and am leery of doing any drive swaps or upgrades until it is fixed.) Chad.

[ceph-users] Re: Ceph Tech Talk: Solving the Bug of the Year

2020-06-25 Thread Marc Roos
Top! Good to see such pro's on the team. What bugs is Dan waiting for to be fixed in cephfs before he upgrades from luminous to nautilus? -Original Message- To: ceph-users@ceph.io Subject: [ceph-users] Ceph Tech Talk: Solving the Bug of the Year Hi everyone, Thanks again to

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Frank Schilder
OK, this *does* sound bad. I would consider this a show stopper for upgrade from mimic. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Francois Legrand Sent: 25 June 2020 19:25:14 To: ceph-users@ceph.io

[ceph-users] Ceph Tech Talk: Solving the Bug of the Year

2020-06-25 Thread Mike Perez
Hi everyone, Thanks again to everyone who was able to join us for discussion, and to Dan for providing some great content. You can find the full recording for the latest Ceph Tech Talk here: https://www.youtube.com/watch?v=_4HUR00oCGo We're looking for a talk for August 27th. If you're

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Frank Schilder
Hi Jiri, this doesn't sound too bad. I don't know if the recovery time is to be expected, how does it compare to the same operation on mimic with the same utilization? In any case, single disk recovery is slow and I plan not to do it. Replacements are installed together with upgrades to have

[ceph-users] node-exporter error problem

2020-06-25 Thread Cem Zafer
Hi, Our ceph cluster system health is fine but when I looked at the "ceph orch ps" one of the image has error state as stated below. node-exporter.ceph102 ceph102 error 7m ago 13m prom/node-exporter How can we debug and locate the problem with ceph command? Another

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Frank Schilder
I actually don't think this is the problem. I removed a 120TB file system EC-data pool in mimic without any special flags and magic. The OSDs of the data pool are HDD with everything collocated. I had absolutely no problem, the data was removed after 2-3 days and nobody even noticed. This is a

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Marc Roos
Do you do things like[1] with the vm's? [1] echo 120 > /sys/block/sda/device/timeout -Original Message- From: Francois Legrand [mailto:f...@lpnhe.in2p3.fr] Sent: donderdag 25 juni 2020 19:25 To: ceph-users@ceph.io Subject: [ceph-users] Re: Removing pool in nautilus is incredibly

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Francois Legrand
For sure, If I could downgrade to mimic I would probably do it !!! So I understand that you plan not to upgrade ! F. Le 25/06/2020 à 19:28, Frank Schilder a écrit : OK, this *does* sound bad. I would consider this a show stopper for upgrade from mimic. Best regards, = Frank

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Jiri D. Hoogeveen
Hi Frank, "With our mimic cluster I have absolutely no problems migrating pools in one go to a completely new set of disks. I have no problems doubling the number of disks and at the same time doubling the number of PGs in a pooI and let the rebalancing loose in one single go. No need for slowly

[ceph-users] Re: Unsubscribe this mail list

2020-06-25 Thread Brian Topping
That kind of information is ALWAYS in the headers of every email. > List-Unsubscribe: > On Jun 25, 2020, at 9:21 AM, adan wrote: > > hello > > i want to unsubscribe this mail list . help me please. > > > 在 2020/6/25 22:42, ceph-users-requ...@ceph.io 写道: >>

[ceph-users] Unsubscribe this mail list

2020-06-25 Thread adan
hello i want to unsubscribe this mail list . help me please. 在 2020/6/25 22:42, ceph-users-requ...@ceph.io 写道: Send ceph-users mailing list submissions to ceph-users@ceph.io To subscribe or unsubscribe via email, send a message with subject or body 'help' to

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Eugen Block
I'm not sure if your OSDs have their rocksDB on faster devices, if not it sounds a lot like rocksdb fragmentation [1] leading to a very high load on the OSDs and occasionally crashing OSDs. If you don't plan to delete so much data at once on a regular basis you could sit this one out, but

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Francois Legrand
Thanks for the hint. I tryed but it doesn't seems to change anything... Moreover, as the osds seems quite loaded I had regularly some osd marked down which triggered some new peering and thus more load !!! I set the osd no down flag, but I still have some osd reported (wrongly) as down (and

[ceph-users] Re: Feedback of the used configuration

2020-06-25 Thread Simon Sutter
Hello Paul, Thanks for the Answer. I took a look at the subvolumes, but they are a bit odd in my opinion. If I create one with a subvolume-group, the folder structure will look like this: /cephfs/volumes/group-name/subvolume-name/random-uuid/ And I have to issue two commands, first set the group

[ceph-users] Lifecycle message on logs

2020-06-25 Thread Marcelo Miziara
Hello...it's the first time I need to use the lifecycle, and I created a bucket and set it to expire in one day with s3cmd: s3cmd expire --expiry-days=1 s3://bucket The rgw_lifecycle_work_time is set to the default values(00:00-06:00). But I noticed in the rgw logs a lot of messages like:

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Wout van Heeswijk
Hi Francois, Have you already looked at the option "osd_delete_sleep"? It will not speed up the process but I will give you some control over your cluster performance. Something like: ceph tell osd.\* injectargs '--osd_delete_sleep1' kind regards, Wout 42on On 25-06-2020 09:57, Francois

[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2020-06-25 Thread Frank Schilder
Hi all, > I did a quick test with wcache off[1]. And have the impression the > simple rados bench of 2 minutes performed a bit worse on my slow hdd's. This probably depends on whether or not the drive actually has non-volatile write cache. I noticed that from many vendors you can buy the

[ceph-users] Re: Bench on specific OSD

2020-06-25 Thread Marc Roos
What is wrong with just doing multiple tests and group in your charts osd's by host? -Original Message- To: ceph-users Subject: [ceph-users] Bench on specific OSD Hi all. Is there anyway to completely health check one OSD host or instance? For example rados bech just on that OSD or

[ceph-users] Re: Removing pool in nautilus is incredibly slow

2020-06-25 Thread Francois Legrand
Does someone have an idea ? F. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RGW listing slower on nominally faster setup

2020-06-25 Thread Olivier AUDRY
hello is there a way to push this config directly into ceph without using the ceph.conf file ? thanks for your tips oau Le vendredi 12 juin 2020 à 15:24 +, Stefan Wild a écrit : > On 6/12/20, 5:40 AM, "James, GleSYS" wrote: > > > When I set the debug_rgw logs to "20/1", the issue