[ceph-users] Fw:Re: "ceph daemon osd.x ops" shows different number from "ceph osd status "

2020-07-20 Thread rainning
aha, thanks very much for pointing out, Anthony! Just a summary for the screenshot pasted in my previous email. Based on my understanding, "ceph daemon osd.x ops" or "ceph daemon osd.x dump_ops_in_flight" shows the ops currently being processed in the osd.x. I also noticed that there is

[ceph-users] Re: Single Server Ceph OSD Recovery

2020-07-20 Thread Daniel Da Cunha
Hi, I manage to activate the OSDs after adding the keys with for i in `seq 0 8`; do ceph auth get-or-create osd.$i mon 'profile osd' mgr 'profile osd' osd 'allow *'; done # ceph osd status ++--+---+---++-++-++ | id | host | used |

[ceph-users] ceph osd log -> set_numa_affinity unable to identify public interface

2020-07-20 Thread EDH - Manuel Rios
Hi , Today checking the osd logs at boot after upgrade to 14.2.10 we found that: set_numa_affinity unable to identify public interface 'p3p1.4094' numa node: (2) No such file or directory "2020-07-20 20:41:41.134 7f2cd15ca700 -1 osd.12 1120769 set_numa_affinity unable to identify public

[ceph-users] Re: bluestore_default_buffered_write = true

2020-07-20 Thread Mark Nelson
Hi Adam, We test it fairly regularly on our development test nodes. Basically what this does is cache data in the bluestore buffer cache on write. By default we only cache things when they are first read. The advantage to enabling this is that you immediately have data in the cache once

[ceph-users] Re: Degradation of write-performance after upgrading to Octopus

2020-07-20 Thread Thomas Gradisnik
Hi Mark and others, last week we have finally been able to solve the problem. We are using Gentoo on our test cluster and as it turned out the official Ebuilds are not setting CMAKE_BUILD_TYPE=RelWithDebInfo, which alone caused the performance degradation we have been seeing after upgrading to

[ceph-users] Re: Thank you!

2020-07-20 Thread Marc Roos
I agree, Thanks from me as well, I am also really impressed by this storage solution as well as something like apache mesos. Those are the most impressive technologies introduced and developed last 5(?) years. -Original Message- To: ceph-users Cc: dhils...@performair.com Subject:

[ceph-users] Re: OSD memory leak?

2020-07-20 Thread Frank Schilder
Dear Mark, thank you very much for the very helpful answers. I will raise osd_memory_cache_min, leave everything else alone and watch what happens. I will report back here. Thanks also for raising this as an issue. Best regards, = Frank Schilder AIT Risø Campus Bygning 109,

[ceph-users] Re: ceph/rados performace sync vs async

2020-07-20 Thread Daniel Mezentsev
Hi All, Did more tests. Just one client, big object / small object, several clients with big and small objects - and seems like im getting absolutely reasonable numbers. Big objects are satturating network, small objects - IOPs on discs. Overall i have better understanding and im happy

[ceph-users] Re: Thank you!

2020-07-20 Thread Brian Topping
If there was a “like” button, I would have just clicked that to keep the list noise down. I have smaller operations and so my cluster goes down a lot more often. I keep dreading my abuse of the cluster and it just keeps coming back for more. Ceph really is amazing, and it’s hard to fully

[ceph-users] Thank you!

2020-07-20 Thread DHilsbos
I just want to thank the Ceph community, and the Ceph developers for such a wonderful product. We had a power outage on Saturday, and both Ceph clusters went offline, along with all of our other servers. Bringing Ceph back to full functionality was an absolute breeze, no problems, no hiccups,

[ceph-users] Re: [Ceph Octopus 15.2.3 ] MDS crashed suddenly

2020-07-20 Thread Patrick Donnelly
On Mon, Jul 20, 2020 at 5:38 AM wrote: > > Hi, > > I made a fresh install of Ceph Octopus 15.2.3 recently. > And after a few days, the 2 standby MDS suddenly crashed with segmentation > fault error. > I try to restart it but it does not start. > [...] Can you please increase MDS debugging:

[ceph-users] Re: EC profile datastore usage - question

2020-07-20 Thread Steven Pine
Hi Igor, Given the patch histories and the rejection of the previous patch for the one in favor of defaulting to 4k block size, does this essentially mean ceph does not support higher block sizes when using erasure coding? Will the ceph project be updating their documentation and references to

[ceph-users] Re: [Ceph Octopus 15.2.3 ] MDS crashed suddenly

2020-07-20 Thread Lindsay Mathieson
On 20/07/2020 10:48 pm, carlimeun...@gmail.com wrote: After trying to restart the mds master, it also failed. Now the cluster state is : Try deleting and recreating one of the MDS. -- Lindsay ___ ceph-users mailing list -- ceph-users@ceph.io To

[ceph-users] Re: OSD memory leak?

2020-07-20 Thread Mark Nelson
On 7/20/20 3:23 AM, Frank Schilder wrote: Dear Mark and Dan, I'm in the process of restarting all OSDs and could use some quick advice on bluestore cache settings. My plan is to set higher minimum values and deal with accumulated excess usage via regular restarts. Looking at the

[ceph-users] Re: EC profile datastore usage - question

2020-07-20 Thread Igor Fedotov
Hi Mateusz, I think you might be hit by: https://tracker.ceph.com/issues/44213 This is fixed in upcoming Pacific release. Nautilus/Octopus backport is under discussion for now. Thanks, Igor On 7/18/2020 8:35 AM, Mateusz Skała wrote: Hello Community, I would like to ask about help in

[ceph-users] Re: [Ceph Octopus 15.2.3 ] MDS crashed suddenly

2020-07-20 Thread carlimeunier
After trying to restart the mds master, it also failed. Now the cluster state is : # ceph status cluster: id: dd024fe1-4996-4fed-ba57-03090e53724d health: HEALTH_WARN 1 filesystem is degraded insufficient standby MDS daemons available 29 daemons have recently crashed services:

[ceph-users] [Ceph Octopus 15.2.3 ] MDS crashed suddenly

2020-07-20 Thread carlimeunier
Hi, I made a fresh install of Ceph Octopus 15.2.3 recently. And after a few days, the 2 standby MDS suddenly crashed with segmentation fault error. I try to restart it but it does not start. Here is the error : -20> 2020-07-17T13:50:27.888+ 7fc8c6c51700 10 monclient: _renew_subs -19>

[ceph-users] Re: OSD memory leak?

2020-07-20 Thread Frank Schilder
Dear Mark and Dan, I'm in the process of restarting all OSDs and could use some quick advice on bluestore cache settings. My plan is to set higher minimum values and deal with accumulated excess usage via regular restarts. Looking at the documentation

[ceph-users] HP printer offline

2020-07-20 Thread masonlava77
When I make my hard efforts to print the documents through my HP printer, suddenly my HP printer goes offline mode. I am applying an appropriate command to my HP printer, I am facing HP printer offline problem. This offline error is an annoying issue for me, so I am not able to work on my HP

[ceph-users] HP printer offline

2020-07-20 Thread masonlava77
When I make my hard efforts to print the documents through my HP printer, suddenly my HP printer goes offline mode. I am applying an appropriate command to my HP printer, I am facing HP printer offline problem. This offline error is an annoying issue for me, so I am not able to work on my HP

[ceph-users] Cache Tier OSDs full and near full - not flushing and evicting

2020-07-20 Thread Priya Sehgal
Hi, We have a large production cluster with Writeback Cache Tier. Recently, we observed that some of the OSDs were near full and got full from the Cache Tier and the cluster is in error state. The target_max_bytes was not set correctly and hence I think the flush and eviction never happened. I

[ceph-users] ceph OSD node optimised sysctl configuration

2020-07-20 Thread Jeremi Avenant
Hi All Is there perhaps any updated documentation about ceph OSD node optimised sysctl configuration? I'm seeing a lot of these: $ netstat -s ... 4955341 packets pruned from receive queue because of socket buffer overrun ... 5866 times the listen queue of a socket overflowed ...

[ceph-users] "ceph daemon osd.x ops" shows different number from "ceph osd status "

2020-07-20 Thread rainning
"ceph daemon osd.x ops" shows ops currently in flight, the number is different from "ceph osd status