[ceph-users] Re: ceph prometheus module no export content

2020-02-26 Thread Jan Fajerski
On Thu, Feb 27, 2020 at 10:20:07AM +0800, 黄明友 wrote: hi,all I had enable the prometheus module on my ceph cluster , the ceph version is ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nauti lus (stable). when I eanble this modle, can get exported content

[ceph-users] Cache tier OSDs crashing due to unfound hitset object 14.2.7

2020-02-26 Thread Lincoln Bryant
Hello Ceph experts, In the last day or so, we had a few nodes randomly reboot and now unfound objects are reported in Ceph health during cluster during recovery. It appears that the object in question is a hit set object, which I now cannot mark lost because Ceph cannot probe the OSDs that

[ceph-users] ceph prometheus module no export content

2020-02-26 Thread 黄明友
hi,all I had enable the prometheus module on my ceph cluster , the ceph version is ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus (stable). when I eanble this modle, can get exported content from the prometheus module. but , when I restart all the ceph

[ceph-users] Re: Nautilus OSD memory consumption?

2020-02-26 Thread Nigel Williams
On Thu, 27 Feb 2020 at 13:08, Nigel Williams wrote > On Thu, 27 Feb 2020 at 06:27, Anthony D'Atri wrote: > > If the heap stats reported by telling the OSD `heap stats` is large, > > telling each `heap release` may be useful. I suspect a TCMALLOC > > shortcoming. heap release seemingly had

[ceph-users] Re: Nautilus OSD memory consumption?

2020-02-26 Thread Nigel Williams
On Thu, 27 Feb 2020 at 06:27, Anthony D'Atri wrote: > If the heap stats reported by telling the OSD `heap stats` is large, telling > each `heap release` may be useful. I suspect a TCMALLOC shortcoming. osd.158 tcmalloc heap stats: MALLOC:

[ceph-users] Re: Nautilus OSD memory consumption?

2020-02-26 Thread Nigel Williams
On Wed, 26 Feb 2020 at 23:56, Mark Nelson wrote: > Have you tried dumping the mempools? ... > One reason this can happen for example is if you > have a huge number of PGs (like many thousands per OSD). We are relying on the pg autoscaler to set the PGs, and so far it seems to do the right

[ceph-users] Re: default data pools for cephfs: replicated vs. ec

2020-02-26 Thread Robert Sander
Hi Thoralf, On 26.02.20 15:35, thoralf schulze wrote: > recently, we've come across a lot of advice to only use replicated rados > pools as default- (ie: root-) data pools for cephfs¹. It should be possible to use an EC pool for CephFS data:

[ceph-users] Re: Question about ceph-balancer and OSD reweights

2020-02-26 Thread shubjero
Right, but should I be proactively returning any reweighted OSD's that are not 1. to 1.? On Wed, Feb 26, 2020 at 3:36 AM Konstantin Shalygin wrote: > > On 2/26/20 3:40 AM, shubjero wrote: > > I'm running a Ceph Mimic cluster 13.2.6 and we use the ceph-balancer > > in upmap mode. This

[ceph-users] default data pools for cephfs: replicated vs. ec

2020-02-26 Thread thoralf schulze
hi there, recently, we've come across a lot of advice to only use replicated rados pools as default- (ie: root-) data pools for cephfs¹. unfortunately, we either skipped or blatantly ignored this advice while creating our cephfs, so our default data pool is an erasure coded one with k=2 and m=4,

[ceph-users] cepfs: ceph-fuse clients getting stuck + causing degraded PG

2020-02-26 Thread Andras Pataki
We've been running into a strange problem repeating every day or so with a specific HPC job on a Mimic cluster (13.2.8) using ceph-fuse (14.2.7).  It seems like some cephfs clients are stuck (perhaps deadlocked) trying to access a file and are not making progress. Ceph reports the following

[ceph-users] Re: Running MDS server on a newer version than monitoring nodes

2020-02-26 Thread Martin Palma
Yes in the end we are in the process of doing it, but we first upgraded the MDSs which worked fine and it solved the problem we had with CephFS. Best, Martin On Wed, Feb 26, 2020 at 9:34 AM Konstantin Shalygin wrote: > > On 2/26/20 12:49 AM, Martin Palma wrote: > > is it possible to run MDS on

[ceph-users] Re: Ceph standby-replay metadata server: MDS internal heartbeat is not healthy

2020-02-26 Thread Martin Palma
Hi Patrick, we have performed a minor upgrade to 12.2.13 which resolved the issue. We think it was the following bug: https://tracker.ceph.com/issues/37723 Best, Martin On Thu, Feb 20, 2020 at 5:16 AM Patrick Donnelly wrote: > > Hi Martin, > > On Thu, Feb 13, 2020 at 4:10 AM Martin Palma

[ceph-users] Re: Nautilus OSD memory consumption?

2020-02-26 Thread Mark Nelson
Have you tried dumping the mempools?  The memory autotuner will grow or shrink the bluestore caches to try to keep the total OSD process mapped memory just under the target.  If there's a memory leak or some other part of the OSD is using more memory than it should, it will shrink the caches

[ceph-users] radosgw lifecycle seems work strangely

2020-02-26 Thread quexian da
ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus (stable) I made a bucket named "test_lc" and ran `s3cmd expire --expiry-date=2019-01-01 s3://test_lc` to set the lifecycle (2019-01-01 is earlier than current date so every object will be removed). Then I ran `radosgw-admin

[ceph-users] next Ceph Meetup Berlin, Germany

2020-02-26 Thread Robert Sander
Hi, The Ceph Berlin MeetUp is a community organized group that met bi-monthly in the past years: https://www.meetup.com/Ceph-Berlin/ The meetups start at 6 pm and consist of one presentation or talk and a following discussion. The discussion often takes place over dinner in a nearby restaurant

[ceph-users] Re: Question about ceph-balancer and OSD reweights

2020-02-26 Thread Konstantin Shalygin
On 2/26/20 3:40 AM, shubjero wrote: I'm running a Ceph Mimic cluster 13.2.6 and we use the ceph-balancer in upmap mode. This cluster is fairly old and pre-Mimic we used to set osd reweights to balance the standard deviation of the cluster. Since moving to Mimic about 9 months ago I enabled the

[ceph-users] Re: Running MDS server on a newer version than monitoring nodes

2020-02-26 Thread Konstantin Shalygin
On 2/26/20 12:49 AM, Martin Palma wrote: is it possible to run MDS on a newer version than the monitoring nodes? I mean we run monitoring nodes on 12.2.10 and would like to upgrade the MDS to 12.2.13 is this possible? Just upgrade your cluster to 12.2.13. Luminous is safe and very stable.