[ceph-users] PG upmap corner cases that silently fail

2024-02-02 Thread Andras Pataki
Hi cephers, I've been looking into better balancing our clusters with upmaps lately, and ran into upmap cases that behave in a less than ideal way.  If there is any cycle in the upmaps like ceph osd pg-upmap-items a b b a or ceph osd pg-upmap-items a b b c c a the upmap validation passes,

[ceph-users] Lots of space allocated in completely empty OSDs

2023-08-11 Thread Andras Pataki
Here is a strange problem that I don't seem to be able to figure out.  Some of our OSDs that have zero weight, and no PGs have lots of allocated space: [root@cephosd0032 ~]# ceph osd df ID    CLASS  WEIGHT    REWEIGHT  SIZE RAW USE  DATA OMAP  META  AVAIL    %USE   VAR   PGS 

[ceph-users] ceph-fuse in infinite loop reading objects without client requests

2023-02-03 Thread Andras Pataki
We've been running into a strange issue with ceph-fuse on some nodes lately.  After some job runs on the node (and finishes or gets killed), ceph-fuse gets stuck busy requesting objects from the OSDs without any processes on the node using cephfs.  When this happens, ceph-fuse uses 2-3 cores,

[ceph-users] Re: Successful Upgrade from 14.2.22 to 15.2.14

2021-09-22 Thread Andras Pataki
Hi Dan, This is excellent to hear - we've also been a bit hesitant to upgrade from Nautilus (which has been working so well for us).  One question: did you/would you consider upgrading straight to Pacific from Nautilus?  Can you share your thoughts that lead you to Octopus first? Thanks,

[ceph-users] Strange (incorrect?) upmap entries in OSD map

2021-06-16 Thread Andras Pataki
I've been working on some improvements to our large cluster's space balancing, when I noticed that sometimes the OSD maps have strange upmap entries.  Here is an example on a clean cluster (PGs are active+clean):     {     "pgid": "1.1cb7", ...     "up": [    

[ceph-users] Re: Upmap balancer after node failure

2021-04-02 Thread Andras Pataki
Lowering the weight is what I ended up doing.  But this isn't ideal since afterwards the balancer will remove too many PGs from the OSD since now it has a lower weight.  So I'll have to put the weight back once the cluster recovers and the balancer goes back to its business. But in any case -

[ceph-users] Upmap balancer after node failure

2021-04-02 Thread Andras Pataki
Dear ceph users, On one of our clusters I have some difficulties with the upmap balancer.  We started with a reasonably well balanced cluster (using the balancer in upmap mode).  After a node failure, we crush reweighted all the OSDs of the node to take it out of the cluster - and waited for

[ceph-users] Re: The ceph balancer sets upmap items which violates my crushrule

2020-12-16 Thread Andras Pataki
Hi Manuel, We also had a similar problem, that for a two step crush selection rule, the balancer kept proposing upmaps that were invalid: step take root-disk step choose indep 3 type pod step choose indep 3 type rack step chooseleaf indep 1 type osd

[ceph-users] Re: cephfs - blacklisted client coming back?

2020-11-09 Thread Andras Pataki
for the implementation) Cheers, Dan On Mon, Nov 9, 2020 at 11:59 PM Andras Pataki wrote: We had some network problems (high packet drops) to some cephfs client nodes that run ceph-fuse (14.2.13) against a Nautilus cluster (on version 14.2.8). As a result a couple of clients got evicted (as o

[ceph-users] cephfs - blacklisted client coming back?

2020-11-09 Thread Andras Pataki
We had some network problems (high packet drops) to some cephfs client nodes that run ceph-fuse (14.2.13) against a Nautilus cluster (on version 14.2.8).  As a result a couple of clients got evicted (as one would expect).  What was really odd is that the clients were trying to flush data they

[ceph-users] Re: Reweighting OSD while down results in undersized+degraded PGs

2020-05-20 Thread Andras Pataki
-users-boun...@ceph.io does not designate 217.70.178.232 as permitted sender) smtp.mailfrom=ceph-users-boun...@ceph.io; dmarc=fail (p=REJECT sp=REJECT dis=QUARANTINE) header.from=flatironinstitute.org On Mon, May 18, 2020 at 10:26 PM Andras Pataki wrote: In a recent cluster reorganization, we

[ceph-users] Re: Reweighting OSD while down results in undersized+degraded PGs

2020-05-20 Thread Andras Pataki
crush map changes is limited to up+in OSDs). Only OSDs that can peer are able to respond to changes of the crush map. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Andras Pataki Sent: 19 May 2020 15:57:49

[ceph-users] Re: Reweighting OSD while down results in undersized+degraded PGs

2020-05-19 Thread Andras Pataki
will be vacated as expected. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Andras Pataki Sent: 18 May 2020 22:25:37 To: ceph-users Subject: [ceph-users] Reweighting OSD while down results in undersized+degraded PGs

[ceph-users] Reweighting OSD while down results in undersized+degraded PGs

2020-05-18 Thread Andras Pataki
In a recent cluster reorganization, we ended up with a lot of undersized/degraded PGs and a day of recovery from them, when all we expected was moving some data around.  After retracing my steps, I found something odd.  If I crush reweight an OSD to  0 while it is down - it results in the PGs

[ceph-users] Re: ceph-mgr high CPU utilization

2020-05-08 Thread Andras Pataki
Andras Pataki wrote: Also just a follow-up on the misbehavior of ceph-mgr. It looks like the upmap balancer is not acting reasonably either. It is trying to create upmap entries every minute or so - and claims to be successful, but they never show up in the OSD map. Setting the logging to 'debug

[ceph-users] Re: ceph-mgr high CPU utilization

2020-05-07 Thread Andras Pataki
, where we didn't see this issue. Andras On 5/1/20 8:48 AM, Andras Pataki wrote: Also just a follow-up on the misbehavior of ceph-mgr. It looks like the upmap balancer is not acting reasonably either. It is trying to create upmap entries every minute or so - and claims to be successful

[ceph-users] Re: ceph-mgr high CPU utilization

2020-05-07 Thread Andras Pataki
, where we didn't see this issue. Andras On 5/1/20 8:48 AM, Andras Pataki wrote: Also just a follow-up on the misbehavior of ceph-mgr.  It looks like the upmap balancer is not acting reasonably either.  It is trying to create upmap entries every minute or so - and claims to be successful

[ceph-users] Re: ceph-mgr high CPU utilization

2020-05-01 Thread Andras Pataki
- and keeps going like that every minute without any progress (the set of upmap entries stays the same, does not increase). Andras On 5/1/20 8:12 AM, Andras Pataki wrote: I'm wondering if anyone still sees issues with ceph-mgr using CPU and being unresponsive even in recent Nautilus releases

[ceph-users] ceph-mgr high CPU utilization

2020-05-01 Thread Andras Pataki
I'm wondering if anyone still sees issues with ceph-mgr using CPU and being unresponsive even in recent Nautilus releases.  We upgraded our largest cluster from Mimic to Nautilus (14.2.8) recently - it has about 3500 OSDs.  Now ceph-mgr is constantly at 100-200% CPU (1-2 cores), and becomes

[ceph-users] Re: PG deep-scrub does not finish

2020-04-21 Thread Andras Pataki
9 03:39:17 popeye-oss-3-03 ceph-osd: 15: (clone()+0x6d) [0x73ee988d] I ended up recreating the OSD (and thus overwriting all data) to fix the issue. Andras On 4/20/20 9:28 PM, Brad Hubbard wrote: On Mon, Apr 20, 2020 at 11:01 PM Andras Pataki wrote: On a cluster running Nautilus (14.2.8

[ceph-users] PG deep-scrub does not finish

2020-04-20 Thread Andras Pataki
On a cluster running Nautilus (14.2.8), we are getting a complaint about a PG not being deep-scrubbed on time.  Looking at the primary OSD's logs, it looks like it tries to deep-scrub the PG every hour or so, emits some complaints that I don't understand, but the deep scrub does not finish

[ceph-users] Re: Load on drives of different sizes in ceph

2020-04-06 Thread Andras Pataki
ss your (unspecified) failure domains so that the extra capacity isn’t wasted, again depending on your topology. On Mar 31, 2020, at 8:49 AM, Eneko Lacunza wrote: Hi Andras, El 31/3/20 a las 16:42, Andras Pataki escribió: I'm looking for some advice on what to do about drives of differen

[ceph-users] Load on drives of different sizes in ceph

2020-03-31 Thread Andras Pataki
Hi cephers, I'm looking for some advice on what to do about drives of different sizes in the same cluster. We have so far kept the drive sizes consistent on our main ceph cluster (using 8TB drives).  We're getting some new hardware with larger, 12TB drives next, and I'm pondering on how

[ceph-users] Upmap balancing - pools grouped together?

2020-03-16 Thread Andras Pataki
I've been trying the upmap balancer on a new Nautilus cluster.  We three main pools, a triple replicated pool (id:1) and two 6+3 erasure coded pools (id: 4 and 5).  The balancer does a very nice job on the triple replicated pool, but does something strange on the EC pools.  Here is a sample of

[ceph-users] cepfs: ceph-fuse clients getting stuck + causing degraded PG

2020-02-26 Thread Andras Pataki
We've been running into a strange problem repeating every day or so with a specific HPC job on a Mimic cluster (13.2.8) using ceph-fuse (14.2.7).  It seems like some cephfs clients are stuck (perhaps deadlocked) trying to access a file and are not making progress. Ceph reports the following

[ceph-users] Re: bluestore compression questions

2020-02-19 Thread Andras Pataki
.  Any ideas? Andras On 2/17/20 3:59 AM, Igor Fedotov wrote: Hi Andras, please find my answers inline. On 2/15/2020 12:27 AM, Andras Pataki wrote: We're considering using bluestore compression for some of our data, and I'm not entirely sure how to interpret compression results

[ceph-users] bluestore compression questions

2020-02-14 Thread Andras Pataki
We're considering using bluestore compression for some of our data, and I'm not entirely sure how to interpret compression results.  As an example, one of the osd perf dump results shows:     "bluestore_compressed": 28089935,     "bluestore_compressed_allocated": 115539968,    

[ceph-users] Re: RBD cephx read-only key

2020-02-06 Thread Andras Pataki
Ah, that makes sense.  Thanks for the quick reply! Andras On 2/6/20 11:24 AM, Jason Dillaman wrote: On Thu, Feb 6, 2020 at 11:20 AM Andras Pataki wrote: I'm trying to set up a cephx key to mount RBD images read-only. I have the following two keys: [client.rbd] key = xxx caps

[ceph-users] RBD cephx read-only key

2020-02-06 Thread Andras Pataki
I'm trying to set up a cephx key to mount RBD images read-only.  I have the following two keys: [client.rbd]     key = xxx     caps mgr = "profile rbd"     caps mon = "profile rbd"     caps osd = "profile rbd pool=rbd_vm" [client.rbd-ro]     key = xxx     caps mgr = "profile rbd-read-only"