Hi cephers,
I've been looking into better balancing our clusters with upmaps lately,
and ran into upmap cases that behave in a less than ideal way. If there
is any cycle in the upmaps like
ceph osd pg-upmap-items a b b a
or
ceph osd pg-upmap-items a b b c c a
the upmap validation passes,
Here is a strange problem that I don't seem to be able to figure out.
Some of our OSDs that have zero weight, and no PGs have lots of
allocated space:
[root@cephosd0032 ~]# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP
META AVAIL %USE VAR PGS
We've been running into a strange issue with ceph-fuse on some nodes
lately. After some job runs on the node (and finishes or gets killed),
ceph-fuse gets stuck busy requesting objects from the OSDs without any
processes on the node using cephfs. When this happens, ceph-fuse uses
2-3 cores,
Hi Dan,
This is excellent to hear - we've also been a bit hesitant to upgrade
from Nautilus (which has been working so well for us). One question:
did you/would you consider upgrading straight to Pacific from Nautilus?
Can you share your thoughts that lead you to Octopus first?
Thanks,
I've been working on some improvements to our large cluster's space
balancing, when I noticed that sometimes the OSD maps have strange upmap
entries. Here is an example on a clean cluster (PGs are active+clean):
{
"pgid": "1.1cb7",
...
"up": [
Lowering the weight is what I ended up doing. But this isn't ideal
since afterwards the balancer will remove too many PGs from the OSD
since now it has a lower weight. So I'll have to put the weight back
once the cluster recovers and the balancer goes back to its business.
But in any case -
Dear ceph users,
On one of our clusters I have some difficulties with the upmap
balancer. We started with a reasonably well balanced cluster (using the
balancer in upmap mode). After a node failure, we crush reweighted all
the OSDs of the node to take it out of the cluster - and waited for
Hi Manuel,
We also had a similar problem, that for a two step crush selection rule,
the balancer kept proposing upmaps that were invalid:
step take root-disk
step choose indep 3 type pod
step choose indep 3 type rack
step chooseleaf indep 1 type osd
for the implementation)
Cheers, Dan
On Mon, Nov 9, 2020 at 11:59 PM Andras Pataki
wrote:
We had some network problems (high packet drops) to some cephfs client
nodes that run ceph-fuse (14.2.13) against a Nautilus cluster (on
version 14.2.8). As a result a couple of clients got evicted (as o
We had some network problems (high packet drops) to some cephfs client
nodes that run ceph-fuse (14.2.13) against a Nautilus cluster (on
version 14.2.8). As a result a couple of clients got evicted (as one
would expect). What was really odd is that the clients were trying to
flush data they
-users-boun...@ceph.io does
not designate 217.70.178.232 as permitted sender)
smtp.mailfrom=ceph-users-boun...@ceph.io;
dmarc=fail (p=REJECT sp=REJECT dis=QUARANTINE)
header.from=flatironinstitute.org
On Mon, May 18, 2020 at 10:26 PM Andras Pataki
wrote:
In a recent cluster reorganization, we
crush map changes is limited to up+in OSDs). Only OSDs that can
peer are able to respond to changes of the crush map.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Andras Pataki
Sent: 19 May 2020 15:57:49
will be vacated as expected.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Andras Pataki
Sent: 18 May 2020 22:25:37
To: ceph-users
Subject: [ceph-users] Reweighting OSD while down results in undersized+degraded
PGs
In a recent cluster reorganization, we ended up with a lot of
undersized/degraded PGs and a day of recovery from them, when all we
expected was moving some data around. After retracing my steps, I found
something odd. If I crush reweight an OSD to 0 while it is down - it
results in the PGs
Andras Pataki
wrote:
Also just a follow-up on the misbehavior of ceph-mgr. It looks like the
upmap balancer is not acting reasonably either. It is trying to create
upmap entries every minute or so - and claims to be successful, but they
never show up in the OSD map. Setting the logging to 'debug
, where we didn't see
this issue.
Andras
On 5/1/20 8:48 AM, Andras Pataki wrote:
Also just a follow-up on the misbehavior of ceph-mgr. It looks like
the upmap balancer is not acting reasonably either. It is trying to
create upmap entries every minute or so - and claims to be successful
, where we didn't see
this issue.
Andras
On 5/1/20 8:48 AM, Andras Pataki wrote:
Also just a follow-up on the misbehavior of ceph-mgr. It looks like
the upmap balancer is not acting reasonably either. It is trying to
create upmap entries every minute or so - and claims to be successful
- and
keeps going like that every minute without any progress (the set of
upmap entries stays the same, does not increase).
Andras
On 5/1/20 8:12 AM, Andras Pataki wrote:
I'm wondering if anyone still sees issues with ceph-mgr using CPU and
being unresponsive even in recent Nautilus releases
I'm wondering if anyone still sees issues with ceph-mgr using CPU and
being unresponsive even in recent Nautilus releases. We upgraded our
largest cluster from Mimic to Nautilus (14.2.8) recently - it has about
3500 OSDs. Now ceph-mgr is constantly at 100-200% CPU (1-2 cores), and
becomes
9 03:39:17 popeye-oss-3-03 ceph-osd: 15: (clone()+0x6d)
[0x73ee988d]
I ended up recreating the OSD (and thus overwriting all data) to fix the
issue.
Andras
On 4/20/20 9:28 PM, Brad Hubbard wrote:
On Mon, Apr 20, 2020 at 11:01 PM Andras Pataki
wrote:
On a cluster running Nautilus (14.2.8
On a cluster running Nautilus (14.2.8), we are getting a complaint about
a PG not being deep-scrubbed on time. Looking at the primary OSD's
logs, it looks like it tries to deep-scrub the PG every hour or so,
emits some complaints that I don't understand, but the deep scrub does
not finish
ss your
(unspecified) failure domains so that the extra capacity isn’t wasted,
again depending on your topology.
On Mar 31, 2020, at 8:49 AM, Eneko Lacunza
wrote:
Hi Andras,
El 31/3/20 a las 16:42, Andras Pataki escribió:
I'm looking for some advice on what to do about drives of differen
Hi cephers,
I'm looking for some advice on what to do about drives of different
sizes in the same cluster.
We have so far kept the drive sizes consistent on our main ceph cluster
(using 8TB drives). We're getting some new hardware with larger, 12TB
drives next, and I'm pondering on how
I've been trying the upmap balancer on a new Nautilus cluster. We three
main pools, a triple replicated pool (id:1) and two 6+3 erasure coded
pools (id: 4 and 5). The balancer does a very nice job on the triple
replicated pool, but does something strange on the EC pools. Here is a
sample of
We've been running into a strange problem repeating every day or so with
a specific HPC job on a Mimic cluster (13.2.8) using ceph-fuse
(14.2.7). It seems like some cephfs clients are stuck (perhaps
deadlocked) trying to access a file and are not making progress.
Ceph reports the following
. Any ideas?
Andras
On 2/17/20 3:59 AM, Igor Fedotov wrote:
Hi Andras,
please find my answers inline.
On 2/15/2020 12:27 AM, Andras Pataki wrote:
We're considering using bluestore compression for some of our data,
and I'm not entirely sure how to interpret compression results
We're considering using bluestore compression for some of our data, and
I'm not entirely sure how to interpret compression results. As an
example, one of the osd perf dump results shows:
"bluestore_compressed": 28089935,
"bluestore_compressed_allocated": 115539968,
Ah, that makes sense. Thanks for the quick reply!
Andras
On 2/6/20 11:24 AM, Jason Dillaman wrote:
On Thu, Feb 6, 2020 at 11:20 AM Andras Pataki
wrote:
I'm trying to set up a cephx key to mount RBD images read-only. I have
the following two keys:
[client.rbd]
key = xxx
caps
I'm trying to set up a cephx key to mount RBD images read-only. I have
the following two keys:
[client.rbd]
key = xxx
caps mgr = "profile rbd"
caps mon = "profile rbd"
caps osd = "profile rbd pool=rbd_vm"
[client.rbd-ro]
key = xxx
caps mgr = "profile rbd-read-only"
29 matches
Mail list logo