Re: [ceph-users] PG status is "active+undersized+degraded"

Burkhard Linke Mon, 25 Jun 2018 04:29:56 -0700

Hi,


On 06/22/2018 08:06 AM, dave.c...@dell.com wrote:

I saw these statement from this link ( 
http://docs.ceph.com/docs/master/rados/operations/crush-map/ ), it that the 
reason which leads to the warning?

" This, combined with the default CRUSH failure domain, ensures that replicas or 
erasure code shards are separated across hosts and a single host failure will not affect 
availability."

Best Regards,
Dave Chen

-----Original Message-----
From: Chen2, Dave
Sent: Friday, June 22, 2018 1:59 PM
To: 'Burkhard Linke'; ceph-users@lists.ceph.com
Cc: Chen2, Dave
Subject: RE: [ceph-users] PG status is "active+undersized+degraded"

Hi Burkhard,

Thanks for your explanation, I created an new OSD with 2TB from another node, it truly 
solved the issue, the status of Ceph cluster is " health HEALTH_OK" now.

Another question is if three homogeneous OSD is spread across 2 nodes, I still got the 
warning message, and  the status is "active+undersized+degraded",  so does the 
three OSD spread across 3 nodes are mandatory rules for Ceph? Is that only for the HA 
consideration? Any official documents from Ceph has some guide on this?

The default ceph crush rules try to distribute PG replicates amonghosts. With a default replication number of 3 (pool size = 3), thisrequires at least three hosts. The pool also defines a minimum number ofPG replicates to be available for allowing I/O to a PG. This is usuallyset to 2 (pool min size = 2). The above status thus means that there areenough copies for the min size (-> active), but not enough for the size(-> undersized + degraded).

Using less than three hosts requires changing the pool size to 2. Butthis is strongly discouraged, since a sane automatic recovery of data incase of a netsplit or other temporary node failure is not possible. Donot do this in a production setup.

For a production setup you should also consider node failures. Thedefault setup uses 3 replicates, so to allow a node failure, you need 4hosts. Otherwise the self healing feature of ceph cannot recover thethird replicate. You also need to closely monitor your cluster's freespace to avoid a full cluster due to replicated PGs in case of a nodefailure.


Regards,
Burkhard
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] PG status is "active+undersized+degraded"

Reply via email to