Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

Karan Singh Wed, 11 Mar 2015 03:26:17 -0700

Thanks Sage

I will create a “new feature” request on tracker.ceph.com 
<http://tracker.ceph.com/> so that this discussion should not get buried under 
mailing list.


Developers can implement this as per their convenience.


****************************************************************
Karan Singh 
Systems Specialist , Storage Platforms
CSC - IT Center for Science,
Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
mobile: +358 503 812758
tel. +358 9 4572001
fax +358 9 4572302
http://www.csc.fi/
****************************************************************

> On 10 Mar 2015, at 14:26, Sage Weil <s...@newdream.net> wrote:
> 
> On Tue, 10 Mar 2015, Christian Eichelmann wrote:
>> Hi Sage,
>> 
>> we hit this problem a few monthes ago as well and it took us quite a while to
>> figure out what's wrong.
>> 
>> As a Systemadministrator I don't like the idea that daemons or even init
>> scripts are changing system wide configuration parameters, so I wouldn't like
>> to see the OSDs do it themself.
> 
> This is my general feeling as well.  As we move to systemd, I'd like to 
> have the ceph unit file get away from this entirely and have the admin set 
> these values in /etc/security/limits.conf or /etc/sysctl.d.  The main 
> thing making this problematic right now is that the daemons run as root 
> instead of a 'ceph' user.
> 
>> The idea with the warning is on one hand a good hint, on the other hand it
>> also may confuse people, since changing this setting is not required for
>> common hardware.
> 
> If we make it warn only if it reaches > 50% of the threshold that is 
> probably safe...
> 
> sage
> 
> 
>> 
>> Regards,
>> Christian
>> 
>> On 03/09/2015 08:01 PM, Sage Weil wrote:
>>> On Mon, 9 Mar 2015, Karan Singh wrote:
>>>> Thanks Guys kernel.pid_max=4194303 did the trick.
>>> Great to hear!  Sorry we missed that you only had it at 65536.
>>> 
>>> This is a really common problem that people hit when their clusters start
>>> to grow.  Is there somewhere in the docs we can put this to catch more
>>> users?  Or maybe a warning issued by the osds themselves or something if
>>> they see limits that are low?
>>> 
>>> sage
>>> 
>>>> - Karan -
>>>> 
>>>>       On 09 Mar 2015, at 14:48, Christian Eichelmann
>>>>       <christian.eichelm...@1und1.de> wrote:
>>>> 
>>>> Hi Karan,
>>>> 
>>>> as you are actually writing in your own book, the problem is the
>>>> sysctl
>>>> setting "kernel.pid_max". I've seen in your bug report that you were
>>>> setting it to 65536, which is still to low for high density hardware.
>>>> 
>>>> In our cluster, one OSD server has in an idle situation about 66.000
>>>> Threads (60 OSDs per Server). The number of threads increases when you
>>>> increase the number of placement groups in the cluster, which I think
>>>> has triggered your problem.
>>>> 
>>>> Set the "kernel.pid_max" setting to 4194303 (the maximum) like Azad
>>>> Aliyar suggested, and the problem should be gone.
>>>> 
>>>> Regards,
>>>> Christian
>>>> 
>>>> Am 09.03.2015 11:41, schrieb Karan Singh:
>>>>       Hello Community need help to fix a long going Ceph
>>>>       problem.
>>>> 
>>>>       Cluster is unhealthy , Multiple OSDs are DOWN. When i am
>>>>       trying to
>>>>       restart OSD?s i am getting this error
>>>> 
>>>> 
>>>>       /2015-03-09 12:22:16.312774 7f760dac9700 -1
>>>>       common/Thread.cc
>>>>       <http://Thread.cc>: In function 'void
>>>>       Thread::create(size_t)' thread
>>>>       7f760dac9700 time 2015-03-09 12:22:16.311970/
>>>>       /common/Thread.cc <http://Thread.cc>: 129: FAILED
>>>>       assert(ret == 0)/
>>>> 
>>>> 
>>>>       *Environment *:  4 Nodes , OSD+Monitor , Firefly latest ,
>>>>       CentOS6.5
>>>>       , 3.17.2-1.el6.elrepo.x86_64
>>>> 
>>>>       Tried upgrading from 0.80.7 to 0.80.8  but no Luck
>>>> 
>>>>       Tried centOS stock kernel 2.6.32  but no Luck
>>>> 
>>>>       Memory is not a problem more then 150+GB is free
>>>> 
>>>> 
>>>>       Did any one every faced this problem ??
>>>> 
>>>>       *Cluster status *
>>>>       *
>>>>       *
>>>>       / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
>>>>       /     health HEALTH_WARN 7334 pgs degraded; 1185 pgs down;
>>>>       1 pgs
>>>>       incomplete; 1735 pgs peering; 8938 pgs stale; 1/
>>>>       /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs
>>>>       stuck unclean;
>>>>       recovery 6061/31080 objects degraded (19/
>>>>       /.501%); 111/196 in osds are down; clock skew detected on
>>>>       mon.pouta-s02,
>>>>       mon.pouta-s03/
>>>>       /     monmap e3: 3 mons at
>>>> {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX
>>>>       .50.3:6789/
>>>>       //0}, election epoch 1312, quorum 0,1,2
>>>>       pouta-s01,pouta-s02,pouta-s03/
>>>>       /   * osdmap e26633: 239 osds: 85 up, 196 in*/
>>>>       /      pgmap v60389: 17408 pgs, 13 pools, 42345 MB data,
>>>>       10360 objects/
>>>>       /            4699 GB used, 707 TB / 711 TB avail/
>>>>       /            6061/31080 objects degraded (19.501%)/
>>>>       /                  14 down+remapped+peering/
>>>>       /                  39 active/
>>>>       /                3289 active+clean/
>>>>       /                 547 peering/
>>>>       /                 663 stale+down+peering/
>>>>       /                 705 stale+active+remapped/
>>>>       /                   1 active+degraded+remapped/
>>>>       /                   1 stale+down+incomplete/
>>>>       /                 484 down+peering/
>>>>       /                 455 active+remapped/
>>>>       /                3696 stale+active+degraded/
>>>>       /                   4 remapped+peering/
>>>>       /                  23 stale+down+remapped+peering/
>>>>       /                  51 stale+active/
>>>>       /                3637 active+degraded/
>>>>       /                3799 stale+active+clean/
>>>> 
>>>>       *OSD :  Logs *
>>>> 
>>>>       /2015-03-09 12:22:16.312774 7f760dac9700 -1
>>>>       common/Thread.cc
>>>>       <http://Thread.cc>: In function 'void
>>>>       Thread::create(size_t)' thread
>>>>       7f760dac9700 time 2015-03-09 12:22:16.311970/
>>>>       /common/Thread.cc <http://Thread.cc>: 129: FAILED
>>>>       assert(ret == 0)/
>>>>       /
>>>>       /
>>>>       / ceph version 0.80.8
>>>>       (69eaad7f8308f21573c604f121956e64679a52a7)/
>>>>       / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/
>>>>       / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a)
>>>>       [0xae84fa]/
>>>>       / 3: (Accepter::entry()+0x265) [0xb5c635]/
>>>>       / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/
>>>>       / 5: (clone()+0x6d) [0x3c8a2e89dd]/
>>>>       / NOTE: a copy of the executable, or `objdump -rdS
>>>>       <executable>` is
>>>>       needed to interpret this./
>>>> 
>>>> 
>>>>       *More information at Ceph Tracker Issue :
>>>>       *http://tracker.ceph.com/issues/10988#change-49018
>>>> 
>>>> 
>>>>       ****************************************************************
>>>>       Karan Singh
>>>>       Systems Specialist , Storage Platforms
>>>>       CSC - IT Center for Science,
>>>>       Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
>>>>       mobile: +358 503 812758
>>>>       tel. +358 9 4572001
>>>>       fax +358 9 4572302
>>>>       http://www.csc.fi/
>>>>       ****************************************************************
>>>> 
>>>> 
>>>> 
>>>>       _______________________________________________
>>>>       ceph-users mailing list
>>>>       ceph-users@lists.ceph.com
>>>>       http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Christian Eichelmann
>>>> Systemadministrator
>>>> 
>>>> 1&1 Internet AG - IT Operations Mail & Media Advertising & Targeting
>>>> Brauerstraße 48 · DE-76135 Karlsruhe
>>>> Telefon: +49 721 91374-8026
>>>> christian.eichelm...@1und1.de
>>>> 
>>>> Amtsgericht Montabaur / HRB 6484
>>>> Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
>>>> Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan
>>>> Oetjen
>>>> Aufsichtsratsvorsitzender: Michael Scheeren
>>>> 
>>>> 
>>>> 
>>

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

Reply via email to