Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-11 Thread Karan Singh
Thanks Sage

I will create a “new feature” request on tracker.ceph.com 
http://tracker.ceph.com/ so that this discussion should not get buried under 
mailing list. 

Developers can implement this as per their convenience.



Karan Singh 
Systems Specialist , Storage Platforms
CSC - IT Center for Science,
Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
mobile: +358 503 812758
tel. +358 9 4572001
fax +358 9 4572302
http://www.csc.fi/


 On 10 Mar 2015, at 14:26, Sage Weil s...@newdream.net wrote:
 
 On Tue, 10 Mar 2015, Christian Eichelmann wrote:
 Hi Sage,
 
 we hit this problem a few monthes ago as well and it took us quite a while to
 figure out what's wrong.
 
 As a Systemadministrator I don't like the idea that daemons or even init
 scripts are changing system wide configuration parameters, so I wouldn't like
 to see the OSDs do it themself.
 
 This is my general feeling as well.  As we move to systemd, I'd like to 
 have the ceph unit file get away from this entirely and have the admin set 
 these values in /etc/security/limits.conf or /etc/sysctl.d.  The main 
 thing making this problematic right now is that the daemons run as root 
 instead of a 'ceph' user.
 
 The idea with the warning is on one hand a good hint, on the other hand it
 also may confuse people, since changing this setting is not required for
 common hardware.
 
 If we make it warn only if it reaches  50% of the threshold that is 
 probably safe...
 
 sage
 
 
 
 Regards,
 Christian
 
 On 03/09/2015 08:01 PM, Sage Weil wrote:
 On Mon, 9 Mar 2015, Karan Singh wrote:
 Thanks Guys kernel.pid_max=4194303 did the trick.
 Great to hear!  Sorry we missed that you only had it at 65536.
 
 This is a really common problem that people hit when their clusters start
 to grow.  Is there somewhere in the docs we can put this to catch more
 users?  Or maybe a warning issued by the osds themselves or something if
 they see limits that are low?
 
 sage
 
 - Karan -
 
   On 09 Mar 2015, at 14:48, Christian Eichelmann
   christian.eichelm...@1und1.de wrote:
 
 Hi Karan,
 
 as you are actually writing in your own book, the problem is the
 sysctl
 setting kernel.pid_max. I've seen in your bug report that you were
 setting it to 65536, which is still to low for high density hardware.
 
 In our cluster, one OSD server has in an idle situation about 66.000
 Threads (60 OSDs per Server). The number of threads increases when you
 increase the number of placement groups in the cluster, which I think
 has triggered your problem.
 
 Set the kernel.pid_max setting to 4194303 (the maximum) like Azad
 Aliyar suggested, and the problem should be gone.
 
 Regards,
 Christian
 
 Am 09.03.2015 11:41, schrieb Karan Singh:
   Hello Community need help to fix a long going Ceph
   problem.
 
   Cluster is unhealthy , Multiple OSDs are DOWN. When i am
   trying to
   restart OSD?s i am getting this error
 
 
   /2015-03-09 12:22:16.312774 7f760dac9700 -1
   common/Thread.cc
   http://Thread.cc: In function 'void
   Thread::create(size_t)' thread
   7f760dac9700 time 2015-03-09 12:22:16.311970/
   /common/Thread.cc http://Thread.cc: 129: FAILED
   assert(ret == 0)/
 
 
   *Environment *:  4 Nodes , OSD+Monitor , Firefly latest ,
   CentOS6.5
   , 3.17.2-1.el6.elrepo.x86_64
 
   Tried upgrading from 0.80.7 to 0.80.8  but no Luck
 
   Tried centOS stock kernel 2.6.32  but no Luck
 
   Memory is not a problem more then 150+GB is free
 
 
   Did any one every faced this problem ??
 
   *Cluster status *
   *
   *
   / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
   / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down;
   1 pgs
   incomplete; 1735 pgs peering; 8938 pgs stale; 1/
   /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs
   stuck unclean;
   recovery 6061/31080 objects degraded (19/
   /.501%); 111/196 in osds are down; clock skew detected on
   mon.pouta-s02,
   mon.pouta-s03/
   / monmap e3: 3 mons at
 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX
   .50.3:6789/
   //0}, election epoch 1312, quorum 0,1,2
   pouta-s01,pouta-s02,pouta-s03/
   /   * osdmap e26633: 239 osds: 85 up, 196 in*/
   /  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data,
   10360 objects/
   /4699 GB used, 707 TB / 711 TB avail/
   /6061/31080 objects degraded (19.501%)/
   /  14 down+remapped+peering/
   /  39 active/
   /3289 active+clean/
   / 547 peering/
   / 663 stale+down+peering/
   / 705 stale+active+remapped/
   /   1 active+degraded+remapped/
   /   1 

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-10 Thread Sage Weil
On Tue, 10 Mar 2015, Christian Eichelmann wrote:
 Hi Sage,
 
 we hit this problem a few monthes ago as well and it took us quite a while to
 figure out what's wrong.
 
 As a Systemadministrator I don't like the idea that daemons or even init
 scripts are changing system wide configuration parameters, so I wouldn't like
 to see the OSDs do it themself.

This is my general feeling as well.  As we move to systemd, I'd like to 
have the ceph unit file get away from this entirely and have the admin set 
these values in /etc/security/limits.conf or /etc/sysctl.d.  The main 
thing making this problematic right now is that the daemons run as root 
instead of a 'ceph' user.

 The idea with the warning is on one hand a good hint, on the other hand it
 also may confuse people, since changing this setting is not required for
 common hardware.

If we make it warn only if it reaches  50% of the threshold that is 
probably safe...

sage


 
 Regards,
 Christian
 
 On 03/09/2015 08:01 PM, Sage Weil wrote:
  On Mon, 9 Mar 2015, Karan Singh wrote:
   Thanks Guys kernel.pid_max=4194303 did the trick.
  Great to hear!  Sorry we missed that you only had it at 65536.
  
  This is a really common problem that people hit when their clusters start
  to grow.  Is there somewhere in the docs we can put this to catch more
  users?  Or maybe a warning issued by the osds themselves or something if
  they see limits that are low?
  
  sage
  
   - Karan -
   
  On 09 Mar 2015, at 14:48, Christian Eichelmann
  christian.eichelm...@1und1.de wrote:
   
   Hi Karan,
   
   as you are actually writing in your own book, the problem is the
   sysctl
   setting kernel.pid_max. I've seen in your bug report that you were
   setting it to 65536, which is still to low for high density hardware.
   
   In our cluster, one OSD server has in an idle situation about 66.000
   Threads (60 OSDs per Server). The number of threads increases when you
   increase the number of placement groups in the cluster, which I think
   has triggered your problem.
   
   Set the kernel.pid_max setting to 4194303 (the maximum) like Azad
   Aliyar suggested, and the problem should be gone.
   
   Regards,
   Christian
   
   Am 09.03.2015 11:41, schrieb Karan Singh:
  Hello Community need help to fix a long going Ceph
  problem.
   
  Cluster is unhealthy , Multiple OSDs are DOWN. When i am
  trying to
  restart OSD?s i am getting this error
   
   
  /2015-03-09 12:22:16.312774 7f760dac9700 -1
  common/Thread.cc
  http://Thread.cc: In function 'void
  Thread::create(size_t)' thread
  7f760dac9700 time 2015-03-09 12:22:16.311970/
  /common/Thread.cc http://Thread.cc: 129: FAILED
  assert(ret == 0)/
   
   
  *Environment *:  4 Nodes , OSD+Monitor , Firefly latest ,
  CentOS6.5
  , 3.17.2-1.el6.elrepo.x86_64
   
  Tried upgrading from 0.80.7 to 0.80.8  but no Luck
   
  Tried centOS stock kernel 2.6.32  but no Luck
   
  Memory is not a problem more then 150+GB is free
   
   
  Did any one every faced this problem ??
   
  *Cluster status *
  *
  *
  / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
  / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down;
  1 pgs
  incomplete; 1735 pgs peering; 8938 pgs stale; 1/
  /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs
  stuck unclean;
  recovery 6061/31080 objects degraded (19/
  /.501%); 111/196 in osds are down; clock skew detected on
  mon.pouta-s02,
  mon.pouta-s03/
  / monmap e3: 3 mons at
   {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX
  .50.3:6789/
  //0}, election epoch 1312, quorum 0,1,2
  pouta-s01,pouta-s02,pouta-s03/
  /   * osdmap e26633: 239 osds: 85 up, 196 in*/
  /  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data,
  10360 objects/
  /4699 GB used, 707 TB / 711 TB avail/
  /6061/31080 objects degraded (19.501%)/
  /  14 down+remapped+peering/
  /  39 active/
  /3289 active+clean/
  / 547 peering/
  / 663 stale+down+peering/
  / 705 stale+active+remapped/
  /   1 active+degraded+remapped/
  /   1 stale+down+incomplete/
  / 484 down+peering/
  / 455 active+remapped/
  /3696 stale+active+degraded/
  /   4 remapped+peering/
  /  23 stale+down+remapped+peering/
  /  51 stale+active/
  /3637 active+degraded/
  /3799 

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-10 Thread Christian Eichelmann

Hi Sage,

we hit this problem a few monthes ago as well and it took us quite a 
while to figure out what's wrong.


As a Systemadministrator I don't like the idea that daemons or even init 
scripts are changing system wide configuration parameters, so I wouldn't 
like to see the OSDs do it themself.


I've noticed that building ceph on high density hardware is a totally 
different thing with totally different problems and solutions than with 
common hardware. I would like to see a special section in the 
documentation regarding problems with that kind of hardware and ceph 
clusters at a larger scale.


So I vote for the documentation. Sysctls are something I want to set for 
myself.
The idea with the warning is on one hand a good hint, on the other hand 
it also may confuse people, since changing this setting is not required 
for common hardware.


Regards,
Christian

On 03/09/2015 08:01 PM, Sage Weil wrote:

On Mon, 9 Mar 2015, Karan Singh wrote:

Thanks Guys kernel.pid_max=4194303 did the trick.

Great to hear!  Sorry we missed that you only had it at 65536.

This is a really common problem that people hit when their clusters start
to grow.  Is there somewhere in the docs we can put this to catch more
users?  Or maybe a warning issued by the osds themselves or something if
they see limits that are low?

sage


- Karan -

   On 09 Mar 2015, at 14:48, Christian Eichelmann
   christian.eichelm...@1und1.de wrote:

Hi Karan,

as you are actually writing in your own book, the problem is the
sysctl
setting kernel.pid_max. I've seen in your bug report that you were
setting it to 65536, which is still to low for high density hardware.

In our cluster, one OSD server has in an idle situation about 66.000
Threads (60 OSDs per Server). The number of threads increases when you
increase the number of placement groups in the cluster, which I think
has triggered your problem.

Set the kernel.pid_max setting to 4194303 (the maximum) like Azad
Aliyar suggested, and the problem should be gone.

Regards,
Christian

Am 09.03.2015 11:41, schrieb Karan Singh:
   Hello Community need help to fix a long going Ceph
   problem.

   Cluster is unhealthy , Multiple OSDs are DOWN. When i am
   trying to
   restart OSD?s i am getting this error


   /2015-03-09 12:22:16.312774 7f760dac9700 -1
   common/Thread.cc
   http://Thread.cc: In function 'void
   Thread::create(size_t)' thread
   7f760dac9700 time 2015-03-09 12:22:16.311970/
   /common/Thread.cc http://Thread.cc: 129: FAILED
   assert(ret == 0)/


   *Environment *:  4 Nodes , OSD+Monitor , Firefly latest ,
   CentOS6.5
   , 3.17.2-1.el6.elrepo.x86_64

   Tried upgrading from 0.80.7 to 0.80.8  but no Luck

   Tried centOS stock kernel 2.6.32  but no Luck

   Memory is not a problem more then 150+GB is free


   Did any one every faced this problem ??

   *Cluster status *
   *
   *
   / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
   / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down;
   1 pgs
   incomplete; 1735 pgs peering; 8938 pgs stale; 1/
   /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs
   stuck unclean;
   recovery 6061/31080 objects degraded (19/
   /.501%); 111/196 in osds are down; clock skew detected on
   mon.pouta-s02,
   mon.pouta-s03/
   / monmap e3: 3 mons at
{pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX
   .50.3:6789/
   //0}, election epoch 1312, quorum 0,1,2
   pouta-s01,pouta-s02,pouta-s03/
   /   * osdmap e26633: 239 osds: 85 up, 196 in*/
   /  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data,
   10360 objects/
   /4699 GB used, 707 TB / 711 TB avail/
   /6061/31080 objects degraded (19.501%)/
   /  14 down+remapped+peering/
   /  39 active/
   /3289 active+clean/
   / 547 peering/
   / 663 stale+down+peering/
   / 705 stale+active+remapped/
   /   1 active+degraded+remapped/
   /   1 stale+down+incomplete/
   / 484 down+peering/
   / 455 active+remapped/
   /3696 stale+active+degraded/
   /   4 remapped+peering/
   /  23 stale+down+remapped+peering/
   /  51 stale+active/
   /3637 active+degraded/
   /3799 stale+active+clean/

   *OSD :  Logs *

   /2015-03-09 12:22:16.312774 7f760dac9700 -1
   common/Thread.cc
   http://Thread.cc: In function 'void
   Thread::create(size_t)' thread
   7f760dac9700 time 2015-03-09 12:22:16.311970/
   /common/Thread.cc http://Thread.cc: 129: FAILED
   assert(ret == 0)/
   /
   /
   / ceph version 0.80.8
   

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Mohamed Pakkeer
Hi Karan,

We faced same issue and resolved after increasing the open file limit and
maximum no of threads

Config reference

/etc/security/limit.conf

root hard nofile 65535

sysctl -w kernel.pid_max=4194303
http://tracker.ceph.com/issues/10554#change-47024

Cheers

Mohamed Pakkeer

On Mon, Mar 9, 2015 at 4:20 PM, Azad Aliyar azad.ali...@sparksupport.com
wrote:

 *Check Max Threadcount:* If you have a node with a lot of OSDs, you may
 be hitting the default maximum number of threads (e.g., usually 32k),
 especially during recovery. You can increase the number of threads using
 sysctl to see if increasing the maximum number of threads to the maximum
 possible number of threads allowed (i.e., 4194303) will help. For example:

 sysctl -w kernel.pid_max=4194303

  If increasing the maximum thread count resolves the issue, you can make
 it permanent by including a kernel.pid_max setting in the /etc/sysctl.conf
 file. For example:

 kernel.pid_max = 4194303


 On Mon, Mar 9, 2015 at 4:11 PM, Karan Singh karan.si...@csc.fi wrote:

 Hello Community need help to fix a long going Ceph problem.

 Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to
 restart OSD’s i am getting this error


 *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970*
 *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)*


 *Environment *:  4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 ,
 3.17.2-1.el6.elrepo.x86_64

 Tried upgrading from 0.80.7 to 0.80.8  but no Luck

 Tried centOS stock kernel 2.6.32  but no Luck

 Memory is not a problem more then 150+GB is free


 Did any one every faced this problem ??

 *Cluster status *

  *  cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33*
 * health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs
 incomplete; 1735 pgs peering; 8938 pgs stale; 1*
 *736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean;
 recovery 6061/31080 objects degraded (19*
 *.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02,
 mon.pouta-s03*
 * monmap e3: 3 mons at
 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789*
 */0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03*
 * osdmap e26633: 239 osds: 85 up, 196 in*
 *  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects*
 *4699 GB used, 707 TB / 711 TB avail*
 *6061/31080 objects degraded (19.501%)*
 *  14 down+remapped+peering*
 *  39 active*
 *3289 active+clean*
 * 547 peering*
 * 663 stale+down+peering*
 * 705 stale+active+remapped*
 *   1 active+degraded+remapped*
 *   1 stale+down+incomplete*
 * 484 down+peering*
 * 455 active+remapped*
 *3696 stale+active+degraded*
 *   4 remapped+peering*
 *  23 stale+down+remapped+peering*
 *  51 stale+active*
 *3637 active+degraded*
 *3799 stale+active+clean*

 *OSD :  Logs *

 *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970*
 *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)*

 * ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)*
 * 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]*
 * 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]*
 * 3: (Accepter::entry()+0x265) [0xb5c635]*
 * 4: /lib64/libpthread.so.0() [0x3c8a6079d1]*
 * 5: (clone()+0x6d) [0x3c8a2e89dd]*
 * NOTE: a copy of the executable, or `objdump -rdS executable` is
 needed to interpret this.*


 *More information at Ceph Tracker Issue :  *
 http://tracker.ceph.com/issues/10988#change-49018


 
 Karan Singh
 Systems Specialist , Storage Platforms
 CSC - IT Center for Science,
 Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
 mobile: +358 503 812758
 tel. +358 9 4572001
 fax +358 9 4572302
 http://www.csc.fi/
 


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




 --
Warm Regards,  Azad Aliyar
  Linux Server Engineer
  *Email* :  azad.ali...@sparksupport.com   *|*   *Skype* :   spark.azad
 http://www.sparksupport.com http://www.sparkmycloud.com
 https://www.facebook.com/sparksupport
 http://www.linkedin.com/company/244846
 https://twitter.com/sparksupport3rd Floor, Leela Infopark, Phase
 -2,Kakanad, Kochi-30, Kerala, India  *Phone*:+91 484 6561696 , 
 *Mobile*:91-8129270421.
   *Confidentiality Notice:* Information in this 

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Christian Eichelmann
Hi Karan,

as you are actually writing in your own book, the problem is the sysctl
setting kernel.pid_max. I've seen in your bug report that you were
setting it to 65536, which is still to low for high density hardware.

In our cluster, one OSD server has in an idle situation about 66.000
Threads (60 OSDs per Server). The number of threads increases when you
increase the number of placement groups in the cluster, which I think
has triggered your problem.

Set the kernel.pid_max setting to 4194303 (the maximum) like Azad
Aliyar suggested, and the problem should be gone.

Regards,
Christian

Am 09.03.2015 11:41, schrieb Karan Singh:
 Hello Community need help to fix a long going Ceph problem.
 
 Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to
 restart OSD’s i am getting this error 
 
 
 /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970/
 /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/
 
 
 *Environment *:  4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5
 , 3.17.2-1.el6.elrepo.x86_64
 
 Tried upgrading from 0.80.7 to 0.80.8  but no Luck
 
 Tried centOS stock kernel 2.6.32  but no Luck
 
 Memory is not a problem more then 150+GB is free 
 
 
 Did any one every faced this problem ??
 
 *Cluster status *
 *
 *
  / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
 / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs
 incomplete; 1735 pgs peering; 8938 pgs stale; 1/
 /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean;
 recovery 6061/31080 objects degraded (19/
 /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02,
 mon.pouta-s03/
 / monmap e3: 3 mons at
 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/
 //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/
 /   * osdmap e26633: 239 osds: 85 up, 196 in*/
 /  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/
 /4699 GB used, 707 TB / 711 TB avail/
 /6061/31080 objects degraded (19.501%)/
 /  14 down+remapped+peering/
 /  39 active/
 /3289 active+clean/
 / 547 peering/
 / 663 stale+down+peering/
 / 705 stale+active+remapped/
 /   1 active+degraded+remapped/
 /   1 stale+down+incomplete/
 / 484 down+peering/
 / 455 active+remapped/
 /3696 stale+active+degraded/
 /   4 remapped+peering/
 /  23 stale+down+remapped+peering/
 /  51 stale+active/
 /3637 active+degraded/
 /3799 stale+active+clean/
 
 *OSD :  Logs *
 
 /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970/
 /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/
 /
 /
 / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/
 / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/
 / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/
 / 3: (Accepter::entry()+0x265) [0xb5c635]/
 / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/
 / 5: (clone()+0x6d) [0x3c8a2e89dd]/
 / NOTE: a copy of the executable, or `objdump -rdS executable` is
 needed to interpret this./
 
 
 *More information at Ceph Tracker Issue :
  *http://tracker.ceph.com/issues/10988#change-49018
 
 
 
 Karan Singh 
 Systems Specialist , Storage Platforms
 CSC - IT Center for Science,
 Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
 mobile: +358 503 812758
 tel. +358 9 4572001
 fax +358 9 4572302
 http://www.csc.fi/
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


-- 
Christian Eichelmann
Systemadministrator

11 Internet AG - IT Operations Mail  Media Advertising  Targeting
Brauerstraße 48 · DE-76135 Karlsruhe
Telefon: +49 721 91374-8026
christian.eichelm...@1und1.de

Amtsgericht Montabaur / HRB 6484
Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
Aufsichtsratsvorsitzender: Michael Scheeren
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Karan Singh
Thanks Guys kernel.pid_max=4194303 did the trick.

- Karan -

 On 09 Mar 2015, at 14:48, Christian Eichelmann 
 christian.eichelm...@1und1.de wrote:
 
 Hi Karan,
 
 as you are actually writing in your own book, the problem is the sysctl
 setting kernel.pid_max. I've seen in your bug report that you were
 setting it to 65536, which is still to low for high density hardware.
 
 In our cluster, one OSD server has in an idle situation about 66.000
 Threads (60 OSDs per Server). The number of threads increases when you
 increase the number of placement groups in the cluster, which I think
 has triggered your problem.
 
 Set the kernel.pid_max setting to 4194303 (the maximum) like Azad
 Aliyar suggested, and the problem should be gone.
 
 Regards,
 Christian
 
 Am 09.03.2015 11:41, schrieb Karan Singh:
 Hello Community need help to fix a long going Ceph problem.
 
 Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to
 restart OSD’s i am getting this error 
 
 
 /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970/
 /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/
 
 
 *Environment *:  4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5
 , 3.17.2-1.el6.elrepo.x86_64
 
 Tried upgrading from 0.80.7 to 0.80.8  but no Luck
 
 Tried centOS stock kernel 2.6.32  but no Luck
 
 Memory is not a problem more then 150+GB is free 
 
 
 Did any one every faced this problem ??
 
 *Cluster status *
 *
 *
 / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
 / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs
 incomplete; 1735 pgs peering; 8938 pgs stale; 1/
 /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean;
 recovery 6061/31080 objects degraded (19/
 /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02,
 mon.pouta-s03/
 / monmap e3: 3 mons at
 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/
 //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/
 /   * osdmap e26633: 239 osds: 85 up, 196 in*/
 /  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/
 /4699 GB used, 707 TB / 711 TB avail/
 /6061/31080 objects degraded (19.501%)/
 /  14 down+remapped+peering/
 /  39 active/
 /3289 active+clean/
 / 547 peering/
 / 663 stale+down+peering/
 / 705 stale+active+remapped/
 /   1 active+degraded+remapped/
 /   1 stale+down+incomplete/
 / 484 down+peering/
 / 455 active+remapped/
 /3696 stale+active+degraded/
 /   4 remapped+peering/
 /  23 stale+down+remapped+peering/
 /  51 stale+active/
 /3637 active+degraded/
 /3799 stale+active+clean/
 
 *OSD :  Logs *
 
 /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970/
 /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/
 /
 /
 / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/
 / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/
 / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/
 / 3: (Accepter::entry()+0x265) [0xb5c635]/
 / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/
 / 5: (clone()+0x6d) [0x3c8a2e89dd]/
 / NOTE: a copy of the executable, or `objdump -rdS executable` is
 needed to interpret this./
 
 
 *More information at Ceph Tracker Issue :
 *http://tracker.ceph.com/issues/10988#change-49018
 
 
 
 Karan Singh 
 Systems Specialist , Storage Platforms
 CSC - IT Center for Science,
 Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
 mobile: +358 503 812758
 tel. +358 9 4572001
 fax +358 9 4572302
 http://www.csc.fi/
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 -- 
 Christian Eichelmann
 Systemadministrator
 
 11 Internet AG - IT Operations Mail  Media Advertising  Targeting
 Brauerstraße 48 · DE-76135 Karlsruhe
 Telefon: +49 721 91374-8026
 christian.eichelm...@1und1.de
 
 Amtsgericht Montabaur / HRB 6484
 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
 Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
 Aufsichtsratsvorsitzender: Michael Scheeren



smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Nicheal
Umm.. Too many Threads are created in SimpleMessenger, every pipe
should create two working threads for sending and receiving messages.
Thus, AsyncMessenger would be promissing but still in development.

Regards
Ning Yao


2015-03-09 20:48 GMT+08:00 Christian Eichelmann christian.eichelm...@1und1.de:
 Hi Karan,

 as you are actually writing in your own book, the problem is the sysctl
 setting kernel.pid_max. I've seen in your bug report that you were
 setting it to 65536, which is still to low for high density hardware.

 In our cluster, one OSD server has in an idle situation about 66.000
 Threads (60 OSDs per Server). The number of threads increases when you
 increase the number of placement groups in the cluster, which I think
 has triggered your problem.

 Set the kernel.pid_max setting to 4194303 (the maximum) like Azad
 Aliyar suggested, and the problem should be gone.

 Regards,
 Christian

 Am 09.03.2015 11:41, schrieb Karan Singh:
 Hello Community need help to fix a long going Ceph problem.

 Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to
 restart OSD’s i am getting this error


 /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970/
 /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/


 *Environment *:  4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5
 , 3.17.2-1.el6.elrepo.x86_64

 Tried upgrading from 0.80.7 to 0.80.8  but no Luck

 Tried centOS stock kernel 2.6.32  but no Luck

 Memory is not a problem more then 150+GB is free


 Did any one every faced this problem ??

 *Cluster status *
 *
 *
  / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
 / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs
 incomplete; 1735 pgs peering; 8938 pgs stale; 1/
 /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean;
 recovery 6061/31080 objects degraded (19/
 /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02,
 mon.pouta-s03/
 / monmap e3: 3 mons at
 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/
 //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/
 /   * osdmap e26633: 239 osds: 85 up, 196 in*/
 /  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/
 /4699 GB used, 707 TB / 711 TB avail/
 /6061/31080 objects degraded (19.501%)/
 /  14 down+remapped+peering/
 /  39 active/
 /3289 active+clean/
 / 547 peering/
 / 663 stale+down+peering/
 / 705 stale+active+remapped/
 /   1 active+degraded+remapped/
 /   1 stale+down+incomplete/
 / 484 down+peering/
 / 455 active+remapped/
 /3696 stale+active+degraded/
 /   4 remapped+peering/
 /  23 stale+down+remapped+peering/
 /  51 stale+active/
 /3637 active+degraded/
 /3799 stale+active+clean/

 *OSD :  Logs *

 /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970/
 /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/
 /
 /
 / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/
 / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/
 / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/
 / 3: (Accepter::entry()+0x265) [0xb5c635]/
 / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/
 / 5: (clone()+0x6d) [0x3c8a2e89dd]/
 / NOTE: a copy of the executable, or `objdump -rdS executable` is
 needed to interpret this./


 *More information at Ceph Tracker Issue :
  *http://tracker.ceph.com/issues/10988#change-49018


 
 Karan Singh
 Systems Specialist , Storage Platforms
 CSC - IT Center for Science,
 Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
 mobile: +358 503 812758
 tel. +358 9 4572001
 fax +358 9 4572302
 http://www.csc.fi/
 



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 --
 Christian Eichelmann
 Systemadministrator

 11 Internet AG - IT Operations Mail  Media Advertising  Targeting
 Brauerstraße 48 · DE-76135 Karlsruhe
 Telefon: +49 721 91374-8026
 christian.eichelm...@1und1.de

 Amtsgericht Montabaur / HRB 6484
 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
 Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
 Aufsichtsratsvorsitzender: Michael Scheeren
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Tony Harris
I know I'm not even close to this type of a problem yet with my small
cluster (both test and production clusters) - but it would be great if
something like that could appear in the cluster HEALTHWARN, if Ceph could
determine the amount of used processes and compare them against the current
limit then throw a health warning if it gets within say 10 or 15% of the
max value.  That would be a really quick indicator for anyone who
frequently checks the health status (like through a web portal) as they may
see it more quickly then during their regular log check interval.  Just a
thought.

-Tony

On Mon, Mar 9, 2015 at 2:01 PM, Sage Weil s...@newdream.net wrote:

 On Mon, 9 Mar 2015, Karan Singh wrote:
  Thanks Guys kernel.pid_max=4194303 did the trick.

 Great to hear!  Sorry we missed that you only had it at 65536.

 This is a really common problem that people hit when their clusters start
 to grow.  Is there somewhere in the docs we can put this to catch more
 users?  Or maybe a warning issued by the osds themselves or something if
 they see limits that are low?

 sage

  - Karan -
 
On 09 Mar 2015, at 14:48, Christian Eichelmann
christian.eichelm...@1und1.de wrote:
 
  Hi Karan,
 
  as you are actually writing in your own book, the problem is the
  sysctl
  setting kernel.pid_max. I've seen in your bug report that you were
  setting it to 65536, which is still to low for high density hardware.
 
  In our cluster, one OSD server has in an idle situation about 66.000
  Threads (60 OSDs per Server). The number of threads increases when you
  increase the number of placement groups in the cluster, which I think
  has triggered your problem.
 
  Set the kernel.pid_max setting to 4194303 (the maximum) like Azad
  Aliyar suggested, and the problem should be gone.
 
  Regards,
  Christian
 
  Am 09.03.2015 11:41, schrieb Karan Singh:
Hello Community need help to fix a long going Ceph
problem.
 
Cluster is unhealthy , Multiple OSDs are DOWN. When i am
trying to
restart OSD?s i am getting this error
 
 
/2015-03-09 12:22:16.312774 7f760dac9700 -1
common/Thread.cc
http://Thread.cc: In function 'void
Thread::create(size_t)' thread
7f760dac9700 time 2015-03-09 12:22:16.311970/
/common/Thread.cc http://Thread.cc: 129: FAILED
assert(ret == 0)/
 
 
*Environment *:  4 Nodes , OSD+Monitor , Firefly latest ,
CentOS6.5
, 3.17.2-1.el6.elrepo.x86_64
 
Tried upgrading from 0.80.7 to 0.80.8  but no Luck
 
Tried centOS stock kernel 2.6.32  but no Luck
 
Memory is not a problem more then 150+GB is free
 
 
Did any one every faced this problem ??
 
*Cluster status *
*
*
/ cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
/ health HEALTH_WARN 7334 pgs degraded; 1185 pgs down;
1 pgs
incomplete; 1735 pgs peering; 8938 pgs stale; 1/
/736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs
stuck unclean;
recovery 6061/31080 objects degraded (19/
/.501%); 111/196 in osds are down; clock skew detected on
mon.pouta-s02,
mon.pouta-s03/
/ monmap e3: 3 mons at
 
 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX
.50.3:6789/
//0}, election epoch 1312, quorum 0,1,2
pouta-s01,pouta-s02,pouta-s03/
/   * osdmap e26633: 239 osds: 85 up, 196 in*/
/  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data,
10360 objects/
/4699 GB used, 707 TB / 711 TB avail/
/6061/31080 objects degraded (19.501%)/
/  14 down+remapped+peering/
/  39 active/
/3289 active+clean/
/ 547 peering/
/ 663 stale+down+peering/
/ 705 stale+active+remapped/
/   1 active+degraded+remapped/
/   1 stale+down+incomplete/
/ 484 down+peering/
/ 455 active+remapped/
/3696 stale+active+degraded/
/   4 remapped+peering/
/  23 stale+down+remapped+peering/
/  51 stale+active/
/3637 active+degraded/
/3799 stale+active+clean/
 
*OSD :  Logs *
 
/2015-03-09 12:22:16.312774 7f760dac9700 -1
common/Thread.cc
http://Thread.cc: In function 'void
Thread::create(size_t)' thread
7f760dac9700 time 2015-03-09 12:22:16.311970/
/common/Thread.cc http://Thread.cc: 129: FAILED
assert(ret == 0)/
/
/
/ ceph version 0.80.8
(69eaad7f8308f21573c604f121956e64679a52a7)/
/ 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/
/ 2: 

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Udo Lembke
Hi Tony,
sounds like an good idea!

Udo
On 09.03.2015 21:55, Tony Harris wrote:
 I know I'm not even close to this type of a problem yet with my small
 cluster (both test and production clusters) - but it would be great if
 something like that could appear in the cluster HEALTHWARN, if Ceph
 could determine the amount of used processes and compare them against
 the current limit then throw a health warning if it gets within say 10
 or 15% of the max value.  That would be a really quick indicator for
 anyone who frequently checks the health status (like through a web
 portal) as they may see it more quickly then during their regular log
 check interval.  Just a thought.

 -Tony


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Sage Weil
On Mon, 9 Mar 2015, Karan Singh wrote:
 Thanks Guys kernel.pid_max=4194303 did the trick.

Great to hear!  Sorry we missed that you only had it at 65536.

This is a really common problem that people hit when their clusters start 
to grow.  Is there somewhere in the docs we can put this to catch more 
users?  Or maybe a warning issued by the osds themselves or something if 
they see limits that are low?

sage

 - Karan -
 
   On 09 Mar 2015, at 14:48, Christian Eichelmann
   christian.eichelm...@1und1.de wrote:
 
 Hi Karan,
 
 as you are actually writing in your own book, the problem is the
 sysctl
 setting kernel.pid_max. I've seen in your bug report that you were
 setting it to 65536, which is still to low for high density hardware.
 
 In our cluster, one OSD server has in an idle situation about 66.000
 Threads (60 OSDs per Server). The number of threads increases when you
 increase the number of placement groups in the cluster, which I think
 has triggered your problem.
 
 Set the kernel.pid_max setting to 4194303 (the maximum) like Azad
 Aliyar suggested, and the problem should be gone.
 
 Regards,
 Christian
 
 Am 09.03.2015 11:41, schrieb Karan Singh:
   Hello Community need help to fix a long going Ceph
   problem.
 
   Cluster is unhealthy , Multiple OSDs are DOWN. When i am
   trying to
   restart OSD?s i am getting this error
 
 
   /2015-03-09 12:22:16.312774 7f760dac9700 -1
   common/Thread.cc
   http://Thread.cc: In function 'void
   Thread::create(size_t)' thread
   7f760dac9700 time 2015-03-09 12:22:16.311970/
   /common/Thread.cc http://Thread.cc: 129: FAILED
   assert(ret == 0)/
 
 
   *Environment *:  4 Nodes , OSD+Monitor , Firefly latest ,
   CentOS6.5
   , 3.17.2-1.el6.elrepo.x86_64
 
   Tried upgrading from 0.80.7 to 0.80.8  but no Luck
 
   Tried centOS stock kernel 2.6.32  but no Luck
 
   Memory is not a problem more then 150+GB is free
 
 
   Did any one every faced this problem ??
 
   *Cluster status *
   *
   *
   / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
   / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down;
   1 pgs
   incomplete; 1735 pgs peering; 8938 pgs stale; 1/
   /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs
   stuck unclean;
   recovery 6061/31080 objects degraded (19/
   /.501%); 111/196 in osds are down; clock skew detected on
   mon.pouta-s02,
   mon.pouta-s03/
   / monmap e3: 3 mons at
 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX
   .50.3:6789/
   //0}, election epoch 1312, quorum 0,1,2
   pouta-s01,pouta-s02,pouta-s03/
   /   * osdmap e26633: 239 osds: 85 up, 196 in*/
   /  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data,
   10360 objects/
   /    4699 GB used, 707 TB / 711 TB avail/
   /    6061/31080 objects degraded (19.501%)/
   /  14 down+remapped+peering/
   /  39 active/
   /    3289 active+clean/
   / 547 peering/
   / 663 stale+down+peering/
   / 705 stale+active+remapped/
   /   1 active+degraded+remapped/
   /   1 stale+down+incomplete/
   / 484 down+peering/
   / 455 active+remapped/
   /    3696 stale+active+degraded/
   /   4 remapped+peering/
   /  23 stale+down+remapped+peering/
   /  51 stale+active/
   /    3637 active+degraded/
   /    3799 stale+active+clean/
 
   *OSD :  Logs *
 
   /2015-03-09 12:22:16.312774 7f760dac9700 -1
   common/Thread.cc
   http://Thread.cc: In function 'void
   Thread::create(size_t)' thread
   7f760dac9700 time 2015-03-09 12:22:16.311970/
   /common/Thread.cc http://Thread.cc: 129: FAILED
   assert(ret == 0)/
   /
   /
   / ceph version 0.80.8
   (69eaad7f8308f21573c604f121956e64679a52a7)/
   / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/
   / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a)
   [0xae84fa]/
   / 3: (Accepter::entry()+0x265) [0xb5c635]/
   / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/
   / 5: (clone()+0x6d) [0x3c8a2e89dd]/
   / NOTE: a copy of the executable, or `objdump -rdS
   executable` is
   needed to interpret this./
 
 
   *More information at Ceph Tracker Issue :
   *http://tracker.ceph.com/issues/10988#change-49018
 
 
   
   Karan Singh
   Systems Specialist , Storage Platforms
   CSC - IT Center for Science,
   Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
   mobile: +358 503 812758
   tel. +358 9 4572001
   fax +358 9 4572302
   

[ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Karan Singh
Hello Community need help to fix a long going Ceph problem.

Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart 
OSD’s i am getting this error 


2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc: In function 'void 
Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970
common/Thread.cc: 129: FAILED assert(ret == 0)


Environment :  4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 
3.17.2-1.el6.elrepo.x86_64

Tried upgrading from 0.80.7 to 0.80.8  but no Luck

Tried centOS stock kernel 2.6.32  but no Luck

Memory is not a problem more then 150+GB is free 


Did any one every faced this problem ??

Cluster status 

   cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33
 health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 
1735 pgs peering; 8938 pgs stale; 1
736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 
6061/31080 objects degraded (19
.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, 
mon.pouta-s03
 monmap e3: 3 mons at 
{pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789
/0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03
 osdmap e26633: 239 osds: 85 up, 196 in
  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects
4699 GB used, 707 TB / 711 TB avail
6061/31080 objects degraded (19.501%)
  14 down+remapped+peering
  39 active
3289 active+clean
 547 peering
 663 stale+down+peering
 705 stale+active+remapped
   1 active+degraded+remapped
   1 stale+down+incomplete
 484 down+peering
 455 active+remapped
3696 stale+active+degraded
   4 remapped+peering
  23 stale+down+remapped+peering
  51 stale+active
3637 active+degraded
3799 stale+active+clean

OSD :  Logs 

2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc: In function 'void 
Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970
common/Thread.cc: 129: FAILED assert(ret == 0)

 ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)
 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]
 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]
 3: (Accepter::entry()+0x265) [0xb5c635]
 4: /lib64/libpthread.so.0() [0x3c8a6079d1]
 5: (clone()+0x6d) [0x3c8a2e89dd]
 NOTE: a copy of the executable, or `objdump -rdS executable` is needed to 
interpret this.


More information at Ceph Tracker Issue :  
http://tracker.ceph.com/issues/10988#change-49018 
http://tracker.ceph.com/issues/10988#change-49018



Karan Singh 
Systems Specialist , Storage Platforms
CSC - IT Center for Science,
Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
mobile: +358 503 812758
tel. +358 9 4572001
fax +358 9 4572302
http://www.csc.fi/




smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Azad Aliyar
*Check Max Threadcount:* If you have a node with a lot of OSDs, you may be
hitting the default maximum number of threads (e.g., usually 32k),
especially during recovery. You can increase the number of threads using
sysctl to see if increasing the maximum number of threads to the maximum
possible number of threads allowed (i.e., 4194303) will help. For example:

sysctl -w kernel.pid_max=4194303

 If increasing the maximum thread count resolves the issue, you can make it
permanent by including a kernel.pid_max setting in the /etc/sysctl.conf
file. For example:

kernel.pid_max = 4194303


On Mon, Mar 9, 2015 at 4:11 PM, Karan Singh karan.si...@csc.fi wrote:

 Hello Community need help to fix a long going Ceph problem.

 Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart
 OSD’s i am getting this error


 *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970*
 *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)*


 *Environment *:  4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 ,
 3.17.2-1.el6.elrepo.x86_64

 Tried upgrading from 0.80.7 to 0.80.8  but no Luck

 Tried centOS stock kernel 2.6.32  but no Luck

 Memory is not a problem more then 150+GB is free


 Did any one every faced this problem ??

 *Cluster status *

  *  cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33*
 * health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs
 incomplete; 1735 pgs peering; 8938 pgs stale; 1*
 *736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean;
 recovery 6061/31080 objects degraded (19*
 *.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02,
 mon.pouta-s03*
 * monmap e3: 3 mons at
 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789*
 */0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03*
 * osdmap e26633: 239 osds: 85 up, 196 in*
 *  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects*
 *4699 GB used, 707 TB / 711 TB avail*
 *6061/31080 objects degraded (19.501%)*
 *  14 down+remapped+peering*
 *  39 active*
 *3289 active+clean*
 * 547 peering*
 * 663 stale+down+peering*
 * 705 stale+active+remapped*
 *   1 active+degraded+remapped*
 *   1 stale+down+incomplete*
 * 484 down+peering*
 * 455 active+remapped*
 *3696 stale+active+degraded*
 *   4 remapped+peering*
 *  23 stale+down+remapped+peering*
 *  51 stale+active*
 *3637 active+degraded*
 *3799 stale+active+clean*

 *OSD :  Logs *

 *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970*
 *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)*

 * ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)*
 * 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]*
 * 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]*
 * 3: (Accepter::entry()+0x265) [0xb5c635]*
 * 4: /lib64/libpthread.so.0() [0x3c8a6079d1]*
 * 5: (clone()+0x6d) [0x3c8a2e89dd]*
 * NOTE: a copy of the executable, or `objdump -rdS executable` is needed
 to interpret this.*


 *More information at Ceph Tracker Issue :  *
 http://tracker.ceph.com/issues/10988#change-49018


 
 Karan Singh
 Systems Specialist , Storage Platforms
 CSC - IT Center for Science,
 Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
 mobile: +358 503 812758
 tel. +358 9 4572001
 fax +358 9 4572302
 http://www.csc.fi/
 


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
   Warm Regards,  Azad Aliyar
 Linux Server Engineer
 *Email* :  azad.ali...@sparksupport.com   *|*   *Skype* :   spark.azad
http://www.sparksupport.com http://www.sparkmycloud.com
https://www.facebook.com/sparksupport
http://www.linkedin.com/company/244846  https://twitter.com/sparksupport
3rd Floor, Leela Infopark, Phase -2,Kakanad, Kochi-30, Kerala, India
*Phone*:+91 484 6561696 , *Mobile*:91-8129270421.   *Confidentiality
Notice:* Information in this e-mail is proprietary to SparkSupport. and is
intended for use only by the addressed, and may contain information that is
privileged, confidential or exempt from disclosure. If you are not the
intended recipient, you are notified that any use of this information in
any manner is strictly prohibited. Please delete this mail  notify us
immediately at i...@sparksupport.com

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Azad Aliyar
Great Karan.

On Mon, Mar 9, 2015 at 9:32 PM, Karan Singh karan.si...@csc.fi wrote:

 Thanks Guys kernel.pid_max=4194303 did the trick.

 - Karan -

 On 09 Mar 2015, at 14:48, Christian Eichelmann 
 christian.eichelm...@1und1.de wrote:

 Hi Karan,

 as you are actually writing in your own book, the problem is the sysctl
 setting kernel.pid_max. I've seen in your bug report that you were
 setting it to 65536, which is still to low for high density hardware.

 In our cluster, one OSD server has in an idle situation about 66.000
 Threads (60 OSDs per Server). The number of threads increases when you
 increase the number of placement groups in the cluster, which I think
 has triggered your problem.

 Set the kernel.pid_max setting to 4194303 (the maximum) like Azad
 Aliyar suggested, and the problem should be gone.

 Regards,
 Christian

 Am 09.03.2015 11:41, schrieb Karan Singh:

 Hello Community need help to fix a long going Ceph problem.

 Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to
 restart OSD’s i am getting this error


 /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970/
 /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/


 *Environment *:  4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5
 , 3.17.2-1.el6.elrepo.x86_64

 Tried upgrading from 0.80.7 to 0.80.8  but no Luck

 Tried centOS stock kernel 2.6.32  but no Luck

 Memory is not a problem more then 150+GB is free


 Did any one every faced this problem ??

 *Cluster status *
 *
 *
 / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
 / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs
 incomplete; 1735 pgs peering; 8938 pgs stale; 1/
 /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean;
 recovery 6061/31080 objects degraded (19/
 /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02,
 mon.pouta-s03/
 / monmap e3: 3 mons at

 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/
 //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/
 /   * osdmap e26633: 239 osds: 85 up, 196 in*/
 /  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/
 /4699 GB used, 707 TB / 711 TB avail/
 /6061/31080 objects degraded (19.501%)/
 /  14 down+remapped+peering/
 /  39 active/
 /3289 active+clean/
 / 547 peering/
 / 663 stale+down+peering/
 / 705 stale+active+remapped/
 /   1 active+degraded+remapped/
 /   1 stale+down+incomplete/
 / 484 down+peering/
 / 455 active+remapped/
 /3696 stale+active+degraded/
 /   4 remapped+peering/
 /  23 stale+down+remapped+peering/
 /  51 stale+active/
 /3637 active+degraded/
 /3799 stale+active+clean/

 *OSD :  Logs *

 /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc
 http://Thread.cc: In function 'void Thread::create(size_t)' thread
 7f760dac9700 time 2015-03-09 12:22:16.311970/
 /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/
 /
 /
 / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/
 / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/
 / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/
 / 3: (Accepter::entry()+0x265) [0xb5c635]/
 / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/
 / 5: (clone()+0x6d) [0x3c8a2e89dd]/
 / NOTE: a copy of the executable, or `objdump -rdS executable` is
 needed to interpret this./


 *More information at Ceph Tracker Issue :
 *http://tracker.ceph.com/issues/10988#change-49018


 
 Karan Singh
 Systems Specialist , Storage Platforms
 CSC - IT Center for Science,
 Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
 mobile: +358 503 812758
 tel. +358 9 4572001
 fax +358 9 4572302
 http://www.csc.fi/
 



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



 --
 Christian Eichelmann
 Systemadministrator

 11 Internet AG - IT Operations Mail  Media Advertising  Targeting
 Brauerstraße 48 · DE-76135 Karlsruhe
 Telefon: +49 721 91374-8026
 christian.eichelm...@1und1.de

 Amtsgericht Montabaur / HRB 6484
 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
 Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
 Aufsichtsratsvorsitzender: Michael Scheeren





-- 
   Warm Regards,  Azad Aliyar
 Linux Server Engineer
 *Email* :  azad.ali...@sparksupport.com   *|*   *Skype* :   spark.azad
http://www.sparksupport.com 

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Nicheal
2015-03-10 3:01 GMT+08:00 Sage Weil s...@newdream.net:
 On Mon, 9 Mar 2015, Karan Singh wrote:
 Thanks Guys kernel.pid_max=4194303 did the trick.

 Great to hear!  Sorry we missed that you only had it at 65536.

 This is a really common problem that people hit when their clusters start
 to grow.  Is there somewhere in the docs we can put this to catch more
 users?  Or maybe a warning issued by the osds themselves or something if
 they see limits that are low?

 sage

Um, I think we can add the command to the shell script
/etc/init.d/ceph.  Something like we deal with the max fd limitation
(ulimit -n 32768). Thus, if we use command service ceph start osd.*
to start osds, it will be automatically changed to the proper value.

 - Karan -

   On 09 Mar 2015, at 14:48, Christian Eichelmann
   christian.eichelm...@1und1.de wrote:

 Hi Karan,

 as you are actually writing in your own book, the problem is the
 sysctl
 setting kernel.pid_max. I've seen in your bug report that you were
 setting it to 65536, which is still to low for high density hardware.

 In our cluster, one OSD server has in an idle situation about 66.000
 Threads (60 OSDs per Server). The number of threads increases when you
 increase the number of placement groups in the cluster, which I think
 has triggered your problem.

 Set the kernel.pid_max setting to 4194303 (the maximum) like Azad
 Aliyar suggested, and the problem should be gone.

 Regards,
 Christian

 Am 09.03.2015 11:41, schrieb Karan Singh:
   Hello Community need help to fix a long going Ceph
   problem.

   Cluster is unhealthy , Multiple OSDs are DOWN. When i am
   trying to
   restart OSD?s i am getting this error


   /2015-03-09 12:22:16.312774 7f760dac9700 -1
   common/Thread.cc
   http://Thread.cc: In function 'void
   Thread::create(size_t)' thread
   7f760dac9700 time 2015-03-09 12:22:16.311970/
   /common/Thread.cc http://Thread.cc: 129: FAILED
   assert(ret == 0)/


   *Environment *:  4 Nodes , OSD+Monitor , Firefly latest ,
   CentOS6.5
   , 3.17.2-1.el6.elrepo.x86_64

   Tried upgrading from 0.80.7 to 0.80.8  but no Luck

   Tried centOS stock kernel 2.6.32  but no Luck

   Memory is not a problem more then 150+GB is free


   Did any one every faced this problem ??

   *Cluster status *
   *
   *
   / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/
   / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down;
   1 pgs
   incomplete; 1735 pgs peering; 8938 pgs stale; 1/
   /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs
   stuck unclean;
   recovery 6061/31080 objects degraded (19/
   /.501%); 111/196 in osds are down; clock skew detected on
   mon.pouta-s02,
   mon.pouta-s03/
   / monmap e3: 3 mons at
 {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX
   .50.3:6789/
   //0}, election epoch 1312, quorum 0,1,2
   pouta-s01,pouta-s02,pouta-s03/
   /   * osdmap e26633: 239 osds: 85 up, 196 in*/
   /  pgmap v60389: 17408 pgs, 13 pools, 42345 MB data,
   10360 objects/
   /4699 GB used, 707 TB / 711 TB avail/
   /6061/31080 objects degraded (19.501%)/
   /  14 down+remapped+peering/
   /  39 active/
   /3289 active+clean/
   / 547 peering/
   / 663 stale+down+peering/
   / 705 stale+active+remapped/
   /   1 active+degraded+remapped/
   /   1 stale+down+incomplete/
   / 484 down+peering/
   / 455 active+remapped/
   /3696 stale+active+degraded/
   /   4 remapped+peering/
   /  23 stale+down+remapped+peering/
   /  51 stale+active/
   /3637 active+degraded/
   /3799 stale+active+clean/

   *OSD :  Logs *

   /2015-03-09 12:22:16.312774 7f760dac9700 -1
   common/Thread.cc
   http://Thread.cc: In function 'void
   Thread::create(size_t)' thread
   7f760dac9700 time 2015-03-09 12:22:16.311970/
   /common/Thread.cc http://Thread.cc: 129: FAILED
   assert(ret == 0)/
   /
   /
   / ceph version 0.80.8
   (69eaad7f8308f21573c604f121956e64679a52a7)/
   / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/
   / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a)
   [0xae84fa]/
   / 3: (Accepter::entry()+0x265) [0xb5c635]/
   / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/
   / 5: (clone()+0x6d) [0x3c8a2e89dd]/
   / NOTE: a copy of the executable, or `objdump -rdS
   executable` is
   needed to interpret this./


   *More information at Ceph Tracker Issue :
   *http://tracker.ceph.com/issues/10988#change-49018