Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Thanks Sage I will create a “new feature” request on tracker.ceph.com http://tracker.ceph.com/ so that this discussion should not get buried under mailing list. Developers can implement this as per their convenience. Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ On 10 Mar 2015, at 14:26, Sage Weil s...@newdream.net wrote: On Tue, 10 Mar 2015, Christian Eichelmann wrote: Hi Sage, we hit this problem a few monthes ago as well and it took us quite a while to figure out what's wrong. As a Systemadministrator I don't like the idea that daemons or even init scripts are changing system wide configuration parameters, so I wouldn't like to see the OSDs do it themself. This is my general feeling as well. As we move to systemd, I'd like to have the ceph unit file get away from this entirely and have the admin set these values in /etc/security/limits.conf or /etc/sysctl.d. The main thing making this problematic right now is that the daemons run as root instead of a 'ceph' user. The idea with the warning is on one hand a good hint, on the other hand it also may confuse people, since changing this setting is not required for common hardware. If we make it warn only if it reaches 50% of the threshold that is probably safe... sage Regards, Christian On 03/09/2015 08:01 PM, Sage Weil wrote: On Mon, 9 Mar 2015, Karan Singh wrote: Thanks Guys kernel.pid_max=4194303 did the trick. Great to hear! Sorry we missed that you only had it at 65536. This is a really common problem that people hit when their clusters start to grow. Is there somewhere in the docs we can put this to catch more users? Or maybe a warning issued by the osds themselves or something if they see limits that are low? sage - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD?s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX .50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
On Tue, 10 Mar 2015, Christian Eichelmann wrote: Hi Sage, we hit this problem a few monthes ago as well and it took us quite a while to figure out what's wrong. As a Systemadministrator I don't like the idea that daemons or even init scripts are changing system wide configuration parameters, so I wouldn't like to see the OSDs do it themself. This is my general feeling as well. As we move to systemd, I'd like to have the ceph unit file get away from this entirely and have the admin set these values in /etc/security/limits.conf or /etc/sysctl.d. The main thing making this problematic right now is that the daemons run as root instead of a 'ceph' user. The idea with the warning is on one hand a good hint, on the other hand it also may confuse people, since changing this setting is not required for common hardware. If we make it warn only if it reaches 50% of the threshold that is probably safe... sage Regards, Christian On 03/09/2015 08:01 PM, Sage Weil wrote: On Mon, 9 Mar 2015, Karan Singh wrote: Thanks Guys kernel.pid_max=4194303 did the trick. Great to hear! Sorry we missed that you only had it at 65536. This is a really common problem that people hit when their clusters start to grow. Is there somewhere in the docs we can put this to catch more users? Or maybe a warning issued by the osds themselves or something if they see limits that are low? sage - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD?s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX .50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Hi Sage, we hit this problem a few monthes ago as well and it took us quite a while to figure out what's wrong. As a Systemadministrator I don't like the idea that daemons or even init scripts are changing system wide configuration parameters, so I wouldn't like to see the OSDs do it themself. I've noticed that building ceph on high density hardware is a totally different thing with totally different problems and solutions than with common hardware. I would like to see a special section in the documentation regarding problems with that kind of hardware and ceph clusters at a larger scale. So I vote for the documentation. Sysctls are something I want to set for myself. The idea with the warning is on one hand a good hint, on the other hand it also may confuse people, since changing this setting is not required for common hardware. Regards, Christian On 03/09/2015 08:01 PM, Sage Weil wrote: On Mon, 9 Mar 2015, Karan Singh wrote: Thanks Guys kernel.pid_max=4194303 did the trick. Great to hear! Sorry we missed that you only had it at 65536. This is a really common problem that people hit when their clusters start to grow. Is there somewhere in the docs we can put this to catch more users? Or maybe a warning issued by the osds themselves or something if they see limits that are low? sage - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD?s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX .50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Hi Karan, We faced same issue and resolved after increasing the open file limit and maximum no of threads Config reference /etc/security/limit.conf root hard nofile 65535 sysctl -w kernel.pid_max=4194303 http://tracker.ceph.com/issues/10554#change-47024 Cheers Mohamed Pakkeer On Mon, Mar 9, 2015 at 4:20 PM, Azad Aliyar azad.ali...@sparksupport.com wrote: *Check Max Threadcount:* If you have a node with a lot of OSDs, you may be hitting the default maximum number of threads (e.g., usually 32k), especially during recovery. You can increase the number of threads using sysctl to see if increasing the maximum number of threads to the maximum possible number of threads allowed (i.e., 4194303) will help. For example: sysctl -w kernel.pid_max=4194303 If increasing the maximum thread count resolves the issue, you can make it permanent by including a kernel.pid_max setting in the /etc/sysctl.conf file. For example: kernel.pid_max = 4194303 On Mon, Mar 9, 2015 at 4:11 PM, Karan Singh karan.si...@csc.fi wrote: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970* *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)* *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33* * health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1* *736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19* *.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03* * monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789* */0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03* * osdmap e26633: 239 osds: 85 up, 196 in* * pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects* *4699 GB used, 707 TB / 711 TB avail* *6061/31080 objects degraded (19.501%)* * 14 down+remapped+peering* * 39 active* *3289 active+clean* * 547 peering* * 663 stale+down+peering* * 705 stale+active+remapped* * 1 active+degraded+remapped* * 1 stale+down+incomplete* * 484 down+peering* * 455 active+remapped* *3696 stale+active+degraded* * 4 remapped+peering* * 23 stale+down+remapped+peering* * 51 stale+active* *3637 active+degraded* *3799 stale+active+clean* *OSD : Logs * *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970* *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)* * ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)* * 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]* * 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]* * 3: (Accepter::entry()+0x265) [0xb5c635]* * 4: /lib64/libpthread.so.0() [0x3c8a6079d1]* * 5: (clone()+0x6d) [0x3c8a2e89dd]* * NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this.* *More information at Ceph Tracker Issue : * http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Warm Regards, Azad Aliyar Linux Server Engineer *Email* : azad.ali...@sparksupport.com *|* *Skype* : spark.azad http://www.sparksupport.com http://www.sparkmycloud.com https://www.facebook.com/sparksupport http://www.linkedin.com/company/244846 https://twitter.com/sparksupport3rd Floor, Leela Infopark, Phase -2,Kakanad, Kochi-30, Kerala, India *Phone*:+91 484 6561696 , *Mobile*:91-8129270421. *Confidentiality Notice:* Information in this
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Eichelmann Systemadministrator 11 Internet AG - IT Operations Mail Media Advertising Targeting Brauerstraße 48 · DE-76135 Karlsruhe Telefon: +49 721 91374-8026 christian.eichelm...@1und1.de Amtsgericht Montabaur / HRB 6484 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen Aufsichtsratsvorsitzender: Michael Scheeren ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Thanks Guys kernel.pid_max=4194303 did the trick. - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Eichelmann Systemadministrator 11 Internet AG - IT Operations Mail Media Advertising Targeting Brauerstraße 48 · DE-76135 Karlsruhe Telefon: +49 721 91374-8026 christian.eichelm...@1und1.de Amtsgericht Montabaur / HRB 6484 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen Aufsichtsratsvorsitzender: Michael Scheeren smime.p7s Description: S/MIME cryptographic signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Umm.. Too many Threads are created in SimpleMessenger, every pipe should create two working threads for sending and receiving messages. Thus, AsyncMessenger would be promissing but still in development. Regards Ning Yao 2015-03-09 20:48 GMT+08:00 Christian Eichelmann christian.eichelm...@1und1.de: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Eichelmann Systemadministrator 11 Internet AG - IT Operations Mail Media Advertising Targeting Brauerstraße 48 · DE-76135 Karlsruhe Telefon: +49 721 91374-8026 christian.eichelm...@1und1.de Amtsgericht Montabaur / HRB 6484 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen Aufsichtsratsvorsitzender: Michael Scheeren ___ ceph-users mailing list ceph-users@lists.ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
I know I'm not even close to this type of a problem yet with my small cluster (both test and production clusters) - but it would be great if something like that could appear in the cluster HEALTHWARN, if Ceph could determine the amount of used processes and compare them against the current limit then throw a health warning if it gets within say 10 or 15% of the max value. That would be a really quick indicator for anyone who frequently checks the health status (like through a web portal) as they may see it more quickly then during their regular log check interval. Just a thought. -Tony On Mon, Mar 9, 2015 at 2:01 PM, Sage Weil s...@newdream.net wrote: On Mon, 9 Mar 2015, Karan Singh wrote: Thanks Guys kernel.pid_max=4194303 did the trick. Great to hear! Sorry we missed that you only had it at 65536. This is a really common problem that people hit when their clusters start to grow. Is there somewhere in the docs we can put this to catch more users? Or maybe a warning issued by the osds themselves or something if they see limits that are low? sage - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD?s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX .50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2:
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Hi Tony, sounds like an good idea! Udo On 09.03.2015 21:55, Tony Harris wrote: I know I'm not even close to this type of a problem yet with my small cluster (both test and production clusters) - but it would be great if something like that could appear in the cluster HEALTHWARN, if Ceph could determine the amount of used processes and compare them against the current limit then throw a health warning if it gets within say 10 or 15% of the max value. That would be a really quick indicator for anyone who frequently checks the health status (like through a web portal) as they may see it more quickly then during their regular log check interval. Just a thought. -Tony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
On Mon, 9 Mar 2015, Karan Singh wrote: Thanks Guys kernel.pid_max=4194303 did the trick. Great to hear! Sorry we missed that you only had it at 65536. This is a really common problem that people hit when their clusters start to grow. Is there somewhere in the docs we can put this to catch more users? Or maybe a warning issued by the osds themselves or something if they see limits that are low? sage - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD?s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX .50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ / 4699 GB used, 707 TB / 711 TB avail/ / 6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ / 3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ / 3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ / 3637 active+degraded/ / 3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302
[ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error 2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970 common/Thread.cc: 129: FAILED assert(ret == 0) Environment : 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? Cluster status cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33 health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1 736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19 .501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03 monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789 /0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03 osdmap e26633: 239 osds: 85 up, 196 in pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects 4699 GB used, 707 TB / 711 TB avail 6061/31080 objects degraded (19.501%) 14 down+remapped+peering 39 active 3289 active+clean 547 peering 663 stale+down+peering 705 stale+active+remapped 1 active+degraded+remapped 1 stale+down+incomplete 484 down+peering 455 active+remapped 3696 stale+active+degraded 4 remapped+peering 23 stale+down+remapped+peering 51 stale+active 3637 active+degraded 3799 stale+active+clean OSD : Logs 2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970 common/Thread.cc: 129: FAILED assert(ret == 0) ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7) 1: (Thread::create(unsigned long)+0x8a) [0xaf41da] 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa] 3: (Accepter::entry()+0x265) [0xb5c635] 4: /lib64/libpthread.so.0() [0x3c8a6079d1] 5: (clone()+0x6d) [0x3c8a2e89dd] NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this. More information at Ceph Tracker Issue : http://tracker.ceph.com/issues/10988#change-49018 http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ smime.p7s Description: S/MIME cryptographic signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
*Check Max Threadcount:* If you have a node with a lot of OSDs, you may be hitting the default maximum number of threads (e.g., usually 32k), especially during recovery. You can increase the number of threads using sysctl to see if increasing the maximum number of threads to the maximum possible number of threads allowed (i.e., 4194303) will help. For example: sysctl -w kernel.pid_max=4194303 If increasing the maximum thread count resolves the issue, you can make it permanent by including a kernel.pid_max setting in the /etc/sysctl.conf file. For example: kernel.pid_max = 4194303 On Mon, Mar 9, 2015 at 4:11 PM, Karan Singh karan.si...@csc.fi wrote: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970* *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)* *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33* * health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1* *736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19* *.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03* * monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789* */0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03* * osdmap e26633: 239 osds: 85 up, 196 in* * pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects* *4699 GB used, 707 TB / 711 TB avail* *6061/31080 objects degraded (19.501%)* * 14 down+remapped+peering* * 39 active* *3289 active+clean* * 547 peering* * 663 stale+down+peering* * 705 stale+active+remapped* * 1 active+degraded+remapped* * 1 stale+down+incomplete* * 484 down+peering* * 455 active+remapped* *3696 stale+active+degraded* * 4 remapped+peering* * 23 stale+down+remapped+peering* * 51 stale+active* *3637 active+degraded* *3799 stale+active+clean* *OSD : Logs * *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970* *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)* * ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)* * 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]* * 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]* * 3: (Accepter::entry()+0x265) [0xb5c635]* * 4: /lib64/libpthread.so.0() [0x3c8a6079d1]* * 5: (clone()+0x6d) [0x3c8a2e89dd]* * NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this.* *More information at Ceph Tracker Issue : * http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Warm Regards, Azad Aliyar Linux Server Engineer *Email* : azad.ali...@sparksupport.com *|* *Skype* : spark.azad http://www.sparksupport.com http://www.sparkmycloud.com https://www.facebook.com/sparksupport http://www.linkedin.com/company/244846 https://twitter.com/sparksupport 3rd Floor, Leela Infopark, Phase -2,Kakanad, Kochi-30, Kerala, India *Phone*:+91 484 6561696 , *Mobile*:91-8129270421. *Confidentiality Notice:* Information in this e-mail is proprietary to SparkSupport. and is intended for use only by the addressed, and may contain information that is privileged, confidential or exempt from disclosure. If you are not the intended recipient, you are notified that any use of this information in any manner is strictly prohibited. Please delete this mail notify us immediately at i...@sparksupport.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Great Karan. On Mon, Mar 9, 2015 at 9:32 PM, Karan Singh karan.si...@csc.fi wrote: Thanks Guys kernel.pid_max=4194303 did the trick. - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Eichelmann Systemadministrator 11 Internet AG - IT Operations Mail Media Advertising Targeting Brauerstraße 48 · DE-76135 Karlsruhe Telefon: +49 721 91374-8026 christian.eichelm...@1und1.de Amtsgericht Montabaur / HRB 6484 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen Aufsichtsratsvorsitzender: Michael Scheeren -- Warm Regards, Azad Aliyar Linux Server Engineer *Email* : azad.ali...@sparksupport.com *|* *Skype* : spark.azad http://www.sparksupport.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
2015-03-10 3:01 GMT+08:00 Sage Weil s...@newdream.net: On Mon, 9 Mar 2015, Karan Singh wrote: Thanks Guys kernel.pid_max=4194303 did the trick. Great to hear! Sorry we missed that you only had it at 65536. This is a really common problem that people hit when their clusters start to grow. Is there somewhere in the docs we can put this to catch more users? Or maybe a warning issued by the osds themselves or something if they see limits that are low? sage Um, I think we can add the command to the shell script /etc/init.d/ceph. Something like we deal with the max fd limitation (ulimit -n 32768). Thus, if we use command service ceph start osd.* to start osds, it will be automatically changed to the proper value. - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD?s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX .50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018