Re: [ceph-users] New eu.ceph.com mirror machine
Hi Wildo If I disable the epel repo then the error changes: [root@ninja ~]# yum install --disablerepo=epel ceph Loaded plugins: langpacks, priorities, product-id, subscription-manager 10 packages excluded due to repository priority protections Resolving Dependencies . -- Finished Dependency Resolution Error: Package: gperftools-libs-2.1-1.el7.x86_64 (ceph) Requires: libunwind.so.8()(64bit) So this is related to the EPEL repo breaking ceph again. I have check_obsoletes=1 as recommended on this list a couple weeks ago. Is there any chance you could copy the libunwind repo to eu.ceph.com ? Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: Wido den Hollander [w...@42on.com] Sent: 09 March 2015 13:43 To: HEWLETT, Paul (Paul)** CTR **; ceph-users Subject: Re: [ceph-users] New eu.ceph.com mirror machine On 03/09/2015 02:27 PM, HEWLETT, Paul (Paul)** CTR ** wrote: When did you make the change? Yesterday It worked on Friday albeit with these extra lines in ceph.repo: [Ceph-el7] name=Ceph-el7 baseurl=http://eu.ceph.com/rpms/rhel7/noarch/ enabled=1 gpgcheck=0 which I removed when I discovered this no longer existed. Ah, I think I know. The rsync script probably didn't clean up those old directories, since they don't exist here either: http://ceph.com/rpms/rhel7/noarch/ That caused some confusion since this machine is a fresh sync from ceph.com Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: Wido den Hollander [w...@42on.com] Sent: 09 March 2015 12:15 To: HEWLETT, Paul (Paul)** CTR **; ceph-users Subject: Re: [ceph-users] New eu.ceph.com mirror machine On 03/09/2015 12:54 PM, HEWLETT, Paul (Paul)** CTR ** wrote: Hi Wildo Has something broken with this move? The following has worked for me repeatedly over the last 2 months: It shouldn't have broken anything, but you never know. The machine rsyncs the data from ceph.com directly. The directories you are pointing at do exist and contain data. Anybody else noticing something? This a.m. I tried to install ceph using the following repo file: [root@citrus ~]# cat /etc/yum.repos.d/ceph.repo [ceph] name=Ceph packages for $basearch baseurl=http://ceph.com/rpm-giant/rhel7/$basearch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-noarch] name=Ceph noarch packages baseurl=http://ceph.com/rpm-giant/rhel7/noarch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-source] name=Ceph source packages baseurl=http://ceph.com/rpm-giant/rhel7/SRPMS enabled=0 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc and ceph now fails to install: msg: Error: Package: 1:ceph-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Error: Package: 1:ceph-common-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Wido den Hollander [w...@42on.com] Sent: 09 March 2015 11:15 To: ceph-users Subject: [ceph-users] New eu.ceph.com mirror machine Hi, Since the recent reports of rsync failing on eu.ceph.com I moved eu.ceph.com to a new machine. It went from physical to a KVM VM backed by RBD, so it's now running on Ceph. URLs or rsync paths haven't changed, it's still eu.ceph.com and available over IPv4 and IPv6. This Virtual Machine is dedicated for running eu.ceph.com, so hopefully rsync won't fail anymore. -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902
Re: [ceph-users] New eu.ceph.com mirror machine
Hi Wildo Has something broken with this move? The following has worked for me repeatedly over the last 2 months: This a.m. I tried to install ceph using the following repo file: [root@citrus ~]# cat /etc/yum.repos.d/ceph.repo [ceph] name=Ceph packages for $basearch baseurl=http://ceph.com/rpm-giant/rhel7/$basearch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-noarch] name=Ceph noarch packages baseurl=http://ceph.com/rpm-giant/rhel7/noarch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-source] name=Ceph source packages baseurl=http://ceph.com/rpm-giant/rhel7/SRPMS enabled=0 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc and ceph now fails to install: msg: Error: Package: 1:ceph-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Error: Package: 1:ceph-common-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Wido den Hollander [w...@42on.com] Sent: 09 March 2015 11:15 To: ceph-users Subject: [ceph-users] New eu.ceph.com mirror machine Hi, Since the recent reports of rsync failing on eu.ceph.com I moved eu.ceph.com to a new machine. It went from physical to a KVM VM backed by RBD, so it's now running on Ceph. URLs or rsync paths haven't changed, it's still eu.ceph.com and available over IPv4 and IPv6. This Virtual Machine is dedicated for running eu.ceph.com, so hopefully rsync won't fail anymore. -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] New eu.ceph.com mirror machine
On 03/09/2015 12:54 PM, HEWLETT, Paul (Paul)** CTR ** wrote: Hi Wildo Has something broken with this move? The following has worked for me repeatedly over the last 2 months: It shouldn't have broken anything, but you never know. The machine rsyncs the data from ceph.com directly. The directories you are pointing at do exist and contain data. Anybody else noticing something? This a.m. I tried to install ceph using the following repo file: [root@citrus ~]# cat /etc/yum.repos.d/ceph.repo [ceph] name=Ceph packages for $basearch baseurl=http://ceph.com/rpm-giant/rhel7/$basearch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-noarch] name=Ceph noarch packages baseurl=http://ceph.com/rpm-giant/rhel7/noarch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-source] name=Ceph source packages baseurl=http://ceph.com/rpm-giant/rhel7/SRPMS enabled=0 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc and ceph now fails to install: msg: Error: Package: 1:ceph-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Error: Package: 1:ceph-common-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Wido den Hollander [w...@42on.com] Sent: 09 March 2015 11:15 To: ceph-users Subject: [ceph-users] New eu.ceph.com mirror machine Hi, Since the recent reports of rsync failing on eu.ceph.com I moved eu.ceph.com to a new machine. It went from physical to a KVM VM backed by RBD, so it's now running on Ceph. URLs or rsync paths haven't changed, it's still eu.ceph.com and available over IPv4 and IPv6. This Virtual Machine is dedicated for running eu.ceph.com, so hopefully rsync won't fail anymore. -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Stuck PGs blocked_by non-existent OSDs
Hi, I'm trying to fix an issue within 0.93 on our internal cloud related to incomplete pg's (yes, I realise the folly of having the dev release - it's a not-so-test env now, so I need to recover this really). I'll detail the current outage info; 72 initial (now 65) OSDs 6 nodes * Update to 0.92 from Giant. * Fine for a day * MDS outage overnight and subsequent node failure * Massive increase in RAM utilisation (10G per OSD!) * More failure * OSD's 'out' to try to alleviate new large cluster requirements and a couple died under additional load * 'superfluous and faulty' OSD's rm, auth keys deleted * RAM added to nodes (96GB each - serving 10-12 OSDs) * Ugrade to 0.93 * Fix broken journals due to 0.92 update * No more missing objects or degredation So, that brings me to today, I still have 73/2264 PGs listed as stuck incomplete/inactive. I also have requests that are blocked. Upon querying said placement groups, I notice that they are 'blocked_by' non-existent OSDs (ones I have removed due to issues). I have no way to tell them the OSD is lost (as it'a already been removed, both from osdmap and crushmap). Exporting the crushmap shows non-existant OSDs as deviceN (i.e. device36 for the removed osd.36) Deleting those and reimporting crush map makes no affect Some further pg detail - https://gist.github.com/joelio/cecca9b48aca6d44451b So I'm stuck, I can't recover the pg's as I can't remove a non-existent OSD that the PG think's blocking it. Help graciously accepted! Joel -- $ echo kpfmAdpoofdufevq/dp/vl | perl -pe 's/(.)/chr(ord($1)-1)/ge' -- $ echo kpfmAdpoofdufevq/dp/vl | perl -pe 's/(.)/chr(ord($1)-1)/ge' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Hi Karan, We faced same issue and resolved after increasing the open file limit and maximum no of threads Config reference /etc/security/limit.conf root hard nofile 65535 sysctl -w kernel.pid_max=4194303 http://tracker.ceph.com/issues/10554#change-47024 Cheers Mohamed Pakkeer On Mon, Mar 9, 2015 at 4:20 PM, Azad Aliyar azad.ali...@sparksupport.com wrote: *Check Max Threadcount:* If you have a node with a lot of OSDs, you may be hitting the default maximum number of threads (e.g., usually 32k), especially during recovery. You can increase the number of threads using sysctl to see if increasing the maximum number of threads to the maximum possible number of threads allowed (i.e., 4194303) will help. For example: sysctl -w kernel.pid_max=4194303 If increasing the maximum thread count resolves the issue, you can make it permanent by including a kernel.pid_max setting in the /etc/sysctl.conf file. For example: kernel.pid_max = 4194303 On Mon, Mar 9, 2015 at 4:11 PM, Karan Singh karan.si...@csc.fi wrote: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970* *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)* *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33* * health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1* *736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19* *.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03* * monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789* */0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03* * osdmap e26633: 239 osds: 85 up, 196 in* * pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects* *4699 GB used, 707 TB / 711 TB avail* *6061/31080 objects degraded (19.501%)* * 14 down+remapped+peering* * 39 active* *3289 active+clean* * 547 peering* * 663 stale+down+peering* * 705 stale+active+remapped* * 1 active+degraded+remapped* * 1 stale+down+incomplete* * 484 down+peering* * 455 active+remapped* *3696 stale+active+degraded* * 4 remapped+peering* * 23 stale+down+remapped+peering* * 51 stale+active* *3637 active+degraded* *3799 stale+active+clean* *OSD : Logs * *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970* *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)* * ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)* * 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]* * 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]* * 3: (Accepter::entry()+0x265) [0xb5c635]* * 4: /lib64/libpthread.so.0() [0x3c8a6079d1]* * 5: (clone()+0x6d) [0x3c8a2e89dd]* * NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this.* *More information at Ceph Tracker Issue : * http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Warm Regards, Azad Aliyar Linux Server Engineer *Email* : azad.ali...@sparksupport.com *|* *Skype* : spark.azad http://www.sparksupport.com http://www.sparkmycloud.com https://www.facebook.com/sparksupport http://www.linkedin.com/company/244846 https://twitter.com/sparksupport3rd Floor, Leela Infopark, Phase -2,Kakanad, Kochi-30, Kerala, India *Phone*:+91 484 6561696 , *Mobile*:91-8129270421. *Confidentiality Notice:* Information in this
Re: [ceph-users] New eu.ceph.com mirror machine
On 03/09/2015 02:47 PM, HEWLETT, Paul (Paul)** CTR ** wrote: Hi Wildo If I disable the epel repo then the error changes: [root@ninja ~]# yum install --disablerepo=epel ceph Loaded plugins: langpacks, priorities, product-id, subscription-manager 10 packages excluded due to repository priority protections Resolving Dependencies . -- Finished Dependency Resolution Error: Package: gperftools-libs-2.1-1.el7.x86_64 (ceph) Requires: libunwind.so.8()(64bit) So this is related to the EPEL repo breaking ceph again. I have check_obsoletes=1 as recommended on this list a couple weeks ago. Is there any chance you could copy the libunwind repo to eu.ceph.com ? Hmm, I'll check the rsync script again. No manual copy should be required, it should fully sync the whole repository. I'll look into that! Wido Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: Wido den Hollander [w...@42on.com] Sent: 09 March 2015 13:43 To: HEWLETT, Paul (Paul)** CTR **; ceph-users Subject: Re: [ceph-users] New eu.ceph.com mirror machine On 03/09/2015 02:27 PM, HEWLETT, Paul (Paul)** CTR ** wrote: When did you make the change? Yesterday It worked on Friday albeit with these extra lines in ceph.repo: [Ceph-el7] name=Ceph-el7 baseurl=http://eu.ceph.com/rpms/rhel7/noarch/ enabled=1 gpgcheck=0 which I removed when I discovered this no longer existed. Ah, I think I know. The rsync script probably didn't clean up those old directories, since they don't exist here either: http://ceph.com/rpms/rhel7/noarch/ That caused some confusion since this machine is a fresh sync from ceph.com Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: Wido den Hollander [w...@42on.com] Sent: 09 March 2015 12:15 To: HEWLETT, Paul (Paul)** CTR **; ceph-users Subject: Re: [ceph-users] New eu.ceph.com mirror machine On 03/09/2015 12:54 PM, HEWLETT, Paul (Paul)** CTR ** wrote: Hi Wildo Has something broken with this move? The following has worked for me repeatedly over the last 2 months: It shouldn't have broken anything, but you never know. The machine rsyncs the data from ceph.com directly. The directories you are pointing at do exist and contain data. Anybody else noticing something? This a.m. I tried to install ceph using the following repo file: [root@citrus ~]# cat /etc/yum.repos.d/ceph.repo [ceph] name=Ceph packages for $basearch baseurl=http://ceph.com/rpm-giant/rhel7/$basearch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-noarch] name=Ceph noarch packages baseurl=http://ceph.com/rpm-giant/rhel7/noarch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-source] name=Ceph source packages baseurl=http://ceph.com/rpm-giant/rhel7/SRPMS enabled=0 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc and ceph now fails to install: msg: Error: Package: 1:ceph-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Error: Package: 1:ceph-common-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Wido den Hollander [w...@42on.com] Sent: 09 March 2015 11:15 To: ceph-users Subject: [ceph-users] New eu.ceph.com mirror machine Hi, Since the recent reports of rsync failing on eu.ceph.com I moved eu.ceph.com to a new machine. It went from physical to a KVM VM backed by RBD, so it's now running on Ceph. URLs or rsync paths haven't changed, it's still eu.ceph.com and available over IPv4 and IPv6. This Virtual Machine is dedicated for running eu.ceph.com, so hopefully rsync won't fail anymore. -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on
[ceph-users] Stuck PGs blocked_by non-existent OSDs
Hi, I'm trying to fix an issue within 0.93 on our internal cloud related to incomplete pg's (yes, I realise the folly of having the dev release - it's a not-so-test env now, so I need to recover this really). I'll detail the current outage info; 72 initial (now 65) OSDs 6 nodes * Update to 0.92 from Giant. * Fine for a day * MDS outage overnight and subsequent node failure * Massive increase in RAM utilisation (10G per OSD!) * More failure * OSD's 'out' to try to alleviate new large cluster requirements and a couple died under additional load * 'superfluous and faulty' OSD's rm, auth keys deleted * RAM added to nodes (96GB each - serving 10-12 OSDs) * Ugrade to 0.93 * Fix broken journals due to 0.92 update * No more missing objects or degredation So, that brings me to today, I still have 73/2264 PGs listed as stuck incomplete/inactive. I also have requests that are blocked. Upon querying said placement groups, I notice that they are 'blocked_by' non-existent OSDs (ones I have removed due to issues). I have no way to tell them the OSD is lost (as it'a already been removed, both from osdmap and crushmap). Exporting the crushmap shows non-existant OSDs as deviceN (i.e. device36 for the removed osd.36) Deleting those and reimporting crush map makes no affect Some further pg detail - https://gist.github.com/joelio/cecca9b48aca6d44451b So I'm stuck, I can't recover the pg's as I can't remove a non-existent OSD that the PG think's blocking it. Help graciously accepted! Joel -- $ echo kpfmAdpoofdufevq/dp/vl | perl -pe 's/(.)/chr(ord($1)-1)/ge' ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] New eu.ceph.com mirror machine
When did you make the change? It worked on Friday albeit with these extra lines in ceph.repo: [Ceph-el7] name=Ceph-el7 baseurl=http://eu.ceph.com/rpms/rhel7/noarch/ enabled=1 gpgcheck=0 which I removed when I discovered this no longer existed. Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: Wido den Hollander [w...@42on.com] Sent: 09 March 2015 12:15 To: HEWLETT, Paul (Paul)** CTR **; ceph-users Subject: Re: [ceph-users] New eu.ceph.com mirror machine On 03/09/2015 12:54 PM, HEWLETT, Paul (Paul)** CTR ** wrote: Hi Wildo Has something broken with this move? The following has worked for me repeatedly over the last 2 months: It shouldn't have broken anything, but you never know. The machine rsyncs the data from ceph.com directly. The directories you are pointing at do exist and contain data. Anybody else noticing something? This a.m. I tried to install ceph using the following repo file: [root@citrus ~]# cat /etc/yum.repos.d/ceph.repo [ceph] name=Ceph packages for $basearch baseurl=http://ceph.com/rpm-giant/rhel7/$basearch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-noarch] name=Ceph noarch packages baseurl=http://ceph.com/rpm-giant/rhel7/noarch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-source] name=Ceph source packages baseurl=http://ceph.com/rpm-giant/rhel7/SRPMS enabled=0 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc and ceph now fails to install: msg: Error: Package: 1:ceph-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Error: Package: 1:ceph-common-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Wido den Hollander [w...@42on.com] Sent: 09 March 2015 11:15 To: ceph-users Subject: [ceph-users] New eu.ceph.com mirror machine Hi, Since the recent reports of rsync failing on eu.ceph.com I moved eu.ceph.com to a new machine. It went from physical to a KVM VM backed by RBD, so it's now running on Ceph. URLs or rsync paths haven't changed, it's still eu.ceph.com and available over IPv4 and IPv6. This Virtual Machine is dedicated for running eu.ceph.com, so hopefully rsync won't fail anymore. -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] New eu.ceph.com mirror machine
On 03/09/2015 02:27 PM, HEWLETT, Paul (Paul)** CTR ** wrote: When did you make the change? Yesterday It worked on Friday albeit with these extra lines in ceph.repo: [Ceph-el7] name=Ceph-el7 baseurl=http://eu.ceph.com/rpms/rhel7/noarch/ enabled=1 gpgcheck=0 which I removed when I discovered this no longer existed. Ah, I think I know. The rsync script probably didn't clean up those old directories, since they don't exist here either: http://ceph.com/rpms/rhel7/noarch/ That caused some confusion since this machine is a fresh sync from ceph.com Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: Wido den Hollander [w...@42on.com] Sent: 09 March 2015 12:15 To: HEWLETT, Paul (Paul)** CTR **; ceph-users Subject: Re: [ceph-users] New eu.ceph.com mirror machine On 03/09/2015 12:54 PM, HEWLETT, Paul (Paul)** CTR ** wrote: Hi Wildo Has something broken with this move? The following has worked for me repeatedly over the last 2 months: It shouldn't have broken anything, but you never know. The machine rsyncs the data from ceph.com directly. The directories you are pointing at do exist and contain data. Anybody else noticing something? This a.m. I tried to install ceph using the following repo file: [root@citrus ~]# cat /etc/yum.repos.d/ceph.repo [ceph] name=Ceph packages for $basearch baseurl=http://ceph.com/rpm-giant/rhel7/$basearch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-noarch] name=Ceph noarch packages baseurl=http://ceph.com/rpm-giant/rhel7/noarch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-source] name=Ceph source packages baseurl=http://ceph.com/rpm-giant/rhel7/SRPMS enabled=0 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc and ceph now fails to install: msg: Error: Package: 1:ceph-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Error: Package: 1:ceph-common-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Wido den Hollander [w...@42on.com] Sent: 09 March 2015 11:15 To: ceph-users Subject: [ceph-users] New eu.ceph.com mirror machine Hi, Since the recent reports of rsync failing on eu.ceph.com I moved eu.ceph.com to a new machine. It went from physical to a KVM VM backed by RBD, so it's now running on Ceph. URLs or rsync paths haven't changed, it's still eu.ceph.com and available over IPv4 and IPv6. This Virtual Machine is dedicated for running eu.ceph.com, so hopefully rsync won't fail anymore. -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] New eu.ceph.com mirror machine
Hi, Since the recent reports of rsync failing on eu.ceph.com I moved eu.ceph.com to a new machine. It went from physical to a KVM VM backed by RBD, so it's now running on Ceph. URLs or rsync paths haven't changed, it's still eu.ceph.com and available over IPv4 and IPv6. This Virtual Machine is dedicated for running eu.ceph.com, so hopefully rsync won't fail anymore. -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Eichelmann Systemadministrator 11 Internet AG - IT Operations Mail Media Advertising Targeting Brauerstraße 48 · DE-76135 Karlsruhe Telefon: +49 721 91374-8026 christian.eichelm...@1und1.de Amtsgericht Montabaur / HRB 6484 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen Aufsichtsratsvorsitzender: Michael Scheeren ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Disk serial number from OSD
Hi All, I just created this little bash script to retrieve the /dev/disk/by-id string for each OSD on a host. Our disks are internally mounted so have no concept of drive bays, this should make it easier to work out what disk has failed. #!/bin/bash DISKS=`ceph-disk list | grep ceph data` old_IFS=$IFS IFS=$'\n' echo $DISKS for DISK in $DISKS; do DEV=`awk '{print $1}' $DISK` OSD=`awk '{print $7}' $DISK` DEV=`echo $DEV | sed -e 's/\/dev\///g'` ID=`ls -l /dev/disk/by-id | grep $DEV | awk '{print $9}' | egrep -v wwn` echo $OSD $ID done IFS=$old_IFS ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] New eu.ceph.com mirror machine
Hi Wildo It seems that your move coincided with yet another change in the EPEL repo. For anyone who is interested, I fixed this by: 1. ensuring that check_obsoletes=1 is in /etc/yum/pluginconf.d/priorities.conf 2. Install libunwind explicitly: yum install libunwind 3. Install ceph with epel disabled: yum install --disablerepo=epel ceph Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: Wido den Hollander [w...@42on.com] Sent: 09 March 2015 13:43 To: HEWLETT, Paul (Paul)** CTR **; ceph-users Subject: Re: [ceph-users] New eu.ceph.com mirror machine On 03/09/2015 02:27 PM, HEWLETT, Paul (Paul)** CTR ** wrote: When did you make the change? Yesterday It worked on Friday albeit with these extra lines in ceph.repo: [Ceph-el7] name=Ceph-el7 baseurl=http://eu.ceph.com/rpms/rhel7/noarch/ enabled=1 gpgcheck=0 which I removed when I discovered this no longer existed. Ah, I think I know. The rsync script probably didn't clean up those old directories, since they don't exist here either: http://ceph.com/rpms/rhel7/noarch/ That caused some confusion since this machine is a fresh sync from ceph.com Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: Wido den Hollander [w...@42on.com] Sent: 09 March 2015 12:15 To: HEWLETT, Paul (Paul)** CTR **; ceph-users Subject: Re: [ceph-users] New eu.ceph.com mirror machine On 03/09/2015 12:54 PM, HEWLETT, Paul (Paul)** CTR ** wrote: Hi Wildo Has something broken with this move? The following has worked for me repeatedly over the last 2 months: It shouldn't have broken anything, but you never know. The machine rsyncs the data from ceph.com directly. The directories you are pointing at do exist and contain data. Anybody else noticing something? This a.m. I tried to install ceph using the following repo file: [root@citrus ~]# cat /etc/yum.repos.d/ceph.repo [ceph] name=Ceph packages for $basearch baseurl=http://ceph.com/rpm-giant/rhel7/$basearch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-noarch] name=Ceph noarch packages baseurl=http://ceph.com/rpm-giant/rhel7/noarch enabled=1 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc [ceph-source] name=Ceph source packages baseurl=http://ceph.com/rpm-giant/rhel7/SRPMS enabled=0 priority=2 gpgcheck=1 type=rpm-md gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc and ceph now fails to install: msg: Error: Package: 1:ceph-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Error: Package: 1:ceph-common-0.87.1-0.el7.x86_64 (ceph) Requires: python-ceph = 1:0.87.1-0.el7 Available: 1:python-ceph-0.86-0.el7.x86_64 (ceph) python-ceph = 1:0.86-0.el7 Available: 1:python-ceph-0.87-0.el7.x86_64 (ceph) python-ceph = 1:0.87-0.el7 Available: 1:python-ceph-0.87.1-0.el7.x86_64 (ceph) python-ceph = 1:0.87.1-0.el7 Regards Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Wido den Hollander [w...@42on.com] Sent: 09 March 2015 11:15 To: ceph-users Subject: [ceph-users] New eu.ceph.com mirror machine Hi, Since the recent reports of rsync failing on eu.ceph.com I moved eu.ceph.com to a new machine. It went from physical to a KVM VM backed by RBD, so it's now running on Ceph. URLs or rsync paths haven't changed, it's still eu.ceph.com and available over IPv4 and IPv6. This Virtual Machine is dedicated for running eu.ceph.com, so hopefully rsync won't fail anymore. -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Thanks Guys kernel.pid_max=4194303 did the trick. - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Eichelmann Systemadministrator 11 Internet AG - IT Operations Mail Media Advertising Targeting Brauerstraße 48 · DE-76135 Karlsruhe Telefon: +49 721 91374-8026 christian.eichelm...@1und1.de Amtsgericht Montabaur / HRB 6484 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen Aufsichtsratsvorsitzender: Michael Scheeren smime.p7s Description: S/MIME cryptographic signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Stuck PGs blocked_by non-existent OSDs
You'll probably have to recreate osds with the same ids (empty ones), let them boot, stop them, and mark them lost. There is a feature in the tracker to improve this behavior: http://tracker.ceph.com/issues/10976 -Sam On Mon, 2015-03-09 at 12:24 +, joel.merr...@gmail.com wrote: Hi, I'm trying to fix an issue within 0.93 on our internal cloud related to incomplete pg's (yes, I realise the folly of having the dev release - it's a not-so-test env now, so I need to recover this really). I'll detail the current outage info; 72 initial (now 65) OSDs 6 nodes * Update to 0.92 from Giant. * Fine for a day * MDS outage overnight and subsequent node failure * Massive increase in RAM utilisation (10G per OSD!) * More failure * OSD's 'out' to try to alleviate new large cluster requirements and a couple died under additional load * 'superfluous and faulty' OSD's rm, auth keys deleted * RAM added to nodes (96GB each - serving 10-12 OSDs) * Ugrade to 0.93 * Fix broken journals due to 0.92 update * No more missing objects or degredation So, that brings me to today, I still have 73/2264 PGs listed as stuck incomplete/inactive. I also have requests that are blocked. Upon querying said placement groups, I notice that they are 'blocked_by' non-existent OSDs (ones I have removed due to issues). I have no way to tell them the OSD is lost (as it'a already been removed, both from osdmap and crushmap). Exporting the crushmap shows non-existant OSDs as deviceN (i.e. device36 for the removed osd.36) Deleting those and reimporting crush map makes no affect Some further pg detail - https://gist.github.com/joelio/cecca9b48aca6d44451b So I'm stuck, I can't recover the pg's as I can't remove a non-existent OSD that the PG think's blocking it. Help graciously accepted! Joel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] S3 RadosGW - Create bucket OP
- Original Message - From: Steffen Winther ceph.u...@siimnet.dk To: ceph-users@lists.ceph.com Sent: Monday, March 9, 2015 12:43:58 AM Subject: Re: [ceph-users] S3 RadosGW - Create bucket OP Steffen W Sørensen stefws@... writes: Response: HTTP/1.1 200 OK Date: Fri, 06 Mar 2015 10:41:14 GMT Server: Apache/2.2.22 (Fedora) Connection: close Transfer-Encoding: chunked Content-Type: application/xml This response makes the App say: S3.createBucket, class S3, code UnexpectedContent, message Inconsistency in S3 response. error response is not a valid xml message Are our S3 GW not responding properly? Why doesn't the radosGW return a Content-Length: 0 header when the body is empty? If you're using apache, then it filters out zero Content-Length. Nothing much radosgw can do about it. http://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html Maybe this is confusing my App to expect some XML in body You can try using the radosgw civetweb frontend, see if it changes anything. Yehuda 2. at every create bucket OP the GW create what looks like new containers for ACLs in .rgw pool, is this normal or howto avoid such multiple objects clottering the GW pools? Is there something wrong since I get multiple ACL object for this bucket everytime my App tries to recreate same bucket or is this a feature/bug in radosGW? # rados -p .rgw ls .bucket.meta.mssCl:default.6309817.1 .bucket.meta.mssCl:default.6187712.3 .bucket.meta.mssCl:default.6299841.7 .bucket.meta.mssCl:default.6309817.5 .bucket.meta.mssCl:default.6187712.2 .bucket.meta.mssCl:default.6187712.19 .bucket.meta.mssCl:default.6187712.12 mssCl ... # rados -p .rgw listxattr .bucket.meta.mssCl:default.6187712.12 ceph.objclass.version user.rgw.acl /Steffen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Umm.. Too many Threads are created in SimpleMessenger, every pipe should create two working threads for sending and receiving messages. Thus, AsyncMessenger would be promissing but still in development. Regards Ning Yao 2015-03-09 20:48 GMT+08:00 Christian Eichelmann christian.eichelm...@1und1.de: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Eichelmann Systemadministrator 11 Internet AG - IT Operations Mail Media Advertising Targeting Brauerstraße 48 · DE-76135 Karlsruhe Telefon: +49 721 91374-8026 christian.eichelm...@1und1.de Amtsgericht Montabaur / HRB 6484 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen Aufsichtsratsvorsitzender: Michael Scheeren ___ ceph-users mailing list ceph-users@lists.ceph.com
Re: [ceph-users] tgt and krbd
Hi Mike, I was using bs_aio with the krbd and still saw a small caching effect. I'm not sure if it was on the ESXi or tgt/krbd page cache side, but I was definitely seeing the IO's being coalesced into larger ones on the krbd device in iostat. Either way, it would make me potentially nervous to run it like that in a HA setup. tgt itself does not do any type of caching, but depending on how you have tgt access the underlying block device you might end up using the normal old linux page cache like you would if you did dd if=/dev/rbd0 of=/dev/null bs=4K count=1 dd if=/dev/rbd0 of=/dev/null bs=4K count=1 This is what Ronnie meant in that thread when he was saying there might be caching in the underlying device. If you use tgt bs_rdwr.c (--bstype=rdwr) with the default settings and with krbd then you will end up doing caching, because the krbd's block device will be accessed like in the dd example above (no direct bits set). You can tell tgt bs_rdwr devices to use O_DIRECT or O_SYNC. When you create the lun pass in the --bsoflags {direct | sync }. Here is an example from the man page: tgtadm --lld iscsi --op new --mode logicalunit --tid 1 --lun 1 --bsoflags=sync - -backing-store=/data/100m_image.raw If you use bs_aio.c then we always set O_DIRECT when opening the krbd device, so no page caching is done. I think linux aio might require this or at least it did at the time it was written. Also the cache settings exported to the other OS's initiator with that modepage command might affect performance then too. It might change how that OS does writes like send cache syncs down or do some sort of barrier or FUA. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel
Thank you Nick for explaining the problem with 4k writes.Queue depth used in this setup is 256 the maximum supported. Can you clarify that adding more nodes will not increase iops.In general how will we increase iops of a ceph cluster. Thanks for your help On Sat, Mar 7, 2015 at 5:57 PM, Nick Fisk n...@fisk.me.uk wrote: You are hitting serial latency limits. For a 4kb sync write to happen it has to:- 1. Travel across network from client to Primary OSD 2. Be processed by Ceph 3. Get Written to Pri OSD 4. Ack travels across network to client At 4kb these 4 steps take up a very high percentage of the actual processing time as compared to the actual write to the SSD. Apart from faster (more ghz) CPU's which will improve step 2, there's not much that can be done. Future Ceph releases may improve step2 as well, but I wouldn't imagine it will change dramitcally. Replication level 1 will also see the IOPs drop as you are introducing yet more ceph processing and network delays. Unless a future Ceph feature can be implemented where it returns the ack to client once data has hit the 1st OSD. Still a 1000 iops, is not that bad. You mention it needs to achieve 8000 iops to replace your existing SAN, at what queue depth is this required? You are getting way above that at a queue depth of only 16. I doubt most Ethernet based enterprise SANs would be able to provide 8000 iops at a queue depth of 1, as just network delays would be limiting you to around that figure. A network delay of .1ms will limit you to 10,000 IOPs, .2ms = 5000IOPs and so on. If you really do need pure SSD performance for a certain client you will need to move the SSD local to it using some sort of caching software running on the client , although this can bring its own challenges. Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of mad Engineer Sent: 07 March 2015 10:55 To: Somnath Roy Cc: ceph-users Subject: Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel Update: Hardware: Upgraded RAID controller to LSI Megaraid 9341 -12Gbps 3 Samsung 840 EVO - was showing 45K iops for fio test with 7 threads and 4k block size in JBOD mode CPU- 16 cores @2.27Ghz RAM- 24Gb NIC- 10Gbits with under 1 ms latency, iperf shows 9.18 Gbps between host and client Software Ubuntu 14.04 with stock kernel 3.13- Upgraded from firefly to giant [ceph version 0.87.1 (283c2e7cfa2457799f534744d7d549f83ea1335e)] Changed file system to btrfs and i/o scheduler to noop. Ceph Setup replication to 1 and using 2 SSD OSD and 1 SSD for Journal.All are samsung 840 EVO in JBOD mode on single server. Configuration: [global] fsid = 979f32fc-6f31-43b0-832f-29fcc4c5a648 mon_initial_members = ceph1 mon_host = 10.99.10.118 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true osd_pool_default_size = 1 osd_pool_default_min_size = 1 osd_pool_default_pg_num = 250 osd_pool_default_pgp_num = 250 debug_lockdep = 0/0 debug_context = 0/0 debug_crush = 0/0 debug_buffer = 0/0 debug_timer = 0/0 debug_filer = 0/0 debug_objecter = 0/0 debug_rados = 0/0 debug_rbd = 0/0 debug_journaler = 0/0 debug_objectcatcher = 0/0 debug_client = 0/0 debug_osd = 0/0 debug_optracker = 0/0 debug_objclass = 0/0 debug_filestore = 0/0 debug_journal = 0/0 debug_ms = 0/0 debug_monc = 0/0 debug_tp = 0/0 debug_auth = 0/0 debug_finisher = 0/0 debug_heartbeatmap = 0/0 debug_perfcounter = 0/0 debug_asok = 0/0 debug_throttle = 0/0 debug_mon = 0/0 debug_paxos = 0/0 debug_rgw = 0/0 [client] rbd_cache = true Client Ubuntu 14.04 with 16 Core @2.53 Ghz and 24G RAM Results rados bench -p rdp -b 4096 -t 16 10 write rados bench -p rbd -b 4096 -t 16 10 write Maintaining 16 concurrent writes of 4096 bytes for up to 10 seconds or 0 objects Object prefix: benchmark_data_ubuntucompute_3931 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 0 0 0 0 0 - 0 1 16 6370 6354 24.8124 24.8203 0.002210.00251512 2 16 11618 11602 22.6536 20.5 0.0010250.00275493 3 16 16889 16873 21.9637 20.5898 0.0012880.00281797 4 16 17310 1729416.884 1.64453 0.0540660.00365805 5 16 17695 1767913.808 1.50391 0.0014510.0009 6 16 18127 18111 11.78681.6875 0.0014630.00527521 7 16 21647 21631 12.0669 13.75 0.001601 0.0051773 8 16 28056 28040 13.6872 25.0352 0.0052680.00456353 9 16 28947 2893112.553 3.48047 0.066470.00494762 10 16 29346 29330 11.4536 1.55859
[ceph-users] [ANN] ceph-deploy 1.5.22 released
Hi All, This is a new release of ceph-deploy that changes a couple of behaviors. On RPM-based distros, ceph-deploy will now automatically enable check_obsoletes in the Yum priorities plugin. This resolves an issue many community members hit where package dependency resolution was breaking due to conflicts between upstream packaging (hosted on ceph.com) and downstream (i.e., Fedora or EPEL). The other important change is that when using ceph-deploy to install Ceph packages on a RHEL machine, the --release flag *must* be used if you want to install upstream packages. In other words, if you want to install Giant on a RHEL machine, you would need to use ceph-deploy install --release giant. If the --release flag is not used, ceph-deploy will expect to use downstream package on RHEL. This is documented at [1]. The full changelog can be seen at [2]. Please update! - Travis [1] http://ceph.com/ceph-deploy/docs/install.html#distribution-notes [2] http://ceph.com/ceph-deploy/docs/changelog.html#id1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] S3 RadosGW - Create bucket OP
- Original Message - From: Steffen Winther ceph.u...@siimnet.dk To: ceph-users@lists.ceph.com Sent: Monday, March 9, 2015 1:25:43 PM Subject: Re: [ceph-users] S3 RadosGW - Create bucket OP Yehuda Sadeh-Weinraub yehuda@... writes: If you're using apache, then it filters out zero Content-Length. Nothing much radosgw can do about it. You can try using the radosgw civetweb frontend, see if it changes anything. Thanks, only no difference... Req: PUT /mssCl/ HTTP/1.1 Host: rgw.gsp.sprawl.dk:7480 Authorization: AWS auth id Date: Mon, 09 Mar 2015 20:18:16 GMT Content-Length: 0 Response: HTTP/1.1 200 OK Content-type: application/xml Content-Length: 0 App still says: S3.createBucket, class S3, code UnexpectedContent, message Inconsistency in S3 response. error response is not a valid xml message :/ According to the api specified here http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketPUT.html, there's no response expected. I can only assume that the application tries to decode the xml if xml content type is returned. What kind of application is that? Yehuda any comments on below 2. issue? 2. at every create bucket OP the GW create what looks like new containers for ACLs in .rgw pool, is this normal or howto avoid such multiple objects clottering the GW pools? Is there something wrong since I get multiple ACL object for this bucket everytime my App tries to recreate same bucket or is this a feature/bug in radosGW? That's a bug. Yehuda # rados -p .rgw ls .bucket.meta.mssCl:default.6309817.1 .bucket.meta.mssCl:default.6187712.3 .bucket.meta.mssCl:default.6299841.7 .bucket.meta.mssCl:default.6309817.5 .bucket.meta.mssCl:default.6187712.2 .bucket.meta.mssCl:default.6187712.19 .bucket.meta.mssCl:default.6187712.12 mssCl ... # rados -p .rgw listxattr .bucket.meta.mssCl:default.6187712.12 ceph.objclass.version user.rgw.acl /Steffen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
I know I'm not even close to this type of a problem yet with my small cluster (both test and production clusters) - but it would be great if something like that could appear in the cluster HEALTHWARN, if Ceph could determine the amount of used processes and compare them against the current limit then throw a health warning if it gets within say 10 or 15% of the max value. That would be a really quick indicator for anyone who frequently checks the health status (like through a web portal) as they may see it more quickly then during their regular log check interval. Just a thought. -Tony On Mon, Mar 9, 2015 at 2:01 PM, Sage Weil s...@newdream.net wrote: On Mon, 9 Mar 2015, Karan Singh wrote: Thanks Guys kernel.pid_max=4194303 did the trick. Great to hear! Sorry we missed that you only had it at 65536. This is a really common problem that people hit when their clusters start to grow. Is there somewhere in the docs we can put this to catch more users? Or maybe a warning issued by the osds themselves or something if they see limits that are low? sage - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD?s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX .50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2:
Re: [ceph-users] EC Pool and Cache Tier Tuning
Either option #1 or #2 depending on if your data has hot spots or you need to use EC pools. I'm finding that the cache tier can actually slow stuff down depending on how much data is in the cache tier vs on the slower tier. Writes will be about the same speed for both solutions, reads will be a lot faster using a cache tier if the data resides in it. -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Steffen Winther Sent: 09 March 2015 20:47 To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] EC Pool and Cache Tier Tuning Nick Fisk nick@... writes: My Ceph cluster comprises of 4 Nodes each with the following:- 10x 3TB WD Red Pro disks - EC pool k=3 m=3 (7200rpm) 2x S3700 100GB SSD's (20k Write IOPs) for HDD Journals 1x S3700 400GB SSD (35k Write IOPs) for cache tier - 3x replica If I have following 4x node config: 2x S3700 200GB SSD's 4x 4TB HDDs What config to aim for to optimize RBD write/read OPs: 1x S3700 200GB SSD for 4x journals 1x S3700 200GB cache tier 4x 4TB HDD OSD disk or: 2x S3700 200GB SSD for 2x journals 4x 4TB HDD OSD disk or: 2x S3700 200GB cache tier 4x 4TB HDD OSD disk /Steffen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph repo - RSYNC?
Hi David also for the Calamari or gui monitoring interface is there any way to get user account and passwd of inktank since the repo to install Calamari seems to be only for people inside of inktank Jesus Chavez SYSTEMS ENGINEER-C.SALES jesch...@cisco.commailto:jesch...@cisco.com Phone: +52 55 5267 3146tel:+52%2055%205267%203146 Mobile: +51 1 5538883255tel:+51%201%205538883255 CCIE - 44433 On Mar 8, 2015, at 10:38 AM, David Moreau Simard dmsim...@iweb.commailto:dmsim...@iweb.com wrote: Hi, With the help of Inktank we have been providing a Ceph mirror at ceph.mirror.iweb.cahttp://ceph.mirror.iweb.ca. Quick facts: - Located on the east coast of Canada (Montreal, Quebec) - Syncs every four hours directly off of the official repositories - Available over http (http://ceph.mirror.iweb.ca/) and rsync (rsync://mirror.iweb.ca/ceph) We're working on a brand new, faster and improved infrastructure for all of our mirrors and it will be backed by Ceph.. So the Ceph mirror will soon be stored on a Ceph cluster :) Feel free to use it ! -- David Moreau Simard On 2015-03-05, 1:14 PM, Brian Rak b...@gameservers.commailto:b...@gameservers.com wrote: Do any of the Ceph repositories run rsync? We generally mirror the repository locally so we don't encounter any unexpected upgrades. eu.ceph.comhttp://eu.ceph.com used to run this, but it seems to be down now. # rsync rsync://eu.ceph.com rsync: failed to connect to eu.ceph.comhttp://eu.ceph.com: Connection refused (111) rsync error: error in socket IO (code 10) at clientserver.c(124) [receiver=3.0.6] ___ ceph-users mailing list ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] EC Pool and Cache Tier Tuning
Nick Fisk nick@... writes: My Ceph cluster comprises of 4 Nodes each with the following:- 10x 3TB WD Red Pro disks - EC pool k=3 m=3 (7200rpm) 2x S3700 100GB SSD's (20k Write IOPs) for HDD Journals 1x S3700 400GB SSD (35k Write IOPs) for cache tier - 3x replica If I have following 4x node config: 2x S3700 200GB SSD's 4x 4TB HDDs What config to aim for to optimize RBD write/read OPs: 1x S3700 200GB SSD for 4x journals 1x S3700 200GB cache tier 4x 4TB HDD OSD disk or: 2x S3700 200GB SSD for 2x journals 4x 4TB HDD OSD disk or: 2x S3700 200GB cache tier 4x 4TB HDD OSD disk /Steffen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Hi Tony, sounds like an good idea! Udo On 09.03.2015 21:55, Tony Harris wrote: I know I'm not even close to this type of a problem yet with my small cluster (both test and production clusters) - but it would be great if something like that could appear in the cluster HEALTHWARN, if Ceph could determine the amount of used processes and compare them against the current limit then throw a health warning if it gets within say 10 or 15% of the max value. That would be a really quick indicator for anyone who frequently checks the health status (like through a web portal) as they may see it more quickly then during their regular log check interval. Just a thought. -Tony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel
Can you run the Fio test again but with a queue depth of 32. This will probably show what your cluster is capable of. Adding more nodes with SSD's will probably help scale, but only at higher io depths. At low queue depths you are probably already at the limit as per my earlier email. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of mad Engineer Sent: 09 March 2015 17:23 To: Nick Fisk Cc: ceph-users Subject: Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel Thank you Nick for explaining the problem with 4k writes.Queue depth used in this setup is 256 the maximum supported. Can you clarify that adding more nodes will not increase iops.In general how will we increase iops of a ceph cluster. Thanks for your help On Sat, Mar 7, 2015 at 5:57 PM, Nick Fisk n...@fisk.me.uk wrote: You are hitting serial latency limits. For a 4kb sync write to happen it has to:- 1. Travel across network from client to Primary OSD 2. Be processed by Ceph 3. Get Written to Pri OSD 4. Ack travels across network to client At 4kb these 4 steps take up a very high percentage of the actual processing time as compared to the actual write to the SSD. Apart from faster (more ghz) CPU's which will improve step 2, there's not much that can be done. Future Ceph releases may improve step2 as well, but I wouldn't imagine it will change dramitcally. Replication level 1 will also see the IOPs drop as you are introducing yet more ceph processing and network delays. Unless a future Ceph feature can be implemented where it returns the ack to client once data has hit the 1st OSD. Still a 1000 iops, is not that bad. You mention it needs to achieve 8000 iops to replace your existing SAN, at what queue depth is this required? You are getting way above that at a queue depth of only 16. I doubt most Ethernet based enterprise SANs would be able to provide 8000 iops at a queue depth of 1, as just network delays would be limiting you to around that figure. A network delay of .1ms will limit you to 10,000 IOPs, .2ms = 5000IOPs and so on. If you really do need pure SSD performance for a certain client you will need to move the SSD local to it using some sort of caching software running on the client , although this can bring its own challenges. Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of mad Engineer Sent: 07 March 2015 10:55 To: Somnath Roy Cc: ceph-users Subject: Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel Update: Hardware: Upgraded RAID controller to LSI Megaraid 9341 -12Gbps 3 Samsung 840 EVO - was showing 45K iops for fio test with 7 threads and 4k block size in JBOD mode CPU- 16 cores @2.27Ghz RAM- 24Gb NIC- 10Gbits with under 1 ms latency, iperf shows 9.18 Gbps between host and client Software Ubuntu 14.04 with stock kernel 3.13- Upgraded from firefly to giant [ceph version 0.87.1 (283c2e7cfa2457799f534744d7d549f83ea1335e)] Changed file system to btrfs and i/o scheduler to noop. Ceph Setup replication to 1 and using 2 SSD OSD and 1 SSD for Journal.All are samsung 840 EVO in JBOD mode on single server. Configuration: [global] fsid = 979f32fc-6f31-43b0-832f-29fcc4c5a648 mon_initial_members = ceph1 mon_host = 10.99.10.118 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true osd_pool_default_size = 1 osd_pool_default_min_size = 1 osd_pool_default_pg_num = 250 osd_pool_default_pgp_num = 250 debug_lockdep = 0/0 debug_context = 0/0 debug_crush = 0/0 debug_buffer = 0/0 debug_timer = 0/0 debug_filer = 0/0 debug_objecter = 0/0 debug_rados = 0/0 debug_rbd = 0/0 debug_journaler = 0/0 debug_objectcatcher = 0/0 debug_client = 0/0 debug_osd = 0/0 debug_optracker = 0/0 debug_objclass = 0/0 debug_filestore = 0/0 debug_journal = 0/0 debug_ms = 0/0 debug_monc = 0/0 debug_tp = 0/0 debug_auth = 0/0 debug_finisher = 0/0 debug_heartbeatmap = 0/0 debug_perfcounter = 0/0 debug_asok = 0/0 debug_throttle = 0/0 debug_mon = 0/0 debug_paxos = 0/0 debug_rgw = 0/0 [client] rbd_cache = true Client Ubuntu 14.04 with 16 Core @2.53 Ghz and 24G RAM Results rados bench -p rdp -b 4096 -t 16 10 write rados bench -p rbd -b 4096 -t 16 10 write Maintaining 16 concurrent writes of 4096 bytes for up to 10 seconds or 0 objects Object prefix: benchmark_data_ubuntucompute_3931 sec Cur ops started finished avg MB/s cur MB/s last lat avg lat 0 0 0 0 0 0 - 0 1 16 6370 6354 24.8124 24.8203 0.002210.00251512 2 16 11618 11602 22.6536 20.5 0.0010250.00275493 3 16 16889 16873 21.9637 20.5898 0.0012880.00281797 4 16 17310 1729416.884 1.64453 0.0540660.00365805 5 16
Re: [ceph-users] qemu-kvm and cloned rbd image
On 03/05/2015 07:19 PM, Josh Durgin wrote: client.libvirt key: caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rw class-read pool=rbd This includes everything except class-write on the pool you're using. You'll need that so that a copy_up call (used just for clones) works. That's what was getting a permissions error. You can use rwx for short. Josh thanks! That was the problem indeed. I removed class-write capability because I also use this user as the default for ceph cli commands. Without class-write this user can't erase an existing image from the pool, while at the same time being able to create new ones. I should probably come up with a better scheme if I am to utilize cloned images. Thanks again! -Kostas ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Prioritize Heartbeat packets
I've found commit 9b9a682fe035c985e416ee1c112fa58f9045a27c and I see that when 'osd heartbeat use min delay socket = true' it will mark the packet with DSCP CS6. Based on the setting of the socket in msg/simple/Pipe.cc is it possible that this can apply to both OSD and monitor? I don't understand the code enough to know how the set_socket_options() is called from the OSD and monitor. If this applies to both monitor and OSD, would it be better to rename the option to a more generic name? Thanks, On Sat, Mar 7, 2015 at 4:23 PM, Daniel Swarbrick daniel.swarbr...@gmail.com wrote: Judging by the commit, this ought to do the trick: osd heartbeat use min delay socket = true On 07/03/15 01:20, Robert LeBlanc wrote: I see that Jian Wen has done work on this for 0.94. I tried looking through the code to see if I can figure out how to configure this new option, but it all went over my head pretty quick. Can I get a brief summary on how to set the priority of heartbeat packets or where to look in the code to figure it out? Thanks, On Thu, Aug 28, 2014 at 2:01 AM, Daniel Swarbrick daniel.swarbr...@profitbricks.com mailto:daniel.swarbr...@profitbricks.com wrote: On 28/08/14 02:56, Sage Weil wrote: I seem to remember someone telling me there were hooks/hints you could call that would tag either a socket or possibly data on that socket with a label for use by iptables and such.. but I forget what it was. Something like setsockopt() SO_MARK? *SO_MARK *(since Linux 2.6.25) Set the mark for each packet sent through this socket (similar to the netfilter MARK target but socket-based). Changing the mark can be used for mark-based routing without netfilter or for packet filtering. Setting this option requires the *CAP_NET_ADMIN *capability. Alternatively, directly set IP_TOS options on the socket, or SO_PRIORITY which sets the IP TOS bits as well. ___ ceph-users mailing list ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
On Mon, 9 Mar 2015, Karan Singh wrote: Thanks Guys kernel.pid_max=4194303 did the trick. Great to hear! Sorry we missed that you only had it at 65536. This is a really common problem that people hit when their clusters start to grow. Is there somewhere in the docs we can put this to catch more users? Or maybe a warning issued by the osds themselves or something if they see limits that are low? sage - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD?s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX .50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ / 4699 GB used, 707 TB / 711 TB avail/ / 6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ / 3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ / 3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ / 3637 active+degraded/ / 3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302
Re: [ceph-users] S3 RadosGW - Create bucket OP
Yehuda Sadeh-Weinraub yehuda@... writes: If you're using apache, then it filters out zero Content-Length. Nothing much radosgw can do about it. You can try using the radosgw civetweb frontend, see if it changes anything. Thanks, only no difference... Req: PUT /mssCl/ HTTP/1.1 Host: rgw.gsp.sprawl.dk:7480 Authorization: AWS auth id Date: Mon, 09 Mar 2015 20:18:16 GMT Content-Length: 0 Response: HTTP/1.1 200 OK Content-type: application/xml Content-Length: 0 App still says: S3.createBucket, class S3, code UnexpectedContent, message Inconsistency in S3 response. error response is not a valid xml message :/ Yehuda any comments on below 2. issue? 2. at every create bucket OP the GW create what looks like new containers for ACLs in .rgw pool, is this normal or howto avoid such multiple objects clottering the GW pools? Is there something wrong since I get multiple ACL object for this bucket everytime my App tries to recreate same bucket or is this a feature/bug in radosGW? # rados -p .rgw ls .bucket.meta.mssCl:default.6309817.1 .bucket.meta.mssCl:default.6187712.3 .bucket.meta.mssCl:default.6299841.7 .bucket.meta.mssCl:default.6309817.5 .bucket.meta.mssCl:default.6187712.2 .bucket.meta.mssCl:default.6187712.19 .bucket.meta.mssCl:default.6187712.12 mssCl ... # rados -p .rgw listxattr .bucket.meta.mssCl:default.6187712.12 ceph.objclass.version user.rgw.acl /Steffen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Tr : RadosGW - Bucket link and ACLs
Yeah, I was thinking about that and will be the alternative for me too... Regards. Italo Santos http://italosantos.com.br/ On Friday, March 6, 2015 at 18:20, ghislain.cheval...@orange.com wrote: Message d'origine De : CHEVALIER Ghislain IMT/OLPS ghislain.cheval...@orange.com (mailto:ghislain.cheval...@orange.com) Date :06/03/2015 21:56 (GMT+01:00) À : Italo Santos okd...@gmail.com (mailto:okd...@gmail.com) Cc : Objet : RE : [ceph-users] RadosGW - Bucket link and ACLs Hi We encountered this behavior when developing the rgw admin module in inkscope and we fixed it as foĺlowed: As you created the user access key and secret key with the admin user it seems better to create the bucket with these credentials Best regards Envoyé de mon Galaxy Ace4 Orange Message d'origine De : Italo Santos okd...@gmail.com (mailto:okd...@gmail.com) Date :06/03/2015 20:52 (GMT+01:00) À : ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) Cc : Objet : [ceph-users] RadosGW - Bucket link and ACLs Hello, I’m building a object storage environment and I’m in trouble with some administration ops, to manage the entire environment I decided create an admin user and use that to manage the client users which I’ll create further. Using the admin (called “italux) I created a new user (called cliente”) and after that I created a new bucket with the admin user (called cliente-bucket). After that, still using the admin, I change the permissions of the cliente-bucket” (which is owned by admin) granting FULL_CONTROL to the “cliente” user. So, using the admin API I unlink the “cliente-bucket” from the admin user and link to the “cliente” user, changing the ownership of the bucket: In [86]: url = 'http://radosgw.example.com/admin/bucket?format=jsonbucket=cliente-bucket' In [87]: r = requests.get(url, auth=S3Auth(access_key, secret_key, server)) In [88]: r.content Out[88]: '{bucket:cliente-bucket,pool:.rgw.buckets,index_pool:.rgw.buckets.index,id:default.4361528.1,marker:default.4361528.1,owner:cliente,ver:1,master_ver:0,mtime:1425670280,max_marker:,usage:{},bucket_quota:{enabled:false,max_size_kb:-1,max_objects:-1}}’ After that, when I try change the permissions/acls of the bucket using the “cliente” user and I’m getting AccessDenied. Looking to the raw debug logs it seems that the owner of the bucket wasn’t change. Anyone knows why? RadosGW debug logs: 2015-03-06 16:32:55.943167 7fd32bf57700 1 == starting new request req=0x3cf78a0 = 2015-03-06 16:32:55.943183 7fd32bf57700 2 req 2:0.16::PUT /::initializing 2015-03-06 16:32:55.943189 7fd32bf57700 10 host=cliente-bucket.radosgw.example.com rgw_dns_name=object-storage.locaweb.com.br (http://web.com.br) 2015-03-06 16:32:55.943220 7fd32bf57700 10 s-object=NULL s-bucket=cliente-bucket 2015-03-06 16:32:55.943225 7fd32bf57700 2 req 2:0.57:s3:PUT /::getting op 2015-03-06 16:32:55.943230 7fd32bf57700 2 req 2:0.62:s3:PUT /:put_acls:authorizing 2015-03-06 16:32:55.943269 7fd32bf57700 10 get_canon_resource(): dest=/cliente-bucket/?acl 2015-03-06 16:32:55.943272 7fd32bf57700 10 auth_hdr: PUT Fri, 06 Mar 2015 19:32:55 GMT /cliente-bucket/?acl 2015-03-06 16:32:55.943370 7fd32bf57700 15 calculated digest=xtSrQR+GsHyqjqGLdiPmjoP62x4= 2015-03-06 16:32:55.943375 7fd32bf57700 15 auth_sign=xtSrQR+GsHyqjqGLdiPmjoP62x4= 2015-03-06 16:32:55.943377 7fd32bf57700 15 compare=0 2015-03-06 16:32:55.943384 7fd32bf57700 2 req 2:0.000216:s3:PUT /:put_acls:reading permissions 2015-03-06 16:32:55.943425 7fd32bf57700 15 Read AccessControlPolicyAccessControlPolicy xmlns=http://s3.amazonaws.com/doc/2006-03-01/;OwnerIDitalux/IDDisplayNameItalo Santos/DisplayName/OwnerAccessControlListGrantGrantee xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; xsi:type=CanonicalUserIDcliente/IDDisplayNameCliente/DisplayName/GranteePermissionFULL_CONTROL/Permission/Grant/AccessControlList/AccessControlPolicy 2015-03-06 16:32:55.943441 7fd32bf57700 2 req 2:0.000273:s3:PUT /:put_acls:init op 2015-03-06 16:32:55.943447 7fd32bf57700 2 req 2:0.000280:s3:PUT /:put_acls:verifying op mask 2015-03-06 16:32:55.943451 7fd32bf57700 20 required_mask= 2 user.op_mask=7 2015-03-06 16:32:55.943453 7fd32bf57700 2 req 2:0.000286:s3:PUT /:put_acls:verifying op permissions 2015-03-06 16:32:55.943457 7fd32bf57700 5 Searching permissions for uid=cliente mask=56 2015-03-06 16:32:55.943461 7fd32bf57700 5 Found permission: 15 2015-03-06 16:32:55.943462 7fd32bf57700 5 Searching permissions for group=1 mask=56 2015-03-06 16:32:55.943464 7fd32bf57700 5 Permissions for group not found 2015-03-06 16:32:55.943466 7fd32bf57700 5 Searching permissions for group=2 mask=56 2015-03-06 16:32:55.943468 7fd32bf57700 5 Permissions for group not found 2015-03-06 16:32:55.943469 7fd32bf57700
Re: [ceph-users] RadosGW - Create bucket via admin API
Hello Georgios, I thought which had some admin alternative to do that, but I realised don’t have once the bucket belongs to a specify user. So the alternative is, after create the user authenticate with created credentials to create the bucket. Thanks Regards. Italo Santos http://italosantos.com.br/ On Friday, March 6, 2015 at 07:40, Georgios Dimitrakakis wrote: Hi Italo, Check the S3 Bucket OPS at : http://ceph.com/docs/master/radosgw/s3/bucketops/ or use any of the examples provided in Python (http://ceph.com/docs/master/radosgw/s3/python/) or PHP (http://ceph.com/docs/master/radosgw/s3/php/) or JAVA (http://ceph.com/docs/master/radosgw/s3/java/) or anything else that is provided through S3 API (http://ceph.com/docs/master/radosgw/s3/) Regards, George Hello guys, On adminops documentation that saw how to remove a bucket, but I can’t find the URI to create one, I’d like to know if this is possible? Regards. ITALO SANTOS http://italosantos.com.br/ [1] Links: -- [1] http://italosantos.com.br/ ___ ceph-users mailing list ceph-users@lists.ceph.com (mailto:ceph-users@lists.ceph.com) http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Stuck PGs blocked_by non-existent OSDs
On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just sj...@redhat.com wrote: You'll probably have to recreate osds with the same ids (empty ones), let them boot, stop them, and mark them lost. There is a feature in the tracker to improve this behavior: http://tracker.ceph.com/issues/10976 -Sam Thanks Sam, I've readded the OSDs, they became unblocked but there are still the same number of pgs stuck. I looked at them in some more detail and it seems they all have num_bytes='0'. Tried a repair too, for good measure. Still nothing I'm afraid. Does this mean some underlying catastrophe has happened and they are never going to recover? Following on, would that cause data loss. There are no missing objects and I'm hoping there's appropriate checksumming / replicas to balance that out, but now I'm not so sure. Thanks again, Joel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error 2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970 common/Thread.cc: 129: FAILED assert(ret == 0) Environment : 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? Cluster status cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33 health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1 736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19 .501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03 monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789 /0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03 osdmap e26633: 239 osds: 85 up, 196 in pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects 4699 GB used, 707 TB / 711 TB avail 6061/31080 objects degraded (19.501%) 14 down+remapped+peering 39 active 3289 active+clean 547 peering 663 stale+down+peering 705 stale+active+remapped 1 active+degraded+remapped 1 stale+down+incomplete 484 down+peering 455 active+remapped 3696 stale+active+degraded 4 remapped+peering 23 stale+down+remapped+peering 51 stale+active 3637 active+degraded 3799 stale+active+clean OSD : Logs 2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970 common/Thread.cc: 129: FAILED assert(ret == 0) ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7) 1: (Thread::create(unsigned long)+0x8a) [0xaf41da] 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa] 3: (Accepter::entry()+0x265) [0xb5c635] 4: /lib64/libpthread.so.0() [0x3c8a6079d1] 5: (clone()+0x6d) [0x3c8a2e89dd] NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this. More information at Ceph Tracker Issue : http://tracker.ceph.com/issues/10988#change-49018 http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ smime.p7s Description: S/MIME cryptographic signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph mds zombie
Hi, On 09/03/2015 04:06, kenmasida wrote : I have resolved the problem,thank you very much。 When I use ceph-fuse to mount the client,it work well. Good news but can you give the kernel version of your client cephfs OS? Like you, I had one problem with cephfs in the client side and it come probably from the kernel 3.16 of my cephfs clients because (like you) my problem didn't produce with ceph-fuse or with a kernel 3.13. -- François Lafont ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
*Check Max Threadcount:* If you have a node with a lot of OSDs, you may be hitting the default maximum number of threads (e.g., usually 32k), especially during recovery. You can increase the number of threads using sysctl to see if increasing the maximum number of threads to the maximum possible number of threads allowed (i.e., 4194303) will help. For example: sysctl -w kernel.pid_max=4194303 If increasing the maximum thread count resolves the issue, you can make it permanent by including a kernel.pid_max setting in the /etc/sysctl.conf file. For example: kernel.pid_max = 4194303 On Mon, Mar 9, 2015 at 4:11 PM, Karan Singh karan.si...@csc.fi wrote: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970* *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)* *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33* * health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1* *736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19* *.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03* * monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789* */0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03* * osdmap e26633: 239 osds: 85 up, 196 in* * pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects* *4699 GB used, 707 TB / 711 TB avail* *6061/31080 objects degraded (19.501%)* * 14 down+remapped+peering* * 39 active* *3289 active+clean* * 547 peering* * 663 stale+down+peering* * 705 stale+active+remapped* * 1 active+degraded+remapped* * 1 stale+down+incomplete* * 484 down+peering* * 455 active+remapped* *3696 stale+active+degraded* * 4 remapped+peering* * 23 stale+down+remapped+peering* * 51 stale+active* *3637 active+degraded* *3799 stale+active+clean* *OSD : Logs * *2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970* *common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)* * ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)* * 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]* * 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]* * 3: (Accepter::entry()+0x265) [0xb5c635]* * 4: /lib64/libpthread.so.0() [0x3c8a6079d1]* * 5: (clone()+0x6d) [0x3c8a2e89dd]* * NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this.* *More information at Ceph Tracker Issue : * http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Warm Regards, Azad Aliyar Linux Server Engineer *Email* : azad.ali...@sparksupport.com *|* *Skype* : spark.azad http://www.sparksupport.com http://www.sparkmycloud.com https://www.facebook.com/sparksupport http://www.linkedin.com/company/244846 https://twitter.com/sparksupport 3rd Floor, Leela Infopark, Phase -2,Kakanad, Kochi-30, Kerala, India *Phone*:+91 484 6561696 , *Mobile*:91-8129270421. *Confidentiality Notice:* Information in this e-mail is proprietary to SparkSupport. and is intended for use only by the addressed, and may contain information that is privileged, confidential or exempt from disclosure. If you are not the intended recipient, you are notified that any use of this information in any manner is strictly prohibited. Please delete this mail notify us immediately at i...@sparksupport.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
Great Karan. On Mon, Mar 9, 2015 at 9:32 PM, Karan Singh karan.si...@csc.fi wrote: Thanks Guys kernel.pid_max=4194303 did the trick. - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018 Karan Singh Systems Specialist , Storage Platforms CSC - IT Center for Science, Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland mobile: +358 503 812758 tel. +358 9 4572001 fax +358 9 4572302 http://www.csc.fi/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Christian Eichelmann Systemadministrator 11 Internet AG - IT Operations Mail Media Advertising Targeting Brauerstraße 48 · DE-76135 Karlsruhe Telefon: +49 721 91374-8026 christian.eichelm...@1und1.de Amtsgericht Montabaur / HRB 6484 Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen Aufsichtsratsvorsitzender: Michael Scheeren -- Warm Regards, Azad Aliyar Linux Server Engineer *Email* : azad.ali...@sparksupport.com *|* *Skype* : spark.azad http://www.sparksupport.com
[ceph-users] how to improve seek time using hammer-test release
hello All, I just setup single node ceph with no replication to familiarize with ceph. using 2 intel S3500 SSD 800 Gb and 8Gb RAM and 16 core CPU. Os is ubuntu 14.04 64 bit ,kbd is loaded (modprobe kbd) When running bonniee++ against /dev/rbd0 it shows a seekrate of 892.2/s. How can the seek time be improved.If i ran 5 bonnie on /mnt where /dev/rbd0 is mounted as ext4 seek/s reduces to 500/s .I am trying to achieve over 1000 seek/s for each thread. What can i do to improve performance. *Tried following * scheduler to noop filesystem to btrfs debugging to 0/0 (all parameters found from mailing list) - This showed some noticeable difference . Will configuring ssd in RAID0 improve this,A single OSD from RAID0 Regards, Kevin ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread
2015-03-10 3:01 GMT+08:00 Sage Weil s...@newdream.net: On Mon, 9 Mar 2015, Karan Singh wrote: Thanks Guys kernel.pid_max=4194303 did the trick. Great to hear! Sorry we missed that you only had it at 65536. This is a really common problem that people hit when their clusters start to grow. Is there somewhere in the docs we can put this to catch more users? Or maybe a warning issued by the osds themselves or something if they see limits that are low? sage Um, I think we can add the command to the shell script /etc/init.d/ceph. Something like we deal with the max fd limitation (ulimit -n 32768). Thus, if we use command service ceph start osd.* to start osds, it will be automatically changed to the proper value. - Karan - On 09 Mar 2015, at 14:48, Christian Eichelmann christian.eichelm...@1und1.de wrote: Hi Karan, as you are actually writing in your own book, the problem is the sysctl setting kernel.pid_max. I've seen in your bug report that you were setting it to 65536, which is still to low for high density hardware. In our cluster, one OSD server has in an idle situation about 66.000 Threads (60 OSDs per Server). The number of threads increases when you increase the number of placement groups in the cluster, which I think has triggered your problem. Set the kernel.pid_max setting to 4194303 (the maximum) like Azad Aliyar suggested, and the problem should be gone. Regards, Christian Am 09.03.2015 11:41, schrieb Karan Singh: Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD?s i am getting this error /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ *Environment *: 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 , 3.17.2-1.el6.elrepo.x86_64 Tried upgrading from 0.80.7 to 0.80.8 but no Luck Tried centOS stock kernel 2.6.32 but no Luck Memory is not a problem more then 150+GB is free Did any one every faced this problem ?? *Cluster status * * * / cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33/ / health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete; 1735 pgs peering; 8938 pgs stale; 1/ /736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery 6061/31080 objects degraded (19/ /.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02, mon.pouta-s03/ / monmap e3: 3 mons at {pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX .50.3:6789/ //0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03/ / * osdmap e26633: 239 osds: 85 up, 196 in*/ / pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects/ /4699 GB used, 707 TB / 711 TB avail/ /6061/31080 objects degraded (19.501%)/ / 14 down+remapped+peering/ / 39 active/ /3289 active+clean/ / 547 peering/ / 663 stale+down+peering/ / 705 stale+active+remapped/ / 1 active+degraded+remapped/ / 1 stale+down+incomplete/ / 484 down+peering/ / 455 active+remapped/ /3696 stale+active+degraded/ / 4 remapped+peering/ / 23 stale+down+remapped+peering/ / 51 stale+active/ /3637 active+degraded/ /3799 stale+active+clean/ *OSD : Logs * /2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc http://Thread.cc: In function 'void Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970/ /common/Thread.cc http://Thread.cc: 129: FAILED assert(ret == 0)/ / / / ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)/ / 1: (Thread::create(unsigned long)+0x8a) [0xaf41da]/ / 2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]/ / 3: (Accepter::entry()+0x265) [0xb5c635]/ / 4: /lib64/libpthread.so.0() [0x3c8a6079d1]/ / 5: (clone()+0x6d) [0x3c8a2e89dd]/ / NOTE: a copy of the executable, or `objdump -rdS executable` is needed to interpret this./ *More information at Ceph Tracker Issue : *http://tracker.ceph.com/issues/10988#change-49018
Re: [ceph-users] Prioritize Heartbeat packets
Jian, Thanks for the clarification. I'll mark traffic destined for the monitors as well. We are getting ready to put our first cluster into production. If you are interested we will be testing the heartbeat priority to see if we can saturate the network (not an easy task for 40 Gb) and keep the cluster from falling apart. Our network team is marking COS based on the DSCP and enforcing priority. We have three VLANs on bonded 40 GbE, management, storage (monitors, clients, OSDs), and cluster (replication). We have three priority classes management (heartbeats on all VLANs, SSH, DNS, etc), storage traffic (no marking), and replication (scavenger class). We are interested to see how things pan out. Thanks, Robert On Mon, Mar 9, 2015 at 8:58 PM, Jian Wen wenjia...@gmail.com wrote: Only OSD calls set_socket_priority(). See https://github.com/ceph/ceph/pull/3353 On Tue, Mar 10, 2015 at 3:36 AM, Robert LeBlanc rob...@leblancnet.us wrote: I've found commit 9b9a682fe035c985e416ee1c112fa58f9045a27c and I see that when 'osd heartbeat use min delay socket = true' it will mark the packet with DSCP CS6. Based on the setting of the socket in msg/simple/Pipe.cc is it possible that this can apply to both OSD and monitor? I don't understand the code enough to know how the set_socket_options() is called from the OSD and monitor. If this applies to both monitor and OSD, would it be better to rename the option to a more generic name? Thanks, On Sat, Mar 7, 2015 at 4:23 PM, Daniel Swarbrick daniel.swarbr...@gmail.com wrote: Judging by the commit, this ought to do the trick: osd heartbeat use min delay socket = true On 07/03/15 01:20, Robert LeBlanc wrote: I see that Jian Wen has done work on this for 0.94. I tried looking through the code to see if I can figure out how to configure this new option, but it all went over my head pretty quick. Can I get a brief summary on how to set the priority of heartbeat packets or where to look in the code to figure it out? Thanks, On Thu, Aug 28, 2014 at 2:01 AM, Daniel Swarbrick daniel.swarbr...@profitbricks.com mailto:daniel.swarbr...@profitbricks.com wrote: On 28/08/14 02:56, Sage Weil wrote: I seem to remember someone telling me there were hooks/hints you could call that would tag either a socket or possibly data on that socket with a label for use by iptables and such.. but I forget what it was. Something like setsockopt() SO_MARK? *SO_MARK *(since Linux 2.6.25) Set the mark for each packet sent through this socket (similar to the netfilter MARK target but socket-based). Changing the mark can be used for mark-based routing without netfilter or for packet filtering. Setting this option requires the *CAP_NET_ADMIN *capability. Alternatively, directly set IP_TOS options on the socket, or SO_PRIORITY which sets the IP TOS bits as well. ___ ceph-users mailing list ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Best, Jian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Prioritize Heartbeat packets
Only OSD calls set_socket_priority(). See https://github.com/ceph/ceph/pull/3353 On Tue, Mar 10, 2015 at 3:36 AM, Robert LeBlanc rob...@leblancnet.us wrote: I've found commit 9b9a682fe035c985e416ee1c112fa58f9045a27c and I see that when 'osd heartbeat use min delay socket = true' it will mark the packet with DSCP CS6. Based on the setting of the socket in msg/simple/Pipe.cc is it possible that this can apply to both OSD and monitor? I don't understand the code enough to know how the set_socket_options() is called from the OSD and monitor. If this applies to both monitor and OSD, would it be better to rename the option to a more generic name? Thanks, On Sat, Mar 7, 2015 at 4:23 PM, Daniel Swarbrick daniel.swarbr...@gmail.com wrote: Judging by the commit, this ought to do the trick: osd heartbeat use min delay socket = true On 07/03/15 01:20, Robert LeBlanc wrote: I see that Jian Wen has done work on this for 0.94. I tried looking through the code to see if I can figure out how to configure this new option, but it all went over my head pretty quick. Can I get a brief summary on how to set the priority of heartbeat packets or where to look in the code to figure it out? Thanks, On Thu, Aug 28, 2014 at 2:01 AM, Daniel Swarbrick daniel.swarbr...@profitbricks.com mailto:daniel.swarbr...@profitbricks.com wrote: On 28/08/14 02:56, Sage Weil wrote: I seem to remember someone telling me there were hooks/hints you could call that would tag either a socket or possibly data on that socket with a label for use by iptables and such.. but I forget what it was. Something like setsockopt() SO_MARK? *SO_MARK *(since Linux 2.6.25) Set the mark for each packet sent through this socket (similar to the netfilter MARK target but socket-based). Changing the mark can be used for mark-based routing without netfilter or for packet filtering. Setting this option requires the *CAP_NET_ADMIN *capability. Alternatively, directly set IP_TOS options on the socket, or SO_PRIORITY which sets the IP TOS bits as well. ___ ceph-users mailing list ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Best, Jian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] rados import error: short write
we use `rados export poolA /opt/zs.rgw-buckets` export ceph cluster pool named poolA into localdir /opt/ .and import the directroy /opt/zs.rgw-buckets into another ceph cluster pool named hello , and following the error :shell rados import /opt/zs.rgw-buckets hello --create[ERROR]upload: rados_write error: short write[ERROR]upload error: -5 the directory /opt/zs.rgw-buckets include the Chinese character how can we solve this problem when migrate rados pool ? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] S3 RadosGW - Create bucket OP
Steffen W Sørensen stefws@... writes: Response: HTTP/1.1 200 OK Date: Fri, 06 Mar 2015 10:41:14 GMT Server: Apache/2.2.22 (Fedora) Connection: close Transfer-Encoding: chunked Content-Type: application/xml This response makes the App say: S3.createBucket, class S3, code UnexpectedContent, message Inconsistency in S3 response. error response is not a valid xml message Are our S3 GW not responding properly? Why doesn't the radosGW return a Content-Length: 0 header when the body is empty? http://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html Maybe this is confusing my App to expect some XML in body 2. at every create bucket OP the GW create what looks like new containers for ACLs in .rgw pool, is this normal or howto avoid such multiple objects clottering the GW pools? Is there something wrong since I get multiple ACL object for this bucket everytime my App tries to recreate same bucket or is this a feature/bug in radosGW? # rados -p .rgw ls .bucket.meta.mssCl:default.6309817.1 .bucket.meta.mssCl:default.6187712.3 .bucket.meta.mssCl:default.6299841.7 .bucket.meta.mssCl:default.6309817.5 .bucket.meta.mssCl:default.6187712.2 .bucket.meta.mssCl:default.6187712.19 .bucket.meta.mssCl:default.6187712.12 mssCl ... # rados -p .rgw listxattr .bucket.meta.mssCl:default.6187712.12 ceph.objclass.version user.rgw.acl /Steffen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph node operating system high availability and osd restoration best practices.
Hi, I have a 4 node ceph cluster and the operating system used on the nodes is Ubuntu 14.04. The ceph cluster currently has 12 osds spread across the 4 nodes. Currently one of the nodes has been restored after an operating system file system corruption which basically made the node and the osds on that particular node inaccessible to the rest of the cluster I had to re-install the operating system to make the node accessible and I am currently in the process of restoring the osds on the re-installed node. I have 3 questions 1) Is there any mechanism to provide Node Operating System high availability on a ceph cluster ? 2) Are there any best practices to follow while restoring the osds on a node that has been restored after an operating system crash ? 3) Is there any way to check if the data stored on the ceph cluster is safe and has been replicated to the other 3 nodes when one nodes crashed ? Regards, -- Vivek ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com