Re: [ceph-users] Ceph 10.1.1 rbd map fail
I find this message in dmesg: [83090.212918] libceph: mon0 192.168.159.128:6789 feature set mismatch, my 4a042a42 < server's 2004a042a42, missing 200 According to "http://cephnotes.ksperis.com/blog/2014/01/21/feature-set-mismatch-error-on-ceph-kernel-client;, this could mean that I need to upgrade kernel client up to 3.15 or disable tunable 3 features. Our cluster is not convenient to upgrade. Could you tell me how to disable tunable 3 features? Thanks! Kind Regards, Haitao Wang At 2016-06-22 12:33:42, "Brad Hubbard"wrote: >On Wed, Jun 22, 2016 at 1:35 PM, 王海涛 wrote: >> Hi All >> >> I'm using ceph-10.1.1 to map a rbd image ,but it dosen't work ,the error >> messages are: >> >> root@heaven:~#rbd map rbd/myimage --id admin >> 2016-06-22 11:16:34.546623 7fc87ca53d80 -1 WARNING: the following dangerous >> and experimental features are enabled: bluestore,rocksdb >> 2016-06-22 11:16:34.547166 7fc87ca53d80 -1 WARNING: the following dangerous >> and experimental features are enabled: bluestore,rocksdb >> 2016-06-22 11:16:34.549018 7fc87ca53d80 -1 WARNING: the following dangerous >> and experimental features are enabled: bluestore,rocksdb >> rbd: sysfs write failed >> rbd: map failed: (5) Input/output error > >Anything in dmesg, or anywhere, about "feature set mismatch" ? > >http://cephnotes.ksperis.com/blog/2014/01/21/feature-set-mismatch-error-on-ceph-kernel-client > >> >> Could someone tell me what's wrong? >> Thanks! >> >> Kind Regards, >> Haitao Wang >> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > > >-- >Cheers, >Brad ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph 10.1.1 rbd map fail
On Wed, Jun 22, 2016 at 1:35 PM, 王海涛wrote: > Hi All > > I'm using ceph-10.1.1 to map a rbd image ,but it dosen't work ,the error > messages are: > > root@heaven:~#rbd map rbd/myimage --id admin > 2016-06-22 11:16:34.546623 7fc87ca53d80 -1 WARNING: the following dangerous > and experimental features are enabled: bluestore,rocksdb > 2016-06-22 11:16:34.547166 7fc87ca53d80 -1 WARNING: the following dangerous > and experimental features are enabled: bluestore,rocksdb > 2016-06-22 11:16:34.549018 7fc87ca53d80 -1 WARNING: the following dangerous > and experimental features are enabled: bluestore,rocksdb > rbd: sysfs write failed > rbd: map failed: (5) Input/output error Anything in dmesg, or anywhere, about "feature set mismatch" ? http://cephnotes.ksperis.com/blog/2014/01/21/feature-set-mismatch-error-on-ceph-kernel-client > > Could someone tell me what's wrong? > Thanks! > > Kind Regards, > Haitao Wang > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Cheers, Brad ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] performance issue with jewel on ubuntu xenial (kernel)
Hi Yoann, On Tue, Jun 21, 2016 at 3:11 PM, Yoann Moulinwrote: > Hello, > > I found a performance drop between kernel 3.13.0-88 (default kernel on Ubuntu > Trusty 14.04) and kernel 4.4.0.24.14 (default kernel on Ubuntu Xenial 16.04) > > ceph version is Jewel (10.2.2). > All tests have been done under Ubuntu 14.04 Knowing that you also have an internalis cluster on almost identical hardware, can you please let the list know whether you see the same behavior (severely reduced throughput on a 4.4 kernel, vs. 3.13) on that cluster as well? Thank you. Cheers, Florian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph 10.1.1 rbd map fail
Hi All I'm using ceph-10.1.1 to map a rbd image ,but it dosen't work ,the error messages are:root@heaven:~#rbd map rbd/myimage --id admin 2016-06-22 11:16:34.546623 7fc87ca53d80 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb 2016-06-22 11:16:34.547166 7fc87ca53d80 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb 2016-06-22 11:16:34.549018 7fc87ca53d80 -1 WARNING: the following dangerous and experimental features are enabled: bluestore,rocksdb rbd: sysfs write failed rbd: map failed: (5) Input/output error Could someone tell me what's wrong?Thanks! Kind Regards, Haitao Wang ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph Performance vs Entry Level San Arrays
Hello, On Wed, 22 Jun 2016 11:09:46 +1200 Denver Williams wrote: > Hi All > > > I'm planning an Open-stack Private Cloud Deplyment and I'm trying to > Decide what would be the Better Option? > > What would the Performance Advantages/Disadvantages be when comparing a > 3 Node Ceph Setup with 15K/12G SAS Drives in an HP Dl380p G8 Server with > SSDs for Write Cache, compared to something like an HP MSA 2040 10GBe > iSCSI Array, All network connections would be 10GBe. > Very complex question and it's easy to compare apples and oranges here as well. For starters, I have no experience with that (or any similar) SAN and all my iSCSI experiences are purely based on tests, not production environments (and thus performance numbers). I'd also pit non-HP boxes (unless you get a massive discount from them of course) against the SAN, both for cost and design flexibility. And 15k or not, 12Gb/s SAS is overkill in my book for anything but SSDs. That all being said, I'd venture the SAN will win performance wise, it having 4GB HW cache on the RAID controllers can mask RAID6 performance drops and if you'd deploy raid 10's and tiering to SSDs with it that should only get better. There's a reason I deploy my mailbox servers as DRBD Pacemaker cluster pairs and not as with Ceph as backing storage. 3 Ceph storage nodes will give you the capacity of just one due to replication and you incur the latency penalty associated with that as well. Ceph could outgrow and potentially out-perform that SAN (in it's maximum configuration), but clearly you're not looking for that. Ceph also has potentially more resilience, but that's not a performance question either. It would be helpful to put a little more meat on that question, as in: - What are you needs (space, IOPS)? - What are the costs for either solution? (get a quote from HP) Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph Performance vs Entry Level San Arrays
Hi All I'm planning an Open-stack Private Cloud Deplyment and I'm trying to Decide what would be the Better Option? What would the Performance Advantages/Disadvantages be when comparing a 3 Node Ceph Setup with 15K/12G SAS Drives in an HP Dl380p G8 Server with SSDs for Write Cache, compared to something like an HP MSA 2040 10GBe iSCSI Array, All network connections would be 10GBe. Kind Regards, Denver Williams ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] slow request, waiting for rw locks / subops from osd doing deep scrub of pg in rgw.buckets.index
.rgw.bucket.index.pool is the pool with rgw's index objects, right? The actual on-disk directory for one of those pgs would contain only empty files -- the actual index data is stored in the osd's leveldb instance. I suspect your index objects are very large (because the buckets contain many objects) and are taking a long time to scrub. iirc, there is a way to make rgw split up those index objects into smaller ones. -Sam On Tue, Jun 21, 2016 at 11:58 AM, Trygve Veawrote: > Hi, > > I believe I've stumbled on a bug in Ceph, and I'm currently trying to figure > out if this is a new bug, some behaviour caused by our cluster being in the > midst of a hammer(0.94.6)->jewel(10.2.2) upgrade, or other factors. > > The state of the cluster at the time of the incident: > > - All monitor nodes are running 10.2.2. > - One OSD-server (4 osds) is up with 10.2.2 and with all pg's in active+clean. > - One OSD-server (4 osds) is up with 10.2.2 and undergoing backfills > (however: nobackfill was set, as we try to keep backfills running during > night time). > > We have 4 OSD-servers with 4 osds each with 0.94.6. > We have 3 OSD-servers with 2 osds each with 0.94.6. > > > We experienced something that heavily affected our RGW-users. Some requests > interfacing with 0.94.6 nodes were slow. > > During a 10 minute window, our RGW-nodes ran out of available workers and > ceased to respond. > > Some nodes logged some lines like these (only 0.94.6 nodes): > > 2016-06-21 09:51:08.053886 7f54610d8700 0 log_channel(cluster) log [WRN] : 2 > slow requests, 1 included below; oldest blocked for > 74.368036 secs > 2016-06-21 09:51:08.053951 7f54610d8700 0 log_channel(cluster) log [WRN] : > slow request 30.056333 seconds old, received at 2016-06-21 09:50:37.997327: > osd_op(client.9433496.0:1089298249 somergwuser.buckets [call > user.set_buckets_info] 12.da8df901 ondisk+write+known_if_redirected e9906) > currently waiting for rw locks > > > Some nodes logged some lines like these (there were some, but not 100% > overlap between osds that logged these and the beforementioned lines - only > 0.94.6 nodes): > > 2016-06-21 09:51:48.677474 7f8cb6628700 0 log_channel(cluster) log [WRN] : 2 > slow requests, 1 included below; oldest blocked for > 42.033650 secs > 2016-06-21 09:51:48.677565 7f8cb6628700 0 log_channel(cluster) log [WRN] : > slow request 30.371173 seconds old, received at 2016-06-21 09:51:18.305770: > osd_op(client.9525441.0:764274789 gc.1164 [call lock.lock] 7.7b4f1779 > ondisk+write+known_if_redirected e9906) currently waiting for subops from > 40,50 > > All of the osds that logged these lines, were waiting for subops from osd.50 > > > Investigating what's going on this osd during that window: > > 2016-06-21 09:48:22.064630 7f1cbb41d700 0 log_channel(cluster) log [INF] : > 5.b5 deep-scrub starts > 2016-06-21 09:59:56.640012 7f1c90163700 0 -- 10.21.9.22:6800/2003521 >> > 10.20.9.21:6805/7755 pipe(0x1e47a000 sd=298 :39448 s=2 pgs=23 cs=1 l=0 > c=0x1033ba20).fault with nothing to send, going to standby > 2016-06-21 09:59:56.997763 7f1c700f8700 0 -- 10.21.9.22:6808/3521 >> > 10.21.9.12:0/1028533 pipe(0x1f30f000 sd=87 :6808 s=0 pgs=0 cs=0 l=1 > c=0x743c840).accept replacing existing (lossy) channel (new one lossy=1) > 2016-06-21 10:00:39.938700 7f1cd9828700 0 log_channel(cluster) log [WRN] : > 33 slow requests, 33 included below; oldest blocked for > 727.862759 secs > 2016-06-21 10:00:39.938708 7f1cd9828700 0 log_channel(cluster) log [WRN] : > slow request 670.918857 seconds old, received at 2016-06-21 09:49:29.019653: > osd_op(client.9403437.0:1209613500 TZ1A91MYDE1LO63AQCM3 [getxattrs,stat] > 9.442585e6 ack+read+known_if_redirected e9906) currently no flag points > reached > 2016-06-21 10:00:39.938800 7f1cd9828700 0 log_channel(cluster) log [WRN] : > slow request 689.815851 seconds old, received at 2016-06-21 09:49:10.122660: > osd_op(client.9403437.0:1209611533 TZ1A91MYDE1LO63AQCM3 [getxattrs,stat] > 9.442585e6 ack+read+known_if_redirected e9906) currently no flag points > reached > 2016-06-21 10:00:39.938807 7f1cd9828700 0 log_channel(cluster) log [WRN] : > slow request 670.895353 seconds old, received at 2016-06-21 09:49:29.043158: > osd_op(client.9403437.0:1209613505 prod.arkham [call > version.read,getxattrs,stat] 2.4da23de6 ack+read+known_if_redirected e9906) > currently no flag points reached > 2016-06-21 10:00:39.938810 7f1cd9828700 0 log_channel(cluster) log [WRN] : > slow request 688.612303 seconds old, received at 2016-06-21 09:49:11.326207: > osd_op(client.20712623.0:137251515 TZ1A91MYDE1LO63AQCM3 [getxattrs,stat] > 9.442585e6 ack+read+known_if_redirected e9906) currently no flag points > reached > 2016-06-21 10:00:39.938813 7f1cd9828700 0 log_channel(cluster) log [WRN] : > slow request 658.605163 seconds old, received at 2016-06-21 09:49:41.48: > osd_op(client.20712623.0:137254412 TZ1A91MYDE1LO63AQCM3 [getxattrs,stat] >
[ceph-users] Bluestore Backend Tech Talk
Hey cephers, Just a reminder, the Bluestore backend Ceph Tech Talk by Sage is going to be starting in ~10m. Feel free to dial in and ask questions. Thanks. http://ceph.com/ceph-tech-talks/ -- Best Regards, Patrick McGarry Director Ceph Community || Red Hat http://ceph.com || http://community.redhat.com @scuttlemonkey || @ceph ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] slow request, waiting for rw locks / subops from osd doing deep scrub of pg in rgw.buckets.index
Hi, I believe I've stumbled on a bug in Ceph, and I'm currently trying to figure out if this is a new bug, some behaviour caused by our cluster being in the midst of a hammer(0.94.6)->jewel(10.2.2) upgrade, or other factors. The state of the cluster at the time of the incident: - All monitor nodes are running 10.2.2. - One OSD-server (4 osds) is up with 10.2.2 and with all pg's in active+clean. - One OSD-server (4 osds) is up with 10.2.2 and undergoing backfills (however: nobackfill was set, as we try to keep backfills running during night time). We have 4 OSD-servers with 4 osds each with 0.94.6. We have 3 OSD-servers with 2 osds each with 0.94.6. We experienced something that heavily affected our RGW-users. Some requests interfacing with 0.94.6 nodes were slow. During a 10 minute window, our RGW-nodes ran out of available workers and ceased to respond. Some nodes logged some lines like these (only 0.94.6 nodes): 2016-06-21 09:51:08.053886 7f54610d8700 0 log_channel(cluster) log [WRN] : 2 slow requests, 1 included below; oldest blocked for > 74.368036 secs 2016-06-21 09:51:08.053951 7f54610d8700 0 log_channel(cluster) log [WRN] : slow request 30.056333 seconds old, received at 2016-06-21 09:50:37.997327: osd_op(client.9433496.0:1089298249 somergwuser.buckets [call user.set_buckets_info] 12.da8df901 ondisk+write+known_if_redirected e9906) currently waiting for rw locks Some nodes logged some lines like these (there were some, but not 100% overlap between osds that logged these and the beforementioned lines - only 0.94.6 nodes): 2016-06-21 09:51:48.677474 7f8cb6628700 0 log_channel(cluster) log [WRN] : 2 slow requests, 1 included below; oldest blocked for > 42.033650 secs 2016-06-21 09:51:48.677565 7f8cb6628700 0 log_channel(cluster) log [WRN] : slow request 30.371173 seconds old, received at 2016-06-21 09:51:18.305770: osd_op(client.9525441.0:764274789 gc.1164 [call lock.lock] 7.7b4f1779 ondisk+write+known_if_redirected e9906) currently waiting for subops from 40,50 All of the osds that logged these lines, were waiting for subops from osd.50 Investigating what's going on this osd during that window: 2016-06-21 09:48:22.064630 7f1cbb41d700 0 log_channel(cluster) log [INF] : 5.b5 deep-scrub starts 2016-06-21 09:59:56.640012 7f1c90163700 0 -- 10.21.9.22:6800/2003521 >> 10.20.9.21:6805/7755 pipe(0x1e47a000 sd=298 :39448 s=2 pgs=23 cs=1 l=0 c=0x1033ba20).fault with nothing to send, going to standby 2016-06-21 09:59:56.997763 7f1c700f8700 0 -- 10.21.9.22:6808/3521 >> 10.21.9.12:0/1028533 pipe(0x1f30f000 sd=87 :6808 s=0 pgs=0 cs=0 l=1 c=0x743c840).accept replacing existing (lossy) channel (new one lossy=1) 2016-06-21 10:00:39.938700 7f1cd9828700 0 log_channel(cluster) log [WRN] : 33 slow requests, 33 included below; oldest blocked for > 727.862759 secs 2016-06-21 10:00:39.938708 7f1cd9828700 0 log_channel(cluster) log [WRN] : slow request 670.918857 seconds old, received at 2016-06-21 09:49:29.019653: osd_op(client.9403437.0:1209613500 TZ1A91MYDE1LO63AQCM3 [getxattrs,stat] 9.442585e6 ack+read+known_if_redirected e9906) currently no flag points reached 2016-06-21 10:00:39.938800 7f1cd9828700 0 log_channel(cluster) log [WRN] : slow request 689.815851 seconds old, received at 2016-06-21 09:49:10.122660: osd_op(client.9403437.0:1209611533 TZ1A91MYDE1LO63AQCM3 [getxattrs,stat] 9.442585e6 ack+read+known_if_redirected e9906) currently no flag points reached 2016-06-21 10:00:39.938807 7f1cd9828700 0 log_channel(cluster) log [WRN] : slow request 670.895353 seconds old, received at 2016-06-21 09:49:29.043158: osd_op(client.9403437.0:1209613505 prod.arkham [call version.read,getxattrs,stat] 2.4da23de6 ack+read+known_if_redirected e9906) currently no flag points reached 2016-06-21 10:00:39.938810 7f1cd9828700 0 log_channel(cluster) log [WRN] : slow request 688.612303 seconds old, received at 2016-06-21 09:49:11.326207: osd_op(client.20712623.0:137251515 TZ1A91MYDE1LO63AQCM3 [getxattrs,stat] 9.442585e6 ack+read+known_if_redirected e9906) currently no flag points reached 2016-06-21 10:00:39.938813 7f1cd9828700 0 log_channel(cluster) log [WRN] : slow request 658.605163 seconds old, received at 2016-06-21 09:49:41.48: osd_op(client.20712623.0:137254412 TZ1A91MYDE1LO63AQCM3 [getxattrs,stat] 9.442585e6 ack+read+known_if_redirected e9906) currently no flag points reached 2016-06-21 10:00:39.960300 7f1cbb41d700 0 log_channel(cluster) log [INF] : 5.b5 deep-scrub ok Looking at the contents of 5.b5 (which is in our .rgw.buckets.index pool, if relevant) and it's almost empty (12KB of files on the disk) I find it unlikely for a scrubbing to take that long. Which is why I suspect we've ran into a bug. With the information I've provided here, can anyone shed some light on what this may be, and if it's a bug that is not fixed in HEAD; What information would be useful to include in a bug report? Regards -- Trygve Vea
Re: [ceph-users] Issue installing ceph with ceph-deploy
On Tue, Jun 21, 2016 at 8:16 AM, shanewrote: > Fran Barrera writes: > >> >> Hi all, >> I have a problem installing ceph jewel with ceph-deploy (1.5.33) on ubuntu > 14.04.4 (openstack instance). >> >> This is my setup: >> >> >> ceph-admin >> >> ceph-mon >> ceph-osd-1 >> ceph-osd-2 >> >> >> I've following these steps from ceph-admin node: >> >> I have the user "ceph" created in all nodes and access from ssh key. >> >> >> 1. # wget -q -O- > 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | apt-key > add - >> >> 2. # echo deb http://download.ceph.com/debian-jewel/ $(lsb_release -sc) > main | tee /etc/apt/sources.list.d/ceph.list >> 3. # apt-get update >> 4. # apt-get install ceph-deploy >> 5. $ ceph-deploy new ceph-mon >> 6. Modify ceph.conf and add "osd_pool_default_size = 2" >> 7. $ ceph-deploy install ceph-admin ceph-mon ceph-osd-1 ceph-osd-2 >> >> And this is the output: >> >> [ceph-admin][DEBUG ] Setting up ceph-common (10.2.1-1trusty) ... >> [ceph-admin][DEBUG ] Setting system user ceph properties..Processing > triggers for libc-bin (2.19-0ubuntu6.9) ... >> [ceph-admin][WARNIN] usermod: user ceph is currently used by process 1303 >> [ceph-admin][WARNIN] dpkg: error processing package ceph-common >> (--configure): >> [ceph-admin][WARNIN] subprocess installed post-installation script > returned error exit status 8 >> [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of > ceph-base: >> [ceph-admin][WARNIN] ceph-base depends on ceph-common (= 10.2.1-1trusty); > however: >> [ceph-admin][WARNIN] Package ceph-common is not configured yet. >> [ceph-admin][WARNIN] >> [ceph-admin][WARNIN] dpkg: error processing package ceph-base (--configure): >> [ceph-admin][WARNIN] dependency problems - leaving unconfigured >> [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of > ceph-mon: >> [ceph-admin][WARNIN] ceph-mon depends on ceph-base (= 10.2.1-1trusty); > however: >> [ceph-admin][WARNIN] Package ceph-base is not configured yet. >> [ceph-admin][WARNIN] >> [ceph-admin][WARNIN] dpkg: error processing package ceph-mon (--configure): >> [ceph-admin][WARNIN] dependency problems - leaving unconfigured >> [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of > ceph-osd: >> [ceph-admin][WARNIN] ceph-osd depends on ceph-base (= 10.2.1-1trusty); > however: >> [ceph-admin][WARNIN] Package ceph-base is not configured yet. >> [ceph-admin][WARNIN] >> [ceph-admin][WARNIN] dpkg: error processing package ceph-osd (--configure): >> [ceph-admin][WARNIN] dependency problems - leaving unconfigured >> [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of ceph: >> [ceph-admin][WARNIN] ceph depends on ceph-mon (= 10.2.1-1trusty); however: >> [ceph-admin][WARNIN] Package ceph-mon is not configured yet. >> [ceph-admin][WARNIN] ceph depends on ceph-osd (= 10.2.1-1trusty); however: >> [ceph-admin][WARNIN] Package ceph-osd is not configured yet. >> [ceph-admin][WARNIN] >> [ceph-admin][WARNIN] dpkg: error processing package ceph (--configure): >> [ceph-admin][WARNIN] dependency problems - leaving unconfigured >> [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of > ceph-mds: >> [ceph-admin][WARNIN] ceph-mds depends on ceph-base (= 10.2.1-1trusty); > however: >> [ceph-admin][WARNIN] Package ceph-base is not configured yet. >> [ceph-admin][WARNIN] >> [ceph-admin][WARNIN] dpkg: error processing package ceph-mds (--configure): >> [ceph-admin][WARNIN] dependency problems - leaving unconfigured >> [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of > radosgw: >> [ceph-admin][WARNIN] radosgw depends on ceph-common (= 10.2.1-1trusty); > however: >> [ceph-admin][WARNIN] Package ceph-common is not configured yet. >> [ceph-admin][WARNIN] >> [ceph-admin][WARNIN] dpkg: error processing package radosgw (--configure): >> [ceph-admin][WARNIN] dependency problems - leaving unconfigured >> [ceph-admin][WARNIN] No apport report written because the error message > indicates its a followup error from a previous failure. >> [ceph-admin][WARNIN] No apport report written because the error message > indicates its a followup error from a previous failure. >> [ceph-admin][WARNIN] No apport report written because MaxReports is > reached already >> [ceph-admin][WARNIN] No apport report written because MaxReports is > reached already >> [ceph-admin][WARNIN] No apport report written because MaxReports is > reached already >> [ceph-admin][WARNIN] No apport report written because MaxReports is > reached already >> [ceph-admin][DEBUG ] Processing triggers for ureadahead (0.100.0-16) ... >> [ceph-admin][WARNIN] Errors were encountered while processing: >> [ceph-admin][WARNIN] ceph-common >> [ceph-admin][WARNIN] ceph-base >> [ceph-admin][WARNIN] ceph-mon >> [ceph-admin][WARNIN] ceph-osd >> [ceph-admin][WARNIN] ceph >> [ceph-admin][WARNIN] ceph-mds >> [ceph-admin][WARNIN]
Re: [ceph-users] Chown / symlink issues on download.ceph.com
On 06/20/2016 12:54 AM, Wido den Hollander wrote: > Hi Dan, > > There seems to be a symlink issue on download.ceph.com: > > # rsync -4 -avrn download.ceph.com::ceph /tmp|grep 'rpm-hammer/rhel7' > rpm-hammer/rhel7 -> /home/dhc-user/repos/rpm-hammer/el7 > > Could you take a quick look at that? It breaks the syncs for all the other > mirrors who sync from download.ceph.com > > Maybe do a chown (automated, cron?) as well to make sure all the files are > readable by rsync? > > Thanks! > > Wido > I've just removed the symlink. It probably was doing no good. If there are further perm issues I don't see them, but let me know. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Issue installing ceph with ceph-deploy
Fran Barrerawrites: > > Hi all, > I have a problem installing ceph jewel with ceph-deploy (1.5.33) on ubuntu 14.04.4 (openstack instance). > > This is my setup: > > > ceph-admin > > ceph-mon > ceph-osd-1 > ceph-osd-2 > > > I've following these steps from ceph-admin node: > > I have the user "ceph" created in all nodes and access from ssh key. > > > 1. # wget -q -O- 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | apt-key add - > > 2. # echo deb http://download.ceph.com/debian-jewel/ $(lsb_release -sc) main | tee /etc/apt/sources.list.d/ceph.list > 3. # apt-get update > 4. # apt-get install ceph-deploy > 5. $ ceph-deploy new ceph-mon > 6. Modify ceph.conf and add "osd_pool_default_size = 2" > 7. $ ceph-deploy install ceph-admin ceph-mon ceph-osd-1 ceph-osd-2 > > And this is the output: > > [ceph-admin][DEBUG ] Setting up ceph-common (10.2.1-1trusty) ... > [ceph-admin][DEBUG ] Setting system user ceph properties..Processing triggers for libc-bin (2.19-0ubuntu6.9) ... > [ceph-admin][WARNIN] usermod: user ceph is currently used by process 1303 > [ceph-admin][WARNIN] dpkg: error processing package ceph-common (--configure): > [ceph-admin][WARNIN] subprocess installed post-installation script returned error exit status 8 > [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of ceph-base: > [ceph-admin][WARNIN] ceph-base depends on ceph-common (= 10.2.1-1trusty); however: > [ceph-admin][WARNIN] Package ceph-common is not configured yet. > [ceph-admin][WARNIN] > [ceph-admin][WARNIN] dpkg: error processing package ceph-base (--configure): > [ceph-admin][WARNIN] dependency problems - leaving unconfigured > [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of ceph-mon: > [ceph-admin][WARNIN] ceph-mon depends on ceph-base (= 10.2.1-1trusty); however: > [ceph-admin][WARNIN] Package ceph-base is not configured yet. > [ceph-admin][WARNIN] > [ceph-admin][WARNIN] dpkg: error processing package ceph-mon (--configure): > [ceph-admin][WARNIN] dependency problems - leaving unconfigured > [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of ceph-osd: > [ceph-admin][WARNIN] ceph-osd depends on ceph-base (= 10.2.1-1trusty); however: > [ceph-admin][WARNIN] Package ceph-base is not configured yet. > [ceph-admin][WARNIN] > [ceph-admin][WARNIN] dpkg: error processing package ceph-osd (--configure): > [ceph-admin][WARNIN] dependency problems - leaving unconfigured > [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of ceph: > [ceph-admin][WARNIN] ceph depends on ceph-mon (= 10.2.1-1trusty); however: > [ceph-admin][WARNIN] Package ceph-mon is not configured yet. > [ceph-admin][WARNIN] ceph depends on ceph-osd (= 10.2.1-1trusty); however: > [ceph-admin][WARNIN] Package ceph-osd is not configured yet. > [ceph-admin][WARNIN] > [ceph-admin][WARNIN] dpkg: error processing package ceph (--configure): > [ceph-admin][WARNIN] dependency problems - leaving unconfigured > [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of ceph-mds: > [ceph-admin][WARNIN] ceph-mds depends on ceph-base (= 10.2.1-1trusty); however: > [ceph-admin][WARNIN] Package ceph-base is not configured yet. > [ceph-admin][WARNIN] > [ceph-admin][WARNIN] dpkg: error processing package ceph-mds (--configure): > [ceph-admin][WARNIN] dependency problems - leaving unconfigured > [ceph-admin][WARNIN] dpkg: dependency problems prevent configuration of radosgw: > [ceph-admin][WARNIN] radosgw depends on ceph-common (= 10.2.1-1trusty); however: > [ceph-admin][WARNIN] Package ceph-common is not configured yet. > [ceph-admin][WARNIN] > [ceph-admin][WARNIN] dpkg: error processing package radosgw (--configure): > [ceph-admin][WARNIN] dependency problems - leaving unconfigured > [ceph-admin][WARNIN] No apport report written because the error message indicates its a followup error from a previous failure. > [ceph-admin][WARNIN] No apport report written because the error message indicates its a followup error from a previous failure. > [ceph-admin][WARNIN] No apport report written because MaxReports is reached already > [ceph-admin][WARNIN] No apport report written because MaxReports is reached already > [ceph-admin][WARNIN] No apport report written because MaxReports is reached already > [ceph-admin][WARNIN] No apport report written because MaxReports is reached already > [ceph-admin][DEBUG ] Processing triggers for ureadahead (0.100.0-16) ... > [ceph-admin][WARNIN] Errors were encountered while processing: > [ceph-admin][WARNIN] ceph-common > [ceph-admin][WARNIN] ceph-base > [ceph-admin][WARNIN] ceph-mon > [ceph-admin][WARNIN] ceph-osd > [ceph-admin][WARNIN] ceph > [ceph-admin][WARNIN] ceph-mds > [ceph-admin][WARNIN] radosgw > [ceph-admin][WARNIN] E: Sub-process /usr/bin/dpkg returned an error code (1) > [ceph-admin][ERROR ] RuntimeError: command returned non-zero exit status: 100 > [ceph_deploy][ERROR ]
[ceph-users] performance issue with jewel on ubuntu xenial (kernel)
Hello, I found a performance drop between kernel 3.13.0-88 (default kernel on Ubuntu Trusty 14.04) and kernel 4.4.0.24.14 (default kernel on Ubuntu Xenial 16.04) ceph version is Jewel (10.2.2). All tests have been done under Ubuntu 14.04 Kernel 4.4 has a drop of 50% compared to 4.2 Kernel 4.4 has a drop of 40% compared to 3.13 details below: With the 3 kernel I have the same performance on disks : Raw benchmark: dd if=/dev/zero of=/dev/sdX bs=1M count=1024 oflag=direct=> average ~230MB/s dd if=/dev/zero of=/dev/sdX bs=1G count=1 oflag=direct => average ~220MB/s Filesystem mounted benchmark: dd if=/dev/zero of=/sdX1/test.img bs=1G count=1 => average ~205MB/s dd if=/dev/zero of=/sdX1/test.img bs=1G count=1 oflag=direct => average ~214MB/s dd if=/dev/zero of=/sdX1/test.img bs=1G count=1 oflag=sync => average ~190MB/s Ceph osd Benchmark: Kernel 3.13.0-88-generic : ceph tell osd.ID => average ~81MB/s Kernel 4.2.0-38-generic : ceph tell osd.ID => average ~109MB/s Kernel 4.4.0-24-generic : ceph tell osd.ID => average ~50MB/s Does anyone get a similar behaviour on their cluster ? Best regards -- Yoann Moulin EPFL IC-IT ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] osds udev rules not triggered on reboot (jewel, jessie)
On 16/06/2016 18:01, stephane.d...@orange.com wrote: > Hi, > > Same issue with Centos 7, I also put back this file in /etc/udev/rules.d. Hi Stephane, Could you please detail which version of CentOS 7 you are using ? I tried to reproduce the problem with CentOS 7.2 as found on the CentOS cloud images repository ( http://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud-1511.qcow2 ) but it "works for me". Thanks ! > > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Alexandre DERUMIER > Sent: Thursday, June 16, 2016 17:53 > To: Karsten Heymann; Loris Cuoghi > Cc: Loic Dachary; ceph-users > Subject: Re: [ceph-users] osds udev rules not triggered on reboot (jewel, > jessie) > > Hi, > > I have the same problem with osd disks not mounted at boot on jessie with > ceph jewel > > workaround is to re-add 60-ceph-partuuid-workaround.rules file to udev > > http://tracker.ceph.com/issues/16351 > > > - Mail original - > De: "aderumier"> À: "Karsten Heymann" , "Loris Cuoghi" > > Cc: "Loic Dachary" , "ceph-users" > > Envoyé: Jeudi 28 Avril 2016 07:42:04 > Objet: Re: [ceph-users] osds udev rules not triggered on reboot (jewel, > jessie) > > Hi, > they are missing target files in debian packages > > http://tracker.ceph.com/issues/15573 > https://github.com/ceph/ceph/pull/8700 > > I have also done some other trackers about packaging bug > > jewel: debian package: wrong /etc/default/ceph/ceph location > http://tracker.ceph.com/issues/15587 > > debian/ubuntu : TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES not specified in > /etc/default/cep > http://tracker.ceph.com/issues/15588 > > jewel: debian package: init.d script bug > http://tracker.ceph.com/issues/15585 > > > @CC loic dachary, maybe he could help to speed up packaging fixes > > - Mail original - > De: "Karsten Heymann" > À: "Loris Cuoghi" > Cc: "ceph-users" > Envoyé: Mercredi 27 Avril 2016 15:20:29 > Objet: Re: [ceph-users] osds udev rules not triggered on reboot (jewel, > jessie) > > 2016-04-27 15:18 GMT+02:00 Loris Cuoghi : >> Le 27/04/2016 14:45, Karsten Heymann a écrit : >>> one workaround I found was to add >>> >>> [Install] >>> WantedBy=ceph-osd.target >>> >>> to /lib/systemd/system/ceph-disk@.service and then manually enable my >>> disks with >>> >>> # systemctl enable ceph-disk\@dev-sdi1 >>> # systemctl start ceph-disk\@dev-sdi1 >>> >>> That way they at least are started at boot time. > >> Great! But only if the disks keep their device names, right ? > > Exactly. It's just a little workaround until the real issue is fixed. > > +Karsten > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _ > > Ce message et ses pieces jointes peuvent contenir des informations > confidentielles ou privilegiees et ne doivent donc > pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu > ce message par erreur, veuillez le signaler > a l'expediteur et le detruire ainsi que les pieces jointes. Les messages > electroniques etant susceptibles d'alteration, > Orange decline toute responsabilite si ce message a ete altere, deforme ou > falsifie. Merci. > > This message and its attachments may contain confidential or privileged > information that may be protected by law; > they should not be distributed, used or copied without authorisation. > If you have received this email in error, please notify the sender and delete > this message and its attachments. > As emails may be altered, Orange is not liable for messages that have been > modified, changed or falsified. > Thank you. > -- Loïc Dachary, Artisan Logiciel Libre ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Observations after upgrading to latest Hammer (0.94.7)
Hello, I upgraded a staging ceph cluster from latest Firefly to latest Hammer last week. Everything went fine overall and I would like to share my observations so far: a. every OSD upgrade lasts appr. 3 minutes. I doubt there is any way to speed this up though b. rados bench with different block sizes and different number of threads produces consistently 15-20% better write/read IOPS/throughput compared to Firefly. Subsequently, CPU load on OSD nodes was lower during the bench c. OSD apply latency increased 2x-3x for all OSDs. No clue though why this is happening. Commitcycle/journal latencies are at the same level. You may notice the effect for the apply latency at the uploaded image [1] It would be nice if someone else also shares his/her experience after upgrading to Hammer or/and propose more core metrics that should be looked at after major version upgrades. [1] https://up1.ca/#V5vcso6i8IQ01Se62NJqng Regards, Kostis ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Does flushbufs on a rbd-nbd invalidate librbd cache?
Hi All, Does anybody know if calling a blockdev --flushbufs on a rbd-nbd device causes the librbd read cache to be invalidated? I've done a quick test and the invalidate_cache counter doesn't increment like when you send the invalidate command via the admin socket. Thanks, Nick ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Regarding executing COSBench onto a specific pool
Hi, In Ceph cluster, currently we are seeing that COSBench is writing IO to default pools that are created while configuring rados gw. Can you please let me know, if there is a way to execute IO (using COSBech) on a specific pool. Thanks & Regards, Manoj ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Inconsistent PGs
Thanks for response. All OSDs seems to be ok, they have been restarted, joined cluster after that, nothing weird in the logs. # ceph pg dump_stuck stale ok # ceph pg dump_stuck inactive ok pg_statstateupup_primaryactingacting_primary 3.2929incomplete[109,272,83]109[109,272,83]109 3.1683incomplete[166,329,281]166[166,329,281]166 # ceph pg dump_stuck unclean ok pg_statstateupup_primaryactingacting_primary 3.2929incomplete[109,272,83]109[109,272,83]109 3.1683incomplete[166,329,281]166[166,329,281]166 On OSD 166 there is 100 blocked ops (on 109 too), they all end on "event": "reached_pg" # ceph --admin-daemon /var/run/ceph/ceph-osd.166.asok dump_ops_in_flight ... { "description": "osd_op(client.958764031.0:18137113 rbd_data.392585982ae8944a.0ad4 [set-alloc-hint object_size 4194304 write_size 4194304,write 2641920~8192] 3.d6195683 RETRY=15 ack+ondisk+retry+write+known_if_redirected e613241)", "initiated_at": "2016-06-21 10:19:59.894393", "age": 828.025527, "duration": 600.020809, "type_data": [ "reached pg", { "client": "client.958764031", "tid": 18137113 }, [ { "time": "2016-06-21 10:19:59.894393", "event": "initiated" }, { "time": "2016-06-21 10:29:59.915202", "event": "reached_pg" } ] ] } ], "num_ops": 100 } On 06/21/2016 12:27 PM, M Ranga Swami Reddy wrote: > you can use the below cmds: > == > > ceph pg dump_stuck stale > ceph pg dump_stuck inactive > ceph pg dump_stuck unclean > === > > And the query the PG, which are in unclean or stale state, check for > any issue with a specific OSD. > > Thanks > Swami > > On Tue, Jun 21, 2016 at 3:02 PM, Paweł Sadowskiwrote: >> Hello, >> >> We have an issue on one of our clusters. One node with 9 OSD was down >> for more than 12 hours. During that time cluster recovered without >> problems. When host back to the cluster we got two PGs in incomplete >> state. We decided to mark OSDs on this host as out but the two PGs are >> still in incomplete state. Trying to query those pg hangs forever. We >> were alredy trying restarting OSDs. Is there any way to solve this issue >> without loosing data? Any help appreciate :) >> >> # ceph health detail | grep incomplete >> HEALTH_WARN 2 pgs incomplete; 2 pgs stuck inactive; 2 pgs stuck unclean; >> 200 requests are blocked > 32 sec; 2 osds have slow requests; >> noscrub,nodeep-scrub flag(s) set >> pg 3.2929 is stuck inactive since forever, current state incomplete, >> last acting [109,272,83] >> pg 3.1683 is stuck inactive since forever, current state incomplete, >> last acting [166,329,281] >> pg 3.2929 is stuck unclean since forever, current state incomplete, last >> acting [109,272,83] >> pg 3.1683 is stuck unclean since forever, current state incomplete, last >> acting [166,329,281] >> pg 3.1683 is incomplete, acting [166,329,281] (reducing pool vms >> min_size from 2 may help; search ceph.com/docs for 'incomplete') >> pg 3.2929 is incomplete, acting [109,272,83] (reducing pool vms min_size >> from 2 may help; search ceph.com/docs for 'incomplete') >> >> Directory for PG 3.1683 is present on OSD 166 and containes ~8GB. >> >> We didn't try setting min_size to 1 yet (we treat is as a last resort). >> >> >> >> Some cluster info: >> # ceph --version >> >> ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) >> >> # ceph -s >> health HEALTH_WARN >> 2 pgs incomplete >> 2 pgs stuck inactive >> 2 pgs stuck unclean >> 200 requests are blocked > 32 sec >> noscrub,nodeep-scrub flag(s) set >> monmap e7: 5 mons at >> {mon-03=*.2:6789/0,mon-04=*.36:6789/0,mon-05=*.81:6789/0,mon-06=*.0:6789/0,mon-07=*.40:6789/0} >> election epoch 3250, quorum 0,1,2,3,4 >> mon-06,mon-07,mon-04,mon-03,mon-05 >> osdmap e613040: 346 osds: 346 up, 337 in >> flags noscrub,nodeep-scrub >> pgmap v27163053: 18624 pgs, 6 pools, 138 TB data, 39062 kobjects >> 415 TB used, 186 TB / 601 TB avail >>18622 active+clean >>2 incomplete >> client io 9992 kB/s rd, 64867 kB/s wr, 8458 op/s >> >> >> # ceph osd pool get vms pg_num >> pg_num: 16384 >> >> # ceph osd pool get vms size >> size: 3 >> >> # ceph osd pool get vms min_size >> min_size: 2 -- PS ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Bucket index question
Hello, I have a questions regarding the bucket index: 1) As far as know index of a given bucket is the single RADOS object and it lives in OSD omap. But does it get replicated or not? 2) When trying to copy bucket index pool to some other pool i get the following error: $ rados cppool ed-1.rgw.buckets.index test ed-1.rgw.buckets.index:.dir.06ee966c-5b48-4c53-8ed8-36bbf53204f5.171499.1 => test:.dir.06ee966c-5b48-4c53-8ed8-36bbf53204f5.171499.1 error copying object: (2) No such file or directory error copying pool ed-1.rgw.buckets.index => test: (34) Numerical result out of range and the object is not getting copied. Btw, this particular index servers as an index of bucket with almost 19 million objects and it is not sharded. $ ceph df | grep -e ed-1.rgw.buckets.data -e NAME NAMEID USED %USED MAX AVAIL OBJECTS ed-1.rgw.buckets.data 16 52998G 4.96 769T 38414882 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD out/down detection
Regarding your original issue, you may want to configure kdump on one of the machines to get more insight on what is happening when the box hangs/crashes. I faced a similar issue when trying 4.4.8 on my Infernalis cluster (box hangs, black screen, OSD down and out), and as it happens, there were cases with similar traces [0][1]. I didn't have the time at the moment to run more tests so I went back to using stock 3.10. Also note that the default kdump behavior on kernel panic is to dump the kernel and restart the server. [0] https://lkml.org/lkml/2016/3/17/570 [1] https://lkml.org/lkml/2016/5/17/136 On Mon, Jun 20, 2016 at 4:12 AM, Adrian Saulwrote: > Hi All, > We have a Jewel (10.2.1) cluster on Centos 7 - I am using an elrepo > 4.4.1 kernel on all machines and we have an issue where some of the > machines hang - not sure if its hardware or OS but essentially the host > including the console is unresponsive and can only be recovered with a > hardware reset. Unfortunately nothing useful is logged so I am still > trying to figure out what is going on to cause this. But the result for > ceph is that if an OSD host goes down like this we have run into an issue > where only some of its OSDs are marked down.In the instance on the > weekend, the host had 8 OSDs and only 5 got marked as down - this lead to > the kRBD devices jamming up trying to send IO to non-responsive OSDs that > stayed marked up. > > The machine went into a slow death - lots of reports of slow or blocked > requests: > > 2016-06-19 09:37:49.070810 osd.36 10.145.2.15:6802/31359 65 : cluster > [WRN] 2 slow requests, 2 included below; oldest blocked for > 30.297258 secs > 2016-06-19 09:37:54.071542 osd.36 10.145.2.15:6802/31359 82 : cluster > [WRN] 112 slow requests, 5 included below; oldest blocked for > 35.297988 > secs > 2016-06-19 09:37:54.071737 osd.6 10.145.2.15:6801/21836 221 : cluster > [WRN] 253 slow requests, 5 included below; oldest blocked for > 35.325155 > secs > 2016-06-19 09:37:59.072570 osd.6 10.145.2.15:6801/21836 251 : cluster > [WRN] 262 slow requests, 5 included below; oldest blocked for > 40.325986 > secs > > And then when the monitors did report them down the OSDs disputed that: > > 2016-06-19 09:38:35.821716 mon.0 10.145.2.13:6789/0 244970 : cluster > [INF] osd.6 10.145.2.15:6801/21836 failed (2 reporters from different > host after 20.000365 >= grace 20.00) > 2016-06-19 09:38:36.950556 mon.0 10.145.2.13:6789/0 244978 : cluster > [INF] osd.22 10.145.2.15:6806/21826 failed (2 reporters from different > host after 21.613336 >= grace 20.00) > 2016-06-19 09:38:36.951133 mon.0 10.145.2.13:6789/0 244980 : cluster > [INF] osd.31 10.145.2.15:6812/21838 failed (2 reporters from different > host after 21.613781 >= grace 20.836511) > 2016-06-19 09:38:36.951636 mon.0 10.145.2.13:6789/0 244982 : cluster > [INF] osd.36 10.145.2.15:6802/31359 failed (2 reporters from different > host after 21.614259 >= grace 20.00) > > 2016-06-19 09:38:37.156088 osd.36 10.145.2.15:6802/31359 346 : cluster > [WRN] map e28730 wrongly marked me down > 2016-06-19 09:38:36.002076 osd.6 10.145.2.15:6801/21836 473 : cluster > [WRN] map e28729 wrongly marked me down > 2016-06-19 09:38:37.046885 osd.22 10.145.2.15:6806/21826 374 : cluster > [WRN] map e28730 wrongly marked me down > 2016-06-19 09:38:37.050635 osd.31 10.145.2.15:6812/21838 351 : cluster > [WRN] map e28730 wrongly marked me down > > But shortly after > > 2016-06-19 09:43:39.940985 mon.0 10.145.2.13:6789/0 245305 : cluster > [INF] osd.6 out (down for 303.951251) > 2016-06-19 09:43:39.941061 mon.0 10.145.2.13:6789/0 245306 : cluster > [INF] osd.22 out (down for 302.908528) > 2016-06-19 09:43:39.941099 mon.0 10.145.2.13:6789/0 245307 : cluster > [INF] osd.31 out (down for 302.908527) > 2016-06-19 09:43:39.941152 mon.0 10.145.2.13:6789/0 245308 : cluster > [INF] osd.36 out (down for 302.908527) > > 2016-06-19 10:09:10.648924 mon.0 10.145.2.13:6789/0 247076 : cluster > [INF] osd.23 10.145.2.15:6814/21852 failed (2 reporters from different > host after 20.000378 >= grace 20.00) > 2016-06-19 10:09:10.887220 osd.23 10.145.2.15:6814/21852 176 : cluster > [WRN] map e28848 wrongly marked me down > 2016-06-19 10:14:15.160513 mon.0 10.145.2.13:6789/0 247422 : cluster > [INF] osd.23 out (down for 304.288018) > > By the time the issue was eventually escalated and I was able to do > something about it I manual marked the remaining host OSDs down (which > seemed to unclog RBD): > > 2016-06-19 15:25:06.171395 mon.0 10.145.2.13:6789/0 267212 : cluster > [INF] osd.7 10.145.2.15:6808/21837 failed (2 reporters from different > host after 22.000367 >= grace 20.00) > 2016-06-19 15:25:06.171905 mon.0 10.145.2.13:6789/0 267214 : cluster > [INF] osd.24 10.145.2.15:6800/21813 failed (2 reporters from different > host after 22.000748 >= grace 20.710981) > 2016-06-19 15:25:06.172426 mon.0 10.145.2.13:6789/0 267216 : cluster > [INF] osd.37 10.145.2.15:6810/31936
Re: [ceph-users] Inconsistent PGs
you can use the below cmds: == ceph pg dump_stuck stale ceph pg dump_stuck inactive ceph pg dump_stuck unclean === And the query the PG, which are in unclean or stale state, check for any issue with a specific OSD. Thanks Swami On Tue, Jun 21, 2016 at 3:02 PM, Paweł Sadowskiwrote: > Hello, > > We have an issue on one of our clusters. One node with 9 OSD was down > for more than 12 hours. During that time cluster recovered without > problems. When host back to the cluster we got two PGs in incomplete > state. We decided to mark OSDs on this host as out but the two PGs are > still in incomplete state. Trying to query those pg hangs forever. We > were alredy trying restarting OSDs. Is there any way to solve this issue > without loosing data? Any help appreciate :) > > # ceph health detail | grep incomplete > HEALTH_WARN 2 pgs incomplete; 2 pgs stuck inactive; 2 pgs stuck unclean; > 200 requests are blocked > 32 sec; 2 osds have slow requests; > noscrub,nodeep-scrub flag(s) set > pg 3.2929 is stuck inactive since forever, current state incomplete, > last acting [109,272,83] > pg 3.1683 is stuck inactive since forever, current state incomplete, > last acting [166,329,281] > pg 3.2929 is stuck unclean since forever, current state incomplete, last > acting [109,272,83] > pg 3.1683 is stuck unclean since forever, current state incomplete, last > acting [166,329,281] > pg 3.1683 is incomplete, acting [166,329,281] (reducing pool vms > min_size from 2 may help; search ceph.com/docs for 'incomplete') > pg 3.2929 is incomplete, acting [109,272,83] (reducing pool vms min_size > from 2 may help; search ceph.com/docs for 'incomplete') > > Directory for PG 3.1683 is present on OSD 166 and containes ~8GB. > > We didn't try setting min_size to 1 yet (we treat is as a last resort). > > > > Some cluster info: > # ceph --version > > ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) > > # ceph -s > health HEALTH_WARN > 2 pgs incomplete > 2 pgs stuck inactive > 2 pgs stuck unclean > 200 requests are blocked > 32 sec > noscrub,nodeep-scrub flag(s) set > monmap e7: 5 mons at > {mon-03=*.2:6789/0,mon-04=*.36:6789/0,mon-05=*.81:6789/0,mon-06=*.0:6789/0,mon-07=*.40:6789/0} > election epoch 3250, quorum 0,1,2,3,4 > mon-06,mon-07,mon-04,mon-03,mon-05 > osdmap e613040: 346 osds: 346 up, 337 in > flags noscrub,nodeep-scrub > pgmap v27163053: 18624 pgs, 6 pools, 138 TB data, 39062 kobjects > 415 TB used, 186 TB / 601 TB avail >18622 active+clean >2 incomplete > client io 9992 kB/s rd, 64867 kB/s wr, 8458 op/s > > > # ceph osd pool get vms pg_num > pg_num: 16384 > > # ceph osd pool get vms size > size: 3 > > # ceph osd pool get vms min_size > min_size: 2 > > > -- > PS > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] librbd compatibility
The librbd API is stable between releases. While new API methods might be added, the older API methods are kept for backwards compatibility. For example, qemu-kvm under RHEL 7 is built against a librbd from Firefly but can function using a librbd from Jewel. On Tue, Jun 21, 2016 at 1:47 AM, min fangwrote: > Hi, is there a document describing librbd compatibility? For example, > something like this: librbd from Ceph 0.88 can also be applied to > 0.90,0.91.. > > I hope not keep librbd relative stable, so can avoid more code iteration and > testing. > > Thanks. > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Inconsistent PGs
Already restarted those OSD and then whole cluster (rack by rack, failure domain is rack in this setup). We would like to try *ceph-objectstore-tool mark-complete* operation. Is there any way (other than checking mtime on file and querying PGs) to determine which replica has most up to date datas? On 06/21/2016 12:37 PM, M Ranga Swami Reddy wrote: > Try to restart OSD 109 and 166? check if it help? > > > On Tue, Jun 21, 2016 at 4:05 PM, Paweł Sadowskiwrote: >> Thanks for response. >> >> All OSDs seems to be ok, they have been restarted, joined cluster after >> that, nothing weird in the logs. >> >> # ceph pg dump_stuck stale >> ok >> >> # ceph pg dump_stuck inactive >> ok >> pg_statstateupup_primaryactingacting_primary >> 3.2929incomplete[109,272,83]109[109,272,83]109 >> 3.1683incomplete[166,329,281]166[166,329,281]166 >> >> # ceph pg dump_stuck unclean >> ok >> pg_statstateupup_primaryactingacting_primary >> 3.2929incomplete[109,272,83]109[109,272,83]109 >> 3.1683incomplete[166,329,281]166[166,329,281]166 >> >> >> On OSD 166 there is 100 blocked ops (on 109 too), they all end on >> "event": "reached_pg" >> >> # ceph --admin-daemon /var/run/ceph/ceph-osd.166.asok dump_ops_in_flight >> ... >> { >> "description": "osd_op(client.958764031.0:18137113 >> rbd_data.392585982ae8944a.0ad4 [set-alloc-hint object_size >> 4194304 write_size 4194304,write 2641920~8192] 3.d6195683 RETRY=15 >> ack+ondisk+retry+write+known_if_redirected e613241)", >> "initiated_at": "2016-06-21 10:19:59.894393", >> "age": 828.025527, >> "duration": 600.020809, >> "type_data": [ >> "reached pg", >> { >> "client": "client.958764031", >> "tid": 18137113 >> }, >> [ >> { >> "time": "2016-06-21 10:19:59.894393", >> "event": "initiated" >> }, >> { >> "time": "2016-06-21 10:29:59.915202", >> "event": "reached_pg" >> } >> ] >> ] >> } >> ], >> "num_ops": 100 >> } >> >> >> >> On 06/21/2016 12:27 PM, M Ranga Swami Reddy wrote: >>> you can use the below cmds: >>> == >>> >>> ceph pg dump_stuck stale >>> ceph pg dump_stuck inactive >>> ceph pg dump_stuck unclean >>> === >>> >>> And the query the PG, which are in unclean or stale state, check for >>> any issue with a specific OSD. >>> >>> Thanks >>> Swami >>> >>> On Tue, Jun 21, 2016 at 3:02 PM, Paweł Sadowski wrote: Hello, We have an issue on one of our clusters. One node with 9 OSD was down for more than 12 hours. During that time cluster recovered without problems. When host back to the cluster we got two PGs in incomplete state. We decided to mark OSDs on this host as out but the two PGs are still in incomplete state. Trying to query those pg hangs forever. We were alredy trying restarting OSDs. Is there any way to solve this issue without loosing data? Any help appreciate :) # ceph health detail | grep incomplete HEALTH_WARN 2 pgs incomplete; 2 pgs stuck inactive; 2 pgs stuck unclean; 200 requests are blocked > 32 sec; 2 osds have slow requests; noscrub,nodeep-scrub flag(s) set pg 3.2929 is stuck inactive since forever, current state incomplete, last acting [109,272,83] pg 3.1683 is stuck inactive since forever, current state incomplete, last acting [166,329,281] pg 3.2929 is stuck unclean since forever, current state incomplete, last acting [109,272,83] pg 3.1683 is stuck unclean since forever, current state incomplete, last acting [166,329,281] pg 3.1683 is incomplete, acting [166,329,281] (reducing pool vms min_size from 2 may help; search ceph.com/docs for 'incomplete') pg 3.2929 is incomplete, acting [109,272,83] (reducing pool vms min_size from 2 may help; search ceph.com/docs for 'incomplete') Directory for PG 3.1683 is present on OSD 166 and containes ~8GB. We didn't try setting min_size to 1 yet (we treat is as a last resort). Some cluster info: # ceph --version ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) # ceph -s health HEALTH_WARN 2 pgs incomplete 2 pgs stuck inactive 2 pgs stuck unclean 200 requests are blocked > 32 sec noscrub,nodeep-scrub flag(s) set monmap e7: 5 mons at {mon-03=*.2:6789/0,mon-04=*.36:6789/0,mon-05=*.81:6789/0,mon-06=*.0:6789/0,mon-07=*.40:6789/0} election epoch 3250, quorum
[ceph-users] performance issue with jewel on ubuntu xenial (kernel)
Hello, I found a performance drop between kernel 3.13.0-88 (default kernel on Ubuntu Trusty 14.04) and kernel 4.4.0.24.14 (default kernel on Ubuntu Xenial 16.04) ceph version is Jewel (10.2.2). All tests have been done under Ubuntu 14.04 Kernel 4.4 has a drop of 50% compared to 4.2 Kernel 4.4 has a drop of 40% compared to 3.13 details below: With the 3 kernel I have the same performance on disks : Raw benchmark: dd if=/dev/zero of=/dev/sdX bs=1M count=1024 oflag=direct=> average ~230MB/s dd if=/dev/zero of=/dev/sdX bs=1G count=1 oflag=direct => average ~220MB/s Filesystem mounted benchmark: dd if=/dev/zero of=/sdX1/test.img bs=1G count=1 => average ~205MB/s dd if=/dev/zero of=/sdX1/test.img bs=1G count=1 oflag=direct => average ~214MB/s dd if=/dev/zero of=/sdX1/test.img bs=1G count=1 oflag=sync => average ~190MB/s Ceph osd Benchmark: Kernel 3.13.0-88-generic : ceph tell osd.ID => average ~81MB/s Kernel 4.2.0-38-generic : ceph tell osd.ID => average ~109MB/s Kernel 4.4.0-24-generic : ceph tell osd.ID => average ~50MB/s Does anyone get a similar behaviour on their cluster ? Best regards -- Yoann Moulin EPFL IC-IT ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Inconsistent PGs
Try to restart OSD 109 and 166? check if it help? On Tue, Jun 21, 2016 at 4:05 PM, Paweł Sadowskiwrote: > Thanks for response. > > All OSDs seems to be ok, they have been restarted, joined cluster after > that, nothing weird in the logs. > > # ceph pg dump_stuck stale > ok > > # ceph pg dump_stuck inactive > ok > pg_statstateupup_primaryactingacting_primary > 3.2929incomplete[109,272,83]109[109,272,83]109 > 3.1683incomplete[166,329,281]166[166,329,281]166 > > # ceph pg dump_stuck unclean > ok > pg_statstateupup_primaryactingacting_primary > 3.2929incomplete[109,272,83]109[109,272,83]109 > 3.1683incomplete[166,329,281]166[166,329,281]166 > > > On OSD 166 there is 100 blocked ops (on 109 too), they all end on > "event": "reached_pg" > > # ceph --admin-daemon /var/run/ceph/ceph-osd.166.asok dump_ops_in_flight > ... > { > "description": "osd_op(client.958764031.0:18137113 > rbd_data.392585982ae8944a.0ad4 [set-alloc-hint object_size > 4194304 write_size 4194304,write 2641920~8192] 3.d6195683 RETRY=15 > ack+ondisk+retry+write+known_if_redirected e613241)", > "initiated_at": "2016-06-21 10:19:59.894393", > "age": 828.025527, > "duration": 600.020809, > "type_data": [ > "reached pg", > { > "client": "client.958764031", > "tid": 18137113 > }, > [ > { > "time": "2016-06-21 10:19:59.894393", > "event": "initiated" > }, > { > "time": "2016-06-21 10:29:59.915202", > "event": "reached_pg" > } > ] > ] > } > ], > "num_ops": 100 > } > > > > On 06/21/2016 12:27 PM, M Ranga Swami Reddy wrote: >> you can use the below cmds: >> == >> >> ceph pg dump_stuck stale >> ceph pg dump_stuck inactive >> ceph pg dump_stuck unclean >> === >> >> And the query the PG, which are in unclean or stale state, check for >> any issue with a specific OSD. >> >> Thanks >> Swami >> >> On Tue, Jun 21, 2016 at 3:02 PM, Paweł Sadowski wrote: >>> Hello, >>> >>> We have an issue on one of our clusters. One node with 9 OSD was down >>> for more than 12 hours. During that time cluster recovered without >>> problems. When host back to the cluster we got two PGs in incomplete >>> state. We decided to mark OSDs on this host as out but the two PGs are >>> still in incomplete state. Trying to query those pg hangs forever. We >>> were alredy trying restarting OSDs. Is there any way to solve this issue >>> without loosing data? Any help appreciate :) >>> >>> # ceph health detail | grep incomplete >>> HEALTH_WARN 2 pgs incomplete; 2 pgs stuck inactive; 2 pgs stuck unclean; >>> 200 requests are blocked > 32 sec; 2 osds have slow requests; >>> noscrub,nodeep-scrub flag(s) set >>> pg 3.2929 is stuck inactive since forever, current state incomplete, >>> last acting [109,272,83] >>> pg 3.1683 is stuck inactive since forever, current state incomplete, >>> last acting [166,329,281] >>> pg 3.2929 is stuck unclean since forever, current state incomplete, last >>> acting [109,272,83] >>> pg 3.1683 is stuck unclean since forever, current state incomplete, last >>> acting [166,329,281] >>> pg 3.1683 is incomplete, acting [166,329,281] (reducing pool vms >>> min_size from 2 may help; search ceph.com/docs for 'incomplete') >>> pg 3.2929 is incomplete, acting [109,272,83] (reducing pool vms min_size >>> from 2 may help; search ceph.com/docs for 'incomplete') >>> >>> Directory for PG 3.1683 is present on OSD 166 and containes ~8GB. >>> >>> We didn't try setting min_size to 1 yet (we treat is as a last resort). >>> >>> >>> >>> Some cluster info: >>> # ceph --version >>> >>> ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) >>> >>> # ceph -s >>> health HEALTH_WARN >>> 2 pgs incomplete >>> 2 pgs stuck inactive >>> 2 pgs stuck unclean >>> 200 requests are blocked > 32 sec >>> noscrub,nodeep-scrub flag(s) set >>> monmap e7: 5 mons at >>> {mon-03=*.2:6789/0,mon-04=*.36:6789/0,mon-05=*.81:6789/0,mon-06=*.0:6789/0,mon-07=*.40:6789/0} >>> election epoch 3250, quorum 0,1,2,3,4 >>> mon-06,mon-07,mon-04,mon-03,mon-05 >>> osdmap e613040: 346 osds: 346 up, 337 in >>> flags noscrub,nodeep-scrub >>> pgmap v27163053: 18624 pgs, 6 pools, 138 TB data, 39062 kobjects >>> 415 TB used, 186 TB / 601 TB avail >>>18622 active+clean >>>2 incomplete >>> client io 9992 kB/s rd, 64867 kB/s wr, 8458 op/s >>> >>> >>> # ceph osd pool get vms pg_num >>> pg_num: 16384 >>> >>> # ceph osd pool get
[ceph-users] Inconsistent PGs
Hello, We have an issue on one of our clusters. One node with 9 OSD was down for more than 12 hours. During that time cluster recovered without problems. When host back to the cluster we got two PGs in incomplete state. We decided to mark OSDs on this host as out but the two PGs are still in incomplete state. Trying to query those pg hangs forever. We were alredy trying restarting OSDs. Is there any way to solve this issue without loosing data? Any help appreciate :) # ceph health detail | grep incomplete HEALTH_WARN 2 pgs incomplete; 2 pgs stuck inactive; 2 pgs stuck unclean; 200 requests are blocked > 32 sec; 2 osds have slow requests; noscrub,nodeep-scrub flag(s) set pg 3.2929 is stuck inactive since forever, current state incomplete, last acting [109,272,83] pg 3.1683 is stuck inactive since forever, current state incomplete, last acting [166,329,281] pg 3.2929 is stuck unclean since forever, current state incomplete, last acting [109,272,83] pg 3.1683 is stuck unclean since forever, current state incomplete, last acting [166,329,281] pg 3.1683 is incomplete, acting [166,329,281] (reducing pool vms min_size from 2 may help; search ceph.com/docs for 'incomplete') pg 3.2929 is incomplete, acting [109,272,83] (reducing pool vms min_size from 2 may help; search ceph.com/docs for 'incomplete') Directory for PG 3.1683 is present on OSD 166 and containes ~8GB. We didn't try setting min_size to 1 yet (we treat is as a last resort). Some cluster info: # ceph --version ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403) # ceph -s health HEALTH_WARN 2 pgs incomplete 2 pgs stuck inactive 2 pgs stuck unclean 200 requests are blocked > 32 sec noscrub,nodeep-scrub flag(s) set monmap e7: 5 mons at {mon-03=*.2:6789/0,mon-04=*.36:6789/0,mon-05=*.81:6789/0,mon-06=*.0:6789/0,mon-07=*.40:6789/0} election epoch 3250, quorum 0,1,2,3,4 mon-06,mon-07,mon-04,mon-03,mon-05 osdmap e613040: 346 osds: 346 up, 337 in flags noscrub,nodeep-scrub pgmap v27163053: 18624 pgs, 6 pools, 138 TB data, 39062 kobjects 415 TB used, 186 TB / 601 TB avail 18622 active+clean 2 incomplete client io 9992 kB/s rd, 64867 kB/s wr, 8458 op/s # ceph osd pool get vms pg_num pg_num: 16384 # ceph osd pool get vms size size: 3 # ceph osd pool get vms min_size min_size: 2 -- PS ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] MDS failover, how to speed it up?
I will plan to add more logging and other info you have asked for at the next MDS restart. As this cluster are being used in production, I have a limited maintenance window, so unless I don't find a time outside this window you have to wait until Sunday/Monday to get the logs. @John, yes I have used the "ceph mds fail " but I would like to do it again with a bit more logging, just to be sure. @Zheng, It might be due to pressure in the MDS server, I don't see a critical high load on the MDS server ~ 0.4 and see ~90Mbit traffic from and to the MDS in in average*.* Also a extra question, when doing a "df -i" on the cephfs mountpoint, I get a high inode count which looks like it's all the inodes on all the OSD's combined divided with the amount of replicas, is this assumption correct? Please let me know if there are any more info needed. Regards On 20 June 2016 at 14:09, Yan, Zhengwrote: > On Mon, Jun 20, 2016 at 7:04 PM, Brian Lagoni wrote: > > Are anyone here able to help us with a question about mds failover? > > > > The case is that we are hitting a bug in ceph which requires us to > restart > > the mds every week. > > There is a bug and PR for it here - > https://github.com/ceph/ceph/pull/9456 > > but until this have been resolved we need to do a restart. Unless there > are > > a better workaround for this bug? > > > > The issue we are having are when we do a failover, the time it takes for > the > > cephfs kernel client to recover are high enough so that the vm guests > using > > this cephfs are having timeouts to they storage and therefor enters > readonly > > mode. > > > > We have tried with making a failover to another mds or restarting the mds > > while it's the only mds in the cluser and in both cases our cephfs kernel > > client are taking too long to recover. > > We have also tried to set the failover MDS into "MDS_STANDBY_REPLAY" mode > > which didn't help on this matter. > > > > When doing a failover all IOPS against ceph are being blocked for 2-5 min > > until the kernel cephfs clients recovers after some timeouts messages > like > > these: > > "2016-06-19 19:09:55.573739 7faaf8f48700 0 log_channel(cluster) log > [WRN] : > > slow request 75.141028 seconds old, received at 2016-06-19 > 19:08:40.432655: > > client_request(client.4283066:4164703242 getattr pAsLsXsFs #1fe > > 2016-06-19 19:08:40.429496) currently failed to rdlock, waiting" > > After this there is a huge spike i IOPS data starts to being processed > > again. > > > > I'm not sure if any of this can be related to this warning which are > present > > 90% of the day. > > "mds0: Behind on trimming (94/30)"? > > I have searched the mailing list for clues and answers on what to do > about > > this but haven't found anything which have helped us. > > We have move/isolated the MDS service to it's own VM with the fastest > > processor we having, without any real changes to this warning. > > > > Our infrastructure is the following: > > - We use CEPH/CEPHFS (10.2.1) > > - We have 3 mons and 6 storage servers with a total of 36 OSDs (~4160 > PGs). > > - We have one main mds and one standby mds. > > - The primary MDS is a virtual machine with 8 core E5-2643 v3 @ > > 3.40GHz(steal time=0), 16G mem > > - We are using ceph kernel client to mount cephfs. > > - Ubuntu 16.04 (4.4.0-22-generic kernel) > > - The OSD's are physical machines with 8 cores & 32GB memory > > - All networking is 10Gb > > > > So at the end are there anything we can do to make the failover and > recovery > > to go faster? > > I guess your MDS is very busy. there are lots of inodes in client > cache. Please run 'ceph daemon mds.xxx session ls' before restarting > the MDS, and send the output to us. > > Regards > Yan, Zheng > > > > > > Regards, > > Brian Lagoni > > System administrator, Engineering Tools > > Unity Technologies > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com