Re: [ceph-users] Help needed porting Ceph to RSockets
So I've had a chance to re-visit this since Bécholey Alexandre was kind enough to let me know how to compile Ceph with the RDMACM library (thankyou again!). At this stage it compiles and runs but there appears to be a problem with calling rshutdown in Pipe as it seems to just wait forever for the pipe to close which causes commands like 'ceph osd tree' to hang indefinitely after they work successfully. Debug MS is here - http://pastebin.com/WzMJNKZY I also tried RADOS bench but it appears to be doing something similar. Debug MS is here - http://pastebin.com/3aXbjzqS It seems like it's very close to working... I must be missing something small that's causing some grief. You can see the OSD coming up in the ceph monitor and the PG's all become active+clean. When shutting down the monitor I get the below which show's it waiting for the pipes to close - 2013-08-09 15:08:31.339394 7f4643cfd700 20 accepter.accepter closing 2013-08-09 15:08:31.382075 7f4643cfd700 10 accepter.accepter stopping 2013-08-09 15:08:31.382115 7f464bd397c0 20 -- 172.16.0.1:6789/0 wait: stopped accepter thread 2013-08-09 15:08:31.382127 7f464bd397c0 20 -- 172.16.0.1:6789/0 wait: stopping reaper thread 2013-08-09 15:08:31.382146 7f4645500700 10 -- 172.16.0.1:6789/0reaper_entry done 2013-08-09 15:08:31.382182 7f464bd397c0 20 -- 172.16.0.1:6789/0 wait: stopped reaper thread 2013-08-09 15:08:31.382194 7f464bd397c0 10 -- 172.16.0.1:6789/0 wait: closing pipes 2013-08-09 15:08:31.382200 7f464bd397c0 10 -- 172.16.0.1:6789/0 reaper 2013-08-09 15:08:31.382205 7f464bd397c0 10 -- 172.16.0.1:6789/0 reaper done 2013-08-09 15:08:31.382210 7f464bd397c0 10 -- 172.16.0.1:6789/0 wait: waiting for pipes 0x3014c80,0x3015180,0x3015400 to close The git repo has been updated if anyone has a few spare minutes to take a look - https://github.com/funkBuild/ceph-rsockets Thanks again -Matt On Thu, Jun 20, 2013 at 5:09 PM, Matthew Anderson manderson8...@gmail.comwrote: Hi All, I've had a few conversations on IRC about getting RDMA support into Ceph and thought I would give it a quick attempt to hopefully spur some interest. What I would like to accomplish is an RSockets only implementation so I'm able to use Ceph, RBD and QEMU at full speed over an Infiniband fabric. What I've tried to do is port Pipe.cc and Acceptor.cc to rsockets by replacing the regular socket calls with the rsocket equivalent. Unfortunately it doesn't compile and I get an error of - CXXLD ceph-osd ./.libs/libglobal.a(libcommon_la-Accepter.o): In function `Accepter::stop()': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:243: undefined reference to `rshutdown' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:251: undefined reference to `rclose' ./.libs/libglobal.a(libcommon_la-Accepter.o): In function `Accepter::entry()': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:213: undefined reference to `raccept' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:230: undefined reference to `rclose' ./.libs/libglobal.a(libcommon_la-Accepter.o): In function `Accepter::bind(entity_addr_t const, int, int)': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:61: undefined reference to `rsocket' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:80: undefined reference to `rsetsockopt' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:87: undefined reference to `rbind' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:118: undefined reference to `rgetsockname' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:128: undefined reference to `rlisten' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:100: undefined reference to `rbind' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:87: undefined reference to `rbind' ./.libs/libglobal.a(libcommon_la-Pipe.o): In function `Pipe::tcp_write(char const*, int)': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:2175: undefined reference to `rsend' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:2162: undefined reference to `rshutdown' ./.libs/libglobal.a(libcommon_la-Pipe.o): In function `Pipe::do_sendmsg(msghdr*, int, bool)': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:1867: undefined reference to `rsendmsg' ./.libs/libglobal.a(libcommon_la-Pipe.o): In function `Pipe::tcp_read_nonblocking(char*, int)': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:2129: undefined reference to `rrecv' ./.libs/libglobal.a(libcommon_la-Pipe.o): In function `Pipe::tcp_read(char*, int)': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:2079: undefined reference to `rshutdown' ./.libs/libglobal.a(libcommon_la-Pipe.o): In function `Pipe::connect()': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:768: undefined reference to `rclose' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:773: undefined reference to `rsocket'
Re: [ceph-users] Openstack glance ceph rbd_store_user authentification problem
Hi, thanks for your answers. It was my fault. I configured all at the beginning of the [DEFAULT] section of glance-api.conf and overlooked the default settings later ( the default ubuntu glance-api.conf has later a default RBD Store Options part ) On 08/08/2013 05:04 PM, Josh Durgin wrote: On 08/08/2013 06:01 AM, Steffen Thorhauer wrote: Hi, recently I had a problem with openstack glance and ceph. I used the http://ceph.com/docs/master/rbd/rbd-openstack/#configuring-glance documentation and http://docs.openstack.org/developer/glance/configuring.html documentation I'm using ubuntu 12.04 LTS with grizzly from Ubuntu Cloud Archive and ceph 61.7. glance-api.conf had following config options default_store = rbd rbd_store_user=images rbd_store_pool = images rbd_store_ceph_conf = /etc/ceph/ceph.conf All the time when doing glance image create I get errors. In the glance api log I only found error like 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images Traceback (most recent call last): 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/glance/api/v1/images.py, line 444, in _upload 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images image_meta['size']) 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/glance/store/rbd.py, line 241, in add 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images with rados.Rados(conffile=self.conf_file, rados_id=self.user) as conn: 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/rados.py, line 134, in __enter__ 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images self.connect() 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images File /usr/lib/python2.7/dist-packages/rados.py, line 192, in connect 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images raise make_ex(ret, error calling connect) 2013-08-08 10:25:38.021 5725 TRACE glance.api.v1.images ObjectNotFound: error calling connect This trace message helped me not very much :-( My google search glance.api.v1.images ObjectNotFound: error calling connect did only find http://irclogs.ceph.widodh.nl/index.php?date=2012-10-26 This points me to an ceph authentification problem. But the ceph tools worked fine for me. The I tried the debug option in glance-api.conf and I found following entry . DEBUG glance.common.config [-] rbd_store_pool = images log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:1485 DEBUG glance.common.config [-] rbd_store_user = glance log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:1485 The glance-api service did not use my rbd_store_user = images option!! Then I configured a client.glance auth and it worked with the implicit glance user!!! Now my question: Am I the only one with this problem?? I've seen people have this issue before due to the way the glance-api.conf can have multiple sections. Make sure those rbd settings are in the [DEFAULT] section, not just at the bottom of the file (which may be a different section). Regards, Steffen Thorhauer ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-deploy behind corporate firewalls
Hi, I was able to use ceph-deploy behind a proxy, by defining the appropriate environment variables used by wget.. I.E. on ubuntu just add to /etc/environnement: http_proxy=http://host:port ftp_proxy=http://host:port https_proxy=http://host:port Regard, Luc. - Mail original - De: Harvey Skinner hpmpe...@gmail.com À: ceph-users@lists.ceph.com Cc: Harvey Skinner hpmpe...@gmail.com Envoyé: Vendredi 9 Août 2013 05:48:35 Objet: [ceph-users] ceph-deploy behind corporate firewalls hi all, I am not sure if I am the only one having issues with ceph-deploy behind a firewall or not. I haven't seen any other reports of similar issues yet. With http proxies I am able to have apt-get working, but wget is still an issue. Working to use the newer ceph-deploy mechanism to deploy my next POC set up on four storage nodes. The ceph-deploy install process unfortunately uses wget to retrieve the Ceph release key and failing the install. To get around this i can manually add the Ceph release key on all my nodes and apt-get install all the Ceph packages. Question though is whether there is anything else that ceph-deploy does that I would need to do manually to have everything in state where ceph-deploy would work correctly for the rest of the cluster setup and deployment, i.e. ceph-deploy new -and- ceph-deploy mon create, etc.? thank you, Harvey ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ Le SITIV décline toute responsabilité quant au contenu de ce message. Ce message et les pièces jointes qui y sont attachées sont confidentiels et établis à l'attention exclusive de leur destinataire. Si vous pensez l'avoir reçu par erreur, merci de bien vouloir en aviser immédiatement l'expéditeur, de ne pas l'utiliser sous quelque forme que ce soit et de le détruire immédiatement. Toute divulgation, utilisation, diffusion ou reproduction du message ou des informations qu'il contient doit être préalablement autorisée par l'expéditeur. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] pgs stuck unclean -- how to fix? (fwd)
Hi, I have a 5 node ceph cluster that is running well (no problems using any of the rbd images and that's really all we use). I have replication set to 3 on all three pools (data, metadata and rbd). ceph -s reports: health HEALTH_WARN 3 pgs degraded; 114 pgs stuck unclean; recovery 5746/384795 degraded (1.493%) I have tried everything I could think of to clear/fix those errors and they persist. Most of them appear to be a problem with not having 3 copies 0.2a0 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:40:07.874427 0'0 21920'388 [4,7] [4,7,8] 0'0 2013-08-04 08:59:34.035198 0'0 2013-07-29 01:49:40.018625 4.1d9 260 0 238 0 1021055488 0 0 active+remapped 2013-08-06 05:56:20.447612 21920'12710 21920'53408 [6,13] [6,13,4]0'0 2013-08-05 06:59:44.717555 0'0 2013-08-05 06:59:44.717555 1.1dc 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:55:44.687830 0'0 21920'3003 [6,13] [6,13,4] 0'0 2013-08-04 10:56:51.226012 0'0 2013-07-28 23:47:13.404512 0.1dd 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:55:44.687525 0'0 21920'3003 [6,13] [6,13,4] 0'0 2013-08-04 10:56:45.258459 0'0 2013-08-01 05:58:17.141625 1.29f 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:40:07.882865 0'0 21920'388 [4,7] [4,7,8] 0'0 2013-08-04 09:01:40.075441 0'0 2013-07-29 01:53:10.068503 1.118 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:50:34.081067 0'0 21920'208 [8,15] [8,15,5] 0'0 2034-02-12 23:20:03.933842 0'0 2034-02-12 23:20:03.933842 0.119 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:50:34.095446 0'0 21920'208 [8,15] [8,15,5] 0'0 2034-02-12 23:18:07.310080 0'0 2034-02-12 23:18:07.310080 4.115 248 0 226 0 987364352 0 0 active+remapped 2013-08-06 05:50:34.112139 21920'6840 21920'42982 [8,15] [8,15,5]0'0 2013-08-05 06:59:18.303823 0'0 2013-08-05 06:59:18.303823 4.4a241 0 286 0 941573120 0 0 active+degraded 2013-08-06 12:00:47.758742 21920'85238 21920'206648 [4,6] [4,6] 0'0 2013-08-05 06:58:36.681726 0'0 2013-08-05 06:58:36.681726 0.4e0 0 0 0 0 0 0 active+remapped 2013-08-06 12:00:47.765391 0'0 21920'489 [4,6] [4,6,1] 0'0 2013-08-04 08:58:12.783265 0'0 2013-07-28 14:21:38.227970 Can anyone suggest a way to clear this up? Thanks! Jeff -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] pgs stuck unclean -- how to fix? (fwd)
On 08/09/2013 10:58 AM, Jeff Moskow wrote: Hi, I have a 5 node ceph cluster that is running well (no problems using any of the rbd images and that's really all we use). I have replication set to 3 on all three pools (data, metadata and rbd). ceph -s reports: health HEALTH_WARN 3 pgs degraded; 114 pgs stuck unclean; recovery 5746/384795 degraded (1.493%) I have tried everything I could think of to clear/fix those errors and they persist. Did you restart the primary OSD for that PGs? Wido Most of them appear to be a problem with not having 3 copies 0.2a0 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:40:07.874427 0'0 21920'388 [4,7] [4,7,8] 0'0 2013-08-04 08:59:34.035198 0'0 2013-07-29 01:49:40.018625 4.1d9 260 0 238 0 1021055488 0 0 active+remapped 2013-08-06 05:56:20.447612 21920'12710 21920'53408 [6,13] [6,13,4]0'0 2013-08-05 06:59:44.717555 0'0 2013-08-05 06:59:44.717555 1.1dc 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:55:44.687830 0'0 21920'3003 [6,13] [6,13,4] 0'0 2013-08-04 10:56:51.226012 0'0 2013-07-28 23:47:13.404512 0.1dd 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:55:44.687525 0'0 21920'3003 [6,13] [6,13,4] 0'0 2013-08-04 10:56:45.258459 0'0 2013-08-01 05:58:17.141625 1.29f 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:40:07.882865 0'0 21920'388 [4,7] [4,7,8] 0'0 2013-08-04 09:01:40.075441 0'0 2013-07-29 01:53:10.068503 1.118 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:50:34.081067 0'0 21920'208 [8,15] [8,15,5] 0'0 2034-02-12 23:20:03.933842 0'0 2034-02-12 23:20:03.933842 0.119 0 0 0 0 0 0 0 active+remapped 2013-08-06 05:50:34.095446 0'0 21920'208 [8,15] [8,15,5] 0'0 2034-02-12 23:18:07.310080 0'0 2034-02-12 23:18:07.310080 4.115 248 0 226 0 987364352 0 0 active+remapped 2013-08-06 05:50:34.112139 21920'6840 21920'42982 [8,15] [8,15,5]0'0 2013-08-05 06:59:18.303823 0'0 2013-08-05 06:59:18.303823 4.4a241 0 286 0 941573120 0 0 active+degraded 2013-08-06 12:00:47.758742 21920'85238 21920'206648 [4,6] [4,6] 0'0 2013-08-05 06:58:36.681726 0'0 2013-08-05 06:58:36.681726 0.4e0 0 0 0 0 0 0 active+remapped 2013-08-06 12:00:47.765391 0'0 21920'489 [4,6] [4,6,1] 0'0 2013-08-04 08:58:12.783265 0'0 2013-07-28 14:21:38.227970 Can anyone suggest a way to clear this up? Thanks! Jeff -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
Hi Josh, just opened http://tracker.ceph.com/issues/5919 with all collected information incl. debug-log. Hope it helps, Oliver. On 08/08/2013 07:01 PM, Josh Durgin wrote: On 08/08/2013 05:40 AM, Oliver Francke wrote: Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;) Yes, that'd be great. If you could include the time when you saw the guest hang that'd be ideal. I'm not sure if this is one or two bugs, but it seems likely it's a bug in rbd and not qemu. Thanks! Josh Thnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan -- Oliver Francke filoo GmbH Moltkestraße 25a 0 Gütersloh HRB4355 AG Gütersloh Geschäftsführer: J.Rehpöhler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] All old pgs in stale after recreating all osds
On Centos 6.4, Ceph 0.61.7. I had a ceph cluster of 9 osds. Today I destroyed all of the osds, and recreated 6 new ones. Then I find all the old pgs are in stale. [root@ceph0 ceph]# ceph -s health HEALTH_WARN 192 pgs stale; 192 pgs stuck inactive; 192 pgs stuck stale; 192 pgs stuck unclean monmap e1: 3 mons at {ceph0=172.18.11.60:6789/0,ceph1=172.18.11.61:6789/0,ceph2=172.18.11.62:6789/0}, election epoch 24, quorum 0,1,2 ceph0,ceph1,ceph2 osdmap e166: 6 osds: 6 up, 6 in pgmap v837: 192 pgs: 192 stale; 9526 bytes data, 221 MB used, 5586 GB / 5586 GB avail mdsmap e114: 0/0/1 up [root@ceph0 ~]# ceph health detail ... pg 2.3 is stuck stale for 10249.230667, current state stale, last acting [5] ... [root@ceph0 ~]# ceph pg 2.3 query i don't have pgid 2.3 How can I get all the pgs back or recreated? Thanks!___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] pgs stuck unclean -- how to fix? (fwd)
Thanks for the suggestion. I had tried stopping each OSD for 30 seconds, then restarting it, waiting 2 minutes and then doing the next one (all OSD's eventually restarted). I tried this twice. -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] mounting a pool via fuse
Hi, I'm using ceph 0.61.7. When using ceph-fuse, I couldn't find a way, to only mount one pool. Is there a way to mount a pool - or is it simply not supported? Kind Regards, Georg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] do we need to install ceph on KVM hypervisor for cloudstack-ceph intergration
HI, To access the storage cluster from kvm hypervisor what are the packages need to install on kvm hypervisor(do we need to install qemu,ceph on KVM host? For cloudstack-ceph integration). MY hypervisor version is rhel6.3. Regards Sadhu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] do we need to install ceph on KVM hypervisor for cloudstack-ceph intergration
On 08/09/2013 01:51 PM, Suresh Sadhu wrote: HI, To access the storage cluster from kvm hypervisor what are the packages need to install on kvm hypervisor(do we need to install qemu,ceph on KVM host? For cloudstack-ceph integration). You only need librbd and librados The Ceph CLI tools and such are not mandatory, but won't hurt anything. Both libvirt and Qemu will link against librbd which links to librados. Wido MY hypervisor version is rhel6.3. Regards Sadhu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] All old pgs in stale after recreating all osds
On Centos 6.4, Ceph 0.61.7. I had a ceph cluster of 9 osds. Today I destroyed all of the osds, and recreated 6 new ones. Then I find all the old pgs are in stale. [root@ceph0 ceph]# ceph -s health HEALTH_WARN 192 pgs stale; 192 pgs stuck inactive; 192 pgs stuck stale; 192 pgs stuck unclean monmap e1: 3 mons at {ceph0=172.18.11.60:6789/0,ceph1=172.18.11.61:6789/0,ceph2=172.18.11.62:6789/0}, election epoch 24, quorum 0,1,2 ceph0,ceph1,ceph2 osdmap e166: 6 osds: 6 up, 6 in pgmap v837: 192 pgs: 192 stale; 9526 bytes data, 221 MB used, 5586 GB / 5586 GB avail mdsmap e114: 0/0/1 up [root@ceph0 ~]# ceph health detail ... pg 2.3 is stuck stale for 10249.230667, current state stale, last acting [5] ... [root@ceph0 ~]# ceph pg 2.3 query i don't have pgid 2.3 How can I get all the pgs back or recreated? Thanks!___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins. I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2 Andrei - Original Message - From: Oliver Francke oliver.fran...@filoo.de To: Josh Durgin josh.dur...@inktank.com Cc: ceph-users@lists.ceph.com, Mike Dawson mike.daw...@cloudapt.com, Stefan Hajnoczi stefa...@redhat.com, qemu-de...@nongnu.org Sent: Friday, 9 August, 2013 10:22:00 AM Subject: Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686] Hi Josh, just opened http://tracker.ceph.com/issues/5919 with all collected information incl. debug-log. Hope it helps, Oliver. On 08/08/2013 07:01 PM, Josh Durgin wrote: On 08/08/2013 05:40 AM, Oliver Francke wrote: Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;) Yes, that'd be great. If you could include the time when you saw the guest hang that'd be ideal. I'm not sure if this is one or two bugs, but it seems likely it's a bug in rbd and not qemu. Thanks! Josh Thnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan -- Oliver Francke filoo GmbH Moltkestraße 25a 0 Gütersloh HRB4355 AG Gütersloh Geschäftsführer: J.Rehpöhler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
On Fri, Aug 09, 2013 at 03:05:22PM +0100, Andrei Mikhailovsky wrote: I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins. I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2 Josh, In addition to the Ceph logs you can also use QEMU tracing with the following events enabled: virtio_blk_handle_write virtio_blk_handle_read virtio_blk_rw_complete See docs/tracing.txt for details on usage. Inspecting the trace output will let you observe the I/O request submission/completion from the virtio-blk device perspective. You'll be able to see whether requests are never being completed in some cases. This bug seems like a corner case or race condition since most requests seem to complete just fine. The problem is that eventually the virtio-blk device becomes unusable when it runs out of descriptors (it has 128). And before that limit is reached the guest may become unusable due to the hung I/O requests. Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] CEPH-DEPLOY TRIALS/EVALUATION RESULT ON CEPH VERSION 61.7
CEPH-DEPLOY EVALUATION ON CEPH VERSION 61.7 ADMINNODE: root@ubuntuceph900athf1:~# ceph -v ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff) root@ubuntuceph900athf1:~# SERVERNODE: root@ubuntuceph700athf1:/etc/ceph# ceph -v ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff) root@ubuntuceph700athf1:/etc/ceph# ===: Trial-1 of using ceph-deploy results: (http://ceph.com/docs/next/start/quick-ceph-deploy/) My trial-1 scenario is using ceph-deploy to replace 2 OSD's (osd.2 and OSD.11) of a ceph node. Obesrvation: ceph-deploy creating a symbolic-links ceph-0--ceph-2 dir and ceph-1---ceph-11 dir. I did not ran into any errors or issue in this trials. once concern: --ceph deploy did not update linux fstab of mount point of osd data. === Trial 2: (http://ceph.com/docs/next/start/quick-ceph-deploy/) I notice my node did not have any contents in /var/lib/ceph/boostrap-{osd}|{mds} . Result: FAILURE TO MOVE FORWARD BEYOND THIS STEP Tip from http://ceph.com/docs/next/start/quick-ceph-deploy/ If you don't have these keyrings, you may not have created a monitor successfully, or you may have a problem with your network connection. Ensure that you complete this step such that you have the foregoing keyrings before proceeding further. Tip from (http://ceph.com/docs/next/start/quick-ceph-deploy/: You may repeat this procedure. If it fails, check to see if the /var/lib/ceph/boostrap-{osd}|{mds} directories on the server node have keyrings. If they do not have keyrings, try adding the monitor again; then, return to this step. My WORKAROUND1: COPIED CONTENTS OF /var/lib/ceph/boostrap-{osd}|{mds} FROM ANOTHER NODE My WORKAROUND2: USED CREATE A NEW CLUSTER PROCEDURE with CEPH-DEPLOY to create missing keyrings. =: TRIAL-3: Attemp to build a new cluster/1-Node using ceph deploy: RESULT FAILED TO GO BEYOND THE ERROR LOGS BELOW: root@ubuntuceph900athf1:~/my-cluster# ceph-deploy osd prepare ubuntuceph700athf1:sde1:/var/lib/ceph/journal/osd.0.journal ceph-disk-prepare -- /dev/sde1 /var/lib/ceph/journal/osd.0.journal returned 1 meta-data=/dev/sde1 isize=2048 agcount=4, agsize=30524098 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=122096390, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=59617, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the same device as the osd data umount: /var/lib/ceph/tmp/mnt.iMsc1G: device is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1)) ceph-disk: Unmounting filesystem failed: Command '['/bin/umount', '--', '/var/lib/ceph/tmp/mnt.iMsc1G']' returned non-zero exit status 1 ceph-deploy: Failed to create 1 OSDs root@ubuntuceph900athf1:~/my-cluster# ceph-deploy osd prepare ubuntuceph700athf1:sde1 ceph-disk-prepare -- /dev/sde1 returned 1 meta-data=/dev/sde1 isize=2048 agcount=4, agsize=30524098 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=122096390, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=59617, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 umount: /var/lib/ceph/tmp/mnt.0JxBp1: device is busy. (In some cases useful info about processes that use the device is found by lsof(8) or fuser(1)) ceph-disk: Unmounting filesystem failed: Command '['/bin/umount', '--', '/var/lib/ceph/tmp/mnt.0JxBp1']' returned non-zero exit status 1 ceph-deploy: Failed to create 1 OSDs root@ubuntuceph900athf1:~/my-cluster# ceph-deploy osd prepare ubuntuceph700athf1:sde1 ceph-disk-prepare -- /dev/sde1 returned 1 ceph-disk: Error: Device is mounted: /dev/sde1 ceph-deploy: Failed to create 1 OSDs Attempted on the local node: root@ubuntuceph700athf1:/etc/ceph# ceph-deploy osd prepare ubuntuceph700athf1:sde1:/var/lib/ceph/journal/osd.0.journal ceph-disk-prepare -- /dev/sde1 /var/lib/ceph/journal/osd.0.journal returned 1 ceph-disk: Error: Device is mounted: /dev/sde1 /dev/sde1 on /var/lib/ceph/tmp/mnt.GzZLAr type xfs (rw,noatime) RESULT: ceph-deploy complaints that osd drive is mounted, it was not mounted prior to running command, ceph-deploy mounted it, then complains that its mounted.
Re: [ceph-users] ceph-deploy behind corporate firewalls
On Fri, Aug 9, 2013 at 1:34 AM, Luc Dumaine lduma...@sitiv.fr wrote: Hi, I was able to use ceph-deploy behind a proxy, by defining the appropriate environment variables used by wget.. I.E. on ubuntu just add to /etc/environnement: http_proxy=http://host:port ftp_proxy=http://host:port https_proxy=http://host:port Thanks for letting us know, this definitely sounds useful, I will add it to the docs so someone having a similar issue can have a workaround for now. Regard, Luc. - Mail original - De: Harvey Skinner hpmpe...@gmail.com À: ceph-users@lists.ceph.com Cc: Harvey Skinner hpmpe...@gmail.com Envoyé: Vendredi 9 Août 2013 05:48:35 Objet: [ceph-users] ceph-deploy behind corporate firewalls hi all, I am not sure if I am the only one having issues with ceph-deploy behind a firewall or not. I haven't seen any other reports of similar issues yet. With http proxies I am able to have apt-get working, but wget is still an issue. Working to use the newer ceph-deploy mechanism to deploy my next POC set up on four storage nodes. The ceph-deploy install process unfortunately uses wget to retrieve the Ceph release key and failing the install. To get around this i can manually add the Ceph release key on all my nodes and apt-get install all the Ceph packages. Question though is whether there is anything else that ceph-deploy does that I would need to do manually to have everything in state where ceph-deploy would work correctly for the rest of the cluster setup and deployment, i.e. ceph-deploy new -and- ceph-deploy mon create, etc.? thank you, Harvey ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ Le SITIV décline toute responsabilité quant au contenu de ce message. Ce message et les pièces jointes qui y sont attachées sont confidentiels et établis à l'attention exclusive de leur destinataire. Si vous pensez l'avoir reçu par erreur, merci de bien vouloir en aviser immédiatement l'expéditeur, de ne pas l'utiliser sous quelque forme que ce soit et de le détruire immédiatement. Toute divulgation, utilisation, diffusion ou reproduction du message ou des informations qu'il contient doit être préalablement autorisée par l'expéditeur. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] STGT targets.conf example
Awesome. Thanks Darryl. Do you want to propose a fix to stgt, or shall I? On Aug 8, 2013 7:21 PM, Darryl Bond db...@nrggos.com.au wrote: Dan, I found that the tgt-admin perl script looks for a local file if (-e $backing_store ! -d $backing_store $can_alloc == 1) { A bit nasty, but I created some empty files relative to / of the same path as the RBD backing store which worked around the problem. mkdir /iscsi-spin touch /iscsi-spin/test Lets me restart tgtd and have the LUN created properly. tgt-admin --dump is also not that useful, doesn't output the backing store type. # tgt-admin --dump default-driver iscsi target iqn.2013.com.ceph:test backing-store iscsi-spin/test initiator-address 192.168.6.100 /target Darryl On 08/09/13 07:23, Dan Mick wrote: On 08/04/2013 10:15 PM, Darryl Bond wrote: I am testing scsi-target-utils tgtd with RBD support. I have successfully created an iscsi target using RBD as an iscsi target and tested it. It backs onto a rados pool iscsi-spin with a RBD called test. Now I want it to survive a reboot. I have created a conf file target iqn.2008-09.com.ceph:test backing-store iscsi-spin/test bs-type rbd path iscsi-spin/test /backing-store /target When I restart tgtd It creates the target but doesn't connect the backing store. The tool tgt-admin has a test mode for the configuration file [root@cephgw conf.d]# tgt-admin -p -e # Adding target: iqn.2008-09.com.ceph:test tgtadm -C 0 --lld iscsi --op new --mode target --tid 1 -T iqn.2008-09.com.ceph:test # Skipping device: iscsi-spin/test # iscsi-spin/bashful-spin does not exist - please check the configuration file tgtadm -C 0 --lld iscsi --op bind --mode target --tid 1 -I ALL It looks to me like tgtd support RBD backing stores but the configuration utilities don't. I have not tried config files or tgt-admin to any great extent, but it doesn't look to me like there are backend dependencies in those tools (or I would have modified them at the time :)), but, that said, there may be some weird problem. tgt-admin is a Perl script that could be instrumented to figure out what's going on. I do know that the syntax of the config file is dicey. Anyone tried this? What have I missed? Regards Darryl The contents of this electronic message and any attachments are intended only for the addressee and may contain legally privileged, personal, sensitive or confidential information. If you are not the intended addressee, and have received this email, any transmission, distribution, downloading, printing or photocopying of the contents of this message or attachments is strictly prohibited. Any legal privilege or confidentiality attached to this message and attachments is not waived, lost or destroyed by reason of delivery to any person other than intended addressee. If you have received this message and are not the intended addressee you should notify the sender by return email and destroy all copies of the message and any attachments. Unless expressly attributed, the views expressed in this email do not necessarily represent the views of the company. __**_ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/**listinfo.cgi/ceph-users-ceph.**comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com The contents of this electronic message and any attachments are intended only for the addressee and may contain legally privileged, personal, sensitive or confidential information. If you are not the intended addressee, and have received this email, any transmission, distribution, downloading, printing or photocopying of the contents of this message or attachments is strictly prohibited. Any legal privilege or confidentiality attached to this message and attachments is not waived, lost or destroyed by reason of delivery to any person other than intended addressee. If you have received this message and are not the intended addressee you should notify the sender by return email and destroy all copies of the message and any attachments. Unless expressly attributed, the views expressed in this email do not necessarily represent the views of the company. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why is my mon store.db is 220GB?
On 07/08/13 15:14, Jeppesen, Nelson wrote: Joao, Have you had a chance to look at my monitor issues? I Ran ''ceph-mon -i FOO -compact' last week but it did not improve disk usage. Let me know if there's anything else I dig up. The monitor still at 0.67-rc2 with the OSDs at .0.61.7. Hi Nelson, It's been a crazy week, and haven't had the opportunity to dive into the compaction issues -- and we've been tying the last loose ends for the dumpling release. Btw, just noticed that you mentioned on your previous email that the 'mon compact on start = true' flag made your monitor hang. Well, that was not a hang per se. If you try that again and take a look at IO on the mon store, you should see the monitor doing loads of it. That's leveldb compacting. It should take a while. A considerable while. As I previously mentioned, 10G stores can take a while to compact -- a 220GB store will take even longer. However, regardless of how we eventually fix this whole thing, you'll need to compact your store. I seriously doubt there's a way out of it. Well, there may be another way out of it, but that would involve a bit of trickery to get the leveldb contents out of the store and into a new, fresh store, which would seem a lot like a last resort. But feel free to ping me on IRC and we'll try to figure something out. -Joao On 08/02/2013 12:15 AM, Jeppesen, Nelson wrote: Thanks for the reply, but how can I fix this without an outage? I tired adding 'mon compact on start = true' but the monitor just hung. Unfortunately this is a production cluster and can't take the outages (I'm assuming the cluster will fail without a monitor). I had three monitors I was hit with the store.db bug and lost two of the three. I have tried running with 0.61.5, .0.61.7 and 0.67-rc2. None of them seem to shrink the DB. My guess is that the compaction policies we are enforcing won't cover the portions of the store that haven't been compacted *prior* to the upgrade. Even today we still know of users with stores growing over dozens of GBs, requiring occasional restarts to compact (which is far from an acceptable fix). Some of these stores can take several minutes to compact when the monitors are restarted, although these guys can often mitigate any down time by restarting monitors one at a time while maintaining quorum. Unfortunately you don't have that luxury. :-\ If however you are willing to manually force a compaction, you should be able to do so with 'ceph-mon -i FOO --compact'. Now, there is a possibility this is why you've been unable to add other monitors to the cluster. Chances are that the iterators used to synchronize the store get stuck, or move slowly enough to make all sorts of funny timeouts to be triggered. I intend to look into your issue (especially the problems with adding new monitors) in the morning to better assess what's happening. -Joao -Original Message- From: Mike Dawson [mailto:mike.dawson at cloudapt.com] Sent: Thursday, August 01, 2013 4:10 PM To: Jeppesen, Nelson Cc: ceph-users at lists.ceph.com Subject: Re: [ceph-users] Why is my mon store.db is 220GB? 220GB is way, way too big. I suspect your monitors need to go through a successful leveldb compaction. The early releases of Cuttlefish suffered several issues with store.db growing unbounded. Most were fixed by 0.61.5, I believe. You may have luck stoping all Ceph daemons, then starting the monitor by itself. When there were bugs, leveldb compaction tended work better without OSD traffic hitting the monitors. Also, there are some settings to force a compact on startup like 'mon compact on start = true' and mon compact on trim = true. I don't think either are required anymore though. See some history here: http://tracker.ceph.com/issues/4895 Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/1/2013 6:52 PM, Jeppesen, Nelson wrote: My Mon store.db has been at 220GB for a few months now. Why is this and how can I fix it? I have one monitor in this cluster and I suspect that I can't add monitors to the cluster because it is too big. Thank you. ___ ceph-users mailing list ceph-users at lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users at lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Joao Eduardo Luis Software Engineer | http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] [ANN] ceph-deploy v1.2 has been released!
I am very pleased to announce the release of ceph-deploy to the Python Package Index. The OS packages are yet to come, I will make sure to update this thread when they do. For now, if you are familiar with Python install tools, you can install directly from PyPI with pip or easy_install: pip install ceph-deploy or easy_install ceph-deploy This release includes a massive effort for better error reporting and granular information in remote hosts (for `install` and `mon create` commands for now). There were about 18 bug fixes and improvements too, including upstream libraries that are used by ceph-deploy. If you find any issues with ceph-deploy, please make sure you let me know via this list or on irc at #ceph! Enjoy! -Alfredo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [ANN] ceph-deploy v1.2 has been released!
Hi! Awesome :)) Thanks for such a great work! Cheers, Sébastien On 10.08.2013 02:52, Alfredo Deza wrote: I am very pleased to announce the release of ceph-deploy to the Python Package Index. The OS packages are yet to come, I will make sure to update this thread when they do. For now, if you are familiar with Python install tools, you can install directly from PyPI with pip or easy_install: pip install ceph-deploy or easy_install ceph-deploy This release includes a massive effort for better error reporting and granular information in remote hosts (for `install` and `mon create` commands for now). There were about 18 bug fixes and improvements too, including upstream libraries that are used by ceph-deploy. If you find any issues with ceph-deploy, please make sure you let me know via this list or on irc at #ceph! Enjoy! -Alfredo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com