Re: [ceph-users] core dump: qemu-img info -f rbd
Hi, I got a core dump when executing: root@ceph-node1:~# qemu-img info -f rbd rbd:vm_disks/box1_disk1 Try leaving out -f rbd from the command - I have seen that make a difference before. -- Jens Kristian S?0?3gaard, Mermaid Consulting ApS, j...@mermaidconsulting.dk, http://www.mermaidconsulting.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] 回复: core dump: qemu-img info -f rbd
Yes, it made a difference: root@ceph-node1:~# qemu-img info rbd:vm_disks/box1_disk1 image: rbd:vm_disks/box1_disk1 file format: raw virtual size: 10G (10737418240 bytes) disk size: unavailable I'm not sure if qemu-img guessed the format correctly. Does the above output seem normal? Thanks! -- 原始邮件 -- 发件人: Jens Kristian Søgaardj...@mermaidconsulting.dk; 发送时间: 2013年6月6日(星期四) 下午2:12 收件人: 大椿ng...@qq.com; 抄送: ceph-users@lists.ceph.comceph-users@lists.ceph.com; 主题: Re: [ceph-users] core dump: qemu-img info -f rbd Hi, I got a core dump when executing: root@ceph-node1:~# qemu-img info -f rbd rbd:vm_disks/box1_disk1 Try leaving out -f rbd from the command - I have seen that make a difference before. -- Jens Kristian Søgaard, Mermaid Consulting ApS, j...@mermaidconsulting.dk, http://www.mermaidconsulting.com/___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-osd constantly crashing
Good day! Thank you, but it's not clear for me what is a bottleneck here. - Hardware node - load average, disk IO - underlying file system problem on osd or disk bad. - ceph journal problem Ceph osd partition is a part of block device which has practically no load Device:tpsMB_read/sMB_wrtn/sMB_readMB_wrtn sda 12,00 0,00 0,12 0 0 Device:tpsMB_read/sMB_wrtn/sMB_readMB_wrtn sda 12,00 0,00 0,14 0 0 Disk with osd is good, just checked it and have good r/w speed with appropriate iops and latency. But hardware node is working hard and have high load average. I fear that ceph-osd process lack resources. Is there any way to fix it? May be raise some kind of timeout when syncing or make this osd less weight or so? Or its better to move this osd to another server? Regards, Artem Silenkov, 2GIS TM. --- 2GIS LLChttp://2gis.rua.silenkov at 2gis.ru http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com gtalk:artem.silenkov at gmail.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com cell:+79231534853 2013/6/5 Gregory Farnum g...@inktank.com This would be easier to see with a log than with all the GDB stuff, but the reference in the backtrace to SyncEntryTimeout::finish(int) tells me that the filesystem is taking too long to sync things to disk. Either this disk is bad or you're somehow subjecting it to a much heavier load than the others. -Greg On Wednesday, June 5, 2013, Artem Silenkov wrote: Good day! Tried to nullify thid osd and reinject it with no success. It works a little bit then the crash again. Regards, Artem Silenkov, 2GIS TM. --- 2GIS LLC http://2gis.ru a.silen...@2gis.ru gtalk:artem.silen...@gmail.com cell:+79231534853 2013/6/5 Artem Silenkov artem.silen...@gmail.com Hello! We have simple setup as follows: Debian GNU/Linux 6.0 x64 Linux h08 2.6.32-19-pve #1 SMP Wed May 15 07:32:52 CEST 2013 x86_64 GNU/Linux ii ceph 0.61.2-1~bpo60+1 distributed storage and file system ii ceph-common 0.61.2-1~bpo60+1 common utilities to mount and interact with a ceph storage cluster ii ceph-fs-common 0.61.2-1~bpo60+1 common utilities to mount and interact with a ceph file system ii ceph-fuse0.61.2-1~bpo60+1 FUSE-based client for the Ceph distributed file system ii ceph-mds 0.61.2-1~bpo60+1 metadata server for the ceph distributed file system ii libcephfs1 0.61.2-1~bpo60+1 Ceph distributed file system client library ii libc-bin 2.11.3-4 Embedded GNU C Library: Binaries ii libc-dev-bin 2.11.3-4 Embedded GNU C Library: Development binaries ii libc62.11.3-4 Embedded GNU C Library: Shared libraries ii libc6-dev2.11.3-4 Embedded GNU C Library: Development Libraries and Header Files All programs are running fine except osd.2 which is crashing repeatedly. All other nodes have the same operating system onboard and all the system environment is quite identical. #cat /etc/ceph/ceph.conf [global] pid file = /var/run/ceph/$name.pid auth cluster required = none auth service required = none auth client required = none max open files = 65000 [mon] [mon.0] host = h01 mon addr = 10.1.1.3:6789 [mon.1] host = h07 mon addr = 10.1.1.10:6789 [mon.2] host = h08 mon addr = 10.1.1.11:6789 [mds] [mds.3] host = h09 [mds.4] host = h06 [osd] osd journal size = 1 osd journal = /var/lib/ceph/journal/$cluster-$id/journal osd mkfs type = xfs [osd.0] host = h01 addr = 10.1.1.3 devs = /dev/sda3 [osd.1] host = h07 addr = 10.1.1.10 devs = /dev/sda3 [osd.2] host = h08 addr = 10.1.1.11 devs = /dev/sda3 [osd.3] host = h09 addr = 10.1.1.12 devs = /dev/sda3 [osd.4] host = h06 addr = 10.1.1.9 devs = /dev/sda3 ~#ceph osd tree # idweight type name up/down reweight -1 5 root default -3 5 rack unknownrack -2 1 host h01 0 1 osd.0 up 1 -4 1 host h07 1 1 osd.1 up 1 -5 1 host h08 2 1 osd.2 down0 -6 1 host h09 3 1 osd.3 up 1 -7 1 host h06 4 1 osd.4 up 1 -- Software
[ceph-users] Ceph in OpenNebulaConf2013. Deadline for talk proposals
Hello everyone, We have just published an extended keynote line-up for the OpenNebula Conference 2013 (http://blog.opennebula.org/?p=4707) that includes experts from leading institutions using OpenNebula. The first ever OpenNebula Conference will be held on Sept. 24-26 in Berlin and is intended to serve as a meeting point for OpenNebula cloud users, developers, administrators, builders, integrators and researchers. I am sending this email to this list because as you all know Ceph is getting a massive traction in the Cloud Computing world and we are seeing an increasing amout of OpenNebula users deploying their cloud with Ceph. We would love to have people from the Ceph user base to speak at the OpenNebulaConf about their experiences with OpenNebula or about anything you see fit, such as optimizing Ceph for Virtualization environments, etc. There is only 10 days left to submit talk proposals (the deadline is June 15th). http://opennebulaconf.com/proposals/ Looking forward to welcoming you personally in Berlin! Cheers, Jaime -- Join us at OpenNebulaConf2013 http://opennebulaconf.com/ in Berlin, 24-26 September, 2013 -- Jaime Melis Project Engineer OpenNebula - The Open Source Toolkit for Cloud Computing www.OpenNebula.org | jme...@opennebula.org ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] cannot use dd to initialize rbd
Hi all I want to do some performance tests on kernel rbd, and I setup a ceph cluster with 4 hosts, every host has 20 osds, the journal of osds is on a separate SSD partition. First I created 48 rbds and mapped them to six clients, 8 rbds for every clients, then I executed the following command to initialize rbd. dd if=/dev/zero bs=1M of=/dev/rbd1 I can initialize all 48 RBDs. second I created another 48 RBDs and mapped them to the six clients, so every client has 12 RBDs. the 'rbd showmapped' can get a right output, however when I initialize the later 48 RBDs as I did before, it cannot be successful. The error output : dd: writing `/dev/rbd10': No space left on device any idea will be appreciate, if you need more information, please let me know, thanks in advance. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problems with OSDs (cuttlefish)
Hi, I do not think that ceph-deploy osd prepare/deploy/create actually works when run on a partition. It was returning successfully for me, but wouldn't actually add any OSDs to the configuration and associate them with a host. No errors, but also no result, had to revert back to using mkcephfs. Ilja -Original Message- From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users- boun...@lists.ceph.com] On Behalf Of Alvaro Izquierdo Jimeno Sent: Thursday, June 06, 2013 2:26 AM To: John Wilkins Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Problems with OSDs (cuttlefish) Hi John, Thanks for your answer! But Maybe I haven´t install the cuttlefish correctly in my hosts. sudo initctl list | grep ceph - none No ceph-all found anywhere. Steps that I have done to install cuttlefish: sudo rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' sudo su -c 'rpm -Uvh http://ceph.com/rpm-cuttlefish/el6/x86_64/ceph- release-1-0.el6.noarch.rpm' sudo yum install ceph Thanks a lot and best regards, Álvaro. -Mensaje original- De: John Wilkins [mailto:john.wilk...@inktank.com] Enviado el: jueves, 06 de junio de 2013 2:48 Para: Alvaro Izquierdo Jimeno CC: ceph-users@lists.ceph.com Asunto: Re: [ceph-users] Problems with OSDs (cuttlefish) You can also start/stop an individual daemon this way: sudo stop ceph-osd id=0 sudo start ceph-osd id=0 On Wed, Jun 5, 2013 at 4:33 PM, John Wilkins john.wilk...@inktank.com wrote: Ok. It's more like this: sudo initctl list | grep ceph This lists all your ceph scripts and their state. To start the cluster: sudo start ceph-all To stop the cluster: sudo stop ceph-all You can also do the same with all OSDs, MDSs, etc. I'll write it up and check it in. On Wed, Jun 5, 2013 at 3:16 PM, John Wilkins john.wilk...@inktank.com wrote: Alvaro, I ran into this too. Clusters running with ceph-deploy now use upstart. start ceph stop ceph Should work. I'm testing and will update the docs shortly. On Wed, Jun 5, 2013 at 7:41 AM, Alvaro Izquierdo Jimeno aizquie...@aubay.es wrote: Hi all, I already installed Ceph Bobtail in centos machines and it’s run perfectly. But now I have to install Ceph Cuttlefish over Redhat 6.4. I have two machines (until the moment). We can assume the hostnames IP1 and IP2 ;). I want (just to test) two monitors (one per host) and two osds (one per host). In both machines, I have a XFS logical volume: Disk /dev/mapper/lvceph: X GB, Y bytes The logical volume is formatted with XFS (sudo mkfs.xfs -f -i size=2048 /dev/mapper/ lvceph) and mounted In /etc/fstab I have: /dev/mapper/lvceph /ceph xfs defaults,inode64,noatime0 2 After use ceph-deploy to install (ceph-deploy install --stable cuttlefish IP1 IP2), create (ceph-deploy new IP1 IP2) and add two monitors (ceph-deploy --overwrite-conf mon create IP1 y ceph-deploy --overwrite-conf mon create IP2), I want to add the two osds ceph-deploy osd prepare IP1:/ceph ceph-deploy osd activate IP1:/ceph ceph-deploy osd prepare IP2:/ceph ceph-deploy osd activate IP2:/ceph But no one osd is up (neither in) #sudo ceph -d osd stat e3: 2 osds: 0 up, 0 in #sudo ceph osd tree # idweight type name up/down reweight -1 0 root default 0 0 osd.0 down0 1 0 osd.1 down0 I tried to start both osds: #sudo /etc/init.d/ceph -a start osd.0 /etc/init.d/ceph: osd.0 not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines ) I suppose I have something wrong in the ceph-deploy osd prepare or activate but, can anybody help me to find it? Is needed to add anything more to /etc/ceph/ceph.conf? Now it looks like: [global] filestore_xattr_use_omap = true mon_host = the_ip_of_IP1, the_ip_of_IP2 osd_journal_size = 1024 mon_initial_members = IP1,IP2 auth_supported = cephx fsid = 43501eb5-e8cf-4f89-a4e2-3c93ab1d9cc5 Thanks in advanced and best regards, Álvaro Verificada la ausencia de virus por G Data AntiVirus Versión: AVA 22.10143 del 05.06.2013 Noticias de virus: www.antiviruslab.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- John Wilkins Senior Technical Writer Intank john.wilk...@inktank.com (415) 425-9599 http://inktank.com -- John Wilkins Senior Technical Writer Intank john.wilk...@inktank.com (415) 425-9599 http://inktank.com -- John Wilkins Senior Technical Writer Intank john.wilk...@inktank.com (415) 425-9599 http://inktank.com Verificada la ausencia de virus por G Data AntiVirus
Re: [ceph-users] Problems with OSDs (cuttlefish)
Hi Same behavior with ceph version 0.61.2. But with ceph version 0.63-359-g02946e5 the ceph-deploy osd prepare doesn't finish ever Thanks, Álvaro. -Mensaje original- De: Ilja Maslov [mailto:ilja.mas...@openet.us] Enviado el: jueves, 06 de junio de 2013 14:43 Para: Alvaro Izquierdo Jimeno; John Wilkins CC: ceph-users@lists.ceph.com Asunto: RE: [ceph-users] Problems with OSDs (cuttlefish) Hi, I do not think that ceph-deploy osd prepare/deploy/create actually works when run on a partition. It was returning successfully for me, but wouldn't actually add any OSDs to the configuration and associate them with a host. No errors, but also no result, had to revert back to using mkcephfs. Ilja -Original Message- From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users- boun...@lists.ceph.com] On Behalf Of Alvaro Izquierdo Jimeno Sent: Thursday, June 06, 2013 2:26 AM To: John Wilkins Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Problems with OSDs (cuttlefish) Hi John, Thanks for your answer! But Maybe I haven´t install the cuttlefish correctly in my hosts. sudo initctl list | grep ceph - none No ceph-all found anywhere. Steps that I have done to install cuttlefish: sudo rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' sudo su -c 'rpm -Uvh http://ceph.com/rpm-cuttlefish/el6/x86_64/ceph- release-1-0.el6.noarch.rpm' sudo yum install ceph Thanks a lot and best regards, Álvaro. -Mensaje original- De: John Wilkins [mailto:john.wilk...@inktank.com] Enviado el: jueves, 06 de junio de 2013 2:48 Para: Alvaro Izquierdo Jimeno CC: ceph-users@lists.ceph.com Asunto: Re: [ceph-users] Problems with OSDs (cuttlefish) You can also start/stop an individual daemon this way: sudo stop ceph-osd id=0 sudo start ceph-osd id=0 On Wed, Jun 5, 2013 at 4:33 PM, John Wilkins john.wilk...@inktank.com wrote: Ok. It's more like this: sudo initctl list | grep ceph This lists all your ceph scripts and their state. To start the cluster: sudo start ceph-all To stop the cluster: sudo stop ceph-all You can also do the same with all OSDs, MDSs, etc. I'll write it up and check it in. On Wed, Jun 5, 2013 at 3:16 PM, John Wilkins john.wilk...@inktank.com wrote: Alvaro, I ran into this too. Clusters running with ceph-deploy now use upstart. start ceph stop ceph Should work. I'm testing and will update the docs shortly. On Wed, Jun 5, 2013 at 7:41 AM, Alvaro Izquierdo Jimeno aizquie...@aubay.es wrote: Hi all, I already installed Ceph Bobtail in centos machines and it’s run perfectly. But now I have to install Ceph Cuttlefish over Redhat 6.4. I have two machines (until the moment). We can assume the hostnames IP1 and IP2 ;). I want (just to test) two monitors (one per host) and two osds (one per host). In both machines, I have a XFS logical volume: Disk /dev/mapper/lvceph: X GB, Y bytes The logical volume is formatted with XFS (sudo mkfs.xfs -f -i size=2048 /dev/mapper/ lvceph) and mounted In /etc/fstab I have: /dev/mapper/lvceph /ceph xfs defaults,inode64,noatime0 2 After use ceph-deploy to install (ceph-deploy install --stable cuttlefish IP1 IP2), create (ceph-deploy new IP1 IP2) and add two monitors (ceph-deploy --overwrite-conf mon create IP1 y ceph-deploy --overwrite-conf mon create IP2), I want to add the two osds ceph-deploy osd prepare IP1:/ceph ceph-deploy osd activate IP1:/ceph ceph-deploy osd prepare IP2:/ceph ceph-deploy osd activate IP2:/ceph But no one osd is up (neither in) #sudo ceph -d osd stat e3: 2 osds: 0 up, 0 in #sudo ceph osd tree # idweight type name up/down reweight -1 0 root default 0 0 osd.0 down0 1 0 osd.1 down0 I tried to start both osds: #sudo /etc/init.d/ceph -a start osd.0 /etc/init.d/ceph: osd.0 not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines ) I suppose I have something wrong in the ceph-deploy osd prepare or activate but, can anybody help me to find it? Is needed to add anything more to /etc/ceph/ceph.conf? Now it looks like: [global] filestore_xattr_use_omap = true mon_host = the_ip_of_IP1, the_ip_of_IP2 osd_journal_size = 1024 mon_initial_members = IP1,IP2 auth_supported = cephx fsid = 43501eb5-e8cf-4f89-a4e2-3c93ab1d9cc5 Thanks in advanced and best regards, Álvaro Verificada la ausencia de virus por G Data AntiVirus Versión: AVA 22.10143 del 05.06.2013 Noticias de virus: www.antiviruslab.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com --
[ceph-users] Block Storage thin provisioning with Ubuntu 12.04?
Hi all, Sorry if the question has already been answered. We are thinking about using Ceph for our OpenStack implementation. As far as I know, thin provisioning is not available in ubuntu 12.04 since it does not include LVM2. Does Ceph have any dependency with LVM or is thin provisioning supported by inner functionality? Regards, Morgan KORCHIA ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Block Storage thin provisioning with Ubuntu 12.04?
On 6 June 2013 15:02, Morgan KORCHIA morgan.korc...@gmail.com wrote: As far as I know, thin provisioning is not available in ubuntu 12.04 since it does not include LVM2. Hi, Fairly sure it does. $ lvchange --version LVM version: 2.02.66(2) (2010-05-20) Library version: 1.02.48 (2010-05-20) Driver version: 4.23.1 $ dpkg -l lvm2 ii lvm2 2.02.66-4ubunt The Linux Logical Volume Manager $ lsb_release -d Description: Ubuntu 12.04.2 LTS But in answer to your question, RBD devices are created sparsely and only consume data when you write to them, if that is what you were thinking of using with OpenStack. Regards, Damien ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Having trouble using CORS in radosgw
Are you trying to set the CORS header using a user other than the user who created the bucket? Yehuda On Wed, Jun 5, 2013 at 8:25 AM, Mike Bryant mike.bry...@ocado.com wrote: Hi, I'm having trouble setting a CORS policy on a bucket. Using the boto python library, I can create a bucket and so on, but when I try to get or set the CORS policy radosgw responds with a 403: ?xml version=1.0 encoding=UTF-8?ErrorCodeAccessDenied/Code/Error Would anyone be able to help me with where I'm going wrong? (This is radosgw 0.61, so it's listed as being supported in the changelog) Cheers Mike -- Mike Bryant | Systems Administrator | Ocado Technology mike.bry...@ocado.com | 01707 382148 | www.ocadotechnology.com -- Notice: This email is confidential and may contain copyright material of Ocado Limited (the Company). Opinions and views expressed in this message may not necessarily reflect the opinions and views of the Company. If you are not the intended recipient, please notify us immediately and delete all copies of this message. Please note that it is your responsibility to scan this message for viruses. Company reg. no. 3875000. Ocado Limited Titan Court 3 Bishops Square Hatfield Business Park Hatfield Herts AL10 9NE ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Having trouble using CORS in radosgw
No, I'm using the same user. I have in fact tried it as close as possible to the actual creation, to be sure I'm using the same credentials. i.e. using boto, bucket = boto.create_bucket(...), followed by, bucket.set_cors(). Mike On 6 June 2013 15:51, Yehuda Sadeh yeh...@inktank.com wrote: Are you trying to set the CORS header using a user other than the user who created the bucket? Yehuda On Wed, Jun 5, 2013 at 8:25 AM, Mike Bryant mike.bry...@ocado.com wrote: Hi, I'm having trouble setting a CORS policy on a bucket. Using the boto python library, I can create a bucket and so on, but when I try to get or set the CORS policy radosgw responds with a 403: ?xml version=1.0 encoding=UTF-8?ErrorCodeAccessDenied/Code/Error Would anyone be able to help me with where I'm going wrong? (This is radosgw 0.61, so it's listed as being supported in the changelog) Cheers Mike -- Mike Bryant | Systems Administrator | Ocado Technology mike.bry...@ocado.com | 01707 382148 | www.ocadotechnology.com -- Notice: This email is confidential and may contain copyright material of Ocado Limited (the Company). Opinions and views expressed in this message may not necessarily reflect the opinions and views of the Company. If you are not the intended recipient, please notify us immediately and delete all copies of this message. Please note that it is your responsibility to scan this message for viruses. Company reg. no. 3875000. Ocado Limited Titan Court 3 Bishops Square Hatfield Business Park Hatfield Herts AL10 9NE ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Mike Bryant | Systems Administrator | Ocado Technology mike.bry...@ocado.com | 01707 382148 | www.ocadotechnology.com -- Notice: This email is confidential and may contain copyright material of Ocado Limited (the Company). Opinions and views expressed in this message may not necessarily reflect the opinions and views of the Company. If you are not the intended recipient, please notify us immediately and delete all copies of this message. Please note that it is your responsibility to scan this message for viruses. Company reg. no. 3875000. Ocado Limited Titan Court 3 Bishops Square Hatfield Business Park Hatfield Herts AL10 9NE ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problems with OSDs (cuttlefish)
Hi Sage, Thank you a lot. This explains a lot of questions I had. You and John have been very helpful. I will try again with my ceph setup. Best Regards, Dewan On Thu, Jun 6, 2013 at 9:23 PM, Sage Weil s...@inktank.com wrote: On Thu, 6 Jun 2013, Ilja Maslov wrote: Hi, I do not think that ceph-deploy osd prepare/deploy/create actually works when run on a partition. It was returning successfully for me, but wouldn't actually add any OSDs to the configuration and associate them with a host. No errors, but also no result, had to revert back to using mkcephfs. What is supposed to happen: - ceph-deploy osd create HOST:DIR will create a symlink in /var/lib/ceph/osd/ to DIR. - admin is responsible for mounting DIR on boot - sysvinit or upstart will see the symlink to DIR and start the daemon. It is possible that the osd create step also isn't triggering the daemon start at the appropriate time. Sorry, this isn't as well tested a use case. s Ilja -Original Message- From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users- boun...@lists.ceph.com] On Behalf Of Alvaro Izquierdo Jimeno Sent: Thursday, June 06, 2013 2:26 AM To: John Wilkins Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Problems with OSDs (cuttlefish) Hi John, Thanks for your answer! But Maybe I haven?t install the cuttlefish correctly in my hosts. sudo initctl list | grep ceph - none No ceph-all found anywhere. Steps that I have done to install cuttlefish: sudo rpm --import 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' sudo su -c 'rpm -Uvh http://ceph.com/rpm-cuttlefish/el6/x86_64/ceph- release-1-0.el6.noarch.rpm' sudo yum install ceph Thanks a lot and best regards, ?lvaro. -Mensaje original- De: John Wilkins [mailto:john.wilk...@inktank.com] Enviado el: jueves, 06 de junio de 2013 2:48 Para: Alvaro Izquierdo Jimeno CC: ceph-users@lists.ceph.com Asunto: Re: [ceph-users] Problems with OSDs (cuttlefish) You can also start/stop an individual daemon this way: sudo stop ceph-osd id=0 sudo start ceph-osd id=0 On Wed, Jun 5, 2013 at 4:33 PM, John Wilkins john.wilk...@inktank.com wrote: Ok. It's more like this: sudo initctl list | grep ceph This lists all your ceph scripts and their state. To start the cluster: sudo start ceph-all To stop the cluster: sudo stop ceph-all You can also do the same with all OSDs, MDSs, etc. I'll write it up and check it in. On Wed, Jun 5, 2013 at 3:16 PM, John Wilkins john.wilk...@inktank.com wrote: Alvaro, I ran into this too. Clusters running with ceph-deploy now use upstart. start ceph stop ceph Should work. I'm testing and will update the docs shortly. On Wed, Jun 5, 2013 at 7:41 AM, Alvaro Izquierdo Jimeno aizquie...@aubay.es wrote: Hi all, I already installed Ceph Bobtail in centos machines and it?s run perfectly. But now I have to install Ceph Cuttlefish over Redhat 6.4. I have two machines (until the moment). We can assume the hostnames IP1 and IP2 ;). I want (just to test) two monitors (one per host) and two osds (one per host). In both machines, I have a XFS logical volume: Disk /dev/mapper/lvceph: X GB, Y bytes The logical volume is formatted with XFS (sudo mkfs.xfs -f -i size=2048 /dev/mapper/ lvceph) and mounted In /etc/fstab I have: /dev/mapper/lvceph /ceph xfs defaults,inode64,noatime0 2 After use ceph-deploy to install (ceph-deploy install --stable cuttlefish IP1 IP2), create (ceph-deploy new IP1 IP2) and add two monitors (ceph-deploy --overwrite-conf mon create IP1 y ceph-deploy --overwrite-conf mon create IP2), I want to add the two osds ceph-deploy osd prepare IP1:/ceph ceph-deploy osd activate IP1:/ceph ceph-deploy osd prepare IP2:/ceph ceph-deploy osd activate IP2:/ceph But no one osd is up (neither in) #sudo ceph -d osd stat e3: 2 osds: 0 up, 0 in #sudo ceph osd tree # idweight type name up/down reweight -1 0 root default 0 0 osd.0 down0 1 0 osd.1 down0 I tried to start both osds: #sudo /etc/init.d/ceph -a start osd.0 /etc/init.d/ceph: osd.0 not found (/etc/ceph/ceph.conf defines , /var/lib/ceph defines ) I suppose I have something wrong in the ceph-deploy osd prepare or activate but, can anybody help me to find it? Is needed to add anything more to /etc/ceph/ceph.conf? Now it looks like: [global]
Re: [ceph-users] How many Pipe per Ceph OSD daemon will keep?
On Thu, Jun 6, 2013 at 12:25 AM, Chen, Xiaoxi xiaoxi.c...@intel.com wrote: Hi, From the code, each pipe (contains a TCP socket) will fork 2 threads, a reader and a writer. We really observe 100+ threads per OSD daemon with 30 instances of rados bench as clients. But this number seems a bit crazy, if I have a 40 disks node, thus I will have 40 OSDs, we plan to have 6 such nodes to serve 120 VMs from OpenStack. Since a RBD is distributed across all the OSDs, we can expect, for every single OSD daemon, we will have 120 TCP socket, that means 240 threads. Thus for 40 OSDs per node, we will have 9600 threads per node. This thread number seems incredible. Is there any internal mechanism to track and manage the number of pipes ? and another question may be , why we need such a lot threads ? why not epoll? Yep, right now the OSD maintains two threads per connection. That hasn't been a problem so far (they're fairly cheap threads) and people run into other limits much earlier — 40 OSDs/node for instance would require a lot of compute power anyway. epoll is a good idea and is something we're aware of, but it hasn't been necessary yet and would involve mucking around with some fairly sensitive core code so it hasn't risen to the top of anybody's queue. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph repair details
Repair does the equivalent of a deep-scrub to find problems. This mostly is reading object data/omap/xattr to create checksums and compares them across all copies. When a discrepancy is identified an arbitrary copy which did not have I/O errors is selected and used to re-write the other replicas. David Zafman Senior Developer http://www.inktank.com On May 25, 2013, at 12:33 PM, Mike Lowe j.michael.l...@gmail.com wrote: Does anybody know exactly what ceph repair does? Could you list out briefly the steps it takes? I unfortunately need to use it for an inconsistent pg. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD recovery over probably corrupted data
Yep, it was so. Disks was mounted with nobarrier(bad idea for XFS :) ), and in my case corruption happened despite presence of battery-backed cache. On Thu, Jun 6, 2013 at 10:44 PM, David Zafman david.zaf...@inktank.com wrote: It looks like the enclosure failure caused data corruption. Otherwise, your OSD should have come back online as it would after a power failure. David Zafman Senior Developer http://www.inktank.com On May 26, 2013, at 9:09 AM, Andrey Korolyov and...@xdel.ru wrote: Hello, Today a large disk enclosure decided to die peacefully, bringing down a couple of XFS-based disks, which works as storage for OSDs. After reviving disks on the different enclosure, OSD processes dying with SIGABRT with almost every disk (only one started working okay). Please take a look on attached backtrace, if there is a way to bring there filestores back without reformatting, I`ll be very glad to hear about such thing. Running Ceph version is the 0.56.4. gdb.txt.gz___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] v0.61.3 released
This is a much-anticipated point release for the v0.61 Cuttlefish stable series. It resolves a number of issues, primarily with monitor stability and leveldb trimming. All v0.61.x uses are encouraged to upgrade. Upgrading from bobtail: * There is one known problem with mon upgrades from bobtail. If the ceph-mon conversion on startup is aborted or fails for some reason, we do not correctly error out, but instead continue with (in certain cases) odd results. Please be careful if you have to restart the mons during the upgrade. A 0.61.4 release with a fix will be out shortly. * In the meantime, for current cuttlefish users, 0.61.3 is safe to use. Notable changes since v0.61.2: * mon: paxos state trimming fix (resolves runaway disk usage) * mon: finer-grained compaction on trim * mon: discard messages from disconnected clients (lowers load) * mon: leveldb compaction and other stats available via admin socket * mon: async compaction (lower overhead) * mon: fix bug incorrectly marking osds down with insufficient failure reports * osd: fixed small bug in pg request map * osd: avoid rewriting pg info on every osdmap * osd: avoid internal heartbeta timeouts when scrubbing very large objects * osd: fix narrow race with journal replay * mon: fixed narrow pg split race * rgw: fix leaked space when copying object * rgw: fix iteration over large/untrimmed usage logs * rgw: fix locking issue with ops log socket * rgw: require matching version of librados * librbd: make image creation defaults configurable (e.g., create format 2 images via qemu-img) * fix units in 'ceph df' output * debian: fix prerm/postinst hooks to start/stop daemons appropriately * upstart: allow uppercase daemons names (and thus hostnames) * sysvinit: fix enumeration of local daemons by type * sysvinit: fix osd weight calcuation when using -a * fix build on unsigned char platforms (e.g., arm) For more detailed information, see the release notes and complete changelog: * http://ceph.com/docs/master/release-notes/#v0-61-2-cuttlefish * http://ceph.com/docs/master/_downloads/v0.61.3.txt You can get v0.61.3 from the usual places: * Git at git://github.com/ceph/ceph.git * Tarball at http://ceph.com/download/ceph-0.61.3.tar.gz * For Debian/Ubuntu packages, see http://ceph.com/docs/master/install/debian * For RPMs, see http://ceph.com/docs/master/install/rpm ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How many Pipe per Ceph OSD daemon will keep?
But in ceph_user,Mark, and some users are really discussing some supermicro chassis that can have 24 spindles per 2u or 36/48 spindles per 4U even with 20 osds per node,the thread num will more than 5000,and if take internal heartbeat/replication pipe into account, it should be around 10K threads. This is still too high for 8 core or 16 core cpu/cpus and will waste a lot of cycles in context switchinh. 发自我的 iPhone 在 2013-6-7,0:21,Gregory Farnum g...@inktank.com 写道: On Thu, Jun 6, 2013 at 12:25 AM, Chen, Xiaoxi xiaoxi.c...@intel.com wrote: Hi, From the code, each pipe (contains a TCP socket) will fork 2 threads, a reader and a writer. We really observe 100+ threads per OSD daemon with 30 instances of rados bench as clients. But this number seems a bit crazy, if I have a 40 disks node, thus I will have 40 OSDs, we plan to have 6 such nodes to serve 120 VMs from OpenStack. Since a RBD is distributed across all the OSDs, we can expect, for every single OSD daemon, we will have 120 TCP socket, that means 240 threads. Thus for 40 OSDs per node, we will have 9600 threads per node. This thread number seems incredible. Is there any internal mechanism to track and manage the number of pipes ? and another question may be , why we need such a lot threads ? why not epoll? Yep, right now the OSD maintains two threads per connection. That hasn't been a problem so far (they're fairly cheap threads) and people run into other limits much earlier ― 40 OSDs/node for instance would require a lot of compute power anyway. epoll is a good idea and is something we're aware of, but it hasn't been necessary yet and would involve mucking around with some fairly sensitive core code so it hasn't risen to the top of anybody's queue. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] v0.61.3 released
There was a mistake in the build script that caused the rpm signing to get skipped. That's been fixed and updated rpms have been pushed out. Cheers, Gary On Jun 6, 2013, at 4:18 PM, Joshua Mesilane josh...@luma-pictures.com wrote: Hey, I'm getting RPM signing errors when trying to install this latest release: Package libcephfs1-0.61.3-0.el6.x86_64.rpm is not signed Running CentOS 6.4 Cheers, Josh On 06/07/2013 04:56 AM, Sage Weil wrote: This is a much-anticipated point release for the v0.61 Cuttlefish stable series. It resolves a number of issues, primarily with monitor stability and leveldb trimming. All v0.61.x uses are encouraged to upgrade. Upgrading from bobtail: * There is one known problem with mon upgrades from bobtail. If the ceph-mon conversion on startup is aborted or fails for some reason, we do not correctly error out, but instead continue with (in certain cases) odd results. Please be careful if you have to restart the mons during the upgrade. A 0.61.4 release with a fix will be out shortly. * In the meantime, for current cuttlefish users, 0.61.3 is safe to use. Notable changes since v0.61.2: * mon: paxos state trimming fix (resolves runaway disk usage) * mon: finer-grained compaction on trim * mon: discard messages from disconnected clients (lowers load) * mon: leveldb compaction and other stats available via admin socket * mon: async compaction (lower overhead) * mon: fix bug incorrectly marking osds down with insufficient failure reports * osd: fixed small bug in pg request map * osd: avoid rewriting pg info on every osdmap * osd: avoid internal heartbeta timeouts when scrubbing very large objects * osd: fix narrow race with journal replay * mon: fixed narrow pg split race * rgw: fix leaked space when copying object * rgw: fix iteration over large/untrimmed usage logs * rgw: fix locking issue with ops log socket * rgw: require matching version of librados * librbd: make image creation defaults configurable (e.g., create format 2 images via qemu-img) * fix units in 'ceph df' output * debian: fix prerm/postinst hooks to start/stop daemons appropriately * upstart: allow uppercase daemons names (and thus hostnames) * sysvinit: fix enumeration of local daemons by type * sysvinit: fix osd weight calcuation when using -a * fix build on unsigned char platforms (e.g., arm) For more detailed information, see the release notes and complete changelog: * http://ceph.com/docs/master/release-notes/#v0-61-2-cuttlefish * http://ceph.com/docs/master/_downloads/v0.61.3.txt You can get v0.61.3 from the usual places: * Git at git://github.com/ceph/ceph.git * Tarball at http://ceph.com/download/ceph-0.61.3.tar.gz * For Debian/Ubuntu packages, see http://ceph.com/docs/master/install/debian * For RPMs, see http://ceph.com/docs/master/install/rpm ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- josh mesilane senior systems administrator luma pictures level 2 256 clarendon street 0416 039 082 m lumapictures.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Issues with a fresh cluster and HEALTH_WARN
Hi, I'm currently evaulating ceph as a solution to some HA storage that we're looking at. To test I have 3 servers, with two disks to be used for OSDs on them (journals on the same disk as the OSD). I've deployed the cluster with 3 mons (one on each server) 6 OSDs (2 on each server) and 3 MDS (1 on each server) I've built the cluster using ceph-deploy checked out from git on my local workstation (Fedora 15) and the Severs themselves are running CentOS 6.4 First note: It looks like the ceph-deploy tool, when you run ceph-deploy osd perpare host:device is actually also activating the OSD when it's done instead of waiting for you to run the ceph-deploy osd activate command. Question: Is ceph-deploy supposed to be writing out the [mon] and the [osd] sections to the ceph.conf configuration file? I can't find any reference to anything in the config file except for the [global] section, and there are no other sections. Question: Once I got all 6 of my OSDs online I'm getting the following health error: health HEALTH_WARN 91 pgs degraded; 192 pgs stuck unclean; clock skew detected on mon.sv-dev-ha02, mon.sv-dev-ha03 ceph health details gives me (Truncated for readability): [root@sv-dev-ha02 ~]# ceph health detail HEALTH_WARN 91 pgs degraded; 192 pgs stale; 192 pgs stuck unclean; 2/6 in osds are down; clock skew detected on mon.sv-dev-ha02, mon.sv-dev-ha03 pg 2.3d is stuck unclean since forever, current state stale+active+remapped, last acting [1,0] pg 1.3e is stuck unclean since forever, current state stale+active+remapped, last acting [1,0] (Lots more lines like this) ... pg 1.1 is stuck unclean since forever, current state stale+active+remapped, last acting [1,0] pg 0.0 is stuck unclean since forever, current state stale+active+degraded, last acting [0] pg 0.3f is stale+active+remapped, acting [1,0] pg 1.3e is stale+active+remapped, acting [1,0] ... (Lots more lines like this) ... pg 1.1 is stale+active+remapped, acting [1,0] pg 2.2 is stale+active+remapped, acting [1,0] osd.0 is down since epoch 25, last address 10.20.100.90:6800/3994 osd.1 is down since epoch 25, last address 10.20.100.90:6803/4758 mon.sv-dev-ha02 addr 10.20.100.91:6789/0 clock skew 0.0858782s max 0.05s (latency 0.00546217s) mon.sv-dev-ha03 addr 10.20.100.92:6789/0 clock skew 0.0852838s max 0.05s (latency 0.00533693s) Any help on how to start troubleshooting this issue would be appreciated. Cheers, -- josh mesilane senior systems administrator luma pictures level 2 256 clarendon street 0416 039 082 m lumapictures.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Issues with a fresh cluster and HEALTH_WARN
Well, I had a closer look at the logs and for some reason, while it listed the OSDs as being up and in to begin with, fairly shortly after I sent this email the two on one of the hosts went down. Turned out that the OSDs weren't mounted for some reason. After re-mounting and restarting the services it all came back online and I've got a healthy cluster. Time is being synced by ntpd on the servers, so not sure what's going on there. Cheers, Josh On 06/07/2013 10:47 AM, Jeff Bailey wrote: You need to fix your clocks (usually with ntp). According to the log message they can be off by 50ms and yours seems to be about 85ms off. On 6/6/2013 8:40 PM, Joshua Mesilane wrote: Hi, I'm currently evaulating ceph as a solution to some HA storage that we're looking at. To test I have 3 servers, with two disks to be used for OSDs on them (journals on the same disk as the OSD). I've deployed the cluster with 3 mons (one on each server) 6 OSDs (2 on each server) and 3 MDS (1 on each server) I've built the cluster using ceph-deploy checked out from git on my local workstation (Fedora 15) and the Severs themselves are running CentOS 6.4 First note: It looks like the ceph-deploy tool, when you run ceph-deploy osd perpare host:device is actually also activating the OSD when it's done instead of waiting for you to run the ceph-deploy osd activate command. Question: Is ceph-deploy supposed to be writing out the [mon] and the [osd] sections to the ceph.conf configuration file? I can't find any reference to anything in the config file except for the [global] section, and there are no other sections. Question: Once I got all 6 of my OSDs online I'm getting the following health error: health HEALTH_WARN 91 pgs degraded; 192 pgs stuck unclean; clock skew detected on mon.sv-dev-ha02, mon.sv-dev-ha03 ceph health details gives me (Truncated for readability): [root@sv-dev-ha02 ~]# ceph health detail HEALTH_WARN 91 pgs degraded; 192 pgs stale; 192 pgs stuck unclean; 2/6 in osds are down; clock skew detected on mon.sv-dev-ha02, mon.sv-dev-ha03 pg 2.3d is stuck unclean since forever, current state stale+active+remapped, last acting [1,0] pg 1.3e is stuck unclean since forever, current state stale+active+remapped, last acting [1,0] (Lots more lines like this) ... pg 1.1 is stuck unclean since forever, current state stale+active+remapped, last acting [1,0] pg 0.0 is stuck unclean since forever, current state stale+active+degraded, last acting [0] pg 0.3f is stale+active+remapped, acting [1,0] pg 1.3e is stale+active+remapped, acting [1,0] ... (Lots more lines like this) ... pg 1.1 is stale+active+remapped, acting [1,0] pg 2.2 is stale+active+remapped, acting [1,0] osd.0 is down since epoch 25, last address 10.20.100.90:6800/3994 osd.1 is down since epoch 25, last address 10.20.100.90:6803/4758 mon.sv-dev-ha02 addr 10.20.100.91:6789/0 clock skew 0.0858782s max 0.05s (latency 0.00546217s) mon.sv-dev-ha03 addr 10.20.100.92:6789/0 clock skew 0.0852838s max 0.05s (latency 0.00533693s) Any help on how to start troubleshooting this issue would be appreciated. Cheers, ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- josh mesilane senior systems administrator luma pictures level 2 256 clarendon street 0416 039 082 m lumapictures.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] mount.ceph Cannot allocate memory
[Please keep all replies on the list.] So you're doing all of these operations on the same server? We don't recommend using the kernel client on the same server as an OSD, but that is unlikely to be causing your issue here. Still, ENOMEM is most likely happening in your kernel, and probably indicates that for some reason it genuinely can't allocate some memory. Have you checked dmesg to see if it has any useful information? -Greg On Thursday, June 6, 2013, Timofey Koolin wrote: Sorry, I have 1 mds server: ceph -s health HEALTH_OK monmap e1: 1 mons at {sh13-1=[...]:6789/0}, election epoch 1, quorum 0 sh13-1 osdmap e113: 2 osds: 2 up, 2 in pgmap v965: 192 pgs: 192 active+clean; 42331 KB data, 106 MB used, 5584 GB / 5584 GB avail mdsmap e66: 1/1/1 up {0=sh13-1=up:active} ceph-fuse work fine, ceph - doesn't work. About server: Ubuntu 12.04 with all updates, 32GB ram. top: top - 06:18:13 up 15:06, 1 user, load average: 0.00, 0.01, 0.05 Tasks: 141 total, 1 running, 139 sleeping, 0 stopped, 1 zombie Cpu(s): 0.0%us, 0.1%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 32835600k total, 1042444k used, 31793156k free,97376k buffers Swap: 16777212k total,0k used, 16777212k free, 428568k cached 2013/6/7 Gregory Farnum g...@inktank.com javascript:_e({}, 'cvml', 'g...@inktank.com'); On Thu, Jun 6, 2013 at 1:19 PM, Timofey Koolin timo...@koolin.rujavascript:_e({}, 'cvml', 'timo...@koolin.ru'); wrote: ceph -v ceph version 0.61.3 (92b1e398576d55df8e5888dd1a9545ed3fd99532) mount.ceph l6:/ /ceph -o name=admin,secret=... mount error 12 = Cannot allocate memory I have cluster with 1 mon, 2 osd, ipv6 network. rbd work fine. CephFS isn't going to work if you don't have a metadata server running... -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com -- Blog: www.rekby.ru -- Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com