[ceph-users] strange issues after upgrading to SL6.6 and latest kernel
Hi, I managed to destroy my development cluster yesteday after upgrading it to Scientific Linux and kernel 2.6.32-504.23.4.el6.x86_64. Upon rebooting the development node hung whilst attempting to start the monitor. It was still in the same state after being left overnight to see if it would time out. I decided to start from scratch to see if I could recreate the issue on a clean install. I've followed both the quick install and manual install guides on the wiki and always see the following error whilst creating the initial monitor. https://gist.github.com/barryorourke/47b0a988d38a817afb5b#file-gistfile1-txt Has anyone seen anything similar? Regards, Barry -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] EPEL packages for QEMU-KVM with rbd support?
Hi, I'm not using OpenStack, I've only really been playing around with Ceph on test machines. I'm currently speccing up my production cluster and will probably end up running it along with OpenNebula. Barry On 07/05/13 10:01, Dan van der Ster wrote: Hi Barry, On Mon, May 6, 2013 at 7:06 PM, Barry O'Rourke Barry.O'rou...@ed.ac.uk wrote: Hi, I built a modified version of the fc17 package that I picked up from koji [1]. That might not be ideal for you as fc17 uses systemd rather than init, we use an in-house configuration management system which handles service start-up so it's not an issue for us. I'd be interested to hear how others install qemu on el6 derivatives, especially those of you running newer versions. Are you by chance trying to use qemu with OpenStack, RDO OpenStack in particular? We've done a naive backport of rbd.c from qemu 1.5 to latest qemu-kvm 0.12.1.2+ in el6 (patch available upon request, but I wouldn't trust it in production since we may have made a mistake). We then recompiled libvirt from el6 to force the enabling of rbd support: [root@xxx rpmbuild]# diff SPECS/libvirt.spec.orig SPECS/libvirt.spec 1671c1671 %{?_without_storage_rbd} \ --- --with-storage-rbd \ (without this patch, libvirt only to enables rbd for Fedora releases, not RHEL). We're at the point where qemu-kvm alone works with rbd, but RDO OpenStack Cinder and Glance are still failing to attach rbd volumes or boot from volumes for some unknown reason. We'd be very interested if someone else is trying/succeeding to achieve the same setup, RDO OpenStack + RBD. Cheers, Dan van der Ster CERN IT -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Dell R515 performance and specification question
Hi, I'm looking to purchase a production cluster of 3 Dell Poweredge R515's which I intend to run in 3 x replication. I've opted for the following configuration; 2 x 6 core processors 32Gb RAM H700 controller (1Gb cache) 2 x SAS OS disks (in RAID1) 2 x 1Gb ethernet (bonded for cluster network) 2 x 1Gb ethernet (bonded for client network) and either 4 x 2Tb nearline SAS OSDs or 8 x 1Tb nearline SAS OSDs. At the moment I'm undecided on the OSDs, although I'm swaying towards the second option at the moment as it would give me more flexibility and the option of using some of the disks as journals. I'm intending to use this cluster to host the images for ~100 virtual machines, which will run on different hardware most likely be managed by OpenNebula. I'd be interested to hear from anyone running a similar configuration with a similar use case, especially people who have spent some time benchmarking a similar configuration and still have a copy of the results. I'd also welcome any comments or critique on the above specification. Purchases have to be made via Dell and 10Gb ethernet is out of the question at the moment. Cheers, Barry -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Dell R515 performance and specification question
Hi, I'm running a somewhat similar configuration here. I'm wondering why you have left out SSDs for the journals? I can't go into exact prices due to our NDA, but I can say that getting a couple of decent SSD disks from Dell will increase the cost per server by a four figure sum, and we're on a limited budget. Dell do offer a budget range of SSDs on a limit warranty, I'm not too sure how much budget can be trusted. I gather they would be quite important to achieve a level of performance for hosting 100 virtual machines - unless that is not important for you? Do you know what kind of disk access patterns those 100 virtual machines will have? (i.e. is it a cluster computing setup with minimal disk access or are they running all sorts of general purpose systems?) The majority of our virtual machines are web servers and subversion repositories with quite a low amount of traffic, I don't imaging the disk I/O being that high. Have you considered having more than 3 servers? If you want to run with a replication count of 3, I imagine that a failed server would be problematic. But perhaps it is not important for you if you have to live with paused VMs for a while if a server dies? We have three server rooms, which is why I decided to go for three with 3 x replication. I don't think I could squeeze any more than that into my budget either. Why do you consider 10 Gb ethernet to be out of the question? I was told that it is out of the question at the moment, I'll need to mention it again. Thanks, Barry -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Dell R515 performance and specification question
Hi, With so few disks and the inability to do 10GbE, you may want to consider doing something like 5-6 R410s or R415s and just using the on-board controller with a couple of SATA disks and 1 SSD for the journal. That should give you better aggregate performance since in your case you can't use 10GbE. It will also spread your OSDs across more hosts for better redundancy and may not cost that much more per GB since you won't need to use the H700 card if you are using an SSD for journals. It's not as dense as R515s or R720XDs can be when fully loaded, but for small clusters with few disks I think it's a good trade-off to get the added redundancy and avoid expander/controller complications. I hadn't considered lowering the specification and increasing the number of hosts, that seems like a really viable option and not too much more expensive. When you say the on-board controller do you mean the onboard SATA or the H310 controller? Thanks, Barry -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Dell R515 performance and specification question
Hi, On Tue, 2013-05-07 at 21:07 +0300, Igor Laskovy wrote: If I currently understand idea, when this 1 SSD will fail whole node with that SSD will fail. Correct? Only OSDs that use that SSD for the journal will fail as they will lose any writes still in the journal. If I only have 2 OSDs sharing one SSD I would lose the whole node. What scenario for node recovery in this case? There are a couple of options, replace the SSD, remove the dead OSD's from ceph and create them from scratch. Or if you need the host back up quickly, delete the OSDs and create them with journals on them, this will probably impact performance elsewhere. Barry -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Dell R515 performance and specification question
Hi, Here's a quick performance display with various block sizes on a host with 1 public 1Gbe link and 1 1Gbe link on the same vlan as the ceph cluster. Thanks for taking the time to look into this for me, I'll compare it with my existing set-up in the morning. Thanks, Barry -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] EPEL packages for QEMU-KVM with rbd support?
Hi, The version currently shipping with RHEL is qemu-kvm-0.12.1.2-2.355 which doesn't work with ceph authentication, so you'll need to go for greater than 1.0.1. I'd love to see the most recent version, but would settle for anything I don't need to package myself :) Cheers, Barry On Mon, 2013-05-06 at 19:49 +0100, Neil Levine wrote: For RHEL, we will use the same version that ships with 6.3, 6.4 etc. For CentOS, it's an open question. We can stay with the same version again or use a newer version. Preferences? On Mon, May 6, 2013 at 7:46 PM, Barry O'Rourke Barry.O'rou...@ed.ac.uk wrote: Hi, That's good to hear, do you know what version you'll be packaging yet? Thanks, Barry On Mon, 2013-05-06 at 19:31 +0100, Neil Levine wrote: We can't host a modified qemu with rbd package in EPEL itself because it would conflict with the version of qemu that ships by default, breaking EPEL's policy. However, we will be building and hosting a qemu package for RH6.3, 6.4 on ceph.com very soon and getting a package into a CentOS repo. Neil On Mon, May 6, 2013 at 6:06 PM, Barry O'Rourke Barry.O'rou...@ed.ac.uk wrote: Hi, I built a modified version of the fc17 package that I picked up from koji [1]. That might not be ideal for you as fc17 uses systemd rather than init, we use an in-house configuration management system which handles service start-up so it's not an issue for us. I'd be interested to hear how others install qemu on el6 derivatives, especially those of you running newer versions. Cheers, Barry 1. http://koji.fedoraproject.org/koji/packageinfo?packageID=3685 On Mon, 2013-05-06 at 16:58 +, w sun wrote: Does anyone know if there are RPM packages for EPEL 6-8 ? I have heard they have been built but could not find them in the latest 6-8 repo. Thanks. --weiguo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd command error librbd::ImageCtx: error finding header
Hi, This sounds aa lot like https://bugzilla.redhat.com/show_bug.cgi?id=891993. Barry On 04/24/13 04:24, Dennis Chen wrote: Hi list, I am using a ceph cluster (version 0.56.4) with all nodes (mon, mds, osd...) deployed in the RHEL 6 distro, the client is based on Ubuntu 12.10. Now I am confused by a strange issue, seems the issue has been asked before by google but no a clear answer for it. The specific details as below-- in the client side, I want to create a rbd image, so I run the commands: root@~# ceph osd pool create mypool 100 100 pool 'mypool' created root@~# rbd ls -p mypool odm-kvm-img root@~# rbd --image odm-kvm-img info rbd: error opening image 2013-04-24 10:43:42.800917 7fdb47d76780 -1 librbd::ImageCtx: error finding header: (2) No such file or directoryodm-kvm-img: (2) No such file or directory So I tried those steps followed according the goolged: root@~# rados ls -p mypool odm-kvm-img.rbd rbd_directory root@~# rbd info odm-kvm-img.rbd rbd: error opening image 2013-04-24 10:54:19.468770 7f8332dea780 -1 librbd::ImageCtx: error finding header: (2) No such file or directory odm-kvm-img.rbd: (2) No such file or directory odm-kvm-img.rbd is showed by 'rados ls' command and it's there, but why I get an error when run the 'rbd info' command upon odm-kvm-img.rbd? does anybody can be help about this? BRs, Dennis ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com