[ceph-users] strange issues after upgrading to SL6.6 and latest kernel

2015-07-14 Thread Barry O'Rourke

Hi,

I managed to destroy my development cluster yesteday after upgrading it to
Scientific Linux and kernel 2.6.32-504.23.4.el6.x86_64.

Upon rebooting the development node hung whilst attempting to start the
monitor. It was still in the same state after being left overnight to
see if it would time out.

I decided to start from scratch to see if I could recreate the issue on
a clean install.

I've followed both the quick install and manual install guides on the
wiki and always see the following error whilst creating the initial
monitor.

https://gist.github.com/barryorourke/47b0a988d38a817afb5b#file-gistfile1-txt

Has anyone seen anything similar?

Regards,

Barry

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] EPEL packages for QEMU-KVM with rbd support?

2013-05-07 Thread Barry O'Rourke

Hi,

I'm not using OpenStack, I've only really been playing around with Ceph 
on test machines. I'm currently speccing up my production cluster and 
will probably end up running it along with OpenNebula.


Barry

On 07/05/13 10:01, Dan van der Ster wrote:

Hi Barry,

On Mon, May 6, 2013 at 7:06 PM, Barry O'Rourke Barry.O'rou...@ed.ac.uk wrote:

Hi,

I built a modified version of the fc17 package that I picked up from
koji [1]. That might not be ideal for you as fc17 uses systemd rather
than init, we use an in-house configuration management system which
handles service start-up so it's not an issue for us.

I'd be interested to hear how others install qemu on el6 derivatives,
especially those of you running newer versions.



Are you by chance trying to use qemu with OpenStack, RDO OpenStack in
particular?

We've done a naive backport of rbd.c from qemu 1.5 to latest qemu-kvm
0.12.1.2+ in el6 (patch available upon request, but I wouldn't trust
it in production since we may have made a mistake). We then recompiled
libvirt from el6 to force the enabling of rbd support:

[root@xxx rpmbuild]# diff SPECS/libvirt.spec.orig SPECS/libvirt.spec
1671c1671
%{?_without_storage_rbd} \
---

--with-storage-rbd \


(without this patch, libvirt only to enables rbd for Fedora releases, not RHEL).

We're at the point where qemu-kvm alone works with rbd, but RDO
OpenStack Cinder and Glance are still failing to attach rbd volumes or
boot from volumes for some unknown reason. We'd be very interested if
someone else is trying/succeeding to achieve the same setup, RDO
OpenStack + RBD.
Cheers,
Dan van der Ster
CERN IT



--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Barry O'Rourke

Hi,

I'm looking to purchase a production cluster of 3 Dell Poweredge R515's 
which I intend to run in 3 x replication. I've opted for the following 
configuration;


2 x 6 core processors
32Gb RAM
H700 controller (1Gb cache)
2 x SAS OS disks (in RAID1)
2 x 1Gb ethernet (bonded for cluster network)
2 x 1Gb ethernet (bonded for client network)

and either 4 x 2Tb nearline SAS OSDs or 8 x 1Tb nearline SAS OSDs.

At the moment I'm undecided on the OSDs, although I'm swaying towards 
the second option at the moment as it would give me more flexibility and 
the option of using some of the disks as journals.


I'm intending to use this cluster to host the images for ~100 virtual 
machines, which will run on different hardware most likely be managed by 
OpenNebula.


I'd be interested to hear from anyone running a similar configuration 
with a similar use case, especially people who have spent some time 
benchmarking a similar configuration and still have a copy of the results.


I'd also welcome any comments or critique on the above specification. 
Purchases have to be made via Dell and 10Gb ethernet is out of the 
question at the moment.


Cheers,

Barry


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Barry O'Rourke

Hi,


I'm running a somewhat similar configuration here. I'm wondering why you
have left out SSDs for the journals?


I can't go into exact prices due to our NDA, but I can say that getting 
a couple of decent SSD disks from Dell will increase the cost per server 
by a four figure sum, and we're on a limited budget. Dell do offer a 
budget range of SSDs on a limit warranty, I'm not too sure how much 
budget can be trusted.



I gather they would be quite important to achieve a level of performance
for hosting 100 virtual machines - unless that is not important for you?


 Do you know what kind of disk access patterns those 100 virtual
 machines will have? (i.e. is it a cluster computing setup with
 minimal disk access or are they running all sorts of general purpose 
 systems?)


The majority of our virtual machines are web servers and subversion 
repositories with quite a low amount of traffic, I don't imaging the 
disk I/O being that high.



Have you considered having more than 3 servers?

If you want to run with a replication count of 3, I imagine that a
failed server would be problematic. But perhaps it is not important for
you if you have to live with paused VMs for a while if a server dies?


We have three server rooms, which is why I decided to go for three with 
3 x replication. I don't think I could squeeze any more than that into 
my budget either.



Why do you consider 10 Gb ethernet to be out of the question?


I was told that it is out of the question at the moment, I'll need to 
mention it again.


Thanks,

Barry

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Barry O'Rourke
Hi,

 With so few disks and the inability to do 10GbE, you may want to 
 consider doing something like 5-6 R410s or R415s and just using the 
 on-board controller with a couple of SATA disks and 1 SSD for the 
 journal.  That should give you better aggregate performance since in 
 your case you can't use 10GbE.  It will also spread your OSDs across 
 more hosts for better redundancy and may not cost that much more per GB 
 since you won't need to use the H700 card if you are using an SSD for 
 journals.  It's not as dense as R515s or R720XDs can be when fully 
 loaded, but for small clusters with few disks I think it's a good 
 trade-off to get the added redundancy and avoid expander/controller 
 complications.

I hadn't considered lowering the specification and increasing the number
of hosts, that seems like a really viable option and not too much more
expensive. When you say the on-board controller do you mean the onboard
SATA or the H310 controller? 

Thanks,

Barry



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Barry O'Rourke
Hi,

On Tue, 2013-05-07 at 21:07 +0300, Igor Laskovy wrote:
 If I currently understand idea, when this 1 SSD will fail whole node
 with that SSD will fail. Correct? 

Only OSDs that use that SSD for the journal will fail as they will lose
any writes still in the journal. If I only have 2 OSDs sharing one SSD I
would lose the whole node.

 What scenario for node recovery in this case? 

There are a couple of options, replace the SSD, remove the dead OSD's
from ceph and create them from scratch. Or if you need the host back up
quickly, delete the OSDs and create them with journals on them, this
will probably impact performance elsewhere.

Barry







-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Dell R515 performance and specification question

2013-05-07 Thread Barry O'Rourke
Hi,

 Here's a quick performance display with various block sizes on a host
 with 1 public 1Gbe link and 1 1Gbe link on the same vlan as the ceph
 cluster.

Thanks for taking the time to look into this for me, I'll compare it
with my existing set-up in the morning.

Thanks,

Barry


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] EPEL packages for QEMU-KVM with rbd support?

2013-05-06 Thread Barry O'Rourke
Hi,

The version currently shipping with RHEL is qemu-kvm-0.12.1.2-2.355
which doesn't work with ceph authentication, so you'll need to go for
greater than 1.0.1.

I'd love to see the most recent version, but would settle for anything I
don't need to package myself :)

Cheers,

Barry



On Mon, 2013-05-06 at 19:49 +0100, Neil Levine wrote:
 For RHEL, we will use the same version that ships with 6.3, 6.4 etc.
 
 For CentOS, it's an open question. We can stay with the same version
 again or use a newer version.
 
 Preferences?
 
 On Mon, May 6, 2013 at 7:46 PM, Barry O'Rourke Barry.O'rou...@ed.ac.uk 
 wrote:
  Hi,
 
  That's good to hear, do you know what version you'll be packaging yet?
 
  Thanks,
 
  Barry
 
  On Mon, 2013-05-06 at 19:31 +0100, Neil Levine wrote:
  We can't host a modified qemu with rbd package in EPEL itself because
  it would conflict with the version of qemu that ships by default,
  breaking EPEL's policy.
 
  However, we will be building and hosting a qemu package for RH6.3, 6.4
  on ceph.com very soon and getting a package into a CentOS repo.
 
  Neil
 
 
  On Mon, May 6, 2013 at 6:06 PM, Barry O'Rourke Barry.O'rou...@ed.ac.uk 
  wrote:
   Hi,
  
   I built a modified version of the fc17 package that I picked up from
   koji [1]. That might not be ideal for you as fc17 uses systemd rather
   than init, we use an in-house configuration management system which
   handles service start-up so it's not an issue for us.
  
   I'd be interested to hear how others install qemu on el6 derivatives,
   especially those of you running newer versions.
  
   Cheers,
  
   Barry
  
   1. http://koji.fedoraproject.org/koji/packageinfo?packageID=3685
  
   On Mon, 2013-05-06 at 16:58 +, w sun wrote:
   Does anyone know if there are RPM packages for EPEL 6-8 ? I have heard
   they have been built but could not find them in the latest 6-8 repo.
  
  
   Thanks. --weiguo
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  
  
  
   --
   The University of Edinburgh is a charitable body, registered in
   Scotland, with registration number SC005336.
  
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
  --
  The University of Edinburgh is a charitable body, registered in
  Scotland, with registration number SC005336.
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd command error librbd::ImageCtx: error finding header

2013-04-23 Thread Barry O'Rourke

Hi,

This sounds aa lot like https://bugzilla.redhat.com/show_bug.cgi?id=891993.

Barry

On 04/24/13 04:24, Dennis Chen wrote:

Hi list,

I am using a ceph cluster (version 0.56.4) with all nodes (mon, mds,
osd...) deployed in the RHEL 6 distro, the client is based on Ubuntu 12.10.
Now I am confused by a strange issue, seems the issue has been asked
before by google but no a clear answer for it. The specific details as
below--
in the client side, I want to create a rbd image, so I run the commands:

root@~# ceph osd pool create mypool 100 100
pool 'mypool' created

root@~# rbd ls -p mypool
odm-kvm-img

root@~# rbd --image odm-kvm-img info
rbd: error opening image 2013-04-24 10:43:42.800917 7fdb47d76780 -1
librbd::ImageCtx: error finding header: (2) No such file or
directoryodm-kvm-img:
(2) No such file or directory

So I tried those steps followed according the goolged:

root@~# rados ls -p mypool
odm-kvm-img.rbd
rbd_directory
root@~# rbd info odm-kvm-img.rbd
rbd: error opening image 2013-04-24 10:54:19.468770 7f8332dea780 -1
librbd::ImageCtx: error finding header: (2) No such file or directory
odm-kvm-img.rbd: (2) No such file or directory

odm-kvm-img.rbd is showed by 'rados ls' command and it's there, but why
I get an error when run the 'rbd info' command upon odm-kvm-img.rbd?
does anybody can be help about this?

BRs,
Dennis
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com