Re: [ceph-users] Local SSD cache for ceph on each compute node.
I’d rather like to see this implemented at the hypervisor level, i.e.: QEMU, so we can have a common layer for all the storage backends. Although this is less portable... > On 17 Mar 2016, at 11:00, Nick Fiskwrote: > > > >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Daniel Niasoff >> Sent: 16 March 2016 21:02 >> To: Nick Fisk ; 'Van Leeuwen, Robert' >> ; 'Jason Dillaman' >> Cc: ceph-users@lists.ceph.com >> Subject: Re: [ceph-users] Local SSD cache for ceph on each compute node. >> >> Hi Nick, >> >> Your solution requires manual configuration for each VM and cannot be >> setup as part of an automated OpenStack deployment. > > Absolutely, potentially flaky as well. > >> >> It would be really nice if it was a hypervisor based setting as opposed to > a VM >> based setting. > > Yes, I can't wait until we can just specify "rbd_cache_device=/dev/ssd" in > the ceph.conf and get it to write to that instead. Ideally ceph would also > provide some sort of lightweight replication for the cache devices, but > otherwise a iSCSI SSD farm or switched SAS could be used so that the caching > device is not tied to one physical host. > >> >> Thanks >> >> Daniel >> >> -Original Message- >> From: Nick Fisk [mailto:n...@fisk.me.uk] >> Sent: 16 March 2016 08:59 >> To: Daniel Niasoff ; 'Van Leeuwen, Robert' >> ; 'Jason Dillaman' >> Cc: ceph-users@lists.ceph.com >> Subject: RE: [ceph-users] Local SSD cache for ceph on each compute node. >> >> >> >>> -Original Message- >>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf >>> Of Daniel Niasoff >>> Sent: 16 March 2016 08:26 >>> To: Van Leeuwen, Robert ; Jason Dillaman >>> >>> Cc: ceph-users@lists.ceph.com >>> Subject: Re: [ceph-users] Local SSD cache for ceph on each compute node. >>> >>> Hi Robert, >>> Caching writes would be bad because a hypervisor failure would result in >>> loss of the cache which pretty much guarantees inconsistent data on >>> the ceph volume. Also live-migration will become problematic compared to running >>> everything from ceph since you will also need to migrate the >> local-storage. >> >> I tested a solution using iSCSI for the cache devices. Each VM was using >> flashcache with a combination of a iSCSI LUN from a SSD and a RBD. This > gets >> around the problem of moving things around or if the hypervisor goes down. >> It's not local caching but the write latency is at least 10x lower than > the RBD. >> Note I tested it, I didn't put it into production :-) >> >>> >>> My understanding of how a writeback cache should work is that it >>> should only take a few seconds for writes to be streamed onto the >>> network and is focussed on resolving the speed issue of small sync >>> writes. The writes >> would >>> be bundled into larger writes that are not time sensitive. >>> >>> So there is potential for a few seconds data loss but compared to the >> current >>> trend of using ephemeral storage to solve this issue, it's a major >>> improvement. >> >> Yeah, problem is a couple of seconds data loss mean different things to >> different people. >> >>> (considering the time required for setting up and maintaining the extra >>> caching layer on each vm, unless you work for free ;-) >>> >>> Couldn't agree more there. >>> >>> I am just so surprised how the openstack community haven't looked to >>> resolve this issue. Ephemeral storage is a HUGE compromise unless you >>> have built in failure into every aspect of your application but many >>> people use openstack as a general purpose devstack. >>> >>> (Jason pointed out his blueprint but I guess it's at least a year or 2 >> away - >>> http://tracker.ceph.com/projects/ceph/wiki/Rbd_-_ordered_crash- >>> consistent_write-back_caching_extension) >>> >>> I see articles discussing the idea such as this one >>> >>> http://www.sebastien-han.fr/blog/2014/06/10/ceph-cache-pool-tiering- >>> scalable-cache/ >>> >>> but no real straightforward validated setup instructions. >>> >>> Thanks >>> >>> Daniel >>> >>> >>> -Original Message- >>> From: Van Leeuwen, Robert [mailto:rovanleeu...@ebay.com] >>> Sent: 16 March 2016 08:11 >>> To: Jason Dillaman ; Daniel Niasoff >>> >>> Cc: ceph-users@lists.ceph.com >>> Subject: Re: [ceph-users] Local SSD cache for ceph on each compute node. >>> Indeed, well understood. As a shorter term workaround, if you have control over the VMs, you could >>> always just slice out an LVM volume from local SSD/NVMe and pass it >>> through to the guest. Within the guest, use dm-cache (or similar) to >>> add >> a >>> cache front-end to your RBD volume. >>> >>> If you do this you need to setup
Re: [ceph-users] Migrate Block Volumes and VMs
What you can do is flatten all the images so you break the relationship between the parent image and the child. Then you can export/import. > On 15 Dec 2015, at 12:10, Sam Huracanwrote: > > Hi everybody, > > My OpenStack System use Ceph as backend for Glance, Cinder, Nova. In the > future, we intend build a new Ceph Cluster. > I can re-connect current OpenStack with new Ceph systems. > > After that, I have tried export rbd images and import to new Ceph, but VMs > and Volumes were clone of Glance rbd images, like this: > > rbd children images/e2c852e1-28ce-408d-b2ec-6351db35d55a@snap > > vms/8a4465fa-cbae-4559-b519-861eb4eda378_disk > volumes/volume-b5937629-5f44-40c8-9f92-5f88129d3171 > > > How could I export all rbd snapshot and its clones to import in new Ceph > Cluster? > > Or is there any solution to move all Vms, Volumes, Images from old Ceph > cluster to the new ones? > > Thanks and regards. > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Senior Cloud Architect "Always give 100%. Unless you're giving blood." Mail: s...@redhat.com Address: 11 bis, rue Roquépine - 75008 Paris signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] maximum number of mapped rbds?
Which Kernel are you running on? These days, the theoretical limit is 65536 AFAIK. Ilya would know the kernel needed for that. > On 03 Sep 2015, at 15:05, Jeff Epsteinwrote: > > Hello, > > In response to an rbd map command, we are getting a "Device or resource busy". > > $ rbd -p platform map ceph:pzejrbegg54hi-stage-4ac9303161243dc71c75--php > > rbd: sysfs write failed > > rbd: map failed: (16) Device or resource busy > > > We currently have over 200 rbds mapped on a single host. Can this be the > source of the problem? If so, is there a workaround? > > $ rbd -p platform showmapped|wc -l > 248 > > Thanks. > > Best, > Jeff > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Senior Cloud Architect "Always give 100%. Unless you're giving blood." Mail: s...@redhat.com Address: 11 bis, rue Roquépine - 75008 Paris signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Nova fails to download image from Glance backed with Ceph
Just to take away a possible issue from infra (LBs etc). Did you try to download the image on the compute node? Something like rbd export? > On 04 Sep 2015, at 11:56, Vasiliy Angapovwrote: > > Hi all, > > Not sure actually where does this bug belong to - OpenStack or Ceph - > but writing here in humble hope that anyone faced that issue also. > > I configured test OpenStack instance with Glance images stored in Ceph > 0.94.3. Nova has local storage. > But when I'm trying to launch instance from large image stored in Ceph > - it fails to spawn with such an error in nova-conductor.log: > > 2015-09-04 11:52:35.076 3605449 ERROR nova.scheduler.utils > [req-c6af3eca-f166-45bd-8edc-b8cfadeb0d0b > 82c1f134605e4ee49f65015dda96c79a 448cc6119e514398ac2793d043d4fa02 - - > -] [instance: 18c9f1d5-50e8-426f-94d5-167f43129ea6] Error from last > host: slpeah005 (node slpeah005.cloud): [u'Traceback (most recent call > last):\n', u' File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2220, > in _do_build_and_run_instance\nfilter_properties)\n', u' File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2363, > in _build_and_run_instance\ninstance_uuid=instance.uuid, > reason=six.text_type(e))\n', u'RescheduledException: Build of instance > 18c9f1d5-50e8-426f-94d5-167f43129ea6 was re-scheduled: [Errno 32] > Corrupt image download. Checksum was 625d0686a50f6b64e57b1facbc042248 > expected 4a7de2fbbd01be5c6a9e114df145b027\n'] > > So nova tries 3 different hosts with the same error messages on every > single one and then fails to spawn an instance. > I've tried Cirros little image and it works fine with it. Issue > happens with large images like 10Gb in size. > I also managed to look into /var/lib/nova/instances/_base folder and > found out that image is actually being downloaded but at some moment > the download process interrupts for some unknown reason and instance > gets deleted. > > I looked at the syslog and found many messages like that: > Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735094 > 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.22 since > back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 > (cutoff 2015-09-04 12:51:32.735011) > Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735099 > 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.23 since > back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 > (cutoff 2015-09-04 12:51:32.735011) > Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735104 > 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.24 since > back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 > (cutoff 2015-09-04 12:51:32.735011) > Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735108 > 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.26 since > back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 > (cutoff 2015-09-04 12:51:32.735011) > Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735118 > 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.27 since > back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 > (cutoff 2015-09-04 12:51:32.735011) > > I've also tried to monitor nova-compute process file descriptors > number but it is never more than 102. ("echo > /proc/NOVA_COMPUTE_PID/fd/* | wc -w" like Jan advised in this ML). > It also seems like problem appeared only in 0.94.3, in 0.94.2 > everything worked just fine! > > Would be very grateful for any help! > > Vasily. > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Senior Cloud Architect "Always give 100%. Unless you're giving blood." Mail: s...@redhat.com Address: 11 bis, rue Roquépine - 75008 Paris signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Nova with Ceph generate error
Which request generated this trace? Is it nova-compute log? On 10 Jul 2015, at 07:13, Mario Codeniera mario.codeni...@gmail.com wrote: Hi, It is my first time here. I am just having an issue regarding with my configuration with the OpenStack which works perfectly for the cinder and the glance based on Kilo release in CentOS 7. I am based my documentation on this rbd-opeenstack manual. If I enable my rbd in the nova.conf it generates error like the following in the dashboard as the logs don't have any errors: Internal Server Error (HTTP 500) (Request-ID: req-231347dd-f14c-4f97-8a1d-851a149b037c) Code 500 Details File /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 343, in decorated_function return function(self, context, *args, **kwargs) File /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2737, in terminate_instance do_terminate_instance(instance, bdms) File /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py, line 445, in inner return f(*args, **kwargs) File /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2735, in do_terminate_instance self._set_instance_error_state(context, instance) File /usr/lib/python2.7/site-packages/oslo_utils/excutils.py, line 85, in __exit__ six.reraise(self.type_, self.value, self.tb) File /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2725, in do_terminate_instance self._delete_instance(context, instance, bdms, quotas) File /usr/lib/python2.7/site-packages/nova/hooks.py, line 149, in inner rv = f(*args, **kwargs) File /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2694, in _delete_instance quotas.rollback() File /usr/lib/python2.7/site-packages/oslo_utils/excutils.py, line 85, in __exit__ six.reraise(self.type_, self.value, self.tb) File /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2664, in _delete_instance self._shutdown_instance(context, instance, bdms) File /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2604, in _shutdown_instance self.volume_api.detach(context, bdm.volume_id) File /usr/lib/python2.7/site-packages/nova/volume/cinder.py, line 214, in wrapper res = method(self, ctx, volume_id, *args, **kwargs) File /usr/lib/python2.7/site-packages/nova/volume/cinder.py, line 365, in detach cinderclient(context).volumes.detach(volume_id) File /usr/lib/python2.7/site-packages/cinderclient/v2/volumes.py, line 334, in detach return self._action('os-detach', volume) File /usr/lib/python2.7/site-packages/cinderclient/v2/volumes.py, line 311, in _action return self.api.client.post(url, body=body) File /usr/lib/python2.7/site-packages/cinderclient/client.py, line 91, in post return self._cs_request(url, 'POST', **kwargs) File /usr/lib/python2.7/site-packages/cinderclient/client.py, line 85, in _cs_request return self.request(url, method, **kwargs) File /usr/lib/python2.7/site-packages/cinderclient/client.py, line 80, in request return super(SessionClient, self).request(*args, **kwargs) File /usr/lib/python2.7/site-packages/keystoneclient/adapter.py, line 206, in request resp = super(LegacyJsonAdapter, self).request(*args, **kwargs) File /usr/lib/python2.7/site-packages/keystoneclient/adapter.py, line 95, in request return self.session.request(url, method, **kwargs) File /usr/lib/python2.7/site-packages/keystoneclient/utils.py, line 318, in inner return func(*args, **kwargs) File /usr/lib/python2.7/site-packages/keystoneclient/session.py, line 397, in request raise exceptions.from_response(resp, method, url) Created 10 Jul 2015, 4:40 a.m. Again if disable I able to work but it is generated on the compute node, as I observe too it doesn't display the hypervisor of the compute nodes, or maybe it is related. It was working on Juno before, but there are unexpected rework as the network infrastructure was change which the I rerun the script and found lots of conflicts et al as I run before using qemu-img-rhev qemu-kvm-rhev from OVirt but seems the new hammer (Ceph repository) solve the issue. Hope someone can enlighten. Thanks, Mario ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Senior Cloud Architect Always give 100%. Unless you're giving blood. Mail: s...@redhat.com Address: 11 bis, rue Roquépine - 75008 Paris signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Expanding a ceph cluster with ansible
Bryan, Answers inline. On 24 Jun 2015, at 00:52, Stillwell, Bryan bryan.stillw...@twcable.com wrote: Sébastien, Nothing has gone wrong with using it in this way, it just has to do with my lack of experience with ansible/ceph-ansible. I'm learning both now, but would love if there were more documentation around using them. For example this documentation around using ceph-deploy is pretty good, and I was hoping for something equivalent for ceph-ansible: http://ceph.com/docs/master/rados/deployment/ Well if this is not enough: https://github.com/ceph/ceph-ansible/wiki Please open an issue with what’s missing and I’ll make sure to clarify everything ASAP. With that said, I'm wondering what tweaks do you think would be needed to get ceph-ansible working on an existing cluster? There are critical variables to edit, so the first thing to do will be to make sure that you perfectly match some variables with your current configuration. Btw I just tried the following: * deployed a cluster with ceph-deploy: 1 mons (on ceph1), 3 OSDs (on ceph4, ceph5, ceph6) * 1 SSD for the journal per OSD Then I configured ceph-ansible normally: * ran ‘ceph fsid’ to pick up the uuid used and edited group_vars/{all,mons,osds} with it (var fsid) * collected the monitor keyring here: /var/lib/ceph/mon/ceph-ceph-eno1/keyring and put it in group_vars/mons on monitor_secret * configured the monitor_interface variable in group_vars/all, this one might be tricky make sure that ceph-deploy used the right interface beforehand * change the journal_size variable in group_vars/all and used 5120 (ceph-deploy default) * change the public_network and cluster_network variables in group_vars/all * removed everything in ~./ceph-ansible/fetch * configure ceph-ansible to use a dedicated journal (journal_collocation: false and raw_multi_journal: true and edited raw_journal_devices variable) Eventually ran “ansible-playbook site.yml” and everything went well. I now have 3 monitors and 4 new OSDs per host all using the same SSDs, so 25 in total. Given that ceph-ansible follows ceph-deploy best practices, it worked without too much difficulty. I’d say that it depends how the cluster was bootstrapped in the first place. Also to answer your other questions, I haven't tried expanding the cluster with ceph-ansible yet. I'm playing around with it in vagrant/virtualbox, and it looks pretty awesome so far! If everything goes well, I'm not against revisiting the choice of puppet-ceph and replacing it with ceph-ansible. Awesome, don’t hesitate and let me know if I can help with this task. One other question, how well does ceph-ansible handle replacing a failed HDD (/dev/sdo) that has the journal at the beginning or middle of an SSD (/dev/sdd2)? At the moment, it doesn’t. Ceph-ansible just expects some basic mapping between OSDs and journals. ceph-disk will do the partitioning, so ceph-ansible doesn’t have any knowledge of the layout. It’d say that this intelligence should probably go intro ceph-disk itself or not but this idea will be to tell ceph-disk to re-use a partition that was a journal once. Then we can build another ansible playbook to re-populate a list of OSDs that died. I’ll have a look at that and will let you know. A bit more about device management in Ceph Ansible. For instance, depending on the scenario you choose. Let’s assume you go with dedicated SSDs for your journal, we have 2 variables: * devices (https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-osd/defaults/main.yml#L51): that contains a list of device where to store OSD data * raw_journal_devices (https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-osd/defaults/main.yml#L89): that contains the list of SSD that will host a journal So you can imagine having: devices: - /dev/sdb - /dev/sdc - /dev/sdd - /dev/sde raw_journal_devices: - /dev/sdu - /dev/sdu - /dev/sdv - /dev/sdv Where sdb, sdc will have sdu as a journal device and sdd, see will have sdv as a journal device. I should probably rework a little bit this part with an easier declaration though... Thanks, Bryan On 6/22/15, 7:09 AM, Sebastien Han s...@redhat.com wrote: Hi Bryan, It shouldn¹t be a problem for ceph-ansible to expand a cluster even if it wasn¹t deployed with it. I believe this requires a bit of tweaking on the ceph-ansible, but it¹s not much. Can you elaborate on what went wrong and perhaps how you configured ceph-ansible? As far as I understood, you haven¹t been able to grow the size of your cluster by adding new disks/nodes? Is this statement correct? One more thing, why don¹t you use ceph-ansible entirely to do the provisioning and life cycle management of your cluster? :) On 18 Jun 2015, at 00:14, Stillwell, Bryan bryan.stillw...@twcable.com wrote: I've been working on automating a lot of our ceph admin tasks lately and am pretty pleased with how the puppet-ceph module
Re: [ceph-users] Expanding a ceph cluster with ansible
Hi Bryan, It shouldn’t be a problem for ceph-ansible to expand a cluster even if it wasn’t deployed with it. I believe this requires a bit of tweaking on the ceph-ansible, but it’s not much. Can you elaborate on what went wrong and perhaps how you configured ceph-ansible? As far as I understood, you haven’t been able to grow the size of your cluster by adding new disks/nodes? Is this statement correct? One more thing, why don’t you use ceph-ansible entirely to do the provisioning and life cycle management of your cluster? :) On 18 Jun 2015, at 00:14, Stillwell, Bryan bryan.stillw...@twcable.com wrote: I've been working on automating a lot of our ceph admin tasks lately and am pretty pleased with how the puppet-ceph module has worked for installing packages, managing ceph.conf, and creating the mon nodes. However, I don't like the idea of puppet managing the OSDs. Since we also use ansible in my group, I took a look at ceph-ansible to see how it might be used to complete this task. I see examples for doing a rolling update and for doing an os migration, but nothing for adding a node or multiple nodes at once. I don't have a problem doing this work, but wanted to check with the community if any one has experience using ceph-ansible for this? After a lot of trial and error I found the following process works well when using ceph-deploy, but it's a lot of steps and can be error prone (especially if you have old cephx keys that haven't been removed yet): # Disable backfilling and scrubbing to prevent too many performance # impacting tasks from happening at the same time. Maybe adding norecover # to this list might be a good idea so only peering happens at first. ceph osd set nobackfill ceph osd set noscrub ceph osd set nodeep-scrub # Zap the disks to start from a clean slate ceph-deploy disk zap dnvrco01-cephosd-025:sd{b..y} # Prepare the disks. I found sleeping between adding each disk can help # prevent performance problems. ceph-deploy osd prepare dnvrco01-cephosd-025:sdh:/dev/sdb; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdi:/dev/sdb; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdj:/dev/sdb; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdk:/dev/sdc; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdl:/dev/sdc; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdm:/dev/sdc; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdn:/dev/sdd; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdo:/dev/sdd; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdp:/dev/sdd; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdq:/dev/sde; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdr:/dev/sde; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sds:/dev/sde; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdt:/dev/sdf; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdu:/dev/sdf; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdv:/dev/sdf; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdw:/dev/sdg; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdx:/dev/sdg; sleep 15 ceph-deploy osd prepare dnvrco01-cephosd-025:sdy:/dev/sdg; sleep 15 # Weight in the new OSDs. We set 'osd_crush_initial_weight = 0' to prevent # them from being added in during the prepare step. Maybe a longer weight # in the last step would make this step unncessary. ceph osd crush reweight osd.450 1.09; sleep 60 ceph osd crush reweight osd.451 1.09; sleep 60 ceph osd crush reweight osd.452 1.09; sleep 60 ceph osd crush reweight osd.453 1.09; sleep 60 ceph osd crush reweight osd.454 1.09; sleep 60 ceph osd crush reweight osd.455 1.09; sleep 60 ceph osd crush reweight osd.456 1.09; sleep 60 ceph osd crush reweight osd.457 1.09; sleep 60 ceph osd crush reweight osd.458 1.09; sleep 60 ceph osd crush reweight osd.459 1.09; sleep 60 ceph osd crush reweight osd.460 1.09; sleep 60 ceph osd crush reweight osd.461 1.09; sleep 60 ceph osd crush reweight osd.462 1.09; sleep 60 ceph osd crush reweight osd.463 1.09; sleep 60 ceph osd crush reweight osd.464 1.09; sleep 60 ceph osd crush reweight osd.465 1.09; sleep 60 ceph osd crush reweight osd.466 1.09; sleep 60 ceph osd crush reweight osd.467 1.09; sleep 60 # Once all the OSDs are added to the cluster, allow the backfill process to # begin. ceph osd unset nobackfill # Then once cluster is healthy again, re-enable scrubbing ceph osd unset noscrub ceph osd unset nodeep-scrub This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in
Re: [ceph-users] rbd unmap command hangs when there is no network connection with mons and osds
Should we put a timeout to the unmap command on the RBD RA in the meantime? On 08 May 2015, at 15:13, Vandeir Eduardo vandeir.edua...@gmail.com wrote: Wouldn't be better a configuration named (map|unmap)_timeout? Cause we are talking about a map/unmap of a RBD device, not a mount/unmount of a file system. On Fri, May 8, 2015 at 10:04 AM, Ilya Dryomov idryo...@gmail.com wrote: On Fri, May 8, 2015 at 3:59 PM, Ilya Dryomov idryo...@gmail.com wrote: On Fri, May 8, 2015 at 1:18 PM, Vandeir Eduardo vandeir.edua...@gmail.com wrote: This causes an annoying problem with rbd resource agent in pacemaker. In a situation where pacemaker needs to stop a rbd resource agent on a node where there is no network connection, the rbd unmap command hangs. This causes the resource agent stop command to timeout and the node is fenced. On Thu, May 7, 2015 at 4:37 PM, Ilya Dryomov idryo...@gmail.com wrote: On Thu, May 7, 2015 at 10:20 PM, Vandeir Eduardo vandeir.edua...@gmail.com wrote: Hi, when issuing rbd unmap command when there is no network connection with mons and osds, the command hangs. Isn't there a option to force unmap even on this situation? No, but you can Ctrl-C the unmap command and that should do it. In the dmesg you'll see something like rbd: unable to tear down watch request and you may have to wait for the cluster to timeout the watch. We can probably add a --force to rbd unmap. That would require extending our sysfs interface but I don't see any obstacles. Sage? On a second thought, we can timeout our wait for a reply to a watch teardown request with a configurable timeout (mount_timeout). We might still need --force for more in the future, but for this particular problem the timeout is a better solution I think. I'll take care of it. Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Find out the location of OSD Journal
Under the OSD directory, you can look where the symlink points. This is generally called ‘journal’, it should point to a device. On 06 May 2015, at 06:54, Patrik Plank p.pl...@st-georgen-gusen.at wrote: Hi, i cant remember on which drive I install which OSD journal :-|| Is there any command to show this? thanks regards ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph is Full
With mon_osd_full_ratio you should restart the monitors and this should’t be a problem. For the unclean PG, looks like something is preventing them to be healthy, look at the state of the OSD responsible for these 2 PGs. On 29 Apr 2015, at 05:06, Ray Sun xiaoq...@gmail.com wrote: mon osd full ratio Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph is Full
You can try to push the full ratio a bit further and then delete some objects. On 28 Apr 2015, at 15:51, Ray Sun xiaoq...@gmail.com wrote: More detail about ceph health detail [root@controller ~]# ceph health detail HEALTH_ERR 20 pgs backfill_toofull; 20 pgs degraded; 20 pgs stuck unclean; recovery 7482/129081 objects degraded (5.796%); 2 full osd(s); 1 near full osd(s) pg 3.8 is stuck unclean for 7067109.597691, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.7d is stuck unclean for 1852078.505139, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.21 is stuck unclean for 7072842.637848, current state active+degraded+remapped+backfill_toofull, last acting [0,2] pg 3.22 is stuck unclean for 7070880.213397, current state active+degraded+remapped+backfill_toofull, last acting [0,2] pg 3.a is stuck unclean for 7067057.863562, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.7f is stuck unclean for 7067122.493746, current state active+degraded+remapped+backfill_toofull, last acting [0,2] pg 3.5 is stuck unclean for 7067088.369629, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.1e is stuck unclean for 7073386.246281, current state active+degraded+remapped+backfill_toofull, last acting [0,2] pg 3.19 is stuck unclean for 7068035.310269, current state active+degraded+remapped+backfill_toofull, last acting [0,2] pg 3.5d is stuck unclean for 1852078.505949, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.1a is stuck unclean for 7067088.429544, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.1b is stuck unclean for 7072773.771385, current state active+degraded+remapped+backfill_toofull, last acting [0,2] pg 3.3 is stuck unclean for 7067057.864514, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.15 is stuck unclean for 7067088.825483, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.11 is stuck unclean for 7067057.862408, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.6d is stuck unclean for 7067083.634454, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.6e is stuck unclean for 7067098.452576, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.c is stuck unclean for 5658116.678331, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.e is stuck unclean for 7067078.646953, current state active+degraded+remapped+backfill_toofull, last acting [2,0] pg 3.20 is stuck unclean for 7067140.530849, current state active+degraded+remapped+backfill_toofull, last acting [0,2] pg 3.7d is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.7f is active+degraded+remapped+backfill_toofull, acting [0,2] pg 3.6d is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.6e is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.5d is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.20 is active+degraded+remapped+backfill_toofull, acting [0,2] pg 3.21 is active+degraded+remapped+backfill_toofull, acting [0,2] pg 3.22 is active+degraded+remapped+backfill_toofull, acting [0,2] pg 3.1e is active+degraded+remapped+backfill_toofull, acting [0,2] pg 3.19 is active+degraded+remapped+backfill_toofull, acting [0,2] pg 3.1a is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.1b is active+degraded+remapped+backfill_toofull, acting [0,2] pg 3.15 is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.11 is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.c is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.e is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.8 is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.a is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.5 is active+degraded+remapped+backfill_toofull, acting [2,0] pg 3.3 is active+degraded+remapped+backfill_toofull, acting [2,0] recovery 7482/129081 objects degraded (5.796%) osd.0 is full at 95% osd.2 is full at 95% osd.1 is near full at 93% Best Regards -- Ray On Tue, Apr 28, 2015 at 9:43 PM, Ray Sun xiaoq...@gmail.com wrote: Emergency Help! One of ceph cluster is full, and ceph -s returns: [root@controller ~]# ceph -s cluster 059f27e8-a23f-4587-9033-3e3679d03b31 health HEALTH_ERR 20 pgs backfill_toofull; 20 pgs degraded; 20 pgs stuck unclean; recovery 7482/129081 objects degraded (5.796%); 2 full osd(s); 1 near full osd(s) monmap e6: 4 mons at {node-5e40.cloud.com=10.10.20.40:6789/0,node-6670.cloud.com=10.10.20.31:6789/0,node-66c4.cloud.com=10.10.20.36:6789/0,node-fb27.cloud.com=10.10.20.41:6789/0}, election epoch 886, quorum 0,1,2,3
Re: [ceph-users] Ceph recovery network?
Well yes “pretty much” the same thing :). I think some people would like to distinguish recovery from replication and maybe perform some QoS around these 2. We have to replicate while recovering so one can impact the other. In the end, I just think it’s a doc issue, still waiting for a dev to answer :). On 27 Apr 2015, at 00:50, Robert LeBlanc rob...@leblancnet.us wrote: My understanding is that Monitors monitor the public address of the OSDs and other OSDs monitor the cluster address of the OSDs. Replication, recovery and backfill traffic all use the same network when you specify 'cluster network = network/mask' in your ceph.conf. It is useful to remember that replication, recovery and backfill traffic are pretty much the same thing, just at different points in time. On Sun, Apr 26, 2015 at 4:39 PM, Sebastien Han sebastien@enovance.com wrote: Hi list, While reading this http://ceph.com/docs/master/rados/configuration/network-config-ref/#ceph-networks, I came across the following sentence: You can also establish a separate cluster network to handle OSD heartbeat, object replication and recovery traffic” I didn’t know it was possible to perform such stretching, at least for recovery traffic. Replication is generally handled by the cluster_network_addr and the heartbeat can be used with osd_heartbeat_addr. Although I’m a bit confused by the osd_heartbeat_addr since I thought the heartbeat was binding on both public and cluster addresses. So my question is: how to isolate the recovery traffic to specific network? Thanks! Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Ceph recovery network?
Hi list, While reading this http://ceph.com/docs/master/rados/configuration/network-config-ref/#ceph-networks, I came across the following sentence: You can also establish a separate cluster network to handle OSD heartbeat, object replication and recovery traffic” I didn’t know it was possible to perform such stretching, at least for recovery traffic. Replication is generally handled by the cluster_network_addr and the heartbeat can be used with osd_heartbeat_addr. Although I’m a bit confused by the osd_heartbeat_addr since I thought the heartbeat was binding on both public and cluster addresses. So my question is: how to isolate the recovery traffic to specific network? Thanks! Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph cluster on docker containers
You can have a look at: https://github.com/ceph/ceph-docker On 23 Mar 2015, at 17:16, Pavel V. Kaygorodov pa...@inasan.ru wrote: Hi! I'm using ceph cluster, packed to a number of docker containers. There are two things, which you need to know: 1. Ceph OSDs are using FS attributes, which may not be supported by filesystem inside docker container, so you need to mount external directory inside a container to store OSD data. 2. Ceph monitors must have static external IP-s, so you have to use lxc-conf directives to use static IP-s inside containers. With best regards, Pavel. 6 марта 2015 г., в 10:15, Sumit Gaur sumitkg...@gmail.com написал(а): Hi I need to know if Ceph has any Docker story. What I am not abel to find if there are any predefined steps for ceph cluster to be deployed on Docker containers. Thanks sumit 201503061614748_BEI0XT4N.gif ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Sparse RBD instance snapshots in OpenStack
Several patches aim to solve that by using RBD snapshots instead of QEMU snapshots. Unfortunately I doubt we will have something ready for OpenStack Juno. Hopefully Liberty will be the release that fixes that. Having RAW images is not that bad since booting from that snapshot will do a clone. So not sure if doing sparsify a good idea (libguestfs should be able to do that). However it’s better we could do that via RBD snapshots so we can have best of both worlds. On 12 Mar 2015, at 03:45, Charles 'Boyo charlesb...@gmail.com wrote: Hello all. The current behavior of snapshotting instances RBD-backed in OpenStack involves uploading the snapshot into Glance. The resulting Glance image is fully allocated, causing an explosion of originally sparse RAW images. Is there a way to preserve the sparseness? Else I can use qemu-img convert (or rbd export/import) to manually sparsify it? On a related note, my Glance is also backed by the same Ceph cluster, in another pool and I was wondering if Ceph snapshots would not be a better way to do this. Any ideas? Regards, Charles Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD on LVM volume
A while ago, I managed to have this working but this was really tricky. See my comment here: https://github.com/ceph/ceph-ansible/issues/9#issuecomment-37127128 One use case I had was a system with 2 SSD for the OS and a couple of OSDs. Both SSD were in RAID1 and the system was configured with lvm already. So we had to create LVs for each journals. On 24 Feb 2015, at 14:41, Jörg Henne henn...@gmail.com wrote: 2015-02-24 14:05 GMT+01:00 John Spray john.sp...@redhat.com: I imagine that without proper partition labels you'll also not get the benefit of e.g. the udev magic that allows plugging OSDs in/out of different hosts. More generally you'll just be in a rather non standard configuration that will confuse anyone working on the host. Ok, thanks for the heads up! Can I ask why you want to use LVM? It is not generally necessary or useful with Ceph: Ceph expects to be fed raw drives. I am currently just experimenting with ceph. Although I have a reasonable number of lab nodes, those nodes are shared with other experimentation and thus it would be rather inconvenient to dedicate the raw disks exclusively to ceph. Joerg Henne ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Journals on all SSD cluster
It has been proven that the OSDs can’t take advantage of the SSD, so I’ll probably collocate both journal and osd data. Search in the ML for [Single OSD performance on SSD] Can't go over 3, 2K IOPS You will see that there is no difference it terms of performance between the following: * 1 SSD for journal + 1 SSD for osd data * 1 SSD for both journal and data What you can do in order to max out your SSD is to run multiple journals and osd data on the same SSD. Something like this gave me more IOPS: * /dev/sda1 ceph journal * /dev/sda2 ceph data * /dev/sda3 ceph journal * /dev/sda4 ceph data On 21 Jan 2015, at 04:32, Andrew Thrift and...@networklabs.co.nz wrote: Hi All, We have a bunch of shiny new hardware we are ready to configure for an all SSD cluster. I am wondering what are other people doing for their journal configuration on all SSD clusters ? - Seperate Journal partition and OSD partition on each SSD or - Journal on OSD Thanks, Andrew ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] how do I show active ceph configuration
You can use the admin socket: $ ceph daemon mon.id config show or locally ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config show On 21 Jan 2015, at 19:46, Robert Fantini robertfant...@gmail.com wrote: Hello Is there a way to see running / acrive ceph.conf configuration items? kind regards Rob Fantini ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] reset osd perf counters
It was added in 0.90 On 13 Jan 2015, at 00:11, Gregory Farnum g...@gregs42.com wrote: perf reset on the admin socket. I'm not sure what version it went in to; you can check the release logs if it doesn't work on whatever you have installed. :) -Greg On Mon, Jan 12, 2015 at 2:26 PM, Shain Miley smi...@npr.org wrote: Is there a way to 'reset' the osd perf counters? The numbers for osd 73 though osd 83 look really high compared to the rest of the numbers I see here. I was wondering if I could clear the counters out, so that I have a fresh set of data to work with. root@cephmount1:/var/log/samba# ceph osd perf osdid fs_commit_latency(ms) fs_apply_latency(ms) 0 0 45 1 0 14 2 0 47 3 0 25 4 1 44 5 12 6 12 7 0 39 8 0 32 9 0 34 10 2 186 11 0 68 12 11 13 0 34 14 01 15 2 37 16 0 23 17 0 28 18 0 26 19 0 22 20 02 21 2 24 22 0 33 23 01 24 3 98 25 2 70 26 01 27 3 99 28 02 29 2 101 30 2 72 31 2 81 32 3 112 33 3 94 34 4 152 35 0 56 36 02 37 2 58 38 01 39 03 40 02 41 02 42 11 43 02 44 1 44 45 02 46 01 47 3 85 48 01 49 2 75 50 4 398 51 3 115 52 01 53 2 47 54 6 290 55 5 153 56 7 453 57 2 66 58 11 59 5 196 60 00 61 0 93 62 09 63 01 64 01 65 04 66 01 67 0 18 68 0 16 69 0 81 70 0 70 71 00 72 01 7374 1217 74 01 7564 1238 7692 1248 77 01 78 01 79 109 1333 8068 1451 8166 1192 8295 1215 8381 1331 84 3 56 85 3 65 86 01 87 3
Re: [ceph-users] Spark/Mesos on top of Ceph/Btrfs
Hey What do you want to use from Ceph? RBD? CephFS? It is not really clear, you mentioned ceph/btfrs which makes me either think of using btrfs for OSD store or btrfs on top of a RBD device. Later you mentioned HDFS, does that mean you want to use CephFS? I don’t know much about Mesos, but what is so specific about Mesos that make you think that you will experience trouble using it with Ceph? On 13 Jan 2015, at 14:25, James wirel...@tampabay.rr.com wrote: Hello, I was wondering if anyone has Mesos running on top of Ceph? I want to test/use Ceph if lieu of HDFS. I'm working on Gentoo, but any experiences with Mesos on Ceph are of keen interest to me as related to performance, stability and any difficulties experienced. James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph as backend for Swift
You can have a look of what I did here with Christian: * https://github.com/stackforge/swift-ceph-backend * https://github.com/enovance/swiftceph-ansible If you have further question just let us know. On 08 Jan 2015, at 15:51, Robert LeBlanc rob...@leblancnet.us wrote: Anyone have a reference for documentation to get Ceph to be a backend for Swift? Thanks, Robert LeBlanc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Number of SSD for OSD journal
Salut, The general recommended ratio (for me at least) is 3 journals per SSD. Using 200GB Intel DC S3700 is great. If you’re going with a low perf scenario I don’t think you should bother buying SSD, just remove them from the picture and do 12 SATA 7.2K 4TB. For medium and medium ++ perf using a ratio 1:11 is way to high, the SSD will definitely be the bottleneck here. Please also note that (bandwidth wise) with 22 drives you’re already hitting the theoretical limit of a 10Gbps network. (~50MB/s * 22 ~= 1.1Gbps). You can theoretically up that value with LACP (depending on the xmit_hash_policy you’re using of course). Btw what’s the network? (since I’m only assuming here). On 15 Dec 2014, at 20:44, Florent MONTHEL fmont...@flox-arts.net wrote: Hi, I’m buying several servers to test CEPH and I would like to configure journal on SSD drives (maybe it’s not necessary for all use cases) Could you help me to identify number of SSD I need (SSD are very expensive and GB price business case killer… ) ? I don’t want to experience SSD bottleneck (some abacus ?). I think I will be with below CONF 2 3 CONF 1 DELL 730XC Low Perf: 10 SATA 7.2K 3.5 4TB + 2 SSD 2.5 » 200GB intensive write CONF 2 DELL 730XC « Medium Perf : 22 SATA 7.2K 2.5 1TB + 2 SSD 2.5 » 200GB intensive write CONF 3 DELL 730XC « Medium Perf ++ : 22 SAS 10K 2.5 1TB + 2 SSD 2.5 » 200GB intensive write Thanks Florent Monthel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph Block device and Trim/Discard
Discard works with virtio-scsi controllers for disks in QEMU. Just use discard=unmap in the disk section (scsi disk). On 12 Dec 2014, at 13:17, Max Power mailli...@ferienwohnung-altenbeken.de wrote: Wido den Hollander w...@42on.com hat am 12. Dezember 2014 um 12:53 geschrieben: It depends. Kernel RBD does not support discard/trim yet. Qemu does under certain situations and with special configuration. Ah, Thank you. So this is my problem. I use rbd with the kernel modules. I think I should port my fileserver to qemu/kvm environment then and hope that it is safe to have a big qemu-partition with around 10 TB. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Watch for fstrim running on your Ubuntu systems
Good to know. Thanks for sharing! On 09 Dec 2014, at 10:21, Wido den Hollander w...@42on.com wrote: Hi, Last sunday I got a call early in the morning that a Ceph cluster was having some issues. Slow requests and OSDs marking each other down. Since this is a 100% SSD cluster I was a bit confused and started investigating. It took me about 15 minutes to see that fstrim was running and was utilizing the SSDs 100%. On Ubuntu 14.04 there is a weekly CRON which executes fstrim-all. It detects all mountpoints which can be trimmed and starts to trim those. On the Intel SSDs used here it caused them to become 100% busy for a couple of minutes. That was enough for them to no longer respond on heartbeats, thus timing out and being marked down. Luckily we had the out interval set to 1800 seconds on that cluster, so no OSD was marked as out. fstrim-all does not execute fstrim with a ionice priority. From what I understand, but haven't tested yet, is that running fstrim with ionice -c Idle should solve this. It's weird that this issue didn't come up earlier on that cluster, but after killing fstrim all problems we resolved and the cluster ran happily again. So watch out for fstrim on early Sunday mornings on Ubuntu! -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Tool or any command to inject metadata/data corruption on rbd
AFAIK there is no tool to do this. You simply rm object or dd a new content in the object (fill with zero) On 04 Dec 2014, at 13:41, Mallikarjun Biradar mallikarjuna.bira...@gmail.com wrote: Hi all, I would like to know which tool or cli that all users are using to simulate metadata/data corruption. This is to test scrub operation. -Thanks regards, Mallikarjun Biradar ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Suitable SSDs for journal
Eneko, I do have plan to push to a performance initiative section on the ceph.com/docs sooner or later so people will put their own results through github PR. On 04 Dec 2014, at 16:09, Eneko Lacunza elacu...@binovo.es wrote: Thanks, will look back in the list archive. On 04/12/14 15:47, Nick Fisk wrote: Hi Eneko, There has been various discussions on the list previously as to the best SSD for Journal use. All of them have pretty much come to the conclusion that the Intel S3700 models are the best suited and in fact work out the cheapest in terms of write durability. Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Eneko Lacunza Sent: 04 December 2014 14:35 To: Ceph Users Subject: [ceph-users] Suitable SSDs for journal Hi all, Does anyone know about a list of good and bad SSD disks for OSD journals? I was pointed to http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ But I was looking for something more complete? For example, I have a Samsung 840 Pro that gives me even worse performance than a Crucial m550... I even thought it was dying (but doesn't seem this is the case). Maybe creating a community-contributed list could be a good idea? Regards Eneko -- Zuzendari Teknikoa / Director Técnico Binovo IT Human Project, S.L. Telf. 943575997 943493611 Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) www.binovo.es ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Zuzendari Teknikoa / Director Técnico Binovo IT Human Project, S.L. Telf. 943575997 943493611 Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) www.binovo.es ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] All SSD storage and journals
They were some investigations as well around F2FS (https://www.kernel.org/doc/Documentation/filesystems/f2fs.txt), the last time I tried to install an OSD dir under f2fs it failed. I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr test: fremovexattr(10, user.test@5848273) = 0 Maybe someone from the core dev has an update on this? On 24 Oct 2014, at 07:58, Christian Balzer ch...@gol.com wrote: Hello, as others have reported in the past and now having tested things here myself, there really is no point in having journals for SSD backed OSDs on other SSDs. It is a zero sum game, because: a) using that journal SSD as another OSD with integrated journal will yield the same overall result performance wise, if all SSDs are the same. And In addition its capacity will be made available for actual storage. b) if the journal SSD is faster than the OSD SSDs it tends to be priced accordingly. For example the DC P3700 400GB is about twice as fast (write) and expensive as the DC S3700 400GB. Things _may_ be different if one doesn't look at bandwidth but IOPS (though certainly not in the near future in regard to Ceph actually getting SSDs busy), but even there the difference is negligible when for example comparing the Intel S and P models in write performance. Reads are another thing, but nobody cares about those in journals. ^o^ Obvious things that come to mind in this context would be the ability to disable journals (difficult, I know, not touching BTRFS, thank you) and probably K/V store in the future. Regards, Christian -- Christian BalzerNetwork/Systems Engineer ch...@gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Performance doesn't scale well on a full ssd cluster.
Mark, please read this: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg12486.html On 16 Oct 2014, at 19:19, Mark Wu wud...@gmail.com wrote: Thanks for the detailed information. but I am already using fio with rbd engine. Almost 4 volumes can reach the peak. 2014 年 10 月 17 日 上午 1:03于 wud...@gmail.com写道: Thanks for the detailed information. but I am already using fio with rbd engine. Almost 4 volumes can reach the peak. 2014 年 10 月 17 日 上午 12:55于 Daniel Schwager daniel.schwa...@dtnet.de写道: Hi Mark, maybe you will check rbd-enabled fio http://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html yum install ceph-devel git clone git://git.kernel.dk/fio.git cd fio ; ./configure ; make -j5 ; make install Setup the number of jobs (==clients) inside fio config to numjobs=8 for simulating multiple clients. regards Danny my test.fio: [global] #logging #write_iops_log=write_iops_log #write_bw_log=write_bw_log #write_lat_log=write_lat_log ioengine=rbd clientname=admin pool=rbd rbdname=myimage invalidate=0# mandatory rw=randwrite bs=1m runtime=120 iodepth=8 numjobs=8 time_based #direct=0 [seq-write] stonewall rw=write #[seq-read] #stonewall #rw=read ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Micro Ceph summit during the OpenStack summit
Hey all, I just saw this thread, I’ve been working on this and was about to share it: https://etherpad.openstack.org/p/kilo-ceph Since the ceph etherpad is down I think we should switch to this one as an alternative. Loic, feel free to work on this one and add more content :). On 13 Oct 2014, at 05:46, Blair Bethwaite blair.bethwa...@gmail.com wrote: Hi Loic, I'll be there and interested to chat with other Cephers. But your pad isn't returning any page data... Cheers, On 11 October 2014 08:48, Loic Dachary l...@dachary.org wrote: Hi Ceph, TL;DR: please register at http://pad.ceph.com/p/kilo if you're attending the OpenStack summit November 3 - 7 in Paris will be the OpenStack summit in Paris https://www.openstack.org/summit/openstack-paris-summit-2014/, an opportunity to meet with Ceph developers and users. We will have a conference room dedicated to Ceph (half a day, date to be determined). Instead of preparing an abstract agenda, it is more interesting to find out who will be there and what topics we would like to talk about. In the spirit of the OpenStack summit it would make sense to primarily discuss the implementation proposals of various features and improvements scheduled for the next Ceph release, Hammer. The online Ceph Developer Summit http://ceph.com/community/ceph-developer-summit-hammer/ is scheduled the week before and we will have plenty of material. If you're attending the OpenStack summit, please add yourself to http://pad.ceph.com/p/kilo and list the topics you'd like to discuss. Next week Josh Durgin and myself will spend some time to prepare this micro Ceph summit and make it a lively and informative experience :-) Cheers -- Loïc Dachary, Artisan Logiciel Libre -- Cheers, ~Blairo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD on openstack glance+cinder CoW?
Hum I just tried on a devstack and on firefly stable, it works for me. Looking at your config it seems that the glance_api_version=2 is put in the wrong section. Please move it to [DEFAULT] and let me know if it works. On 08 Oct 2014, at 14:28, Nathan Stratton nat...@robotics.net wrote: On Tue, Oct 7, 2014 at 5:35 PM, Jonathan Proulx j...@jonproulx.com wrote: Hi All, We're running Firefly on the ceph side and Icehouse on the OpenStack side I've pulled the recommended nova branch from https://github.com/angdraug/nova/tree/rbd-ephemeral-clone-stable-icehouse according to http://ceph.com/docs/master/rbd/rbd-openstack/#booting-from-a-block-device: When Glance and Cinder are both using Ceph block devices, the image is a copy-on-write clone, so it can create a new volume quickly I'm not seeing this, even though I have glance setup in such away that nova does create copy on write clones when booting ephemeral instances of the same image. Cinder downloads the glance RBD than pushes it back up as full copy. Since Glance - Nova is working (has the show_image_direct_url=True etc...) I suspect a problem with my Cinder config, this is what I added for rbd support: [rbd] volume_driver=cinder.volume.drivers.rbd.RBDDriver rbd_pool=volumes rbd_ceph_conf=/etc/ceph/ceph.conf rbd_flatten_volume_from_snapshot=false rbd_max_clone_depth=5 glance_api_version=2 rbd_user=USER rbd_secret_uuid=UUID volume_backend_name=rbd Note it does *work* just not doing CoW. Am I missing something here? I am running into the same thing, when I import a temp file is created in /var/lib/cinder/conversion. Everything works, it just is not CoW. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd + openstack nova instance snapshots?
Hi, Unfortunately this is expected. If you take a snapshot you should not expect a clone but a RBD snapshot. Please see this BP: https://blueprints.launchpad.net/nova/+spec/implement-rbd-snapshots-instead-of-qemu-snapshots A major part of the code is ready, however we missed nova-specs feature freeze so we haven’t proposed anything for Juno. So we will push something for Kilo. On 01 Oct 2014, at 06:12, Jonathan Proulx j...@jonproulx.com wrote: Hi All, I'm working on integrating our new Ceph cluster with our older OpenStack infrastructure. It's going pretty well so far but looking to check my expectations. We're running Firefly on the ceph side and Icehouse on the OpenStack side. I've pulled the recommnded nova branch from https://github.com/angdraug/nova/tree/rbd-ephemeral-clone-stable-icehouse on my test nova nodes and have happily gotten instances booting from CoW clones of images stored in glance's rbd pool. I notice if I take a snapshot of that instance however, rather than making a clone as I'd hoped the hypervisor is pulling down a copy of the instance rbd to local disk then shipping that full sized raw image back to glance to be uploaded back into rbd. Is this expected? Am I misconfiguring some thing (glance and nova are using different pools, which works to launch a cloned instance but maybe doesn't work in reverse)? Is there another patch I need to pull? Thanks, -Jon ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd + openstack nova instance snapshots?
On 01 Oct 2014, at 15:26, Jonathan Proulx j...@jonproulx.com wrote: On Wed, Oct 1, 2014 at 2:57 AM, Sebastien Han sebastien@enovance.com wrote: Hi, Unfortunately this is expected. If you take a snapshot you should not expect a clone but a RBD snapshot. Unfortunate that it doesn't work, but fortunate for me I don't need to figure out what I'm doing wrong :) Wait a second, let me rephrase this: If you take a snapshot you should not expect a clone but a RBD snapshot. You’re not doing anything wrong here :). If this was implemented you would get a RBD snapshot not a clone, meaning that the design approach to this BP is to use snapshots and not clones. Sorry for the confusion. Now what you get is a local snapshot on the compute node that gets streamed through Glance (and in Ceph). Please see this BP: https://blueprints.launchpad.net/nova/+spec/implement-rbd-snapshots-instead-of-qemu-snapshots Merci, De rien. -Jon Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS
What about writes with Giant? On 18 Sep 2014, at 08:12, Zhang, Jian jian.zh...@intel.com wrote: Have anyone ever testing multi volume performance on a *FULL* SSD setup? We are able to get ~18K IOPS for 4K random read on a single volume with fio (with rbd engine) on a 12x DC3700 Setup, but only able to get ~23K (peak) IOPS even with multiple volumes. Seems the maximum random write performance we can get on the entire cluster is quite close to single volume performance. Thanks Jian -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Sebastien Han Sent: Tuesday, September 16, 2014 9:33 PM To: Alexandre DERUMIER Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Hi, Thanks for keeping us updated on this subject. dsync is definitely killing the ssd. I don't have much to add, I'm just surprised that you're only getting 5299 with 0.85 since I've been able to get 6,4K, well I was using the 200GB model, that might explain this. On 12 Sep 2014, at 16:32, Alexandre DERUMIER aderum...@odiso.com wrote: here the results for the intel s3500 max performance is with ceph 0.85 + optracker disabled. intel s3500 don't have d_sync problem like crucial %util show almost 100% for read and write, so maybe the ssd disk performance is the limit. I have some stec zeusram 8GB in stock (I used them for zfs zil), I'll try to bench them next week. INTEL s3500 --- raw disk randread: fio --filename=/dev/sdb --direct=1 --rw=randread --bs=4k --iodepth=32 --group_reporting --invalidate=0 --name=abc --ioengine=aio bw=288207KB/s, iops=72051 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0,00 0,00 73454,000,00 293816,00 0,00 8,00 30,960,420,420,00 0,01 99,90 randwrite: fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k --iodepth=32 --group_reporting --invalidate=0 --name=abc --ioengine=aio --sync=1 bw=48131KB/s, iops=12032 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0,00 0,000,00 24120,00 0,00 48240,00 4,00 2,080,090,000,09 0,04 100,00 ceph 0.80 - randread: no tuning: bw=24578KB/s, iops=6144 randwrite: bw=10358KB/s, iops=2589 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0,00 373,000,00 8878,00 0,00 34012,50 7,66 1,630,180,000,18 0,06 50,90 ceph 0.85 : - randread : bw=41406KB/s, iops=10351 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 2,00 0,00 10425,000,00 41816,00 0,00 8,02 1,360,130,130,00 0,07 75,90 randwrite : bw=17204KB/s, iops=4301 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0,00 333,000,00 9788,00 0,00 57909,0011,83 1,460,150,000,15 0,07 67,80 ceph 0.85 tuning op_tracker=false randread : bw=86537KB/s, iops=21634 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 25,00 0,00 21428,000,00 86444,00 0,00 8,07 3,130,150,150,00 0,05 98,00 randwrite: bw=21199KB/s, iops=5299 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0,00 1563,000,00 9880,00 0,00 75223,5015,23 2,090,210,000,21 0,07 80,00 - Mail original - De: Alexandre DERUMIER aderum...@odiso.com À: Cedric Lemarchand ced...@yipikai.org Cc: ceph-users@lists.ceph.com Envoyé: Vendredi 12 Septembre 2014 08:15:08 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS results of fio on rbd with kernel patch fio rbd crucial m550 1 osd 0.85 (osd_enable_op_tracker true or false, same result): --- bw=12327KB/s, iops=3081 So no much better than before, but this time, iostat show only 15% utils, and latencies are lower Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0,00 29,00 0,00 3075,00 0,00 36748,50 23,90 0,29 0,10 0,00 0,10 0,05 15,20 So, the write bottleneck seem to be in ceph. I will send s3500 result today - Mail original - De: Alexandre DERUMIER aderum...@odiso.com À
Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS
5225,00 0,00 29678,00 11,36 57,63 11,03 0,00 11,03 0,19 99,70 (I don't understand what exactly is %util, 100% in the 2 cases, because 10x slower with ceph) It would be interesting if you could catch the size of writes on SSD during the bench through librbd (I know nmon can do that) Replying to myself ... I ask a bit quickly in the way we already have this information (29678 / 5225 = 5,68Ko), but this is irrelevant. Cheers It could be a dsync problem, result seem pretty poor # dd if=rand.file of=/dev/sdb bs=4k count=65536 oflag=direct 65536+0 enregistrements lus 65536+0 enregistrements écrits 268435456 octets (268 MB) copiés, 2,77433 s, 96,8 MB/s # dd if=rand.file of=/dev/sdb bs=4k count=65536 oflag=dsync,direct ^C17228+0 enregistrements lus 17228+0 enregistrements écrits 70565888 octets (71 MB) copiés, 70,4098 s, 1,0 MB/s I'll do tests with intel s3500 tomorrow to compare - Mail original - De: Sebastien Han sebastien@enovance.com À: Warren Wang warren_w...@cable.comcast.com Cc: ceph-users@lists.ceph.com Envoyé: Lundi 8 Septembre 2014 22:58:25 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS They definitely are Warren! Thanks for bringing this here :). On 05 Sep 2014, at 23:02, Wang, Warren warren_w...@cable.comcast.com wrote: +1 to what Cedric said. Anything more than a few minutes of heavy sustained writes tended to get our solid state devices into a state where garbage collection could not keep up. Originally we used small SSDs and did not overprovision the journals by much. Manufacturers publish their SSD stats, and then in very small font, state that the attained IOPS are with empty drives, and the tests are only run for very short amounts of time. Even if the drives are new, it's a good idea to perform an hdparm secure erase on them (so that the SSD knows that the blocks are truly unused), and then overprovision them. You'll know if you have a problem by watching for utilization and wait data on the journals. One of the other interesting performance issues is that the Intel 10Gbe NICs + default kernel that we typically use max out around 1million packets/sec. It's worth tracking this metric to if you are close. I know these aren't necessarily relevant to the test parameters you gave below, but they're worth keeping in mind. -- Warren Wang Comcast Cloud (OpenStack) From: Cedric Lemarchand ced...@yipikai.org Date: Wednesday, September 3, 2014 at 5:14 PM To: ceph-users@lists.ceph.com ceph-users@lists.ceph.com Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Le 03/09/2014 22:11, Sebastien Han a écrit : Hi Warren, What do mean exactly by secure erase? At the firmware level with constructor softwares? SSDs were pretty new so I don’t we hit that sort of things. I believe that only aged SSDs have this behaviour but I might be wrong. Sorry I forgot to reply to the real question ;-) So yes it only plays after some times, for your case, if the SSD still delivers write IOPS specified by the manufacturer, it will doesn't help in any ways. But it seems this practice is nowadays increasingly used. Cheers On 02 Sep 2014, at 18:23, Wang, Warren warren_w...@cable.comcast.com wrote: Hi Sebastien, Something I didn't see in the thread so far, did you secure erase the SSDs before they got used? I assume these were probably repurposed for this test. We have seen some pretty significant garbage collection issue on various SSD and other forms of solid state storage to the point where we are overprovisioning pretty much every solid state device now. By as much as 50% to handle sustained write operations. Especially important for the journals, as we've found. Maybe not an issue on the short fio run below, but certainly evident on longer runs or lots of historical data on the drives. The max transaction time looks pretty good for your test. Something to consider though. Warren -Original Message- From: ceph-users [ mailto:ceph-users-boun...@lists.ceph.com ] On Behalf Of Sebastien Han Sent: Thursday, August 28, 2014 12:12 PM To: ceph-users Cc: Mark Nelson Subject: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Hey all, It has been a while since the last thread performance related on the ML :p I've been running some experiment to see how much I can get from an SSD on a Ceph cluster. To achieve that I did something pretty simple: * Debian wheezy 7.6 * kernel from debian 3.14-0.bpo.2-amd64 * 1 cluster, 3 mons (i'd like to keep this realistic since in a real deployment i'll use 3) * 1 OSD backed by an SSD (journal and osd data on the same device) * 1 replica count of 1 * partitions are perfectly aligned * io scheduler is set to noon but deadline was showing the same results * no updatedb running About the box: * 32GB
Re: [ceph-users] vdb busy error when attaching to instance
Did you follow this ceph.com/docs/master/rbd/rbd-openstack/ to configure your env? On 12 Sep 2014, at 14:38, m.channappa.nega...@accenture.com wrote: Hello Team, I have configured ceph as a multibackend for openstack. I have created 2 pools . 1. Volumes (replication size =3 ) 2. poolb (replication size =2 ) Below is the details from /etc/cinder/cinder.conf enabled_backends=rbd-ceph,rbd-cephrep [rbd-ceph] volume_driver=cinder.volume.drivers.rbd.RBDDriver rbd_pool=volumes volume_backend_name=ceph rbd_user=volumes rbd_secret_uuid=34c88ed2-1cf6-446d-8564-f888934eec35 volumes_dir=/var/lib/cinder/volumes [rbd-cephrep] volume_driver=cinder.volume.drivers.rbd.RBDDriver rbd_pool=poolb volume_backend_name=ceph1 rbd_user=poolb rbd_secret_uuid=d62b0df6-ee26-46f0-8d90-4ef4d55caa5b volumes_dir=/var/lib/cinder/volumes1 when I am attaching a volume to a instance I am getting “DeviceIsBusy: The supplied device (vdb) is busy” error. Please let me know how to correct this.. Regards, Malleshi CN This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. __ www.accenture.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS
They definitely are Warren! Thanks for bringing this here :). On 05 Sep 2014, at 23:02, Wang, Warren warren_w...@cable.comcast.com wrote: +1 to what Cedric said. Anything more than a few minutes of heavy sustained writes tended to get our solid state devices into a state where garbage collection could not keep up. Originally we used small SSDs and did not overprovision the journals by much. Manufacturers publish their SSD stats, and then in very small font, state that the attained IOPS are with empty drives, and the tests are only run for very short amounts of time. Even if the drives are new, it's a good idea to perform an hdparm secure erase on them (so that the SSD knows that the blocks are truly unused), and then overprovision them. You'll know if you have a problem by watching for utilization and wait data on the journals. One of the other interesting performance issues is that the Intel 10Gbe NICs + default kernel that we typically use max out around 1million packets/sec. It's worth tracking this metric to if you are close. I know these aren't necessarily relevant to the test parameters you gave below, but they're worth keeping in mind. -- Warren Wang Comcast Cloud (OpenStack) From: Cedric Lemarchand ced...@yipikai.org Date: Wednesday, September 3, 2014 at 5:14 PM To: ceph-users@lists.ceph.com ceph-users@lists.ceph.com Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Le 03/09/2014 22:11, Sebastien Han a écrit : Hi Warren, What do mean exactly by secure erase? At the firmware level with constructor softwares? SSDs were pretty new so I don’t we hit that sort of things. I believe that only aged SSDs have this behaviour but I might be wrong. Sorry I forgot to reply to the real question ;-) So yes it only plays after some times, for your case, if the SSD still delivers write IOPS specified by the manufacturer, it will doesn't help in any ways. But it seems this practice is nowadays increasingly used. Cheers On 02 Sep 2014, at 18:23, Wang, Warren warren_w...@cable.comcast.com wrote: Hi Sebastien, Something I didn't see in the thread so far, did you secure erase the SSDs before they got used? I assume these were probably repurposed for this test. We have seen some pretty significant garbage collection issue on various SSD and other forms of solid state storage to the point where we are overprovisioning pretty much every solid state device now. By as much as 50% to handle sustained write operations. Especially important for the journals, as we've found. Maybe not an issue on the short fio run below, but certainly evident on longer runs or lots of historical data on the drives. The max transaction time looks pretty good for your test. Something to consider though. Warren -Original Message- From: ceph-users [ mailto:ceph-users-boun...@lists.ceph.com ] On Behalf Of Sebastien Han Sent: Thursday, August 28, 2014 12:12 PM To: ceph-users Cc: Mark Nelson Subject: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Hey all, It has been a while since the last thread performance related on the ML :p I've been running some experiment to see how much I can get from an SSD on a Ceph cluster. To achieve that I did something pretty simple: * Debian wheezy 7.6 * kernel from debian 3.14-0.bpo.2-amd64 * 1 cluster, 3 mons (i'd like to keep this realistic since in a real deployment i'll use 3) * 1 OSD backed by an SSD (journal and osd data on the same device) * 1 replica count of 1 * partitions are perfectly aligned * io scheduler is set to noon but deadline was showing the same results * no updatedb running About the box: * 32GB of RAM * 12 cores with HT @ 2,4 GHz * WB cache is enabled on the controller * 10Gbps network (doesn't help here) The SSD is a 200G Intel DC S3700 and is capable of delivering around 29K iops with random 4k writes (my fio results) As a benchmark tool I used fio with the rbd engine (thanks deutsche telekom guys!). O_DIECT and D_SYNC don't seem to be a problem for the SSD: # dd if=/dev/urandom of=rand.file bs=4k count=65536 65536+0 records in 65536+0 records out 268435456 bytes (268 MB) copied, 29.5477 s, 9.1 MB/s # du -sh rand.file 256Mrand.file # dd if=rand.file of=/dev/sdo bs=4k count=65536 oflag=dsync,direct 65536+0 records in 65536+0 records out 268435456 bytes (268 MB) copied, 2.73628 s, 98.1 MB/s See my ceph.conf: [global] auth cluster required = cephx auth service required = cephx auth client required = cephx fsid = 857b8609-8c9b-499e-9161-2ea67ba51c97 osd pool default pg num = 4096 osd pool default pgp num = 4096 osd pool default size = 2 osd crush chooseleaf type = 0 debug lockdep = 0/0 debug context = 0/0 debug crush = 0/0 debug buffer = 0/0 debug timer = 0/0 debug journaler = 0/0
Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS
Hey, Well I ran an fio job that simulates the (more or less) what ceph is doing (journal writes with dsync and o_direct) and the ssd gave me 29K IOPS too. I could do this, but for me it definitely looks like a major waste since we don’t even get a third of the ssd performance. On 02 Sep 2014, at 09:38, Alexandre DERUMIER aderum...@odiso.com wrote: Hi Sebastien, I got 6340 IOPS on a single OSD SSD. (journal and data on the same partition). Shouldn't it better to have 2 partitions, 1 for journal and 1 for datas ? (I'm thinking about filesystem write syncs) - Mail original - De: Sebastien Han sebastien@enovance.com À: Somnath Roy somnath@sandisk.com Cc: ceph-users@lists.ceph.com Envoyé: Mardi 2 Septembre 2014 02:19:16 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Mark and all, Ceph IOPS performance has definitely improved with Giant. With this version: ceph version 0.84-940-g3215c52 (3215c520e1306f50d0094b5646636c02456c9df4) on Debian 7.6 with Kernel 3.14-0. I got 6340 IOPS on a single OSD SSD. (journal and data on the same partition). So basically twice the amount of IOPS that I was getting with Firefly. Rand reads 4k went from 12431 to 10201, so I’m a bit disappointed here. The SSD is still under-utilised: Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdp1 0.00 540.37 0.00 5902.30 0.00 47.14 16.36 0.87 0.15 0.00 0.15 0.07 40.15 sdp2 0.00 0.00 0.00 4454.67 0.00 49.16 22.60 0.31 0.07 0.00 0.07 0.07 30.61 Thanks a ton for all your comments and assistance guys :). One last question for Sage (or other that might know), what’s the status of the S2FS implementation? (or maybe we are waiting for S2FS to provide atomic transactions?) I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr test: fremovexattr(10, user.test@5848273) = 0 On 01 Sep 2014, at 11:13, Sebastien Han sebastien@enovance.com wrote: Mark, thanks a lot for experimenting this for me. I’m gonna try master soon and will tell you how much I can get. It’s interesting to see that using 2 SSDs brings up more performance, even both SSDs are under-utilized… They should be able to sustain both loads at the same time (journal and osd data). On 01 Sep 2014, at 09:51, Somnath Roy somnath@sandisk.com wrote: As I said, 107K with IOs serving from memory, not hitting the disk.. From: Jian Zhang [mailto:amberzhan...@gmail.com] Sent: Sunday, August 31, 2014 8:54 PM To: Somnath Roy Cc: Haomai Wang; ceph-users@lists.ceph.com Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Somnath, on the small workload performance, 107k is higher than the theoretical IOPS of 520, any idea why? Single client is ~14K iops, but scaling as number of clients increases. 10 clients ~107K iops. ~25 cpu cores are used. 2014-09-01 11:52 GMT+08:00 Jian Zhang amberzhan...@gmail.com: Somnath, on the small workload performance, 2014-08-29 14:37 GMT+08:00 Somnath Roy somnath@sandisk.com: Thanks Haomai ! Here is some of the data from my setup. -- Set up: 32 core cpu with HT enabled, 128 GB RAM, one SSD (both journal and data) - one OSD. 5 client m/c with 12 core cpu and each running two instances of ceph_smalliobench (10 clients total). Network is 10GbE. Workload: - Small workload – 20K objects with 4K size and io_size is also 4K RR. The intent is to serve the ios from memory so that it can uncover the performance problems within single OSD. Results from Firefly: -- Single client throughput is ~14K iops, but as the number of client increases the aggregated throughput is not increasing. 10 clients ~15K iops. ~9-10 cpu cores are used. Result with latest master: -- Single client is ~14K iops, but scaling as number of clients increases. 10 clients ~107K iops. ~25 cpu cores are used. -- More realistic workload: - Let’s see how it is performing while 90% of the ios are served from disks Setup: --- 40 cpu core server as a cluster node (single node cluster) with 64 GB RAM. 8 SSDs - 8 OSDs. One similar node for monitor and rgw. Another node for client running fio/vdbench. 4 rbds are configured with ‘noshare’ option. 40 GbE network Workload: 8 SSDs are populated
Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS
@Dan, hop my bad I forgot to use these settings, I’ll try again and see how much I can get on the read performance side. @Mark, thanks again and yes I believe that due to some hardware variance we have difference results, I won’t say that the deviance is decent but results are close enough to say that we experience the same limitations (ceph level). @Cédric, yes I did and what fio was showing was consistent with the iostat output, same goes for disk utilisation. On 02 Sep 2014, at 12:44, Cédric Lemarchand c.lemarch...@yipikai.org wrote: Hi Sebastian, Le 2 sept. 2014 à 10:41, Sebastien Han sebastien@enovance.com a écrit : Hey, Well I ran an fio job that simulates the (more or less) what ceph is doing (journal writes with dsync and o_direct) and the ssd gave me 29K IOPS too. I could do this, but for me it definitely looks like a major waste since we don’t even get a third of the ssd performance. Did you had a look if the raw ssd IOPS (using iostat -x for example) show same results during fio bench ? Cheers On 02 Sep 2014, at 09:38, Alexandre DERUMIER aderum...@odiso.com wrote: Hi Sebastien, I got 6340 IOPS on a single OSD SSD. (journal and data on the same partition). Shouldn't it better to have 2 partitions, 1 for journal and 1 for datas ? (I'm thinking about filesystem write syncs) - Mail original - De: Sebastien Han sebastien@enovance.com À: Somnath Roy somnath@sandisk.com Cc: ceph-users@lists.ceph.com Envoyé: Mardi 2 Septembre 2014 02:19:16 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Mark and all, Ceph IOPS performance has definitely improved with Giant. With this version: ceph version 0.84-940-g3215c52 (3215c520e1306f50d0094b5646636c02456c9df4) on Debian 7.6 with Kernel 3.14-0. I got 6340 IOPS on a single OSD SSD. (journal and data on the same partition). So basically twice the amount of IOPS that I was getting with Firefly. Rand reads 4k went from 12431 to 10201, so I’m a bit disappointed here. The SSD is still under-utilised: Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdp1 0.00 540.37 0.00 5902.30 0.00 47.14 16.36 0.87 0.15 0.00 0.15 0.07 40.15 sdp2 0.00 0.00 0.00 4454.67 0.00 49.16 22.60 0.31 0.07 0.00 0.07 0.07 30.61 Thanks a ton for all your comments and assistance guys :). One last question for Sage (or other that might know), what’s the status of the S2FS implementation? (or maybe we are waiting for S2FS to provide atomic transactions?) I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr test: fremovexattr(10, user.test@5848273) = 0 On 01 Sep 2014, at 11:13, Sebastien Han sebastien@enovance.com wrote: Mark, thanks a lot for experimenting this for me. I’m gonna try master soon and will tell you how much I can get. It’s interesting to see that using 2 SSDs brings up more performance, even both SSDs are under-utilized… They should be able to sustain both loads at the same time (journal and osd data). On 01 Sep 2014, at 09:51, Somnath Roy somnath@sandisk.com wrote: As I said, 107K with IOs serving from memory, not hitting the disk.. From: Jian Zhang [mailto:amberzhan...@gmail.com] Sent: Sunday, August 31, 2014 8:54 PM To: Somnath Roy Cc: Haomai Wang; ceph-users@lists.ceph.com Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Somnath, on the small workload performance, 107k is higher than the theoretical IOPS of 520, any idea why? Single client is ~14K iops, but scaling as number of clients increases. 10 clients ~107K iops. ~25 cpu cores are used. 2014-09-01 11:52 GMT+08:00 Jian Zhang amberzhan...@gmail.com: Somnath, on the small workload performance, 2014-08-29 14:37 GMT+08:00 Somnath Roy somnath@sandisk.com: Thanks Haomai ! Here is some of the data from my setup. -- Set up: 32 core cpu with HT enabled, 128 GB RAM, one SSD (both journal and data) - one OSD. 5 client m/c with 12 core cpu and each running two instances of ceph_smalliobench (10 clients total). Network is 10GbE. Workload: - Small workload – 20K objects with 4K size and io_size is also 4K RR. The intent is to serve the ios from memory so that it can uncover the performance problems within single OSD. Results from Firefly: -- Single client throughput is ~14K iops, but as the number of client increases the aggregated throughput is not increasing. 10 clients ~15K iops. ~9-10 cpu cores are used. Result with latest master: -- Single client is ~14K iops
Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS
It would nice if you could post the results :) Yup gitbuilder is available on debian 7.6 wheezy. On 02 Sep 2014, at 17:55, Alexandre DERUMIER aderum...@odiso.com wrote: I'm going to install next week a small 3 nodes test ssd cluster, I have some intel s3500 and crucial m550. I'll try to bench them with firefly and master. Is a debian wheezy gitbuilder repository available ? (I'm a bit lazy to compile all packages) - Mail original - De: Sebastien Han sebastien@enovance.com À: Alexandre DERUMIER aderum...@odiso.com Cc: ceph-users@lists.ceph.com, Cédric Lemarchand c.lemarch...@yipikai.org Envoyé: Mardi 2 Septembre 2014 15:25:05 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Well the last time I ran two processes in parallel I got half the total amount available so 1,7k per client. On 02 Sep 2014, at 15:19, Alexandre DERUMIER aderum...@odiso.com wrote: Do you have same results, if you launch 2 fio benchs in parallel on 2 differents rbd volumes ? - Mail original - De: Sebastien Han sebastien@enovance.com À: Cédric Lemarchand c.lemarch...@yipikai.org Cc: Alexandre DERUMIER aderum...@odiso.com, ceph-users@lists.ceph.com Envoyé: Mardi 2 Septembre 2014 13:59:13 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS @Dan, hop my bad I forgot to use these settings, I’ll try again and see how much I can get on the read performance side. @Mark, thanks again and yes I believe that due to some hardware variance we have difference results, I won’t say that the deviance is decent but results are close enough to say that we experience the same limitations (ceph level). @Cédric, yes I did and what fio was showing was consistent with the iostat output, same goes for disk utilisation. On 02 Sep 2014, at 12:44, Cédric Lemarchand c.lemarch...@yipikai.org wrote: Hi Sebastian, Le 2 sept. 2014 à 10:41, Sebastien Han sebastien@enovance.com a écrit : Hey, Well I ran an fio job that simulates the (more or less) what ceph is doing (journal writes with dsync and o_direct) and the ssd gave me 29K IOPS too. I could do this, but for me it definitely looks like a major waste since we don’t even get a third of the ssd performance. Did you had a look if the raw ssd IOPS (using iostat -x for example) show same results during fio bench ? Cheers On 02 Sep 2014, at 09:38, Alexandre DERUMIER aderum...@odiso.com wrote: Hi Sebastien, I got 6340 IOPS on a single OSD SSD. (journal and data on the same partition). Shouldn't it better to have 2 partitions, 1 for journal and 1 for datas ? (I'm thinking about filesystem write syncs) - Mail original - De: Sebastien Han sebastien@enovance.com À: Somnath Roy somnath@sandisk.com Cc: ceph-users@lists.ceph.com Envoyé: Mardi 2 Septembre 2014 02:19:16 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Mark and all, Ceph IOPS performance has definitely improved with Giant. With this version: ceph version 0.84-940-g3215c52 (3215c520e1306f50d0094b5646636c02456c9df4) on Debian 7.6 with Kernel 3.14-0. I got 6340 IOPS on a single OSD SSD. (journal and data on the same partition). So basically twice the amount of IOPS that I was getting with Firefly. Rand reads 4k went from 12431 to 10201, so I’m a bit disappointed here. The SSD is still under-utilised: Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdp1 0.00 540.37 0.00 5902.30 0.00 47.14 16.36 0.87 0.15 0.00 0.15 0.07 40.15 sdp2 0.00 0.00 0.00 4454.67 0.00 49.16 22.60 0.31 0.07 0.00 0.07 0.07 30.61 Thanks a ton for all your comments and assistance guys :). One last question for Sage (or other that might know), what’s the status of the S2FS implementation? (or maybe we are waiting for S2FS to provide atomic transactions?) I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr test: fremovexattr(10, user.test@5848273) = 0 On 01 Sep 2014, at 11:13, Sebastien Han sebastien@enovance.com wrote: Mark, thanks a lot for experimenting this for me. I’m gonna try master soon and will tell you how much I can get. It’s interesting to see that using 2 SSDs brings up more performance, even both SSDs are under-utilized… They should be able to sustain both loads at the same time (journal and osd data). On 01 Sep 2014, at 09:51, Somnath Roy somnath@sandisk.com wrote: As I said, 107K with IOs serving from memory, not hitting the disk.. From: Jian Zhang [mailto:amberzhan...@gmail.com] Sent: Sunday, August 31, 2014 8:54 PM To: Somnath Roy Cc: Haomai Wang; ceph-users@lists.ceph.com Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Somnath, on the small workload performance, 107k is higher than the theoretical IOPS of 520, any idea why
Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS
Mark and all, Ceph IOPS performance has definitely improved with Giant. With this version: ceph version 0.84-940-g3215c52 (3215c520e1306f50d0094b5646636c02456c9df4) on Debian 7.6 with Kernel 3.14-0. I got 6340 IOPS on a single OSD SSD. (journal and data on the same partition). So basically twice the amount of IOPS that I was getting with Firefly. Rand reads 4k went from 12431 to 10201, so I’m a bit disappointed here. The SSD is still under-utilised: Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdp1 0.00 540.370.00 5902.30 0.0047.1416.36 0.870.150.000.15 0.07 40.15 sdp2 0.00 0.000.00 4454.67 0.0049.1622.60 0.310.070.000.07 0.07 30.61 Thanks a ton for all your comments and assistance guys :). One last question for Sage (or other that might know), what’s the status of the S2FS implementation? (or maybe we are waiting for S2FS to provide atomic transactions?) I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr test: fremovexattr(10, user.test@5848273) = 0 On 01 Sep 2014, at 11:13, Sebastien Han sebastien@enovance.com wrote: Mark, thanks a lot for experimenting this for me. I’m gonna try master soon and will tell you how much I can get. It’s interesting to see that using 2 SSDs brings up more performance, even both SSDs are under-utilized… They should be able to sustain both loads at the same time (journal and osd data). On 01 Sep 2014, at 09:51, Somnath Roy somnath@sandisk.com wrote: As I said, 107K with IOs serving from memory, not hitting the disk.. From: Jian Zhang [mailto:amberzhan...@gmail.com] Sent: Sunday, August 31, 2014 8:54 PM To: Somnath Roy Cc: Haomai Wang; ceph-users@lists.ceph.com Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Somnath, on the small workload performance, 107k is higher than the theoretical IOPS of 520, any idea why? Single client is ~14K iops, but scaling as number of clients increases. 10 clients ~107K iops. ~25 cpu cores are used. 2014-09-01 11:52 GMT+08:00 Jian Zhang amberzhan...@gmail.com: Somnath, on the small workload performance, 2014-08-29 14:37 GMT+08:00 Somnath Roy somnath@sandisk.com: Thanks Haomai ! Here is some of the data from my setup. -- Set up: 32 core cpu with HT enabled, 128 GB RAM, one SSD (both journal and data) - one OSD. 5 client m/c with 12 core cpu and each running two instances of ceph_smalliobench (10 clients total). Network is 10GbE. Workload: - Small workload – 20K objects with 4K size and io_size is also 4K RR. The intent is to serve the ios from memory so that it can uncover the performance problems within single OSD. Results from Firefly: -- Single client throughput is ~14K iops, but as the number of client increases the aggregated throughput is not increasing. 10 clients ~15K iops. ~9-10 cpu cores are used. Result with latest master: -- Single client is ~14K iops, but scaling as number of clients increases. 10 clients ~107K iops. ~25 cpu cores are used. -- More realistic workload: - Let’s see how it is performing while 90% of the ios are served from disks Setup: --- 40 cpu core server as a cluster node (single node cluster) with 64 GB RAM. 8 SSDs - 8 OSDs. One similar node for monitor and rgw. Another node for client running fio/vdbench. 4 rbds are configured with ‘noshare’ option. 40 GbE network Workload: 8 SSDs are populated , so, 8 * 800GB = ~6.4 TB of data. Io_size = 4K RR. Results from Firefly: Aggregated output while 4 rbd clients stressing the cluster in parallel is ~20-25K IOPS , cpu cores used ~8-10 cores (may be less can’t remember precisely) Results from latest master: Aggregated output while 4 rbd clients stressing the cluster in parallel is ~120K IOPS , cpu is 7% idle i.e ~37-38 cpu cores. Hope this helps. Thanks Regards Somnath -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Thursday, August 28, 2014 8:01 PM To: Somnath Roy Cc: Andrey Korolyov; ceph-users@lists.ceph.com Subject: Re: [ceph-users
Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS
Thanks a lot for the answers, even if we drifted from the main subject a little bit. Thanks Somnath for sharing this, when can we expect any codes that might improve _write_ performance? @Mark thanks trying this :) Unfortunately using nobarrier and another dedicated SSD for the journal (plus your ceph setting) didn’t bring much, now I can reach 3,5K IOPS. By any chance, would it be possible for you to test with a single OSD SSD? On 28 Aug 2014, at 18:11, Sebastien Han sebastien@enovance.com wrote: Hey all, It has been a while since the last thread performance related on the ML :p I’ve been running some experiment to see how much I can get from an SSD on a Ceph cluster. To achieve that I did something pretty simple: * Debian wheezy 7.6 * kernel from debian 3.14-0.bpo.2-amd64 * 1 cluster, 3 mons (i’d like to keep this realistic since in a real deployment i’ll use 3) * 1 OSD backed by an SSD (journal and osd data on the same device) * 1 replica count of 1 * partitions are perfectly aligned * io scheduler is set to noon but deadline was showing the same results * no updatedb running About the box: * 32GB of RAM * 12 cores with HT @ 2,4 GHz * WB cache is enabled on the controller * 10Gbps network (doesn’t help here) The SSD is a 200G Intel DC S3700 and is capable of delivering around 29K iops with random 4k writes (my fio results) As a benchmark tool I used fio with the rbd engine (thanks deutsche telekom guys!). O_DIECT and D_SYNC don’t seem to be a problem for the SSD: # dd if=/dev/urandom of=rand.file bs=4k count=65536 65536+0 records in 65536+0 records out 268435456 bytes (268 MB) copied, 29.5477 s, 9.1 MB/s # du -sh rand.file 256Mrand.file # dd if=rand.file of=/dev/sdo bs=4k count=65536 oflag=dsync,direct 65536+0 records in 65536+0 records out 268435456 bytes (268 MB) copied, 2.73628 s, 98.1 MB/s See my ceph.conf: [global] auth cluster required = cephx auth service required = cephx auth client required = cephx fsid = 857b8609-8c9b-499e-9161-2ea67ba51c97 osd pool default pg num = 4096 osd pool default pgp num = 4096 osd pool default size = 2 osd crush chooseleaf type = 0 debug lockdep = 0/0 debug context = 0/0 debug crush = 0/0 debug buffer = 0/0 debug timer = 0/0 debug journaler = 0/0 debug osd = 0/0 debug optracker = 0/0 debug objclass = 0/0 debug filestore = 0/0 debug journal = 0/0 debug ms = 0/0 debug monc = 0/0 debug tp = 0/0 debug auth = 0/0 debug finisher = 0/0 debug heartbeatmap = 0/0 debug perfcounter = 0/0 debug asok = 0/0 debug throttle = 0/0 [mon] mon osd down out interval = 600 mon osd min down reporters = 13 [mon.ceph-01] host = ceph-01 mon addr = 172.20.20.171 [mon.ceph-02] host = ceph-02 mon addr = 172.20.20.172 [mon.ceph-03] host = ceph-03 mon addr = 172.20.20.173 debug lockdep = 0/0 debug context = 0/0 debug crush = 0/0 debug buffer = 0/0 debug timer = 0/0 debug journaler = 0/0 debug osd = 0/0 debug optracker = 0/0 debug objclass = 0/0 debug filestore = 0/0 debug journal = 0/0 debug ms = 0/0 debug monc = 0/0 debug tp = 0/0 debug auth = 0/0 debug finisher = 0/0 debug heartbeatmap = 0/0 debug perfcounter = 0/0 debug asok = 0/0 debug throttle = 0/0 [osd] osd mkfs type = xfs osd mkfs options xfs = -f -i size=2048 osd mount options xfs = rw,noatime,logbsize=256k,delaylog osd journal size = 20480 cluster_network = 172.20.20.0/24 public_network = 172.20.20.0/24 osd mon heartbeat interval = 30 # Performance tuning filestore merge threshold = 40 filestore split multiple = 8 osd op threads = 8 # Recovery tuning osd recovery max active = 1 osd max backfills = 1 osd recovery op priority = 1 debug lockdep = 0/0 debug context = 0/0 debug crush = 0/0 debug buffer = 0/0 debug timer = 0/0 debug journaler = 0/0 debug osd = 0/0 debug optracker = 0/0 debug objclass = 0/0 debug filestore = 0/0 debug journal = 0/0 debug ms = 0/0 debug monc = 0/0 debug tp = 0/0 debug auth = 0/0 debug finisher = 0/0 debug heartbeatmap = 0/0 debug perfcounter = 0/0 debug asok = 0/0 debug throttle = 0/0 Disabling all debugging made me win 200/300 more IOPS. See my fio template: [global] #logging #write_iops_log=write_iops_log #write_bw_log=write_bw_log #write_lat_log=write_lat_lo time_based runtime=60 ioengine=rbd clientname=admin pool=test rbdname=fio invalidate=0# mandatory #rw=randwrite rw=write bs=4k #bs=32m size=5G group_reporting [rbd_iodepth32
Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS
@Dan: thanks for sharing your config, with all your flags I don’t seem to get more that 3,4K IOPS and they even seem to slow me down :( This is really weird. Yes I already tried to run to simultaneous processes and only half of 3,4K for each of them. @Kasper: thanks for these results, I believe some improvement could be made in the code as well :). FYI I just tried on Ubuntu 12.04 and it looks a bit better because I’m getting iops=3783. On 29 Aug 2014, at 13:10, Dan Van Der Ster daniel.vanders...@cern.ch wrote: vm.dirty_expire_centisecs Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Moving Journal to SSD
Hi Dane, If you deployed with ceph-deploy, you will see that the journal is just a symlink. Take a look at /var/lib/ceph/osd/osd-id/journal The link should point to the first partition of your hard drive disk, so no filesystem for the journal, just a block device. Roughly you should try: create N partition on your SSD for your N OSDs ceph osd set noout sudo service ceph stop osd.$ID ceph-osd -i osd.$ID --flush-journal rm -f /var/lib/ceph/osd/osd-id/journal ln -s /var/lib/ceph/osd/osd-id/journal /dev/ssd-partition-for-your-journal ceph-osd -i osd.$ID —mkjournal sudo service ceph start osd.$ID ceph osd unset noout This should work. Cheers. On 11 Aug 2014, at 18:36, Dane Elwell dane.elw...@gmail.com wrote: Hi list, Our current setup has OSDs with their journal sharing the same disk as the data, and we've reached the point we're outgrowing this setup. We're currently vacating disks in order to replace them with SSDs and recreate the OSD journals on the SSDs in a 5:1 ratio of spinners to SSDs. I've read in a few places that it's possible to move the OSD journals without losing data on the OSDs, which is great, however none of the stuff I've read seems to cover our case. We installed Ceph using ceph-deploy, putting the journals on the same disks. ceph-deploy doesn't populate a ceph.conf file fully, so we don't have e.g. individual OSD entries in there. If I'm understanding this correctly, the Ceph disks are automounted by udev rules from /lib/udev/rules.d/95-ceph-osd.rules, and this mounts the OSD disk (partition 1) then mounts the journal under /journal (partition 2 of the same disk). That's all well and good, but as I now want to move the journal, how do I go about telling Ceph where the new journals are located so they can be mounted in the right location? Do I need to populate ceph.conf with individual entries for all OSDs or is there a way I can make udev do all the heavy lifting? Regards Dane ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. Sébastien Han Cloud Architect Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu image create failed
Can you connect to your Ceph cluster? You can pass options to the cmd line like this: $ qemu-img create -f rbd rbd:instances/vmdisk01:id=leseb:conf=/etc/ceph/ceph-leseb.conf 2G Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 12 Jul 2014, at 03:06, Yonghua Peng sys...@mail2000.us wrote: Anybody knows this issue? thanks. Fri, 11 Jul 2014 10:26:47 +0800 from Yonghua Peng sys...@mail2000.us: Hi, I try to create a qemu image, but got failed. ceph@ceph:~/my-cluster$ qemu-img create -f rbd rbd:rbd/qemu 2G Formatting 'rbd:rbd/qemu', fmt=rbd size=2147483648 cluster_size=0 qemu-img: error connecting qemu-img: rbd:rbd/qemu: error while creating rbd: Input/output error Can you tell what's the problem? Thanks. -- We are hiring cloud Dev/Ops, more details please see: YY Cloud Jobs ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Is it still unsafe to map a RBD device on an OSD server?
Hi all, A couple of years ago, I heard that it wasn’t safe to map a krbd block on an OSD host. It was more or less like mounting a NFS mount on the NFS server, we can potentially end up with some deadlocks. At least, I tried again recently and didn’t encounter any problem. What do you think? Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] question about feature set mismatch
FYI I encountered the same problem for krbd, removing the ec pool didn’t solve my problem. I’m running 3.13 Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 08 Jun 2014, at 10:19, Ilya Dryomov ilya.dryo...@inktank.com wrote: On Sun, Jun 8, 2014 at 11:27 AM, Igor Krstic puh.dobri...@hotmail.com wrote: On Fri, 2014-06-06 at 17:40 +0400, Ilya Dryomov wrote: On Fri, Jun 6, 2014 at 4:34 PM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: - Message from Igor Krstic igor.z.krs...@gmail.com - Date: Fri, 06 Jun 2014 13:23:19 +0200 From: Igor Krstic igor.z.krs...@gmail.com Subject: Re: [ceph-users] question about feature set mismatch To: Ilya Dryomov ilya.dryo...@inktank.com Cc: ceph-users@lists.ceph.com On Fri, 2014-06-06 at 11:51 +0400, Ilya Dryomov wrote: On Thu, Jun 5, 2014 at 10:38 PM, Igor Krstic igor.z.krs...@gmail.com wrote: Hello, dmesg: [ 690.181780] libceph: mon1 192.168.214.102:6789 feature set mismatch, my 4a042a42 server's 504a042a42, missing 50 [ 690.181907] libceph: mon1 192.168.214.102:6789 socket error on read [ 700.190342] libceph: mon0 192.168.214.101:6789 feature set mismatch, my 4a042a42 server's 504a042a42, missing 50 [ 700.190481] libceph: mon0 192.168.214.101:6789 socket error on read [ 710.194499] libceph: mon1 192.168.214.102:6789 feature set mismatch, my 4a042a42 server's 504a042a42, missing 50 [ 710.194633] libceph: mon1 192.168.214.102:6789 socket error on read [ 720.201226] libceph: mon1 192.168.214.102:6789 feature set mismatch, my 4a042a42 server's 504a042a42, missing 50 [ 720.201482] libceph: mon1 192.168.214.102:6789 socket error on read 50 should be: CEPH_FEATURE_CRUSH_V2 36 10 and CEPH_FEATURE_OSD_ERASURE_CODES 38 40 CEPH_FEATURE_OSD_TMAP2OMAP 38* 40 That is happening on two separate boxes that are just my nfs and block gateways (they are not osd/mon/mds). So I just need on them something like: sudo rbd map share2 sudo mount -t xfs /dev/rbd1 /mnt/share2 On ceph cluster and on those two separate boxes: ~$ ceph -v ceph version 0.80.1 What could be the problem? Which kernel version are you running? Do you have any erasure coded pools? Thanks, Ilya ~$ uname -a Linux ceph-gw1 3.13.0-24-generic #47~precise2-Ubuntu SMP Fri May 2 23:30:46 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Yes, one of the pools is erasure coded pool but the only thing I use on that box is rbd and rbd pool is not ec pool. It is replicated pool. I to not touch ec pool from there. Or at least, I believe so :) Well, I saw something similar with CephFS: I didn't touch the pools in use by cephfs, but I created another pool with Erasure Code, and the ceph client (kernel 3.13, but not enough for EC) also stopped working with 'feature set mismatch'. Thus I guess the clients can't read the crushmap anymore when there is a 'erasure' mentioning in it:) Unfortunately that's true. If there are any erasure code pools in the cluster, kernel clients (both krbd and kcephfs) won't work. The only way it will work is if you remove all erasure coded pools. CRUSH_V2 is also not present in 3.13. You'll have to uprade to 3.14. Alternatively, CRUSH_V2 can be disabled, but I can't tell you how off the top of my head. The fundamental problem is that you are running latest userspace, and the defaults it ships with are incompatible with older kernels. Thanks, Ilya Thanks. Upgrade to 3.14 solved CRUSH_V2. Regarding krbd and kcephfs... If that is the case, that is something that should be addressed in documentation on ceph.com more clearly. There is info that CRUSH_TUNABLES3 (chooseleaf_vary_r) requires Linux kernel version v3.15 or later (for the file system and RBD kernel clients) but nothing else. I think CRUSH_V2 is also mentioned somewhere, most probably in the release notes, but you are right, it should centralized and easy to find. Only now that I have your information I was able to find https://lkml.org/lkml/2014/4/7/257 Anyway... What I want to test is SSD pool as cache pool in front of ec pool. Is there some way to update krbd manually (from github?) or I need to wait 3.15 for this? The if there are any erasure code pools in the cluster, kernel clients (both krbd and kcephfs) won't work problem is getting fixed on the server side. The next ceph release will have the fix and you will be able to use 3.14 kernel with clusters that have EC pools. Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description:
Re: [ceph-users] Is it still unsafe to map a RBD device on an OSD server?
Thanks for your answers :) Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 10 Jun 2014, at 20:49, John Wilkins john.wilk...@inktank.com wrote: Sebastian, It's actually not an issue with Ceph, but with the Linux kernel itself. If you want to do this and avoid a deadlock, just use a VM on the same host to mount the block device. Regards, John On Tue, Jun 10, 2014 at 9:51 AM, Jean-Charles LOPEZ jeanchlo...@mac.com wrote: Hi Sébastien, still the case. Depending on what you do, the OSD process will get to a hang and will suicide. Regards JC On Jun 10, 2014, at 09:46, Sebastien Han sebastien@enovance.com wrote: Hi all, A couple of years ago, I heard that it wasn’t safe to map a krbd block on an OSD host. It was more or less like mounting a NFS mount on the NFS server, we can potentially end up with some deadlocks. At least, I tried again recently and didn’t encounter any problem. What do you think? Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood. Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- John Wilkins Senior Technical Writer Intank john.wilk...@inktank.com (415) 425-9599 http://inktank.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Storage Multi Tenancy
Jeroen, Actually this is more a question for the OpenStack ML. All the use cases you described are not possible at the moment. The only thing you can get is shared ressources across all the tenants, you can’t really pin any ressource to a specific tenant. This could done I guess, but not available yet. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 15 May 2014, at 10:20, Jeroen van Leur jvl...@home.nl wrote: Hello, Currently I am integrating my ceph cluster into Openstack by using Ceph’s RBD. I’d like to store my KVM virtual machines on pools that I have made on the ceph cluster. I would like to achieve to have multiple storage solutions for multiple tenants. Currently when I launch an instance the instance will be set on the Ceph pool that has been defined in the cinder.conf file of my Openstack controller node. If you set up an multi storage backend for cinder then the scheduler will determine which storage backend will be used without looking at the tenant. What I would like to happen is that the instance/VM that’s being launched by a specific tenant should have two choices; either choose for a shared Ceph Pool or have their own pool. Another option might even be a tenant having his own ceph cluster. When the instance is being launched on either shared pool, dedicated pool or even another cluster, I would also like the extra volumes that are being created to have the same option. Data needs to be isolated from another tenants and users and therefore choosing other pools/clusters would be nice. Is this goal achievable or is it impossible. If it’s achievable could I please have some assistance in doing so. Has anyone ever done this before. I would like thank you in advance for reading this lengthy e-mail. If there’s anything that is unclear, please feel free to ask. Best Regards, Jeroen van Leur — Infitialis Jeroen van Leur Sent with Airmail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image
Glad to hear that it works now :) Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 15 May 2014, at 09:02, Maciej Gałkiewicz mac...@shellycloud.com wrote: On 15 May 2014 04:05, Maciej Gałkiewicz mac...@shellycloud.com wrote: On 28 April 2014 16:11, Sebastien Han sebastien@enovance.com wrote: Yes yes, just restart cinder-api and cinder-volume. It worked for me. In my case the image is still downloaded:( Option show_image_direct_url = True was missing in my glance config. -- Maciej Gałkiewicz Shelly Cloud Sp. z o. o., Co-founder, Sysadmin http://shellycloud.com/, mac...@shellycloud.com KRS: 440358 REGON: 101504426 signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Help -Ceph deployment in Single node Like Devstack
http://www.sebastien-han.fr/blog/2014/05/01/vagrant-up-install-ceph-in-one-command/ Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 08 May 2014, at 06:21, Neil Levine neil.lev...@inktank.com wrote: Loic's micro-osd.sh script is as close to single push button as it gets: http://dachary.org/?p=2374 Not exactly a production cluster but it at least allows you to start experimenting on the CLI. Neil On Wed, May 7, 2014 at 7:56 PM, Patrick McGarry patr...@inktank.com wrote: Hey, Sorry for the delay, I have been traveling in Asia. This question should probably go to the ceph-user list (cc'd). Right now there is no single push-button deployment for Ceph like devstack (that I'm aware of)...but we have sever options in terms of orchestration and deployment (including out own ceph-deploy featured in the doc). A good place to see the package options is http://ceph.com/get Sorry I couldn't give you an exact answer, but I think Ceph is pretty approachable in terms of deployment for experimentation. Hope that helps. Best Regards, Patrick McGarry Director, Community || Inktank http://ceph.com || http://inktank.com @scuttlemonkey || @ceph || @inktank On Wed, Apr 30, 2014 at 2:05 AM, Pandiyan M maestropa...@gmail.com wrote: Hi, I am looking for Ceph simple instalation like devstack ( For opennstack by one package contains all), it should supports for ceph, puppet and run its function as whole ceph does? help me out Thanks in Advance !! -- PANDIYAN MUTHURAMAN Mobile : + 91 9600-963-436 (Personal) +91 7259-031-872 (Official) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image
FYI It’s fixed here: https://review.openstack.org/#/c/90644/1 Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 25 Apr 2014, at 18:16, Sebastien Han sebastien@enovance.com wrote: I just tried, I have the same problem, it looks like a regression… It’s weird because the code didn’t change that much during the Icehouse cycle. I just reported the bug here: https://bugs.launchpad.net/cinder/+bug/1312819 Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 25 Apr 2014, at 16:37, Sebastien Han sebastien@enovance.com wrote: g ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image
Yes yes, just restart cinder-api and cinder-volume. It worked for me. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 28 Apr 2014, at 16:10, Maciej Gałkiewicz mac...@shellycloud.com wrote: On 28 April 2014 15:58, Sebastien Han sebastien@enovance.com wrote: FYI It’s fixed here: https://review.openstack.org/#/c/90644/1 I already have this patch and it didn't help. Have it fixed the problem in your cluster? -- Maciej Gałkiewicz Shelly Cloud Sp. z o. o., Co-founder, Sysadmin http://shellycloud.com/, mac...@shellycloud.com KRS: 440358 REGON: 101504426 signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image
This is a COW clone, but the BP you pointed doesn’t match the feature you described. This might explain Greg’s answer. The BP refers to the libvirt_image_type functionality for Nova. What do you get now when you try to create a volume from an image? Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 25 Apr 2014, at 16:34, Maciej Gałkiewicz mac...@shellycloud.com wrote: On 25 April 2014 16:00, Gregory Farnum g...@inktank.com wrote: If you had it working in Havana I think you must have been using a customized code base; you can still do the same for Icehouse. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com I was using a standard OpenStack version from Debian official repository. This is how I was creating the volume: # cinder create --image-id 1b84776e-25a0-441e-a74f-dc5d3bf5c103 5 The volume is created instantly. # rbd info volume-abca46dc-0a69-43f0-b91a-412dbf30810f -p cinder_volumes rbd image 'volume-abca46dc-0a69-43f0-b91a-412dbf30810f': size 5120 MB in 640 objects order 23 (8192 kB objects) block_name_prefix: rbd_data.301adaa4c2e28e format: 2 features: layering parent: glance_images/1b84776e-25a0-441e-a74f-dc5d3bf5c103@snap overlap: 4608 MB Isn't it a copy-on-write clone? -- Maciej Gałkiewicz Shelly Cloud Sp. z o. o., Co-founder, Sysadmin http://shellycloud.com/, mac...@shellycloud.com KRS: 440358 REGON: 101504426 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image
I just tried, I have the same problem, it looks like a regression… It’s weird because the code didn’t change that much during the Icehouse cycle. I just reported the bug here: https://bugs.launchpad.net/cinder/+bug/1312819 Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 25 Apr 2014, at 16:37, Sebastien Han sebastien@enovance.com wrote: g signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph-brag installation
Hey Loïc, The machine was setup a while ago :). The server side is ready, there is just no graphical interface, everything appears as plain text. It’s not necessary to upgrade. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 20 Apr 2014, at 16:40, Loic Dachary l...@dachary.org wrote: Hi Cédric, This is in the context of https://wiki.ceph.com/Planning/Blueprints/Firefly/Ceph-Brag which is included in Firefly https://github.com/ceph/ceph/tree/firefly/src/brag . It would be good to try it for real before the release ;-) Cheers On 20/04/2014 12:38, Cédric Lemarchand wrote: Hello there, Le 20 avr. 2014 à 12:20, Loic Dachary l...@dachary.org a écrit : Hi Sébastien, I'm available to help setup the ceph-brag machine. Just curious ;-), could you more specific about that ? When would it be more convenient for you to work on this with me ? The brag.ceph.com machine hosted by the Free Software Foundation France is ready with an Ubuntu precise installation. Maybe we should upgrade to Trusty now ? Cheers -- Loïc Dachary, Artisan Logiciel Libre ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Loïc Dachary, Artisan Logiciel Libre ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rdb - huge disk - slow ceph
To speed up the deletion, you can remove the rbd_header (if the image is empty) and then remove it. For example: $ rados -p rbd ls huge.rbd rbd_directory $ rados -p rbd rm huge.rbd $ time rbd rm huge 2013-12-10 09:35:44.168695 7f9c4a87d780 -1 librbd::ImageCtx: error finding header: (2) No such file or directory Removing image: 100% complete...done. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 21 Apr 2014, at 17:03, Gonzalo Aguilar Delgado gagui...@aguilardelgado.com wrote: Hi, I did my first mistake so big... I did a rbd disk of about 300 TB, yes 300 TB rbd info test-disk -p high_value rbd image 'test-disk': size 300 TB in 78643200 objects order 22 (4096 kB objects) block_name_prefix: rb.0.18d7.2ae8944a format: 1 but even more. I made an error with the name (I thought it was 300GB) and deleted it and created again. rbd info homes -p high_value rbd image 'homes': size 300 TB in 78643200 objects order 22 (4096 kB objects) block_name_prefix: rb.0.193e.238e1f29 format: 1 Great mistake, eh?! When I realized I deleted them. But it takes a lot to remove just one. Removing image: 21% complete... (1-2h) What's incredible is that ceph didn't break. Question is. How can I delete them without waiting and breaking something? I also moving my 300GB disk to the ceph cluster: /dev/sdd1 307468468 265789152 26037716 92% /mnt/temp /dev/rbd1 309506048 4888396 304601268 2% /mnt/rbd/homes So I have: [1] Running rbd rm test-disk -p high_value (wd: ~) [2]- Running rbd rm homes -p high_value (wd: ~) [3]+ Running cp -rapx * /mnt/rbd/homes/ (wd: /mnt/temp) It copied about 4GB but takes long. I don't know if it's because the rm or because the problem Michael told me about btrfs. Any help on this, also? Best regards, ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph osd creation error ---Please help me
Try ceph auth del osd.1 And then repeat step 6 Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 08 Apr 2014, at 16:26, Srinivasa Rao Ragolu srag...@mvista.com wrote: Correct error as below Error EINVAL: entity osd.1 exists but key does not match On Tue, Apr 8, 2014 at 7:51 PM, Srinivasa Rao Ragolu srag...@mvista.com wrote: Hi, I am trying to setup ceph cluster without using ceph-deploy. Followed the link http://ceph.com/docs/master/install/manual-deployment/ Successfully able to create monitor node and results are as expected I have copied ceph.conf and ceph.client.admin.keyring from monitor node to OSD node. ceph.conf [global] fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993 mon initial members = mon mon host = 10.162.xx.yy auth cluster required = cephx auth service required = cephx auth client required = cephx osd journal size = 1024 filestore xattr use omap = true osd pool default size = 2 osd pool default min size = 1 osd pool default pg num = 333 osd pool default pgp num = 333 osd crush chooseleaf type = 1 -- On OSD node executed followed steps:(All executed in super user mode) 1) ceph osd create result: 1 2) mkdir /var/lib/ceph/osd/ceph-1 3) mkfs -t ext4 /dev/sdb1 4) mount -o user_xattr /dev/sdb1 /var/lib/ceph/osd/ceph-1 5) ceph-osd -i 1 --mkfs --mkkey 6) ceph auth add osd.1 osd 'allow *' mon 'allow rwx' -i /var/lib/ceph/osd/ceph-1/keyring Now I got error Error EINVAL: entity osd.0 exists but key does not match Please help me in resolving this issue. Please let me know what did I missed? Thanks in advance. Srinivas. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack Nova not removing RBD volumes after removing of instance
I don’t know the packages but for me it looks like a bug… Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 04 Apr 2014, at 09:56, Mariusz Gronczewski mariusz.gronczew...@efigence.com wrote: Nope, one from RDO packages http://openstack.redhat.com/Main_Page On Thu, 3 Apr 2014 23:22:15 +0200, Sebastien Han sebastien@enovance.com wrote: Are you running Havana with josh’s branch? (https://github.com/jdurgin/nova/commits/havana-ephemeral-rbd) Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 03 Apr 2014, at 13:24, Mariusz Gronczewski mariusz.gronczew...@efigence.com wrote: Hi, some time ago I build small Openstack cluster with Ceph as main/only storage backend. I managed to get all parts working (removing/adding volumes works in cinder/glance/nova). I get no errors in logs but I've noticed that after deleting an instance (booted from image) I get leftover RBD volumes: hqblade201(hqstack1):~☠ nova list +--+---+++-+-+ | ID | Name | Status | Task State | Power State | Networks| +--+---+++-+-+ | 5c6261a5-0290-4db6-89a2-f0c81f47d044 | template.devops.non.3dart.com | ACTIVE | None | Running | ext_vlan_102=10.0.102.2 | +--+---+++-+-+ [10:25:00]hqblade201(hqstack1):~☠ nova volume-list +--+---+-+--+-+-+ | ID | Status| Display Name| Size | Volume Type | Attached to | +--+---+-+--+-+-+ | 11aae1a0-48c9-4606-a2be-f44624adb583 | available | stackdev.root | 10 | None| | | 4dacfa9c-dfea-4a15-8ede-0cbdebb5a2e5 | available | cloud-init-test | 10 | None| | | ecf26742-e79e-4d7a-b8a4-9b4dc85dd41f | available | deb-net3| 10 | None| | | 91ec34e3-d597-49e9-80f6-364f5879c6c0 | available | deb-net2| 10 | None| | | 2acee1b6-16ec-4409-b5ad-3af7903f7d5c | available | deb-net1| 10 | None| | | dba790ec-60a3-48ef-ba40-dfb5946a6a1d | available | deb3| 10 | None| | | 57600343-b488-4da6-beb6-94ed351f4f6a | available | deb2| 10 | None| | | 8ff0be71-a36e-40f8-84ad-a8dffa1157fd | available | cvcxvcxv| 10 | None| | | 32a1a61d-698c-4131-bb60-75d95b487b9a | available | deb | 10 | None| | | 5faae133-3e9e-4048-b2bb-ba636f74e8d1 | available | sr | 3 | None| | +--+---+-+--+-+-+ hqblade201(hqstack1):~☠ rbd ls volumes # those are orphaned volumes 003c2a30-240c-4a42-930c-9a81bc9f743d_disk 003c2a30-240c-4a42-930c-9a81bc9f743d_disk.local 003c2a30-240c-4a42-930c-9a81bc9f743d_disk.swap 1026039e-2cb9-4ff1-8f3d-2b270a765858_disk 1026039e-2cb9-4ff1-8f3d-2b270a765858_disk.local 1026039e-2cb9-4ff1-8f3d-2b270a765858_disk.swap 1986fb8e-df4a-40a8-9d1e-762665e60db2_disk 1a0500ad-9311-472b-9c7a-82046ac7aeab_disk 1a0500ad-9311-472b-9c7a-82046ac7aeab_disk.local 1a0500ad-9311-472b-9c7a-82046ac7aeab_disk.swap 1d87569d-db74-480e-af6c-68716460010c_disk 1d87569d-db74-480e-af6c-68716460010c_disk.local 1d87569d-db74-480e-af6c-68716460010c_disk.swap ... 5c6261a5-0290-4db6-89a2-f0c81f47d044_disk 5c6261a5-0290-4db6-89a2-f0c81f47d044_disk.local 5c6261a5-0290-4db6-89a2-f0c81f47d044_disk.swap ... fc9bff9c-fa37-4412-992e-5d1c9d5f4fac_disk fc9bff9c-fa37-4412-992e-5d1c9d5f4fac_disk.local fc9bff9c-fa37-4412-992e-5d1c9d5f4fac_disk.swap volume-11aae1a0-48c9-4606-a2be-f44624adb583 volume-2acee1b6-16ec-4409-b5ad-3af7903f7d5c volume-32a1a61d-698c-4131-bb60-75d95b487b9a volume-4dacfa9c-dfea-4a15-8ede-0cbdebb5a2e5 volume-57600343-b488-4da6-beb6-94ed351f4f6a volume-5faae133-3e9e-4048-b2bb-ba636f74e8d1 volume-8ff0be71-a36e-40f8-84ad-a8dffa1157fd volume-91ec34e3-d597-49e9-80f6-364f5879c6c0 volume-dba790ec-60a3-48ef-ba40-dfb5946a6a1d volume-ecf26742-e79e
Re: [ceph-users] qemu-rbd
There is a RBD engine for FIO, have a look at http://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 12 Mar 2014, at 04:25, Kyle Bader kyle.ba...@gmail.com wrote: I tried rbd-fuse and it's throughput using fio is approx. 1/4 that of the kernel client. Can you please let me know how to setup RBD backend for FIO? I'm assuming this RBD backend is also based on librbd? You will probably have to build fio from source since the rbd engine is new: https://github.com/axboe/fio Assuming you already have a cluster and a client configured this should do the trick: https://github.com/axboe/fio/blob/master/examples/rbd.fio -- Kyle ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu non-shared storage migration of nova instances?
Hi, I use the following live migration flags: VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_PERSIST_DEST It deletes the libvirt.xml and re-creates it on the other side. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 11 Mar 2014, at 22:06, Don Talton (dotalton) dotal...@cisco.com wrote: Hi guys and gals, I'm able to do live migration via 'nova live-migration', as long as my instances are sitting on shared storage. However, when they are not, nova live-migrate fails, due to a shared storage check. To get around this, I attempted to do a live migration via libvirt directly. Using the feature --copy-storage-all fails. Part of the trouble with this is that even though nova is booted from a volume stored on ceph, there are still support files (eg console.log, disk.config) that reside in the instances directory. The virsh command (I've tried many combinations of many different migration approaches) is virsh migrate --live --copy-storage-all instance-000c qemu+ssh://target/system. This fails due to libvirt not creating the instance dir and copying the support files to the target. I'm curious if anyone has been able to get something like this to work. I'd really love to get ceph-backed live migration going without adding the overhead of shared storage for nova too. Thanks, Donald Talton Cloud Systems Development Cisco Systems ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to Configure Cinder to access multiple pools
Hi, Please have a look at the cinder multi-backend functionality: examples here: http://www.sebastien-han.fr/blog/2013/04/25/ceph-and-cinder-multi-backend/ Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 25 Feb 2014, at 14:42, Vikrant Verma vikrantverm...@gmail.com wrote: Hi All, I am using cinder as a front end for volume storage in Openstack configuration. Ceph is used as storage back-end. Currently cinder uses only one pool (in my case pool name is volumes ) for its volume storage. I want cinder to use multiple ceph pools for volume storage --following is the cinder.conf--- volume_driver=cinder.volume.drivers.rbd.RBDDriver rbd_pool=volumes rbd_ceph_conf=/etc/ceph/ceph.conf rbd_flatten_volume_from_snapshot=false Please let me know if it is possible to have multiple pools associated to cinder, let me know how to configure it. Regards, Vikrant ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] storage
Hi, RBD blocks are stored as objects on a filesystem usually under: /var/lib/ceph/osd/osd.id/current/pg.id/ RBD is just an abstraction layer. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 25 Feb 2014, at 13:09, yalla.gnan.ku...@accenture.com wrote: Hi All, By default in which directory/directories, does ceph store the block device files ? Is it in the /dev or other filesystem ? Thanks Kumar This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. . __ www.accenture.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Size of objects in Ceph
Hi, The value can be set during the image creation. Start with this: http://ceph.com/docs/master/man/8/rbd/#striping Followed by the example section. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 25 Feb 2014, at 15:54, Florent Bautista flor...@coppint.com wrote: Hi all, I'm new with Ceph and I would like to know if there is any way of changing size of Ceph's internal objects. I mean, when I put an image on RBD for exemple, I can see this: rbd -p CephTest info base-127-disk-1 rbd image 'base-127-disk-1': size 32768 MB in 8192 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.347c274b0dc51 format: 2 features: layering 4096 kB objects = how can I change size of objects ? Or is it a fixed value in Ceph architecture ? Thank you ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Unable top start instance in openstack
Which distro and packages? libvirt_image_type is broken on cloud archive, please patch with https://github.com/jdurgin/nova/commits/havana-ephemeral-rbd Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 19 Feb 2014, at 08:06, yalla.gnan.ku...@accenture.com wrote: Hi, I have followed the link http://ceph.com/docs/master/rbd/rbd-openstack/ and configured ceph with openstack. But when I try to launch instances, they are going into Error state. I have found the below log in the controller node of openstack: --- injection_path = image(\'disk\').path\n', uAttributeError: 'Rbd' object has no attribute 'path'\n] Thanks Kumar This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security and assessment of internal compliance with Accenture policy. . __ www.accenture.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Block Devices and OpenStack
Hi, Can I see your ceph.conf? I suspect that [client.cinder] and [client.glance] sections are missing. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 16 Feb 2014, at 06:55, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Jean, Here is the output for ceph auth list for client.cinder client.cinder key: AQCKaP9ScNgiMBAAwWjFnyL69rBfMzQRSHOfoQ== caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rx pool=images Here is the output of ceph -s: ashish@ceph-client:~$ ceph -s cluster afa13fcd-f662-4778-8389-85047645d034 health HEALTH_OK monmap e1: 1 mons at {ceph-node1=10.0.1.11:6789/0}, election epoch 1, quorum 0 ceph-node1 osdmap e37: 3 osds: 3 up, 3 in pgmap v84: 576 pgs, 6 pools, 0 bytes data, 0 objects 106 MB used, 9076 MB / 9182 MB avail 576 active+clean I created all the keyrings and copied as suggested by the guide. On Sun, Feb 16, 2014 at 3:08 AM, Jean-Charles LOPEZ jc.lo...@inktank.com wrote: Hi, what do you get when you run a 'ceph auth list' command for the user name (client.cinder) you created for cinder? Are the caps and the key for this user correct? No typo in the hostname in the cinder.conf file (host=) ? Did you copy the keyring to the cinder running cinder (can’t really say from your output and there is no ceph-s command to check the monitor names)? It could just be a typo in the ceph auth get-or-create command that’s causing it. Rgds JC On Feb 15, 2014, at 10:35, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Cephers, I am trying to configure ceph rbd as backend for cinder and glance by following the steps mentioned in: http://ceph.com/docs/master/rbd/rbd-openstack/ Before I start all openstack services are running normally and ceph cluster health shows HEALTH_OK But once I am done with all steps and restart openstack services, cinder-volume fails to start and throws an error. 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Traceback (most recent call last): 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 262, in check_for_setup_error 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd with RADOSClient(self): 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 234, in __init__ 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd self.cluster, self.ioctx = driver._connect_to_rados(pool) 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 282, in _connect_to_rados 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd client.connect() 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /usr/lib/python2.7/dist-packages/rados.py, line 185, in connect 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd raise make_ex(ret, error calling connect) 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Error: error calling connect: error code 95 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd 2014-02-16 00:01:42.591 ERROR cinder.volume.manager [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Error encountered during initialization of driver: RBDDriver 2014-02-16 00:01:42.592 ERROR cinder.volume.manager [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Bad or unexpected response from the storage volume backend API: error connecting to ceph cluster 2014-02-16 00:01:42.592 TRACE cinder.volume.manager Traceback (most recent call last): 2014-02-16 00:01:42.592 TRACE cinder.volume.manager File /opt/stack/cinder/cinder/volume/manager.py, line 190, in init_host 2014-02-16 00:01:42.592 TRACE cinder.volume.manager self.driver.check_for_setup_error() 2014-02-16 00:01:42.592 TRACE cinder.volume.manager File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 267, in check_for_setup_error 2014-02-16 00:01:42.592 TRACE cinder.volume.manager raise exception.VolumeBackendAPIException(data=msg) 2014-02-16 00:01:42.592 TRACE cinder.volume.manager VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: error connecting to ceph cluster Here is the content of my /etc/ceph in openstack node: ashish@ubuntu:/etc/ceph$ ls -lrt total 16 -rw-r--r-- 1 cinder cinder 229 Feb 15 23:45 ceph.conf -rw-r--r-- 1 glance glance 65 Feb 15 23:46 ceph.client.glance.keyring -rw-r--r-- 1 cinder cinder 65 Feb 15 23:47 ceph.client.cinder.keyring -rw-r--r-- 1 cinder cinder 72 Feb 15 23:47
Re: [ceph-users] Block Devices and OpenStack
Hi, If cinder-volume fails to connect and putting the admin keyring works it means that cinder is not configured properly. Please also try to add the following: [client.cinder] keyring = path-to-keyring Same for Glance. Btw: ceph.conf doesn’t need to be own by Cinder, just let mod +r and keep root as owner. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 17 Feb 2014, at 14:48, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Sebastian, Jean; This is my ceph.conf looks like. It was auto generated using ceph-deploy. [global] fsid = afa13fcd-f662-4778-8389-85047645d034 mon_initial_members = ceph-node1 mon_host = 10.0.1.11 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true If I provide admin.keyring file to openstack node (in /etc/ceph) it works fine and issue is gone . Thanks Ashish On Mon, Feb 17, 2014 at 2:03 PM, Sebastien Han sebastien@enovance.com wrote: Hi, Can I see your ceph.conf? I suspect that [client.cinder] and [client.glance] sections are missing. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 16 Feb 2014, at 06:55, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Jean, Here is the output for ceph auth list for client.cinder client.cinder key: AQCKaP9ScNgiMBAAwWjFnyL69rBfMzQRSHOfoQ== caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rx pool=images Here is the output of ceph -s: ashish@ceph-client:~$ ceph -s cluster afa13fcd-f662-4778-8389-85047645d034 health HEALTH_OK monmap e1: 1 mons at {ceph-node1=10.0.1.11:6789/0}, election epoch 1, quorum 0 ceph-node1 osdmap e37: 3 osds: 3 up, 3 in pgmap v84: 576 pgs, 6 pools, 0 bytes data, 0 objects 106 MB used, 9076 MB / 9182 MB avail 576 active+clean I created all the keyrings and copied as suggested by the guide. On Sun, Feb 16, 2014 at 3:08 AM, Jean-Charles LOPEZ jc.lo...@inktank.com wrote: Hi, what do you get when you run a 'ceph auth list' command for the user name (client.cinder) you created for cinder? Are the caps and the key for this user correct? No typo in the hostname in the cinder.conf file (host=) ? Did you copy the keyring to the cinder running cinder (can’t really say from your output and there is no ceph-s command to check the monitor names)? It could just be a typo in the ceph auth get-or-create command that’s causing it. Rgds JC On Feb 15, 2014, at 10:35, Ashish Chandra mail.ashishchan...@gmail.com wrote: Hi Cephers, I am trying to configure ceph rbd as backend for cinder and glance by following the steps mentioned in: http://ceph.com/docs/master/rbd/rbd-openstack/ Before I start all openstack services are running normally and ceph cluster health shows HEALTH_OK But once I am done with all steps and restart openstack services, cinder-volume fails to start and throws an error. 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Traceback (most recent call last): 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 262, in check_for_setup_error 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd with RADOSClient(self): 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 234, in __init__ 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd self.cluster, self.ioctx = driver._connect_to_rados(pool) 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 282, in _connect_to_rados 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd client.connect() 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd File /usr/lib/python2.7/dist-packages/rados.py, line 185, in connect 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd raise make_ex(ret, error calling connect) 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Error: error calling connect: error code 95 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd 2014-02-16 00:01:42.591 ERROR cinder.volume.manager [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Error encountered during initialization of driver: RBDDriver 2014-02-16 00:01:42.592 ERROR cinder.volume.manager [req
Re: [ceph-users] Meetup in Frankfurt, before the Ceph day
Hi Alexandre, We have a meet up in Paris. Please see: http://www.meetup.com/Ceph-in-Paris/events/158942372/ Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 05 Feb 2014, at 13:51, Alexandre DERUMIER aderum...@odiso.com wrote: Hi Loic, do you known if a ceph meetup is planned soon in France or Belgium ? I miss the Fosdem this year, and I'll be very happy to meet some ceph users/devs. Regards, Alexandre - Mail original - De: Loic Dachary l...@dachary.org À: ceph-users ceph-users@lists.ceph.com Envoyé: Mercredi 5 Février 2014 09:44:04 Objet: [ceph-users] Meetup in Frankfurt, before the Ceph day Hi Ceph, I'll be in Frankfurt for the Ceph day February 27th http://www.eventbrite.com/e/ceph-day-frankfurt-tickets-10173269523 and I will attend the meetup organized the evening before http://www.meetup.com/Ceph-Frankfurt/events/164620852/ Anyone interested to join ? Not sure where we should meet ... I've never been to Frankfurt before :-) Cheers -- Loïc Dachary, Artisan Logiciel Libre ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] During copy new rbd image is totally thick
I have the same behaviour here. I believe this is somehow expected since you’re calling “copy”, clone will do the cow. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 03 Feb 2014, at 08:43, Igor Laskovy igor.lask...@gmail.com wrote: Anybody? ;) On Thu, Jan 30, 2014 at 9:10 PM, Igor Laskovy igor.lask...@gmail.com wrote: Hello list, Is it correct behavior during copy to thicking rbd image? igor@hv03:~$ rbd create rbd/test -s 1024 igor@hv03:~$ rbd diff rbd/test | awk '{ SUM += $2 } END { print SUM/1024/1024 MB }' 0 MB igor@hv03:~$ rbd copy rbd/test rbd/cloneoftest Image copy: 100% complete...done. igor@hv03:~$ rbd diff rbd/cloneoftest | awk '{ SUM += $2 } END { print SUM/1024/1024 MB }' 1024 MB -- Igor Laskovy facebook.com/igor.laskovy studiogrizzly.com -- Igor Laskovy facebook.com/igor.laskovy studiogrizzly.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] get virtual size and used
Hi, $ rbd diff rbd/toto | awk '{ SUM += $2 } END { print SUM/1024/1024 MB }’ Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 03 Feb 2014, at 17:10, zorg z...@probesys.com wrote: hi, We use rbd pool for and I wonder how can i have the real size use by my drb image I can have the virtual size rbd info but how can i have the real size use by my drbd image -- probeSys - spécialiste GNU/Linux site web : http://www.probesys.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack Havana release installation with ceph
Usually you would like to start here: http://ceph.com/docs/master/rbd/rbd-openstack/ Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 22 Jan 2014, at 00:14, Dmitry Borodaenko dborodae...@mirantis.com wrote: On Tue, Jan 21, 2014 at 10:38 AM, Dmitry Borodaenko dborodae...@mirantis.com wrote: On Tue, Jan 21, 2014 at 2:23 AM, Lalitha Maruthachalam lalitha.maruthacha...@aricent.com wrote: Can someone please let me know whether there is any documentation for installing Havana release of Openstack along with Ceph. These slides have some information about how this is done in Mirantis OpenStack 4.0, including some gotchas and troubleshooting pointers: http://files.meetup.com/11701852/fuel-ceph.pdf I didn't realize you need to be a participant of the meetup to get that file, here's a link to the same slides on SlideShare: http://www.slideshare.net/mirantis/fuel-ceph Apologies, -- Dmitry Borodaenko ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD port usage
Greg, Do you have any estimation about how heartbeat messages use the network? How busy is it? At some point (if the cluster gets big enough), could this degrade the network performance? Will it make sense to have a separate network for this? So in addition to public and storage we will have an heartbeat network, so we could pin it to a specific network link. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 22 Jan 2014, at 19:01, Gregory Farnum g...@inktank.com wrote: On Tue, Jan 21, 2014 at 8:26 AM, Sylvain Munaut s.mun...@whatever-company.com wrote: Hi, I noticed in the documentation that the OSD should use 3 ports per OSD daemon running and so when I setup the cluster, I originally opened enough port to accomodate this (with a small margin so that restart could proceed even is ports aren't released immediately). However today I just noticed that OSD daemons are using 5 ports and so for some of them, a port or two were locked by the firewall. All the OSD were still reporting as OK and the cluster didn't report anything wrong but I was getting some weird behavior that could have been related. So is that usage of 5 TCP ports normal ? And if it is, could the doc be updated ? Normal! It's increased a couple times recently because we added heartbeating on both the public and cluster network interfaces. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD port usage
I agree but somehow this generates more traffic too. We just need to find a good balance. But I don’t think this will change the scenario where the cluster network is down and OSDs die because of this… Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 24 Jan 2014, at 11:52, Sylvain Munaut s.mun...@whatever-company.com wrote: Hi, At some point (if the cluster gets big enough), could this degrade the network performance? Will it make sense to have a separate network for this? So in addition to public and storage we will have an heartbeat network, so we could pin it to a specific network link. I think the whole point of having added hearbeating on public cluster network is to be able to detect failure of one network but not the other. Separating heartbeat on it's own independent network would be quite counter productive in this respect. Cheers, Sylvain signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD port usage
Ok Greg, thanks for the clarification! Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 24 Jan 2014, at 18:22, Gregory Farnum g...@inktank.com wrote: On Friday, January 24, 2014, Sebastien Han sebastien@enovance.com wrote: Greg, Do you have any estimation about how heartbeat messages use the network? How busy is it? Not very. It's one very small message per OSD peer per...second? At some point (if the cluster gets big enough), could this degrade the network performance? Will it make sense to have a separate network for this? As Sylvain said, that would negate the entire point of heartbeating on both networks. Trust me, you don't want to deal with a cluster where the OSDs can't talk to each other but they can talk to the monitors and keep marking each other down. -Greg So in addition to public and storage we will have an heartbeat network, so we could pin it to a specific network link. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 22 Jan 2014, at 19:01, Gregory Farnum g...@inktank.com wrote: On Tue, Jan 21, 2014 at 8:26 AM, Sylvain Munaut s.mun...@whatever-company.com wrote: Hi, I noticed in the documentation that the OSD should use 3 ports per OSD daemon running and so when I setup the cluster, I originally opened enough port to accomodate this (with a small margin so that restart could proceed even is ports aren't released immediately). However today I just noticed that OSD daemons are using 5 ports and so for some of them, a port or two were locked by the firewall. All the OSD were still reporting as OK and the cluster didn't report anything wrong but I was getting some weird behavior that could have been related. So is that usage of 5 TCP ports normal ? And if it is, could the doc be updated ? Normal! It's increased a couple times recently because we added heartbeating on both the public and cluster network interfaces. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Software Engineer #42 @ http://inktank.com | http://ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] servers advise (dell r515 or supermicro ....)
Hi Alexandre, Are you going with a 10Gb network? It’s not an issue for IOPS but more for the bandwidth. If so read the following: I personally won’t go with a ratio of 1:6 for the journal. I guess 1:5 (or even 1:4) is preferable. SAS 10K gives you around 140MB/sec for sequential writes. So if you use a journal with an SSD, you expect at least 140MB if you don’t want to slow things down. If you do so 140*10 (disks): fulfil your 10GB bandwidth already. So either you don’t need that much disks either you don’t need SSDs. It depends on the performance that you want to achieve. Another thing, I also won’t use the DC S3700 since this disk was definitely made for IOPS intensive applications. The journal is purely sequential (small seq block, IIRC Stephan mentioned 370k blocks). I will instead use with a SSD with large sequential capabilities like 525 series 120GB. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 15 Jan 2014, at 12:47, Alexandre DERUMIER aderum...@odiso.com wrote: Hello List, I'm going to build a build a rbd cluster this year, with 5 nodes I would like to have this kind of configuration for each node: - 2U - 2,5inch drives os : 2 disk sas drive journal : 2 x ssd intel dc s3700 100GB osd : 10 or 12 x sas Seagate Savvio 10K.6 900GB I see on the mailing that intank use dell r515. I currently own a lot of dell servers and I have good prices. But I have also see on the mailing that dell perc H700 can have some performance problem, and also it's not easy to flash the firmware for jbod mode. http://www.spinics.net/lists/ceph-devel/msg16661.html I don't known if theses performance problem has finally been solved ? Another option could be to use supermicro server, they have some 2U - 16 disks chassis + one or two lsi jbod controller. But, I have had in past really bad experience with supermicro motherboard. (Mainly firmware bug, ipmi card bug,.) Does someone have experience with supermicro, and give me advise for a good motherboard model? Best Regards, Alexandre Derumier ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] servers advise (dell r515 or supermicro ....)
Hum the Crucial m500 is pretty slow. The biggest one doesn’t even reach 300MB/s. Intel DC S3700 100G showed around 200MB/sec for us. Actually, I don’t know the price difference between the crucial and the intel but the intel looks more suitable for me. Especially after Mark’s comment. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 15 Jan 2014, at 15:28, Mark Nelson mark.nel...@inktank.com wrote: On 01/15/2014 08:03 AM, Robert van Leeuwen wrote: Power-Loss Protection: In the rare event that power fails while the drive is operating, power-loss protection helps ensure that data isn’t corrupted. Seems that not all power protected SSDs are created equal: http://lkcl.net/reports/ssd_analysis.html The m500 is not tested but the m4 is. Up to now it seems that only Intel seems to have done his homework. In general they *seem* to be the most reliable SSD provider. Even at that, there has been some concern on the list (and lkml) that certain older Intel drives without super-capacitors are ignoring ATA_CMD_FLUSH, making them very fast (which I like!) but potentially dangerous (boo!). The 520 in particular is a drive I've used for a lot of Ceph performance testing but I'm afraid that if it's not properly handling CMD FLUSH requests, it may not be indicative of the performance folks would see on other drives that do. On the third hand, if drives with supercaps like the Intel DC S3700 can safely ignore CMD_FLUSH and maintain high performance (even when there are a lot of O_DSYNC calls, ala the journal), that potentially makes them even more attractive (and that drive already has relatively high sequential write performance and high write endurance). Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] servers advise (dell r515 or supermicro ....)
Sorry I was only looking at the 4K aligned results. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 15 Jan 2014, at 15:46, Stefan Priebe s.pri...@profihost.ag wrote: Am 15.01.2014 15:44, schrieb Mark Nelson: On 01/15/2014 08:39 AM, Stefan Priebe wrote: Am 15.01.2014 15:34, schrieb Sebastien Han: Hum the Crucial m500 is pretty slow. The biggest one doesn’t even reach 300MB/s. Intel DC S3700 100G showed around 200MB/sec for us. where did you get this values from? I've some 960GB and they all have 450Mb/s write speed. Also in tests like here you see 450MB/s http://www.tomshardware.com/reviews/crucial-m500-1tb-ssd,3551-5.html Looks like at least according to Anand's chart, you'll get full write speed once you buy the 480GB model, but not for the 120 or 240GB models: http://www.anandtech.com/show/6884/crucial-micron-m500-review-960gb-480gb-240gb-120gb that's correct but the sentence was The biggest one doesn’t even reach 300MB/s. Actually, I don’t know the price difference between the crucial and the intel but the intel looks more suitable for me. Especially after Mark’s comment. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 15 Jan 2014, at 15:28, Mark Nelson mark.nel...@inktank.com wrote: On 01/15/2014 08:03 AM, Robert van Leeuwen wrote: Power-Loss Protection: In the rare event that power fails while the drive is operating, power-loss protection helps ensure that data isn’t corrupted. Seems that not all power protected SSDs are created equal: http://lkcl.net/reports/ssd_analysis.html The m500 is not tested but the m4 is. Up to now it seems that only Intel seems to have done his homework. In general they *seem* to be the most reliable SSD provider. Even at that, there has been some concern on the list (and lkml) that certain older Intel drives without super-capacitors are ignoring ATA_CMD_FLUSH, making them very fast (which I like!) but potentially dangerous (boo!). The 520 in particular is a drive I've used for a lot of Ceph performance testing but I'm afraid that if it's not properly handling CMD FLUSH requests, it may not be indicative of the performance folks would see on other drives that do. On the third hand, if drives with supercaps like the Intel DC S3700 can safely ignore CMD_FLUSH and maintain high performance (even when there are a lot of O_DSYNC calls, ala the journal), that potentially makes them even more attractive (and that drive already has relatively high sequential write performance and high write endurance). Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] servers advise (dell r515 or supermicro ....)
However you have to get 480GB which ridiculously large for a journal. I believe they are pretty expensive too. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 15 Jan 2014, at 15:49, Sebastien Han sebastien@enovance.com wrote: Sorry I was only looking at the 4K aligned results. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 15 Jan 2014, at 15:46, Stefan Priebe s.pri...@profihost.ag wrote: Am 15.01.2014 15:44, schrieb Mark Nelson: On 01/15/2014 08:39 AM, Stefan Priebe wrote: Am 15.01.2014 15:34, schrieb Sebastien Han: Hum the Crucial m500 is pretty slow. The biggest one doesn’t even reach 300MB/s. Intel DC S3700 100G showed around 200MB/sec for us. where did you get this values from? I've some 960GB and they all have 450Mb/s write speed. Also in tests like here you see 450MB/s http://www.tomshardware.com/reviews/crucial-m500-1tb-ssd,3551-5.html Looks like at least according to Anand's chart, you'll get full write speed once you buy the 480GB model, but not for the 120 or 240GB models: http://www.anandtech.com/show/6884/crucial-micron-m500-review-960gb-480gb-240gb-120gb that's correct but the sentence was The biggest one doesn’t even reach 300MB/s. Actually, I don’t know the price difference between the crucial and the intel but the intel looks more suitable for me. Especially after Mark’s comment. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 15 Jan 2014, at 15:28, Mark Nelson mark.nel...@inktank.com wrote: On 01/15/2014 08:03 AM, Robert van Leeuwen wrote: Power-Loss Protection: In the rare event that power fails while the drive is operating, power-loss protection helps ensure that data isn’t corrupted. Seems that not all power protected SSDs are created equal: http://lkcl.net/reports/ssd_analysis.html The m500 is not tested but the m4 is. Up to now it seems that only Intel seems to have done his homework. In general they *seem* to be the most reliable SSD provider. Even at that, there has been some concern on the list (and lkml) that certain older Intel drives without super-capacitors are ignoring ATA_CMD_FLUSH, making them very fast (which I like!) but potentially dangerous (boo!). The 520 in particular is a drive I've used for a lot of Ceph performance testing but I'm afraid that if it's not properly handling CMD FLUSH requests, it may not be indicative of the performance folks would see on other drives that do. On the third hand, if drives with supercaps like the Intel DC S3700 can safely ignore CMD_FLUSH and maintain high performance (even when there are a lot of O_DSYNC calls, ala the journal), that potentially makes them even more attractive (and that drive already has relatively high sequential write performance and high write endurance). Cheers, Robert van Leeuwen ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [Ceph-community] Ceph User Committee elections : call for participation
Thanks! Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 01 Jan 2014, at 10:41, Loic Dachary l...@dachary.org wrote: On 01/01/2014 02:39, Sebastien Han wrote: Hi, I’m not sure to have the whole visibility of the role but I will be more than happy to take over. I believe that I can allocate some time for this. Your name is added to the http://pad.ceph.com/p/ceph-user-committee-candidates list Cheers Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 31 Dec 2013, at 09:18, Loic Dachary l...@dachary.org wrote: Hi, For personal reasons I have to step down as head of the Ceph User Committee at the end of January 2014. Who would be willing to take over this role ? If there is enough interest I'll organize the election. Otherwise we'll have to figure out something ;-) Cheers -- Loïc Dachary, Artisan Logiciel Libre ___ Ceph-community mailing list ceph-commun...@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com -- Loïc Dachary, Artisan Logiciel Libre signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] My experience with ceph now documentted
The ceph doc is currently being updated. See https://github.com/ceph/ceph/pull/906 Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 17 Dec 2013, at 00:13, Andrew Woodward xar...@gmail.com wrote: Karan, This all looks great. I'd encourage you to submit some of this information into the ceph docs, some of the openstack integration docs are getting a little dated Andrew On Fri, Dec 6, 2013 at 12:24 PM, Karan Singh ksi...@csc.fi wrote: Hello Cephers I would like to say a BIG THANKS to ceph community for helping me in setting up and learning ceph. I have created a small documentation http://karan-mj.blogspot.fi/ of my experience with ceph till now , i belive it would help beginners in installing ceph and integrating it with openstack. I would keep updating this blog. PS -- i recommend original ceph documentation http://ceph.com/docs/master/ and other original content published by Ceph community , INKTANK and other partners. My attempt http://karan-mj.blogspot.fi/ is just to contribute for a regular online content about ceph. Karan Singh CSC - IT Center for Science Ltd. P.O. Box 405, FI-02101 Espoo, FINLAND http://www.csc.fi/ | +358 (0) 503 812758 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- If google has done it, Google did it right! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Journal, SSD and OS
Arf forgot to mention that I’ll do a software mdadm RAID 1 with both sda1 and sdb1 and put the OS on this. The rest (sda2 and sdb2) will go for the journals. @James: I think that Gandalf’s main idea was to save some costs/space on the servers so having dedicated disks is not an option. (that what I understand from your comment “have the OS somewhere else” but I could be wrong) Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 05 Dec 2013, at 16:02, James Pearce ja...@peacon.co.uk wrote: Another option is to run journals on individually presented SSDs, in a 5:1 ratio (spinning-disk:ssd) and have the OS somewhere else. Then the failure domain is smaller. Ideally implement some way to monitor SSD write life SMART data - at least it gives a guide as to device condition compared to its rated life. That can be done with smartmontools, but it would be nice to have it on the InkTank dashboard for example. On 2013-12-05 14:26, Sebastien Han wrote: Hi guys, I won’t do a RAID 1 with SSDs since they both write the same data. Thus, they are more likely to “almost” die at the same time. What I will try to do instead is to use both disk in JBOD mode or (degraded RAID0). Then I will create a tiny root partition for the OS. Then I’ll still have something like /dev/sda2 and /dev/sdb2 and then I can take advantage of the 2 disks independently. The good thing with that is that you can balance your journals across both SSDs. From a performance perspective this is really good. The bad thing as always is that if you loose a SSD you loose all the journals attached to it. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 05 Dec 2013, at 10:53, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2013/12/4 Simon Leinen simon.lei...@switch.ch: I think this is a fine configuration - you won't be writing to the root partition too much, outside journals. We also put journals on the same SSDs as root partitions (not that we're very ambitious about performance...). Do you suggest a RAID1 for the OS partitions on SSDs ? Is this safe or a RAID1 will decrease SSD life? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Journal, SSD and OS
Hi guys, I won’t do a RAID 1 with SSDs since they both write the same data. Thus, they are more likely to “almost” die at the same time. What I will try to do instead is to use both disk in JBOD mode or (degraded RAID0). Then I will create a tiny root partition for the OS. Then I’ll still have something like /dev/sda2 and /dev/sdb2 and then I can take advantage of the 2 disks independently. The good thing with that is that you can balance your journals across both SSDs. From a performance perspective this is really good. The bad thing as always is that if you loose a SSD you loose all the journals attached to it. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 05 Dec 2013, at 10:53, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2013/12/4 Simon Leinen simon.lei...@switch.ch: I think this is a fine configuration - you won't be writing to the root partition too much, outside journals. We also put journals on the same SSDs as root partitions (not that we're very ambitious about performance...). Do you suggest a RAID1 for the OS partitions on SSDs ? Is this safe or a RAID1 will decrease SSD life? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Docker
Hi guys! Some experiment here: http://www.sebastien-han.fr/blog/2013/09/19/how-I-barely-got-my-first-ceph-mon-running-in-docker/ Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 29 Nov 2013, at 00:08, Patrick McGarry patr...@inktank.com wrote: I played with Docker for a while and ran into some issues (perhaps from my own ignorance of Docker principles). The biggest issue seemed to be that the IP was relatively ephemeral, which the MON really doesn't like. I couldn't find a reliably intuitive way to have the MON get either the same IP or a way to update the IP in a way that would also form a cluster. If anyone has been able to get Ceph into Docker with reliability and portability I would love to hear about it (and feature it on the Ceph.com blog!). Best Regards, Patrick McGarry Director, Community || Inktank http://ceph.com || http://inktank.com @scuttlemonkey || @ceph || @inktank On Thu, Nov 28, 2013 at 5:17 PM, Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: Anybody using MONs and RGW inside docker containers? I would like to use a server with two docker containers, one for mon and one for RGW This to archieve a better isolation between services and some reusable components (the same container can be exported and used multiple times on multiple servers) Suggestions? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] LevelDB Backend For Ceph OSD Preview
Hi Sage, If I recall correctly during the summit you mentioned that it was possible to disable the journal. Is it still part of the plan? Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 25 Nov 2013, at 10:00, Sebastien Han sebastien@enovance.com wrote: Nice job Haomai! Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 25 Nov 2013, at 02:50, Haomai Wang haomaiw...@gmail.com wrote: On Mon, Nov 25, 2013 at 2:17 AM, Mark Nelson mark.nel...@inktank.com wrote: Great Work! This is very exciting! Did you happen to try RADOS bench at different object sizes and concurrency levels? Maybe can try it later. :-) Mark On 11/24/2013 03:01 AM, Haomai Wang wrote: Hi all, For Emperor blueprint(http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Add_LevelDB_support_to_ceph_cluster_backend_store), I'm sorry to delay the progress. Now, I have done the most of the works for the blueprint's goal. Because of sage's F blueprint(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend), I need to adjust some codes to match it. The branch is here(https://github.com/yuyuyu101/ceph/tree/wip/6173). I have tested the LevelDB backend on three nodes(eight OSDs) and compare it to FileStore(ext4). I just use intern benchmark tool rados bench to get the comparison. The default ceph configurations is used and replication size is 2. The filesystem is ext4 and no others changed. The results is below: *Rados Bench* *Bandwidth(MB/sec)* *Average Latency* *Max Latency* *Min Latency* *Stddev Latency* *Stddev Bandwidth(MB/sec)* *Max Bandwidth(MB/sec)* *Min Bandwidth(MB/sec)* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *Write 30* 24.590 23.495 4.87257 5.07716 14.752 13.0885 0.580851 0.605118 2.97708 3.30538 9.91938 10.5986 44 76 0 0 *Write 20* 23.515 23.064 3.39745 3.45711 11.6089 11.5996 0.169507 0.138595 2.58285 2.75962 9.14467 8.54156 44 40 0 0 *Write 10* 22.927 21.980 1.73815 1.8198 5.53792 6.46675 0.171028 0.143392 1.05982 1.20303 9.18403 8.74401 44 40 0 0 *Write 5* 19.680 20.017 1.01492 0.997019 3.10783 3.05008 0.143758 0.138161 0.561548 0.571459 5.92575 6.844 36 32 0 0 *Read 30* 65.852 60.688 1.80069 1.96009 9.30039 10.1146 0.115153 0.061657 *Read 20* 59.372 60.738 1.30479 1.28383 6.28435 8.21304 0.016843 0.012073 *Read 10* 65.502 55.814 0.608805 0.7087 3.3917 4.72626 0.016267 0.011998 *Read 5* 64.176 54.928 0.307111 0.364077 1.76391 1.90182 0.017174 0.011999 Charts can be view here(http://img42.com/ziwjP+) and (http://img42.com/LKhoo+) From above, I'm feeling relieved that the LevelDB backend isn't useless. Most of metrics are better and if increasing cache size for LevelDB the results may be more attractive. Even more, LevelDB backend is used by KeyValueStore and much of optimizations can be done to improve performance such as increase parallel threads or optimize io path. Next, I use rbd bench-write to test. The result is pity: *RBD Bench-Write* *OPS/sec* *Bytes/sec* *KVStore* *FileStore* *KVStore* *FileStore* *Seq 4096 5* 27.42 716.55 111861.51 2492149.21 *Rand 4096 5* 28.27 504 112331.42 1683151.29 Just because kv backend doesn't support read/write operation with offset/length argument, each read/write operation will call
Re: [ceph-users] how to Testing cinder and glance with CEPH
Hi, Well after restarting the services run: $ cinder create 1 Then you can check both status in Cinder and Ceph: For Cinder run: $ cinder list For Ceph run: $ rbd -p cinder-pool ls If the image is there, you’re good. Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 27 Nov 2013, at 00:04, Karan Singh ksi...@csc.fi wrote: Hello Cephers I was following http://ceph.com/docs/master/rbd/rbd-openstack/ for ceph and openstack Integration , using this document ih ave done all the changes required for this integration. I am not sure how should i test my configuration , how should i make sure integration is successful. Can you suggest some test that i can perform to check my ceph and openstack integration . FYI , in the document http://ceph.com/docs/master/rbd/rbd-openstack/ , Nothing is mentioned after Restart Openstack Services heading , but there should be steps to test this inttegration , please suggest me here , i am new to openstack great if you can give me some commanes used for testing. Karan Singh CSC - IT Center for Science Ltd. P.O. Box 405, FI-02101 Espoo, FINLAND http://www.csc.fi/ | +358 (0) 503 812758 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] LevelDB Backend For Ceph OSD Preview
Nice job Haomai! Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 25 Nov 2013, at 02:50, Haomai Wang haomaiw...@gmail.com wrote: On Mon, Nov 25, 2013 at 2:17 AM, Mark Nelson mark.nel...@inktank.com wrote: Great Work! This is very exciting! Did you happen to try RADOS bench at different object sizes and concurrency levels? Maybe can try it later. :-) Mark On 11/24/2013 03:01 AM, Haomai Wang wrote: Hi all, For Emperor blueprint(http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Add_LevelDB_support_to_ceph_cluster_backend_store), I'm sorry to delay the progress. Now, I have done the most of the works for the blueprint's goal. Because of sage's F blueprint(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend), I need to adjust some codes to match it. The branch is here(https://github.com/yuyuyu101/ceph/tree/wip/6173). I have tested the LevelDB backend on three nodes(eight OSDs) and compare it to FileStore(ext4). I just use intern benchmark tool rados bench to get the comparison. The default ceph configurations is used and replication size is 2. The filesystem is ext4 and no others changed. The results is below: *Rados Bench* *Bandwidth(MB/sec)* *Average Latency* *Max Latency* *Min Latency* *Stddev Latency* *Stddev Bandwidth(MB/sec)* *Max Bandwidth(MB/sec)* *Min Bandwidth(MB/sec)* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *KVStore* *FileStore* *Write 30* 24.590 23.495 4.87257 5.07716 14.752 13.0885 0.580851 0.605118 2.97708 3.30538 9.91938 10.5986 44 76 0 0 *Write 20* 23.515 23.064 3.39745 3.45711 11.6089 11.5996 0.169507 0.138595 2.58285 2.75962 9.14467 8.54156 44 40 0 0 *Write 10* 22.927 21.980 1.73815 1.8198 5.53792 6.46675 0.171028 0.143392 1.05982 1.20303 9.18403 8.74401 44 40 0 0 *Write 5* 19.680 20.017 1.01492 0.997019 3.10783 3.05008 0.143758 0.138161 0.561548 0.571459 5.92575 6.844 36 32 0 0 *Read 30* 65.852 60.688 1.80069 1.96009 9.30039 10.1146 0.115153 0.061657 *Read 20* 59.372 60.738 1.30479 1.28383 6.28435 8.21304 0.016843 0.012073 *Read 10* 65.502 55.814 0.608805 0.7087 3.3917 4.72626 0.016267 0.011998 *Read 5* 64.176 54.928 0.307111 0.364077 1.76391 1.90182 0.017174 0.011999 Charts can be view here(http://img42.com/ziwjP+) and (http://img42.com/LKhoo+) From above, I'm feeling relieved that the LevelDB backend isn't useless. Most of
Re: [ceph-users] alternative approaches to CEPH-FS
Hi, 1) nfs over rbd (http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/) This has been in production for more than a year now and heavily tested before. Performance was not expected since frontend server mainly do read (90%). Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 14 Nov 2013, at 17:08, Gautam Saxena gsax...@i-a-inc.com wrote: 1) nfs over rbd (http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] alternative approaches to CEPH-FS
Hi, Well, basically, the frontend is composed of web servers. They mostly do reads on the NFS mount. I believe that the biggest frontend has around 60 virtual machines, accessing the share and serving it. Unfortunately, I don’t have any figures anymore but performances were really poor in general. However they were fair enough for us since the workload was going to be “mixed read”. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 25 Nov 2013, at 13:50, Gautam Saxena gsax...@i-a-inc.com wrote: Hi Sebastien. Thanks! WHen you say performance was not expected, can you elaborate a little? Specifically, what did you notice in terms of performance? On Mon, Nov 25, 2013 at 4:39 AM, Sebastien Han sebastien@enovance.com wrote: Hi, 1) nfs over rbd (http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/) This has been in production for more than a year now and heavily tested before. Performance was not expected since frontend server mainly do read (90%). Cheers. Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 14 Nov 2013, at 17:08, Gautam Saxena gsax...@i-a-inc.com wrote: 1) nfs over rbd (http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Intel 520/530 SSD for ceph
I used a blocksize of 350k as my graphes shows me that this is the average workload we have on the journal. Pretty interesting metric Stefan. Has anyone seen the same behaviour? Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 22 Nov 2013, at 02:37, Mark Nelson mark.nel...@inktank.com wrote: On 11/21/2013 02:36 AM, Stefan Priebe - Profihost AG wrote: Hi, Am 21.11.2013 01:29, schrieb m...@linuxbox.com: On Tue, Nov 19, 2013 at 09:02:41AM +0100, Stefan Priebe wrote: ... You might be able to vary this behavior by experimenting with sdparm, smartctl or other tools, or possibly with different microcode in the drive. Which values or which settings do you think of? ... Off-hand, I don't know. Probably the first thing would be to compare the configuration of your 520 530; anything that's different is certainly worth investigating. This should display all pages, sdparm --all --long /dev/sdX the 520 only appears to have 3 pages, which can be fetched directly w/ sdparm --page=ca --long /dev/sdX sdparm --page=co --long /dev/sdX sdparm --page=rw --long /dev/sdX The sample machine I'm looking has an intel 520, and on ours, most options show as 0 except for AWRE1 [cha: n, def: 1] Automatic write reallocation enabled WCE 1 [cha: y, def: 1] Write cache enable DRA 1 [cha: n, def: 1] Disable read ahead GLTSD 1 [cha: n, def: 1] Global logging target save disable BTP-1 [cha: n, def: -1] Busy timeout period (100us) ESTCT 30 [cha: n, def: 30] Extended self test completion time (sec) Perhaps that's an interesting data point to compare with yours. Figuring out if you have up-to-date intel firmware appears to require burning and running an iso image from https://downloadcenter.intel.com/Detail_Desc.aspx?agr=YDwnldID=18455 The results of sdparm --page=whatever --long /dev/sdc show the intel firmware, but this labels it better: smartctl -i /dev/sdc Our 520 has firmware 400i loaded. Firmware is up2date and all values are the same. I expect that the 520 firmware just ignores CMD_FLUSH commands and the 530 does not. For those of you that don't follow LKML, there is some interesting discussion going on regarding this same issue (Hi Stefan!) https://lkml.org/lkml/2013/11/20/158 Can anyone think of a reasonable (ie not yanking power out) way to test what CMD_FLUSH is actually doing? I have some 520s in our test rig I can play with. Otherwise, maybe an Intel engineer can chime in and let us know what's going on? Greets, Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] presentation videos from Ceph Day London?
Nothing has been recorded as far as I know. However I’ve seen some guys from Scality recording sessions with a cam. Scality? Are you there? :) Sébastien Han Cloud Engineer Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 10, rue de la Victoire - 75009 Paris Web : www.enovance.com - Twitter : @enovance On 30 Oct 2013, at 10:24, Blair Bethwaite blair.bethwa...@gmail.com wrote: I've been perusing the content on slideshare and see some really interesting and creatively composed presentations! Was there any recording done (and plans to make it generally available)? -- Cheers, ~Blairo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack on ceph rbd installation failure
Can you send your ceph.conf too?Is /etc/ceph/ceph.conf present? Is the key of user volume present too?Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance On Jul 23, 2013, at 5:39 AM, johnu johnugeorge...@gmail.com wrote:Hi, I have a three node ceph cluster. ceph -w says health ok . I have openstack in the same cluster and trying to map cinder and glance onto rbd.I have followed steps as given inhttp://ceph.com/docs/next/rbd/rbd-openstack/New Settings that is added in cinder.conf for three filesvolume_driver=cinder.volume.drivers.rbd.RBDDriverrbd_pool=volumesglance_api_version=2rbd_user=volumesrbd_secret_uuid=62d0b384-50ad-2e17-15ed-66bfeda40252 ( different for each node)LOGS seen when I run ./rejoin.sh2013-07-22 20:35:01.900 INFO cinder.service [-] Starting 1 workers2013-07-22 20:35:01.909 INFO cinder.service [-] Started child 22902013-07-22 20:35:01.965 AUDIT cinder.service [-] Starting cinder-volume node (version 2013.2)2013-07-22 20:35:02.129 ERROR cinder.volume.drivers.rbd [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] error connecting to ceph cluster2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd Traceback (most recent call last):2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 243, in check_for_setup_error2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd with RADOSClient(self):2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 215, in __init__2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd self.cluster, self.ioctx = driver._connect_to_rados(pool)2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 263, in _connect_to_rados2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd client.connect()2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/usr/lib/python2.7/dist-packages/rados.py", line 192, in connect2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd raise make_ex(ret, "error calling connect")2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd ObjectNotFound: error calling connect2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd2013-07-22 20:35:02.149 ERROR cinder.service [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] Unhandled exception2013-07-22 20:35:02.149 TRACE cinder.service Traceback (most recent call last):2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 228, in _start_child2013-07-22 20:35:02.149 TRACE cinder.service self._child_process(wrap.server)2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 205, in _child_process2013-07-22 20:35:02.149 TRACE cinder.service launcher.run_server(server)2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 96, in run_server2013-07-22 20:35:02.149 TRACE cinder.service server.start()2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 359, in start2013-07-22 20:35:02.149 TRACE cinder.service self.manager.init_host()2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/volume/manager.py", line 139, in init_host2013-07-22 20:35:02.149 TRACE cinder.service self.driver.check_for_setup_error()2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 248, in check_for_setup_error2013-07-22 20:35:02.149 TRACE cinder.service raise exception.VolumeBackendAPIException(data="">2013-07-22 20:35:02.149 TRACE cinder.service VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: error connecting to ceph cluster2013-07-22 20:35:02.149 TRACE cinder.service2013-07-22 20:35:02.191 INFO cinder.service [-] Child 2290 exited with status 22013-07-22 20:35:02.192 INFO cinder.service [-] _wait_child 12013-07-22 20:35:02.193 INFO cinder.service [-] wait wrap.failed TrueCan someone help me with some debug points and solve it ?___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD Mapping
Hi Greg,Just tried the list watchers, on a rbd with the QEMU driver and I got:root@ceph:~# rados -p volumes listwatchers rbd_header.789c2ae8944awatcher=client.30882 cookie=1I also tried with the kernel module but didn't see anything…No IP addresses anywhere… :/, any idea?Nice tip btw :)Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance On Jul 23, 2013, at 11:01 PM, Gregory Farnum g...@inktank.com wrote:On Tue, Jul 23, 2013 at 1:28 PM, Wido den Hollander w...@42on.com wrote:On 07/23/2013 09:09 PM, Gaylord Holder wrote:Is it possible to find out which machines are mapping and RBD?No, that is stateless. You can use locking however, you can for example putthe hostname of the machine in the lock.But that's not mandatory in the protocol.Maybe you are able to list watchers for a RBD drive, but I'm not sure aboutthat.You can. "rados listwatchers object" will tell you who's got watchesregistered, and that output should include IPs. You'll want to run itagainst the rbd head object.-GregSoftware Engineer #42 @ http://inktank.com | http://ceph.com-Gaylord___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com--Wido den Hollander42on B.V.Phone: +31 (0)20 700 9902Skype: contact42on___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD Mapping
Arf no worries. Even after a quick dive into the logs, I haven't find anything. (default log level).Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance On Jul 24, 2013, at 12:08 AM, Gregory Farnum g...@inktank.com wrote:On Tue, Jul 23, 2013 at 2:55 PM, Sebastien Hansebastien@enovance.com wrote:Hi Greg,Just tried the list watchers, on a rbd with the QEMU driver and I got:root@ceph:~# rados -p volumes listwatchers rbd_header.789c2ae8944awatcher=client.30882 cookie=1I also tried with the kernel module but didn't see anything…No IP addresses anywhere… :/, any idea?Nice tip btw :)Oh, whoops. Looks like the first iteration didn't include IPaddresses; they show up in version 0.65 or later. Sorry for theinconvenience. I think there might be a way to convert client IDs intoaddresses but I can't quite think of any convenient ones (as opposedto inconvenient ones like digging them up out of logs); maybe somebodyelse has an idea...-GregSoftware Engineer #42 @ http://inktank.com | http://ceph.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] RADOS Bench strange behavior
Hi all,While running some benchmarks with the internal rados benchmarker I noticed something really strange. First of all, this is the line I used to run it:$sudo rados -p 07:59:54_performance bench 300 write -b 4194304 -t 1 --no-cleanupSo I want to test an IO with a concurrency of 1. I had a look at the code and also strace the process and I noticed that the IOs are send one by one sequentially. Thus it does what I expect from it.However while monitoring the disks usage on all my OSDs, I found out that they were all loaded (writing, both journals and filestore) which is kind of weird since all the IOs are send one by one. I was expecting that only one OSDs at a time will be writing.Obviously there is no replication going on since I changed the rep size to 1.$ ceph osd dump |grep "07:59:54_performance"pool 323 '07:59:54_performance' rep size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 1306 owner 0Thanks in advance guy.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problem with multiple hosts RBD + Cinder
De rien, cool :)Yes start from the libvirt section.Cheers!Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance On Jun 21, 2013, at 11:25 AM, Igor Laskovy igor.lask...@gmail.com wrote:Merci Sebastien, it's work now ;)Now for live migration do I need followhttps://wiki.openstack.org/wiki/LiveMigrationUsagebegining from libvirt settings section?On Thu, Jun 20, 2013 at 2:47 PM, Sebastien Hansebastien@enovance.comwrote:Hi,No this must always be the same UUID. You can only specify one in cinder.conf.Btw nova does the attachment this is why it needs the uuid and secret.The first secret import generates an UUID, then always re-use the same one for all your compute node, do something like:secret ephemeral='no' private='no' uuid9e4c7795-0681-cd4f-cf36-8cb8aef3c47f/uuid usage type='ceph' nameclient.volumes secret/name /usage/secretCheers.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."image.pngPhone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovanceOn Jun 20, 2013, at 12:23 PM, Igor Laskovy igor.lask...@gmail.com wrote:Hello list!I am trying deploy Ceph RBD + OpenStack Cinder.Basically, my question related to this section in documentation:cat secret.xml EOFsecret ephemeral='no' private='no' usage type='ceph' nameclient.volumes secret/name /usage/secretEOFsudo virsh secret-define --file secret.xmluuid of secret is output heresudo virsh secret-set-value --secret {uuid of secret} --base64 $(cat client.volumes.key) rm client.volumes.key secret.xmlDo I need tie libvirt secrets logic with ceph client.volumes user on each cinder-volume hosts? So it will be separate "uuid of secret" for each host but they all will use single user cinder.volumes, right?Asking this because I have strange error in nova-scheduler.log on controller host:2013-06-20 13:10:01.270 ERROR nova.scheduler.filter_scheduler [req-b173d765-9528-43af-a3d1-bd811df8710d fd860a2737f94ff0bc7decec5783017b 3f47be9a0c2348faac4deec2a988acd8] [instance: d8dd40d4-61de-498d-a54f-12f4d9e9c594] Errorfrom last host: node03 (node node03.ceph.labspace.studiogrizzly.com): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 848, in _run_instance\n set_access_ip=set_access_ip)\n', u' File"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1107, in _spawn\n LOG.exception(_(\'Instance failed to spawn\'), instance=instance)\n', u' File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__\n self.gen.next()\n', u' File"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1103, in _spawn\n block_device_info)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 1527, in spawn\n block_device_info)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2443, in _create_domain_and_network\n domain = self._create_domain(xml, instance=instance)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2404, in _create_domain\n domain.createWithFlags(launch_flags)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 187, in doit\n result = proxy_call(self._autowrap, f, *args, **kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 147, inproxy_call\n rv = execute(f,*args,**kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 76, in tworker\n rv = meth(*args,**kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/libvirt.py", line 711, in createWithFlags\n if ret == -1: raiselibvirtError (\'virDomainCreateWithFlags() failed\', dom=self)\n', u"libvirtError: internal error rbd username 'volumes' specified but secret not found\n"]--Igor Laskovyfacebook.com/igor.laskovystudiogrizzly.com___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com--Igor Laskovyfacebook.com/igor.laskovystudiogrizzly.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Problem with multiple hosts RBD + Cinder
Hi,No this must always be the same UUID. You can only specify one in cinder.conf.Btw nova does the attachment this is why it needs the uuid and secret.The first secret import generates an UUID, then always re-use the same one for all your compute node, do something like:secret ephemeral='no' private='no' uuid9e4c7795-0681-cd4f-cf36-8cb8aef3c47f/uuid usage type='ceph' nameclient.volumes secret/name /usage/secretCheers.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance On Jun 20, 2013, at 12:23 PM, Igor Laskovy igor.lask...@gmail.com wrote:Hello list!I am trying deploy Ceph RBD + OpenStack Cinder.Basically, my question related to this section in documentation:cat secret.xml EOFsecret ephemeral='no' private='no' usage type='ceph' nameclient.volumes secret/name /usage/secretEOFsudo virsh secret-define --file secret.xmluuid of secret is output heresudo virsh secret-set-value --secret {uuid of secret} --base64 $(cat client.volumes.key) rm client.volumes.key secret.xmlDo I need tie libvirt secrets logic with ceph client.volumes user on each cinder-volume hosts? So it will be separate "uuid of secret" for each host but they all will use single user cinder.volumes, right?Asking this because I have strange error in nova-scheduler.log on controller host:2013-06-20 13:10:01.270 ERROR nova.scheduler.filter_scheduler [req-b173d765-9528-43af-a3d1-bd811df8710d fd860a2737f94ff0bc7decec5783017b 3f47be9a0c2348faac4deec2a988acd8] [instance: d8dd40d4-61de-498d-a54f-12f4d9e9c594] Error fromlast host: node03 (nodenode03.ceph.labspace.studiogrizzly.com): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 848, in _run_instance\n set_access_ip=set_access_ip)\n', u' File"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1107, in _spawn\n LOG.exception(_(\'Instance failed to spawn\'), instance=instance)\n', u' File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__\n self.gen.next()\n', u' File"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1103, in _spawn\n block_device_info)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 1527, in spawn\n block_device_info)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2443, in _create_domain_and_network\n domain = self._create_domain(xml, instance=instance)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2404, in _create_domain\n domain.createWithFlags(launch_flags)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 187, in doit\n result = proxy_call(self._autowrap, f, *args, **kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 147, in proxy_call\n rv = execute(f,*args,**kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 76, in tworker\n rv = meth(*args,**kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/libvirt.py", line 711, in createWithFlags\n if ret == -1: raise libvirtError(\'virDomainCreateWithFlags() failed\', dom=self)\n', u"libvirtError: internal error rbd username 'volumes' specified but secret not found\n"]--Igor Laskovyfacebook.com/igor.laskovystudiogrizzly.com___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Live Migrations with cephFS
Thank you, Sebastien Han. I am sure many are thankful you've published your thoughts and experiences with Ceph and even OpemStack.Thanks Bo! :)If I may, I would like to reword my question/statement with greater clarity: To force all instances to always boot from RBD volumes, would a person would have to make changes to something more than Horizon (demonstration GUI)? If the changes need only be inHorizon, the provider would then likely need to restrict or deny their customers access to their unmodified APIs. If they do not, then the unchanged APIs would allow for behavior the provider does not want.Thoughts? Corrections? Feel free to teach.This is correct. Forcing the boot from volume requires a modified version of the API which kinda tricky and GUI modifications. There are 2 cases:1. you're an ISP (public provider), you should forget about the idea unless you want to provide a _really_ close service.2.you're the only one managing your platform (private cloud) this might be doable but even so you'll encounter a lot of problems while upgrading. At the end it's up to you, if you're 100% sure that you have the complete control of your infra and that you know when, who and how new instances are booted (and occasionally don't care about update and compatibility).You can always hack the dashboard but it's more than that you have to automate the action that each time someone is booting a VM you have to create a volume from an image for this. This will prolong the process. At this point, I'll recommend you to push this blueprint, it'll run all the VM through ceph even the one not using the boot-from-volume option.https://blueprints.launchpad.net/nova/+spec/bring-rbd-support-libvirt-images-typeAn article is coming next week and will cover the entire subject.Cheers!Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance On Jun 17, 2013, at 8:00 AM, Wolfgang Hennerbichler wolfgang.hennerbich...@risc-software.at wrote:On 06/14/2013 08:00 PM, Ilja Maslov wrote:Hi,Is live migration supported with RBD and KVM/OpenStack?Always wanted to know but was afraid to ask :)totally works in my productive setup. but we don't use openstack in thisinstallation, just KVM/RBD.Pardon brevity and formatting, replying from the phone.Cheers,IljaRobert Sander r.san...@heinlein-support.de wrote:On 14.06.2013 12:55, Alvaro Izquierdo Jimeno wrote:By default, openstack uses NFS but… other options are available….can weuse cephFS instead of NFS?Wouldn't you use qemu-rbd for your virtual guests in OpenStack?AFAIK CephFS is not needed for KVM/qemu virtual machines.Regards--Robert SanderHeinlein Support GmbHSchwedter Str. 8/9b, 10119 Berlinhttp://www.heinlein-support.deTel: 030 / 405051-43Fax: 030 / 405051-19Zwangsangaben lt. §35a GmbHG:HRB 93818 B / Amtsgericht Berlin-Charlottenburg,Geschäftsführer: Peer Heinlein -- Sitz: Berlin___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.comThis email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient, please note that any review, dissemination, disclosure, alteration, printing,circulation, retention or transmission of this e-mail and/or any file or attachment transmitted with it, is prohibited and may be unlawful. If you have received this e-mail or any file or attachment transmitted with it in error please notify postmas...@openet.com.Although Openet has taken reasonable precautions to ensure no viruses are present in this email, we cannot accept responsibility for any loss or damage arising from the use of this email or attachments.___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com--DI (FH) Wolfgang HennerbichlerSoftware DevelopmentUnit Advanced Computing TechnologiesRISC Software GmbHA company of the Johannes Kepler University LinzIT-CenterSoftwarepark 354232 HagenbergAustriaPhone: +43 7236 3343 245Fax: +43 7236 3343 250wolfgang.hennerbich...@risc-software.athttp://www.risc-software.at___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Live Migrations with cephFS
In OpenStack, a VM booted from a volume (where the disk is located on RBD) supports the live-migration without any problems.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance On Jun 14, 2013, at 11:36 PM, Bo b...@samware.com wrote:If I am not mistaken, one would need to modify OpenStack source to force Nova to boot from RBD volumes. Is this no longer the case?Modifying OpenStack's source is a wonderful idea especially if you push your changes upstream for review. However, it does add to your work when you want to pull updated code from upstream into your deployment.-boOn 14.06.2013 12:55, Alvaro Izquierdo Jimeno wrote: By default, openstack uses NFS but… other options are available….can we use cephFS instead of NFS?Wouldn't you use qemu-rbd for your virtual guests in OpenStack?AFAIK CephFS is not needed for KVM/qemu virtual machines.Regards--Robert SanderHeinlein Support GmbHSchwedter Str. 8/9b, 10119 Berlinhttp://www.heinlein-support.deTel: 030 / 405051-43Fax: 030 / 405051-19Zwangsangaben lt. §35a GmbHG:HRB 93818 B / Amtsgericht Berlin-Charlottenburg,Geschäftsführer: Peer Heinlein -- Sitz: Berlin___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com--"But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Much more then, having now been justified by His blood, we shall be saved from the wrath of God through Him." Romans 5:8-9Allhave sinned, broken God's law, and deserve eternal torment. Jesus Christ, the Son of God, died for the sins of those that will believe, purchasing our salvation, and defeated death so that we all may spend eternity in heaven. Do you desire freedom from helland be with God in His love for eternity?"If you confess with your mouth Jesus as Lord, and believe in your heart that God raised Him from the dead, you will be saved." Romans 10:9___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] QEMU -drive setting (if=none) for rbd
OpenStack doesn't know how to set different caching options for attached block device.See the following blueprint,https://blueprints.launchpad.net/nova/+spec/enable-rbd-tuning-optionsThis might be implemented for Havana.Cheers.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance On Jun 11, 2013, at 7:43 PM, Oliver Francke oliver.fran...@filoo.de wrote:Hi,Am 11.06.2013 um 19:14 schrieb w sun ws...@hotmail.com:Hi,We are currently testing the performance with rbd caching enabled with write-back mode on our openstack (grizzly) nova nodes. By default, nova fires up the rbd volumes with "if=none" mode evidenced by the following cmd line from "ps | grep".-drive file=rbd:ceph-openstack-volumes/volume-949e2e32-20c7-45cf-b41b-46951c78708b:id=ceph-openstack-volumes:key=12347I9RsEoIDBAAi2t+M6+7zMMZoMM+aasiog==:auth_supported=cephx\;none,if=none,id=drive-virtio-disk0,format=raw,serial=949e2e32-20c7-45cf-b41b-46951c78708b,cache=writebackDoes anyone know if this should be set to anything else (e.g., if=virtio suggested by some qemu posts in general)? Given that the underline network stack for RBD IO is provided by the linux kenerl instead, does this option bear any relevance for rbd volumeperformance inside guest VM?there should be something like "-device virtio-blk-pci,drive=drive-virtio-disk0" in reference to the id= for the drive-specification.Furthermore to really check rbd_cache there is s/t like:rbd_cache=true:rbd_cache_size=33554432:rbd_cache_max_dirty=16777216:rbd_cache_target_dirty=8388608missing in the ":"-list, perhaps after :none:rbd_cache=true:rbd_cache_size=33554432:rbd_cache_max_dirty=16777216:rbd_cache_target_dirty=8388608cache=writeback is necessary, too.No idea, though, how to teach openstack to use these parameters, sorry.Regards,Oliver.Thanks. --weiguo___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Live Migration: KVM-Libvirt Shared-storage
I did, what would like to know?Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance On May 30, 2013, at 1:49 AM, Amit Vijairania amit.vijaira...@gmail.com wrote:We are currently testing Ceph with OpenStack Grizzly release and looking for some insight on Live Migration [1].. Based on documentation, there are two options for shared-storage and used for Nova instances (/var/lib/nova/instances): NFS and OpenStackGluster Connector..Do you know if anyone is using or have tested CephFS for Nova instances directory (console.log, libvirt.xml, )?[1]http://docs.openstack.org/trunk/openstack-compute/admin/content/configuring-migrations.htmlThanks!Amit___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.2 rbd-fixed ubuntu packages
Arf sorryWolfgang I scratched your name in my previous email :).Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance On May 29, 2013, at 12:19 AM, Sebastien Han sebastien@enovance.com wrote:Wolgang,I'm interested, and I assume I'm not the only one, thus can't you just make it public for everyone?Thanks.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."image.pngPhone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovanceOn May 28, 2013, at 8:10 PM, Alex Bligh a...@alex.org.uk wrote:Wolfgang,On 28 May 2013, at 06:50, Wolfgang Hennerbichler wrote:for anybody who's interested, I've packaged the latest qemu-1.4.2 (not 1.5, it didn't work nicely with libvirt) which includes important fixes to RBD for ubuntu 12.04 AMD64. If you want to save some time, I can share the packages with you. drop me a line ifyou're interested.Information as to what the important fixes are would be appreciated!--Alex Bligh___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com