Re: [ceph-users] Local SSD cache for ceph on each compute node.

2016-03-18 Thread Sebastien Han
I’d rather like to see this implemented at the hypervisor level, i.e.: QEMU, so 
we can have a common layer for all the storage backends.
Although this is less portable...

> On 17 Mar 2016, at 11:00, Nick Fisk  wrote:
> 
> 
> 
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Daniel Niasoff
>> Sent: 16 March 2016 21:02
>> To: Nick Fisk ; 'Van Leeuwen, Robert'
>> ; 'Jason Dillaman' 
>> Cc: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] Local SSD cache for ceph on each compute node.
>> 
>> Hi Nick,
>> 
>> Your solution requires manual configuration for each VM and cannot be
>> setup as part of an automated OpenStack deployment.
> 
> Absolutely, potentially flaky as well.
> 
>> 
>> It would be really nice if it was a hypervisor based setting as opposed to
> a VM
>> based setting.
> 
> Yes, I can't wait until we can just specify "rbd_cache_device=/dev/ssd" in
> the ceph.conf and get it to write to that instead. Ideally ceph would also
> provide some sort of lightweight replication for the cache devices, but
> otherwise a iSCSI SSD farm or switched SAS could be used so that the caching
> device is not tied to one physical host.
> 
>> 
>> Thanks
>> 
>> Daniel
>> 
>> -Original Message-
>> From: Nick Fisk [mailto:n...@fisk.me.uk]
>> Sent: 16 March 2016 08:59
>> To: Daniel Niasoff ; 'Van Leeuwen, Robert'
>> ; 'Jason Dillaman' 
>> Cc: ceph-users@lists.ceph.com
>> Subject: RE: [ceph-users] Local SSD cache for ceph on each compute node.
>> 
>> 
>> 
>>> -Original Message-
>>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
>>> Of Daniel Niasoff
>>> Sent: 16 March 2016 08:26
>>> To: Van Leeuwen, Robert ; Jason Dillaman
>>> 
>>> Cc: ceph-users@lists.ceph.com
>>> Subject: Re: [ceph-users] Local SSD cache for ceph on each compute node.
>>> 
>>> Hi Robert,
>>> 
 Caching writes would be bad because a hypervisor failure would result
 in
>>> loss of the cache which pretty much guarantees inconsistent data on
>>> the ceph volume.
 Also live-migration will become problematic compared to running
>>> everything from ceph since you will also need to migrate the
>> local-storage.
>> 
>> I tested a solution using iSCSI for the cache devices. Each VM was using
>> flashcache with a combination of a iSCSI LUN from a SSD and a RBD. This
> gets
>> around the problem of moving things around or if the hypervisor goes down.
>> It's not local caching but the write latency is at least 10x lower than
> the RBD.
>> Note I tested it, I didn't put it into production :-)
>> 
>>> 
>>> My understanding of how a writeback cache should work is that it
>>> should only take a few seconds for writes to be streamed onto the
>>> network and is focussed on resolving the speed issue of small sync
>>> writes. The writes
>> would
>>> be bundled into larger writes that are not time sensitive.
>>> 
>>> So there is potential for a few seconds data loss but compared to the
>> current
>>> trend of using ephemeral storage to solve this issue, it's a major
>>> improvement.
>> 
>> Yeah, problem is a couple of seconds data loss mean different things to
>> different people.
>> 
>>> 
 (considering the time required for setting up and maintaining the
 extra
>>> caching layer on each vm, unless you work for free ;-)
>>> 
>>> Couldn't agree more there.
>>> 
>>> I am just so surprised how the openstack community haven't looked to
>>> resolve this issue. Ephemeral storage is a HUGE compromise unless you
>>> have built in failure into every aspect of your application but many
>>> people use openstack as a general purpose devstack.
>>> 
>>> (Jason pointed out his blueprint but I guess it's at least a year or 2
>> away -
>>> http://tracker.ceph.com/projects/ceph/wiki/Rbd_-_ordered_crash-
>>> consistent_write-back_caching_extension)
>>> 
>>> I see articles discussing the idea such as this one
>>> 
>>> http://www.sebastien-han.fr/blog/2014/06/10/ceph-cache-pool-tiering-
>>> scalable-cache/
>>> 
>>> but no real straightforward  validated setup instructions.
>>> 
>>> Thanks
>>> 
>>> Daniel
>>> 
>>> 
>>> -Original Message-
>>> From: Van Leeuwen, Robert [mailto:rovanleeu...@ebay.com]
>>> Sent: 16 March 2016 08:11
>>> To: Jason Dillaman ; Daniel Niasoff
>>> 
>>> Cc: ceph-users@lists.ceph.com
>>> Subject: Re: [ceph-users] Local SSD cache for ceph on each compute node.
>>> 
 Indeed, well understood.
 
 As a shorter term workaround, if you have control over the VMs, you
 could
>>> always just slice out an LVM volume from local SSD/NVMe and pass it
>>> through to the guest.  Within the guest, use dm-cache (or similar) to
>>> add
>> a
>>> cache front-end to your RBD volume.
>>> 
>>> If you do this you need to setup 

Re: [ceph-users] Migrate Block Volumes and VMs

2015-12-17 Thread Sebastien Han
What you can do is flatten all the images so you break the relationship between 
the parent image and the child.
Then you can export/import.

> On 15 Dec 2015, at 12:10, Sam Huracan  wrote:
> 
> Hi everybody,
> 
> My OpenStack System use Ceph as backend for Glance, Cinder, Nova. In the 
> future, we intend build a new Ceph Cluster.
> I can re-connect current OpenStack with new Ceph systems.
> 
> After that, I have tried export rbd images and import to new Ceph, but VMs 
> and Volumes were clone of Glance rbd images, like this:
> 
> rbd children images/e2c852e1-28ce-408d-b2ec-6351db35d55a@snap
> 
> vms/8a4465fa-cbae-4559-b519-861eb4eda378_disk
> volumes/volume-b5937629-5f44-40c8-9f92-5f88129d3171
> 
> 
> How could I export all rbd snapshot and its clones to import in new Ceph 
> Cluster?
> 
> Or is there any solution to move all Vms, Volumes, Images from old Ceph 
> cluster to the new ones?
> 
> Thanks and regards.
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Senior Cloud Architect

"Always give 100%. Unless you're giving blood."

Mail: s...@redhat.com
Address: 11 bis, rue Roquépine - 75008 Paris



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] maximum number of mapped rbds?

2015-09-04 Thread Sebastien Han
Which Kernel are you running on?
These days, the theoretical limit is 65536 AFAIK.

Ilya would know the kernel needed for that.

> On 03 Sep 2015, at 15:05, Jeff Epstein  wrote:
> 
> Hello,
> 
> In response to an rbd map command, we are getting a "Device or resource busy".
> 
> $ rbd -p platform map ceph:pzejrbegg54hi-stage-4ac9303161243dc71c75--php
> 
> rbd: sysfs write failed
> 
> rbd: map failed: (16) Device or resource busy
> 
> 
> We currently have over 200 rbds mapped on a single host. Can this be the 
> source of the problem? If so, is there a workaround?
> 
> $  rbd -p platform showmapped|wc -l
> 248
> 
> Thanks.
> 
> Best,
> Jeff
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Senior Cloud Architect

"Always give 100%. Unless you're giving blood."

Mail: s...@redhat.com
Address: 11 bis, rue Roquépine - 75008 Paris



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Nova fails to download image from Glance backed with Ceph

2015-09-04 Thread Sebastien Han
Just to take away a possible issue from infra (LBs etc).
Did you try to download the image on the compute node? Something like rbd 
export?

> On 04 Sep 2015, at 11:56, Vasiliy Angapov  wrote:
> 
> Hi all,
> 
> Not sure actually where does this bug belong to - OpenStack or Ceph -
> but writing here in humble hope that anyone faced that issue also.
> 
> I configured test OpenStack instance with Glance images stored in Ceph
> 0.94.3. Nova has local storage.
> But when I'm trying to launch instance from large image stored in Ceph
> - it fails to spawn with such an error in nova-conductor.log:
> 
> 2015-09-04 11:52:35.076 3605449 ERROR nova.scheduler.utils
> [req-c6af3eca-f166-45bd-8edc-b8cfadeb0d0b
> 82c1f134605e4ee49f65015dda96c79a 448cc6119e514398ac2793d043d4fa02 - -
> -] [instance: 18c9f1d5-50e8-426f-94d5-167f43129ea6] Error from last
> host: slpeah005 (node slpeah005.cloud): [u'Traceback (most recent call
> last):\n', u'  File
> "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2220,
> in _do_build_and_run_instance\nfilter_properties)\n', u'  File
> "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2363,
> in _build_and_run_instance\ninstance_uuid=instance.uuid,
> reason=six.text_type(e))\n', u'RescheduledException: Build of instance
> 18c9f1d5-50e8-426f-94d5-167f43129ea6 was re-scheduled: [Errno 32]
> Corrupt image download. Checksum was 625d0686a50f6b64e57b1facbc042248
> expected 4a7de2fbbd01be5c6a9e114df145b027\n']
> 
> So nova tries 3 different hosts with the same error messages on every
> single one and then fails to spawn an instance.
> I've tried Cirros little image and it works fine with it. Issue
> happens with large images like 10Gb in size.
> I also managed to look into /var/lib/nova/instances/_base folder and
> found out that image is actually being downloaded but at some moment
> the download process interrupts for some unknown reason and instance
> gets deleted.
> 
> I looked at the syslog and found many messages like that:
> Sep  4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735094
> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.22 since
> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203
> (cutoff 2015-09-04 12:51:32.735011)
> Sep  4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735099
> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.23 since
> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203
> (cutoff 2015-09-04 12:51:32.735011)
> Sep  4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735104
> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.24 since
> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203
> (cutoff 2015-09-04 12:51:32.735011)
> Sep  4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735108
> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.26 since
> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203
> (cutoff 2015-09-04 12:51:32.735011)
> Sep  4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735118
> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.27 since
> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203
> (cutoff 2015-09-04 12:51:32.735011)
> 
> I've also tried to monitor nova-compute process file descriptors
> number but it is never more than 102. ("echo
> /proc/NOVA_COMPUTE_PID/fd/* | wc -w" like Jan advised in this ML).
> It also seems like problem appeared only in 0.94.3, in 0.94.2
> everything worked just fine!
> 
> Would be very grateful for any help!
> 
> Vasily.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Senior Cloud Architect

"Always give 100%. Unless you're giving blood."

Mail: s...@redhat.com
Address: 11 bis, rue Roquépine - 75008 Paris



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Nova with Ceph generate error

2015-07-10 Thread Sebastien Han
Which request generated this trace?
Is it  nova-compute log?

 On 10 Jul 2015, at 07:13, Mario Codeniera mario.codeni...@gmail.com wrote:
 
 Hi,
 
 It is my first time here. I am just having an issue regarding with my 
 configuration with the OpenStack which works perfectly for the cinder and the 
 glance based on Kilo release in CentOS 7. I am based my documentation on this 
 rbd-opeenstack manual.
 
 
 If I enable my rbd in the nova.conf it generates error like the following in 
 the dashboard as the logs don't have any errors:
 
 Internal Server Error (HTTP 500) (Request-ID: 
 req-231347dd-f14c-4f97-8a1d-851a149b037c)
 Code
 500
 Details
 File /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 343, in 
 decorated_function return function(self, context, *args, **kwargs) File 
 /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2737, in 
 terminate_instance do_terminate_instance(instance, bdms) File 
 /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py, line 445, 
 in inner return f(*args, **kwargs) File 
 /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2735, in 
 do_terminate_instance self._set_instance_error_state(context, instance) File 
 /usr/lib/python2.7/site-packages/oslo_utils/excutils.py, line 85, in 
 __exit__ six.reraise(self.type_, self.value, self.tb) File 
 /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2725, in 
 do_terminate_instance self._delete_instance(context, instance, bdms, quotas) 
 File /usr/lib/python2.7/site-packages/nova/hooks.py, line 149, in inner rv 
 = f(*args, **kwargs) File 
 /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2694, in 
 _delete_instance quotas.rollback() File 
 /usr/lib/python2.7/site-packages/oslo_utils/excutils.py, line 85, in 
 __exit__ six.reraise(self.type_, self.value, self.tb) File 
 /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2664, in 
 _delete_instance self._shutdown_instance(context, instance, bdms) File 
 /usr/lib/python2.7/site-packages/nova/compute/manager.py, line 2604, in 
 _shutdown_instance self.volume_api.detach(context, bdm.volume_id) File 
 /usr/lib/python2.7/site-packages/nova/volume/cinder.py, line 214, in 
 wrapper res = method(self, ctx, volume_id, *args, **kwargs) File 
 /usr/lib/python2.7/site-packages/nova/volume/cinder.py, line 365, in detach 
 cinderclient(context).volumes.detach(volume_id) File 
 /usr/lib/python2.7/site-packages/cinderclient/v2/volumes.py, line 334, in 
 detach return self._action('os-detach', volume) File 
 /usr/lib/python2.7/site-packages/cinderclient/v2/volumes.py, line 311, in 
 _action return self.api.client.post(url, body=body) File 
 /usr/lib/python2.7/site-packages/cinderclient/client.py, line 91, in post 
 return self._cs_request(url, 'POST', **kwargs) File 
 /usr/lib/python2.7/site-packages/cinderclient/client.py, line 85, in 
 _cs_request return self.request(url, method, **kwargs) File 
 /usr/lib/python2.7/site-packages/cinderclient/client.py, line 80, in 
 request return super(SessionClient, self).request(*args, **kwargs) File 
 /usr/lib/python2.7/site-packages/keystoneclient/adapter.py, line 206, in 
 request resp = super(LegacyJsonAdapter, self).request(*args, **kwargs) File 
 /usr/lib/python2.7/site-packages/keystoneclient/adapter.py, line 95, in 
 request return self.session.request(url, method, **kwargs) File 
 /usr/lib/python2.7/site-packages/keystoneclient/utils.py, line 318, in 
 inner return func(*args, **kwargs) File 
 /usr/lib/python2.7/site-packages/keystoneclient/session.py, line 397, in 
 request raise exceptions.from_response(resp, method, url)
 Created
 10 Jul 2015, 4:40 a.m.
 
 
 Again if disable I able to work but it is generated on the compute node, as I 
 observe too it doesn't display the hypervisor of the compute nodes, or maybe 
 it is related.
 
 It was working on Juno before, but there are unexpected rework as the network 
 infrastructure was change which the I rerun the script and found lots of 
 conflicts et al as I run before using qemu-img-rhev qemu-kvm-rhev from OVirt 
 but seems the new hammer (Ceph repository) solve the issue.
 
 Hope someone can enlighten.
 
 Thanks,
 Mario
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Senior Cloud Architect

Always give 100%. Unless you're giving blood.

Mail: s...@redhat.com
Address: 11 bis, rue Roquépine - 75008 Paris



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Expanding a ceph cluster with ansible

2015-06-24 Thread Sebastien Han
Bryan,

Answers inline.

 On 24 Jun 2015, at 00:52, Stillwell, Bryan bryan.stillw...@twcable.com 
 wrote:
 
 Sébastien,
 
 Nothing has gone wrong with using it in this way, it just has to do with
 my lack
 of experience with ansible/ceph-ansible.  I'm learning both now, but would
 love
 if there were more documentation around using them.  For example this
 documentation around using ceph-deploy is pretty good, and I was hoping for
 something equivalent for ceph-ansible:
 
 http://ceph.com/docs/master/rados/deployment/
 

Well if this is not enough: https://github.com/ceph/ceph-ansible/wiki
Please open an issue with what’s missing and I’ll make sure to clarify 
everything ASAP.

 
 With that said, I'm wondering what tweaks do you think would be needed to
 get
 ceph-ansible working on an existing cluster?

There are critical variables to edit, so the first thing to do will be to make 
sure that you perfectly match some variables with your current configuration.

Btw I just tried the following:

* deployed a cluster with ceph-deploy: 1 mons (on ceph1), 3 OSDs (on ceph4, 
ceph5, ceph6)
* 1 SSD for the journal per OSD

Then I configured ceph-ansible normally:

* ran ‘ceph fsid’ to pick up the uuid used  and edited 
group_vars/{all,mons,osds} with it (var fsid)
* collected the monitor keyring here: /var/lib/ceph/mon/ceph-ceph-eno1/keyring 
and put it in group_vars/mons on monitor_secret
* configured the monitor_interface variable in group_vars/all, this one might 
be tricky make sure that ceph-deploy used the right interface beforehand
* change the journal_size variable in group_vars/all and used 5120 (ceph-deploy 
default)
* change the public_network and cluster_network variables in group_vars/all
* removed everything in ~./ceph-ansible/fetch
* configure ceph-ansible to use a dedicated journal (journal_collocation: false 
and raw_multi_journal: true and edited raw_journal_devices variable)

Eventually ran “ansible-playbook site.yml”  and everything went well.
I now have 3 monitors and 4 new OSDs per host all using the same SSDs, so 25 in 
total.
Given that ceph-ansible follows ceph-deploy best practices, it worked without 
too much difficulty.
I’d say that it depends how the cluster was bootstrapped in the first place.

 
 Also to answer your other questions, I haven't tried expanding the cluster
 with
 ceph-ansible yet.  I'm playing around with it in vagrant/virtualbox, and
 it looks
 pretty awesome so far!  If everything goes well, I'm not against
 revisiting the
 choice of puppet-ceph and replacing it with ceph-ansible.

Awesome, don’t hesitate and let me know if I can help with this task.

 
 One other question, how well does ceph-ansible handle replacing a failed
 HDD
 (/dev/sdo) that has the journal at the beginning or middle of an SSD
 (/dev/sdd2)?

At the moment, it doesn’t.
Ceph-ansible just expects some basic mapping between OSDs and journals.
ceph-disk will do the partitioning, so ceph-ansible doesn’t have any knowledge 
of the layout.
It’d say that this intelligence should probably go intro ceph-disk itself or 
not but this idea will be to tell ceph-disk to re-use a partition that was a 
journal once.
Then we can build another ansible playbook to re-populate a list of OSDs that 
died.
I’ll have a look at that and will let you know.

A bit more about device management in Ceph Ansible.
For instance, depending on the scenario you choose.
Let’s assume you go with dedicated SSDs for your journal, we have 2 variables:

* devices 
(https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-osd/defaults/main.yml#L51):
 that contains a list of device where to store OSD data
* raw_journal_devices 
(https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-osd/defaults/main.yml#L89):
 that contains the list of SSD that will host a journal

So you can imagine having:


devices:
  - /dev/sdb
  - /dev/sdc
  - /dev/sdd
  - /dev/sde


raw_journal_devices:
  - /dev/sdu
  - /dev/sdu
  - /dev/sdv
  - /dev/sdv

Where sdb, sdc will have sdu as a journal device and sdd, see will have sdv as 
a journal device.

I should probably rework a little bit this part with an easier declaration 
though...

 Thanks,
 Bryan
 
 On 6/22/15, 7:09 AM, Sebastien Han s...@redhat.com wrote:
 
 Hi Bryan,
 
 It shouldn¹t be a problem for ceph-ansible to expand a cluster even if it
 wasn¹t deployed with it.
 I believe this requires a bit of tweaking on the ceph-ansible, but it¹s
 not much.
 Can you elaborate on what went wrong and perhaps how you configured
 ceph-ansible?
 
 As far as I understood, you haven¹t been able to grow the size of your
 cluster by adding new disks/nodes?
 Is this statement correct?
 
 One more thing, why don¹t you use ceph-ansible entirely to do the
 provisioning and life cycle management of your cluster? :)
 
 On 18 Jun 2015, at 00:14, Stillwell, Bryan
 bryan.stillw...@twcable.com wrote:
 
 I've been working on automating a lot of our ceph admin tasks lately
 and am
 pretty pleased with how the puppet-ceph module

Re: [ceph-users] Expanding a ceph cluster with ansible

2015-06-22 Thread Sebastien Han
Hi Bryan,

It shouldn’t be a problem for ceph-ansible to expand a cluster even if it 
wasn’t deployed with it.
I believe this requires a bit of tweaking on the ceph-ansible, but it’s not 
much.
Can you elaborate on what went wrong and perhaps how you configured 
ceph-ansible?

As far as I understood, you haven’t been able to grow the size of your cluster 
by adding new disks/nodes?
Is this statement correct?

One more thing, why don’t you use ceph-ansible entirely to do the provisioning 
and life cycle management of your cluster? :)

 On 18 Jun 2015, at 00:14, Stillwell, Bryan bryan.stillw...@twcable.com 
 wrote:
 
 I've been working on automating a lot of our ceph admin tasks lately and am
 pretty pleased with how the puppet-ceph module has worked for installing
 packages, managing ceph.conf, and creating the mon nodes.  However, I don't
 like the idea of puppet managing the OSDs.  Since we also use ansible in my
 group, I took a look at ceph-ansible to see how it might be used to
 complete
 this task.  I see examples for doing a rolling update and for doing an os
 migration, but nothing for adding a node or multiple nodes at once.  I
 don't
 have a problem doing this work, but wanted to check with the community if
 any one has experience using ceph-ansible for this?
 
 After a lot of trial and error I found the following process works well
 when
 using ceph-deploy, but it's a lot of steps and can be error prone
 (especially if you have old cephx keys that haven't been removed yet):
 
 # Disable backfilling and scrubbing to prevent too many performance
 # impacting tasks from happening at the same time.  Maybe adding norecover
 # to this list might be a good idea so only peering happens at first.
 ceph osd set nobackfill
 ceph osd set noscrub
 ceph osd set nodeep-scrub
 
 # Zap the disks to start from a clean slate
 ceph-deploy disk zap dnvrco01-cephosd-025:sd{b..y}
 
 # Prepare the disks.  I found sleeping between adding each disk can help
 # prevent performance problems.
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdh:/dev/sdb; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdi:/dev/sdb; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdj:/dev/sdb; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdk:/dev/sdc; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdl:/dev/sdc; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdm:/dev/sdc; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdn:/dev/sdd; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdo:/dev/sdd; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdp:/dev/sdd; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdq:/dev/sde; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdr:/dev/sde; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sds:/dev/sde; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdt:/dev/sdf; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdu:/dev/sdf; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdv:/dev/sdf; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdw:/dev/sdg; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdx:/dev/sdg; sleep 15
 ceph-deploy osd prepare dnvrco01-cephosd-025:sdy:/dev/sdg; sleep 15
 
 # Weight in the new OSDs.  We set 'osd_crush_initial_weight = 0' to prevent
 # them from being added in during the prepare step.  Maybe a longer weight
 # in the last step would make this step unncessary.
 ceph osd crush reweight osd.450 1.09; sleep 60
 ceph osd crush reweight osd.451 1.09; sleep 60
 ceph osd crush reweight osd.452 1.09; sleep 60
 ceph osd crush reweight osd.453 1.09; sleep 60
 ceph osd crush reweight osd.454 1.09; sleep 60
 ceph osd crush reweight osd.455 1.09; sleep 60
 ceph osd crush reweight osd.456 1.09; sleep 60
 ceph osd crush reweight osd.457 1.09; sleep 60
 ceph osd crush reweight osd.458 1.09; sleep 60
 ceph osd crush reweight osd.459 1.09; sleep 60
 ceph osd crush reweight osd.460 1.09; sleep 60
 ceph osd crush reweight osd.461 1.09; sleep 60
 ceph osd crush reweight osd.462 1.09; sleep 60
 ceph osd crush reweight osd.463 1.09; sleep 60
 ceph osd crush reweight osd.464 1.09; sleep 60
 ceph osd crush reweight osd.465 1.09; sleep 60
 ceph osd crush reweight osd.466 1.09; sleep 60
 ceph osd crush reweight osd.467 1.09; sleep 60
 
 # Once all the OSDs are added to the cluster, allow the backfill process to
 # begin.
 ceph osd unset nobackfill
 
 # Then once cluster is healthy again, re-enable scrubbing
 ceph osd unset noscrub
 ceph osd unset nodeep-scrub
 
 
 This E-mail and any of its attachments may contain Time Warner Cable 
 proprietary information, which is privileged, confidential, or subject to 
 copyright belonging to Time Warner Cable. This E-mail is intended solely for 
 the use of the individual or entity to which it is addressed. If you are not 
 the intended recipient of this E-mail, you are hereby notified that any 
 dissemination, distribution, copying, or action taken in 

Re: [ceph-users] rbd unmap command hangs when there is no network connection with mons and osds

2015-05-12 Thread Sebastien Han
Should we put a timeout to the unmap command on the RBD RA in the meantime?

 On 08 May 2015, at 15:13, Vandeir Eduardo vandeir.edua...@gmail.com wrote:
 
 Wouldn't be better a configuration named (map|unmap)_timeout? Cause we are 
 talking about a map/unmap of a RBD device, not a mount/unmount of a file 
 system.
 
 On Fri, May 8, 2015 at 10:04 AM, Ilya Dryomov idryo...@gmail.com wrote:
 On Fri, May 8, 2015 at 3:59 PM, Ilya Dryomov idryo...@gmail.com wrote:
  On Fri, May 8, 2015 at 1:18 PM, Vandeir Eduardo
  vandeir.edua...@gmail.com wrote:
  This causes an annoying problem with rbd resource agent in pacemaker. In a
  situation where pacemaker needs to stop a rbd resource agent on a node 
  where
  there is no network connection, the rbd unmap command hangs. This causes 
  the
  resource agent stop command to timeout and the node is fenced.
 
  On Thu, May 7, 2015 at 4:37 PM, Ilya Dryomov idryo...@gmail.com wrote:
 
  On Thu, May 7, 2015 at 10:20 PM, Vandeir Eduardo
  vandeir.edua...@gmail.com wrote:
   Hi,
  
   when issuing rbd unmap command when there is no network connection with
   mons
   and osds, the command hangs. Isn't there a option to force unmap even on
   this situation?
 
  No, but you can Ctrl-C the unmap command and that should do it.  In the
  dmesg you'll see something like
 
rbd: unable to tear down watch request
 
  and you may have to wait for the cluster to timeout the watch.
 
  We can probably add a --force to rbd unmap.  That would require extending 
  our
  sysfs interface but I don't see any obstacles.  Sage?
 
 On a second thought, we can timeout our wait for a reply to a watch
 teardown request with a configurable timeout (mount_timeout).  We might
 still need --force for more in the future, but for this particular
 problem the timeout is a better solution I think.  I'll take care of
 it.
 
 Thanks,
 
 Ilya
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Find out the location of OSD Journal

2015-05-11 Thread Sebastien Han
Under the OSD directory, you can look where the symlink points. This is 
generally called ‘journal’, it should point to a device.

 On 06 May 2015, at 06:54, Patrik Plank p.pl...@st-georgen-gusen.at wrote:
 
 Hi,
 
 i cant remember on which drive I install which OSD journal :-||
 Is there any command to show this?
 
 
 thanks
 regards
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph is Full

2015-04-29 Thread Sebastien Han
With mon_osd_full_ratio you should restart the monitors and this should’t be a 
problem.

For the unclean PG, looks like something is preventing them to be healthy, look 
at the state of the OSD responsible for these 2 PGs.

 On 29 Apr 2015, at 05:06, Ray Sun xiaoq...@gmail.com wrote:
 
 mon osd full ratio


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph is Full

2015-04-28 Thread Sebastien Han
You can try to push the full ratio a bit further and then delete some objects.

 On 28 Apr 2015, at 15:51, Ray Sun xiaoq...@gmail.com wrote:
 
 More detail about ceph health detail
 [root@controller ~]# ceph health detail
 HEALTH_ERR 20 pgs backfill_toofull; 20 pgs degraded; 20 pgs stuck unclean; 
 recovery 7482/129081 objects degraded (5.796%); 2 full osd(s); 1 near full 
 osd(s)
 pg 3.8 is stuck unclean for 7067109.597691, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.7d is stuck unclean for 1852078.505139, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.21 is stuck unclean for 7072842.637848, current state 
 active+degraded+remapped+backfill_toofull, last acting [0,2]
 pg 3.22 is stuck unclean for 7070880.213397, current state 
 active+degraded+remapped+backfill_toofull, last acting [0,2]
 pg 3.a is stuck unclean for 7067057.863562, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.7f is stuck unclean for 7067122.493746, current state 
 active+degraded+remapped+backfill_toofull, last acting [0,2]
 pg 3.5 is stuck unclean for 7067088.369629, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.1e is stuck unclean for 7073386.246281, current state 
 active+degraded+remapped+backfill_toofull, last acting [0,2]
 pg 3.19 is stuck unclean for 7068035.310269, current state 
 active+degraded+remapped+backfill_toofull, last acting [0,2]
 pg 3.5d is stuck unclean for 1852078.505949, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.1a is stuck unclean for 7067088.429544, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.1b is stuck unclean for 7072773.771385, current state 
 active+degraded+remapped+backfill_toofull, last acting [0,2]
 pg 3.3 is stuck unclean for 7067057.864514, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.15 is stuck unclean for 7067088.825483, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.11 is stuck unclean for 7067057.862408, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.6d is stuck unclean for 7067083.634454, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.6e is stuck unclean for 7067098.452576, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.c is stuck unclean for 5658116.678331, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.e is stuck unclean for 7067078.646953, current state 
 active+degraded+remapped+backfill_toofull, last acting [2,0]
 pg 3.20 is stuck unclean for 7067140.530849, current state 
 active+degraded+remapped+backfill_toofull, last acting [0,2]
 pg 3.7d is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.7f is active+degraded+remapped+backfill_toofull, acting [0,2]
 pg 3.6d is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.6e is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.5d is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.20 is active+degraded+remapped+backfill_toofull, acting [0,2]
 pg 3.21 is active+degraded+remapped+backfill_toofull, acting [0,2]
 pg 3.22 is active+degraded+remapped+backfill_toofull, acting [0,2]
 pg 3.1e is active+degraded+remapped+backfill_toofull, acting [0,2]
 pg 3.19 is active+degraded+remapped+backfill_toofull, acting [0,2]
 pg 3.1a is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.1b is active+degraded+remapped+backfill_toofull, acting [0,2]
 pg 3.15 is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.11 is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.c is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.e is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.8 is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.a is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.5 is active+degraded+remapped+backfill_toofull, acting [2,0]
 pg 3.3 is active+degraded+remapped+backfill_toofull, acting [2,0]
 recovery 7482/129081 objects degraded (5.796%)
 osd.0 is full at 95%
 osd.2 is full at 95%
 osd.1 is near full at 93%
 
 Best Regards
 -- Ray
 
 On Tue, Apr 28, 2015 at 9:43 PM, Ray Sun xiaoq...@gmail.com wrote:
 Emergency Help!
 
 One of ceph cluster is full, and ceph -s returns:
 [root@controller ~]# ceph -s
 cluster 059f27e8-a23f-4587-9033-3e3679d03b31
  health HEALTH_ERR 20 pgs backfill_toofull; 20 pgs degraded; 20 pgs stuck 
 unclean; recovery 7482/129081 objects degraded (5.796%); 2 full osd(s); 1 
 near full osd(s)
  monmap e6: 4 mons at 
 {node-5e40.cloud.com=10.10.20.40:6789/0,node-6670.cloud.com=10.10.20.31:6789/0,node-66c4.cloud.com=10.10.20.36:6789/0,node-fb27.cloud.com=10.10.20.41:6789/0},
  election epoch 886, quorum 0,1,2,3 
 

Re: [ceph-users] Ceph recovery network?

2015-04-27 Thread Sebastien Han
Well yes “pretty much” the same thing :).
I think some people would like to distinguish recovery from replication and 
maybe perform some QoS around these 2.
We have to replicate while recovering so one can impact the other.

In the end, I just think it’s a doc issue, still waiting for a dev to answer :).

 On 27 Apr 2015, at 00:50, Robert LeBlanc rob...@leblancnet.us wrote:
 
 My understanding is that Monitors monitor the public address of the
 OSDs and other OSDs monitor the cluster address of the OSDs.
 Replication, recovery and backfill traffic all use the same network
 when you specify 'cluster network = network/mask' in your ceph.conf.
 It is useful to remember that replication, recovery and backfill
 traffic are pretty much the same thing, just at different points in
 time.
 
 On Sun, Apr 26, 2015 at 4:39 PM, Sebastien Han
 sebastien@enovance.com wrote:
 Hi list,
 
 While reading this 
 http://ceph.com/docs/master/rados/configuration/network-config-ref/#ceph-networks,
  I came across the following sentence:
 
 You can also establish a separate cluster network to handle OSD heartbeat, 
 object replication and recovery traffic”
 
 I didn’t know it was possible to perform such stretching, at least for 
 recovery traffic.
 Replication is generally handled by the cluster_network_addr and the 
 heartbeat can be used with osd_heartbeat_addr.
 Although I’m a bit confused by the osd_heartbeat_addr since I thought the 
 heartbeat was binding on both public and cluster addresses.
 
 So my question is: how to isolate the recovery traffic to specific network?
 
 Thanks!
 
 Cheers.
 
 Sébastien Han
 Cloud Architect
 
 Always give 100%. Unless you're giving blood.
 
 Phone: +33 (0)1 49 70 99 72
 Mail: sebastien@enovance.com
 Address : 11 bis, rue Roquépine - 75008 Paris
 Web : www.enovance.com - Twitter : @enovance
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph recovery network?

2015-04-26 Thread Sebastien Han
Hi list,

While reading this 
http://ceph.com/docs/master/rados/configuration/network-config-ref/#ceph-networks,
 I came across the following sentence:

You can also establish a separate cluster network to handle OSD heartbeat, 
object replication and recovery traffic”

I didn’t know it was possible to perform such stretching, at least for recovery 
traffic.
Replication is generally handled by the cluster_network_addr and the heartbeat 
can be used with osd_heartbeat_addr.
Although I’m a bit confused by the osd_heartbeat_addr since I thought the 
heartbeat was binding on both public and cluster addresses.

So my question is: how to isolate the recovery traffic to specific network?

Thanks!

Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph cluster on docker containers

2015-03-29 Thread Sebastien Han
You can have a look at: https://github.com/ceph/ceph-docker

 On 23 Mar 2015, at 17:16, Pavel V. Kaygorodov pa...@inasan.ru wrote:
 
 Hi!
 
 I'm using ceph cluster, packed to a number of docker containers.
 There are two things, which you need to know:
 
 1. Ceph OSDs are using FS attributes, which may not be supported by 
 filesystem inside docker container, so you need to mount external directory 
 inside a container to store OSD data.
 2. Ceph monitors must have static external IP-s, so you have to use lxc-conf 
 directives to use static IP-s inside containers.
 
 
 With best regards,
  Pavel.
 
 
 6 марта 2015 г., в 10:15, Sumit Gaur sumitkg...@gmail.com написал(а):
 
 Hi
 I need to know if Ceph has any Docker story. What I am not abel to find if 
 there are any predefined steps for ceph cluster to be deployed on Docker 
 containers.
 
 Thanks
 sumit
 
 
 201503061614748_BEI0XT4N.gif
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Sparse RBD instance snapshots in OpenStack

2015-03-12 Thread Sebastien Han
Several patches aim to solve that by using RBD snapshots instead of QEMU 
snapshots.
Unfortunately I doubt we will have something ready for OpenStack Juno.
Hopefully Liberty will be the release that fixes that.

Having RAW images is not that bad since booting from that snapshot will do a 
clone.
So not sure if doing sparsify a good idea (libguestfs should be able to do 
that).
However it’s better we could do that via RBD snapshots so we can have best of 
both worlds.

 On 12 Mar 2015, at 03:45, Charles 'Boyo charlesb...@gmail.com wrote:
 
 Hello all.
 
 The current behavior of snapshotting instances RBD-backed in OpenStack 
 involves uploading the snapshot into Glance.
 
 The resulting Glance image is fully allocated, causing an explosion of 
 originally sparse RAW images. Is there a way to preserve the sparseness? Else 
 I can use qemu-img convert (or rbd export/import) to manually sparsify it?
 
 On a related note, my Glance is also backed by the same Ceph cluster, in 
 another pool and I was wondering if Ceph snapshots would not be a better way 
 to do this. Any ideas?
 
 Regards,
 
 Charles
 


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD on LVM volume

2015-02-24 Thread Sebastien Han
A while ago, I managed to have this working but this was really tricky.
See my comment here: 
https://github.com/ceph/ceph-ansible/issues/9#issuecomment-37127128

One use case I had was a system with 2 SSD for the OS and a couple of OSDs.
Both SSD were in RAID1 and the system was configured with lvm already.
So we had to create LVs for each journals.

 On 24 Feb 2015, at 14:41, Jörg Henne henn...@gmail.com wrote:
 
 2015-02-24 14:05 GMT+01:00 John Spray john.sp...@redhat.com:
 
 I imagine that without proper partition labels you'll also not get the 
 benefit of e.g. the udev magic
 that allows plugging OSDs in/out of different hosts.  More generally you'll 
 just be in a rather non standard configuration that will confuse anyone 
 working on the host.
 Ok, thanks for the heads up!
  
 Can I ask why you want to use LVM?  It is not generally necessary or useful 
 with Ceph: Ceph expects to be fed raw drives.
 I am currently just experimenting with ceph. Although I have a reasonable 
 number of lab nodes, those nodes are shared with other experimentation and 
 thus it would be rather inconvenient to dedicate the raw disks exclusively to 
 ceph.
 
 Joerg Henne
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Journals on all SSD cluster

2015-01-21 Thread Sebastien Han
It has been proven that the OSDs can’t take advantage of the SSD, so I’ll 
probably collocate both journal and osd data.
Search in the ML for [Single OSD performance on SSD] Can't go over 3, 2K IOPS

You will see that there is no difference it terms of performance between the 
following:

* 1 SSD for journal + 1 SSD for osd data
* 1 SSD for both journal and data

What you can do in order to max out your SSD is to run multiple journals and 
osd data on the same SSD. Something like this gave me more IOPS:

* /dev/sda1 ceph journal
* /dev/sda2 ceph data
* /dev/sda3 ceph journal
* /dev/sda4 ceph data

 On 21 Jan 2015, at 04:32, Andrew Thrift and...@networklabs.co.nz wrote:
 
 Hi All,
 
 We have a bunch of shiny new hardware we are ready to configure for an all 
 SSD cluster.
 
 I am wondering what are other people doing for their journal configuration on 
 all SSD clusters ?
 
 - Seperate Journal partition and OSD partition on each SSD
 
 or
 
 - Journal on OSD
 
 
 Thanks,
 
 
 
 
 Andrew
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how do I show active ceph configuration

2015-01-21 Thread Sebastien Han
You can use the admin socket:

$ ceph daemon mon.id config show

or locally

ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config show

 On 21 Jan 2015, at 19:46, Robert Fantini robertfant...@gmail.com wrote:
 
 Hello
 
  Is there a way to see running / acrive  ceph.conf  configuration items?
 
 kind regards
 Rob Fantini
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] reset osd perf counters

2015-01-14 Thread Sebastien Han
It was added in 0.90

 On 13 Jan 2015, at 00:11, Gregory Farnum g...@gregs42.com wrote:
 
 perf reset on the admin socket. I'm not sure what version it went in
 to; you can check the release logs if it doesn't work on whatever you
 have installed. :)
 -Greg
 
 
 On Mon, Jan 12, 2015 at 2:26 PM, Shain Miley smi...@npr.org wrote:
 Is there a way to 'reset' the osd perf counters?
 
 The numbers for osd 73 though osd 83 look really high compared to the rest
 of the numbers I see here.
 
 I was wondering if I could clear the counters out, so that I have a fresh
 set of data to work with.
 
 
 root@cephmount1:/var/log/samba# ceph osd perf
 osdid fs_commit_latency(ms) fs_apply_latency(ms)
0 0   45
1 0   14
2 0   47
3 0   25
4 1   44
5 12
6 12
7 0   39
8 0   32
9 0   34
   10 2  186
   11 0   68
   12 11
   13 0   34
   14 01
   15 2   37
   16 0   23
   17 0   28
   18 0   26
   19 0   22
   20 02
   21 2   24
   22 0   33
   23 01
   24 3   98
   25 2   70
   26 01
   27 3   99
   28 02
   29 2  101
   30 2   72
   31 2   81
   32 3  112
   33 3   94
   34 4  152
   35 0   56
   36 02
   37 2   58
   38 01
   39 03
   40 02
   41 02
   42 11
   43 02
   44 1   44
   45 02
   46 01
   47 3   85
   48 01
   49 2   75
   50 4  398
   51 3  115
   52 01
   53 2   47
   54 6  290
   55 5  153
   56 7  453
   57 2   66
   58 11
   59 5  196
   60 00
   61 0   93
   62 09
   63 01
   64 01
   65 04
   66 01
   67 0   18
   68 0   16
   69 0   81
   70 0   70
   71 00
   72 01
   7374 1217
   74 01
   7564 1238
   7692 1248
   77 01
   78 01
   79   109 1333
   8068 1451
   8166 1192
   8295 1215
   8381 1331
   84 3   56
   85 3   65
   86 01
   87 3  

Re: [ceph-users] Spark/Mesos on top of Ceph/Btrfs

2015-01-14 Thread Sebastien Han
Hey

What do you want to use from Ceph? RBD? CephFS?

It is not really clear, you mentioned ceph/btfrs which makes me either think of 
using btrfs for OSD store or btrfs on top of a RBD device.
Later you mentioned HDFS, does that mean you want to use CephFS?

I don’t know much about Mesos, but what is so specific about Mesos that make 
you think that you will experience trouble using it with Ceph?

 On 13 Jan 2015, at 14:25, James wirel...@tampabay.rr.com wrote:
 
 Hello,
 
 
 I was wondering if anyone has Mesos running on top of Ceph?
 I want to test/use Ceph if lieu of HDFS.
 
 
 I'm working on Gentoo, but any experiences with Mesos on Ceph
 are of keen interest to me as related to performance, stability
 and any difficulties experienced.
 
 
 James
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph as backend for Swift

2015-01-08 Thread Sebastien Han
You can have a look of what I did here with Christian:

* https://github.com/stackforge/swift-ceph-backend
* https://github.com/enovance/swiftceph-ansible

If you have further question just let us know.

 On 08 Jan 2015, at 15:51, Robert LeBlanc rob...@leblancnet.us wrote:
 
 Anyone have a reference for documentation to get Ceph to be a backend for 
 Swift?
 
 Thanks,
 Robert LeBlanc
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Number of SSD for OSD journal

2014-12-15 Thread Sebastien Han
Salut,

The general recommended ratio (for me at least) is 3 journals per SSD. Using 
200GB Intel DC S3700 is great.
If you’re going with a low perf scenario I don’t think you should bother buying 
SSD, just remove them from the picture and do 12 SATA 7.2K 4TB.

For medium and medium ++ perf using a ratio 1:11 is way to high, the SSD will 
definitely be the bottleneck here.
Please also note that (bandwidth wise) with 22 drives you’re already hitting 
the theoretical limit of a 10Gbps network. (~50MB/s * 22 ~= 1.1Gbps).
You can theoretically up that value with LACP (depending on the 
xmit_hash_policy you’re using of course).

Btw what’s the network? (since I’m only assuming here).


 On 15 Dec 2014, at 20:44, Florent MONTHEL fmont...@flox-arts.net wrote:
 
 Hi,
 
 I’m buying several servers to test CEPH and I would like to configure journal 
 on SSD drives (maybe it’s not necessary for all use cases)
 Could you help me to identify number of SSD I need (SSD are very expensive 
 and GB price business case killer… ) ? I don’t want to experience SSD 
 bottleneck (some abacus ?).
 I think I will be with below CONF 2  3
 
 
 CONF 1 DELL 730XC Low Perf:
 10 SATA 7.2K 3.5  4TB + 2 SSD 2.5 » 200GB intensive write
 
 CONF 2 DELL 730XC « Medium Perf :
 22 SATA 7.2K 2.5 1TB + 2 SSD 2.5 » 200GB intensive write
 
 CONF 3 DELL 730XC « Medium Perf ++ :
 22 SAS 10K 2.5 1TB + 2 SSD 2.5 » 200GB intensive write
 
 Thanks
 
 Florent Monthel
 
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Block device and Trim/Discard

2014-12-12 Thread Sebastien Han
Discard works with virtio-scsi controllers for disks in QEMU.
Just use discard=unmap in the disk section (scsi disk).


 On 12 Dec 2014, at 13:17, Max Power mailli...@ferienwohnung-altenbeken.de 
 wrote:
 
 Wido den Hollander w...@42on.com hat am 12. Dezember 2014 um 12:53
 geschrieben:
 It depends. Kernel RBD does not support discard/trim yet. Qemu does
 under certain situations and with special configuration.
 
 Ah, Thank you. So this is my problem. I use rbd with the kernel modules. I 
 think
 I should port my fileserver to qemu/kvm environment then and hope that it is
 safe to have a big qemu-partition with around 10 TB.
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Watch for fstrim running on your Ubuntu systems

2014-12-09 Thread Sebastien Han
Good to know. Thanks for sharing!

 On 09 Dec 2014, at 10:21, Wido den Hollander w...@42on.com wrote:
 
 Hi,
 
 Last sunday I got a call early in the morning that a Ceph cluster was
 having some issues. Slow requests and OSDs marking each other down.
 
 Since this is a 100% SSD cluster I was a bit confused and started
 investigating.
 
 It took me about 15 minutes to see that fstrim was running and was
 utilizing the SSDs 100%.
 
 On Ubuntu 14.04 there is a weekly CRON which executes fstrim-all. It
 detects all mountpoints which can be trimmed and starts to trim those.
 
 On the Intel SSDs used here it caused them to become 100% busy for a
 couple of minutes. That was enough for them to no longer respond on
 heartbeats, thus timing out and being marked down.
 
 Luckily we had the out interval set to 1800 seconds on that cluster,
 so no OSD was marked as out.
 
 fstrim-all does not execute fstrim with a ionice priority. From what I
 understand, but haven't tested yet, is that running fstrim with ionice
 -c Idle should solve this.
 
 It's weird that this issue didn't come up earlier on that cluster, but
 after killing fstrim all problems we resolved and the cluster ran
 happily again.
 
 So watch out for fstrim on early Sunday mornings on Ubuntu!
 
 --
 Wido den Hollander
 42on B.V.
 Ceph trainer and consultant
 
 Phone: +31 (0)20 700 9902
 Skype: contact42on
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Tool or any command to inject metadata/data corruption on rbd

2014-12-04 Thread Sebastien Han
AFAIK there is no tool to do this.
You simply rm object or dd a new content in the object (fill with zero)

 On 04 Dec 2014, at 13:41, Mallikarjun Biradar 
 mallikarjuna.bira...@gmail.com wrote:
 
 Hi all,
 
 I would like to know which tool or cli that all users are using to simulate 
 metadata/data corruption.
 This is to test scrub operation.
 
 -Thanks  regards,
 Mallikarjun Biradar
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Suitable SSDs for journal

2014-12-04 Thread Sebastien Han
Eneko,

I do have plan to push to a performance initiative section on the ceph.com/docs 
sooner or later so people will put their own results through github PR.

 On 04 Dec 2014, at 16:09, Eneko Lacunza elacu...@binovo.es wrote:
 
 Thanks, will look back in the list archive.
 
 On 04/12/14 15:47, Nick Fisk wrote:
 Hi Eneko,
 
 There has been various discussions on the list previously as to the best SSD 
 for Journal use. All of them have pretty much come to the conclusion that 
 the Intel S3700 models are the best suited and in fact work out the cheapest 
 in terms of write durability.
 
 Nick
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
 Eneko Lacunza
 Sent: 04 December 2014 14:35
 To: Ceph Users
 Subject: [ceph-users] Suitable SSDs for journal
 
 Hi all,
 
 Does anyone know about a list of good and bad SSD disks for OSD journals?
 
 I was pointed to
 http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
 
 But I was looking for something more complete?
 
 For example, I have a Samsung 840 Pro that gives me even worse performance 
 than a Crucial m550... I even thought it was dying (but doesn't seem this is 
 the case).
 
 Maybe creating a community-contributed list could be a good idea?
 
 Regards
 Eneko
 
 --
 Zuzendari Teknikoa / Director Técnico
 Binovo IT Human Project, S.L.
 Telf. 943575997
943493611
 Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa) 
 www.binovo.es
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 
 
 
 
 
 --
 Zuzendari Teknikoa / Director Técnico
 Binovo IT Human Project, S.L.
 Telf. 943575997
  943493611
 Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
 www.binovo.es
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.

Sébastien Han
Cloud Architect

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72
Mail: sebastien@enovance.com
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] All SSD storage and journals

2014-10-27 Thread Sebastien Han
They were some investigations as well around F2FS 
(https://www.kernel.org/doc/Documentation/filesystems/f2fs.txt), the last time 
I tried to install an OSD dir under f2fs it failed.
I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr test:

fremovexattr(10, user.test@5848273)   = 0

Maybe someone from the core dev has an update on this?

 On 24 Oct 2014, at 07:58, Christian Balzer ch...@gol.com wrote:
 
 
 Hello,
 
 as others have reported in the past and now having tested things here
 myself, there really is no point in having journals for SSD backed OSDs on
 other SSDs.
 
 It is a zero sum game, because:
 a) using that journal SSD as another OSD with integrated journal will
 yield the same overall result performance wise, if all SSDs are the same.
 And In addition its capacity will be made available for actual storage.
 b) if the journal SSD is faster than the OSD SSDs it tends to be priced
 accordingly. For example the DC P3700 400GB is about twice as fast (write)
 and expensive as the DC S3700 400GB.
 
 Things _may_ be different if one doesn't look at bandwidth but IOPS (though
 certainly not in the near future in regard to Ceph actually getting SSDs
 busy), but even there the difference is negligible when for example
 comparing the Intel S and P models in write performance.
 Reads are another thing, but nobody cares about those in journals. ^o^
 
 Obvious things that come to mind in this context would be the ability to
 disable journals (difficult, I know, not touching BTRFS, thank you) and
 probably K/V store in the future.
 
 Regards,
 
 Christian
 -- 
 Christian BalzerNetwork/Systems Engineer
 ch...@gol.com Global OnLine Japan/Fusion Communications
 http://www.gol.com/
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance doesn't scale well on a full ssd cluster.

2014-10-16 Thread Sebastien Han
Mark, please read this: 
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg12486.html

On 16 Oct 2014, at 19:19, Mark Wu wud...@gmail.com wrote:

 
 Thanks for the detailed information. but I am already using fio with rbd 
 engine. Almost 4 volumes can reach the peak.
 
 2014 年 10 月 17 日 上午 1:03于 wud...@gmail.com写道:
 Thanks for the detailed information. but I am already using fio with rbd 
 engine. Almost 4 volumes can reach the peak.
 
 2014 年 10 月 17 日 上午 12:55于 Daniel Schwager daniel.schwa...@dtnet.de写道:
 Hi Mark,
 
  
 
 maybe you will check rbd-enabled fio
 
 
 http://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html
 
  
 
 yum install ceph-devel
 
 git clone git://git.kernel.dk/fio.git
 
 cd fio ; ./configure ; make -j5 ; make install
 
  
 
 Setup the number of jobs (==clients) inside fio config to
 
 numjobs=8
 
 for simulating multiple clients.
 
  
 
  
 
 regards
 
 Danny
 
  
 
  
 
 my test.fio:
 
  
 
 [global]
 
 #logging
 
 #write_iops_log=write_iops_log
 
 #write_bw_log=write_bw_log
 
 #write_lat_log=write_lat_log
 
 ioengine=rbd
 
 clientname=admin
 
 pool=rbd
 
 rbdname=myimage
 
 invalidate=0# mandatory
 
 rw=randwrite
 
 bs=1m
 
 runtime=120
 
 iodepth=8
 
 numjobs=8
 
  
 
 time_based
 
 #direct=0
 
  
 
  
 
 [seq-write]
 
 stonewall
 
 rw=write
 
  
 
 #[seq-read]
 
 #stonewall
 
 #rw=read
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Micro Ceph summit during the OpenStack summit

2014-10-13 Thread Sebastien Han
Hey all,

I just saw this thread, I’ve been working on this and was about to share it: 
https://etherpad.openstack.org/p/kilo-ceph
Since the ceph etherpad is down I think we should switch to this one as an 
alternative.

Loic, feel free to work on this one and add more content :).

On 13 Oct 2014, at 05:46, Blair Bethwaite blair.bethwa...@gmail.com wrote:

 Hi Loic,
 
 I'll be there and interested to chat with other Cephers. But your pad
 isn't returning any page data...
 
 Cheers,
 
 On 11 October 2014 08:48, Loic Dachary l...@dachary.org wrote:
 Hi Ceph,
 
 TL;DR: please register at http://pad.ceph.com/p/kilo if you're attending the 
 OpenStack summit
 
 November 3 - 7 in Paris will be the OpenStack summit in Paris 
 https://www.openstack.org/summit/openstack-paris-summit-2014/, an 
 opportunity to meet with Ceph developers and users. We will have a 
 conference room dedicated to Ceph (half a day, date to be determined).
 
 Instead of preparing an abstract agenda, it is more interesting to find out 
 who will be there and what topics we would like to talk about.
 
 In the spirit of the OpenStack summit it would make sense to primarily 
 discuss the implementation proposals of various features and improvements 
 scheduled for the next Ceph release, Hammer. The online Ceph Developer 
 Summit http://ceph.com/community/ceph-developer-summit-hammer/ is scheduled 
 the week before and we will have plenty of material.
 
 If you're attending the OpenStack summit, please add yourself to 
 http://pad.ceph.com/p/kilo and list the topics you'd like to discuss. Next 
 week Josh Durgin and myself will spend some time to prepare this micro Ceph 
 summit and make it a lively and informative experience :-)
 
 Cheers
 
 --
 Loïc Dachary, Artisan Logiciel Libre
 
 
 
 
 -- 
 Cheers,
 ~Blairo
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD on openstack glance+cinder CoW?

2014-10-08 Thread Sebastien Han
Hum I just tried on a devstack and on firefly stable, it works for me.

Looking at your config it seems that the glance_api_version=2 is put in the 
wrong section.
Please move it to [DEFAULT] and let me know if it works.

On 08 Oct 2014, at 14:28, Nathan Stratton nat...@robotics.net wrote:

 On Tue, Oct 7, 2014 at 5:35 PM, Jonathan Proulx j...@jonproulx.com wrote:
 Hi All,
 
 We're running Firefly on the ceph side and Icehouse on the OpenStack
 side  I've pulled the recommended nova branch from
 https://github.com/angdraug/nova/tree/rbd-ephemeral-clone-stable-icehouse
 
 according to 
 http://ceph.com/docs/master/rbd/rbd-openstack/#booting-from-a-block-device:
 
 When Glance and Cinder are both using Ceph block devices, the image
 is a copy-on-write clone, so it can create a new volume quickly
 
 I'm not seeing this, even though I have glance setup in such away that
 nova does create copy on write clones when booting ephemeral instances
 of the same image.  Cinder downloads the glance RBD than pushes it
 back up as full copy.
 
 Since Glance - Nova is working (has the show_image_direct_url=True
 etc...) I suspect a problem with my Cinder config, this is what I
 added for rbd support:
 
 [rbd]
 volume_driver=cinder.volume.drivers.rbd.RBDDriver
 rbd_pool=volumes
 rbd_ceph_conf=/etc/ceph/ceph.conf
 rbd_flatten_volume_from_snapshot=false
 rbd_max_clone_depth=5
 glance_api_version=2
 rbd_user=USER
 rbd_secret_uuid=UUID
 volume_backend_name=rbd
 
 Note it does *work* just not doing CoW.  Am I missing something here?
 
 I am running into the same thing, when I import a temp file is created in 
 /var/lib/cinder/conversion. Everything works, it just is not CoW.
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd + openstack nova instance snapshots?

2014-10-01 Thread Sebastien Han
Hi,

Unfortunately this is expected.
If you take a snapshot you should not expect a clone but a RBD snapshot.

Please see this BP: 
https://blueprints.launchpad.net/nova/+spec/implement-rbd-snapshots-instead-of-qemu-snapshots

A major part of the code is ready, however we missed nova-specs feature freeze 
so we haven’t proposed anything for Juno.
So we will push something for Kilo.

On 01 Oct 2014, at 06:12, Jonathan Proulx j...@jonproulx.com wrote:

 Hi All,
 
 I'm working on integrating our new Ceph cluster with our older
 OpenStack infrastructure.  It's going pretty well so far but looking
 to check my expectations.
 
 We're running Firefly on the ceph side and Icehouse on the OpenStack
 side.  I've pulled the recommnded nova branch from
 https://github.com/angdraug/nova/tree/rbd-ephemeral-clone-stable-icehouse
 on my test nova nodes and have happily gotten instances booting from
 CoW clones of images stored in glance's rbd pool.
 
 I notice if I take a snapshot of that instance however, rather than
 making a clone as I'd hoped the hypervisor is pulling down a copy of
 the instance rbd to local disk then shipping that full sized raw image
 back to glance to be uploaded back into rbd.
 
 Is this expected?  Am I misconfiguring some thing (glance and nova are
 using different pools, which works to launch a cloned instance but
 maybe doesn't work in reverse)? Is there another patch I need to pull?
 
 Thanks,
 -Jon
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd + openstack nova instance snapshots?

2014-10-01 Thread Sebastien Han
On 01 Oct 2014, at 15:26, Jonathan Proulx j...@jonproulx.com wrote:

 On Wed, Oct 1, 2014 at 2:57 AM, Sebastien Han
 sebastien@enovance.com wrote:
 Hi,
 
 Unfortunately this is expected.
 If you take a snapshot you should not expect a clone but a RBD snapshot.
 
 Unfortunate that it doesn't work, but fortunate for me I don't need to
 figure out what I'm doing wrong :)

Wait a second, let me rephrase this:  If you take a snapshot you should not 
expect a clone but a RBD snapshot. You’re not doing anything wrong here :).
If this was implemented you would get a RBD snapshot not a clone, meaning that 
the design approach to this BP is to use snapshots and not clones.

Sorry for the confusion.
Now what you get is a local snapshot on the compute node that gets streamed 
through Glance (and in Ceph).

 
 Please see this BP: 
 https://blueprints.launchpad.net/nova/+spec/implement-rbd-snapshots-instead-of-qemu-snapshots
 
 Merci,

De rien.

 -Jon


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-23 Thread Sebastien Han
What about writes with Giant?

On 18 Sep 2014, at 08:12, Zhang, Jian jian.zh...@intel.com wrote:

 Have anyone ever testing multi volume performance on a *FULL* SSD setup?
 We are able to get ~18K IOPS for 4K random read on a single volume with fio 
 (with rbd engine) on a 12x DC3700 Setup, but only able to get ~23K (peak) 
 IOPS even with multiple volumes. 
 Seems the maximum random write performance we can get on the entire cluster 
 is quite close to single volume performance. 
 
 Thanks
 Jian
 
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
 Sebastien Han
 Sent: Tuesday, September 16, 2014 9:33 PM
 To: Alexandre DERUMIER
 Cc: ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K 
 IOPS
 
 Hi,
 
 Thanks for keeping us updated on this subject.
 dsync is definitely killing the ssd.
 
 I don't have much to add, I'm just surprised that you're only getting 5299 
 with 0.85 since I've been able to get 6,4K, well I was using the 200GB model, 
 that might explain this.
 
 
 On 12 Sep 2014, at 16:32, Alexandre DERUMIER aderum...@odiso.com wrote:
 
 here the results for the intel s3500
 
 max performance is with ceph 0.85 + optracker disabled.
 intel s3500 don't have d_sync problem like crucial
 
 %util show almost 100% for read and write, so maybe the ssd disk performance 
 is the limit.
 
 I have some stec zeusram 8GB in stock (I used them for zfs zil), I'll try to 
 bench them next week.
 
 
 
 
 
 
 INTEL s3500
 ---
 raw disk
 
 
 randread: fio --filename=/dev/sdb --direct=1 --rw=randread --bs=4k 
 --iodepth=32 --group_reporting --invalidate=0 --name=abc 
 --ioengine=aio bw=288207KB/s, iops=72051
 
 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz 
 avgqu-sz   await r_await w_await  svctm  %util
 sdb   0,00 0,00 73454,000,00 293816,00 0,00 8,00 
30,960,420,420,00   0,01  99,90
 
 randwrite: fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k 
 --iodepth=32 --group_reporting --invalidate=0 --name=abc --ioengine=aio 
 --sync=1 bw=48131KB/s, iops=12032
 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz 
 avgqu-sz   await r_await w_await  svctm  %util
 sdb   0,00 0,000,00 24120,00 0,00 48240,00 4,00  
2,080,090,000,09   0,04 100,00
 
 
 ceph 0.80
 -
 randread: no tuning:  bw=24578KB/s, iops=6144
 
 
 randwrite: bw=10358KB/s, iops=2589
 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz 
 avgqu-sz   await r_await w_await  svctm  %util
 sdb   0,00   373,000,00 8878,00 0,00 34012,50 7,66   
   1,630,180,000,18   0,06  50,90
 
 
 ceph 0.85 :
 -
 
 randread :  bw=41406KB/s, iops=10351
 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz 
 avgqu-sz   await r_await w_await  svctm  %util
 sdb   2,00 0,00 10425,000,00 41816,00 0,00 8,02  
1,360,130,130,00   0,07  75,90
 
 randwrite : bw=17204KB/s, iops=4301
 
 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz 
 avgqu-sz   await r_await w_await  svctm  %util
 sdb   0,00   333,000,00 9788,00 0,00 57909,0011,83   
   1,460,150,000,15   0,07  67,80
 
 
 ceph 0.85 tuning op_tracker=false
 
 
 randread :  bw=86537KB/s, iops=21634
 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz 
 avgqu-sz   await r_await w_await  svctm  %util
 sdb  25,00 0,00 21428,000,00 86444,00 0,00 8,07  
3,130,150,150,00   0,05  98,00
 
 randwrite:  bw=21199KB/s, iops=5299
 Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz 
 avgqu-sz   await r_await w_await  svctm  %util
 sdb   0,00  1563,000,00 9880,00 0,00 75223,5015,23   
   2,090,210,000,21   0,07  80,00
 
 
 - Mail original -
 
 De: Alexandre DERUMIER aderum...@odiso.com
 À: Cedric Lemarchand ced...@yipikai.org
 Cc: ceph-users@lists.ceph.com
 Envoyé: Vendredi 12 Septembre 2014 08:15:08
 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 
 3, 2K IOPS
 
 results of fio on rbd with kernel patch
 
 
 
 fio rbd crucial m550 1 osd 0.85 (osd_enable_op_tracker true or false, same 
 result):
 ---
 bw=12327KB/s, iops=3081
 
 So no much better than before, but this time, iostat show only 15% 
 utils, and latencies are lower
 
 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await 
 r_await w_await svctm %util sdb 0,00 29,00 0,00 3075,00 0,00 36748,50 
 23,90 0,29 0,10 0,00 0,10 0,05 15,20
 
 
 So, the write bottleneck seem to be in ceph.
 
 
 
 I will send s3500 result today
 
 - Mail original -
 
 De: Alexandre DERUMIER aderum...@odiso.com
 À

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-16 Thread Sebastien Han
 5225,00 0,00 29678,00 11,36 57,63 11,03 0,00 11,03 
 0,19 99,70
 
 
 (I don't understand what exactly is %util, 100% in the 2 cases, because 10x 
 slower with ceph)
 It would be interesting if you could catch the size of writes on SSD
 during the bench through librbd (I know nmon can do that)
 Replying to myself ... I ask a bit quickly in the way we already have
 this information (29678 / 5225 = 5,68Ko), but this is irrelevant.
 
 Cheers
 
 It could be a dsync problem, result seem pretty poor
 
 # dd if=rand.file of=/dev/sdb bs=4k count=65536 oflag=direct
 65536+0 enregistrements lus
 65536+0 enregistrements écrits
 268435456 octets (268 MB) copiés, 2,77433 s, 96,8 MB/s
 
 
 # dd if=rand.file of=/dev/sdb bs=4k count=65536 oflag=dsync,direct
 ^C17228+0 enregistrements lus
 17228+0 enregistrements écrits
 70565888 octets (71 MB) copiés, 70,4098 s, 1,0 MB/s
 
 
 
 I'll do tests with intel s3500 tomorrow to compare
 
 - Mail original -
 
 De: Sebastien Han sebastien@enovance.com
 À: Warren Wang warren_w...@cable.comcast.com
 Cc: ceph-users@lists.ceph.com
 Envoyé: Lundi 8 Septembre 2014 22:58:25
 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K 
 IOPS
 
 They definitely are Warren!
 
 Thanks for bringing this here :).
 
 On 05 Sep 2014, at 23:02, Wang, Warren warren_w...@cable.comcast.com 
 wrote:
 
 +1 to what Cedric said.
 
 Anything more than a few minutes of heavy sustained writes tended to get 
 our solid state devices into a state where garbage collection could not 
 keep up. Originally we used small SSDs and did not overprovision the 
 journals by much. Manufacturers publish their SSD stats, and then in very 
 small font, state that the attained IOPS are with empty drives, and the 
 tests are only run for very short amounts of time. Even if the drives are 
 new, it's a good idea to perform an hdparm secure erase on them (so that 
 the SSD knows that the blocks are truly unused), and then overprovision 
 them. You'll know if you have a problem by watching for utilization and 
 wait data on the journals.
 
 One of the other interesting performance issues is that the Intel 10Gbe 
 NICs + default kernel that we typically use max out around 1million 
 packets/sec. It's worth tracking this metric to if you are close.
 
 I know these aren't necessarily relevant to the test parameters you gave 
 below, but they're worth keeping in mind.
 
 --
 Warren Wang
 Comcast Cloud (OpenStack)
 
 
 From: Cedric Lemarchand ced...@yipikai.org
 Date: Wednesday, September 3, 2014 at 5:14 PM
 To: ceph-users@lists.ceph.com ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 
 2K IOPS
 
 
 Le 03/09/2014 22:11, Sebastien Han a écrit :
 Hi Warren,
 
 What do mean exactly by secure erase? At the firmware level with 
 constructor softwares?
 SSDs were pretty new so I don’t we hit that sort of things. I believe 
 that only aged SSDs have this behaviour but I might be wrong.
 
 Sorry I forgot to reply to the real question ;-)
 So yes it only plays after some times, for your case, if the SSD still 
 delivers write IOPS specified by the manufacturer, it will doesn't help in 
 any ways.
 
 But it seems this practice is nowadays increasingly used.
 
 Cheers
 On 02 Sep 2014, at 18:23, Wang, Warren warren_w...@cable.comcast.com
 wrote:
 
 
 Hi Sebastien,
 
 Something I didn't see in the thread so far, did you secure erase the 
 SSDs before they got used? I assume these were probably repurposed for 
 this test. We have seen some pretty significant garbage collection issue 
 on various SSD and other forms of solid state storage to the point where 
 we are overprovisioning pretty much every solid state device now. By as 
 much as 50% to handle sustained write operations. Especially important 
 for the journals, as we've found.
 
 Maybe not an issue on the short fio run below, but certainly evident on 
 longer runs or lots of historical data on the drives. The max 
 transaction time looks pretty good for your test. Something to consider 
 though.
 
 Warren
 
 -Original Message-
 From: ceph-users [
 mailto:ceph-users-boun...@lists.ceph.com
 ] On Behalf Of Sebastien Han
 Sent: Thursday, August 28, 2014 12:12 PM
 To: ceph-users
 Cc: Mark Nelson
 Subject: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 
 2K IOPS
 
 Hey all,
 
 It has been a while since the last thread performance related on the ML 
 :p I've been running some experiment to see how much I can get from an 
 SSD on a Ceph cluster.
 To achieve that I did something pretty simple:
 
 * Debian wheezy 7.6
 * kernel from debian 3.14-0.bpo.2-amd64
 * 1 cluster, 3 mons (i'd like to keep this realistic since in a real 
 deployment i'll use 3)
 * 1 OSD backed by an SSD (journal and osd data on the same device)
 * 1 replica count of 1
 * partitions are perfectly aligned
 * io scheduler is set to noon but deadline was showing the same results
 * no updatedb running
 
 About the box:
 
 * 32GB

Re: [ceph-users] vdb busy error when attaching to instance

2014-09-16 Thread Sebastien Han
Did you follow this ceph.com/docs/master/rbd/rbd-openstack/ to configure your 
env?

On 12 Sep 2014, at 14:38, m.channappa.nega...@accenture.com wrote:

 Hello Team,
  
 I have configured ceph as a multibackend for openstack.
  
 I have created 2 pools .
 1.   Volumes (replication size =3 )
 2.   poolb (replication size =2 )
  
 Below is the details from /etc/cinder/cinder.conf
  
 enabled_backends=rbd-ceph,rbd-cephrep
 [rbd-ceph]
 volume_driver=cinder.volume.drivers.rbd.RBDDriver
 rbd_pool=volumes
 volume_backend_name=ceph
 rbd_user=volumes
 rbd_secret_uuid=34c88ed2-1cf6-446d-8564-f888934eec35
 volumes_dir=/var/lib/cinder/volumes
 [rbd-cephrep]
 volume_driver=cinder.volume.drivers.rbd.RBDDriver
 rbd_pool=poolb
 volume_backend_name=ceph1
 rbd_user=poolb
 rbd_secret_uuid=d62b0df6-ee26-46f0-8d90-4ef4d55caa5b
 volumes_dir=/var/lib/cinder/volumes1
  
 when I am attaching a volume to a instance I am getting “DeviceIsBusy: The 
 supplied device (vdb) is busy” error.
  
 Please let me know how to correct this..
  
 Regards,
 Malleshi CN
 
 
 This message is for the designated recipient only and may contain privileged, 
 proprietary, or otherwise confidential information. If you have received it 
 in error, please notify the sender immediately and delete the original. Any 
 other use of the e-mail by you is prohibited. Where allowed by local law, 
 electronic communications with Accenture and its affiliates, including e-mail 
 and instant messaging (including content), may be scanned by our systems for 
 the purposes of information security and assessment of internal compliance 
 with Accenture policy. 
 __
 
 www.accenture.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-08 Thread Sebastien Han
They definitely are Warren!

Thanks for bringing this here :).

On 05 Sep 2014, at 23:02, Wang, Warren warren_w...@cable.comcast.com wrote:

 +1 to what Cedric said.
 
 Anything more than a few minutes of heavy sustained writes tended to get our 
 solid state devices into a state where garbage collection could not keep up. 
 Originally we used small SSDs and did not overprovision the journals by much. 
 Manufacturers publish their SSD stats, and then in very small font, state 
 that the attained IOPS are with empty drives, and the tests are only run for 
 very short amounts of time.  Even if the drives are new, it's a good idea to 
 perform an hdparm secure erase on them (so that the SSD knows that the blocks 
 are truly unused), and then overprovision them. You'll know if you have a 
 problem by watching for utilization and wait data on the journals.
 
 One of the other interesting performance issues is that the Intel 10Gbe NICs 
 + default kernel that we typically use max out around 1million packets/sec. 
 It's worth tracking this metric to if you are close. 
 
 I know these aren't necessarily relevant to the test parameters you gave 
 below, but they're worth keeping in mind.
 
 -- 
 Warren Wang
 Comcast Cloud (OpenStack)
 
 
 From: Cedric Lemarchand ced...@yipikai.org
 Date: Wednesday, September 3, 2014 at 5:14 PM
 To: ceph-users@lists.ceph.com ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K 
 IOPS
 
 
 Le 03/09/2014 22:11, Sebastien Han a écrit :
 Hi Warren,
 
 What do mean exactly by secure erase? At the firmware level with constructor 
 softwares?
 SSDs were pretty new so I don’t we hit that sort of things. I believe that 
 only aged SSDs have this behaviour but I might be wrong.
 
 Sorry I forgot to reply to the real question ;-)
 So yes it only plays after some times, for your case, if the SSD still 
 delivers write IOPS specified by the manufacturer, it will doesn't help in 
 any ways.
 
 But it seems this practice is nowadays increasingly used.
 
 Cheers
 On 02 Sep 2014, at 18:23, Wang, Warren warren_w...@cable.comcast.com
  wrote:
 
 
 Hi Sebastien,
 
 Something I didn't see in the thread so far, did you secure erase the SSDs 
 before they got used? I assume these were probably repurposed for this 
 test. We have seen some pretty significant garbage collection issue on 
 various SSD and other forms of solid state storage to the point where we 
 are overprovisioning pretty much every solid state device now. By as much 
 as 50% to handle sustained write operations. Especially important for the 
 journals, as we've found.
 
 Maybe not an issue on the short fio run below, but certainly evident on 
 longer runs or lots of historical data on the drives. The max transaction 
 time looks pretty good for your test. Something to consider though.
 
 Warren
 
 -Original Message-
 From: ceph-users [
 mailto:ceph-users-boun...@lists.ceph.com
 ] On Behalf Of Sebastien Han
 Sent: Thursday, August 28, 2014 12:12 PM
 To: ceph-users
 Cc: Mark Nelson
 Subject: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K 
 IOPS
 
 Hey all,
 
 It has been a while since the last thread performance related on the ML :p 
 I've been running some experiment to see how much I can get from an SSD on 
 a Ceph cluster.
 To achieve that I did something pretty simple:
 
 * Debian wheezy 7.6
 * kernel from debian 3.14-0.bpo.2-amd64
 * 1 cluster, 3 mons (i'd like to keep this realistic since in a real 
 deployment i'll use 3)
 * 1 OSD backed by an SSD (journal and osd data on the same device)
 * 1 replica count of 1
 * partitions are perfectly aligned
 * io scheduler is set to noon but deadline was showing the same results
 * no updatedb running
 
 About the box:
 
 * 32GB of RAM
 * 12 cores with HT @ 2,4 GHz
 * WB cache is enabled on the controller
 * 10Gbps network (doesn't help here)
 
 The SSD is a 200G Intel DC S3700 and is capable of delivering around 29K 
 iops with random 4k writes (my fio results) As a benchmark tool I used fio 
 with the rbd engine (thanks deutsche telekom guys!).
 
 O_DIECT and D_SYNC don't seem to be a problem for the SSD:
 
 # dd if=/dev/urandom of=rand.file bs=4k count=65536
 65536+0 records in
 65536+0 records out
 268435456 bytes (268 MB) copied, 29.5477 s, 9.1 MB/s
 
 # du -sh rand.file
 256Mrand.file
 
 # dd if=rand.file of=/dev/sdo bs=4k count=65536 oflag=dsync,direct
 65536+0 records in
 65536+0 records out
 268435456 bytes (268 MB) copied, 2.73628 s, 98.1 MB/s
 
 See my ceph.conf:
 
 [global]
  auth cluster required = cephx
  auth service required = cephx
  auth client required = cephx
  fsid = 857b8609-8c9b-499e-9161-2ea67ba51c97
  osd pool default pg num = 4096
  osd pool default pgp num = 4096
  osd pool default size = 2
  osd crush chooseleaf type = 0
 
   debug lockdep = 0/0
debug context = 0/0
debug crush = 0/0
debug buffer = 0/0
debug timer = 0/0
debug journaler = 0/0

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-02 Thread Sebastien Han
Hey,

Well I ran an fio job that simulates the (more or less) what ceph is doing 
(journal writes with dsync and o_direct) and the ssd gave me 29K IOPS too.
I could do this, but for me it definitely looks like a major waste since we 
don’t even get a third of the ssd performance.

On 02 Sep 2014, at 09:38, Alexandre DERUMIER aderum...@odiso.com wrote:

 Hi Sebastien,
 
 I got 6340 IOPS on a single OSD SSD. (journal and data on the same 
 partition).
 
 Shouldn't it better to have 2 partitions, 1 for journal and 1 for datas ?
 
 (I'm thinking about filesystem write syncs)
 
 
 
 
 - Mail original -
 
 De: Sebastien Han sebastien@enovance.com
 À: Somnath Roy somnath@sandisk.com
 Cc: ceph-users@lists.ceph.com
 Envoyé: Mardi 2 Septembre 2014 02:19:16
 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K 
 IOPS
 
 Mark and all, Ceph IOPS performance has definitely improved with Giant.
 With this version: ceph version 0.84-940-g3215c52 
 (3215c520e1306f50d0094b5646636c02456c9df4) on Debian 7.6 with Kernel 3.14-0.
 
 I got 6340 IOPS on a single OSD SSD. (journal and data on the same partition).
 So basically twice the amount of IOPS that I was getting with Firefly.
 
 Rand reads 4k went from 12431 to 10201, so I’m a bit disappointed here.
 
 The SSD is still under-utilised:
 
 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await 
 w_await svctm %util
 sdp1 0.00 540.37 0.00 5902.30 0.00 47.14 16.36 0.87 0.15 0.00 0.15 0.07 40.15
 sdp2 0.00 0.00 0.00 4454.67 0.00 49.16 22.60 0.31 0.07 0.00 0.07 0.07 30.61
 
 Thanks a ton for all your comments and assistance guys :).
 
 One last question for Sage (or other that might know), what’s the status of 
 the S2FS implementation? (or maybe we are waiting for S2FS to provide atomic 
 transactions?)
 I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr 
 test:
 
 fremovexattr(10, user.test@5848273) = 0
 
 On 01 Sep 2014, at 11:13, Sebastien Han sebastien@enovance.com wrote:
 
 Mark, thanks a lot for experimenting this for me.
 I’m gonna try master soon and will tell you how much I can get.
 
 It’s interesting to see that using 2 SSDs brings up more performance, even 
 both SSDs are under-utilized…
 They should be able to sustain both loads at the same time (journal and osd 
 data).
 
 On 01 Sep 2014, at 09:51, Somnath Roy somnath@sandisk.com wrote:
 
 As I said, 107K with IOs serving from memory, not hitting the disk..
 
 From: Jian Zhang [mailto:amberzhan...@gmail.com]
 Sent: Sunday, August 31, 2014 8:54 PM
 To: Somnath Roy
 Cc: Haomai Wang; ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 
 2K IOPS
 
 Somnath,
 on the small workload performance, 107k is higher than the theoretical IOPS 
 of 520, any idea why?
 
 
 
 Single client is ~14K iops, but scaling as number of clients increases. 
 10 clients ~107K iops. ~25 cpu cores are used.
 
 
 2014-09-01 11:52 GMT+08:00 Jian Zhang amberzhan...@gmail.com:
 Somnath,
 on the small workload performance,
 
 
 
 2014-08-29 14:37 GMT+08:00 Somnath Roy somnath@sandisk.com:
 
 Thanks Haomai !
 
 Here is some of the data from my setup.
 
 
 
 --
 
 Set up:
 
 
 
 
 
 32 core cpu with HT enabled, 128 GB RAM, one SSD (both journal and data) - 
 one OSD. 5 client m/c with 12 core cpu and each running two instances of 
 ceph_smalliobench (10 clients total). Network is 10GbE.
 
 
 
 Workload:
 
 -
 
 
 
 Small workload – 20K objects with 4K size and io_size is also 4K RR. The 
 intent is to serve the ios from memory so that it can uncover the 
 performance problems within single OSD.
 
 
 
 Results from Firefly:
 
 --
 
 
 
 Single client throughput is ~14K iops, but as the number of client 
 increases the aggregated throughput is not increasing. 10 clients ~15K 
 iops. ~9-10 cpu cores are used.
 
 
 
 Result with latest master:
 
 --
 
 
 
 Single client is ~14K iops, but scaling as number of clients increases. 10 
 clients ~107K iops. ~25 cpu cores are used.
 
 
 
 --
 
 
 
 
 
 More realistic workload:
 
 -
 
 Let’s see how it is performing while  90% of the ios are served from disks
 
 Setup:
 
 ---
 
 40 cpu core server as a cluster node (single node cluster) with 64 GB RAM. 
 8 SSDs - 8 OSDs. One similar node for monitor and rgw. Another node for 
 client running fio/vdbench. 4 rbds are configured with ‘noshare’ option. 40 
 GbE network
 
 
 
 Workload:
 
 
 
 
 
 8 SSDs are populated

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-02 Thread Sebastien Han
@Dan, hop my bad I forgot to use these settings, I’ll try again and see how 
much I can get on the read performance side.
@Mark, thanks again and yes I believe that due to some hardware variance  we 
have difference results, I won’t say that the deviance is decent but results 
are close enough to say that we experience the same limitations (ceph level).
@Cédric, yes I did and what fio was showing was consistent with the iostat 
output, same goes for disk utilisation.


On 02 Sep 2014, at 12:44, Cédric Lemarchand c.lemarch...@yipikai.org wrote:

 Hi Sebastian,
 
 Le 2 sept. 2014 à 10:41, Sebastien Han sebastien@enovance.com a écrit :
 
 Hey,
 
 Well I ran an fio job that simulates the (more or less) what ceph is doing 
 (journal writes with dsync and o_direct) and the ssd gave me 29K IOPS too.
 I could do this, but for me it definitely looks like a major waste since we 
 don’t even get a third of the ssd performance.
 
 Did you had a look if the raw ssd IOPS (using iostat -x for example) show 
 same results during fio bench ?
 
 Cheers 
 
 
 On 02 Sep 2014, at 09:38, Alexandre DERUMIER aderum...@odiso.com wrote:
 
 Hi Sebastien,
 
 I got 6340 IOPS on a single OSD SSD. (journal and data on the same 
 partition).
 
 Shouldn't it better to have 2 partitions, 1 for journal and 1 for datas ?
 
 (I'm thinking about filesystem write syncs)
 
 
 
 
 - Mail original -
 
 De: Sebastien Han sebastien@enovance.com
 À: Somnath Roy somnath@sandisk.com
 Cc: ceph-users@lists.ceph.com
 Envoyé: Mardi 2 Septembre 2014 02:19:16
 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K 
 IOPS
 
 Mark and all, Ceph IOPS performance has definitely improved with Giant.
 With this version: ceph version 0.84-940-g3215c52 
 (3215c520e1306f50d0094b5646636c02456c9df4) on Debian 7.6 with Kernel 3.14-0.
 
 I got 6340 IOPS on a single OSD SSD. (journal and data on the same 
 partition).
 So basically twice the amount of IOPS that I was getting with Firefly.
 
 Rand reads 4k went from 12431 to 10201, so I’m a bit disappointed here.
 
 The SSD is still under-utilised:
 
 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await 
 w_await svctm %util
 sdp1 0.00 540.37 0.00 5902.30 0.00 47.14 16.36 0.87 0.15 0.00 0.15 0.07 
 40.15
 sdp2 0.00 0.00 0.00 4454.67 0.00 49.16 22.60 0.31 0.07 0.00 0.07 0.07 30.61
 
 Thanks a ton for all your comments and assistance guys :).
 
 One last question for Sage (or other that might know), what’s the status of 
 the S2FS implementation? (or maybe we are waiting for S2FS to provide 
 atomic transactions?)
 I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr 
 test:
 
 fremovexattr(10, user.test@5848273) = 0
 
 On 01 Sep 2014, at 11:13, Sebastien Han sebastien@enovance.com wrote:
 
 Mark, thanks a lot for experimenting this for me.
 I’m gonna try master soon and will tell you how much I can get.
 
 It’s interesting to see that using 2 SSDs brings up more performance, even 
 both SSDs are under-utilized…
 They should be able to sustain both loads at the same time (journal and 
 osd data).
 
 On 01 Sep 2014, at 09:51, Somnath Roy somnath@sandisk.com wrote:
 
 As I said, 107K with IOs serving from memory, not hitting the disk..
 
 From: Jian Zhang [mailto:amberzhan...@gmail.com]
 Sent: Sunday, August 31, 2014 8:54 PM
 To: Somnath Roy
 Cc: Haomai Wang; ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 
 3, 2K IOPS
 
 Somnath,
 on the small workload performance, 107k is higher than the theoretical 
 IOPS of 520, any idea why?
 
 
 
 Single client is ~14K iops, but scaling as number of clients increases. 
 10 clients ~107K iops. ~25 cpu cores are used.
 
 
 2014-09-01 11:52 GMT+08:00 Jian Zhang amberzhan...@gmail.com:
 Somnath,
 on the small workload performance,
 
 
 
 2014-08-29 14:37 GMT+08:00 Somnath Roy somnath@sandisk.com:
 
 Thanks Haomai !
 
 Here is some of the data from my setup.
 
 
 
 --
 
 Set up:
 
 
 
 
 
 32 core cpu with HT enabled, 128 GB RAM, one SSD (both journal and data) 
 - one OSD. 5 client m/c with 12 core cpu and each running two instances 
 of ceph_smalliobench (10 clients total). Network is 10GbE.
 
 
 
 Workload:
 
 -
 
 
 
 Small workload – 20K objects with 4K size and io_size is also 4K RR. The 
 intent is to serve the ios from memory so that it can uncover the 
 performance problems within single OSD.
 
 
 
 Results from Firefly:
 
 --
 
 
 
 Single client throughput is ~14K iops, but as the number of client 
 increases the aggregated throughput is not increasing. 10 clients ~15K 
 iops. ~9-10 cpu cores are used.
 
 
 
 Result with latest master:
 
 --
 
 
 
 Single client is ~14K iops

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-02 Thread Sebastien Han
It would nice if you could post the results :)
Yup gitbuilder is available on debian 7.6 wheezy.


On 02 Sep 2014, at 17:55, Alexandre DERUMIER aderum...@odiso.com wrote:

 I'm going to install next week a small 3 nodes test ssd cluster,
 
 I have some intel s3500 and crucial m550.
 I'll try to bench them with firefly and master.
 
 Is a debian wheezy gitbuilder repository available ? (I'm a bit lazy to 
 compile all packages)
 
 
 - Mail original -
 
 De: Sebastien Han sebastien@enovance.com
 À: Alexandre DERUMIER aderum...@odiso.com
 Cc: ceph-users@lists.ceph.com, Cédric Lemarchand c.lemarch...@yipikai.org
 Envoyé: Mardi 2 Septembre 2014 15:25:05
 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K 
 IOPS
 
 Well the last time I ran two processes in parallel I got half the total 
 amount available so 1,7k per client.
 
 On 02 Sep 2014, at 15:19, Alexandre DERUMIER aderum...@odiso.com wrote:
 
 
 Do you have same results, if you launch 2 fio benchs in parallel on 2 
 differents rbd volumes ?
 
 
 - Mail original -
 
 De: Sebastien Han sebastien@enovance.com
 À: Cédric Lemarchand c.lemarch...@yipikai.org
 Cc: Alexandre DERUMIER aderum...@odiso.com, ceph-users@lists.ceph.com
 Envoyé: Mardi 2 Septembre 2014 13:59:13
 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K 
 IOPS
 
 @Dan, hop my bad I forgot to use these settings, I’ll try again and see how 
 much I can get on the read performance side.
 @Mark, thanks again and yes I believe that due to some hardware variance we 
 have difference results, I won’t say that the deviance is decent but results 
 are close enough to say that we experience the same limitations (ceph level).
 @Cédric, yes I did and what fio was showing was consistent with the iostat 
 output, same goes for disk utilisation.
 
 
 On 02 Sep 2014, at 12:44, Cédric Lemarchand c.lemarch...@yipikai.org wrote:
 
 Hi Sebastian,
 
 Le 2 sept. 2014 à 10:41, Sebastien Han sebastien@enovance.com a 
 écrit :
 
 Hey,
 
 Well I ran an fio job that simulates the (more or less) what ceph is doing 
 (journal writes with dsync and o_direct) and the ssd gave me 29K IOPS too.
 I could do this, but for me it definitely looks like a major waste since 
 we don’t even get a third of the ssd performance.
 
 Did you had a look if the raw ssd IOPS (using iostat -x for example) show 
 same results during fio bench ?
 
 Cheers
 
 
 On 02 Sep 2014, at 09:38, Alexandre DERUMIER aderum...@odiso.com wrote:
 
 Hi Sebastien,
 
 I got 6340 IOPS on a single OSD SSD. (journal and data on the same 
 partition).
 
 Shouldn't it better to have 2 partitions, 1 for journal and 1 for datas ?
 
 (I'm thinking about filesystem write syncs)
 
 
 
 
 - Mail original -
 
 De: Sebastien Han sebastien@enovance.com
 À: Somnath Roy somnath@sandisk.com
 Cc: ceph-users@lists.ceph.com
 Envoyé: Mardi 2 Septembre 2014 02:19:16
 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 
 2K IOPS
 
 Mark and all, Ceph IOPS performance has definitely improved with Giant.
 With this version: ceph version 0.84-940-g3215c52 
 (3215c520e1306f50d0094b5646636c02456c9df4) on Debian 7.6 with Kernel 
 3.14-0.
 
 I got 6340 IOPS on a single OSD SSD. (journal and data on the same 
 partition).
 So basically twice the amount of IOPS that I was getting with Firefly.
 
 Rand reads 4k went from 12431 to 10201, so I’m a bit disappointed here.
 
 The SSD is still under-utilised:
 
 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await 
 w_await svctm %util
 sdp1 0.00 540.37 0.00 5902.30 0.00 47.14 16.36 0.87 0.15 0.00 0.15 0.07 
 40.15
 sdp2 0.00 0.00 0.00 4454.67 0.00 49.16 22.60 0.31 0.07 0.00 0.07 0.07 
 30.61
 
 Thanks a ton for all your comments and assistance guys :).
 
 One last question for Sage (or other that might know), what’s the status 
 of the S2FS implementation? (or maybe we are waiting for S2FS to provide 
 atomic transactions?)
 I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr 
 test:
 
 fremovexattr(10, user.test@5848273) = 0
 
 On 01 Sep 2014, at 11:13, Sebastien Han sebastien@enovance.com 
 wrote:
 
 Mark, thanks a lot for experimenting this for me.
 I’m gonna try master soon and will tell you how much I can get.
 
 It’s interesting to see that using 2 SSDs brings up more performance, 
 even both SSDs are under-utilized…
 They should be able to sustain both loads at the same time (journal and 
 osd data).
 
 On 01 Sep 2014, at 09:51, Somnath Roy somnath@sandisk.com wrote:
 
 As I said, 107K with IOs serving from memory, not hitting the disk..
 
 From: Jian Zhang [mailto:amberzhan...@gmail.com]
 Sent: Sunday, August 31, 2014 8:54 PM
 To: Somnath Roy
 Cc: Haomai Wang; ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 
 3, 2K IOPS
 
 Somnath,
 on the small workload performance, 107k is higher than the theoretical 
 IOPS of 520, any idea why

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-01 Thread Sebastien Han
Mark and all, Ceph IOPS performance has definitely improved with Giant.
With this version: ceph version 0.84-940-g3215c52 
(3215c520e1306f50d0094b5646636c02456c9df4) on Debian 7.6 with Kernel 3.14-0.

I got 6340 IOPS on a single OSD SSD. (journal and data on the same partition).
So basically twice the amount of IOPS that I was getting with Firefly.

Rand reads 4k went from 12431 to 10201, so I’m a bit disappointed here.

The SSD is still under-utilised:

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
avgqu-sz   await r_await w_await  svctm  %util
sdp1  0.00   540.370.00 5902.30 0.0047.1416.36 
0.870.150.000.15   0.07  40.15
sdp2  0.00 0.000.00 4454.67 0.0049.1622.60 
0.310.070.000.07   0.07  30.61

Thanks a ton for all your comments and assistance guys :).

One last question for Sage (or other that might know), what’s the status of the 
S2FS implementation? (or maybe we are waiting for S2FS to provide atomic 
transactions?)
I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr test:

fremovexattr(10, user.test@5848273)   = 0

On 01 Sep 2014, at 11:13, Sebastien Han sebastien@enovance.com wrote:

 Mark, thanks a lot for experimenting this for me.
 I’m gonna try master soon and will tell you how much I can get. 
 
 It’s interesting to see that using 2 SSDs brings up more performance, even 
 both SSDs are under-utilized…
 They should be able to sustain both loads at the same time (journal and osd 
 data).
 
 On 01 Sep 2014, at 09:51, Somnath Roy somnath@sandisk.com wrote:
 
 As I said, 107K with IOs serving from memory, not hitting the disk..
 
 From: Jian Zhang [mailto:amberzhan...@gmail.com] 
 Sent: Sunday, August 31, 2014 8:54 PM
 To: Somnath Roy
 Cc: Haomai Wang; ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 
 2K IOPS
 
 Somnath,
 on the small workload performance, 107k is higher than the theoretical IOPS 
 of 520, any idea why? 
 
 
 
 Single client is ~14K iops, but scaling as number of clients increases. 10 
 clients ~107K iops. ~25 cpu cores are used.
 
 
 2014-09-01 11:52 GMT+08:00 Jian Zhang amberzhan...@gmail.com:
 Somnath,
 on the small workload performance, 
 
 
 
 2014-08-29 14:37 GMT+08:00 Somnath Roy somnath@sandisk.com:
 
 Thanks Haomai !
 
 Here is some of the data from my setup.
 
 
 
 --
 
 Set up:
 
 
 
 
 
 32 core cpu with HT enabled, 128 GB RAM, one SSD (both journal and data) - 
 one OSD. 5 client m/c with 12 core cpu and each running two instances of 
 ceph_smalliobench (10 clients total). Network is 10GbE.
 
 
 
 Workload:
 
 -
 
 
 
 Small workload – 20K objects with 4K size and io_size is also 4K RR. The 
 intent is to serve the ios from memory so that it can uncover the 
 performance problems within single OSD.
 
 
 
 Results from Firefly:
 
 --
 
 
 
 Single client throughput is ~14K iops, but as the number of client increases 
 the aggregated throughput is not increasing. 10 clients ~15K iops. ~9-10 cpu 
 cores are used.
 
 
 
 Result with latest master:
 
 --
 
 
 
 Single client is ~14K iops, but scaling as number of clients increases. 10 
 clients ~107K iops. ~25 cpu cores are used.
 
 
 
 --
 
 
 
 
 
 More realistic workload:
 
 -
 
 Let’s see how it is performing while  90% of the ios are served from disks
 
 Setup:
 
 ---
 
 40 cpu core server as a cluster node (single node cluster) with 64 GB RAM. 8 
 SSDs - 8 OSDs. One similar node for monitor and rgw. Another node for 
 client running fio/vdbench. 4 rbds are configured with ‘noshare’ option. 40 
 GbE network
 
 
 
 Workload:
 
 
 
 
 
 8 SSDs are populated , so, 8 * 800GB = ~6.4 TB of data.  Io_size = 4K RR.
 
 
 
 Results from Firefly:
 
 
 
 
 
 Aggregated output while 4 rbd clients stressing the cluster in parallel is 
 ~20-25K IOPS , cpu cores used ~8-10 cores (may be less can’t remember 
 precisely)
 
 
 
 Results from latest master:
 
 
 
 
 
 Aggregated output while 4 rbd clients stressing the cluster in parallel is 
 ~120K IOPS , cpu is 7% idle i.e  ~37-38 cpu cores.
 
 
 
 Hope this helps.
 
 
 
 Thanks  Regards
 
 Somnath
 
 
 
 -Original Message-
 From: Haomai Wang [mailto:haomaiw...@gmail.com] 
 Sent: Thursday, August 28, 2014 8:01 PM
 To: Somnath Roy
 Cc: Andrey Korolyov; ceph-users@lists.ceph.com
 Subject: Re: [ceph-users

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-29 Thread Sebastien Han
Thanks a lot for the answers, even if we drifted from the main subject a little 
bit.
Thanks Somnath for sharing this, when can we expect any codes that might 
improve _write_ performance?

@Mark thanks trying this :)
Unfortunately using nobarrier and another dedicated SSD for the journal  (plus 
your ceph setting) didn’t bring much, now I can reach 3,5K IOPS.
By any chance, would it be possible for you to test with a single OSD SSD?

On 28 Aug 2014, at 18:11, Sebastien Han sebastien@enovance.com wrote:

 Hey all,
 
 It has been a while since the last thread performance related on the ML :p
 I’ve been running some experiment to see how much I can get from an SSD on a 
 Ceph cluster.
 To achieve that I did something pretty simple:
 
 * Debian wheezy 7.6
 * kernel from debian 3.14-0.bpo.2-amd64
 * 1 cluster, 3 mons (i’d like to keep this realistic since in a real 
 deployment i’ll use 3)
 * 1 OSD backed by an SSD (journal and osd data on the same device)
 * 1 replica count of 1
 * partitions are perfectly aligned
 * io scheduler is set to noon but deadline was showing the same results
 * no updatedb running
 
 About the box:
 
 * 32GB of RAM
 * 12 cores with HT @ 2,4 GHz
 * WB cache is enabled on the controller
 * 10Gbps network (doesn’t help here)
 
 The SSD is a 200G Intel DC S3700 and is capable of delivering around 29K iops 
 with random 4k writes (my fio results)
 As a benchmark tool I used fio with the rbd engine (thanks deutsche telekom 
 guys!).
 
 O_DIECT and D_SYNC don’t seem to be a problem for the SSD:
 
 # dd if=/dev/urandom of=rand.file bs=4k count=65536
 65536+0 records in
 65536+0 records out
 268435456 bytes (268 MB) copied, 29.5477 s, 9.1 MB/s
 
 # du -sh rand.file
 256Mrand.file
 
 # dd if=rand.file of=/dev/sdo bs=4k count=65536 oflag=dsync,direct
 65536+0 records in
 65536+0 records out
 268435456 bytes (268 MB) copied, 2.73628 s, 98.1 MB/s
 
 See my ceph.conf:
 
 [global]
  auth cluster required = cephx
  auth service required = cephx
  auth client required = cephx
  fsid = 857b8609-8c9b-499e-9161-2ea67ba51c97
  osd pool default pg num = 4096
  osd pool default pgp num = 4096
  osd pool default size = 2
  osd crush chooseleaf type = 0
 
   debug lockdep = 0/0
debug context = 0/0
debug crush = 0/0
debug buffer = 0/0
debug timer = 0/0
debug journaler = 0/0
debug osd = 0/0
debug optracker = 0/0
debug objclass = 0/0
debug filestore = 0/0
debug journal = 0/0
debug ms = 0/0
debug monc = 0/0
debug tp = 0/0
debug auth = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug perfcounter = 0/0
debug asok = 0/0
debug throttle = 0/0
 
 [mon]
  mon osd down out interval = 600
  mon osd min down reporters = 13
[mon.ceph-01]
host = ceph-01
mon addr = 172.20.20.171
  [mon.ceph-02]
host = ceph-02
mon addr = 172.20.20.172
  [mon.ceph-03]
host = ceph-03
mon addr = 172.20.20.173
 
debug lockdep = 0/0
debug context = 0/0
debug crush = 0/0
debug buffer = 0/0
debug timer = 0/0
debug journaler = 0/0
debug osd = 0/0
debug optracker = 0/0
debug objclass = 0/0
debug filestore = 0/0
debug journal = 0/0
debug ms = 0/0
debug monc = 0/0
debug tp = 0/0
debug auth = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug perfcounter = 0/0
debug asok = 0/0
debug throttle = 0/0
 
 [osd]
  osd mkfs type = xfs
 osd mkfs options xfs = -f -i size=2048
 osd mount options xfs = rw,noatime,logbsize=256k,delaylog
  osd journal size = 20480
  cluster_network = 172.20.20.0/24
  public_network = 172.20.20.0/24
  osd mon heartbeat interval = 30
  # Performance tuning
  filestore merge threshold = 40
  filestore split multiple = 8
  osd op threads = 8
  # Recovery tuning
  osd recovery max active = 1
  osd max backfills = 1
  osd recovery op priority = 1
 
 
debug lockdep = 0/0
debug context = 0/0
debug crush = 0/0
debug buffer = 0/0
debug timer = 0/0
debug journaler = 0/0
debug osd = 0/0
debug optracker = 0/0
debug objclass = 0/0
debug filestore = 0/0
debug journal = 0/0
debug ms = 0/0
debug monc = 0/0
debug tp = 0/0
debug auth = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug perfcounter = 0/0
debug asok = 0/0
debug throttle = 0/0
 
 Disabling all debugging made me win 200/300 more IOPS.
 
 See my fio template:
 
 [global]
 #logging
 #write_iops_log=write_iops_log
 #write_bw_log=write_bw_log
 #write_lat_log=write_lat_lo
 
 time_based
 runtime=60
 
 ioengine=rbd
 clientname=admin
 pool=test
 rbdname=fio
 invalidate=0# mandatory
 #rw=randwrite
 rw=write
 bs=4k
 #bs=32m
 size=5G
 group_reporting
 
 [rbd_iodepth32

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-29 Thread Sebastien Han
@Dan: thanks for sharing your config, with all your flags I don’t seem to get 
more that 3,4K IOPS and they even seem to slow me down :( This is really weird.
Yes I already tried to run to simultaneous processes and only half of 3,4K for 
each of them.

@Kasper: thanks for these results, I believe some improvement could be made in 
the code as well :).

FYI I just tried on Ubuntu 12.04 and it looks a bit better because I’m getting 
iops=3783.

On 29 Aug 2014, at 13:10, Dan Van Der Ster daniel.vanders...@cern.ch wrote:

 vm.dirty_expire_centisecs


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Moving Journal to SSD

2014-08-11 Thread Sebastien Han
Hi Dane,

If you deployed with ceph-deploy, you will see that the journal is just a 
symlink.
Take a look at /var/lib/ceph/osd/osd-id/journal
The link should point to the first partition of your hard drive disk, so no 
filesystem for the journal, just a block device.

Roughly you should try:

create N partition on your SSD for your N OSDs
ceph osd set noout
sudo service ceph stop osd.$ID
ceph-osd -i osd.$ID --flush-journal
rm -f /var/lib/ceph/osd/osd-id/journal
ln -s  /var/lib/ceph/osd/osd-id/journal /dev/ssd-partition-for-your-journal
ceph-osd -i osd.$ID —mkjournal
sudo service ceph start osd.$ID
ceph osd unset noout

This should work.

Cheers.

On 11 Aug 2014, at 18:36, Dane Elwell dane.elw...@gmail.com wrote:

 Hi list,
 
 Our current setup has OSDs with their journal sharing the same disk as
 the data, and we've reached the point we're outgrowing this setup.
 We're currently vacating disks in order to replace them with SSDs and
 recreate the OSD journals on the SSDs in a 5:1 ratio of spinners to
 SSDs.
 
 I've read in a few places that it's possible to move the OSD journals
 without losing data on the OSDs, which is great, however none of the
 stuff I've read seems to cover our case.
 
 We installed Ceph using ceph-deploy, putting the journals on the same
 disks. ceph-deploy doesn't populate a ceph.conf file fully, so we
 don't have e.g. individual OSD entries in there.
 
 If I'm understanding this correctly, the Ceph disks are automounted by
 udev rules from /lib/udev/rules.d/95-ceph-osd.rules, and this mounts
 the OSD disk (partition 1) then mounts the journal under /journal
 (partition 2 of the same disk).
 
 That's all well and good, but as I now want to move the journal, how
 do I go about telling Ceph where the new journals are located so they
 can be mounted in the right location? Do I need to populate ceph.conf
 with individual entries for all OSDs or is there a way I can make udev
 do all the heavy lifting?
 
 Regards
 
 Dane
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Cheers.
 
Sébastien Han 
Cloud Architect 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] qemu image create failed

2014-07-15 Thread Sebastien Han
Can you connect to your Ceph cluster?
You can pass options to the cmd line like this:

$ qemu-img create -f rbd 
rbd:instances/vmdisk01:id=leseb:conf=/etc/ceph/ceph-leseb.conf 2G

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 12 Jul 2014, at 03:06, Yonghua Peng sys...@mail2000.us wrote:

 Anybody knows this issue? thanks.
 
 
 
 
 Fri, 11 Jul 2014 10:26:47 +0800 from Yonghua Peng sys...@mail2000.us:
 Hi,
 
 I try to create a qemu image, but got failed.
 
 ceph@ceph:~/my-cluster$ qemu-img create -f rbd rbd:rbd/qemu 2G
 Formatting 'rbd:rbd/qemu', fmt=rbd size=2147483648 cluster_size=0 
 qemu-img: error connecting
 qemu-img: rbd:rbd/qemu: error while creating rbd: Input/output error
 
 Can you tell what's the problem?
 
 Thanks.
 
 -- 
 We are hiring cloud Dev/Ops, more details please see: YY Cloud Jobs
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Is it still unsafe to map a RBD device on an OSD server?

2014-06-10 Thread Sebastien Han
Hi all,

A couple of years ago, I heard that it wasn’t safe to map a krbd block on an 
OSD host.
It was more or less like mounting a NFS mount on the NFS server, we can 
potentially end up with some deadlocks.

At least, I tried again recently and didn’t encounter any problem.

What do you think?

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] question about feature set mismatch

2014-06-10 Thread Sebastien Han
FYI I encountered the same problem for krbd, removing the ec pool didn’t solve 
my problem.
I’m running 3.13

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 08 Jun 2014, at 10:19, Ilya Dryomov ilya.dryo...@inktank.com wrote:

 On Sun, Jun 8, 2014 at 11:27 AM, Igor Krstic puh.dobri...@hotmail.com wrote:
 On Fri, 2014-06-06 at 17:40 +0400, Ilya Dryomov wrote:
 On Fri, Jun 6, 2014 at 4:34 PM, Kenneth Waegeman
 kenneth.waege...@ugent.be wrote:
 
 - Message from Igor Krstic igor.z.krs...@gmail.com -
   Date: Fri, 06 Jun 2014 13:23:19 +0200
   From: Igor Krstic igor.z.krs...@gmail.com
 Subject: Re: [ceph-users] question about feature set mismatch
 To: Ilya Dryomov ilya.dryo...@inktank.com
 Cc: ceph-users@lists.ceph.com
 
 
 
 On Fri, 2014-06-06 at 11:51 +0400, Ilya Dryomov wrote:
 
 On Thu, Jun 5, 2014 at 10:38 PM, Igor Krstic igor.z.krs...@gmail.com
 wrote:
 Hello,
 
 dmesg:
 [  690.181780] libceph: mon1 192.168.214.102:6789 feature set mismatch,
 my
 4a042a42  server's 504a042a42, missing 50
 [  690.181907] libceph: mon1 192.168.214.102:6789 socket error on read
 [  700.190342] libceph: mon0 192.168.214.101:6789 feature set mismatch,
 my
 4a042a42  server's 504a042a42, missing 50
 [  700.190481] libceph: mon0 192.168.214.101:6789 socket error on read
 [  710.194499] libceph: mon1 192.168.214.102:6789 feature set mismatch,
 my
 4a042a42  server's 504a042a42, missing 50
 [  710.194633] libceph: mon1 192.168.214.102:6789 socket error on read
 [  720.201226] libceph: mon1 192.168.214.102:6789 feature set mismatch,
 my
 4a042a42  server's 504a042a42, missing 50
 [  720.201482] libceph: mon1 192.168.214.102:6789 socket error on read
 
 50 should be:
 CEPH_FEATURE_CRUSH_V2 36 10
 and
 CEPH_FEATURE_OSD_ERASURE_CODES 38 40
 CEPH_FEATURE_OSD_TMAP2OMAP 38* 40
 
 That is happening on  two separate boxes that are just my nfs and block
 gateways (they are not osd/mon/mds). So I just need on them something
 like:
 sudo rbd map share2
 sudo mount -t xfs /dev/rbd1 /mnt/share2
 
 On ceph cluster and on those two separate boxes:
 ~$ ceph -v
 ceph version 0.80.1
 
 What could be the problem?
 
 Which kernel version are you running?  Do you have any erasure coded
 pools?
 
 Thanks,
 
Ilya
 
 ~$ uname -a
 Linux ceph-gw1 3.13.0-24-generic #47~precise2-Ubuntu SMP Fri May 2
 23:30:46 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
 
 Yes, one of the pools is erasure coded pool but the only thing I use on
 that box is rbd and rbd pool is not ec pool. It is replicated pool. I to
 not touch ec pool from there. Or at least, I believe so :)
 
 
 Well, I saw something similar with CephFS: I didn't touch the pools in use
 by cephfs, but I created another pool with Erasure Code, and the ceph 
 client
 (kernel 3.13, but not enough for EC) also stopped working with 'feature set
 mismatch'. Thus I guess the clients can't read the crushmap anymore when
 there is a 'erasure' mentioning in it:)
 
 Unfortunately that's true.  If there are any erasure code pools in the
 cluster, kernel clients (both krbd and kcephfs) won't work.  The only
 way it will work is if you remove all erasure coded pools.
 
 CRUSH_V2 is also not present in 3.13.  You'll have to uprade to 3.14.
 Alternatively, CRUSH_V2 can be disabled, but I can't tell you how off
 the top of my head.  The fundamental problem is that you are running
 latest userspace, and the defaults it ships with are incompatible with
 older kernels.
 
 Thanks,
 
Ilya
 
 Thanks. Upgrade to 3.14 solved CRUSH_V2.
 
 Regarding krbd and kcephfs...
 
 If that is the case, that is something that should be addressed in
 documentation on ceph.com more clearly.
 There is info that CRUSH_TUNABLES3 (chooseleaf_vary_r) requires Linux
 kernel version v3.15 or later (for the file system and RBD kernel
 clients) but nothing else.
 
 I think CRUSH_V2 is also mentioned somewhere, most probably in the
 release notes, but you are right, it should centralized and easy to
 find.
 
 
 Only now that I have your information I was able to find
 https://lkml.org/lkml/2014/4/7/257
 
 Anyway... What I want to test is SSD pool as cache pool in front of ec
 pool. Is there some way to update krbd manually (from github?) or I need
 to wait 3.15 for this?
 
 The if there are any erasure code pools in the cluster, kernel clients
 (both krbd and kcephfs) won't work problem is getting fixed on the
 server side.  The next ceph release will have the fix and you will be
 able to use 3.14 kernel with clusters that have EC pools.
 
 Thanks,
 
Ilya
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: 

Re: [ceph-users] Is it still unsafe to map a RBD device on an OSD server?

2014-06-10 Thread Sebastien Han
Thanks for your answers :)

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 10 Jun 2014, at 20:49, John Wilkins john.wilk...@inktank.com wrote:

 Sebastian, 
 
 It's actually not an issue with Ceph, but with the Linux kernel itself. If 
 you want to do this and avoid a deadlock, just use a VM on the same host to 
 mount the block device.
 
 Regards, 
 
 
 John
 
 
 On Tue, Jun 10, 2014 at 9:51 AM, Jean-Charles LOPEZ jeanchlo...@mac.com 
 wrote:
 Hi Sébastien,
 
 still the case. Depending on what you do, the OSD process will get to a hang 
 and will suicide.
 
 Regards
 JC
 
 On Jun 10, 2014, at 09:46, Sebastien Han sebastien@enovance.com wrote:
 
  Hi all,
 
  A couple of years ago, I heard that it wasn’t safe to map a krbd block on 
  an OSD host.
  It was more or less like mounting a NFS mount on the NFS server, we can 
  potentially end up with some deadlocks.
 
  At least, I tried again recently and didn’t encounter any problem.
 
  What do you think?
 
  Cheers.
  
  Sébastien Han
  Cloud Engineer
 
  Always give 100%. Unless you're giving blood.
 
  Phone: +33 (0)1 49 70 99 72
  Mail: sebastien@enovance.com
  Address : 11 bis, rue Roquépine - 75008 Paris
  Web : www.enovance.com - Twitter : @enovance
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 -- 
 John Wilkins
 Senior Technical Writer
 Intank
 john.wilk...@inktank.com
 (415) 425-9599
 http://inktank.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Storage Multi Tenancy

2014-05-16 Thread Sebastien Han
Jeroen,

Actually this is more a question for the OpenStack ML.
All the use cases you described are not possible at the moment.

The only thing you can get is shared ressources across all the tenants, you 
can’t really pin any ressource to a specific tenant.
This could done I guess, but not available yet.

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 15 May 2014, at 10:20, Jeroen van Leur jvl...@home.nl wrote:

 Hello,
 
 Currently I am integrating my ceph cluster into Openstack by using Ceph’s 
 RBD. I’d like to store my KVM virtual machines on pools that I have made on 
 the ceph cluster.
 I would like to achieve to have multiple storage solutions for multiple 
 tenants. Currently when I launch an instance the instance will be set on the 
 Ceph pool that has been defined in the cinder.conf file of my Openstack 
 controller node. If you set up an multi storage backend for cinder then the 
 scheduler will determine which storage backend will be used without looking 
 at the tenant. 
 
 What I would like to happen is that the instance/VM that’s being launched by 
 a specific tenant should have two choices; either choose for a shared Ceph 
 Pool or have their own pool. Another option might even be a tenant having his 
 own ceph cluster. When the instance is being launched on either shared pool, 
 dedicated pool or even another cluster, I would also like the extra volumes 
 that are being created to have the same option. 
 
 
 Data needs to be isolated from another tenants and users and therefore 
 choosing other pools/clusters would be nice. 
 Is this goal achievable or is it impossible. If it’s achievable could I 
 please have some assistance in doing so. Has anyone ever done this before.
 
 I would like thank you in advance for reading this lengthy e-mail. If there’s 
 anything that is unclear, please feel free to ask.
 
 Best Regards,
 
 Jeroen van Leur
 
 — 
 Infitialis
 Jeroen van Leur
 Sent with Airmail
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image

2014-05-15 Thread Sebastien Han
Glad to hear that it works now :)

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 15 May 2014, at 09:02, Maciej Gałkiewicz mac...@shellycloud.com wrote:

 On 15 May 2014 04:05, Maciej Gałkiewicz mac...@shellycloud.com wrote:
 On 28 April 2014 16:11, Sebastien Han sebastien@enovance.com wrote:
 Yes yes, just restart cinder-api and cinder-volume.
 It worked for me.
 
 In my case the image is still downloaded:(
 
 Option show_image_direct_url = True was missing in my glance config.
 
 -- 
 Maciej Gałkiewicz
 Shelly Cloud Sp. z o. o., Co-founder, Sysadmin
 http://shellycloud.com/, mac...@shellycloud.com
 KRS: 440358 REGON: 101504426



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help -Ceph deployment in Single node Like Devstack

2014-05-09 Thread Sebastien Han
http://www.sebastien-han.fr/blog/2014/05/01/vagrant-up-install-ceph-in-one-command/

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 08 May 2014, at 06:21, Neil Levine neil.lev...@inktank.com wrote:

 Loic's micro-osd.sh script is as close to single push button as it gets:
 
 http://dachary.org/?p=2374
 
 Not exactly a production cluster but it at least allows you to start
 experimenting on the CLI.
 
 Neil
 
 On Wed, May 7, 2014 at 7:56 PM, Patrick McGarry patr...@inktank.com wrote:
 Hey,
 
 Sorry for the delay, I have been traveling in Asia.  This question
 should probably go to the ceph-user list (cc'd).
 
 Right now there is no single push-button deployment for Ceph like
 devstack (that I'm aware of)...but we have sever options in terms of
 orchestration and deployment (including out own ceph-deploy featured
 in the doc).
 
 A good place to see the package options is http://ceph.com/get
 
 Sorry I couldn't give you an exact answer, but I think Ceph is pretty
 approachable in terms of deployment for experimentation.  Hope that
 helps.
 
 
 
 Best Regards,
 
 Patrick McGarry
 Director, Community || Inktank
 http://ceph.com  ||  http://inktank.com
 @scuttlemonkey || @ceph || @inktank
 
 
 On Wed, Apr 30, 2014 at 2:05 AM, Pandiyan M maestropa...@gmail.com wrote:
 
 Hi,
 
 I am looking for Ceph simple instalation like devstack ( For opennstack by
 one package contains all), it should supports for ceph, puppet and run its
 function as whole ceph does? help me out
 
 Thanks in Advance !!
 --
 PANDIYAN MUTHURAMAN
 
 Mobile : + 91 9600-963-436   (Personal)
  +91 7259-031-872  (Official)
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image

2014-04-28 Thread Sebastien Han
FYI It’s fixed here: https://review.openstack.org/#/c/90644/1

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 25 Apr 2014, at 18:16, Sebastien Han sebastien@enovance.com wrote:

 I just tried, I have the same problem, it looks like a regression…
 It’s weird because the code didn’t change that much during the Icehouse cycle.
 
 I just reported the bug here: https://bugs.launchpad.net/cinder/+bug/1312819
 
  
 Sébastien Han 
 Cloud Engineer 
 
 Always give 100%. Unless you're giving blood.” 
 
 Phone: +33 (0)1 49 70 99 72 
 Mail: sebastien@enovance.com 
 Address : 11 bis, rue Roquépine - 75008 Paris
 Web : www.enovance.com - Twitter : @enovance 
 
 On 25 Apr 2014, at 16:37, Sebastien Han sebastien@enovance.com wrote:
 
 g
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image

2014-04-28 Thread Sebastien Han
Yes yes, just restart cinder-api and cinder-volume.
It worked for me.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 28 Apr 2014, at 16:10, Maciej Gałkiewicz mac...@shellycloud.com wrote:

 On 28 April 2014 15:58, Sebastien Han sebastien@enovance.com wrote:
 FYI It’s fixed here: https://review.openstack.org/#/c/90644/1
 
 I already have this patch and it didn't help. Have it fixed the problem in 
 your cluster? 
 
 -- 
 Maciej Gałkiewicz
 Shelly Cloud Sp. z o. o., Co-founder, Sysadmin
 http://shellycloud.com/, mac...@shellycloud.com
 KRS: 440358 REGON: 101504426



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image

2014-04-25 Thread Sebastien Han
This is a COW clone, but the BP you pointed doesn’t match the feature you 
described. This might explain Greg’s answer.
The BP refers to the libvirt_image_type functionality for Nova.

What do you get now when you try to create a volume from an image?

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 25 Apr 2014, at 16:34, Maciej Gałkiewicz mac...@shellycloud.com wrote:

 On 25 April 2014 16:00, Gregory Farnum g...@inktank.com wrote:
 If you had it working in Havana I think you must have been using a
 customized code base; you can still do the same for Icehouse.
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com
 
 I was using a standard OpenStack version from Debian official repository.
 
 This is how I was creating the volume:
 
 # cinder create --image-id 1b84776e-25a0-441e-a74f-dc5d3bf5c103 5
 
 The volume is created instantly.
 
 # rbd info volume-abca46dc-0a69-43f0-b91a-412dbf30810f -p cinder_volumes
 rbd image 'volume-abca46dc-0a69-43f0-b91a-412dbf30810f':
 size 5120 MB in 640 objects
 order 23 (8192 kB objects)
 block_name_prefix: rbd_data.301adaa4c2e28e
 format: 2
 features: layering
 parent: glance_images/1b84776e-25a0-441e-a74f-dc5d3bf5c103@snap
 overlap: 4608 MB
 
 Isn't it a copy-on-write clone?
 
 -- 
 Maciej Gałkiewicz
 Shelly Cloud Sp. z o. o., Co-founder, Sysadmin
 http://shellycloud.com/, mac...@shellycloud.com
 KRS: 440358 REGON: 101504426
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OpenStack Icehouse and ephemeral disks created from image

2014-04-25 Thread Sebastien Han
I just tried, I have the same problem, it looks like a regression…
It’s weird because the code didn’t change that much during the Icehouse cycle.

I just reported the bug here: https://bugs.launchpad.net/cinder/+bug/1312819

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 25 Apr 2014, at 16:37, Sebastien Han sebastien@enovance.com wrote:

 g



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-brag installation

2014-04-22 Thread Sebastien Han
Hey Loïc,

The machine was setup a while ago :).
The server side is ready, there is just no graphical interface, everything 
appears as plain text.

It’s not necessary to upgrade.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 20 Apr 2014, at 16:40, Loic Dachary l...@dachary.org wrote:

 Hi Cédric,
 
 This is in the context of 
 https://wiki.ceph.com/Planning/Blueprints/Firefly/Ceph-Brag which is included 
 in Firefly https://github.com/ceph/ceph/tree/firefly/src/brag . It would be 
 good to try it for real before the release ;-) 
 
 Cheers
 
 On 20/04/2014 12:38, Cédric Lemarchand wrote:
 Hello there,
 
 Le 20 avr. 2014 à 12:20, Loic Dachary l...@dachary.org a écrit :
 
 Hi Sébastien,
 
 I'm available to help setup the ceph-brag machine.
 Just curious ;-), could you more specific about that ?
 
 When would it be more convenient for you to work on this with me ? The 
 brag.ceph.com machine hosted by the Free Software Foundation France is 
 ready with an Ubuntu precise installation. Maybe we should upgrade to 
 Trusty now ? 
 
 Cheers
 -- 
 Loïc Dachary, Artisan Logiciel Libre
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 -- 
 Loïc Dachary, Artisan Logiciel Libre
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rdb - huge disk - slow ceph

2014-04-22 Thread Sebastien Han
To speed up the deletion, you can remove the rbd_header (if the image is empty) 
and then remove it.

For example:

$ rados -p rbd ls
huge.rbd
rbd_directory


$ rados -p rbd rm huge.rbd
$ time 
rbd rm huge
2013-12-10 09:35:44.168695 7f9c4a87d780 -1 librbd::ImageCtx: error finding 
header: (2)
 No such file or directory
Removing image: 100% complete...done.

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 21 Apr 2014, at 17:03, Gonzalo Aguilar Delgado gagui...@aguilardelgado.com 
wrote:

 Hi, 
 
 I did my first mistake so big... I did a rbd disk of about  300 TB, yes  300 
 TB
 rbd info test-disk -p high_value
 rbd image 'test-disk':
   size 300 TB in 78643200 objects
   order 22 (4096 kB objects)
   block_name_prefix: rb.0.18d7.2ae8944a
   format: 1
 
 but even more. I made an error with the name (I thought it was 300GB) and 
 deleted it and created again. 
 
 
 rbd info homes -p high_value
 rbd image 'homes':
  size 300 TB in 78643200 objects
  order 22 (4096 kB objects)
  block_name_prefix: rb.0.193e.238e1f29
  format: 1
 
 Great mistake, eh?!
 
 When I realized I deleted them. But it takes a lot to remove just one. 
 
 Removing image: 21% complete... (1-2h)
 
 What's incredible is that ceph didn't break. 
 
 Question is. How can I delete them without waiting and breaking something?
 
 I also moving my 300GB disk to the ceph cluster:
 
 /dev/sdd1 307468468 265789152 26037716 92% /mnt/temp
 /dev/rbd1 309506048 4888396 304601268 2% /mnt/rbd/homes
 
 So I have:
 [1]   Running rbd rm test-disk -p high_value   (wd: ~)
 [2]-  Running rbd rm homes -p high_value   (wd: ~)
 [3]+  Running cp -rapx * /mnt/rbd/homes/   (wd: /mnt/temp)
 
 
 It copied about 4GB but takes long. I don't know if it's because the rm or 
 because the problem Michael told me about btrfs. 
 
 Any help on this, also?
 
 Best regards,
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph osd creation error ---Please help me

2014-04-08 Thread Sebastien Han
Try ceph auth del osd.1

And then repeat step 6

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 08 Apr 2014, at 16:26, Srinivasa Rao Ragolu srag...@mvista.com wrote:

 Correct error as below
 Error EINVAL: entity osd.1 exists but key does not match
 
 
 On Tue, Apr 8, 2014 at 7:51 PM, Srinivasa Rao Ragolu srag...@mvista.com 
 wrote:
 Hi,
 
 I am trying to setup ceph cluster without using ceph-deploy.
 
 Followed the link http://ceph.com/docs/master/install/manual-deployment/
 
 Successfully able to create monitor node and results are as expected
 
 I have copied ceph.conf and ceph.client.admin.keyring from monitor node to 
 OSD node.
 
 
 ceph.conf
 [global]
 fsid = a7f64266-0894-4f1e-a635-d0aeaca0e993
 mon initial members = mon
 mon host = 10.162.xx.yy
 auth cluster required = cephx
 auth service required = cephx
 auth client required = cephx
 osd journal size = 1024
 filestore xattr use omap = true
 osd pool default size = 2
 osd pool default min size = 1
 osd pool default pg num = 333
 osd pool default pgp num = 333
 osd crush chooseleaf type = 1
 
 --
 
 On OSD node executed followed steps:(All executed in super user mode)
 
 1) ceph osd create
   result: 1
 
 2) mkdir /var/lib/ceph/osd/ceph-1
 
 3) mkfs -t ext4 /dev/sdb1
 
 4) mount -o user_xattr /dev/sdb1 /var/lib/ceph/osd/ceph-1
 
 5) ceph-osd -i 1 --mkfs --mkkey
 
 6) ceph auth add osd.1 osd 'allow *' mon 'allow rwx' -i 
 /var/lib/ceph/osd/ceph-1/keyring
 
 Now I got error
 Error EINVAL: entity osd.0 exists but key does not match
 
 
 Please help me in resolving this issue. Please let me know what did I missed? 
 Thanks in advance.
 
 Srinivas.
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Openstack Nova not removing RBD volumes after removing of instance

2014-04-04 Thread Sebastien Han
I don’t know the packages but for me it looks like a bug…

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 04 Apr 2014, at 09:56, Mariusz Gronczewski 
mariusz.gronczew...@efigence.com wrote:

 Nope, one from RDO packages http://openstack.redhat.com/Main_Page
 
 On Thu, 3 Apr 2014 23:22:15 +0200, Sebastien Han
 sebastien@enovance.com wrote:
 
 Are you running Havana with josh’s branch?
 (https://github.com/jdurgin/nova/commits/havana-ephemeral-rbd)
 
  
 Sébastien Han 
 Cloud Engineer 
 
 Always give 100%. Unless you're giving blood.” 
 
 Phone: +33 (0)1 49 70 99 72 
 Mail: sebastien@enovance.com 
 Address : 11 bis, rue Roquépine - 75008 Paris
 Web : www.enovance.com - Twitter : @enovance 
 
 On 03 Apr 2014, at 13:24, Mariusz Gronczewski 
 mariusz.gronczew...@efigence.com wrote:
 
 Hi,
 
 some time ago I build small Openstack cluster with Ceph as main/only
 storage backend. I managed to get all parts working (removing/adding
 volumes works in cinder/glance/nova).
 
 I get no errors in logs but I've noticed that after deleting an
 instance (booted from image) I get leftover RBD volumes:
 
 hqblade201(hqstack1):~☠ nova list
 +--+---+++-+-+
 | ID   | Name  | 
 Status | Task State | Power State | Networks|
 +--+---+++-+-+
 | 5c6261a5-0290-4db6-89a2-f0c81f47d044 | template.devops.non.3dart.com | 
 ACTIVE | None   | Running | ext_vlan_102=10.0.102.2 |
 +--+---+++-+-+
 
 [10:25:00]hqblade201(hqstack1):~☠ nova volume-list
 +--+---+-+--+-+-+
 | ID   | Status| Display Name| Size 
 | Volume Type | Attached to |
 +--+---+-+--+-+-+
 | 11aae1a0-48c9-4606-a2be-f44624adb583 | available | stackdev.root   | 10   
 | None| |
 | 4dacfa9c-dfea-4a15-8ede-0cbdebb5a2e5 | available | cloud-init-test | 10   
 | None| |
 | ecf26742-e79e-4d7a-b8a4-9b4dc85dd41f | available | deb-net3| 10   
 | None| |
 | 91ec34e3-d597-49e9-80f6-364f5879c6c0 | available | deb-net2| 10   
 | None| |
 | 2acee1b6-16ec-4409-b5ad-3af7903f7d5c | available | deb-net1| 10   
 | None| |
 | dba790ec-60a3-48ef-ba40-dfb5946a6a1d | available | deb3| 10   
 | None| |
 | 57600343-b488-4da6-beb6-94ed351f4f6a | available | deb2| 10   
 | None| |
 | 8ff0be71-a36e-40f8-84ad-a8dffa1157fd | available | cvcxvcxv| 10   
 | None| |
 | 32a1a61d-698c-4131-bb60-75d95b487b9a | available | deb | 10   
 | None| |
 | 5faae133-3e9e-4048-b2bb-ba636f74e8d1 | available | sr  | 3
 | None| |
 +--+---+-+--+-+-+
 
 hqblade201(hqstack1):~☠ rbd ls volumes
 # those are orphaned volumes
 003c2a30-240c-4a42-930c-9a81bc9f743d_disk
 003c2a30-240c-4a42-930c-9a81bc9f743d_disk.local
 003c2a30-240c-4a42-930c-9a81bc9f743d_disk.swap
 1026039e-2cb9-4ff1-8f3d-2b270a765858_disk
 1026039e-2cb9-4ff1-8f3d-2b270a765858_disk.local
 1026039e-2cb9-4ff1-8f3d-2b270a765858_disk.swap
 1986fb8e-df4a-40a8-9d1e-762665e60db2_disk
 1a0500ad-9311-472b-9c7a-82046ac7aeab_disk
 1a0500ad-9311-472b-9c7a-82046ac7aeab_disk.local
 1a0500ad-9311-472b-9c7a-82046ac7aeab_disk.swap
 1d87569d-db74-480e-af6c-68716460010c_disk
 1d87569d-db74-480e-af6c-68716460010c_disk.local
 1d87569d-db74-480e-af6c-68716460010c_disk.swap
 ...
 5c6261a5-0290-4db6-89a2-f0c81f47d044_disk
 5c6261a5-0290-4db6-89a2-f0c81f47d044_disk.local
 5c6261a5-0290-4db6-89a2-f0c81f47d044_disk.swap
 ...
 fc9bff9c-fa37-4412-992e-5d1c9d5f4fac_disk
 fc9bff9c-fa37-4412-992e-5d1c9d5f4fac_disk.local
 fc9bff9c-fa37-4412-992e-5d1c9d5f4fac_disk.swap
 volume-11aae1a0-48c9-4606-a2be-f44624adb583
 volume-2acee1b6-16ec-4409-b5ad-3af7903f7d5c
 volume-32a1a61d-698c-4131-bb60-75d95b487b9a
 volume-4dacfa9c-dfea-4a15-8ede-0cbdebb5a2e5
 volume-57600343-b488-4da6-beb6-94ed351f4f6a
 volume-5faae133-3e9e-4048-b2bb-ba636f74e8d1
 volume-8ff0be71-a36e-40f8-84ad-a8dffa1157fd
 volume-91ec34e3-d597-49e9-80f6-364f5879c6c0
 volume-dba790ec-60a3-48ef-ba40-dfb5946a6a1d
 volume-ecf26742-e79e

Re: [ceph-users] qemu-rbd

2014-03-17 Thread Sebastien Han
There is a RBD engine for FIO, have a look at 
http://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 12 Mar 2014, at 04:25, Kyle Bader kyle.ba...@gmail.com wrote:

 I tried rbd-fuse and it's throughput using fio is approx. 1/4 that of the 
 kernel client.
 
 Can you please let me know how to setup RBD backend for FIO? I'm assuming 
 this RBD backend is also based on librbd?
 
 You will probably have to build fio from source since the rbd engine is new:
 
 https://github.com/axboe/fio
 
 Assuming you already have a cluster and a client configured this
 should do the trick:
 
 https://github.com/axboe/fio/blob/master/examples/rbd.fio
 
 -- 
 
 Kyle
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] qemu non-shared storage migration of nova instances?

2014-03-17 Thread Sebastien Han
Hi,

I use the following live migration flags: 
VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_PERSIST_DEST
It deletes the libvirt.xml and re-creates it on the other side.

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 11 bis, rue Roquépine - 75008 Paris
Web : www.enovance.com - Twitter : @enovance 

On 11 Mar 2014, at 22:06, Don Talton (dotalton) dotal...@cisco.com wrote:

 Hi guys and gals,
 
 I'm able to do live migration via 'nova live-migration', as long as my 
 instances are sitting on shared storage. However, when they are not, nova 
 live-migrate fails, due to a shared storage check.
 
 To get around this, I attempted to do a live migration via libvirt directly. 
 Using the feature --copy-storage-all fails. Part of the trouble with this 
 is that even though nova is booted from a volume stored on ceph, there are 
 still support files (eg console.log, disk.config) that reside in the 
 instances directory. The virsh command (I've tried many combinations of many 
 different migration approaches) is virsh migrate --live --copy-storage-all 
 instance-000c qemu+ssh://target/system. This fails due to libvirt not 
 creating the instance dir and copying the support files to the target.
 
 I'm curious if anyone has been able to get something like this to work. I'd 
 really love to get ceph-backed live migration going without adding the 
 overhead of shared storage for nova too.
 
 Thanks,
 
 Donald Talton
 Cloud Systems Development
 Cisco Systems
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to Configure Cinder to access multiple pools

2014-02-25 Thread Sebastien Han
Hi,

Please have a look at the cinder multi-backend functionality: examples here:
http://www.sebastien-han.fr/blog/2013/04/25/ceph-and-cinder-multi-backend/

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 25 Feb 2014, at 14:42, Vikrant Verma vikrantverm...@gmail.com wrote:

 Hi All,
 
 I am using cinder as a front end for volume storage in Openstack 
 configuration.
 Ceph is used as storage back-end.
 
 Currently cinder uses only one pool (in my case pool name is volumes ) for 
 its volume storage.
 I want cinder to use multiple ceph pools for volume storage
 
 
 --following is the cinder.conf---
 volume_driver=cinder.volume.drivers.rbd.RBDDriver
 rbd_pool=volumes
 rbd_ceph_conf=/etc/ceph/ceph.conf
 rbd_flatten_volume_from_snapshot=false
 
 
 Please let me know if it is possible to have multiple pools associated to 
 cinder, let me know how to configure it.
 
 Regards,
 Vikrant
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] storage

2014-02-25 Thread Sebastien Han
Hi,

RBD blocks are stored as objects on a filesystem usually under: 
/var/lib/ceph/osd/osd.id/current/pg.id/
RBD is just an abstraction layer.

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 25 Feb 2014, at 13:09, yalla.gnan.ku...@accenture.com wrote:

 Hi All,
  
 By default in which directory/directories, does ceph store the block device 
 files ? Is it in the /dev or other filesystem ?
  
  
 Thanks
 Kumar
 
 
 This message is for the designated recipient only and may contain privileged, 
 proprietary, or otherwise confidential information. If you have received it 
 in error, please notify the sender immediately and delete the original. Any 
 other use of the e-mail by you is prohibited. Where allowed by local law, 
 electronic communications with Accenture and its affiliates, including e-mail 
 and instant messaging (including content), may be scanned by our systems for 
 the purposes of information security and assessment of internal compliance 
 with Accenture policy. .
 __
 
 www.accenture.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Size of objects in Ceph

2014-02-25 Thread Sebastien Han
Hi,

The value can be set during the image creation.
Start with this: http://ceph.com/docs/master/man/8/rbd/#striping

Followed by the example section.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 25 Feb 2014, at 15:54, Florent Bautista flor...@coppint.com wrote:

 Hi all,
 
 I'm new with Ceph and I would like to know if there is any way of
 changing size of Ceph's internal objects.
 
 I mean, when I put an image on RBD for exemple, I can see this:
 
 rbd -p CephTest info base-127-disk-1
 rbd image 'base-127-disk-1':
size 32768 MB in 8192 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.347c274b0dc51
format: 2
features: layering
 
 
 4096 kB objects = how can I change size of objects ?
 
 Or is it a fixed value in Ceph architecture ?
 
 Thank you
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Unable top start instance in openstack

2014-02-20 Thread Sebastien Han
Which distro and packages?

libvirt_image_type is broken on cloud archive, please patch with 
https://github.com/jdurgin/nova/commits/havana-ephemeral-rbd

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 19 Feb 2014, at 08:06, yalla.gnan.ku...@accenture.com wrote:

 Hi,
  
 I have followed the link http://ceph.com/docs/master/rbd/rbd-openstack/ and 
 configured ceph with openstack.
 But  when I try to launch instances, they are going into Error state. I have 
 found the below log in the controller node of openstack:
  
 ---
 injection_path = image(\'disk\').path\n', uAttributeError: 'Rbd' object has 
 no attribute 'path'\n]
  
  
  
 Thanks
 Kumar
 
 
 This message is for the designated recipient only and may contain privileged, 
 proprietary, or otherwise confidential information. If you have received it 
 in error, please notify the sender immediately and delete the original. Any 
 other use of the e-mail by you is prohibited. Where allowed by local law, 
 electronic communications with Accenture and its affiliates, including e-mail 
 and instant messaging (including content), may be scanned by our systems for 
 the purposes of information security and assessment of internal compliance 
 with Accenture policy. .
 __
 
 www.accenture.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Block Devices and OpenStack

2014-02-17 Thread Sebastien Han
Hi,

Can I see your ceph.conf?
I suspect that [client.cinder] and [client.glance] sections are missing.

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 16 Feb 2014, at 06:55, Ashish Chandra mail.ashishchan...@gmail.com wrote:

 Hi Jean,
 
 Here is the output for ceph auth list for client.cinder
 
 client.cinder
 key: AQCKaP9ScNgiMBAAwWjFnyL69rBfMzQRSHOfoQ==
 caps: [mon] allow r
 caps: [osd] allow class-read object_prefix rbd_children, allow rwx 
 pool=volumes, allow rx pool=images
 
 
 Here is the output of ceph -s:
 
 ashish@ceph-client:~$ ceph -s
 cluster afa13fcd-f662-4778-8389-85047645d034
  health HEALTH_OK
  monmap e1: 1 mons at {ceph-node1=10.0.1.11:6789/0}, election epoch 1, 
 quorum 0 ceph-node1
  osdmap e37: 3 osds: 3 up, 3 in
   pgmap v84: 576 pgs, 6 pools, 0 bytes data, 0 objects
 106 MB used, 9076 MB / 9182 MB avail
  576 active+clean
 
 I created all the keyrings and copied as suggested by the guide.
 
 
 
 
 
 
 On Sun, Feb 16, 2014 at 3:08 AM, Jean-Charles LOPEZ jc.lo...@inktank.com 
 wrote:
 Hi,
 
 what do you get when you run a 'ceph auth list' command for the user name 
 (client.cinder) you created for cinder? Are the caps and the key for this 
 user correct? No typo in the hostname in the cinder.conf file (host=) ? Did 
 you copy the keyring to the cinder running cinder (can’t really say from your 
 output and there is no ceph-s command to check the monitor names)?
 
 It could just be a typo in the ceph auth get-or-create command that’s causing 
 it.
 
 Rgds
 JC
 
 
 
 On Feb 15, 2014, at 10:35, Ashish Chandra mail.ashishchan...@gmail.com 
 wrote:
 
 Hi Cephers,
 
 I am trying to configure ceph rbd as backend for cinder and glance by 
 following the steps mentioned in:
 
 http://ceph.com/docs/master/rbd/rbd-openstack/
 
 Before I start all openstack services are running normally and ceph cluster 
 health shows HEALTH_OK
 
 But once I am done with all steps and restart openstack services, 
 cinder-volume fails to start and throws an error.
 
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Traceback (most 
 recent call last):
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 262, in 
 check_for_setup_error
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd with 
 RADOSClient(self):
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 234, in __init__
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd self.cluster, 
 self.ioctx = driver._connect_to_rados(pool)
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 282, in 
 _connect_to_rados
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd client.connect()
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
 /usr/lib/python2.7/dist-packages/rados.py, line 185, in connect
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd raise 
 make_ex(ret, error calling connect)
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Error: error calling 
 connect: error code 95
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd
 2014-02-16 00:01:42.591 ERROR cinder.volume.manager 
 [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Error encountered 
 during initialization of driver: RBDDriver
 2014-02-16 00:01:42.592 ERROR cinder.volume.manager 
 [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Bad or unexpected 
 response from the storage volume backend API: error connecting to ceph 
 cluster
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager Traceback (most recent 
 call last):
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager   File 
 /opt/stack/cinder/cinder/volume/manager.py, line 190, in init_host
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager 
 self.driver.check_for_setup_error()
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager   File 
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 267, in 
 check_for_setup_error
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager raise 
 exception.VolumeBackendAPIException(data=msg)
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager 
 VolumeBackendAPIException: Bad or unexpected response from the storage 
 volume backend API: error connecting to ceph cluster
 
 
 Here is the content of my /etc/ceph in openstack node: 
 
 ashish@ubuntu:/etc/ceph$ ls -lrt
 total 16
 -rw-r--r-- 1 cinder cinder 229 Feb 15 23:45 ceph.conf
 -rw-r--r-- 1 glance glance  65 Feb 15 23:46 ceph.client.glance.keyring
 -rw-r--r-- 1 cinder cinder  65 Feb 15 23:47 ceph.client.cinder.keyring
 -rw-r--r-- 1 cinder cinder  72 Feb 15 23:47 

Re: [ceph-users] Block Devices and OpenStack

2014-02-17 Thread Sebastien Han
Hi,

If cinder-volume fails to connect and putting the admin keyring works it means 
that cinder is not configured properly.
Please also try to add the following:

[client.cinder]
keyring =  path-to-keyring

Same for Glance.

Btw: ceph.conf doesn’t need to be own by Cinder, just let mod +r and keep root 
as owner.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 17 Feb 2014, at 14:48, Ashish Chandra mail.ashishchan...@gmail.com wrote:

 Hi Sebastian, Jean;
 
 This is my ceph.conf looks like. It was auto generated using ceph-deploy.
 
 [global]
 fsid = afa13fcd-f662-4778-8389-85047645d034
 mon_initial_members = ceph-node1
 mon_host = 10.0.1.11
 auth_cluster_required = cephx
 auth_service_required = cephx
 auth_client_required = cephx
 filestore_xattr_use_omap = true
 
 If I provide admin.keyring file to openstack node (in /etc/ceph) it works 
 fine and issue is gone .
 
 Thanks 
 
 Ashish
 
 
 On Mon, Feb 17, 2014 at 2:03 PM, Sebastien Han sebastien@enovance.com 
 wrote:
 Hi,
 
 Can I see your ceph.conf?
 I suspect that [client.cinder] and [client.glance] sections are missing.
 
 Cheers.
 
 Sébastien Han
 Cloud Engineer
 
 Always give 100%. Unless you're giving blood.”
 
 Phone: +33 (0)1 49 70 99 72
 Mail: sebastien@enovance.com
 Address : 10, rue de la Victoire - 75009 Paris
 Web : www.enovance.com - Twitter : @enovance
 
 On 16 Feb 2014, at 06:55, Ashish Chandra mail.ashishchan...@gmail.com wrote:
 
  Hi Jean,
 
  Here is the output for ceph auth list for client.cinder
 
  client.cinder
  key: AQCKaP9ScNgiMBAAwWjFnyL69rBfMzQRSHOfoQ==
  caps: [mon] allow r
  caps: [osd] allow class-read object_prefix rbd_children, allow rwx 
  pool=volumes, allow rx pool=images
 
 
  Here is the output of ceph -s:
 
  ashish@ceph-client:~$ ceph -s
  cluster afa13fcd-f662-4778-8389-85047645d034
   health HEALTH_OK
   monmap e1: 1 mons at {ceph-node1=10.0.1.11:6789/0}, election epoch 1, 
  quorum 0 ceph-node1
   osdmap e37: 3 osds: 3 up, 3 in
pgmap v84: 576 pgs, 6 pools, 0 bytes data, 0 objects
  106 MB used, 9076 MB / 9182 MB avail
   576 active+clean
 
  I created all the keyrings and copied as suggested by the guide.
 
 
 
 
 
 
  On Sun, Feb 16, 2014 at 3:08 AM, Jean-Charles LOPEZ jc.lo...@inktank.com 
  wrote:
  Hi,
 
  what do you get when you run a 'ceph auth list' command for the user name 
  (client.cinder) you created for cinder? Are the caps and the key for this 
  user correct? No typo in the hostname in the cinder.conf file (host=) ? Did 
  you copy the keyring to the cinder running cinder (can’t really say from 
  your output and there is no ceph-s command to check the monitor names)?
 
  It could just be a typo in the ceph auth get-or-create command that’s 
  causing it.
 
  Rgds
  JC
 
 
 
  On Feb 15, 2014, at 10:35, Ashish Chandra mail.ashishchan...@gmail.com 
  wrote:
 
  Hi Cephers,
 
  I am trying to configure ceph rbd as backend for cinder and glance by 
  following the steps mentioned in:
 
  http://ceph.com/docs/master/rbd/rbd-openstack/
 
  Before I start all openstack services are running normally and ceph 
  cluster health shows HEALTH_OK
 
  But once I am done with all steps and restart openstack services, 
  cinder-volume fails to start and throws an error.
 
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Traceback (most 
  recent call last):
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
  /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 262, in 
  check_for_setup_error
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd with 
  RADOSClient(self):
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
  /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 234, in __init__
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd self.cluster, 
  self.ioctx = driver._connect_to_rados(pool)
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
  /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 282, in 
  _connect_to_rados
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd 
  client.connect()
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
  /usr/lib/python2.7/dist-packages/rados.py, line 185, in connect
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd raise 
  make_ex(ret, error calling connect)
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Error: error 
  calling connect: error code 95
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd
  2014-02-16 00:01:42.591 ERROR cinder.volume.manager 
  [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Error encountered 
  during initialization of driver: RBDDriver
  2014-02-16 00:01:42.592 ERROR cinder.volume.manager 
  [req

Re: [ceph-users] Meetup in Frankfurt, before the Ceph day

2014-02-05 Thread Sebastien Han
Hi Alexandre,

We have a meet up in Paris.
Please see: http://www.meetup.com/Ceph-in-Paris/events/158942372/

Cheers.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 05 Feb 2014, at 13:51, Alexandre DERUMIER aderum...@odiso.com wrote:

 Hi Loic, 
 do you known if a ceph meetup is planned soon in France or Belgium ?
 
 I miss the Fosdem this year, and I'll be very happy to meet some ceph 
 users/devs.
 
 Regards,
 
 Alexandre
 
 - Mail original - 
 
 De: Loic Dachary l...@dachary.org 
 À: ceph-users ceph-users@lists.ceph.com 
 Envoyé: Mercredi 5 Février 2014 09:44:04 
 Objet: [ceph-users] Meetup in Frankfurt, before the Ceph day 
 
 Hi Ceph, 
 
 I'll be in Frankfurt for the Ceph day February 27th 
 http://www.eventbrite.com/e/ceph-day-frankfurt-tickets-10173269523 and I will 
 attend the meetup organized the evening before 
 http://www.meetup.com/Ceph-Frankfurt/events/164620852/ 
 
 Anyone interested to join ? Not sure where we should meet ... I've never been 
 to Frankfurt before :-) 
 
 Cheers 
 
 -- 
 Loïc Dachary, Artisan Logiciel Libre 
 
 
 ___ 
 ceph-users mailing list 
 ceph-users@lists.ceph.com 
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] During copy new rbd image is totally thick

2014-02-03 Thread Sebastien Han
I have the same behaviour here.
I believe this is somehow expected since you’re calling “copy”, clone will do 
the cow.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 03 Feb 2014, at 08:43, Igor Laskovy igor.lask...@gmail.com wrote:

 Anybody? ;)
 
 
 On Thu, Jan 30, 2014 at 9:10 PM, Igor Laskovy igor.lask...@gmail.com wrote:
 Hello list,
 
 Is it correct behavior during copy to thicking rbd image?
 
 igor@hv03:~$ rbd create rbd/test -s 1024
 igor@hv03:~$ rbd diff rbd/test | awk '{ SUM += $2 } END { print SUM/1024/1024 
  MB }'
 0 MB
 igor@hv03:~$ rbd copy rbd/test rbd/cloneoftest
 Image copy: 100% complete...done.
 igor@hv03:~$ rbd diff rbd/cloneoftest | awk '{ SUM += $2 } END { print 
 SUM/1024/1024  MB }'
 1024 MB
 
 -- 
 Igor Laskovy
 facebook.com/igor.laskovy
 studiogrizzly.com
 
 
 
 -- 
 Igor Laskovy
 facebook.com/igor.laskovy
 studiogrizzly.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] get virtual size and used

2014-02-03 Thread Sebastien Han
Hi,

$ rbd diff rbd/toto | awk '{ SUM += $2 } END { print SUM/1024/1024  MB }’

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 03 Feb 2014, at 17:10, zorg z...@probesys.com wrote:

 hi,
 We use rbd pool for
 and I wonder how can i have
 the real size use by my drb image
 
 I can have the virtual size rbd info
 but  how can i have the real size use by my drbd image
 
 
 -- 
 probeSys - spécialiste GNU/Linux
 site web : http://www.probesys.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Openstack Havana release installation with ceph

2014-01-24 Thread Sebastien Han
Usually you would like to start here: 
http://ceph.com/docs/master/rbd/rbd-openstack/

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 22 Jan 2014, at 00:14, Dmitry Borodaenko dborodae...@mirantis.com wrote:

 On Tue, Jan 21, 2014 at 10:38 AM, Dmitry Borodaenko
 dborodae...@mirantis.com wrote:
 On Tue, Jan 21, 2014 at 2:23 AM, Lalitha Maruthachalam
 lalitha.maruthacha...@aricent.com wrote:
 Can someone please let me know whether there is any documentation for
 installing Havana release of Openstack along with Ceph.
 These slides have some information about how this is done in Mirantis
 OpenStack 4.0, including some gotchas and troubleshooting pointers:
 http://files.meetup.com/11701852/fuel-ceph.pdf
 
 I didn't realize you need to be a participant of the meetup to get
 that file, here's a link to the same slides on SlideShare:
 http://www.slideshare.net/mirantis/fuel-ceph
 
 Apologies,
 
 -- 
 Dmitry Borodaenko
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD port usage

2014-01-24 Thread Sebastien Han
Greg,

Do you have any estimation about how heartbeat messages use the network?
How busy is it?

At some point (if the cluster gets big enough), could this degrade the network 
performance? Will it make sense to have a separate network for this?

So in addition to public and storage we will have an heartbeat network, so we 
could pin it to a specific network link.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 22 Jan 2014, at 19:01, Gregory Farnum g...@inktank.com wrote:

 On Tue, Jan 21, 2014 at 8:26 AM, Sylvain Munaut
 s.mun...@whatever-company.com wrote:
 Hi,
 
 I noticed in the documentation that the OSD should use 3 ports per OSD
 daemon running and so when I setup the cluster, I originally opened
 enough port to accomodate this (with a small margin so that restart
 could proceed even is ports aren't released immediately).
 
 However today I just noticed that OSD daemons are using 5 ports and so
 for some of them, a port or two were locked by the firewall.
 
 All the OSD were still reporting as OK and the cluster didn't report
 anything wrong but I was getting some weird behavior that could have
 been related.
 
 
 So is that usage of 5 TCP ports normal ? And if it is, could the doc
 be updated ?
 
 Normal! It's increased a couple times recently because we added
 heartbeating on both the public and cluster network interfaces.
 -Greg
 Software Engineer #42 @ http://inktank.com | http://ceph.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD port usage

2014-01-24 Thread Sebastien Han
I agree but somehow this generates more traffic too. We just need to find a 
good balance.
But I don’t think this will change the scenario where the cluster network is 
down and OSDs die because of this…

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 24 Jan 2014, at 11:52, Sylvain Munaut s.mun...@whatever-company.com wrote:

 Hi,
 
 At some point (if the cluster gets big enough), could this degrade the 
 network performance? Will it make sense to have a separate network for this?
 
 So in addition to public and storage we will have an heartbeat network, so 
 we could pin it to a specific network link.
 
 I think the whole point of having added hearbeating on public 
 cluster network is to be able to detect failure of one network but not
 the other. Separating heartbeat on it's own independent network would
 be quite counter productive in this respect.
 
 
 Cheers,
 
   Sylvain



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD port usage

2014-01-24 Thread Sebastien Han
Ok Greg, thanks for the clarification!

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 24 Jan 2014, at 18:22, Gregory Farnum g...@inktank.com wrote:

 On Friday, January 24, 2014, Sebastien Han sebastien@enovance.com wrote:
 Greg,
 
 Do you have any estimation about how heartbeat messages use the network?
 How busy is it?
 
 Not very. It's one very small message per OSD peer per...second?
  
 
 At some point (if the cluster gets big enough), could this degrade the 
 network performance? Will it make sense to have a separate network for this?
 
 As Sylvain said, that would negate the entire point of heartbeating on both 
 networks. Trust me, you don't want to deal with a cluster where the OSDs 
 can't talk to each other but they can talk to the monitors and keep marking 
 each other down.
 -Greg
  
 
 So in addition to public and storage we will have an heartbeat network, so we 
 could pin it to a specific network link.
 
 
 Sébastien Han
 Cloud Engineer
 
 Always give 100%. Unless you're giving blood.”
 
 Phone: +33 (0)1 49 70 99 72
 Mail: sebastien@enovance.com
 Address : 10, rue de la Victoire - 75009 Paris
 Web : www.enovance.com - Twitter : @enovance
 
 On 22 Jan 2014, at 19:01, Gregory Farnum g...@inktank.com wrote:
 
  On Tue, Jan 21, 2014 at 8:26 AM, Sylvain Munaut
  s.mun...@whatever-company.com wrote:
  Hi,
 
  I noticed in the documentation that the OSD should use 3 ports per OSD
  daemon running and so when I setup the cluster, I originally opened
  enough port to accomodate this (with a small margin so that restart
  could proceed even is ports aren't released immediately).
 
  However today I just noticed that OSD daemons are using 5 ports and so
  for some of them, a port or two were locked by the firewall.
 
  All the OSD were still reporting as OK and the cluster didn't report
  anything wrong but I was getting some weird behavior that could have
  been related.
 
 
  So is that usage of 5 TCP ports normal ? And if it is, could the doc
  be updated ?
 
  Normal! It's increased a couple times recently because we added
  heartbeating on both the public and cluster network interfaces.
  -Greg
  Software Engineer #42 @ http://inktank.com | http://ceph.com
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 -- 
 Software Engineer #42 @ http://inktank.com | http://ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] servers advise (dell r515 or supermicro ....)

2014-01-15 Thread Sebastien Han
Hi Alexandre,

Are you going with a 10Gb network? It’s not an issue for IOPS but more for the 
bandwidth. If so read the following:

I personally won’t go with a ratio of 1:6 for the journal. I guess 1:5 (or even 
1:4) is preferable.
SAS 10K gives you around 140MB/sec for sequential writes.
So if you use a journal with an SSD, you expect at least 140MB if you don’t 
want to slow things down.
If you do so 140*10 (disks): fulfil your 10GB bandwidth already. So either you 
don’t need that much disks either you don’t need SSDs.
It depends on the performance that you want to achieve.
Another thing, I also won’t use the DC S3700 since this disk was definitely 
made for IOPS intensive applications. The journal is purely sequential (small 
seq block, IIRC Stephan mentioned 370k blocks).
I will instead use with a SSD with large sequential capabilities like 525 
series 120GB. 

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 15 Jan 2014, at 12:47, Alexandre DERUMIER aderum...@odiso.com wrote:

 Hello List,
 
 I'm going to build a build a rbd cluster this year, with 5 nodes
 
 I would like to have this kind of configuration for each node:
 
 - 2U
 - 2,5inch drives
 
 os : 2 disk sas drive
 journal : 2 x ssd intel dc s3700 100GB
 osd : 10 or 12  x sas Seagate Savvio 10K.6 900GB
 
 
 
 I see on the mailing that intank use dell r515. 
 I currently own a lot of dell servers and I have good prices.
 
 But I have also see on the mailing that dell perc H700 can have some 
 performance problem,
 and also it's not easy to flash the firmware for jbod mode.
 http://www.spinics.net/lists/ceph-devel/msg16661.html
 
 I don't known if theses performance problem has finally been solved ?
 
 
 
 Another option could be to use supermicro server,
 they have some 2U - 16 disks chassis + one or two lsi jbod controller.
 But, I have had in past really bad experience with supermicro motherboard.
 (Mainly firmware bug, ipmi card bug,.)
 
 Does someone have experience with supermicro, and give me advise for a good 
 motherboard model? 
 
 
 Best Regards,
 
 Alexandre Derumier
 
 
 
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] servers advise (dell r515 or supermicro ....)

2014-01-15 Thread Sebastien Han
Hum the Crucial m500 is pretty slow. The biggest one doesn’t even reach 300MB/s.
Intel DC S3700 100G showed around 200MB/sec for us.

Actually, I don’t know the price difference between the crucial and the intel 
but the intel looks more suitable for me. Especially after Mark’s comment.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 15 Jan 2014, at 15:28, Mark Nelson mark.nel...@inktank.com wrote:

 On 01/15/2014 08:03 AM, Robert van Leeuwen wrote:
 Power-Loss Protection:  In the rare event that power fails while the
 drive is operating, power-loss protection helps ensure that data isn’t
 corrupted.
 
 Seems that not all power protected SSDs are created equal:
 http://lkcl.net/reports/ssd_analysis.html
 
 The m500 is not tested but the m4 is.
 
 Up to now it seems that only Intel seems to have done his homework.
 In general they *seem* to be the most reliable SSD provider.
 
 Even at that, there has been some concern on the list (and lkml) that certain 
 older Intel drives without super-capacitors are ignoring ATA_CMD_FLUSH, 
 making them very fast (which I like!) but potentially dangerous (boo!).  The 
 520 in particular is a drive I've used for a lot of Ceph performance testing 
 but I'm afraid that if it's not properly handling CMD FLUSH requests, it may 
 not be indicative of the performance folks would see on other drives that do.
 
 On the third hand, if drives with supercaps like the Intel DC S3700 can 
 safely ignore CMD_FLUSH and maintain high performance (even when there are a 
 lot of O_DSYNC calls, ala the journal), that potentially makes them even more 
 attractive (and that drive already has relatively high sequential write 
 performance and high write endurance).
 
 
 Cheers,
 Robert van Leeuwen
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] servers advise (dell r515 or supermicro ....)

2014-01-15 Thread Sebastien Han
Sorry I was only looking at the 4K aligned results.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 15 Jan 2014, at 15:46, Stefan Priebe s.pri...@profihost.ag wrote:

 Am 15.01.2014 15:44, schrieb Mark Nelson:
 On 01/15/2014 08:39 AM, Stefan Priebe wrote:
 
 Am 15.01.2014 15:34, schrieb Sebastien Han:
 Hum the Crucial m500 is pretty slow. The biggest one doesn’t even
 reach 300MB/s.
 Intel DC S3700 100G showed around 200MB/sec for us.
 
 where did you get this values from? I've some 960GB and they all have 
 450Mb/s write speed. Also in tests like here you see  450MB/s
 http://www.tomshardware.com/reviews/crucial-m500-1tb-ssd,3551-5.html
 
 Looks like at least according to Anand's chart, you'll get full write
 speed once you buy the 480GB model, but not for the 120 or 240GB models:
 
 http://www.anandtech.com/show/6884/crucial-micron-m500-review-960gb-480gb-240gb-120gb
 
 that's correct but the sentence was  The biggest one doesn’t even
 reach 300MB/s.
 
 
 
 Actually, I don’t know the price difference between the crucial and
 the intel but the intel looks more suitable for me. Especially after
 Mark’s comment.
 
 
 Sébastien Han
 Cloud Engineer
 
 Always give 100%. Unless you're giving blood.”
 
 Phone: +33 (0)1 49 70 99 72
 Mail: sebastien@enovance.com
 Address : 10, rue de la Victoire - 75009 Paris
 Web : www.enovance.com - Twitter : @enovance
 
 On 15 Jan 2014, at 15:28, Mark Nelson mark.nel...@inktank.com wrote:
 
 On 01/15/2014 08:03 AM, Robert van Leeuwen wrote:
 Power-Loss Protection:  In the rare event that power fails while the
 drive is operating, power-loss protection helps ensure that data
 isn’t
 corrupted.
 
 Seems that not all power protected SSDs are created equal:
 http://lkcl.net/reports/ssd_analysis.html
 
 The m500 is not tested but the m4 is.
 
 Up to now it seems that only Intel seems to have done his homework.
 In general they *seem* to be the most reliable SSD provider.
 
 Even at that, there has been some concern on the list (and lkml) that
 certain older Intel drives without super-capacitors are ignoring
 ATA_CMD_FLUSH, making them very fast (which I like!) but potentially
 dangerous (boo!).  The 520 in particular is a drive I've used for a
 lot of Ceph performance testing but I'm afraid that if it's not
 properly handling CMD FLUSH requests, it may not be indicative of the
 performance folks would see on other drives that do.
 
 On the third hand, if drives with supercaps like the Intel DC S3700
 can safely ignore CMD_FLUSH and maintain high performance (even when
 there are a lot of O_DSYNC calls, ala the journal), that potentially
 makes them even more attractive (and that drive already has
 relatively high sequential write performance and high write endurance).
 
 
 Cheers,
 Robert van Leeuwen
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] servers advise (dell r515 or supermicro ....)

2014-01-15 Thread Sebastien Han
However you have to get  480GB which ridiculously large for a journal. I 
believe they are pretty expensive too.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 15 Jan 2014, at 15:49, Sebastien Han sebastien@enovance.com wrote:

 Sorry I was only looking at the 4K aligned results.
 
  
 Sébastien Han 
 Cloud Engineer 
 
 Always give 100%. Unless you're giving blood.” 
 
 Phone: +33 (0)1 49 70 99 72 
 Mail: sebastien@enovance.com 
 Address : 10, rue de la Victoire - 75009 Paris 
 Web : www.enovance.com - Twitter : @enovance 
 
 On 15 Jan 2014, at 15:46, Stefan Priebe s.pri...@profihost.ag wrote:
 
 Am 15.01.2014 15:44, schrieb Mark Nelson:
 On 01/15/2014 08:39 AM, Stefan Priebe wrote:
 
 Am 15.01.2014 15:34, schrieb Sebastien Han:
 Hum the Crucial m500 is pretty slow. The biggest one doesn’t even
 reach 300MB/s.
 Intel DC S3700 100G showed around 200MB/sec for us.
 
 where did you get this values from? I've some 960GB and they all have 
 450Mb/s write speed. Also in tests like here you see  450MB/s
 http://www.tomshardware.com/reviews/crucial-m500-1tb-ssd,3551-5.html
 
 Looks like at least according to Anand's chart, you'll get full write
 speed once you buy the 480GB model, but not for the 120 or 240GB models:
 
 http://www.anandtech.com/show/6884/crucial-micron-m500-review-960gb-480gb-240gb-120gb
 
 that's correct but the sentence was  The biggest one doesn’t even
 reach 300MB/s.
 
 
 
 Actually, I don’t know the price difference between the crucial and
 the intel but the intel looks more suitable for me. Especially after
 Mark’s comment.
 
 
 Sébastien Han
 Cloud Engineer
 
 Always give 100%. Unless you're giving blood.”
 
 Phone: +33 (0)1 49 70 99 72
 Mail: sebastien@enovance.com
 Address : 10, rue de la Victoire - 75009 Paris
 Web : www.enovance.com - Twitter : @enovance
 
 On 15 Jan 2014, at 15:28, Mark Nelson mark.nel...@inktank.com wrote:
 
 On 01/15/2014 08:03 AM, Robert van Leeuwen wrote:
 Power-Loss Protection:  In the rare event that power fails while the
 drive is operating, power-loss protection helps ensure that data
 isn’t
 corrupted.
 
 Seems that not all power protected SSDs are created equal:
 http://lkcl.net/reports/ssd_analysis.html
 
 The m500 is not tested but the m4 is.
 
 Up to now it seems that only Intel seems to have done his homework.
 In general they *seem* to be the most reliable SSD provider.
 
 Even at that, there has been some concern on the list (and lkml) that
 certain older Intel drives without super-capacitors are ignoring
 ATA_CMD_FLUSH, making them very fast (which I like!) but potentially
 dangerous (boo!).  The 520 in particular is a drive I've used for a
 lot of Ceph performance testing but I'm afraid that if it's not
 properly handling CMD FLUSH requests, it may not be indicative of the
 performance folks would see on other drives that do.
 
 On the third hand, if drives with supercaps like the Intel DC S3700
 can safely ignore CMD_FLUSH and maintain high performance (even when
 there are a lot of O_DSYNC calls, ala the journal), that potentially
 makes them even more attractive (and that drive already has
 relatively high sequential write performance and high write endurance).
 
 
 Cheers,
 Robert van Leeuwen
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Ceph-community] Ceph User Committee elections : call for participation

2014-01-01 Thread Sebastien Han
Thanks!

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 01 Jan 2014, at 10:41, Loic Dachary l...@dachary.org wrote:

 
 
 On 01/01/2014 02:39, Sebastien Han wrote:
 Hi,
 
 I’m not sure to have the whole visibility of the role but I will be more 
 than happy to take over.
 I believe that I can allocate some time for this.
 
 Your name is added to the 
 http://pad.ceph.com/p/ceph-user-committee-candidates list
 
 Cheers
 
 
 Cheers.
  
 Sébastien Han 
 Cloud Engineer 
 
 Always give 100%. Unless you're giving blood.” 
 
 Phone: +33 (0)1 49 70 99 72 
 Mail: sebastien@enovance.com 
 Address : 10, rue de la Victoire - 75009 Paris 
 Web : www.enovance.com - Twitter : @enovance 
 
 On 31 Dec 2013, at 09:18, Loic Dachary l...@dachary.org wrote:
 
 Hi,
 
 For personal reasons I have to step down as head of the Ceph User Committee 
 at the end of January 2014. Who would be willing to take over this role ? 
 If there is enough interest I'll organize the election. Otherwise we'll 
 have to figure out something ;-)
 
 Cheers
 
 -- 
 Loïc Dachary, Artisan Logiciel Libre
 
 ___
 Ceph-community mailing list
 ceph-commun...@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
 
 
 -- 
 Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] My experience with ceph now documentted

2013-12-17 Thread Sebastien Han
The ceph doc is currently being updated. See 
https://github.com/ceph/ceph/pull/906

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 17 Dec 2013, at 00:13, Andrew Woodward xar...@gmail.com wrote:

 Karan,
 
 This all looks great. I'd encourage you to submit some of this information 
 into the ceph docs, some of the openstack integration docs are getting a 
 little dated 
 
 Andrew
 
 
 On Fri, Dec 6, 2013 at 12:24 PM, Karan Singh ksi...@csc.fi wrote:
 Hello Cephers 
 
 I would like to say a BIG THANKS to ceph community for helping me in setting 
 up and learning ceph.
 
 I have created a small documentation  http://karan-mj.blogspot.fi/  of my 
 experience with ceph till now , i belive it would help beginners in 
 installing ceph and integrating it with openstack. I would keep updating this 
 blog.
 
 
 PS -- i recommend original ceph documentation http://ceph.com/docs/master/ 
 and other original content published by Ceph community , INKTANK and other 
 partners.  My attempt  http://karan-mj.blogspot.fi/  is just to contribute 
 for a regular online content about ceph.
 
 
 
 Karan Singh
 CSC - IT Center for Science Ltd.
 P.O. Box 405, FI-02101 Espoo, FINLAND
 http://www.csc.fi/ | +358 (0) 503 812758
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 
 -- 
 If google has done it, Google did it right!
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Journal, SSD and OS

2013-12-06 Thread Sebastien Han
Arf forgot to mention that I’ll do a software mdadm RAID 1 with both sda1 and 
sdb1 and put the OS on this.
The rest (sda2 and sdb2) will go for the journals.

@James: I think that Gandalf’s main idea was to save some costs/space on the 
servers so having dedicated disks is not an option. (that what I understand 
from your comment “have the OS somewhere else” but I could be wrong)

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 05 Dec 2013, at 16:02, James Pearce ja...@peacon.co.uk wrote:

 Another option is to run journals on individually presented SSDs, in a 5:1 
 ratio (spinning-disk:ssd) and have the OS somewhere else.  Then the failure 
 domain is smaller.
 
 Ideally implement some way to monitor SSD write life SMART data - at least it 
 gives a guide as to device condition compared to its rated life.  That can be 
 done with smartmontools, but it would be nice to have it on the InkTank 
 dashboard for example.
 
 
 On 2013-12-05 14:26, Sebastien Han wrote:
 Hi guys,
 
 I won’t do a RAID 1 with SSDs since they both write the same data.
 Thus, they are more likely to “almost” die at the same time.
 
 What I will try to do instead is to use both disk in JBOD mode or
 (degraded RAID0).
 Then I will create a tiny root partition for the OS.
 
 Then I’ll still have something like /dev/sda2 and /dev/sdb2 and then
 I can take advantage of the 2 disks independently.
 The good thing with that is that you can balance your journals across both 
 SSDs.
 From a performance perspective this is really good.
 The bad thing as always is that if you loose a SSD you loose all the
 journals attached to it.
 
 Cheers.
 
 
 Sébastien Han
 Cloud Engineer
 
 Always give 100%. Unless you're giving blood.”
 
 Phone: +33 (0)1 49 70 99 72
 Mail: sebastien@enovance.com
 Address : 10, rue de la Victoire - 75009 Paris
 Web : www.enovance.com - Twitter : @enovance
 
 On 05 Dec 2013, at 10:53, Gandalf Corvotempesta
 gandalf.corvotempe...@gmail.com wrote:
 
 2013/12/4 Simon Leinen simon.lei...@switch.ch:
 I think this is a fine configuration - you won't be writing to the root
 partition too much, outside journals.  We also put journals on the same
 SSDs as root partitions (not that we're very ambitious about
 performance...).
 
 Do you suggest a RAID1 for the OS partitions on SSDs ? Is this safe or
 a RAID1 will decrease SSD life?
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Journal, SSD and OS

2013-12-05 Thread Sebastien Han
Hi guys,

I won’t do a RAID 1 with SSDs since they both write the same data.
Thus, they are more likely to “almost” die at the same time.

What I will try to do instead is to use both disk in JBOD mode or (degraded 
RAID0).
Then I will create a tiny root partition for the OS.

Then I’ll still have something like /dev/sda2 and /dev/sdb2 and then I can take 
advantage of the 2 disks independently.
The good thing with that is that you can balance your journals across both SSDs.
From a performance perspective this is really good.
The bad thing as always is that if you loose a SSD you loose all the journals 
attached to it.

Cheers.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 05 Dec 2013, at 10:53, Gandalf Corvotempesta 
gandalf.corvotempe...@gmail.com wrote:

 2013/12/4 Simon Leinen simon.lei...@switch.ch:
 I think this is a fine configuration - you won't be writing to the root
 partition too much, outside journals.  We also put journals on the same
 SSDs as root partitions (not that we're very ambitious about
 performance...).
 
 Do you suggest a RAID1 for the OS partitions on SSDs ? Is this safe or
 a RAID1 will decrease SSD life?
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Docker

2013-11-29 Thread Sebastien Han
Hi guys!

Some experiment here: 
http://www.sebastien-han.fr/blog/2013/09/19/how-I-barely-got-my-first-ceph-mon-running-in-docker/

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 29 Nov 2013, at 00:08, Patrick McGarry patr...@inktank.com wrote:

 I played with Docker for a while and ran into some issues (perhaps
 from my own ignorance of Docker principles).  The biggest issue seemed
 to be that the IP was relatively ephemeral, which the MON really
 doesn't like.  I couldn't find a reliably intuitive way to have the
 MON get either the same IP or a way to update the IP in a way that
 would also form a cluster.
 
 If anyone has been able to get Ceph into Docker with reliability and
 portability I would love to hear about it (and feature it on the
 Ceph.com blog!).
 
 
 Best Regards,
 
 Patrick McGarry
 Director, Community || Inktank
 http://ceph.com  ||  http://inktank.com
 @scuttlemonkey || @ceph || @inktank
 
 
 On Thu, Nov 28, 2013 at 5:17 PM, Gandalf Corvotempesta
 gandalf.corvotempe...@gmail.com wrote:
 Anybody using MONs and RGW inside docker containers?
 I would like to use a server with two docker containers, one for mon
 and one for RGW
 
 This to archieve a better isolation between services and some reusable
 components (the same container can be exported and used multiple times
 on multiple servers)
 
 Suggestions?
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] LevelDB Backend For Ceph OSD Preview

2013-11-26 Thread Sebastien Han
Hi Sage,
If I recall correctly during the summit you mentioned that it was possible to 
disable the journal.
Is it still part of the plan?

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 25 Nov 2013, at 10:00, Sebastien Han sebastien@enovance.com wrote:

 Nice job Haomai!
 
  
 Sébastien Han 
 Cloud Engineer 
 
 Always give 100%. Unless you're giving blood.” 
 
 Phone: +33 (0)1 49 70 99 72 
 Mail: sebastien@enovance.com 
 Address : 10, rue de la Victoire - 75009 Paris 
 Web : www.enovance.com - Twitter : @enovance 
 
 On 25 Nov 2013, at 02:50, Haomai Wang haomaiw...@gmail.com wrote:
 
 
 
 
 On Mon, Nov 25, 2013 at 2:17 AM, Mark Nelson mark.nel...@inktank.com wrote:
 Great Work! This is very exciting!  Did you happen to try RADOS bench at 
 different object sizes and concurrency levels?
 
 
 Maybe can try it later. :-)
 
 Mark
 
 
 On 11/24/2013 03:01 AM, Haomai Wang wrote:
 Hi all,
 
 For Emperor
 blueprint(http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Add_LevelDB_support_to_ceph_cluster_backend_store),
 I'm sorry to delay the progress. Now, I have done the most of the works
 for the blueprint's goal. Because of sage's F
 blueprint(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend),
 I need to adjust some codes to match it. The branch is
 here(https://github.com/yuyuyu101/ceph/tree/wip/6173).
 
 I have tested the LevelDB backend on three nodes(eight OSDs) and compare
 it to FileStore(ext4). I just use intern benchmark tool rados bench to
 get the comparison. The default ceph configurations is used and
 replication size is 2. The filesystem is ext4 and no others changed. The
 results is below:
 
 *Rados Bench*
 
 
 
 *Bandwidth(MB/sec)*
 
 
 
 *Average Latency*
 
 
 
 *Max Latency*
 
 
 
 *Min Latency*
 
 
 
 *Stddev Latency*
 
 
 
 *Stddev Bandwidth(MB/sec)*
 
 
 
 *Max Bandwidth(MB/sec)*
 
 
 
 *Min Bandwidth(MB/sec)*
 
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 *Write 30*
 
 
 
 
 24.590
 
 
 
 23.495
 
 
 
 4.87257
 
 
 
 5.07716
 
 
 
 14.752
 
 
 
 13.0885
 
 
 
 0.580851
 
 
 
 0.605118
 
 
 
 2.97708
 
 
 
 3.30538
 
 
 
 9.91938
 
 
 
 10.5986
 
 
 
 44
 
 
 
 76
 
 
 
 0
 
 
 
 0
 
 *Write 20*
 
 
 
 
 23.515
 
 
 
 23.064
 
 
 
 3.39745
 
 
 
 3.45711
 
 
 
 11.6089
 
 
 
 11.5996
 
 
 
 0.169507
 
 
 
 0.138595
 
 
 
 2.58285
 
 
 
 2.75962
 
 
 
 9.14467
 
 
 
 8.54156
 
 
 
 44
 
 
 
 40
 
 
 
 0
 
 
 
 0
 
 *Write 10*
 
 
 
 
 22.927
 
 
 
 21.980
 
 
 
 1.73815
 
 
 
 1.8198
 
 
 
 5.53792
 
 
 
 6.46675
 
 
 
 0.171028
 
 
 
 0.143392
 
 
 
 1.05982
 
 
 
 1.20303
 
 
 
 9.18403
 
 
 
 8.74401
 
 
 
 44
 
 
 
 40
 
 
 
 0
 
 
 
 0
 
 *Write 5*
 
 
 
 
 19.680
 
 
 
 20.017
 
 
 
 1.01492
 
 
 
 0.997019
 
 
 
 3.10783
 
 
 
 3.05008
 
 
 
 0.143758
 
 
 
 0.138161
 
 
 
 0.561548
 
 
 
 0.571459
 
 
 
 5.92575
 
 
 
 6.844
 
 
 
 36
 
 
 
 32
 
 
 
 0
 
 
 
 0
 
 *Read 30*
 
 
 
 
 65.852
 
 
 
 60.688
 
 
 
 1.80069
 
 
 
 1.96009
 
 
 
 9.30039
 
 
 
 10.1146
 
 
 
 0.115153
 
 
 
 0.061657
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 *Read 20*
 
 
 
 
 59.372
 
 
 
 60.738
 
 
 
 1.30479
 
 
 
 1.28383
 
 
 
 6.28435
 
 
 
 8.21304
 
 
 
 0.016843
 
 
 
 0.012073
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 *Read 10*
 
 
 
 
 65.502
 
 
 
 55.814
 
 
 
 0.608805
 
 
 
 0.7087
 
 
 
 3.3917
 
 
 
 4.72626
 
 
 
 0.016267
 
 
 
 0.011998
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 *Read 5*
 
 
 
 
 64.176
 
 
 
 54.928
 
 
 
 0.307111
 
 
 
 0.364077
 
 
 
 1.76391
 
 
 
 1.90182
 
 
 
 0.017174
 
 
 
 0.011999
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 Charts can be view here(http://img42.com/ziwjP+) and
 (http://img42.com/LKhoo+)
 
 
 From above, I'm feeling relieved that the LevelDB backend isn't
 useless. Most of metrics are better and if increasing cache size for
 LevelDB the results may be more attractive.
 Even more, LevelDB backend is used by KeyValueStore and much of
 optimizations can be done to improve performance such as increase
 parallel threads or optimize io path.
 
 Next, I use rbd bench-write to test. The result is pity:
 
 *RBD Bench-Write*
 
 
 
 *OPS/sec*
 
 
 
 *Bytes/sec*
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 *Seq 4096 5*
 
 
 
 27.42
 
 
 
 716.55
 
 
 
 111861.51
 
 
 
 2492149.21
 
 *Rand 4096 5*
 
 
 
 
 28.27
 
 
 
 504
 
 
 
 112331.42
 
 
 
 1683151.29
 
 
 Just because kv backend doesn't support read/write operation with
 offset/length argument, each read/write operation will call

Re: [ceph-users] how to Testing cinder and glance with CEPH

2013-11-26 Thread Sebastien Han
Hi,

Well after restarting the services run:

$ cinder create 1

Then you can check both status in Cinder and Ceph:

For Cinder run:
$ cinder list

For Ceph run:
$ rbd -p cinder-pool ls

If the image is there, you’re good.

Cheers.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 27 Nov 2013, at 00:04, Karan Singh ksi...@csc.fi wrote:

 Hello Cephers
 
 I was following http://ceph.com/docs/master/rbd/rbd-openstack/  for ceph and 
 openstack Integration , using this document ih ave done all the changes 
 required for this integration.
 
 I am not sure how should i test my configuration , how should i make sure 
 integration is successful. Can you suggest some test that i can perform to 
 check my ceph and openstack integration .
 
 FYI , in the document http://ceph.com/docs/master/rbd/rbd-openstack/  , 
 Nothing is mentioned after  Restart Openstack Services   heading , but 
 there should be steps to test this inttegration , please suggest me here , i 
 am new to openstack great if you can give me some commanes used for testing.
 
 
 
 Karan Singh
 CSC - IT Center for Science Ltd.
 P.O. Box 405, FI-02101 Espoo, FINLAND
 http://www.csc.fi/ | +358 (0) 503 812758
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] LevelDB Backend For Ceph OSD Preview

2013-11-25 Thread Sebastien Han
Nice job Haomai!

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 25 Nov 2013, at 02:50, Haomai Wang haomaiw...@gmail.com wrote:

 
 
 
 On Mon, Nov 25, 2013 at 2:17 AM, Mark Nelson mark.nel...@inktank.com wrote:
 Great Work! This is very exciting!  Did you happen to try RADOS bench at 
 different object sizes and concurrency levels?
 
 
 Maybe can try it later. :-)
  
 Mark
 
 
 On 11/24/2013 03:01 AM, Haomai Wang wrote:
 Hi all,
 
 For Emperor
 blueprint(http://wiki.ceph.com/01Planning/02Blueprints/Emperor/Add_LevelDB_support_to_ceph_cluster_backend_store),
 I'm sorry to delay the progress. Now, I have done the most of the works
 for the blueprint's goal. Because of sage's F
 blueprint(http://wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend),
 I need to adjust some codes to match it. The branch is
 here(https://github.com/yuyuyu101/ceph/tree/wip/6173).
 
 I have tested the LevelDB backend on three nodes(eight OSDs) and compare
 it to FileStore(ext4). I just use intern benchmark tool rados bench to
 get the comparison. The default ceph configurations is used and
 replication size is 2. The filesystem is ext4 and no others changed. The
 results is below:
 
 *Rados Bench*
 
 
 
 *Bandwidth(MB/sec)*
 
 
 
 *Average Latency*
 
 
 
 *Max Latency*
 
 
 
 *Min Latency*
 
 
 
 *Stddev Latency*
 
 
 
 *Stddev Bandwidth(MB/sec)*
 
 
 
 *Max Bandwidth(MB/sec)*
 
 
 
 *Min Bandwidth(MB/sec)*
 
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 
 
 *KVStore*
 
 
 
 *FileStore*
 
 *Write 30*
 
 
 
 
 24.590
 
 
 
 23.495
 
 
 
 4.87257
 
 
 
 5.07716
 
 
 
 14.752
 
 
 
 13.0885
 
 
 
 0.580851
 
 
 
 0.605118
 
 
 
 2.97708
 
 
 
 3.30538
 
 
 
 9.91938
 
 
 
 10.5986
 
 
 
 44
 
 
 
 76
 
 
 
 0
 
 
 
 0
 
 *Write 20*
 
 
 
 
 23.515
 
 
 
 23.064
 
 
 
 3.39745
 
 
 
 3.45711
 
 
 
 11.6089
 
 
 
 11.5996
 
 
 
 0.169507
 
 
 
 0.138595
 
 
 
 2.58285
 
 
 
 2.75962
 
 
 
 9.14467
 
 
 
 8.54156
 
 
 
 44
 
 
 
 40
 
 
 
 0
 
 
 
 0
 
 *Write 10*
 
 
 
 
 22.927
 
 
 
 21.980
 
 
 
 1.73815
 
 
 
 1.8198
 
 
 
 5.53792
 
 
 
 6.46675
 
 
 
 0.171028
 
 
 
 0.143392
 
 
 
 1.05982
 
 
 
 1.20303
 
 
 
 9.18403
 
 
 
 8.74401
 
 
 
 44
 
 
 
 40
 
 
 
 0
 
 
 
 0
 
 *Write 5*
 
 
 
 
 19.680
 
 
 
 20.017
 
 
 
 1.01492
 
 
 
 0.997019
 
 
 
 3.10783
 
 
 
 3.05008
 
 
 
 0.143758
 
 
 
 0.138161
 
 
 
 0.561548
 
 
 
 0.571459
 
 
 
 5.92575
 
 
 
 6.844
 
 
 
 36
 
 
 
 32
 
 
 
 0
 
 
 
 0
 
 *Read 30*
 
 
 
 
 65.852
 
 
 
 60.688
 
 
 
 1.80069
 
 
 
 1.96009
 
 
 
 9.30039
 
 
 
 10.1146
 
 
 
 0.115153
 
 
 
 0.061657
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 *Read 20*
 
 
 
 
 59.372
 
 
 
 60.738
 
 
 
 1.30479
 
 
 
 1.28383
 
 
 
 6.28435
 
 
 
 8.21304
 
 
 
 0.016843
 
 
 
 0.012073
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 *Read 10*
 
 
 
 
 65.502
 
 
 
 55.814
 
 
 
 0.608805
 
 
 
 0.7087
 
 
 
 3.3917
 
 
 
 4.72626
 
 
 
 0.016267
 
 
 
 0.011998
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 *Read 5*
 
 
 
 
 64.176
 
 
 
 54.928
 
 
 
 0.307111
 
 
 
 0.364077
 
 
 
 1.76391
 
 
 
 1.90182
 
 
 
 0.017174
 
 
 
 0.011999
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 Charts can be view here(http://img42.com/ziwjP+) and
 (http://img42.com/LKhoo+)
 
 
  From above, I'm feeling relieved that the LevelDB backend isn't
 useless. Most of 

Re: [ceph-users] alternative approaches to CEPH-FS

2013-11-25 Thread Sebastien Han
Hi,

1) nfs over rbd (http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/)

This has been in production for more than a year now and heavily tested before.
Performance was not expected since frontend server mainly do read (90%).

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 14 Nov 2013, at 17:08, Gautam Saxena gsax...@i-a-inc.com wrote:

 1) nfs over rbd (http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/)

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] alternative approaches to CEPH-FS

2013-11-25 Thread Sebastien Han
Hi,

Well, basically, the frontend is composed of web servers. 
They mostly do reads on the NFS mount. 
I believe that the biggest frontend has around 60 virtual machines, accessing 
the share and serving it.

Unfortunately, I don’t have any figures anymore but performances were really 
poor in general. However they were fair enough for us since the workload was 
going to be “mixed read”.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 25 Nov 2013, at 13:50, Gautam Saxena gsax...@i-a-inc.com wrote:

 Hi Sebastien.
 
 Thanks! WHen you say performance was not expected, can you elaborate a 
 little? Specifically, what did you notice in terms of performance?
 
 
 
 On Mon, Nov 25, 2013 at 4:39 AM, Sebastien Han sebastien@enovance.com 
 wrote:
 Hi,
 
 1) nfs over rbd (http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/)
 
 This has been in production for more than a year now and heavily tested 
 before.
 Performance was not expected since frontend server mainly do read (90%).
 
 Cheers.
 
 Sébastien Han
 Cloud Engineer
 
 Always give 100%. Unless you're giving blood.”
 
 Phone: +33 (0)1 49 70 99 72
 Mail: sebastien@enovance.com
 Address : 10, rue de la Victoire - 75009 Paris
 Web : www.enovance.com - Twitter : @enovance
 
 On 14 Nov 2013, at 17:08, Gautam Saxena gsax...@i-a-inc.com wrote:
 
  1) nfs over rbd (http://www.sebastien-han.fr/blog/2012/07/06/nfs-over-rbd/)
 
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Intel 520/530 SSD for ceph

2013-11-22 Thread Sebastien Han
 I used a blocksize of 350k as my graphes shows me that this is the
 average workload we have on the journal.

Pretty interesting metric Stefan.
Has anyone seen the same behaviour?

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 22 Nov 2013, at 02:37, Mark Nelson mark.nel...@inktank.com wrote:

 On 11/21/2013 02:36 AM, Stefan Priebe - Profihost AG wrote:
 Hi,
 
 Am 21.11.2013 01:29, schrieb m...@linuxbox.com:
 On Tue, Nov 19, 2013 at 09:02:41AM +0100, Stefan Priebe wrote:
 ...
 You might be able to vary this behavior by experimenting with sdparm,
 smartctl or other tools, or possibly with different microcode in the 
 drive.
 Which values or which settings do you think of?
 ...
 
 Off-hand, I don't know.  Probably the first thing would be
 to compare the configuration of your 520  530; anything that's
 different is certainly worth investigating.
 
 This should display all pages,
 sdparm --all --long /dev/sdX
 the 520 only appears to have 3 pages, which can be fetched directly w/
 sdparm --page=ca --long /dev/sdX
 sdparm --page=co --long /dev/sdX
 sdparm --page=rw --long /dev/sdX
 
 The sample machine I'm looking has an intel 520, and on ours,
 most options show as 0 except for
   AWRE1  [cha: n, def:  1]  Automatic write reallocation enabled
   WCE 1  [cha: y, def:  1]  Write cache enable
   DRA 1  [cha: n, def:  1]  Disable read ahead
   GLTSD   1  [cha: n, def:  1]  Global logging target save disable
   BTP-1  [cha: n, def: -1]  Busy timeout period (100us)
   ESTCT  30  [cha: n, def: 30]  Extended self test completion time (sec)
 Perhaps that's an interesting data point to compare with yours.
 
 Figuring out if you have up-to-date intel firmware appears to require
 burning and running an iso image from
 https://downloadcenter.intel.com/Detail_Desc.aspx?agr=YDwnldID=18455
 
 The results of sdparm --page=whatever --long /dev/sdc
 show the intel firmware, but this labels it better:
 smartctl -i /dev/sdc
 Our 520 has firmware 400i loaded.
 
 Firmware is up2date and all values are the same. I expect that the 520
 firmware just ignores CMD_FLUSH commands and the 530 does not.
 
 For those of you that don't follow LKML, there is some interesting discussion 
 going on regarding this same issue (Hi Stefan!)
 
 https://lkml.org/lkml/2013/11/20/158
 
 Can anyone think of a reasonable (ie not yanking power out) way to test what 
 CMD_FLUSH is actually doing?  I have some 520s in our test rig I can play 
 with.  Otherwise, maybe an Intel engineer can chime in and let us know what's 
 going on?
 
 
 Greets,
 Stefan
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] presentation videos from Ceph Day London?

2013-10-31 Thread Sebastien Han
Nothing has been recorded as far as I know.
However I’ve seen some guys from Scality recording sessions with a cam.

Scality? Are you there? :)

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 30 Oct 2013, at 10:24, Blair Bethwaite blair.bethwa...@gmail.com wrote:

 I've been perusing the content on slideshare and see some really interesting 
 and creatively composed presentations! Was there any recording done (and 
 plans to make it generally available)?
 
 -- 
 Cheers,
 ~Blairo
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Openstack on ceph rbd installation failure

2013-07-23 Thread Sebastien Han
Can you send your ceph.conf too?Is /etc/ceph/ceph.conf present? Is the key of user volume present too?Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance
On Jul 23, 2013, at 5:39 AM, johnu johnugeorge...@gmail.com wrote:Hi,  I have a three node ceph cluster. ceph -w says health ok . I have openstack in the same cluster and trying to map cinder and glance onto rbd.I have followed steps as given inhttp://ceph.com/docs/next/rbd/rbd-openstack/New Settings that is added in cinder.conf for three filesvolume_driver=cinder.volume.drivers.rbd.RBDDriverrbd_pool=volumesglance_api_version=2rbd_user=volumesrbd_secret_uuid=62d0b384-50ad-2e17-15ed-66bfeda40252 ( different for each node)LOGS seen when I run ./rejoin.sh2013-07-22 20:35:01.900 INFO cinder.service [-] Starting 1 workers2013-07-22 20:35:01.909 INFO cinder.service [-] Started child 22902013-07-22 20:35:01.965 AUDIT cinder.service [-] Starting cinder-volume node (version 2013.2)2013-07-22 20:35:02.129 ERROR cinder.volume.drivers.rbd [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] error connecting to ceph cluster2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd Traceback (most recent call last):2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd  File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 243, in check_for_setup_error2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd   with RADOSClient(self):2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd  File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 215, in __init__2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd   self.cluster, self.ioctx = driver._connect_to_rados(pool)2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd  File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 263, in _connect_to_rados2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd   client.connect()2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd  File "/usr/lib/python2.7/dist-packages/rados.py", line 192, in connect2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd   raise make_ex(ret, "error calling connect")2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd ObjectNotFound: error calling connect2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd2013-07-22 20:35:02.149 ERROR cinder.service [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] Unhandled exception2013-07-22 20:35:02.149 TRACE cinder.service Traceback (most recent call last):2013-07-22 20:35:02.149 TRACE cinder.service  File "/opt/stack/cinder/cinder/service.py", line 228, in _start_child2013-07-22 20:35:02.149 TRACE cinder.service   self._child_process(wrap.server)2013-07-22 20:35:02.149 TRACE cinder.service  File "/opt/stack/cinder/cinder/service.py", line 205, in _child_process2013-07-22 20:35:02.149 TRACE cinder.service   launcher.run_server(server)2013-07-22 20:35:02.149 TRACE cinder.service  File "/opt/stack/cinder/cinder/service.py", line 96, in run_server2013-07-22 20:35:02.149 TRACE cinder.service   server.start()2013-07-22 20:35:02.149 TRACE cinder.service  File "/opt/stack/cinder/cinder/service.py", line 359, in start2013-07-22 20:35:02.149 TRACE cinder.service   self.manager.init_host()2013-07-22 20:35:02.149 TRACE cinder.service  File "/opt/stack/cinder/cinder/volume/manager.py", line 139, in init_host2013-07-22 20:35:02.149 TRACE cinder.service   self.driver.check_for_setup_error()2013-07-22 20:35:02.149 TRACE cinder.service  File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 248, in check_for_setup_error2013-07-22 20:35:02.149 TRACE cinder.service   raise exception.VolumeBackendAPIException(data="">2013-07-22 20:35:02.149 TRACE cinder.service VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: error connecting to ceph cluster2013-07-22 20:35:02.149 TRACE cinder.service2013-07-22 20:35:02.191 INFO cinder.service [-] Child 2290 exited with status 22013-07-22 20:35:02.192 INFO cinder.service [-] _wait_child 12013-07-22 20:35:02.193 INFO cinder.service [-] wait wrap.failed TrueCan someone help me with some debug points and solve it ?___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD Mapping

2013-07-23 Thread Sebastien Han
Hi Greg,Just tried the list watchers, on a rbd with the QEMU driver and I got:root@ceph:~# rados -p volumes listwatchers rbd_header.789c2ae8944awatcher=client.30882 cookie=1I also tried with the kernel module but didn't see anything…No IP addresses anywhere… :/, any idea?Nice tip btw :)Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance

On Jul 23, 2013, at 11:01 PM, Gregory Farnum g...@inktank.com wrote:On Tue, Jul 23, 2013 at 1:28 PM, Wido den Hollander w...@42on.com wrote:On 07/23/2013 09:09 PM, Gaylord Holder wrote:Is it possible to find out which machines are mapping and RBD?No, that is stateless. You can use locking however, you can for example putthe hostname of the machine in the lock.But that's not mandatory in the protocol.Maybe you are able to list watchers for a RBD drive, but I'm not sure aboutthat.You can. "rados listwatchers object" will tell you who's got watchesregistered, and that output should include IPs. You'll want to run itagainst the rbd head object.-GregSoftware Engineer #42 @ http://inktank.com | http://ceph.com-Gaylord___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com--Wido den Hollander42on B.V.Phone: +31 (0)20 700 9902Skype: contact42on___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD Mapping

2013-07-23 Thread Sebastien Han
Arf no worries. Even after a quick dive into the logs, I haven't find anything. (default log level).Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance

On Jul 24, 2013, at 12:08 AM, Gregory Farnum g...@inktank.com wrote:On Tue, Jul 23, 2013 at 2:55 PM, Sebastien Hansebastien@enovance.com wrote:Hi Greg,Just tried the list watchers, on a rbd with the QEMU driver and I got:root@ceph:~# rados -p volumes listwatchers rbd_header.789c2ae8944awatcher=client.30882 cookie=1I also tried with the kernel module but didn't see anything…No IP addresses anywhere… :/, any idea?Nice tip btw :)Oh, whoops. Looks like the first iteration didn't include IPaddresses; they show up in version 0.65 or later. Sorry for theinconvenience. I think there might be a way to convert client IDs intoaddresses but I can't quite think of any convenient ones (as opposedto inconvenient ones like digging them up out of logs); maybe somebodyelse has an idea...-GregSoftware Engineer #42 @ http://inktank.com | http://ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RADOS Bench strange behavior

2013-07-09 Thread Sebastien Han
Hi all,While running some benchmarks with the internal rados benchmarker I noticed something really strange. First of all, this is the line I used to run it:$sudo rados -p 07:59:54_performance bench 300 write -b 4194304 -t 1 --no-cleanupSo I want to test an IO with a concurrency of 1. I had a look at the code and also strace the process and I noticed that the IOs are send one by one sequentially. Thus it does what I expect from it.However while monitoring the disks usage on all my OSDs, I found out that they were all loaded (writing, both journals and filestore) which is kind of weird since all the IOs are send one by one. I was expecting that only one OSDs at a time will be writing.Obviously there is no replication going on since I changed the rep size to 1.$ ceph osd dump |grep "07:59:54_performance"pool 323 '07:59:54_performance' rep size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 1306 owner 0Thanks in advance guy.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with multiple hosts RBD + Cinder

2013-06-21 Thread Sebastien Han
De rien, cool :)Yes start from the libvirt section.Cheers!Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance
On Jun 21, 2013, at 11:25 AM, Igor Laskovy igor.lask...@gmail.com wrote:Merci Sebastien, it's work now ;)Now for live migration do I need followhttps://wiki.openstack.org/wiki/LiveMigrationUsagebegining from libvirt settings section?On Thu, Jun 20, 2013 at 2:47 PM, Sebastien Hansebastien@enovance.comwrote:Hi,No this must always be the same UUID. You can only specify one in cinder.conf.Btw nova does the attachment this is why it needs the uuid and secret.The first secret import generates an UUID, then always re-use the same one for all your compute node, do something like:secret ephemeral='no' private='no' uuid9e4c7795-0681-cd4f-cf36-8cb8aef3c47f/uuid usage type='ceph'  nameclient.volumes secret/name /usage/secretCheers.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."image.pngPhone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovanceOn Jun 20, 2013, at 12:23 PM, Igor Laskovy igor.lask...@gmail.com wrote:Hello list!I am trying deploy Ceph RBD + OpenStack Cinder.Basically, my question related to this section in documentation:cat  secret.xml EOFsecret ephemeral='no' private='no' usage type='ceph'  nameclient.volumes secret/name /usage/secretEOFsudo virsh secret-define --file secret.xmluuid of secret is output heresudo virsh secret-set-value --secret {uuid of secret} --base64 $(cat client.volumes.key)  rm client.volumes.key secret.xmlDo I need tie libvirt secrets logic with ceph client.volumes user on each cinder-volume hosts? So it will be separate "uuid of secret" for each host but they all will use single user cinder.volumes, right?Asking this because I have strange error in nova-scheduler.log on controller host:2013-06-20 13:10:01.270 ERROR nova.scheduler.filter_scheduler [req-b173d765-9528-43af-a3d1-bd811df8710d fd860a2737f94ff0bc7decec5783017b 3f47be9a0c2348faac4deec2a988acd8] [instance: d8dd40d4-61de-498d-a54f-12f4d9e9c594] Errorfrom last host: node03 (node node03.ceph.labspace.studiogrizzly.com): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 848, in _run_instance\n  set_access_ip=set_access_ip)\n', u' File"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1107, in _spawn\n  LOG.exception(_(\'Instance failed to spawn\'), instance=instance)\n', u' File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__\n  self.gen.next()\n', u' File"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1103, in _spawn\n  block_device_info)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 1527, in spawn\n  block_device_info)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2443, in _create_domain_and_network\n  domain = self._create_domain(xml, instance=instance)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2404, in _create_domain\n domain.createWithFlags(launch_flags)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 187, in doit\n  result = proxy_call(self._autowrap, f, *args, **kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 147, inproxy_call\n  rv = execute(f,*args,**kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 76, in tworker\n  rv = meth(*args,**kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/libvirt.py", line 711, in createWithFlags\n  if ret == -1: raiselibvirtError (\'virDomainCreateWithFlags() failed\', dom=self)\n', u"libvirtError: internal error rbd username 'volumes' specified but secret not found\n"]--Igor Laskovyfacebook.com/igor.laskovystudiogrizzly.com___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com--Igor Laskovyfacebook.com/igor.laskovystudiogrizzly.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with multiple hosts RBD + Cinder

2013-06-20 Thread Sebastien Han
Hi,No this must always be the same UUID. You can only specify one in cinder.conf.Btw nova does the attachment this is why it needs the uuid and secret.The first secret import generates an UUID, then always re-use the same one for all your compute node, do something like:secret ephemeral='no' private='no' uuid9e4c7795-0681-cd4f-cf36-8cb8aef3c47f/uuid usage type='ceph'  nameclient.volumes secret/name /usage/secretCheers.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance
On Jun 20, 2013, at 12:23 PM, Igor Laskovy igor.lask...@gmail.com wrote:Hello list!I am trying deploy Ceph RBD + OpenStack Cinder.Basically, my question related to this section in documentation:cat  secret.xml EOFsecret ephemeral='no' private='no' usage type='ceph'  nameclient.volumes secret/name /usage/secretEOFsudo virsh secret-define --file secret.xmluuid of secret is output heresudo virsh secret-set-value --secret {uuid of secret} --base64 $(cat client.volumes.key)  rm client.volumes.key secret.xmlDo I need tie libvirt secrets logic with ceph client.volumes user on each cinder-volume hosts? So it will be separate "uuid of secret" for each host but they all will use single user cinder.volumes, right?Asking this because I have strange error in nova-scheduler.log on controller host:2013-06-20 13:10:01.270 ERROR nova.scheduler.filter_scheduler [req-b173d765-9528-43af-a3d1-bd811df8710d fd860a2737f94ff0bc7decec5783017b 3f47be9a0c2348faac4deec2a988acd8] [instance: d8dd40d4-61de-498d-a54f-12f4d9e9c594] Error fromlast host: node03 (nodenode03.ceph.labspace.studiogrizzly.com): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 848, in _run_instance\n  set_access_ip=set_access_ip)\n', u' File"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1107, in _spawn\n  LOG.exception(_(\'Instance failed to spawn\'), instance=instance)\n', u' File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__\n  self.gen.next()\n', u' File"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1103, in _spawn\n  block_device_info)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 1527, in spawn\n  block_device_info)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2443, in _create_domain_and_network\n  domain = self._create_domain(xml, instance=instance)\n', u' File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2404, in _create_domain\n domain.createWithFlags(launch_flags)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 187, in doit\n  result = proxy_call(self._autowrap, f, *args, **kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 147, in proxy_call\n rv = execute(f,*args,**kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 76, in tworker\n  rv = meth(*args,**kwargs)\n', u' File "/usr/lib/python2.7/dist-packages/libvirt.py", line 711, in createWithFlags\n  if ret == -1: raise libvirtError(\'virDomainCreateWithFlags() failed\', dom=self)\n', u"libvirtError: internal error rbd username 'volumes' specified but secret not found\n"]--Igor Laskovyfacebook.com/igor.laskovystudiogrizzly.com___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Live Migrations with cephFS

2013-06-17 Thread Sebastien Han
Thank you, Sebastien Han. I am sure many are thankful you've published your thoughts and experiences with Ceph and even OpemStack.Thanks Bo! :)If I may, I would like to reword my question/statement with greater clarity: To force all instances to always boot from RBD volumes, would a person would have to make changes to something more than Horizon (demonstration GUI)? If the changes need only be inHorizon, the provider would then likely need to restrict or deny their customers access to their unmodified APIs. If they do not, then the unchanged APIs would allow for behavior the provider does not want.Thoughts? Corrections? Feel free to teach.This is correct. Forcing the boot from volume requires a modified version of the API which kinda tricky and GUI modifications. There are 2 cases:1. you're an ISP (public provider), you should forget about the idea unless you want to provide a _really_ close service.2.you're the only one managing your platform (private cloud) this might be doable but even so you'll encounter a lot of problems while upgrading. At the end it's up to you, if you're 100% sure that you have the complete control of your infra and that you know when, who and how new instances are booted (and occasionally don't care about update and compatibility).You can always hack the dashboard but it's more than that you have to automate the action that each time someone is booting a VM you have to create a volume from an image for this. This will prolong the process. At this point, I'll recommend you to push this blueprint, it'll run all the VM through ceph even the one not using the boot-from-volume option.https://blueprints.launchpad.net/nova/+spec/bring-rbd-support-libvirt-images-typeAn article is coming next week and will cover the entire subject.Cheers!Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance
On Jun 17, 2013, at 8:00 AM, Wolfgang Hennerbichler wolfgang.hennerbich...@risc-software.at wrote:On 06/14/2013 08:00 PM, Ilja Maslov wrote:Hi,Is live migration supported with RBD and KVM/OpenStack?Always wanted to know but was afraid to ask :)totally works in my productive setup. but we don't use openstack in thisinstallation, just KVM/RBD.Pardon brevity and formatting, replying from the phone.Cheers,IljaRobert Sander r.san...@heinlein-support.de wrote:On 14.06.2013 12:55, Alvaro Izquierdo Jimeno wrote:By default, openstack uses NFS but… other options are available….can weuse cephFS instead of NFS?Wouldn't you use qemu-rbd for your virtual guests in OpenStack?AFAIK CephFS is not needed for KVM/qemu virtual machines.Regards--Robert SanderHeinlein Support GmbHSchwedter Str. 8/9b, 10119 Berlinhttp://www.heinlein-support.deTel: 030 / 405051-43Fax: 030 / 405051-19Zwangsangaben lt. §35a GmbHG:HRB 93818 B / Amtsgericht Berlin-Charlottenburg,Geschäftsführer: Peer Heinlein -- Sitz: Berlin___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.comThis email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient, please note that any review, dissemination, disclosure, alteration, printing,circulation, retention or transmission of this e-mail and/or any file or attachment transmitted with it, is prohibited and may be unlawful. If you have received this e-mail or any file or attachment transmitted with it in error please notify postmas...@openet.com.Although Openet has taken reasonable precautions to ensure no viruses are present in this email, we cannot accept responsibility for any loss or damage arising from the use of this email or attachments.___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com--DI (FH) Wolfgang HennerbichlerSoftware DevelopmentUnit Advanced Computing TechnologiesRISC Software GmbHA company of the Johannes Kepler University LinzIT-CenterSoftwarepark 354232 HagenbergAustriaPhone: +43 7236 3343 245Fax: +43 7236 3343 250wolfgang.hennerbich...@risc-software.athttp://www.risc-software.at___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Live Migrations with cephFS

2013-06-16 Thread Sebastien Han
In OpenStack, a VM booted from a volume (where the disk is located on RBD) supports the live-migration without any problems.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance
On Jun 14, 2013, at 11:36 PM, Bo b...@samware.com wrote:If I am not mistaken, one would need to modify OpenStack source to force Nova to boot from RBD volumes. Is this no longer the case?Modifying OpenStack's source is a wonderful idea especially if you push your changes upstream for review. However, it does add to your work when you want to pull updated code from upstream into your deployment.-boOn 14.06.2013 12:55, Alvaro Izquierdo Jimeno wrote: By default, openstack uses NFS but… other options are available….can we use cephFS instead of NFS?Wouldn't you use qemu-rbd for your virtual guests in OpenStack?AFAIK CephFS is not needed for KVM/qemu virtual machines.Regards--Robert SanderHeinlein Support GmbHSchwedter Str. 8/9b, 10119 Berlinhttp://www.heinlein-support.deTel: 030 / 405051-43Fax: 030 / 405051-19Zwangsangaben lt. §35a GmbHG:HRB 93818 B / Amtsgericht Berlin-Charlottenburg,Geschäftsführer: Peer Heinlein -- Sitz: Berlin___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com--"But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Much more then, having now been justified by His blood, we shall be saved from the wrath of God through Him." Romans 5:8-9Allhave sinned, broken God's law, and deserve eternal torment. Jesus Christ, the Son of God, died for the sins of those that will believe, purchasing our salvation, and defeated death so that we all may spend eternity in heaven. Do you desire freedom from helland be with God in His love for eternity?"If you confess with your mouth Jesus as Lord, and believe in your heart that God raised Him from the dead, you will be saved." Romans 10:9___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] QEMU -drive setting (if=none) for rbd

2013-06-13 Thread Sebastien Han
OpenStack doesn't know how to set different caching options for attached block device.See the following blueprint,https://blueprints.launchpad.net/nova/+spec/enable-rbd-tuning-optionsThis might be implemented for Havana.Cheers.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance
On Jun 11, 2013, at 7:43 PM, Oliver Francke oliver.fran...@filoo.de wrote:Hi,Am 11.06.2013 um 19:14 schrieb w sun ws...@hotmail.com:Hi,We are currently testing the performance with rbd caching enabled with write-back mode on our openstack (grizzly) nova nodes. By default, nova fires up the rbd volumes with "if=none" mode evidenced by the following cmd line from "ps | grep".-drive file=rbd:ceph-openstack-volumes/volume-949e2e32-20c7-45cf-b41b-46951c78708b:id=ceph-openstack-volumes:key=12347I9RsEoIDBAAi2t+M6+7zMMZoMM+aasiog==:auth_supported=cephx\;none,if=none,id=drive-virtio-disk0,format=raw,serial=949e2e32-20c7-45cf-b41b-46951c78708b,cache=writebackDoes anyone know if this should be set to anything else (e.g., if=virtio suggested by some qemu posts in general)? Given that the underline network stack for RBD IO is provided by the linux kenerl instead, does this option bear any relevance for rbd volumeperformance inside guest VM?there should be something like "-device virtio-blk-pci,drive=drive-virtio-disk0" in reference to the id= for the drive-specification.Furthermore to really check rbd_cache there is s/t like:rbd_cache=true:rbd_cache_size=33554432:rbd_cache_max_dirty=16777216:rbd_cache_target_dirty=8388608missing in the ":"-list, perhaps after :none:rbd_cache=true:rbd_cache_size=33554432:rbd_cache_max_dirty=16777216:rbd_cache_target_dirty=8388608cache=writeback is necessary, too.No idea, though, how to teach openstack to use these parameters, sorry.Regards,Oliver.Thanks. --weiguo___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Live Migration: KVM-Libvirt Shared-storage

2013-06-05 Thread Sebastien Han
I did, what would like to know?Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance
On May 30, 2013, at 1:49 AM, Amit Vijairania amit.vijaira...@gmail.com wrote:We are currently testing Ceph with OpenStack Grizzly release and looking for some insight on Live Migration [1].. Based on documentation, there are two options for shared-storage and used for Nova instances (/var/lib/nova/instances): NFS and OpenStackGluster Connector..Do you know if anyone is using or have tested CephFS for Nova instances directory (console.log, libvirt.xml, )?[1]http://docs.openstack.org/trunk/openstack-compute/admin/content/configuring-migrations.htmlThanks!Amit___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] qemu-1.4.2 rbd-fixed ubuntu packages

2013-05-28 Thread Sebastien Han
Arf sorryWolfgang I scratched your name in my previous email :).Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovance
On May 29, 2013, at 12:19 AM, Sebastien Han sebastien@enovance.com wrote:Wolgang,I'm interested, and I assume I'm not the only one, thus can't you just make it public for everyone?Thanks.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."image.pngPhone :+33 (0)1 49 70 99 72–Mobile :+33 (0)6 52 84 44 70Email :sebastien@enovance.com–Skype :han.sbastienAddress :10, rue de la Victoire – 75009 ParisWeb :www.enovance.com–Twitter :@enovanceOn May 28, 2013, at 8:10 PM, Alex Bligh a...@alex.org.uk wrote:Wolfgang,On 28 May 2013, at 06:50, Wolfgang Hennerbichler wrote:for anybody who's interested, I've packaged the latest qemu-1.4.2 (not 1.5, it didn't work nicely with libvirt) which includes important fixes to RBD for ubuntu 12.04 AMD64. If you want to save some time, I can share the packages with you. drop me a line ifyou're interested.Information as to what the important fixes are would be appreciated!--Alex Bligh___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


  1   2   >