[ceph-users] VMs freez after slow requests

2013-06-03 Thread Dominik Mostowiec
Hi, I try to start postgres cluster on VMs with second disk mounted from ceph (rbd - kvm). I started some writes (pgbench initialisation) on 8 VMs and VMs freez. Ceph reports slow request on 1 osd. I restarted this osd to remove slows and VMs hangs permanently. Is this a normal situation afer

Re: [ceph-users] VMs freez after slow requests

2013-06-03 Thread Gregory Farnum
On Sunday, June 2, 2013, Dominik Mostowiec wrote: Hi, I try to start postgres cluster on VMs with second disk mounted from ceph (rbd - kvm). I started some writes (pgbench initialisation) on 8 VMs and VMs freez. Ceph reports slow request on 1 osd. I restarted this osd to remove slows and

Re: [ceph-users] VMs freez after slow requests

2013-06-03 Thread Olivier Bonvalet
Le lundi 03 juin 2013 à 08:04 -0700, Gregory Farnum a écrit : On Sunday, June 2, 2013, Dominik Mostowiec wrote: Hi, I try to start postgres cluster on VMs with second disk mounted from ceph (rbd - kvm). I started some writes (pgbench initialisation)

Re: [ceph-users] Ceph killed by OS because of OOM under high load

2013-06-03 Thread Gregory Farnum
On Mon, Jun 3, 2013 at 8:47 AM, Chen, Xiaoxi xiaoxi.c...@intel.com wrote: Hi, As my previous mail reported some weeks ago ,we are suffering from OSD crash/ OSD Flipping / System reboot and etc, all these unstable issue really stop us from digging further into ceph characterization.

Re: [ceph-users] MDS has been repeatedly laggy or crashed

2013-06-03 Thread Gregory Farnum
On Sat, Jun 1, 2013 at 7:50 PM, MinhTien MinhTien tientienminh080...@gmail.com wrote: Hi all. I have 3 server(use ceph 0.56.6): 1 server user for Mon mds.0 1 server run OSD deamon ( Raid 6 (44TB) = OSD.0 ) mds.1 1 server run OSD daemon ( Raid 6 (44TB) = OSD.1 ) mds.2 When running ceph

Re: [ceph-users] ceph-deploy

2013-06-03 Thread John Wilkins
Actually, as I said, I unmounted them first, zapped the disk, then used OSD create. For you, that might look like: sudo umount /dev/sda3 ceph-deploy disk zap ceph0:sda3 ceph1:sda3 ceph2:sda3 ceph-deploy osd create ceph0:sda3 ceph1:sda3 ceph2:sda3 I was referring to the entire disk in my

Re: [ceph-users] ceph-deploy

2013-06-03 Thread John Wilkins
Sorry...hit send inadvertantly... http://ceph.com/docs/master/start/quick-ceph-deploy/#multiple-osds-on-the-os-disk-demo-only On Mon, Jun 3, 2013 at 1:00 PM, John Wilkins john.wilk...@inktank.com wrote: Actually, as I said, I unmounted them first, zapped the disk, then used OSD create. For

Re: [ceph-users] replacing an OSD or crush map sensitivity

2013-06-03 Thread Chen, Xiaoxi
my 0.02, you really dont need to wait for health_ok between your recovery steps,just go ahead. Everytime a new map be generated and broadcasted,the old map and in-progress recovery will be canceled 发自我的 iPhone 在 2013-6-2,11:30,Nigel Williams nigel.d.willi...@gmail.com 写道: Could I have a

Re: [ceph-users] CentOS + qemu-kvm rbd support update

2013-06-03 Thread YIP Wai Peng
Hi Andrel, Have you tried the patched ones at https://objects.dreamhost.com/rpms/qemu/qemu-kvm-0.12.1.2-2.355.el6.2.x86_64.rpmand https://objects.dreamhost.com/rpms/qemu/qemu-img-0.12.1.2-2.355.el6.2.x86_64.rpm? I got the links off the IRC chat, I'm using them now. - WP On Sun, Jun 2, 2013 at

Re: [ceph-users] qemu-1.4.2 rbd-fixed ubuntu packages

2013-06-03 Thread w sun
Thanks for your clarification. I don't have much in-depth knowledge of libvirt although I believe openstack does use it for scheduling nova compute jobs (initiating VM instances) and supporting live-migration, both of which work properly in our grizzly environment. I will keep an eye on this

[ceph-users] PG active+clean+degraded, but not creating new replicas

2013-06-03 Thread YIP Wai Peng
Hi all, I'm running ceph on CentOS6 on 3 hosts, with 3 OSD each (total 9 OSD). When I increased one of my pool rep size from 2 to 3, just 6 PGs will get stuck in active+clean+degraded mode, but it doesn't create new replicas. One of the problematic PG has the following (snipped for brevity) {

Re: [ceph-users] replacing an OSD or crush map sensitivity

2013-06-03 Thread Nigel Williams
On 4/06/2013 9:16 AM, Chen, Xiaoxi wrote: my 0.02, you really dont need to wait for health_ok between your recovery steps,just go ahead. Everytime a new map be generated and broadcasted,the old map and in-progress recovery will be canceled thanks Xiaoxi, that is helpful to know. It seems to

Re: [ceph-users] PG active+clean+degraded, but not creating new replicas

2013-06-03 Thread Sage Weil
On Tue, 4 Jun 2013, YIP Wai Peng wrote: Hi all, I'm running ceph on CentOS6 on 3 hosts, with 3 OSD each (total 9 OSD). When I increased one of my pool rep size from 2 to 3, just 6 PGs will get stuck in active+clean+degraded mode, but it doesn't create new replicas. My first guess is that you

Re: [ceph-users] replacing an OSD or crush map sensitivity

2013-06-03 Thread Sage Weil
On Tue, 4 Jun 2013, Nigel Williams wrote: On 4/06/2013 9:16 AM, Chen, Xiaoxi wrote: my 0.02? you really dont need to wait for health_ok between your recovery steps,just go ahead. Everytime a new map be generated and broadcasted,the old map and in-progress recovery will be canceled thanks

Re: [ceph-users] PG active+clean+degraded, but not creating new replicas

2013-06-03 Thread YIP Wai Peng
Hi Sage, It is on optimal tunables already. However, I'm on kernel 2.6.32-358.6.2.el6.x86_64. Will the tunables take effect or do I have to upgrade to something newer? - WP On Tue, Jun 4, 2013 at 11:58 AM, Sage Weil s...@inktank.com wrote: On Tue, 4 Jun 2013, YIP Wai Peng wrote: Hi all,

Re: [ceph-users] replacing an OSD or crush map sensitivity

2013-06-03 Thread Nigel Williams
On Tue, Jun 4, 2013 at 1:59 PM, Sage Weil s...@inktank.com wrote: On Tue, 4 Jun 2013, Nigel Williams wrote: Something else I noticed: ... Does the monitor data directory share a disk with an OSD? If so, that makes sense: compaction freed enough space to drop below the threshold... Of

Re: [ceph-users] PG active+clean+degraded, but not creating new replicas

2013-06-03 Thread Wolfgang Hennerbichler
On Mon, Jun 03, 2013 at 08:58:00PM -0700, Sage Weil wrote: My first guess is that you do not have the newer crush tunables set and some placements are not quite right. If you are prepared for some data migration, and are not using an older kernel client, try ceph osd crush tunables

Re: [ceph-users] PG active+clean+degraded, but not creating new replicas

2013-06-03 Thread YIP Wai Peng
Sorry, to set things in context, I had some other problems last weekend. Setting it to optimal tunables helped (although I am on the older kernel). Since it worked, I was inclined to believed that the tunables do work on the older kernel. That being said, I will upgrade the kernel to see if this

Re: [ceph-users] PG active+clean+degraded, but not creating new replicas

2013-06-03 Thread Sage Weil
On Tue, 4 Jun 2013, Wolfgang Hennerbichler wrote: On Mon, Jun 03, 2013 at 08:58:00PM -0700, Sage Weil wrote: My first guess is that you do not have the newer crush tunables set and some placements are not quite right. If you are prepared for some data migration, and are not using an

Re: [ceph-users] PG active+clean+degraded, but not creating new replicas

2013-06-03 Thread Sage Weil
On Tue, 4 Jun 2013, YIP Wai Peng wrote: Sorry, to set things in context, I had some other problems last weekend. Setting it to optimal tunables helped (although I am on the older kernel). Since it worked, I was inclined to believed that the tunables do work on the older kernel. That being

Re: [ceph-users] PG active+clean+degraded, but not creating new replicas

2013-06-03 Thread YIP Wai Peng
Hi Sage, Thanks, I noticed after re-reading the documentation. I realized that osd.8 was not in host3. After adding osd.8 to host3, the PGs are now in active+remapped # ceph pg 3.45 query { state: active+remapped, epoch: 1374, up: [ 4, 8], acting: [ 4, 8,