Re: Possible memory leak in mon?

2012-05-18 Thread Vladimir Bashkirtsev
On 16/05/12 02:43, Gregory Farnum wrote: On Sun, May 6, 2012 at 5:53 PM, Vladimir Bashkirtsev wrote: On 03/05/12 16:23, Greg Farnum wrote: On Wednesday, May 2, 2012 at 11:24 PM, Vladimir Bashkirtsev wrote: Greg, Apologies for multiple emails: my mail server is backed by ceph now and it

Re: PGs stuck in creating state

2012-05-22 Thread Vladimir Bashkirtsev
On 08/05/12 01:26, Sage Weil wrote: On Mon, 7 May 2012, Vladimir Bashkirtsev wrote: On 20/04/12 14:41, Sage Weil wrote: On Fri, 20 Apr 2012, Vladimir Bashkirtsev wrote: Dear devs, First of all I would like to bow my head at your great effort! Even if ceph did not reach prime time status yet

Stuck OSD phantom

2012-06-03 Thread Vladimir Bashkirtsev
Dear devs, While playing around with ceph with six OSDs I decided to retire two OSDs simultaneously (I do triplication so ceph should withstand such damage) to see how ceph will cope with it. I was doing it in different ways trying to get ceph off-rails and it looks I have managed it. :) Fir

Re: Stuck OSD phantom

2012-06-03 Thread Vladimir Bashkirtsev
On 04/06/12 13:38, Sage Weil wrote: Hi Vladimir, On Mon, 4 Jun 2012, Vladimir Bashkirtsev wrote: Dear devs, While playing around with ceph with six OSDs I decided to retire two OSDs simultaneously (I do triplication so ceph should withstand such damage) to see how ceph will cope with it. I

Ceph and KVM live migration

2012-06-30 Thread Vladimir Bashkirtsev
Dear all, Currently I testing KVMs running on ceph and particularly testing recent cache feature. Performance is of course vastly improved but still have occasional KVM hold ups - not sure who is at blame ceph of KVM. But I will deal with it later. Right now I've got myself a question which I

Re: Ceph and KVM live migration

2012-06-30 Thread Vladimir Bashkirtsev
On 01/07/12 10:47, Josh Durgin wrote: On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote: Dear all, Currently I testing KVMs running on ceph and particularly testing recent cache feature. Performance is of course vastly improved but still have occasional KVM hold ups - not sure who is at blame

Re: Ceph and KVM live migration

2012-06-30 Thread Vladimir Bashkirtsev
On 01/07/12 11:59, Josh Durgin wrote: On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote: On 01/07/12 10:47, Josh Durgin wrote: On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote: Dear all, Currently I testing KVMs running on ceph and particularly testing recent cache feature. Performance

librbd: error finding header

2012-07-08 Thread Vladimir Bashkirtsev
Hello, I just hit this error: error opening image sip.logics.net.au: (2) No such file or directory 2012-07-09 15:03:59.935835 7ffbe0673780 -1 librbd: error finding header: (2) No such file or directory Googled around and found that Oliver Francke had similar issue back in March. Read your re

Re: librbd: error finding header

2012-07-09 Thread Vladimir Bashkirtsev
belong? On 07/08/2012 10:42 PM, Vladimir Bashkirtsev wrote: Hello, I just hit this error: error opening image sip.logics.net.au: (2) No such file or directory 2012-07-09 15:03:59.935835 7ffbe0673780 -1 librbd: error finding header: (2) No such file or directory Googled around and found that

Re: librbd: error finding header

2012-07-09 Thread Vladimir Bashkirtsev
rned that one of them may not exist. Right on the ball: .rbd for image concerned just does not exist. So how can we recover from this? And why it has disappeared in first place? (I guess latter may be related to some sort of bug) On 07/09/2012 03:29 AM, Vladimir Bashkirtsev wrote: On 09/07

Re: librbd: error finding header

2012-07-09 Thread Vladimir Bashkirtsev
On 10/07/12 02:00, Florian Haas wrote: On 07/09/12 12:29, Vladimir Bashkirtsev wrote: On 09/07/12 18:33, Dan Mick wrote: Vladimir: you can do some investigation with the rados command. What does rados -p rbd ls show you? Rather long list of: rb.0.11.2786 rb.0.d.54a2 rb

Re: librbd: error finding header

2012-07-10 Thread Vladimir Bashkirtsev
On 10/07/12 14:32, Dan Mick wrote: On 07/09/2012 08:27 PM, Vladimir Bashkirtsev wrote: On 10/07/12 03:17, Dan Mick wrote: Well, it's not so much those; those are the objects that hold data blocks. You're more interested in the objects whose names end in '.rbd'. These ar

Re: librbd: error finding header

2012-07-11 Thread Vladimir Bashkirtsev
On 11/07/12 05:38, Josh Durgin wrote: On 07/10/2012 02:25 AM, Vladimir Bashkirtsev wrote: On 10/07/12 14:32, Dan Mick wrote: On 07/09/2012 08:27 PM, Vladimir Bashkirtsev wrote: On 10/07/12 03:17, Dan Mick wrote: Well, it's not so much those; those are the objects that hold data b

Re: librbd: error finding header

2012-07-13 Thread Vladimir Bashkirtsev
On 13/07/12 01:30, Tommi Virtanen wrote: On Wed, Jul 11, 2012 at 9:41 PM, Josh Durgin wrote: You're right about the object name - you can get its offset in the image that way. Since rbd is thin-provisioned, however, the highest index object might not be the highest possible object. When you fir

Poor read performance in KVM

2012-07-15 Thread Vladimir Bashkirtsev
Hello, Lately I was trying to get KVM to perform well on RBD. But it still appears elusive. [root@alpha etc]# rados -p rbd bench 120 seq -t 8 Total time run:16.873277 Total reads made: 302 Read size:4194304 Bandwidth (MB/sec):71.592 Average Latency: 0.437984

Re: Poor read performance in KVM

2012-07-17 Thread Vladimir Bashkirtsev
On 16/07/12 15:46, Josh Durgin wrote: On 07/15/2012 06:13 AM, Vladimir Bashkirtsev wrote: Hello, Lately I was trying to get KVM to perform well on RBD. But it still appears elusive. [root@alpha etc]# rados -p rbd bench 120 seq -t 8 Total time run:16.873277 Total reads made: 302

Re: Poor read performance in KVM

2012-07-17 Thread Vladimir Bashkirtsev
On 16/07/12 15:46, Josh Durgin wrote: On 07/15/2012 06:13 AM, Vladimir Bashkirtsev wrote: Hello, Lately I was trying to get KVM to perform well on RBD. But it still appears elusive. [root@alpha etc]# rados -p rbd bench 120 seq -t 8 Total time run:16.873277 Total reads made: 302

Re: Poor read performance in KVM

2012-07-17 Thread Vladimir Bashkirtsev
On 16/07/12 15:46, Josh Durgin wrote: On 07/15/2012 06:13 AM, Vladimir Bashkirtsev wrote: Hello, Lately I was trying to get KVM to perform well on RBD. But it still appears elusive. [root@alpha etc]# rados -p rbd bench 120 seq -t 8 Total time run:16.873277 Total reads made: 302

Re: Poor read performance in KVM

2012-07-19 Thread Vladimir Bashkirtsev
It's actually the sum of the latencies of all 3971 asynchronous reads, in seconds, so the average latency was ~200ms, which is still pretty high. OK. I did realize it later that day when I've noticed that sum does go up only. So sum is number of seconds spent and divided by avgcount gives an i

Re: Poor read performance in KVM

2012-07-19 Thread Vladimir Bashkirtsev
Try to determine how much of the 200ms avg latency comes from osds vs the qemu block driver. Look like that osd.0 performs with low latency but osd.1 latency is way too high and on average it appears as 200ms. osd is backed by btrfs over LVM2. May be issue lie in backing fs selection? All four

Re: Poor read performance in KVM

2012-07-19 Thread Vladimir Bashkirtsev
On 20/07/2012 1:22 AM, Tommi Virtanen wrote: On Thu, Jul 19, 2012 at 5:19 AM, Vladimir Bashkirtsev wrote: Look like that osd.0 performs with low latency but osd.1 latency is way too high and on average it appears as 200ms. osd is backed by btrfs over LVM2. May be issue lie in backing fs

Re: Poor read performance in KVM

2012-07-19 Thread Vladimir Bashkirtsev
We are seeing degradation at 64k node/leaf sizes as well. So far the degradation is most obvious with small writes. it affects XFS as well, though not as severely. We are vigorously looking into it. :) Just confirming that one of our clients has run fair amount (on gigabytes scale) of m

Re: Poor read performance in KVM

2012-07-19 Thread Vladimir Bashkirtsev
What node/leaf size are you using on your btrfs volume? Default 4K. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Poor read performance in KVM

2012-07-19 Thread Vladimir Bashkirtsev
Yes, they can hold up reads to the same object. Depending on where they're stuck, they may be blocking other requests as well if they're e.g. taking up all the filestore threads. Waiting for subops means they're waiting for replicas to acknowledge the write and commit it to disk. The real cause f

Re: Poor read performance in KVM

2012-07-20 Thread Vladimir Bashkirtsev
Yes, they can hold up reads to the same object. Depending on where they're stuck, they may be blocking other requests as well if they're e.g. taking up all the filestore threads. Waiting for subops means they're waiting for replicas to acknowledge the write and commit it to disk. The real cause

Re: Poor read performance in KVM

2012-07-20 Thread Vladimir Bashkirtsev
On 21/07/2012 2:12 AM, Tommi Virtanen wrote: On Fri, Jul 20, 2012 at 9:17 AM, Vladimir Bashkirtsev wrote: not running. So I ended up rebooting hosts and that's where fun begin: btrfs has failed to umount , on boot up it spit out "btrfs: free space inode generation (0) did not match

Increasing number of PGs

2012-07-28 Thread Vladimir Bashkirtsev
Hello, I am working on optimization of ceph performance: CPU load vs OSD data load. Right now I have 576 PGs in total. Three pools: metadata, data, rbd. Each pool has 192 PGs. data is not used heavily, rbd is in heavy use. In total 6 OSDs in cluster. I have read recommendation about 100 PGs p

Increasing number of PGs

2012-07-28 Thread Vladimir Bashkirtsev
Hello, I am working on optimization of ceph performance: CPU load vs OSD data load. Right now I have 576 PGs in total. Three pools: metadata, data, rbd. Each pool has 192 PGs. data is not used heavily, rbd is in heavy use. In total 6 OSDs in cluster. I have read recommendation about 100 PGs p

Re: Poor read performance in KVM

2012-07-29 Thread Vladimir Bashkirtsev
On 21/07/12 02:12, Tommi Virtanen wrote: But it leaves me with very final question: should we rely on btrfs at this point given it is having such major faults? What if I will use well tested by time ext4? You might want to try xfs. We hear/see problems with all three, but xfs currently seems to h

Re: Increasing number of PGs

2012-07-30 Thread Vladimir Bashkirtsev
On 31/07/12 01:46, Josh Durgin wrote: So here two questions: 1. Should I increase number of PGs in rbd pool or better leave it where it is? It would be better to increase it from a data balancing point of view, but it's not clear that it would help performance. I would care about data balanc

Crash of almost full ceph

2012-08-04 Thread Vladimir Bashkirtsev
Hello, Yesterday finally I have managed to screw up my installation of ceph! :) My ceph was at 80% capacity. I have rebooted one of OSDs remotely and managed to screw up with fstab. Host failed to come up and while I was driving from home to my office ceph took recovery action. But it meant t

Re: Crash of almost full ceph

2012-08-06 Thread Vladimir Bashkirtsev
On 07/08/12 01:55, Gregory Farnum wrote: There is not yet any such feature, no — dealing with full systems is notoriously hard and we haven't come up with a great solution yet. One thing you can do is experiment with the "mon_osd_min_in_ratio" parameter, which prevents the monitors from marking

Is there any way to throttle recovery?

2012-09-02 Thread Vladimir Bashkirtsev
Hello devs, I have noticed that while ceph recovering from OSD failure RBD performance becomes dismal. Judging by load on the net ceph attempts to shuffle around huge amount of data leaving virtually no bandwidth for normal operations. Is there any way to throttle recovery process? Network th

RBD image max simultaneous mounts

2012-09-16 Thread Vladimir Bashkirtsev
Dear devs, We just had an incident where two instances of the same VM started up on different hosts and mounted the same RBD image. Image had ext4 partition and it became corrupted quickly. Therefore it would be good to have some setting in RBD info like max_simultaneous_mounts which will dict

PGs stuck in creating state

2012-04-19 Thread Vladimir Bashkirtsev
Dear devs, First of all I would like to bow my head at your great effort! Even if ceph did not reach prime time status yet it is already extremely powerful and fairly stable to the point we have deployed it in live environment (still backing up of course). I have played with ceph extensively

Re: PGs stuck in creating state

2012-04-20 Thread Vladimir Bashkirtsev
On 20/04/12 14:41, Sage Weil wrote: On Fri, 20 Apr 2012, Vladimir Bashkirtsev wrote: Dear devs, First of all I would like to bow my head at your great effort! Even if ceph did not reach prime time status yet it is already extremely powerful and fairly stable to the point we have deployed

OSD weighting

2012-04-20 Thread Vladimir Bashkirtsev
Dear devs, Playing around with ceph and gradually moving it from a toy thing into production I wanted ceph to actually make its run for the money (so to speak). I have assembled number of OSDs which are really built on different hardware: starting from old P4 with 512MB of RAM and ending up w

Possible memory leak in mon?

2012-05-02 Thread Vladimir Bashkirtsev
Dear devs, I have three mons and two of them suddenly consumed around 4G of RAM while third one happily lived with 150M. This immediately prompts few questions: 1. What is expected memory use of mon? I believed that mon merely directs clients to relevant OSDs and should not consume a lot of

Possible memory leak in mon?

2012-05-02 Thread Vladimir Bashkirtsev
Dear devs, I have three mons and two of them suddenly consumed around 4G of RAM while third one happily lived with 150M. This immediately prompts few questions: 1. What is expected memory use of mon? I believed that mon merely directs clients to relevant OSDs and should not consume a lot of

Possible memory leak in mon?

2012-05-02 Thread Vladimir Bashkirtsev
Dear devs, I have three mons and two of them suddenly consumed around 4G of RAM while third one happily lived with 150M. This immediately prompts few questions: 1. What is expected memory use of mon? I believed that mon merely directs clients to relevant OSDs and should not consume a lot of

Possible memory leak in mon?

2012-05-02 Thread Vladimir Bashkirtsev
Dear devs, I have three mons and two of them suddenly consumed around 4G of RAM while third one happily lived with 150M. This immediately prompts few questions: 1. What is expected memory use of mon? I believed that mon merely directs clients to relevant OSDs and should not consume a lot of

Re: Possible memory leak in mon?

2012-05-02 Thread Vladimir Bashkirtsev
monitored by nagios and so this record appears every 5 minutes), monitors periodically call for election (different periods between 1 to 15 minutes as it looks). That's it. Regards, Vladimir On 03/05/12 09:52, Greg Farnum wrote: On Wednesday, May 2, 2012 at 3:28 PM, Vladimir Bashkirtsev wrote:

Re: Possible memory leak in mon?

2012-05-06 Thread Vladimir Bashkirtsev
On 03/05/12 16:23, Greg Farnum wrote: On Wednesday, May 2, 2012 at 11:24 PM, Vladimir Bashkirtsev wrote: Greg, Apologies for multiple emails: my mail server is backed by ceph now and it struggled this morning (separate issue). So my mail server reported back to my mailer that sending of email

Re: Possible memory leak in mon?

2012-05-06 Thread Vladimir Bashkirtsev
On 03/05/12 16:23, Greg Farnum wrote: On Wednesday, May 2, 2012 at 11:24 PM, Vladimir Bashkirtsev wrote: Greg, Apologies for multiple emails: my mail server is backed by ceph now and it struggled this morning (separate issue). So my mail server reported back to my mailer that sending of email

Re: PGs stuck in creating state

2012-05-06 Thread Vladimir Bashkirtsev
On 20/04/12 14:41, Sage Weil wrote: On Fri, 20 Apr 2012, Vladimir Bashkirtsev wrote: Dear devs, First of all I would like to bow my head at your great effort! Even if ceph did not reach prime time status yet it is already extremely powerful and fairly stable to the point we have deployed it in