Re: osd not in tree

2012-11-16 Thread Drunkard Zhang
2012/11/16 Josh Durgin josh.dur...@inktank.com: On 11/15/2012 11:21 PM, Drunkard Zhang wrote: I installed mon x1, mds x1 and osd x11 in one host, then add some osd from other hosts, But they are not in osd tree, also not usable, how can I fix this? The crush command I used: ceph osd crush

Re: ceph-osd cpu usage

2012-11-16 Thread Alexandre DERUMIER
Any other ideas how to reduce ceph-osd while doing randwrite? Randread gives me with 3 VMs: 60.000 iops Randwrite gives me with 3 VMs: 25.000 iops Great to see that read scale ! For randwrite, what is the bottleneck now with filestore xattr use omap = true ? Always cpu ? - Mail

Re: ceph-osd cpu usage

2012-11-16 Thread Stefan Priebe - Profihost AG
Am 16.11.2012 09:41, schrieb Alexandre DERUMIER: Any other ideas how to reduce ceph-osd while doing randwrite? Randread gives me with 3 VMs: 60.000 iops Randwrite gives me with 3 VMs: 25.000 iops Great to see that read scale ! Yes that works fine. For randwrite, what is the bottleneck now

rbd tool changed format? (breaks compatibility)

2012-11-16 Thread Constantinos Venetsanopoulos
Hello ceph team, As you may already know, our team in GRNET is building a complete open source cloud platform called Synnefo [1], which already powers our production public cloud service ~okeanos [2]. Synnefo is using Google Ganeti for the low level VM management part [3]. As of Jan 2012, we

[PATCH] rbd: do not allow remove of mounted-on image

2012-11-16 Thread Alex Elder
There is no check in rbd_remove() to see if anybody holds ope the image being removed. That's not cool. Add a simple open count that goes up and down with opens and closes (releases) of the device, and don't allow an rbd image to be removed if the count is non-zero. Both functions are protected

[PATCH] rbd: get rid of rbd_{get,put}_dev()

2012-11-16 Thread Alex Elder
The functions rbd_get_dev() and rbd_put_dev() are trivial wrappers that add no values, and their existence suggests they may do more than what they do. Get rid of them. Signed-off-by: Alex Elder el...@inktank.com --- drivers/block/rbd.c | 14 ++ 1 file changed, 2 insertions(+), 12

Re: osd not in tree

2012-11-16 Thread Sage Weil
On Fri, 16 Nov 2012, Drunkard Zhang wrote: 2012/11/16 Josh Durgin josh.dur...@inktank.com: On 11/15/2012 11:21 PM, Drunkard Zhang wrote: I installed mon x1, mds x1 and osd x11 in one host, then add some osd from other hosts, But they are not in osd tree, also not usable, how can I fix

OSD and MON memory usage

2012-11-16 Thread Cláudio Martins
Hi, We're testing ceph using a recent build from the 'next' branch (commit b40387d) and we've run into some interesting problems related to memory usage. The setup consists of 64 OSDs (4 boxes, each with 16 disks, most of them 2TB, some 1.5TB, XFS filesystems, Debian Wheezy). After the

osd server hardware requirements

2012-11-16 Thread Snider, Tim
I'm starting to look at Ceph and was wondering if anyone had a link to more specific OSD server guidelines WRT processor speed and memory requirements. My candidate server had an Intel processor  2.13 GHz quad core processor and supports up to 24Gb of memory. I'm wondering about the approximate

Re: rbd map command hangs for 15 minutes during system start up

2012-11-16 Thread Nick Bartos
Turns out we're having the 'rbd map' hang on startup again, after we started using the wip-3.5 patch set. How critical is the libceph_protect_ceph_con_open_with_mutex commit? That's the one I removed before which seemed to get rid of the problem (although I'm not completely sure if it completely

Re: OSD and MON memory usage

2012-11-16 Thread Joao Eduardo Luis
On 11/16/2012 05:24 PM, Cláudio Martins wrote: As for the monitor daemon on this cluster (running on a dedicated machine), it is currently using 3.2GB of memory, and it got to that point again in a matter of minutes after being restarted. Would it be good if we tested with the changes from

Re: rbd map command hangs for 15 minutes during system start up

2012-11-16 Thread Sage Weil
I just realized I was mixing up this thread with the other deadlock thread. On Fri, 16 Nov 2012, Nick Bartos wrote: Turns out we're having the 'rbd map' hang on startup again, after we started using the wip-3.5 patch set. How critical is the libceph_protect_ceph_con_open_with_mutex commit?

Question about building on SLES 11SP2

2012-11-16 Thread Gary Lowell
Hi - We are setting up an in-house build machine on SLES 11SP2. I've run into a couple issues compiling the latest ceph release. I suspect the root problem is that we need more up to date Boost libraries. The latest I can find for SLES are version 1.36. So I am wondering how other folks

Re: osd server hardware requirements

2012-11-16 Thread Samuel Just
We've heard that 1ghz+1GB or so per osd is sufficient. So in your case, around 8 osds. It's likely that you could go even a bit further, since memory is usually more constrained than cpu. Note that memory and processor use during recovery can be considerable higher than during normal operation,

Re: rbd map command hangs for 15 minutes during system start up

2012-11-16 Thread Nick Bartos
How far off do the clocks need to be before there is a problem? It would seem to be hard to ensure a very large cluster has all of it's nodes synchronized within 50ms (which seems to be the default for mon clock drift allowed). Does the mon clock drift allowed parameter change anything other

Re: rbd map command hangs for 15 minutes during system start up

2012-11-16 Thread Nick Bartos
Should I be lowering the clock drift allowed, or the lease interval to help reproduce it? On Fri, Nov 16, 2012 at 2:13 PM, Sage Weil s...@inktank.com wrote: You can safely set the clock drift allowed as high as 500ms. The real limitation is that it needs to be well under the lease interval,

Re: rbd map command hangs for 15 minutes during system start up

2012-11-16 Thread Sage Weil
On Fri, 16 Nov 2012, Nick Bartos wrote: Should I be lowering the clock drift allowed, or the lease interval to help reproduce it? clock drift allowed. On Fri, Nov 16, 2012 at 2:13 PM, Sage Weil s...@inktank.com wrote: You can safely set the clock drift allowed as high as 500ms. The

Re: rbd map command hangs for 15 minutes during system start up

2012-11-16 Thread Gregory Farnum
To be clear, the monitor cluster needs to be within this clock drift — the rest of the Ceph cluster can be off by as much as you care to. (Well, there's also a limit imposed by cephx authorization which can keep nodes out of the cluster, but that drift allowance is measured in units of hours.)

Re: [PATCH] rbd: do not allow remove of mounted-on image

2012-11-16 Thread Josh Durgin
On 11/16/2012 07:43 AM, Alex Elder wrote: There is no check in rbd_remove() to see if anybody holds ope the image being removed. That's not cool. Add a simple open count that goes up and down with opens and closes (releases) of the device, and don't allow an rbd image to be removed if the

Re: RBD fio Performance concerns

2012-11-16 Thread Mark Kampe
On 11/15/2012 12:23 PM, Sébastien Han wrote: First of all, I would like to thank you for this well explained, structured and clear answer. I guess I got better IOPS thanks to the 10K disks. 10K RPM would bring your per-drive throughput (for 4K random writes) up to 142 IOPS and your aggregate

Re: osd not in tree

2012-11-16 Thread Drunkard Zhang
2012/11/17 Sage Weil s...@inktank.com: On Fri, 16 Nov 2012, Drunkard Zhang wrote: 2012/11/16 Josh Durgin josh.dur...@inktank.com: On 11/15/2012 11:21 PM, Drunkard Zhang wrote: I installed mon x1, mds x1 and osd x11 in one host, then add some osd from other hosts, But they are not in osd

Re: osd not in tree

2012-11-16 Thread Sage Weil
On Sat, 17 Nov 2012, Drunkard Zhang wrote: 2012/11/17 Sage Weil s...@inktank.com: On Fri, 16 Nov 2012, Drunkard Zhang wrote: 2012/11/16 Josh Durgin josh.dur...@inktank.com: On 11/15/2012 11:21 PM, Drunkard Zhang wrote: I installed mon x1, mds x1 and osd x11 in one host, then add some

Re: osd not in tree

2012-11-16 Thread Drunkard Zhang
2012/11/17 Sage Weil s...@inktank.com: On Sat, 17 Nov 2012, Drunkard Zhang wrote: 2012/11/17 Sage Weil s...@inktank.com: On Fri, 16 Nov 2012, Drunkard Zhang wrote: 2012/11/16 Josh Durgin josh.dur...@inktank.com: On 11/15/2012 11:21 PM, Drunkard Zhang wrote: I installed mon x1, mds

Re: osd not in tree

2012-11-16 Thread Sage Weil
Hi, Okay, it looks something in the past added the host entry but for some reason didn't give it a parent. Did you previously modify the crush map by hand, or did you only manipulate it via the 'ceph osd crush ...' commands? Unfortuantely the fix is manually edit it. ceph osd getcrushmap -o

Re: OSD network failure

2012-11-16 Thread Josh Durgin
On 11/15/2012 01:51 AM, Gandalf Corvotempesta wrote: 2012/11/15 Josh Durgin josh.dur...@inktank.com: So basically you'd only need a single nic per storage node. Multiple can be useful to separate frontend and backend traffic, but ceph is designed to maintain strong consistency when failures

Re: osd not in tree

2012-11-16 Thread Sage Weil
On Sat, 17 Nov 2012, Drunkard Zhang wrote: 2012/11/17 Sage Weil s...@inktank.com: Hi, Okay, it looks something in the past added the host entry but for some reason didn't give it a parent. Did you previously modify the crush map by hand, or did you only manipulate it via the 'ceph osd

Re: osd not in tree

2012-11-16 Thread Drunkard Zhang
2012/11/17 Sage Weil s...@inktank.com: On Sat, 17 Nov 2012, Drunkard Zhang wrote: 2012/11/17 Sage Weil s...@inktank.com: Hi, Okay, it looks something in the past added the host entry but for some reason didn't give it a parent. Did you previously modify the crush map by hand, or did

Re: [PATCH] rbd: do not allow remove of mounted-on image

2012-11-16 Thread Alex Elder
On 11/16/2012 04:27 PM, Josh Durgin wrote: On 11/16/2012 07:43 AM, Alex Elder wrote: There is no check in rbd_remove() to see if anybody holds ope the image being removed. That's not cool. Add a simple open count that goes up and down with opens and closes (releases) of the device, and