2012/11/16 Josh Durgin josh.dur...@inktank.com:
On 11/15/2012 11:21 PM, Drunkard Zhang wrote:
I installed mon x1, mds x1 and osd x11 in one host, then add some osd
from other hosts, But they are not in osd tree, also not usable, how
can I fix this?
The crush command I used:
ceph osd crush
Any other ideas how to reduce ceph-osd while doing randwrite?
Randread gives me with 3 VMs: 60.000 iops
Randwrite gives me with 3 VMs: 25.000 iops
Great to see that read scale !
For randwrite, what is the bottleneck now with filestore xattr use omap = true ?
Always cpu ?
- Mail
Am 16.11.2012 09:41, schrieb Alexandre DERUMIER:
Any other ideas how to reduce ceph-osd while doing randwrite?
Randread gives me with 3 VMs: 60.000 iops
Randwrite gives me with 3 VMs: 25.000 iops
Great to see that read scale !
Yes that works fine.
For randwrite, what is the bottleneck now
Hello ceph team,
As you may already know, our team in GRNET is building a complete open
source cloud platform called Synnefo [1], which already powers our
production public cloud service ~okeanos [2].
Synnefo is using Google Ganeti for the low level VM management part [3].
As of Jan 2012, we
There is no check in rbd_remove() to see if anybody holds ope the
image being removed. That's not cool.
Add a simple open count that goes up and down with opens and closes
(releases) of the device, and don't allow an rbd image to be removed
if the count is non-zero. Both functions are protected
The functions rbd_get_dev() and rbd_put_dev() are trivial wrappers
that add no values, and their existence suggests they may do more
than what they do.
Get rid of them.
Signed-off-by: Alex Elder el...@inktank.com
---
drivers/block/rbd.c | 14 ++
1 file changed, 2 insertions(+), 12
On Fri, 16 Nov 2012, Drunkard Zhang wrote:
2012/11/16 Josh Durgin josh.dur...@inktank.com:
On 11/15/2012 11:21 PM, Drunkard Zhang wrote:
I installed mon x1, mds x1 and osd x11 in one host, then add some osd
from other hosts, But they are not in osd tree, also not usable, how
can I fix
Hi,
We're testing ceph using a recent build from the 'next' branch (commit
b40387d) and we've run into some interesting problems related to memory
usage.
The setup consists of 64 OSDs (4 boxes, each with 16 disks, most of
them 2TB, some 1.5TB, XFS filesystems, Debian Wheezy). After the
I'm starting to look at Ceph and was wondering if anyone had a link to more
specific OSD server guidelines WRT processor speed and memory requirements.
My candidate server had an Intel processor 2.13 GHz quad core processor and
supports up to 24Gb of memory.
I'm wondering about the approximate
Turns out we're having the 'rbd map' hang on startup again, after we
started using the wip-3.5 patch set. How critical is the
libceph_protect_ceph_con_open_with_mutex commit? That's the one I
removed before which seemed to get rid of the problem (although I'm
not completely sure if it completely
On 11/16/2012 05:24 PM, Cláudio Martins wrote:
As for the monitor daemon on this cluster (running on a dedicated
machine), it is currently using 3.2GB of memory, and it got to that
point again in a matter of minutes after being restarted. Would it be
good if we tested with the changes from
I just realized I was mixing up this thread with the other deadlock
thread.
On Fri, 16 Nov 2012, Nick Bartos wrote:
Turns out we're having the 'rbd map' hang on startup again, after we
started using the wip-3.5 patch set. How critical is the
libceph_protect_ceph_con_open_with_mutex commit?
Hi -
We are setting up an in-house build machine on SLES 11SP2. I've run into a
couple issues compiling the latest ceph release. I suspect the root problem
is that we need more up to date Boost libraries. The latest I can find for
SLES are version 1.36.
So I am wondering how other folks
We've heard that 1ghz+1GB or so per osd is sufficient. So in your
case, around 8 osds. It's likely that you could go even a bit
further, since memory is usually more constrained than cpu. Note that
memory and processor use during recovery can be considerable higher
than during normal operation,
How far off do the clocks need to be before there is a problem? It
would seem to be hard to ensure a very large cluster has all of it's
nodes synchronized within 50ms (which seems to be the default for mon
clock drift allowed). Does the mon clock drift allowed parameter
change anything other
Should I be lowering the clock drift allowed, or the lease interval to
help reproduce it?
On Fri, Nov 16, 2012 at 2:13 PM, Sage Weil s...@inktank.com wrote:
You can safely set the clock drift allowed as high as 500ms. The real
limitation is that it needs to be well under the lease interval,
On Fri, 16 Nov 2012, Nick Bartos wrote:
Should I be lowering the clock drift allowed, or the lease interval to
help reproduce it?
clock drift allowed.
On Fri, Nov 16, 2012 at 2:13 PM, Sage Weil s...@inktank.com wrote:
You can safely set the clock drift allowed as high as 500ms. The
To be clear, the monitor cluster needs to be within this clock drift —
the rest of the Ceph cluster can be off by as much as you care to.
(Well, there's also a limit imposed by cephx authorization which can
keep nodes out of the cluster, but that drift allowance is measured in
units of hours.)
On 11/16/2012 07:43 AM, Alex Elder wrote:
There is no check in rbd_remove() to see if anybody holds ope the
image being removed. That's not cool.
Add a simple open count that goes up and down with opens and closes
(releases) of the device, and don't allow an rbd image to be removed
if the
On 11/15/2012 12:23 PM, Sébastien Han wrote:
First of all, I would like to thank you for this well explained,
structured and clear answer. I guess I got better IOPS thanks to the 10K disks.
10K RPM would bring your per-drive throughput (for 4K random writes)
up to 142 IOPS and your aggregate
2012/11/17 Sage Weil s...@inktank.com:
On Fri, 16 Nov 2012, Drunkard Zhang wrote:
2012/11/16 Josh Durgin josh.dur...@inktank.com:
On 11/15/2012 11:21 PM, Drunkard Zhang wrote:
I installed mon x1, mds x1 and osd x11 in one host, then add some osd
from other hosts, But they are not in osd
On Sat, 17 Nov 2012, Drunkard Zhang wrote:
2012/11/17 Sage Weil s...@inktank.com:
On Fri, 16 Nov 2012, Drunkard Zhang wrote:
2012/11/16 Josh Durgin josh.dur...@inktank.com:
On 11/15/2012 11:21 PM, Drunkard Zhang wrote:
I installed mon x1, mds x1 and osd x11 in one host, then add some
2012/11/17 Sage Weil s...@inktank.com:
On Sat, 17 Nov 2012, Drunkard Zhang wrote:
2012/11/17 Sage Weil s...@inktank.com:
On Fri, 16 Nov 2012, Drunkard Zhang wrote:
2012/11/16 Josh Durgin josh.dur...@inktank.com:
On 11/15/2012 11:21 PM, Drunkard Zhang wrote:
I installed mon x1, mds
Hi,
Okay, it looks something in the past added the host entry but for some
reason didn't give it a parent. Did you previously modify the crush map
by hand, or did you only manipulate it via the 'ceph osd crush ...'
commands?
Unfortuantely the fix is manually edit it.
ceph osd getcrushmap -o
On 11/15/2012 01:51 AM, Gandalf Corvotempesta wrote:
2012/11/15 Josh Durgin josh.dur...@inktank.com:
So basically you'd only need a single nic per storage node. Multiple
can be useful to separate frontend and backend traffic, but ceph
is designed to maintain strong consistency when failures
On Sat, 17 Nov 2012, Drunkard Zhang wrote:
2012/11/17 Sage Weil s...@inktank.com:
Hi,
Okay, it looks something in the past added the host entry but for some
reason didn't give it a parent. Did you previously modify the crush map
by hand, or did you only manipulate it via the 'ceph osd
2012/11/17 Sage Weil s...@inktank.com:
On Sat, 17 Nov 2012, Drunkard Zhang wrote:
2012/11/17 Sage Weil s...@inktank.com:
Hi,
Okay, it looks something in the past added the host entry but for some
reason didn't give it a parent. Did you previously modify the crush map
by hand, or did
On 11/16/2012 04:27 PM, Josh Durgin wrote:
On 11/16/2012 07:43 AM, Alex Elder wrote:
There is no check in rbd_remove() to see if anybody holds ope the
image being removed. That's not cool.
Add a simple open count that goes up and down with opens and closes
(releases) of the device, and
28 matches
Mail list logo