Re: less cores more iops / speed

2012-11-07 Thread Stefan Priebe - Profihost AG
Am 08.11.2012 um 06:54 schrieb Dietmar Maurer : >> Why is vhost net driver involved here at all? Kvm guest only uses ssh here. > > I though you are testing things (rdb) which depends on KVM network speed? Kvm process uses librbd and both are running on host not in guest. Stefan > > -- > To

Re: less cores more iops / speed

2012-11-07 Thread Stefan Priebe - Profihost AG
Am 08.11.2012 um 06:49 schrieb Dietmar Maurer : >>> I've noticed something really interesting. >>> >>> I get 5000 iops / VM for rand. 4k writes while assigning 4 cores on a >>> 2.5 Ghz Xeon. >>> >>> When i move this VM to another kvm host with 3.6Ghz i get 8000 iops >>> (still 8 >>> cores) when

RE: less cores more iops / speed

2012-11-07 Thread Dietmar Maurer
> Why is vhost net driver involved here at all? Kvm guest only uses ssh here. I though you are testing things (rdb) which depends on KVM network speed? -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo inf

Re: less cores more iops / speed

2012-11-07 Thread Stefan Priebe - Profihost AG
Am 08.11.2012 um 06:42 schrieb Dietmar Maurer : >> I've noticed something really interesting. >> >> I get 5000 iops / VM for rand. 4k writes while assigning 4 cores on a >> 2.5 Ghz Xeon. >> >> When i move this VM to another kvm host with 3.6Ghz i get 8000 iops (still 8 >> cores) when i then LOWE

RE: less cores more iops / speed

2012-11-07 Thread Dietmar Maurer
> > I've noticed something really interesting. > > > > I get 5000 iops / VM for rand. 4k writes while assigning 4 cores on a > > 2.5 Ghz Xeon. > > > > When i move this VM to another kvm host with 3.6Ghz i get 8000 iops > > (still 8 > > cores) when i then LOWER the assigned cores from 8 to 4 i get >

RE: less cores more iops / speed

2012-11-07 Thread Dietmar Maurer
> I've noticed something really interesting. > > I get 5000 iops / VM for rand. 4k writes while assigning 4 cores on a > 2.5 Ghz Xeon. > > When i move this VM to another kvm host with 3.6Ghz i get 8000 iops (still 8 > cores) when i then LOWER the assigned cores from 8 to 4 i get > 14.500 iops. If

Re: syncfs slower than without syncfs

2012-11-07 Thread Josh Durgin
On 11/07/2012 08:26 AM, Stefan Priebe wrote: Am 07.11.2012 16:04, schrieb Mark Nelson: Whew, glad you found the problem Stefan! I was starting to wonder what was going on. :) Do you mind filling a bug about the control dependencies? Sure where should i fill it in? http://www.tracker.newdre

Re: Unexpected behavior by ceph 0.48.2argonaut.

2012-11-07 Thread Josh Durgin
On 11/07/2012 05:34 AM, hemant surale wrote: I am not sure about my judgments but please help me out understanding the result of following experiment carried out : - Experiment : (3 node cluster, all have ceph v.0.48.2agonaut( after building ceph from src code) + UBUNTU 12.04 + kernel 3.2.0 )

Re: less cores more iops / speed

2012-11-07 Thread Mark Nelson
On 11/07/2012 06:00 PM, Joao Eduardo Luis wrote: On 11/07/2012 10:02 PM, Stefan Priebe wrote: Hello again, I've noticed something really interesting. I get 5000 iops / VM for rand. 4k writes while assigning 4 cores on a 2.5 Ghz Xeon. When i move this VM to another kvm host with 3.6Ghz i get 8

Re: less cores more iops / speed

2012-11-07 Thread Mark Nelson
On 11/07/2012 06:00 PM, Joao Eduardo Luis wrote: On 11/07/2012 10:02 PM, Stefan Priebe wrote: Hello again, I've noticed something really interesting. I get 5000 iops / VM for rand. 4k writes while assigning 4 cores on a 2.5 Ghz Xeon. When i move this VM to another kvm host with 3.6Ghz i get 8

Openstack - Boot From New Volume

2012-11-07 Thread Quenten Grasso
Hi All, I've been looking for this bit of code for awhile how to make open stack create a VM with the dashboard with a attached/imaged volume to avoid the current multistep process of creating a vm with ceph volume. Here's the code, thanks goes out to vishy on openstack-dev's and I've added a

Re: less cores more iops / speed

2012-11-07 Thread Joao Eduardo Luis
On 11/07/2012 10:02 PM, Stefan Priebe wrote: > Hello again, > > I've noticed something really interesting. > > I get 5000 iops / VM for rand. 4k writes while assigning 4 cores on a > 2.5 Ghz Xeon. > > When i move this VM to another kvm host with 3.6Ghz i get 8000 iops > (still 8 cores) when i th

Re: SSD journal suggestion

2012-11-07 Thread Mark Nelson
On 11/07/2012 04:51 PM, Gandalf Corvotempesta wrote: 2012/11/7 Martin Mailand : But it looks good on paper, so it's definitely a try worth. is at least 4x times faster than 10gbe and AFAIK should have a lower latency. I'm planning to use infiniband as backend storage network, used for OSD repl

Re: SSD journal suggestion

2012-11-07 Thread Martin Mailand
good question, probably we do not have enough experience with IPoIB. But it looks good on paper, so it's definitely a try worth. -martin Am 07.11.2012 23:28, schrieb Gandalf Corvotempesta: 2012/11/7 Martin Mailand : I tested a Arista 7150S-24, a HP5900 and in a few weeks I will get a Mellanox

Re: SSD journal suggestion

2012-11-07 Thread Martin Mailand
Hi, I *think* the HP is Broadcom based, the Arista is Fulcrum based, and I don't know which chips Mellanox is using. Our NOC tested both of them, an the Arista was the clear winner, at least in our workload. -martin Am 07.11.2012 22:59, schrieb Stefan Priebe: HP told me they all use the s

less cores more iops / speed

2012-11-07 Thread Stefan Priebe
Hello again, I've noticed something really interesting. I get 5000 iops / VM for rand. 4k writes while assigning 4 cores on a 2.5 Ghz Xeon. When i move this VM to another kvm host with 3.6Ghz i get 8000 iops (still 8 cores) when i then LOWER the assigned cores from 8 to 4 i get 14.500 iops.

Re: SSD journal suggestion

2012-11-07 Thread Stefan Priebe
Am 07.11.2012 22:55, schrieb Martin Mailand: Hi Stefan, deep buffers means latency spikes, you should go for fast switching latency. The HP5900 has a latency of 1ms, the Arista and Mellanox of 250ns. HP told me they all use the same ships and Arista measures latency while only one port is in

Re: SSD journal suggestion

2012-11-07 Thread Martin Mailand
Hi Stefan, deep buffers means latency spikes, you should go for fast switching latency. The HP5900 has a latency of 1ms, the Arista and Mellanox of 250ns. And I you should think at the price the HP5900 cost 3 times of the Mellanox. -martin Am 07.11.2012 22:44, schrieb Stefan Priebe: Am 07.11

Re: SSD journal suggestion

2012-11-07 Thread Stefan Priebe
Am 07.11.2012 22:35, schrieb Martin Mailand: Hi, I tested a Arista 7150S-24, a HP5900 and in a few weeks I will get a Mellanox MSX1016. ATM the Arista is may favourite. For the dual 10GeB NICs I tested the Intel X520-DA2 and the Mellanox ConnectX-3. My favourite is the Intel X520-DA2. That's p

Re: SSD journal suggestion

2012-11-07 Thread Martin Mailand
Hi, I tested a Arista 7150S-24, a HP5900 and in a few weeks I will get a Mellanox MSX1016. ATM the Arista is may favourite. For the dual 10GeB NICs I tested the Intel X520-DA2 and the Mellanox ConnectX-3. My favourite is the Intel X520-DA2. -martin Am 07.11.2012 22:14, schrieb Gandalf Corvot

Re: SSD journal suggestion

2012-11-07 Thread Martin Mailand
Hi, I have 16 SAS disk on a LSI 9266-8i and 4 Intel 520 SSD on a HBA, the node has dual 10G Ethernet. The clients are 4 nodes with dual 10GeB, as test I use rados bench on each client. The aggregated write speed is around 1,6GB/s with single replication. In the first configuration, I had the

extreme ceph-osd cpu load for rand. 4k write

2012-11-07 Thread Stefan Priebe
Hello list, whiling benchmarking i was wondering, why the ceph-osd load is so extreme high while having random 4k write i/o. Here an example while benchmarking: random 4k write: 16.000 iop/s 180% CPU Load in top from EACH ceph-osd process random 4k read: 16.000 iop/s 19% CPU Load in top fr

Re: rbd striping format v1 / v2 benchmark

2012-11-07 Thread Sage Weil
On Wed, 7 Nov 2012, Stefan Priebe wrote: > Hello list, > > i've done some benchmarks regarding striping / v1 / v2. > > Results: > format 1: > > write: io=5739MB, bw=65278KB/s, iops=16319, runt= 90029msec > read : io=5771MB, bw=65636KB/s, iops=16408, runt= 90030msec > write: io=77224MB, bw=

rbd striping format v1 / v2 benchmark

2012-11-07 Thread Stefan Priebe
Hello list, i've done some benchmarks regarding striping / v1 / v2. Results: format 1: write: io=5739MB, bw=65278KB/s, iops=16319, runt= 90029msec read : io=5771MB, bw=65636KB/s, iops=16408, runt= 90030msec write: io=77224MB, bw=874044KB/s, iops=213, runt= 90473msec read : io=178840MB,

Re: Ubuntu 12.04.1 + xfs + syncfs is still not our friend

2012-11-07 Thread Josh Durgin
On 11/07/2012 12:14 AM, Gandalf Corvotempesta wrote: 2012/11/7 Dan Mick : Resolution: installing the packages built for precise, rather than squeeze, got versions that use syncfs. Which packages, ceph or libc? The ceph packages. -- To unsubscribe from this list: send the line "unsubscribe ce

Re: SSD journal suggestion

2012-11-07 Thread Mark Nelson
On 11/07/2012 10:35 AM, Atchley, Scott wrote: On Nov 7, 2012, at 11:20 AM, Mark Nelson wrote: Right now I'm doing 3 journals per SSD, but topping out at about 1.2-1.4GB/s from the client perspective for the node with 15+ drives and 5 SSDs. It's possible newer versions of the code and tuning

Re: SSD journal suggestion

2012-11-07 Thread Atchley, Scott
On Nov 7, 2012, at 11:20 AM, Mark Nelson wrote: >>> Right now I'm doing 3 journals per SSD, but topping out at about >>> 1.2-1.4GB/s from the client perspective for the node with 15+ drives and >>> 5 SSDs. It's possible newer versions of the code and tuning may >>> increase that. >> >> What in

Re: syncfs slower than without syncfs

2012-11-07 Thread Stefan Priebe
Am 07.11.2012 16:04, schrieb Mark Nelson: Whew, glad you found the problem Stefan! I was starting to wonder what was going on. :) Do you mind filling a bug about the control dependencies? Sure where should i fill it in? Stefan -- To unsubscribe from this list: send the line "unsubscribe ceph

Re: SSD journal suggestion

2012-11-07 Thread Atchley, Scott
On Nov 7, 2012, at 10:01 AM, Mark Nelson wrote: > On 11/07/2012 06:28 AM, Gandalf Corvotempesta wrote: >> 2012/11/7 Sage Weil : >>> On Wed, 7 Nov 2012, Gandalf Corvotempesta wrote: I'm evaluating some SSD drives as journal. Samsung 840 Pro seems to be the fastest in sequential reads and

Re: SSD journal suggestion

2012-11-07 Thread Mark Nelson
On 11/07/2012 10:12 AM, Atchley, Scott wrote: On Nov 7, 2012, at 10:01 AM, Mark Nelson wrote: On 11/07/2012 06:28 AM, Gandalf Corvotempesta wrote: 2012/11/7 Sage Weil : On Wed, 7 Nov 2012, Gandalf Corvotempesta wrote: I'm evaluating some SSD drives as journal. Samsung 840 Pro seems to be th

trying to import crushmap results in max_devices > osdmap max_osd

2012-11-07 Thread Stefan Priebe - Profihost AG
Hello, i've added two nodes with 4 devices each and modified the crushmap. But importing the new map results in: crushmap max_devices 55 > osdmap max_osd 35 What's wrong? Greets Stefan -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord..

Re: Mons network

2012-11-07 Thread Gregory Farnum
On Wed, Nov 7, 2012 at 4:20 PM, Gandalf Corvotempesta wrote: > 2012/11/7 Gregory Farnum : >> The mons need to be reachable by everybody. They don't do a ton of >> network traffic, but 100Mb/s might be pushing it a bit low... > > Some portion of my network are 10/100 with gigabit uplink. > In this

Re: Mons network

2012-11-07 Thread Gregory Farnum
On Wed, Nov 7, 2012 at 1:31 PM, Gandalf Corvotempesta wrote: > Which kind of network should I plain for 3 or 4 MONs nodes? > Are these node called by the ceph client (RBD, RGW, and so on) or only by OSD? > > Can I use some virtual machines distributed across multiple Xen nodes > on a 100mbit/s net

Re: syncfs slower than without syncfs

2012-11-07 Thread Mark Nelson
Whew, glad you found the problem Stefan! I was starting to wonder what was going on. :) Do you mind filling a bug about the control dependencies? Mark On 11/07/2012 07:31 AM, Stefan Priebe - Profihost AG wrote: Am 07.11.2012 10:41, schrieb Stefan Priebe - Profihost AG: Hello list, syncfs i

Re: SSD journal suggestion

2012-11-07 Thread Mark Nelson
On 11/07/2012 06:28 AM, Gandalf Corvotempesta wrote: 2012/11/7 Sage Weil : On Wed, 7 Nov 2012, Gandalf Corvotempesta wrote: I'm evaluating some SSD drives as journal. Samsung 840 Pro seems to be the fastest in sequential reads and write. The 840 Pro seems to reach 485MB/s in sequential write:

unexpected problem with radosgw fcgi

2012-11-07 Thread SÅ‚awomir Skowron
I have realize that requests from fastcgi in nginx from radosgw returning: HTTP/1.1 200, not a HTTP/1.1 200 OK Any other cgi that i run, for example php via fastcgi return this like RFC says, with OK. Is someone experience this problem ?? I see in code: ./src/rgw/rgw_rest.cc line 36 const sta

Unexpected behavior by ceph 0.48.2argonaut.

2012-11-07 Thread hemant surale
I am not sure about my judgments but please help me out understanding the result of following experiment carried out : - Experiment : (3 node cluster, all have ceph v.0.48.2agonaut( after building ceph from src code) + UBUNTU 12.04 + kernel 3.2.0 ) --

Re: syncfs slower than without syncfs

2012-11-07 Thread Stefan Priebe - Profihost AG
Am 07.11.2012 10:41, schrieb Stefan Priebe - Profihost AG: Hello list, syncfs is much slower than without syncfs to me. If i compile latest ceph master with wip-rbd-read: with syncfs: rand 4K: write: io=1133MB, bw=12853KB/s, iops=3213, runt= 90277msec read : io=1239MB, bw=14046KB/s, iops

Re: SSD journal suggestion

2012-11-07 Thread Sage Weil
On Wed, 7 Nov 2012, Gandalf Corvotempesta wrote: > I'm evaluating some SSD drives as journal. > Samsung 840 Pro seems to be the fastest in sequential reads and write. > > What parameter should I consider for a journal? I think that none of > read benchmark are influent because when dumping journal

syncfs slower than without syncfs

2012-11-07 Thread Stefan Priebe - Profihost AG
Hello list, syncfs is much slower than without syncfs to me. If i compile latest ceph master with wip-rbd-read: with syncfs: rand 4K: write: io=1133MB, bw=12853KB/s, iops=3213, runt= 90277msec read : io=1239MB, bw=14046KB/s, iops=3511, runt= 90325msec seq 4M: write: io=37560MB, bw=423874K

Re: What would a good OSD node hardware configuration look like?

2012-11-07 Thread Wido den Hollander
On 07-11-12 09:17, Gandalf Corvotempesta wrote: 2012/11/7 Wido den Hollander : Except that SSDs will mainly fail due to the amount of write cycles they had to endure. So in RAID-1 your SSDs will fail at almost the same time. With for example 8 OSDs in a server you better spread them out 50/5