ginal Message-
> From: ceph-devel-ow...@vger.kernel.org
> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Sébastien Han
> Sent: 2012年11月22日 5:47
> To: Mark Nelson
> Cc: Alexandre DERUMIER; ceph-devel; Mark Kampe
> Subject: Re: RBD fio Performance concerns
>
&g
iginal -
De: "Stefan Priebe - Profihost AG"
À: "Alexandre DERUMIER"
Cc: "Mark Nelson" , "ceph-devel"
, "Mark Kampe" ,
"Sébastien Han"
Envoyé: Vendredi 23 Novembre 2012 11:49:10
Objet: Re: RBD fio Performance concerns
Am 23.11.2012 1
-ow...@vger.kernel.org] On Behalf Of Sébastien Han
Sent: 2012年11月22日 5:47
To: Mark Nelson
Cc: Alexandre DERUMIER; ceph-devel; Mark Kampe
Subject: Re: RBD fio Performance concerns
Hi Mark,
Well the most concerning thing is that I have 2 Ceph clusters and both of them
show better rand than seq...
I
ot;ceph-devel"
, "Mark Kampe" ,
"Sébastien Han"
Envoyé: Vendredi 23 Novembre 2012 11:49:10
Objet: Re: RBD fio Performance concerns
Am 23.11.2012 11:47, schrieb Alexandre DERUMIER:
when i switch the journal to the OSD Disk seperate partiton on each
disk
(/dev/sdX1 for jou
Han"
Envoyé: Vendredi 23 Novembre 2012 14:24:26
Objet: Re: RBD fio Performance concerns
Am 23.11.2012 14:18, schrieb Mark Nelson:
> Agreed with Alexandre, try putting the journal on a raw partition.
> That's pretty insane! What controller are you using again?
Makes no differenc
rofihost AG"
À: "Alexandre DERUMIER"
Cc: "Mark Nelson" , "ceph-devel"
, "Mark Kampe" ,
"Sébastien Han"
Envoyé: Vendredi 23 Novembre 2012 11:49:10
Objet: Re: RBD fio Performance concerns
Am 23.11.2012 11:47, schrieb Alexandre DERUMIER:
when i s
Mail original -
De: "Stefan Priebe - Profihost AG"
À: "Alexandre DERUMIER"
Cc: "Mark Nelson" , "ceph-devel" , "Mark Kampe"
, "Sébastien Han"
Envoyé: Vendredi 23 Novembre 2012 11:49:10
Objet: Re: RBD fio Performance concerns
Am
Am 23.11.2012 12:03, schrieb Alexandre DERUMIER:
so correcly aligned...
Maybe try to use journal directly on the full partition, without xfs ?
The same - just 200 iops for rand 4k.
Stefan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majo
bastien
Han"
Envoyé: Vendredi 23 Novembre 2012 11:49:10
Objet: Re: RBD fio Performance concerns
Am 23.11.2012 11:47, schrieb Alexandre DERUMIER:
>>> when i switch the journal to the OSD Disk seperate partiton on each disk
>>> (/dev/sdX1 for journal 1GB and /dev/sdX2 f
Am 23.11.2012 11:47, schrieb Alexandre DERUMIER:
when i switch the journal to the OSD Disk seperate partiton on each disk
(/dev/sdX1 for journal 1GB and /dev/sdX2 for OSD) i go down from 23.000
iops to 200 iops random 4k.
O_o , that's seem crazy...
Are you sure that your partitions are correctly
ing first
partition at sector 2048 is best for ssd)
- Mail original -
De: "Stefan Priebe - Profihost AG"
À: "Sébastien Han"
Cc: "Mark Nelson" , "Alexandre DERUMIER"
, "ceph-devel" , "Mark Kampe"
Envoyé: Vendredi 23 Novemb
Hi,
when i switch the journal to the OSD Disk seperate partiton on each disk
(/dev/sdX1 for journal 1GB and /dev/sdX2 for OSD) i go down from 23.000
iops to 200 iops random 4k.
Greets,
Stefan
Am 22.11.2012 13:50, schrieb Sébastien Han:
journal is running on tmpfs to me but that changes nothi
> journal is running on tmpfs to me but that changes nothing.
I don't think it works then. According to the doc: Enables using
libaio for asynchronous writes to the journal. Requires journal dio
set to true.
On Thu, Nov 22, 2012 at 12:48 PM, Stefan Priebe - Profihost AG
wrote:
> Am 22.11.2012 1
>>But who cares? it's also on the 2nd node. or even on the 3rd if you have
>>replicas 3.
Yes but you could also suffer a crash while writing the first replica.
If the journal is in tmpfs, there is nothing to replay.
On Thu, Nov 22, 2012 at 4:35 PM, Alexandre DERUMIER wrote:
>
> >>But who cares
;Alexandre DERUMIER" , "ceph-devel"
, "Mark Kampe" , "Sébastien
Han"
Envoyé: Jeudi 22 Novembre 2012 16:01:56
Objet: Re: RBD fio Performance concerns
Am 22.11.2012 15:46, schrieb Mark Nelson:
> I haven't played a whole lot with SSD only OSDs yet (oth
st AG"
À: "Alexandre DERUMIER"
Cc: "ceph-devel" , "Mark Kampe"
, "Sébastien Han" , "Mark
Nelson"
Envoyé: Jeudi 22 Novembre 2012 16:28:57
Objet: Re: RBD fio Performance concerns
Am 22.11.2012 16:26, schrieb Alexandre DERUMIER:
>>>
fs as we can use dio?
Greets,
Stefan
- Mail original -
De: "Stefan Priebe - Profihost AG"
À: "Sébastien Han"
Cc: "Mark Nelson" , "Alexandre DERUMIER"
, "ceph-devel" ,
"Mark Kampe"
Envoyé: Jeudi 22 Novembre 2012 14:29:03
Objet: Re
4k: 1600iops
>>>> seq write 4M : 31iops (1gigabit client bandwith limit)
>>>>
>>>>
>>>> I really don't understand why I can't get more rand read iops with 4K
>>>> block ...
>>>>
>>>> I try with high e
;>>
>>>> I see no other option while working with SSDs - the only Option would be
>>>> to be able to deaktivate the journal at all. But ceph does not support
>>>> this.
>>>>
>>>> Stefan
>>>>
>&
Sequential is faster than random on a disk, but we are not
doing I/O to a disk, but a distributed storage cluster:
small random operations are striped over multiple objects and
servers, and so can proceed in parallel and take advantage of
more nodes and disks. This parallelism can overcome
e DERUMIER" , "ceph-devel"
, "Mark Kampe"
Envoyé: Jeudi 22 Novembre 2012 14:29:03
Objet: Re: RBD fio Performance concerns
Am 22.11.2012 14:22, schrieb Sébastien Han:
And RAMDISK devices are too expensive.
It would make sense in your infra, but yes they are really expensi
Stefan
- Mail original -
De: "Stefan Priebe - Profihost AG"
À: "Mark Nelson"
Cc: "Alexandre DERUMIER" , "ceph-devel" , "Mark
Kampe" , "Sébastien Han"
Envoyé: Jeudi 22 Novembre 2012 16:01:56
Objet: Re: RBD fio Performance concerns
ompare.
I'll redo tests in 1 or 2 week.
I hope performance will improve.
I'll keep you in touch !
Alexandre
- Mail original -
De: "Sébastien Han"
À: "Mark Nelson"
Cc: "Alexandre DERUMIER" , "ceph-devel"
, "Mark Kampe"
t, it doesn't change nothing.
But test cluster use old 8 cores E5420 @ 2.50GHZ (But cpu is around
15% on cluster during read bench)
- Mail original -
De: "Sébastien Han"
À: "Mark Kampe"
Cc: "Alexandre DERUMIER" , "ceph-devel"
Envoyé: Lund
De: "Stefan Priebe - Profihost AG"
À: "Mark Nelson"
Cc: "Alexandre DERUMIER" , "ceph-devel"
, "Mark Kampe" , "Sébastien
Han"
Envoyé: Jeudi 22 Novembre 2012 15:42:14
Objet: Re: RBD fio Performance concerns
Am 22.11.2012 15:37,
t;Stefan Priebe - Profihost AG"
À: "Sébastien Han"
Cc: "Mark Nelson" , "Alexandre DERUMIER"
, "ceph-devel" ,
"Mark Kampe"
Envoyé: Jeudi 22 Novembre 2012 14:29:03
Objet: Re: RBD fio Performance concerns
Am 22.11.2012 14:22, schrieb Sébastien
Am 22.11.2012 11:49, schrieb Sébastien Han:
@Alexandre: cool!
@ Stefan: Full SSD cluster and 10G switches?
Yes
Couple of weeks ago I saw
that you use journal aio, did you notice performance improvement with it?
journal is running on tmpfs to me but that changes nothing.
Stefan
--
To unsubsc
mpe" , "Sébastien Han"
Envoyé: Jeudi 22 Novembre 2012 15:42:14
Objet: Re: RBD fio Performance concerns
Am 22.11.2012 15:37, schrieb Mark Nelson:
I don't think we recommend tmpfs at all for anything other than playing
around. :)
I discussed this with somebody frmo inkt
Am 22.11.2012 14:22, schrieb Sébastien Han:
And RAMDISK devices are too expensive.
It would make sense in your infra, but yes they are really expensive.
We need something like tmpfs - running in local memory but support dio.
Stefan
--
To unsubscribe from this list: send the line "unsubscr
tefan
- Mail original -
De: "Stefan Priebe - Profihost AG"
À: "Sébastien Han"
Cc: "Mark Nelson" , "Alexandre DERUMIER"
, "ceph-devel" ,
"Mark Kampe"
Envoyé: Jeudi 22 Novembre 2012 14:29:03
Objet: Re: RBD fio Performance concerns
Am
Otherwise you would have the same problem with the disk crashes
Am 22.11.2012 um 16:55 schrieb Sébastien Han :
> Hum sorry, you're right. Forget about what I said :)
>
>
> On Thu, Nov 22, 2012 at 4:54 PM, Stefan Priebe - Profihost AG
> wrote:
>> I thought the Client would then write to the 2nd
;>>>>
>>>>> rand read 4K : 6000 iops
>>>>> seq read 4K : 3500 iops
>>>>> seq read 4M : 31iops (1gigabit client bandwith limit)
>>>>>
>>>>> rand write 4k: 6000iops (tmpfs journal)
>>>&
etter than tmpfs as we can use dio?
Greets,
Stefan
- Mail original -
De: "Stefan Priebe - Profihost AG"
À: "Sébastien Han"
Cc: "Mark Nelson" , "Alexandre DERUMIER"
, "ceph-devel" ,
"Mark Kampe"
Envoyé: Jeudi 22 Novembre 2012
Am 22.11.2012 13:50, schrieb Sébastien Han:
journal is running on tmpfs to me but that changes nothing.
I don't think it works then. According to the doc: Enables using
libaio for asynchronous writes to the journal. Requires journal dio
set to true.
Ah might be but as the SSDs are pretty fast
I thought the Client would then write to the 2nd is this wrong?
Stefan
Am 22.11.2012 um 16:49 schrieb Sébastien Han :
>>> But who cares? it's also on the 2nd node. or even on the 3rd if you have
>>> replicas 3.
>
> Yes but you could also suffer a crash while writing the first replica.
> If the
tien Han"
Cc: "Mark Nelson" , "Alexandre DERUMIER"
, "ceph-devel" , "Mark Kampe"
Envoyé: Jeudi 22 Novembre 2012 14:29:03
Objet: Re: RBD fio Performance concerns
Am 22.11.2012 14:22, schrieb Sébastien Han:
> And RAMDISK devices are too expens
Hum sorry, you're right. Forget about what I said :)
On Thu, Nov 22, 2012 at 4:54 PM, Stefan Priebe - Profihost AG
wrote:
> I thought the Client would then write to the 2nd is this wrong?
>
> Stefan
>
> Am 22.11.2012 um 16:49 schrieb Sébastien Han :
>
But who cares? it's also on the 2nd nod
tien Han"
À: "Mark Kampe"
Cc: "Alexandre DERUMIER" , "ceph-devel"
Envoyé: Lundi 19 Novembre 2012 19:03:40
Objet: Re: RBD fio Performance concerns
@Sage, thanks for the info :)
@Mark:
If you want to do sequential I/O, you should do it buffered
(so
ster use old 8 cores E5420 @ 2.50GHZ (But cpu is around 15% on
cluster during read bench)
- Mail original -
De: "Sébastien Han"
À: "Mark Kampe"
Cc: "Alexandre DERUMIER" , "ceph-devel"
Envoyé: Lundi 19 Novembre 2012 19:03:40
Objet: Re: RBD fio Per
r with kvm guest)
>
>
>
> - Mail original -
>
> De: "Sébastien Han"
> À: "Alexandre DERUMIER"
> Cc: "ceph-devel" , "Mark Kampe"
>
> Envoyé: Lundi 19 Novembre 2012 21:57:59
> Objet: Re: RBD fio Performance concerns
(each fio is at 6000iops)
(I have same result with rbd module or with kvm guest)
- Mail original -
De: "Sébastien Han"
À: "Alexandre DERUMIER"
Cc: "ceph-devel" , "Mark Kampe"
Envoyé: Lundi 19 Novembre 2012 21:57:59
Objet: Re: RBD fio Pe
Hello Mark,
See below my benchmarks results:
-RADOS Bench with 4M block size write:
# rados -p bench bench 300 write -t 32 --no-cleanup
Maintaining 32 concurrent writes of 4194304 bytes for at least 300 seconds.
2012-11-19 21:35:01.722143min lat: 0.255396 max lat: 8.40212 avg lat: 1.14076
se
t; But test cluster use old 8 cores E5420 @ 2.50GHZ (But cpu is around 15% on
> cluster during read bench)
>
>
> - Mail original -
>
> De: "Sébastien Han"
> À: "Mark Kampe"
> Cc: "Alexandre DERUMIER" , "ceph-devel"
>
>
De: "Sébastien Han"
À: "Mark Kampe"
Cc: "Alexandre DERUMIER" , "ceph-devel"
Envoyé: Lundi 19 Novembre 2012 19:03:40
Objet: Re: RBD fio Performance concerns
@Sage, thanks for the info :)
@Mark:
> If you want to do sequential I/O, you sh
@Sage, thanks for the info :)
@Mark:
> If you want to do sequential I/O, you should do it buffered
> (so that the writes can be aggregated) or with a 4M block size
> (very efficient and avoiding object serialization).
The original benchmark has been performed with 4M block size. And as
you can se
Recall:
1. RBD volumes are striped (4M wide) across RADOS objects
2. distinct writes to a single RADOS object are serialized
Your sequential 4K writes are direct, depth=256, so there are
(at all times) 256 writes queued to the same object. All of
your writes are waiting through a very long
astien HAN.
>
>
> On Mon, Nov 19, 2012 at 4:28 PM, Alexandre DERUMIER
> wrote:
> >>>why the
> >>>sequential read/writes are lower than the randoms onces? Or maybe do I
> >>>just need to care about the bandwidth for those values?
> >
> > If I remember, you use f
t more bandwith.
>
>
>
> - Mail original -
>
> De: "Sébastien Han"
> À: "Mark Kampe"
> Cc: "ceph-devel"
> Envoyé: Lundi 19 Novembre 2012 15:56:35
> Objet: Re: RBD fio Performance concerns
>
> Hello Mark,
>
> Firs
ail original -
De: "Sébastien Han"
À: "Mark Kampe"
Cc: "ceph-devel"
Envoyé: Lundi 19 Novembre 2012 15:56:35
Objet: Re: RBD fio Performance concerns
Hello Mark,
First of all, thank you again for another accurate answer :-).
> I would have exp
Hello Mark,
First of all, thank you again for another accurate answer :-).
> I would have expected write aggregation and cylinder affinity to
> have eliminated some seeks and improved rotational latency resulting
> in better than theoretical random write throughput. Against those
> expectations
On 11/15/2012 12:23 PM, Sébastien Han wrote:
First of all, I would like to thank you for this well explained,
structured and clear answer. I guess I got better IOPS thanks to the 10K disks.
10K RPM would bring your per-drive throughput (for 4K random writes)
up to 142 IOPS and your aggregate c
51 matches
Mail list logo