subject:"\[ceph\-users\] RADOS Bench strange behavior"

Re: [ceph-users] RADOS Bench strange behavior

2013-07-10 Thread Gregory Farnum

On Wed, Jul 10, 2013 at 12:38 AM, Erwan Velu  wrote:
> Hi,
>
> I've just subscribe the mailing. I'm maybe breaking the thread as I cannot
> "answer to all" ;o)
>
> I'd like to share my research on understanding of this behavior.
>
> A rados put is showing the expected behavior while the rados bench doesn't
> even with a concurrency set to one.
>
> As a new comer, I've been reading the code to understand the difference
> between each "put" vs "bench" approach.
>
> The first one is pretty straightforward and we achieve the IO via do_put
> which call io_ctx.write{full}.
>
> On the other hand, benchmark is using a much more complicated stuff by using
> aio.
> If I understand properly, that's mostly to be able to increase concurrency.
> After a few calls we achieve the write_bench() function which is the main
> loop of the benchmark
> (https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L302)
>
> That's mostly where I have some troubles understand how it could works as
> expected, here come why :
>
> From this point,
> https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L330, we
> do prepare objects as much as we do have concurrent_ios.
>
> From this point,
> https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L344, we
> do spread the IOs as much as we do have concurrent_ios
>
> From this point,
> https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L368, we
> do start the main loop until we reach the limit (time or amount of objects)
>
> Starting this loop,
> https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L371, we
> do wait that all sent IOs (up to concurrent_ios) are completed. By the way,
> I didn't understood how the end of IO is detected. AIO supports callbacks,
> signals or polling. Which one is used ? I saw that we rely on
> completion_is_done() which does a  return completions[slot]->complete; I
> only found something here but not sure if it's the good one :
> https://github.com/ceph/ceph/blob/master/src/tools/rest_bench.cc#L329

This code has all been refactored several times, but from my memory we
have an array of completions matching the array of in-flight objects.
When an object op has been completed, its completion gets marked
complete.
So, once we've spun off the initial async io, we enter a loop. The
first thing we do in that loop is look through the list of completions
for one that's marked complete. Then:
> Then we reach
> https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L389.
> That's where I'm confused. as from my understanding we are rescheduling _a
> single IO_ and get back to the waiting loop. So I don't really got how the
> concurrency is kept.

So here we've found that *one* of the IOs (not all of them) are
completed, and we're spinning up a replacement IO for that one IO that
finished. If more IOs have finished while we were setting that one up
then we'll notice that very quickly in the while(1) loop.

> To be more direct about my thoughts, I do think that somewhere the aio stuff
> does ack the IO too soon and so we are sending a new IO while the previous
> one didn't got complete. That would explain the kind of behavior we do see
> with sebastien.

Yeah, the IO is acking here once it's in the journal, but after that
it still needs to get into the backing store. (Actually IIRC the ack
it's looking for is the in-memory one, but with xfs the journal is
write-ahead and it handles the ordering for you.) However, this
shouldn't really result in any different behavior than what you'd see
with a bunch of looped "rados put" commands (especially on XFS).
My guess, Sébastien, is that the BBU/RAID card is just reordering
things in ways you weren't expecting but that don't show up with the
"rados put" because it's running at about half the speed of the rados
bench.

> As a side note, I saw that  ceph_clock_now is using gettimeofday which is
> not resilient to system date changes (like if a ntp update is occuring).
> clock_gettime with CLOCK_MONOTONIC is clearly prefered for such "time
> difference computing" job.

Hmm, that's probably correct. Getting that kind of flag through the
current interfaces sounds a little annoying, though. :/
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] RADOS Bench strange behavior

2013-07-10 Thread Erwan Velu


Hi,

I've just subscribe the mailing. I'm maybe breaking the thread as I 
cannot "answer to all" ;o)


I'd like to share my research on understanding of this behavior.

A rados put is showing the expected behavior while the rados bench 
doesn't even with a concurrency set to one.


As a new comer, I've been reading the code to understand the difference 
between each "put" vs "bench" approach.


The first one is pretty straightforward and we achieve the IO via do_put 
which call io_ctx.write{full}.


On the other hand, benchmark is using a much more complicated stuff by 
using aio.
If I understand properly, that's mostly to be able to increase 
concurrency. After a few calls we achieve the write_bench() function 
which is the main loop of the benchmark 
(https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L302)


That's mostly where I have some troubles understand how it could works 
as expected, here come why :


From this point, 
https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L330, 
we do prepare objects as much as we do have concurrent_ios.


From this point, 
https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L344, 
we do spread the IOs as much as we do have concurrent_ios


From this point, 
https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L368, 
we do start the main loop until we reach the limit (time or amount of 
objects)


Starting this loop, 
https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L371, 
we do wait that all sent IOs (up to concurrent_ios) are completed. By 
the way, I didn't understood how the end of IO is detected. AIO supports 
callbacks, signals or polling. Which one is used ? I saw that we rely on 
completion_is_done() which does a  return completions[slot]->complete; I 
only found something here but not sure if it's the good one : 
https://github.com/ceph/ceph/blob/master/src/tools/rest_bench.cc#L329


Then we reach 
https://github.com/ceph/ceph/blob/master/src/common/obj_bencher.cc#L389.
That's where I'm confused. as from my understanding we are rescheduling 
_a single IO_ and get back to the waiting loop. So I don't really got 
how the concurrency is kept.


To be more direct about my thoughts, I do think that somewhere the aio 
stuff does ack the IO too soon and so we are sending a new IO while the 
previous one didn't got complete. That would explain the kind of 
behavior we do see with sebastien.


As a side note, I saw that  ceph_clock_now is using gettimeofday which 
is not resilient to system date changes (like if a ntp update is 
occuring). clock_gettime with CLOCK_MONOTONIC is clearly prefered for 
such "time difference computing" job.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] RADOS Bench strange behavior

2013-07-09 Thread Sebastien Han

Hrm!  Were you using 4MB of data with rados put?  Also, I don't know how much extra latency running "rados put" would add from start to finish. Is it slower than RADOS bench when you loop it?  It may not show much concurrency if the writes on the OSDs are finishing quickly and waiting on the next operation to start.The file was 1GB  large. It was just for the test, I just tried with 4M and it took around 60 sec for the loop to get 1000 object of 4M done against 35 sec for the rados bench. Thus rados bench looks almost 2 times faster.I assume the rados put gives the ack as soon as the object is stored on the backend fs.Another thing to look at with RADOS bench is the average latency being reported at each reporting interval.  Is it better than you'd expect for a 4MB write to a single spinning disk?  That might tell you how much of an effect cache is having.The average latency was 0.0378687. Disks are SEAGATE, ST91000640SS	, 931.012GB. Latency announced by Seagate is 4.16 ms, thus it's better than expected.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone : +33 (0)1 49 70 99 72 – Mobile : +33 (0)6 52 84 44 70Email : sebastien@enovance.com – Skype : han.sbastienAddress : 10, rue de la Victoire – 75009 ParisWeb : www.enovance.com – Twitter : @enovance
On Jul 9, 2013, at 2:19 PM, Mark Nelson  wrote:On 07/09/2013 06:47 AM, Sebastien Han wrote:Hi Mark,Yes write back caching is enable since we have a BBU. See the currentcache policy of the controller: WriteBack, ReadAheadNone and Direct.FYI, both journal and filestore are stored on the same disks, thus sd*1is the journal and sd*2 is the filestore.In order to give you a little bit more about the behaviour here's what Isee when I do a "rados put":Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/savgrq-sz avgqu-sz   await r_await w_await  svctm  %utilsdc               0.00     0.00    0.00    0.00     0.00     0.000.00     0.00    0.00    0.00    0.00   0.00   0.00sdc1              0.00     0.00    0.00    0.00     0.00     0.000.00     0.00    0.00    0.00    0.00   0.00   0.00sdc2              0.00     0.00    0.00    0.00     0.00     0.000.00     0.00    0.00    0.00    0.00   0.00   0.00sde               0.00     0.00    0.00    0.00     0.00     0.000.00     0.00    0.00    0.00    0.00   0.00   0.00sde1              0.00     0.00    0.00    0.00     0.00     0.000.00     0.00    0.00    0.00    0.00   0.00   0.00sde2              0.00     0.00    0.00    0.00     0.00     0.000.00     0.00    0.00    0.00    0.00   0.00   0.00sdd               0.00     0.00    0.00    0.00     0.00     0.000.00     0.00    0.00    0.00    0.00   0.00   0.00sdd1              0.00     0.00    0.00    0.00     0.00     0.000.00     0.00    0.00    0.00    0.00   0.00   0.00sdd2              0.00     0.00    0.00    0.00     0.00     0.000.00     0.00    0.00    0.00    0.00   0.00   0.00sdb               0.00     0.30    0.00  100.40     0.00    25.63522.85     0.07    0.69    0.00    0.69   0.08   0.80sdb1              0.00     0.00    0.00   48.10     0.00    12.83546.08     0.03    0.67    0.00    0.67   0.09   0.44sdb2              0.00     0.30    0.00   49.00     0.00    12.81535.27     0.04    0.75    0.00    0.75   0.07   0.36And here's what I see with a "rados bench" with a concurrency of 1 on apool with only one copy.Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/savgrq-sz avgqu-sz   await r_await w_await  svctm  %utilsdc               0.00     8.20    0.00   71.20     0.00    16.12463.78     0.05    0.77    0.00    0.77   0.08   0.60sdc1              0.00     0.00    0.00   30.20     0.00     8.02543.63     0.02    0.81    0.00    0.81   0.11   0.32sdc2              0.00     8.20    0.00   38.80     0.00     8.11427.93     0.03    0.78    0.00    0.78   0.07   0.28sde               0.00     1.20    0.00   57.70     0.00    14.42511.94     0.06    0.96    0.00    0.96   0.14   0.80sde1              0.00     0.00    0.00   27.20     0.00     7.21543.24     0.02    0.81    0.00    0.81   0.10   0.28sde2              0.00     1.20    0.00   28.50     0.00     7.21518.01     0.03    1.16    0.00    1.16   0.18   0.52sdd               0.00     1.50    0.00   78.40     0.00    19.24502.50     0.08    1.08    0.00    1.08   0.12   0.92sdd1              0.00     0.00    0.00   36.30     0.00     9.62542.74     0.03    0.88    0.00    0.88   0.09   0.32sdd2              0.00     1.50    0.00   39.40     0.00     9.62499.86     0.05    1.33    0.00    1.33   0.15   0.60sdb               0.00     2.10    0.00   80.70     0.00    20.04508.49     0.06    0.69    0.00    0.69   0.07   0.60sdb1              0.00     0.00    0.00   37.80     0.00    10.02542.92     0.02    0.62    0.00    0.62   0.07   0.28sdb2              0.00     2.10    0.00   40.10     0.00    10.02511.54     0.03    0.80    0.00    0.80   0.08   0.32This definitely looks way to much to me…I also tried to reproduce the r

Re: [ceph-users] RADOS Bench strange behavior

2013-07-09 Thread Mark Nelson


On 07/09/2013 06:47 AM, Sebastien Han wrote:

Hi Mark,

Yes write back caching is enable since we have a BBU. See the current
cache policy of the controller: WriteBack, ReadAheadNone and Direct.

FYI, both journal and filestore are stored on the same disks, thus sd*1
is the journal and sd*2 is the filestore.

In order to give you a little bit more about the behaviour here's what I
see when I do a "rados put":

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdc   0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdc1  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdc2  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sde   0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sde1  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sde2  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdd   0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdd1  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdd2  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdb   0.00 0.300.00  100.40 0.0025.63
522.85 0.070.690.000.69   0.08   0.80
sdb1  0.00 0.000.00   48.10 0.0012.83
546.08 0.030.670.000.67   0.09   0.44
sdb2  0.00 0.300.00   49.00 0.0012.81
535.27 0.040.750.000.75   0.07   0.36

And here's what I see with a "rados bench" with a concurrency of 1 on a
pool with only one copy.

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdc   0.00 8.200.00   71.20 0.0016.12
463.78 0.050.770.000.77   0.08   0.60
sdc1  0.00 0.000.00   30.20 0.00 8.02
543.63 0.020.810.000.81   0.11   0.32
sdc2  0.00 8.200.00   38.80 0.00 8.11
427.93 0.030.780.000.78   0.07   0.28
sde   0.00 1.200.00   57.70 0.0014.42
511.94 0.060.960.000.96   0.14   0.80
sde1  0.00 0.000.00   27.20 0.00 7.21
543.24 0.020.810.000.81   0.10   0.28
sde2  0.00 1.200.00   28.50 0.00 7.21
518.01 0.031.160.001.16   0.18   0.52
sdd   0.00 1.500.00   78.40 0.0019.24
502.50 0.081.080.001.08   0.12   0.92
sdd1  0.00 0.000.00   36.30 0.00 9.62
542.74 0.030.880.000.88   0.09   0.32
sdd2  0.00 1.500.00   39.40 0.00 9.62
499.86 0.051.330.001.33   0.15   0.60
sdb   0.00 2.100.00   80.70 0.0020.04
508.49 0.060.690.000.69   0.07   0.60
sdb1  0.00 0.000.00   37.80 0.0010.02
542.92 0.020.620.000.62   0.07   0.28
sdb2  0.00 2.100.00   40.10 0.0010.02
511.54 0.030.800.000.80   0.08   0.32

This definitely looks way to much to me…

I also tried to reproduce the rados bench behaviour by looping "radios
put" command then I got something like:

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdc   0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdc1  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdc2  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sde   0.00 0.000.00  589.00 0.00   152.15
529.03 0.550.930.000.93   0.12   6.80
sde1  0.00 0.000.00  285.00 0.0076.15
547.20 0.220.770.000.77   0.13   3.60
sde2  0.00 0.000.00  285.00 0.0076.00
546.13 0.321.140.001.14   0.11   3.20
sdd   0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdd1  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdd2  0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdb   0.00 0.00

Re: [ceph-users] RADOS Bench strange behavior

2013-07-09 Thread Sebastien Han

Hi Mark,Yes write back caching is enable since we have a BBU. See the current cache policy of the controller: WriteBack, ReadAheadNone and Direct.FYI, both journal and filestore are stored on the same disks, thus sd*1 is the journal and sd*2 is the filestore.In order to give you a little bit more about the behaviour here's what I see when I do a "rados put":Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %utilsdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sdc1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sdc2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sde1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sde2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sdd1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sdd2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sdb               0.00     0.30    0.00  100.40     0.00    25.63   522.85     0.07    0.69    0.00    0.69   0.08   0.80sdb1              0.00     0.00    0.00   48.10     0.00    12.83   546.08     0.03    0.67    0.00    0.67   0.09   0.44sdb2              0.00     0.30    0.00   49.00     0.00    12.81   535.27     0.04    0.75    0.00    0.75   0.07   0.36And here's what I see with a "rados bench" with a concurrency of 1 on a pool with only one copy.Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %utilsdc               0.00     8.20    0.00   71.20     0.00    16.12   463.78     0.05    0.77    0.00    0.77   0.08   0.60sdc1              0.00     0.00    0.00   30.20     0.00     8.02   543.63     0.02    0.81    0.00    0.81   0.11   0.32sdc2              0.00     8.20    0.00   38.80     0.00     8.11   427.93     0.03    0.78    0.00    0.78   0.07   0.28sde               0.00     1.20    0.00   57.70     0.00    14.42   511.94     0.06    0.96    0.00    0.96   0.14   0.80sde1              0.00     0.00    0.00   27.20     0.00     7.21   543.24     0.02    0.81    0.00    0.81   0.10   0.28sde2              0.00     1.20    0.00   28.50     0.00     7.21   518.01     0.03    1.16    0.00    1.16   0.18   0.52sdd               0.00     1.50    0.00   78.40     0.00    19.24   502.50     0.08    1.08    0.00    1.08   0.12   0.92sdd1              0.00     0.00    0.00   36.30     0.00     9.62   542.74     0.03    0.88    0.00    0.88   0.09   0.32sdd2              0.00     1.50    0.00   39.40     0.00     9.62   499.86     0.05    1.33    0.00    1.33   0.15   0.60sdb               0.00     2.10    0.00   80.70     0.00    20.04   508.49     0.06    0.69    0.00    0.69   0.07   0.60sdb1              0.00     0.00    0.00   37.80     0.00    10.02   542.92     0.02    0.62    0.00    0.62   0.07   0.28sdb2              0.00     2.10    0.00   40.10     0.00    10.02   511.54     0.03    0.80    0.00    0.80   0.08   0.32This definitely looks way to much to me…I also tried to reproduce the rados bench behaviour by looping "radios put" command then I got something like:Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %utilsdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sdc1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sdc2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sde               0.00     0.00    0.00  589.00     0.00   152.15   529.03     0.55    0.93    0.00    0.93   0.12   6.80sde1              0.00     0.00    0.00  285.00     0.00    76.15   547.20     0.22    0.77    0.00    0.77   0.13   3.60sde2              0.00     0.00    0.00  285.00     0.00    76.00   546.13     0.32    1.14    0.00    1.14   0.11   3.20sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sdd1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sdd2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00sdb               0.00     0.00    0.

Re: [ceph-users] RADOS Bench strange behavior

2013-07-09 Thread Mark Nelson


On 07/09/2013 03:20 AM, Sebastien Han wrote:

Hi all,

While running some benchmarks with the internal rados benchmarker I
noticed something really strange. First of all, this is the line I used
to run it:

$ sudo rados -p 07:59:54_performance bench 300 write -b 4194304 -t 1
--no-cleanup

So I want to test an IO with a concurrency of 1. I had a look at the
code and also strace the process and I noticed that the IOs are send one
by one sequentially. Thus it does what I expect from it.
However while monitoring the disks usage on all my OSDs, I found out
that they were all loaded (writing, both journals and filestore) which
is kind of weird since all the IOs are send one by one. I was expecting
that only one OSDs at a time will be writing.

Obviously there is no replication going on since I changed the rep size
to 1.

$ ceph osd dump |grep "07:59:54_performance"
pool 323 '07:59:54_performance' rep size 1 min_size 1 crush_ruleset 0
object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 1306 owner 0

Thanks in advance guy.



Hi Sebastien,

Do your SAS controllers have writeback cache perchance?  If so, you may 
be seeing the OSDs immediately acknowledge journal writes before the 
data actually makes it to disk allowing the client to send the next 
operation before the first actually is truly committed.  Even without WB 
cache, you'll probably see a lot of concurrent journal writes and OSD 
writes since the OSD writes are just happening in the background via 
buffered IO.  There's also drive cache helping out here unless you've 
explicitly disabled it.  I suspect some combination of above is what you 
are seeing.


Mark

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] RADOS Bench strange behavior

2013-07-09 Thread Sebastien Han

Hi all,While running some benchmarks with the internal rados benchmarker I noticed something really strange. First of all, this is the line I used to run it:$ sudo rados -p 07:59:54_performance bench 300 write -b 4194304 -t 1 --no-cleanupSo I want to test an IO with a concurrency of 1. I had a look at the code and also strace the process and I noticed that the IOs are send one by one sequentially. Thus it does what I expect from it.However while monitoring the disks usage on all my OSDs, I found out that they were all loaded (writing, both journals and filestore) which is kind of weird since all the IOs are send one by one. I was expecting that only one OSDs at a time will be writing.Obviously there is no replication going on since I changed the rep size to 1.$ ceph osd dump |grep "07:59:54_performance"pool 323 '07:59:54_performance' rep size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 1306 owner 0Thanks in advance guy.Sébastien HanCloud Engineer"Always give 100%. Unless you're giving blood."Phone : +33 (0)1 49 70 99 72 – Mobile : +33 (0)6 52 84 44 70Email : sebastien@enovance.com – Skype : han.sbastienAddress : 10, rue de la Victoire – 75009 ParisWeb : www.enovance.com – Twitter : @enovance
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] RADOS Bench strange behavior

Re: [ceph-users] RADOS Bench strange behavior

Re: [ceph-users] RADOS Bench strange behavior

Re: [ceph-users] RADOS Bench strange behavior

Re: [ceph-users] RADOS Bench strange behavior

Re: [ceph-users] RADOS Bench strange behavior

[ceph-users] RADOS Bench strange behavior

7 matches

Site Navigation

Mail list logo

Footer information