Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-12 Thread ulembke

Hi,
if you wrote from an client, the data was written in an (or more) 
Placement Group in 4MB-Chunks. This PGs are written to journal and the 
osd-disk and due this the data are in the linux file buffer on the 
osd-node too (until the os need the storage for other data (file buffer 
or anything else)).


If you read the data from the client again, the osd-node takes the data 
from the file buffer instead to read the same data again from the slow 
disks. Ths is the reason, why huge ram in osd-nodes speed up ceph ;-)

Normaly nice, but difficult for benchmarking.

Udo

Am 2016-12-12 05:51, schrieb V Plus:

Hi.. Udo,
I am not sure I understood what you said.
Did you mean that the 'dd' command also got cached in the osd node? 
or??



On Sun, Dec 11, 2016 at 10:46 PM, Udo Lembke  
wrote:



Hi,
but I assume you measure also cache in this scenario - the osd-nodes 
has

cached the writes in the filebuffer
(due this the latency should be very small).

Udo

On 12.12.2016 03:00, V Plus wrote:
> Thanks Somnath!
> As you recommended, I executed:
> dd if=/dev/zero bs=1M count=4096 of=/dev/rbd0
> dd if=/dev/zero bs=1M count=4096 of=/dev/rbd1
>
> Then the output results look more reasonable!
> Could you tell me why??
>
> Btw, the purpose of my run is to test the performance of rbd in ceph.
> Does my case mean that before every test, I have to "initialize" all
> the images???
>
> Great thanks!!
>
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread V Plus
Hi.. Udo,
I am not sure I understood what you said.
Did you mean that the 'dd' command also got cached in the osd node? or??


On Sun, Dec 11, 2016 at 10:46 PM, Udo Lembke <ulem...@polarzone.de> wrote:

> Hi,
> but I assume you measure also cache in this scenario - the osd-nodes has
> cached the writes in the filebuffer
> (due this the latency should be very small).
>
> Udo
>
> On 12.12.2016 03:00, V Plus wrote:
> > Thanks Somnath!
> > As you recommended, I executed:
> > dd if=/dev/zero bs=1M count=4096 of=/dev/rbd0
> > dd if=/dev/zero bs=1M count=4096 of=/dev/rbd1
> >
> > Then the output results look more reasonable!
> > Could you tell me why??
> >
> > Btw, the purpose of my run is to test the performance of rbd in ceph.
> > Does my case mean that before every test, I have to "initialize" all
> > the images???
> >
> > Great thanks!!
> >
> > On Sun, Dec 11, 2016 at 8:47 PM, Somnath Roy <somnath@sandisk.com
> > <mailto:somnath@sandisk.com>> wrote:
> >
> > Fill up the image with big write (say 1M) first before reading and
> > you should see sane throughput.
> >
> >
> >
> > Thanks & Regards
> >
> > Somnath
> >
> > *From:*ceph-users [mailto:ceph-users-boun...@lists.ceph.com
> >     <mailto:ceph-users-boun...@lists.ceph.com>] *On Behalf Of *V Plus
> > *Sent:* Sunday, December 11, 2016 5:44 PM
> > *To:* ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> > *Subject:* [ceph-users] Ceph performance is too good
> (impossible..)...
> >
> >
> >
> > Hi Guys,
> >
> > we have a ceph cluster with 6 machines (6 OSD per host).
> >
> > 1. I created 2 images in Ceph, and map them to another host A
> > (*/outside /*the Ceph cluster). On host A, I
> > got *//dev/rbd0/* and*/ /dev/rbd1/*.
> >
> > 2. I start two fio job to perform READ test on rbd0 and rbd1. (fio
> > job descriptions can be found below)
> >
> > */"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output
> > b.txt  & wait"/*
> >
> > 3. After the test, in a.txt, we got */bw=1162.7MB/s/*, in b.txt,
> > we get */bw=3579.6MB/s/*.
> >
> > The results do NOT make sense because there is only one NIC on
> > host A, and its limit is 10 Gbps (1.25GB/s).
> >
> >
> >
> > I suspect it is because of the cache setting.
> >
> > But I am sure that in file *//etc/ceph/ceph.conf/* on host A,I
> > already added:
> >
> > */[client]/*
> >
> > */rbd cache = false/*
> >
> >
> >
> > Could anyone give me a hint what is missing? why
> >
> > Thank you very much.
> >
> >
> >
> > *fioA.job:*
> >
> > /[A]/
> >
> > /direct=1/
> >
> > /group_reporting=1/
> >
> > /unified_rw_reporting=1/
> >
> > /size=100%/
> >
> > /time_based=1/
> >
> > /filename=/dev/rbd0/
> >
> > /rw=read/
> >
> > /bs=4MB/
> >
> > /numjobs=16/
> >
> > /ramp_time=10/
> >
> > /runtime=20/
> >
> >
> >
> > *fioB.job:*
> >
> > /[B]/
> >
> > /direct=1/
> >
> > /group_reporting=1/
> >
> > /unified_rw_reporting=1/
> >
> > /size=100%/
> >
> > /time_based=1/
> >
> > /filename=/dev/rbd1/
> >
> > /rw=read/
> >
> > /bs=4MB/
> >
> > /numjobs=16/
> >
> > /ramp_time=10/
> >
> > /runtime=20/
> >
> >
> >
> > /Thanks.../
> >
> > PLEASE NOTE: The information contained in this electronic mail
> > message is intended only for the use of the designated
> > recipient(s) named above. If the reader of this message is not the
> > intended recipient, you are hereby notified that you have received
> > this message in error and that any review, dissemination,
> > distribution, or copying of this message is strictly prohibited.
> > If you have received this communication in error, please notify
> > the sender by telephone or e-mail (as shown above) immediately and
> > destroy any and all copies of this message in your possession
> > (whether hard copies or electronically stored copies).
> >
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread Udo Lembke
Hi,
but I assume you measure also cache in this scenario - the osd-nodes has
cached the writes in the filebuffer
(due this the latency should be very small).

Udo

On 12.12.2016 03:00, V Plus wrote:
> Thanks Somnath!
> As you recommended, I executed:
> dd if=/dev/zero bs=1M count=4096 of=/dev/rbd0
> dd if=/dev/zero bs=1M count=4096 of=/dev/rbd1
>
> Then the output results look more reasonable!
> Could you tell me why??
>
> Btw, the purpose of my run is to test the performance of rbd in ceph.
> Does my case mean that before every test, I have to "initialize" all
> the images???
>
> Great thanks!!
>
> On Sun, Dec 11, 2016 at 8:47 PM, Somnath Roy <somnath@sandisk.com
> <mailto:somnath@sandisk.com>> wrote:
>
> Fill up the image with big write (say 1M) first before reading and
> you should see sane throughput.
>
>  
>
> Thanks & Regards
>
> Somnath
>
> *From:*ceph-users [mailto:ceph-users-boun...@lists.ceph.com
> <mailto:ceph-users-boun...@lists.ceph.com>] *On Behalf Of *V Plus
> *Sent:* Sunday, December 11, 2016 5:44 PM
>     *To:* ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> *Subject:* [ceph-users] Ceph performance is too good (impossible..)...
>
>  
>
> Hi Guys,
>
> we have a ceph cluster with 6 machines (6 OSD per host). 
>
> 1. I created 2 images in Ceph, and map them to another host A
> (*/outside /*the Ceph cluster). On host A, I
> got *//dev/rbd0/* and*/ /dev/rbd1/*.
>
> 2. I start two fio job to perform READ test on rbd0 and rbd1. (fio
> job descriptions can be found below)
>
> */"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output
> b.txt  & wait"/*
>
> 3. After the test, in a.txt, we got */bw=1162.7MB/s/*, in b.txt,
> we get */bw=3579.6MB/s/*.
>
> The results do NOT make sense because there is only one NIC on
> host A, and its limit is 10 Gbps (1.25GB/s).
>
>  
>
> I suspect it is because of the cache setting.
>
> But I am sure that in file *//etc/ceph/ceph.conf/* on host A,I
> already added:
>
> */[client]/*
>
> */rbd cache = false/*
>
>  
>
> Could anyone give me a hint what is missing? why
>
> Thank you very much.
>
>  
>
> *fioA.job:*
>
> /[A]/
>
> /direct=1/
>
> /group_reporting=1/
>
> /unified_rw_reporting=1/
>
> /size=100%/
>
> /time_based=1/
>
> /filename=/dev/rbd0/
>
> /rw=read/
>
> /bs=4MB/
>
> /numjobs=16/
>
> /ramp_time=10/
>
> /runtime=20/
>
>  
>
> *fioB.job:*
>
> /[B]/
>
> /direct=1/
>
> /group_reporting=1/
>
> /unified_rw_reporting=1/
>
> /size=100%/
>
> /time_based=1/
>
> /filename=/dev/rbd1/
>
> /rw=read/
>
> /bs=4MB/
>
> /numjobs=16/
>
> /ramp_time=10/
>
> /runtime=20/
>
>  
>
> /Thanks.../
>
> PLEASE NOTE: The information contained in this electronic mail
> message is intended only for the use of the designated
> recipient(s) named above. If the reader of this message is not the
> intended recipient, you are hereby notified that you have received
> this message in error and that any review, dissemination,
> distribution, or copying of this message is strictly prohibited.
> If you have received this communication in error, please notify
> the sender by telephone or e-mail (as shown above) immediately and
> destroy any and all copies of this message in your possession
> (whether hard copies or electronically stored copies).
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread Somnath Roy
I generally do a 1M seq write to fill up the device. Block size doesn’t matter 
here but bigger block size is faster to fill up and that’s why people use that.

From: V Plus [mailto:v.plussh...@gmail.com]
Sent: Sunday, December 11, 2016 7:03 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph performance is too good (impossible..)...

Thanks!

One more question, what do you mean by "bigger" ?
Do you mean that bigger block size (say, I will run read test with bs=4K, then 
I need to first write the rbd with bs>4K?)? or size that is big enough to cover 
the area where the test will be executed?


On Sun, Dec 11, 2016 at 9:54 PM, Somnath Roy 
<somnath@sandisk.com<mailto:somnath@sandisk.com>> wrote:
A block needs to be written before read otherwise you will get funny result. 
For example, in case of flash (depending on how FW is implemented) , it will 
mostly return you 0 if a block is not written. Now, I have seen some flash FW 
is really inefficient on manufacturing this data (say 0) if not written and 
some are really fast.
So, to get predictable result you should be always reading a block that is 
written. In a device say half of the block is written and you are doing a full 
device random reads , you will get unpredictable/spiky/imbalanced result.
Same with rbd as well, consider it as a storage device and behavior would be 
similar. So, it is always recommended to precondition (fill up) a rbd image 
with bigger block seq write before you do any synthetic test on that. Now, for 
filestore backend added advantage of preconditioning rbd will be the files in 
the filesystem will be created beforehand.

Thanks & Regards
Somnath

From: V Plus [mailto:v.plussh...@gmail.com<mailto:v.plussh...@gmail.com>]
Sent: Sunday, December 11, 2016 6:01 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] Ceph performance is too good (impossible..)...

Thanks Somnath!
As you recommended, I executed:
dd if=/dev/zero bs=1M count=4096 of=/dev/rbd0
dd if=/dev/zero bs=1M count=4096 of=/dev/rbd1

Then the output results look more reasonable!
Could you tell me why??

Btw, the purpose of my run is to test the performance of rbd in ceph. Does my 
case mean that before every test, I have to "initialize" all the images???

Great thanks!!

On Sun, Dec 11, 2016 at 8:47 PM, Somnath Roy 
<somnath@sandisk.com<mailto:somnath@sandisk.com>> wrote:
Fill up the image with big write (say 1M) first before reading and you should 
see sane throughput.

Thanks & Regards
Somnath
From: ceph-users 
[mailto:ceph-users-boun...@lists.ceph.com<mailto:ceph-users-boun...@lists.ceph.com>]
 On Behalf Of V Plus
Sent: Sunday, December 11, 2016 5:44 PM
To: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: [ceph-users] Ceph performance is too good (impossible..)...

Hi Guys,
we have a ceph cluster with 6 machines (6 OSD per host).
1. I created 2 images in Ceph, and map them to another host A (outside the Ceph 
cluster). On host A, I got /dev/rbd0 and /dev/rbd1.
2. I start two fio job to perform READ test on rbd0 and rbd1. (fio job 
descriptions can be found below)
"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output b.txt  & wait"
3. After the test, in a.txt, we got bw=1162.7MB/s, in b.txt, we get 
bw=3579.6MB/s.
The results do NOT make sense because there is only one NIC on host A, and its 
limit is 10 Gbps (1.25GB/s).

I suspect it is because of the cache setting.
But I am sure that in file /etc/ceph/ceph.conf on host A,I already added:
[client]
rbd cache = false

Could anyone give me a hint what is missing? why
Thank you very much.

fioA.job:
[A]
direct=1
group_reporting=1
unified_rw_reporting=1
size=100%
time_based=1
filename=/dev/rbd0
rw=read
bs=4MB
numjobs=16
ramp_time=10
runtime=20

fioB.job:
[B]
direct=1
group_reporting=1
unified_rw_reporting=1
size=100%
time_based=1
filename=/dev/rbd1
rw=read
bs=4MB
numjobs=16
ramp_time=10
runtime=20

Thanks...
PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread V Plus
Thanks!

One more question, what do you mean by "bigger" ?
Do you mean that bigger block size (say, I will run read test with bs=4K,
then I need to first write the rbd with bs>4K?)? or size that is big enough
to cover the area where the test will be executed?


On Sun, Dec 11, 2016 at 9:54 PM, Somnath Roy <somnath@sandisk.com>
wrote:

> A block needs to be written before read otherwise you will get funny
> result. For example, in case of flash (depending on how FW is implemented)
> , it will mostly return you 0 if a block is not written. Now, I have seen
> some flash FW is really inefficient on manufacturing this data (say 0) if
> not written and some are really fast.
>
> So, to get predictable result you should be always reading a block that is
> written. In a device say half of the block is written and you are doing a
> full device random reads , you will get unpredictable/spiky/imbalanced
> result.
>
> Same with rbd as well, consider it as a storage device and behavior would
> be similar. So, it is always recommended to precondition (fill up) a rbd
> image with bigger block seq write before you do any synthetic test on that.
> Now, for filestore backend added advantage of preconditioning rbd will be
> the files in the filesystem will be created beforehand.
>
>
>
> Thanks & Regards
>
> Somnath
>
>
>
> *From:* V Plus [mailto:v.plussh...@gmail.com]
> *Sent:* Sunday, December 11, 2016 6:01 PM
> *To:* Somnath Roy
> *Cc:* ceph-users@lists.ceph.com
> *Subject:* Re: [ceph-users] Ceph performance is too good (impossible..)...
>
>
>
> Thanks Somnath!
>
> As you recommended, I executed:
>
> dd if=/dev/zero bs=1M count=4096 of=/dev/rbd0
>
> dd if=/dev/zero bs=1M count=4096 of=/dev/rbd1
>
>
>
> Then the output results look more reasonable!
>
> Could you tell me why??
>
>
>
> Btw, the purpose of my run is to test the performance of rbd in ceph. Does
> my case mean that before every test, I have to "initialize" all the
> images???
>
>
>
> Great thanks!!
>
>
>
> On Sun, Dec 11, 2016 at 8:47 PM, Somnath Roy <somnath@sandisk.com>
> wrote:
>
> Fill up the image with big write (say 1M) first before reading and you
> should see sane throughput.
>
>
>
> Thanks & Regards
>
> Somnath
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *V Plus
> *Sent:* Sunday, December 11, 2016 5:44 PM
> *To:* ceph-users@lists.ceph.com
> *Subject:* [ceph-users] Ceph performance is too good (impossible..)...
>
>
>
> Hi Guys,
>
> we have a ceph cluster with 6 machines (6 OSD per host).
>
> 1. I created 2 images in Ceph, and map them to another host A (*outside *the
> Ceph cluster). On host A, I got */dev/rbd0* and* /dev/rbd1*.
>
> 2. I start two fio job to perform READ test on rbd0 and rbd1. (fio job
> descriptions can be found below)
>
> *"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output b.txt  &
> wait"*
>
> 3. After the test, in a.txt, we got *bw=1162.7MB/s*, in b.txt, we get
> *bw=3579.6MB/s*.
>
> The results do NOT make sense because there is only one NIC on host A, and
> its limit is 10 Gbps (1.25GB/s).
>
>
>
> I suspect it is because of the cache setting.
>
> But I am sure that in file */etc/ceph/ceph.conf* on host A,I already
> added:
>
> *[client]*
>
> *rbd cache = false*
>
>
>
> Could anyone give me a hint what is missing? why
>
> Thank you very much.
>
>
>
> *fioA.job:*
>
> *[A]*
>
> *direct=1*
>
> *group_reporting=1*
>
> *unified_rw_reporting=1*
>
> *size=100%*
>
> *time_based=1*
>
> *filename=/dev/rbd0*
>
> *rw=read*
>
> *bs=4MB*
>
> *numjobs=16*
>
> *ramp_time=10*
>
> *runtime=20*
>
>
>
> *fioB.job:*
>
> *[B]*
>
> *direct=1*
>
> *group_reporting=1*
>
> *unified_rw_reporting=1*
>
> *size=100%*
>
> *time_based=1*
>
> *filename=/dev/rbd1*
>
> *rw=read*
>
> *bs=4MB*
>
> *numjobs=16*
>
> *ramp_time=10*
>
> *runtime=20*
>
>
>
> *Thanks...*
>
> PLEASE NOTE: The information contained in this electronic mail message is
> intended only for the use of the designated recipient(s) named above. If
> the reader of this message is not the intended recipient, you are hereby
> notified that you have received this message in error and that any review,
> dissemination, distribution, or copying of this message is strictly
> prohibited. If you have received this communication in error, please notify
> the sender by telephone or e-mail (as shown above) immediately and destroy
> any and all copies of this message in your possession (whether hard copies
> or electronically stored copies).
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread Somnath Roy
A block needs to be written before read otherwise you will get funny result. 
For example, in case of flash (depending on how FW is implemented) , it will 
mostly return you 0 if a block is not written. Now, I have seen some flash FW 
is really inefficient on manufacturing this data (say 0) if not written and 
some are really fast.
So, to get predictable result you should be always reading a block that is 
written. In a device say half of the block is written and you are doing a full 
device random reads , you will get unpredictable/spiky/imbalanced result.
Same with rbd as well, consider it as a storage device and behavior would be 
similar. So, it is always recommended to precondition (fill up) a rbd image 
with bigger block seq write before you do any synthetic test on that. Now, for 
filestore backend added advantage of preconditioning rbd will be the files in 
the filesystem will be created beforehand.

Thanks & Regards
Somnath

From: V Plus [mailto:v.plussh...@gmail.com]
Sent: Sunday, December 11, 2016 6:01 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph performance is too good (impossible..)...

Thanks Somnath!
As you recommended, I executed:
dd if=/dev/zero bs=1M count=4096 of=/dev/rbd0
dd if=/dev/zero bs=1M count=4096 of=/dev/rbd1

Then the output results look more reasonable!
Could you tell me why??

Btw, the purpose of my run is to test the performance of rbd in ceph. Does my 
case mean that before every test, I have to "initialize" all the images???

Great thanks!!

On Sun, Dec 11, 2016 at 8:47 PM, Somnath Roy 
<somnath@sandisk.com<mailto:somnath@sandisk.com>> wrote:
Fill up the image with big write (say 1M) first before reading and you should 
see sane throughput.

Thanks & Regards
Somnath
From: ceph-users 
[mailto:ceph-users-boun...@lists.ceph.com<mailto:ceph-users-boun...@lists.ceph.com>]
 On Behalf Of V Plus
Sent: Sunday, December 11, 2016 5:44 PM
To: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: [ceph-users] Ceph performance is too good (impossible..)...

Hi Guys,
we have a ceph cluster with 6 machines (6 OSD per host).
1. I created 2 images in Ceph, and map them to another host A (outside the Ceph 
cluster). On host A, I got /dev/rbd0 and /dev/rbd1.
2. I start two fio job to perform READ test on rbd0 and rbd1. (fio job 
descriptions can be found below)
"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output b.txt  & wait"
3. After the test, in a.txt, we got bw=1162.7MB/s, in b.txt, we get 
bw=3579.6MB/s.
The results do NOT make sense because there is only one NIC on host A, and its 
limit is 10 Gbps (1.25GB/s).

I suspect it is because of the cache setting.
But I am sure that in file /etc/ceph/ceph.conf on host A,I already added:
[client]
rbd cache = false

Could anyone give me a hint what is missing? why
Thank you very much.

fioA.job:
[A]
direct=1
group_reporting=1
unified_rw_reporting=1
size=100%
time_based=1
filename=/dev/rbd0
rw=read
bs=4MB
numjobs=16
ramp_time=10
runtime=20

fioB.job:
[B]
direct=1
group_reporting=1
unified_rw_reporting=1
size=100%
time_based=1
filename=/dev/rbd1
rw=read
bs=4MB
numjobs=16
ramp_time=10
runtime=20

Thanks...
PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread V Plus
Thanks.
then how can we avoid this if I want to test the ceph rbd performance.

BTW, it seems not the case.
I followed what Somnath said, and got reasonable results.
But I am still confused.

On Sun, Dec 11, 2016 at 8:59 PM, JiaJia Zhong <zhongjia...@haomaiyi.com>
wrote:

> >> 3. After the test, in a.txt, we got *bw=1162.7MB/s*, in b.txt, we get
> *bw=3579.6MB/s*.
>
> mostly, due to your kernel buffer of client host
>
>
> -- Original --
> *From: * "Somnath Roy"<somnath@sandisk.com>;
> *Date: * Mon, Dec 12, 2016 09:47 AM
> *To: * "V Plus"<v.plussh...@gmail.com>; "CEPH list"<ceph-us...@lists.ceph.
> com>;
> *Subject: * Re: [ceph-users] Ceph performance is too good
> (impossible..)...
>
>
> Fill up the image with big write (say 1M) first before reading and you
> should see sane throughput.
>
>
>
> Thanks & Regards
>
> Somnath
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *V Plus
> *Sent:* Sunday, December 11, 2016 5:44 PM
> *To:* ceph-users@lists.ceph.com
> *Subject:* [ceph-users] Ceph performance is too good (impossible..)...
>
>
>
> Hi Guys,
>
> we have a ceph cluster with 6 machines (6 OSD per host).
>
> 1. I created 2 images in Ceph, and map them to another host A (*outside *the
> Ceph cluster). On host A, I got */dev/rbd0* and* /dev/rbd1*.
>
> 2. I start two fio job to perform READ test on rbd0 and rbd1. (fio job
> descriptions can be found below)
>
> *"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output b.txt  &
> wait"*
>
> 3. After the test, in a.txt, we got *bw=1162.7MB/s*, in b.txt, we get
> *bw=3579.6MB/s*.
>
> The results do NOT make sense because there is only one NIC on host A, and
> its limit is 10 Gbps (1.25GB/s).
>
>
>
> I suspect it is because of the cache setting.
>
> But I am sure that in file */etc/ceph/ceph.conf* on host A,I already
> added:
>
> *[client]*
>
> *rbd cache = false*
>
>
>
> Could anyone give me a hint what is missing? why
>
> Thank you very much.
>
>
>
> *fioA.job:*
>
> *[A]*
>
> *direct=1*
>
> *group_reporting=1*
>
> *unified_rw_reporting=1*
>
> *size=100%*
>
> *time_based=1*
>
> *filename=/dev/rbd0*
>
> *rw=read*
>
> *bs=4MB*
>
> *numjobs=16*
>
> *ramp_time=10*
>
> *runtime=20*
>
>
>
> *fioB.job:*
>
> *[B]*
>
> *direct=1*
>
> *group_reporting=1*
>
> *unified_rw_reporting=1*
>
> *size=100%*
>
> *time_based=1*
>
> *filename=/dev/rbd1*
>
> *rw=read*
>
> *bs=4MB*
>
> *numjobs=16*
>
> *ramp_time=10*
>
> *runtime=20*
>
>
>
> *Thanks...*
> PLEASE NOTE: The information contained in this electronic mail message is
> intended only for the use of the designated recipient(s) named above. If
> the reader of this message is not the intended recipient, you are hereby
> notified that you have received this message in error and that any review,
> dissemination, distribution, or copying of this message is strictly
> prohibited. If you have received this communication in error, please notify
> the sender by telephone or e-mail (as shown above) immediately and destroy
> any and all copies of this message in your possession (whether hard copies
> or electronically stored copies).
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread V Plus
Thanks Somnath!
As you recommended, I executed:
dd if=/dev/zero bs=1M count=4096 of=/dev/rbd0
dd if=/dev/zero bs=1M count=4096 of=/dev/rbd1

Then the output results look more reasonable!
Could you tell me why??

Btw, the purpose of my run is to test the performance of rbd in ceph. Does
my case mean that before every test, I have to "initialize" all the
images???

Great thanks!!

On Sun, Dec 11, 2016 at 8:47 PM, Somnath Roy <somnath@sandisk.com>
wrote:

> Fill up the image with big write (say 1M) first before reading and you
> should see sane throughput.
>
>
>
> Thanks & Regards
>
> Somnath
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *V Plus
> *Sent:* Sunday, December 11, 2016 5:44 PM
> *To:* ceph-users@lists.ceph.com
> *Subject:* [ceph-users] Ceph performance is too good (impossible..)...
>
>
>
> Hi Guys,
>
> we have a ceph cluster with 6 machines (6 OSD per host).
>
> 1. I created 2 images in Ceph, and map them to another host A (*outside *the
> Ceph cluster). On host A, I got */dev/rbd0* and* /dev/rbd1*.
>
> 2. I start two fio job to perform READ test on rbd0 and rbd1. (fio job
> descriptions can be found below)
>
> *"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output b.txt  &
> wait"*
>
> 3. After the test, in a.txt, we got *bw=1162.7MB/s*, in b.txt, we get
> *bw=3579.6MB/s*.
>
> The results do NOT make sense because there is only one NIC on host A, and
> its limit is 10 Gbps (1.25GB/s).
>
>
>
> I suspect it is because of the cache setting.
>
> But I am sure that in file */etc/ceph/ceph.conf* on host A,I already
> added:
>
> *[client]*
>
> *rbd cache = false*
>
>
>
> Could anyone give me a hint what is missing? why
>
> Thank you very much.
>
>
>
> *fioA.job:*
>
> *[A]*
>
> *direct=1*
>
> *group_reporting=1*
>
> *unified_rw_reporting=1*
>
> *size=100%*
>
> *time_based=1*
>
> *filename=/dev/rbd0*
>
> *rw=read*
>
> *bs=4MB*
>
> *numjobs=16*
>
> *ramp_time=10*
>
> *runtime=20*
>
>
>
> *fioB.job:*
>
> *[B]*
>
> *direct=1*
>
> *group_reporting=1*
>
> *unified_rw_reporting=1*
>
> *size=100%*
>
> *time_based=1*
>
> *filename=/dev/rbd1*
>
> *rw=read*
>
> *bs=4MB*
>
> *numjobs=16*
>
> *ramp_time=10*
>
> *runtime=20*
>
>
>
> *Thanks...*
> PLEASE NOTE: The information contained in this electronic mail message is
> intended only for the use of the designated recipient(s) named above. If
> the reader of this message is not the intended recipient, you are hereby
> notified that you have received this message in error and that any review,
> dissemination, distribution, or copying of this message is strictly
> prohibited. If you have received this communication in error, please notify
> the sender by telephone or e-mail (as shown above) immediately and destroy
> any and all copies of this message in your possession (whether hard copies
> or electronically stored copies).
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread JiaJia Zhong
>> 3. After the test, in a.txt, we got bw=1162.7MB/s, in b.txt, we get 
>> bw=3579.6MB/s.

mostly, due to your kernel buffer of client host



 
 
-- Original --
From:  "Somnath Roy"<somnath@sandisk.com>;
Date:  Mon, Dec 12, 2016 09:47 AM
To:  "V Plus"<v.plussh...@gmail.com>; "CEPH list"<ceph-users@lists.ceph.com>; 

Subject:  Re: [ceph-users] Ceph performance is too good (impossible..)...

 
  
Fill up the image with big write (say 1M) first before reading and you should 
see sane throughput.
 
 
 
Thanks & Regards
 
Somnath
 
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of V Plus
 Sent: Sunday, December 11, 2016 5:44 PM
 To: ceph-users@lists.ceph.com
 Subject: [ceph-users] Ceph performance is too good (impossible..)...
 
 
   
Hi Guys,
 
  
we have a ceph cluster with 6 machines (6 OSD per host). 
 
  
1. I created 2 images in Ceph, and map them to another host A (outside the Ceph 
cluster). On host A, I got /dev/rbd0 and /dev/rbd1.
 
  
2. I start two fio job to perform READ test on rbd0 and rbd1. (fio job 
descriptions can be found below)
 
  
"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output b.txt  & wait"
 
  
3. After the test, in a.txt, we got bw=1162.7MB/s, in b.txt, we get 
bw=3579.6MB/s.
 
  
The results do NOT make sense because there is only one NIC on host A, and its 
limit is 10 Gbps (1.25GB/s).
 
  
 
 
  
I suspect it is because of the cache setting.
 
  
But I am sure that in file /etc/ceph/ceph.conf on host A,I already added:
 
  
[client]
 
  
rbd cache = false
 
  
 
 
  
Could anyone give me a hint what is missing? why
 
  
Thank you very much.
 
  
 
 
  
fioA.job:
 
  
[A]
 
  
direct=1
 
  
group_reporting=1
 
  
unified_rw_reporting=1
 
  
size=100%
 
  
time_based=1
 
  
filename=/dev/rbd0
 
  
rw=read
 
  
bs=4MB
 
  
numjobs=16
 
  
ramp_time=10
 
  
runtime=20
 
  
 
 
  
fioB.job:
 
  
[B]
 
  
direct=1
 
  
group_reporting=1
 
  
unified_rw_reporting=1
 
  
size=100%
 
  
time_based=1
 
  
filename=/dev/rbd1
 
  
rw=read
 
  
bs=4MB
 
  
numjobs=16
 
  
ramp_time=10
 
  
runtime=20
 
  
 
 
  
Thanks...
 
 
 
 PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this  message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy  any and all 
copies of this message in your possession (whether hard copies or 
electronically stored copies).___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread Somnath Roy
Fill up the image with big write (say 1M) first before reading and you should 
see sane throughput.

Thanks & Regards
Somnath
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of V Plus
Sent: Sunday, December 11, 2016 5:44 PM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Ceph performance is too good (impossible..)...

Hi Guys,
we have a ceph cluster with 6 machines (6 OSD per host).
1. I created 2 images in Ceph, and map them to another host A (outside the Ceph 
cluster). On host A, I got /dev/rbd0 and /dev/rbd1.
2. I start two fio job to perform READ test on rbd0 and rbd1. (fio job 
descriptions can be found below)
"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output b.txt  & wait"
3. After the test, in a.txt, we got bw=1162.7MB/s, in b.txt, we get 
bw=3579.6MB/s.
The results do NOT make sense because there is only one NIC on host A, and its 
limit is 10 Gbps (1.25GB/s).

I suspect it is because of the cache setting.
But I am sure that in file /etc/ceph/ceph.conf on host A,I already added:
[client]
rbd cache = false

Could anyone give me a hint what is missing? why
Thank you very much.

fioA.job:
[A]
direct=1
group_reporting=1
unified_rw_reporting=1
size=100%
time_based=1
filename=/dev/rbd0
rw=read
bs=4MB
numjobs=16
ramp_time=10
runtime=20

fioB.job:
[B]
direct=1
group_reporting=1
unified_rw_reporting=1
size=100%
time_based=1
filename=/dev/rbd1
rw=read
bs=4MB
numjobs=16
ramp_time=10
runtime=20

Thanks...
PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread V Plus
Hi Guys,
we have a ceph cluster with 6 machines (6 OSD per host).
1. I created 2 images in Ceph, and map them to another host A (*outside *the
Ceph cluster). On host A, I got */dev/rbd0* and* /dev/rbd1*.
2. I start two fio job to perform READ test on rbd0 and rbd1. (fio job
descriptions can be found below)
*"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output b.txt  &
wait"*
3. After the test, in a.txt, we got *bw=1162.7MB/s*, in b.txt, we get
*bw=3579.6MB/s*.
The results do NOT make sense because there is only one NIC on host A, and
its limit is 10 Gbps (1.25GB/s).

I suspect it is because of the cache setting.
But I am sure that in file */etc/ceph/ceph.conf* on host A,I already added:
*[client]*
*rbd cache = false*

Could anyone give me a hint what is missing? why
Thank you very much.

*fioA.job:*
*[A]*
*direct=1*
*group_reporting=1*
*unified_rw_reporting=1*
*size=100%*
*time_based=1*
*filename=/dev/rbd0*
*rw=read*
*bs=4MB*
*numjobs=16*
*ramp_time=10*
*runtime=20*

*fioB.job:*
*[B]*
*direct=1*
*group_reporting=1*
*unified_rw_reporting=1*
*size=100%*
*time_based=1*
*filename=/dev/rbd1*
*rw=read*
*bs=4MB*
*numjobs=16*
*ramp_time=10*
*runtime=20*

*Thanks...*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com