Re: [ceph-users] RBD caching on 4K reads???

2015-02-02 Thread Somnath Roy
This rbd_cache is only applicable to librbd not with the kernel rbd. Hope you 
are trying with librbd based env.
If not, then the caching effect you are seeing is the filesystem cache.

Thanks  Regards
Somnath

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bruce 
McFarland
Sent: Monday, February 02, 2015 3:39 PM
To: Mykola Golub; Udo Lembke
Cc: ceph-us...@ceph.com; Prashanth Nednoor
Subject: Re: [ceph-users] RBD caching on 4K reads???

I'm still missing something. I can check on the monitor to see that the running 
config on the cluster has rbd cache = false

[root@essperf13 ceph]# ceph --admin-daemon 
/var/run/ceph/ceph-mon.essperf13.asok config show | grep rbd
  debug_rbd: 0\/5,
  rbd_cache: false,

Since rbd caching is a client setting I've added the following to the rbd 
client /etc/ceph/ceph.conf

[global]
log file = /var/log/ceph/rbd.log
rbd cache = false
rbd readahead max bytes = 0  should be disabled if rbd cache = false, but I'm 
paranoid

 [client]
admin socket = /var/run/ceph/rbd-$pid.asok

I never see a rbd*asok file in /var/run/ceph. I started the rbd driver on the 
client without /var/run/ceph directory and then see:
2015-02-02 14:40:30.254509 7f81888257c0 -1 asok(0x7f8189182390) 
AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to 
bind the UNIX domain socket to '/var/run/ceph/rbd-1716.asok': (2) No such file 
or directory

When I attempt to map the rbd image to the client device with rbd map. Once I 
create /var/run/ceph these messages don't occur. So it appears that the admin 
sockets are being created, but only for the duration of the command.

I still see the effects of rbd caching if I run fio/vdbench with 4K random 
reads, but I have not been able to create a persistant rbd admin socket so that 
I can dump the running configuration and/or change it at run time.

Any ideas on what I've overlooked?

Any pointers to documentation on the [client] section of ceph.conf? rbd admin 
sockets? Nothing at ceph.com/docs on either topic.
Thanks,
Bruce

-Original Message-
From: Mykola Golub [mailto:to.my.troc...@gmail.com]
Sent: Sunday, February 01, 2015 1:24 PM
To: Udo Lembke
Cc: Bruce McFarland; ceph-us...@ceph.com; Prashanth Nednoor
Subject: Re: [ceph-users] RBD caching on 4K reads???

On Fri, Jan 30, 2015 at 10:09:32PM +0100, Udo Lembke wrote:
 Hi Bruce,
 you can also look on the mon, like
 ceph --admin-daemon /var/run/ceph/ceph-mon.b.asok config show | grep
 cache

rbd cache is a client setting, so you have to check this connecting to the 
client admin socket. Its location is defined in ceph.conf, [client] section, 
admin socket parameter.

--
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD caching on 4K reads???

2015-02-02 Thread Bruce McFarland
I'm still missing something. I can check on the monitor to see that the running 
config on the cluster has rbd cache = false

[root@essperf13 ceph]# ceph --admin-daemon 
/var/run/ceph/ceph-mon.essperf13.asok config show | grep rbd
  debug_rbd: 0\/5,
  rbd_cache: false,

Since rbd caching is a client setting I've added the following to the rbd 
client /etc/ceph/ceph.conf

[global]
log file = /var/log/ceph/rbd.log
rbd cache = false
rbd readahead max bytes = 0  should be disabled if rbd cache = false, but I'm 
paranoid 

 [client]
admin socket = /var/run/ceph/rbd-$pid.asok

I never see a rbd*asok file in /var/run/ceph. I started the rbd driver on the 
client without /var/run/ceph directory and then see:
2015-02-02 14:40:30.254509 7f81888257c0 -1 asok(0x7f8189182390) 
AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to 
bind the UNIX domain socket to '/var/run/ceph/rbd-1716.asok': (2) No such file 
or directory

When I attempt to map the rbd image to the client device with rbd map. Once I 
create /var/run/ceph these messages don't occur. So it appears that the admin 
sockets are being created, but only for the duration of the command. 

I still see the effects of rbd caching if I run fio/vdbench with 4K random 
reads, but I have not been able to create a persistant rbd admin socket so that 
I can dump the running configuration and/or change it at run time.

Any ideas on what I've overlooked?

Any pointers to documentation on the [client] section of ceph.conf? rbd admin 
sockets? Nothing at ceph.com/docs on either topic.
Thanks,
Bruce

-Original Message-
From: Mykola Golub [mailto:to.my.troc...@gmail.com] 
Sent: Sunday, February 01, 2015 1:24 PM
To: Udo Lembke
Cc: Bruce McFarland; ceph-us...@ceph.com; Prashanth Nednoor
Subject: Re: [ceph-users] RBD caching on 4K reads???

On Fri, Jan 30, 2015 at 10:09:32PM +0100, Udo Lembke wrote:
 Hi Bruce,
 you can also look on the mon, like
 ceph --admin-daemon /var/run/ceph/ceph-mon.b.asok config show | grep 
 cache

rbd cache is a client setting, so you have to check this connecting to the 
client admin socket. Its location is defined in ceph.conf, [client] section, 
admin socket parameter.

--
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD caching on 4K reads???

2015-02-02 Thread Nicheal
Uh! It is strange. You said you have already cleared caches both on
client and osd node, the data must directly come from the disk. Wait
other's ideas

2015-02-03 11:44 GMT+08:00 Bruce McFarland bruce.mcfarl...@taec.toshiba.com:
 Yes I'm using and the kernel rbd in Ubuntu 14.04 which makes calls into 
 libceph

 root@essperf3:/etc/ceph# lsmod | grep rbd
 rbd63707  1
 libceph   225026  1 rbd
 root@essperf3:/etc/ceph#

 I'm doing raw device IO with either fio or vdbench (preferred tool) and there 
 is no filesystem on top of /dev/rbd1. Yes I did invalidate the kmem pages by 
 writing to the drop_caches and I've also allocated huge pages to be the max 
 allowable based on free memory. The huge page allocation should minimize any 
 system caches. I have a, relatively, small storage pool since this is a 
 development environment and there is only ~ 4TB total and the rbd image is 
 3TB. On my lab system with 320TB I don't see this problem since the data set 
 is orders of magnitude larger than available system cache.

 Maybe I'll should try and test after removing DIMMs from the client system 
 and physically disabling kernel caching.

 -Original Message-
 From: Nicheal [mailto:zay11...@gmail.com]
 Sent: Monday, February 02, 2015 7:35 PM
 To: Bruce McFarland
 Cc: ceph-us...@ceph.com; Prashanth Nednoor
 Subject: Re: [ceph-users] RBD caching on 4K reads???

 It seems you use the kernel rbd. So rbd_cache does not work, which is just 
 designed for librbd. Kernel rbd is directly using the system page cache. You 
 said that you have already run like echo 3  /proc/sys/vm/drop_cache to 
 invalidate all pages cached in kernel. So do you test the /dev/rbd1 based on 
 any filesystem, such ext4 or xfs?
 If so, and you run the test tool like fio, first with a write test and 
 file_size = 10G. Then a file(10G) is created by fio but with lots of holes in 
 the file, and your read test may read those holes so that filesystem can tell 
 thay contain nothing and there is no need to access the physical disk to get 
 data. You may check the fiemap of the file to see whether it contains holes 
 or you just remove the file and recreate the file by a read test.

 Ning Yao

 2015-01-31 4:51 GMT+08:00 Bruce McFarland bruce.mcfarl...@taec.toshiba.com:
 I have a cluster and have created a rbd device - /dev/rbd1. It shows
 up as expected with ‘rbd –image test info’ and rbd showmapped. I have
 been looking at cluster performance with the usual Linux block device
 tools – fio and vdbench. When I look at writes and large block
 sequential reads I’m seeing what I’d expect with performance limited
 by either my cluster interconnect bandwidth or the backend device
 throughput speeds – 1 GE frontend and cluster network and 7200rpm SATA
 OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K
 random reads. There is caching occurring somewhere in my system that I 
 haven’t been able to detect and suppress - yet.



 I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on the
 client, monitor, and storage nodes. I’ve flushed the system caches on
 the client and storage nodes before test run ie vm.drop_caches=3 and
 set the huge pages to the maximum available to consume free system
 memory so that it can’t be used for system cache . I’ve also disabled
 read-ahead on all of the HDD/OSDs.



 When I run a 4k randon read workload on the client the most I could
 expect would be ~100iops/osd x number of osd’s – I’m seeing an order
 of magnitude greater than that AND running IOSTAT on the storage nodes
 show no read activity on the OSD disks.



 Any ideas on what I’ve overlooked? There appears to be some read-ahead
 caching that I’ve missed.



 Thanks,

 Bruce


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD caching on 4K reads???

2015-02-02 Thread Nicheal
It seems you use the kernel rbd. So rbd_cache does not work, which is
just designed for librbd. Kernel rbd is directly using the system page
cache. You said that you have already run like echo 3 
/proc/sys/vm/drop_cache to invalidate all pages cached in kernel. So
do you test the /dev/rbd1 based on any filesystem, such ext4 or xfs?
If so, and you run the test tool like fio, first with a write test and
file_size = 10G. Then a file(10G) is created by fio but with lots of
holes in the file, and your read test may read those holes so that
filesystem can tell thay contain nothing and there is no need to
access the physical disk to get data. You may check the fiemap of the
file to see whether it contains holes or you just remove the file and
recreate the file by a read test.

Ning Yao

2015-01-31 4:51 GMT+08:00 Bruce McFarland bruce.mcfarl...@taec.toshiba.com:
 I have a cluster and have created a rbd device - /dev/rbd1. It shows up as
 expected with ‘rbd –image test info’ and rbd showmapped. I have been looking
 at cluster performance with the usual Linux block device tools – fio and
 vdbench. When I look at writes and large block sequential reads I’m seeing
 what I’d expect with performance limited by either my cluster interconnect
 bandwidth or the backend device throughput speeds – 1 GE frontend and
 cluster network and 7200rpm SATA OSDs with 1 SSD/osd for journal. Everything
 looks good EXCEPT 4K random reads. There is caching occurring somewhere in
 my system that I haven’t been able to detect and suppress - yet.



 I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on the
 client, monitor, and storage nodes. I’ve flushed the system caches on the
 client and storage nodes before test run ie vm.drop_caches=3 and set the
 huge pages to the maximum available to consume free system memory so that it
 can’t be used for system cache . I’ve also disabled read-ahead on all of the
 HDD/OSDs.



 When I run a 4k randon read workload on the client the most I could expect
 would be ~100iops/osd x number of osd’s – I’m seeing an order of magnitude
 greater than that AND running IOSTAT on the storage nodes show no read
 activity on the OSD disks.



 Any ideas on what I’ve overlooked? There appears to be some read-ahead
 caching that I’ve missed.



 Thanks,

 Bruce


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD caching on 4K reads???

2015-02-02 Thread Bruce McFarland
I'm using Ubuntu 14.04 and the kernel rbd which makes calls into libceph
root@essperf3:/etc/ceph# lsmod | grep rbd
rbd63707  1 
libceph   225026  1 rbd
root@essperf3:/etc/ceph#

I'm doing raw device IO with either fio or vdbench (preferred tool) and there 
is no filesystem on top of /dev/rbd1. Yes I did invalidate the kmem pages by 
writing to the drop_caches and I've also allocated huge pages to be the max 
allowable based on free memory. The huge page allocation should minimize any 
system caches. I have a, relatively, small storage pool since this is a 
development environment and there is only ~ 4TB total and the rbd image is 3TB. 
On my lab system with 320TB I don't see this problem since the data set is 
orders of magnitude larger than available system cache. Maybe I'll remove DIMMs 
from the client system and physically disable kernel caching.

-Original Message-
From: Nicheal [mailto:zay11...@gmail.com] 
Sent: Monday, February 02, 2015 7:35 PM
To: Bruce McFarland
Cc: ceph-us...@ceph.com; Prashanth Nednoor
Subject: Re: [ceph-users] RBD caching on 4K reads???

It seems you use the kernel rbd. So rbd_cache does not work, which is just 
designed for librbd. Kernel rbd is directly using the system page cache. You 
said that you have already run like echo 3  /proc/sys/vm/drop_cache to 
invalidate all pages cached in kernel. So do you test the /dev/rbd1 based on 
any filesystem, such ext4 or xfs?
If so, and you run the test tool like fio, first with a write test and 
file_size = 10G. Then a file(10G) is created by fio but with lots of holes in 
the file, and your read test may read those holes so that filesystem can tell 
thay contain nothing and there is no need to access the physical disk to get 
data. You may check the fiemap of the file to see whether it contains holes or 
you just remove the file and recreate the file by a read test.

Ning Yao

2015-01-31 4:51 GMT+08:00 Bruce McFarland bruce.mcfarl...@taec.toshiba.com:
 I have a cluster and have created a rbd device - /dev/rbd1. It shows 
 up as expected with ‘rbd –image test info’ and rbd showmapped. I have 
 been looking at cluster performance with the usual Linux block device 
 tools – fio and vdbench. When I look at writes and large block 
 sequential reads I’m seeing what I’d expect with performance limited 
 by either my cluster interconnect bandwidth or the backend device 
 throughput speeds – 1 GE frontend and cluster network and 7200rpm SATA 
 OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K 
 random reads. There is caching occurring somewhere in my system that I 
 haven’t been able to detect and suppress - yet.



 I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on the 
 client, monitor, and storage nodes. I’ve flushed the system caches on 
 the client and storage nodes before test run ie vm.drop_caches=3 and 
 set the huge pages to the maximum available to consume free system 
 memory so that it can’t be used for system cache . I’ve also disabled 
 read-ahead on all of the HDD/OSDs.



 When I run a 4k randon read workload on the client the most I could 
 expect would be ~100iops/osd x number of osd’s – I’m seeing an order 
 of magnitude greater than that AND running IOSTAT on the storage nodes 
 show no read activity on the OSD disks.



 Any ideas on what I’ve overlooked? There appears to be some read-ahead 
 caching that I’ve missed.



 Thanks,

 Bruce


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD caching on 4K reads???

2015-02-02 Thread Bruce McFarland
Yes I'm using and the kernel rbd in Ubuntu 14.04 which makes calls into libceph 

root@essperf3:/etc/ceph# lsmod | grep rbd
rbd63707  1 
libceph   225026  1 rbd
root@essperf3:/etc/ceph#

I'm doing raw device IO with either fio or vdbench (preferred tool) and there 
is no filesystem on top of /dev/rbd1. Yes I did invalidate the kmem pages by 
writing to the drop_caches and I've also allocated huge pages to be the max 
allowable based on free memory. The huge page allocation should minimize any 
system caches. I have a, relatively, small storage pool since this is a 
development environment and there is only ~ 4TB total and the rbd image is 3TB. 
On my lab system with 320TB I don't see this problem since the data set is 
orders of magnitude larger than available system cache. 

Maybe I'll should try and test after removing DIMMs from the client system and 
physically disabling kernel caching.

-Original Message-
From: Nicheal [mailto:zay11...@gmail.com] 
Sent: Monday, February 02, 2015 7:35 PM
To: Bruce McFarland
Cc: ceph-us...@ceph.com; Prashanth Nednoor
Subject: Re: [ceph-users] RBD caching on 4K reads???

It seems you use the kernel rbd. So rbd_cache does not work, which is just 
designed for librbd. Kernel rbd is directly using the system page cache. You 
said that you have already run like echo 3  /proc/sys/vm/drop_cache to 
invalidate all pages cached in kernel. So do you test the /dev/rbd1 based on 
any filesystem, such ext4 or xfs?
If so, and you run the test tool like fio, first with a write test and 
file_size = 10G. Then a file(10G) is created by fio but with lots of holes in 
the file, and your read test may read those holes so that filesystem can tell 
thay contain nothing and there is no need to access the physical disk to get 
data. You may check the fiemap of the file to see whether it contains holes or 
you just remove the file and recreate the file by a read test.

Ning Yao

2015-01-31 4:51 GMT+08:00 Bruce McFarland bruce.mcfarl...@taec.toshiba.com:
 I have a cluster and have created a rbd device - /dev/rbd1. It shows 
 up as expected with ‘rbd –image test info’ and rbd showmapped. I have 
 been looking at cluster performance with the usual Linux block device 
 tools – fio and vdbench. When I look at writes and large block 
 sequential reads I’m seeing what I’d expect with performance limited 
 by either my cluster interconnect bandwidth or the backend device 
 throughput speeds – 1 GE frontend and cluster network and 7200rpm SATA 
 OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K 
 random reads. There is caching occurring somewhere in my system that I 
 haven’t been able to detect and suppress - yet.



 I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on the 
 client, monitor, and storage nodes. I’ve flushed the system caches on 
 the client and storage nodes before test run ie vm.drop_caches=3 and 
 set the huge pages to the maximum available to consume free system 
 memory so that it can’t be used for system cache . I’ve also disabled 
 read-ahead on all of the HDD/OSDs.



 When I run a 4k randon read workload on the client the most I could 
 expect would be ~100iops/osd x number of osd’s – I’m seeing an order 
 of magnitude greater than that AND running IOSTAT on the storage nodes 
 show no read activity on the OSD disks.



 Any ideas on what I’ve overlooked? There appears to be some read-ahead 
 caching that I’ve missed.



 Thanks,

 Bruce


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD caching on 4K reads???

2015-02-01 Thread Mykola Golub
On Fri, Jan 30, 2015 at 10:09:32PM +0100, Udo Lembke wrote:
 Hi Bruce,
 you can also look on the mon, like
 ceph --admin-daemon /var/run/ceph/ceph-mon.b.asok config show | grep cache

rbd cache is a client setting, so you have to check this connecting to
the client admin socket. Its location is defined in ceph.conf,
[client] section, admin socket parameter.

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD caching on 4K reads???

2015-01-30 Thread Udo Lembke
Hi Bruce,
hmm, sounds for me like the rbd cache.
Can you look, if the cache is realy disabled in the running config with

ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep cache

Udo

On 30.01.2015 21:51, Bruce McFarland wrote:

 I have a cluster and have created a rbd device - /dev/rbd1. It shows
 up as expected with ‘rbd –image test info’ and rbd showmapped. I have
 been looking at cluster performance with the usual Linux block device
 tools – fio and vdbench. When I look at writes and large block
 sequential reads I’m seeing what I’d expect with performance limited
 by either my cluster interconnect bandwidth or the backend device
 throughput speeds – 1 GE frontend and cluster network and 7200rpm SATA
 OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K
 random reads. There is caching occurring somewhere in my system that I
 haven’t been able to detect and suppress - yet.

  

 I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on the
 client, monitor, and storage nodes. I’ve flushed the system caches on
 the client and storage nodes before test run ie vm.drop_caches=3 and
 set the huge pages to the maximum available to consume free system
 memory so that it can’t be used for system cache . I’ve also disabled
 read-ahead on all of the HDD/OSDs.

  

 When I run a 4k randon read workload on the client the most I could
 expect would be ~100iops/osd x number of osd’s – I’m seeing an order
 of magnitude greater than that AND running IOSTAT on the storage nodes
 show no read activity on the OSD disks.

  

 Any ideas on what I’ve overlooked? There appears to be some read-ahead
 caching that I’ve missed.

  

 Thanks,

 Bruce



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RBD caching on 4K reads???

2015-01-30 Thread Bruce McFarland
I have a cluster and have created a rbd device - /dev/rbd1. It shows up as 
expected with 'rbd -image test info' and rbd showmapped. I have been looking at 
cluster performance with the usual Linux block device tools - fio and vdbench. 
When I look at writes and large block sequential reads I'm seeing what I'd 
expect with performance limited by either my cluster interconnect bandwidth or 
the backend device throughput speeds - 1 GE frontend and cluster network and 
7200rpm SATA OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K 
random reads. There is caching occurring somewhere in my system that I haven't 
been able to detect and suppress - yet.

I've set 'rbd_cache=false' in the [client] section of ceph.conf on the client, 
monitor, and storage nodes. I've flushed the system caches on the client and 
storage nodes before test run ie vm.drop_caches=3 and set the huge pages to the 
maximum available to consume free system memory so that it can't be used for 
system cache . I've also disabled read-ahead on all of the HDD/OSDs.

When I run a 4k randon read workload on the client the most I could expect 
would be ~100iops/osd x number of osd's - I'm seeing an order of magnitude 
greater than that AND running IOSTAT on the storage nodes show no read activity 
on the OSD disks.

Any ideas on what I've overlooked? There appears to be some read-ahead caching 
that I've missed.

Thanks,
Bruce
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD caching on 4K reads???

2015-01-30 Thread Bruce McFarland
The ceph daemon isn't running on the client with the rbd device so I can't 
verify if it's disabled at the librbd level on the client. If you mean on the 
storage nodes I've had some issues dumping the config. Does the rbd caching 
occur on the storage nodes, client, or both?


From: Udo Lembke [mailto:ulem...@polarzone.de]
Sent: Friday, January 30, 2015 1:00 PM
To: Bruce McFarland; ceph-us...@ceph.com
Cc: Prashanth Nednoor
Subject: Re: [ceph-users] RBD caching on 4K reads???

Hi Bruce,
hmm, sounds for me like the rbd cache.
Can you look, if the cache is realy disabled in the running config with

ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep cache

Udo
On 30.01.2015 21:51, Bruce McFarland wrote:
I have a cluster and have created a rbd device - /dev/rbd1. It shows up as 
expected with 'rbd -image test info' and rbd showmapped. I have been looking at 
cluster performance with the usual Linux block device tools - fio and vdbench. 
When I look at writes and large block sequential reads I'm seeing what I'd 
expect with performance limited by either my cluster interconnect bandwidth or 
the backend device throughput speeds - 1 GE frontend and cluster network and 
7200rpm SATA OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K 
random reads. There is caching occurring somewhere in my system that I haven't 
been able to detect and suppress - yet.

I've set 'rbd_cache=false' in the [client] section of ceph.conf on the client, 
monitor, and storage nodes. I've flushed the system caches on the client and 
storage nodes before test run ie vm.drop_caches=3 and set the huge pages to the 
maximum available to consume free system memory so that it can't be used for 
system cache . I've also disabled read-ahead on all of the HDD/OSDs.

When I run a 4k randon read workload on the client the most I could expect 
would be ~100iops/osd x number of osd's - I'm seeing an order of magnitude 
greater than that AND running IOSTAT on the storage nodes show no read activity 
on the OSD disks.

Any ideas on what I've overlooked? There appears to be some read-ahead caching 
that I've missed.

Thanks,
Bruce




___

ceph-users mailing list

ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD caching on 4K reads???

2015-01-30 Thread Udo Lembke
Hi Bruce,
you can also look on the mon, like
ceph --admin-daemon /var/run/ceph/ceph-mon.b.asok config show | grep cache

(I guess you have an number instead of the .b. )

Udo
On 30.01.2015 22:02, Bruce McFarland wrote:

 The ceph daemon isn’t running on the client with the rbd device so I
 can’t verify if it’s disabled at the librbd level on the client. If
 you mean on the storage nodes I’ve had some issues dumping the config.
 Does the rbd caching occur on the storage nodes, client, or both?

  

  

 *From:*Udo Lembke [mailto:ulem...@polarzone.de]
 *Sent:* Friday, January 30, 2015 1:00 PM
 *To:* Bruce McFarland; ceph-us...@ceph.com
 *Cc:* Prashanth Nednoor
 *Subject:* Re: [ceph-users] RBD caching on 4K reads???

  

 Hi Bruce,
 hmm, sounds for me like the rbd cache.
 Can you look, if the cache is realy disabled in the running config with

 ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep cache

 Udo

 On 30.01.2015 21:51, Bruce McFarland wrote:

 I have a cluster and have created a rbd device - /dev/rbd1. It
 shows up as expected with ‘rbd –image test info’ and rbd
 showmapped. I have been looking at cluster performance with the
 usual Linux block device tools – fio and vdbench. When I look at
 writes and large block sequential reads I’m seeing what I’d expect
 with performance limited by either my cluster interconnect
 bandwidth or the backend device throughput speeds – 1 GE frontend
 and cluster network and 7200rpm SATA OSDs with 1 SSD/osd for
 journal. Everything looks good EXCEPT 4K random reads. There is
 caching occurring somewhere in my system that I haven’t been able
 to detect and suppress - yet.

  

 I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on
 the client, monitor, and storage nodes. I’ve flushed the system
 caches on the client and storage nodes before test run ie
 vm.drop_caches=3 and set the huge pages to the maximum available
 to consume free system memory so that it can’t be used for system
 cache . I’ve also disabled read-ahead on all of the HDD/OSDs.

  

 When I run a 4k randon read workload on the client the most I
 could expect would be ~100iops/osd x number of osd’s – I’m seeing
 an order of magnitude greater than that AND running IOSTAT on the
 storage nodes show no read activity on the OSD disks.

  

 Any ideas on what I’ve overlooked? There appears to be some
 read-ahead caching that I’ve missed.

  

 Thanks,

 Bruce




 ___

 ceph-users mailing list

 ceph-users@lists.ceph.com mailto:ceph-users@lists.ceph.com

 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

  


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com