Re: [ceph-users] RBD caching on 4K reads???
It seems you use the kernel rbd. So rbd_cache does not work, which is just designed for librbd. Kernel rbd is directly using the system page cache. You said that you have already run like echo 3 > /proc/sys/vm/drop_cache to invalidate all pages cached in kernel. So do you test the /dev/rbd1 based on any filesystem, such ext4 or xfs? If so, and you run the test tool like fio, first with a write test and file_size = 10G. Then a file(10G) is created by fio but with lots of holes in the file, and your read test may read those holes so that filesystem can tell thay contain nothing and there is no need to access the physical disk to get data. You may check the fiemap of the file to see whether it contains holes or you just remove the file and recreate the file by a read test. Ning Yao 2015-01-31 4:51 GMT+08:00 Bruce McFarland : > I have a cluster and have created a rbd device - /dev/rbd1. It shows up as > expected with ‘rbd –image test info’ and rbd showmapped. I have been looking > at cluster performance with the usual Linux block device tools – fio and > vdbench. When I look at writes and large block sequential reads I’m seeing > what I’d expect with performance limited by either my cluster interconnect > bandwidth or the backend device throughput speeds – 1 GE frontend and > cluster network and 7200rpm SATA OSDs with 1 SSD/osd for journal. Everything > looks good EXCEPT 4K random reads. There is caching occurring somewhere in > my system that I haven’t been able to detect and suppress - yet. > > > > I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on the > client, monitor, and storage nodes. I’ve flushed the system caches on the > client and storage nodes before test run ie vm.drop_caches=3 and set the > huge pages to the maximum available to consume free system memory so that it > can’t be used for system cache . I’ve also disabled read-ahead on all of the > HDD/OSDs. > > > > When I run a 4k randon read workload on the client the most I could expect > would be ~100iops/osd x number of osd’s – I’m seeing an order of magnitude > greater than that AND running IOSTAT on the storage nodes show no read > activity on the OSD disks. > > > > Any ideas on what I’ve overlooked? There appears to be some read-ahead > caching that I’ve missed. > > > > Thanks, > > Bruce > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD caching on 4K reads???
Uh! It is strange. You said you have already cleared caches both on client and osd node, the data must directly come from the disk. Wait other's ideas 2015-02-03 11:44 GMT+08:00 Bruce McFarland : > Yes I'm using and the kernel rbd in Ubuntu 14.04 which makes calls into > libceph > > root@essperf3:/etc/ceph# lsmod | grep rbd > rbd63707 1 > libceph 225026 1 rbd > root@essperf3:/etc/ceph# > > I'm doing raw device IO with either fio or vdbench (preferred tool) and there > is no filesystem on top of /dev/rbd1. Yes I did invalidate the kmem pages by > writing to the drop_caches and I've also allocated huge pages to be the max > allowable based on free memory. The huge page allocation should minimize any > system caches. I have a, relatively, small storage pool since this is a > development environment and there is only ~ 4TB total and the rbd image is > 3TB. On my lab system with 320TB I don't see this problem since the data set > is orders of magnitude larger than available system cache. > > Maybe I'll should try and test after removing DIMMs from the client system > and physically disabling kernel caching. > > -Original Message- > From: Nicheal [mailto:zay11...@gmail.com] > Sent: Monday, February 02, 2015 7:35 PM > To: Bruce McFarland > Cc: ceph-us...@ceph.com; Prashanth Nednoor > Subject: Re: [ceph-users] RBD caching on 4K reads??? > > It seems you use the kernel rbd. So rbd_cache does not work, which is just > designed for librbd. Kernel rbd is directly using the system page cache. You > said that you have already run like echo 3 > /proc/sys/vm/drop_cache to > invalidate all pages cached in kernel. So do you test the /dev/rbd1 based on > any filesystem, such ext4 or xfs? > If so, and you run the test tool like fio, first with a write test and > file_size = 10G. Then a file(10G) is created by fio but with lots of holes in > the file, and your read test may read those holes so that filesystem can tell > thay contain nothing and there is no need to access the physical disk to get > data. You may check the fiemap of the file to see whether it contains holes > or you just remove the file and recreate the file by a read test. > > Ning Yao > > 2015-01-31 4:51 GMT+08:00 Bruce McFarland : >> I have a cluster and have created a rbd device - /dev/rbd1. It shows >> up as expected with ‘rbd –image test info’ and rbd showmapped. I have >> been looking at cluster performance with the usual Linux block device >> tools – fio and vdbench. When I look at writes and large block >> sequential reads I’m seeing what I’d expect with performance limited >> by either my cluster interconnect bandwidth or the backend device >> throughput speeds – 1 GE frontend and cluster network and 7200rpm SATA >> OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K >> random reads. There is caching occurring somewhere in my system that I >> haven’t been able to detect and suppress - yet. >> >> >> >> I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on the >> client, monitor, and storage nodes. I’ve flushed the system caches on >> the client and storage nodes before test run ie vm.drop_caches=3 and >> set the huge pages to the maximum available to consume free system >> memory so that it can’t be used for system cache . I’ve also disabled >> read-ahead on all of the HDD/OSDs. >> >> >> >> When I run a 4k randon read workload on the client the most I could >> expect would be ~100iops/osd x number of osd’s – I’m seeing an order >> of magnitude greater than that AND running IOSTAT on the storage nodes >> show no read activity on the OSD disks. >> >> >> >> Any ideas on what I’ve overlooked? There appears to be some read-ahead >> caching that I’ve missed. >> >> >> >> Thanks, >> >> Bruce >> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD caching on 4K reads???
Yes I'm using and the kernel rbd in Ubuntu 14.04 which makes calls into libceph root@essperf3:/etc/ceph# lsmod | grep rbd rbd63707 1 libceph 225026 1 rbd root@essperf3:/etc/ceph# I'm doing raw device IO with either fio or vdbench (preferred tool) and there is no filesystem on top of /dev/rbd1. Yes I did invalidate the kmem pages by writing to the drop_caches and I've also allocated huge pages to be the max allowable based on free memory. The huge page allocation should minimize any system caches. I have a, relatively, small storage pool since this is a development environment and there is only ~ 4TB total and the rbd image is 3TB. On my lab system with 320TB I don't see this problem since the data set is orders of magnitude larger than available system cache. Maybe I'll should try and test after removing DIMMs from the client system and physically disabling kernel caching. -Original Message- From: Nicheal [mailto:zay11...@gmail.com] Sent: Monday, February 02, 2015 7:35 PM To: Bruce McFarland Cc: ceph-us...@ceph.com; Prashanth Nednoor Subject: Re: [ceph-users] RBD caching on 4K reads??? It seems you use the kernel rbd. So rbd_cache does not work, which is just designed for librbd. Kernel rbd is directly using the system page cache. You said that you have already run like echo 3 > /proc/sys/vm/drop_cache to invalidate all pages cached in kernel. So do you test the /dev/rbd1 based on any filesystem, such ext4 or xfs? If so, and you run the test tool like fio, first with a write test and file_size = 10G. Then a file(10G) is created by fio but with lots of holes in the file, and your read test may read those holes so that filesystem can tell thay contain nothing and there is no need to access the physical disk to get data. You may check the fiemap of the file to see whether it contains holes or you just remove the file and recreate the file by a read test. Ning Yao 2015-01-31 4:51 GMT+08:00 Bruce McFarland : > I have a cluster and have created a rbd device - /dev/rbd1. It shows > up as expected with ‘rbd –image test info’ and rbd showmapped. I have > been looking at cluster performance with the usual Linux block device > tools – fio and vdbench. When I look at writes and large block > sequential reads I’m seeing what I’d expect with performance limited > by either my cluster interconnect bandwidth or the backend device > throughput speeds – 1 GE frontend and cluster network and 7200rpm SATA > OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K > random reads. There is caching occurring somewhere in my system that I > haven’t been able to detect and suppress - yet. > > > > I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on the > client, monitor, and storage nodes. I’ve flushed the system caches on > the client and storage nodes before test run ie vm.drop_caches=3 and > set the huge pages to the maximum available to consume free system > memory so that it can’t be used for system cache . I’ve also disabled > read-ahead on all of the HDD/OSDs. > > > > When I run a 4k randon read workload on the client the most I could > expect would be ~100iops/osd x number of osd’s – I’m seeing an order > of magnitude greater than that AND running IOSTAT on the storage nodes > show no read activity on the OSD disks. > > > > Any ideas on what I’ve overlooked? There appears to be some read-ahead > caching that I’ve missed. > > > > Thanks, > > Bruce > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD caching on 4K reads???
I'm using Ubuntu 14.04 and the kernel rbd which makes calls into libceph root@essperf3:/etc/ceph# lsmod | grep rbd rbd63707 1 libceph 225026 1 rbd root@essperf3:/etc/ceph# I'm doing raw device IO with either fio or vdbench (preferred tool) and there is no filesystem on top of /dev/rbd1. Yes I did invalidate the kmem pages by writing to the drop_caches and I've also allocated huge pages to be the max allowable based on free memory. The huge page allocation should minimize any system caches. I have a, relatively, small storage pool since this is a development environment and there is only ~ 4TB total and the rbd image is 3TB. On my lab system with 320TB I don't see this problem since the data set is orders of magnitude larger than available system cache. Maybe I'll remove DIMMs from the client system and physically disable kernel caching. -Original Message- From: Nicheal [mailto:zay11...@gmail.com] Sent: Monday, February 02, 2015 7:35 PM To: Bruce McFarland Cc: ceph-us...@ceph.com; Prashanth Nednoor Subject: Re: [ceph-users] RBD caching on 4K reads??? It seems you use the kernel rbd. So rbd_cache does not work, which is just designed for librbd. Kernel rbd is directly using the system page cache. You said that you have already run like echo 3 > /proc/sys/vm/drop_cache to invalidate all pages cached in kernel. So do you test the /dev/rbd1 based on any filesystem, such ext4 or xfs? If so, and you run the test tool like fio, first with a write test and file_size = 10G. Then a file(10G) is created by fio but with lots of holes in the file, and your read test may read those holes so that filesystem can tell thay contain nothing and there is no need to access the physical disk to get data. You may check the fiemap of the file to see whether it contains holes or you just remove the file and recreate the file by a read test. Ning Yao 2015-01-31 4:51 GMT+08:00 Bruce McFarland : > I have a cluster and have created a rbd device - /dev/rbd1. It shows > up as expected with ‘rbd –image test info’ and rbd showmapped. I have > been looking at cluster performance with the usual Linux block device > tools – fio and vdbench. When I look at writes and large block > sequential reads I’m seeing what I’d expect with performance limited > by either my cluster interconnect bandwidth or the backend device > throughput speeds – 1 GE frontend and cluster network and 7200rpm SATA > OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K > random reads. There is caching occurring somewhere in my system that I > haven’t been able to detect and suppress - yet. > > > > I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on the > client, monitor, and storage nodes. I’ve flushed the system caches on > the client and storage nodes before test run ie vm.drop_caches=3 and > set the huge pages to the maximum available to consume free system > memory so that it can’t be used for system cache . I’ve also disabled > read-ahead on all of the HDD/OSDs. > > > > When I run a 4k randon read workload on the client the most I could > expect would be ~100iops/osd x number of osd’s – I’m seeing an order > of magnitude greater than that AND running IOSTAT on the storage nodes > show no read activity on the OSD disks. > > > > Any ideas on what I’ve overlooked? There appears to be some read-ahead > caching that I’ve missed. > > > > Thanks, > > Bruce > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD caching on 4K reads???
This rbd_cache is only applicable to librbd not with the kernel rbd. Hope you are trying with librbd based env. If not, then the caching effect you are seeing is the filesystem cache. Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bruce McFarland Sent: Monday, February 02, 2015 3:39 PM To: Mykola Golub; Udo Lembke Cc: ceph-us...@ceph.com; Prashanth Nednoor Subject: Re: [ceph-users] RBD caching on 4K reads??? I'm still missing something. I can check on the monitor to see that the running config on the cluster has rbd cache = false [root@essperf13 ceph]# ceph --admin-daemon /var/run/ceph/ceph-mon.essperf13.asok config show | grep rbd "debug_rbd": "0\/5", "rbd_cache": "false", Since rbd caching is a client setting I've added the following to the rbd client /etc/ceph/ceph.conf [global] log file = /var/log/ceph/rbd.log rbd cache = false rbd readahead max bytes = 0 "should be" disabled if rbd cache = false, but I'm paranoid [client] admin socket = /var/run/ceph/rbd-$pid.asok I never see a rbd*asok file in /var/run/ceph. I started the rbd driver on the client without /var/run/ceph directory and then see: 2015-02-02 14:40:30.254509 7f81888257c0 -1 asok(0x7f8189182390) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/rbd-1716.asok': (2) No such file or directory When I attempt to map the rbd image to the client device with "rbd map". Once I create /var/run/ceph these messages don't occur. So it appears that the admin sockets are being created, but only for the duration of the command. I still see the effects of rbd caching if I run fio/vdbench with 4K random reads, but I have not been able to create a persistant rbd admin socket so that I can dump the running configuration and/or change it at run time. Any ideas on what I've overlooked? Any pointers to documentation on the [client] section of ceph.conf? rbd admin sockets? Nothing at ceph.com/docs on either topic. Thanks, Bruce -Original Message- From: Mykola Golub [mailto:to.my.troc...@gmail.com] Sent: Sunday, February 01, 2015 1:24 PM To: Udo Lembke Cc: Bruce McFarland; ceph-us...@ceph.com; Prashanth Nednoor Subject: Re: [ceph-users] RBD caching on 4K reads??? On Fri, Jan 30, 2015 at 10:09:32PM +0100, Udo Lembke wrote: > Hi Bruce, > you can also look on the mon, like > ceph --admin-daemon /var/run/ceph/ceph-mon.b.asok config show | grep > cache rbd cache is a client setting, so you have to check this connecting to the client admin socket. Its location is defined in ceph.conf, [client] section, "admin socket" parameter. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD caching on 4K reads???
I'm still missing something. I can check on the monitor to see that the running config on the cluster has rbd cache = false [root@essperf13 ceph]# ceph --admin-daemon /var/run/ceph/ceph-mon.essperf13.asok config show | grep rbd "debug_rbd": "0\/5", "rbd_cache": "false", Since rbd caching is a client setting I've added the following to the rbd client /etc/ceph/ceph.conf [global] log file = /var/log/ceph/rbd.log rbd cache = false rbd readahead max bytes = 0 "should be" disabled if rbd cache = false, but I'm paranoid [client] admin socket = /var/run/ceph/rbd-$pid.asok I never see a rbd*asok file in /var/run/ceph. I started the rbd driver on the client without /var/run/ceph directory and then see: 2015-02-02 14:40:30.254509 7f81888257c0 -1 asok(0x7f8189182390) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/rbd-1716.asok': (2) No such file or directory When I attempt to map the rbd image to the client device with "rbd map". Once I create /var/run/ceph these messages don't occur. So it appears that the admin sockets are being created, but only for the duration of the command. I still see the effects of rbd caching if I run fio/vdbench with 4K random reads, but I have not been able to create a persistant rbd admin socket so that I can dump the running configuration and/or change it at run time. Any ideas on what I've overlooked? Any pointers to documentation on the [client] section of ceph.conf? rbd admin sockets? Nothing at ceph.com/docs on either topic. Thanks, Bruce -Original Message- From: Mykola Golub [mailto:to.my.troc...@gmail.com] Sent: Sunday, February 01, 2015 1:24 PM To: Udo Lembke Cc: Bruce McFarland; ceph-us...@ceph.com; Prashanth Nednoor Subject: Re: [ceph-users] RBD caching on 4K reads??? On Fri, Jan 30, 2015 at 10:09:32PM +0100, Udo Lembke wrote: > Hi Bruce, > you can also look on the mon, like > ceph --admin-daemon /var/run/ceph/ceph-mon.b.asok config show | grep > cache rbd cache is a client setting, so you have to check this connecting to the client admin socket. Its location is defined in ceph.conf, [client] section, "admin socket" parameter. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD caching on 4K reads???
On Fri, Jan 30, 2015 at 10:09:32PM +0100, Udo Lembke wrote: > Hi Bruce, > you can also look on the mon, like > ceph --admin-daemon /var/run/ceph/ceph-mon.b.asok config show | grep cache rbd cache is a client setting, so you have to check this connecting to the client admin socket. Its location is defined in ceph.conf, [client] section, "admin socket" parameter. -- Mykola Golub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD caching on 4K reads???
Hi Bruce, you can also look on the mon, like ceph --admin-daemon /var/run/ceph/ceph-mon.b.asok config show | grep cache (I guess you have an number instead of the .b. ) Udo On 30.01.2015 22:02, Bruce McFarland wrote: > > The ceph daemon isn’t running on the client with the rbd device so I > can’t verify if it’s disabled at the librbd level on the client. If > you mean on the storage nodes I’ve had some issues dumping the config. > Does the rbd caching occur on the storage nodes, client, or both? > > > > > > *From:*Udo Lembke [mailto:ulem...@polarzone.de] > *Sent:* Friday, January 30, 2015 1:00 PM > *To:* Bruce McFarland; ceph-us...@ceph.com > *Cc:* Prashanth Nednoor > *Subject:* Re: [ceph-users] RBD caching on 4K reads??? > > > > Hi Bruce, > hmm, sounds for me like the rbd cache. > Can you look, if the cache is realy disabled in the running config with > > ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep cache > > Udo > > On 30.01.2015 21:51, Bruce McFarland wrote: > > I have a cluster and have created a rbd device - /dev/rbd1. It > shows up as expected with ‘rbd –image test info’ and rbd > showmapped. I have been looking at cluster performance with the > usual Linux block device tools – fio and vdbench. When I look at > writes and large block sequential reads I’m seeing what I’d expect > with performance limited by either my cluster interconnect > bandwidth or the backend device throughput speeds – 1 GE frontend > and cluster network and 7200rpm SATA OSDs with 1 SSD/osd for > journal. Everything looks good EXCEPT 4K random reads. There is > caching occurring somewhere in my system that I haven’t been able > to detect and suppress - yet. > > > > I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on > the client, monitor, and storage nodes. I’ve flushed the system > caches on the client and storage nodes before test run ie > vm.drop_caches=3 and set the huge pages to the maximum available > to consume free system memory so that it can’t be used for system > cache . I’ve also disabled read-ahead on all of the HDD/OSDs. > > > > When I run a 4k randon read workload on the client the most I > could expect would be ~100iops/osd x number of osd’s – I’m seeing > an order of magnitude greater than that AND running IOSTAT on the > storage nodes show no read activity on the OSD disks. > > > > Any ideas on what I’ve overlooked? There appears to be some > read-ahead caching that I’ve missed. > > > > Thanks, > > Bruce > > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD caching on 4K reads???
The ceph daemon isn't running on the client with the rbd device so I can't verify if it's disabled at the librbd level on the client. If you mean on the storage nodes I've had some issues dumping the config. Does the rbd caching occur on the storage nodes, client, or both? From: Udo Lembke [mailto:ulem...@polarzone.de] Sent: Friday, January 30, 2015 1:00 PM To: Bruce McFarland; ceph-us...@ceph.com Cc: Prashanth Nednoor Subject: Re: [ceph-users] RBD caching on 4K reads??? Hi Bruce, hmm, sounds for me like the rbd cache. Can you look, if the cache is realy disabled in the running config with ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep cache Udo On 30.01.2015 21:51, Bruce McFarland wrote: I have a cluster and have created a rbd device - /dev/rbd1. It shows up as expected with 'rbd -image test info' and rbd showmapped. I have been looking at cluster performance with the usual Linux block device tools - fio and vdbench. When I look at writes and large block sequential reads I'm seeing what I'd expect with performance limited by either my cluster interconnect bandwidth or the backend device throughput speeds - 1 GE frontend and cluster network and 7200rpm SATA OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K random reads. There is caching occurring somewhere in my system that I haven't been able to detect and suppress - yet. I've set 'rbd_cache=false' in the [client] section of ceph.conf on the client, monitor, and storage nodes. I've flushed the system caches on the client and storage nodes before test run ie vm.drop_caches=3 and set the huge pages to the maximum available to consume free system memory so that it can't be used for system cache . I've also disabled read-ahead on all of the HDD/OSDs. When I run a 4k randon read workload on the client the most I could expect would be ~100iops/osd x number of osd's - I'm seeing an order of magnitude greater than that AND running IOSTAT on the storage nodes show no read activity on the OSD disks. Any ideas on what I've overlooked? There appears to be some read-ahead caching that I've missed. Thanks, Bruce ___ ceph-users mailing list ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RBD caching on 4K reads???
Hi Bruce, hmm, sounds for me like the rbd cache. Can you look, if the cache is realy disabled in the running config with ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep cache Udo On 30.01.2015 21:51, Bruce McFarland wrote: > > I have a cluster and have created a rbd device - /dev/rbd1. It shows > up as expected with ‘rbd –image test info’ and rbd showmapped. I have > been looking at cluster performance with the usual Linux block device > tools – fio and vdbench. When I look at writes and large block > sequential reads I’m seeing what I’d expect with performance limited > by either my cluster interconnect bandwidth or the backend device > throughput speeds – 1 GE frontend and cluster network and 7200rpm SATA > OSDs with 1 SSD/osd for journal. Everything looks good EXCEPT 4K > random reads. There is caching occurring somewhere in my system that I > haven’t been able to detect and suppress - yet. > > > > I’ve set ‘rbd_cache=false’ in the [client] section of ceph.conf on the > client, monitor, and storage nodes. I’ve flushed the system caches on > the client and storage nodes before test run ie vm.drop_caches=3 and > set the huge pages to the maximum available to consume free system > memory so that it can’t be used for system cache . I’ve also disabled > read-ahead on all of the HDD/OSDs. > > > > When I run a 4k randon read workload on the client the most I could > expect would be ~100iops/osd x number of osd’s – I’m seeing an order > of magnitude greater than that AND running IOSTAT on the storage nodes > show no read activity on the OSD disks. > > > > Any ideas on what I’ve overlooked? There appears to be some read-ahead > caching that I’ve missed. > > > > Thanks, > > Bruce > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com