Re: [ceph-users] Fio rbd stalls during 4M reads
Yeah, looks like it. If I disable the rbd ccahe: $ tail /etc/ceph/ceph.conf ... [client] rbd cache = false then the 2-4M reads work fine (no invalid reads in valgrind either). I'll let the fio guys know. Cheers Mark On 25/10/14 06:56, Gregory Farnum wrote: There's an issue in master branch temporarily that makes rbd reads greater than the cache size hang (if the cache was on). This might be that. (Jason is working on it: http://tracker.ceph.com/issues/9854) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Oct 23, 2014 at 5:09 PM, Mark Kirkwood wrote: I'm doing some fio tests on Giant using fio rbd driver to measure performance on a new ceph cluster. However with block sizes > 1M (initially noticed with 4M) I am seeing absolutely no IOPS for *reads* - and the fio process becomes non interrupteable (needs kill -9): $ ceph -v ceph version 0.86-467-g317b83d (317b831a917f70838870b31931a79bdd4dd0) $ fio --version fio-2.1.11-20-g9a44 $ fio read-busted.fio env-read-4M: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32 fio-2.1.11-20-g9a44 Starting 1 process rbd engine: RBD version: 0.1.8 Jobs: 1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158050441d:06h:58m:03s] This appears to be a pure fio rbd driver issue, as I can attach the relevant rbd volume to a vm and dd from it using 4M blocks no problem. Any ideas? Cheers Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Fio rbd stalls during 4M reads
FWIW the specific fio read problem appears to have started after 0.86 and before commit 42bcabf. Mark On 10/24/2014 12:56 PM, Gregory Farnum wrote: There's an issue in master branch temporarily that makes rbd reads greater than the cache size hang (if the cache was on). This might be that. (Jason is working on it: http://tracker.ceph.com/issues/9854) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Oct 23, 2014 at 5:09 PM, Mark Kirkwood wrote: I'm doing some fio tests on Giant using fio rbd driver to measure performance on a new ceph cluster. However with block sizes > 1M (initially noticed with 4M) I am seeing absolutely no IOPS for *reads* - and the fio process becomes non interrupteable (needs kill -9): $ ceph -v ceph version 0.86-467-g317b83d (317b831a917f70838870b31931a79bdd4dd0) $ fio --version fio-2.1.11-20-g9a44 $ fio read-busted.fio env-read-4M: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32 fio-2.1.11-20-g9a44 Starting 1 process rbd engine: RBD version: 0.1.8 Jobs: 1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158050441d:06h:58m:03s] This appears to be a pure fio rbd driver issue, as I can attach the relevant rbd volume to a vm and dd from it using 4M blocks no problem. Any ideas? Cheers Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Fio rbd stalls during 4M reads
There's an issue in master branch temporarily that makes rbd reads greater than the cache size hang (if the cache was on). This might be that. (Jason is working on it: http://tracker.ceph.com/issues/9854) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Oct 23, 2014 at 5:09 PM, Mark Kirkwood wrote: > I'm doing some fio tests on Giant using fio rbd driver to measure > performance on a new ceph cluster. > > However with block sizes > 1M (initially noticed with 4M) I am seeing > absolutely no IOPS for *reads* - and the fio process becomes non > interrupteable (needs kill -9): > > $ ceph -v > ceph version 0.86-467-g317b83d (317b831a917f70838870b31931a79bdd4dd0) > > $ fio --version > fio-2.1.11-20-g9a44 > > $ fio read-busted.fio > env-read-4M: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32 > fio-2.1.11-20-g9a44 > Starting 1 process > rbd engine: RBD version: 0.1.8 > Jobs: 1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta > 1158050441d:06h:58m:03s] > > This appears to be a pure fio rbd driver issue, as I can attach the relevant > rbd volume to a vm and dd from it using 4M blocks no problem. > > Any ideas? > > Cheers > > Mark > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Fio rbd stalls during 4M reads
On 24/10/14 13:09, Mark Kirkwood wrote: I'm doing some fio tests on Giant using fio rbd driver to measure performance on a new ceph cluster. However with block sizes > 1M (initially noticed with 4M) I am seeing absolutely no IOPS for *reads* - and the fio process becomes non interrupteable (needs kill -9): $ ceph -v ceph version 0.86-467-g317b83d (317b831a917f70838870b31931a79bdd4dd0) $ fio --version fio-2.1.11-20-g9a44 $ fio read-busted.fio env-read-4M: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32 fio-2.1.11-20-g9a44 Starting 1 process rbd engine: RBD version: 0.1.8 Jobs: 1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158050441d:06h:58m:03s] This appears to be a pure fio rbd driver issue, as I can attach the relevant rbd volume to a vm and dd from it using 4M blocks no problem. Running under valgrind shows up some invalid reads (have raised bug for fio): $ valgrind fio read-test.fio ==12519== Memcheck, a memory error detector ==12519== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==12519== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info ==12519== Command: fio read-test.fio ==12519== rbd_thread: (g=0): rw=read, bs=2M-2M/2M-2M/2M-2M, ioengine=rbd, iodepth=32 fio-2.1.13-88-gb2ee7 Starting 1 process rbd engine: RBD version: 0.1.8 ==12519== Thread 6: ==12519== Invalid read of size 8 ==12519==at 0x4EFA7B3: ObjectCacher::_readx(ObjectCacher::OSDRead*, ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1158) ==12519==by 0x4E965A7: librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*, unsigned long, unsigned long, Context*) (ImageCtx.cc:484) ==12519==by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*, std::vector, std::allocator > > const&, char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262) ==12519==by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3135) ==12519==by 0x4E8B737: rbd_aio_read (librbd.cc:1518) ==12519==by 0x459D92: fio_rbd_queue (rbd.c:294) ==12519==by 0x40D379: td_io_queue (ioengines.c:300) ==12519==by 0x44B77E: thread_main (backend.c:781) ==12519==by 0x81F6181: start_thread (pthread_create.c:312) ==12519==by 0x870AFBC: clone (clone.S:111) ==12519== Address 0x197b6fe0 is 48 bytes inside a block of size 264 free'd ==12519==at 0x4C2C2BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==12519==by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*, ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149) ==12519==by 0x4E965A7: librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*, unsigned long, unsigned long, Context*) (ImageCtx.cc:484) ==12519==by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*, std::vector, std::allocator > > const&, char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262) ==12519==by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3135) ==12519==by 0x4E8B737: rbd_aio_read (librbd.cc:1518) ==12519==by 0x459D92: fio_rbd_queue (rbd.c:294) ==12519==by 0x40D379: td_io_queue (ioengines.c:300) ==12519==by 0x44B77E: thread_main (backend.c:781) ==12519==by 0x81F6181: start_thread (pthread_create.c:312) ==12519==by 0x870AFBC: clone (clone.S:111) ==12519== ==12519== Invalid read of size 8 ==12519==at 0x4EFA7CD: ObjectCacher::_readx(ObjectCacher::OSDRead*, ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.h:170) ==12519==by 0x4E965A7: librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*, unsigned long, unsigned long, Context*) (ImageCtx.cc:484) ==12519==by 0x4EAA9FA: librbd::aio_read(librbd::ImageCtx*, std::vector, std::allocator > > const&, char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3262) ==12519==by 0x4EAB872: librbd::aio_read(librbd::ImageCtx*, unsigned long, unsigned long, char*, ceph::buffer::list*, librbd::AioCompletion*) (internal.cc:3135) ==12519==by 0x4E8B737: rbd_aio_read (librbd.cc:1518) ==12519==by 0x459D92: fio_rbd_queue (rbd.c:294) ==12519==by 0x40D379: td_io_queue (ioengines.c:300) ==12519==by 0x44B77E: thread_main (backend.c:781) ==12519==by 0x81F6181: start_thread (pthread_create.c:312) ==12519==by 0x870AFBC: clone (clone.S:111) ==12519== Address 0x197b6fe8 is 56 bytes inside a block of size 264 free'd ==12519==at 0x4C2C2BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==12519==by 0x4EFA7AE: ObjectCacher::_readx(ObjectCacher::OSDRead*, ObjectCacher::ObjectSet*, Context*, bool) (ObjectCacher.cc:1149) ==12519==by 0x4E965A7: librbd::ImageCtx::aio_read_from_cache(object_t, ceph::buffer::list*, unsigned long, unsigned long, Context*) (ImageCtx.cc:484
[ceph-users] Fio rbd stalls during 4M reads
I'm doing some fio tests on Giant using fio rbd driver to measure performance on a new ceph cluster. However with block sizes > 1M (initially noticed with 4M) I am seeing absolutely no IOPS for *reads* - and the fio process becomes non interrupteable (needs kill -9): $ ceph -v ceph version 0.86-467-g317b83d (317b831a917f70838870b31931a79bdd4dd0) $ fio --version fio-2.1.11-20-g9a44 $ fio read-busted.fio env-read-4M: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=32 fio-2.1.11-20-g9a44 Starting 1 process rbd engine: RBD version: 0.1.8 Jobs: 1 (f=1): [R(1)] [inf% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 1158050441d:06h:58m:03s] This appears to be a pure fio rbd driver issue, as I can attach the relevant rbd volume to a vm and dd from it using 4M blocks no problem. Any ideas? Cheers Mark [global] ioengine=rbd clientname=admin pool=rbd rbdname=rbd-fio-test invalidate=0 iodepth=32 nrfiles=1 runtime=120 direct=1 sync=1 unlink=1 numjobs=1 thread=0 disk_util=0 [env-read-4M] bs=4M rw=read ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com