Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Jason Dillaman
It's in the build and has tests to verify that it is properly being
triggered [1].

$ git tag --contains 5498377205523052476ed81aebb2c2e6973f67ef
v10.2.3

What are your tests that say otherwise?

[1] 
https://github.com/ceph/ceph/pull/10797/commits/5498377205523052476ed81aebb2c2e6973f67ef

On Fri, Oct 21, 2016 at 7:42 AM, Pavan Rallabhandi
 wrote:
> I see the fix for write back cache not getting turned on after flush has made 
> into Jewel 10.2.3 ( http://tracker.ceph.com/issues/17080 ) but our testing 
> says otherwise.
>
> The cache is still behaving as if its writethrough, though the setting is set 
> to true. Wanted to check if it’s still broken in Jewel 10.2.3 or am I missing 
> anything here?
>
> Thanks,
> -Pavan.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
From my VMs that have cinder provisioned volumes, I tried dd / fio (like below) 
to find the IOPS to be less, even a sync before the runs didn’t help. Same runs 
by setting the option to false yield better results.

Both the clients and the cluster are running 10.2.3, perhaps the only 
difference is that the clients are on Trusty and the cluster is Xenial.

dd if=/dev/zero of=/dev/vdd bs=4K count=1000 oflag=direct

fio -name iops -rw=write -bs=4k -direct=1  -runtime=60 -iodepth 1 -filename 
/dev/vde -ioengine=libaio 

Thanks,
-Pavan.

On 10/21/16, 6:15 PM, "Jason Dillaman"  wrote:

It's in the build and has tests to verify that it is properly being
triggered [1].

$ git tag --contains 5498377205523052476ed81aebb2c2e6973f67ef
v10.2.3

What are your tests that say otherwise?

[1] 
https://github.com/ceph/ceph/pull/10797/commits/5498377205523052476ed81aebb2c2e6973f67ef

On Fri, Oct 21, 2016 at 7:42 AM, Pavan Rallabhandi
 wrote:
> I see the fix for write back cache not getting turned on after flush has 
made into Jewel 10.2.3 ( http://tracker.ceph.com/issues/17080 ) but our testing 
says otherwise.
>
> The cache is still behaving as if its writethrough, though the setting is 
set to true. Wanted to check if it’s still broken in Jewel 10.2.3 or am I 
missing anything here?
>
> Thanks,
> -Pavan.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Jason


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
And to add, the host running Cinder services is having Hammer 0.94.9 but the 
rest of them like Nova are on Jewel 10.2.3

FWIW, the rbd info for one such image looks like this:

rbd image 'volume-f6ec45e2-b644-4b58-b6b5-b3a418c3c5b2':
size 2048 MB in 512 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.5ebf12d1934e
format: 2
features: layering, striping
flags: 
stripe unit: 4096 kB
stripe count: 1

Thanks!

On 10/21/16, 7:26 PM, "ceph-users on behalf of Pavan Rallabhandi" 
 
wrote:

Both the clients and the cluster are running 10.2.3, perhaps the only 
difference is that the clients are on Trusty and the cluster is Xenial.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Jason Dillaman
I just tested from the v10.2.3 git tag on my local machine and
averaged 2912.54 4K writes / second with
"rbd_cache_writethrough_until_flush = false" and averaged 3035.09 4K
writes / second with "rbd_cache_writethrough_until_flush = true"
(queue depth of 1 in both cases). I used new images between each run
to ensure the there wasn't any warm data.

What is the IOPS delta percentage between your two cases? What is your
QEMU cache setting for the rbd drive?

On Fri, Oct 21, 2016 at 9:56 AM, Pavan Rallabhandi
 wrote:
> From my VMs that have cinder provisioned volumes, I tried dd / fio (like 
> below) to find the IOPS to be less, even a sync before the runs didn’t help. 
> Same runs by setting the option to false yield better results.
>
> Both the clients and the cluster are running 10.2.3, perhaps the only 
> difference is that the clients are on Trusty and the cluster is Xenial.
>
> dd if=/dev/zero of=/dev/vdd bs=4K count=1000 oflag=direct
>
> fio -name iops -rw=write -bs=4k -direct=1  -runtime=60 -iodepth 1 -filename 
> /dev/vde -ioengine=libaio
>
> Thanks,
> -Pavan.
>
> On 10/21/16, 6:15 PM, "Jason Dillaman"  wrote:
>
> It's in the build and has tests to verify that it is properly being
> triggered [1].
>
> $ git tag --contains 5498377205523052476ed81aebb2c2e6973f67ef
> v10.2.3
>
> What are your tests that say otherwise?
>
> [1] 
> https://github.com/ceph/ceph/pull/10797/commits/5498377205523052476ed81aebb2c2e6973f67ef
>
> On Fri, Oct 21, 2016 at 7:42 AM, Pavan Rallabhandi
>  wrote:
> > I see the fix for write back cache not getting turned on after flush 
> has made into Jewel 10.2.3 ( http://tracker.ceph.com/issues/17080 ) but our 
> testing says otherwise.
> >
> > The cache is still behaving as if its writethrough, though the setting 
> is set to true. Wanted to check if it’s still broken in Jewel 10.2.3 or am I 
> missing anything here?
> >
> > Thanks,
> > -Pavan.
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Jason
>
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
Thanks for verifying at your end Jason.

It’s pretty weird that the difference is >~10X, with 
"rbd_cache_writethrough_until_flush = true" I see ~400 IOPS vs with 
"rbd_cache_writethrough_until_flush = false" I see them to be ~6000 IOPS. 

The QEMU cache is none for all of the rbd drives. On that note, would older 
librbd versions (like Hammer) have any caching issues while dealing with Jewel 
clusters?

Thanks,
-Pavan.

On 10/21/16, 8:17 PM, "Jason Dillaman"  wrote:

QEMU cache setting for the rbd drive?



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Jason Dillaman
On Fri, Oct 21, 2016 at 1:15 PM, Pavan Rallabhandi
 wrote:
> The QEMU cache is none for all of the rbd drives

Hmm -- if you have QEMU cache disabled, I would expect it to disable
the librbd cache.

I have to ask, but did you (re)start/live-migrate these VMs you are
testing against after you upgraded to librbd v10.2.3?

-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
The VM am testing against is created after the librbd upgrade.

Always had this confusion around this bit in the docs here  
http://docs.ceph.com/docs/jewel/rbd/qemu-rbd/#qemu-cache-options that:

“QEMU’s cache settings override Ceph’s default settings (i.e., settings that 
are not explicitly set in the Ceph configuration file). If you explicitly set 
RBD Cache settings in your Ceph configuration file, your Ceph settings override 
the QEMU cache settings. If you set cache settings on the QEMU command line, 
the QEMU command line settings override the Ceph configuration file settings.”

Thanks,
-Pavan.

On 10/21/16, 11:31 PM, "Jason Dillaman"  wrote:

On Fri, Oct 21, 2016 at 1:15 PM, Pavan Rallabhandi
 wrote:
> The QEMU cache is none for all of the rbd drives

Hmm -- if you have QEMU cache disabled, I would expect it to disable
the librbd cache.

I have to ask, but did you (re)start/live-migrate these VMs you are
testing against after you upgraded to librbd v10.2.3?

-- 
Jason



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Jason Dillaman
Thanks for pointing that out, since it is incorrect for (semi-)modern
QEMUs. All configuration starts and the Ceph defaults, are overwritten
by your ceph.conf, and then are further overwritten by any
QEMU-specific override.  I would recommend retesting with
"cache=writeback" to see if that helps.

On Fri, Oct 21, 2016 at 2:10 PM, Pavan Rallabhandi
 wrote:
> The VM am testing against is created after the librbd upgrade.
>
> Always had this confusion around this bit in the docs here  
> http://docs.ceph.com/docs/jewel/rbd/qemu-rbd/#qemu-cache-options that:
>
> “QEMU’s cache settings override Ceph’s default settings (i.e., settings that 
> are not explicitly set in the Ceph configuration file). If you explicitly set 
> RBD Cache settings in your Ceph configuration file, your Ceph settings 
> override the QEMU cache settings. If you set cache settings on the QEMU 
> command line, the QEMU command line settings override the Ceph configuration 
> file settings.”
>
> Thanks,
> -Pavan.
>
> On 10/21/16, 11:31 PM, "Jason Dillaman"  wrote:
>
> On Fri, Oct 21, 2016 at 1:15 PM, Pavan Rallabhandi
>  wrote:
> > The QEMU cache is none for all of the rbd drives
>
> Hmm -- if you have QEMU cache disabled, I would expect it to disable
> the librbd cache.
>
> I have to ask, but did you (re)start/live-migrate these VMs you are
> testing against after you upgraded to librbd v10.2.3?
>
> --
> Jason
>
>
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com