Re: issue 8747 / 9011

2014-09-19 Thread Dmitry Smirnov
Hi Sage,

On Fri, 19 Sep 2014 16:43:31 Sage Weil wrote:
> Are you still seeing this crash?
> 
>  osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubber.start || soid >=
> scrubber.end)

Thanks for following-up on this, Sage.
Yes, I've seen this crash just recently on 0.80.5. It usually happens during 
long recovery like when OSD is replaced. I've seen this happening after hours 
of backfilling/remapping although it may take a long time to manifest.

-- 
Cheers,
 Dmitry Smirnov
 GPG key : 4096R/53968D1B

---

However beautiful the strategy, you should occasionally look at the
results.
-- Winston Churchill


signature.asc
Description: This is a digitally signed message part.


issue 8747 / 9011

2014-09-19 Thread Sage Weil
Hey Dmitry,

Are you still seeing this crash?

 osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubber.start || soid >= 
scrubber.end)

We haven't turned it up in our testing in the last two months, so we 
still have no log of it occurring.

Thanks!
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: S3 API Compatibility support

2014-09-19 Thread Yehuda Sadeh
On Fri, Sep 19, 2014 at 8:32 AM, M Ranga Swami Reddy
 wrote:
> Hi Sage,
> Thanks for quick reply.
>
>>Ceph doesn't interact at all with AWS services like Glacier, if that's
>
> No. I meant that - Ceph interaction with a glacier and RRS type of
> storages along with currently used OSD (or standard storage).
>
>>what you mean.
>>For RRS, though, I assume you mean the ability to create buckets with
>>reduced redundancy with radosgw?  That is supported, although not quite
>>the way AWS does it.  You can create different pools that back RGW
>>buckets, and each bucket is stored in one of those pools.  So you could
>>make one of them 2x instead of 3x, or use an erasure code of your choice.
>
> Yes, we can confiure ceph to use 2x replicas, which will look like
> reduced redundancy, but AWS uses a separate RRS storage-low cost
> (instead of
> standard) storage for this purpose. I am checking, if we could
> similarly in ceph too.

You can use multiple placement targets and can specify on bucket
creation which placement target to use. At this time we don't support
the exact S3 reduced redundancy fields, although it should be pretty
easy to add.

>
>>What isn't currently supported is the ability to reduce the redundancy of
>>individual objects in a bucket.  I don't think there is anything
>>architecturally preventing that, but it is not implemented or supported.
>
> OK. Do we have the issue id for the above? Else, we can file one. Please 
> advise.

I think #8929 would cover it.

Yehuda

>
>>When we look at the S3 archival features in more detail (soon!) I'm sure
>>this will come up!  The current plan is to address object versioning
>>first.  That is, unless a developer surfaces who wants to start hacking on
>>this right away...
>
> Great to know this. Even we are keen with S3 support in Ceph and we
> are happy support you here.
>
> Thanks
> Swami
>
> On Fri, Sep 19, 2014 at 11:08 AM, Sage Weil  wrote:
>> On Fri, 19 Sep 2014, M Ranga Swami Reddy wrote:
>>> Hi Sage,
>>> Could you please advise, if Ceph support the low cost object
>>> storages(like Amazon Glacier or RRS) for archiving objects like log
>>> file etc.?
>>
>> Ceph doesn't interact at all with AWS services like Glacier, if that's
>> what you mean.
>>
>> For RRS, though, I assume you mean the ability to create buckets with
>> reduced redundancy with radosgw?  That is supported, although not quite
>> the way AWS does it.  You can create different pools that back RGW
>> buckets, and each bucket is stored in one of those pools.  So you could
>> make one of them 2x instead of 3x, or use an erasure code of your choice.
>>
>> What isn't currently supported is the ability to reduce the redundancy of
>> individual objects in a bucket.  I don't think there is anything
>> architecturally preventing that, but it is not implemented or supported.
>>
>> When we look at the S3 archival features in more detail (soon!) I'm sure
>> this will come up!  The current plan is to address object versioning
>> first.  That is, unless a developer surfaces who wants to start hacking on
>> this right away...
>>
>> sage
>>
>>
>>
>>>
>>> Thanks
>>> Swami
>>>
>>> On Thu, Sep 18, 2014 at 6:20 PM, M Ranga Swami Reddy
>>>  wrote:
>>> > Hi ,
>>> >
>>> > Could you please check and clarify the below question on object
>>> > lifecycle and notification S3 APIs support:
>>> >
>>> > 1. To support the bucket lifecycle - we need to support the
>>> > moving/deleting the objects/buckets based lifecycle settings.
>>> > For ex: If an object lifecyle set as below:
>>> >   1. Archive it after 10 days - means move this object to low
>>> > cost object storage after 10 days of the creation date.
>>> >2. Remove this object after 90days - mean remove this
>>> > object from the low cost object after 90days of creation date.
>>> >
>>> > Q1- Does the ceph support the above concept like moving to low cost
>>> > storage and delete from that storage?
>>> >
>>> > 2. To support the object notifications:
>>> >   - First there should be low cost and high availability storage
>>> > with single replica only. If an object created with this type of
>>> > object storage,
>>> > There could be chances that object could lose, so if an object
>>> > of this type of storage lost, set the notifications.
>>> >
>>> > Q2- Does Ceph support low cost and high availability storage type?
>>> >
>>> > Thanks
>>> >
>>> > On Fri, Sep 12, 2014 at 8:00 PM, M Ranga Swami Reddy
>>> >  wrote:
>>> >> Hi Yehuda,
>>> >>
>>> >> Could you please check and clarify the below question on object
>>> >> lifecycle and notification S3 APIs support:
>>> >>
>>> >> 1. To support the bucket lifecycle - we need to support the
>>> >> moving/deleting the objects/buckets based lifecycle settings.
>>> >> For ex: If an object lifecyle set as below:
>>> >>   1. Archive it after 10 days - means move this object to low
>>> >> cost object storage after 10 days of the creation date.
>>> >>2. Remove this object after 90days - 

Re: Fwd: S3 API Compatibility support

2014-09-19 Thread M Ranga Swami Reddy
>What do you mean by "RRS storage-low cost storage"?  My read of the RRS
>numbers is that they simply have a different tier of S3 that runs fewer
>replicas and (probably) cheaper disks.  In radosgw-land, this would just
>be a different rados pool with 2x replicas and (probably) a CRUSH rule
>mapping it to different hardware (with bigger and/or cheaper disks).

Thats correct. If we could do the with a different rados pool  using
2x replicas along with CURSH
mapping it to different h/w (with bigger and cheaper disks) , then its
same as RRS support in AWS.


>> >What isn't currently supported is the ability to reduce the redundancy of
>> >individual objects in a bucket.  I don't think there is anything
>> >architecturally preventing that, but it is not implemented or supported.
>>
>> OK. Do we have the issue id for the above? Else, we can file one. Please 
>> advise.

>There is the main #4099 issue for object expiration, but there is no real
>detail there.  The plan is (as always) to have equivalent functionality to S3.

>Do you mind creating a new feature ticket that specifically references the
>ability to move objects to a second storage tier based on policy?  Any
>references to AWS docs about the API or functionality would be helpful in
>the ticket.


Sure, I will create a new feature ticket and add the needful information  there.

Thanks
Swami

On Fri, Sep 19, 2014 at 9:08 PM, Sage Weil  wrote:
> On Fri, 19 Sep 2014, M Ranga Swami Reddy wrote:
>> Hi Sage,
>> Thanks for quick reply.
>>
>> >what you mean.
>> >For RRS, though, I assume you mean the ability to create buckets with
>> >reduced redundancy with radosgw?  That is supported, although not quite
>> >the way AWS does it.  You can create different pools that back RGW
>> >buckets, and each bucket is stored in one of those pools.  So you could
>> >make one of them 2x instead of 3x, or use an erasure code of your choice.
>>
>> Yes, we can confiure ceph to use 2x replicas, which will look like
>> reduced redundancy, but AWS uses a separate RRS storage-low cost
>> (instead of
>> standard) storage for this purpose. I am checking, if we could
>> similarly in ceph too.
>
> What do you mean by "RRS storage-low cost storage"?  My read of the RRS
> numbers is that they simply have a different tier of S3 that runs fewer
> replicas and (probably) cheaper disks.  In radosgw-land, this would just
> be a different rados pool with 2x replicas and (probably) a CRUSH rule
> mapping it to different hardware (with bigger and/or cheaper disks).
>
>> >What isn't currently supported is the ability to reduce the redundancy of
>> >individual objects in a bucket.  I don't think there is anything
>> >architecturally preventing that, but it is not implemented or supported.
>>
>> OK. Do we have the issue id for the above? Else, we can file one. Please 
>> advise.
>
> There is the main #4099 issue for object expiration, but there is no real
> detail there.  The plan is (as always) to have equivalent functionality to
> S3.
>
> Do you mind creating a new feature ticket that specifically references the
> ability to move objects to a second storage tier based on policy?  Any
> references to AWS docs about the API or functionality would be helpful in
> the ticket.
>
>> >When we look at the S3 archival features in more detail (soon!) I'm sure
>> >this will come up!  The current plan is to address object versioning
>> >first.  That is, unless a developer surfaces who wants to start hacking on
>> >this right away...
>>
>> Great to know this. Even we are keen with S3 support in Ceph and we
>> are happy support you here.
>
> Great to hear!
>
> Thanks-
> sage
>
>
>>
>> Thanks
>> Swami
>>
>> On Fri, Sep 19, 2014 at 11:08 AM, Sage Weil  wrote:
>> > On Fri, 19 Sep 2014, M Ranga Swami Reddy wrote:
>> >> Hi Sage,
>> >> Could you please advise, if Ceph support the low cost object
>> >> storages(like Amazon Glacier or RRS) for archiving objects like log
>> >> file etc.?
>> >
>> > Ceph doesn't interact at all with AWS services like Glacier, if that's
>> > what you mean.
>> >
>> > For RRS, though, I assume you mean the ability to create buckets with
>> > reduced redundancy with radosgw?  That is supported, although not quite
>> > the way AWS does it.  You can create different pools that back RGW
>> > buckets, and each bucket is stored in one of those pools.  So you could
>> > make one of them 2x instead of 3x, or use an erasure code of your choice.
>> >
>> > What isn't currently supported is the ability to reduce the redundancy of
>> > individual objects in a bucket.  I don't think there is anything
>> > architecturally preventing that, but it is not implemented or supported.
>> >
>> > When we look at the S3 archival features in more detail (soon!) I'm sure
>> > this will come up!  The current plan is to address object versioning
>> > first.  That is, unless a developer surfaces who wants to start hacking on
>> > this right away...
>> >
>> > sage
>> >
>> >
>> >
>> >>
>> >> Thanks
>> >> Swami
>> >>
>> >

Re: why ZFS on ceph is unstable?

2014-09-19 Thread Mark Nelson

On 09/19/2014 10:40 AM, Eric Eastman wrote:

 Hi developers,it mentioned in the source code that


OPTION(filestore_zfs_snap,OPT_BOOL, false) // zfsonlinux is still
unstable.


 So if we turn on filestore_zfs_snap and neglect journal like btrf,


it will be unstable?As is mentioned on the "zfs on linux community",


 It is stable enough to run a ZFS root filesystem on a GNU/Linux


installation for yourworkstation as something to play around with.


It is copy-on-write,supports compression, deduplication, file


atomicity, off-disk caching,(encryption not support), and much more.


 So it seems that allfeatures are supported except for


encryption.Thus, I am puzzled that the unstable, you mean, is


ZFS unstableitself. Or it now is already stable on linux, but still


unstable when used as ceph FileStore filesystem.If so,


what will happen if we use it, losing data or frequent crash?




In testing I did last year, there were multiple issues with using ZFS
for my OSD backend, that would lock up the ZFS file systems, and take
the OSD down.

Several of these have been fixed by the ZFS team. See:



https://github.com/zfsonlinux/zfs/issues/1891

https://github.com/zfsonlinux/zfs/issues/1961

https://github.com/zfsonlinux/zfs/issues/2015



The recommendation is to use xattr=sa, but looking at the current open
issues for ZFS, there seems to still be issues with this option.  See:



https://github.com/zfsonlinux/zfs/issues/2700

https://github.com/zfsonlinux/zfs/issues/2717

https://github.com/zfsonlinux/zfs/issues/2663

and others


SA xattrs are pretty important from a performance perspective for Ceph 
on ZFS based on some testing I did a while back with Brian Behlendorf.






Also per the recent ZFS posting on clusterhq, aio will not be supported
until 0.64 so the following needs to be added to your ceph.conf file



  filestore zfs_snap = 1

  journal aio = 0

  journal dio = 0



My plans are to retest ZFS as an OSD backend once ZFS version 0.64 has
been released.



Please test ZFS with Ceph, and submit bugs, as this is how it will get
stable enough to use in production.



Eric
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: why ZFS on ceph is unstable?

2014-09-19 Thread Eric Eastman
 Hi developers,it mentioned in the source code that 
OPTION(filestore_zfs_snap,OPT_BOOL, false) // zfsonlinux is still 
unstable.
 So if we turn on filestore_zfs_snap and neglect journal like btrf, 

it will be unstable?As is mentioned on the "zfs on linux community",
 It is stable enough to run a ZFS root filesystem on a GNU/Linux 

installation for yourworkstation as something to play around with.
It is copy-on-write,supports compression, deduplication, file 

atomicity, off-disk caching,(encryption not support), and much more.
 So it seems that allfeatures are supported except for 

encryption.Thus, I am puzzled that the unstable, you mean, is
ZFS unstableitself. Or it now is already stable on linux, but still 

unstable when used as ceph FileStore filesystem.If so,

what will happen if we use it, losing data or frequent crash?


In testing I did last year, there were multiple issues with using ZFS 
for my OSD backend, that would lock up the ZFS file systems, and take 
the OSD down.

Several of these have been fixed by the ZFS team. See:

https://github.com/zfsonlinux/zfs/issues/1891
https://github.com/zfsonlinux/zfs/issues/1961
https://github.com/zfsonlinux/zfs/issues/2015

The recommendation is to use xattr=sa, but looking at the current open 
issues for ZFS, there seems to still be issues with this option.  See:


https://github.com/zfsonlinux/zfs/issues/2700
https://github.com/zfsonlinux/zfs/issues/2717
https://github.com/zfsonlinux/zfs/issues/2663
and others

Also per the recent ZFS posting on clusterhq, aio will not be supported 
until 0.64 so the following needs to be added to your ceph.conf file


 filestore zfs_snap = 1
 journal aio = 0
 journal dio = 0

My plans are to retest ZFS as an OSD backend once ZFS version 0.64 has 
been released.


Please test ZFS with Ceph, and submit bugs, as this is how it will get 
stable enough to use in production.


Eric
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: S3 API Compatibility support

2014-09-19 Thread Sage Weil
On Fri, 19 Sep 2014, M Ranga Swami Reddy wrote:
> Hi Sage,
> Thanks for quick reply.
> 
> >what you mean.
> >For RRS, though, I assume you mean the ability to create buckets with
> >reduced redundancy with radosgw?  That is supported, although not quite
> >the way AWS does it.  You can create different pools that back RGW
> >buckets, and each bucket is stored in one of those pools.  So you could
> >make one of them 2x instead of 3x, or use an erasure code of your choice.
> 
> Yes, we can confiure ceph to use 2x replicas, which will look like
> reduced redundancy, but AWS uses a separate RRS storage-low cost
> (instead of
> standard) storage for this purpose. I am checking, if we could
> similarly in ceph too.

What do you mean by "RRS storage-low cost storage"?  My read of the RRS 
numbers is that they simply have a different tier of S3 that runs fewer 
replicas and (probably) cheaper disks.  In radosgw-land, this would just 
be a different rados pool with 2x replicas and (probably) a CRUSH rule 
mapping it to different hardware (with bigger and/or cheaper disks).

> >What isn't currently supported is the ability to reduce the redundancy of
> >individual objects in a bucket.  I don't think there is anything
> >architecturally preventing that, but it is not implemented or supported.
> 
> OK. Do we have the issue id for the above? Else, we can file one. Please 
> advise.

There is the main #4099 issue for object expiration, but there is no real 
detail there.  The plan is (as always) to have equivalent functionality to 
S3.

Do you mind creating a new feature ticket that specifically references the 
ability to move objects to a second storage tier based on policy?  Any 
references to AWS docs about the API or functionality would be helpful in 
the ticket.

> >When we look at the S3 archival features in more detail (soon!) I'm sure
> >this will come up!  The current plan is to address object versioning
> >first.  That is, unless a developer surfaces who wants to start hacking on
> >this right away...
> 
> Great to know this. Even we are keen with S3 support in Ceph and we
> are happy support you here.

Great to hear!

Thanks-
sage


> 
> Thanks
> Swami
> 
> On Fri, Sep 19, 2014 at 11:08 AM, Sage Weil  wrote:
> > On Fri, 19 Sep 2014, M Ranga Swami Reddy wrote:
> >> Hi Sage,
> >> Could you please advise, if Ceph support the low cost object
> >> storages(like Amazon Glacier or RRS) for archiving objects like log
> >> file etc.?
> >
> > Ceph doesn't interact at all with AWS services like Glacier, if that's
> > what you mean.
> >
> > For RRS, though, I assume you mean the ability to create buckets with
> > reduced redundancy with radosgw?  That is supported, although not quite
> > the way AWS does it.  You can create different pools that back RGW
> > buckets, and each bucket is stored in one of those pools.  So you could
> > make one of them 2x instead of 3x, or use an erasure code of your choice.
> >
> > What isn't currently supported is the ability to reduce the redundancy of
> > individual objects in a bucket.  I don't think there is anything
> > architecturally preventing that, but it is not implemented or supported.
> >
> > When we look at the S3 archival features in more detail (soon!) I'm sure
> > this will come up!  The current plan is to address object versioning
> > first.  That is, unless a developer surfaces who wants to start hacking on
> > this right away...
> >
> > sage
> >
> >
> >
> >>
> >> Thanks
> >> Swami
> >>
> >> On Thu, Sep 18, 2014 at 6:20 PM, M Ranga Swami Reddy
> >>  wrote:
> >> > Hi ,
> >> >
> >> > Could you please check and clarify the below question on object
> >> > lifecycle and notification S3 APIs support:
> >> >
> >> > 1. To support the bucket lifecycle - we need to support the
> >> > moving/deleting the objects/buckets based lifecycle settings.
> >> > For ex: If an object lifecyle set as below:
> >> >   1. Archive it after 10 days - means move this object to low
> >> > cost object storage after 10 days of the creation date.
> >> >2. Remove this object after 90days - mean remove this
> >> > object from the low cost object after 90days of creation date.
> >> >
> >> > Q1- Does the ceph support the above concept like moving to low cost
> >> > storage and delete from that storage?
> >> >
> >> > 2. To support the object notifications:
> >> >   - First there should be low cost and high availability storage
> >> > with single replica only. If an object created with this type of
> >> > object storage,
> >> > There could be chances that object could lose, so if an object
> >> > of this type of storage lost, set the notifications.
> >> >
> >> > Q2- Does Ceph support low cost and high availability storage type?
> >> >
> >> > Thanks
> >> >
> >> > On Fri, Sep 12, 2014 at 8:00 PM, M Ranga Swami Reddy
> >> >  wrote:
> >> >> Hi Yehuda,
> >> >>
> >> >> Could you please check and clarify the below question on object
> >> >> lifecycle and notification S3 APIs support:
>

Re: Fwd: S3 API Compatibility support

2014-09-19 Thread M Ranga Swami Reddy
Hi Sage,
Thanks for quick reply.

>Ceph doesn't interact at all with AWS services like Glacier, if that's

No. I meant that - Ceph interaction with a glacier and RRS type of
storages along with currently used OSD (or standard storage).

>what you mean.
>For RRS, though, I assume you mean the ability to create buckets with
>reduced redundancy with radosgw?  That is supported, although not quite
>the way AWS does it.  You can create different pools that back RGW
>buckets, and each bucket is stored in one of those pools.  So you could
>make one of them 2x instead of 3x, or use an erasure code of your choice.

Yes, we can confiure ceph to use 2x replicas, which will look like
reduced redundancy, but AWS uses a separate RRS storage-low cost
(instead of
standard) storage for this purpose. I am checking, if we could
similarly in ceph too.

>What isn't currently supported is the ability to reduce the redundancy of
>individual objects in a bucket.  I don't think there is anything
>architecturally preventing that, but it is not implemented or supported.

OK. Do we have the issue id for the above? Else, we can file one. Please advise.

>When we look at the S3 archival features in more detail (soon!) I'm sure
>this will come up!  The current plan is to address object versioning
>first.  That is, unless a developer surfaces who wants to start hacking on
>this right away...

Great to know this. Even we are keen with S3 support in Ceph and we
are happy support you here.

Thanks
Swami

On Fri, Sep 19, 2014 at 11:08 AM, Sage Weil  wrote:
> On Fri, 19 Sep 2014, M Ranga Swami Reddy wrote:
>> Hi Sage,
>> Could you please advise, if Ceph support the low cost object
>> storages(like Amazon Glacier or RRS) for archiving objects like log
>> file etc.?
>
> Ceph doesn't interact at all with AWS services like Glacier, if that's
> what you mean.
>
> For RRS, though, I assume you mean the ability to create buckets with
> reduced redundancy with radosgw?  That is supported, although not quite
> the way AWS does it.  You can create different pools that back RGW
> buckets, and each bucket is stored in one of those pools.  So you could
> make one of them 2x instead of 3x, or use an erasure code of your choice.
>
> What isn't currently supported is the ability to reduce the redundancy of
> individual objects in a bucket.  I don't think there is anything
> architecturally preventing that, but it is not implemented or supported.
>
> When we look at the S3 archival features in more detail (soon!) I'm sure
> this will come up!  The current plan is to address object versioning
> first.  That is, unless a developer surfaces who wants to start hacking on
> this right away...
>
> sage
>
>
>
>>
>> Thanks
>> Swami
>>
>> On Thu, Sep 18, 2014 at 6:20 PM, M Ranga Swami Reddy
>>  wrote:
>> > Hi ,
>> >
>> > Could you please check and clarify the below question on object
>> > lifecycle and notification S3 APIs support:
>> >
>> > 1. To support the bucket lifecycle - we need to support the
>> > moving/deleting the objects/buckets based lifecycle settings.
>> > For ex: If an object lifecyle set as below:
>> >   1. Archive it after 10 days - means move this object to low
>> > cost object storage after 10 days of the creation date.
>> >2. Remove this object after 90days - mean remove this
>> > object from the low cost object after 90days of creation date.
>> >
>> > Q1- Does the ceph support the above concept like moving to low cost
>> > storage and delete from that storage?
>> >
>> > 2. To support the object notifications:
>> >   - First there should be low cost and high availability storage
>> > with single replica only. If an object created with this type of
>> > object storage,
>> > There could be chances that object could lose, so if an object
>> > of this type of storage lost, set the notifications.
>> >
>> > Q2- Does Ceph support low cost and high availability storage type?
>> >
>> > Thanks
>> >
>> > On Fri, Sep 12, 2014 at 8:00 PM, M Ranga Swami Reddy
>> >  wrote:
>> >> Hi Yehuda,
>> >>
>> >> Could you please check and clarify the below question on object
>> >> lifecycle and notification S3 APIs support:
>> >>
>> >> 1. To support the bucket lifecycle - we need to support the
>> >> moving/deleting the objects/buckets based lifecycle settings.
>> >> For ex: If an object lifecyle set as below:
>> >>   1. Archive it after 10 days - means move this object to low
>> >> cost object storage after 10 days of the creation date.
>> >>2. Remove this object after 90days - mean remove this
>> >> object from the low cost object after 90days of creation date.
>> >>
>> >> Q1- Does the ceph support the above concept like moving to low cost
>> >> storage and delete from that storage?
>> >>
>> >> 2. To support the object notifications:
>> >>   - First there should be low cost and high availability storage
>> >> with single replica only. If an object created with this type of
>> >> object storage,
>> >> There c

Re: snap_trimming + backfilling is inefficient with many purged_snaps

2014-09-19 Thread Dan van der Ster
September 19 2014 5:19 PM, "Sage Weil"  wrote: 
> On Fri, 19 Sep 2014, Dan van der Ster wrote:
> 
>> On Fri, Sep 19, 2014 at 10:41 AM, Dan Van Der Ster
>>  wrote:
 On 19 Sep 2014, at 08:12, Florian Haas  wrote:
 
 On Fri, Sep 19, 2014 at 12:27 AM, Sage Weil  wrote:
> On Fri, 19 Sep 2014, Florian Haas wrote:
>> Hi Sage,
>> 
>> was the off-list reply intentional?
> 
> Whoops! Nope :)
> 
>> On Thu, Sep 18, 2014 at 11:47 PM, Sage Weil  wrote:
 So, disaster is a pretty good description. Would anyone from the core
 team like to suggest another course of action or workaround, or are
 Dan and I generally on the right track to make the best out of a
 pretty bad situation?
>>> 
>>> The short term fix would probably be to just prevent backfill for the 
>>> time
>>> being until the bug is fixed.
>> 
>> As in, osd max backfills = 0?
> 
> Yeah :)
> 
> Just managed to reproduce the problem...
> 
> sage
 
 Saw the wip branch. Color me freakishly impressed on the turnaround. :) 
 Thanks!
>>> 
>>> Indeed :) Thanks Sage!
>>> wip-9487-dumpling fixes the problem on my test cluster. Trying in prod now?
>> 
>> Final update, after 4 hours in prod and after draining 8 OSDs -- zero
>> slow requests :)
> 
> That's great news!
> 
> But, please be careful. This code hasn't been reiewed yet or been through
> any testing! I would hold off on further backfills until it's merged.

Roger; I've been watching it very closely and so far it seems to work very 
well. Looking forward to that merge :)

Cheers, Dan


> 
> Thanks!
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: severe librbd performance degradation in Giant

2014-09-19 Thread Sage Weil
On Fri, 19 Sep 2014, Alexandre DERUMIER wrote:
> >>Crazy, I've 56 SSDs and can?t go above 20 000 iops.
> 
> I just notice than my fio benchmark is cpu bound...
> 
> I can reach around 4iops. Don't have more client machines for the moment 
> to bench

A quick aside on the fio testing: Mark noticed a few weeks back that the 
fio rbd driver is doing quite the right thing when you turn up the number 
of threads: each one issues its own IOs but they touch the same blocks in 
the image (or something like that).  See

http://tracker.ceph.com/issues/9391

It would be great to get this fixed in fio...

sage


> 
> 
> - Mail original - 
> 
> De: "Stefan Priebe - Profihost AG"  
> ?: "Xinxin Shu" , "Somnath Roy" 
> , "Alexandre DERUMIER" , 
> "Haomai Wang"  
> Cc: "Sage Weil" , "Josh Durgin" , 
> ceph-devel@vger.kernel.org 
> Envoy?: Vendredi 19 Septembre 2014 15:31:14 
> Objet: Re: severe librbd performance degradation in Giant 
> 
> Am 19.09.2014 um 15:02 schrieb Shu, Xinxin: 
> > 12 x Intel DC 3700 200GB, every SSD has two OSDs. 
> 
> Crazy, I've 56 SSDs and can?t go above 20 000 iops. 
> 
> Gr??e Stefan 
> 
> > Cheers, 
> > xinxin 
> > 
> > -Original Message- 
> > From: Stefan Priebe [mailto:s.pri...@profihost.ag] 
> > Sent: Friday, September 19, 2014 2:54 PM 
> > To: Shu, Xinxin; Somnath Roy; Alexandre DERUMIER; Haomai Wang 
> > Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org 
> > Subject: Re: severe librbd performance degradation in Giant 
> > 
> > Am 19.09.2014 03:08, schrieb Shu, Xinxin: 
> >> I also observed performance degradation on my full SSD setup , I can 
> >> got ~270K IOPS for 4KB random read with 0.80.4 , but with latest 
> >> master , I only got ~12K IOPS 
> > 
> > This are impressive numbers. Can you tell me how many OSDs you have and 
> > which SSDs you use? 
> > 
> > Thanks, 
> > Stefan 
> > 
> > 
> >> Cheers, 
> >> xinxin 
> >> 
> >> -Original Message- 
> >> From: ceph-devel-ow...@vger.kernel.org 
> >> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy 
> >> Sent: Friday, September 19, 2014 2:03 AM 
> >> To: Alexandre DERUMIER; Haomai Wang 
> >> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org 
> >> Subject: RE: severe librbd performance degradation in Giant 
> >> 
> >> Alexandre, 
> >> What tool are you using ? I used fio rbd. 
> >> 
> >> Also, I hope you have Giant package installed in the client side as well 
> >> and rbd_cache =true is set on the client conf file. 
> >> FYI, firefly librbd + librados and Giant cluster will work seamlessly and 
> >> I had to make sure fio rbd is really loading giant librbd (if you have 
> >> multiple copies around , which was in my case) for reproducing it. 
> >> 
> >> Thanks & Regards 
> >> Somnath 
> >> 
> >> -Original Message- 
> >> From: Alexandre DERUMIER [mailto:aderum...@odiso.com] 
> >> Sent: Thursday, September 18, 2014 2:49 AM 
> >> To: Haomai Wang 
> >> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org; Somnath Roy 
> >> Subject: Re: severe librbd performance degradation in Giant 
> >> 
>  According http://tracker.ceph.com/issues/9513, do you mean that rbd 
>  cache will make 10x performance degradation for random read? 
> >> 
> >> Hi, on my side, I don't see any degradation performance on read (seq or 
> >> rand) with or without. 
> >> 
> >> firefly : around 12000iops (with or without rbd_cache) giant : around 
> >> 12000iops (with or without rbd_cache) 
> >> 
> >> (and I can reach around 2-3 iops on giant with disabling 
> >> optracker). 
> >> 
> >> 
> >> rbd_cache only improve write performance for me (4k block ) 
> >> 
> >> 
> >> 
> >> - Mail original - 
> >> 
> >> De: "Haomai Wang"  
> >> ?: "Somnath Roy"  
> >> Cc: "Sage Weil" , "Josh Durgin" 
> >> , ceph-devel@vger.kernel.org 
> >> Envoy?: Jeudi 18 Septembre 2014 04:27:56 
> >> Objet: Re: severe librbd performance degradation in Giant 
> >> 
> >> According http://tracker.ceph.com/issues/9513, do you mean that rbd cache 
> >> will make 10x performance degradation for random read? 
> >> 
> >> On Thu, Sep 18, 2014 at 7:44 AM, Somnath Roy  
> >> wrote: 
> >>> Josh/Sage, 
> >>> I should mention that even after turning off rbd cache I am getting ~20% 
> >>> degradation over Firefly. 
> >>> 
> >>> Thanks & Regards 
> >>> Somnath 
> >>> 
> >>> -Original Message- 
> >>> From: Somnath Roy 
> >>> Sent: Wednesday, September 17, 2014 2:44 PM 
> >>> To: Sage Weil 
> >>> Cc: Josh Durgin; ceph-devel@vger.kernel.org 
> >>> Subject: RE: severe librbd performance degradation in Giant 
> >>> 
> >>> Created a tracker for this. 
> >>> 
> >>> http://tracker.ceph.com/issues/9513 
> >>> 
> >>> Thanks & Regards 
> >>> Somnath 
> >>> 
> >>> -Original Message- 
> >>> From: ceph-devel-ow...@vger.kernel.org 
> >>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy 
> >>> Sent: Wednesday, September 17, 2014 2:39 PM 
> >>> To: Sage Weil 
> >>> Cc: Josh Durgin; ceph-devel@vger.kernel.org 
> 

Re: [ceph-users] Status of snapshots in CephFS

2014-09-19 Thread Sage Weil
On Fri, 19 Sep 2014, Florian Haas wrote:
> Hello everyone,
> 
> Just thought I'd circle back on some discussions I've had with people
> earlier in the year:
> 
> Shortly before firefly, snapshot support for CephFS clients was
> effectively disabled by default at the MDS level, and can only be
> enabled after accepting a scary warning that your filesystem is highly
> likely to break if snapshot support is enabled. Has any progress been
> made on this in the interim?
> 
> With libcephfs support slowly maturing in Ganesha, the option of
> deploying a Ceph-backed userspace NFS server is becoming more
> attractive -- and it's probably a better use of resources than mapping
> a boatload of RBDs on an NFS head node and then exporting all the data
> from there. Recent snapshot trimming issues notwithstanding, RBD
> snapshot support is reasonably stable, but even so, making snapshot
> data available via NFS, that way, is rather ugly. In addition, the
> libcephfs/Ganesha approach would obviously include much better
> horizontal scalability.

We haven't done any work on snapshot stability.  It is probably moderately 
stable if snapshots are only done at the root or at a consistent point in 
the hierarcy (as opposed to random directories), but there are still some 
basic problems that need to be resolved.  I would not suggest deploying 
this in production!  But some stress testing woudl as always be very 
welcome.  :)

> In addition, 
> https://github.com/nfs-ganesha/nfs-ganesha/wiki/ReleaseNotes_2.0#CEPH
> states:
> 
> "The current requirement to build and use the Ceph FSAL is a Ceph
> build environment which includes Ceph client enhancements staged on
> the libwipcephfs development branch. These changes are expected to be
> part of the Ceph Firefly release."
> 
> ... though it's not clear whether they ever did make it into firefly.
> Could someone in the know comment on that?

I think this is referring to the libcephfs API changes that the cohortfs 
folks did.  That all merged shortly before firefly.

By the way, we have some basic samba integration tests in our regular 
regression tests, but nothing based on ganesha.  If you really want this 
to the work, the most valuable thing you could do would be to help 
get the tests written and integrated into ceph-qa-suite.git.  Probably the 
biggest piece of work there is creating a task/ganesha.py that installs 
and configures ganesha with the ceph FSAL.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: snap_trimming + backfilling is inefficient with many purged_snaps

2014-09-19 Thread Sage Weil
On Fri, 19 Sep 2014, Dan van der Ster wrote:
> On Fri, Sep 19, 2014 at 10:41 AM, Dan Van Der Ster
>  wrote:
> >> On 19 Sep 2014, at 08:12, Florian Haas  wrote:
> >>
> >> On Fri, Sep 19, 2014 at 12:27 AM, Sage Weil  wrote:
> >>> On Fri, 19 Sep 2014, Florian Haas wrote:
>  Hi Sage,
> 
>  was the off-list reply intentional?
> >>>
> >>> Whoops!  Nope :)
> >>>
>  On Thu, Sep 18, 2014 at 11:47 PM, Sage Weil  wrote:
> >> So, disaster is a pretty good description. Would anyone from the core
> >> team like to suggest another course of action or workaround, or are
> >> Dan and I generally on the right track to make the best out of a
> >> pretty bad situation?
> >
> > The short term fix would probably be to just prevent backfill for the 
> > time
> > being until the bug is fixed.
> 
>  As in, osd max backfills = 0?
> >>>
> >>> Yeah :)
> >>>
> >>> Just managed to reproduce the problem...
> >>>
> >>> sage
> >>
> >> Saw the wip branch. Color me freakishly impressed on the turnaround. :) 
> >> Thanks!
> >
> > Indeed :) Thanks Sage!
> > wip-9487-dumpling fixes the problem on my test cluster. Trying in prod now?
> 
> Final update, after 4 hours in prod and after draining 8 OSDs -- zero
> slow requests :)

That's great news!

But, please be careful.  This code hasn't been reiewed yet or been through 
any testing!  I would hold off on further backfills until it's merged.

Thanks!
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: severe librbd performance degradation in Giant

2014-09-19 Thread Sage Weil
On Fri, 19 Sep 2014, Alexandre DERUMIER wrote:
> >> with rbd_cache=true , I got around 6iops (and I don't see any network 
> >> traffic) 
> >>
> >>So maybe they are a bug in fio ? 
> >>maybe this is related to: 
> 
> Oh, sorry, this was my fault, I didn't fill the rbd with datas before doing 
> the bench
> 
> Now the results are (for 1 osd)
> 
> firefly
> --
>  bw=37460KB/s, iops=9364
> 
> giant
> -
>  bw=32741KB/s, iops=8185
> 
> 
> So, a little regression
> 
> (the results are equals rbd_cache=true|false)

Do you see a difference with rados bench, or is it just librbd?

Thanks!
sage


> 
> 
> I'll try to compare with more osds
> 
> - Mail original - 
> 
> De: "Alexandre DERUMIER"  
> ?: "Somnath Roy"  
> Cc: "Sage Weil" , "Josh Durgin" , 
> ceph-devel@vger.kernel.org, "Haomai Wang"  
> Envoy?: Vendredi 19 Septembre 2014 12:09:41 
> Objet: Re: severe librbd performance degradation in Giant 
> 
> >>What tool are you using ? I used fio rbd. 
> 
> fio rbd too 
> 
> 
> [global] 
> ioengine=rbd 
> clientname=admin 
> pool=test 
> rbdname=test 
> invalidate=0 
> #rw=read 
> #rw=randwrite 
> #rw=write 
> rw=randread 
> bs=4k 
> direct=1 
> numjobs=2 
> group_reporting=1 
> size=10G 
> 
> [rbd_iodepth32] 
> iodepth=32 
> 
> 
> 
> I just notice something strange 
> 
> with rbd_cache=true , I got around 6iops (and I don't see any network 
> traffic) 
> 
> So maybe they are a bug in fio ? 
> maybe this is related to: 
> 
> 
> http://tracker.ceph.com/issues/9391 
> "fio rbd driver rewrites same blocks" 
> 
> - Mail original - 
> 
> De: "Somnath Roy"  
> ?: "Alexandre DERUMIER" , "Haomai Wang" 
>  
> Cc: "Sage Weil" , "Josh Durgin" , 
> ceph-devel@vger.kernel.org 
> Envoy?: Jeudi 18 Septembre 2014 20:02:49 
> Objet: RE: severe librbd performance degradation in Giant 
> 
> Alexandre, 
> What tool are you using ? I used fio rbd. 
> 
> Also, I hope you have Giant package installed in the client side as well and 
> rbd_cache =true is set on the client conf file. 
> FYI, firefly librbd + librados and Giant cluster will work seamlessly and I 
> had to make sure fio rbd is really loading giant librbd (if you have multiple 
> copies around , which was in my case) for reproducing it. 
> 
> Thanks & Regards 
> Somnath 
> 
> -Original Message- 
> From: Alexandre DERUMIER [mailto:aderum...@odiso.com] 
> Sent: Thursday, September 18, 2014 2:49 AM 
> To: Haomai Wang 
> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org; Somnath Roy 
> Subject: Re: severe librbd performance degradation in Giant 
> 
> >>According http://tracker.ceph.com/issues/9513, do you mean that rbd 
> >>cache will make 10x performance degradation for random read? 
> 
> Hi, on my side, I don't see any degradation performance on read (seq or rand) 
> with or without. 
> 
> firefly : around 12000iops (with or without rbd_cache) giant : around 
> 12000iops (with or without rbd_cache) 
> 
> (and I can reach around 2-3 iops on giant with disabling optracker). 
> 
> 
> rbd_cache only improve write performance for me (4k block ) 
> 
> 
> 
> - Mail original - 
> 
> De: "Haomai Wang"  
> ?: "Somnath Roy"  
> Cc: "Sage Weil" , "Josh Durgin" , 
> ceph-devel@vger.kernel.org 
> Envoy?: Jeudi 18 Septembre 2014 04:27:56 
> Objet: Re: severe librbd performance degradation in Giant 
> 
> According http://tracker.ceph.com/issues/9513, do you mean that rbd cache 
> will make 10x performance degradation for random read? 
> 
> On Thu, Sep 18, 2014 at 7:44 AM, Somnath Roy  wrote: 
> > Josh/Sage, 
> > I should mention that even after turning off rbd cache I am getting ~20% 
> > degradation over Firefly. 
> > 
> > Thanks & Regards 
> > Somnath 
> > 
> > -Original Message- 
> > From: Somnath Roy 
> > Sent: Wednesday, September 17, 2014 2:44 PM 
> > To: Sage Weil 
> > Cc: Josh Durgin; ceph-devel@vger.kernel.org 
> > Subject: RE: severe librbd performance degradation in Giant 
> > 
> > Created a tracker for this. 
> > 
> > http://tracker.ceph.com/issues/9513 
> > 
> > Thanks & Regards 
> > Somnath 
> > 
> > -Original Message- 
> > From: ceph-devel-ow...@vger.kernel.org 
> > [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy 
> > Sent: Wednesday, September 17, 2014 2:39 PM 
> > To: Sage Weil 
> > Cc: Josh Durgin; ceph-devel@vger.kernel.org 
> > Subject: RE: severe librbd performance degradation in Giant 
> > 
> > Sage, 
> > It's a 4K random read. 
> > 
> > Thanks & Regards 
> > Somnath 
> > 
> > -Original Message- 
> > From: Sage Weil [mailto:sw...@redhat.com] 
> > Sent: Wednesday, September 17, 2014 2:36 PM 
> > To: Somnath Roy 
> > Cc: Josh Durgin; ceph-devel@vger.kernel.org 
> > Subject: RE: severe librbd performance degradation in Giant 
> > 
> > What was the io pattern? Sequential or random? For random a slowdown makes 
> > sense (tho maybe not 10x!) but not for sequentail 
> > 
> > s 
> > 
> > On Wed, 17 Sep 2014, Somnath Roy wrote: 
> > 
> >> I set the following in the client side 

Re: why ZFS on ceph is unstable?

2014-09-19 Thread Sage Weil
On Fri, 19 Sep 2014, Nicheal wrote:
> Hi developers,
> 
> it mentioned in the source code that OPTION(filestore_zfs_snap,
> OPT_BOOL, false) // zfsonlinux is still unstable. So if we turn on
> filestore_zfs_snap and neglect journal like btrf, it will be unstable?
> 
> As is mentioned on the "zfs on linux community", It is stable enough
> to run a ZFS root filesystem on a GNU/Linux installation for your
> workstation as something to play around with. It is copy-on-write,
> supports compression, deduplication, file atomicity, off-disk caching,
> (encryption not support), and much more.  So it seems that all
> features are supported except for encryption.
> Thus, I am puzzled that the unstable, you mean, is ZFS unstable
> itself. Or it now is already stable on linux, but still unstable when
> used as ceph FileStore filesystem.
> 
> If so, what will happen if we use it, losing data or frequent crash?

At the time the libzfs support was added, zfsonlinux would crash very 
quickly under the ceph-osd workload.  If that has changed, great!  We 
haven't tested it, though, since Zheng added the initial support.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: severe librbd performance degradation in Giant

2014-09-19 Thread Alexandre DERUMIER
>>Crazy, I've 56 SSDs and canÄt go above 20 000 iops.

I just notice than my fio benchmark is cpu bound...

I can reach around 4iops. Don't have more client machines for the moment to 
bench


- Mail original - 

De: "Stefan Priebe - Profihost AG"  
À: "Xinxin Shu" , "Somnath Roy" 
, "Alexandre DERUMIER" , "Haomai 
Wang"  
Cc: "Sage Weil" , "Josh Durgin" , 
ceph-devel@vger.kernel.org 
Envoyé: Vendredi 19 Septembre 2014 15:31:14 
Objet: Re: severe librbd performance degradation in Giant 

Am 19.09.2014 um 15:02 schrieb Shu, Xinxin: 
> 12 x Intel DC 3700 200GB, every SSD has two OSDs. 

Crazy, I've 56 SSDs and canÄt go above 20 000 iops. 

Grüße Stefan 

> Cheers, 
> xinxin 
> 
> -Original Message- 
> From: Stefan Priebe [mailto:s.pri...@profihost.ag] 
> Sent: Friday, September 19, 2014 2:54 PM 
> To: Shu, Xinxin; Somnath Roy; Alexandre DERUMIER; Haomai Wang 
> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org 
> Subject: Re: severe librbd performance degradation in Giant 
> 
> Am 19.09.2014 03:08, schrieb Shu, Xinxin: 
>> I also observed performance degradation on my full SSD setup , I can 
>> got ~270K IOPS for 4KB random read with 0.80.4 , but with latest 
>> master , I only got ~12K IOPS 
> 
> This are impressive numbers. Can you tell me how many OSDs you have and which 
> SSDs you use? 
> 
> Thanks, 
> Stefan 
> 
> 
>> Cheers, 
>> xinxin 
>> 
>> -Original Message- 
>> From: ceph-devel-ow...@vger.kernel.org 
>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy 
>> Sent: Friday, September 19, 2014 2:03 AM 
>> To: Alexandre DERUMIER; Haomai Wang 
>> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org 
>> Subject: RE: severe librbd performance degradation in Giant 
>> 
>> Alexandre, 
>> What tool are you using ? I used fio rbd. 
>> 
>> Also, I hope you have Giant package installed in the client side as well and 
>> rbd_cache =true is set on the client conf file. 
>> FYI, firefly librbd + librados and Giant cluster will work seamlessly and I 
>> had to make sure fio rbd is really loading giant librbd (if you have 
>> multiple copies around , which was in my case) for reproducing it. 
>> 
>> Thanks & Regards 
>> Somnath 
>> 
>> -Original Message- 
>> From: Alexandre DERUMIER [mailto:aderum...@odiso.com] 
>> Sent: Thursday, September 18, 2014 2:49 AM 
>> To: Haomai Wang 
>> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org; Somnath Roy 
>> Subject: Re: severe librbd performance degradation in Giant 
>> 
 According http://tracker.ceph.com/issues/9513, do you mean that rbd 
 cache will make 10x performance degradation for random read? 
>> 
>> Hi, on my side, I don't see any degradation performance on read (seq or 
>> rand) with or without. 
>> 
>> firefly : around 12000iops (with or without rbd_cache) giant : around 
>> 12000iops (with or without rbd_cache) 
>> 
>> (and I can reach around 2-3 iops on giant with disabling optracker). 
>> 
>> 
>> rbd_cache only improve write performance for me (4k block ) 
>> 
>> 
>> 
>> - Mail original - 
>> 
>> De: "Haomai Wang"  
>> À: "Somnath Roy"  
>> Cc: "Sage Weil" , "Josh Durgin" 
>> , ceph-devel@vger.kernel.org 
>> Envoyé: Jeudi 18 Septembre 2014 04:27:56 
>> Objet: Re: severe librbd performance degradation in Giant 
>> 
>> According http://tracker.ceph.com/issues/9513, do you mean that rbd cache 
>> will make 10x performance degradation for random read? 
>> 
>> On Thu, Sep 18, 2014 at 7:44 AM, Somnath Roy  
>> wrote: 
>>> Josh/Sage, 
>>> I should mention that even after turning off rbd cache I am getting ~20% 
>>> degradation over Firefly. 
>>> 
>>> Thanks & Regards 
>>> Somnath 
>>> 
>>> -Original Message- 
>>> From: Somnath Roy 
>>> Sent: Wednesday, September 17, 2014 2:44 PM 
>>> To: Sage Weil 
>>> Cc: Josh Durgin; ceph-devel@vger.kernel.org 
>>> Subject: RE: severe librbd performance degradation in Giant 
>>> 
>>> Created a tracker for this. 
>>> 
>>> http://tracker.ceph.com/issues/9513 
>>> 
>>> Thanks & Regards 
>>> Somnath 
>>> 
>>> -Original Message- 
>>> From: ceph-devel-ow...@vger.kernel.org 
>>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy 
>>> Sent: Wednesday, September 17, 2014 2:39 PM 
>>> To: Sage Weil 
>>> Cc: Josh Durgin; ceph-devel@vger.kernel.org 
>>> Subject: RE: severe librbd performance degradation in Giant 
>>> 
>>> Sage, 
>>> It's a 4K random read. 
>>> 
>>> Thanks & Regards 
>>> Somnath 
>>> 
>>> -Original Message- 
>>> From: Sage Weil [mailto:sw...@redhat.com] 
>>> Sent: Wednesday, September 17, 2014 2:36 PM 
>>> To: Somnath Roy 
>>> Cc: Josh Durgin; ceph-devel@vger.kernel.org 
>>> Subject: RE: severe librbd performance degradation in Giant 
>>> 
>>> What was the io pattern? Sequential or random? For random a slowdown makes 
>>> sense (tho maybe not 10x!) but not for sequentail 
>>> 
>>> s 
>>> 
>>> On Wed, 17 Sep 2014, Somnath Roy wrote: 
>>> 
 I set the following in the client side /etc/ceph/ceph.conf

Re: severe librbd performance degradation in Giant

2014-09-19 Thread David Moreau Simard
Numbers vary a lot from brand to brand and from model to model.

Just within Intel, you'd be surprised at the large difference between DC
S3500 and DC S3700:
http://ark.intel.com/compare/75680,71914
-- 
David Moreau Simard


Le 2014-09-19, 9:31 AM, « Stefan Priebe - Profihost AG »
 a écrit :

>Am 19.09.2014 um 15:02 schrieb Shu, Xinxin:
>>  12 x Intel DC 3700 200GB, every SSD has two OSDs.
>
>Crazy, I've 56 SSDs and canÄt go above 20 000 iops.
>
>Grüße Stefan
>
>> Cheers,
>> xinxin
>> 
>> -Original Message-
>> From: Stefan Priebe [mailto:s.pri...@profihost.ag]
>> Sent: Friday, September 19, 2014 2:54 PM
>> To: Shu, Xinxin; Somnath Roy; Alexandre DERUMIER; Haomai Wang
>> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org
>> Subject: Re: severe librbd performance degradation in Giant
>> 
>> Am 19.09.2014 03:08, schrieb Shu, Xinxin:
>>> I also observed performance degradation on my full SSD setup ,  I can
>>> got  ~270K IOPS for 4KB random read with 0.80.4 , but with latest
>>> master , I only got ~12K IOPS
>> 
>> This are impressive numbers. Can you tell me how many OSDs you have and
>>which SSDs you use?
>> 
>> Thanks,
>> Stefan
>> 
>> 
>>> Cheers,
>>> xinxin
>>>
>>> -Original Message-
>>> From: ceph-devel-ow...@vger.kernel.org
>>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy
>>> Sent: Friday, September 19, 2014 2:03 AM
>>> To: Alexandre DERUMIER; Haomai Wang
>>> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org
>>> Subject: RE: severe librbd performance degradation in Giant
>>>
>>> Alexandre,
>>> What tool are you using ? I used fio rbd.
>>>
>>> Also, I hope you have Giant package installed in the client side as
>>>well and rbd_cache =true is set on the client conf file.
>>> FYI, firefly librbd + librados and Giant cluster will work seamlessly
>>>and I had to make sure fio rbd is really loading giant librbd (if you
>>>have multiple copies around , which was in my case) for reproducing it.
>>>
>>> Thanks & Regards
>>> Somnath
>>>
>>> -Original Message-
>>> From: Alexandre DERUMIER [mailto:aderum...@odiso.com]
>>> Sent: Thursday, September 18, 2014 2:49 AM
>>> To: Haomai Wang
>>> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org; Somnath Roy
>>> Subject: Re: severe librbd performance degradation in Giant
>>>
> According http://tracker.ceph.com/issues/9513, do you mean that rbd
> cache will make 10x performance degradation for random read?
>>>
>>> Hi, on my side, I don't see any degradation performance on read (seq
>>>or rand)  with or without.
>>>
>>> firefly : around 12000iops (with or without rbd_cache) giant : around
>>> 12000iops  (with or without rbd_cache)
>>>
>>> (and I can reach around 2-3 iops on giant with disabling
>>>optracker).
>>>
>>>
>>> rbd_cache only improve write performance for me (4k block )
>>>
>>>
>>>
>>> - Mail original -
>>>
>>> De: "Haomai Wang" 
>>> À: "Somnath Roy" 
>>> Cc: "Sage Weil" , "Josh Durgin"
>>> , ceph-devel@vger.kernel.org
>>> Envoyé: Jeudi 18 Septembre 2014 04:27:56
>>> Objet: Re: severe librbd performance degradation in Giant
>>>
>>> According http://tracker.ceph.com/issues/9513, do you mean that rbd
>>>cache will make 10x performance degradation for random read?
>>>
>>> On Thu, Sep 18, 2014 at 7:44 AM, Somnath Roy 
>>>wrote:
 Josh/Sage,
 I should mention that even after turning off rbd cache I am getting
~20% degradation over Firefly.

 Thanks & Regards
 Somnath

 -Original Message-
 From: Somnath Roy
 Sent: Wednesday, September 17, 2014 2:44 PM
 To: Sage Weil
 Cc: Josh Durgin; ceph-devel@vger.kernel.org
 Subject: RE: severe librbd performance degradation in Giant

 Created a tracker for this.

 http://tracker.ceph.com/issues/9513

 Thanks & Regards
 Somnath

 -Original Message-
 From: ceph-devel-ow...@vger.kernel.org
 [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy
 Sent: Wednesday, September 17, 2014 2:39 PM
 To: Sage Weil
 Cc: Josh Durgin; ceph-devel@vger.kernel.org
 Subject: RE: severe librbd performance degradation in Giant

 Sage,
 It's a 4K random read.

 Thanks & Regards
 Somnath

 -Original Message-
 From: Sage Weil [mailto:sw...@redhat.com]
 Sent: Wednesday, September 17, 2014 2:36 PM
 To: Somnath Roy
 Cc: Josh Durgin; ceph-devel@vger.kernel.org
 Subject: RE: severe librbd performance degradation in Giant

 What was the io pattern? Sequential or random? For random a slowdown
makes sense (tho maybe not 10x!) but not for sequentail

 s

 On Wed, 17 Sep 2014, Somnath Roy wrote:

> I set the following in the client side /etc/ceph/ceph.conf where I
>am running fio rbd.
>
> rbd_cache_writethrough_until_flush = false
>
> But, no difference. BTW, I am doing Random read, not write. Still
>this setting applies ?
>
>

Re: severe librbd performance degradation in Giant

2014-09-19 Thread Stefan Priebe - Profihost AG
Am 19.09.2014 um 15:02 schrieb Shu, Xinxin:
>  12 x Intel DC 3700 200GB, every SSD has two OSDs.

Crazy, I've 56 SSDs and canÄt go above 20 000 iops.

Grüße Stefan

> Cheers,
> xinxin
> 
> -Original Message-
> From: Stefan Priebe [mailto:s.pri...@profihost.ag] 
> Sent: Friday, September 19, 2014 2:54 PM
> To: Shu, Xinxin; Somnath Roy; Alexandre DERUMIER; Haomai Wang
> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org
> Subject: Re: severe librbd performance degradation in Giant
> 
> Am 19.09.2014 03:08, schrieb Shu, Xinxin:
>> I also observed performance degradation on my full SSD setup ,  I can 
>> got  ~270K IOPS for 4KB random read with 0.80.4 , but with latest 
>> master , I only got ~12K IOPS
> 
> This are impressive numbers. Can you tell me how many OSDs you have and which 
> SSDs you use?
> 
> Thanks,
> Stefan
> 
> 
>> Cheers,
>> xinxin
>>
>> -Original Message-
>> From: ceph-devel-ow...@vger.kernel.org 
>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy
>> Sent: Friday, September 19, 2014 2:03 AM
>> To: Alexandre DERUMIER; Haomai Wang
>> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org
>> Subject: RE: severe librbd performance degradation in Giant
>>
>> Alexandre,
>> What tool are you using ? I used fio rbd.
>>
>> Also, I hope you have Giant package installed in the client side as well and 
>> rbd_cache =true is set on the client conf file.
>> FYI, firefly librbd + librados and Giant cluster will work seamlessly and I 
>> had to make sure fio rbd is really loading giant librbd (if you have 
>> multiple copies around , which was in my case) for reproducing it.
>>
>> Thanks & Regards
>> Somnath
>>
>> -Original Message-
>> From: Alexandre DERUMIER [mailto:aderum...@odiso.com]
>> Sent: Thursday, September 18, 2014 2:49 AM
>> To: Haomai Wang
>> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org; Somnath Roy
>> Subject: Re: severe librbd performance degradation in Giant
>>
 According http://tracker.ceph.com/issues/9513, do you mean that rbd 
 cache will make 10x performance degradation for random read?
>>
>> Hi, on my side, I don't see any degradation performance on read (seq or 
>> rand)  with or without.
>>
>> firefly : around 12000iops (with or without rbd_cache) giant : around 
>> 12000iops  (with or without rbd_cache)
>>
>> (and I can reach around 2-3 iops on giant with disabling optracker).
>>
>>
>> rbd_cache only improve write performance for me (4k block )
>>
>>
>>
>> - Mail original -
>>
>> De: "Haomai Wang" 
>> À: "Somnath Roy" 
>> Cc: "Sage Weil" , "Josh Durgin" 
>> , ceph-devel@vger.kernel.org
>> Envoyé: Jeudi 18 Septembre 2014 04:27:56
>> Objet: Re: severe librbd performance degradation in Giant
>>
>> According http://tracker.ceph.com/issues/9513, do you mean that rbd cache 
>> will make 10x performance degradation for random read?
>>
>> On Thu, Sep 18, 2014 at 7:44 AM, Somnath Roy  wrote:
>>> Josh/Sage,
>>> I should mention that even after turning off rbd cache I am getting ~20% 
>>> degradation over Firefly.
>>>
>>> Thanks & Regards
>>> Somnath
>>>
>>> -Original Message-
>>> From: Somnath Roy
>>> Sent: Wednesday, September 17, 2014 2:44 PM
>>> To: Sage Weil
>>> Cc: Josh Durgin; ceph-devel@vger.kernel.org
>>> Subject: RE: severe librbd performance degradation in Giant
>>>
>>> Created a tracker for this.
>>>
>>> http://tracker.ceph.com/issues/9513
>>>
>>> Thanks & Regards
>>> Somnath
>>>
>>> -Original Message-
>>> From: ceph-devel-ow...@vger.kernel.org 
>>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy
>>> Sent: Wednesday, September 17, 2014 2:39 PM
>>> To: Sage Weil
>>> Cc: Josh Durgin; ceph-devel@vger.kernel.org
>>> Subject: RE: severe librbd performance degradation in Giant
>>>
>>> Sage,
>>> It's a 4K random read.
>>>
>>> Thanks & Regards
>>> Somnath
>>>
>>> -Original Message-
>>> From: Sage Weil [mailto:sw...@redhat.com]
>>> Sent: Wednesday, September 17, 2014 2:36 PM
>>> To: Somnath Roy
>>> Cc: Josh Durgin; ceph-devel@vger.kernel.org
>>> Subject: RE: severe librbd performance degradation in Giant
>>>
>>> What was the io pattern? Sequential or random? For random a slowdown makes 
>>> sense (tho maybe not 10x!) but not for sequentail
>>>
>>> s
>>>
>>> On Wed, 17 Sep 2014, Somnath Roy wrote:
>>>
 I set the following in the client side /etc/ceph/ceph.conf where I am 
 running fio rbd.

 rbd_cache_writethrough_until_flush = false

 But, no difference. BTW, I am doing Random read, not write. Still this 
 setting applies ?

 Next, I tried to tweak the rbd_cache setting to false and I *got back* the 
 old performance. Now, it is similar to firefly throughput !

 So, loks like rbd_cache=true was the culprit.

 Thanks Josh !

 Regards
 Somnath

 -Original Message-
 From: Josh Durgin [mailto:josh.dur...@inktank.com]
 Sent: Wednesday, September 17, 2014 2:20 PM
 To: Somnath Ro

RE: severe librbd performance degradation in Giant

2014-09-19 Thread Shu, Xinxin
 12 x Intel DC 3700 200GB, every SSD has two OSDs.

Cheers,
xinxin

-Original Message-
From: Stefan Priebe [mailto:s.pri...@profihost.ag] 
Sent: Friday, September 19, 2014 2:54 PM
To: Shu, Xinxin; Somnath Roy; Alexandre DERUMIER; Haomai Wang
Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org
Subject: Re: severe librbd performance degradation in Giant

Am 19.09.2014 03:08, schrieb Shu, Xinxin:
> I also observed performance degradation on my full SSD setup ,  I can 
> got  ~270K IOPS for 4KB random read with 0.80.4 , but with latest 
> master , I only got ~12K IOPS

This are impressive numbers. Can you tell me how many OSDs you have and which 
SSDs you use?

Thanks,
Stefan


> Cheers,
> xinxin
>
> -Original Message-
> From: ceph-devel-ow...@vger.kernel.org 
> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy
> Sent: Friday, September 19, 2014 2:03 AM
> To: Alexandre DERUMIER; Haomai Wang
> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org
> Subject: RE: severe librbd performance degradation in Giant
>
> Alexandre,
> What tool are you using ? I used fio rbd.
>
> Also, I hope you have Giant package installed in the client side as well and 
> rbd_cache =true is set on the client conf file.
> FYI, firefly librbd + librados and Giant cluster will work seamlessly and I 
> had to make sure fio rbd is really loading giant librbd (if you have multiple 
> copies around , which was in my case) for reproducing it.
>
> Thanks & Regards
> Somnath
>
> -Original Message-
> From: Alexandre DERUMIER [mailto:aderum...@odiso.com]
> Sent: Thursday, September 18, 2014 2:49 AM
> To: Haomai Wang
> Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org; Somnath Roy
> Subject: Re: severe librbd performance degradation in Giant
>
>>> According http://tracker.ceph.com/issues/9513, do you mean that rbd 
>>> cache will make 10x performance degradation for random read?
>
> Hi, on my side, I don't see any degradation performance on read (seq or rand) 
>  with or without.
>
> firefly : around 12000iops (with or without rbd_cache) giant : around 
> 12000iops  (with or without rbd_cache)
>
> (and I can reach around 2-3 iops on giant with disabling optracker).
>
>
> rbd_cache only improve write performance for me (4k block )
>
>
>
> - Mail original -
>
> De: "Haomai Wang" 
> À: "Somnath Roy" 
> Cc: "Sage Weil" , "Josh Durgin" 
> , ceph-devel@vger.kernel.org
> Envoyé: Jeudi 18 Septembre 2014 04:27:56
> Objet: Re: severe librbd performance degradation in Giant
>
> According http://tracker.ceph.com/issues/9513, do you mean that rbd cache 
> will make 10x performance degradation for random read?
>
> On Thu, Sep 18, 2014 at 7:44 AM, Somnath Roy  wrote:
>> Josh/Sage,
>> I should mention that even after turning off rbd cache I am getting ~20% 
>> degradation over Firefly.
>>
>> Thanks & Regards
>> Somnath
>>
>> -Original Message-
>> From: Somnath Roy
>> Sent: Wednesday, September 17, 2014 2:44 PM
>> To: Sage Weil
>> Cc: Josh Durgin; ceph-devel@vger.kernel.org
>> Subject: RE: severe librbd performance degradation in Giant
>>
>> Created a tracker for this.
>>
>> http://tracker.ceph.com/issues/9513
>>
>> Thanks & Regards
>> Somnath
>>
>> -Original Message-
>> From: ceph-devel-ow...@vger.kernel.org 
>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy
>> Sent: Wednesday, September 17, 2014 2:39 PM
>> To: Sage Weil
>> Cc: Josh Durgin; ceph-devel@vger.kernel.org
>> Subject: RE: severe librbd performance degradation in Giant
>>
>> Sage,
>> It's a 4K random read.
>>
>> Thanks & Regards
>> Somnath
>>
>> -Original Message-
>> From: Sage Weil [mailto:sw...@redhat.com]
>> Sent: Wednesday, September 17, 2014 2:36 PM
>> To: Somnath Roy
>> Cc: Josh Durgin; ceph-devel@vger.kernel.org
>> Subject: RE: severe librbd performance degradation in Giant
>>
>> What was the io pattern? Sequential or random? For random a slowdown makes 
>> sense (tho maybe not 10x!) but not for sequentail
>>
>> s
>>
>> On Wed, 17 Sep 2014, Somnath Roy wrote:
>>
>>> I set the following in the client side /etc/ceph/ceph.conf where I am 
>>> running fio rbd.
>>>
>>> rbd_cache_writethrough_until_flush = false
>>>
>>> But, no difference. BTW, I am doing Random read, not write. Still this 
>>> setting applies ?
>>>
>>> Next, I tried to tweak the rbd_cache setting to false and I *got back* the 
>>> old performance. Now, it is similar to firefly throughput !
>>>
>>> So, loks like rbd_cache=true was the culprit.
>>>
>>> Thanks Josh !
>>>
>>> Regards
>>> Somnath
>>>
>>> -Original Message-
>>> From: Josh Durgin [mailto:josh.dur...@inktank.com]
>>> Sent: Wednesday, September 17, 2014 2:20 PM
>>> To: Somnath Roy; ceph-devel@vger.kernel.org
>>> Subject: Re: severe librbd performance degradation in Giant
>>>
>>> On 09/17/2014 01:55 PM, Somnath Roy wrote:
 Hi Sage,
 We are experiencing severe librbd performance degradation in Giant over 
 firefly release. Here is the experiment we

Re: snap_trimming + backfilling is inefficient with many purged_snaps

2014-09-19 Thread Dan van der Ster
On Fri, Sep 19, 2014 at 10:41 AM, Dan Van Der Ster
 wrote:
>> On 19 Sep 2014, at 08:12, Florian Haas  wrote:
>>
>> On Fri, Sep 19, 2014 at 12:27 AM, Sage Weil  wrote:
>>> On Fri, 19 Sep 2014, Florian Haas wrote:
 Hi Sage,

 was the off-list reply intentional?
>>>
>>> Whoops!  Nope :)
>>>
 On Thu, Sep 18, 2014 at 11:47 PM, Sage Weil  wrote:
>> So, disaster is a pretty good description. Would anyone from the core
>> team like to suggest another course of action or workaround, or are
>> Dan and I generally on the right track to make the best out of a
>> pretty bad situation?
>
> The short term fix would probably be to just prevent backfill for the time
> being until the bug is fixed.

 As in, osd max backfills = 0?
>>>
>>> Yeah :)
>>>
>>> Just managed to reproduce the problem...
>>>
>>> sage
>>
>> Saw the wip branch. Color me freakishly impressed on the turnaround. :) 
>> Thanks!
>
> Indeed :) Thanks Sage!
> wip-9487-dumpling fixes the problem on my test cluster. Trying in prod now…

Final update, after 4 hours in prod and after draining 8 OSDs -- zero
slow requests :)

Thanks again!

Dan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: severe librbd performance degradation in Giant

2014-09-19 Thread Alexandre DERUMIER
giant results with 6 osd

bw=118129KB/s, iops=29532  : rbd_cache = false
bw=101771KB/s, iops=25442 : rbd_cache = true



fio config (note that numjobs is important, i'm going from 18000iops -> 29000 
iops for numjobs 1->4)
--
[global]
#logging
#write_iops_log=write_iops_log
#write_bw_log=write_bw_log
#write_lat_log=write_lat_log
ioengine=rbd
clientname=admin
pool=test
rbdname=test
invalidate=0# mandatory
#rw=read
#rw=randwrite
#rw=write
rw=randread
bs=4K
direct=1
numjobs=4
group_reporting=1
size=10G

[rbd_iodepth32]
iodepth=32



ceph.conf
-
   debug lockdep = 0/0
debug context = 0/0
debug crush = 0/0
debug buffer = 0/0
debug timer = 0/0
debug journaler = 0/0
debug osd = 0/0
debug optracker = 0/0
debug objclass = 0/0
debug filestore = 0/0
debug journal = 0/0
debug ms = 0/0
debug monc = 0/0
debug tp = 0/0
debug auth = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug perfcounter = 0/0
debug asok = 0/0
debug throttle = 0/0

osd_op_num_threads_per_shard = 2
osd_op_num_shards = 25
filestore_fd_cache_size = 64
filestore_fd_cache_shards = 32

 ms_nocrc = true
 cephx sign messages = false
 cephx require signatures = false

 ms_dispatch_throttle_bytes = 0
 throttler_perf_counter = false

[osd]
 osd_client_message_size_cap = 0
 osd_client_message_cap = 0
 osd_enable_op_tracker = false


- Mail original - 

De: "Alexandre DERUMIER"  
À: "Somnath Roy"  
Cc: "Sage Weil" , "Josh Durgin" , 
ceph-devel@vger.kernel.org, "Haomai Wang"  
Envoyé: Vendredi 19 Septembre 2014 13:30:24 
Objet: Re: severe librbd performance degradation in Giant 

>> with rbd_cache=true , I got around 6iops (and I don't see any network 
>> traffic) 
>> 
>>So maybe they are a bug in fio ? 
>>maybe this is related to: 

Oh, sorry, this was my fault, I didn't fill the rbd with datas before doing the 
bench 

Now the results are (for 1 osd) 

firefly 
-- 
bw=37460KB/s, iops=9364 

giant 
- 
bw=32741KB/s, iops=8185 


So, a little regression 

(the results are equals rbd_cache=true|false) 


I'll try to compare with more osds 

- Mail original - 

De: "Alexandre DERUMIER"  
À: "Somnath Roy"  
Cc: "Sage Weil" , "Josh Durgin" , 
ceph-devel@vger.kernel.org, "Haomai Wang"  
Envoyé: Vendredi 19 Septembre 2014 12:09:41 
Objet: Re: severe librbd performance degradation in Giant 

>>What tool are you using ? I used fio rbd. 

fio rbd too 


[global] 
ioengine=rbd 
clientname=admin 
pool=test 
rbdname=test 
invalidate=0 
#rw=read 
#rw=randwrite 
#rw=write 
rw=randread 
bs=4k 
direct=1 
numjobs=2 
group_reporting=1 
size=10G 

[rbd_iodepth32] 
iodepth=32 



I just notice something strange 

with rbd_cache=true , I got around 6iops (and I don't see any network 
traffic) 

So maybe they are a bug in fio ? 
maybe this is related to: 


http://tracker.ceph.com/issues/9391 
"fio rbd driver rewrites same blocks" 

- Mail original - 

De: "Somnath Roy"  
À: "Alexandre DERUMIER" , "Haomai Wang" 
 
Cc: "Sage Weil" , "Josh Durgin" , 
ceph-devel@vger.kernel.org 
Envoyé: Jeudi 18 Septembre 2014 20:02:49 
Objet: RE: severe librbd performance degradation in Giant 

Alexandre, 
What tool are you using ? I used fio rbd. 

Also, I hope you have Giant package installed in the client side as well and 
rbd_cache =true is set on the client conf file. 
FYI, firefly librbd + librados and Giant cluster will work seamlessly and I had 
to make sure fio rbd is really loading giant librbd (if you have multiple 
copies around , which was in my case) for reproducing it. 

Thanks & Regards 
Somnath 

-Original Message- 
From: Alexandre DERUMIER [mailto:aderum...@odiso.com] 
Sent: Thursday, September 18, 2014 2:49 AM 
To: Haomai Wang 
Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org; Somnath Roy 
Subject: Re: severe librbd performance degradation in Giant 

>>According http://tracker.ceph.com/issues/9513, do you mean that rbd 
>>cache will make 10x performance degradation for random read? 

Hi, on my side, I don't see any degradation performance on read (seq or rand) 
with or without. 

firefly : around 12000iops (with or without rbd_cache) giant : around 12000iops 
(with or without rbd_cache) 

(and I can reach around 2-3 iops on giant with disabling optracker). 


rbd_cache only improve write performance for me (4k block ) 



- Mail original - 

De: "Haomai Wang"  
À: "Somnath Roy"  
Cc: "Sage Weil" , "Josh Durgin" , 
ceph-devel@vger.kernel.org 
Envoyé: Jeudi 18 Septembre 2014 04:27:56 
Objet: Re: severe librbd performance degradation in Giant 

According http://tracker.ceph.com/issues/9513, do you mean that rbd cache will 
make 10x performance degradation for random read? 

On Thu, Sep 18, 2014 at 7:44 AM, Somnath Roy  wrot

Re: severe librbd performance degradation in Giant

2014-09-19 Thread Alexandre DERUMIER
>> with rbd_cache=true , I got around 6iops (and I don't see any network 
>> traffic) 
>>
>>So maybe they are a bug in fio ? 
>>maybe this is related to: 

Oh, sorry, this was my fault, I didn't fill the rbd with datas before doing the 
bench

Now the results are (for 1 osd)

firefly
--
 bw=37460KB/s, iops=9364

giant
-
 bw=32741KB/s, iops=8185


So, a little regression

(the results are equals rbd_cache=true|false)


I'll try to compare with more osds

- Mail original - 

De: "Alexandre DERUMIER"  
À: "Somnath Roy"  
Cc: "Sage Weil" , "Josh Durgin" , 
ceph-devel@vger.kernel.org, "Haomai Wang"  
Envoyé: Vendredi 19 Septembre 2014 12:09:41 
Objet: Re: severe librbd performance degradation in Giant 

>>What tool are you using ? I used fio rbd. 

fio rbd too 


[global] 
ioengine=rbd 
clientname=admin 
pool=test 
rbdname=test 
invalidate=0 
#rw=read 
#rw=randwrite 
#rw=write 
rw=randread 
bs=4k 
direct=1 
numjobs=2 
group_reporting=1 
size=10G 

[rbd_iodepth32] 
iodepth=32 



I just notice something strange 

with rbd_cache=true , I got around 6iops (and I don't see any network 
traffic) 

So maybe they are a bug in fio ? 
maybe this is related to: 


http://tracker.ceph.com/issues/9391 
"fio rbd driver rewrites same blocks" 

- Mail original - 

De: "Somnath Roy"  
À: "Alexandre DERUMIER" , "Haomai Wang" 
 
Cc: "Sage Weil" , "Josh Durgin" , 
ceph-devel@vger.kernel.org 
Envoyé: Jeudi 18 Septembre 2014 20:02:49 
Objet: RE: severe librbd performance degradation in Giant 

Alexandre, 
What tool are you using ? I used fio rbd. 

Also, I hope you have Giant package installed in the client side as well and 
rbd_cache =true is set on the client conf file. 
FYI, firefly librbd + librados and Giant cluster will work seamlessly and I had 
to make sure fio rbd is really loading giant librbd (if you have multiple 
copies around , which was in my case) for reproducing it. 

Thanks & Regards 
Somnath 

-Original Message- 
From: Alexandre DERUMIER [mailto:aderum...@odiso.com] 
Sent: Thursday, September 18, 2014 2:49 AM 
To: Haomai Wang 
Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org; Somnath Roy 
Subject: Re: severe librbd performance degradation in Giant 

>>According http://tracker.ceph.com/issues/9513, do you mean that rbd 
>>cache will make 10x performance degradation for random read? 

Hi, on my side, I don't see any degradation performance on read (seq or rand) 
with or without. 

firefly : around 12000iops (with or without rbd_cache) giant : around 12000iops 
(with or without rbd_cache) 

(and I can reach around 2-3 iops on giant with disabling optracker). 


rbd_cache only improve write performance for me (4k block ) 



- Mail original - 

De: "Haomai Wang"  
À: "Somnath Roy"  
Cc: "Sage Weil" , "Josh Durgin" , 
ceph-devel@vger.kernel.org 
Envoyé: Jeudi 18 Septembre 2014 04:27:56 
Objet: Re: severe librbd performance degradation in Giant 

According http://tracker.ceph.com/issues/9513, do you mean that rbd cache will 
make 10x performance degradation for random read? 

On Thu, Sep 18, 2014 at 7:44 AM, Somnath Roy  wrote: 
> Josh/Sage, 
> I should mention that even after turning off rbd cache I am getting ~20% 
> degradation over Firefly. 
> 
> Thanks & Regards 
> Somnath 
> 
> -Original Message- 
> From: Somnath Roy 
> Sent: Wednesday, September 17, 2014 2:44 PM 
> To: Sage Weil 
> Cc: Josh Durgin; ceph-devel@vger.kernel.org 
> Subject: RE: severe librbd performance degradation in Giant 
> 
> Created a tracker for this. 
> 
> http://tracker.ceph.com/issues/9513 
> 
> Thanks & Regards 
> Somnath 
> 
> -Original Message- 
> From: ceph-devel-ow...@vger.kernel.org 
> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy 
> Sent: Wednesday, September 17, 2014 2:39 PM 
> To: Sage Weil 
> Cc: Josh Durgin; ceph-devel@vger.kernel.org 
> Subject: RE: severe librbd performance degradation in Giant 
> 
> Sage, 
> It's a 4K random read. 
> 
> Thanks & Regards 
> Somnath 
> 
> -Original Message- 
> From: Sage Weil [mailto:sw...@redhat.com] 
> Sent: Wednesday, September 17, 2014 2:36 PM 
> To: Somnath Roy 
> Cc: Josh Durgin; ceph-devel@vger.kernel.org 
> Subject: RE: severe librbd performance degradation in Giant 
> 
> What was the io pattern? Sequential or random? For random a slowdown makes 
> sense (tho maybe not 10x!) but not for sequentail 
> 
> s 
> 
> On Wed, 17 Sep 2014, Somnath Roy wrote: 
> 
>> I set the following in the client side /etc/ceph/ceph.conf where I am 
>> running fio rbd. 
>> 
>> rbd_cache_writethrough_until_flush = false 
>> 
>> But, no difference. BTW, I am doing Random read, not write. Still this 
>> setting applies ? 
>> 
>> Next, I tried to tweak the rbd_cache setting to false and I *got back* the 
>> old performance. Now, it is similar to firefly throughput ! 
>> 
>> So, loks like rbd_cache=true was the culprit. 
>> 
>> Thanks Josh ! 
>> 
>> Regards 
>> Somnath 
>> 
>> -Original Message- 
>

Re: severe librbd performance degradation in Giant

2014-09-19 Thread Alexandre DERUMIER
>>What tool are you using ? I used fio rbd. 

fio rbd too


[global]
ioengine=rbd
clientname=admin
pool=test
rbdname=test
invalidate=0 
#rw=read
#rw=randwrite
#rw=write
rw=randread
bs=4k
direct=1
numjobs=2
group_reporting=1
size=10G

[rbd_iodepth32]
iodepth=32



I just notice something strange

with rbd_cache=true , I got around 6iops  (and I don't see any network 
traffic)

So maybe they are a bug in fio ?
maybe this is related to:


http://tracker.ceph.com/issues/9391
"fio rbd driver rewrites same blocks"

- Mail original - 

De: "Somnath Roy"  
À: "Alexandre DERUMIER" , "Haomai Wang" 
 
Cc: "Sage Weil" , "Josh Durgin" , 
ceph-devel@vger.kernel.org 
Envoyé: Jeudi 18 Septembre 2014 20:02:49 
Objet: RE: severe librbd performance degradation in Giant 

Alexandre, 
What tool are you using ? I used fio rbd. 

Also, I hope you have Giant package installed in the client side as well and 
rbd_cache =true is set on the client conf file. 
FYI, firefly librbd + librados and Giant cluster will work seamlessly and I had 
to make sure fio rbd is really loading giant librbd (if you have multiple 
copies around , which was in my case) for reproducing it. 

Thanks & Regards 
Somnath 

-Original Message- 
From: Alexandre DERUMIER [mailto:aderum...@odiso.com] 
Sent: Thursday, September 18, 2014 2:49 AM 
To: Haomai Wang 
Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org; Somnath Roy 
Subject: Re: severe librbd performance degradation in Giant 

>>According http://tracker.ceph.com/issues/9513, do you mean that rbd 
>>cache will make 10x performance degradation for random read? 

Hi, on my side, I don't see any degradation performance on read (seq or rand) 
with or without. 

firefly : around 12000iops (with or without rbd_cache) giant : around 12000iops 
(with or without rbd_cache) 

(and I can reach around 2-3 iops on giant with disabling optracker). 


rbd_cache only improve write performance for me (4k block ) 



- Mail original - 

De: "Haomai Wang"  
À: "Somnath Roy"  
Cc: "Sage Weil" , "Josh Durgin" , 
ceph-devel@vger.kernel.org 
Envoyé: Jeudi 18 Septembre 2014 04:27:56 
Objet: Re: severe librbd performance degradation in Giant 

According http://tracker.ceph.com/issues/9513, do you mean that rbd cache will 
make 10x performance degradation for random read? 

On Thu, Sep 18, 2014 at 7:44 AM, Somnath Roy  wrote: 
> Josh/Sage, 
> I should mention that even after turning off rbd cache I am getting ~20% 
> degradation over Firefly. 
> 
> Thanks & Regards 
> Somnath 
> 
> -Original Message- 
> From: Somnath Roy 
> Sent: Wednesday, September 17, 2014 2:44 PM 
> To: Sage Weil 
> Cc: Josh Durgin; ceph-devel@vger.kernel.org 
> Subject: RE: severe librbd performance degradation in Giant 
> 
> Created a tracker for this. 
> 
> http://tracker.ceph.com/issues/9513 
> 
> Thanks & Regards 
> Somnath 
> 
> -Original Message- 
> From: ceph-devel-ow...@vger.kernel.org 
> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy 
> Sent: Wednesday, September 17, 2014 2:39 PM 
> To: Sage Weil 
> Cc: Josh Durgin; ceph-devel@vger.kernel.org 
> Subject: RE: severe librbd performance degradation in Giant 
> 
> Sage, 
> It's a 4K random read. 
> 
> Thanks & Regards 
> Somnath 
> 
> -Original Message- 
> From: Sage Weil [mailto:sw...@redhat.com] 
> Sent: Wednesday, September 17, 2014 2:36 PM 
> To: Somnath Roy 
> Cc: Josh Durgin; ceph-devel@vger.kernel.org 
> Subject: RE: severe librbd performance degradation in Giant 
> 
> What was the io pattern? Sequential or random? For random a slowdown makes 
> sense (tho maybe not 10x!) but not for sequentail 
> 
> s 
> 
> On Wed, 17 Sep 2014, Somnath Roy wrote: 
> 
>> I set the following in the client side /etc/ceph/ceph.conf where I am 
>> running fio rbd. 
>> 
>> rbd_cache_writethrough_until_flush = false 
>> 
>> But, no difference. BTW, I am doing Random read, not write. Still this 
>> setting applies ? 
>> 
>> Next, I tried to tweak the rbd_cache setting to false and I *got back* the 
>> old performance. Now, it is similar to firefly throughput ! 
>> 
>> So, loks like rbd_cache=true was the culprit. 
>> 
>> Thanks Josh ! 
>> 
>> Regards 
>> Somnath 
>> 
>> -Original Message- 
>> From: Josh Durgin [mailto:josh.dur...@inktank.com] 
>> Sent: Wednesday, September 17, 2014 2:20 PM 
>> To: Somnath Roy; ceph-devel@vger.kernel.org 
>> Subject: Re: severe librbd performance degradation in Giant 
>> 
>> On 09/17/2014 01:55 PM, Somnath Roy wrote: 
>> > Hi Sage, 
>> > We are experiencing severe librbd performance degradation in Giant over 
>> > firefly release. Here is the experiment we did to isolate it as a librbd 
>> > problem. 
>> > 
>> > 1. Single OSD is running latest Giant and client is running fio rbd on top 
>> > of firefly based librbd/librados. For one client it is giving ~11-12K iops 
>> > (4K RR). 
>> > 2. Single OSD is running Giant and client is running fio rbd on top of 
>> > Giant based librbd/librados. For 

Re: [PATCH v2 2/3] ec: use 32-byte aligned buffers

2014-09-19 Thread Loic Dachary
Hi Janne,

This looks good ! The 32 byte aligned buffer applies to the diff related to 
buffer.h though, could you update the title ? I tend to prefer erasure-code 
over ec : it is easier to grep / search ;-)

Cheers

On 18/09/2014 12:33, Janne Grunau wrote:
> Requiring page aligned buffers and realigning the input if necessary
> creates measurable oberhead. ceph_erasure_code_benchmark is ~30% faster
> with this change for technique=reed_sol_van,k=2,m=1.
> 
> Also prevents a misaligned buffer when bufferlist::c_str(bufferlist)
> has to allocate a new buffer to provide continuous one. See bug #9408
> 
> Signed-off-by: Janne Grunau 
> ---
>  src/erasure-code/ErasureCode.cc | 57 
> -
>  src/erasure-code/ErasureCode.h  |  3 ++-
>  2 files changed, 41 insertions(+), 19 deletions(-)
> 
> diff --git a/src/erasure-code/ErasureCode.cc b/src/erasure-code/ErasureCode.cc
> index 5953f49..7aa5235 100644
> --- a/src/erasure-code/ErasureCode.cc
> +++ b/src/erasure-code/ErasureCode.cc
> @@ -54,22 +54,49 @@ int ErasureCode::minimum_to_decode_with_cost(const 
> set &want_to_read,
>  }
>  
>  int ErasureCode::encode_prepare(const bufferlist &raw,
> -bufferlist *prepared) const
> +map &encoded) const
>  {
>unsigned int k = get_data_chunk_count();
>unsigned int m = get_chunk_count() - k;
>unsigned blocksize = get_chunk_size(raw.length());
> -  unsigned padded_length = blocksize * k;
> -  *prepared = raw;
> -  if (padded_length - raw.length() > 0) {
> -bufferptr pad(padded_length - raw.length());
> -pad.zero();
> -prepared->push_back(pad);
> +  unsigned pad_len = blocksize * k - raw.length();
> +  unsigned padded_chunks = k - raw.length() / blocksize;
> +  bufferlist prepared = raw;
> +
> +  if (!prepared.is_aligned()) {
> +// splice padded chunks off to make the rebuild faster
> +if (padded_chunks)
> +  prepared.splice((k - padded_chunks) * blocksize,
> +  padded_chunks * blocksize - pad_len);
> +prepared.rebuild_aligned();
> +  }
> +
> +  for (unsigned int i = 0; i < k - padded_chunks; i++) {
> +int chunk_index = chunk_mapping.size() > 0 ? chunk_mapping[i] : i;
> +bufferlist &chunk = encoded[chunk_index];
> +chunk.substr_of(prepared, i * blocksize, blocksize);
> +  }
> +  if (padded_chunks) {
> +unsigned remainder = raw.length() - (k - padded_chunks) * blocksize;
> +bufferlist padded;
> +bufferptr buf(buffer::create_aligned(padded_chunks * blocksize));
> +
> +raw.copy((k - padded_chunks) * blocksize, remainder, buf.c_str());
> +buf.zero(remainder, pad_len);
> +padded.push_back(buf);
> +
> +for (unsigned int i = k - padded_chunks; i < k; i++) {
> +  int chunk_index = chunk_mapping.size() > 0 ? chunk_mapping[i] : i;
> +  bufferlist &chunk = encoded[chunk_index];
> +  chunk.substr_of(padded, (i - (k - padded_chunks)) * blocksize, 
> blocksize);
> +}
> +  }
> +  for (unsigned int i = k; i < k + m; i++) {
> +int chunk_index = chunk_mapping.size() > 0 ? chunk_mapping[i] : i;
> +bufferlist &chunk = encoded[chunk_index];
> +chunk.push_back(buffer::create_aligned(blocksize));
>}
> -  unsigned coding_length = blocksize * m;
> -  bufferptr coding(buffer::create_page_aligned(coding_length));
> -  prepared->push_back(coding);
> -  prepared->rebuild_page_aligned();
> +
>return 0;
>  }
>  
> @@ -80,15 +107,9 @@ int ErasureCode::encode(const set &want_to_encode,
>unsigned int k = get_data_chunk_count();
>unsigned int m = get_chunk_count() - k;
>bufferlist out;
> -  int err = encode_prepare(in, &out);
> +  int err = encode_prepare(in, *encoded);
>if (err)
>  return err;
> -  unsigned blocksize = get_chunk_size(in.length());
> -  for (unsigned int i = 0; i < k + m; i++) {
> -int chunk_index = chunk_mapping.size() > 0 ? chunk_mapping[i] : i;
> -bufferlist &chunk = (*encoded)[chunk_index];
> -chunk.substr_of(out, i * blocksize, blocksize);
> -  }
>encode_chunks(want_to_encode, encoded);
>for (unsigned int i = 0; i < k + m; i++) {
>  if (want_to_encode.count(i) == 0)
> diff --git a/src/erasure-code/ErasureCode.h b/src/erasure-code/ErasureCode.h
> index 7aaea95..62aa383 100644
> --- a/src/erasure-code/ErasureCode.h
> +++ b/src/erasure-code/ErasureCode.h
> @@ -46,7 +46,8 @@ namespace ceph {
>  const map &available,
>  set *minimum);
>  
> -int encode_prepare(const bufferlist &raw, bufferlist *prepared) const;
> +int encode_prepare(const bufferlist &raw,
> +   map &encoded) const;
>  
>  virtual int encode(const set &want_to_encode,
> const bufferlist &in,
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature


why ZFS on ceph is unstable?

2014-09-19 Thread Nicheal
Hi developers,

it mentioned in the source code that OPTION(filestore_zfs_snap,
OPT_BOOL, false) // zfsonlinux is still unstable. So if we turn on
filestore_zfs_snap and neglect journal like btrf, it will be unstable?

As is mentioned on the "zfs on linux community", It is stable enough
to run a ZFS root filesystem on a GNU/Linux installation for your
workstation as something to play around with. It is copy-on-write,
supports compression, deduplication, file atomicity, off-disk caching,
(encryption not support), and much more.  So it seems that all
features are supported except for encryption.
Thus, I am puzzled that the unstable, you mean, is ZFS unstable
itself. Or it now is already stable on linux, but still unstable when
used as ceph FileStore filesystem.

If so, what will happen if we use it, losing data or frequent crash?

Nicheal
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: v2 aligned buffer changes for erasure codes

2014-09-19 Thread Loic Dachary
Hi Andreas,

The per_chunk_alignment addresses a backward compatible change in the way they 
are calculated. The problem was that the initial calculation lead to oversized 
chunks. The long explanation is at 
https://github.com/ceph/ceph/commit/c7daaaf5e63d0bd1d444385e62611fe276f6ce29

Please let me know if you see something wrong :-)

Cheers

On 18/09/2014 14:34, Andreas Joachim Peters wrote:
> Hi Janne/Loic, 
> there is more confusion atleast on my side ...
> 
> I had now a look at the jerasure plug-in and I am now slightly confused why 
> you have two ways to return in get_alignment ... one is as I assume and 
> another one is "per_chunk_alignment" ... what should the function return Loic?
> 
> Cheers Andreas.
> 
> From: ceph-devel-ow...@vger.kernel.org [ceph-devel-ow...@vger.kernel.org] on 
> behalf of Andreas Joachim Peters [andreas.joachim.pet...@cern.ch]
> Sent: 18 September 2014 14:18
> To: Janne Grunau; ceph-devel@vger.kernel.org
> Subject: RE: v2 aligned buffer changes for erasure codes
> 
> Hi Janne,
> => (src/erasure-code/isa/README claims it needs 16*k byte aligned buffers
> 
> I should update the README since it is misleading ... it should say 8*k or 
> 16*k byte aligned chunk size depending on the compiler/platform used, it is 
> not the alignment of the allocated buffer addresses.The get_alignment in the 
> plug-in function is used to compute the chunk size for the encoding (as I 
> said not the start address alignment).
> 
> If you pass k buffers for decoding each buffer should be aligned at least to 
> 16 or as you pointed out better 32 bytes.
> 
> For encoding there is normally a single buffer split 'virtually' into k 
> pieces. To make all pieces starting at an aligned address one needs to align 
> the chunk size to e.g. 16*k. For the best possible performance on all 
> platforms we should change the get_alignment function in the ISA plug-in to 
> return 32*k if there are no other objections ?!?!
> 
> Cheers Andreas.
> 
> From: ceph-devel-ow...@vger.kernel.org [ceph-devel-ow...@vger.kernel.org] on 
> behalf of Janne Grunau [j...@jannau.net]
> Sent: 18 September 2014 12:33
> To: ceph-devel@vger.kernel.org
> Subject: v2 aligned buffer changes for erasure codes
> 
> Hi,
> 
> following a is an updated patchset. It passes now make check in src
> 
> It has following changes:
>  * use 32-byte alignment since the isa plugin use AVX2
>(src/erasure-code/isa/README claims it needs 16*k byte aligned buffers
>but I can't see a reason why it would need more than 32-bytes
>  * ErasureCode::encode_prepare() handles more than one chunk with padding
> 
> cheers
> 
> Janne
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre



signature.asc
Description: OpenPGP digital signature


Re: snap_trimming + backfilling is inefficient with many purged_snaps

2014-09-19 Thread Dan Van Der Ster
> On 19 Sep 2014, at 08:12, Florian Haas  wrote:
> 
> On Fri, Sep 19, 2014 at 12:27 AM, Sage Weil  wrote:
>> On Fri, 19 Sep 2014, Florian Haas wrote:
>>> Hi Sage,
>>> 
>>> was the off-list reply intentional?
>> 
>> Whoops!  Nope :)
>> 
>>> On Thu, Sep 18, 2014 at 11:47 PM, Sage Weil  wrote:
> So, disaster is a pretty good description. Would anyone from the core
> team like to suggest another course of action or workaround, or are
> Dan and I generally on the right track to make the best out of a
> pretty bad situation?
 
 The short term fix would probably be to just prevent backfill for the time
 being until the bug is fixed.
>>> 
>>> As in, osd max backfills = 0?
>> 
>> Yeah :)
>> 
>> Just managed to reproduce the problem...
>> 
>> sage
> 
> Saw the wip branch. Color me freakishly impressed on the turnaround. :) 
> Thanks!

Indeed :) Thanks Sage!
wip-9487-dumpling fixes the problem on my test cluster. Trying in prod now…
Cheers, 
DanN�r��yb�X��ǧv�^�)޺{.n�+���z�]z���{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj"��!�i