Re: [ceph-users] any recommendation of using EnhanceIO?

2015-09-01 Thread Nick Fisk




> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Wang, Zhiqiang
> Sent: 01 September 2015 09:48
> To: Nick Fisk ; 'Samuel Just' 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> > -Original Message-
> > From: Nick Fisk [mailto:n...@fisk.me.uk]
> > Sent: Tuesday, September 1, 2015 4:37 PM
> > To: Wang, Zhiqiang; 'Samuel Just'
> > Cc: ceph-users@lists.ceph.com
> > Subject: RE: [ceph-users] any recommendation of using EnhanceIO?
> >
> >
> >
> >
> >
> > > -Original Message-
> > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > Behalf Of Wang, Zhiqiang
> > > Sent: 01 September 2015 09:18
> > > To: Nick Fisk ; 'Samuel Just' 
> > > Cc: ceph-users@lists.ceph.com
> > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > >
> > > > -Original Message-
> > > > From: Nick Fisk [mailto:n...@fisk.me.uk]
> > > > Sent: Tuesday, September 1, 2015 3:55 PM
> > > > To: Wang, Zhiqiang; 'Nick Fisk'; 'Samuel Just'
> > > > Cc: ceph-users@lists.ceph.com
> > > > Subject: RE: [ceph-users] any recommendation of using EnhanceIO?
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > > > Behalf Of Wang, Zhiqiang
> > > > > Sent: 01 September 2015 02:48
> > > > > To: Nick Fisk ; 'Samuel Just'
> > > > > 
> > > > > Cc: ceph-users@lists.ceph.com
> > > > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > > > >
> > > > > > -Original Message-
> > > > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > > > > Behalf Of Nick Fisk
> > > > > > Sent: Wednesday, August 19, 2015 5:25 AM
> > > > > > To: 'Samuel Just'
> > > > > > Cc: ceph-users@lists.ceph.com
> > > > > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > > > > >
> > > > > > Hi Sam,
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com]
> > > > > > > On Behalf Of Samuel Just
> > > > > > > Sent: 18 August 2015 21:38
> > > > > > > To: Nick Fisk 
> > > > > > > Cc: ceph-users@lists.ceph.com
> > > > > > > Subject: Re: [ceph-users] any recommendation of using
> EnhanceIO?
> > > > > > >
> > > > > > > 1.  We've kicked this around a bit.  What kind of failure
> > > > > > > semantics would
> > > > > > you
> > > > > > > be comfortable with here (that is, what would be reasonable
> > > > > > > behavior if
> > > > > > the
> > > > > > > client side cache fails)?
> > > > > >
> > > > > > I would either expect to provide the cache with a redundant
> > > > > > block device (ie
> > > > > > RAID1 SSD's) or the cache to allow itself to be configured to
> > > > > > mirror across two SSD's. Of course single SSD's can be used if
> > > > > > the user accepts
> > > > the
> > > > > risk.
> > > > > > If the cache did the mirroring then you could do fancy stuff
> > > > > > like mirror the writes, but leave the read cache blocks as
> > > > > > single copies to increase the cache capacity.
> > > > > >
> > > > > > In either case although an outage is undesirable, its only
> > > > > > data loss which would be unacceptable, which would hopefully
> > > > > > be avoided by the mirroring. As part of this, it would need to
> > > > > > be a way to make sure a "dirty" RBD can't be accessed unless
> > > > > > the corresponding cache is also
> > > > > attached.
> > > > > >
> > > > > > I guess as it caching the RBD and not the pool or entire
> > > > > >

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-09-01 Thread Wang, Zhiqiang
> -Original Message-
> From: Nick Fisk [mailto:n...@fisk.me.uk]
> Sent: Tuesday, September 1, 2015 4:37 PM
> To: Wang, Zhiqiang; 'Samuel Just'
> Cc: ceph-users@lists.ceph.com
> Subject: RE: [ceph-users] any recommendation of using EnhanceIO?
> 
> 
> 
> 
> 
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of Wang, Zhiqiang
> > Sent: 01 September 2015 09:18
> > To: Nick Fisk ; 'Samuel Just' 
> > Cc: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> >
> > > -Original Message-
> > > From: Nick Fisk [mailto:n...@fisk.me.uk]
> > > Sent: Tuesday, September 1, 2015 3:55 PM
> > > To: Wang, Zhiqiang; 'Nick Fisk'; 'Samuel Just'
> > > Cc: ceph-users@lists.ceph.com
> > > Subject: RE: [ceph-users] any recommendation of using EnhanceIO?
> > >
> > >
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > > Behalf Of Wang, Zhiqiang
> > > > Sent: 01 September 2015 02:48
> > > > To: Nick Fisk ; 'Samuel Just' 
> > > > Cc: ceph-users@lists.ceph.com
> > > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > > >
> > > > > -Original Message-
> > > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > > > Behalf Of Nick Fisk
> > > > > Sent: Wednesday, August 19, 2015 5:25 AM
> > > > > To: 'Samuel Just'
> > > > > Cc: ceph-users@lists.ceph.com
> > > > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > > > >
> > > > > Hi Sam,
> > > > >
> > > > > > -Original Message-
> > > > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > > > > Behalf Of Samuel Just
> > > > > > Sent: 18 August 2015 21:38
> > > > > > To: Nick Fisk 
> > > > > > Cc: ceph-users@lists.ceph.com
> > > > > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > > > > >
> > > > > > 1.  We've kicked this around a bit.  What kind of failure
> > > > > > semantics would
> > > > > you
> > > > > > be comfortable with here (that is, what would be reasonable
> > > > > > behavior if
> > > > > the
> > > > > > client side cache fails)?
> > > > >
> > > > > I would either expect to provide the cache with a redundant
> > > > > block device (ie
> > > > > RAID1 SSD's) or the cache to allow itself to be configured to
> > > > > mirror across two SSD's. Of course single SSD's can be used if
> > > > > the user accepts
> > > the
> > > > risk.
> > > > > If the cache did the mirroring then you could do fancy stuff
> > > > > like mirror the writes, but leave the read cache blocks as
> > > > > single copies to increase the cache capacity.
> > > > >
> > > > > In either case although an outage is undesirable, its only data
> > > > > loss which would be unacceptable, which would hopefully be
> > > > > avoided by the mirroring. As part of this, it would need to be a
> > > > > way to make sure a "dirty" RBD can't be accessed unless the
> > > > > corresponding cache is also
> > > > attached.
> > > > >
> > > > > I guess as it caching the RBD and not the pool or entire
> > > > > cluster, the cache only needs to match the failure requirements
> > > > > of the application
> > > its
> > > > caching.
> > > > > If I need to cache a RBD that is on  a single server, there is
> > > > > no requirement to make the cache redundant across
> > > > racks/PDU's/servers...etc.
> > > > >
> > > > > I hope I've answered your question?
> > > > >
> > > > >
> > > > > > 2. We've got a branch which should merge soon (tomorrow
> > > > > > probably) which actually does allow writes to be proxied, so
> > > > > > that should alleviate some of these pain point

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-09-01 Thread Nick Fisk




> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Wang, Zhiqiang
> Sent: 01 September 2015 09:18
> To: Nick Fisk ; 'Samuel Just' 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> > -Original Message-
> > From: Nick Fisk [mailto:n...@fisk.me.uk]
> > Sent: Tuesday, September 1, 2015 3:55 PM
> > To: Wang, Zhiqiang; 'Nick Fisk'; 'Samuel Just'
> > Cc: ceph-users@lists.ceph.com
> > Subject: RE: [ceph-users] any recommendation of using EnhanceIO?
> >
> >
> >
> >
> >
> > > -Original Message-
> > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > Behalf Of Wang, Zhiqiang
> > > Sent: 01 September 2015 02:48
> > > To: Nick Fisk ; 'Samuel Just' 
> > > Cc: ceph-users@lists.ceph.com
> > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > >
> > > > -Original Message-
> > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > > Behalf Of Nick Fisk
> > > > Sent: Wednesday, August 19, 2015 5:25 AM
> > > > To: 'Samuel Just'
> > > > Cc: ceph-users@lists.ceph.com
> > > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > > >
> > > > Hi Sam,
> > > >
> > > > > -Original Message-
> > > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > > > Behalf Of Samuel Just
> > > > > Sent: 18 August 2015 21:38
> > > > > To: Nick Fisk 
> > > > > Cc: ceph-users@lists.ceph.com
> > > > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > > > >
> > > > > 1.  We've kicked this around a bit.  What kind of failure
> > > > > semantics would
> > > > you
> > > > > be comfortable with here (that is, what would be reasonable
> > > > > behavior if
> > > > the
> > > > > client side cache fails)?
> > > >
> > > > I would either expect to provide the cache with a redundant block
> > > > device (ie
> > > > RAID1 SSD's) or the cache to allow itself to be configured to
> > > > mirror across two SSD's. Of course single SSD's can be used if the
> > > > user accepts
> > the
> > > risk.
> > > > If the cache did the mirroring then you could do fancy stuff like
> > > > mirror the writes, but leave the read cache blocks as single
> > > > copies to increase the cache capacity.
> > > >
> > > > In either case although an outage is undesirable, its only data
> > > > loss which would be unacceptable, which would hopefully be avoided
> > > > by the mirroring. As part of this, it would need to be a way to
> > > > make sure a "dirty" RBD can't be accessed unless the corresponding
> > > > cache is also
> > > attached.
> > > >
> > > > I guess as it caching the RBD and not the pool or entire cluster,
> > > > the cache only needs to match the failure requirements of the
> > > > application
> > its
> > > caching.
> > > > If I need to cache a RBD that is on  a single server, there is no
> > > > requirement to make the cache redundant across
> > > racks/PDU's/servers...etc.
> > > >
> > > > I hope I've answered your question?
> > > >
> > > >
> > > > > 2. We've got a branch which should merge soon (tomorrow
> > > > > probably) which actually does allow writes to be proxied, so
> > > > > that should alleviate some of these pain points somewhat.  I'm
> > > > > not sure it is clever enough to allow through writefulls for an
> > > > > ec base tier though (but it would be a good
> > > > idea!) -
> > > >
> > > > Excellent news, I shall look forward to testing in the future. I
> > > > did mention the proxy write for write fulls to someone who was
> > > > working on the proxy write code, but I'm not sure if it ever got
followed
> up.
> > >
> > > I think someone here is me. In the current code, for an ec base
> > > tier,
> > writefull
> > > can be proxied to the base.
> >
> > Excellent news. Is

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-09-01 Thread Wang, Zhiqiang
> -Original Message-
> From: Nick Fisk [mailto:n...@fisk.me.uk]
> Sent: Tuesday, September 1, 2015 3:55 PM
> To: Wang, Zhiqiang; 'Nick Fisk'; 'Samuel Just'
> Cc: ceph-users@lists.ceph.com
> Subject: RE: [ceph-users] any recommendation of using EnhanceIO?
> 
> 
> 
> 
> 
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of Wang, Zhiqiang
> > Sent: 01 September 2015 02:48
> > To: Nick Fisk ; 'Samuel Just' 
> > Cc: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> >
> > > -Original Message-
> > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > Behalf Of Nick Fisk
> > > Sent: Wednesday, August 19, 2015 5:25 AM
> > > To: 'Samuel Just'
> > > Cc: ceph-users@lists.ceph.com
> > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > >
> > > Hi Sam,
> > >
> > > > -Original Message-----
> > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > > Behalf Of Samuel Just
> > > > Sent: 18 August 2015 21:38
> > > > To: Nick Fisk 
> > > > Cc: ceph-users@lists.ceph.com
> > > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > > >
> > > > 1.  We've kicked this around a bit.  What kind of failure
> > > > semantics would
> > > you
> > > > be comfortable with here (that is, what would be reasonable
> > > > behavior if
> > > the
> > > > client side cache fails)?
> > >
> > > I would either expect to provide the cache with a redundant block
> > > device (ie
> > > RAID1 SSD's) or the cache to allow itself to be configured to mirror
> > > across two SSD's. Of course single SSD's can be used if the user
> > > accepts
> the
> > risk.
> > > If the cache did the mirroring then you could do fancy stuff like
> > > mirror the writes, but leave the read cache blocks as single copies
> > > to increase the cache capacity.
> > >
> > > In either case although an outage is undesirable, its only data loss
> > > which would be unacceptable, which would hopefully be avoided by the
> > > mirroring. As part of this, it would need to be a way to make sure a
> > > "dirty" RBD can't be accessed unless the corresponding cache is also
> > attached.
> > >
> > > I guess as it caching the RBD and not the pool or entire cluster,
> > > the cache only needs to match the failure requirements of the
> > > application
> its
> > caching.
> > > If I need to cache a RBD that is on  a single server, there is no
> > > requirement to make the cache redundant across
> > racks/PDU's/servers...etc.
> > >
> > > I hope I've answered your question?
> > >
> > >
> > > > 2. We've got a branch which should merge soon (tomorrow probably)
> > > > which actually does allow writes to be proxied, so that should
> > > > alleviate some of these pain points somewhat.  I'm not sure it is
> > > > clever enough to allow through writefulls for an ec base tier
> > > > though (but it would be a good
> > > idea!) -
> > >
> > > Excellent news, I shall look forward to testing in the future. I did
> > > mention the proxy write for write fulls to someone who was working
> > > on the proxy write code, but I'm not sure if it ever got followed up.
> >
> > I think someone here is me. In the current code, for an ec base tier,
> writefull
> > can be proxied to the base.
> 
> Excellent news. Is this intelligent enough to determine when say a normal 
> write
> IO from a RBD is equal to the underlying object size and then turn this normal
> write effectively into a write full?

Checked the code, seems we don't do this right now... Would this be much 
helpful? I think we can do this if the answer is yes.

> 
> >
> > >
> > > > Sam
> > > >
> > > > On Tue, Aug 18, 2015 at 12:48 PM, Nick Fisk  wrote:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >> -Original Message-
> > > > >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > > >> Behalf Of Mark Nelson
> > > > >> Sent: 18 August 20

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-09-01 Thread Nick Fisk




> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Wang, Zhiqiang
> Sent: 01 September 2015 02:48
> To: Nick Fisk ; 'Samuel Just' 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of Nick Fisk
> > Sent: Wednesday, August 19, 2015 5:25 AM
> > To: 'Samuel Just'
> > Cc: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> >
> > Hi Sam,
> >
> > > -Original Message-
> > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > Behalf Of Samuel Just
> > > Sent: 18 August 2015 21:38
> > > To: Nick Fisk 
> > > Cc: ceph-users@lists.ceph.com
> > > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > >
> > > 1.  We've kicked this around a bit.  What kind of failure semantics
> > > would
> > you
> > > be comfortable with here (that is, what would be reasonable behavior
> > > if
> > the
> > > client side cache fails)?
> >
> > I would either expect to provide the cache with a redundant block
> > device (ie
> > RAID1 SSD's) or the cache to allow itself to be configured to mirror
> > across two SSD's. Of course single SSD's can be used if the user accepts
the
> risk.
> > If the cache did the mirroring then you could do fancy stuff like
> > mirror the writes, but leave the read cache blocks as single copies to
> > increase the cache capacity.
> >
> > In either case although an outage is undesirable, its only data loss
> > which would be unacceptable, which would hopefully be avoided by the
> > mirroring. As part of this, it would need to be a way to make sure a
> > "dirty" RBD can't be accessed unless the corresponding cache is also
> attached.
> >
> > I guess as it caching the RBD and not the pool or entire cluster, the
> > cache only needs to match the failure requirements of the application
its
> caching.
> > If I need to cache a RBD that is on  a single server, there is no
> > requirement to make the cache redundant across
> racks/PDU's/servers...etc.
> >
> > I hope I've answered your question?
> >
> >
> > > 2. We've got a branch which should merge soon (tomorrow probably)
> > > which actually does allow writes to be proxied, so that should
> > > alleviate some of these pain points somewhat.  I'm not sure it is
> > > clever enough to allow through writefulls for an ec base tier though
> > > (but it would be a good
> > idea!) -
> >
> > Excellent news, I shall look forward to testing in the future. I did
> > mention the proxy write for write fulls to someone who was working on
> > the proxy write code, but I'm not sure if it ever got followed up.
> 
> I think someone here is me. In the current code, for an ec base tier,
writefull
> can be proxied to the base.

Excellent news. Is this intelligent enough to determine when say a normal
write IO from a RBD is equal to the underlying object size and then turn
this normal write effectively into a write full?

> 
> >
> > > Sam
> > >
> > > On Tue, Aug 18, 2015 at 12:48 PM, Nick Fisk  wrote:
> > > >
> > > >
> > > >
> > > >
> > > >> -Original Message-
> > > >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > > >> Behalf Of Mark Nelson
> > > >> Sent: 18 August 2015 18:51
> > > >> To: Nick Fisk ; 'Jan Schermer' 
> > > >> Cc: ceph-users@lists.ceph.com
> > > >> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > > >>
> > > >>
> > > >>
> > > >> On 08/18/2015 11:52 AM, Nick Fisk wrote:
> > > >> > 
> > > >> >>>>
> > > >> >>>> Here's kind of how I see the field right now:
> > > >> >>>>
> > > >> >>>> 1) Cache at the client level.  Likely fastest but obvious
> > > >> >>>> issues like
> > > > above.
> > > >> >>>> RAID1 might be an option at increased cost.  Lack of
> > > >> >>>> barriers in some implementations scary.
> > > >> >>>
> >

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-31 Thread Wang, Zhiqiang
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Nick Fisk
> Sent: Wednesday, August 19, 2015 5:25 AM
> To: 'Samuel Just'
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> Hi Sam,
> 
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of Samuel Just
> > Sent: 18 August 2015 21:38
> > To: Nick Fisk 
> > Cc: ceph-users@lists.ceph.com
> > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> >
> > 1.  We've kicked this around a bit.  What kind of failure semantics
> > would
> you
> > be comfortable with here (that is, what would be reasonable behavior
> > if
> the
> > client side cache fails)?
> 
> I would either expect to provide the cache with a redundant block device (ie
> RAID1 SSD's) or the cache to allow itself to be configured to mirror across 
> two
> SSD's. Of course single SSD's can be used if the user accepts the risk.
> If the cache did the mirroring then you could do fancy stuff like mirror the
> writes, but leave the read cache blocks as single copies to increase the cache
> capacity.
> 
> In either case although an outage is undesirable, its only data loss which 
> would
> be unacceptable, which would hopefully be avoided by the mirroring. As part of
> this, it would need to be a way to make sure a "dirty" RBD can't be accessed
> unless the corresponding cache is also attached.
> 
> I guess as it caching the RBD and not the pool or entire cluster, the cache 
> only
> needs to match the failure requirements of the application its caching.
> If I need to cache a RBD that is on  a single server, there is no requirement 
> to
> make the cache redundant across racks/PDU's/servers...etc.
> 
> I hope I've answered your question?
> 
> 
> > 2. We've got a branch which should merge soon (tomorrow probably)
> > which actually does allow writes to be proxied, so that should
> > alleviate some of these pain points somewhat.  I'm not sure it is
> > clever enough to allow through writefulls for an ec base tier though
> > (but it would be a good
> idea!) -
> 
> Excellent news, I shall look forward to testing in the future. I did mention 
> the
> proxy write for write fulls to someone who was working on the proxy write 
> code,
> but I'm not sure if it ever got followed up.

I think someone here is me. In the current code, for an ec base tier, writefull 
can be proxied to the base.

> 
> > Sam
> >
> > On Tue, Aug 18, 2015 at 12:48 PM, Nick Fisk  wrote:
> > >
> > >
> > >
> > >
> > >> -Original Message-
> > >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> > >> Behalf Of Mark Nelson
> > >> Sent: 18 August 2015 18:51
> > >> To: Nick Fisk ; 'Jan Schermer' 
> > >> Cc: ceph-users@lists.ceph.com
> > >> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > >>
> > >>
> > >>
> > >> On 08/18/2015 11:52 AM, Nick Fisk wrote:
> > >> > 
> > >> >>>>
> > >> >>>> Here's kind of how I see the field right now:
> > >> >>>>
> > >> >>>> 1) Cache at the client level.  Likely fastest but obvious
> > >> >>>> issues like
> > > above.
> > >> >>>> RAID1 might be an option at increased cost.  Lack of barriers
> > >> >>>> in some implementations scary.
> > >> >>>
> > >> >>> Agreed.
> > >> >>>
> > >> >>>>
> > >> >>>> 2) Cache below the OSD.  Not much recent data on this.  Not
> > >> >>>> likely as fast as client side cache, but likely cheaper (fewer
> > >> >>>> OSD nodes than client
> > >> >> nodes?).
> > >> >>>> Lack of barriers in some implementations scary.
> > >> >>>
> > >> >>> This also has the benefit of caching the leveldb on the OSD, so
> > >> >>> get a big
> > >> >> performance gain from there too for small sequential writes. I
> > >> >> looked at using Flashcache for this too but decided it was
> > >> >> adding to much complexity and risk.
> > >> >>>
> > >> >>> I tho

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-19 Thread Christian Balzer
On Wed, 19 Aug 2015 10:02:25 +0100 Nick Fisk wrote:

> 
> 
> 
> 
> > -Original Message-
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> > Of Christian Balzer
> > Sent: 19 August 2015 03:32
> > To: ceph-users@lists.ceph.com
> > Cc: Nick Fisk 
> > Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> > 
> > On Tue, 18 Aug 2015 20:48:26 +0100 Nick Fisk wrote:
> > 
> > [mega snip]
> > > 4. Disk based OSD with SSD Journal performance As I touched on above
> > > earlier, I would expect a disk based OSD with SSD journal to have
> > > similar performance to a pure SSD OSD when dealing with sequential
> > > small IO's. Currently the levelDB sync and potentially other things
> > > slow this down.
> > >
> > 
> > Has anybody tried symlinking the omap directory to a SSD and tested if
> > hat makes a (significant) difference?
> 
> I thought I remember reading somewhere that all these items need to
> remain on the OSD itself so that when the OSD calls fsync it can be sure
> they are all in sync at the same time.
>
Would be nice to have this confirmed by the devs.
It being leveldb, you'd think it would be in sync by default.

But even if it were potentially unsafe (not crash safe) in the current
incarnation, the results of such a test might make any needed changes
attractive.
Unfortunately I don't have anything resembling a SSD in my test cluster. 

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-19 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

> Probably the big question is what are the pain points?  The most common
> answer we get when asking folks what applications they run on top of Ceph is
> "everything!".  This is wonderful, but not helpful when trying to figure out
> what performance issues matter most! :)
>
> IE, should we be focusing on IOPS?  Latency?  Finding a way to avoid journal
> overhead for large writes?  Are there specific use cases where we should
> specifically be focusing attention? general iscsi?  S3? databases directly
> on RBD? etc.  There's tons of different areas that we can work on (general
> OSD threading improvements, different messenger implementations, newstore,
> client side bottlenecks, etc) but all of those things tackle different kinds
> of problems.

We run "everything" or it sure seems like it. I did some computation
of a large sampling of our servers and found that the average request
size was ~12K/~18K (read/write) and ~30%/70% (it looks like I didn't
save that spreadsheet to get exact numbers).

So, any optimization in smaller I/O sizes would really benefit us

- 
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
-BEGIN PGP SIGNATURE-
Version: Mailvelope v1.0.0
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJV1ONDCRDmVDuy+mK58QAAGZMP/2jZk2ScXzJP0kpvVpCZ
pHn74r2tR2gwpzSDzFp2az5L36DutrJImDx/wDZnxsuA6v1sUWiW3/VBwm8r
LRHr0jHvVm5EnpyJuT7zPRg/aJhXcK9hlIuud3tQ357BtMb0Hv6rbVclqtoP
RCYFoWv6vHVg0nmbKyZODj9W4PWfb6AjXazwHlgZw10q1GcYSs5LS2n9Yx8B
Q8hpn/8mf49IopYuyBOH5VTIxOUGlv1XAUD4kSRSYvhFLMQg0lt7L1VZrbiY
qqFtMUlvSoasb7eFahYZskkjUPB9c9kplWjKkKo8K7nV40pUfg8yClZmZ0kl
a4gok35Cn8x58rWBrSpMEqAvr5ObE27LNnwGhy9KfzdkMpdWS2/lr7oZsb+O
Gwk/4/u4hWbjuYSeGsqXefFINgRjl8TPOTQ7ZawlOcxsJhnyiGwWa5jXypHZ
Smju6lEKMd9XvoBHtRBk+wX08E/T/U6pZplOpRwG8jV5xQtrF9B9AEiQoMTj
x4HWdD17O9DcW5veiPkDzf9onkrbWZdcYjTXwdKqm6q0vEvk4stcLTfgjCVu
+zqcgcbCyvw/URNmVjAHH3dSkfmrFBHuLW062hYhSnPlqgSBJ6xTwFzQSIIZ
ZcDOfQIR72l5mLLg/YAvOMljhUnwZjdURRJytYG5KsdzZRWJZ9bWd+n2ZwfZ
Yf4H
=bSAx
-END PGP SIGNATURE-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-19 Thread Nick Fisk




> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Christian Balzer
> Sent: 19 August 2015 03:32
> To: ceph-users@lists.ceph.com
> Cc: Nick Fisk 
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> On Tue, 18 Aug 2015 20:48:26 +0100 Nick Fisk wrote:
> 
> [mega snip]
> > 4. Disk based OSD with SSD Journal performance As I touched on above
> > earlier, I would expect a disk based OSD with SSD journal to have
> > similar performance to a pure SSD OSD when dealing with sequential
> > small IO's. Currently the levelDB sync and potentially other things
> > slow this down.
> >
> 
> Has anybody tried symlinking the omap directory to a SSD and tested if hat
> makes a (significant) difference?

I thought I remember reading somewhere that all these items need to remain
on the OSD itself so that when the OSD calls fsync it can be sure they are
all in sync at the same time.

> 
> Christian
> --
> Christian BalzerNetwork/Systems Engineer
> ch...@gol.com Global OnLine Japan/Fusion Communications
> http://www.gol.com/
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Stefan Priebe - Profihost AG
Am 18.08.2015 um 15:43 schrieb Campbell, Bill:
> Hey Stefan,
> Are you using your Ceph cluster for virtualization storage?
Yes

>  Is dm-writeboost configured on the OSD nodes themselves?
Yes

Stefan

> 
> 
> *From: *"Stefan Priebe - Profihost AG" 
> *To: *"Mark Nelson" , ceph-users@lists.ceph.com
> *Sent: *Tuesday, August 18, 2015 7:36:10 AM
> *Subject: *Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> We're using an extra caching layer for ceph since the beginning for our
> older ceph deployments. All new deployments go with full SSDs.
> 
> I've tested so far:
> - EnhanceIO
> - Flashcache
> - Bcache
> - dm-cache
> - dm-writeboost
> 
> The best working solution was and is bcache except for it's buggy code.
> The current code in 4.2-rc7 vanilla kernel still contains bugs. f.e.
> discards result in crashed FS after reboots and so on. But it's still
> the fastest for ceph.
> 
> The 2nd best solution which we already use in production is
> dm-writeboost (https://github.com/akiradeveloper/dm-writeboost).
> 
> Everything else is too slow.
> 
> Stefan
> Am 18.08.2015 um 13:33 schrieb Mark Nelson:
>> Hi Jan,
>>
>> Out of curiosity did you ever try dm-cache?  I've been meaning to give
>> it a spin but haven't had the spare cycles.
>>
>> Mark
>>
>> On 08/18/2015 04:00 AM, Jan Schermer wrote:
>>> I already evaluated EnhanceIO in combination with CentOS 6 (and
>>> backported 3.10 and 4.0 kernel-lt if I remember correctly).
>>> It worked fine during benchmarks and stress tests, but once we run DB2
>>> on it it panicked within minutes and took all the data with it (almost
>>> literally - files that werent touched, like OS binaries were b0rked
>>> and the filesystem was unsalvageable).
>>> If you disregard this warning - the performance gains weren't that
>>> great either, at least in a VM. It had problems when flushing to disk
>>> after reaching dirty watermark and the block size has some
>>> not-well-documented implications (not sure now, but I think it only
>>> cached IO _larger_than the block size, so if your database keeps
>>> incrementing an XX-byte counter it will go straight to disk).
>>>
>>> Flashcache doesn't respect barriers (or does it now?) - if that's ok
>>> for you than go for it, it should be stable and I used it in the past
>>> in production without problems.
>>>
>>> bcache seemed to work fine, but I needed to
>>> a) use it for root
>>> b) disable and enable it on the fly (doh)
>>> c) make it non-persisent (flush it) before reboot - not sure if that
>>> was possible either.
>>> d) all that in a customer's VM, and that customer didn't have a strong
>>> technical background to be able to fiddle with it...
>>> So I haven't tested it heavily.
>>>
>>> Bcache should be the obvious choice if you are in control of the
>>> environment. At least you can cry on LKML's shoulder when you lose
>>> data :-)
>>>
>>> Jan
>>>
>>>
>>>> On 18 Aug 2015, at 01:49, Alex Gorbachev  wrote:
>>>>
>>>> What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
>>>> months ago, but no external contributors :(
>>>>
>>>> The nice thing about EnhanceIO is there is no need to change device
>>>> name, unlike bcache, flashcache etc.
>>>>
>>>> Best regards,
>>>> Alex
>>>>
>>>> On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz 
>>>> wrote:
>>>>> I did some (non-ceph) work on these, and concluded that bcache was
>>>>> the best
>>>>> supported, most stable, and fastest.  This was ~1 year ago, to take
>>>>> it with
>>>>> a grain of salt, but that's what I would recommend.
>>>>>
>>>>> Daniel
>>>>>
>>>>>
>>>>> 
>>>>> From: "Dominik Zalewski" 
>>>>> To: "German Anders" 
>>>>> Cc: "ceph-users" 
>>>>> Sent: Wednesday, July 1, 2015 5:28:10 PM
>>>>> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I’ve asked same question last weeks or so (just search the mailing 

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Christian Balzer
On Tue, 18 Aug 2015 12:50:38 -0500 Mark Nelson wrote:

[snap]
> Probably the big question is what are the pain points?  The most common 
> answer we get when asking folks what applications they run on top of 
> Ceph is "everything!".  This is wonderful, but not helpful when trying 
> to figure out what performance issues matter most! :)
> 
Well, the "everything" answer really is the one everybody who runs VMs
backed by RBD for internal or external customers will give.
I.e. no idea what is installed and no control over how it accesses the
Ceph cluster.

And even when you think you have a predictable use case it might not be
true.
As in, one of our Ceph installs backs a ganeti cluster with hundreds of VMs
running 2 type of applications and from past experience I know their I/O
patterns (nearly 100% write only, any reads usually can be satisfied from
local or storage node pagecache). 
Thus the Ceph cluster was configured in a way that was optimized for this
and it worked beautifully until:
a) scrubs became too heavy (generating too many read IOPS while also
invalidating page caches) and
b) somebody thought a 3rd type of VM using Windows with IOPS that equal
dozens of the other types would be a good idea.


> IE, should we be focusing on IOPS?  Latency?  Finding a way to avoid 
> journal overhead for large writes?  Are there specific use cases where 
> we should specifically be focusing attention? general iscsi?  S3? 
> databases directly on RBD? etc.  There's tons of different areas that we 
> can work on (general OSD threading improvements, different messenger 
> implementations, newstore, client side bottlenecks, etc) but all of 
> those things tackle different kinds of problems.
> 
All of these except S3 would have a positive impact in my various use
cases.
However at the risk of sounding like a broken record, any time spent on
these improvements before Ceph can recover from a scrub error fully
autonomously (read: checksums) would be a waste in my book.

All the speed in the world is pretty insignificant when a simple 
"ceph pg repair" (which is still in the Ceph docs w/o any qualification of
what it actually does) has a good chance of wiping out good data "by
imposing the primary OSD's view of the world on the replicas", to quote
Greg.

Regards,

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Christian Balzer
On Tue, 18 Aug 2015 20:48:26 +0100 Nick Fisk wrote:

[mega snip]
> 4. Disk based OSD with SSD Journal performance
> As I touched on above earlier, I would expect a disk based OSD with SSD
> journal to have similar performance to a pure SSD OSD when dealing with
> sequential small IO's. Currently the levelDB sync and potentially other
> things slow this down.
> 

Has anybody tried symlinking the omap directory to a SSD and tested if hat
makes a (significant) difference?

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
Hi Sam,

> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Samuel Just
> Sent: 18 August 2015 21:38
> To: Nick Fisk 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> 1.  We've kicked this around a bit.  What kind of failure semantics would
you
> be comfortable with here (that is, what would be reasonable behavior if
the
> client side cache fails)?

I would either expect to provide the cache with a redundant block device (ie
RAID1 SSD's) or the cache to allow itself to be configured to mirror across
two SSD's. Of course single SSD's can be used if the user accepts the risk.
If the cache did the mirroring then you could do fancy stuff like mirror the
writes, but leave the read cache blocks as single copies to increase the
cache capacity.

In either case although an outage is undesirable, its only data loss which
would be unacceptable, which would hopefully be avoided by the mirroring. As
part of this, it would need to be a way to make sure a "dirty" RBD can't be
accessed unless the corresponding cache is also attached.

I guess as it caching the RBD and not the pool or entire cluster, the cache
only needs to match the failure requirements of the application its caching.
If I need to cache a RBD that is on  a single server, there is no
requirement to make the cache redundant across racks/PDU's/servers...etc. 

I hope I've answered your question?


> 2. We've got a branch which should merge soon (tomorrow probably) which
> actually does allow writes to be proxied, so that should alleviate some of
> these pain points somewhat.  I'm not sure it is clever enough to allow
> through writefulls for an ec base tier though (but it would be a good
idea!) -

Excellent news, I shall look forward to testing in the future. I did mention
the proxy write for write fulls to someone who was working on the proxy
write code, but I'm not sure if it ever got followed up.

> Sam
> 
> On Tue, Aug 18, 2015 at 12:48 PM, Nick Fisk  wrote:
> >
> >
> >
> >
> >> -Original Message-
> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> >> Of Mark Nelson
> >> Sent: 18 August 2015 18:51
> >> To: Nick Fisk ; 'Jan Schermer' 
> >> Cc: ceph-users@lists.ceph.com
> >> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> >>
> >>
> >>
> >> On 08/18/2015 11:52 AM, Nick Fisk wrote:
> >> > 
> >> >>>>
> >> >>>> Here's kind of how I see the field right now:
> >> >>>>
> >> >>>> 1) Cache at the client level.  Likely fastest but obvious issues
> >> >>>> like
> > above.
> >> >>>> RAID1 might be an option at increased cost.  Lack of barriers in
> >> >>>> some implementations scary.
> >> >>>
> >> >>> Agreed.
> >> >>>
> >> >>>>
> >> >>>> 2) Cache below the OSD.  Not much recent data on this.  Not
> >> >>>> likely as fast as client side cache, but likely cheaper (fewer
> >> >>>> OSD nodes than client
> >> >> nodes?).
> >> >>>> Lack of barriers in some implementations scary.
> >> >>>
> >> >>> This also has the benefit of caching the leveldb on the OSD, so
> >> >>> get a big
> >> >> performance gain from there too for small sequential writes. I
> >> >> looked at using Flashcache for this too but decided it was adding
> >> >> to much complexity and risk.
> >> >>>
> >> >>> I thought I read somewhere that RocksDB allows you to move its
> >> >>> WAL to
> >> >> SSD, is there anything in the pipeline for something like moving
> >> >> the filestore to use RocksDB?
> >> >>
> >> >> I believe you can already do this, though I haven't tested it.
> >> >> You can certainly move the monitors to rocksdb (tested) and
> >> >> newstore uses
> >> rocksdb as well.
> >> >>
> >> >
> >> > Interesting, I might have a look into this.
> >> >
> >> >>>
> >> >>>>
> >> >>>> 3) Ceph Cache Tiering. Network overhead and write amplification
> >> >>>> on promotion makes this primarily useful when workloads fit
> >> >>>> mostly into the cache tier.  Overal

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Samuel Just
1.  We've kicked this around a bit.  What kind of failure semantics
would you be comfortable with here (that is, what would be reasonable
behavior if the client side cache fails)?
2. We've got a branch which should merge soon (tomorrow probably)
which actually does allow writes to be proxied, so that should
alleviate some of these pain points somewhat.  I'm not sure it is
clever enough to allow through writefulls for an ec base tier though
(but it would be a good idea!)
-Sam

On Tue, Aug 18, 2015 at 12:48 PM, Nick Fisk  wrote:
>
>
>
>
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Mark Nelson
>> Sent: 18 August 2015 18:51
>> To: Nick Fisk ; 'Jan Schermer' 
>> Cc: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
>>
>>
>>
>> On 08/18/2015 11:52 AM, Nick Fisk wrote:
>> > 
>> >>>>
>> >>>> Here's kind of how I see the field right now:
>> >>>>
>> >>>> 1) Cache at the client level.  Likely fastest but obvious issues like
> above.
>> >>>> RAID1 might be an option at increased cost.  Lack of barriers in
>> >>>> some implementations scary.
>> >>>
>> >>> Agreed.
>> >>>
>> >>>>
>> >>>> 2) Cache below the OSD.  Not much recent data on this.  Not likely
>> >>>> as fast as client side cache, but likely cheaper (fewer OSD nodes
>> >>>> than client
>> >> nodes?).
>> >>>> Lack of barriers in some implementations scary.
>> >>>
>> >>> This also has the benefit of caching the leveldb on the OSD, so get
>> >>> a big
>> >> performance gain from there too for small sequential writes. I looked
>> >> at using Flashcache for this too but decided it was adding to much
>> >> complexity and risk.
>> >>>
>> >>> I thought I read somewhere that RocksDB allows you to move its WAL
>> >>> to
>> >> SSD, is there anything in the pipeline for something like moving the
>> >> filestore to use RocksDB?
>> >>
>> >> I believe you can already do this, though I haven't tested it.  You
>> >> can certainly move the monitors to rocksdb (tested) and newstore uses
>> rocksdb as well.
>> >>
>> >
>> > Interesting, I might have a look into this.
>> >
>> >>>
>> >>>>
>> >>>> 3) Ceph Cache Tiering. Network overhead and write amplification on
>> >>>> promotion makes this primarily useful when workloads fit mostly
>> >>>> into the cache tier.  Overall safe design but care must be taken to
>> >>>> not over-
>> >> promote.
>> >>>>
>> >>>> 4) separate SSD pool.  Manual and not particularly flexible, but
>> >>>> perhaps
>> >> best
>> >>>> for applications that need consistently high performance.
>> >>>
>> >>> I think it depends on the definition of performance. Currently even
>> >>> very
>> >> fast CPU's and SSD's in their own pool will still struggle to get
>> >> less than 1ms of write latency. If your performance requirements are
>> >> for large queue depths then you will probably be alright. If you
>> >> require something that mirrors the performance of traditional write
>> >> back cache, then even pure SSD Pools can start to struggle.
>> >>
>> >> Agreed.  This is definitely the crux of the problem.  The example
>> >> below is a great start!  It'd would be fantastic if we could get more
>> >> feedback from the list on the relative importance of low latency
>> >> operations vs high IOPS through concurrency.  We have general
>> >> suspicions but not a ton of actual data regarding what folks are
>> >> seeing in practice and under what scenarios.
>> >>
>> >
>> > If you have any specific questions that you think I might be able to
> answer,
>> please let me know. The only other main app that I can really think of
> where
>> these sort of write latency is critical is SQL, particularly the
> transaction logs.
>>
>> Probably the big question is what are the pain points?  The most common
>> answer we get when asking folks what applications they run on top of Ceph
>> is "everything!".  Th

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk




> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Mark Nelson
> Sent: 18 August 2015 18:51
> To: Nick Fisk ; 'Jan Schermer' 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> 
> 
> On 08/18/2015 11:52 AM, Nick Fisk wrote:
> > 
> >>>>
> >>>> Here's kind of how I see the field right now:
> >>>>
> >>>> 1) Cache at the client level.  Likely fastest but obvious issues like
above.
> >>>> RAID1 might be an option at increased cost.  Lack of barriers in
> >>>> some implementations scary.
> >>>
> >>> Agreed.
> >>>
> >>>>
> >>>> 2) Cache below the OSD.  Not much recent data on this.  Not likely
> >>>> as fast as client side cache, but likely cheaper (fewer OSD nodes
> >>>> than client
> >> nodes?).
> >>>> Lack of barriers in some implementations scary.
> >>>
> >>> This also has the benefit of caching the leveldb on the OSD, so get
> >>> a big
> >> performance gain from there too for small sequential writes. I looked
> >> at using Flashcache for this too but decided it was adding to much
> >> complexity and risk.
> >>>
> >>> I thought I read somewhere that RocksDB allows you to move its WAL
> >>> to
> >> SSD, is there anything in the pipeline for something like moving the
> >> filestore to use RocksDB?
> >>
> >> I believe you can already do this, though I haven't tested it.  You
> >> can certainly move the monitors to rocksdb (tested) and newstore uses
> rocksdb as well.
> >>
> >
> > Interesting, I might have a look into this.
> >
> >>>
> >>>>
> >>>> 3) Ceph Cache Tiering. Network overhead and write amplification on
> >>>> promotion makes this primarily useful when workloads fit mostly
> >>>> into the cache tier.  Overall safe design but care must be taken to
> >>>> not over-
> >> promote.
> >>>>
> >>>> 4) separate SSD pool.  Manual and not particularly flexible, but
> >>>> perhaps
> >> best
> >>>> for applications that need consistently high performance.
> >>>
> >>> I think it depends on the definition of performance. Currently even
> >>> very
> >> fast CPU's and SSD's in their own pool will still struggle to get
> >> less than 1ms of write latency. If your performance requirements are
> >> for large queue depths then you will probably be alright. If you
> >> require something that mirrors the performance of traditional write
> >> back cache, then even pure SSD Pools can start to struggle.
> >>
> >> Agreed.  This is definitely the crux of the problem.  The example
> >> below is a great start!  It'd would be fantastic if we could get more
> >> feedback from the list on the relative importance of low latency
> >> operations vs high IOPS through concurrency.  We have general
> >> suspicions but not a ton of actual data regarding what folks are
> >> seeing in practice and under what scenarios.
> >>
> >
> > If you have any specific questions that you think I might be able to
answer,
> please let me know. The only other main app that I can really think of
where
> these sort of write latency is critical is SQL, particularly the
transaction logs.
> 
> Probably the big question is what are the pain points?  The most common
> answer we get when asking folks what applications they run on top of Ceph
> is "everything!".  This is wonderful, but not helpful when trying to
figure out
> what performance issues matter most! :)

Sort of like someone telling you their pc is broken and when asked for
details getting "It's not working" in return. 

In general I think a lot of it comes down to people not appreciating the
differences between Ceph and say a Raid array. For most things like larger
block IO performance tends to scale with cluster size and the cost
effectiveness of Ceph makes this a no brainer not to just add a handful of
extra OSD's.

I will try and be more precise. Here is my list of pain points / wishes that
I have come across in the last 12 months of running Ceph.

1. Improve small IO write latency
As discussed in depth in this thread. If it's possible just to make Ceph a
lot faster then great, but I fear even a doubling in performance will still
fall short compared to if you ar

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Alex Gorbachev
> IE, should we be focusing on IOPS?  Latency?  Finding a way to avoid journal
> overhead for large writes?  Are there specific use cases where we should
> specifically be focusing attention? general iscsi?  S3? databases directly
> on RBD? etc.  There's tons of different areas that we can work on (general
> OSD threading improvements, different messenger implementations, newstore,
> client side bottlenecks, etc) but all of those things tackle different kinds
> of problems.
>

Mark, my take is definitely write latency.  Base on this discussion,
there is no real safe solution for write caching outside Ceph.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Mark Nelson



On 08/18/2015 11:52 AM, Nick Fisk wrote:




Here's kind of how I see the field right now:

1) Cache at the client level.  Likely fastest but obvious issues like above.
RAID1 might be an option at increased cost.  Lack of barriers in some
implementations scary.


Agreed.



2) Cache below the OSD.  Not much recent data on this.  Not likely as
fast as client side cache, but likely cheaper (fewer OSD nodes than client

nodes?).

Lack of barriers in some implementations scary.


This also has the benefit of caching the leveldb on the OSD, so get a big

performance gain from there too for small sequential writes. I looked at
using Flashcache for this too but decided it was adding to much complexity
and risk.


I thought I read somewhere that RocksDB allows you to move its WAL to

SSD, is there anything in the pipeline for something like moving the filestore
to use RocksDB?

I believe you can already do this, though I haven't tested it.  You can 
certainly
move the monitors to rocksdb (tested) and newstore uses rocksdb as well.



Interesting, I might have a look into this.





3) Ceph Cache Tiering. Network overhead and write amplification on
promotion makes this primarily useful when workloads fit mostly into the
cache tier.  Overall safe design but care must be taken to not over-

promote.


4) separate SSD pool.  Manual and not particularly flexible, but perhaps

best

for applications that need consistently high performance.


I think it depends on the definition of performance. Currently even very

fast CPU's and SSD's in their own pool will still struggle to get less than 1ms 
of
write latency. If your performance requirements are for large queue depths
then you will probably be alright. If you require something that mirrors the
performance of traditional write back cache, then even pure SSD Pools can
start to struggle.

Agreed.  This is definitely the crux of the problem.  The example below
is a great start!  It'd would be fantastic if we could get more feedback
from the list on the relative importance of low latency operations vs
high IOPS through concurrency.  We have general suspicions but not a ton
of actual data regarding what folks are seeing in practice and under
what scenarios.



If you have any specific questions that you think I might be able to answer, 
please let me know. The only other main app that I can really think of where 
these sort of write latency is critical is SQL, particularly the transaction 
logs.


Probably the big question is what are the pain points?  The most common 
answer we get when asking folks what applications they run on top of 
Ceph is "everything!".  This is wonderful, but not helpful when trying 
to figure out what performance issues matter most! :)


IE, should we be focusing on IOPS?  Latency?  Finding a way to avoid 
journal overhead for large writes?  Are there specific use cases where 
we should specifically be focusing attention? general iscsi?  S3? 
databases directly on RBD? etc.  There's tons of different areas that we 
can work on (general OSD threading improvements, different messenger 
implementations, newstore, client side bottlenecks, etc) but all of 
those things tackle different kinds of problems.


Mark
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Jan Schermer
> Sent: 18 August 2015 17:13
> To: Nick Fisk 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> 
> > On 18 Aug 2015, at 16:44, Nick Fisk  wrote:
> >
> >> -Original Message-
> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> >> Of Mark Nelson
> >> Sent: 18 August 2015 14:51
> >> To: Nick Fisk ; 'Jan Schermer' 
> >> Cc: ceph-users@lists.ceph.com
> >> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> >>
> >>
> >>
> >> On 08/18/2015 06:47 AM, Nick Fisk wrote:
> >>> Just to chime in, I gave dmcache a limited test but its lack of
> >>> proper
> >> writeback cache ruled it out for me. It only performs write back
> >> caching on blocks already on the SSD, whereas I need something that
> >> works like a Battery backed raid controller caching all writes.
> >>>
> >>> It's amazing the 100x performance increase you get with RBD's when
> >>> doing
> >> sync writes and give it something like just 1GB write back cache with
> >> flashcache.
> >>
> >> For your use case, is it ok that data may live on the flashcache for
> >> some amount of time before making to ceph to be replicated?  We've
> >> wondered internally if this kind of trade-off is acceptable to
> >> customers or not should the flashcache SSD fail.
> >
> > Yes, I agree, it's not ideal. But I believe it’s the only way to get the
> performance required for some workloads that need write latency's <1ms.
> >
> > I'm still in testing at the moment with the testing kernel that includes 
> > blk-
> mq fixes for large queue depths and max io sizes. But if we decide to put into
> production, it would be using 2x SAS dual port SSD's in RAID1 across two
> servers for HA. As we are currently using iSCSI from these two servers, there
> is no real loss of availability by doing this. Generally I think as long as 
> you build
> this around the fault domains of the application you are caching, it shouldn't
> impact too much.
> >
> > I guess for people using openstack and other direct RBD interfaces it may
> not be such an attractive option. I've been thinking that maybe Ceph needs
> to have an additional daemon with very low overheads, which is run on SSD's
> to provide shared persistent cache devices for librbd. There's still a trade 
> off,
> maybe not as much as using Flashcache, but for some workloads like
> database's, many people may decide that it's worth it. Of course I realise 
> this
> would be a lot of work and everyone is really busy, but in terms of
> performance gained it would most likely have a dramatic effect in making
> Ceph look comparable to other solutions like VSAN or ScaleIO when it comes
> to high iops/low latency stuff.
> >
> 
> Additional daemon that is persistent how? Isn't that what journal does
> already, just too slowly?

The journal is part of an OSD, as is speed restricted by a lot of the 
functionality that Ceph has to provide. I was more thinking of a very light 
weight "service" that acts as an interface between a SSD and librbd and is 
focussed on speed. For something like a standalone SQL server it might run on 
the SQL server with a local SSD, but in other scenarios you might have this 
"service" remote where the SSD's are installed. HA for the SSD could be 
provided by RAID+Dual Port SAS, or maybe some sort of lightweight replication 
could be built into the service.


This was just a random though rather than something I have planned out though.

> 
> I think the best (and easiest!) approach is to mimic what a monilithic SAN
> does
> 
> Currently
> 1) client issues blocking/atomic/sync IO
> 2) rbd client sends this IO to all OSDs
> 3) after all OSDs "process the IO", the IO is finished and considered 
> persistent
> 
> That has serious implications
>   * every IO is processed separately, not much coalescing
>   * OSD processes add the latency when processing this IO
>   * one OSD can be slow momentarily, IO backs up and the cluster
> stalls
> 
> Let me just select what "processing the IO" means with respect to my
> architecture and I can likely get a 100x improvement
> 
> Let me choose:
> 
> 1) WHERE the IO is persisted
> Do I really need all (e.g. 3) OSDs to persist the data or is quorum (2)
> sufficient?
> No

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Jan Schermer
Yes, writeback mode. I didn't try anything else.

Jan

> On 18 Aug 2015, at 18:30, Alex Gorbachev  wrote:
> 
> HI Jan,
> 
> On Tue, Aug 18, 2015 at 5:00 AM, Jan Schermer  wrote:
>> I already evaluated EnhanceIO in combination with CentOS 6 (and backported 
>> 3.10 and 4.0 kernel-lt if I remember correctly).
>> It worked fine during benchmarks and stress tests, but once we run DB2 on it 
>> it panicked within minutes and took all the data with it (almost literally - 
>> files that werent touched, like OS binaries were b0rked and the filesystem 
>> was unsalvageable).
> 
> Out of curiosity, were you using EnhanceIO in writeback mode?  I
> assume so, as a read cache should not hurt anything.
> 
> Thanks,
> Alex
> 
>> If you disregard this warning - the performance gains weren't that great 
>> either, at least in a VM. It had problems when flushing to disk after 
>> reaching dirty watermark and the block size has some not-well-documented 
>> implications (not sure now, but I think it only cached IO _larger_than the 
>> block size, so if your database keeps incrementing an XX-byte counter it 
>> will go straight to disk).
>> 
>> Flashcache doesn't respect barriers (or does it now?) - if that's ok for you 
>> than go for it, it should be stable and I used it in the past in production 
>> without problems.
>> 
>> bcache seemed to work fine, but I needed to
>> a) use it for root
>> b) disable and enable it on the fly (doh)
>> c) make it non-persisent (flush it) before reboot - not sure if that was 
>> possible either.
>> d) all that in a customer's VM, and that customer didn't have a strong 
>> technical background to be able to fiddle with it...
>> So I haven't tested it heavily.
>> 
>> Bcache should be the obvious choice if you are in control of the 
>> environment. At least you can cry on LKML's shoulder when you lose data :-)
>> 
>> Jan
>> 
>> 
>>> On 18 Aug 2015, at 01:49, Alex Gorbachev  wrote:
>>> 
>>> What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
>>> months ago, but no external contributors :(
>>> 
>>> The nice thing about EnhanceIO is there is no need to change device
>>> name, unlike bcache, flashcache etc.
>>> 
>>> Best regards,
>>> Alex
>>> 
>>> On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz  wrote:
>>>> I did some (non-ceph) work on these, and concluded that bcache was the best
>>>> supported, most stable, and fastest.  This was ~1 year ago, to take it with
>>>> a grain of salt, but that's what I would recommend.
>>>> 
>>>> Daniel
>>>> 
>>>> 
>>>> 
>>>> From: "Dominik Zalewski" 
>>>> To: "German Anders" 
>>>> Cc: "ceph-users" 
>>>> Sent: Wednesday, July 1, 2015 5:28:10 PM
>>>> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
>>>> 
>>>> 
>>>> Hi,
>>>> 
>>>> I’ve asked same question last weeks or so (just search the mailing list
>>>> archives for EnhanceIO :) and got some interesting answers.
>>>> 
>>>> Looks like the project is pretty much dead since it was bought out by HGST.
>>>> Even their website has some broken links in regards to EnhanceIO
>>>> 
>>>> I’m keen to try flashcache or bcache (its been in the mainline kernel for
>>>> some time)
>>>> 
>>>> Dominik
>>>> 
>>>> On 1 Jul 2015, at 21:13, German Anders  wrote:
>>>> 
>>>> Hi cephers,
>>>> 
>>>>  Is anyone out there that implement enhanceIO in a production environment?
>>>> any recommendation? any perf output to share with the diff between using it
>>>> and not?
>>>> 
>>>> Thanks in advance,
>>>> 
>>>> German
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>>> 
>>>> 
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>>> 
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk

> >>
> >> Here's kind of how I see the field right now:
> >>
> >> 1) Cache at the client level.  Likely fastest but obvious issues like 
> >> above.
> >> RAID1 might be an option at increased cost.  Lack of barriers in some
> >> implementations scary.
> >
> > Agreed.
> >
> >>
> >> 2) Cache below the OSD.  Not much recent data on this.  Not likely as
> >> fast as client side cache, but likely cheaper (fewer OSD nodes than client
> nodes?).
> >> Lack of barriers in some implementations scary.
> >
> > This also has the benefit of caching the leveldb on the OSD, so get a big
> performance gain from there too for small sequential writes. I looked at
> using Flashcache for this too but decided it was adding to much complexity
> and risk.
> >
> > I thought I read somewhere that RocksDB allows you to move its WAL to
> SSD, is there anything in the pipeline for something like moving the filestore
> to use RocksDB?
> 
> I believe you can already do this, though I haven't tested it.  You can 
> certainly
> move the monitors to rocksdb (tested) and newstore uses rocksdb as well.
> 

Interesting, I might have a look into this. 

> >
> >>
> >> 3) Ceph Cache Tiering. Network overhead and write amplification on
> >> promotion makes this primarily useful when workloads fit mostly into the
> >> cache tier.  Overall safe design but care must be taken to not over-
> promote.
> >>
> >> 4) separate SSD pool.  Manual and not particularly flexible, but perhaps
> best
> >> for applications that need consistently high performance.
> >
> > I think it depends on the definition of performance. Currently even very
> fast CPU's and SSD's in their own pool will still struggle to get less than 
> 1ms of
> write latency. If your performance requirements are for large queue depths
> then you will probably be alright. If you require something that mirrors the
> performance of traditional write back cache, then even pure SSD Pools can
> start to struggle.
> 
> Agreed.  This is definitely the crux of the problem.  The example below
> is a great start!  It'd would be fantastic if we could get more feedback
> from the list on the relative importance of low latency operations vs
> high IOPS through concurrency.  We have general suspicions but not a ton
> of actual data regarding what folks are seeing in practice and under
> what scenarios.
> 

If you have any specific questions that you think I might be able to answer, 
please let me know. The only other main app that I can really think of where 
these sort of write latency is critical is SQL, particularly the transaction 
logs.  

> >
> >
> > To give a real world example of what I see when doing various tests,  here
> is a rough guide to IOP's when removing a snapshot on a ESX server
> >
> > Traditional Array 10K disks = 300-600 IOPs
> > Ceph 7.2K + SSD Journal = 100-200 IOPs (LevelDB syncing on OSD seems to
> be the main limitation)
> > Ceph Pure SSD Pool = 500 IOPs (Intel s3700 SSD's)
> 
> I'd be curious to see how much jemalloc or tcmalloc 2.4 + 128MB TC help
> here.  Sandisk and Intel have both done some very useful investigations,
> I've got some additional tests replicating some of their findings coming
> shortly.

Ok, will be interesting to se. I will see if I can change it on my environment 
and if it has any improvement. I think I came to the conclusion that Ceph takes 
a certain amount of time to do a write and by the time you add in a replica 
copy I was struggling to get much below 2ms per IO with my 2.1GHz CPU's.   2ms 
= ~500IOPs.

> 
> > Ceph Cache Tiering = 10-500 IOPs (As we know, misses can be very painful)
> 
> Indeed.  There's some work going on in this area too.  Hopefully we'll
> know how some of our ideas pan out later this week.  Assuming excessive
> promotions aren't a problem, the jemalloc/tcmalloc improvements I
> suspect will generally make cache teiring more interesting (though
> buffer cache will still be the primary source of really hot cached reads)
> 
> > Ceph + RBD Caching with Flashcache = 200-1000 IOPs (Readahead can give
> high bursts if snapshot blocks are sequential)
> 
> Good to know!
> 
> >
> > And when copying VM's to datastore (ESXi does this in sequential 64k
> IO's.yes silly I know)
> >
> > Traditional Array 10K disks = ~100MB/s (Limited by 1GB interface, on other
> arrays I guess this scales)
> > Ceph 7.2K + SSD Journal = ~20MB/s (Again LevelDB sync seems to limit here
> for sequential writes)
> 
> This is pretty bad.  Is RBD cache enabled?

Tell me about it, moving a 2TB VM is a painful experience. Yes the librbd cache 
is on, but as iSCSI effectively turns all writes into sync writes so this 
bypasses the cache, so you are dependent on the time it takes for each OSD to 
ACK the write. In this case waiting each time for 64kb IO's to complete due to 
the levelDB sync you end up with transfer speeds somewhere in the region of 
15-20MB/s. You can do the same thing with something IOmeter (64k, sequential 
write, directio, QD=1).

NFS is even worse 

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Alex Gorbachev
HI Jan,

On Tue, Aug 18, 2015 at 5:00 AM, Jan Schermer  wrote:
> I already evaluated EnhanceIO in combination with CentOS 6 (and backported 
> 3.10 and 4.0 kernel-lt if I remember correctly).
> It worked fine during benchmarks and stress tests, but once we run DB2 on it 
> it panicked within minutes and took all the data with it (almost literally - 
> files that werent touched, like OS binaries were b0rked and the filesystem 
> was unsalvageable).

Out of curiosity, were you using EnhanceIO in writeback mode?  I
assume so, as a read cache should not hurt anything.

Thanks,
Alex

> If you disregard this warning - the performance gains weren't that great 
> either, at least in a VM. It had problems when flushing to disk after 
> reaching dirty watermark and the block size has some not-well-documented 
> implications (not sure now, but I think it only cached IO _larger_than the 
> block size, so if your database keeps incrementing an XX-byte counter it will 
> go straight to disk).
>
> Flashcache doesn't respect barriers (or does it now?) - if that's ok for you 
> than go for it, it should be stable and I used it in the past in production 
> without problems.
>
> bcache seemed to work fine, but I needed to
> a) use it for root
> b) disable and enable it on the fly (doh)
> c) make it non-persisent (flush it) before reboot - not sure if that was 
> possible either.
> d) all that in a customer's VM, and that customer didn't have a strong 
> technical background to be able to fiddle with it...
> So I haven't tested it heavily.
>
> Bcache should be the obvious choice if you are in control of the environment. 
> At least you can cry on LKML's shoulder when you lose data :-)
>
> Jan
>
>
>> On 18 Aug 2015, at 01:49, Alex Gorbachev  wrote:
>>
>> What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
>> months ago, but no external contributors :(
>>
>> The nice thing about EnhanceIO is there is no need to change device
>> name, unlike bcache, flashcache etc.
>>
>> Best regards,
>> Alex
>>
>> On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz  wrote:
>>> I did some (non-ceph) work on these, and concluded that bcache was the best
>>> supported, most stable, and fastest.  This was ~1 year ago, to take it with
>>> a grain of salt, but that's what I would recommend.
>>>
>>> Daniel
>>>
>>>
>>> 
>>> From: "Dominik Zalewski" 
>>> To: "German Anders" 
>>> Cc: "ceph-users" 
>>> Sent: Wednesday, July 1, 2015 5:28:10 PM
>>> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
>>>
>>>
>>> Hi,
>>>
>>> I’ve asked same question last weeks or so (just search the mailing list
>>> archives for EnhanceIO :) and got some interesting answers.
>>>
>>> Looks like the project is pretty much dead since it was bought out by HGST.
>>> Even their website has some broken links in regards to EnhanceIO
>>>
>>> I’m keen to try flashcache or bcache (its been in the mainline kernel for
>>> some time)
>>>
>>> Dominik
>>>
>>> On 1 Jul 2015, at 21:13, German Anders  wrote:
>>>
>>> Hi cephers,
>>>
>>>   Is anyone out there that implement enhanceIO in a production environment?
>>> any recommendation? any perf output to share with the diff between using it
>>> and not?
>>>
>>> Thanks in advance,
>>>
>>> German
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Mark Nelson



On 08/18/2015 11:08 AM, Nick Fisk wrote:

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Mark Nelson
Sent: 18 August 2015 15:55
To: Jan Schermer 
Cc: ceph-users@lists.ceph.com; Nick Fisk 
Subject: Re: [ceph-users] any recommendation of using EnhanceIO?



On 08/18/2015 09:24 AM, Jan Schermer wrote:



On 18 Aug 2015, at 15:50, Mark Nelson  wrote:



On 08/18/2015 06:47 AM, Nick Fisk wrote:

Just to chime in, I gave dmcache a limited test but its lack of proper

writeback cache ruled it out for me. It only performs write back caching on
blocks already on the SSD, whereas I need something that works like a
Battery backed raid controller caching all writes.


It's amazing the 100x performance increase you get with RBD's when

doing sync writes and give it something like just 1GB write back cache with
flashcache.


For your use case, is it ok that data may live on the flashcache for some

amount of time before making to ceph to be replicated?  We've wondered
internally if this kind of trade-off is acceptable to customers or not should 
the
flashcache SSD fail.




Was it me pestering you about it? :-)
All my customers need this desperately - people don't care about having

RPO=0 seconds when all hell breaks loose.

People care about their apps being slow all the time which is effectively an

"outage".

I (sysadmin) care about having consistent data where all I have to do is start

up the VMs.


Any ideas how to approach this? I think even checkpoints (like reverting to

a known point in the past) would be great and sufficient for most people...

Here's kind of how I see the field right now:

1) Cache at the client level.  Likely fastest but obvious issues like above.
RAID1 might be an option at increased cost.  Lack of barriers in some
implementations scary.


Agreed.



2) Cache below the OSD.  Not much recent data on this.  Not likely as fast as
client side cache, but likely cheaper (fewer OSD nodes than client nodes?).
Lack of barriers in some implementations scary.


This also has the benefit of caching the leveldb on the OSD, so get a big 
performance gain from there too for small sequential writes. I looked at using 
Flashcache for this too but decided it was adding to much complexity and risk.

I thought I read somewhere that RocksDB allows you to move its WAL to SSD, is 
there anything in the pipeline for something like moving the filestore to use 
RocksDB?


I believe you can already do this, though I haven't tested it.  You can 
certainly move the monitors to rocksdb (tested) and newstore uses 
rocksdb as well.






3) Ceph Cache Tiering. Network overhead and write amplification on
promotion makes this primarily useful when workloads fit mostly into the
cache tier.  Overall safe design but care must be taken to not over-promote.

4) separate SSD pool.  Manual and not particularly flexible, but perhaps best
for applications that need consistently high performance.


I think it depends on the definition of performance. Currently even very fast 
CPU's and SSD's in their own pool will still struggle to get less than 1ms of 
write latency. If your performance requirements are for large queue depths then 
you will probably be alright. If you require something that mirrors the 
performance of traditional write back cache, then even pure SSD Pools can start 
to struggle.


Agreed.  This is definitely the crux of the problem.  The example below 
is a great start!  It'd would be fantastic if we could get more feedback 
from the list on the relative importance of low latency operations vs 
high IOPS through concurrency.  We have general suspicions but not a ton 
of actual data regarding what folks are seeing in practice and under 
what scenarios.





To give a real world example of what I see when doing various tests,  here is a 
rough guide to IOP's when removing a snapshot on a ESX server

Traditional Array 10K disks = 300-600 IOPs
Ceph 7.2K + SSD Journal = 100-200 IOPs (LevelDB syncing on OSD seems to be the 
main limitation)
Ceph Pure SSD Pool = 500 IOPs (Intel s3700 SSD's)


I'd be curious to see how much jemalloc or tcmalloc 2.4 + 128MB TC help 
here.  Sandisk and Intel have both done some very useful investigations, 
I've got some additional tests replicating some of their findings coming 
shortly.



Ceph Cache Tiering = 10-500 IOPs (As we know, misses can be very painful)


Indeed.  There's some work going on in this area too.  Hopefully we'll 
know how some of our ideas pan out later this week.  Assuming excessive 
promotions aren't a problem, the jemalloc/tcmalloc improvements I 
suspect will generally make cache teiring more interesting (though 
buffer cache will still be the primary source of really hot cached reads)



Ceph + RBD Caching with Flashcache = 200-1000 IOPs (Readahead can give high 
bursts if snapshot blocks are sequential)


Good to k

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Jan Schermer

> On 18 Aug 2015, at 16:44, Nick Fisk  wrote:
> 
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Mark Nelson
>> Sent: 18 August 2015 14:51
>> To: Nick Fisk ; 'Jan Schermer' 
>> Cc: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
>> 
>> 
>> 
>> On 08/18/2015 06:47 AM, Nick Fisk wrote:
>>> Just to chime in, I gave dmcache a limited test but its lack of proper
>> writeback cache ruled it out for me. It only performs write back caching on
>> blocks already on the SSD, whereas I need something that works like a
>> Battery backed raid controller caching all writes.
>>> 
>>> It's amazing the 100x performance increase you get with RBD's when doing
>> sync writes and give it something like just 1GB write back cache with
>> flashcache.
>> 
>> For your use case, is it ok that data may live on the flashcache for some
>> amount of time before making to ceph to be replicated?  We've wondered
>> internally if this kind of trade-off is acceptable to customers or not 
>> should the
>> flashcache SSD fail.
> 
> Yes, I agree, it's not ideal. But I believe it’s the only way to get the 
> performance required for some workloads that need write latency's <1ms. 
> 
> I'm still in testing at the moment with the testing kernel that includes 
> blk-mq fixes for large queue depths and max io sizes. But if we decide to put 
> into production, it would be using 2x SAS dual port SSD's in RAID1 across two 
> servers for HA. As we are currently using iSCSI from these two servers, there 
> is no real loss of availability by doing this. Generally I think as long as 
> you build this around the fault domains of the application you are caching, 
> it shouldn't impact too much.
> 
> I guess for people using openstack and other direct RBD interfaces it may not 
> be such an attractive option. I've been thinking that maybe Ceph needs to 
> have an additional daemon with very low overheads, which is run on SSD's to 
> provide shared persistent cache devices for librbd. There's still a trade 
> off, maybe not as much as using Flashcache, but for some workloads like 
> database's, many people may decide that it's worth it. Of course I realise 
> this would be a lot of work and everyone is really busy, but in terms of 
> performance gained it would most likely have a dramatic effect in making Ceph 
> look comparable to other solutions like VSAN or ScaleIO when it comes to high 
> iops/low latency stuff.
> 

Additional daemon that is persistent how? Isn't that what journal does already, 
just too slowly?

I think the best (and easiest!) approach is to mimic what a monilithic SAN does

Currently
1) client issues blocking/atomic/sync IO
2) rbd client sends this IO to all OSDs
3) after all OSDs "process the IO", the IO is finished and considered persistent

That has serious implications
* every IO is processed separately, not much coalescing
* OSD processes add the latency when processing this IO
* one OSD can be slow momentarily, IO backs up and the cluster stalls

Let me just select what "processing the IO" means with respect to my 
architecture and I can likely get a 100x improvement

Let me choose:

1) WHERE the IO is persisted
Do I really need all (e.g. 3) OSDs to persist the data or is quorum (2) 
sufficient?
Not waiting for one slow OSD gives me at least some SLA for planned tasks like 
backfilling, scrubbing, deep-scrubbing
Hands up who can afford to leav deep-scrub enabled in production...

2) WHEN the IO is persisted
Do I really need all OSDs to flush the data to disk?
If all the nodes are in the same cabinet and on the same UPS then this makes 
sense.
But my nodes are actually in different buildings ~10km apart. The chances of 
power failing simultaneously, N+1 UPSes failing simultaneously, diesels failing 
simultaneously... When nukes start falling and this happens then I'll start 
looking for backups.
Even if your nodes are in one datacentre, there are likely redundant (2+) 
circuits.
And even if you have just one cabinet, you can add 3x UPS in there and gain a 
nice speed boost.

So the IO could be actually pretty safe and happy when it gets to a remote 
buffers on enough (quorum) nodes  and waits for processing. It can be batched, 
it can be coalesced, it can be rewritten with subsequent updates...

3)  WHAT amount of IO is stored
Do I need to have the last transaction or can I tolerate 1 minute of missing 
data?
Checkpoints, checksums on last transaction, rollback (journal already does this 
AFAIK)...

4) I DON'T CARE mode :-)
qemu cache=unsafe equivalent but s

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Mark Nelson
> Sent: 18 August 2015 15:55
> To: Jan Schermer 
> Cc: ceph-users@lists.ceph.com; Nick Fisk 
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
>
>
>
> On 08/18/2015 09:24 AM, Jan Schermer wrote:
> >
> >> On 18 Aug 2015, at 15:50, Mark Nelson  wrote:
> >>
> >>
> >>
> >> On 08/18/2015 06:47 AM, Nick Fisk wrote:
> >>> Just to chime in, I gave dmcache a limited test but its lack of proper
> writeback cache ruled it out for me. It only performs write back caching on
> blocks already on the SSD, whereas I need something that works like a
> Battery backed raid controller caching all writes.
> >>>
> >>> It's amazing the 100x performance increase you get with RBD's when
> doing sync writes and give it something like just 1GB write back cache with
> flashcache.
> >>
> >> For your use case, is it ok that data may live on the flashcache for some
> amount of time before making to ceph to be replicated?  We've wondered
> internally if this kind of trade-off is acceptable to customers or not should 
> the
> flashcache SSD fail.
> >>
> >
> > Was it me pestering you about it? :-)
> > All my customers need this desperately - people don't care about having
> RPO=0 seconds when all hell breaks loose.
> > People care about their apps being slow all the time which is effectively an
> "outage".
> > I (sysadmin) care about having consistent data where all I have to do is 
> > start
> up the VMs.
> >
> > Any ideas how to approach this? I think even checkpoints (like reverting to
> a known point in the past) would be great and sufficient for most people...
>
> Here's kind of how I see the field right now:
>
> 1) Cache at the client level.  Likely fastest but obvious issues like above.
> RAID1 might be an option at increased cost.  Lack of barriers in some
> implementations scary.

Agreed.

>
> 2) Cache below the OSD.  Not much recent data on this.  Not likely as fast as
> client side cache, but likely cheaper (fewer OSD nodes than client nodes?).
> Lack of barriers in some implementations scary.

This also has the benefit of caching the leveldb on the OSD, so get a big 
performance gain from there too for small sequential writes. I looked at using 
Flashcache for this too but decided it was adding to much complexity and risk.

I thought I read somewhere that RocksDB allows you to move its WAL to SSD, is 
there anything in the pipeline for something like moving the filestore to use 
RocksDB?

>
> 3) Ceph Cache Tiering. Network overhead and write amplification on
> promotion makes this primarily useful when workloads fit mostly into the
> cache tier.  Overall safe design but care must be taken to not over-promote.
>
> 4) separate SSD pool.  Manual and not particularly flexible, but perhaps best
> for applications that need consistently high performance.

I think it depends on the definition of performance. Currently even very fast 
CPU's and SSD's in their own pool will still struggle to get less than 1ms of 
write latency. If your performance requirements are for large queue depths then 
you will probably be alright. If you require something that mirrors the 
performance of traditional write back cache, then even pure SSD Pools can start 
to struggle.


To give a real world example of what I see when doing various tests,  here is a 
rough guide to IOP's when removing a snapshot on a ESX server

Traditional Array 10K disks = 300-600 IOPs
Ceph 7.2K + SSD Journal = 100-200 IOPs (LevelDB syncing on OSD seems to be the 
main limitation)
Ceph Pure SSD Pool = 500 IOPs (Intel s3700 SSD's)
Ceph Cache Tiering = 10-500 IOPs (As we know, misses can be very painful)
Ceph + RBD Caching with Flashcache = 200-1000 IOPs (Readahead can give high 
bursts if snapshot blocks are sequential)

And when copying VM's to datastore (ESXi does this in sequential 64k 
IO's.yes silly I know)

Traditional Array 10K disks = ~100MB/s (Limited by 1GB interface, on other 
arrays I guess this scales)
Ceph 7.2K + SSD Journal = ~20MB/s (Again LevelDB sync seems to limit here for 
sequential writes)
Ceph Pure SSD Pool = ~50MB/s (Ceph CPU bottleneck is occurring)
Ceph Cache Tiering = ~50MB/s when writing to new block, <10MB/s when 
promote+overwrite
Ceph + RBD Caching with Flashcache = As fast as the SSD will go


>
> >
> >
> >>>
> >>>
> >>>> -Original Message-
> >>>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On
> >>>> Behalf Of Jan Schermer
> >>>> Sent: 18 

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Mark Nelson



On 08/18/2015 09:24 AM, Jan Schermer wrote:



On 18 Aug 2015, at 15:50, Mark Nelson  wrote:



On 08/18/2015 06:47 AM, Nick Fisk wrote:

Just to chime in, I gave dmcache a limited test but its lack of proper 
writeback cache ruled it out for me. It only performs write back caching on 
blocks already on the SSD, whereas I need something that works like a Battery 
backed raid controller caching all writes.

It's amazing the 100x performance increase you get with RBD's when doing sync 
writes and give it something like just 1GB write back cache with flashcache.


For your use case, is it ok that data may live on the flashcache for some 
amount of time before making to ceph to be replicated?  We've wondered 
internally if this kind of trade-off is acceptable to customers or not should 
the flashcache SSD fail.



Was it me pestering you about it? :-)
All my customers need this desperately - people don't care about having RPO=0 
seconds when all hell breaks loose.
People care about their apps being slow all the time which is effectively an 
"outage".
I (sysadmin) care about having consistent data where all I have to do is start 
up the VMs.

Any ideas how to approach this? I think even checkpoints (like reverting to a 
known point in the past) would be great and sufficient for most people...


Here's kind of how I see the field right now:

1) Cache at the client level.  Likely fastest but obvious issues like 
above.  RAID1 might be an option at increased cost.  Lack of barriers in 
some implementations scary.


2) Cache below the OSD.  Not much recent data on this.  Not likely as 
fast as client side cache, but likely cheaper (fewer OSD nodes than 
client nodes?).  Lack of barriers in some implementations scary.


3) Ceph Cache Tiering. Network overhead and write amplification on 
promotion makes this primarily useful when workloads fit mostly into the 
cache tier.  Overall safe design but care must be taken to not over-promote.


4) separate SSD pool.  Manual and not particularly flexible, but perhaps 
best for applications that need consistently high performance.









-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Jan Schermer
Sent: 18 August 2015 12:44
To: Mark Nelson 
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] any recommendation of using EnhanceIO?

I did not. Not sure why now - probably for the same reason I didn't
extensively test bcache.
I'm not a real fan of device mapper though, so if I had to choose I'd still go 
for
bcache :-)

Jan


On 18 Aug 2015, at 13:33, Mark Nelson  wrote:

Hi Jan,

Out of curiosity did you ever try dm-cache?  I've been meaning to give it a

spin but haven't had the spare cycles.


Mark

On 08/18/2015 04:00 AM, Jan Schermer wrote:

I already evaluated EnhanceIO in combination with CentOS 6 (and

backported 3.10 and 4.0 kernel-lt if I remember correctly).

It worked fine during benchmarks and stress tests, but once we run DB2

on it it panicked within minutes and took all the data with it (almost 
literally -
files that werent touched, like OS binaries were b0rked and the filesystem
was unsalvageable).

If you disregard this warning - the performance gains weren't that great

either, at least in a VM. It had problems when flushing to disk after reaching
dirty watermark and the block size has some not-well-documented
implications (not sure now, but I think it only cached IO _larger_than the
block size, so if your database keeps incrementing an XX-byte counter it will
go straight to disk).


Flashcache doesn't respect barriers (or does it now?) - if that's ok for you

than go for it, it should be stable and I used it in the past in production
without problems.


bcache seemed to work fine, but I needed to
a) use it for root
b) disable and enable it on the fly (doh)
c) make it non-persisent (flush it) before reboot - not sure if that was

possible either.

d) all that in a customer's VM, and that customer didn't have a strong

technical background to be able to fiddle with it...

So I haven't tested it heavily.

Bcache should be the obvious choice if you are in control of the
environment. At least you can cry on LKML's shoulder when you lose
data :-)

Jan



On 18 Aug 2015, at 01:49, Alex Gorbachev 

wrote:


What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
months ago, but no external contributors :(

The nice thing about EnhanceIO is there is no need to change device
name, unlike bcache, flashcache etc.

Best regards,
Alex

On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz 

wrote:

I did some (non-ceph) work on these, and concluded that bcache was
the best supported, most stable, and fastest.  This was ~1 year
ago, to take it with a grain of salt, but that's what I would recommend.

Daniel



From: "Dominik Zalewski" 
To: "German Anders

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Mark Nelson
> Sent: 18 August 2015 14:51
> To: Nick Fisk ; 'Jan Schermer' 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> 
> 
> On 08/18/2015 06:47 AM, Nick Fisk wrote:
> > Just to chime in, I gave dmcache a limited test but its lack of proper
> writeback cache ruled it out for me. It only performs write back caching on
> blocks already on the SSD, whereas I need something that works like a
> Battery backed raid controller caching all writes.
> >
> > It's amazing the 100x performance increase you get with RBD's when doing
> sync writes and give it something like just 1GB write back cache with
> flashcache.
> 
> For your use case, is it ok that data may live on the flashcache for some
> amount of time before making to ceph to be replicated?  We've wondered
> internally if this kind of trade-off is acceptable to customers or not should 
> the
> flashcache SSD fail.

Yes, I agree, it's not ideal. But I believe it’s the only way to get the 
performance required for some workloads that need write latency's <1ms. 

I'm still in testing at the moment with the testing kernel that includes blk-mq 
fixes for large queue depths and max io sizes. But if we decide to put into 
production, it would be using 2x SAS dual port SSD's in RAID1 across two 
servers for HA. As we are currently using iSCSI from these two servers, there 
is no real loss of availability by doing this. Generally I think as long as you 
build this around the fault domains of the application you are caching, it 
shouldn't impact too much.

I guess for people using openstack and other direct RBD interfaces it may not 
be such an attractive option. I've been thinking that maybe Ceph needs to have 
an additional daemon with very low overheads, which is run on SSD's to provide 
shared persistent cache devices for librbd. There's still a trade off, maybe 
not as much as using Flashcache, but for some workloads like database's, many 
people may decide that it's worth it. Of course I realise this would be a lot 
of work and everyone is really busy, but in terms of performance gained it 
would most likely have a dramatic effect in making Ceph look comparable to 
other solutions like VSAN or ScaleIO when it comes to high iops/low latency 
stuff.

> 
> >
> >
> >> -Original Message-
> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> >> Of Jan Schermer
> >> Sent: 18 August 2015 12:44
> >> To: Mark Nelson 
> >> Cc: ceph-users@lists.ceph.com
> >> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> >>
> >> I did not. Not sure why now - probably for the same reason I didn't
> >> extensively test bcache.
> >> I'm not a real fan of device mapper though, so if I had to choose I'd
> >> still go for bcache :-)
> >>
> >> Jan
> >>
> >>> On 18 Aug 2015, at 13:33, Mark Nelson  wrote:
> >>>
> >>> Hi Jan,
> >>>
> >>> Out of curiosity did you ever try dm-cache?  I've been meaning to
> >>> give it a
> >> spin but haven't had the spare cycles.
> >>>
> >>> Mark
> >>>
> >>> On 08/18/2015 04:00 AM, Jan Schermer wrote:
> >>>> I already evaluated EnhanceIO in combination with CentOS 6 (and
> >> backported 3.10 and 4.0 kernel-lt if I remember correctly).
> >>>> It worked fine during benchmarks and stress tests, but once we run
> >>>> DB2
> >> on it it panicked within minutes and took all the data with it
> >> (almost literally - files that werent touched, like OS binaries were
> >> b0rked and the filesystem was unsalvageable).
> >>>> If you disregard this warning - the performance gains weren't that
> >>>> great
> >> either, at least in a VM. It had problems when flushing to disk after
> >> reaching dirty watermark and the block size has some
> >> not-well-documented implications (not sure now, but I think it only
> >> cached IO _larger_than the block size, so if your database keeps
> >> incrementing an XX-byte counter it will go straight to disk).
> >>>>
> >>>> Flashcache doesn't respect barriers (or does it now?) - if that's
> >>>> ok for you
> >> than go for it, it should be stable and I used it in the past in
> >> production without problems.
> >>>>
> >>>> 

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Jan Schermer

> On 18 Aug 2015, at 15:50, Mark Nelson  wrote:
> 
> 
> 
> On 08/18/2015 06:47 AM, Nick Fisk wrote:
>> Just to chime in, I gave dmcache a limited test but its lack of proper 
>> writeback cache ruled it out for me. It only performs write back caching on 
>> blocks already on the SSD, whereas I need something that works like a 
>> Battery backed raid controller caching all writes.
>> 
>> It's amazing the 100x performance increase you get with RBD's when doing 
>> sync writes and give it something like just 1GB write back cache with 
>> flashcache.
> 
> For your use case, is it ok that data may live on the flashcache for some 
> amount of time before making to ceph to be replicated?  We've wondered 
> internally if this kind of trade-off is acceptable to customers or not should 
> the flashcache SSD fail.
> 

Was it me pestering you about it? :-)
All my customers need this desperately - people don't care about having RPO=0 
seconds when all hell breaks loose.
People care about their apps being slow all the time which is effectively an 
"outage".
I (sysadmin) care about having consistent data where all I have to do is start 
up the VMs.

Any ideas how to approach this? I think even checkpoints (like reverting to a 
known point in the past) would be great and sufficient for most people...


>> 
>> 
>>> -Original Message-
>>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>>> Jan Schermer
>>> Sent: 18 August 2015 12:44
>>> To: Mark Nelson 
>>> Cc: ceph-users@lists.ceph.com
>>> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
>>> 
>>> I did not. Not sure why now - probably for the same reason I didn't
>>> extensively test bcache.
>>> I'm not a real fan of device mapper though, so if I had to choose I'd still 
>>> go for
>>> bcache :-)
>>> 
>>> Jan
>>> 
>>>> On 18 Aug 2015, at 13:33, Mark Nelson  wrote:
>>>> 
>>>> Hi Jan,
>>>> 
>>>> Out of curiosity did you ever try dm-cache?  I've been meaning to give it a
>>> spin but haven't had the spare cycles.
>>>> 
>>>> Mark
>>>> 
>>>> On 08/18/2015 04:00 AM, Jan Schermer wrote:
>>>>> I already evaluated EnhanceIO in combination with CentOS 6 (and
>>> backported 3.10 and 4.0 kernel-lt if I remember correctly).
>>>>> It worked fine during benchmarks and stress tests, but once we run DB2
>>> on it it panicked within minutes and took all the data with it (almost 
>>> literally -
>>> files that werent touched, like OS binaries were b0rked and the filesystem
>>> was unsalvageable).
>>>>> If you disregard this warning - the performance gains weren't that great
>>> either, at least in a VM. It had problems when flushing to disk after 
>>> reaching
>>> dirty watermark and the block size has some not-well-documented
>>> implications (not sure now, but I think it only cached IO _larger_than the
>>> block size, so if your database keeps incrementing an XX-byte counter it 
>>> will
>>> go straight to disk).
>>>>> 
>>>>> Flashcache doesn't respect barriers (or does it now?) - if that's ok for 
>>>>> you
>>> than go for it, it should be stable and I used it in the past in production
>>> without problems.
>>>>> 
>>>>> bcache seemed to work fine, but I needed to
>>>>> a) use it for root
>>>>> b) disable and enable it on the fly (doh)
>>>>> c) make it non-persisent (flush it) before reboot - not sure if that was
>>> possible either.
>>>>> d) all that in a customer's VM, and that customer didn't have a strong
>>> technical background to be able to fiddle with it...
>>>>> So I haven't tested it heavily.
>>>>> 
>>>>> Bcache should be the obvious choice if you are in control of the
>>>>> environment. At least you can cry on LKML's shoulder when you lose
>>>>> data :-)
>>>>> 
>>>>> Jan
>>>>> 
>>>>> 
>>>>>> On 18 Aug 2015, at 01:49, Alex Gorbachev 
>>> wrote:
>>>>>> 
>>>>>> What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
>>>>>> months ago, but no external contributors :(
>>>>>> 
>>>>>> The nice thing about EnhanceIO is th

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Mark Nelson



On 08/18/2015 06:47 AM, Nick Fisk wrote:

Just to chime in, I gave dmcache a limited test but its lack of proper 
writeback cache ruled it out for me. It only performs write back caching on 
blocks already on the SSD, whereas I need something that works like a Battery 
backed raid controller caching all writes.

It's amazing the 100x performance increase you get with RBD's when doing sync 
writes and give it something like just 1GB write back cache with flashcache.


For your use case, is it ok that data may live on the flashcache for 
some amount of time before making to ceph to be replicated?  We've 
wondered internally if this kind of trade-off is acceptable to customers 
or not should the flashcache SSD fail.






-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Jan Schermer
Sent: 18 August 2015 12:44
To: Mark Nelson 
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] any recommendation of using EnhanceIO?

I did not. Not sure why now - probably for the same reason I didn't
extensively test bcache.
I'm not a real fan of device mapper though, so if I had to choose I'd still go 
for
bcache :-)

Jan


On 18 Aug 2015, at 13:33, Mark Nelson  wrote:

Hi Jan,

Out of curiosity did you ever try dm-cache?  I've been meaning to give it a

spin but haven't had the spare cycles.


Mark

On 08/18/2015 04:00 AM, Jan Schermer wrote:

I already evaluated EnhanceIO in combination with CentOS 6 (and

backported 3.10 and 4.0 kernel-lt if I remember correctly).

It worked fine during benchmarks and stress tests, but once we run DB2

on it it panicked within minutes and took all the data with it (almost 
literally -
files that werent touched, like OS binaries were b0rked and the filesystem
was unsalvageable).

If you disregard this warning - the performance gains weren't that great

either, at least in a VM. It had problems when flushing to disk after reaching
dirty watermark and the block size has some not-well-documented
implications (not sure now, but I think it only cached IO _larger_than the
block size, so if your database keeps incrementing an XX-byte counter it will
go straight to disk).


Flashcache doesn't respect barriers (or does it now?) - if that's ok for you

than go for it, it should be stable and I used it in the past in production
without problems.


bcache seemed to work fine, but I needed to
a) use it for root
b) disable and enable it on the fly (doh)
c) make it non-persisent (flush it) before reboot - not sure if that was

possible either.

d) all that in a customer's VM, and that customer didn't have a strong

technical background to be able to fiddle with it...

So I haven't tested it heavily.

Bcache should be the obvious choice if you are in control of the
environment. At least you can cry on LKML's shoulder when you lose
data :-)

Jan



On 18 Aug 2015, at 01:49, Alex Gorbachev 

wrote:


What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
months ago, but no external contributors :(

The nice thing about EnhanceIO is there is no need to change device
name, unlike bcache, flashcache etc.

Best regards,
Alex

On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz 

wrote:

I did some (non-ceph) work on these, and concluded that bcache was
the best supported, most stable, and fastest.  This was ~1 year
ago, to take it with a grain of salt, but that's what I would recommend.

Daniel



From: "Dominik Zalewski" 
To: "German Anders" 
Cc: "ceph-users" 
Sent: Wednesday, July 1, 2015 5:28:10 PM
Subject: Re: [ceph-users] any recommendation of using EnhanceIO?


Hi,

I’ve asked same question last weeks or so (just search the mailing
list archives for EnhanceIO :) and got some interesting answers.

Looks like the project is pretty much dead since it was bought out by

HGST.

Even their website has some broken links in regards to EnhanceIO

I’m keen to try flashcache or bcache (its been in the mainline
kernel for some time)

Dominik

On 1 Jul 2015, at 21:13, German Anders 

wrote:


Hi cephers,

   Is anyone out there that implement enhanceIO in a production

environment?

any recommendation? any perf output to share with the diff between
using it and not?

Thanks in advance,

German
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-cep

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Campbell, Bill
Hey Stefan, 
Are you using your Ceph cluster for virtualization storage? Is dm-writeboost 
configured on the OSD nodes themselves? 

- Original Message -

From: "Stefan Priebe - Profihost AG"  
To: "Mark Nelson" , ceph-users@lists.ceph.com 
Sent: Tuesday, August 18, 2015 7:36:10 AM 
Subject: Re: [ceph-users] any recommendation of using EnhanceIO? 

We're using an extra caching layer for ceph since the beginning for our 
older ceph deployments. All new deployments go with full SSDs. 

I've tested so far: 
- EnhanceIO 
- Flashcache 
- Bcache 
- dm-cache 
- dm-writeboost 

The best working solution was and is bcache except for it's buggy code. 
The current code in 4.2-rc7 vanilla kernel still contains bugs. f.e. 
discards result in crashed FS after reboots and so on. But it's still 
the fastest for ceph. 

The 2nd best solution which we already use in production is 
dm-writeboost (https://github.com/akiradeveloper/dm-writeboost). 

Everything else is too slow. 

Stefan 
Am 18.08.2015 um 13:33 schrieb Mark Nelson: 
> Hi Jan, 
> 
> Out of curiosity did you ever try dm-cache? I've been meaning to give 
> it a spin but haven't had the spare cycles. 
> 
> Mark 
> 
> On 08/18/2015 04:00 AM, Jan Schermer wrote: 
>> I already evaluated EnhanceIO in combination with CentOS 6 (and 
>> backported 3.10 and 4.0 kernel-lt if I remember correctly). 
>> It worked fine during benchmarks and stress tests, but once we run DB2 
>> on it it panicked within minutes and took all the data with it (almost 
>> literally - files that werent touched, like OS binaries were b0rked 
>> and the filesystem was unsalvageable). 
>> If you disregard this warning - the performance gains weren't that 
>> great either, at least in a VM. It had problems when flushing to disk 
>> after reaching dirty watermark and the block size has some 
>> not-well-documented implications (not sure now, but I think it only 
>> cached IO _larger_than the block size, so if your database keeps 
>> incrementing an XX-byte counter it will go straight to disk). 
>> 
>> Flashcache doesn't respect barriers (or does it now?) - if that's ok 
>> for you than go for it, it should be stable and I used it in the past 
>> in production without problems. 
>> 
>> bcache seemed to work fine, but I needed to 
>> a) use it for root 
>> b) disable and enable it on the fly (doh) 
>> c) make it non-persisent (flush it) before reboot - not sure if that 
>> was possible either. 
>> d) all that in a customer's VM, and that customer didn't have a strong 
>> technical background to be able to fiddle with it... 
>> So I haven't tested it heavily. 
>> 
>> Bcache should be the obvious choice if you are in control of the 
>> environment. At least you can cry on LKML's shoulder when you lose 
>> data :-) 
>> 
>> Jan 
>> 
>> 
>>> On 18 Aug 2015, at 01:49, Alex Gorbachev  wrote: 
>>> 
>>> What about https://github.com/Frontier314/EnhanceIO? Last commit 2 
>>> months ago, but no external contributors :( 
>>> 
>>> The nice thing about EnhanceIO is there is no need to change device 
>>> name, unlike bcache, flashcache etc. 
>>> 
>>> Best regards, 
>>> Alex 
>>> 
>>> On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz  
>>> wrote: 
>>>> I did some (non-ceph) work on these, and concluded that bcache was 
>>>> the best 
>>>> supported, most stable, and fastest. This was ~1 year ago, to take 
>>>> it with 
>>>> a grain of salt, but that's what I would recommend. 
>>>> 
>>>> Daniel 
>>>> 
>>>> 
>>>>  
>>>> From: "Dominik Zalewski"  
>>>> To: "German Anders"  
>>>> Cc: "ceph-users"  
>>>> Sent: Wednesday, July 1, 2015 5:28:10 PM 
>>>> Subject: Re: [ceph-users] any recommendation of using EnhanceIO? 
>>>> 
>>>> 
>>>> Hi, 
>>>> 
>>>> I’ve asked same question last weeks or so (just search the mailing list 
>>>> archives for EnhanceIO :) and got some interesting answers. 
>>>> 
>>>> Looks like the project is pretty much dead since it was bought out 
>>>> by HGST. 
>>>> Even their website has some broken links in regards to EnhanceIO 
>>>> 
>>>> I’m keen to try flashcache or bcache (its been in the mainline 
>>>> kernel for 
>>>> some time) 
>>>> 
>>>> Domi

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
> -Original Message-
> From: Emmanuel Florac [mailto:eflo...@intellique.com]
> Sent: 18 August 2015 12:26
> To: Nick Fisk 
> Cc: 'Jan Schermer' ; 'Alex Gorbachev'  integration.com>; 'Dominik Zalewski' ; ceph-
> us...@lists.ceph.com
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> Le Tue, 18 Aug 2015 10:12:59 +0100
> Nick Fisk  écrivait:
> 
> > > Bcache should be the obvious choice if you are in control of the
> > > environment. At least you can cry on LKML's shoulder when you lose
> > > data :-)
> >
> > Please note, it looks like the main(only?) dev of Bcache has started
> > making a new version of bcache, bcachefs. At this stage I'm not sure
> > what this means for the ongoing support of the existing bcache
> > project.
> 
> bcachefs is more than a "new version of bcache", it's a complete POSIX
> filesystem with integrated caching. Looks like a silly idea if you ask me
> (because we already have several excellent filesystems; because developing
> a reliable filesystem is DAMN HARD; because building a feature-complete FS
> is CRAZY HARD; because FTL sucks anyway; etc).

Agreed, it's such a shame that there isn't a simple, reliable and maintained 
caching solution out there for Linux. When I started seeing all these projects 
spring up 5-6 years ago I was full of optimism, but we still don't have 
anything I would call fully usable.

> 
> --
> 
> Emmanuel Florac |   Direction technique
> |   Intellique
> | 
> |   +33 1 78 94 84 02
> 




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
Just to chime in, I gave dmcache a limited test but its lack of proper 
writeback cache ruled it out for me. It only performs write back caching on 
blocks already on the SSD, whereas I need something that works like a Battery 
backed raid controller caching all writes.

It's amazing the 100x performance increase you get with RBD's when doing sync 
writes and give it something like just 1GB write back cache with flashcache.


> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Jan Schermer
> Sent: 18 August 2015 12:44
> To: Mark Nelson 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> I did not. Not sure why now - probably for the same reason I didn't
> extensively test bcache.
> I'm not a real fan of device mapper though, so if I had to choose I'd still 
> go for
> bcache :-)
> 
> Jan
> 
> > On 18 Aug 2015, at 13:33, Mark Nelson  wrote:
> >
> > Hi Jan,
> >
> > Out of curiosity did you ever try dm-cache?  I've been meaning to give it a
> spin but haven't had the spare cycles.
> >
> > Mark
> >
> > On 08/18/2015 04:00 AM, Jan Schermer wrote:
> >> I already evaluated EnhanceIO in combination with CentOS 6 (and
> backported 3.10 and 4.0 kernel-lt if I remember correctly).
> >> It worked fine during benchmarks and stress tests, but once we run DB2
> on it it panicked within minutes and took all the data with it (almost 
> literally -
> files that werent touched, like OS binaries were b0rked and the filesystem
> was unsalvageable).
> >> If you disregard this warning - the performance gains weren't that great
> either, at least in a VM. It had problems when flushing to disk after reaching
> dirty watermark and the block size has some not-well-documented
> implications (not sure now, but I think it only cached IO _larger_than the
> block size, so if your database keeps incrementing an XX-byte counter it will
> go straight to disk).
> >>
> >> Flashcache doesn't respect barriers (or does it now?) - if that's ok for 
> >> you
> than go for it, it should be stable and I used it in the past in production
> without problems.
> >>
> >> bcache seemed to work fine, but I needed to
> >> a) use it for root
> >> b) disable and enable it on the fly (doh)
> >> c) make it non-persisent (flush it) before reboot - not sure if that was
> possible either.
> >> d) all that in a customer's VM, and that customer didn't have a strong
> technical background to be able to fiddle with it...
> >> So I haven't tested it heavily.
> >>
> >> Bcache should be the obvious choice if you are in control of the
> >> environment. At least you can cry on LKML's shoulder when you lose
> >> data :-)
> >>
> >> Jan
> >>
> >>
> >>> On 18 Aug 2015, at 01:49, Alex Gorbachev 
> wrote:
> >>>
> >>> What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
> >>> months ago, but no external contributors :(
> >>>
> >>> The nice thing about EnhanceIO is there is no need to change device
> >>> name, unlike bcache, flashcache etc.
> >>>
> >>> Best regards,
> >>> Alex
> >>>
> >>> On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz 
> wrote:
> >>>> I did some (non-ceph) work on these, and concluded that bcache was
> >>>> the best supported, most stable, and fastest.  This was ~1 year
> >>>> ago, to take it with a grain of salt, but that's what I would recommend.
> >>>>
> >>>> Daniel
> >>>>
> >>>>
> >>>> 
> >>>> From: "Dominik Zalewski" 
> >>>> To: "German Anders" 
> >>>> Cc: "ceph-users" 
> >>>> Sent: Wednesday, July 1, 2015 5:28:10 PM
> >>>> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> >>>>
> >>>>
> >>>> Hi,
> >>>>
> >>>> I’ve asked same question last weeks or so (just search the mailing
> >>>> list archives for EnhanceIO :) and got some interesting answers.
> >>>>
> >>>> Looks like the project is pretty much dead since it was bought out by
> HGST.
> >>>> Even their website has some broken links in regards to EnhanceIO
> >>>>
> >>>> I’m keen to try flash

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Jan Schermer
I did not. Not sure why now - probably for the same reason I didn't extensively 
test bcache.
I'm not a real fan of device mapper though, so if I had to choose I'd still go 
for bcache :-)

Jan
 
> On 18 Aug 2015, at 13:33, Mark Nelson  wrote:
> 
> Hi Jan,
> 
> Out of curiosity did you ever try dm-cache?  I've been meaning to give it a 
> spin but haven't had the spare cycles.
> 
> Mark
> 
> On 08/18/2015 04:00 AM, Jan Schermer wrote:
>> I already evaluated EnhanceIO in combination with CentOS 6 (and backported 
>> 3.10 and 4.0 kernel-lt if I remember correctly).
>> It worked fine during benchmarks and stress tests, but once we run DB2 on it 
>> it panicked within minutes and took all the data with it (almost literally - 
>> files that werent touched, like OS binaries were b0rked and the filesystem 
>> was unsalvageable).
>> If you disregard this warning - the performance gains weren't that great 
>> either, at least in a VM. It had problems when flushing to disk after 
>> reaching dirty watermark and the block size has some not-well-documented 
>> implications (not sure now, but I think it only cached IO _larger_than the 
>> block size, so if your database keeps incrementing an XX-byte counter it 
>> will go straight to disk).
>> 
>> Flashcache doesn't respect barriers (or does it now?) - if that's ok for you 
>> than go for it, it should be stable and I used it in the past in production 
>> without problems.
>> 
>> bcache seemed to work fine, but I needed to
>> a) use it for root
>> b) disable and enable it on the fly (doh)
>> c) make it non-persisent (flush it) before reboot - not sure if that was 
>> possible either.
>> d) all that in a customer's VM, and that customer didn't have a strong 
>> technical background to be able to fiddle with it...
>> So I haven't tested it heavily.
>> 
>> Bcache should be the obvious choice if you are in control of the 
>> environment. At least you can cry on LKML's shoulder when you lose data :-)
>> 
>> Jan
>> 
>> 
>>> On 18 Aug 2015, at 01:49, Alex Gorbachev  wrote:
>>> 
>>> What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
>>> months ago, but no external contributors :(
>>> 
>>> The nice thing about EnhanceIO is there is no need to change device
>>> name, unlike bcache, flashcache etc.
>>> 
>>> Best regards,
>>> Alex
>>> 
>>> On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz  wrote:
>>>> I did some (non-ceph) work on these, and concluded that bcache was the best
>>>> supported, most stable, and fastest.  This was ~1 year ago, to take it with
>>>> a grain of salt, but that's what I would recommend.
>>>> 
>>>> Daniel
>>>> 
>>>> 
>>>> 
>>>> From: "Dominik Zalewski" 
>>>> To: "German Anders" 
>>>> Cc: "ceph-users" 
>>>> Sent: Wednesday, July 1, 2015 5:28:10 PM
>>>> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
>>>> 
>>>> 
>>>> Hi,
>>>> 
>>>> I’ve asked same question last weeks or so (just search the mailing list
>>>> archives for EnhanceIO :) and got some interesting answers.
>>>> 
>>>> Looks like the project is pretty much dead since it was bought out by HGST.
>>>> Even their website has some broken links in regards to EnhanceIO
>>>> 
>>>> I’m keen to try flashcache or bcache (its been in the mainline kernel for
>>>> some time)
>>>> 
>>>> Dominik
>>>> 
>>>> On 1 Jul 2015, at 21:13, German Anders  wrote:
>>>> 
>>>> Hi cephers,
>>>> 
>>>>   Is anyone out there that implement enhanceIO in a production environment?
>>>> any recommendation? any perf output to share with the diff between using it
>>>> and not?
>>>> 
>>>> Thanks in advance,
>>>> 
>>>> German
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>>> 
>>>> 
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>>> 
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>> 
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Stefan Priebe - Profihost AG
We're using an extra caching layer for ceph since the beginning for our
older ceph deployments. All new deployments go with full SSDs.

I've tested so far:
- EnhanceIO
- Flashcache
- Bcache
- dm-cache
- dm-writeboost

The best working solution was and is bcache except for it's buggy code.
The current code in 4.2-rc7 vanilla kernel still contains bugs. f.e.
discards result in crashed FS after reboots and so on. But it's still
the fastest for ceph.

The 2nd best solution which we already use in production is
dm-writeboost (https://github.com/akiradeveloper/dm-writeboost).

Everything else is too slow.

Stefan
Am 18.08.2015 um 13:33 schrieb Mark Nelson:
> Hi Jan,
> 
> Out of curiosity did you ever try dm-cache?  I've been meaning to give
> it a spin but haven't had the spare cycles.
> 
> Mark
> 
> On 08/18/2015 04:00 AM, Jan Schermer wrote:
>> I already evaluated EnhanceIO in combination with CentOS 6 (and
>> backported 3.10 and 4.0 kernel-lt if I remember correctly).
>> It worked fine during benchmarks and stress tests, but once we run DB2
>> on it it panicked within minutes and took all the data with it (almost
>> literally - files that werent touched, like OS binaries were b0rked
>> and the filesystem was unsalvageable).
>> If you disregard this warning - the performance gains weren't that
>> great either, at least in a VM. It had problems when flushing to disk
>> after reaching dirty watermark and the block size has some
>> not-well-documented implications (not sure now, but I think it only
>> cached IO _larger_than the block size, so if your database keeps
>> incrementing an XX-byte counter it will go straight to disk).
>>
>> Flashcache doesn't respect barriers (or does it now?) - if that's ok
>> for you than go for it, it should be stable and I used it in the past
>> in production without problems.
>>
>> bcache seemed to work fine, but I needed to
>> a) use it for root
>> b) disable and enable it on the fly (doh)
>> c) make it non-persisent (flush it) before reboot - not sure if that
>> was possible either.
>> d) all that in a customer's VM, and that customer didn't have a strong
>> technical background to be able to fiddle with it...
>> So I haven't tested it heavily.
>>
>> Bcache should be the obvious choice if you are in control of the
>> environment. At least you can cry on LKML's shoulder when you lose
>> data :-)
>>
>> Jan
>>
>>
>>> On 18 Aug 2015, at 01:49, Alex Gorbachev  wrote:
>>>
>>> What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
>>> months ago, but no external contributors :(
>>>
>>> The nice thing about EnhanceIO is there is no need to change device
>>> name, unlike bcache, flashcache etc.
>>>
>>> Best regards,
>>> Alex
>>>
>>> On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz 
>>> wrote:
>>>> I did some (non-ceph) work on these, and concluded that bcache was
>>>> the best
>>>> supported, most stable, and fastest.  This was ~1 year ago, to take
>>>> it with
>>>> a grain of salt, but that's what I would recommend.
>>>>
>>>> Daniel
>>>>
>>>>
>>>> 
>>>> From: "Dominik Zalewski" 
>>>> To: "German Anders" 
>>>> Cc: "ceph-users" 
>>>> Sent: Wednesday, July 1, 2015 5:28:10 PM
>>>> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I’ve asked same question last weeks or so (just search the mailing list
>>>> archives for EnhanceIO :) and got some interesting answers.
>>>>
>>>> Looks like the project is pretty much dead since it was bought out
>>>> by HGST.
>>>> Even their website has some broken links in regards to EnhanceIO
>>>>
>>>> I’m keen to try flashcache or bcache (its been in the mainline
>>>> kernel for
>>>> some time)
>>>>
>>>> Dominik
>>>>
>>>> On 1 Jul 2015, at 21:13, German Anders  wrote:
>>>>
>>>> Hi cephers,
>>>>
>>>>Is anyone out there that implement enhanceIO in a production
>>>> environment?
>>>> any recommendation? any perf output to share with the diff between
>>>> using it
>>>> and not?
>>>>
>>>> Thanks in advance,
>>>>
>>&

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Mark Nelson

Hi Jan,

Out of curiosity did you ever try dm-cache?  I've been meaning to give 
it a spin but haven't had the spare cycles.


Mark

On 08/18/2015 04:00 AM, Jan Schermer wrote:

I already evaluated EnhanceIO in combination with CentOS 6 (and backported 3.10 
and 4.0 kernel-lt if I remember correctly).
It worked fine during benchmarks and stress tests, but once we run DB2 on it it 
panicked within minutes and took all the data with it (almost literally - files 
that werent touched, like OS binaries were b0rked and the filesystem was 
unsalvageable).
If you disregard this warning - the performance gains weren't that great 
either, at least in a VM. It had problems when flushing to disk after reaching 
dirty watermark and the block size has some not-well-documented implications 
(not sure now, but I think it only cached IO _larger_than the block size, so if 
your database keeps incrementing an XX-byte counter it will go straight to 
disk).

Flashcache doesn't respect barriers (or does it now?) - if that's ok for you 
than go for it, it should be stable and I used it in the past in production 
without problems.

bcache seemed to work fine, but I needed to
a) use it for root
b) disable and enable it on the fly (doh)
c) make it non-persisent (flush it) before reboot - not sure if that was 
possible either.
d) all that in a customer's VM, and that customer didn't have a strong 
technical background to be able to fiddle with it...
So I haven't tested it heavily.

Bcache should be the obvious choice if you are in control of the environment. 
At least you can cry on LKML's shoulder when you lose data :-)

Jan



On 18 Aug 2015, at 01:49, Alex Gorbachev  wrote:

What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
months ago, but no external contributors :(

The nice thing about EnhanceIO is there is no need to change device
name, unlike bcache, flashcache etc.

Best regards,
Alex

On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz  wrote:

I did some (non-ceph) work on these, and concluded that bcache was the best
supported, most stable, and fastest.  This was ~1 year ago, to take it with
a grain of salt, but that's what I would recommend.

Daniel



From: "Dominik Zalewski" 
To: "German Anders" 
Cc: "ceph-users" 
Sent: Wednesday, July 1, 2015 5:28:10 PM
Subject: Re: [ceph-users] any recommendation of using EnhanceIO?


Hi,

I’ve asked same question last weeks or so (just search the mailing list
archives for EnhanceIO :) and got some interesting answers.

Looks like the project is pretty much dead since it was bought out by HGST.
Even their website has some broken links in regards to EnhanceIO

I’m keen to try flashcache or bcache (its been in the mainline kernel for
some time)

Dominik

On 1 Jul 2015, at 21:13, German Anders  wrote:

Hi cephers,

   Is anyone out there that implement enhanceIO in a production environment?
any recommendation? any perf output to share with the diff between using it
and not?

Thanks in advance,

German
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Emmanuel Florac
Le Tue, 18 Aug 2015 10:12:59 +0100
Nick Fisk  écrivait:

> > Bcache should be the obvious choice if you are in control of the
> > environment. At least you can cry on LKML's shoulder when you lose
> > data :-)  
> 
> Please note, it looks like the main(only?) dev of Bcache has started
> making a new version of bcache, bcachefs. At this stage I'm not sure
> what this means for the ongoing support of the existing bcache
> project.

bcachefs is more than a "new version of bcache", it's a complete POSIX
filesystem with integrated caching. Looks like a silly idea if you ask
me (because we already have several excellent filesystems; because
developing a reliable filesystem is DAMN HARD; because building a
feature-complete FS is CRAZY HARD; because FTL sucks anyway; etc).

-- 

Emmanuel Florac |   Direction technique
|   Intellique
|   
|   +33 1 78 94 84 02

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk




> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Jan Schermer
> Sent: 18 August 2015 10:01
> To: Alex Gorbachev 
> Cc: Dominik Zalewski ; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> 
> I already evaluated EnhanceIO in combination with CentOS 6 (and
> backported 3.10 and 4.0 kernel-lt if I remember correctly).
> It worked fine during benchmarks and stress tests, but once we run DB2 on it
> it panicked within minutes and took all the data with it (almost literally - 
> files
> that werent touched, like OS binaries were b0rked and the filesystem was
> unsalvageable).
> If you disregard this warning - the performance gains weren't that great
> either, at least in a VM. It had problems when flushing to disk after reaching
> dirty watermark and the block size has some not-well-documented
> implications (not sure now, but I think it only cached IO _larger_than the
> block size, so if your database keeps incrementing an XX-byte counter it will
> go straight to disk).
> 
> Flashcache doesn't respect barriers (or does it now?) - if that's ok for you
> than go for it, it should be stable and I used it in the past in production
> without problems.
> 
> bcache seemed to work fine, but I needed to
> a) use it for root
> b) disable and enable it on the fly (doh)
> c) make it non-persisent (flush it) before reboot - not sure if that was
> possible either.
> d) all that in a customer's VM, and that customer didn't have a strong
> technical background to be able to fiddle with it...
> So I haven't tested it heavily.
> 
> Bcache should be the obvious choice if you are in control of the environment.
> At least you can cry on LKML's shoulder when you lose data :-)

Please note, it looks like the main(only?) dev of Bcache has started making a 
new version of bcache, bcachefs. At this stage I'm not sure what this means for 
the ongoing support of the existing bcache project.

> 
> Jan
> 
> 
> > On 18 Aug 2015, at 01:49, Alex Gorbachev  wrote:
> >
> > What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
> > months ago, but no external contributors :(
> >
> > The nice thing about EnhanceIO is there is no need to change device
> > name, unlike bcache, flashcache etc.
> >
> > Best regards,
> > Alex
> >
> > On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz 
> wrote:
> >> I did some (non-ceph) work on these, and concluded that bcache was
> >> the best supported, most stable, and fastest.  This was ~1 year ago,
> >> to take it with a grain of salt, but that's what I would recommend.
> >>
> >> Daniel
> >>
> >>
> >> 
> >> From: "Dominik Zalewski" 
> >> To: "German Anders" 
> >> Cc: "ceph-users" 
> >> Sent: Wednesday, July 1, 2015 5:28:10 PM
> >> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
> >>
> >>
> >> Hi,
> >>
> >> I’ve asked same question last weeks or so (just search the mailing
> >> list archives for EnhanceIO :) and got some interesting answers.
> >>
> >> Looks like the project is pretty much dead since it was bought out by
> HGST.
> >> Even their website has some broken links in regards to EnhanceIO
> >>
> >> I’m keen to try flashcache or bcache (its been in the mainline kernel
> >> for some time)
> >>
> >> Dominik
> >>
> >> On 1 Jul 2015, at 21:13, German Anders  wrote:
> >>
> >> Hi cephers,
> >>
> >>   Is anyone out there that implement enhanceIO in a production
> environment?
> >> any recommendation? any perf output to share with the diff between
> >> using it and not?
> >>
> >> Thanks in advance,
> >>
> >> German
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> >>
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Jan Schermer
I already evaluated EnhanceIO in combination with CentOS 6 (and backported 3.10 
and 4.0 kernel-lt if I remember correctly).
It worked fine during benchmarks and stress tests, but once we run DB2 on it it 
panicked within minutes and took all the data with it (almost literally - files 
that werent touched, like OS binaries were b0rked and the filesystem was 
unsalvageable).
If you disregard this warning - the performance gains weren't that great 
either, at least in a VM. It had problems when flushing to disk after reaching 
dirty watermark and the block size has some not-well-documented implications 
(not sure now, but I think it only cached IO _larger_than the block size, so if 
your database keeps incrementing an XX-byte counter it will go straight to 
disk).

Flashcache doesn't respect barriers (or does it now?) - if that's ok for you 
than go for it, it should be stable and I used it in the past in production 
without problems.

bcache seemed to work fine, but I needed to
a) use it for root
b) disable and enable it on the fly (doh)
c) make it non-persisent (flush it) before reboot - not sure if that was 
possible either.
d) all that in a customer's VM, and that customer didn't have a strong 
technical background to be able to fiddle with it...
So I haven't tested it heavily.

Bcache should be the obvious choice if you are in control of the environment. 
At least you can cry on LKML's shoulder when you lose data :-)

Jan


> On 18 Aug 2015, at 01:49, Alex Gorbachev  wrote:
> 
> What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
> months ago, but no external contributors :(
> 
> The nice thing about EnhanceIO is there is no need to change device
> name, unlike bcache, flashcache etc.
> 
> Best regards,
> Alex
> 
> On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz  wrote:
>> I did some (non-ceph) work on these, and concluded that bcache was the best
>> supported, most stable, and fastest.  This was ~1 year ago, to take it with
>> a grain of salt, but that's what I would recommend.
>> 
>> Daniel
>> 
>> 
>> 
>> From: "Dominik Zalewski" 
>> To: "German Anders" 
>> Cc: "ceph-users" 
>> Sent: Wednesday, July 1, 2015 5:28:10 PM
>> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
>> 
>> 
>> Hi,
>> 
>> I’ve asked same question last weeks or so (just search the mailing list
>> archives for EnhanceIO :) and got some interesting answers.
>> 
>> Looks like the project is pretty much dead since it was bought out by HGST.
>> Even their website has some broken links in regards to EnhanceIO
>> 
>> I’m keen to try flashcache or bcache (its been in the mainline kernel for
>> some time)
>> 
>> Dominik
>> 
>> On 1 Jul 2015, at 21:13, German Anders  wrote:
>> 
>> Hi cephers,
>> 
>>   Is anyone out there that implement enhanceIO in a production environment?
>> any recommendation? any perf output to share with the diff between using it
>> and not?
>> 
>> Thanks in advance,
>> 
>> German
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> 
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-17 Thread Alex Gorbachev
What about https://github.com/Frontier314/EnhanceIO?  Last commit 2
months ago, but no external contributors :(

The nice thing about EnhanceIO is there is no need to change device
name, unlike bcache, flashcache etc.

Best regards,
Alex

On Thu, Jul 23, 2015 at 11:02 AM, Daniel Gryniewicz  wrote:
> I did some (non-ceph) work on these, and concluded that bcache was the best
> supported, most stable, and fastest.  This was ~1 year ago, to take it with
> a grain of salt, but that's what I would recommend.
>
> Daniel
>
>
> 
> From: "Dominik Zalewski" 
> To: "German Anders" 
> Cc: "ceph-users" 
> Sent: Wednesday, July 1, 2015 5:28:10 PM
> Subject: Re: [ceph-users] any recommendation of using EnhanceIO?
>
>
> Hi,
>
> I’ve asked same question last weeks or so (just search the mailing list
> archives for EnhanceIO :) and got some interesting answers.
>
> Looks like the project is pretty much dead since it was bought out by HGST.
> Even their website has some broken links in regards to EnhanceIO
>
> I’m keen to try flashcache or bcache (its been in the mainline kernel for
> some time)
>
> Dominik
>
> On 1 Jul 2015, at 21:13, German Anders  wrote:
>
> Hi cephers,
>
>Is anyone out there that implement enhanceIO in a production environment?
> any recommendation? any perf output to share with the diff between using it
> and not?
>
> Thanks in advance,
>
> German
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-23 Thread Daniel Gryniewicz
I did some (non-ceph) work on these, and concluded that bcache was the best 
supported, most stable, and fastest. This was ~1 year ago, to take it with a 
grain of salt, but that's what I would recommend. 

Daniel 


- Original Message -

From: "Dominik Zalewski"  
To: "German Anders"  
Cc: "ceph-users"  
Sent: Wednesday, July 1, 2015 5:28:10 PM 
Subject: Re: [ceph-users] any recommendation of using EnhanceIO? 

Hi, 

I’ve asked same question last weeks or so (just search the mailing list 
archives for EnhanceIO :) and got some interesting answers. 

Looks like the project is pretty much dead since it was bought out by HGST. 
Even their website has some broken links in regards to EnhanceIO 

I’m keen to try flashcache or bcache (its been in the mainline kernel for some 
time) 

Dominik 




On 1 Jul 2015, at 21:13, German Anders < gand...@despegar.com > wrote: 

Hi cephers, 

Is anyone out there that implement enhanceIO in a production environment? any 
recommendation? any perf output to share with the diff between using it and 
not? 

Thanks in advance, 



German 
___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 





___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-23 Thread Dominik Zalewski
Hi,

I’ve asked same question last weeks or so (just search the mailing list 
archives for EnhanceIO :) and got some interesting answers.

Looks like the project is pretty much dead since it was bought out by HGST. 
Even their website has some broken links in regards to EnhanceIO

I’m keen to try flashcache or bcache (its been in the mainline kernel for some 
time)

Dominik

> On 1 Jul 2015, at 21:13, German Anders  wrote:
> 
> Hi cephers,
> 
>Is anyone out there that implement enhanceIO in a production environment? 
> any recommendation? any perf output to share with the diff between using it 
> and not?
> 
> Thanks in advance,
> 
> 
> German
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread Lionel Bouton
On 07/02/15 13:49, German Anders wrote:
> output from iostat:
>
> CEPHOSD01:
>
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> sdc(ceph-0)   0.00 0.001.00  389.00 0.0035.98  
> 188.9660.32  120.12   16.00  120.39   1.26  49.20
> sdd(ceph-1)   0.00 0.000.000.00 0.00 0.00
> 0.00 0.000.000.000.00   0.00   0.00
> sdf(ceph-2)   0.00 1.006.00  521.00 0.0260.72  
> 236.05   143.10  309.75  484.00  307.74   1.90 100.00
> sdg(ceph-3)   0.00 0.00   11.00  535.00 0.0442.41  
> 159.22   139.25  279.72  394.18  277.37   1.83 100.00
> sdi(ceph-4)   0.00 1.004.00  560.00 0.0254.87  
> 199.32   125.96  187.07  562.00  184.39   1.65  93.20
> sdj(ceph-5)   0.00 0.000.00  566.00 0.0061.41  
> 222.19   109.13  169.620.00  169.62   1.53  86.40
> sdl(ceph-6)   0.00 0.008.000.00 0.09 0.00   
> 23.00 0.12   12.00   12.000.00   2.50   2.00
> sdm(ceph-7)   0.00 0.002.00  481.00 0.0144.59  
> 189.12   116.64  241.41  268.00  241.30   2.05  99.20
> sdn(ceph-8)   0.00 0.001.000.00 0.00 0.00
> 8.00 0.018.008.000.00   8.00   0.80
> fioa  0.00 0.000.00 1016.00 0.0019.09   
> 38.47 0.000.060.000.06   0.00   0.00
>
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> sdc(ceph-0)   0.00 1.00   10.00  278.00 0.0426.07  
> 185.6960.82  257.97  309.60  256.12   2.83  81.60
> sdd(ceph-1)   0.00 0.002.000.00 0.02 0.00   
> 20.00 0.02   10.00   10.000.00  10.00   2.00
> sdf(ceph-2)   0.00 1.006.00  579.00 0.0254.16  
> 189.68   142.78  246.55  328.67  245.70   1.71 100.00
> sdg(ceph-3)   0.00 0.00   10.00   75.00 0.05 5.32  
> 129.41 4.94  185.08   11.20  208.27   4.05  34.40
> sdi(ceph-4)   0.00 0.00   19.00  147.00 0.0912.61  
> 156.6317.88  230.89  114.32  245.96   3.37  56.00
> sdj(ceph-5)   0.00 1.002.00  629.00 0.0143.66  
> 141.72   143.00  223.35  426.00  222.71   1.58 100.00
> sdl(ceph-6)   0.00 0.00   10.000.00 0.04 0.00
> 8.00 0.16   18.40   18.400.00   5.60   5.60
> sdm(ceph-7)   0.00 0.00   11.004.00 0.05 0.01
> 8.00 0.48   35.20   25.82   61.00  14.13  21.20
> sdn(ceph-8)   0.00 0.009.000.00 0.07 0.00   
> 15.11 0.078.008.000.00   4.89   4.40
> fioa  0.00 0.000.00 6415.00 0.00   125.81   
> 40.16 0.000.140.000.14   0.00   0.00
>
> CEPHOSD02:
>
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> sdc1(ceph-9)  0.00 0.00   13.000.00 0.11 0.00   
> 16.62 0.17   13.23   13.230.00   4.92   6.40
> sdd1(ceph-10) 0.00 0.00   15.000.00 0.13 0.00   
> 18.13 0.26   17.33   17.330.00   1.87   2.80
> sdf1(ceph-11) 0.00 0.00   22.00  650.00 0.1151.75  
> 158.04   143.27  212.07  308.55  208.81   1.49 100.00
> sdg1(ceph-12) 0.00 0.00   12.00  282.00 0.0554.60  
> 380.6813.16  120.52  352.00  110.67   2.91  85.60
> sdi1(ceph-13) 0.00 0.001.000.00 0.00 0.00
> 8.00 0.018.008.000.00   8.00   0.80
> sdj1(ceph-14) 0.00 0.00   20.000.00 0.08 0.00
> 8.00 0.26   12.80   12.800.00   3.60   7.20
> sdl1(ceph-15) 0.00 0.000.000.00 0.00 0.00
> 0.00 0.000.000.000.00   0.00   0.00
> sdm1(ceph-16) 0.00 0.00   20.00  424.00 0.1132.20  
> 149.0589.69  235.30  243.00  234.93   2.14  95.20
> sdn1(ceph-17) 0.00 0.005.00  411.00 0.0245.47  
> 223.9498.32  182.28 1057.60  171.63   2.40 100.00
>
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> sdc1(ceph-9)  0.00 0.00   26.00  383.00 0.1134.32  
> 172.4486.92  258.64  297.08  256.03   2.29  93.60
> sdd1(ceph-10) 0.00 0.008.00   31.00 0.09 1.86  
> 101.95 0.84  178.15   94.00  199.87   6.46  25.20
> sdf1(ceph-11) 0.00 1.005.00  409.00 0.0548.34  
> 239.3490.94  219.43  383.20  217.43   2.34  96.80
> sdg1(ceph-12) 0.00 0.000.00  238.00 0.00 1.64   
> 14.1258.34  143.600.00  143.60   1.83  43.60
> sdi1(ceph-13) 0.00 0.00   11.000.00 0.05 0.00   
> 10.18 0.16   14.18   14.180.00   5.09   5.60
> sdj1(ceph-14) 0.00 0.001.000.00 0.00 0.00
> 8.00 0.02   16.00   16.000.00

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread German Anders
yeah 3TB SAS disks

*German Anders*
Storage System Engineer Leader
*Despegar* | IT Team
*office* +54 11 4894 3500 x3408
*mobile* +54 911 3493 7262
*mail* gand...@despegar.com

2015-07-02 9:04 GMT-03:00 Jan Schermer :

> And those disks are spindles?
> Looks like there’s simply too few of there….
>
> Jan
>
> On 02 Jul 2015, at 13:49, German Anders  wrote:
>
> output from iostat:
>
> *CEPHOSD01:*
>
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdc(ceph-0)   0.00 0.001.00  389.00 0.0035.98
> 188.9660.32  120.12   16.00  120.39   1.26  49.20
> sdd(ceph-1)   0.00 0.000.000.00 0.00 0.00
> 0.00 0.000.000.000.00   0.00   0.00
> sdf(ceph-2)   0.00 1.006.00  521.00 0.0260.72
> 236.05   143.10  309.75  484.00  307.74   1.90 100.00
> sdg(ceph-3)   0.00 0.00   11.00  535.00 0.0442.41
> 159.22   139.25  279.72  394.18  277.37   1.83 100.00
> sdi(ceph-4)   0.00 1.004.00  560.00 0.0254.87
> 199.32   125.96  187.07  562.00  184.39   1.65  93.20
> sdj(ceph-5)   0.00 0.000.00  566.00 0.0061.41
> 222.19   109.13  169.620.00  169.62   1.53  86.40
> sdl(ceph-6)   0.00 0.008.000.00 0.09 0.00
> 23.00 0.12   12.00   12.000.00   2.50   2.00
> sdm(ceph-7)   0.00 0.002.00  481.00 0.0144.59
> 189.12   116.64  241.41  268.00  241.30   2.05  99.20
> sdn(ceph-8)   0.00 0.001.000.00 0.00 0.00
> 8.00 0.018.008.000.00   8.00   0.80
> fioa  0.00 0.000.00 1016.00 0.0019.09
> 38.47 0.000.060.000.06   0.00   0.00
>
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdc(ceph-0)   0.00 1.00   10.00  278.00 0.0426.07
> 185.6960.82  257.97  309.60  256.12   2.83  81.60
> sdd(ceph-1)   0.00 0.002.000.00 0.02 0.00
> 20.00 0.02   10.00   10.000.00  10.00   2.00
> sdf(ceph-2)   0.00 1.006.00  579.00 0.0254.16
> 189.68   142.78  246.55  328.67  245.70   1.71 100.00
> sdg(ceph-3)   0.00 0.00   10.00   75.00 0.05 5.32
> 129.41 4.94  185.08   11.20  208.27   4.05  34.40
> sdi(ceph-4)   0.00 0.00   19.00  147.00 0.0912.61
> 156.6317.88  230.89  114.32  245.96   3.37  56.00
> sdj(ceph-5)   0.00 1.002.00  629.00 0.0143.66
> 141.72   143.00  223.35  426.00  222.71   1.58 100.00
> sdl(ceph-6)   0.00 0.00   10.000.00 0.04 0.00
> 8.00 0.16   18.40   18.400.00   5.60   5.60
> sdm(ceph-7)   0.00 0.00   11.004.00 0.05 0.01
> 8.00 0.48   35.20   25.82   61.00  14.13  21.20
> sdn(ceph-8)   0.00 0.009.000.00 0.07 0.00
> 15.11 0.078.008.000.00   4.89   4.40
> fioa  0.00 0.000.00 6415.00 0.00   125.81
> 40.16 0.000.140.000.14   0.00   0.00
>
> *CEPHOSD02:*
>
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdc1(ceph-9)  0.00 0.00   13.000.00 0.11 0.00
> 16.62 0.17   13.23   13.230.00   4.92   6.40
> sdd1(ceph-10) 0.00 0.00   15.000.00 0.13 0.00
> 18.13 0.26   17.33   17.330.00   1.87   2.80
> sdf1(ceph-11) 0.00 0.00   22.00  650.00 0.1151.75
> 158.04   143.27  212.07  308.55  208.81   1.49 100.00
> sdg1(ceph-12) 0.00 0.00   12.00  282.00 0.0554.60
> 380.6813.16  120.52  352.00  110.67   2.91  85.60
> sdi1(ceph-13) 0.00 0.001.000.00 0.00 0.00
> 8.00 0.018.008.000.00   8.00   0.80
> sdj1(ceph-14) 0.00 0.00   20.000.00 0.08 0.00
> 8.00 0.26   12.80   12.800.00   3.60   7.20
> sdl1(ceph-15) 0.00 0.000.000.00 0.00 0.00
> 0.00 0.000.000.000.00   0.00   0.00
> sdm1(ceph-16) 0.00 0.00   20.00  424.00 0.1132.20
> 149.0589.69  235.30  243.00  234.93   2.14  95.20
> sdn1(ceph-17) 0.00 0.005.00  411.00 0.0245.47
> 223.9498.32  182.28 1057.60  171.63   2.40 100.00
>
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdc1(ceph-9)  0.00 0.00   26.00  383.00 0.1134.32
> 172.4486.92  258.64  297.08  256.03   2.29  93.60
> sdd1(ceph-10) 0.00 0.008.00   31.00 0.09 1.86
> 101.95 0.84  178.15   94.00  199.87   6.46  25.20
> sdf1(ceph-11) 0.00 1.005.00  409.00 0.0548.34
> 239.3490.94  219.43  383.20  217.43   2.34  96.80
> sdg1(ceph-12) 0.00 0.000.00  238.00 0.00 1.64
> 14.1258.34  143.600.00  143.60   1.83  43.60
> 

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread Jan Schermer
And those disks are spindles? 
Looks like there’s simply too few of there….

Jan

> On 02 Jul 2015, at 13:49, German Anders  wrote:
> 
> output from iostat:
> 
> CEPHOSD01:
> 
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
> avgqu-sz   await r_await w_await  svctm  %util
> sdc(ceph-0)   0.00 0.001.00  389.00 0.0035.98   188.96
> 60.32  120.12   16.00  120.39   1.26  49.20
> sdd(ceph-1)   0.00 0.000.000.00 0.00 0.00 0.00
>  0.000.000.000.00   0.00   0.00
> sdf(ceph-2)   0.00 1.006.00  521.00 0.0260.72   236.05   
> 143.10  309.75  484.00  307.74   1.90 100.00
> sdg(ceph-3)   0.00 0.00   11.00  535.00 0.0442.41   159.22   
> 139.25  279.72  394.18  277.37   1.83 100.00
> sdi(ceph-4)   0.00 1.004.00  560.00 0.0254.87   199.32   
> 125.96  187.07  562.00  184.39   1.65  93.20
> sdj(ceph-5)   0.00 0.000.00  566.00 0.0061.41   222.19   
> 109.13  169.620.00  169.62   1.53  86.40
> sdl(ceph-6)   0.00 0.008.000.00 0.09 0.0023.00
>  0.12   12.00   12.000.00   2.50   2.00
> sdm(ceph-7)   0.00 0.002.00  481.00 0.0144.59   189.12   
> 116.64  241.41  268.00  241.30   2.05  99.20
> sdn(ceph-8)   0.00 0.001.000.00 0.00 0.00 8.00
>  0.018.008.000.00   8.00   0.80
> fioa  0.00 0.000.00 1016.00 0.0019.0938.47
>  0.000.060.000.06   0.00   0.00
> 
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
> avgqu-sz   await r_await w_await  svctm  %util
> sdc(ceph-0)   0.00 1.00   10.00  278.00 0.0426.07   185.69
> 60.82  257.97  309.60  256.12   2.83  81.60
> sdd(ceph-1)   0.00 0.002.000.00 0.02 0.0020.00
>  0.02   10.00   10.000.00  10.00   2.00
> sdf(ceph-2)   0.00 1.006.00  579.00 0.0254.16   189.68   
> 142.78  246.55  328.67  245.70   1.71 100.00
> sdg(ceph-3)   0.00 0.00   10.00   75.00 0.05 5.32   129.41
>  4.94  185.08   11.20  208.27   4.05  34.40
> sdi(ceph-4)   0.00 0.00   19.00  147.00 0.0912.61   156.63
> 17.88  230.89  114.32  245.96   3.37  56.00
> sdj(ceph-5)   0.00 1.002.00  629.00 0.0143.66   141.72   
> 143.00  223.35  426.00  222.71   1.58 100.00
> sdl(ceph-6)   0.00 0.00   10.000.00 0.04 0.00 8.00
>  0.16   18.40   18.400.00   5.60   5.60
> sdm(ceph-7)   0.00 0.00   11.004.00 0.05 0.01 8.00
>  0.48   35.20   25.82   61.00  14.13  21.20
> sdn(ceph-8)   0.00 0.009.000.00 0.07 0.0015.11
>  0.078.008.000.00   4.89   4.40
> fioa  0.00 0.000.00 6415.00 0.00   125.8140.16
>  0.000.140.000.14   0.00   0.00
> 
> CEPHOSD02:
> 
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
> avgqu-sz   await r_await w_await  svctm  %util
> sdc1(ceph-9)  0.00 0.00   13.000.00 0.11 0.0016.62
>  0.17   13.23   13.230.00   4.92   6.40
> sdd1(ceph-10) 0.00 0.00   15.000.00 0.13 0.0018.13
>  0.26   17.33   17.330.00   1.87   2.80
> sdf1(ceph-11) 0.00 0.00   22.00  650.00 0.1151.75   158.04   
> 143.27  212.07  308.55  208.81   1.49 100.00
> sdg1(ceph-12) 0.00 0.00   12.00  282.00 0.0554.60   380.68
> 13.16  120.52  352.00  110.67   2.91  85.60
> sdi1(ceph-13) 0.00 0.001.000.00 0.00 0.00 8.00
>  0.018.008.000.00   8.00   0.80
> sdj1(ceph-14) 0.00 0.00   20.000.00 0.08 0.00 8.00
>  0.26   12.80   12.800.00   3.60   7.20
> sdl1(ceph-15) 0.00 0.000.000.00 0.00 0.00 0.00
>  0.000.000.000.00   0.00   0.00
> sdm1(ceph-16) 0.00 0.00   20.00  424.00 0.1132.20   149.05
> 89.69  235.30  243.00  234.93   2.14  95.20
> sdn1(ceph-17) 0.00 0.005.00  411.00 0.0245.47   223.94
> 98.32  182.28 1057.60  171.63   2.40 100.00
> 
> Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
> avgqu-sz   await r_await w_await  svctm  %util
> sdc1(ceph-9)  0.00 0.00   26.00  383.00 0.1134.32   172.44
> 86.92  258.64  297.08  256.03   2.29  93.60
> sdd1(ceph-10) 0.00 0.008.00   31.00 0.09 1.86   101.95
>  0.84  178.15   94.00  199.87   6.46  25.20
> sdf1(ceph-11) 0.00 1.005.00  409.00 0.0548.34   239.34
> 90.94  219.43  383.20  217.43   2.34  96.80
> sdg1(ceph-12) 0.00 0.000.00  238.00 0.00 1.6414.12
> 58.34  143.600.00  143.60   1.83  43.60
> sdi1(ceph-13) 0.00 0.00   11.000.00 0.05 0.0010.18
>  0.16   14.18   

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread German Anders
output from iostat:

*CEPHOSD01:*

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdc(ceph-0)   0.00 0.001.00  389.00 0.0035.98
188.9660.32  120.12   16.00  120.39   1.26  49.20
sdd(ceph-1)   0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdf(ceph-2)   0.00 1.006.00  521.00 0.0260.72
236.05   143.10  309.75  484.00  307.74   1.90 100.00
sdg(ceph-3)   0.00 0.00   11.00  535.00 0.0442.41
159.22   139.25  279.72  394.18  277.37   1.83 100.00
sdi(ceph-4)   0.00 1.004.00  560.00 0.0254.87
199.32   125.96  187.07  562.00  184.39   1.65  93.20
sdj(ceph-5)   0.00 0.000.00  566.00 0.0061.41
222.19   109.13  169.620.00  169.62   1.53  86.40
sdl(ceph-6)   0.00 0.008.000.00 0.09 0.00
23.00 0.12   12.00   12.000.00   2.50   2.00
sdm(ceph-7)   0.00 0.002.00  481.00 0.0144.59
189.12   116.64  241.41  268.00  241.30   2.05  99.20
sdn(ceph-8)   0.00 0.001.000.00 0.00 0.00
8.00 0.018.008.000.00   8.00   0.80
fioa  0.00 0.000.00 1016.00 0.0019.09
38.47 0.000.060.000.06   0.00   0.00

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdc(ceph-0)   0.00 1.00   10.00  278.00 0.0426.07
185.6960.82  257.97  309.60  256.12   2.83  81.60
sdd(ceph-1)   0.00 0.002.000.00 0.02 0.00
20.00 0.02   10.00   10.000.00  10.00   2.00
sdf(ceph-2)   0.00 1.006.00  579.00 0.0254.16
189.68   142.78  246.55  328.67  245.70   1.71 100.00
sdg(ceph-3)   0.00 0.00   10.00   75.00 0.05 5.32
129.41 4.94  185.08   11.20  208.27   4.05  34.40
sdi(ceph-4)   0.00 0.00   19.00  147.00 0.0912.61
156.6317.88  230.89  114.32  245.96   3.37  56.00
sdj(ceph-5)   0.00 1.002.00  629.00 0.0143.66
141.72   143.00  223.35  426.00  222.71   1.58 100.00
sdl(ceph-6)   0.00 0.00   10.000.00 0.04 0.00
8.00 0.16   18.40   18.400.00   5.60   5.60
sdm(ceph-7)   0.00 0.00   11.004.00 0.05 0.01
8.00 0.48   35.20   25.82   61.00  14.13  21.20
sdn(ceph-8)   0.00 0.009.000.00 0.07 0.00
15.11 0.078.008.000.00   4.89   4.40
fioa  0.00 0.000.00 6415.00 0.00   125.81
40.16 0.000.140.000.14   0.00   0.00

*CEPHOSD02:*

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdc1(ceph-9)  0.00 0.00   13.000.00 0.11 0.00
16.62 0.17   13.23   13.230.00   4.92   6.40
sdd1(ceph-10) 0.00 0.00   15.000.00 0.13 0.00
18.13 0.26   17.33   17.330.00   1.87   2.80
sdf1(ceph-11) 0.00 0.00   22.00  650.00 0.1151.75
158.04   143.27  212.07  308.55  208.81   1.49 100.00
sdg1(ceph-12) 0.00 0.00   12.00  282.00 0.0554.60
380.6813.16  120.52  352.00  110.67   2.91  85.60
sdi1(ceph-13) 0.00 0.001.000.00 0.00 0.00
8.00 0.018.008.000.00   8.00   0.80
sdj1(ceph-14) 0.00 0.00   20.000.00 0.08 0.00
8.00 0.26   12.80   12.800.00   3.60   7.20
sdl1(ceph-15) 0.00 0.000.000.00 0.00 0.00
0.00 0.000.000.000.00   0.00   0.00
sdm1(ceph-16) 0.00 0.00   20.00  424.00 0.1132.20
149.0589.69  235.30  243.00  234.93   2.14  95.20
sdn1(ceph-17) 0.00 0.005.00  411.00 0.0245.47
223.9498.32  182.28 1057.60  171.63   2.40 100.00

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdc1(ceph-9)  0.00 0.00   26.00  383.00 0.1134.32
172.4486.92  258.64  297.08  256.03   2.29  93.60
sdd1(ceph-10) 0.00 0.008.00   31.00 0.09 1.86
101.95 0.84  178.15   94.00  199.87   6.46  25.20
sdf1(ceph-11) 0.00 1.005.00  409.00 0.0548.34
239.3490.94  219.43  383.20  217.43   2.34  96.80
sdg1(ceph-12) 0.00 0.000.00  238.00 0.00 1.64
14.1258.34  143.600.00  143.60   1.83  43.60
sdi1(ceph-13) 0.00 0.00   11.000.00 0.05 0.00
10.18 0.16   14.18   14.180.00   5.09   5.60
sdj1(ceph-14) 0.00 0.001.000.00 0.00 0.00
8.00 0.02   16.00   16.000.00  16.00   1.60
sdl1(ceph-15) 0.00 0.001.000.00 0.03 0.00
64.00 0.01   12.00   12.000.00  12.00   1.20
sdm1(ceph-16) 0.00 1.004.00  587.00 0.0350.09
173.69   143.32  244.97  296.00  244.62   1.69 100.00
sdn1(ceph-17) 0.00 0.000.00  375.

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread Jan Schermer
There are scripts to integrate existing device into bcache (not sure how well 
it works).

Jan

> On 02 Jul 2015, at 12:59, Emmanuel Florac  wrote:
> 
> 
> bcache has the advantage of being natively integrated to the linux
> kernel, feeling more "proper". It seems slightly faster than flashcache
> too, but YMMV. However you cannot add bcache as an afterthought to an
> existing volume, but you can set up flashcache this way apparently.
> I had a few crashes with bcache on different machines but never had any
> corruption, so it looks production-safe.
> 
> 
> Le Thu, 2 Jul 2015 07:48:48 -0300
> German Anders  écrivait:
> 
>> The idea is to cache rbd at a host level. Also could be possible to
>> cache at the osd level. We have high iowait and we need to lower it a
>> bit, since we are getting the max from our sas disks 100-110 iops per
>> disk (3TB osd's), any advice? Flashcache?
> 
> -- 
> 
> Emmanuel Florac |   Direction technique
>|   Intellique
>|  
>|   +33 1 78 94 84 02
> 



smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread Lionel Bouton
On 07/02/15 12:48, German Anders wrote:
> The idea is to cache rbd at a host level. Also could be possible to
> cache at the osd level. We have high iowait and we need to lower it a
> bit, since we are getting the max from our sas disks 100-110 iops per
> disk (3TB osd's), any advice? Flashcache?

It's hard to suggest anything without knowing more about your setup. Are
your I/O mostly reads or writes? Reads: can you add enough RAM on your
guests or on your OSD to cache your working set? Writes: do you use SSD
for journals already?

Lionel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread Emmanuel Florac

bcache has the advantage of being natively integrated to the linux
kernel, feeling more "proper". It seems slightly faster than flashcache
too, but YMMV. However you cannot add bcache as an afterthought to an
existing volume, but you can set up flashcache this way apparently.
I had a few crashes with bcache on different machines but never had any
corruption, so it looks production-safe.


Le Thu, 2 Jul 2015 07:48:48 -0300
German Anders  écrivait:

> The idea is to cache rbd at a host level. Also could be possible to
> cache at the osd level. We have high iowait and we need to lower it a
> bit, since we are getting the max from our sas disks 100-110 iops per
> disk (3TB osd's), any advice? Flashcache?

-- 

Emmanuel Florac |   Direction technique
|   Intellique
|   
|   +33 1 78 94 84 02

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread Jan Schermer
Tune the OSDs or add more OSDs (if the problem is really in the disks).

Can you post iostat output for the disks that are loaded? (iostat -mx 1 
/dev/sdX, a few lines…)
What drives are those? What controller?

Jan

> On 02 Jul 2015, at 12:48, German Anders  wrote:
> 
> The idea is to cache rbd at a host level. Also could be possible to cache at 
> the osd level. We have high iowait and we need to lower it a bit, since we 
> are getting the max from our sas disks 100-110 iops per disk (3TB osd's), any 
> advice? Flashcache?
> 
> 
> On Thursday, July 2, 2015, Jan Schermer  > wrote:
> I think I posted my experience here ~1 month ago.
> 
> My advice for EnhanceIO: don’t use it.
> 
> But you didn’t exactly say what you want to cache - do you want to cache the 
> OSD filestore disks? RBD devices on hosts? RBD devices inside guests?
> 
> Jan
> 
> > On 02 Jul 2015, at 11:29, Emmanuel Florac  > > wrote:
> >
> > Le Wed, 1 Jul 2015 17:13:03 -0300
> > German Anders > écrivait:
> >
> >> Hi cephers,
> >>
> >>   Is anyone out there that implement enhanceIO in a production
> >> environment? any recommendation? any perf output to share with the
> >> diff between using it and not?
> >
> > I've tried EnhanceIO back when it wasn't too stale, but never put it in
> > production. I've set up bcache on trial, it has its problems (load is
> > stuck at 1.0 because of the bcache_writeback kernel thread, and I
> > suspect a crash was due to it) but works pretty well overall.
> >
> > --
> > 
> > Emmanuel Florac |   Direction technique
> >|   Intellique
> >|  >
> >|   +33 1 78 94 84 02
> > 
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com 
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> > 
> 
> 
> 
> -- 
> 
> German Anders
> Storage System Engineer Leader
> Despegar | IT Team
> office +54 11 4894 3500 x3408
> mobile +54 911 3493 7262
> mail gand...@despegar.com 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread German Anders
The idea is to cache rbd at a host level. Also could be possible to cache
at the osd level. We have high iowait and we need to lower it a bit, since
we are getting the max from our sas disks 100-110 iops per disk (3TB
osd's), any advice? Flashcache?


On Thursday, July 2, 2015, Jan Schermer  wrote:

> I think I posted my experience here ~1 month ago.
>
> My advice for EnhanceIO: don’t use it.
>
> But you didn’t exactly say what you want to cache - do you want to cache
> the OSD filestore disks? RBD devices on hosts? RBD devices inside guests?
>
> Jan
>
> > On 02 Jul 2015, at 11:29, Emmanuel Florac  > wrote:
> >
> > Le Wed, 1 Jul 2015 17:13:03 -0300
> > German Anders > écrivait:
> >
> >> Hi cephers,
> >>
> >>   Is anyone out there that implement enhanceIO in a production
> >> environment? any recommendation? any perf output to share with the
> >> diff between using it and not?
> >
> > I've tried EnhanceIO back when it wasn't too stale, but never put it in
> > production. I've set up bcache on trial, it has its problems (load is
> > stuck at 1.0 because of the bcache_writeback kernel thread, and I
> > suspect a crash was due to it) but works pretty well overall.
> >
> > --
> > 
> > Emmanuel Florac |   Direction technique
> >|   Intellique
> >|  >
> >|   +33 1 78 94 84 02
> > 
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com 
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>

-- 

*German Anders*
Storage System Engineer Leader
*Despegar* | IT Team
*office* +54 11 4894 3500 x3408
*mobile* +54 911 3493 7262
*mail* gand...@despegar.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread Jan Schermer
I think I posted my experience here ~1 month ago.

My advice for EnhanceIO: don’t use it.

But you didn’t exactly say what you want to cache - do you want to cache the 
OSD filestore disks? RBD devices on hosts? RBD devices inside guests?

Jan

> On 02 Jul 2015, at 11:29, Emmanuel Florac  wrote:
> 
> Le Wed, 1 Jul 2015 17:13:03 -0300
> German Anders  écrivait:
> 
>> Hi cephers,
>> 
>>   Is anyone out there that implement enhanceIO in a production
>> environment? any recommendation? any perf output to share with the
>> diff between using it and not?
> 
> I've tried EnhanceIO back when it wasn't too stale, but never put it in
> production. I've set up bcache on trial, it has its problems (load is
> stuck at 1.0 because of the bcache_writeback kernel thread, and I
> suspect a crash was due to it) but works pretty well overall.
> 
> -- 
> 
> Emmanuel Florac |   Direction technique
>|   Intellique
>|  
>|   +33 1 78 94 84 02
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread Emmanuel Florac
Le Wed, 1 Jul 2015 17:13:03 -0300
German Anders  écrivait:

> Hi cephers,
> 
>Is anyone out there that implement enhanceIO in a production
> environment? any recommendation? any perf output to share with the
> diff between using it and not?

I've tried EnhanceIO back when it wasn't too stale, but never put it in
production. I've set up bcache on trial, it has its problems (load is
stuck at 1.0 because of the bcache_writeback kernel thread, and I
suspect a crash was due to it) but works pretty well overall.

-- 

Emmanuel Florac |   Direction technique
|   Intellique
|   
|   +33 1 78 94 84 02

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-02 Thread Burkhard Linke

Hi,

On 07/01/2015 10:13 PM, German Anders wrote:

Hi cephers,

Is anyone out there that implement enhanceIO in a production 
environment? any recommendation? any perf output to share with the 
diff between using it and not?


I've used EnhanceIO as accelerator for our MySQL server, but I had to 
discard it after a fatal kernel crash related to the module.


In my experience it works stable in write through mode, but write back 
is buggy. Since the later one is the interesting one in almost any use 
case, I would not recommend to use it.


Best regards,
Burkhard Linke
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] any recommendation of using EnhanceIO?

2015-07-01 Thread Dominik Zalewski
Hi,


I’ve asked same question last weeks or so (just search the mailing list
archives for EnhanceIO :) and got some interesting answers.


Looks like the project is pretty much dead since it was bought out by HGST.
Even their website has some broken links in regards to EnhanceIO


I’m keen to try flashcache or bcache (its been in the mainline kernel for
some time)


Dominik

On Wed, Jul 1, 2015 at 9:13 PM, German Anders  wrote:

> Hi cephers,
>
>Is anyone out there that implement enhanceIO in a production
> environment? any recommendation? any perf output to share with the diff
> between using it and not?
>
> Thanks in advance,
>
> *German*
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] any recommendation of using EnhanceIO?

2015-07-01 Thread German Anders
Hi cephers,

   Is anyone out there that implement enhanceIO in a production
environment? any recommendation? any perf output to share with the diff
between using it and not?

Thanks in advance,

*German*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com