[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-01 Thread jsterr
Is there any update on this? Did someone test the option and has 
performance values before and after?

Is there any good documentation regarding this option?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-01 Thread Igor Fedotov
I played with this feature a while ago and recall it had visible 
negative impact on user operations due to the need to submit tons of 
discard operations - effectively each data overwrite operation triggers 
one or more discard operation submission to disk.


And I doubt this has been widely used if any.

Nevertheless recently we've got a PR to rework some aspects of thread 
management for this stuff, see https://github.com/ceph/ceph/pull/55469


The author claimed they needed this feature for their cluster so you 
might want to ask him about their user experience.



W.r.t documentation - actually there are just two options

- bdev_enable_discard - enables issuing discard to disk

- bdev_async_discard - instructs whether discard requests are issued 
synchronously (along with disk extents release) or asynchronously (using 
a background thread).


Thanks,

Igor

On 01/03/2024 13:06, jst...@proxforge.de wrote:
Is there any update on this? Did someone test the option and has 
performance values before and after?

Is there any good documentation regarding this option?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-01 Thread Anthony D'Atri
I have a number of drives in my fleet with old firmware that seems to have 
discard / TRIM bugs, as in the drives get bricked.

Much worse is that since they're on legacy RAID HBAs, many of them can't be 
updated.

ymmv.

> On Mar 1, 2024, at 13:15, Igor Fedotov  wrote:
> 
> I played with this feature a while ago and recall it had visible negative 
> impact on user operations due to the need to submit tons of discard 
> operations - effectively each data overwrite operation triggers one or more 
> discard operation submission to disk.
> 
> And I doubt this has been widely used if any.
> 
> Nevertheless recently we've got a PR to rework some aspects of thread 
> management for this stuff, see https://github.com/ceph/ceph/pull/55469
> 
> The author claimed they needed this feature for their cluster so you might 
> want to ask him about their user experience.
> 
> 
> W.r.t documentation - actually there are just two options
> 
> - bdev_enable_discard - enables issuing discard to disk
> 
> - bdev_async_discard - instructs whether discard requests are issued 
> synchronously (along with disk extents release) or asynchronously (using a 
> background thread).
> 
> Thanks,
> 
> Igor
> 
> On 01/03/2024 13:06, jst...@proxforge.de wrote:
>> Is there any update on this? Did someone test the option and has performance 
>> values before and after?
>> Is there any good documentation regarding this option?
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-02 Thread David C.
I came across an enterprise NVMe used for BlueFS DB whose performance
dropped sharply after a few months of delivery (I won't mention the brand
here but it was not among these 3: Intel, Samsung, Micron).
It is clear that enabling bdev_enable_discard impacted performance, but
this option also saved the platform after a few days of discard.

IMHO the most important thing is to validate the behavior when there has
been a write to the entire flash media.
But this option has the merit of existing.

it seems to me that the ideal would be not to have several options on
bdev_*discard, and that this task should be asynchronous and with the
(D)iscard instructions during a calmer period of activity (I do not see any
impact if the instructions are lost during an OSD reboot)


Le ven. 1 mars 2024 à 19:17, Igor Fedotov  a écrit :

> I played with this feature a while ago and recall it had visible
> negative impact on user operations due to the need to submit tons of
> discard operations - effectively each data overwrite operation triggers
> one or more discard operation submission to disk.
>
> And I doubt this has been widely used if any.
>
> Nevertheless recently we've got a PR to rework some aspects of thread
> management for this stuff, see https://github.com/ceph/ceph/pull/55469
>
> The author claimed they needed this feature for their cluster so you
> might want to ask him about their user experience.
>
>
> W.r.t documentation - actually there are just two options
>
> - bdev_enable_discard - enables issuing discard to disk
>
> - bdev_async_discard - instructs whether discard requests are issued
> synchronously (along with disk extents release) or asynchronously (using
> a background thread).
>
> Thanks,
>
> Igor
>
> On 01/03/2024 13:06, jst...@proxforge.de wrote:
> > Is there any update on this? Did someone test the option and has
> > performance values before and after?
> > Is there any good documentation regarding this option?
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-02 Thread Matt Vandermeulen
We've had a specific set of drives that we've had to enable 
bdev_enable_discard and bdev_async_discard for in order to maintain 
acceptable performance on block clusters. I wrote the patch that Igor 
mentioned in order to try and send more parallel discards to the 
devices, but these ones in particular seem to process them in serial 
(based on observed discard counts and latency going to the device), 
which is unfortunate. We're also testing new firmware that suggests it 
should help alleviate some of the initial concerns we had about discards 
not keeping up which prompted the patch in the first place.


Most of our drives do not need discards enabled (and definitely not 
without async) in order to maintain performance unless we're doing a 
full disk fio test or something like that where we're trying to find its 
cliff profile. We've used OSD classes to help target the options being 
applied to specific OSDs via centralized conf which helps when we would 
add new hosts that may have different drives so that the options weren't 
applied globally.


Based on our experience, I wouldn't enable it unless you're seeing some 
sort of cliff-like behaviour as your OSDs run low on free space, or are 
heavily fragmented. I would also deem bdev_async_enabled = 1 to be a 
requirement so that it doesn't block user IO. Keep an eye on your 
discards being sent to devices and the discard latency, as well (via 
node_exporter, for example).


Matt


On 2024-03-02 06:18, David C. wrote:

I came across an enterprise NVMe used for BlueFS DB whose performance
dropped sharply after a few months of delivery (I won't mention the 
brand

here but it was not among these 3: Intel, Samsung, Micron).
It is clear that enabling bdev_enable_discard impacted performance, but
this option also saved the platform after a few days of discard.

IMHO the most important thing is to validate the behavior when there 
has

been a write to the entire flash media.
But this option has the merit of existing.

it seems to me that the ideal would be not to have several options on
bdev_*discard, and that this task should be asynchronous and with the
(D)iscard instructions during a calmer period of activity (I do not see 
any

impact if the instructions are lost during an OSD reboot)


Le ven. 1 mars 2024 à 19:17, Igor Fedotov  a 
écrit :



I played with this feature a while ago and recall it had visible
negative impact on user operations due to the need to submit tons of
discard operations - effectively each data overwrite operation 
triggers

one or more discard operation submission to disk.

And I doubt this has been widely used if any.

Nevertheless recently we've got a PR to rework some aspects of thread
management for this stuff, see https://github.com/ceph/ceph/pull/55469

The author claimed they needed this feature for their cluster so you
might want to ask him about their user experience.


W.r.t documentation - actually there are just two options

- bdev_enable_discard - enables issuing discard to disk

- bdev_async_discard - instructs whether discard requests are issued
synchronously (along with disk extents release) or asynchronously 
(using

a background thread).

Thanks,

Igor

On 01/03/2024 13:06, jst...@proxforge.de wrote:
> Is there any update on this? Did someone test the option and has
> performance values before and after?
> Is there any good documentation regarding this option?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-02 Thread David C.
Could we not consider setting up a “bluefstrim” which could be orchestrated
?

This would avoid having a continuous stream of (D)iscard instructions on
the disks during activity.

A weekly (probably monthly) bluefstrim could probably be enough for
platforms that really need it.


Le sam. 2 mars 2024 à 12:58, Matt Vandermeulen  a
écrit :

> We've had a specific set of drives that we've had to enable
> bdev_enable_discard and bdev_async_discard for in order to maintain
> acceptable performance on block clusters. I wrote the patch that Igor
> mentioned in order to try and send more parallel discards to the
> devices, but these ones in particular seem to process them in serial
> (based on observed discard counts and latency going to the device),
> which is unfortunate. We're also testing new firmware that suggests it
> should help alleviate some of the initial concerns we had about discards
> not keeping up which prompted the patch in the first place.
>
> Most of our drives do not need discards enabled (and definitely not
> without async) in order to maintain performance unless we're doing a
> full disk fio test or something like that where we're trying to find its
> cliff profile. We've used OSD classes to help target the options being
> applied to specific OSDs via centralized conf which helps when we would
> add new hosts that may have different drives so that the options weren't
> applied globally.
>
> Based on our experience, I wouldn't enable it unless you're seeing some
> sort of cliff-like behaviour as your OSDs run low on free space, or are
> heavily fragmented. I would also deem bdev_async_enabled = 1 to be a
> requirement so that it doesn't block user IO. Keep an eye on your
> discards being sent to devices and the discard latency, as well (via
> node_exporter, for example).
>
> Matt
>
>
> On 2024-03-02 06:18, David C. wrote:
> > I came across an enterprise NVMe used for BlueFS DB whose performance
> > dropped sharply after a few months of delivery (I won't mention the
> > brand
> > here but it was not among these 3: Intel, Samsung, Micron).
> > It is clear that enabling bdev_enable_discard impacted performance, but
> > this option also saved the platform after a few days of discard.
> >
> > IMHO the most important thing is to validate the behavior when there
> > has
> > been a write to the entire flash media.
> > But this option has the merit of existing.
> >
> > it seems to me that the ideal would be not to have several options on
> > bdev_*discard, and that this task should be asynchronous and with the
> > (D)iscard instructions during a calmer period of activity (I do not see
> > any
> > impact if the instructions are lost during an OSD reboot)
> >
> >
> > Le ven. 1 mars 2024 à 19:17, Igor Fedotov  a
> > écrit :
> >
> >> I played with this feature a while ago and recall it had visible
> >> negative impact on user operations due to the need to submit tons of
> >> discard operations - effectively each data overwrite operation
> >> triggers
> >> one or more discard operation submission to disk.
> >>
> >> And I doubt this has been widely used if any.
> >>
> >> Nevertheless recently we've got a PR to rework some aspects of thread
> >> management for this stuff, see https://github.com/ceph/ceph/pull/55469
> >>
> >> The author claimed they needed this feature for their cluster so you
> >> might want to ask him about their user experience.
> >>
> >>
> >> W.r.t documentation - actually there are just two options
> >>
> >> - bdev_enable_discard - enables issuing discard to disk
> >>
> >> - bdev_async_discard - instructs whether discard requests are issued
> >> synchronously (along with disk extents release) or asynchronously
> >> (using
> >> a background thread).
> >>
> >> Thanks,
> >>
> >> Igor
> >>
> >> On 01/03/2024 13:06, jst...@proxforge.de wrote:
> >> > Is there any update on this? Did someone test the option and has
> >> > performance values before and after?
> >> > Is there any good documentation regarding this option?
> >> > ___
> >> > ceph-users mailing list -- ceph-users@ceph.io
> >> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-02 Thread Joshua Baergen
Periodic discard was actually attempted in the past:
https://github.com/ceph/ceph/pull/20723

A proper implementation would probably need appropriate
scheduling/throttling that can be tuned so as to balance against
client I/O impact.

Josh

On Sat, Mar 2, 2024 at 6:20 AM David C.  wrote:
>
> Could we not consider setting up a “bluefstrim” which could be orchestrated
> ?
>
> This would avoid having a continuous stream of (D)iscard instructions on
> the disks during activity.
>
> A weekly (probably monthly) bluefstrim could probably be enough for
> platforms that really need it.
>
>
> Le sam. 2 mars 2024 à 12:58, Matt Vandermeulen  a
> écrit :
>
> > We've had a specific set of drives that we've had to enable
> > bdev_enable_discard and bdev_async_discard for in order to maintain
> > acceptable performance on block clusters. I wrote the patch that Igor
> > mentioned in order to try and send more parallel discards to the
> > devices, but these ones in particular seem to process them in serial
> > (based on observed discard counts and latency going to the device),
> > which is unfortunate. We're also testing new firmware that suggests it
> > should help alleviate some of the initial concerns we had about discards
> > not keeping up which prompted the patch in the first place.
> >
> > Most of our drives do not need discards enabled (and definitely not
> > without async) in order to maintain performance unless we're doing a
> > full disk fio test or something like that where we're trying to find its
> > cliff profile. We've used OSD classes to help target the options being
> > applied to specific OSDs via centralized conf which helps when we would
> > add new hosts that may have different drives so that the options weren't
> > applied globally.
> >
> > Based on our experience, I wouldn't enable it unless you're seeing some
> > sort of cliff-like behaviour as your OSDs run low on free space, or are
> > heavily fragmented. I would also deem bdev_async_enabled = 1 to be a
> > requirement so that it doesn't block user IO. Keep an eye on your
> > discards being sent to devices and the discard latency, as well (via
> > node_exporter, for example).
> >
> > Matt
> >
> >
> > On 2024-03-02 06:18, David C. wrote:
> > > I came across an enterprise NVMe used for BlueFS DB whose performance
> > > dropped sharply after a few months of delivery (I won't mention the
> > > brand
> > > here but it was not among these 3: Intel, Samsung, Micron).
> > > It is clear that enabling bdev_enable_discard impacted performance, but
> > > this option also saved the platform after a few days of discard.
> > >
> > > IMHO the most important thing is to validate the behavior when there
> > > has
> > > been a write to the entire flash media.
> > > But this option has the merit of existing.
> > >
> > > it seems to me that the ideal would be not to have several options on
> > > bdev_*discard, and that this task should be asynchronous and with the
> > > (D)iscard instructions during a calmer period of activity (I do not see
> > > any
> > > impact if the instructions are lost during an OSD reboot)
> > >
> > >
> > > Le ven. 1 mars 2024 à 19:17, Igor Fedotov  a
> > > écrit :
> > >
> > >> I played with this feature a while ago and recall it had visible
> > >> negative impact on user operations due to the need to submit tons of
> > >> discard operations - effectively each data overwrite operation
> > >> triggers
> > >> one or more discard operation submission to disk.
> > >>
> > >> And I doubt this has been widely used if any.
> > >>
> > >> Nevertheless recently we've got a PR to rework some aspects of thread
> > >> management for this stuff, see https://github.com/ceph/ceph/pull/55469
> > >>
> > >> The author claimed they needed this feature for their cluster so you
> > >> might want to ask him about their user experience.
> > >>
> > >>
> > >> W.r.t documentation - actually there are just two options
> > >>
> > >> - bdev_enable_discard - enables issuing discard to disk
> > >>
> > >> - bdev_async_discard - instructs whether discard requests are issued
> > >> synchronously (along with disk extents release) or asynchronously
> > >> (using
> > >> a background thread).
> > >>
> > >> Thanks,
> > >>
> > >> Igor
> > >>
> > >> On 01/03/2024 13:06, jst...@proxforge.de wrote:
> > >> > Is there any update on this? Did someone test the option and has
> > >> > performance values before and after?
> > >> > Is there any good documentation regarding this option?
> > >> > ___
> > >> > ceph-users mailing list -- ceph-users@ceph.io
> > >> > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >> ___
> > >> ceph-users mailing list -- ceph-users@ceph.io
> > >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > >>
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@c

[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-06 Thread jsterr
Is there any update on this? Did someone test the option and has 
performance values before and after?

Is there any good documentation regarding this option?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2021-04-13 Thread Wido den Hollander



On 4/12/21 5:46 PM, Dan van der Ster wrote:
> Hi all,
> 
> bdev_enable_discard has been in ceph for several major releases now
> but it is still off by default.
> Did anyone try it recently -- is it safe to use? And do you have perf
> numbers before and after enabling?
> 

I have done so on SATA SSDs in a few cases and: it worked

Did I notice a real difference? Not really.

It's highly debated if this still makes a difference with modern flash
devices. I don't think there is a real conclusion if you still need to
trim/discard blocks.

Wido

> Cheers, Dan
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2021-04-13 Thread Dan van der Ster
On Tue, Apr 13, 2021 at 9:00 AM Wido den Hollander  wrote:
>
>
>
> On 4/12/21 5:46 PM, Dan van der Ster wrote:
> > Hi all,
> >
> > bdev_enable_discard has been in ceph for several major releases now
> > but it is still off by default.
> > Did anyone try it recently -- is it safe to use? And do you have perf
> > numbers before and after enabling?
> >
>
> I have done so on SATA SSDs in a few cases and: it worked
>
> Did I notice a real difference? Not really.
>

Thanks, I've enabled it on a test box and am draining data to check
that it doesn't crash anything.

> It's highly debated if this still makes a difference with modern flash
> devices. I don't think there is a real conclusion if you still need to
> trim/discard blocks.

Do you happen to have any more info on these debates? As you know we
have seen major performance issues on hypervisors that are not running
a periodic fstrim; we use similar or identical SATA ssds for HV local
storage and our block.db's. If it doesn't hurt anything, why wouldn't
we enable it by default?

Cheers, Dan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2021-04-13 Thread Mark Nelson

On 4/13/21 4:07 AM, Dan van der Ster wrote:


On Tue, Apr 13, 2021 at 9:00 AM Wido den Hollander  wrote:



On 4/12/21 5:46 PM, Dan van der Ster wrote:

Hi all,

bdev_enable_discard has been in ceph for several major releases now
but it is still off by default.
Did anyone try it recently -- is it safe to use? And do you have perf
numbers before and after enabling?


I have done so on SATA SSDs in a few cases and: it worked

Did I notice a real difference? Not really.


Thanks, I've enabled it on a test box and am draining data to check
that it doesn't crash anything.


It's highly debated if this still makes a difference with modern flash
devices. I don't think there is a real conclusion if you still need to
trim/discard blocks.

Do you happen to have any more info on these debates? As you know we
have seen major performance issues on hypervisors that are not running
a periodic fstrim; we use similar or identical SATA ssds for HV local
storage and our block.db's. If it doesn't hurt anything, why wouldn't
we enable it by default?



There's some good discussion in the original PR:


https://github.com/ceph/ceph/pull/14727


I suspect that the primary concerns for enabling it by default are two 
fold: (1) the issue of having to maintain a blocklist for buggy firmware 
implementations (2) Even "good" firmware can potentially see slowdowns 
with bursts of trim commands due to needing to update the FTL metadata 
per this comment:



https://github.com/ceph/ceph/pull/14727#issuecomment-342399578


The original issue of how to decide between online discard, periodic 
bulk discard, or no discard is still an issue imho.  I think we probably 
need to get more feedback from people with real large deployments (hint 
hint :D) before we enable online discard by default.



Mark




Cheers, Dan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2021-04-13 Thread Dan van der Ster
On Tue, Apr 13, 2021 at 12:35 PM Mark Nelson  wrote:
>
> On 4/13/21 4:07 AM, Dan van der Ster wrote:
>
> > On Tue, Apr 13, 2021 at 9:00 AM Wido den Hollander  wrote:
> >>
> >>
> >> On 4/12/21 5:46 PM, Dan van der Ster wrote:
> >>> Hi all,
> >>>
> >>> bdev_enable_discard has been in ceph for several major releases now
> >>> but it is still off by default.
> >>> Did anyone try it recently -- is it safe to use? And do you have perf
> >>> numbers before and after enabling?
> >>>
> >> I have done so on SATA SSDs in a few cases and: it worked
> >>
> >> Did I notice a real difference? Not really.
> >>
> > Thanks, I've enabled it on a test box and am draining data to check
> > that it doesn't crash anything.
> >
> >> It's highly debated if this still makes a difference with modern flash
> >> devices. I don't think there is a real conclusion if you still need to
> >> trim/discard blocks.
> > Do you happen to have any more info on these debates? As you know we
> > have seen major performance issues on hypervisors that are not running
> > a periodic fstrim; we use similar or identical SATA ssds for HV local
> > storage and our block.db's. If it doesn't hurt anything, why wouldn't
> > we enable it by default?
>
>
> There's some good discussion in the original PR:
>
>
> https://github.com/ceph/ceph/pull/14727
>
>
> I suspect that the primary concerns for enabling it by default are two
> fold: (1) the issue of having to maintain a blocklist for buggy firmware
> implementations (2) Even "good" firmware can potentially see slowdowns
> with bursts of trim commands due to needing to update the FTL metadata
> per this comment:
>
>
> https://github.com/ceph/ceph/pull/14727#issuecomment-342399578
>
>
> The original issue of how to decide between online discard, periodic
> bulk discard, or no discard is still an issue imho.  I think we probably
> need to get more feedback from people with real large deployments (hint
> hint :D) before we enable online discard by default.

Thanks for the links. And further to those I found the attempt at a
periodic discard: https://github.com/ceph/ceph/pull/20723
Igor posted some performance numbers there for online and periodic,
both of which seem not very promising.
And I didn't find any further work on periodic discard for bitmap or beyond.

Since the runtime performance impact of this looks unpredictable,
maybe a conservative way to resume this work would be to allow discard
via the offline bluestore tooling?

Cheers, Dan

>
>
> Mark
>
>
> >
> > Cheers, Dan
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2021-04-15 Thread Wido den Hollander




On 13/04/2021 11:07, Dan van der Ster wrote:

On Tue, Apr 13, 2021 at 9:00 AM Wido den Hollander  wrote:




On 4/12/21 5:46 PM, Dan van der Ster wrote:

Hi all,

bdev_enable_discard has been in ceph for several major releases now
but it is still off by default.
Did anyone try it recently -- is it safe to use? And do you have perf
numbers before and after enabling?



I have done so on SATA SSDs in a few cases and: it worked

Did I notice a real difference? Not really.



Thanks, I've enabled it on a test box and am draining data to check
that it doesn't crash anything.


It's highly debated if this still makes a difference with modern flash
devices. I don't think there is a real conclusion if you still need to
trim/discard blocks.


Do you happen to have any more info on these debates? As you know we
have seen major performance issues on hypervisors that are not running
a periodic fstrim; we use similar or identical SATA ssds for HV local
storage and our block.db's. If it doesn't hurt anything, why wouldn't
we enable it by default?



These debates are more about if it really makes sense with modern SSDs 
as the performance gain seems limited.


With older (SATA) SSDs it might, but with the modern NVMe DC-grade ones 
people are doubting if it is still needed.


SATA 3.0 also had the issue that the TRIM command was a blocking command 
where with SATA 3.1 it became async and thus non-blocking.


With NVMe it is a different story again.

I don't have links or papers for you, it's mainly stories I heard on 
conferences and such.


Wido


Cheers, Dan


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io