+1 to switch it on in Hadoop 3.4.0

(1) it doesn't break any existing applications I am aware of.
(2) No noticeable performance regression in any cases observed.

I feel compelled to make a feature the default if it is strictly better.
Hopefully we can make Hadoop easier to use in this way too.

On Tue, Apr 28, 2020 at 8:36 AM Stephen O'Donnell
<sodonn...@cloudera.com.invalid> wrote:

> Hi,
>
> A long time back there was a Jira raised to change the default volume
> choosing policy from Round Robin to Available Space:
>
> https://issues.apache.org/jira/browse/HDFS-8538
>
> At the time there were some objections / concerns about using available
> space.
>
> In the 5 years since then, at Cloudera we have seen about 1000 clusters
> running with Available Space enabled, and we have not seen any issues
> caused by it. It feels like this policy should be the default, as we have
> to change it more often than not.
>
> To recap, the Available Space places blocks on disks with more free space
> with a higher probability until all disks are within a threshold of free
> space from each other. After that it behaves in a round robin fashion. This
> means if a disk is replaced, it will slowly catch up to the usage of the
> others, and if you have disks of different sizes, they will self balance.
>
> I would like to ask:
>
> 1. Are there others in the community running the Available Space volume
> choosing policy, and if so, have you seen any issues, or does it run
> smoothly?
>
> 2. Does anyone have any strong objections in changing the default to
> Available Space from 3.4 onwards?
>
> Thanks,
>
> Stephen.
>

Reply via email to