Re: Request to document the direct relationship between other configurations

Jungtaek Lim Tue, 11 Feb 2020 17:48:26 -0800

I'm sorry if I miss something, but this is ideally better to be started as
[DISCUSS] as I haven't seen any reference to have consensus on this
practice.

For me it's just there're two different practices co-existing on the
codebase, meaning it's closer to the preference of individual (with
implicitly agreeing that others have different preferences), or it hasn't
been discussed thoughtfully.

Personally I'd rather not think someone won't understand setting `.enabled`
to `false` means the functionality is disabled and effectively it disables
all sub-configurations. E.g. when `spark.sql.adaptive.enabled` is `false`,
all the configurations for `spark.sql.adaptive.*` are implicitly no-op. For
me this is pretty intuitive and the one of major benefits of the structured
configurations.

If we want to make it explicit, "all" sub-configurations should have
redundant part of the doc. More redundant if the condition is nested. I
agree this is the good step of "be kind" but less pragmatic.

I'd be happy to follow the consensus we would make in this thread.
Appreciate more voices.

Thanks,
Jungtaek Lim (HeartSaVioR)

On Wed, Feb 12, 2020 at 10:36 AM Hyukjin Kwon <gurwls...@gmail.com> wrote:

> > I don't plan to document this officially yet
> Just to prevent confusion, I meant I don't yet plan to document the fact
> that we should write the relationships in configurations as a code/review
> guideline in https://spark.apache.org/contributing.html
>
>
> 2020년 2월 12일 (수) 오전 9:57, Hyukjin Kwon <gurwls...@gmail.com>님이 작성:
>
>> Hi all,
>>
>> I happened to review some PRs and I noticed that some configurations
>> don't have some information
>> necessary.
>>
>> To be explicit, I would like to make sure we document the direct
>> relationship between other configurations
>> in the documentation. For example,
>> `spark.sql.adaptive.shuffle.reducePostShufflePartitions.enabled`
>> can be only enabled when `spark.sql.adaptive.enabled` is enabled. That's
>> clearly documented.
>> We're good in general given that we document them in general in Apache
>> Spark.
>> See 'spark.task.reaper.enabled', 'spark.dynamicAllocation.enabled',
>> 'spark.sql.parquet.filterPushdown', etc.
>>
>> However, I noticed such a pattern that such information is missing in
>> some components in general, for example,
>> `spark.history.fs.cleaner.*`, `spark.history.kerberos.*` and
>> `spark.history.ui.acls.* `
>>
>> I hope we all start to document such information. Logically users can't
>> know the relationship and I myself
>> had to read the codes to confirm when I review.
>> I don't plan to document this officially yet because to me it looks a
>> pretty logical request to me; however,
>> let me know if you guys have some different opinions.
>>
>> Thanks.
>>
>>
>>

Re: Request to document the direct relationship between other configurations

Reply via email to