Thank you, Wenchen.

The new policy looks clear to me. +1 for the explicit policy.

So, are we going to revise the existing conf names before 3.0.0 release?

Or, is it applied to new up-coming configurations from now?

Bests,
Dongjoon.

On Wed, Feb 12, 2020 at 7:43 AM Wenchen Fan <cloud0...@gmail.com> wrote:

> Hi all,
>
> I'd like to discuss the naming policy of Spark configs, as for now it
> depends on personal preference which leads to inconsistent namings.
>
> In general, the config name should be a noun that describes its meaning
> clearly.
> Good examples:
> spark.sql.session.timeZone
> spark.sql.streaming.continuous.executorQueueSize
> spark.sql.statistics.histogram.numBins
> Bad examples:
> spark.sql.defaultSizeInBytes (default size for what?)
>
> Also note that, config name has many parts, joined by dots. Each part is a
> namespace. Don't create namespace unnecessarily.
> Good example:
> spark.sql.execution.rangeExchange.sampleSizePerPartition
> spark.sql.execution.arrow.maxRecordsPerBatch
> Bad examples:
> spark.sql.windowExec.buffer.in.memory.threshold ("in" is not a useful
> namespace, better to be .buffer.inMemoryThreshold)
>
> For a big feature, usually we need to create an umbrella config to turn it
> on/off, and other configs for fine-grained controls. These configs should
> share the same namespace, and the umbrella config should be named like
> featureName.enabled. For example:
> spark.sql.cbo.enabled
> spark.sql.cbo.starSchemaDetection
> spark.sql.cbo.starJoinFTRatio
> spark.sql.cbo.joinReorder.enabled
> spark.sql.cbo.joinReorder.dp.threshold (BTW "dp" is not a good namespace)
> spark.sql.cbo.joinReorder.card.weight (BTW "card" is not a good namespace)
>
> For boolean configs, in general it should end with a verb, e.g.
> spark.sql.join.preferSortMergeJoin. If the config is for a feature and
> you can't find a good verb for the feature, featureName.enabled is also
> good.
>
> I'll update https://spark.apache.org/contributing.html after we reach a
> consensus here. Any comments are welcome!
>
> Thanks,
> Wenchen
>
>
>

Reply via email to