This is really cool. We should also be more opinionated about how we specify time and intervals.
On Wed, Feb 12, 2020 at 3:15 PM, Dongjoon Hyun < dongjoon.h...@gmail.com > wrote: > > Thank you, Wenchen. > > > The new policy looks clear to me. +1 for the explicit policy. > > > So, are we going to revise the existing conf names before 3.0.0 release? > > > Or, is it applied to new up-coming configurations from now? > > > Bests, > Dongjoon. > > On Wed, Feb 12, 2020 at 7:43 AM Wenchen Fan < cloud0fan@ gmail. com ( > cloud0...@gmail.com ) > wrote: > > >> Hi all, >> >> >> I'd like to discuss the naming policy of Spark configs, as for now it >> depends on personal preference which leads to inconsistent namings. >> >> >> In general, the config name should be a noun that describes its meaning >> clearly. >> Good examples: >> spark.sql.session.timeZone >> >> spark.sql.streaming.continuous.executorQueueSize >> >> spark.sql.statistics.histogram.numBins >> >> Bad examples: >> spark.sql.defaultSizeInBytes (default size for what?) >> >> >> >> Also note that, config name has many parts, joined by dots. Each part is a >> namespace. Don't create namespace unnecessarily. >> Good example: >> spark.sql.execution.rangeExchange.sampleSizePerPartition >> >> spark.sql.execution.arrow.maxRecordsPerBatch >> >> Bad examples: >> spark. sql. windowExec. buffer. in. memory. threshold ( >> http://spark.sql.windowexec.buffer.in.memory.threshold/ ) (" in" is not a >> useful namespace, better to be.buffer.inMemoryThreshold ) >> >> >> >> For a big feature, usually we need to create an umbrella config to turn it >> on/off, and other configs for fine-grained controls. These configs should >> share the same namespace, and the umbrella config should be named like >> featureName.enabled >> . For example: >> spark.sql.cbo.enabled >> >> spark.sql.cbo.starSchemaDetection >> >> spark.sql.cbo.starJoinFTRatio >> spark.sql.cbo.joinReorder.enabled >> spark.sql.cbo.joinReorder.dp.threshold (BTW "dp" is not a good namespace) >> >> spark.sql.cbo.joinReorder.card.weight (BTW "card" is not a good namespace) >> >> >> >> >> For boolean configs, in general it should end with a verb, e.g. >> spark.sql.join.preferSortMergeJoin >> . If the config is for a feature and you can't find a good verb for the >> feature, featureName.enabled is also good. >> >> >> I'll update https:/ / spark. apache. org/ contributing. html ( >> https://spark.apache.org/contributing.html ) after we reach a consensus >> here. Any comments are welcome! >> >> >> Thanks, >> Wenchen >> > >