[ 
https://issues.apache.org/jira/browse/SPARK-37727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-37727:
---------------------------------
    Description: 
Currently, {{SparkSession.builder.getOrCreate()}} is too noisy even when 
duplicate configurations are set. And users cannot tell which configurations 
are to fix. See the example below:

{code}
./bin/spark-shell --conf spark.abc=abc
{code}

{code}
import org.apache.spark.sql.SparkSession
spark.sparkContext.setLogLevel("DEBUG")
SparkSession.builder.config("spark.abc", "abc").getOrCreate
{code}

{code}
...
21:12:40.601 [main] WARN  org.apache.spark.sql.SparkSession - Using an existing 
SparkSession; some spark core configurations may not take effect.
{code}

This is strait forward when there are few configurations but it is difficult 
for users to figure out when there are too many configurations especially when 
these configurations are defined in property files like {{spark-default.conf}} 
that is sometimes maintained separately by system admins.

See also https://github.com/apache/spark/pull/34757#discussion_r769248275

  was:
Currently, {{SparkSession.builder.getOrCreate()}} is too noisy even when 
duplicate configurations are set. And users cannot tell which configurations 
are to fix. See the example below:

{code}
./bin/spark-shell --conf spark.abc=abc
{code}

{code}
import org.apache.spark.sql.SparkSession
SparkSession.builder.config("spark.abc", "abc").getOrCreate
{code}

{code}

{code}

This is strait forward when there are few configurations but it is difficult 
for users to figure out when there are too many configurations especially when 
these configurations are defined in property files like {{spark-default.conf}} 
that is sometimes maintained separately by system admins.

See also https://github.com/apache/spark/pull/34757#discussion_r769248275


> Show ignored confs & hide warnings for conf already set in 
> SparkSession.builder.getOrCreate
> -------------------------------------------------------------------------------------------
>
>                 Key: SPARK-37727
>                 URL: https://issues.apache.org/jira/browse/SPARK-37727
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Hyukjin Kwon
>            Priority: Major
>
> Currently, {{SparkSession.builder.getOrCreate()}} is too noisy even when 
> duplicate configurations are set. And users cannot tell which configurations 
> are to fix. See the example below:
> {code}
> ./bin/spark-shell --conf spark.abc=abc
> {code}
> {code}
> import org.apache.spark.sql.SparkSession
> spark.sparkContext.setLogLevel("DEBUG")
> SparkSession.builder.config("spark.abc", "abc").getOrCreate
> {code}
> {code}
> ...
> 21:12:40.601 [main] WARN  org.apache.spark.sql.SparkSession - Using an 
> existing SparkSession; some spark core configurations may not take effect.
> {code}
> This is strait forward when there are few configurations but it is difficult 
> for users to figure out when there are too many configurations especially 
> when these configurations are defined in property files like 
> {{spark-default.conf}} that is sometimes maintained separately by system 
> admins.
> See also https://github.com/apache/spark/pull/34757#discussion_r769248275



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to