[ https://issues.apache.org/jira/browse/SPARK-21752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129322#comment-16129322 ]
Jakub Nowacki commented on SPARK-21752: --------------------------------------- Well, it seems so, but it is, at least logically, unclear why passing {{SparkConf}} via {{SparkSession}} config works, whereas, using key-value doesn't. It should be at least mentioned somewhere in the documentation, but currently neither [Configuration|https://spark.apache.org/docs/latest/configuration.html] nor [Spark SQL guide|https://spark.apache.org/docs/latest/sql-programming-guide.html] says anything about that. > Config spark.jars.packages is ignored in SparkSession config > ------------------------------------------------------------ > > Key: SPARK-21752 > URL: https://issues.apache.org/jira/browse/SPARK-21752 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.2.0 > Reporter: Jakub Nowacki > > If I put a config key {{spark.jars.packages}} using {{SparkSession}} builder > as follows: > {code} > spark = pyspark.sql.SparkSession.builder\ > .appName('test-mongo')\ > .master('local[*]')\ > .config("spark.jars.packages", > "org.mongodb.spark:mongo-spark-connector_2.11:2.2.0"))\ > .config("spark.mongodb.input.uri", "mongodb://mongo/test.coll") \ > .config("spark.mongodb.output.uri", "mongodb://mongo/test.coll") \ > .getOrCreate() > {code} > the SparkSession gets created but there are no package download logs printed, > and if I use the loaded classes, Mongo connector in this case, but it's the > same for other packages, I get {{java.lang.ClassNotFoundException}} for the > missing classes. > If I use the config file {{conf/spark-defaults.comf}}, command line option > {{--packages}}, e.g.: > {code} > import os > os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages > org.mongodb.spark:mongo-spark-connector_2.11:2.2.0 pyspark-shell' > {code} > it works fine. Interestingly, using {{SparkConf}} object works fine as well, > e.g.: > {code} > conf = pyspark.SparkConf() > conf.set("spark.jars.packages", > "org.mongodb.spark:mongo-spark-connector_2.11:2.2.0") > conf.set("spark.mongodb.input.uri", "mongodb://mongo/test.coll") > conf.set("spark.mongodb.output.uri", "mongodb://mongo/test.coll") > spark = pyspark.sql.SparkSession.builder\ > .appName('test-mongo')\ > .master('local[*]')\ > .config(conf=conf)\ > .getOrCreate() > {code} > The above is in Python but I've seen the behavior in other languages, though, > I didn't check R. > I also have seen it in older Spark versions. > It seems that this is the only config key that doesn't work for me via the > {{SparkSession}} builder config. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org