Rafal Wojdyla created SPARK-38438: ------------------------------------- Summary: Can't update spark.jars.packages on existing global/default context Key: SPARK-38438 URL: https://issues.apache.org/jira/browse/SPARK-38438 Project: Spark Issue Type: New Feature Components: PySpark, Spark Core Affects Versions: 3.2.1 Environment: py: 3.9 spark: 3.2.1 Reporter: Rafal Wojdyla
Reproduction: {code:python} from pyspark import SparkConf from pyspark.sql import SparkSession # default session: s = SparkSession.builder.getOrCreate() # later on we want to update jars.packages, here's e.g. spark-hats s = (SparkSession.builder .config("spark.jars.packages", "za.co.absa:spark-hats_2.12:0.2.2") .getOrCreate()) # line below return None, the config was not propagated: s._sc._conf.get("spark.jars.packages") {code} Stopping the context doesn't help, in fact it's even more confusing, because the configuration is updated, but doesn't have an effect: {code:python} from pyspark import SparkConf from pyspark.sql import SparkSession # default session: s = SparkSession.builder.getOrCreate() s.stop() s = (SparkSession.builder .config("spark.jars.packages", "za.co.absa:spark-hats_2.12:0.2.2") .getOrCreate()) # now this line returns 'za.co.absa:spark-hats_2.12:0.2.2', but the context # doesn't download the jar/package, as it would if there was no global context # thus the extra package is unusable. It's not downloaded, or added to the # classpath. s._sc._conf.get("spark.jars.packages") {code} One workaround is to stop the context AND kill the JVM gateway, which seems to be a kind of hard reset: {code:python} from pyspark import SparkConf from pyspark.sql import SparkSession # default session: s = SparkSession.builder.getOrCreate() # Hard reset: s.stop() s._sc._gateway.shutdown() SparkContext._gateway = None SparkContext._jvm = None s = (SparkSession.builder .config("spark.jars.packages", "za.co.absa:spark-hats_2.12:0.2.2") .getOrCreate()) # Now we are guaranteed there's a new spark session, and packages # are downloaded, added to the classpath etc. {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org