Michael Armbrust created SPARK-27911:
----------------------------------------

             Summary: PySpark Packages should automatically choose correct 
scala version
                 Key: SPARK-27911
                 URL: https://issues.apache.org/jira/browse/SPARK-27911
             Project: Spark
          Issue Type: New Feature
          Components: PySpark
    Affects Versions: 2.4.3
            Reporter: Michael Armbrust


Today, users of pyspark (and Scala) need to manually specify the version of 
Scala that their Spark installation is using when adding a Spark package to 
their application. This extra configuration confusing to users who may not even 
know which version of Scala they are using (for example, if they installed 
Spark using {{pip}}). The confusion here is exacerbated by releases in Spark 
that have changed the default from {{2.11}} -> {{2.12}} -> {{2.11}}.

https://spark.apache.org/releases/spark-release-2-4-2.html
https://spark.apache.org/releases/spark-release-2-4-3.html

Since Spark can know which version of Scala it was compiled for, we should give 
users the option to automatically choose the correct version.  This could be as 
simple as a substitution for {{$scalaVersion}} or something when resolving a 
package (similar to SBTs support for automatically handling scala dependencies).

Here are some concrete examples of users getting it wrong and getting confused:
https://github.com/delta-io/delta/issues/6
https://github.com/delta-io/delta/issues/63



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to