Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/22746#discussion_r226263066 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -0,0 +1,520 @@ +--- +layout: global +title: Spark SQL Upgrading Guide +displayTitle: Spark SQL Upgrading Guide +--- + +* Table of contents +{:toc} + +## Upgrading From Spark SQL 2.4 to 3.0 + + - In PySpark, when creating a `SparkSession` with `SparkSession.builder.getOrCreate()`, if there is an existing `SparkContext`, the builder was trying to update the `SparkConf` of the existing `SparkContext` with configurations specified to the builder, but the `SparkContext` is shared by all `SparkSession`s, so we should not update them. Since 3.0, the builder come to not update the configurations. This is the same behavior as Java/Scala API in 2.3 and above. If you want to update them, you need to update them prior to creating a `SparkSession`. + +## Upgrading From Spark SQL 2.3 to 2.4 + + - In Spark version 2.3 and earlier, the second parameter to array_contains function is implicitly promoted to the element type of first array type parameter. This type promotion can be lossy and may cause `array_contains` function to return wrong result. This problem has been addressed in 2.4 by employing a safer type promotion mechanism. This can cause some change in behavior and are illustrated in the table below. + <table class="table"> + <tr> + <th> + <b>Query</b> + </th> + <th> + <b>Result Spark 2.3 or Prior</b> + </th> + <th> + <b>Result Spark 2.4</b> + </th> + <th> + <b>Remarks</b> + </th> + </tr> + <tr> + <th> + <b>SELECT <br> array_contains(array(1), 1.34D);</b> + </th> + <th> + <b>true</b> + </th> + <th> + <b>false</b> + </th> + <th> + <b>In Spark 2.4, left and right parameters are promoted to array(double) and double type respectively.</b> + </th> + </tr> + <tr> + <th> + <b>SELECT <br> array_contains(array(1), '1');</b> + </th> + <th> + <b>true</b> + </th> + <th> + <b>AnalysisException is thrown since integer type can not be promoted to string type in a loss-less manner.</b> + </th> + <th> + <b>Users can use explict cast</b> + </th> + </tr> + <tr> + <th> + <b>SELECT <br> array_contains(array(1), 'anystring');</b> + </th> + <th> + <b>null</b> + </th> + <th> + <b>AnalysisException is thrown since integer type can not be promoted to string type in a loss-less manner.</b> + </th> + <th> + <b>Users can use explict cast</b> --- End diff -- `explict` -> `explicit`
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org