Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18853#discussion_r150227062 --- Diff: docs/sql-programming-guide.md --- @@ -1460,6 +1460,13 @@ that these options will be deprecated in future release as more optimizations ar Configures the number of partitions to use when shuffling data for joins or aggregations. </td> </tr> + <tr> + <td><code>spark.sql.typeCoercion.mode</code></td> + <td><code>legacy</code></td> + <td> + The <code>legacy</code> type coercion mode was used in spark prior to 2.3, and so it continues to be the default to avoid breaking behavior. However, it has logical inconsistencies. The <code>hive</code> mode is preferred for most new applications, though it may require additional manual casting. --- End diff -- I don't agree hive's type coercion rule is the most reasonable. One example is casting both sides to double when comparing string and long, which may lead to wrong result because of precision lose. I'd like to be neutral here, just say users can choose different type coercion mode, like hive, mysql, etc. By default it's spark.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org