[spark] branch branch-3.0 updated: [SPARK-31151][SQL][DOC] Reorganize the migration guide of SQL

2020-03-14 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f83ef7d  [SPARK-31151][SQL][DOC] Reorganize the migration guide of SQL
f83ef7d is described below

commit f83ef7d143aafbbdd1bb322567481f68db72195a
Author: gatorsmile 
AuthorDate: Sun Mar 15 07:35:20 2020 +0900

[SPARK-31151][SQL][DOC] Reorganize the migration guide of SQL

### What changes were proposed in this pull request?
The current migration guide of SQL is too long for most readers to find the 
needed info. This PR is to group the items in the migration guide of Spark SQL 
based on the corresponding components.

Note. This PR does not change the contents of the migration guides. 
Attached figure is the screenshot after the change.


![screencapture-127-0-0-1-4000-sql-migration-guide-html-2020-03-14-12_00_40](https://user-images.githubusercontent.com/11567269/76688626-d3010200-65eb-11ea-9ce7-265bc90ebb2c.png)

### Why are the changes needed?
The current migration guide of SQL is too long for most readers to find the 
needed info.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
N/A

Closes #27909 from gatorsmile/migrationGuideReorg.

Authored-by: gatorsmile 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4d4c00c1b564b57d3016ce8c3bfcffaa6e58f012)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-migration-guide.md | 287 +++-
 1 file changed, 150 insertions(+), 137 deletions(-)

diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 19c744c..31d5c68 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -23,92 +23,119 @@ license: |
 {:toc}
 
 ## Upgrading from Spark SQL 2.4 to 3.0
-  - Since Spark 3.0, when inserting a value into a table column with a 
different data type, the type coercion is performed as per ANSI SQL standard. 
Certain unreasonable type conversions such as converting `string` to `int` and 
`double` to `boolean` are disallowed. A runtime exception will be thrown if the 
value is out-of-range for the data type of the column. In Spark version 2.4 and 
earlier, type conversions during table insertion are allowed as long as they 
are valid `Cast`. When inse [...]
 
-  - In Spark 3.0, the deprecated methods `SQLContext.createExternalTable` and 
`SparkSession.createExternalTable` have been removed in favor of its 
replacement, `createTable`.
-
-  - In Spark 3.0, the deprecated `HiveContext` class has been removed. Use 
`SparkSession.builder.enableHiveSupport()` instead.
-
-  - Since Spark 3.0, configuration `spark.sql.crossJoin.enabled` become 
internal configuration, and is true by default, so by default spark won't raise 
exception on sql with implicit cross join.
-
-  - In Spark version 2.4 and earlier, SQL queries such as `FROM ` or 
`FROM  UNION ALL FROM ` are supported by accident. In hive-style 
`FROM  SELECT `, the `SELECT` clause is not negligible. Neither 
Hive nor Presto support this syntax. Therefore we will treat these queries as 
invalid since Spark 3.0.
+### Dataset/DataFrame APIs
 
   - Since Spark 3.0, the Dataset and DataFrame API `unionAll` is not 
deprecated any more. It is an alias for `union`.
 
-  - In Spark version 2.4 and earlier, the parser of JSON data source treats 
empty strings as null for some data types such as `IntegerType`. For 
`FloatType`, `DoubleType`, `DateType` and `TimestampType`, it fails on empty 
strings and throws exceptions. Since Spark 3.0, we disallow empty strings and 
will throw exceptions for data types except for `StringType` and `BinaryType`. 
The previous behaviour of allowing empty string can be restored by setting 
`spark.sql.legacy.json.allowEmptyStrin [...]
-
-  - Since Spark 3.0, the `from_json` functions supports two modes - 
`PERMISSIVE` and `FAILFAST`. The modes can be set via the `mode` option. The 
default mode became `PERMISSIVE`. In previous versions, behavior of `from_json` 
did not conform to either `PERMISSIVE` nor `FAILFAST`, especially in processing 
of malformed JSON records. For example, the JSON string `{"a" 1}` with the 
schema `a INT` is converted to `null` by previous versions but Spark 3.0 
converts it to `Row(null)`.
-
-  - The `ADD JAR` command previously returned a result set with the single 
value 0. It now returns an empty result set.
-
-  - In Spark version 2.4 and earlier, users can create map values with map 
type key via built-in function such as `CreateMap`, `MapFromArrays`, etc. Since 
Spark 3.0, it's not allowed to create map values with map type key with these 
built-in functions. Users can use `map_entries` function to convert map to 
array> as a workaround. In addition, users can still read 
map values with map type key f

[spark] branch branch-3.0 updated: [SPARK-31151][SQL][DOC] Reorganize the migration guide of SQL

2020-03-14 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f83ef7d  [SPARK-31151][SQL][DOC] Reorganize the migration guide of SQL
f83ef7d is described below

commit f83ef7d143aafbbdd1bb322567481f68db72195a
Author: gatorsmile 
AuthorDate: Sun Mar 15 07:35:20 2020 +0900

[SPARK-31151][SQL][DOC] Reorganize the migration guide of SQL

### What changes were proposed in this pull request?
The current migration guide of SQL is too long for most readers to find the 
needed info. This PR is to group the items in the migration guide of Spark SQL 
based on the corresponding components.

Note. This PR does not change the contents of the migration guides. 
Attached figure is the screenshot after the change.


![screencapture-127-0-0-1-4000-sql-migration-guide-html-2020-03-14-12_00_40](https://user-images.githubusercontent.com/11567269/76688626-d3010200-65eb-11ea-9ce7-265bc90ebb2c.png)

### Why are the changes needed?
The current migration guide of SQL is too long for most readers to find the 
needed info.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
N/A

Closes #27909 from gatorsmile/migrationGuideReorg.

Authored-by: gatorsmile 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4d4c00c1b564b57d3016ce8c3bfcffaa6e58f012)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-migration-guide.md | 287 +++-
 1 file changed, 150 insertions(+), 137 deletions(-)

diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 19c744c..31d5c68 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -23,92 +23,119 @@ license: |
 {:toc}
 
 ## Upgrading from Spark SQL 2.4 to 3.0
-  - Since Spark 3.0, when inserting a value into a table column with a 
different data type, the type coercion is performed as per ANSI SQL standard. 
Certain unreasonable type conversions such as converting `string` to `int` and 
`double` to `boolean` are disallowed. A runtime exception will be thrown if the 
value is out-of-range for the data type of the column. In Spark version 2.4 and 
earlier, type conversions during table insertion are allowed as long as they 
are valid `Cast`. When inse [...]
 
-  - In Spark 3.0, the deprecated methods `SQLContext.createExternalTable` and 
`SparkSession.createExternalTable` have been removed in favor of its 
replacement, `createTable`.
-
-  - In Spark 3.0, the deprecated `HiveContext` class has been removed. Use 
`SparkSession.builder.enableHiveSupport()` instead.
-
-  - Since Spark 3.0, configuration `spark.sql.crossJoin.enabled` become 
internal configuration, and is true by default, so by default spark won't raise 
exception on sql with implicit cross join.
-
-  - In Spark version 2.4 and earlier, SQL queries such as `FROM ` or 
`FROM  UNION ALL FROM ` are supported by accident. In hive-style 
`FROM  SELECT `, the `SELECT` clause is not negligible. Neither 
Hive nor Presto support this syntax. Therefore we will treat these queries as 
invalid since Spark 3.0.
+### Dataset/DataFrame APIs
 
   - Since Spark 3.0, the Dataset and DataFrame API `unionAll` is not 
deprecated any more. It is an alias for `union`.
 
-  - In Spark version 2.4 and earlier, the parser of JSON data source treats 
empty strings as null for some data types such as `IntegerType`. For 
`FloatType`, `DoubleType`, `DateType` and `TimestampType`, it fails on empty 
strings and throws exceptions. Since Spark 3.0, we disallow empty strings and 
will throw exceptions for data types except for `StringType` and `BinaryType`. 
The previous behaviour of allowing empty string can be restored by setting 
`spark.sql.legacy.json.allowEmptyStrin [...]
-
-  - Since Spark 3.0, the `from_json` functions supports two modes - 
`PERMISSIVE` and `FAILFAST`. The modes can be set via the `mode` option. The 
default mode became `PERMISSIVE`. In previous versions, behavior of `from_json` 
did not conform to either `PERMISSIVE` nor `FAILFAST`, especially in processing 
of malformed JSON records. For example, the JSON string `{"a" 1}` with the 
schema `a INT` is converted to `null` by previous versions but Spark 3.0 
converts it to `Row(null)`.
-
-  - The `ADD JAR` command previously returned a result set with the single 
value 0. It now returns an empty result set.
-
-  - In Spark version 2.4 and earlier, users can create map values with map 
type key via built-in function such as `CreateMap`, `MapFromArrays`, etc. Since 
Spark 3.0, it's not allowed to create map values with map type key with these 
built-in functions. Users can use `map_entries` function to convert map to 
array> as a workaround. In addition, users can still read 
map values with map type key f

[spark] branch branch-3.0 updated: [SPARK-31151][SQL][DOC] Reorganize the migration guide of SQL

2020-03-14 Thread yamamuro
This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f83ef7d  [SPARK-31151][SQL][DOC] Reorganize the migration guide of SQL
f83ef7d is described below

commit f83ef7d143aafbbdd1bb322567481f68db72195a
Author: gatorsmile 
AuthorDate: Sun Mar 15 07:35:20 2020 +0900

[SPARK-31151][SQL][DOC] Reorganize the migration guide of SQL

### What changes were proposed in this pull request?
The current migration guide of SQL is too long for most readers to find the 
needed info. This PR is to group the items in the migration guide of Spark SQL 
based on the corresponding components.

Note. This PR does not change the contents of the migration guides. 
Attached figure is the screenshot after the change.


![screencapture-127-0-0-1-4000-sql-migration-guide-html-2020-03-14-12_00_40](https://user-images.githubusercontent.com/11567269/76688626-d3010200-65eb-11ea-9ce7-265bc90ebb2c.png)

### Why are the changes needed?
The current migration guide of SQL is too long for most readers to find the 
needed info.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
N/A

Closes #27909 from gatorsmile/migrationGuideReorg.

Authored-by: gatorsmile 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4d4c00c1b564b57d3016ce8c3bfcffaa6e58f012)
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-migration-guide.md | 287 +++-
 1 file changed, 150 insertions(+), 137 deletions(-)

diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 19c744c..31d5c68 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -23,92 +23,119 @@ license: |
 {:toc}
 
 ## Upgrading from Spark SQL 2.4 to 3.0
-  - Since Spark 3.0, when inserting a value into a table column with a 
different data type, the type coercion is performed as per ANSI SQL standard. 
Certain unreasonable type conversions such as converting `string` to `int` and 
`double` to `boolean` are disallowed. A runtime exception will be thrown if the 
value is out-of-range for the data type of the column. In Spark version 2.4 and 
earlier, type conversions during table insertion are allowed as long as they 
are valid `Cast`. When inse [...]
 
-  - In Spark 3.0, the deprecated methods `SQLContext.createExternalTable` and 
`SparkSession.createExternalTable` have been removed in favor of its 
replacement, `createTable`.
-
-  - In Spark 3.0, the deprecated `HiveContext` class has been removed. Use 
`SparkSession.builder.enableHiveSupport()` instead.
-
-  - Since Spark 3.0, configuration `spark.sql.crossJoin.enabled` become 
internal configuration, and is true by default, so by default spark won't raise 
exception on sql with implicit cross join.
-
-  - In Spark version 2.4 and earlier, SQL queries such as `FROM ` or 
`FROM  UNION ALL FROM ` are supported by accident. In hive-style 
`FROM  SELECT `, the `SELECT` clause is not negligible. Neither 
Hive nor Presto support this syntax. Therefore we will treat these queries as 
invalid since Spark 3.0.
+### Dataset/DataFrame APIs
 
   - Since Spark 3.0, the Dataset and DataFrame API `unionAll` is not 
deprecated any more. It is an alias for `union`.
 
-  - In Spark version 2.4 and earlier, the parser of JSON data source treats 
empty strings as null for some data types such as `IntegerType`. For 
`FloatType`, `DoubleType`, `DateType` and `TimestampType`, it fails on empty 
strings and throws exceptions. Since Spark 3.0, we disallow empty strings and 
will throw exceptions for data types except for `StringType` and `BinaryType`. 
The previous behaviour of allowing empty string can be restored by setting 
`spark.sql.legacy.json.allowEmptyStrin [...]
-
-  - Since Spark 3.0, the `from_json` functions supports two modes - 
`PERMISSIVE` and `FAILFAST`. The modes can be set via the `mode` option. The 
default mode became `PERMISSIVE`. In previous versions, behavior of `from_json` 
did not conform to either `PERMISSIVE` nor `FAILFAST`, especially in processing 
of malformed JSON records. For example, the JSON string `{"a" 1}` with the 
schema `a INT` is converted to `null` by previous versions but Spark 3.0 
converts it to `Row(null)`.
-
-  - The `ADD JAR` command previously returned a result set with the single 
value 0. It now returns an empty result set.
-
-  - In Spark version 2.4 and earlier, users can create map values with map 
type key via built-in function such as `CreateMap`, `MapFromArrays`, etc. Since 
Spark 3.0, it's not allowed to create map values with map type key with these 
built-in functions. Users can use `map_entries` function to convert map to 
array> as a workaround. In addition, users can still read 
map values with map type key f