[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r495564516 ## File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md ## @@ -36,6 +36,14 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ LOCATION path ] Review comment: Yes, we need it, will add it soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r457913593 ## File path: docs/sql-ref-syntax-qry-select-groupby.md ## @@ -81,6 +81,9 @@ aggregate_name ( [ DISTINCT ] expression [ , ... ] ) [ FILTER ( WHERE boolean_ex * **aggregate_name** Specifies an aggregate function name (MIN, MAX, COUNT, SUM, AVG, etc.). +Note that `FIRST` and `LAST` have an optional `IGNORE NULLS` clause; when the option specified, +it will returns the first or last value that is not null (or null if all values are null). +**Syntax:** `[ FIRST | LAST ] ( [ distinct ] expression [ IGNORE NULLS ] ) [ FILTER ( WHERE boolean_expression ) ]` Review comment: now have have aggregate functions above and examples in the end, looks enough,let me move it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r457310417 ## File path: docs/sql-ref-syntax-qry-select-lateral-view.md ## @@ -0,0 +1,130 @@ +--- +layout: global +title: LATERAL VIEW Clause +displayTitle: LATERAL VIEW Clause +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +The `LATERAL VIEW` clause is used in conjunction with generator functions such as explode(), which will generate a virtual table containing one or more rows. `LATERAL VIEW` will apply the rows to each original output row. + +### Syntax + +```sql +LATERAL VIEW [ OUTER ] generator_function ( expression [ , ... ] ) [ table_alias ] AS column_alias [ , ... ] +``` + +### Parameters + +* **OUTER** + +If `LATERAL VIEW` is used without `OUTER`, and `generator_function` returns empty, then no results will be output in `SELECT` clause. +If `LATERAL VIEW` is used with `OUTER`, and `generator_function` returns empty, then results will be output normally with `NULL` as `generator_function` output. Review comment: looks better ## File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md ## @@ -114,9 +162,45 @@ CREATE TABLE student (id INT, name STRING) PARTITIONED BY (age INT); --Use Row Format and file format -CREATE TABLE student (id INT,name STRING) +CREATE TABLE student (id INT, name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; + +--Use complex datatype +CREATE EXTERNAL TABLE family( +name STRING, +friends ARRAY, +children MAP, +address STRUCT +) +ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '\\' +COLLECTION ITEMS TERMINATED BY '_' +MAP KEYS TERMINATED BY ':' +LINES TERMINATED BY '\n' +NULL DEFINED AS 'foonull' +STORED AS TEXTFILE +LOCATION '/tmp/family/'; + +--Use predefined custom serde +CREATE TABLE avroExample +ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' +STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' +OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' +TBLPROPERTIES ('avro.schema.literal'='{ "namespace": "org.apache.hive", +"name": "first_schema", +"type": "record", +"fields": [ +{ "name":"string1", "type":"string" }, +{ "name":"string2", "type":"string" } +] }'); + +--Use personalized custom serde(we may need to `ADD JAR xxx.jar` first to ensure we can find the serde_class, or you may run into `CLASSNOTFOUND` exception) +ADD JAR /usr/lib/hive_serde/lib/hive_serde_example.jar; Review comment: we may inform our user, they can use serdes provided by spark or hive, or defined by themselves This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r456729395 ## File path: docs/sql-ref-syntax-qry-select-case.md ## @@ -0,0 +1,107 @@ +--- +layout: global +title: CASE Clause +displayTitle: CASE Clause +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`CASE` clause uses a rule to return specific result based on the specified condition, similar to if/else statements in other programming languages. + +### Syntax + +```sql +CASE [ expression ] { WHEN boolean_expression THEN then_expression } [ ... ] +[ ELSE else_expression ] +END +``` + +### Parameters + +* **boolean_expression** + +Specific specified condition, it should be boolean type. Review comment: > also edit the menu-sql.yaml to update the sidebar menu? Thanks huaxingao,updated This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r456728329 ## File path: docs/sql-ref-syntax-qry-select-groupby.md ## @@ -91,6 +93,18 @@ aggregate_name ( [ DISTINCT ] expression [ , ... ] ) [ FILTER ( WHERE boolean_ex Filters the input rows for which the `boolean_expression` in the `WHERE` clause evaluates to true are passed to the aggregate function; other rows are discarded. +* **FIRST** + +`FIRST` select a first expression value from the data set, we can specific `IGNORE NULLS` to ignore NULLS, it is optional. Review comment: it seems your expression more natural. will improve it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r456727633 ## File path: docs/sql-ref-syntax-qry-select-lateral-view.md ## @@ -0,0 +1,130 @@ +--- +layout: global +title: LATERAL VIEW Clause +displayTitle: LATERAL VIEW Clause +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`LATERAL VIEW` clause is used in conjunction with generator functions such as explode(), which will generate a virtual table containing one or more rows. `LATERAL VIEW` will apply the rows to each original output row. + +### Syntax + +```sql +LATERAL VIEW [ OUTER ] { generator_function ( expression [ , ... ] ) [ table_alias ] AS column_alias [ , ... ] } [ ... ] Review comment: > I have the same question in https://github.com/apache/spark/pull/29056/files#diff-bc52347d9c1373190bce4389d3066ad5R47 `[ LATERAL VIEW clause ] [ ... ] ` means we can have one or more `LATERAL VIEW clause` together , not like `PIVOT clause`, which appears at most once in one place. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r456464691 ## File path: docs/sql-ref-syntax-qry-select-lateral-view.md ## @@ -0,0 +1,130 @@ +--- +layout: global +title: LATERAL VIEW Clause +displayTitle: LATERAL VIEW Clause +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`LATERAL VIEW` clause is used in conjunction with generator functions such as explode(), which will generate a virtual table containing one or more rows. `LATERAL VIEW` will apply the rows to each original output row. + +### Syntax + +```sql +LATERAL VIEW [ OUTER ] { generator_function ( expression [ , ... ] ) [ table_alias ] AS column_alias [ , ... ] } [ ... ] Review comment: emm, have removed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r456194748 ## File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md ## @@ -36,6 +36,14 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier [ LOCATION path ] [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] [ AS select_statement ] + +row_format: + : SERDE serde_class [WITH SERDEPROPERTIES (k1=v1, k2=v2, ...) ] + | DELIMITED [ FIELDS TERMINATED BY fields_termiated_char [ ESCAPED BY escaped_char] ] + [ COLLECTION ITEMS TERMINATED BY collection_items_termiated_char ] + [ MAP KEYS TERMINATED BY map_key_termiated_char ] + [ LINES TERMINATED BY row_termiated_char ] + [ NULL DEFINED AS null_char ] Review comment: I checked other docs. seems the indent rule is more general, i will check and follow it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r456190158 ## File path: docs/sql-ref-syntax-qry-select-case.md ## @@ -0,0 +1,123 @@ +--- +layout: global +title: CASE Clause +displayTitle: CASE Clause +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`CASE` clause uses a rule to return specific result based on the specified condition, similar to if/else statements in other programming languages. + +### Syntax + +```sql +CASE [ expression ] { WHEN boolean_expression THEN then_expression } [ ... ] +[ ELSE else_expression ] +END +``` + +### Parameters + +* **WHEN** + +Specific a boolean condition ,under which to return the `THEN` result, `WHEN` must exist in `CASE` clause. + +* **THEN** + +Specific a result base the `WHEN` condition, `THEN` must exist in `CASE` clause. + +* **ELSE** + +Specific a default result for the `CASE` rules, it is optional, if user don't use else then the `CASE` will not have default result. + +* **END** Review comment: +1 for me. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r456187128 ## File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md ## @@ -114,9 +162,44 @@ CREATE TABLE student (id INT, name STRING) PARTITIONED BY (age INT); --Use Row Format and file format -CREATE TABLE student (id INT,name STRING) +CREATE TABLE student (id INT, name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; + +--Use complex datatype +CREATE EXTERNAL TABLE family( +name STRING, +friends ARRAY, +children MAP, +address STRUCT +) +ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '\\' +COLLECTION ITEMS TERMINATED BY '_' +MAP KEYS TERMINATED BY ':' +LINES TERMINATED BY '\n' +NULL DEFINED AS 'foonull' +STORED AS TEXTFILE +LOCATION '/tmp/family/'; + +--Use native serde Review comment: let me improve the description This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455572972 ## File path: docs/sql-ref-syntax-qry-select-case.md ## @@ -0,0 +1,114 @@ +--- +layout: global +title: CASE Clause +displayTitle: CASE Clause +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`CASE` clause uses rule to return specific result based on the specified condition, similar to if/else statements in other programming languages. + +### Syntax + +```sql +CASE [ expression ] { WHEN boolean_expression THEN then_expression } [ ... ] +[ ELSE else_expression ] +END +``` + +### Parameters + +* **WHEN** + +Specific a boolean condition ,under which to return the `THEN` result, `WHEN` must exist in `CASE` clause. + +* **THEN** + +Specific a result base the `WHEN` condition, `THEN` must exist in `CASE` clause. + +* **ELSE** + +Specific a default result for the `CASE` rules, it is optional, if user don't use else then the `CASE` will not have default result. + +* **END** + +Key words to finish a case clause, `END` must exist in `CASE` clause. + +* **boolean_expression** + +Specific specified condition, it should be boolean type. + +* **then_expression** + +Specific the then expression based on the `boolean_expression` condition, `then_expression` and `else_expression` should all be same type or coercible to a common type. + +* **else_expression** + +Specific the default expression, `then_expression` and `else_expression` should all be same type or coercible to a common type. + +### Examples + +```sql +CREATE TABLE person (id INT, name STRING, age INT); +INSERT INTO person VALUES +(100, 'John', 30), +(200, 'Mary', NULL), +(300, 'Mike', 80), +(400, 'Dan', 50); + +SELECT id, CASE WHEN id > 200 THEN 'bigger' ELSE 'small' END FROM person; ++--+--+ +| id | CASE WHEN (id > 200) THEN bigger ELSE small END | ++--+--+ +| 100 | small| +| 200 | small| +| 300 | bigger | +| 400 | bigger | ++--+--+ + +SELECT id, CASE id WHEN 100 then 'bigger' WHEN id > 300 THEN '300' ELSE 'small' END FROM person; ++--+---+ +| id | CASE WHEN (id = 100) THEN bigger WHEN (id = CAST((id > 300) AS INT)) THEN 300 ELSE small END | ++--+---+ +| 100 | bigger | +| 200 | small | +| 300 | small | +| 400 | small | ++--+---+ + +SELECT * FROM person where CASE 1 = 1 WHEN 100 THEN 'big' WHEN 200 THEN 'bigger' WHEN 300 THEN 'biggest' ELSE 'small' END = 'small'; Review comment: It is case for `CASE` expression, and we have `CASE` column case above. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455594631 ## File path: docs/sql-ref-syntax-qry-select-groupby.md ## @@ -38,6 +38,8 @@ GROUP BY GROUPING SETS (grouping_set [ , ...]) While aggregate functions are defined as ```sql aggregate_name ( [ DISTINCT ] expression [ , ... ] ) [ FILTER ( WHERE boolean_expression ) ] + +[ FIRST | LAST ] ( expression [ IGNORE NULLS ] ) ] Review comment: have updated in the docs This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455576141 ## File path: docs/sql-ref-syntax-qry.md ## @@ -45,4 +45,7 @@ ability to generate logical and physical plan for a given query using * [TABLESAMPLE](sql-ref-syntax-qry-select-sampling.html) * [Table-valued Function](sql-ref-syntax-qry-select-tvf.html) * [Window Function](sql-ref-syntax-qry-select-window.html) + * [CASE Clause](sql-ref-syntax-qry-select-case.html) Review comment: I checked docs in hive and mysql, both of them singlely define `CASE` as function, and not defined together with `SELECT` clause.i have put the `CASE` example link in other `select` docs. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF https://dev.mysql.com/doc/refman/8.0/en/control-flow-functions.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455572972 ## File path: docs/sql-ref-syntax-qry-select-case.md ## @@ -0,0 +1,114 @@ +--- +layout: global +title: CASE Clause +displayTitle: CASE Clause +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`CASE` clause uses rule to return specific result based on the specified condition, similar to if/else statements in other programming languages. + +### Syntax + +```sql +CASE [ expression ] { WHEN boolean_expression THEN then_expression } [ ... ] +[ ELSE else_expression ] +END +``` + +### Parameters + +* **WHEN** + +Specific a boolean condition ,under which to return the `THEN` result, `WHEN` must exist in `CASE` clause. + +* **THEN** + +Specific a result base the `WHEN` condition, `THEN` must exist in `CASE` clause. + +* **ELSE** + +Specific a default result for the `CASE` rules, it is optional, if user don't use else then the `CASE` will not have default result. + +* **END** + +Key words to finish a case clause, `END` must exist in `CASE` clause. + +* **boolean_expression** + +Specific specified condition, it should be boolean type. + +* **then_expression** + +Specific the then expression based on the `boolean_expression` condition, `then_expression` and `else_expression` should all be same type or coercible to a common type. + +* **else_expression** + +Specific the default expression, `then_expression` and `else_expression` should all be same type or coercible to a common type. + +### Examples + +```sql +CREATE TABLE person (id INT, name STRING, age INT); +INSERT INTO person VALUES +(100, 'John', 30), +(200, 'Mary', NULL), +(300, 'Mike', 80), +(400, 'Dan', 50); + +SELECT id, CASE WHEN id > 200 THEN 'bigger' ELSE 'small' END FROM person; ++--+--+ +| id | CASE WHEN (id > 200) THEN bigger ELSE small END | ++--+--+ +| 100 | small| +| 200 | small| +| 300 | bigger | +| 400 | bigger | ++--+--+ + +SELECT id, CASE id WHEN 100 then 'bigger' WHEN id > 300 THEN '300' ELSE 'small' END FROM person; ++--+---+ +| id | CASE WHEN (id = 100) THEN bigger WHEN (id = CAST((id > 300) AS INT)) THEN 300 ELSE small END | ++--+---+ +| 100 | bigger | +| 200 | small | +| 300 | small | +| 400 | small | ++--+---+ + +SELECT * FROM person where CASE 1 = 1 WHEN 100 THEN 'big' WHEN 200 THEN 'bigger' WHEN 300 THEN 'biggest' ELSE 'small' END = 'small'; Review comment: It is case for `CASE` expression, and we have `CASE` column case above. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455571961 ## File path: docs/sql-ref-syntax-qry-select-case.md ## @@ -0,0 +1,114 @@ +--- +layout: global +title: CASE Clause +displayTitle: CASE Clause +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`CASE` clause uses rule to return specific result based on the specified condition, similar to if/else statements in other programming languages. + +### Syntax + +```sql +CASE [ expression ] { WHEN boolean_expression THEN then_expression } [ ... ] +[ ELSE else_expression ] +END +``` + +### Parameters + +* **WHEN** + +Specific a boolean condition ,under which to return the `THEN` result, `WHEN` must exist in `CASE` clause. + +* **THEN** + +Specific a result base the `WHEN` condition, `THEN` must exist in `CASE` clause. + +* **ELSE** + +Specific a default result for the `CASE` rules, it is optional, if user don't use else then the `CASE` will not have default result. + +* **END** + +Key words to finish a case clause, `END` must exist in `CASE` clause. + +* **boolean_expression** + +Specific specified condition, it should be boolean type. + +* **then_expression** + +Specific the then expression based on the `boolean_expression` condition, `then_expression` and `else_expression` should all be same type or coercible to a common type. + +* **else_expression** + +Specific the default expression, `then_expression` and `else_expression` should all be same type or coercible to a common type. + +### Examples + +```sql +CREATE TABLE person (id INT, name STRING, age INT); +INSERT INTO person VALUES +(100, 'John', 30), +(200, 'Mary', NULL), +(300, 'Mike', 80), +(400, 'Dan', 50); + +SELECT id, CASE WHEN id > 200 THEN 'bigger' ELSE 'small' END FROM person; ++--+--+ +| id | CASE WHEN (id > 200) THEN bigger ELSE small END | ++--+--+ +| 100 | small| +| 200 | small| +| 300 | bigger | +| 400 | bigger | ++--+--+ + +SELECT id, CASE id WHEN 100 then 'bigger' WHEN id > 300 THEN '300' ELSE 'small' END FROM person; ++--+---+ +| id | CASE WHEN (id = 100) THEN bigger WHEN (id = CAST((id > 300) AS INT)) THEN 300 ELSE small END | ++--+---+ +| 100 | bigger | +| 200 | small | +| 300 | small | +| 400 | small | ++--+---+ + +SELECT * FROM person where CASE 1 = 1 WHEN 100 THEN 'big' WHEN 200 THEN 'bigger' WHEN 300 THEN 'biggest' ELSE 'small' END = 'small'; Review comment: It is case for `CASE` expression, and we have `CASE` column case above. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455549369 ## File path: docs/sql-ref-syntax-qry.md ## @@ -45,4 +45,7 @@ ability to generate logical and physical plan for a given query using * [TABLESAMPLE](sql-ref-syntax-qry-select-sampling.html) * [Table-valued Function](sql-ref-syntax-qry-select-tvf.html) * [Window Function](sql-ref-syntax-qry-select-window.html) + * [CASE Clause](sql-ref-syntax-qry-select-case.html) Review comment: I saw mysql and hive both define CASE as function singlely, not defined in `SELECT` clause https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF https://dev.mysql.com/doc/refman/8.0/en/control-flow-functions.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455220561 ## File path: docs/sql-ref-syntax-qry-select-groupby.md ## @@ -38,6 +38,8 @@ GROUP BY GROUPING SETS (grouping_set [ , ...]) While aggregate functions are defined as ```sql aggregate_name ( [ DISTINCT ] expression [ , ... ] ) [ FILTER ( WHERE boolean_expression ) ] + +[ FIRST | LAST ] ( expression [ IGNORE NULLS ] ) ] Review comment: I just tried, not working ,even aggregate functions do not support `FILTER` in V2.4.5, i will test for other version tomorrow. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455178397 ## File path: docs/sql-ref-syntax-qry.md ## @@ -45,4 +45,7 @@ ability to generate logical and physical plan for a given query using * [TABLESAMPLE](sql-ref-syntax-qry-select-sampling.html) * [Table-valued Function](sql-ref-syntax-qry-select-tvf.html) * [Window Function](sql-ref-syntax-qry-select-window.html) + * [CASE Clause](sql-ref-syntax-qry-select-case.html) Review comment: `case` may appear in many key words,such as `select`,`on`,`from`,'where' ...,If we define in `select`, we may define many places, seems a little weird,WYDT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r455176092 ## File path: docs/sql-ref-syntax-qry-select.md ## @@ -74,6 +74,12 @@ SELECT [ hints , ... ] [ ALL | DISTINCT ] { named_expression [ , ... ] } An expression with an assigned name. In general, it denotes a column expression. **Syntax:** `expression [AS] [alias]` + + * **FROM** Review comment: have added the basic defintion, i saw there was `from_item` already in it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r453643327 ## File path: docs/sql-ref-syntax-qry-select.md ## @@ -159,3 +159,6 @@ SELECT [ hints , ... ] [ ALL | DISTINCT ] { named_expression [ , ... ] } * [TABLESAMPLE](sql-ref-syntax-qry-select-sampling.html) Review comment: > Could you update the `FROM` syntax, too? > https://github.com/apache/spark/blob/b6229df16c02d9edcd53bc16ee12b199aaa0ee38/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4#L574 have updated:) ## File path: docs/sql-ref-syntax-qry-select-case.md ## @@ -0,0 +1,112 @@ +--- +layout: global +title: CASE Clause +displayTitle: CASE Clause +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`CASE` clause uses rule to return specific result based on the specified condition. + +### Syntax + +```sql +CASE [ expression ] { WHEN boolean_expression THEN then_expression } [ ... ] [ ELSE else_expression ] END +``` + +### Parameters + +* **WHEN** + +Specific a boolean condition ,under which to return the `THEN` result, `WHEN` must exist in `CASE` clause. + +* **THEN** + +Specific a result base the `WHEN` condition, `THEN` must exist in `CASE` clause. + +* **ELSE** + +Specific a default result for the `CASE` rules, it is optional, if user don't use else then the `CASE` will not have default result. + +* **END** + +Key words to finish a case clause, `END` must exist in `CASE` clause. + +* **boolean_expression** + +Specific specified condition, it should be boolean type. + +* **then_expression** + +Specific the then expression based on the `boolean_expression` condition, `then_expression` and `else_expression` should all be same type or coercible to a common type. + +* **else_expression** + +Specific the default expression, `then_expression` and `else_expression` should all be same type or coercible to a common type. + +### Examples + +```sql +CREATE TABLE person (id INT, name STRING, age INT); +INSERT INTO person VALUES +(100, 'John', 30), +(200, 'Mary', NULL), +(300, 'Mike', 80), +(400, 'Dan', 50); + +SELECT id, CASE WHEN id > 200 THEN 'bigger' ELSE 'small' END FROM person; ++--+--+--+ Review comment: moved~ ## File path: docs/sql-ref-syntax-qry-select-ignore-nulls.md ## @@ -0,0 +1,66 @@ +--- +layout: global +title: IGNORE NULLS +displayTitle: IGNORE NULLS +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`IGNORE NULLS` is to ignore null values, which is used in first and last + +### Syntax + +```sql +[ first | last ] ( expression [ IGNORE NULLS ] ) ] Review comment: Merged~ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] GuoPhilipse commented on a change in pull request #29056: [SPARK-31753][SQL][DOCS] Add missing keywords in the SQL docs
GuoPhilipse commented on a change in pull request #29056: URL: https://github.com/apache/spark/pull/29056#discussion_r453444219 ## File path: docs/sql-ref-syntax-ddl-create-table-hiveformat.md ## @@ -117,6 +145,21 @@ CREATE TABLE student (id INT, name STRING) CREATE TABLE student (id INT,name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE; + +--Use complex datatype Review comment: sure, will update later This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org