[jira] [Assigned] (SPARK-38838) Support ALTER TABLE ALTER COLUMN commands with DEFAULT values
[ https://issues.apache.org/jira/browse/SPARK-38838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-38838: -- Assignee: Daniel > Support ALTER TABLE ALTER COLUMN commands with DEFAULT values > - > > Key: SPARK-38838 > URL: https://issues.apache.org/jira/browse/SPARK-38838 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Daniel >Assignee: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39108) Show hints for try_add/try_substract/try_multiply in error messages of int/long overflow
Gengliang Wang created SPARK-39108: -- Summary: Show hints for try_add/try_substract/try_multiply in error messages of int/long overflow Key: SPARK-39108 URL: https://issues.apache.org/jira/browse/SPARK-39108 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0, 3.4.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-39093) Dividing interval by integral can result in codegen compilation error
[ https://issues.apache.org/jira/browse/SPARK-39093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-39093. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36442 [https://github.com/apache/spark/pull/36442] > Dividing interval by integral can result in codegen compilation error > - > > Key: SPARK-39093 > URL: https://issues.apache.org/jira/browse/SPARK-39093 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1, 3.3.0, 3.4.0 >Reporter: Bruce Robbins >Assignee: Bruce Robbins >Priority: Minor > Fix For: 3.3.0 > > > Assume this data: > {noformat} > create or replace temp view v1 as > select * FROM VALUES > (interval '10' months, interval '10' day, 2) > as v1(period, duration, num); > cache table v1; > {noformat} > These two queries work: > {noformat} > spark-sql> select period/num from v1; > 0-5 > Time taken: 0.143 seconds, Fetched 1 row(s) > {noformat} > {noformat} > spark-sql> select duration/num from v1; > 5 00:00:00.0 > Time taken: 0.094 seconds, Fetched 1 row(s) > {noformat} > However, these two queries get a codegen compilation error: > {noformat} > spark-sql> select period/(num + 3) from v1; > 22/05/03 08:56:37 ERROR CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 40, Column 44: Expression "project_value_2" is not an rvalue > ... > 22/05/03 08:56:37 WARN UnsafeProjection: Expr codegen error and falling back > to interpreter mode > ... > 0-2 > Time taken: 0.149 seconds, Fetched 1 row(s) > {noformat} > {noformat} > spark-sql> select duration/(num + 3) from v1; > 22/05/03 08:57:29 ERROR CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 40, Column 54: Expression "project_value_2" is not an rvalue > ... > 22/05/03 08:57:29 WARN UnsafeProjection: Expr codegen error and falling back > to interpreter mode > ... > 2 00:00:00.0 > Time taken: 0.089 seconds, Fetched 1 row(s) > {noformat} > Even the first two queries will get a compilation error if you turn off > whole-stage codegen: > {noformat} > spark-sql> set spark.sql.codegen.wholeStage=false; > spark.sql.codegen.wholeStage false > Time taken: 0.055 seconds, Fetched 1 row(s) > spark-sql> select period/num from v1; > 22/05/03 09:16:42 ERROR CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 37, Column 5: Expression "value_1" is not an rvalue > > 0-5 > Time taken: 0.175 seconds, Fetched 1 row(s) > spark-sql> select duration/num from v1; > 22/05/03 09:17:41 ERROR CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 37, Column 5: Expression "value_1" is not an rvalue > ... > 5 00:00:00.0 > Time taken: 0.104 seconds, Fetched 1 row(s) > {noformat} > Note that in the error cases, the queries still return a result because Spark > falls back on interpreting the divide expression (so I marked this as > "minor"). -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-39093) Dividing interval by integral can result in codegen compilation error
[ https://issues.apache.org/jira/browse/SPARK-39093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-39093: -- Assignee: Bruce Robbins > Dividing interval by integral can result in codegen compilation error > - > > Key: SPARK-39093 > URL: https://issues.apache.org/jira/browse/SPARK-39093 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.1, 3.3.0, 3.4.0 >Reporter: Bruce Robbins >Assignee: Bruce Robbins >Priority: Minor > > Assume this data: > {noformat} > create or replace temp view v1 as > select * FROM VALUES > (interval '10' months, interval '10' day, 2) > as v1(period, duration, num); > cache table v1; > {noformat} > These two queries work: > {noformat} > spark-sql> select period/num from v1; > 0-5 > Time taken: 0.143 seconds, Fetched 1 row(s) > {noformat} > {noformat} > spark-sql> select duration/num from v1; > 5 00:00:00.0 > Time taken: 0.094 seconds, Fetched 1 row(s) > {noformat} > However, these two queries get a codegen compilation error: > {noformat} > spark-sql> select period/(num + 3) from v1; > 22/05/03 08:56:37 ERROR CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 40, Column 44: Expression "project_value_2" is not an rvalue > ... > 22/05/03 08:56:37 WARN UnsafeProjection: Expr codegen error and falling back > to interpreter mode > ... > 0-2 > Time taken: 0.149 seconds, Fetched 1 row(s) > {noformat} > {noformat} > spark-sql> select duration/(num + 3) from v1; > 22/05/03 08:57:29 ERROR CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 40, Column 54: Expression "project_value_2" is not an rvalue > ... > 22/05/03 08:57:29 WARN UnsafeProjection: Expr codegen error and falling back > to interpreter mode > ... > 2 00:00:00.0 > Time taken: 0.089 seconds, Fetched 1 row(s) > {noformat} > Even the first two queries will get a compilation error if you turn off > whole-stage codegen: > {noformat} > spark-sql> set spark.sql.codegen.wholeStage=false; > spark.sql.codegen.wholeStage false > Time taken: 0.055 seconds, Fetched 1 row(s) > spark-sql> select period/num from v1; > 22/05/03 09:16:42 ERROR CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 37, Column 5: Expression "value_1" is not an rvalue > > 0-5 > Time taken: 0.175 seconds, Fetched 1 row(s) > spark-sql> select duration/num from v1; > 22/05/03 09:17:41 ERROR CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 37, Column 5: Expression "value_1" is not an rvalue > ... > 5 00:00:00.0 > Time taken: 0.104 seconds, Fetched 1 row(s) > {noformat} > Note that in the error cases, the queries still return a result because Spark > falls back on interpreting the divide expression (so I marked this as > "minor"). -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38838) Support ALTER TABLE ALTER COLUMN commands with DEFAULT values
[ https://issues.apache.org/jira/browse/SPARK-38838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38838. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36398 [https://github.com/apache/spark/pull/36398] > Support ALTER TABLE ALTER COLUMN commands with DEFAULT values > - > > Key: SPARK-38838 > URL: https://issues.apache.org/jira/browse/SPARK-38838 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-39046) Return an empty context string if TreeNode.origin is wrongly set
[ https://issues.apache.org/jira/browse/SPARK-39046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-39046. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36379 [https://github.com/apache/spark/pull/36379] > Return an empty context string if TreeNode.origin is wrongly set > > > Key: SPARK-39046 > URL: https://issues.apache.org/jira/browse/SPARK-39046 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38914) Allow user to insert specified columns into insertable view
[ https://issues.apache.org/jira/browse/SPARK-38914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-38914: -- Assignee: morvenhuang > Allow user to insert specified columns into insertable view > --- > > Key: SPARK-38914 > URL: https://issues.apache.org/jira/browse/SPARK-38914 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.1 >Reporter: morvenhuang >Assignee: morvenhuang >Priority: Minor > > The option `spark.sql.defaultColumn.useNullsForMissingDefautValues` allows us > to insert specified columns into table (SPARK-38795), but currently this > option does not work for insertable view, > Below INSERT INTO will result in AnalysisException even when the > useNullsForMissingDefautValues option is true, > {code:java} > spark.sql("CREATE TEMPORARY VIEW v1 (c1 int, c2 string) USING > org.apache.spark.sql.json.DefaultSource OPTIONS ( path 'json_dir')"); > spark.sql("INSERT INTO v1(c1) VALUES(100)"); > org.apache.spark.sql.AnalysisException: unknown requires that the data to be > inserted have the same number of columns as the target table: target table > has 2 column(s) but the inserted data has 1 column(s), including 0 partition > column(s) having constant value(s). > {code} > > I can provide a fix for this issue. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38914) Allow user to insert specified columns into insertable view
[ https://issues.apache.org/jira/browse/SPARK-38914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38914. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36212 [https://github.com/apache/spark/pull/36212] > Allow user to insert specified columns into insertable view > --- > > Key: SPARK-38914 > URL: https://issues.apache.org/jira/browse/SPARK-38914 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.1 >Reporter: morvenhuang >Assignee: morvenhuang >Priority: Minor > Fix For: 3.4.0 > > > The option `spark.sql.defaultColumn.useNullsForMissingDefautValues` allows us > to insert specified columns into table (SPARK-38795), but currently this > option does not work for insertable view, > Below INSERT INTO will result in AnalysisException even when the > useNullsForMissingDefautValues option is true, > {code:java} > spark.sql("CREATE TEMPORARY VIEW v1 (c1 int, c2 string) USING > org.apache.spark.sql.json.DefaultSource OPTIONS ( path 'json_dir')"); > spark.sql("INSERT INTO v1(c1) VALUES(100)"); > org.apache.spark.sql.AnalysisException: unknown requires that the data to be > inserted have the same number of columns as the target table: target table > has 2 column(s) but the inserted data has 1 column(s), including 0 partition > column(s) having constant value(s). > {code} > > I can provide a fix for this issue. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39046) Return an empty context string if TreeNode.origin is wrongly set
[ https://issues.apache.org/jira/browse/SPARK-39046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-39046: --- Parent: SPARK-38615 Issue Type: Sub-task (was: Improvement) > Return an empty context string if TreeNode.origin is wrongly set > > > Key: SPARK-39046 > URL: https://issues.apache.org/jira/browse/SPARK-39046 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39046) Return an empty context string if TreeNode.origin is wrongly set
Gengliang Wang created SPARK-39046: -- Summary: Return an empty context string if TreeNode.origin is wrongly set Key: SPARK-39046 URL: https://issues.apache.org/jira/browse/SPARK-39046 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39028) Use SparkDateTimeException when casting to datetime types failed
Gengliang Wang created SPARK-39028: -- Summary: Use SparkDateTimeException when casting to datetime types failed Key: SPARK-39028 URL: https://issues.apache.org/jira/browse/SPARK-39028 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38984) Allow comparison between TimestampNTZ and Timestamp/Date
[ https://issues.apache.org/jira/browse/SPARK-38984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38984. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36300 [https://github.com/apache/spark/pull/36300] > Allow comparison between TimestampNTZ and Timestamp/Date > > > Key: SPARK-38984 > URL: https://issues.apache.org/jira/browse/SPARK-38984 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38980) Move error class tests requiring ANSI SQL mode to QueryExecutionAnsiErrorsSuite
[ https://issues.apache.org/jira/browse/SPARK-38980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38980. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36299 [https://github.com/apache/spark/pull/36299] > Move error class tests requiring ANSI SQL mode to > QueryExecutionAnsiErrorsSuite > --- > > Key: SPARK-38980 > URL: https://issues.apache.org/jira/browse/SPARK-38980 > Project: Spark > Issue Type: Sub-task > Components: SQL, Tests >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Trivial > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38980) Move error class tests requiring ANSI SQL mode to QueryExecutionAnsiErrorsSuite
[ https://issues.apache.org/jira/browse/SPARK-38980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38980: --- Affects Version/s: 3.4.0 (was: 3.3.0) > Move error class tests requiring ANSI SQL mode to > QueryExecutionAnsiErrorsSuite > --- > > Key: SPARK-38980 > URL: https://issues.apache.org/jira/browse/SPARK-38980 > Project: Spark > Issue Type: Sub-task > Components: SQL, Tests >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Trivial > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38984) Allow comparison between TimestampNTZ and Timestamp/Date
Gengliang Wang created SPARK-38984: -- Summary: Allow comparison between TimestampNTZ and Timestamp/Date Key: SPARK-38984 URL: https://issues.apache.org/jira/browse/SPARK-38984 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38980) Move error class tests requiring ANSI SQL mode to QueryExecutionAnsiErrorsSuite
Gengliang Wang created SPARK-38980: -- Summary: Move error class tests requiring ANSI SQL mode to QueryExecutionAnsiErrorsSuite Key: SPARK-38980 URL: https://issues.apache.org/jira/browse/SPARK-38980 Project: Spark Issue Type: Sub-task Components: SQL, Tests Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38967) Turn spark.sql.ansi.strictIndexOperator into internal config
Gengliang Wang created SPARK-38967: -- Summary: Turn spark.sql.ansi.strictIndexOperator into internal config Key: SPARK-38967 URL: https://issues.apache.org/jira/browse/SPARK-38967 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang Currently, all the ANSI error message shows the hint "If necessary set spark.sql.ansi.enabled to false to bypass this error." "Map key not exist" and "array index out of bound" errors are special. It shows the config spark.sql.ansi.strictIndexOperator instead. This one special case can confuse users. To make it simple: * Show the configuration spark.sql.ansi.enabled instead * If it is "map key not exist" error, show the hint for using "try_element_at". Otherwise, we don't show it. For array, `[]` operator is using 0-based index while `try_element_at` is using 1-based index. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38811) Support ALTER TABLE ADD COLUMN commands with DEFAULT values
[ https://issues.apache.org/jira/browse/SPARK-38811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-38811: -- Assignee: Daniel > Support ALTER TABLE ADD COLUMN commands with DEFAULT values > --- > > Key: SPARK-38811 > URL: https://issues.apache.org/jira/browse/SPARK-38811 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Daniel >Assignee: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38811) Support ALTER TABLE ADD COLUMN commands with DEFAULT values
[ https://issues.apache.org/jira/browse/SPARK-38811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38811. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36091 [https://github.com/apache/spark/pull/36091] > Support ALTER TABLE ADD COLUMN commands with DEFAULT values > --- > > Key: SPARK-38811 > URL: https://issues.apache.org/jira/browse/SPARK-38811 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38935) Improve the exception type and message of casting string to numbers
[ https://issues.apache.org/jira/browse/SPARK-38935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38935: --- Priority: Minor (was: Major) > Improve the exception type and message of casting string to numbers > --- > > Key: SPARK-38935 > URL: https://issues.apache.org/jira/browse/SPARK-38935 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Apache Spark >Priority: Minor > > # change the exception type from "java.lang.NumberFormatException" to > SparkNumberFormatException > 2. Show the exact target data type in the error message -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38935) Improve the exception type and message of casting string to numbers
[ https://issues.apache.org/jira/browse/SPARK-38935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38935: --- Description: # change the exception type from "java.lang.NumberFormatException" to SparkNumberFormatException 2. Show the exact target data type in the error message was: # change the exception type from "java.lang.NumberFormatException" to SparkNumberFormatException # Show the exact target data type in the error message > Improve the exception type and message of casting string to numbers > --- > > Key: SPARK-38935 > URL: https://issues.apache.org/jira/browse/SPARK-38935 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > > # change the exception type from "java.lang.NumberFormatException" to > SparkNumberFormatException > 2. Show the exact target data type in the error message -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38935) Improve the exception type and message of casting string to numbers
Gengliang Wang created SPARK-38935: -- Summary: Improve the exception type and message of casting string to numbers Key: SPARK-38935 URL: https://issues.apache.org/jira/browse/SPARK-38935 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang # change the exception type from "java.lang.NumberFormatException" to SparkNumberFormatException # Show the exact target data type in the error message -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38924) Update dataTables to 1.10.25 for security issue
[ https://issues.apache.org/jira/browse/SPARK-38924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-38924: -- Assignee: Sean R. Owen > Update dataTables to 1.10.25 for security issue > --- > > Key: SPARK-38924 > URL: https://issues.apache.org/jira/browse/SPARK-38924 > Project: Spark > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.3.0 >Reporter: Sean R. Owen >Assignee: Sean R. Owen >Priority: Minor > > https://nvd.nist.gov/vuln/detail/CVE-2020-28458 affects datatables up to > 1.10.21 and we're on 1.10.20. It may or may not affect Spark, but updating to > 1.10.25 at least should be easy -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38924) Update dataTables to 1.10.25 for security issue
[ https://issues.apache.org/jira/browse/SPARK-38924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38924. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36226 [https://github.com/apache/spark/pull/36226] > Update dataTables to 1.10.25 for security issue > --- > > Key: SPARK-38924 > URL: https://issues.apache.org/jira/browse/SPARK-38924 > Project: Spark > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.3.0 >Reporter: Sean R. Owen >Assignee: Sean R. Owen >Priority: Minor > Fix For: 3.3.0 > > > https://nvd.nist.gov/vuln/detail/CVE-2020-28458 affects datatables up to > 1.10.21 and we're on 1.10.20. It may or may not affect Spark, but updating to > 1.10.25 at least should be easy -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38908) Provide query context in runtime error of Casting from String to Number/Date/Timestamp/Boolean
[ https://issues.apache.org/jira/browse/SPARK-38908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38908. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36206 [https://github.com/apache/spark/pull/36206] > Provide query context in runtime error of Casting from String to > Number/Date/Timestamp/Boolean > -- > > Key: SPARK-38908 > URL: https://issues.apache.org/jira/browse/SPARK-38908 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38908) Provide query context in runtime error of Casting from String to Number/Date/Timestamp/Boolean
Gengliang Wang created SPARK-38908: -- Summary: Provide query context in runtime error of Casting from String to Number/Date/Timestamp/Boolean Key: SPARK-38908 URL: https://issues.apache.org/jira/browse/SPARK-38908 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38829) New configuration for controlling timestamp inference of Parquet
[ https://issues.apache.org/jira/browse/SPARK-38829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38829. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36137 [https://github.com/apache/spark/pull/36137] > New configuration for controlling timestamp inference of Parquet > > > Key: SPARK-38829 > URL: https://issues.apache.org/jira/browse/SPARK-38829 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Ivan Sadikov >Priority: Major > Fix For: 3.3.0 > > > A new SQL conf which can fallback to the behavior that reads all the Parquet > Timestamp column as TimestampType. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38589) New SQL function: try_avg
[ https://issues.apache.org/jira/browse/SPARK-38589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38589. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35896 [https://github.com/apache/spark/pull/35896] > New SQL function: try_avg > - > > Key: SPARK-38589 > URL: https://issues.apache.org/jira/browse/SPARK-38589 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38869) Respect Table capability `ACCEPT_ANY_SCHEMA` in default column resolution
Gengliang Wang created SPARK-38869: -- Summary: Respect Table capability `ACCEPT_ANY_SCHEMA` in default column resolution Key: SPARK-38869 URL: https://issues.apache.org/jira/browse/SPARK-38869 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Gengliang Wang Assignee: Daniel If a V2 table has the capability of [ACCEPT_ANY_SCHEMA|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableCapability.java#L94], we should skip adding default column values to the insert schema. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38829) New configuration for controlling timestamp inference of Parquet
[ https://issues.apache.org/jira/browse/SPARK-38829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17520900#comment-17520900 ] Gengliang Wang commented on SPARK-38829: [~ivan.sadikov] +1, disabling it in 3.3 is simpler > New configuration for controlling timestamp inference of Parquet > > > Key: SPARK-38829 > URL: https://issues.apache.org/jira/browse/SPARK-38829 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Ivan Sadikov >Priority: Major > > A new SQL conf which can fallback to the behavior that reads all the Parquet > Timestamp column as TimestampType. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38829) New configuration for controlling timestamp inference of Parquet
[ https://issues.apache.org/jira/browse/SPARK-38829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17520405#comment-17520405 ] Gengliang Wang commented on SPARK-38829: [~ivan.sadikov] I think we still need a new configuration on master branch > New configuration for controlling timestamp inference of Parquet > > > Key: SPARK-38829 > URL: https://issues.apache.org/jira/browse/SPARK-38829 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Ivan Sadikov >Priority: Major > > A new SQL conf which can fallback to the behavior that reads all the Parquet > Timestamp column as TimestampType. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38795) Support INSERT INTO user specified column lists with DEFAULT values
[ https://issues.apache.org/jira/browse/SPARK-38795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-38795: -- Assignee: Daniel > Support INSERT INTO user specified column lists with DEFAULT values > --- > > Key: SPARK-38795 > URL: https://issues.apache.org/jira/browse/SPARK-38795 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Daniel >Assignee: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38795) Support INSERT INTO user specified column lists with DEFAULT values
[ https://issues.apache.org/jira/browse/SPARK-38795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38795. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36077 [https://github.com/apache/spark/pull/36077] > Support INSERT INTO user specified column lists with DEFAULT values > --- > > Key: SPARK-38795 > URL: https://issues.apache.org/jira/browse/SPARK-38795 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38842) Replace all the ArithmeticException with SparkArithmeticException
Gengliang Wang created SPARK-38842: -- Summary: Replace all the ArithmeticException with SparkArithmeticException Key: SPARK-38842 URL: https://issues.apache.org/jira/browse/SPARK-38842 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Similar to [https://github.com/apache/spark/pull/36022,] we should replace all the ArithmeticException with SparkArithmeticException -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38834) Update the version of TimestampNTZ related changes as 3.4.0
[ https://issues.apache.org/jira/browse/SPARK-38834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38834. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36118 [https://github.com/apache/spark/pull/36118] > Update the version of TimestampNTZ related changes as 3.4.0 > --- > > Key: SPARK-38834 > URL: https://issues.apache.org/jira/browse/SPARK-38834 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35662) Support Timestamp without time zone data type
[ https://issues.apache.org/jira/browse/SPARK-35662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-35662: --- Affects Version/s: 3.4.0 (was: 3.3.0) > Support Timestamp without time zone data type > - > > Key: SPARK-35662 > URL: https://issues.apache.org/jira/browse/SPARK-35662 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.4.0 >Reporter: Gengliang Wang >Assignee: Apache Spark >Priority: Major > > Spark SQL today supports the TIMESTAMP data type. However the semantics > provided actually match TIMESTAMP WITH LOCAL TIMEZONE as defined by Oracle. > Timestamps embedded in a SQL query or passed through JDBC are presumed to be > in session local timezone and cast to UTC before being processed. > These are desirable semantics in many cases, such as when dealing with > calendars. > In many (more) other cases, such as when dealing with log files it is > desirable that the provided timestamps not be altered. > SQL users expect that they can model either behavior and do so by using > TIMESTAMP WITHOUT TIME ZONE for time zone insensitive data and TIMESTAMP WITH > LOCAL TIME ZONE for time zone sensitive data. > Most traditional RDBMS map TIMESTAMP to TIMESTAMP WITHOUT TIME ZONE and will > be surprised to see TIMESTAMP WITH LOCAL TIME ZONE, a feature that does not > exist in the standard. > In this new feature, we will introduce TIMESTAMP WITH LOCAL TIMEZONE to > describe the existing timestamp type and add TIMESTAMP WITHOUT TIME ZONE for > standard semantic. > Using these two types will provide clarity. > We will also allow users to set the default behavior for TIMESTAMP to either > use TIMESTAMP WITH LOCAL TIME ZONE or TIMESTAMP WITHOUT TIME ZONE. > h3. Milestone 1 – Spark Timestamp equivalency ( The new Timestamp type > TimestampWithoutTZ meets or exceeds all function of the existing SQL > Timestamp): > * Add a new DataType implementation for TimestampWithoutTZ. > * Support TimestampWithoutTZ in Dataset/UDF. > * TimestampWithoutTZ literals > * TimestampWithoutTZ arithmetic(e.g. TimestampWithoutTZ - > TimestampWithoutTZ, TimestampWithoutTZ - Date) > * Datetime functions/operators: dayofweek, weekofyear, year, etc > * Cast to and from TimestampWithoutTZ, cast String/Timestamp to > TimestampWithoutTZ, cast TimestampWithoutTZ to string (pretty > printing)/Timestamp, with the SQL syntax to specify the types > * Support sorting TimestampWithoutTZ. > h3. Milestone 2 – Persistence: > * Ability to create tables of type TimestampWithoutTZ > * Ability to write to common file formats such as Parquet and JSON. > * INSERT, SELECT, UPDATE, MERGE > * Discovery > h3. Milestone 3 – Client support > * JDBC support > * Hive Thrift server > h3. Milestone 4 – PySpark and Spark R integration > * Python UDF can take and return TimestampWithoutTZ > * DataFrame support -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38834) Update the version of TimestampNTZ related changes as 3.4.0
Gengliang Wang created SPARK-38834: -- Summary: Update the version of TimestampNTZ related changes as 3.4.0 Key: SPARK-38834 URL: https://issues.apache.org/jira/browse/SPARK-38834 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38813) Remove TimestampNTZ type support in Spark 3.3
[ https://issues.apache.org/jira/browse/SPARK-38813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38813. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36094 [https://github.com/apache/spark/pull/36094] > Remove TimestampNTZ type support in Spark 3.3 > - > > Key: SPARK-38813 > URL: https://issues.apache.org/jira/browse/SPARK-38813 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > > Note that this one doesn't include the PySpark part. > See also: https://issues.apache.org/jira/browse/SPARK-38828 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38829) New configuration for controlling timestamp inference of Parquet
[ https://issues.apache.org/jira/browse/SPARK-38829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38829: --- Description: A new SQL conf which can fallback to the behavior that reads all the Parquet Timestamp column as TimestampType. > New configuration for controlling timestamp inference of Parquet > > > Key: SPARK-38829 > URL: https://issues.apache.org/jira/browse/SPARK-38829 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Ivan Sadikov >Priority: Major > > A new SQL conf which can fallback to the behavior that reads all the Parquet > Timestamp column as TimestampType. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38829) New configuration for controlling timestamp inference of Parquet
Gengliang Wang created SPARK-38829: -- Summary: New configuration for controlling timestamp inference of Parquet Key: SPARK-38829 URL: https://issues.apache.org/jira/browse/SPARK-38829 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Ivan Sadikov -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38813) Remove TimestampNTZ type support in Spark 3.3
[ https://issues.apache.org/jira/browse/SPARK-38813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38813: --- Description: Note that this one doesn't include the PySpark part. See also: https://issues.apache.org/jira/browse/SPARK-38828 > Remove TimestampNTZ type support in Spark 3.3 > - > > Key: SPARK-38813 > URL: https://issues.apache.org/jira/browse/SPARK-38813 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > > Note that this one doesn't include the PySpark part. > See also: https://issues.apache.org/jira/browse/SPARK-38828 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38828) Remove TimestampNTZ type Python support in Spark 3.3
Gengliang Wang created SPARK-38828: -- Summary: Remove TimestampNTZ type Python support in Spark 3.3 Key: SPARK-38828 URL: https://issues.apache.org/jira/browse/SPARK-38828 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Haejoon Lee -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38818) Fix the docs of try_multiply/try_subtract/ANSI cast
[ https://issues.apache.org/jira/browse/SPARK-38818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38818. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36099 [https://github.com/apache/spark/pull/36099] > Fix the docs of try_multiply/try_subtract/ANSI cast > --- > > Key: SPARK-38818 > URL: https://issues.apache.org/jira/browse/SPARK-38818 > Project: Spark > Issue Type: Bug > Components: Documentation >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > Fix For: 3.3.0 > > > * Fix the valid combinations of ANSI CAST. > * Fix the usage of try_multiply/try_subtract, from "expr1 _FUNC_ expr2" to > "_FUNC_(expr1< expr2)" -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38818) Fix the docs of try_multiply/try_subtract/ANSI cast
Gengliang Wang created SPARK-38818: -- Summary: Fix the docs of try_multiply/try_subtract/ANSI cast Key: SPARK-38818 URL: https://issues.apache.org/jira/browse/SPARK-38818 Project: Spark Issue Type: Bug Components: Documentation Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang * Fix the valid combinations of ANSI CAST. * Fix the usage of try_multiply/try_subtract, from "expr1 _FUNC_ expr2" to "_FUNC_(expr1< expr2)" -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38813) Remove TimestampNTZ type support in Spark 3.2
Gengliang Wang created SPARK-38813: -- Summary: Remove TimestampNTZ type support in Spark 3.2 Key: SPARK-38813 URL: https://issues.apache.org/jira/browse/SPARK-38813 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38813) Remove TimestampNTZ type support in Spark 3.3
[ https://issues.apache.org/jira/browse/SPARK-38813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38813: --- Summary: Remove TimestampNTZ type support in Spark 3.3 (was: Remove TimestampNTZ type support in Spark 3.2) > Remove TimestampNTZ type support in Spark 3.3 > - > > Key: SPARK-38813 > URL: https://issues.apache.org/jira/browse/SPARK-38813 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38762) Provide query context in Decimal overflow errors
[ https://issues.apache.org/jira/browse/SPARK-38762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38762. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36040 [https://github.com/apache/spark/pull/36040] > Provide query context in Decimal overflow errors > > > Key: SPARK-38762 > URL: https://issues.apache.org/jira/browse/SPARK-38762 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38762) Provide query context in Decimal overflow errors
Gengliang Wang created SPARK-38762: -- Summary: Provide query context in Decimal overflow errors Key: SPARK-38762 URL: https://issues.apache.org/jira/browse/SPARK-38762 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38716) Provide query context in runtime error of map key not exists
[ https://issues.apache.org/jira/browse/SPARK-38716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38716. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36025 [https://github.com/apache/spark/pull/36025] > Provide query context in runtime error of map key not exists > > > Key: SPARK-38716 > URL: https://issues.apache.org/jira/browse/SPARK-38716 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38710) use SparkArithmeticException for Arithmetic overflow runtime errors
[ https://issues.apache.org/jira/browse/SPARK-38710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38710. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36022 [https://github.com/apache/spark/pull/36022] > use SparkArithmeticException for Arithmetic overflow runtime errors > --- > > Key: SPARK-38710 > URL: https://issues.apache.org/jira/browse/SPARK-38710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > Fix For: 3.3.0 > > > use SparkArithmeticException in Arithmetic overflow runtime errors, instead > of > java.lang.ArithmeticException -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38716) Provide query context in runtime error of map key not exists
Gengliang Wang created SPARK-38716: -- Summary: Provide query context in runtime error of map key not exists Key: SPARK-38716 URL: https://issues.apache.org/jira/browse/SPARK-38716 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38710) use SparkArithmeticException for Arithmetic overflow runtime errors
[ https://issues.apache.org/jira/browse/SPARK-38710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38710: --- Summary: use SparkArithmeticException for Arithmetic overflow runtime errors (was: use SparkArithmeticException in Arithmetic overflow runtime errors) > use SparkArithmeticException for Arithmetic overflow runtime errors > --- > > Key: SPARK-38710 > URL: https://issues.apache.org/jira/browse/SPARK-38710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > > use SparkArithmeticException in Arithmetic overflow runtime errors, instead > of > java.lang.ArithmeticException -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38710) use SparkArithmeticException in Arithmetic overflow runtime errors
Gengliang Wang created SPARK-38710: -- Summary: use SparkArithmeticException in Arithmetic overflow runtime errors Key: SPARK-38710 URL: https://issues.apache.org/jira/browse/SPARK-38710 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang use SparkArithmeticException in Arithmetic overflow runtime errors, instead of java.lang.ArithmeticException -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38698) Provide query context in runtime error of Divide/Div/Reminder/Pmod
[ https://issues.apache.org/jira/browse/SPARK-38698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38698. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36013 [https://github.com/apache/spark/pull/36013] > Provide query context in runtime error of Divide/Div/Reminder/Pmod > -- > > Key: SPARK-38698 > URL: https://issues.apache.org/jira/browse/SPARK-38698 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38698) Provide query context in runtime error of Divide/Div/Reminder/Pmod
Gengliang Wang created SPARK-38698: -- Summary: Provide query context in runtime error of Divide/Div/Reminder/Pmod Key: SPARK-38698 URL: https://issues.apache.org/jira/browse/SPARK-38698 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38676) Provide query context in runtime error of Add/Subtract/Multiply
[ https://issues.apache.org/jira/browse/SPARK-38676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38676. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35992 [https://github.com/apache/spark/pull/35992] > Provide query context in runtime error of Add/Subtract/Multiply > --- > > Key: SPARK-38676 > URL: https://issues.apache.org/jira/browse/SPARK-38676 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38676) Provide query context in runtime error of Add/Subtract/Multiply
Gengliang Wang created SPARK-38676: -- Summary: Provide query context in runtime error of Add/Subtract/Multiply Key: SPARK-38676 URL: https://issues.apache.org/jira/browse/SPARK-38676 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38616) Keep track of SQL query text in Catalyst TreeNode
[ https://issues.apache.org/jira/browse/SPARK-38616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38616. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35936 [https://github.com/apache/spark/pull/35936] > Keep track of SQL query text in Catalyst TreeNode > - > > Key: SPARK-38616 > URL: https://issues.apache.org/jira/browse/SPARK-38616 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > > Spark SQL uses the class Origin for tracking the position of each TreeNode in > the SQL query text. When there is a parser error, we can show the position > info in the error message: > {code:java} > > sql("create tabe foo(i int)") > org.apache.spark.sql.catalyst.parser.ParseException: > no viable alternative at input 'create tabe'(line 1, pos 7) > == SQL == > create tabe foo(i int) > ---^^^ {code} > It contains two fields: line and startPosition. This is enough for the parser > since the SQL query text is known. > However, the SQL query text is unknown in the execution phase. One solution > is to include the query text in Origin and show it in the runtime error > message. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38336) Catalyst changes for DEFAULT column support
[ https://issues.apache.org/jira/browse/SPARK-38336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38336: --- Affects Version/s: 3.4.0 (was: 3.2.1) > Catalyst changes for DEFAULT column support > --- > > Key: SPARK-38336 > URL: https://issues.apache.org/jira/browse/SPARK-38336 > Project: Spark > Issue Type: Sub-task > Components: Optimizer >Affects Versions: 3.4.0 >Reporter: Daniel >Assignee: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38336) Catalyst changes for DEFAULT column support
[ https://issues.apache.org/jira/browse/SPARK-38336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512670#comment-17512670 ] Gengliang Wang commented on SPARK-38336: [~dtenedor] the affected version should be 3.4.0 > Catalyst changes for DEFAULT column support > --- > > Key: SPARK-38336 > URL: https://issues.apache.org/jira/browse/SPARK-38336 > Project: Spark > Issue Type: Sub-task > Components: Optimizer >Affects Versions: 3.4.0 >Reporter: Daniel >Assignee: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38336) Catalyst changes for DEFAULT column support
[ https://issues.apache.org/jira/browse/SPARK-38336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-38336: -- Assignee: Daniel > Catalyst changes for DEFAULT column support > --- > > Key: SPARK-38336 > URL: https://issues.apache.org/jira/browse/SPARK-38336 > Project: Spark > Issue Type: Sub-task > Components: Optimizer >Affects Versions: 3.2.1 >Reporter: Daniel >Assignee: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38336) Catalyst changes for DEFAULT column support
[ https://issues.apache.org/jira/browse/SPARK-38336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38336. Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 35855 [https://github.com/apache/spark/pull/35855] > Catalyst changes for DEFAULT column support > --- > > Key: SPARK-38336 > URL: https://issues.apache.org/jira/browse/SPARK-38336 > Project: Spark > Issue Type: Sub-task > Components: Optimizer >Affects Versions: 3.2.1 >Reporter: Daniel >Priority: Major > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38574) Enrich Avro data source documentation
[ https://issues.apache.org/jira/browse/SPARK-38574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-38574: -- Assignee: Tianhan Hu > Enrich Avro data source documentation > - > > Key: SPARK-38574 > URL: https://issues.apache.org/jira/browse/SPARK-38574 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.2.1 >Reporter: Tianhan Hu >Assignee: Tianhan Hu >Priority: Minor > > Enrich Avro data source documentation to emphasize the difference between > *avroSchema* which is an option, and *jsonFormatSchema* which is a parameter > for function *from_avro* . > When using {*}from_avro{*}, *avroSchema* option can be set to a compatible > and evolved schema, while *jsonFormatSchema* has to be the actual schema. > Elsewise, the behavior is undefined. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38574) Enrich Avro data source documentation
[ https://issues.apache.org/jira/browse/SPARK-38574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38574. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35880 [https://github.com/apache/spark/pull/35880] > Enrich Avro data source documentation > - > > Key: SPARK-38574 > URL: https://issues.apache.org/jira/browse/SPARK-38574 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.2.1 >Reporter: Tianhan Hu >Assignee: Tianhan Hu >Priority: Minor > Fix For: 3.3.0 > > > Enrich Avro data source documentation to emphasize the difference between > *avroSchema* which is an option, and *jsonFormatSchema* which is a parameter > for function *from_avro* . > When using {*}from_avro{*}, *avroSchema* option can be set to a compatible > and evolved schema, while *jsonFormatSchema* has to be the actual schema. > Elsewise, the behavior is undefined. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38615) Provide error context for runtime ANSI failures
[ https://issues.apache.org/jira/browse/SPARK-38615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1751#comment-1751 ] Gengliang Wang commented on SPARK-38615: [~maxgekk] I am targeting this one in 3.3 as well. Since it is an error message improvement, let's try to finish as much as we can in 3.3. What do you think? > Provide error context for runtime ANSI failures > --- > > Key: SPARK-38615 > URL: https://issues.apache.org/jira/browse/SPARK-38615 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Priority: Major > > Currently, there is not enough error context for runtime ANSI failures. > In the following example, the error message only tells that there is a > "divide by zero" error, without pointing out where the exact SQL statement is. > {code:java} > > SELECT > ss1.ca_county, > ss1.d_year, > ws2.web_sales / ws1.web_sales web_q1_q2_increase, > ss2.store_sales / ss1.store_sales store_q1_q2_increase, > ws3.web_sales / ws2.web_sales web_q2_q3_increase, > ss3.store_sales / ss2.store_sales store_q2_q3_increase > FROM > ss ss1, ss ss2, ss ss3, ws ws1, ws ws2, ws ws3 > WHERE > ss1.d_qoy = 1 > AND ss1.d_year = 2000 > AND ss1.ca_county = ss2.ca_county > AND ss2.d_qoy = 2 > AND ss2.d_year = 2000 > AND ss2.ca_county = ss3.ca_county > AND ss3.d_qoy = 3 > AND ss3.d_year = 2000 > AND ss1.ca_county = ws1.ca_county > AND ws1.d_qoy = 1 > AND ws1.d_year = 2000 > AND ws1.ca_county = ws2.ca_county > AND ws2.d_qoy = 2 > AND ws2.d_year = 2000 > AND ws1.ca_county = ws3.ca_county > AND ws3.d_qoy = 3 > AND ws3.d_year = 2000 > AND CASE WHEN ws1.web_sales > 0 > THEN ws2.web_sales / ws1.web_sales > ELSE NULL END > > CASE WHEN ss1.store_sales > 0 > THEN ss2.store_sales / ss1.store_sales > ELSE NULL END > AND CASE WHEN ws2.web_sales > 0 > THEN ws3.web_sales / ws2.web_sales > ELSE NULL END > > CASE WHEN ss2.store_sales > 0 > THEN ss3.store_sales / ss2.store_sales > ELSE NULL END > ORDER BY ss1.ca_county > {code} > {code:java} > org.apache.spark.SparkArithmeticException: divide by zero at > org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:140) > at > org.apache.spark.sql.catalyst.expressions.DivModLike.eval(arithmetic.scala:437) > at > org.apache.spark.sql.catalyst.expressions.DivModLike.eval$(arithmetic.scala:425) > at > org.apache.spark.sql.catalyst.expressions.Divide.eval(arithmetic.scala:534) > {code} > > I suggest that we provide details in the error message, including: > * the problematic expression from the original SQL query, e.g. > "ss3.store_sales / ss2.store_sales store_q2_q3_increase" > * the line number and starting char position of the problematic expression, > in case of queries like "select a + b from t1 union select a + b from t2" -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38615) Provide error context for runtime ANSI failures
[ https://issues.apache.org/jira/browse/SPARK-38615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38615: --- Issue Type: Improvement (was: Bug) > Provide error context for runtime ANSI failures > --- > > Key: SPARK-38615 > URL: https://issues.apache.org/jira/browse/SPARK-38615 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Priority: Major > > Currently, there is not enough error context for runtime ANSI failures. > In the following example, the error message only tells that there is a > "divide by zero" error, without pointing out where the exact SQL statement is. > {code:java} > > SELECT > ss1.ca_county, > ss1.d_year, > ws2.web_sales / ws1.web_sales web_q1_q2_increase, > ss2.store_sales / ss1.store_sales store_q1_q2_increase, > ws3.web_sales / ws2.web_sales web_q2_q3_increase, > ss3.store_sales / ss2.store_sales store_q2_q3_increase > FROM > ss ss1, ss ss2, ss ss3, ws ws1, ws ws2, ws ws3 > WHERE > ss1.d_qoy = 1 > AND ss1.d_year = 2000 > AND ss1.ca_county = ss2.ca_county > AND ss2.d_qoy = 2 > AND ss2.d_year = 2000 > AND ss2.ca_county = ss3.ca_county > AND ss3.d_qoy = 3 > AND ss3.d_year = 2000 > AND ss1.ca_county = ws1.ca_county > AND ws1.d_qoy = 1 > AND ws1.d_year = 2000 > AND ws1.ca_county = ws2.ca_county > AND ws2.d_qoy = 2 > AND ws2.d_year = 2000 > AND ws1.ca_county = ws3.ca_county > AND ws3.d_qoy = 3 > AND ws3.d_year = 2000 > AND CASE WHEN ws1.web_sales > 0 > THEN ws2.web_sales / ws1.web_sales > ELSE NULL END > > CASE WHEN ss1.store_sales > 0 > THEN ss2.store_sales / ss1.store_sales > ELSE NULL END > AND CASE WHEN ws2.web_sales > 0 > THEN ws3.web_sales / ws2.web_sales > ELSE NULL END > > CASE WHEN ss2.store_sales > 0 > THEN ss3.store_sales / ss2.store_sales > ELSE NULL END > ORDER BY ss1.ca_county > {code} > {code:java} > org.apache.spark.SparkArithmeticException: divide by zero at > org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:140) > at > org.apache.spark.sql.catalyst.expressions.DivModLike.eval(arithmetic.scala:437) > at > org.apache.spark.sql.catalyst.expressions.DivModLike.eval$(arithmetic.scala:425) > at > org.apache.spark.sql.catalyst.expressions.Divide.eval(arithmetic.scala:534) > {code} > > I suggest that we provide details in the error message, including: > * the problematic expression from the original SQL query, e.g. > "ss3.store_sales / ss2.store_sales store_q2_q3_increase" > * the line number and starting char position of the problematic expression, > in case of queries like "select a + b from t1 union select a + b from t2" -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38616) Keep track of SQL query text in Catalyst TreeNode
[ https://issues.apache.org/jira/browse/SPARK-38616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38616: --- Summary: Keep track of SQL query text in Catalyst TreeNode (was: Include SQL query text in the Origin of TreeNode) > Keep track of SQL query text in Catalyst TreeNode > - > > Key: SPARK-38616 > URL: https://issues.apache.org/jira/browse/SPARK-38616 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > > Spark SQL uses the class Origin for tracking the position of each TreeNode in > the SQL query text. When there is a parser error, we can show the position > info in the error message: > {code:java} > > sql("create tabe foo(i int)") > org.apache.spark.sql.catalyst.parser.ParseException: > no viable alternative at input 'create tabe'(line 1, pos 7) > == SQL == > create tabe foo(i int) > ---^^^ {code} > It contains two fields: line and startPosition. This is enough for the parser > since the SQL query text is known. > However, the SQL query text is unknown in the execution phase. One solution > is to include the query text in Origin and show it in the runtime error > message. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38616) Include SQL query text in the Origin of TreeNode
Gengliang Wang created SPARK-38616: -- Summary: Include SQL query text in the Origin of TreeNode Key: SPARK-38616 URL: https://issues.apache.org/jira/browse/SPARK-38616 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang Spark SQL uses the class Origin for tracking the position of each TreeNode in the SQL query text. When there is a parser error, we can show the position info in the error message: {code:java} > sql("create tabe foo(i int)") org.apache.spark.sql.catalyst.parser.ParseException: no viable alternative at input 'create tabe'(line 1, pos 7) == SQL == create tabe foo(i int) ---^^^ {code} It contains two fields: line and startPosition. This is enough for the parser since the SQL query text is known. However, the SQL query text is unknown in the execution phase. One solution is to include the query text in Origin and show it in the runtime error message. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38615) Provide error context for runtime ANSI failures
[ https://issues.apache.org/jira/browse/SPARK-38615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38615: --- Epic Link: SPARK-35030 > Provide error context for runtime ANSI failures > --- > > Key: SPARK-38615 > URL: https://issues.apache.org/jira/browse/SPARK-38615 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Priority: Major > > Currently, there is not enough error context for runtime ANSI failures. > In the following example, the error message only tells that there is a > "divide by zero" error, without pointing out where the exact SQL statement is. > {code:java} > > SELECT > ss1.ca_county, > ss1.d_year, > ws2.web_sales / ws1.web_sales web_q1_q2_increase, > ss2.store_sales / ss1.store_sales store_q1_q2_increase, > ws3.web_sales / ws2.web_sales web_q2_q3_increase, > ss3.store_sales / ss2.store_sales store_q2_q3_increase > FROM > ss ss1, ss ss2, ss ss3, ws ws1, ws ws2, ws ws3 > WHERE > ss1.d_qoy = 1 > AND ss1.d_year = 2000 > AND ss1.ca_county = ss2.ca_county > AND ss2.d_qoy = 2 > AND ss2.d_year = 2000 > AND ss2.ca_county = ss3.ca_county > AND ss3.d_qoy = 3 > AND ss3.d_year = 2000 > AND ss1.ca_county = ws1.ca_county > AND ws1.d_qoy = 1 > AND ws1.d_year = 2000 > AND ws1.ca_county = ws2.ca_county > AND ws2.d_qoy = 2 > AND ws2.d_year = 2000 > AND ws1.ca_county = ws3.ca_county > AND ws3.d_qoy = 3 > AND ws3.d_year = 2000 > AND CASE WHEN ws1.web_sales > 0 > THEN ws2.web_sales / ws1.web_sales > ELSE NULL END > > CASE WHEN ss1.store_sales > 0 > THEN ss2.store_sales / ss1.store_sales > ELSE NULL END > AND CASE WHEN ws2.web_sales > 0 > THEN ws3.web_sales / ws2.web_sales > ELSE NULL END > > CASE WHEN ss2.store_sales > 0 > THEN ss3.store_sales / ss2.store_sales > ELSE NULL END > ORDER BY ss1.ca_county > {code} > {code:java} > org.apache.spark.SparkArithmeticException: divide by zero at > org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:140) > at > org.apache.spark.sql.catalyst.expressions.DivModLike.eval(arithmetic.scala:437) > at > org.apache.spark.sql.catalyst.expressions.DivModLike.eval$(arithmetic.scala:425) > at > org.apache.spark.sql.catalyst.expressions.Divide.eval(arithmetic.scala:534) > {code} > > I suggest that we provide details in the error message, including: > * the problematic expression from the original SQL query, e.g. > "ss3.store_sales / ss2.store_sales store_q2_q3_increase" > * the line number and starting char position of the problematic expression, > in case of queries like "select a + b from t1 union select a + b from t2" -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38615) Provide error context for runtime ANSI failures
Gengliang Wang created SPARK-38615: -- Summary: Provide error context for runtime ANSI failures Key: SPARK-38615 URL: https://issues.apache.org/jira/browse/SPARK-38615 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Currently, there is not enough error context for runtime ANSI failures. In the following example, the error message only tells that there is a "divide by zero" error, without pointing out where the exact SQL statement is. {code:java} > SELECT ss1.ca_county, ss1.d_year, ws2.web_sales / ws1.web_sales web_q1_q2_increase, ss2.store_sales / ss1.store_sales store_q1_q2_increase, ws3.web_sales / ws2.web_sales web_q2_q3_increase, ss3.store_sales / ss2.store_sales store_q2_q3_increase FROM ss ss1, ss ss2, ss ss3, ws ws1, ws ws2, ws ws3 WHERE ss1.d_qoy = 1 AND ss1.d_year = 2000 AND ss1.ca_county = ss2.ca_county AND ss2.d_qoy = 2 AND ss2.d_year = 2000 AND ss2.ca_county = ss3.ca_county AND ss3.d_qoy = 3 AND ss3.d_year = 2000 AND ss1.ca_county = ws1.ca_county AND ws1.d_qoy = 1 AND ws1.d_year = 2000 AND ws1.ca_county = ws2.ca_county AND ws2.d_qoy = 2 AND ws2.d_year = 2000 AND ws1.ca_county = ws3.ca_county AND ws3.d_qoy = 3 AND ws3.d_year = 2000 AND CASE WHEN ws1.web_sales > 0 THEN ws2.web_sales / ws1.web_sales ELSE NULL END > CASE WHEN ss1.store_sales > 0 THEN ss2.store_sales / ss1.store_sales ELSE NULL END AND CASE WHEN ws2.web_sales > 0 THEN ws3.web_sales / ws2.web_sales ELSE NULL END > CASE WHEN ss2.store_sales > 0 THEN ss3.store_sales / ss2.store_sales ELSE NULL END ORDER BY ss1.ca_county {code} {code:java} org.apache.spark.SparkArithmeticException: divide by zero at org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:140) at org.apache.spark.sql.catalyst.expressions.DivModLike.eval(arithmetic.scala:437) at org.apache.spark.sql.catalyst.expressions.DivModLike.eval$(arithmetic.scala:425) at org.apache.spark.sql.catalyst.expressions.Divide.eval(arithmetic.scala:534) {code} I suggest that we provide details in the error message, including: * the problematic expression from the original SQL query, e.g. "ss3.store_sales / ss2.store_sales store_q2_q3_increase" * the line number and starting char position of the problematic expression, in case of queries like "select a + b from t1 union select a + b from t2" -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38154) Set up a new GA job to run tests with ANSI mode
[ https://issues.apache.org/jira/browse/SPARK-38154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38154. Fix Version/s: 3.3.0 Assignee: Apache Spark Resolution: Fixed > Set up a new GA job to run tests with ANSI mode > --- > > Key: SPARK-38154 > URL: https://issues.apache.org/jira/browse/SPARK-38154 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Wenchen Fan >Assignee: Apache Spark >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38548) New SQL function: try_sum
[ https://issues.apache.org/jira/browse/SPARK-38548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38548. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35848 [https://github.com/apache/spark/pull/35848] > New SQL function: try_sum > - > > Key: SPARK-38548 > URL: https://issues.apache.org/jira/browse/SPARK-38548 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38590) New SQL function: try_to_binary
Gengliang Wang created SPARK-38590: -- Summary: New SQL function: try_to_binary Key: SPARK-38590 URL: https://issues.apache.org/jira/browse/SPARK-38590 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38589) New SQL function: try_avg
Gengliang Wang created SPARK-38589: -- Summary: New SQL function: try_avg Key: SPARK-38589 URL: https://issues.apache.org/jira/browse/SPARK-38589 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38548) New SQL function: try_sum
Gengliang Wang created SPARK-38548: -- Summary: New SQL function: try_sum Key: SPARK-38548 URL: https://issues.apache.org/jira/browse/SPARK-38548 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38501) Fix thriftserver test failures under ANSI mode
[ https://issues.apache.org/jira/browse/SPARK-38501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38501. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35802 [https://github.com/apache/spark/pull/35802] > Fix thriftserver test failures under ANSI mode > -- > > Key: SPARK-38501 > URL: https://issues.apache.org/jira/browse/SPARK-38501 > Project: Spark > Issue Type: Sub-task > Components: SQL, Tests >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38451) Fix R tests under ANSI mode
[ https://issues.apache.org/jira/browse/SPARK-38451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38451. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35798 [https://github.com/apache/spark/pull/35798] > Fix R tests under ANSI mode > --- > > Key: SPARK-38451 > URL: https://issues.apache.org/jira/browse/SPARK-38451 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Hyukjin Kwon >Priority: Major > Fix For: 3.3.0 > > > [https://github.com/gengliangwang/spark/runs/5461227887?check_suite_focus=true] > > {quote}1. Error (test_sparkSQL.R:2064:3): SPARK-37108: expose make_date > expression i > 2022-03-08T10:06:54.9600113Z Error in `handleErrors(returnStatus, conn)`: > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 661.0 failed 1 times, most recent failure: Lost task 0.0 in stage 661.0 > (TID 570) (localhost executor driver): java.time.DateTimeException: Invalid > value for MonthOfYear (valid values 1 - 12): 13. If necessary set > spark.sql.ansi.enabled to false to bypass this error. > {quote} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38490) Add Github action test job for ANSI SQL mode
[ https://issues.apache.org/jira/browse/SPARK-38490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38490. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35797 [https://github.com/apache/spark/pull/35797] > Add Github action test job for ANSI SQL mode > > > Key: SPARK-38490 > URL: https://issues.apache.org/jira/browse/SPARK-38490 > Project: Spark > Issue Type: Sub-task > Components: Project Infra, SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38170) Fix //sql/hive-thriftserver:org.apache.spark.sql.hive.thriftserver.ThriftServerWithSparkContextInHttpSuite-hive-2.3__hadoop-2.7 in ANSI
[ https://issues.apache.org/jira/browse/SPARK-38170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38170. Resolution: Duplicate It is included in https://issues.apache.org/jira/browse/SPARK-38501 > Fix > //sql/hive-thriftserver:org.apache.spark.sql.hive.thriftserver.ThriftServerWithSparkContextInHttpSuite-hive-2.3__hadoop-2.7 > in ANSI > --- > > Key: SPARK-38170 > URL: https://issues.apache.org/jira/browse/SPARK-38170 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Rui Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38501) Fix thriftserver test failures under ANSI mode
Gengliang Wang created SPARK-38501: -- Summary: Fix thriftserver test failures under ANSI mode Key: SPARK-38501 URL: https://issues.apache.org/jira/browse/SPARK-38501 Project: Spark Issue Type: Sub-task Components: SQL, Tests Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38490) Add Github action test job for ANSI SQL mode
Gengliang Wang created SPARK-38490: -- Summary: Add Github action test job for ANSI SQL mode Key: SPARK-38490 URL: https://issues.apache.org/jira/browse/SPARK-38490 Project: Spark Issue Type: Sub-task Components: Project Infra, SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38407) ANSI Cast: loosen the limitation of casting non-null complex types
[ https://issues.apache.org/jira/browse/SPARK-38407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38407. Fix Version/s: 3.3.0 3.2.2 Resolution: Fixed Issue resolved by pull request 35754 [https://github.com/apache/spark/pull/35754] > ANSI Cast: loosen the limitation of casting non-null complex types > -- > > Key: SPARK-38407 > URL: https://issues.apache.org/jira/browse/SPARK-38407 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0, 3.2.2 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0, 3.2.2 > > > When ANSI mode is off, `ArrayType(DoubleType, containsNull = false)` can't > cast as `ArrayType(IntegerType, containsNull = false)` since there can be > overflow thus result in null results and breaks the non-null constraint. > > When ANSI mode is on, currently Spark SQL has the same behavior. However, > this is not correct since the non-null constraint won't be break. Spark SQL > can just execute the cast and throw runtime error on overflow, just like > casting DoubleType as IntegerType. > > This applies to MapType and StructType as well. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38450) Fix HiveQuerySuite//PushFoldableIntoBranchesSuite/TransposeWindowSuite
[ https://issues.apache.org/jira/browse/SPARK-38450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38450. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35771 [https://github.com/apache/spark/pull/35771] > Fix HiveQuerySuite//PushFoldableIntoBranchesSuite/TransposeWindowSuite > -- > > Key: SPARK-38450 > URL: https://issues.apache.org/jira/browse/SPARK-38450 > Project: Spark > Issue Type: Sub-task > Components: SQL, Tests >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38451) Fix R tests under ANSI mode
[ https://issues.apache.org/jira/browse/SPARK-38451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38451: --- Description: [https://github.com/gengliangwang/spark/runs/5461227887?check_suite_focus=true] 1. Error (test_sparkSQL.R:2064:3): SPARK-37108: expose make_date expression i 2022-03-08T10:06:54.9600113Z Error in `handleErrors(returnStatus, conn)`: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 661.0 failed 1 times, most recent failure: Lost task 0.0 in stage 661.0 (TID 570) (localhost executor driver): java.time.DateTimeException: Invalid value for MonthOfYear (valid values 1 - 12): 13. If necessary set spark.sql.ansi.enabled to false to bypass this error. was:https://github.com/gengliangwang/spark/runs/5461227887?check_suite_focus=true > Fix R tests under ANSI mode > --- > > Key: SPARK-38451 > URL: https://issues.apache.org/jira/browse/SPARK-38451 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > > [https://github.com/gengliangwang/spark/runs/5461227887?check_suite_focus=true] > > 1. Error (test_sparkSQL.R:2064:3): SPARK-37108: expose make_date expression i > 2022-03-08T10:06:54.9600113Z Error in `handleErrors(returnStatus, conn)`: > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 661.0 failed 1 times, most recent failure: Lost task 0.0 in stage 661.0 > (TID 570) (localhost executor driver): java.time.DateTimeException: Invalid > value for MonthOfYear (valid values 1 - 12): 13. If necessary set > spark.sql.ansi.enabled to false to bypass this error. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38451) Fix R tests under ANSI mode
[ https://issues.apache.org/jira/browse/SPARK-38451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38451: --- Description: [https://github.com/gengliangwang/spark/runs/5461227887?check_suite_focus=true] {quote}1. Error (test_sparkSQL.R:2064:3): SPARK-37108: expose make_date expression i 2022-03-08T10:06:54.9600113Z Error in `handleErrors(returnStatus, conn)`: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 661.0 failed 1 times, most recent failure: Lost task 0.0 in stage 661.0 (TID 570) (localhost executor driver): java.time.DateTimeException: Invalid value for MonthOfYear (valid values 1 - 12): 13. If necessary set spark.sql.ansi.enabled to false to bypass this error. {quote} was: [https://github.com/gengliangwang/spark/runs/5461227887?check_suite_focus=true] 1. Error (test_sparkSQL.R:2064:3): SPARK-37108: expose make_date expression i 2022-03-08T10:06:54.9600113Z Error in `handleErrors(returnStatus, conn)`: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 661.0 failed 1 times, most recent failure: Lost task 0.0 in stage 661.0 (TID 570) (localhost executor driver): java.time.DateTimeException: Invalid value for MonthOfYear (valid values 1 - 12): 13. If necessary set spark.sql.ansi.enabled to false to bypass this error. > Fix R tests under ANSI mode > --- > > Key: SPARK-38451 > URL: https://issues.apache.org/jira/browse/SPARK-38451 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > > [https://github.com/gengliangwang/spark/runs/5461227887?check_suite_focus=true] > > {quote}1. Error (test_sparkSQL.R:2064:3): SPARK-37108: expose make_date > expression i > 2022-03-08T10:06:54.9600113Z Error in `handleErrors(returnStatus, conn)`: > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 661.0 failed 1 times, most recent failure: Lost task 0.0 in stage 661.0 > (TID 570) (localhost executor driver): java.time.DateTimeException: Invalid > value for MonthOfYear (valid values 1 - 12): 13. If necessary set > spark.sql.ansi.enabled to false to bypass this error. > {quote} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38451) Fix R tests under ANSI mode
[ https://issues.apache.org/jira/browse/SPARK-38451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-38451: -- Assignee: Hyukjin Kwon (was: Gengliang Wang) > Fix R tests under ANSI mode > --- > > Key: SPARK-38451 > URL: https://issues.apache.org/jira/browse/SPARK-38451 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Hyukjin Kwon >Priority: Major > > [https://github.com/gengliangwang/spark/runs/5461227887?check_suite_focus=true] > > {quote}1. Error (test_sparkSQL.R:2064:3): SPARK-37108: expose make_date > expression i > 2022-03-08T10:06:54.9600113Z Error in `handleErrors(returnStatus, conn)`: > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 661.0 failed 1 times, most recent failure: Lost task 0.0 in stage 661.0 > (TID 570) (localhost executor driver): java.time.DateTimeException: Invalid > value for MonthOfYear (valid values 1 - 12): 13. If necessary set > spark.sql.ansi.enabled to false to bypass this error. > {quote} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38451) Fix R tests under ANSI mode
[ https://issues.apache.org/jira/browse/SPARK-38451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502982#comment-17502982 ] Gengliang Wang commented on SPARK-38451: [~hyukjin.kwon] could you take a look at this one when you have time? It's a blocker for this new GA job > Fix R tests under ANSI mode > --- > > Key: SPARK-38451 > URL: https://issues.apache.org/jira/browse/SPARK-38451 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > > https://github.com/gengliangwang/spark/runs/5461227887?check_suite_focus=true -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38451) Fix R tests under ANSI mode
Gengliang Wang created SPARK-38451: -- Summary: Fix R tests under ANSI mode Key: SPARK-38451 URL: https://issues.apache.org/jira/browse/SPARK-38451 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang https://github.com/gengliangwang/spark/runs/5461227887?check_suite_focus=true -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38450) Fix HiveQuerySuite//PushFoldableIntoBranchesSuite/TransposeWindowSuite
Gengliang Wang created SPARK-38450: -- Summary: Fix HiveQuerySuite//PushFoldableIntoBranchesSuite/TransposeWindowSuite Key: SPARK-38450 URL: https://issues.apache.org/jira/browse/SPARK-38450 Project: Spark Issue Type: Sub-task Components: SQL, Tests Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38442) Fix ConstantFoldingSuite/ColumnExpressionSuite/DataFrameSuite/AdaptiveQueryExecSuite under ANSI mode
[ https://issues.apache.org/jira/browse/SPARK-38442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38442. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35761 [https://github.com/apache/spark/pull/35761] > Fix > ConstantFoldingSuite/ColumnExpressionSuite/DataFrameSuite/AdaptiveQueryExecSuite > under ANSI mode > > > Key: SPARK-38442 > URL: https://issues.apache.org/jira/browse/SPARK-38442 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38442) Fix ConstantFoldingSuite/ColumnExpressionSuite/DataFrameSuite/AdaptiveQueryExecSuite under ANSI mode
Gengliang Wang created SPARK-38442: -- Summary: Fix ConstantFoldingSuite/ColumnExpressionSuite/DataFrameSuite/AdaptiveQueryExecSuite under ANSI mode Key: SPARK-38442 URL: https://issues.apache.org/jira/browse/SPARK-38442 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38434) Correct semantic of CheckAnalysis.getDataTypesAreCompatibleFn method
[ https://issues.apache.org/jira/browse/SPARK-38434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-38434: -- Assignee: huangtengfei > Correct semantic of CheckAnalysis.getDataTypesAreCompatibleFn method > > > Key: SPARK-38434 > URL: https://issues.apache.org/jira/browse/SPARK-38434 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.1 >Reporter: huangtengfei >Assignee: huangtengfei >Priority: Minor > > Currently, in `CheckAnalysis` method [getDataTypesAreCompatibleFn > |https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala#L606] > implemented as: > {code:java} > private def getDataTypesAreCompatibleFn(plan: LogicalPlan): (DataType, > DataType) => Boolean = { > val isUnion = plan.isInstanceOf[Union] > if (isUnion) { > (dt1: DataType, dt2: DataType) => > !DataType.equalsStructurally(dt1, dt2, true) > } else { > // SPARK-18058: we shall not care about the nullability of columns > (dt1: DataType, dt2: DataType) => > TypeCoercion.findWiderTypeForTwo(dt1.asNullable, > dt2.asNullable).isEmpty > } > } > {code} > Return false when data types are compatible, otherwise return true, which is > pretty confusing. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38434) Correct semantic of CheckAnalysis.getDataTypesAreCompatibleFn method
[ https://issues.apache.org/jira/browse/SPARK-38434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38434. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35752 [https://github.com/apache/spark/pull/35752] > Correct semantic of CheckAnalysis.getDataTypesAreCompatibleFn method > > > Key: SPARK-38434 > URL: https://issues.apache.org/jira/browse/SPARK-38434 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.1 >Reporter: huangtengfei >Assignee: huangtengfei >Priority: Minor > Fix For: 3.3.0 > > > Currently, in `CheckAnalysis` method [getDataTypesAreCompatibleFn > |https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala#L606] > implemented as: > {code:java} > private def getDataTypesAreCompatibleFn(plan: LogicalPlan): (DataType, > DataType) => Boolean = { > val isUnion = plan.isInstanceOf[Union] > if (isUnion) { > (dt1: DataType, dt2: DataType) => > !DataType.equalsStructurally(dt1, dt2, true) > } else { > // SPARK-18058: we shall not care about the nullability of columns > (dt1: DataType, dt2: DataType) => > TypeCoercion.findWiderTypeForTwo(dt1.asNullable, > dt2.asNullable).isEmpty > } > } > {code} > Return false when data types are compatible, otherwise return true, which is > pretty confusing. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38335) Parser changes for DEFAULT column support
[ https://issues.apache.org/jira/browse/SPARK-38335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang reassigned SPARK-38335: -- Assignee: Daniel > Parser changes for DEFAULT column support > - > > Key: SPARK-38335 > URL: https://issues.apache.org/jira/browse/SPARK-38335 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.1 >Reporter: Daniel >Assignee: Daniel >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38335) Parser changes for DEFAULT column support
[ https://issues.apache.org/jira/browse/SPARK-38335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38335. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35690 [https://github.com/apache/spark/pull/35690] > Parser changes for DEFAULT column support > - > > Key: SPARK-38335 > URL: https://issues.apache.org/jira/browse/SPARK-38335 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.1 >Reporter: Daniel >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38407) ANSI Cast: loosen the limitation of casting non-null complex types
Gengliang Wang created SPARK-38407: -- Summary: ANSI Cast: loosen the limitation of casting non-null complex types Key: SPARK-38407 URL: https://issues.apache.org/jira/browse/SPARK-38407 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0, 3.2.2 Reporter: Gengliang Wang Assignee: Gengliang Wang When ANSI mode is off, `ArrayType(DoubleType, containsNull = false)` can't cast as `ArrayType(IntegerType, containsNull = false)` since there can be overflow thus result in null results and breaks the non-null constraint. When ANSI mode is on, currently Spark SQL has the same behavior. However, this is not correct since the non-null constraint won't be break. Spark SQL can just execute the cast and throw runtime error on overflow, just like casting DoubleType as IntegerType. This applies to MapType and StructType as well. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38363) Avoid runtime error in Dataset.summary() when ANSI mode is on
[ https://issues.apache.org/jira/browse/SPARK-38363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38363. Fix Version/s: 3.3.0 3.2.2 Resolution: Fixed Issue resolved by pull request 35699 [https://github.com/apache/spark/pull/35699] > Avoid runtime error in Dataset.summary() when ANSI mode is on > - > > Key: SPARK-38363 > URL: https://issues.apache.org/jira/browse/SPARK-38363 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0, 3.2.2 > > > When executing df.summary(), Spark SQL converts String columns as Double for > the > percentiles/mean/stddev metrics. > This can cause runtime errors with ANSI mode on. > Since this API is for getting a quick summary of the Dataframe, I suggest > using "TryCast" for the problematic stats so that the API still works under > ANSI mode. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38363) Avoid runtime error in Dataset.summary() when ANSI mode is on
Gengliang Wang created SPARK-38363: -- Summary: Avoid runtime error in Dataset.summary() when ANSI mode is on Key: SPARK-38363 URL: https://issues.apache.org/jira/browse/SPARK-38363 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang When executing df.summary(), Spark SQL converts String columns as Double for the percentiles/mean/stddev metrics. This can cause runtime errors with ANSI mode on. Since this API is for getting a quick summary of the Dataframe, I suggest using "TryCast" for the problematic stats so that the API still works under ANSI mode. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38352) Fix DataFrameAggregateSuite/DataFrameSetOperationsSuite/DataFrameWindowFunctionsSuite under ANSI mode
[ https://issues.apache.org/jira/browse/SPARK-38352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38352. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35682 [https://github.com/apache/spark/pull/35682] > Fix > DataFrameAggregateSuite/DataFrameSetOperationsSuite/DataFrameWindowFunctionsSuite > under ANSI mode > - > > Key: SPARK-38352 > URL: https://issues.apache.org/jira/browse/SPARK-38352 > Project: Spark > Issue Type: Sub-task > Components: SQL, Tests >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38347) Nullability propagation in transformUpWithNewOutput
[ https://issues.apache.org/jira/browse/SPARK-38347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38347. Fix Version/s: 3.3.0 3.2.2 Resolution: Fixed Issue resolved by pull request 35685 [https://github.com/apache/spark/pull/35685] > Nullability propagation in transformUpWithNewOutput > --- > > Key: SPARK-38347 > URL: https://issues.apache.org/jira/browse/SPARK-38347 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Yingyi Bu >Assignee: Yingyi Bu >Priority: Major > Fix For: 3.3.0, 3.2.2 > > > The nullability of a replaced attribute should be `a.nullable` instead of > `b.nullable`: > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala#L357 > The scenario is a left outer join where the RHS attributes are replaced > bottom up. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org