[jira] [Comment Edited] (SPARK-35161) Safe version SQL functions
[ https://issues.apache.org/jira/browse/SPARK-35161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17326256#comment-17326256 ] Gengliang Wang edited comment on SPARK-35161 at 4/21/21, 5:51 AM: -- cc [~beliefer] [~angerszhuuu] Are you interested in these new features? was (Author: gengliang.wang): cc [~beliefer][~angerszhuuu] Are you interested in these new features? > Safe version SQL functions > -- > > Key: SPARK-35161 > URL: https://issues.apache.org/jira/browse/SPARK-35161 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gengliang Wang >Priority: Major > > Create new safe version SQL functions for existing SQL functions/operators, > which returns NULL if overflow/error occurs. So that: > 1. Users can manage to finish queries without interruptions in ANSI mode. > 2. Users can get NULLs instead of unreasonable results if overflow occurs > when ANSI mode is off. > For example, the behavior of the following SQL operations is unreasonable: > {code:java} > 2147483647 + 2 => -2147483647 > CAST(2147483648L AS INT) => -2147483648 > {code} > With the new safe version SQL functions: > {code:java} > TRY_ADD(2147483647, 2) => null > TRY_CAST(2147483648L AS INT) => null > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35161) Safe version SQL functions
[ https://issues.apache.org/jira/browse/SPARK-35161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17326272#comment-17326272 ] angerszhu commented on SPARK-35161: --- Got it. > Safe version SQL functions > -- > > Key: SPARK-35161 > URL: https://issues.apache.org/jira/browse/SPARK-35161 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gengliang Wang >Priority: Major > > Create new safe version SQL functions for existing SQL functions/operators, > which returns NULL if overflow/error occurs. So that: > 1. Users can manage to finish queries without interruptions in ANSI mode. > 2. Users can get NULLs instead of unreasonable results if overflow occurs > when ANSI mode is off. > For example, the behavior of the following SQL operations is unreasonable: > {code:java} > 2147483647 + 2 => -2147483647 > CAST(2147483648L AS INT) => -2147483648 > {code} > With the new safe version SQL functions: > {code:java} > TRY_ADD(2147483647, 2) => null > TRY_CAST(2147483648L AS INT) => null > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35161) Safe version SQL functions
[ https://issues.apache.org/jira/browse/SPARK-35161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-35161: --- Description: Create new safe version SQL functions for existing SQL functions/operators, which returns NULL if overflow/error occurs. So that: 1. Users can manage to finish queries without interruptions in ANSI mode. 2. Users can get NULLs instead of unreasonable results if overflow occurs when ANSI mode is off. For example, the behavior of the following SQL operations is unreasonable: {code:java} 2147483647 + 2 => -2147483647 CAST(2147483648L AS INT) => -2147483648 {code} With the new safe version SQL functions: {code:java} TRY_ADD(2147483647, 2) => null TRY_CAST(2147483648L AS INT) => null {code} was: Create new safe version SQL functions for existing SQL functions/operators, which returns NULL if overflow/error occurs. So that: 1. Users can manage to finish queries without interruptions in ANSI mode. 2. Even when ANSI mode is off, the result can be more reasonable. For example, the result of the following operation is terrible {code:java} 2147483647 + 2 => -2147483647 CAST(2147483648L AS INT) => -2147483648 {code} Having the safe version SQL functions provides an alternative solution for handling such cases {code:java} TRY_ADD(2147483647, 2) => null TRY_CAST(2147483648L AS INT) => null {code} > Safe version SQL functions > -- > > Key: SPARK-35161 > URL: https://issues.apache.org/jira/browse/SPARK-35161 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gengliang Wang >Priority: Major > > Create new safe version SQL functions for existing SQL functions/operators, > which returns NULL if overflow/error occurs. So that: > 1. Users can manage to finish queries without interruptions in ANSI mode. > 2. Users can get NULLs instead of unreasonable results if overflow occurs > when ANSI mode is off. > For example, the behavior of the following SQL operations is unreasonable: > {code:java} > 2147483647 + 2 => -2147483647 > CAST(2147483648L AS INT) => -2147483648 > {code} > With the new safe version SQL functions: > {code:java} > TRY_ADD(2147483647, 2) => null > TRY_CAST(2147483648L AS INT) => null > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-33966) Two-tier encryption key management
[ https://issues.apache.org/jira/browse/SPARK-33966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gidon Gershinsky updated SPARK-33966: - Target Version/s: (was: 3.2.0) > Two-tier encryption key management > -- > > Key: SPARK-33966 > URL: https://issues.apache.org/jira/browse/SPARK-33966 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gidon Gershinsky >Priority: Major > > Columnar data formats (Parquet and ORC) have recently added a column > encryption capability. The data protection follows the practice of envelope > encryption, where the Data Encryption Key (DEK) is freshly generated for each > file/column, and is encrypted with a master key (or an intermediate key, that > is in turn encrypted with a master key). The master keys are kept in a > centralized Key Management Service (KMS) - meaning that each Spark worker > needs to interact with a (typically slow) KMS server. > This Jira (and its sub-tasks) introduce an alternative approach, that on one > hand preserves the best practice of generating fresh encryption keys for each > data file/column, and on the other hand allows Spark clusters to have a > scalable interaction with a KMS server, by delegating it to the application > driver. This is done via two-tier management of the keys, where a random Key > Encryption Key (KEK) is generated by the driver, encrypted by the master key > in the KMS, and distributed by the driver to the workers, so they can use it > to encrypt the DEKs, generated there by Parquet or ORC libraries. In the > workers, the KEKs are distributed to the executors/threads in the write path. > In the read path, the encrypted KEKs are fetched by workers from file > metadata, decrypted via interaction with the driver, and shared among the > executors/threads. > The KEK layer further improves scalability of the key management, because > neither driver or workers need to interact with the KMS for each file/column. > Stand-alone Parquet/ORC libraries (without Spark) and/or other frameworks > (e.g., Presto, pandas) must be able to read/decrypt the files, > written/encrypted by this Spark-driven key management mechanism - and > vice-versa. [of course, only if both sides have proper authorisation for > using the master keys in the KMS] > A link to a discussion/design doc is attached. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35161) Safe version SQL functions
[ https://issues.apache.org/jira/browse/SPARK-35161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-35161: --- Description: Create new safe version SQL functions for existing SQL functions/operators, which returns NULL if overflow/error occurs. So that: 1. Users can manage to finish queries without interruptions in ANSI mode. 2. Even when ANSI mode is off, the result can be more reasonable. For example, the result of the following operation is terrible {code:java} 2147483647 + 2 => -2147483647 CAST(2147483648L AS INT) => -2147483648 {code} Having the safe version SQL functions provides an alternative solution for handling such cases {code:java} TRY_ADD(2147483647, 2) => null TRY_CAST(2147483648L AS INT) => null {code} was: Create new safe version SQL functions for existing SQL functions/operators, which returns NULL if overflow/error occurs. So that: 1. Users can manage to finish queries without interruptions 2. The result can be more reasonable. For example, the result of the following operation is terrible {code:java} 2147483647 + 2 => -2147483647 CAST(2147483648L AS INT) => -2147483648 {code} Having the safe version SQL functions provides an alternative solution for handling such cases {code:java} TRY_ADD(2147483647, 2) => null TRY_CAST(2147483648L AS INT) => null {code} > Safe version SQL functions > -- > > Key: SPARK-35161 > URL: https://issues.apache.org/jira/browse/SPARK-35161 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gengliang Wang >Priority: Major > > Create new safe version SQL functions for existing SQL functions/operators, > which returns NULL if overflow/error occurs. So that: > 1. Users can manage to finish queries without interruptions in ANSI mode. > 2. Even when ANSI mode is off, the result can be more reasonable. For > example, the result of the following operation is terrible > {code:java} > 2147483647 + 2 => -2147483647 > CAST(2147483648L AS INT) => -2147483648 > {code} > Having the safe version SQL functions provides an alternative solution for > handling such cases > {code:java} > TRY_ADD(2147483647, 2) => null > TRY_CAST(2147483648L AS INT) => null > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35161) Safe version SQL functions
[ https://issues.apache.org/jira/browse/SPARK-35161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17326256#comment-17326256 ] Gengliang Wang commented on SPARK-35161: cc [~beliefer][~angerszhuuu] Are you interested in these new features? > Safe version SQL functions > -- > > Key: SPARK-35161 > URL: https://issues.apache.org/jira/browse/SPARK-35161 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gengliang Wang >Priority: Major > > Create new safe version SQL functions for existing SQL functions/operators, > which returns NULL if overflow/error occurs. So that: > 1. Users can manage to finish queries without interruptions > 2. The result can be more reasonable. For example, the result of the > following operation is terrible > {code:java} > 2147483647 + 2 => -2147483647 > CAST(2147483648L AS INT) => -2147483648 > {code} > Having the safe version SQL functions provides an alternative solution for > handling such cases > {code:java} > TRY_ADD(2147483647, 2) => null > TRY_CAST(2147483648L AS INT) => null > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35167) New SQL function: TRY_NEGATIVE
Gengliang Wang created SPARK-35167: -- Summary: New SQL function: TRY_NEGATIVE Key: SPARK-35167 URL: https://issues.apache.org/jira/browse/SPARK-35167 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Gengliang Wang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35166) New SQL function: TRY_DIV
Gengliang Wang created SPARK-35166: -- Summary: New SQL function: TRY_DIV Key: SPARK-35166 URL: https://issues.apache.org/jira/browse/SPARK-35166 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Gengliang Wang This is for integral divide -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35164) New SQL function: TRY_MULTIPLY
Gengliang Wang created SPARK-35164: -- Summary: New SQL function: TRY_MULTIPLY Key: SPARK-35164 URL: https://issues.apache.org/jira/browse/SPARK-35164 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Gengliang Wang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35165) New SQL function: TRY_DIVIDE
Gengliang Wang created SPARK-35165: -- Summary: New SQL function: TRY_DIVIDE Key: SPARK-35165 URL: https://issues.apache.org/jira/browse/SPARK-35165 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Gengliang Wang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35163) New SQL function: TRY_SUBTRACT
Gengliang Wang created SPARK-35163: -- Summary: New SQL function: TRY_SUBTRACT Key: SPARK-35163 URL: https://issues.apache.org/jira/browse/SPARK-35163 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Gengliang Wang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35162) New SQL function: TRY_ADD
Gengliang Wang created SPARK-35162: -- Summary: New SQL function: TRY_ADD Key: SPARK-35162 URL: https://issues.apache.org/jira/browse/SPARK-35162 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Gengliang Wang -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34881) New SQL Function: TRY_CAST
[ https://issues.apache.org/jira/browse/SPARK-34881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-34881: --- Parent: SPARK-35161 Issue Type: Sub-task (was: New Feature) > New SQL Function: TRY_CAST > -- > > Key: SPARK-34881 > URL: https://issues.apache.org/jira/browse/SPARK-34881 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.2.0 > > > Add a new SQL function try_cast. try_cast is identical to CAST with > `spark.sql.ansi.enabled` as true, except it returns NULL instead of raising > an error. This expression has one major difference from `cast` with > `spark.sql.ansi.enabled` as true: when the source value can't be stored in > the target integral(Byte/Short/Int/Long) type, `try_cast` returns null > instead of returning the low order bytes of the source value. > This is learned from Google BigQuery and Snowflake: > https://docs.snowflake.com/en/sql-reference/functions/try_cast.html > https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#safe_casting -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35161) Safe version SQL functions
[ https://issues.apache.org/jira/browse/SPARK-35161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-35161: --- Issue Type: Umbrella (was: New Feature) > Safe version SQL functions > -- > > Key: SPARK-35161 > URL: https://issues.apache.org/jira/browse/SPARK-35161 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gengliang Wang >Priority: Major > > Create new safe version SQL functions for existing SQL functions/operators, > which returns NULL if overflow/error occurs. So that: > 1. Users can manage to finish queries without interruptions > 2. The result can be more reasonable. For example, the result of the > following operation is terrible > {code:java} > 2147483647 + 2 => -2147483647 > CAST(2147483648L AS INT) => -2147483648 > {code} > Having the safe version SQL functions provides an alternative solution for > handling such cases > {code:java} > TRY_ADD(2147483647, 2) => null > TRY_CAST(2147483648L AS INT) => null > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35112) Cast string to day-time interval
[ https://issues.apache.org/jira/browse/SPARK-35112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17326252#comment-17326252 ] angerszhu commented on SPARK-35112: --- Working on this > Cast string to day-time interval > > > Key: SPARK-35112 > URL: https://issues.apache.org/jira/browse/SPARK-35112 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Priority: Major > > Support cast of string to DayTimeIntervalType. The cast should support full > form INTERVAL '1 10:11:12' DAY TO SECOND and only interval payload '1 > 10:11:12'. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35111) Cast string to year-month interval
[ https://issues.apache.org/jira/browse/SPARK-35111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17326251#comment-17326251 ] angerszhu commented on SPARK-35111: --- Working on this. > Cast string to year-month interval > -- > > Key: SPARK-35111 > URL: https://issues.apache.org/jira/browse/SPARK-35111 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Priority: Major > > Support cast of string to YearMonthIntervalType. The cast should support full > form INTERVAL '1-1' YEAR TO MONTH and only interval payload '1-1'. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35161) Safe version SQL functions
Gengliang Wang created SPARK-35161: -- Summary: Safe version SQL functions Key: SPARK-35161 URL: https://issues.apache.org/jira/browse/SPARK-35161 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 3.2.0 Reporter: Gengliang Wang Create new safe version SQL functions for existing SQL functions/operators, which returns NULL if overflow/error occurs. So that: 1. Users can manage to finish queries without interruptions 2. The result can be more reasonable. For example, the result of the following operation is terrible {code:java} 2147483647 + 2 => -2147483647 CAST(2147483648L AS INT) => -2147483648 {code} Having the safe version SQL functions provides an alternative solution for handling such cases {code:java} TRY_ADD(2147483647, 2) => null TRY_CAST(2147483648L AS INT) => null {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-35113) Support ANSI intervals in the Hash expression
[ https://issues.apache.org/jira/browse/SPARK-35113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-35113. -- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 32259 [https://github.com/apache/spark/pull/32259] > Support ANSI intervals in the Hash expression > - > > Key: SPARK-35113 > URL: https://issues.apache.org/jira/browse/SPARK-35113 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Assignee: angerszhu >Priority: Major > Fix For: 3.2.0 > > > Handle YearMonthIntervalType and DayTimeIntervalType in HashExpression. And > write tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35113) Support ANSI intervals in the Hash expression
[ https://issues.apache.org/jira/browse/SPARK-35113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-35113: Assignee: angerszhu > Support ANSI intervals in the Hash expression > - > > Key: SPARK-35113 > URL: https://issues.apache.org/jira/browse/SPARK-35113 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Assignee: angerszhu >Priority: Major > > Handle YearMonthIntervalType and DayTimeIntervalType in HashExpression. And > write tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35159) extract doc of hive format
[ https://issues.apache.org/jira/browse/SPARK-35159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17326225#comment-17326225 ] Apache Spark commented on SPARK-35159: -- User 'AngersZh' has created a pull request for this issue: https://github.com/apache/spark/pull/32264 > extract doc of hive format > -- > > Key: SPARK-35159 > URL: https://issues.apache.org/jira/browse/SPARK-35159 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Priority: Major > > extract doc of hive format -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35159) extract doc of hive format
[ https://issues.apache.org/jira/browse/SPARK-35159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35159: Assignee: (was: Apache Spark) > extract doc of hive format > -- > > Key: SPARK-35159 > URL: https://issues.apache.org/jira/browse/SPARK-35159 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Priority: Major > > extract doc of hive format -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35159) extract doc of hive format
[ https://issues.apache.org/jira/browse/SPARK-35159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35159: Assignee: Apache Spark > extract doc of hive format > -- > > Key: SPARK-35159 > URL: https://issues.apache.org/jira/browse/SPARK-35159 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: Apache Spark >Priority: Major > > extract doc of hive format -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35159) extract doc of hive format
[ https://issues.apache.org/jira/browse/SPARK-35159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17326224#comment-17326224 ] Apache Spark commented on SPARK-35159: -- User 'AngersZh' has created a pull request for this issue: https://github.com/apache/spark/pull/32264 > extract doc of hive format > -- > > Key: SPARK-35159 > URL: https://issues.apache.org/jira/browse/SPARK-35159 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Priority: Major > > extract doc of hive format -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34928) CTE Execution fails for Sql Server
[ https://issues.apache.org/jira/browse/SPARK-34928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Supun De Silva updated SPARK-34928: --- Description: h2. Issue We have a simple Sql statement that we intend to execute on SQL Server. This has a CTE component. Execution of this yields to an error that looks like follows {code:java} java.sql.SQLException: Incorrect syntax near the keyword 'WITH'.{code} We are using the jdbc driver *net.sourceforge.jtds.jdbc.Driver* (version 1.3.1) This is a particularly annoying issue and due to this we are having to write inner queries that are fair bit inefficient. h2. SQL statement (not the actual one but a simplified version with renamed parameters) {code:sql} WITH OldChanges as ( SELECT distinct SomeDate, Name FROM [dbo].[DateNameFoo] (nolock) WHERE SomeDate!= '2021-03-30' AND convert(date, UpdateDateTime) = '2021-03-31' SELECT * from OldChanges {code} h3. Update on 2021-04-21 We tried *com.microsoft.sqlserver.jdbc.SQLServerDriver* driver as well. This also yields to the same issue. was: h2. Issue We have a simple Sql statement that we intend to execute on SQL Server. This has a CTE component. Execution of this yields to an error that looks like follows {code:java} java.sql.SQLException: Incorrect syntax near the keyword 'WITH'.{code} We are using the jdbc driver *net.sourceforge.jtds.jdbc.Driver* (version 1.3.1) This is a particularly annoying issue and due to this we are having to write inner queries that are fair bit inefficient. h2. SQL statement (not the actual one but a simplified version with renamed parameters) {code:sql} WITH OldChanges as ( SELECT distinct SomeDate, Name FROM [dbo].[DateNameFoo] (nolock) WHERE SomeDate!= '2021-03-30' AND convert(date, UpdateDateTime) = '2021-03-31' SELECT * from OldChanges {code} > CTE Execution fails for Sql Server > -- > > Key: SPARK-34928 > URL: https://issues.apache.org/jira/browse/SPARK-34928 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.0.1 >Reporter: Supun De Silva >Priority: Minor > > h2. Issue > We have a simple Sql statement that we intend to execute on SQL Server. This > has a CTE component. > Execution of this yields to an error that looks like follows > {code:java} > java.sql.SQLException: Incorrect syntax near the keyword 'WITH'.{code} > We are using the jdbc driver *net.sourceforge.jtds.jdbc.Driver* (version > 1.3.1) > This is a particularly annoying issue and due to this we are having to write > inner queries that are fair bit inefficient. > h2. SQL statement > (not the actual one but a simplified version with renamed parameters) > > {code:sql} > WITH OldChanges as ( >SELECT distinct > SomeDate, > Name >FROM [dbo].[DateNameFoo] (nolock) >WHERE SomeDate!= '2021-03-30' >AND convert(date, UpdateDateTime) = '2021-03-31' > SELECT * from OldChanges {code} > h3. Update on 2021-04-21 > We tried *com.microsoft.sqlserver.jdbc.SQLServerDriver* driver as well. This > also yields to the same issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34928) CTE Execution fails for Sql Server
[ https://issues.apache.org/jira/browse/SPARK-34928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17326220#comment-17326220 ] Supun De Silva commented on SPARK-34928: [~hyukjin.kwon] We tried *com.microsoft.sqlserver.jdbc.SQLServerDriver* driver as well. This also yields to the same issue. > CTE Execution fails for Sql Server > -- > > Key: SPARK-34928 > URL: https://issues.apache.org/jira/browse/SPARK-34928 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.0.1 >Reporter: Supun De Silva >Priority: Minor > > h2. Issue > We have a simple Sql statement that we intend to execute on SQL Server. This > has a CTE component. > Execution of this yields to an error that looks like follows > {code:java} > java.sql.SQLException: Incorrect syntax near the keyword 'WITH'.{code} > We are using the jdbc driver *net.sourceforge.jtds.jdbc.Driver* (version > 1.3.1) > This is a particularly annoying issue and due to this we are having to write > inner queries that are fair bit inefficient. > h2. SQL statement > (not the actual one but a simplified version with renamed parameters) > > {code:sql} > WITH OldChanges as ( >SELECT distinct > SomeDate, > Name >FROM [dbo].[DateNameFoo] (nolock) >WHERE SomeDate!= '2021-03-30' >AND convert(date, UpdateDateTime) = '2021-03-31' > SELECT * from OldChanges {code} > h3. Update on 2021-04-21 > We tried *com.microsoft.sqlserver.jdbc.SQLServerDriver* driver as well. This > also yields to the same issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35160) Spark application submitted despite failing to get Hive delegation token
Manu Zhang created SPARK-35160: -- Summary: Spark application submitted despite failing to get Hive delegation token Key: SPARK-35160 URL: https://issues.apache.org/jira/browse/SPARK-35160 Project: Spark Issue Type: Improvement Components: Security Affects Versions: 3.1.1 Reporter: Manu Zhang Currently, when running on YARN and failing to get Hive delegation token, a Spark SQL application will still be submitted. Eventually, the application will fail on connecting to Hive metastore without a valid delegation token. Is there any reason for this design ? cc [~jerryshao] who originally implemented this in https://issues.apache.org/jira/browse/SPARK-14743 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35159) extract doc of hive format
angerszhu created SPARK-35159: - Summary: extract doc of hive format Key: SPARK-35159 URL: https://issues.apache.org/jira/browse/SPARK-35159 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: angerszhu extract doc of hive format -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35084) [k8s] On Spark 3, jars listed in spark.jars and spark.jars.packages are not added to sparkContext
[ https://issues.apache.org/jira/browse/SPARK-35084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keunhyun Oh updated SPARK-35084: Issue Type: Bug (was: Question) > [k8s] On Spark 3, jars listed in spark.jars and spark.jars.packages are not > added to sparkContext > - > > Key: SPARK-35084 > URL: https://issues.apache.org/jira/browse/SPARK-35084 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.0.0, 3.0.2, 3.1.1 >Reporter: Keunhyun Oh >Priority: Major > > I'm trying to migrate spark 2 to spark 3 in k8s. > > In my environment, on Spark 3.x, jars listed in spark.jars and > spark.jars.packages are not added to sparkContext. > After driver's process is launched, jars are not propagated to Executors. So, > NoClassDefException is raised in executors. > > In spark.properties, the only main application jar is contained in > spark.jars. It is different from Spark 2. > > How to solve this situation? Is it any changed spark options in spark 3 from > spark 2? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-35084) [k8s] On Spark 3, jars listed in spark.jars and spark.jars.packages are not added to sparkContext
[ https://issues.apache.org/jira/browse/SPARK-35084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17326200#comment-17326200 ] Keunhyun Oh edited comment on SPARK-35084 at 4/21/21, 2:07 AM: --- *Spark 2.4.5* [https://github.com/apache/spark/blob/v2.4.5/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala] {code:java} if (!isMesosCluster && !isStandAloneCluster) { // Resolve maven dependencies if there are any and add classpath to jars. Add them to py-files // too for packages that include Python code val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies( args.packagesExclusions, args.packages, args.repositories, args.ivyRepoPath, args.ivySettingsPath) if (!StringUtils.isBlank(resolvedMavenCoordinates)) { args.jars = mergeFileLists(args.jars, resolvedMavenCoordinates) if (args.isPython || isInternal(args.primaryResource)) { args.pyFiles = mergeFileLists(args.pyFiles, resolvedMavenCoordinates) } } // install any R packages that may have been passed through --jars or --packages. // Spark Packages may contain R source code inside the jar. if (args.isR && !StringUtils.isBlank(args.jars)) { RPackageUtils.checkAndBuildRPackage(args.jars, printStream, args.verbose) } } {code} *Spark 3.0.2* [https://github.com/apache/spark/blob/v3.0.2/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala] {code:java} if (!StringUtils.isBlank(resolvedMavenCoordinates)) { // In K8s client mode, when in the driver, add resolved jars early as we might need // them at the submit time for artifact downloading. // For example we might use the dependencies for downloading // files from a Hadoop Compatible fs eg. S3. In this case the user might pass: // --packages com.amazonaws:aws-java-sdk:1.7.4:org.apache.hadoop:hadoop-aws:2.7.6 if (isKubernetesClusterModeDriver) { val loader = getSubmitClassLoader(sparkConf) for (jar <- resolvedMavenCoordinates.split(",")) { addJarToClasspath(jar, loader) } } else if (isKubernetesCluster) { // We need this in K8s cluster mode so that we can upload local deps // via the k8s application, like in cluster mode driver childClasspath ++= resolvedMavenCoordinates.split(",") } else { args.jars = mergeFileLists(args.jars, resolvedMavenCoordinates) if (args.isPython || isInternal(args.primaryResource)) { args.pyFiles = mergeFileLists(args.pyFiles, resolvedMavenCoordinates) } } }{code} When using k8s master, in spark 2, jars derived from maven are added to args.jars. However, in spark 3, maven dependencies are not merged to args.jars. I assume that because of it k8s cluster mode spark-submit is not supported spark.jars.packages I expected. So, jars from packages are not added to spark context. How to use maven packages in k8s cluster mode? was (Author: ocworld): *Spark 2.4.5* [https://github.com/apache/spark/blob/v2.4.5/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala] {code:java} if (!isMesosCluster && !isStandAloneCluster) { // Resolve maven dependencies if there are any and add classpath to jars. Add them to py-files // too for packages that include Python code val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies( args.packagesExclusions, args.packages, args.repositories, args.ivyRepoPath, args.ivySettingsPath) if (!StringUtils.isBlank(resolvedMavenCoordinates)) { args.jars = mergeFileLists(args.jars, resolvedMavenCoordinates) if (args.isPython || isInternal(args.primaryResource)) { args.pyFiles = mergeFileLists(args.pyFiles, resolvedMavenCoordinates) } } // install any R packages that may have been passed through --jars or --packages. // Spark Packages may contain R source code inside the jar. if (args.isR && !StringUtils.isBlank(args.jars)) { RPackageUtils.checkAndBuildRPackage(args.jars, printStream, args.verbose) } } {code} *Spark 3.0.2* **[https://github.com/apache/spark/blob/v3.0.2/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala] {code:java} if (!StringUtils.isBlank(resolvedMavenCoordinates)) { // In K8s client mode, when in the driver, add resolved jars early as we might need // them at the submit time for artifact downloading. // For example we might use the dependencies for downloading // files from a Hadoop Compatible fs eg. S3. In this case the user might pass: // --packages com.amazonaws:aws-java-sdk:1.7.4:org.apache.hadoop:hadoop-aws:2.7.6 if
[jira] [Commented] (SPARK-35084) [k8s] On Spark 3, jars listed in spark.jars and spark.jars.packages are not added to sparkContext
[ https://issues.apache.org/jira/browse/SPARK-35084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17326200#comment-17326200 ] Keunhyun Oh commented on SPARK-35084: - *Spark 2.4.5* [https://github.com/apache/spark/blob/v2.4.5/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala] {code:java} if (!isMesosCluster && !isStandAloneCluster) { // Resolve maven dependencies if there are any and add classpath to jars. Add them to py-files // too for packages that include Python code val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies( args.packagesExclusions, args.packages, args.repositories, args.ivyRepoPath, args.ivySettingsPath) if (!StringUtils.isBlank(resolvedMavenCoordinates)) { args.jars = mergeFileLists(args.jars, resolvedMavenCoordinates) if (args.isPython || isInternal(args.primaryResource)) { args.pyFiles = mergeFileLists(args.pyFiles, resolvedMavenCoordinates) } } // install any R packages that may have been passed through --jars or --packages. // Spark Packages may contain R source code inside the jar. if (args.isR && !StringUtils.isBlank(args.jars)) { RPackageUtils.checkAndBuildRPackage(args.jars, printStream, args.verbose) } } {code} *Spark 3.0.2* **[https://github.com/apache/spark/blob/v3.0.2/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala] {code:java} if (!StringUtils.isBlank(resolvedMavenCoordinates)) { // In K8s client mode, when in the driver, add resolved jars early as we might need // them at the submit time for artifact downloading. // For example we might use the dependencies for downloading // files from a Hadoop Compatible fs eg. S3. In this case the user might pass: // --packages com.amazonaws:aws-java-sdk:1.7.4:org.apache.hadoop:hadoop-aws:2.7.6 if (isKubernetesClusterModeDriver) { val loader = getSubmitClassLoader(sparkConf) for (jar <- resolvedMavenCoordinates.split(",")) { addJarToClasspath(jar, loader) } } else if (isKubernetesCluster) { // We need this in K8s cluster mode so that we can upload local deps // via the k8s application, like in cluster mode driver childClasspath ++= resolvedMavenCoordinates.split(",") } else { args.jars = mergeFileLists(args.jars, resolvedMavenCoordinates) if (args.isPython || isInternal(args.primaryResource)) { args.pyFiles = mergeFileLists(args.pyFiles, resolvedMavenCoordinates) } } }{code} When using k8s master, in spark 2, jars derived from maven are added to args.jars. However, in spark 3, maven dependencies are not merged to args.jars. I assume that because of it k8s cluster mode spark-submit is not supported spark.jars.packages I expected. So, jars from packages are not added to spark context. > [k8s] On Spark 3, jars listed in spark.jars and spark.jars.packages are not > added to sparkContext > - > > Key: SPARK-35084 > URL: https://issues.apache.org/jira/browse/SPARK-35084 > Project: Spark > Issue Type: Question > Components: Kubernetes >Affects Versions: 3.0.0, 3.0.2, 3.1.1 >Reporter: Keunhyun Oh >Priority: Major > > I'm trying to migrate spark 2 to spark 3 in k8s. > > In my environment, on Spark 3.x, jars listed in spark.jars and > spark.jars.packages are not added to sparkContext. > After driver's process is launched, jars are not propagated to Executors. So, > NoClassDefException is raised in executors. > > In spark.properties, the only main application jar is contained in > spark.jars. It is different from Spark 2. > > How to solve this situation? Is it any changed spark options in spark 3 from > spark 2? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35084) [k8s] On Spark 3, jars listed in spark.jars and spark.jars.packages are not added to sparkContext
[ https://issues.apache.org/jira/browse/SPARK-35084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keunhyun Oh updated SPARK-35084: Affects Version/s: 3.1.1 > [k8s] On Spark 3, jars listed in spark.jars and spark.jars.packages are not > added to sparkContext > - > > Key: SPARK-35084 > URL: https://issues.apache.org/jira/browse/SPARK-35084 > Project: Spark > Issue Type: Question > Components: Kubernetes >Affects Versions: 3.0.0, 3.0.2, 3.1.1 >Reporter: Keunhyun Oh >Priority: Major > > I'm trying to migrate spark 2 to spark 3 in k8s. > > In my environment, on Spark 3.x, jars listed in spark.jars and > spark.jars.packages are not added to sparkContext. > After driver's process is launched, jars are not propagated to Executors. So, > NoClassDefException is raised in executors. > > In spark.properties, the only main application jar is contained in > spark.jars. It is different from Spark 2. > > How to solve this situation? Is it any changed spark options in spark 3 from > spark 2? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35158) Add some guides for authors to retrigger the workflow run
[ https://issues.apache.org/jira/browse/SPARK-35158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-35158: - Description: Currently only authors can retrigger the GitHub Actions build in their PRs. We should explicitly guide them, at [https://github.com/apache/spark/blob/master/.github/workflows/notify_test_workflow.yml#L110], something like: If the tests fail for reasons unrelated to the change, please retrigger the workflow run in your forked repository. If related, please investigate, fix and push new changes to fix the test failure. This guides can be removed once SPARK-35157 is done. was: Currently only authors can retrigger the GitHub Actions build in their PRs. We should explicitly guide them, at [https://github.com/apache/spark/blob/master/.github/workflows/notify_test_workflow.yml#L110], something like: If the tests fail for reasons unrelated to the change, please retrigger the workflow run in your forked repository. If related, please investigate, fix and push new changes to fix the test failure. > Add some guides for authors to retrigger the workflow run > - > > Key: SPARK-35158 > URL: https://issues.apache.org/jira/browse/SPARK-35158 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 3.2.0 >Reporter: Hyukjin Kwon >Priority: Major > > Currently only authors can retrigger the GitHub Actions build in their PRs. > We should explicitly guide them, at > [https://github.com/apache/spark/blob/master/.github/workflows/notify_test_workflow.yml#L110], > something like: > If the tests fail for reasons unrelated to the change, please retrigger the > workflow run in your forked repository. > If related, please investigate, fix and push new changes to fix the test > failure. > This guides can be removed once SPARK-35157 is done. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35158) Add some guides for authors to retrigger the workflow run
[ https://issues.apache.org/jira/browse/SPARK-35158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-35158: - Description: Currently only authors can retrigger the GitHub Actions build in their PRs. We should explicitly guide them, at [https://github.com/apache/spark/blob/master/.github/workflows/notify_test_workflow.yml#L110], something like: If the tests fail for reasons unrelated to the change, please retrigger the workflow run in your forked repository. If related, please investigate, fix and push new changes to fix the test failure. was:Currently only authors can retrigger the GitHub Actions build in their PRs. We should explicitly guide them, at [https://github.com/apache/spark/blob/master/.github/workflows/notify_test_workflow.yml#L110], to fix the changes and/or retrigger the tests if it fails. > Add some guides for authors to retrigger the workflow run > - > > Key: SPARK-35158 > URL: https://issues.apache.org/jira/browse/SPARK-35158 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 3.2.0 >Reporter: Hyukjin Kwon >Priority: Major > > Currently only authors can retrigger the GitHub Actions build in their PRs. > We should explicitly guide them, at > [https://github.com/apache/spark/blob/master/.github/workflows/notify_test_workflow.yml#L110], > something like: > If the tests fail for reasons unrelated to the change, please retrigger the > workflow run in your forked repository. > If related, please investigate, fix and push new changes to fix the test > failure. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35158) Add some guides for authors to retrigger the workflow run
Hyukjin Kwon created SPARK-35158: Summary: Add some guides for authors to retrigger the workflow run Key: SPARK-35158 URL: https://issues.apache.org/jira/browse/SPARK-35158 Project: Spark Issue Type: Sub-task Components: Project Infra Affects Versions: 3.2.0 Reporter: Hyukjin Kwon Currently only authors can retrigger the GitHub Actions build in their PRs. We should explicitly guide them, at [https://github.com/apache/spark/blob/master/.github/workflows/notify_test_workflow.yml#L110], to fix the changes and/or retrigger the tests if it fails. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35157) Have a way for other people to retrigger the build in GitHub Actions
[ https://issues.apache.org/jira/browse/SPARK-35157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-35157: - Description: We should ask contributors to retrigger the tests because the builds run in their fork, and currently committers or other people cannot retrigger out of the box. Note that the retriggering has to happen in forked repository. This cannot be done from main repository. One possible way is to create a workflow *that only runs in forked repository*: 1. Regularly (15 mins?) get a list of PRs opened from the forked repository or possibly author? (see [https://github.com/apache/spark/blob/master/.github/workflows/update_build_status.yml#L36-L47]) 2. Iterate the PRs: ㅤ2.1. Get the latest workflow run ([https://github.com/apache/spark/blob/master/.github/workflows/notify_test_workflow.yml#L41-L59]) ㅤ2.2. Iterates the comments in the PR (see [https://docs.github.com/en/rest/guides/working-with-comments#pull-request-comments] for Javascript and [https://docs.github.com/en/rest/reference/issues#list-issue-comments] for REST API). Issue number is PR number. ㅤㅤ2.2.1. check if there is a comment such as "GitHub Actions: retrigger please" _after the latest workflow run_. Last update time is available when you get workflow run, see also [https://docs.github.com/en/rest/reference/actions#get-a-workflow-run] ㅤㅤㅤ2.2.1.1. If there is, retrigger the workflow run, see also [https://docs.github.com/en/rest/reference/actions#create-a-workflow-dispatch-event] ㅤㅤㅤ2.2.1.2. If not, skip. was:We should ask contributors to retrigger the tests because the builds run in their fork, and currently committers cannot retrigger out of the box. > Have a way for other people to retrigger the build in GitHub Actions > > > Key: SPARK-35157 > URL: https://issues.apache.org/jira/browse/SPARK-35157 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 3.2.0 >Reporter: Hyukjin Kwon >Priority: Major > > We should ask contributors to retrigger the tests because the builds run in > their fork, and currently committers or other people cannot retrigger out of > the box. > Note that the retriggering has to happen in forked repository. This cannot > be done from main repository. > One possible way is to create a workflow *that only runs in forked > repository*: > 1. Regularly (15 mins?) get a list of PRs opened from the forked repository > or possibly author? (see > [https://github.com/apache/spark/blob/master/.github/workflows/update_build_status.yml#L36-L47]) > 2. Iterate the PRs: > ㅤ2.1. Get the latest workflow run > ([https://github.com/apache/spark/blob/master/.github/workflows/notify_test_workflow.yml#L41-L59]) > ㅤ2.2. Iterates the comments in the PR (see > [https://docs.github.com/en/rest/guides/working-with-comments#pull-request-comments] > for Javascript and > [https://docs.github.com/en/rest/reference/issues#list-issue-comments] for > REST API). Issue number is PR number. > ㅤㅤ2.2.1. check if there is a comment such as "GitHub Actions: retrigger > please" _after the latest workflow run_. Last update time is available when > you get workflow run, see also > [https://docs.github.com/en/rest/reference/actions#get-a-workflow-run] > ㅤㅤㅤ2.2.1.1. If there is, retrigger the workflow run, see also > [https://docs.github.com/en/rest/reference/actions#create-a-workflow-dispatch-event] > ㅤㅤㅤ2.2.1.2. If not, skip. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35157) Have a way for other people to retrigger the build in GitHub Actions
[ https://issues.apache.org/jira/browse/SPARK-35157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-35157: - Summary: Have a way for other people to retrigger the build in GitHub Actions (was: Guide users to retrigger if tests fails in GitHub Actions build) > Have a way for other people to retrigger the build in GitHub Actions > > > Key: SPARK-35157 > URL: https://issues.apache.org/jira/browse/SPARK-35157 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 3.2.0 >Reporter: Hyukjin Kwon >Priority: Major > > We should ask contributors to retrigger the tests because the builds run in > their fork, and currently committers cannot retrigger out of the box. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34639) always remove unnecessary Alias in Analyzer.resolveExpression
[ https://issues.apache.org/jira/browse/SPARK-34639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-34639: - Fix Version/s: 3.1.2 > always remove unnecessary Alias in Analyzer.resolveExpression > - > > Key: SPARK-34639 > URL: https://issues.apache.org/jira/browse/SPARK-34639 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > Fix For: 3.1.2, 3.2.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35157) Guide users to retrigger if tests fails in GitHub Actions build
Hyukjin Kwon created SPARK-35157: Summary: Guide users to retrigger if tests fails in GitHub Actions build Key: SPARK-35157 URL: https://issues.apache.org/jira/browse/SPARK-35157 Project: Spark Issue Type: Sub-task Components: Project Infra Affects Versions: 3.2.0 Reporter: Hyukjin Kwon We should ask contributors to retrigger the tests because the builds run in their fork, and currently committers cannot retrigger out of the box. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35156) Thrown java.lang.NoClassDefFoundError when using spark-submit
[ https://issues.apache.org/jira/browse/SPARK-35156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-35156: Description: Got NoClassDefFoundError when run spark-submit to submit Spark app to K8S cluster. Master, branch-3.1 are okay. Branch-3.1 is affected. How to reproduce: 1. Using sbt to build Spark with Kubernetes (-Pkubernetes) 2. Run spark-submit to submit to K8S cluster 3. Get the following exception {code:java} 21/04/20 16:33:37 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfigFromString(KubeConfigUtils.java:46) at io.fabric8.kubernetes.client.Config.loadFromKubeconfig(Config.java:564) at io.fabric8.kubernetes.client.Config.tryKubeConfig(Config.java:530) at io.fabric8.kubernetes.client.Config.autoConfigure(Config.java:264) at io.fabric8.kubernetes.client.Config.(Config.java:230) at io.fabric8.kubernetes.client.Config.(Config.java:224) at io.fabric8.kubernetes.client.Config.autoConfigure(Config.java:259) at org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:80) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$2(KubernetesClientApplication.scala:207) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2621) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:207) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:179) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.dataformat.yaml.YAMLFactory at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 19 more {code} was: Got NoClassDefFoundError when run spark-submit to submit Spark app to K8S cluster. Master branch is okay. Branch-3.1 is affected. How to reproduce: 1. Using sbt to build Spark with Kubernetes (-Pkubernetes) 2. Run spark-submit to submit to K8S cluster 3. Get the following exception {code:java} 21/04/20 16:33:37 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfigFromString(KubeConfigUtils.java:46) at io.fabric8.kubernetes.client.Config.loadFromKubeconfig(Config.java:564)
[jira] [Updated] (SPARK-35156) Thrown java.lang.NoClassDefFoundError when using spark-submit
[ https://issues.apache.org/jira/browse/SPARK-35156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh updated SPARK-35156: Description: Got NoClassDefFoundError when run spark-submit to submit Spark app to K8S cluster. Master branch is okay. Branch-3.1 is affected. How to reproduce: 1. Using sbt to build Spark with Kubernetes (-Pkubernetes) 2. Run spark-submit to submit to K8S cluster 3. Get the following exception {code:java} 21/04/20 16:33:37 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfigFromString(KubeConfigUtils.java:46) at io.fabric8.kubernetes.client.Config.loadFromKubeconfig(Config.java:564) at io.fabric8.kubernetes.client.Config.tryKubeConfig(Config.java:530) at io.fabric8.kubernetes.client.Config.autoConfigure(Config.java:264) at io.fabric8.kubernetes.client.Config.(Config.java:230) at io.fabric8.kubernetes.client.Config.(Config.java:224) at io.fabric8.kubernetes.client.Config.autoConfigure(Config.java:259) at org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:80) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$2(KubernetesClientApplication.scala:207) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2621) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:207) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:179) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.dataformat.yaml.YAMLFactory at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 19 more {code} was: How to reproduce: 1. Using sbt to build Spark with Kubernetes (-Pkubernetes) 2. Run spark-submit to submit to K8S cluster 3. Get the following exception {code:java} 21/04/20 16:33:37 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfigFromString(KubeConfigUtils.java:46) at io.fabric8.kubernetes.client.Config.loadFromKubeconfig(Config.java:564) at io.fabric8.kubernetes.client.Config.tryKubeConfig(Config.java:530)
[jira] [Created] (SPARK-35156) Thrown java.lang.NoClassDefFoundError when using spark-submit
L. C. Hsieh created SPARK-35156: --- Summary: Thrown java.lang.NoClassDefFoundError when using spark-submit Key: SPARK-35156 URL: https://issues.apache.org/jira/browse/SPARK-35156 Project: Spark Issue Type: Bug Components: Build, Kubernetes Affects Versions: 3.1.1 Reporter: L. C. Hsieh How to reproduce: 1. Using sbt to build Spark with Kubernetes (-Pkubernetes) 2. Run spark-submit to submit to K8S cluster 3. Get the following exception {code:java} 21/04/20 16:33:37 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory at io.fabric8.kubernetes.client.internal.KubeConfigUtils.parseConfigFromString(KubeConfigUtils.java:46) at io.fabric8.kubernetes.client.Config.loadFromKubeconfig(Config.java:564) at io.fabric8.kubernetes.client.Config.tryKubeConfig(Config.java:530) at io.fabric8.kubernetes.client.Config.autoConfigure(Config.java:264) at io.fabric8.kubernetes.client.Config.(Config.java:230) at io.fabric8.kubernetes.client.Config.(Config.java:224) at io.fabric8.kubernetes.client.Config.autoConfigure(Config.java:259) at org.apache.spark.deploy.k8s.SparkKubernetesClientFactory$.createKubernetesClient(SparkKubernetesClientFactory.scala:80) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$2(KubernetesClientApplication.scala:207) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2621) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:207) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:179) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1030) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.dataformat.yaml.YAMLFactory at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 19 more {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-35132) Upgrade netty-all to 4.1.63.Final
[ https://issues.apache.org/jira/browse/SPARK-35132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-35132. -- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 32227 [https://github.com/apache/spark/pull/32227] > Upgrade netty-all to 4.1.63.Final > - > > Key: SPARK-35132 > URL: https://issues.apache.org/jira/browse/SPARK-35132 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.2.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > Fix For: 3.2.0 > > > Three CVE problems were found after netty 4.1.51.Final: > > ||Name||Description|| > |[CVE-2021-21409|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-21409]|Netty > is an open-source, asynchronous event-driven network application framework > for rapid development of maintainable high performance protocol servers & > clients. In Netty (io.netty:netty-codec-http2) before version 4.1.61.Final > there is a vulnerability that enables request smuggling. The content-length > header is not correctly validated if the request only uses a single > Http2HeaderFrame with the endStream set to to true. This could lead to > request smuggling if the request is proxied to a remote peer and translated > to HTTP/1.1. This is a followup of GHSA-wm47-8v5p-wjpj/CVE-2021-21295 which > did miss to fix this one case. This was fixed as part of 4.1.61.Final.| > |[CVE-2021-21295|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-21295]|Netty > is an open-source, asynchronous event-driven network application framework > for rapid development of maintainable high performance protocol servers & > clients. In Netty (io.netty:netty-codec-http2) before version 4.1.60.Final > there is a vulnerability that enables request smuggling. If a Content-Length > header is present in the original HTTP/2 request, the field is not validated > by `Http2MultiplexHandler` as it is propagated up. This is fine as long as > the request is not proxied through as HTTP/1.1. If the request comes in as an > HTTP/2 stream, gets converted into the HTTP/1.1 domain objects > (`HttpRequest`, `HttpContent`, etc.) via `Http2StreamFrameToHttpObjectCodec > `and then sent up to the child channel's pipeline and proxied through a > remote peer as HTTP/1.1 this may result in request smuggling. In a proxy > case, users may assume the content-length is validated somehow, which is not > the case. If the request is forwarded to a backend channel that is a HTTP/1.1 > connection, the Content-Length now has meaning and needs to be checked. An > attacker can smuggle requests inside the body as it gets downgraded from > HTTP/2 to HTTP/1.1. For an example attack refer to the linked GitHub > Advisory. Users are only affected if all of this is true: > `HTTP2MultiplexCodec` or `Http2FrameCodec` is used, > `Http2StreamFrameToHttpObjectCodec` is used to convert to HTTP/1.1 objects, > and these HTTP/1.1 objects are forwarded to another remote peer. This has > been patched in 4.1.60.Final As a workaround, the user can do the validation > by themselves by implementing a custom `ChannelInboundHandler` that is put in > the `ChannelPipeline` behind `Http2StreamFrameToHttpObjectCodec`.| > |[CVE-2021-21290|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-21290]|Netty > is an open-source, asynchronous event-driven network application framework > for rapid development of maintainable high performance protocol servers & > clients. In Netty before version 4.1.59.Final there is a vulnerability on > Unix-like systems involving an insecure temp file. When netty's multipart > decoders are used local information disclosure can occur via the local system > temporary directory if temporary storing uploads on the disk is enabled. On > unix-like systems, the temporary directory is shared between all user. As > such, writing to this directory using APIs that do not explicitly set the > file/directory permissions can lead to information disclosure. Of note, this > does not impact modern MacOS Operating Systems. The method > "File.createTempFile" on unix-like systems creates a random file, but, by > default will create this file with the permissions "-rw-r--r--". Thus, if > sensitive information is written to this file, other local users can read > this information. This is the case in netty's "AbstractDiskHttpData" is > vulnerable. This has been fixed in version 4.1.59.Final. As a workaround, one > may specify your own "java.io.tmpdir" when you start the JVM or use > "DefaultHttpDataFactory.setBaseDir(...)" to set the directory to something > that is only readable by the current user.| > > Upgrade netty version to avoid these potential risks -- This message was sent by Atlassian Jira
[jira] [Assigned] (SPARK-35132) Upgrade netty-all to 4.1.63.Final
[ https://issues.apache.org/jira/browse/SPARK-35132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-35132: Assignee: Yang Jie > Upgrade netty-all to 4.1.63.Final > - > > Key: SPARK-35132 > URL: https://issues.apache.org/jira/browse/SPARK-35132 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.2.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Minor > > Three CVE problems were found after netty 4.1.51.Final: > > ||Name||Description|| > |[CVE-2021-21409|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-21409]|Netty > is an open-source, asynchronous event-driven network application framework > for rapid development of maintainable high performance protocol servers & > clients. In Netty (io.netty:netty-codec-http2) before version 4.1.61.Final > there is a vulnerability that enables request smuggling. The content-length > header is not correctly validated if the request only uses a single > Http2HeaderFrame with the endStream set to to true. This could lead to > request smuggling if the request is proxied to a remote peer and translated > to HTTP/1.1. This is a followup of GHSA-wm47-8v5p-wjpj/CVE-2021-21295 which > did miss to fix this one case. This was fixed as part of 4.1.61.Final.| > |[CVE-2021-21295|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-21295]|Netty > is an open-source, asynchronous event-driven network application framework > for rapid development of maintainable high performance protocol servers & > clients. In Netty (io.netty:netty-codec-http2) before version 4.1.60.Final > there is a vulnerability that enables request smuggling. If a Content-Length > header is present in the original HTTP/2 request, the field is not validated > by `Http2MultiplexHandler` as it is propagated up. This is fine as long as > the request is not proxied through as HTTP/1.1. If the request comes in as an > HTTP/2 stream, gets converted into the HTTP/1.1 domain objects > (`HttpRequest`, `HttpContent`, etc.) via `Http2StreamFrameToHttpObjectCodec > `and then sent up to the child channel's pipeline and proxied through a > remote peer as HTTP/1.1 this may result in request smuggling. In a proxy > case, users may assume the content-length is validated somehow, which is not > the case. If the request is forwarded to a backend channel that is a HTTP/1.1 > connection, the Content-Length now has meaning and needs to be checked. An > attacker can smuggle requests inside the body as it gets downgraded from > HTTP/2 to HTTP/1.1. For an example attack refer to the linked GitHub > Advisory. Users are only affected if all of this is true: > `HTTP2MultiplexCodec` or `Http2FrameCodec` is used, > `Http2StreamFrameToHttpObjectCodec` is used to convert to HTTP/1.1 objects, > and these HTTP/1.1 objects are forwarded to another remote peer. This has > been patched in 4.1.60.Final As a workaround, the user can do the validation > by themselves by implementing a custom `ChannelInboundHandler` that is put in > the `ChannelPipeline` behind `Http2StreamFrameToHttpObjectCodec`.| > |[CVE-2021-21290|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-21290]|Netty > is an open-source, asynchronous event-driven network application framework > for rapid development of maintainable high performance protocol servers & > clients. In Netty before version 4.1.59.Final there is a vulnerability on > Unix-like systems involving an insecure temp file. When netty's multipart > decoders are used local information disclosure can occur via the local system > temporary directory if temporary storing uploads on the disk is enabled. On > unix-like systems, the temporary directory is shared between all user. As > such, writing to this directory using APIs that do not explicitly set the > file/directory permissions can lead to information disclosure. Of note, this > does not impact modern MacOS Operating Systems. The method > "File.createTempFile" on unix-like systems creates a random file, but, by > default will create this file with the permissions "-rw-r--r--". Thus, if > sensitive information is written to this file, other local users can read > this information. This is the case in netty's "AbstractDiskHttpData" is > vulnerable. This has been fixed in version 4.1.59.Final. As a workaround, one > may specify your own "java.io.tmpdir" when you start the JVM or use > "DefaultHttpDataFactory.setBaseDir(...)" to set the directory to something > that is only readable by the current user.| > > Upgrade netty version to avoid these potential risks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail:
[jira] [Resolved] (SPARK-35153) Override `sql()` of ANSI interval operators
[ https://issues.apache.org/jira/browse/SPARK-35153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-35153. -- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 32262 [https://github.com/apache/spark/pull/32262] > Override `sql()` of ANSI interval operators > --- > > Key: SPARK-35153 > URL: https://issues.apache.org/jira/browse/SPARK-35153 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > Fix For: 3.2.0 > > > Override the sql() method of the expression that implements operators over > ANSI interval, and make SQL representation more readable and potentially > parsable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35155) Add rule id to all ResolveXxx rules
Yingyi Bu created SPARK-35155: - Summary: Add rule id to all ResolveXxx rules Key: SPARK-35155 URL: https://issues.apache.org/jira/browse/SPARK-35155 Project: Spark Issue Type: Sub-task Components: Optimizer Affects Versions: 3.1.0 Reporter: Yingyi Bu All ResolveXxx rules are run in a fixed point batch and can be beneficial for the rule-id-based pruning, regardless of whether there's a stop condition lambda. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-34472) SparkContext.addJar with an ivy path fails in cluster mode with a custom ivySettings file
[ https://issues.apache.org/jira/browse/SPARK-34472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-34472. --- Fix Version/s: 3.2.0 Assignee: Shardul Mahadik Resolution: Fixed > SparkContext.addJar with an ivy path fails in cluster mode with a custom > ivySettings file > - > > Key: SPARK-34472 > URL: https://issues.apache.org/jira/browse/SPARK-34472 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.2.0 >Reporter: Shardul Mahadik >Assignee: Shardul Mahadik >Priority: Major > Fix For: 3.2.0 > > > SPARK-33084 introduced support for Ivy paths in {{sc.addJar}} or Spark SQL > {{ADD JAR}}. If we use a custom ivySettings file using > {{spark.jars.ivySettings}}, it is loaded at > [https://github.com/apache/spark/blob/b26e7b510bbaee63c4095ab47e75ff2a70e377d7/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L1280.] > However, this file is only accessible on the client machine. In cluster > mode, this file is not available on the driver and so {{addJar}} fails. > {code:sh} > spark-submit --master yarn --deploy-mode cluster --class IvyAddJarExample > --conf spark.jars.ivySettings=/path/to/ivySettings.xml example.jar > {code} > {code} > java.lang.IllegalArgumentException: requirement failed: Ivy settings file > /path/to/ivySettings.xml does not exist > at scala.Predef$.require(Predef.scala:281) > at > org.apache.spark.deploy.SparkSubmitUtils$.loadIvySettings(SparkSubmit.scala:1331) > at > org.apache.spark.util.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:176) > at > org.apache.spark.util.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:156) > at > org.apache.spark.sql.internal.SessionResourceLoader.resolveJars(SessionState.scala:166) > at > org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:133) > at > org.apache.spark.sql.execution.command.AddJarCommand.run(resources.scala:40) > {code} > We should ship the ivySettings file to the driver so that {{addJar}} is able > to find it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35044) Support retrieve hadoop configurations via SET syntax
[ https://issues.apache.org/jira/browse/SPARK-35044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325994#comment-17325994 ] Apache Spark commented on SPARK-35044: -- User 'yaooqinn' has created a pull request for this issue: https://github.com/apache/spark/pull/32263 > Support retrieve hadoop configurations via SET syntax > - > > Key: SPARK-35044 > URL: https://issues.apache.org/jira/browse/SPARK-35044 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Fix For: 3.2.0 > > > Currently, pure SQL users are short of ways to see the Hadoop configurations > which may affect their jobs a lot -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35044) Support retrieve hadoop configurations via SET syntax
[ https://issues.apache.org/jira/browse/SPARK-35044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325992#comment-17325992 ] Apache Spark commented on SPARK-35044: -- User 'yaooqinn' has created a pull request for this issue: https://github.com/apache/spark/pull/32263 > Support retrieve hadoop configurations via SET syntax > - > > Key: SPARK-35044 > URL: https://issues.apache.org/jira/browse/SPARK-35044 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Major > Fix For: 3.2.0 > > > Currently, pure SQL users are short of ways to see the Hadoop configurations > which may affect their jobs a lot -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35154) Rpc env not shutdown when shutdown method call by endpoint onStop
[ https://issues.apache.org/jira/browse/SPARK-35154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LIU updated SPARK-35154: Description: when i use this code to work, Rpc thread hangs up and not close gracefully. i think when rpc thread called shutdown on OnStop method, it will try to put MessageLoop.PoisonPill to return and stop thread in rpc pool. In spark 3.x, it will make others thread return & stop but current thread which call OnStop method to await current pool to stop. it makes current thread not stop, and pending program. I'm not sure that needs to be improved or not? {code:java} //代码占位符{code} test("Rpc env not shutdown when shutdown method call by endpoint onStop") { val rpcEndpoint = new RpcEndpoint { override val rpcEnv: RpcEnv = env override def onStop(): Unit = { env.shutdown() env.awaitTermination() } override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = { case m => context.reply(m) } } env.setupEndpoint("test", rpcEndpoint) rpcEndpoint.stop() env.awaitTermination() } was: when i use this code to work, Rpc thread hangs up and not close gracefully. i think when rpc thread called shutdown on OnStop method, it will try to put MessageLoop.PoisonPill to return and stop thread in rpc pool. In spark 3.x, it will make others thread return & stop but current thread which call OnStop method to await current pool to stop. it makes current thread not stop, and pending program. I'm not sure that needs to be improved or not? {code:java} //代码占位符{code} test("Rpc env not shutdown when shutdown method call by endpoint onStop") { val rpcEndpoint = new RpcEndpoint { override val rpcEnv: RpcEnv = env override def onStop(): Unit = { env.shutdown() env.awaitTermination() } override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = { case m => context.reply(m) } } env.setupEndpoint("test", rpcEndpoint) rpcEndpoint.stop() env.awaitTermination() } > Rpc env not shutdown when shutdown method call by endpoint onStop > - > > Key: SPARK-35154 > URL: https://issues.apache.org/jira/browse/SPARK-35154 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.0.0 > Environment: spark-3.x >Reporter: LIU >Priority: Major > > when i use this code to work, Rpc thread hangs up and not close gracefully. > i think when rpc thread called shutdown on OnStop method, it will try to put > MessageLoop.PoisonPill to return and stop thread in rpc pool. In spark 3.x, > it will make others thread return & stop but current thread which call OnStop > method to await current pool to stop. it makes current thread not stop, and > pending program. > I'm not sure that needs to be improved or not? > > {code:java} > //代码占位符{code} > test("Rpc env not shutdown when shutdown method call by endpoint onStop") { > val rpcEndpoint = new RpcEndpoint { > override val rpcEnv: RpcEnv = env > override def onStop(): Unit = > { env.shutdown() env.awaitTermination() } > override def receiveAndReply(context: RpcCallContext): > PartialFunction[Any, Unit] = > { case m => context.reply(m) } > } > env.setupEndpoint("test", rpcEndpoint) > rpcEndpoint.stop() > env.awaitTermination() > } > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35154) Rpc env not shutdown when shutdown method call by endpoint onStop
[ https://issues.apache.org/jira/browse/SPARK-35154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LIU updated SPARK-35154: Description: when i use this code to work, Rpc thread hangs up and not close gracefully. i think when rpc thread called shutdown on OnStop method, it will try to put MessageLoop.PoisonPill to return and stop thread in rpc pool. In spark 3.x, it will make others thread return & stop but current thread which call OnStop method to await current pool to stop. it makes current thread not stop, and pending program. I'm not sure that needs to be improved or not? {code:java} //代码占位符{code} test("Rpc env not shutdown when shutdown method call by endpoint onStop") { val rpcEndpoint = new RpcEndpoint { override val rpcEnv: RpcEnv = env override def onStop(): Unit = { env.shutdown() env.awaitTermination() } override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = { case m => context.reply(m) } } env.setupEndpoint("test", rpcEndpoint) rpcEndpoint.stop() env.awaitTermination() } was: when i use this code to work, Rpc thread hangs up and not close gracefully. i think when rpc thread called shutdown on OnStop method, it will try to put MessageLoop.PoisonPill to return and stop thread in rpc pool. In spark 3.x, it will make others thread return & stop but current thread which call OnStop method to await current pool to stop. it makes current thread not stop, and pending program. I'm not sure that needs to be improved or not? {code:java} //代码占位符{code} test("Rpc env not shutdown when shutdown method call by endpoint onStop") { val rpcEndpoint = new RpcEndpoint { override val rpcEnv: RpcEnv = env override def onStop(): Unit = { env.shutdown() env.awaitTermination() } override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] ={ case m => context.reply(m) } } env.setupEndpoint("test", rpcEndpoint) rpcEndpoint.stop() env.awaitTermination() } > Rpc env not shutdown when shutdown method call by endpoint onStop > - > > Key: SPARK-35154 > URL: https://issues.apache.org/jira/browse/SPARK-35154 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.0.0 > Environment: spark-3.x >Reporter: LIU >Priority: Major > > when i use this code to work, Rpc thread hangs up and not close gracefully. > i think when rpc thread called shutdown on OnStop method, it will try to put > MessageLoop.PoisonPill to return and stop thread in rpc pool. In spark 3.x, > it will make others thread return & stop but current thread which call OnStop > method to await current pool to stop. it makes current thread not stop, and > pending program. > I'm not sure that needs to be improved or not? > > {code:java} > //代码占位符{code} > test("Rpc env not shutdown when shutdown method call by endpoint onStop") { > val rpcEndpoint = new RpcEndpoint { > override val rpcEnv: RpcEnv = env > override def onStop(): Unit = { > env.shutdown() > env.awaitTermination() > } > override def receiveAndReply(context: RpcCallContext): > PartialFunction[Any, Unit] = { > case m => context.reply(m) > } > } > env.setupEndpoint("test", rpcEndpoint) > rpcEndpoint.stop() > env.awaitTermination() > } > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35154) Rpc env not shutdown when shutdown method call by endpoint onStop
[ https://issues.apache.org/jira/browse/SPARK-35154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LIU updated SPARK-35154: Description: when i use this code to work, Rpc thread hangs up and not close gracefully. i think when rpc thread called shutdown on OnStop method, it will try to put MessageLoop.PoisonPill to return and stop thread in rpc pool. In spark 3.x, it will make others thread return & stop but current thread which call OnStop method to await current pool to stop. it makes current thread not stop, and pending program. I'm not sure that needs to be improved or not? {code:java} //代码占位符{code} test("Rpc env not shutdown when shutdown method call by endpoint onStop") { val rpcEndpoint = new RpcEndpoint { override val rpcEnv: RpcEnv = env override def onStop(): Unit = { env.shutdown() env.awaitTermination() } override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] ={ case m => context.reply(m) } } env.setupEndpoint("test", rpcEndpoint) rpcEndpoint.stop() env.awaitTermination() } was: when i use this code to work, Rpc thread hangs up and not close gracefully. i think when rpc thread called shutdown on OnStop method, it will try to put MessageLoop.PoisonPill to return and stop thread in rpc pool. In spark 3.x, it will make others thread return & stop but current thread which call OnStop method to await current pool to stop. it makes current thread not stop, and pending program. I'm not sure that needs to be improved or not? {code:java} //代码占位符{code} test("Rpc env not shutdown when shutdown method call by endpoint onStop") { val rpcEndpoint = new RpcEndpoint { override val rpcEnv: RpcEnv = env override def onStop(): Unit = { env.shutdown() env.awaitTermination() } override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] ={ case m => context.reply(m) } } env.setupEndpoint("test", rpcEndpoint) rpcEndpoint.stop() env.awaitTermination() } > Rpc env not shutdown when shutdown method call by endpoint onStop > - > > Key: SPARK-35154 > URL: https://issues.apache.org/jira/browse/SPARK-35154 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.0.0 > Environment: spark-3.x >Reporter: LIU >Priority: Major > > when i use this code to work, Rpc thread hangs up and not close gracefully. > i think when rpc thread called shutdown on OnStop method, it will try to put > MessageLoop.PoisonPill to return and stop thread in rpc pool. In spark 3.x, > it will make others thread return & stop but current thread which call OnStop > method to await current pool to stop. it makes current thread not stop, and > pending program. > I'm not sure that needs to be improved or not? > > {code:java} > //代码占位符{code} > test("Rpc env not shutdown when shutdown method call by endpoint onStop") { > val rpcEndpoint = new RpcEndpoint { > override val rpcEnv: RpcEnv = env > override def onStop(): Unit = { > env.shutdown() > env.awaitTermination() > } > override def receiveAndReply(context: RpcCallContext): > PartialFunction[Any, Unit] ={ > case m => context.reply(m) > } > } > env.setupEndpoint("test", rpcEndpoint) > rpcEndpoint.stop() > env.awaitTermination() > } > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35154) Rpc env not shutdown when shutdown method call by endpoint onStop
LIU created SPARK-35154: --- Summary: Rpc env not shutdown when shutdown method call by endpoint onStop Key: SPARK-35154 URL: https://issues.apache.org/jira/browse/SPARK-35154 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.0.0 Environment: spark-3.x Reporter: LIU when i use this code to work, Rpc thread hangs up and not close gracefully. i think when rpc thread called shutdown on OnStop method, it will try to put MessageLoop.PoisonPill to return and stop thread in rpc pool. In spark 3.x, it will make others thread return & stop but current thread which call OnStop method to await current pool to stop. it makes current thread not stop, and pending program. I'm not sure that needs to be improved or not? {code:java} //代码占位符{code} test("Rpc env not shutdown when shutdown method call by endpoint onStop") { val rpcEndpoint = new RpcEndpoint { override val rpcEnv: RpcEnv = env override def onStop(): Unit = { env.shutdown() env.awaitTermination() } override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] ={ case m => context.reply(m) } } env.setupEndpoint("test", rpcEndpoint) rpcEndpoint.stop() env.awaitTermination() } -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32288) [UI] Add failure summary table in stage page
[ https://issues.apache.org/jira/browse/SPARK-32288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhongwei Zhu updated SPARK-32288: - Summary: [UI] Add failure summary table in stage page (was: [UI] Add exception summary table in stage page) > [UI] Add failure summary table in stage page > > > Key: SPARK-32288 > URL: https://issues.apache.org/jira/browse/SPARK-32288 > Project: Spark > Issue Type: New Feature > Components: Web UI >Affects Versions: 3.0.0 >Reporter: Zhongwei Zhu >Priority: Major > > When there're many task failure during one stage, it's hard to find failure > pattern such as aggregation task failure by exception type and message. If we > have such information, we can easily know which type of exception of failure > is the root cause of stage failure. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34777) [UI] StagePage input size/records not show when records greater than zero
[ https://issues.apache.org/jira/browse/SPARK-34777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhongwei Zhu updated SPARK-34777: - Summary: [UI] StagePage input size/records not show when records greater than zero (was: [UI] StagePage input size records not show when records greater than zero) > [UI] StagePage input size/records not show when records greater than zero > - > > Key: SPARK-34777 > URL: https://issues.apache.org/jira/browse/SPARK-34777 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 3.1.1 >Reporter: Zhongwei Zhu >Priority: Minor > Attachments: No input size records.png > > > !No input size records.png|width=547,height=212! > The `Input Size / Records` should show in summary metrics table and task > columns, as input records greater than zero and bytes is zero. One example is > spark streaming job read from kafka -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35153) Override `sql()` of ANSI interval operators
[ https://issues.apache.org/jira/browse/SPARK-35153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325925#comment-17325925 ] Apache Spark commented on SPARK-35153: -- User 'MaxGekk' has created a pull request for this issue: https://github.com/apache/spark/pull/32262 > Override `sql()` of ANSI interval operators > --- > > Key: SPARK-35153 > URL: https://issues.apache.org/jira/browse/SPARK-35153 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Override the sql() method of the expression that implements operators over > ANSI interval, and make SQL representation more readable and potentially > parsable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35153) Override `sql()` of ANSI interval operators
[ https://issues.apache.org/jira/browse/SPARK-35153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35153: Assignee: Max Gekk (was: Apache Spark) > Override `sql()` of ANSI interval operators > --- > > Key: SPARK-35153 > URL: https://issues.apache.org/jira/browse/SPARK-35153 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Override the sql() method of the expression that implements operators over > ANSI interval, and make SQL representation more readable and potentially > parsable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35153) Override `sql()` of ANSI interval operators
[ https://issues.apache.org/jira/browse/SPARK-35153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325924#comment-17325924 ] Apache Spark commented on SPARK-35153: -- User 'MaxGekk' has created a pull request for this issue: https://github.com/apache/spark/pull/32262 > Override `sql()` of ANSI interval operators > --- > > Key: SPARK-35153 > URL: https://issues.apache.org/jira/browse/SPARK-35153 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Override the sql() method of the expression that implements operators over > ANSI interval, and make SQL representation more readable and potentially > parsable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35153) Override `sql()` of ANSI interval operators
[ https://issues.apache.org/jira/browse/SPARK-35153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35153: Assignee: Apache Spark (was: Max Gekk) > Override `sql()` of ANSI interval operators > --- > > Key: SPARK-35153 > URL: https://issues.apache.org/jira/browse/SPARK-35153 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Assignee: Apache Spark >Priority: Major > > Override the sql() method of the expression that implements operators over > ANSI interval, and make SQL representation more readable and potentially > parsable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35153) Override `sql()` of ANSI interval operators
Max Gekk created SPARK-35153: Summary: Override `sql()` of ANSI interval operators Key: SPARK-35153 URL: https://issues.apache.org/jira/browse/SPARK-35153 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Max Gekk Assignee: Max Gekk Override the sql() method of the expression that implements operators over ANSI interval, and make SQL representation more readable and potentially parsable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35151) Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-35151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325911#comment-17325911 ] Apache Spark commented on SPARK-35151: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/32261 > Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13 > -- > > Key: SPARK-35151 > URL: https://issues.apache.org/jira/browse/SPARK-35151 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.2.0 >Reporter: Yang Jie >Priority: Minor > > Add compile args to suppress compilation warnings as follows: > > {code:java} > [warn] > /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:38: > [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; > use Symbol("id") instead > [warn] val ds = spark.range(20).select(('id % 3).as("key"), > 'id).as[(Long, Long)] > [warn] ^ > [warn] > /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:58: > [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; > use Symbol("id") instead > [warn] val ds = spark.range(20).select(('id % 3).as("key"), > 'id).as[(Long, Long)] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35151) Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-35151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35151: Assignee: Apache Spark > Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13 > -- > > Key: SPARK-35151 > URL: https://issues.apache.org/jira/browse/SPARK-35151 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.2.0 >Reporter: Yang Jie >Assignee: Apache Spark >Priority: Minor > > Add compile args to suppress compilation warnings as follows: > > {code:java} > [warn] > /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:38: > [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; > use Symbol("id") instead > [warn] val ds = spark.range(20).select(('id % 3).as("key"), > 'id).as[(Long, Long)] > [warn] ^ > [warn] > /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:58: > [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; > use Symbol("id") instead > [warn] val ds = spark.range(20).select(('id % 3).as("key"), > 'id).as[(Long, Long)] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35151) Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-35151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35151: Assignee: (was: Apache Spark) > Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13 > -- > > Key: SPARK-35151 > URL: https://issues.apache.org/jira/browse/SPARK-35151 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.2.0 >Reporter: Yang Jie >Priority: Minor > > Add compile args to suppress compilation warnings as follows: > > {code:java} > [warn] > /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:38: > [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; > use Symbol("id") instead > [warn] val ds = spark.range(20).select(('id % 3).as("key"), > 'id).as[(Long, Long)] > [warn] ^ > [warn] > /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:58: > [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; > use Symbol("id") instead > [warn] val ds = spark.range(20).select(('id % 3).as("key"), > 'id).as[(Long, Long)] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35151) Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-35151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325910#comment-17325910 ] Apache Spark commented on SPARK-35151: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/32261 > Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13 > -- > > Key: SPARK-35151 > URL: https://issues.apache.org/jira/browse/SPARK-35151 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.2.0 >Reporter: Yang Jie >Priority: Minor > > Add compile args to suppress compilation warnings as follows: > > {code:java} > [warn] > /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:38: > [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; > use Symbol("id") instead > [warn] val ds = spark.range(20).select(('id % 3).as("key"), > 'id).as[(Long, Long)] > [warn] ^ > [warn] > /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:58: > [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; > use Symbol("id") instead > [warn] val ds = spark.range(20).select(('id % 3).as("key"), > 'id).as[(Long, Long)] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-35145) CurrentOrigin should support nested invoking
[ https://issues.apache.org/jira/browse/SPARK-35145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-35145. - Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 32249 [https://github.com/apache/spark/pull/32249] > CurrentOrigin should support nested invoking > > > Key: SPARK-35145 > URL: https://issues.apache.org/jira/browse/SPARK-35145 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > Fix For: 3.2.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35145) CurrentOrigin should support nested invoking
[ https://issues.apache.org/jira/browse/SPARK-35145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-35145: --- Assignee: Wenchen Fan > CurrentOrigin should support nested invoking > > > Key: SPARK-35145 > URL: https://issues.apache.org/jira/browse/SPARK-35145 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35152) ANSI mode: IntegralDivide throws exception on overflow
[ https://issues.apache.org/jira/browse/SPARK-35152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325894#comment-17325894 ] Apache Spark commented on SPARK-35152: -- User 'gengliangwang' has created a pull request for this issue: https://github.com/apache/spark/pull/32260 > ANSI mode: IntegralDivide throws exception on overflow > -- > > Key: SPARK-35152 > URL: https://issues.apache.org/jira/browse/SPARK-35152 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > > IntegralDivide throws an exception on overflow. > There is only one case that can cause that: > ``` > Long.MinValue div -1 > ``` -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35152) ANSI mode: IntegralDivide throws exception on overflow
[ https://issues.apache.org/jira/browse/SPARK-35152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35152: Assignee: Apache Spark (was: Gengliang Wang) > ANSI mode: IntegralDivide throws exception on overflow > -- > > Key: SPARK-35152 > URL: https://issues.apache.org/jira/browse/SPARK-35152 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gengliang Wang >Assignee: Apache Spark >Priority: Major > > IntegralDivide throws an exception on overflow. > There is only one case that can cause that: > ``` > Long.MinValue div -1 > ``` -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35152) ANSI mode: IntegralDivide throws exception on overflow
[ https://issues.apache.org/jira/browse/SPARK-35152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325892#comment-17325892 ] Apache Spark commented on SPARK-35152: -- User 'gengliangwang' has created a pull request for this issue: https://github.com/apache/spark/pull/32260 > ANSI mode: IntegralDivide throws exception on overflow > -- > > Key: SPARK-35152 > URL: https://issues.apache.org/jira/browse/SPARK-35152 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > > IntegralDivide throws an exception on overflow. > There is only one case that can cause that: > ``` > Long.MinValue div -1 > ``` -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35152) ANSI mode: IntegralDivide throws exception on overflow
[ https://issues.apache.org/jira/browse/SPARK-35152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35152: Assignee: Gengliang Wang (was: Apache Spark) > ANSI mode: IntegralDivide throws exception on overflow > -- > > Key: SPARK-35152 > URL: https://issues.apache.org/jira/browse/SPARK-35152 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > > IntegralDivide throws an exception on overflow. > There is only one case that can cause that: > ``` > Long.MinValue div -1 > ``` -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-34338) Report metrics from Datasource v2 scan
[ https://issues.apache.org/jira/browse/SPARK-34338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-34338. - Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 31451 [https://github.com/apache/spark/pull/31451] > Report metrics from Datasource v2 scan > -- > > Key: SPARK-34338 > URL: https://issues.apache.org/jira/browse/SPARK-34338 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 3.2.0 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Fix For: 3.2.0 > > > This is related to SPARK-34297. > In SPARK-34297, we want to add a couple of useful metrics when reading from > Kafka in SS. We need some public API change in DS v2 to make it possible. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35152) ANSI mode: IntegralDivide throws exception on overflow
Gengliang Wang created SPARK-35152: -- Summary: ANSI mode: IntegralDivide throws exception on overflow Key: SPARK-35152 URL: https://issues.apache.org/jira/browse/SPARK-35152 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: Gengliang Wang Assignee: Gengliang Wang IntegralDivide throws an exception on overflow. There is only one case that can cause that: ``` Long.MinValue div -1 ``` -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-34035) Refactor ScriptTransformation to remove input parameter and replace it by child.output
[ https://issues.apache.org/jira/browse/SPARK-34035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-34035. - Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 32228 [https://github.com/apache/spark/pull/32228] > Refactor ScriptTransformation to remove input parameter and replace it by > child.output > --- > > Key: SPARK-34035 > URL: https://issues.apache.org/jira/browse/SPARK-34035 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: angerszhu >Priority: Major > Fix For: 3.2.0 > > > According to discussion here > https://github.com/apache/spark/pull/29087#discussion_r552625920 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-34035) Refactor ScriptTransformation to remove input parameter and replace it by child.output
[ https://issues.apache.org/jira/browse/SPARK-34035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-34035: --- Assignee: angerszhu > Refactor ScriptTransformation to remove input parameter and replace it by > child.output > --- > > Key: SPARK-34035 > URL: https://issues.apache.org/jira/browse/SPARK-34035 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: angerszhu >Priority: Major > > According to discussion here > https://github.com/apache/spark/pull/29087#discussion_r552625920 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35151) Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-35151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-35151: - Description: Add compile args to suppress compilation warnings as follows: {code:java} [warn] /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:38: [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; use Symbol("id") instead [warn] val ds = spark.range(20).select(('id % 3).as("key"), 'id).as[(Long, Long)] [warn] ^ [warn] /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:58: [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; use Symbol("id") instead [warn] val ds = spark.range(20).select(('id % 3).as("key"), 'id).as[(Long, Long)] {code} was:Add compile args to suppress > Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13 > -- > > Key: SPARK-35151 > URL: https://issues.apache.org/jira/browse/SPARK-35151 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.2.0 >Reporter: Yang Jie >Priority: Minor > > Add compile args to suppress compilation warnings as follows: > > {code:java} > [warn] > /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:38: > [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; > use Symbol("id") instead > [warn] val ds = spark.range(20).select(('id % 3).as("key"), > 'id).as[(Long, Long)] > [warn] ^ > [warn] > /home/kou/work/oss/spark-scala-2.13/examples/src/main/scala/org/apache/spark/examples/sql/SimpleTypedAggregator.scala:34:58: > [deprecation @ | origin= | version=2.13.0] symbol literal is deprecated; > use Symbol("id") instead > [warn] val ds = spark.range(20).select(('id % 3).as("key"), > 'id).as[(Long, Long)] > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35151) Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13
[ https://issues.apache.org/jira/browse/SPARK-35151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-35151: - Description: Add compile args to suppress > Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13 > -- > > Key: SPARK-35151 > URL: https://issues.apache.org/jira/browse/SPARK-35151 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 3.2.0 >Reporter: Yang Jie >Priority: Minor > > Add compile args to suppress -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35151) Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13
Yang Jie created SPARK-35151: Summary: Suppress `symbol literal is deprecated` compilation warnings in Scala 2.13 Key: SPARK-35151 URL: https://issues.apache.org/jira/browse/SPARK-35151 Project: Spark Issue Type: Sub-task Components: Build Affects Versions: 3.2.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35150) Accelerate fallback BLAS with dev.ludovic.netlib
[ https://issues.apache.org/jira/browse/SPARK-35150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35150: Assignee: (was: Apache Spark) > Accelerate fallback BLAS with dev.ludovic.netlib > > > Key: SPARK-35150 > URL: https://issues.apache.org/jira/browse/SPARK-35150 > Project: Spark > Issue Type: Improvement > Components: GraphX, ML, MLlib >Affects Versions: 3.2.0 >Reporter: Ludovic Henry >Priority: Major > > Following https://github.com/apache/spark/pull/30810, I've continued looking > for ways to accelerate the usage of BLAS in Spark. With this PR, I integrate > work done in the [{{dev.ludovic.netlib}}|https://github.com/luhenry/netlib/] > Maven package. > The {{dev.ludovic.netlib}} library wraps the original > {{com.github.fommil.netlib}} library and focus on accelerating the linear > algebra routines in use in Spark. When running the > {{org.apache.spark.ml.linalg.BLASBenchmark}}benchmarking suite, I get the > results at [1] on an Intel machine. Moreover, this library is thoroughly > tested to return the exact same results as the reference implementation. > Under the hood, it reimplements the necessary algorithms in pure > autovectorization-friendly Java 8, as well as takes advantage of the Vector > API and Foreign Linker API introduced in JDK 16 when available. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35150) Accelerate fallback BLAS with dev.ludovic.netlib
[ https://issues.apache.org/jira/browse/SPARK-35150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325842#comment-17325842 ] Apache Spark commented on SPARK-35150: -- User 'luhenry' has created a pull request for this issue: https://github.com/apache/spark/pull/32253 > Accelerate fallback BLAS with dev.ludovic.netlib > > > Key: SPARK-35150 > URL: https://issues.apache.org/jira/browse/SPARK-35150 > Project: Spark > Issue Type: Improvement > Components: GraphX, ML, MLlib >Affects Versions: 3.2.0 >Reporter: Ludovic Henry >Priority: Major > > Following https://github.com/apache/spark/pull/30810, I've continued looking > for ways to accelerate the usage of BLAS in Spark. With this PR, I integrate > work done in the [{{dev.ludovic.netlib}}|https://github.com/luhenry/netlib/] > Maven package. > The {{dev.ludovic.netlib}} library wraps the original > {{com.github.fommil.netlib}} library and focus on accelerating the linear > algebra routines in use in Spark. When running the > {{org.apache.spark.ml.linalg.BLASBenchmark}}benchmarking suite, I get the > results at [1] on an Intel machine. Moreover, this library is thoroughly > tested to return the exact same results as the reference implementation. > Under the hood, it reimplements the necessary algorithms in pure > autovectorization-friendly Java 8, as well as takes advantage of the Vector > API and Foreign Linker API introduced in JDK 16 when available. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35150) Accelerate fallback BLAS with dev.ludovic.netlib
[ https://issues.apache.org/jira/browse/SPARK-35150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35150: Assignee: Apache Spark > Accelerate fallback BLAS with dev.ludovic.netlib > > > Key: SPARK-35150 > URL: https://issues.apache.org/jira/browse/SPARK-35150 > Project: Spark > Issue Type: Improvement > Components: GraphX, ML, MLlib >Affects Versions: 3.2.0 >Reporter: Ludovic Henry >Assignee: Apache Spark >Priority: Major > > Following https://github.com/apache/spark/pull/30810, I've continued looking > for ways to accelerate the usage of BLAS in Spark. With this PR, I integrate > work done in the [{{dev.ludovic.netlib}}|https://github.com/luhenry/netlib/] > Maven package. > The {{dev.ludovic.netlib}} library wraps the original > {{com.github.fommil.netlib}} library and focus on accelerating the linear > algebra routines in use in Spark. When running the > {{org.apache.spark.ml.linalg.BLASBenchmark}}benchmarking suite, I get the > results at [1] on an Intel machine. Moreover, this library is thoroughly > tested to return the exact same results as the reference implementation. > Under the hood, it reimplements the necessary algorithms in pure > autovectorization-friendly Java 8, as well as takes advantage of the Vector > API and Foreign Linker API introduced in JDK 16 when available. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35150) Accelerate fallback BLAS with dev.ludovic.netlib
Ludovic Henry created SPARK-35150: - Summary: Accelerate fallback BLAS with dev.ludovic.netlib Key: SPARK-35150 URL: https://issues.apache.org/jira/browse/SPARK-35150 Project: Spark Issue Type: Improvement Components: GraphX, ML, MLlib Affects Versions: 3.2.0 Reporter: Ludovic Henry Following https://github.com/apache/spark/pull/30810, I've continued looking for ways to accelerate the usage of BLAS in Spark. With this PR, I integrate work done in the [{{dev.ludovic.netlib}}|https://github.com/luhenry/netlib/] Maven package. The {{dev.ludovic.netlib}} library wraps the original {{com.github.fommil.netlib}} library and focus on accelerating the linear algebra routines in use in Spark. When running the {{org.apache.spark.ml.linalg.BLASBenchmark}}benchmarking suite, I get the results at [1] on an Intel machine. Moreover, this library is thoroughly tested to return the exact same results as the reference implementation. Under the hood, it reimplements the necessary algorithms in pure autovectorization-friendly Java 8, as well as takes advantage of the Vector API and Foreign Linker API introduced in JDK 16 when available. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35113) Support ANSI intervals in the Hash expression
[ https://issues.apache.org/jira/browse/SPARK-35113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325827#comment-17325827 ] Apache Spark commented on SPARK-35113: -- User 'AngersZh' has created a pull request for this issue: https://github.com/apache/spark/pull/32259 > Support ANSI intervals in the Hash expression > - > > Key: SPARK-35113 > URL: https://issues.apache.org/jira/browse/SPARK-35113 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Priority: Major > > Handle YearMonthIntervalType and DayTimeIntervalType in HashExpression. And > write tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35113) Support ANSI intervals in the Hash expression
[ https://issues.apache.org/jira/browse/SPARK-35113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35113: Assignee: Apache Spark > Support ANSI intervals in the Hash expression > - > > Key: SPARK-35113 > URL: https://issues.apache.org/jira/browse/SPARK-35113 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Assignee: Apache Spark >Priority: Major > > Handle YearMonthIntervalType and DayTimeIntervalType in HashExpression. And > write tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35113) Support ANSI intervals in the Hash expression
[ https://issues.apache.org/jira/browse/SPARK-35113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35113: Assignee: (was: Apache Spark) > Support ANSI intervals in the Hash expression > - > > Key: SPARK-35113 > URL: https://issues.apache.org/jira/browse/SPARK-35113 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Priority: Major > > Handle YearMonthIntervalType and DayTimeIntervalType in HashExpression. And > write tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35113) Support ANSI intervals in the Hash expression
[ https://issues.apache.org/jira/browse/SPARK-35113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325825#comment-17325825 ] Apache Spark commented on SPARK-35113: -- User 'AngersZh' has created a pull request for this issue: https://github.com/apache/spark/pull/32259 > Support ANSI intervals in the Hash expression > - > > Key: SPARK-35113 > URL: https://issues.apache.org/jira/browse/SPARK-35113 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Priority: Major > > Handle YearMonthIntervalType and DayTimeIntervalType in HashExpression. And > write tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-34877) Add Spark AM Log link in case of master as yarn and deploy mode as client
[ https://issues.apache.org/jira/browse/SPARK-34877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-34877. --- Fix Version/s: 3.2.0 Assignee: Saurabh Chawla Resolution: Fixed > Add Spark AM Log link in case of master as yarn and deploy mode as client > - > > Key: SPARK-34877 > URL: https://issues.apache.org/jira/browse/SPARK-34877 > Project: Spark > Issue Type: Improvement > Components: Spark Core, YARN >Affects Versions: 3.1.1 >Reporter: Saurabh Chawla >Assignee: Saurabh Chawla >Priority: Minor > Fix For: 3.2.0 > > > On Running Spark job with yarn and deployment mode as client, Spark Driver > and Spark Application master launch in two separate containers. In various > scenarios there is need to see Spark Application master logs to see the > resource allocation, Decommissioning status and other information shared > between yarn RM and Spark Application master. > Till now the only way to check this by finding the container id of the AM and > check the logs either using Yarn utility or Yarn RM Application History > server. > This Jira is for adding the spark AM log link for spark job running in the > client mode for yarn. Instead of searching the container id and then find the > logs. We can directly check in the Spark UI -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35120) Guide users to sync branch and enable GitHub Actions in their forked repository
[ https://issues.apache.org/jira/browse/SPARK-35120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325810#comment-17325810 ] Apache Spark commented on SPARK-35120: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/32258 > Guide users to sync branch and enable GitHub Actions in their forked > repository > --- > > Key: SPARK-35120 > URL: https://issues.apache.org/jira/browse/SPARK-35120 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 3.2.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Fix For: 3.2.0 > > > If developers don't enable Github Actions in their fork, the Pr builds cannot > run. We should guide them to enable. > Also, the branch should be synced to the latest master branch. > We could leverage Action Required status in GitHub check: > https://docs.github.com/en/rest/guides/getting-started-with-the-checks-api -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33976) Add a dedicated SQL document page for the TRANSFORM-related functionality,
[ https://issues.apache.org/jira/browse/SPARK-33976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325800#comment-17325800 ] Apache Spark commented on SPARK-33976: -- User 'AngersZh' has created a pull request for this issue: https://github.com/apache/spark/pull/32257 > Add a dedicated SQL document page for the TRANSFORM-related functionality, > -- > > Key: SPARK-33976 > URL: https://issues.apache.org/jira/browse/SPARK-33976 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: angerszhu >Priority: Major > Fix For: 3.2.0 > > > Add doc about transform > https://github.com/apache/spark/pull/30973#issuecomment-753715318 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33976) Add a dedicated SQL document page for the TRANSFORM-related functionality,
[ https://issues.apache.org/jira/browse/SPARK-33976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325801#comment-17325801 ] Apache Spark commented on SPARK-33976: -- User 'AngersZh' has created a pull request for this issue: https://github.com/apache/spark/pull/32257 > Add a dedicated SQL document page for the TRANSFORM-related functionality, > -- > > Key: SPARK-33976 > URL: https://issues.apache.org/jira/browse/SPARK-33976 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: angerszhu >Priority: Major > Fix For: 3.2.0 > > > Add doc about transform > https://github.com/apache/spark/pull/30973#issuecomment-753715318 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31225) Override `sql` method for OuterReference
[ https://issues.apache.org/jira/browse/SPARK-31225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325792#comment-17325792 ] Apache Spark commented on SPARK-31225: -- User 'yaooqinn' has created a pull request for this issue: https://github.com/apache/spark/pull/32256 > Override `sql` method for OuterReference > - > > Key: SPARK-31225 > URL: https://issues.apache.org/jira/browse/SPARK-31225 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Minor > Fix For: 3.0.0 > > > OuterReference is LeafExpression, so it's children is Nil, which makes its > SQL representation always be outer(). This makes our explain-command and > error msg unclear when OuterReference exists -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31225) Override `sql` method for OuterReference
[ https://issues.apache.org/jira/browse/SPARK-31225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325791#comment-17325791 ] Apache Spark commented on SPARK-31225: -- User 'yaooqinn' has created a pull request for this issue: https://github.com/apache/spark/pull/32256 > Override `sql` method for OuterReference > - > > Key: SPARK-31225 > URL: https://issues.apache.org/jira/browse/SPARK-31225 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.0 >Reporter: Kent Yao >Assignee: Kent Yao >Priority: Minor > Fix For: 3.0.0 > > > OuterReference is LeafExpression, so it's children is Nil, which makes its > SQL representation always be outer(). This makes our explain-command and > error msg unclear when OuterReference exists -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35108) Pickle produces incorrect key labels for GenericRowWithSchema (data corruption)
[ https://issues.apache.org/jira/browse/SPARK-35108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325764#comment-17325764 ] Hyukjin Kwon commented on SPARK-35108: -- Thanks for cc'ing me [~tgraves]. I will take a look early next week if no one takes this one. > Pickle produces incorrect key labels for GenericRowWithSchema (data > corruption) > --- > > Key: SPARK-35108 > URL: https://issues.apache.org/jira/browse/SPARK-35108 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.1, 3.0.2 >Reporter: Robert Joseph Evans >Priority: Blocker > Labels: correctness > Attachments: test.py, test.sh > > > I think this also shows up for all versions of Spark that pickle the data > when doing a collect from python. > When you do a collect in python java will do a collect and convert the > UnsafeRows into GenericRowWithSchema instances before it sends them to the > Pickler. The Pickler, by default, will try to dedupe objects using hashCode > and .equals for the object. But .equals and .hashCode for > GenericRowWithSchema only looks at the data, not the schema. But when we > pickle the row the keys from the schema are written out. > This can result in data corruption, sort of, in a few cases where a row has > the same number of elements as a struct within the row does, or a sub-struct > within another struct. > If the data happens to be the same, the keys for the resulting row or struct > can be wrong. > My repro case is a bit convoluted, but it does happen. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35143) Add default log config for spark-sql
[ https://issues.apache.org/jira/browse/SPARK-35143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325696#comment-17325696 ] Apache Spark commented on SPARK-35143: -- User 'ChenDou2021' has created a pull request for this issue: https://github.com/apache/spark/pull/32254 > Add default log config for spark-sql > > > Key: SPARK-35143 > URL: https://issues.apache.org/jira/browse/SPARK-35143 > Project: Spark > Issue Type: Improvement > Components: Spark Shell, SQL >Affects Versions: 3.1.1 >Reporter: hong dongdong >Priority: Minor > > The default log level for spark-sql is WARN. How to change the log level is > confusing, we need a default config. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-33976) Add a dedicated SQL document page for the TRANSFORM-related functionality,
[ https://issues.apache.org/jira/browse/SPARK-33976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-33976: --- Assignee: angerszhu > Add a dedicated SQL document page for the TRANSFORM-related functionality, > -- > > Key: SPARK-33976 > URL: https://issues.apache.org/jira/browse/SPARK-33976 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: angerszhu >Priority: Major > > Add doc about transform > https://github.com/apache/spark/pull/30973#issuecomment-753715318 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-33976) Add a dedicated SQL document page for the TRANSFORM-related functionality,
[ https://issues.apache.org/jira/browse/SPARK-33976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-33976. - Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 31010 [https://github.com/apache/spark/pull/31010] > Add a dedicated SQL document page for the TRANSFORM-related functionality, > -- > > Key: SPARK-33976 > URL: https://issues.apache.org/jira/browse/SPARK-33976 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: angerszhu >Priority: Major > Fix For: 3.2.0 > > > Add doc about transform > https://github.com/apache/spark/pull/30973#issuecomment-753715318 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35149) I am facing this issue regularly, how to fix this issue.
Eppa Rakesh created SPARK-35149: --- Summary: I am facing this issue regularly, how to fix this issue. Key: SPARK-35149 URL: https://issues.apache.org/jira/browse/SPARK-35149 Project: Spark Issue Type: Question Components: Spark Submit Affects Versions: 2.2.2 Reporter: Eppa Rakesh 21/04/19 21:02:11 WARN hdfs.DataStreamer: Exception for BP-823308525-10.56.47.77-1544458538172:blk_1170699623_96969312 java.io.EOFException: Unexpected EOF while trying to read response from server at org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:448) at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213) at org.apache.hadoop.hdfs.DataStreamer$ResponseProcessor.run(DataStreamer.java:1086) 21/04/19 21:04:01 WARN hdfs.DataStreamer: Error Recovery for BP-823308525-10.56.47.77-1544458538172:blk_1170699623_96969312 in pipeline [DatanodeInfoWithStorage[10.34.39.42:9866,DS-0ad94d03-fa3f-486b-b204-3e8d2df91f17,DISK], DatanodeInfoWithStorage[10.56.47.67:9866,DS-c28dab54-8fa0-4a49-80ec-345cc0cc52bd,DISK], DatanodeInfoWithStorage[10.56.47.55:9866,DS-79f5dd22-d0bc-4fe0-8e50-8a570779de17,DISK]]: datanode 0(DatanodeInfoWithStorage[10.56.47.36:9866,DS-0ad94d03-fa3f-486b-b204-3e8d2df91f17,DISK]) is bad. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-35068) Add tests for ANSI intervals to HiveThriftBinaryServerSuite
[ https://issues.apache.org/jira/browse/SPARK-35068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-35068: Assignee: angerszhu > Add tests for ANSI intervals to HiveThriftBinaryServerSuite > --- > > Key: SPARK-35068 > URL: https://issues.apache.org/jira/browse/SPARK-35068 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: angerszhu >Priority: Major > > Add tests for year-month and day-time intervals to > HiveThriftBinaryServerSuite similar to: > # Query Intervals in VIEWs through thrift server > # Support interval type -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-35068) Add tests for ANSI intervals to HiveThriftBinaryServerSuite
[ https://issues.apache.org/jira/browse/SPARK-35068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-35068. -- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 32250 [https://github.com/apache/spark/pull/32250] > Add tests for ANSI intervals to HiveThriftBinaryServerSuite > --- > > Key: SPARK-35068 > URL: https://issues.apache.org/jira/browse/SPARK-35068 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: angerszhu >Priority: Major > Fix For: 3.2.0 > > > Add tests for year-month and day-time intervals to > HiveThriftBinaryServerSuite similar to: > # Query Intervals in VIEWs through thrift server > # Support interval type -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34526) Skip checking glob path in FileStreamSink.hasMetadata
[ https://issues.apache.org/jira/browse/SPARK-34526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuanjian Li updated SPARK-34526: Description: When checking the path in {{FileStreamSink.hasMetadata}}, we should ignore the error and assume the user wants to read a batch output. This is to keep the original behavior of ignoring the error. (was: Some users may use a very long glob path to read and `isDirectory` may fail when the path is too long. We should ignore the error when the path is a glob path since the file streaming sink doesn’t support glob paths.) > Skip checking glob path in FileStreamSink.hasMetadata > - > > Key: SPARK-34526 > URL: https://issues.apache.org/jira/browse/SPARK-34526 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 3.1.0 >Reporter: Yuanjian Li >Priority: Major > > When checking the path in {{FileStreamSink.hasMetadata}}, we should ignore > the error and assume the user wants to read a batch output. This is to keep > the original behavior of ignoring the error. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35113) Support ANSI intervals in the Hash expression
[ https://issues.apache.org/jira/browse/SPARK-35113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325670#comment-17325670 ] Max Gekk commented on SPARK-35113: -- [~angerszhuuu] Feel free to take this. > Support ANSI intervals in the Hash expression > - > > Key: SPARK-35113 > URL: https://issues.apache.org/jira/browse/SPARK-35113 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Priority: Major > > Handle YearMonthIntervalType and DayTimeIntervalType in HashExpression. And > write tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35113) Support ANSI intervals in the Hash expression
[ https://issues.apache.org/jira/browse/SPARK-35113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17325663#comment-17325663 ] angerszhu commented on SPARK-35113: --- [~maxgekk] Have you work on this? if not, can I take this one? > Support ANSI intervals in the Hash expression > - > > Key: SPARK-35113 > URL: https://issues.apache.org/jira/browse/SPARK-35113 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Max Gekk >Priority: Major > > Handle YearMonthIntervalType and DayTimeIntervalType in HashExpression. And > write tests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org