[jira] [Assigned] (SPARK-52540) Support the time type by make_timestamp_ntz()
[ https://issues.apache.org/jira/browse/SPARK-52540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-52540: Assignee: Max Gekk > Support the time type by make_timestamp_ntz() > - > > Key: SPARK-52540 > URL: https://issues.apache.org/jira/browse/SPARK-52540 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.1.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > > Modify the make_timestamp_ntz() function to create a timestamp from date and > time. > h4. Syntax > {code:sql} > make_timestamp_ntz(date [, time]) > {code} > h4. Arguments > # date: A date expression > # time: A time expression > h4. Returns > A TIMESTAMP. > h4. Examples > {code:sql} > > SELECT make_timestamp_ntz(DATE'2014-12-28', TIME'6:30:45.887'); > 2014-12-28 06:30:45.887 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52549) Disable Recursive CTE self-references from window functions
Pavle Martinović created SPARK-52549: Summary: Disable Recursive CTE self-references from window functions Key: SPARK-52549 URL: https://issues.apache.org/jira/browse/SPARK-52549 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.1.0 Reporter: Pavle Martinović -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52547) Build dry runs against master branch
Hyukjin Kwon created SPARK-52547: Summary: Build dry runs against master branch Key: SPARK-52547 URL: https://issues.apache.org/jira/browse/SPARK-52547 Project: Spark Issue Type: Sub-task Components: Project Infra Affects Versions: 4.0.0 Reporter: Hyukjin Kwon -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52548) Add a test case for when shuffle manager is overridden by a SparkPlugin
Hongze Zhang created SPARK-52548: Summary: Add a test case for when shuffle manager is overridden by a SparkPlugin Key: SPARK-52548 URL: https://issues.apache.org/jira/browse/SPARK-52548 Project: Spark Issue Type: Improvement Components: SQL, Tests Affects Versions: 4.0.0 Reporter: Hongze Zhang The PR [https://github.com/apache/spark/pull/43627] for SPARK-45762 introduced a change to allow the shuffle manager specified in Spark configuration to be overridden in SparkPlugin, this change was not tested though. Will suggest adding a test case for it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52528) Enable divide-by-zero for numeric mod with ANSI enabled
[ https://issues.apache.org/jira/browse/SPARK-52528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-52528. -- Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 51219 [https://github.com/apache/spark/pull/51219] > Enable divide-by-zero for numeric mod with ANSI enabled > > > Key: SPARK-52528 > URL: https://issues.apache.org/jira/browse/SPARK-52528 > Project: Spark > Issue Type: Sub-task > Components: PS >Affects Versions: 4.1.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Fix For: 4.1.0 > > > Enable divide-by-zero for numeric mod with ANSI enabled -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-52408) SPIP: Upgrade Apache Hive to 4.x
[ https://issues.apache.org/jira/browse/SPARK-52408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985476#comment-17985476 ] Kousuke Saruta commented on SPARK-52408: [~dongjoon], [~srowen], I don't think we need SPIP for this kind of change but what do you think? In the past, we upgrade Scala to 2.13 and we had lots of tasks including user-facing ones but it's done without SPIP. https://issues.apache.org/jira/browse/SPARK-25075 https://issues.apache.org/jira/browse/SPARK-39786 [~sunchao] I'd like to hear from you if you have any concern about this upgrade because you are very familiar with Hive. > SPIP: Upgrade Apache Hive to 4.x > > > Key: SPARK-52408 > URL: https://issues.apache.org/jira/browse/SPARK-52408 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 4.1.0 >Reporter: Vlad Rozov >Priority: Major > Labels: SPIP > > The > [SPIP|https://docs.google.com/document/d/1ejaGpuBvwBz2cD3Xj-QysShauBrdgYSh5yTxfAGvS1c/edit?usp=sharing] > proposes upgrading the Apache Hive version used in Apache Spark builds from > 2.3.10 to *version 4.x* (either 4.0.1 or the upcoming 4.1.0). It also > proposes discontinuing support for Apache Hive 2.x and 3.x, as these versions > are no longer maintained by the Apache Hive community and have reached > end-of-life (EOL). > The *key objectives* of this proposal are to: > # *Maintain all existing functionality* currently supported in Apache Hive > 2.x that *is compatible* with Apache Hive 4.x > # Ensure *no functional or performance regressions* occur > # Provide *the best upgrade path* for current Apache Spark users, minimizing > prerequisites and manual steps for those using Hive 2.x or 3.x > SPIP > [doc|https://docs.google.com/document/d/1ejaGpuBvwBz2cD3Xj-QysShauBrdgYSh5yTxfAGvS1c/edit?usp=sharing] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52462) Enforce type coercion before children output deduplication in Union
[ https://issues.apache.org/jira/browse/SPARK-52462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-52462. - Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 51172 [https://github.com/apache/spark/pull/51172] > Enforce type coercion before children output deduplication in Union > --- > > Key: SPARK-52462 > URL: https://issues.apache.org/jira/browse/SPARK-52462 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.1.0 >Reporter: Mihailo Aleksic >Assignee: Mihailo Aleksic >Priority: Major > Labels: pull-request-available > Fix For: 4.1.0 > > > Right now, query the following query produces plans that are not consistent > over different underlying table providers. Query: > SELECT col1, col2, col3, NULLIF('','') AS col4 > FROM table > UNION ALL > SELECT col2, col2, null AS col3, col4 > FROM table; > This happens because of rule ordering: > - Sometimes: ... -> WidenSetOperationTypes -> ... -> ResolveReferences > (deduplication of Union children outputs) -> ... > - Sometimes: ... -> ResolveReferences (deduplication of Union children > outputs) -> ... -> WidenSetOperationTypes -> ... > In this issue I propose that we align those two by enforcing type coercion to > happen before deduplication. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52462) Enforce type coercion before children output deduplication in Union
[ https://issues.apache.org/jira/browse/SPARK-52462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-52462: --- Assignee: Mihailo Aleksic > Enforce type coercion before children output deduplication in Union > --- > > Key: SPARK-52462 > URL: https://issues.apache.org/jira/browse/SPARK-52462 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.1.0 >Reporter: Mihailo Aleksic >Assignee: Mihailo Aleksic >Priority: Major > Labels: pull-request-available > > Right now, query the following query produces plans that are not consistent > over different underlying table providers. Query: > SELECT col1, col2, col3, NULLIF('','') AS col4 > FROM table > UNION ALL > SELECT col2, col2, null AS col3, col4 > FROM table; > This happens because of rule ordering: > - Sometimes: ... -> WidenSetOperationTypes -> ... -> ResolveReferences > (deduplication of Union children outputs) -> ... > - Sometimes: ... -> ResolveReferences (deduplication of Union children > outputs) -> ... -> WidenSetOperationTypes -> ... > In this issue I propose that we align those two by enforcing type coercion to > happen before deduplication. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-52408) SPIP: Upgrade Apache Hive to 4.x
[ https://issues.apache.org/jira/browse/SPARK-52408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985640#comment-17985640 ] Sean R. Owen commented on SPARK-52408: -- Maybe, maybe not, but if there is already a document detailing some of the work and tradeoffs, that seems fine and not a problem. I don't see that it makes any particular difference from here. > SPIP: Upgrade Apache Hive to 4.x > > > Key: SPARK-52408 > URL: https://issues.apache.org/jira/browse/SPARK-52408 > Project: Spark > Issue Type: Umbrella > Components: SQL >Affects Versions: 4.1.0 >Reporter: Vlad Rozov >Priority: Major > Labels: SPIP > > The > [SPIP|https://docs.google.com/document/d/1ejaGpuBvwBz2cD3Xj-QysShauBrdgYSh5yTxfAGvS1c/edit?usp=sharing] > proposes upgrading the Apache Hive version used in Apache Spark builds from > 2.3.10 to *version 4.x* (either 4.0.1 or the upcoming 4.1.0). It also > proposes discontinuing support for Apache Hive 2.x and 3.x, as these versions > are no longer maintained by the Apache Hive community and have reached > end-of-life (EOL). > The *key objectives* of this proposal are to: > # *Maintain all existing functionality* currently supported in Apache Hive > 2.x that *is compatible* with Apache Hive 4.x > # Ensure *no functional or performance regressions* occur > # Provide *the best upgrade path* for current Apache Spark users, minimizing > prerequisites and manual steps for those using Hive 2.x or 3.x > SPIP > [doc|https://docs.google.com/document/d/1ejaGpuBvwBz2cD3Xj-QysShauBrdgYSh5yTxfAGvS1c/edit?usp=sharing] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-52548) Add a test case for when shuffle manager is overridden by a SparkPlugin
[ https://issues.apache.org/jira/browse/SPARK-52548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongze Zhang updated SPARK-52548: - Component/s: Spark Core (was: SQL) > Add a test case for when shuffle manager is overridden by a SparkPlugin > --- > > Key: SPARK-52548 > URL: https://issues.apache.org/jira/browse/SPARK-52548 > Project: Spark > Issue Type: Improvement > Components: Spark Core, Tests >Affects Versions: 4.0.0 >Reporter: Hongze Zhang >Priority: Major > > The PR [https://github.com/apache/spark/pull/43627] for SPARK-45762 > introduced a change to allow the shuffle manager specified in Spark > configuration to be overridden in SparkPlugin, this change was not tested > though. > > Will suggest adding a test case for it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52550) SparkSessionExtensions requires support for DSV2 based extensions which require the classic SparkSession
Jack created SPARK-52550: Summary: SparkSessionExtensions requires support for DSV2 based extensions which require the classic SparkSession Key: SPARK-52550 URL: https://issues.apache.org/jira/browse/SPARK-52550 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 4.0.0 Reporter: Jack Extensions providing connector capabilities such as [https://github.com/apache/cassandra-spark-connector] register custom V2 strategies as part of the extension. However, in order to implement the data source v2 strategy, we require a session type to be provided as classic.SparkSession [due to API changes in DSV2|https://github.com/apache/spark/commit/5db31aec33c53aaa7c814f33ec84e6ba66fc193b#diff-7aeb491d44e183c8c8cf86d90b57701dba009fc19983c2a5c09449c768b047ceR36]. It appears it is no longer possible to implement a custom strategy from an extension in this mechanism, since SparkSessionExtensions only provides a handle to the SparkSession class, which cannot be used for DSV2 strategy planners. This item of work is to enable a user to provision extensions when they know the session will be scoped to classic, allowing registration of a DSV2 strategy, eg. {code:java} import org.apache.spark.sql.classic.{SparkSession => ClassicSparkSession} . . . case class MyCustomStrategy(spark: ClassicSparkSession) extends Strategy with Serializable { . . . override def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match { . . val dataSourceOptimizedPlan = new DataSourceV2Strategy(spark)...{code} Where this is registered via {code:java} class CoolSparkExtensions extends (SparkSessionExtensions => Unit) with Logging { override def apply(extensions: SparkSessionExtensions): Unit = { extensions.injectPlannerStrategy(MyCustomStrategy.apply) . . .{code} Worth noting that the existing API is marked both Experimental and @Unstable, meaning changes proposed could be considered if a better solution to this issue is not devised. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52551) Add a new v2 Predicate BOOLEAN_EXPRESSION
Wenchen Fan created SPARK-52551: --- Summary: Add a new v2 Predicate BOOLEAN_EXPRESSION Key: SPARK-52551 URL: https://issues.apache.org/jira/browse/SPARK-52551 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 4.1.0 Reporter: Wenchen Fan -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-52549) Disable Recursive CTE self-references from window functions and inside sorts
[ https://issues.apache.org/jira/browse/SPARK-52549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavle Martinović updated SPARK-52549: - Summary: Disable Recursive CTE self-references from window functions and inside sorts (was: Disable Recursive CTE self-references from window functions) > Disable Recursive CTE self-references from window functions and inside sorts > > > Key: SPARK-52549 > URL: https://issues.apache.org/jira/browse/SPARK-52549 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.1.0 >Reporter: Pavle Martinović >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-52538) Add new method to check if the value is fully extractable
[ https://issues.apache.org/jira/browse/SPARK-52538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985253#comment-17985253 ] Anastasia Filippova commented on SPARK-52538: - User 'vladimirg-db' has created a pull request for this issue: https://github.com/apache/spark/pull/51231 > Add new method to check if the value is fully extractable > - > > Key: SPARK-52538 > URL: https://issues.apache.org/jira/browse/SPARK-52538 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 4.1.0 >Reporter: Vladimir Golubev >Priority: Major > > To be later used in the single-pass Analyzer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52542) Use `/nonexistent` instead of nonexistent `/opt/spark`
[ https://issues.apache.org/jira/browse/SPARK-52542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-52542. --- Fix Version/s: 4.0.1 Assignee: Dongjoon Hyun Resolution: Fixed This is resolved via https://github.com/apache/spark-docker/pull/87 > Use `/nonexistent` instead of nonexistent `/opt/spark` > -- > > Key: SPARK-52542 > URL: https://issues.apache.org/jira/browse/SPARK-52542 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.3.4, 3.4.4, 3.5.6, 4.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Fix For: 4.0.1 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-51717) Possible SST mismatch error for the second snapshot created for a new query
[ https://issues.apache.org/jira/browse/SPARK-51717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985730#comment-17985730 ] Hudson commented on SPARK-51717: User 'micheal-o' has created a pull request for this issue: https://github.com/apache/spark/pull/51255 > Possible SST mismatch error for the second snapshot created for a new query > --- > > Key: SPARK-51717 > URL: https://issues.apache.org/jira/browse/SPARK-51717 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 4.1.0, 4.0.0 >Reporter: B. Micheal Okutubo >Assignee: B. Micheal Okutubo >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > Fix this error: Sst file size mismatch ... MANIFEST-05 may be corrupted > An edge case in SST file reuse that can only happen for the first ever > RocksDB checkpoint if: > # The first ever RocksDB checkpoint (e.g. for version 10) was created with > x.sst, but not yet upload by maintenance > # The next batch using RocksDB at v10 fails and rolls back store to -1 > (invalidates RocksDB) > # A new request to load RocksDB at v10 comes in, but v10 checkpoint is still > not uploaded hence we have to start replaying changelog starting from > checkpoint v0. > # We create a new v11 and new checkpoint with new x*.sst. v10 is now > uploaded by maintenance. Then during upload of x*.sst for v11, we reuse x.sst > DFS file, thinking it is the same as x*.sst. > The problem here is from step 3, the way the file manager loads v0 is > different from how it loads other versions. During the load of other > versions, when we delete an existing local file we also delete it from file > mapping. But for v0, file manager just deletes the local dir and we missed > clearing the file mapping in this case. Hence the old x.sst was still showing > in the file mapping at step 4. We need to fix this and also add additional > size check. > > Only when using changelog checkpointing -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-52515) Approx_top_k SQL function based on Apache DataSketches
[ https://issues.apache.org/jira/browse/SPARK-52515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985731#comment-17985731 ] Hudson commented on SPARK-52515: User 'yhuang-db' has created a pull request for this issue: https://github.com/apache/spark/pull/51236 > Approx_top_k SQL function based on Apache DataSketches > -- > > Key: SPARK-52515 > URL: https://issues.apache.org/jira/browse/SPARK-52515 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 4.1.0 >Reporter: Yuchuan Huang >Priority: Major > > Apache DataSketches an open-source library of sketches algorithms, and it is > widely used and adopted in industry. This ticket aims to introduce a new > function "approx_top_k", which uses the _frequent items sketches_ from Apache > DataSketches to find the approximate k-most-frequent items in a dataset. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52549) Disable Recursive CTE self-references from window functions and inside sorts
[ https://issues.apache.org/jira/browse/SPARK-52549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-52549. - Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 51178 [https://github.com/apache/spark/pull/51178] > Disable Recursive CTE self-references from window functions and inside sorts > > > Key: SPARK-52549 > URL: https://issues.apache.org/jira/browse/SPARK-52549 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.1.0 >Reporter: Pavle Martinović >Assignee: Pavle Martinović >Priority: Major > Fix For: 4.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52549) Disable Recursive CTE self-references from window functions and inside sorts
[ https://issues.apache.org/jira/browse/SPARK-52549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-52549: --- Assignee: Pavle Martinović > Disable Recursive CTE self-references from window functions and inside sorts > > > Key: SPARK-52549 > URL: https://issues.apache.org/jira/browse/SPARK-52549 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.1.0 >Reporter: Pavle Martinović >Assignee: Pavle Martinović >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52555) Enforce `UnusedLocalVariable` rule
[ https://issues.apache.org/jira/browse/SPARK-52555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-52555. --- Fix Version/s: kubernetes-operator-0.4.0 Resolution: Fixed Issue resolved by pull request 254 [https://github.com/apache/spark-kubernetes-operator/pull/254] > Enforce `UnusedLocalVariable` rule > -- > > Key: SPARK-52555 > URL: https://issues.apache.org/jira/browse/SPARK-52555 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.4.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > Fix For: kubernetes-operator-0.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52555) Enforce `UnusedLocalVariable` rule
[ https://issues.apache.org/jira/browse/SPARK-52555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-52555: - Assignee: Dongjoon Hyun > Enforce `UnusedLocalVariable` rule > -- > > Key: SPARK-52555 > URL: https://issues.apache.org/jira/browse/SPARK-52555 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.4.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52558) Lower `SparkOperatorConfManager` log level to WARN for `FileNotFoundException`
[ https://issues.apache.org/jira/browse/SPARK-52558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-52558. --- Fix Version/s: kubernetes-operator-0.4.0 Resolution: Fixed Issue resolved by pull request 255 [https://github.com/apache/spark-kubernetes-operator/pull/255] > Lower `SparkOperatorConfManager` log level to WARN for `FileNotFoundException` > -- > > Key: SPARK-52558 > URL: https://issues.apache.org/jira/browse/SPARK-52558 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.4.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > Fix For: kubernetes-operator-0.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52558) Lower `SparkOperatorConfManager` log level to WARN for `FileNotFoundException`
[ https://issues.apache.org/jira/browse/SPARK-52558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-52558: - Assignee: Dongjoon Hyun > Lower `SparkOperatorConfManager` log level to WARN for `FileNotFoundException` > -- > > Key: SPARK-52558 > URL: https://issues.apache.org/jira/browse/SPARK-52558 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.4.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52559) Synchronize `SparkOperatorConfManager.getValue`
[ https://issues.apache.org/jira/browse/SPARK-52559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-52559: - Assignee: Dongjoon Hyun > Synchronize `SparkOperatorConfManager.getValue` > --- > > Key: SPARK-52559 > URL: https://issues.apache.org/jira/browse/SPARK-52559 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.4.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Blocker > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52559) Synchronize `SparkOperatorConfManager.getValue`
[ https://issues.apache.org/jira/browse/SPARK-52559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-52559. --- Fix Version/s: kubernetes-operator-0.4.0 Resolution: Fixed Issue resolved by pull request 256 [https://github.com/apache/spark-kubernetes-operator/pull/256] > Synchronize `SparkOperatorConfManager.getValue` > --- > > Key: SPARK-52559 > URL: https://issues.apache.org/jira/browse/SPARK-52559 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes >Affects Versions: kubernetes-operator-0.4.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Blocker > Fix For: kubernetes-operator-0.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52559) Synchronize `SparkOperatorConfManager.getValue`
Dongjoon Hyun created SPARK-52559: - Summary: Synchronize `SparkOperatorConfManager.getValue` Key: SPARK-52559 URL: https://issues.apache.org/jira/browse/SPARK-52559 Project: Spark Issue Type: Sub-task Components: Kubernetes Affects Versions: kubernetes-operator-0.4.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52547) Build dry runs against master branch
[ https://issues.apache.org/jira/browse/SPARK-52547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-52547. -- Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 51245 [https://github.com/apache/spark/pull/51245] > Build dry runs against master branch > > > Key: SPARK-52547 > URL: https://issues.apache.org/jira/browse/SPARK-52547 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > Fix For: 4.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52547) Build dry runs against master branch
[ https://issues.apache.org/jira/browse/SPARK-52547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-52547: Assignee: Hyukjin Kwon > Build dry runs against master branch > > > Key: SPARK-52547 > URL: https://issues.apache.org/jira/browse/SPARK-52547 > Project: Spark > Issue Type: Sub-task > Components: Project Infra >Affects Versions: 4.0.0 >Reporter: Hyukjin Kwon >Assignee: Hyukjin Kwon >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52554) Avoid multiple roundtrips for config check in connect
[ https://issues.apache.org/jira/browse/SPARK-52554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-52554. -- Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 51252 [https://github.com/apache/spark/pull/51252] > Avoid multiple roundtrips for config check in connect > - > > Key: SPARK-52554 > URL: https://issues.apache.org/jira/browse/SPARK-52554 > Project: Spark > Issue Type: Sub-task > Components: Pandas API on Spark >Affects Versions: 4.1.0 >Reporter: Takuya Ueshin >Assignee: Takuya Ueshin >Priority: Major > Fix For: 4.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52554) Avoid multiple roundtrips for config check in connect
[ https://issues.apache.org/jira/browse/SPARK-52554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-52554: Assignee: Takuya Ueshin > Avoid multiple roundtrips for config check in connect > - > > Key: SPARK-52554 > URL: https://issues.apache.org/jira/browse/SPARK-52554 > Project: Spark > Issue Type: Sub-task > Components: Pandas API on Spark >Affects Versions: 4.1.0 >Reporter: Takuya Ueshin >Assignee: Takuya Ueshin >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52536) Specify AsyncProfilerLoader.extractionDir to spark local dir
[ https://issues.apache.org/jira/browse/SPARK-52536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-52536: Assignee: Zhen Wang > Specify AsyncProfilerLoader.extractionDir to spark local dir > > > Key: SPARK-52536 > URL: https://issues.apache.org/jira/browse/SPARK-52536 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Zhen Wang >Assignee: Zhen Wang >Priority: Major > > AsyncProfilerLoader uses `user.home` by default to store the extracted > libraries: > [https://github.com/jvm-profiling-tools/ap-loader/blob/main/src/main/java/one/profiler/AsyncProfilerLoader.java#L139-L152] > The `user.home` directory of the datanodes in our yarn cluster was not > initialized, causing the executor startup to fail: > {code:java} > 25/06/20 11:54:26 ERROR YarnCoarseGrainedExecutorBackend: Executor > self-exiting due to : Unable to create executor due to /home/pilot > java.nio.file.AccessDeniedException: /home/pilot > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at > sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) > at java.nio.file.Files.createDirectory(Files.java:674) > at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) > at java.nio.file.Files.createDirectories(Files.java:767) > at > one.profiler.AsyncProfilerLoader.getExtractionDirectory(AsyncProfilerLoader.java:133) > at > one.profiler.AsyncProfilerLoader.getAsyncProfilerPath(AsyncProfilerLoader.java:562) > at one.profiler.AsyncProfilerLoader.load(AsyncProfilerLoader.java:861) > at > org.apache.spark.profiler.SparkAsyncProfiler.(SparkAsyncProfiler.scala:70) > at > org.apache.spark.profiler.ProfilerExecutorPlugin.init(ProfilerPlugin.scala:82) > at > org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:125) > at > scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293) > at > scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at > scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293) > at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290) > at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108) > at > org.apache.spark.internal.plugin.ExecutorPluginContainer.(PluginContainer.scala:113) > at > org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:211) > at > org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:199) > at > org.apache.spark.executor.Executor.$anonfun$plugins$1(Executor.scala:337) > at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:178) > at org.apache.spark.executor.Executor.(Executor.scala:337) > at > org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:181) > at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115) > at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213) > at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100) > at > org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75) > at > org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 25/06/20 11:54:26 INFO YarnCoarseGrainedExecutorBackend: Driver commanded a > shutdown {code} > > We can specify `AsyncProfilerLoader.extractionDir` to spark local dir to > avoid this issue. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52536) Specify AsyncProfilerLoader.extractionDir to spark local dir
[ https://issues.apache.org/jira/browse/SPARK-52536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-52536. -- Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 51229 [https://github.com/apache/spark/pull/51229] > Specify AsyncProfilerLoader.extractionDir to spark local dir > > > Key: SPARK-52536 > URL: https://issues.apache.org/jira/browse/SPARK-52536 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Zhen Wang >Assignee: Zhen Wang >Priority: Major > Fix For: 4.1.0 > > > AsyncProfilerLoader uses `user.home` by default to store the extracted > libraries: > [https://github.com/jvm-profiling-tools/ap-loader/blob/main/src/main/java/one/profiler/AsyncProfilerLoader.java#L139-L152] > The `user.home` directory of the datanodes in our yarn cluster was not > initialized, causing the executor startup to fail: > {code:java} > 25/06/20 11:54:26 ERROR YarnCoarseGrainedExecutorBackend: Executor > self-exiting due to : Unable to create executor due to /home/pilot > java.nio.file.AccessDeniedException: /home/pilot > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at > sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) > at java.nio.file.Files.createDirectory(Files.java:674) > at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) > at java.nio.file.Files.createDirectories(Files.java:767) > at > one.profiler.AsyncProfilerLoader.getExtractionDirectory(AsyncProfilerLoader.java:133) > at > one.profiler.AsyncProfilerLoader.getAsyncProfilerPath(AsyncProfilerLoader.java:562) > at one.profiler.AsyncProfilerLoader.load(AsyncProfilerLoader.java:861) > at > org.apache.spark.profiler.SparkAsyncProfiler.(SparkAsyncProfiler.scala:70) > at > org.apache.spark.profiler.ProfilerExecutorPlugin.init(ProfilerPlugin.scala:82) > at > org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:125) > at > scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293) > at > scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at > scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293) > at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290) > at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108) > at > org.apache.spark.internal.plugin.ExecutorPluginContainer.(PluginContainer.scala:113) > at > org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:211) > at > org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:199) > at > org.apache.spark.executor.Executor.$anonfun$plugins$1(Executor.scala:337) > at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:178) > at org.apache.spark.executor.Executor.(Executor.scala:337) > at > org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:181) > at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115) > at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213) > at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100) > at > org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75) > at > org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 25/06/20 11:54:26 INFO YarnCoarseGrainedExecutorBackend: Driver commanded a > shutdown {code} > > We can specify `AsyncProfilerLoader.extractionDir` to spark local dir to > avoid this issue. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52499) Add more tests for data types
[ https://issues.apache.org/jira/browse/SPARK-52499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-52499. - Fix Version/s: 4.1.0 4.0.1 Resolution: Fixed Issue resolved by pull request 51193 [https://github.com/apache/spark/pull/51193] > Add more tests for data types > - > > Key: SPARK-52499 > URL: https://issues.apache.org/jira/browse/SPARK-52499 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.1.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Fix For: 4.1.0, 4.0.1 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52499) Add more tests for data types
[ https://issues.apache.org/jira/browse/SPARK-52499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-52499: --- Assignee: Allison Wang > Add more tests for data types > - > > Key: SPARK-52499 > URL: https://issues.apache.org/jira/browse/SPARK-52499 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.1.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52534) Make MLCache and MLHandler thread-safe
[ https://issues.apache.org/jira/browse/SPARK-52534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu resolved SPARK-52534. Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 51226 [https://github.com/apache/spark/pull/51226] > Make MLCache and MLHandler thread-safe > -- > > Key: SPARK-52534 > URL: https://issues.apache.org/jira/browse/SPARK-52534 > Project: Spark > Issue Type: Sub-task > Components: Connect, ML >Affects Versions: 4.1.0 >Reporter: Weichen Xu >Assignee: Weichen Xu >Priority: Major > Fix For: 4.1.0 > > > Make MLCache and MLHandler thread-safe > > Current implementation might cause race conditions. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52546) when sparkcontext crashes in executing sql, final state should be "error", but eventually return "finished".
[ https://issues.apache.org/jira/browse/SPARK-52546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-52546. -- Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 51243 [https://github.com/apache/spark/pull/51243] > when sparkcontext crashes in executing sql, final state should be "error", > but eventually return "finished". > > > Key: SPARK-52546 > URL: https://issues.apache.org/jira/browse/SPARK-52546 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.5.0, 4.0.0 >Reporter: xuyu >Assignee: xuyu >Priority: Major > Labels: pull-request-available > Fix For: 4.1.0 > > > when running execute() in SparkExecuteStatementOperation.scala, sparkContext > crashes, and runs into catch code block, state in Operation.scala should be > set "error", but eventually return "finished". The final state is wrong, > because catch code block in execute() missed a judgment branch. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52546) when sparkcontext crashes in executing sql, final state should be "error", but eventually return "finished".
[ https://issues.apache.org/jira/browse/SPARK-52546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-52546: Assignee: xuyu > when sparkcontext crashes in executing sql, final state should be "error", > but eventually return "finished". > > > Key: SPARK-52546 > URL: https://issues.apache.org/jira/browse/SPARK-52546 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.5.0, 4.0.0 >Reporter: xuyu >Assignee: xuyu >Priority: Major > Labels: pull-request-available > > when running execute() in SparkExecuteStatementOperation.scala, sparkContext > crashes, and runs into catch code block, state in Operation.scala should be > set "error", but eventually return "finished". The final state is wrong, > because catch code block in execute() missed a judgment branch. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52540) Support the time type by make_timestamp_ntz()
[ https://issues.apache.org/jira/browse/SPARK-52540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-52540. -- Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 51232 [https://github.com/apache/spark/pull/51232] > Support the time type by make_timestamp_ntz() > - > > Key: SPARK-52540 > URL: https://issues.apache.org/jira/browse/SPARK-52540 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 4.1.0 >Reporter: Max Gekk >Assignee: Max Gekk >Priority: Major > Fix For: 4.1.0 > > > Modify the make_timestamp_ntz() function to create a timestamp from date and > time. > h4. Syntax > {code:sql} > make_timestamp_ntz(date [, time]) > {code} > h4. Arguments > # date: A date expression > # time: A time expression > h4. Returns > A TIMESTAMP. > h4. Examples > {code:sql} > > SELECT make_timestamp_ntz(DATE'2014-12-28', TIME'6:30:45.887'); > 2014-12-28 06:30:45.887 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-52349) Enable boolean division tests with ANSI enabled
[ https://issues.apache.org/jira/browse/SPARK-52349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-52349: - Summary: Enable boolean division tests with ANSI enabled (was: Enable divide-by-zero for boolean division with ANSI enabled) > Enable boolean division tests with ANSI enabled > --- > > Key: SPARK-52349 > URL: https://issues.apache.org/jira/browse/SPARK-52349 > Project: Spark > Issue Type: Sub-task > Components: PS >Affects Versions: 4.1.0 >Reporter: Xinrong Meng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52554) Avoid multiple roundtrips for config check in connect
Takuya Ueshin created SPARK-52554: - Summary: Avoid multiple roundtrips for config check in connect Key: SPARK-52554 URL: https://issues.apache.org/jira/browse/SPARK-52554 Project: Spark Issue Type: Sub-task Components: Pandas API on Spark Affects Versions: 4.1.0 Reporter: Takuya Ueshin -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52555) Enforce `UnusedLocalVariable` rule
Dongjoon Hyun created SPARK-52555: - Summary: Enforce `UnusedLocalVariable` rule Key: SPARK-52555 URL: https://issues.apache.org/jira/browse/SPARK-52555 Project: Spark Issue Type: Sub-task Components: Kubernetes Affects Versions: kubernetes-operator-0.4.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52556) CAST_INVALID_INPUT for Pandas on Spark in ANSI mode
Xinrong Meng created SPARK-52556: Summary: CAST_INVALID_INPUT for Pandas on Spark in ANSI mode Key: SPARK-52556 URL: https://issues.apache.org/jira/browse/SPARK-52556 Project: Spark Issue Type: Umbrella Components: PS Affects Versions: 4.1.0 Reporter: Xinrong Meng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52557) Avoid CAST_INVALID_INPUT of to_numeric(errors='coerce') in ANSI mode
Xinrong Meng created SPARK-52557: Summary: Avoid CAST_INVALID_INPUT of to_numeric(errors='coerce') in ANSI mode Key: SPARK-52557 URL: https://issues.apache.org/jira/browse/SPARK-52557 Project: Spark Issue Type: Sub-task Components: PS Affects Versions: 4.1.0 Reporter: Xinrong Meng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52558) Lower `SparkOperatorConfManager` log level to WARN for `FileNotFoundException`
Dongjoon Hyun created SPARK-52558: - Summary: Lower `SparkOperatorConfManager` log level to WARN for `FileNotFoundException` Key: SPARK-52558 URL: https://issues.apache.org/jira/browse/SPARK-52558 Project: Spark Issue Type: Sub-task Components: Kubernetes Affects Versions: kubernetes-operator-0.4.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52553) Fix NumberFormatException when reading v1 changelog
[ https://issues.apache.org/jira/browse/SPARK-52553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-52553. -- Fix Version/s: 4.1.0 4.0.1 Resolution: Fixed Issue resolved by pull request 51255 [https://github.com/apache/spark/pull/51255] > Fix NumberFormatException when reading v1 changelog > --- > > Key: SPARK-52553 > URL: https://issues.apache.org/jira/browse/SPARK-52553 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 4.0.0 >Reporter: B. Micheal Okutubo >Assignee: B. Micheal Okutubo >Priority: Major > Fix For: 4.1.0, 4.0.1 > > > When trying to read the changelog version, the reader factory throws > NumberFormatException for v1 changelog, if it decodes the first few bytes in > the file as UTF string e.g. "v)" -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52553) Fix NumberFormatException when reading v1 changelog
[ https://issues.apache.org/jira/browse/SPARK-52553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reassigned SPARK-52553: Assignee: B. Micheal Okutubo > Fix NumberFormatException when reading v1 changelog > --- > > Key: SPARK-52553 > URL: https://issues.apache.org/jira/browse/SPARK-52553 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 4.0.0 >Reporter: B. Micheal Okutubo >Assignee: B. Micheal Okutubo >Priority: Major > > When trying to read the changelog version, the reader factory throws > NumberFormatException for v1 changelog, if it decodes the first few bytes in > the file as UTF string e.g. "v)" -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-48231) Remove unused CodeHaus Jackson dependencies
[ https://issues.apache.org/jira/browse/SPARK-48231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie resolved SPARK-48231. -- Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 46521 [https://github.com/apache/spark/pull/46521] > Remove unused CodeHaus Jackson dependencies > --- > > Key: SPARK-48231 > URL: https://issues.apache.org/jira/browse/SPARK-48231 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 4.1.0 >Reporter: Cheng Pan >Assignee: Cheng Pan >Priority: Major > Labels: pull-request-available > Fix For: 4.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-52349) Enable boolean division tests with ANSI enabled
[ https://issues.apache.org/jira/browse/SPARK-52349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin reassigned SPARK-52349: - Assignee: Xinrong Meng > Enable boolean division tests with ANSI enabled > --- > > Key: SPARK-52349 > URL: https://issues.apache.org/jira/browse/SPARK-52349 > Project: Spark > Issue Type: Sub-task > Components: PS >Affects Versions: 4.1.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52552) Skip CHECK constraint enforcement for deletion vector deletes
Gengliang Wang created SPARK-52552: -- Summary: Skip CHECK constraint enforcement for deletion vector deletes Key: SPARK-52552 URL: https://issues.apache.org/jira/browse/SPARK-52552 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 4.1.0 Reporter: Gengliang Wang Assignee: Gengliang Wang Writes a delta of rows to an existing table doesn't produce any new rows, thus enforcing check constraints is unnecessary. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-51068) CTEs are not canonicalized and resulting in cached result not being used and recomputed
[ https://issues.apache.org/jira/browse/SPARK-51068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nimesh Khandelwal updated SPARK-51068: -- Target Version/s: 3.3.2 (was: 3.3.2, 4.0.0) > CTEs are not canonicalized and resulting in cached result not being used and > recomputed > --- > > Key: SPARK-51068 > URL: https://issues.apache.org/jira/browse/SPARK-51068 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2, 3.1.3, 3.3.2 >Reporter: Nimesh Khandelwal >Priority: Major > Labels: pull-request-available > > To check whether the plan exists in the cache or not, CacheManager matches > the canonicalized version of the plan. Currently, in canonicalized versions, > CTEIds are not handled and thus result in unnecessary cache misses in cases > where queries using CTE are stored. This issue starts after the commit to > [Avoid inlining non-deterministic > With-CTEs|https://github.com/apache/spark/pull/33671/files] in which each > CTERelationDef and CTERelationRef were introduced and their canonicalization > was not handled. > {code:java} > >>>spark.sql("CACHE TABLE cached_cte AS WITH cte1 AS ( SELECT 1 AS id, > >>>'Alice' AS name UNION ALL SELECT 2 AS id, 'Bob' AS name ), cte2 AS ( > >>>SELECT 1 AS id, 10 AS score UNION ALL SELECT 2 AS id, 20 AS score ) SELECT > >>>cte1.id, cte1.name, cte2.score FROM cte1 JOIN cte2 ON cte1.id = cte2.id"); > DataFrame[] > >>> spark.sql("select count(*) from cached_cte").explain() > == Physical Plan == > AdaptiveSparkPlan isFinalPlan=false > +- HashAggregate(keys=[], functions=[count(1)]) > +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=165] > +- HashAggregate(keys=[], functions=[partial_count(1)]) > +- Project > +- BroadcastHashJoin [id#120], [id#124], Inner, BuildRight, false > :- Union > : :- Project [1 AS id#120] > : : +- Scan OneRowRelation[] > : +- Project [2 AS id#122] > : +- Scan OneRowRelation[] > +- BroadcastExchange > HashedRelationBroadcastMode(List(cast(input[0, int, false] as > bigint)),false), [plan_id=160] > +- Union > :- Project [1 AS id#124] > : +- Scan OneRowRelation[] > +- Project [2 AS id#126] > +- Scan OneRowRelation[]{code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-52349) Enable boolean division tests with ANSI enabled
[ https://issues.apache.org/jira/browse/SPARK-52349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-52349. --- Fix Version/s: 4.1.0 Resolution: Fixed Issue resolved by pull request 51249 [https://github.com/apache/spark/pull/51249] > Enable boolean division tests with ANSI enabled > --- > > Key: SPARK-52349 > URL: https://issues.apache.org/jira/browse/SPARK-52349 > Project: Spark > Issue Type: Sub-task > Components: PS >Affects Versions: 4.1.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Fix For: 4.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-52349) Enable divide-by-zero for boolean division with ANSI enabled
[ https://issues.apache.org/jira/browse/SPARK-52349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-52349: - Summary: Enable divide-by-zero for boolean division with ANSI enabled (was: Enable divide-by-zero test cases) > Enable divide-by-zero for boolean division with ANSI enabled > > > Key: SPARK-52349 > URL: https://issues.apache.org/jira/browse/SPARK-52349 > Project: Spark > Issue Type: Sub-task > Components: PS >Affects Versions: 4.1.0 >Reporter: Xinrong Meng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-52553) Fix NumberFormatException when reading v1 changelog
B. Micheal Okutubo created SPARK-52553: -- Summary: Fix NumberFormatException when reading v1 changelog Key: SPARK-52553 URL: https://issues.apache.org/jira/browse/SPARK-52553 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 4.0.0 Reporter: B. Micheal Okutubo When trying to read the changelog version, the reader factory throws NumberFormatException for v1 changelog, if it decodes the first few bytes in the file as UTF string e.g. "v)" -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org