[jira] [Resolved] (SPARK-39676) Add task partition id for Task assertEquals method in JsonProtocolSuite

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-39676. -- Assignee: Qian Sun Resolution: Fixed Fixed in https://github.com/apache/spark/pull/37081

[jira] [Updated] (SPARK-39676) Add task partition id for Task assertEquals method in JsonProtocolSuite

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39676: - Fix Version/s: 3.3.1 > Add task partition id for Task assertEquals method in JsonProtocolSuite

[jira] [Assigned] (SPARK-39676) Add task partition id for Task assertEquals method in JsonProtocolSuite

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39676: Assignee: Apache Spark > Add task partition id for Task assertEquals method in JsonProtoc

[jira] [Assigned] (SPARK-39676) Add task partition id for Task assertEquals method in JsonProtocolSuite

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39676: Assignee: (was: Apache Spark) > Add task partition id for Task assertEquals method in

[jira] [Commented] (SPARK-39676) Add task partition id for Task assertEquals method in JsonProtocolSuite

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562362#comment-17562362 ] Apache Spark commented on SPARK-39676: -- User 'dcoliversun' has created a pull reque

[jira] [Commented] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Tanin Na Nakorn (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562361#comment-17562361 ] Tanin Na Nakorn commented on SPARK-39602: - We certainly can. Different points

[jira] [Comment Edited] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Tanin Na Nakorn (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562331#comment-17562331 ] Tanin Na Nakorn edited comment on SPARK-39602 at 7/5/22 4:26 AM: -

[jira] [Created] (SPARK-39676) Add task partition id for Task assertEquals method in JsonProtocolSuite

2022-07-04 Thread Qian Sun (Jira)
Qian Sun created SPARK-39676: Summary: Add task partition id for Task assertEquals method in JsonProtocolSuite Key: SPARK-39676 URL: https://issues.apache.org/jira/browse/SPARK-39676 Project: Spark

[jira] [Comment Edited] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562358#comment-17562358 ] Jungtaek Lim edited comment on SPARK-39602 at 7/5/22 4:14 AM:

[jira] [Commented] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Jungtaek Lim (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562358#comment-17562358 ] Jungtaek Lim commented on SPARK-39602: -- Why not make the number of partitions be co

[jira] [Commented] (SPARK-35208) Add docs for LATERAL subqueries

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562356#comment-17562356 ] Apache Spark commented on SPARK-35208: -- User 'huaxingao' has created a pull request

[jira] [Assigned] (SPARK-35208) Add docs for LATERAL subqueries

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35208: Assignee: (was: Apache Spark) > Add docs for LATERAL subqueries > ---

[jira] [Commented] (SPARK-35208) Add docs for LATERAL subqueries

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562357#comment-17562357 ] Apache Spark commented on SPARK-35208: -- User 'huaxingao' has created a pull request

[jira] [Assigned] (SPARK-35208) Add docs for LATERAL subqueries

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-35208: Assignee: Apache Spark > Add docs for LATERAL subqueries > --

[jira] [Resolved] (SPARK-39675) Switch 'spark.sql.codegen.factoryMode' configuration from testing purpose to internal purpose

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-39675. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37076 [https://gi

[jira] [Assigned] (SPARK-39675) Switch 'spark.sql.codegen.factoryMode' configuration from testing purpose to internal purpose

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-39675: Assignee: Hyukjin Kwon > Switch 'spark.sql.codegen.factoryMode' configuration from testin

[jira] [Commented] (SPARK-39610) Add safe.directory for container based job

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562351#comment-17562351 ] Apache Spark commented on SPARK-39610: -- User 'Yikun' has created a pull request for

[jira] [Assigned] (SPARK-39610) Add safe.directory for container based job

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39610: Assignee: Apache Spark > Add safe.directory for container based job > ---

[jira] [Commented] (SPARK-39610) Add safe.directory for container based job

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562350#comment-17562350 ] Apache Spark commented on SPARK-39610: -- User 'Yikun' has created a pull request for

[jira] [Assigned] (SPARK-39610) Add safe.directory for container based job

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39610: Assignee: (was: Apache Spark) > Add safe.directory for container based job >

[jira] [Assigned] (SPARK-39579) Make ListFunctions/getFunction/functionExists API compatible

2022-07-04 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-39579: --- Assignee: Ruifeng Zheng > Make ListFunctions/getFunction/functionExists API compatible > -

[jira] [Resolved] (SPARK-39579) Make ListFunctions/getFunction/functionExists API compatible

2022-07-04 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-39579. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36977 [https://gith

[jira] [Commented] (SPARK-39578) The driver cannot parse the SQL statement and the job is not executed

2022-07-04 Thread miaowang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562340#comment-17562340 ] miaowang commented on SPARK-39578: -- Thank you for your answer, please ask, spark 2 X is

[jira] [Commented] (SPARK-39664) RowMatrix(...).computeCovariance() VS Correlation.corr(..., ...)

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562339#comment-17562339 ] Hyukjin Kwon commented on SPARK-39664: -- [~igaloly] mind sharing simplified versions

[jira] [Assigned] (SPARK-39611) PySpark support numpy 1.23.X

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39611: Assignee: (was: Apache Spark) > PySpark support numpy 1.23.X > --

[jira] [Assigned] (SPARK-39611) PySpark support numpy 1.23.X

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39611: Assignee: Apache Spark > PySpark support numpy 1.23.X > > >

[jira] [Commented] (SPARK-39611) PySpark support numpy 1.23.X

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562337#comment-17562337 ] Apache Spark commented on SPARK-39611: -- User 'Yikun' has created a pull request for

[jira] [Updated] (SPARK-39635) Custom driver metrics for Datasource v2

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39635: - Target Version/s: (was: 3.4.0) > Custom driver metrics for Datasource v2 > ---

[jira] [Updated] (SPARK-39635) Custom driver metrics for Datasource v2

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39635: - Fix Version/s: (was: 3.4.0) > Custom driver metrics for Datasource v2 >

[jira] [Commented] (SPARK-39611) PySpark support numpy 1.23.X

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562338#comment-17562338 ] Apache Spark commented on SPARK-39611: -- User 'Yikun' has created a pull request for

[jira] [Commented] (SPARK-39623) partitionng by datestamp leads to wrong query on backend?

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562336#comment-17562336 ] Hyukjin Kwon commented on SPARK-39623: -- You can actually check the logs and which S

[jira] [Commented] (SPARK-39612) The dataframe returned by exceptAll() can no longer perform operations such as count() or isEmpty(), or an exception will be thrown.

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562334#comment-17562334 ] Hyukjin Kwon commented on SPARK-39612: -- This is a regression from Spark 3.2.0 > Th

[jira] [Updated] (SPARK-39612) The dataframe returned by exceptAll() can no longer perform operations such as count() or isEmpty(), or an exception will be thrown.

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39612: - Component/s: SQL (was: PySpark) > The dataframe returned by exceptAll() can

[jira] [Updated] (SPARK-39612) The dataframe returned by exceptAll() can no longer perform operations such as count() or isEmpty(), or an exception will be thrown.

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39612: - Priority: Critical (was: Major) > The dataframe returned by exceptAll() can no longer perform o

[jira] [Commented] (SPARK-39612) The dataframe returned by exceptAll() can no longer perform operations such as count() or isEmpty(), or an exception will be thrown.

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562332#comment-17562332 ] Hyukjin Kwon commented on SPARK-39612: -- Scala reproducer: {code} val d1 = Seq("a")

[jira] [Comment Edited] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Tanin Na Nakorn (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562331#comment-17562331 ] Tanin Na Nakorn edited comment on SPARK-39602 at 7/5/22 2:55 AM: -

[jira] [Commented] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Tanin Na Nakorn (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562331#comment-17562331 ] Tanin Na Nakorn commented on SPARK-39602: - The number of partitions are for illu

[jira] [Comment Edited] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Tanin Na Nakorn (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562331#comment-17562331 ] Tanin Na Nakorn edited comment on SPARK-39602 at 7/5/22 2:54 AM: -

[jira] [Comment Edited] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Tanin Na Nakorn (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562331#comment-17562331 ] Tanin Na Nakorn edited comment on SPARK-39602 at 7/5/22 2:54 AM: -

[jira] [Updated] (SPARK-39608) Upgrade to spark 3.3.0 is causing error "Cannot grow BufferHolder by size -179446840 because the size is negative"

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39608: - Priority: Major (was: Critical) > Upgrade to spark 3.3.0 is causing error "Cannot grow BufferHo

[jira] [Commented] (SPARK-39605) PySpark df.count() operation works fine on DBR 7.3 LTS but fails in DBR 10.4 LTS

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562330#comment-17562330 ] Hyukjin Kwon commented on SPARK-39605: -- The exception is from MongoDB. I suspect th

[jira] [Updated] (SPARK-39605) PySpark df.count() operation works fine on DBR 7.3 LTS but fails in DBR 10.4 LTS

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39605: - Flags: (was: Important) > PySpark df.count() operation works fine on DBR 7.3 LTS but fails in

[jira] [Updated] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39602: - Issue Type: Improvement (was: Bug) > Invoking .repartition(10) in a unit test causes the un

[jira] [Commented] (SPARK-39602) Invoking .repartition(100000) in a unit test causes the unit test to take >20 minutes.

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562326#comment-17562326 ] Hyukjin Kwon commented on SPARK-39602: -- Why don't you decrease the number of partit

[jira] [Commented] (SPARK-39603) Dataset planning in a unit test takes a very long time to finish (e.g. >8mins for complex job)

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562327#comment-17562327 ] Hyukjin Kwon commented on SPARK-39603: -- Mind showing the reproducer? It's very diff

[jira] [Commented] (SPARK-39593) Configurable State Checkpointing Frequency in Structured Streaming

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562325#comment-17562325 ] Hyukjin Kwon commented on SPARK-39593: -- cc [~kabhwan] FYI > Configurable State Che

[jira] [Commented] (SPARK-39592) Asynchronous State Checkpointing in Structured Streaming

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562324#comment-17562324 ] Hyukjin Kwon commented on SPARK-39592: -- cc [~kabhwan] FYI > Asynchronous State Che

[jira] [Commented] (SPARK-39591) Offset Management Improvements in Structured Streaming

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562323#comment-17562323 ] Hyukjin Kwon commented on SPARK-39591: -- cc [~kabhwan] FYI > Offset Management Impr

[jira] [Commented] (SPARK-39588) Inspect the State of Stateful Streaming Pipelines

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562320#comment-17562320 ] Hyukjin Kwon commented on SPARK-39588: -- cc [~kabhwan] FYI > Inspect the State of S

[jira] [Commented] (SPARK-39586) Advanced Windowing in Structured Streaming

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562318#comment-17562318 ] Hyukjin Kwon commented on SPARK-39586: -- cc [~kabhwan] FYI > Advanced Windowing in

[jira] [Commented] (SPARK-39585) Multiple Stateful Operators in Structured Streaming

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562317#comment-17562317 ] Hyukjin Kwon commented on SPARK-39585: -- cc. [~kabhwan] FYI > Multiple Stateful Ope

[jira] [Commented] (SPARK-39587) Schema Evolution for Stateful Pipelines in Structured Streaming

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562319#comment-17562319 ] Hyukjin Kwon commented on SPARK-39587: -- cc [~kabhwan] FYI > Schema Evolution for S

[jira] [Commented] (SPARK-39589) Asynchronous I/O API support in Structured Streaming

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562321#comment-17562321 ] Hyukjin Kwon commented on SPARK-39589: -- cc [~kabhwan] FYI > Asynchronous I/O API s

[jira] [Commented] (SPARK-39578) The driver cannot parse the SQL statement and the job is not executed

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562315#comment-17562315 ] Hyukjin Kwon commented on SPARK-39578: -- Spark 2.X is EOL. mind trying Spark 3.X and

[jira] [Commented] (SPARK-39568) when using df.astype("str") on pyspark dataframe. None are converted "None"

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562314#comment-17562314 ] Hyukjin Kwon commented on SPARK-39568: -- [~itholic] [~XinrongM] [~podongfeng] FYI >

[jira] [Resolved] (SPARK-39669) Support unpivot function

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-39669. -- Resolution: Duplicate > Support unpivot function > > >

[jira] [Updated] (SPARK-39671) insert overwrite table java.lang.NoSuchMethodException: org.apache.hadoop.hive.ql.metadata.Hive.loadPartition .This problem occurred when we installed Apache Spark3.0.1-

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39671: - Component/s: SQL (was: Pandas API on Spark) > insert overwrite table java.l

[jira] [Commented] (SPARK-39671) insert overwrite table java.lang.NoSuchMethodException: org.apache.hadoop.hive.ql.metadata.Hive.loadPartition .This problem occurred when we installed Apache Spark3.0.

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562312#comment-17562312 ] Hyukjin Kwon commented on SPARK-39671: -- [~xinjian.zhang.bean] mind sharing the self

[jira] [Updated] (SPARK-39611) PySpark support numpy 1.23.X

2022-07-04 Thread Yikun Jiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yikun Jiang updated SPARK-39611: Description: {code:java} == ER

[jira] [Resolved] (SPARK-39062) Add Standalone backend support for Stage Level Scheduling

2022-07-04 Thread wuyi (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuyi resolved SPARK-39062. -- Fix Version/s: 3.4.0 Assignee: huangtengfei Resolution: Fixed Issue resolved by https://github

[jira] [Assigned] (SPARK-39675) Switch 'spark.sql.codegen.factoryMode' configuration from testing purpose to internal purpose

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39675: Assignee: (was: Apache Spark) > Switch 'spark.sql.codegen.factoryMode' configuration

[jira] [Commented] (SPARK-39675) Switch 'spark.sql.codegen.factoryMode' configuration from testing purpose to internal purpose

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562302#comment-17562302 ] Apache Spark commented on SPARK-39675: -- User 'HyukjinKwon' has created a pull reque

[jira] [Assigned] (SPARK-39675) Switch 'spark.sql.codegen.factoryMode' configuration from testing purpose to internal purpose

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39675: Assignee: Apache Spark > Switch 'spark.sql.codegen.factoryMode' configuration from testin

[jira] [Created] (SPARK-39675) Switch 'spark.sql.codegen.factoryMode' configuration from testing purpose to internal purpose

2022-07-04 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-39675: Summary: Switch 'spark.sql.codegen.factoryMode' configuration from testing purpose to internal purpose Key: SPARK-39675 URL: https://issues.apache.org/jira/browse/SPARK-39675

[jira] [Created] (SPARK-39674) Initial Protobuf Definition for Spark Connect API

2022-07-04 Thread Martin Grund (Jira)
Martin Grund created SPARK-39674: Summary: Initial Protobuf Definition for Spark Connect API Key: SPARK-39674 URL: https://issues.apache.org/jira/browse/SPARK-39674 Project: Spark Issue Type:

[jira] [Created] (SPARK-39673) High-Level Design Doc for Spark Connect

2022-07-04 Thread Martin Grund (Jira)
Martin Grund created SPARK-39673: Summary: High-Level Design Doc for Spark Connect Key: SPARK-39673 URL: https://issues.apache.org/jira/browse/SPARK-39673 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-39653) Remove `ColumnVectorUtils#populate(WritableColumnVector, InternalRow, int) ` method

2022-07-04 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-39653: Assignee: Yang Jie > Remove `ColumnVectorUtils#populate(WritableColumnVector, InternalRow

[jira] [Resolved] (SPARK-39653) Remove `ColumnVectorUtils#populate(WritableColumnVector, InternalRow, int) ` method

2022-07-04 Thread Sean R. Owen (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-39653. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37045 [https://gi

[jira] [Commented] (SPARK-38904) Low cost DataFrame schema swap util

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562224#comment-17562224 ] Apache Spark commented on SPARK-38904: -- User 'cloud-fan' has created a pull request

[jira] [Commented] (SPARK-38904) Low cost DataFrame schema swap util

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562225#comment-17562225 ] Apache Spark commented on SPARK-38904: -- User 'cloud-fan' has created a pull request

[jira] [Assigned] (SPARK-39672) NotExists subquery failed with conflicting attributes

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39672: Assignee: (was: Apache Spark) > NotExists subquery failed with conflicting attributes

[jira] [Commented] (SPARK-39672) NotExists subquery failed with conflicting attributes

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562190#comment-17562190 ] Apache Spark commented on SPARK-39672: -- User 'manuzhang' has created a pull request

[jira] [Commented] (SPARK-39672) NotExists subquery failed with conflicting attributes

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562189#comment-17562189 ] Apache Spark commented on SPARK-39672: -- User 'manuzhang' has created a pull request

[jira] [Assigned] (SPARK-39672) NotExists subquery failed with conflicting attributes

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39672: Assignee: Apache Spark > NotExists subquery failed with conflicting attributes >

[jira] [Created] (SPARK-39672) NotExists subquery failed with conflicting attributes

2022-07-04 Thread Manu Zhang (Jira)
Manu Zhang created SPARK-39672: -- Summary: NotExists subquery failed with conflicting attributes Key: SPARK-39672 URL: https://issues.apache.org/jira/browse/SPARK-39672 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-39649) Make listDatabases / getDatabase / listColumns / refreshTable in PySpark support 3-layer-namespace

2022-07-04 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-39649. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37039 [https://gith

[jira] [Assigned] (SPARK-39649) Make listDatabases / getDatabase / listColumns / refreshTable in PySpark support 3-layer-namespace

2022-07-04 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-39649: --- Assignee: Ruifeng Zheng > Make listDatabases / getDatabase / listColumns / refreshTable in

[jira] [Commented] (SPARK-39656) Fix wrong namespace in DescribeNamespaceExec

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562164#comment-17562164 ] Apache Spark commented on SPARK-39656: -- User 'ulysses-you' has created a pull reque

[jira] [Commented] (SPARK-39656) Fix wrong namespace in DescribeNamespaceExec

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562163#comment-17562163 ] Apache Spark commented on SPARK-39656: -- User 'ulysses-you' has created a pull reque

[jira] [Commented] (SPARK-39656) Fix wrong namespace in DescribeNamespaceExec

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562161#comment-17562161 ] Apache Spark commented on SPARK-39656: -- User 'ulysses-you' has created a pull reque

[jira] [Closed] (SPARK-39670) pyspark set "spark.hadoop.validateOutputSpecs" to false can't take effect

2022-07-04 Thread youngxinler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] youngxinler closed SPARK-39670. --- > pyspark set "spark.hadoop.validateOutputSpecs" to false can't take effect > --

[jira] [Resolved] (SPARK-39670) pyspark set "spark.hadoop.validateOutputSpecs" to false can't take effect

2022-07-04 Thread youngxinler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] youngxinler resolved SPARK-39670. - Resolution: Not A Problem self code error > pyspark set "spark.hadoop.validateOutputSpecs" to f

[jira] [Resolved] (SPARK-39666) Use UnsafeProjection.create to respect `spark.sql.codegen.factoryMode` in ExpresssionEncoder

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-39666. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37067 [https://gi

[jira] [Assigned] (SPARK-39666) Use UnsafeProjection.create to respect `spark.sql.codegen.factoryMode` in ExpresssionEncoder

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-39666: Assignee: Hyukjin Kwon > Use UnsafeProjection.create to respect `spark.sql.codegen.factor

[jira] [Comment Edited] (SPARK-38958) Override S3 Client in Spark Write/Read calls

2022-07-04 Thread Daniel Carl Jones (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562117#comment-17562117 ] Daniel Carl Jones edited comment on SPARK-38958 at 7/4/22 10:06 AM: --

[jira] [Comment Edited] (SPARK-38958) Override S3 Client in Spark Write/Read calls

2022-07-04 Thread Daniel Carl Jones (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562117#comment-17562117 ] Daniel Carl Jones edited comment on SPARK-38958 at 7/4/22 10:03 AM: --

[jira] [Updated] (SPARK-39619) PrometheusServlet: add "TYPE" comment to exposed metrics

2022-07-04 Thread Eric Barault (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Barault updated SPARK-39619: - Description: The PrometheusServlet sink does not include the usual comments when exposing the m

[jira] [Created] (SPARK-39671) insert overwrite table java.lang.NoSuchMethodException: org.apache.hadoop.hive.ql.metadata.Hive.loadPartition .This problem occurred when we installed Apache Spark3.0.1-

2022-07-04 Thread xin (Jira)
xin created SPARK-39671: --- Summary: insert overwrite table java.lang.NoSuchMethodException: org.apache.hadoop.hive.ql.metadata.Hive.loadPartition .This problem occurred when we installed Apache Spark3.0.1-hadoop3.0 in CDH6.1.1 Key: SPARK-

[jira] [Commented] (SPARK-38958) Override S3 Client in Spark Write/Read calls

2022-07-04 Thread Daniel Carl Jones (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562117#comment-17562117 ] Daniel Carl Jones commented on SPARK-38958: --- There is a workaround which may h

[jira] [Commented] (SPARK-38958) Override S3 Client in Spark Write/Read calls

2022-07-04 Thread Daniel Carl Jones (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562115#comment-17562115 ] Daniel Carl Jones commented on SPARK-38958: --- I'm not aware of documented suppo

[jira] [Created] (SPARK-39670) pyspark set "spark.hadoop.validateOutputSpecs" to false can't take effect

2022-07-04 Thread youngxinler (Jira)
youngxinler created SPARK-39670: --- Summary: pyspark set "spark.hadoop.validateOutputSpecs" to false can't take effect Key: SPARK-39670 URL: https://issues.apache.org/jira/browse/SPARK-39670 Project: Spar

[jira] [Commented] (SPARK-39081) Impl DataFrame.resample and Series.resample

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562053#comment-17562053 ] Apache Spark commented on SPARK-39081: -- User 'zhengruifeng' has created a pull requ

[jira] [Commented] (SPARK-39081) Impl DataFrame.resample and Series.resample

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562051#comment-17562051 ] Apache Spark commented on SPARK-39081: -- User 'zhengruifeng' has created a pull requ

[jira] [Updated] (SPARK-39430) The inconsistent timezone in Spark History Server UI

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39430: - Priority: Major (was: Critical) > The inconsistent timezone in Spark History Server UI > --

[jira] [Updated] (SPARK-39666) Use UnsafeProjection.create to respect `spark.sql.codegen.factoryMode` in ExpresssionEncoder

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39666: - Summary: Use UnsafeProjection.create to respect `spark.sql.codegen.factoryMode` in ExpresssionEn

[jira] [Updated] (SPARK-33782) Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-33782: - Target Version/s: 3.4.0 (was: 3.3.0) > Place spark.files, spark.jars and spark.files under the

[jira] [Commented] (SPARK-39665) (GitHub CI) Bump workflow versions

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562041#comment-17562041 ] Apache Spark commented on SPARK-39665: -- User 'ArjunSharda' has created a pull reque

[jira] [Commented] (SPARK-39515) Improve/recover scheduled jobs in GitHub Actions

2022-07-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17562040#comment-17562040 ] Hyukjin Kwon commented on SPARK-39515: -- Thanks [~yikunkero] > Improve/recover sche

[jira] [Assigned] (SPARK-39665) (GitHub CI) Bump workflow versions

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39665: Assignee: Apache Spark > (GitHub CI) Bump workflow versions > ---

[jira] [Assigned] (SPARK-39665) (GitHub CI) Bump workflow versions

2022-07-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39665: Assignee: (was: Apache Spark) > (GitHub CI) Bump workflow versions >

  1   2   >