[jira] [Updated] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated SPARK-39833: - Affects Version/s: 3.3.0 > Filtered parquet data frame count() and show() produce inconsistent results

[jira] [Updated] (SPARK-39988) LevelDBIterator not close after used in `RemoteBlockPushResolver`, `YarnShuffleService` and `ExternalShuffleBlockResolver`

2022-08-04 Thread Yang Jie (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-39988: - Description: For example: {code:java} @VisibleForTesting static ConcurrentMap

[jira] [Commented] (SPARK-39988) LevelDBIterator not close after used in `RemoteBlockPushResolver`, `YarnShuffleService` and `ExternalShuffleBlockResolver`

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575574#comment-17575574 ] Apache Spark commented on SPARK-39988: -- User 'LuciferYang' has created a pull request for this

[jira] [Assigned] (SPARK-39988) LevelDBIterator not close after used in `RemoteBlockPushResolver`, `YarnShuffleService` and `ExternalShuffleBlockResolver`

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39988: Assignee: Apache Spark > LevelDBIterator not close after used in

[jira] [Assigned] (SPARK-39988) LevelDBIterator not close after used in `RemoteBlockPushResolver`, `YarnShuffleService` and `ExternalShuffleBlockResolver`

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39988: Assignee: (was: Apache Spark) > LevelDBIterator not close after used in

[jira] [Created] (SPARK-39988) LevelDBIterator not close after used in `RemoteBlockPushResolver`, `YarnShuffleService` and `ExternalShuffleBlockResolver`

2022-08-04 Thread Yang Jie (Jira)
Yang Jie created SPARK-39988: Summary: LevelDBIterator not close after used in `RemoteBlockPushResolver`, `YarnShuffleService` and `ExternalShuffleBlockResolver` Key: SPARK-39988 URL:

[jira] [Commented] (SPARK-39988) LevelDBIterator not close after used in `RemoteBlockPushResolver`, `YarnShuffleService` and `ExternalShuffleBlockResolver`

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575572#comment-17575572 ] Apache Spark commented on SPARK-39988: -- User 'LuciferYang' has created a pull request for this

[jira] [Commented] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575573#comment-17575573 ] Apache Spark commented on SPARK-39833: -- User 'sadikovi' has created a pull request for this issue:

[jira] [Assigned] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39833: Assignee: Apache Spark > Filtered parquet data frame count() and show() produce

[jira] [Assigned] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39833: Assignee: (was: Apache Spark) > Filtered parquet data frame count() and show()

[jira] [Comment Edited] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575032#comment-17575032 ] Ivan Sadikov edited comment on SPARK-39833 at 8/5/22 5:07 AM: -- Your example

[jira] [Commented] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575571#comment-17575571 ] Apache Spark commented on SPARK-39833: -- User 'sadikovi' has created a pull request for this issue:

[jira] [Commented] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575570#comment-17575570 ] Ivan Sadikov commented on SPARK-39833: -- I opened a PR to quickly fix it:

[jira] [Commented] (SPARK-39987) Support PEAK_JVM_(ON|OFF)HEAP_MEMORY executor rolling policy

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1757#comment-1757 ] Apache Spark commented on SPARK-39987: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-39987) Support PEAK_JVM_(ON|OFF)HEAP_MEMORY executor rolling policy

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39987: Assignee: Apache Spark > Support PEAK_JVM_(ON|OFF)HEAP_MEMORY executor rolling policy >

[jira] [Assigned] (SPARK-39987) Support PEAK_JVM_(ON|OFF)HEAP_MEMORY executor rolling policy

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39987: Assignee: (was: Apache Spark) > Support PEAK_JVM_(ON|OFF)HEAP_MEMORY executor

[jira] [Commented] (SPARK-39965) Spark on K8s delete pvc even though it's not being used.

2022-08-04 Thread pralabhkumar (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575554#comment-17575554 ] pralabhkumar commented on SPARK-39965: -- [~dongjoon] Please review. > Spark on K8s delete pvc even

[jira] [Created] (SPARK-39987) Support PEAK_JVM_(ON|OFF)HEAP_MEMORY executor rolling policy

2022-08-04 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-39987: - Summary: Support PEAK_JVM_(ON|OFF)HEAP_MEMORY executor rolling policy Key: SPARK-39987 URL: https://issues.apache.org/jira/browse/SPARK-39987 Project: Spark

[jira] [Commented] (SPARK-33782) Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575553#comment-17575553 ] Apache Spark commented on SPARK-33782: -- User 'pralabhkumar' has created a pull request for this

[jira] [Commented] (SPARK-33782) Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575552#comment-17575552 ] Apache Spark commented on SPARK-33782: -- User 'pralabhkumar' has created a pull request for this

[jira] [Assigned] (SPARK-33782) Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-33782: Assignee: Apache Spark > Place spark.files, spark.jars and spark.files under the current

[jira] [Assigned] (SPARK-33782) Place spark.files, spark.jars and spark.files under the current working directory on the driver in K8S cluster mode

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-33782: Assignee: (was: Apache Spark) > Place spark.files, spark.jars and spark.files under

[jira] [Resolved] (SPARK-39775) Regression due to AVRO-2035

2022-08-04 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-39775. - Fix Version/s: 3.3.1 3.2.3 3.4.0 Resolution: Fixed

[jira] [Assigned] (SPARK-39775) Regression due to AVRO-2035

2022-08-04 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-39775: --- Assignee: Yuming Wang > Regression due to AVRO-2035 > --- > >

[jira] [Commented] (SPARK-39743) Unable to set zstd compression level while writing parquet files

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575542#comment-17575542 ] Apache Spark commented on SPARK-39743: -- User 'ming95' has created a pull request for this

[jira] [Commented] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575538#comment-17575538 ] Felipe commented on SPARK-39971: [~yumwang] I added the query plans to all the scenarios. (note that

[jira] [Assigned] (SPARK-39986) Better example for Co-grouped Map

2022-08-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-39986: Assignee: Xinrong Meng > Better example for Co-grouped Map >

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Attachment: 2.2.AfterAnalyzeTable WITHOUT ForAllColumns-joinreorder-enabled.txt

[jira] [Resolved] (SPARK-39986) Better example for Co-grouped Map

2022-08-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-39986. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37412

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Attachment: (was: BeforeAnalyzeTable.txt) > ANALYZE TABLE makes some queries run forever >

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Attachment: (was: AfterAnalyzeTable WITHOUT ForAllColumns.txt) > ANALYZE TABLE makes some queries run

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Attachment: (was: AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt) > ANALYZE TABLE makes some

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Attachment: (was: AfterAnalyzeTableForAllColumns.txt) > ANALYZE TABLE makes some queries run forever >

[jira] [Commented] (SPARK-39953) Hudi spark-submits from EMR 5.33 to EMR 6.5

2022-08-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575515#comment-17575515 ] Hyukjin Kwon commented on SPARK-39953: -- [~lavak] can you reproduce this in Apache Spark instead of

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39971: - Component/s: SQL > ANALYZE TABLE makes some queries run forever >

[jira] [Commented] (SPARK-39981) CheckOverflowInTableInsert returns exception rather than throwing it

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575514#comment-17575514 ] Apache Spark commented on SPARK-39981: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-39981) CheckOverflowInTableInsert returns exception rather than throwing it

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39981: Assignee: Apache Spark > CheckOverflowInTableInsert returns exception rather than

[jira] [Assigned] (SPARK-39981) CheckOverflowInTableInsert returns exception rather than throwing it

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39981: Assignee: (was: Apache Spark) > CheckOverflowInTableInsert returns exception rather

[jira] [Comment Edited] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575032#comment-17575032 ] Ivan Sadikov edited comment on SPARK-39833 at 8/5/22 1:48 AM: -- This is

[jira] [Commented] (SPARK-39833) Filtered parquet data frame count() and show() produce inconsistent results when spark.sql.parquet.filterPushdown is true

2022-08-04 Thread Ivan Sadikov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575510#comment-17575510 ] Ivan Sadikov commented on SPARK-39833: -- It appears to be a bug in Parquet-Mr.  There is a

[jira] [Updated] (SPARK-39976) NULL check in ArrayIntersect adds extraneous null from first param

2022-08-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-39976: - Priority: Major (was: Blocker) > NULL check in ArrayIntersect adds extraneous null from first

[jira] [Commented] (SPARK-39979) IndexOutOfBoundsException on groupby + apply pandas grouped map udf function

2022-08-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575509#comment-17575509 ] Hyukjin Kwon commented on SPARK-39979: -- This is from error limitation, it has to be fixed now. I

[jira] [Commented] (SPARK-39981) CheckOverflowInTableInsert returns exception rather than throwing it

2022-08-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575508#comment-17575508 ] Hyukjin Kwon commented on SPARK-39981: -- will make a quick fix soon. Thanks for reporting this. >

[jira] [Assigned] (SPARK-39983) Should not cache unserialized broadcast relations on the driver

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39983: Assignee: Apache Spark > Should not cache unserialized broadcast relations on the driver

[jira] [Commented] (SPARK-39983) Should not cache unserialized broadcast relations on the driver

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575506#comment-17575506 ] Apache Spark commented on SPARK-39983: -- User 'alex-balikov' has created a pull request for this

[jira] [Assigned] (SPARK-39983) Should not cache unserialized broadcast relations on the driver

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39983: Assignee: (was: Apache Spark) > Should not cache unserialized broadcast relations on

[jira] [Commented] (SPARK-39983) Should not cache unserialized broadcast relations on the driver

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575505#comment-17575505 ] Apache Spark commented on SPARK-39983: -- User 'alex-balikov' has created a pull request for this

[jira] [Commented] (SPARK-39934) takeRDD in R is slow

2022-08-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575502#comment-17575502 ] Hyukjin Kwon commented on SPARK-39934: -- [~deshanxiao] do you have a reproducer? DataFrame.take

[jira] [Assigned] (SPARK-39984) Check workerLastHeartbeat with master before HeartbeatReceiver expires an executor

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39984: Assignee: Apache Spark > Check workerLastHeartbeat with master before HeartbeatReceiver

[jira] [Commented] (SPARK-39984) Check workerLastHeartbeat with master before HeartbeatReceiver expires an executor

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575499#comment-17575499 ] Apache Spark commented on SPARK-39984: -- User 'kevin85421' has created a pull request for this

[jira] [Assigned] (SPARK-39986) Better example for Co-grouped Map

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39986: Assignee: (was: Apache Spark) > Better example for Co-grouped Map >

[jira] [Assigned] (SPARK-39984) Check workerLastHeartbeat with master before HeartbeatReceiver expires an executor

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39984: Assignee: (was: Apache Spark) > Check workerLastHeartbeat with master before

[jira] [Assigned] (SPARK-39986) Better example for Co-grouped Map

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39986: Assignee: Apache Spark > Better example for Co-grouped Map >

[jira] [Commented] (SPARK-39986) Better example for Co-grouped Map

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575501#comment-17575501 ] Apache Spark commented on SPARK-39986: -- User 'xinrong-meng' has created a pull request for this

[jira] [Commented] (SPARK-39984) Check workerLastHeartbeat with master before HeartbeatReceiver expires an executor

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575500#comment-17575500 ] Apache Spark commented on SPARK-39984: -- User 'kevin85421' has created a pull request for this

[jira] [Created] (SPARK-39986) Better example for Co-grouped Map

2022-08-04 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-39986: Summary: Better example for Co-grouped Map Key: SPARK-39986 URL: https://issues.apache.org/jira/browse/SPARK-39986 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-38493) Improve the test coverage for pyspark/pandas module

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575494#comment-17575494 ] Apache Spark commented on SPARK-38493: -- User 'itholic' has created a pull request for this issue:

[jira] [Created] (SPARK-39985) Test DEFAULT column values with DataFrames

2022-08-04 Thread Daniel (Jira)
Daniel created SPARK-39985: -- Summary: Test DEFAULT column values with DataFrames Key: SPARK-39985 URL: https://issues.apache.org/jira/browse/SPARK-39985 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-39974) Create separate static image tag for infra cache

2022-08-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-39974: Assignee: Yikun Jiang > Create separate static image tag for infra cache >

[jira] [Resolved] (SPARK-39974) Create separate static image tag for infra cache

2022-08-04 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-39974. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37402

[jira] [Created] (SPARK-39984) Check workerLastHeartbeat with master before HeartbeatReceiver expires an executor

2022-08-04 Thread Kai-Hsun Chen (Jira)
Kai-Hsun Chen created SPARK-39984: - Summary: Check workerLastHeartbeat with master before HeartbeatReceiver expires an executor Key: SPARK-39984 URL: https://issues.apache.org/jira/browse/SPARK-39984

[jira] [Commented] (SPARK-39970) Introduce ThrottledLogger to prevent log message flooding caused by network issues

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575467#comment-17575467 ] Apache Spark commented on SPARK-39970: -- User 'kevin85421' has created a pull request for this

[jira] [Assigned] (SPARK-39970) Introduce ThrottledLogger to prevent log message flooding caused by network issues

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39970: Assignee: (was: Apache Spark) > Introduce ThrottledLogger to prevent log message

[jira] [Assigned] (SPARK-39970) Introduce ThrottledLogger to prevent log message flooding caused by network issues

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39970: Assignee: Apache Spark > Introduce ThrottledLogger to prevent log message flooding

[jira] [Commented] (SPARK-39970) Introduce ThrottledLogger to prevent log message flooding caused by network issues

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575466#comment-17575466 ] Apache Spark commented on SPARK-39970: -- User 'kevin85421' has created a pull request for this

[jira] [Created] (SPARK-39983) Should not cache unserialized broadcast relations on the driver

2022-08-04 Thread Alex Balikov (Jira)
Alex Balikov created SPARK-39983: Summary: Should not cache unserialized broadcast relations on the driver Key: SPARK-39983 URL: https://issues.apache.org/jira/browse/SPARK-39983 Project: Spark

[jira] [Commented] (SPARK-39982) StructType.fromJson method missing documentation

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575406#comment-17575406 ] Apache Spark commented on SPARK-39982: -- User 'khalidmammadov' has created a pull request for this

[jira] [Commented] (SPARK-39982) StructType.fromJson method missing documentation

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575405#comment-17575405 ] Apache Spark commented on SPARK-39982: -- User 'khalidmammadov' has created a pull request for this

[jira] [Assigned] (SPARK-39982) StructType.fromJson method missing documentation

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39982: Assignee: (was: Apache Spark) > StructType.fromJson method missing documentation >

[jira] [Assigned] (SPARK-39982) StructType.fromJson method missing documentation

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39982: Assignee: Apache Spark > StructType.fromJson method missing documentation >

[jira] [Created] (SPARK-39982) StructType.fromJson method missing documentation

2022-08-04 Thread Khalid Mammadov (Jira)
Khalid Mammadov created SPARK-39982: --- Summary: StructType.fromJson method missing documentation Key: SPARK-39982 URL: https://issues.apache.org/jira/browse/SPARK-39982 Project: Spark Issue

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Attachment: (was: AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt) > ANALYZE TABLE makes some

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Attachment: AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt > ANALYZE TABLE makes some queries run

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Attachment: AfterAnalyzeTableForAllColumns-joinreorder-disabled.txt > ANALYZE TABLE makes some queries run

[jira] [Updated] (SPARK-39976) NULL check in ArrayIntersect adds extraneous null from first param

2022-08-04 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-39976: -- Labels: (was: corr) > NULL check in ArrayIntersect adds extraneous null from first param >

[jira] [Updated] (SPARK-39976) NULL check in ArrayIntersect adds extraneous null from first param

2022-08-04 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-39976: -- Priority: Blocker (was: Major) > NULL check in ArrayIntersect adds extraneous null from

[jira] [Updated] (SPARK-39976) NULL check in ArrayIntersect adds extraneous null from first param

2022-08-04 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-39976: -- Labels: corr (was: ) > NULL check in ArrayIntersect adds extraneous null from first param >

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Description: I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without the FOR ALL

[jira] [Commented] (SPARK-39976) NULL check in ArrayIntersect adds extraneous null from first param

2022-08-04 Thread Thomas Graves (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575398#comment-17575398 ] Thomas Graves commented on SPARK-39976: --- [~cloud_fan]  [~angerszhuuu]  who worked on original

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Attachment: AfterAnalyzeTable WITHOUT ForAllColumns.txt AfterAnalyzeTableForAllColumns.txt

[jira] [Assigned] (SPARK-39961) DS V2 push-down translate Cast if the cast is safe

2022-08-04 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-39961: - Assignee: jiaan.geng > DS V2 push-down translate Cast if the cast is safe >

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Description: I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without the FOR ALL

[jira] [Updated] (SPARK-39971) ANALYZE TABLE makes some queries run forever

2022-08-04 Thread Felipe (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felipe updated SPARK-39971: --- Description: I'm using TPCDS to run benchmarks, and after running ANALYZE TABLE (without the FOR ALL

[jira] [Resolved] (SPARK-39961) DS V2 push-down translate Cast if the cast is safe

2022-08-04 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-39961. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37388

[jira] [Assigned] (SPARK-39872) HeapByteBuffer#get(int) is a hotspot path when using BytePackerForLong#unpack8Values with ByteBuffer input API

2022-08-04 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-39872: - Assignee: Yang Jie > HeapByteBuffer#get(int) is a hotspot path when using >

[jira] [Resolved] (SPARK-39872) HeapByteBuffer#get(int) is a hotspot path when using BytePackerForLong#unpack8Values with ByteBuffer input API

2022-08-04 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-39872. --- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 37293

[jira] [Created] (SPARK-39981) CheckOverflowInTableInsert returns exception rather than throwing it

2022-08-04 Thread Jason Darrell Lowe (Jira)
Jason Darrell Lowe created SPARK-39981: -- Summary: CheckOverflowInTableInsert returns exception rather than throwing it Key: SPARK-39981 URL: https://issues.apache.org/jira/browse/SPARK-39981

[jira] [Assigned] (SPARK-39876) Unpivot / melt function for SQL

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39876: Assignee: Apache Spark > Unpivot / melt function for SQL >

[jira] [Assigned] (SPARK-39876) Unpivot / melt function for SQL

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39876: Assignee: (was: Apache Spark) > Unpivot / melt function for SQL >

[jira] [Commented] (SPARK-39876) Unpivot / melt function for SQL

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575311#comment-17575311 ] Apache Spark commented on SPARK-39876: -- User 'EnricoMi' has created a pull request for this issue:

[jira] [Commented] (SPARK-37321) Wrong size estimation leads to "Cannot broadcast the table that is larger than 8GB: 8 GB"

2022-08-04 Thread Jira
[ https://issues.apache.org/jira/browse/SPARK-37321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575307#comment-17575307 ] Igor Uchôa commented on SPARK-37321: I'm facing the same situation too. You can see more details

[jira] [Commented] (SPARK-39980) Change infra image to static tag

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575256#comment-17575256 ] Apache Spark commented on SPARK-39980: -- User 'Yikun' has created a pull request for this issue:

[jira] [Assigned] (SPARK-39980) Change infra image to static tag

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39980: Assignee: Apache Spark > Change infra image to static tag >

[jira] [Assigned] (SPARK-39980) Change infra image to static tag

2022-08-04 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-39980: Assignee: (was: Apache Spark) > Change infra image to static tag >

[jira] [Commented] (SPARK-32952) Test failure on IBM Z: CoalesceShufflePartitionsSuite: - determining the number of reducers: complex query 1

2022-08-04 Thread Vivian Kong (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575252#comment-17575252 ] Vivian Kong commented on SPARK-32952: - The test is still failing on Spark v3.3.0 on IBM Z.  We

[jira] [Commented] (SPARK-35520) Spark-SQL test fails on IBM Z for certain config combinations.

2022-08-04 Thread Vivian Kong (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575254#comment-17575254 ] Vivian Kong commented on SPARK-35520: - The test is still failing on Spark v3.3.0 on IBM Z.  We

[jira] [Created] (SPARK-39980) Change infra image to static tag

2022-08-04 Thread Yikun Jiang (Jira)
Yikun Jiang created SPARK-39980: --- Summary: Change infra image to static tag Key: SPARK-39980 URL: https://issues.apache.org/jira/browse/SPARK-39980 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-39979) IndexOutOfBoundsException on groupby + apply pandas grouped map udf function

2022-08-04 Thread yaniv oren (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yaniv oren updated SPARK-39979: --- Description: I'm grouping on relatively small subset of groups with big size groups. Working with

[jira] [Updated] (SPARK-39979) IndexOutOfBoundsException on groupby + apply pandas grouped map udf function

2022-08-04 Thread yaniv oren (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yaniv oren updated SPARK-39979: --- Description: I'm grouping on relatively small subset of groups with big size groups. Working with

[jira] [Updated] (SPARK-39979) IndexOutOfBoundsException on groupby + apply pandas grouped map udf function

2022-08-04 Thread yaniv oren (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yaniv oren updated SPARK-39979: --- Summary: IndexOutOfBoundsException on groupby + apply pandas grouped map udf function (was:

  1   2   >