[jira] [Created] (SPARK-36214) Add add_categories to CategoricalAccessor and CategoricalIndex.

2021-07-19 Thread Takuya Ueshin (Jira)
Takuya Ueshin created SPARK-36214: - Summary: Add add_categories to CategoricalAccessor and CategoricalIndex. Key: SPARK-36214 URL: https://issues.apache.org/jira/browse/SPARK-36214 Project: Spark

[jira] [Updated] (SPARK-36217) Rename CustomShuffleReader and OptimizeLocalShuffleReader

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-36217: - Description: The name {{CustomShuffleReader}} is confusing and sounds like an API. This should

[jira] [Commented] (SPARK-36220) Incorrect pyspark.sql.types.Row __new__ and __init__ type annotations

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383727#comment-17383727 ] Apache Spark commented on SPARK-36220: -- User 'tobiasedwards' has created a pull request for this

[jira] [Assigned] (SPARK-36220) Incorrect pyspark.sql.types.Row __new__ and __init__ type annotations

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36220: Assignee: Apache Spark > Incorrect pyspark.sql.types.Row __new__ and __init__ type

[jira] [Assigned] (SPARK-36220) Incorrect pyspark.sql.types.Row __new__ and __init__ type annotations

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36220: Assignee: (was: Apache Spark) > Incorrect pyspark.sql.types.Row __new__ and __init__

[jira] [Assigned] (SPARK-36217) Rename CustomShuffleReader and OptimizeLocalShuffleReader

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36217: Assignee: (was: Apache Spark) > Rename CustomShuffleReader and

[jira] [Commented] (SPARK-36217) Rename CustomShuffleReader and OptimizeLocalShuffleReader

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383732#comment-17383732 ] Apache Spark commented on SPARK-36217: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-36217) Rename CustomShuffleReader and OptimizeLocalShuffleReader

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36217: Assignee: Apache Spark > Rename CustomShuffleReader and OptimizeLocalShuffleReader >

[jira] [Commented] (SPARK-36217) Rename CustomShuffleReader and OptimizeLocalShuffleReader

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383731#comment-17383731 ] Apache Spark commented on SPARK-36217: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Created] (SPARK-36221) Make sure CustomShuffleReaderExec has at least one partition

2021-07-19 Thread XiDuo You (Jira)
XiDuo You created SPARK-36221: - Summary: Make sure CustomShuffleReaderExec has at least one partition Key: SPARK-36221 URL: https://issues.apache.org/jira/browse/SPARK-36221 Project: Spark

[jira] [Commented] (SPARK-36046) Support new functions make_timestamp_ntz and make_timestamp_ltz

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383736#comment-17383736 ] Apache Spark commented on SPARK-36046: -- User 'beliefer' has created a pull request for this issue:

[jira] [Assigned] (SPARK-36216) Increase timeout for StreamingLinearRegressionWithTests.test_parameter_convergence

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-36216: Assignee: Hyukjin Kwon > Increase timeout for >

[jira] [Resolved] (SPARK-36216) Increase timeout for StreamingLinearRegressionWithTests.test_parameter_convergence

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-36216. -- Fix Version/s: 3.1.3 3.2.0 3.0.4 Resolution:

[jira] [Commented] (SPARK-36046) Support new functions make_timestamp_ntz and make_timestamp_ltz

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383738#comment-17383738 ] Apache Spark commented on SPARK-36046: -- User 'beliefer' has created a pull request for this issue:

[jira] [Assigned] (SPARK-36093) The result incorrect if the partition path case is inconsistent

2021-07-19 Thread Yuming Wang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang reassigned SPARK-36093: --- Assignee: angerszhu (was: Apache Spark) > The result incorrect if the partition path case

[jira] [Updated] (SPARK-36196) Spark FetchFailedException Stream is corrupted Error

2021-07-19 Thread Arghya Saha (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arghya Saha updated SPARK-36196: Component/s: PySpark Kubernetes > Spark FetchFailedException Stream is corrupted

[jira] [Assigned] (SPARK-36221) Make sure CustomShuffleReaderExec has at least one partition

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36221: Assignee: Apache Spark > Make sure CustomShuffleReaderExec has at least one partition >

[jira] [Assigned] (SPARK-36221) Make sure CustomShuffleReaderExec has at least one partition

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36221: Assignee: (was: Apache Spark) > Make sure CustomShuffleReaderExec has at least one

[jira] [Commented] (SPARK-36221) Make sure CustomShuffleReaderExec has at least one partition

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383751#comment-17383751 ] Apache Spark commented on SPARK-36221: -- User 'ulysses-you' has created a pull request for this

[jira] [Updated] (SPARK-35228) Add expression ToHiveString for keep consistent between hive/spark format in df.show and transform

2021-07-19 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-35228: Parent: (was: SPARK-27790) Issue Type: Improvement (was: Sub-task) > Add expression

[jira] [Commented] (SPARK-35815) Allow delayThreshold for watermark to be represented as ANSI day-time/year-month interval literals

2021-07-19 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383780#comment-17383780 ] Wenchen Fan commented on SPARK-35815: - [~sarutak] do we have any more blockers for this one? >

[jira] [Commented] (SPARK-35815) Allow delayThreshold for watermark to be represented as ANSI day-time/year-month interval literals

2021-07-19 Thread Kousuke Saruta (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383781#comment-17383781 ] Kousuke Saruta commented on SPARK-35815: I don't think so, and I'll work on this soon. > Allow

[jira] [Updated] (SPARK-35809) Add `index_col` argument for ps.sql.

2021-07-19 Thread Haejoon Lee (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haejoon Lee updated SPARK-35809: Description: The current behavior of [ps.sql 

[jira] [Created] (SPARK-36218) Flaky Test: TPC-DS in PR builder

2021-07-19 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-36218: Summary: Flaky Test: TPC-DS in PR builder Key: SPARK-36218 URL: https://issues.apache.org/jira/browse/SPARK-36218 Project: Spark Issue Type: Test

[jira] [Issue Comment Deleted] (SPARK-36218) Flaky Test: TPC-DS in PR builder

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-36218: - Comment: was deleted (was: If we're not sure, It think we can land the same hacky fix for now

[jira] [Comment Edited] (SPARK-36218) Flaky Test: TPC-DS in PR builder

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383707#comment-17383707 ] Hyukjin Kwon edited comment on SPARK-36218 at 7/20/21, 2:44 AM: cc

[jira] [Commented] (SPARK-36218) Flaky Test: TPC-DS in PR builder

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383709#comment-17383709 ] Hyukjin Kwon commented on SPARK-36218: -- Let me create a Pr for now as a temporary workaround ... >

[jira] [Issue Comment Deleted] (SPARK-36218) Flaky Test: TPC-DS in PR builder

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-36218: - Comment: was deleted (was: Let me create a Pr for now as a temporary workaround ...) > Flaky

[jira] [Updated] (SPARK-36219) Add flag to allow Driver to request for OPPORTUNISTIC containers

2021-07-19 Thread chaosju (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chaosju updated SPARK-36219: Description: YARN-2882 and YARN-4335 introduces the concept of container ExecutionTypes and specifically

[jira] [Updated] (SPARK-36219) Add flag to allow Driver to request for OPPORTUNISTIC containers

2021-07-19 Thread chaosju (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chaosju updated SPARK-36219: Description: YARN-2882 and YARN-4335 introduces the concept of container ExecutionTypes and specifically

[jira] [Updated] (SPARK-36219) Add flag to allow Driver to request for OPPORTUNISTIC containers

2021-07-19 Thread chaosju (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chaosju updated SPARK-36219: Affects Version/s: (was: 3.1.2) 3.3.0 > Add flag to allow Driver to request

[jira] [Commented] (SPARK-35809) Add `index_col` argument for ps.sql.

2021-07-19 Thread Haejoon Lee (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383679#comment-17383679 ] Haejoon Lee commented on SPARK-35809: - I'm working on this > Add `index_col` argument for ps.sql. >

[jira] [Assigned] (SPARK-36179) Support TimestampNTZType in SparkGetColumnsOperation

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-36179: Assignee: Kent Yao > Support TimestampNTZType in SparkGetColumnsOperation >

[jira] [Resolved] (SPARK-36179) Support TimestampNTZType in SparkGetColumnsOperation

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-36179. -- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 33393

[jira] [Created] (SPARK-36215) Add logging for slow fetches to diagnose external shuffle service issues

2021-07-19 Thread Shardul Mahadik (Jira)
Shardul Mahadik created SPARK-36215: --- Summary: Add logging for slow fetches to diagnose external shuffle service issues Key: SPARK-36215 URL: https://issues.apache.org/jira/browse/SPARK-36215

[jira] [Commented] (SPARK-35807) Deprecate the `num_files` argument

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383698#comment-17383698 ] Apache Spark commented on SPARK-35807: -- User 'itholic' has created a pull request for this issue:

[jira] [Commented] (SPARK-35807) Deprecate the `num_files` argument

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383699#comment-17383699 ] Apache Spark commented on SPARK-35807: -- User 'itholic' has created a pull request for this issue:

[jira] [Updated] (SPARK-36216) Increase timeout for StreamingLinearRegressionWithTests.test_parameter_convergence

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-36216: - Docs Text: (was: Test is flaky (https://github.com/apache/spark/runs/3109815586): {code}

[jira] [Updated] (SPARK-36216) Increase timeout for StreamingLinearRegressionWithTests.test_parameter_convergence

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-36216: - Description: Test is flaky (https://github.com/apache/spark/runs/3109815586): {code} Traceback

[jira] [Created] (SPARK-36216) Increase timeout for StreamingLinearRegressionWithTests.test_parameter_convergence

2021-07-19 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-36216: Summary: Increase timeout for StreamingLinearRegressionWithTests.test_parameter_convergence Key: SPARK-36216 URL: https://issues.apache.org/jira/browse/SPARK-36216

[jira] [Assigned] (SPARK-36216) Increase timeout for StreamingLinearRegressionWithTests.test_parameter_convergence

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36216: Assignee: Apache Spark > Increase timeout for >

[jira] [Commented] (SPARK-36216) Increase timeout for StreamingLinearRegressionWithTests.test_parameter_convergence

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383701#comment-17383701 ] Apache Spark commented on SPARK-36216: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Created] (SPARK-36217) Rename CustomShuffleReader and OptimizeLocalShuffleReader

2021-07-19 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-36217: Summary: Rename CustomShuffleReader and OptimizeLocalShuffleReader Key: SPARK-36217 URL: https://issues.apache.org/jira/browse/SPARK-36217 Project: Spark

[jira] [Assigned] (SPARK-36216) Increase timeout for StreamingLinearRegressionWithTests.test_parameter_convergence

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36216: Assignee: (was: Apache Spark) > Increase timeout for >

[jira] [Assigned] (SPARK-35807) Deprecate the `num_files` argument

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-35807: Assignee: Haejoon Lee > Deprecate the `num_files` argument >

[jira] [Updated] (SPARK-35807) Deprecate the `num_files` argument

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-35807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-35807: - Fix Version/s: 3.2.0 > Deprecate the `num_files` argument > --

[jira] [Commented] (SPARK-36218) Flaky Test: TPC-DS in PR builder

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383707#comment-17383707 ] Hyukjin Kwon commented on SPARK-36218: -- cc [~maropu], [~cloud_fan], [~dongjoon] FYI. Actually, I

[jira] [Comment Edited] (SPARK-36218) Flaky Test: TPC-DS in PR builder

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383707#comment-17383707 ] Hyukjin Kwon edited comment on SPARK-36218 at 7/20/21, 2:41 AM: cc

[jira] [Commented] (SPARK-36218) Flaky Test: TPC-DS in PR builder

2021-07-19 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383708#comment-17383708 ] Hyukjin Kwon commented on SPARK-36218: -- If we're not sure, It think we can land the same hacky fix

[jira] [Updated] (SPARK-36093) The result incorrect if the partition path case is inconsistent

2021-07-19 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-36093: Fix Version/s: 3.0.4 > The result incorrect if the partition path case is inconsistent >

[jira] [Created] (SPARK-36219) Add flag to allow Driver to request for OPPORTUNISTIC containers

2021-07-19 Thread chaosju (Jira)
chaosju created SPARK-36219: --- Summary: Add flag to allow Driver to request for OPPORTUNISTIC containers Key: SPARK-36219 URL: https://issues.apache.org/jira/browse/SPARK-36219 Project: Spark

[jira] [Created] (SPARK-36220) Incorrect pyspark.sql.types.Row __new__ and __init__ type annotations

2021-07-19 Thread Tobias Edwards (Jira)
Tobias Edwards created SPARK-36220: -- Summary: Incorrect pyspark.sql.types.Row __new__ and __init__ type annotations Key: SPARK-36220 URL: https://issues.apache.org/jira/browse/SPARK-36220 Project:

[jira] [Commented] (SPARK-32709) Write Hive ORC/Parquet bucketed table with hivehash (for Hive 1,2)

2021-07-19 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383792#comment-17383792 ] Apache Spark commented on SPARK-32709: -- User 'c21' has created a pull request for this issue:

<    1   2