[jira] [Comment Edited] (SPARK-32536) deleted not existing hdfs locations when use spark sql to execute "insert overwrite" statement to dynamic partition

2020-08-06 Thread yx91490 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172889#comment-17172889 ] yx91490 edited comment on SPARK-32536 at 8/7/20, 6:49 AM: -- the

[jira] [Commented] (SPARK-32180) Getting Started - Installation

2020-08-06 Thread Rohit Mishra (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172938#comment-17172938 ] Rohit Mishra commented on SPARK-32180: -- [~hyukjin.kwon], I will start contributing

[jira] [Comment Edited] (SPARK-32526) Let sql/catalyst module tests pass for Scala 2.13

2020-08-06 Thread Yang Jie (Jira)
>Affects Versions: 3.0.0 >Reporter: Yang Jie >Priority: Minor > Attachments: failed-and-aborted-20200806 > > > sql/catalyst module has following compile errors with scala-2.13 profile: > {code:java} > [ERROR] [Error] > /Users/yangjie01/Source

[jira] [Resolved] (SPARK-32560) improve exception message

2020-08-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-32560. -- Fix Version/s: 3.1.0 2.4.7 3.0.1 Resolution: Fixed

[jira] [Assigned] (SPARK-32560) improve exception message

2020-08-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-32560: Assignee: philipse > improve exception message > - > >

[jira] [Commented] (SPARK-32536) deleted not existing hdfs locations when use spark sql to execute "insert overwrite" statement to dynamic partition

2020-08-06 Thread yx91490 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172889#comment-17172889 ] yx91490 commented on SPARK-32536: - the method org.apache.hadoop.hive.ql.metadata.Hive.de

[jira] [Updated] (SPARK-32562) Pyspark drop duplicate columns

2020-08-06 Thread abhijeet dada mote (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] abhijeet dada mote updated SPARK-32562: --- Description: Hi All, This is one suggestion can we have a feature in pyspark to re

[jira] [Created] (SPARK-32562) Pyspark drop duplicate columns

2020-08-06 Thread abhijeet dada mote (Jira)
abhijeet dada mote created SPARK-32562: -- Summary: Pyspark drop duplicate columns Key: SPARK-32562 URL: https://issues.apache.org/jira/browse/SPARK-32562 Project: Spark Issue Type: Improv

[jira] [Commented] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172805#comment-17172805 ] Ramakrishna Prasad K S commented on SPARK-32558: [~rohitmishr1484] [~hyu

[jira] [Comment Edited] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172805#comment-17172805 ] Ramakrishna Prasad K S edited comment on SPARK-32558 at 8/7/20, 4:07 AM: -

[jira] [Commented] (SPARK-32560) improve exception message

2020-08-06 Thread philipse (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172791#comment-17172791 ] philipse commented on SPARK-32560: --   Thanks [~maropu] for you notice. will improve it

[jira] [Commented] (SPARK-31703) Changes made by SPARK-26985 break reading parquet files correctly in BigEndian architectures (AIX + LinuxPPC64)

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172786#comment-17172786 ] Apache Spark commented on SPARK-31703: -- User 'tinhto-000' has created a pull reques

[jira] [Assigned] (SPARK-31703) Changes made by SPARK-26985 break reading parquet files correctly in BigEndian architectures (AIX + LinuxPPC64)

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31703: Assignee: (was: Apache Spark) > Changes made by SPARK-26985 break reading parquet fil

[jira] [Commented] (SPARK-31703) Changes made by SPARK-26985 break reading parquet files correctly in BigEndian architectures (AIX + LinuxPPC64)

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172785#comment-17172785 ] Apache Spark commented on SPARK-31703: -- User 'tinhto-000' has created a pull reques

[jira] [Assigned] (SPARK-31703) Changes made by SPARK-26985 break reading parquet files correctly in BigEndian architectures (AIX + LinuxPPC64)

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-31703: Assignee: Apache Spark > Changes made by SPARK-26985 break reading parquet files correctl

[jira] [Resolved] (SPARK-32549) Add column name in _infer_schema error message

2020-08-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-32549. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 29365 [https://gi

[jira] [Assigned] (SPARK-32549) Add column name in _infer_schema error message

2020-08-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-32549: Assignee: Liang Zhang > Add column name in _infer_schema error message >

[jira] [Commented] (SPARK-32560) improve exception message

2020-08-06 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172780#comment-17172780 ] Takeshi Yamamuro commented on SPARK-32560: -- Hi, [~小郭飞飞刀], thanks for the report

[jira] [Commented] (SPARK-32540) Eliminate filter clause in aggregate

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172779#comment-17172779 ] Apache Spark commented on SPARK-32540: -- User 'beliefer' has created a pull request

[jira] [Resolved] (SPARK-32538) Use local time zone for the timestamp logged in unit-tests.log

2020-08-06 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro resolved SPARK-32538. -- Fix Version/s: 3.0.1 Resolution: Fixed Resolved by [https://github.com/apache/s

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-32558: - Description: Steps to reproduce the issue: ---  Download Spark_3.0

[jira] [Commented] (SPARK-30577) StorageLevel.DISK_ONLY_2 causes the data loss

2020-08-06 Thread zero222 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172771#comment-17172771 ] zero222 commented on SPARK-30577: - OK, Thank you very much. > StorageLevel.DISK_ONLY_2

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-32558: - Target Version/s: (was: 3.0.0) > ORC target files that Spark_3.0 produces does not work with H

[jira] [Commented] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172770#comment-17172770 ] Hyukjin Kwon commented on SPARK-32558: -- Thanks [~rohitmishr1484]. [~ramks] please a

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-32558: - Fix Version/s: (was: 3.0.0) > ORC target files that Spark_3.0 produces does not work with Hi

[jira] [Reopened] (SPARK-32515) Distinct Function Weird Bug

2020-08-06 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro reopened SPARK-32515: -- > Distinct Function Weird Bug > --- > > Key: SPARK

[jira] [Resolved] (SPARK-32515) Distinct Function Weird Bug

2020-08-06 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro resolved SPARK-32515. -- Resolution: Not A Problem > Distinct Function Weird Bug > ---

[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-06 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-32515: - Target Version/s: (was: 2.4.6) > Distinct Function Weird Bug > ---

[jira] [Updated] (SPARK-32515) Distinct Function Weird Bug

2020-08-06 Thread Takeshi Yamamuro (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-32515: - Fix Version/s: (was: 2.4.6) > Distinct Function Weird Bug >

[jira] [Commented] (SPARK-31851) Redesign PySpark documentation

2020-08-06 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172758#comment-17172758 ] Hyukjin Kwon commented on SPARK-31851: -- [~Shan_Chandra] Please go ahead. You might

[jira] [Updated] (SPARK-32560) improve exception message

2020-08-06 Thread philipse (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32560: - Description: Exception messages are lack of single quotes, we can improve it to keep consisent (was: Ex

[jira] [Updated] (SPARK-32560) improve exception message

2020-08-06 Thread philipse (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32560: - Attachment: exception.png > improve exception message > - > > Ke

[jira] [Updated] (SPARK-32560) improve exception message

2020-08-06 Thread philipse (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] philipse updated SPARK-32560: - Description: Exception messages are lack of single quotes, we can improve it to keep consisent !image-

[jira] [Commented] (SPARK-30577) StorageLevel.DISK_ONLY_2 causes the data loss

2020-08-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172715#comment-17172715 ] Dongjoon Hyun commented on SPARK-30577: --- Thanks for the confirmation. BTW, `get_a

[jira] [Commented] (SPARK-32522) Using pyspark with a MultiLayerPerceptron model given inconsistent outputs if a large amount of data is fed into it and at least one of the model outputs is fed to a P

2020-08-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172714#comment-17172714 ] Dongjoon Hyun commented on SPARK-32522: --- Thank you for the explanation, [~Ben Smit

[jira] [Commented] (SPARK-32264) More resources in Github Actions

2020-08-06 Thread Dongjoon Hyun (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172703#comment-17172703 ] Dongjoon Hyun commented on SPARK-32264: --- That's too bad. However, thank you for in

[jira] [Commented] (SPARK-32526) Let sql/catalyst module tests pass for Scala 2.13

2020-08-06 Thread Dongjoon Hyun (Jira)
gt; Components: SQL >Affects Versions: 3.0.0 >Reporter: Yang Jie >Priority: Minor > Attachments: failed-and-aborted-20200806 > > > sql/catalyst module has following compile errors with scala-2.13 profile: > {code:java} > [ERROR] [E

[jira] [Commented] (SPARK-32018) Fix UnsafeRow set overflowed decimal

2020-08-06 Thread Sunitha Kambhampati (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172689#comment-17172689 ] Sunitha Kambhampati commented on SPARK-32018: - I have added a summary of my

[jira] [Commented] (SPARK-32018) Fix UnsafeRow set overflowed decimal

2020-08-06 Thread Sunitha Kambhampati (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172688#comment-17172688 ] Sunitha Kambhampati commented on SPARK-32018: - The important issue is we sho

[jira] [Assigned] (SPARK-32506) flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests

2020-08-06 Thread Huaxin Gao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxin Gao reassigned SPARK-32506: -- Assignee: Huaxin Gao > flaky test: > pyspark.mllib.tests.test_streaming_algorithms.Streaming

[jira] [Resolved] (SPARK-32506) flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests

2020-08-06 Thread Huaxin Gao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxin Gao resolved SPARK-32506. Fix Version/s: 3.1.0 3.0.1 Resolution: Fixed Issue resolved by pull requ

[jira] [Commented] (SPARK-32561) Allow DataSourceReadBenchmark to run for select formats

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172647#comment-17172647 ] Apache Spark commented on SPARK-32561: -- User 'msamirkhan' has created a pull reques

[jira] [Assigned] (SPARK-32561) Allow DataSourceReadBenchmark to run for select formats

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32561: Assignee: (was: Apache Spark) > Allow DataSourceReadBenchmark to run for select forma

[jira] [Commented] (SPARK-32561) Allow DataSourceReadBenchmark to run for select formats

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172646#comment-17172646 ] Apache Spark commented on SPARK-32561: -- User 'msamirkhan' has created a pull reques

[jira] [Assigned] (SPARK-32561) Allow DataSourceReadBenchmark to run for select formats

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32561: Assignee: Apache Spark > Allow DataSourceReadBenchmark to run for select formats > --

[jira] [Created] (SPARK-32561) Allow DataSourceReadBenchmark to run for select formats

2020-08-06 Thread Muhammad Samir Khan (Jira)
Muhammad Samir Khan created SPARK-32561: --- Summary: Allow DataSourceReadBenchmark to run for select formats Key: SPARK-32561 URL: https://issues.apache.org/jira/browse/SPARK-32561 Project: Spark

[jira] [Updated] (SPARK-32531) Add benchmarks for nested structs and arrays for different file formats

2020-08-06 Thread Muhammad Samir Khan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Muhammad Samir Khan updated SPARK-32531: Component/s: Tests > Add benchmarks for nested structs and arrays for different fi

[jira] [Commented] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Rohit Mishra (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172635#comment-17172635 ] Rohit Mishra commented on SPARK-32558: -- [~ramks], Thanks for raising this bug but P

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Rohit Mishra (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohit Mishra updated SPARK-32558: - Priority: Major (was: Blocker) > ORC target files that Spark_3.0 produces does not work with Hi

[jira] [Commented] (SPARK-31851) Redesign PySpark documentation

2020-08-06 Thread Shanmugavel Kuttiyandi Chandrakasu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172627#comment-17172627 ] Shanmugavel Kuttiyandi Chandrakasu commented on SPARK-31851:

[jira] [Closed] (SPARK-32551) Ambiguous self join error in non self join with window

2020-08-06 Thread kanika dhuria (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanika dhuria closed SPARK-32551. - Closing as duplicate. > Ambiguous self join error in non self join with window > --

[jira] [Resolved] (SPARK-32551) Ambiguous self join error in non self join with window

2020-08-06 Thread kanika dhuria (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanika dhuria resolved SPARK-32551. --- Fix Version/s: 3.0.1 Resolution: Fixed > Ambiguous self join error in non self join w

[jira] [Commented] (SPARK-32551) Ambiguous self join error in non self join with window

2020-08-06 Thread kanika dhuria (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172586#comment-17172586 ] kanika dhuria commented on SPARK-32551: --- Thanks [~cloud_fan], it is fixed in lates

[jira] [Commented] (SPARK-32506) flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172581#comment-17172581 ] Apache Spark commented on SPARK-32506: -- User 'huaxingao' has created a pull request

[jira] [Resolved] (SPARK-32515) Distinct Function Weird Bug

2020-08-06 Thread Jayce Jiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jayce Jiang resolved SPARK-32515. - Fix Version/s: 2.4.6 Target Version/s: 2.4.6 Resolution: Fixed > Distinct Funct

[jira] [Commented] (SPARK-32515) Distinct Function Weird Bug

2020-08-06 Thread Jayce Jiang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172579#comment-17172579 ] Jayce Jiang commented on SPARK-32515: - Closing the issues, it has to due with spark.

[jira] [Assigned] (SPARK-32506) flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32506: Assignee: (was: Apache Spark) > flaky test: > pyspark.mllib.tests.test_streaming_alg

[jira] [Assigned] (SPARK-32506) flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32506: Assignee: Apache Spark > flaky test: > pyspark.mllib.tests.test_streaming_algorithms.Str

[jira] [Commented] (SPARK-32506) flaky test: pyspark.mllib.tests.test_streaming_algorithms.StreamingLinearRegressionWithTests

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172578#comment-17172578 ] Apache Spark commented on SPARK-32506: -- User 'huaxingao' has created a pull request

[jira] [Commented] (SPARK-32551) Ambiguous self join error in non self join with window

2020-08-06 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172407#comment-17172407 ] Wenchen Fan commented on SPARK-32551: - Can you try the latest 3.0 branch? There are

[jira] [Updated] (SPARK-32546) SHOW VIEWS fails with MetaException ... ClassNotFoundException

2020-08-06 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-32546: Fix Version/s: 3.0.1 > SHOW VIEWS fails with MetaException ... ClassNotFoundException > --

[jira] [Commented] (SPARK-32544) Bucketing and Partitioning information are not passed on to non FileFormat datasource writes

2020-08-06 Thread Rahij Ramsharan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172313#comment-17172313 ] Rahij Ramsharan commented on SPARK-32544: - [~hyukjin.kwon] do you happen to know

[jira] [Commented] (SPARK-32546) SHOW VIEWS fails with MetaException ... ClassNotFoundException

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172275#comment-17172275 ] Apache Spark commented on SPARK-32546: -- User 'MaxGekk' has created a pull request f

[jira] [Commented] (SPARK-30069) Clean up non-shuffle disk block manager files following executor exists on YARN

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172267#comment-17172267 ] Apache Spark commented on SPARK-30069: -- User 'LantaoJin' has created a pull request

[jira] [Commented] (SPARK-30069) Clean up non-shuffle disk block manager files following executor exists on YARN

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172264#comment-17172264 ] Apache Spark commented on SPARK-30069: -- User 'LantaoJin' has created a pull request

[jira] [Commented] (SPARK-32546) SHOW VIEWS fails with MetaException ... ClassNotFoundException

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172231#comment-17172231 ] Apache Spark commented on SPARK-32546: -- User 'MaxGekk' has created a pull request f

[jira] [Comment Edited] (SPARK-12741) DataFrame count method return wrong size.

2020-08-06 Thread Yu Gan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172226#comment-17172226 ] Yu Gan edited comment on SPARK-12741 at 8/6/20, 10:38 AM: -- Aha,

[jira] [Comment Edited] (SPARK-12741) DataFrame count method return wrong size.

2020-08-06 Thread Yu Gan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172226#comment-17172226 ] Yu Gan edited comment on SPARK-12741 at 8/6/20, 10:38 AM: -- Aha,

[jira] [Commented] (SPARK-12741) DataFrame count method return wrong size.

2020-08-06 Thread Yu Gan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-12741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172226#comment-17172226 ] Yu Gan commented on SPARK-12741: Aha, I came across the similar issue. My sql is  s

[jira] [Commented] (SPARK-32560) improve exception message

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172219#comment-17172219 ] Apache Spark commented on SPARK-32560: -- User 'GuoPhilipse' has created a pull reque

[jira] [Assigned] (SPARK-32560) improve exception message

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32560: Assignee: (was: Apache Spark) > improve exception message > -

[jira] [Commented] (SPARK-32560) improve exception message

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172217#comment-17172217 ] Apache Spark commented on SPARK-32560: -- User 'GuoPhilipse' has created a pull reque

[jira] [Assigned] (SPARK-32560) improve exception message

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32560: Assignee: Apache Spark > improve exception message > - > >

[jira] [Issue Comment Deleted] (SPARK-32536) deleted not existing hdfs locations when use spark sql to execute "insert overwrite" statement to dynamic partition

2020-08-06 Thread yx91490 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yx91490 updated SPARK-32536: Comment: was deleted (was: it seems that the hive code in there(I use hdp-3.1.4.0-315): {code:java} org.sp

[jira] [Created] (SPARK-32560) improve exception message

2020-08-06 Thread philipse (Jira)
philipse created SPARK-32560: Summary: improve exception message Key: SPARK-32560 URL: https://issues.apache.org/jira/browse/SPARK-32560 Project: Spark Issue Type: Improvement Component

[jira] [Commented] (SPARK-32547) Cant able to process Timestamp 0001-01-01T00:00:00.000+0000 with TimestampType

2020-08-06 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172214#comment-17172214 ] Kent Yao commented on SPARK-32547: -- thanks [~ManjunathHatti] I also tested other zones

[jira] [Commented] (SPARK-32536) deleted not existing hdfs locations when use spark sql to execute "insert overwrite" statement to dynamic partition

2020-08-06 Thread yx91490 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172207#comment-17172207 ] yx91490 commented on SPARK-32536: - it seems that the hive code in there(I use hdp-3.1.4.

[jira] [Commented] (SPARK-32547) Cant able to process Timestamp 0001-01-01T00:00:00.000+0000 with TimestampType

2020-08-06 Thread Manjunath H (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172199#comment-17172199 ] Manjunath H commented on SPARK-32547: - [~Qin Yao]  timezone is UTC spark.conf.get('

[jira] [Comment Edited] (SPARK-30577) StorageLevel.DISK_ONLY_2 causes the data loss

2020-08-06 Thread zero222 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172177#comment-17172177 ] zero222 edited comment on SPARK-30577 at 8/6/20, 9:50 AM: -- spar

[jira] [Commented] (SPARK-30577) StorageLevel.DISK_ONLY_2 causes the data loss

2020-08-06 Thread zero222 (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172177#comment-17172177 ] zero222 commented on SPARK-30577: - spark-2.4.6 can work normally with DISK_ONLY_2. But w

[jira] [Updated] (SPARK-32559) Fix the trim logic in UTF8String.toInt/toLong did't handle Chinese characters correctly

2020-08-06 Thread EdisonWang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] EdisonWang updated SPARK-32559: --- Description: The trim logic in Cast expression introduced in  [https://github.com/apache/spark/pull/

[jira] [Commented] (SPARK-32559) Fix the trim logic in UTF8String.toInt/toLong did't handle Chinese characters correctly

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172153#comment-17172153 ] Apache Spark commented on SPARK-32559: -- User 'WangGuangxin' has created a pull requ

[jira] [Assigned] (SPARK-32559) Fix the trim logic in UTF8String.toInt/toLong did't handle Chinese characters correctly

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32559: Assignee: (was: Apache Spark) > Fix the trim logic in UTF8String.toInt/toLong did't h

[jira] [Assigned] (SPARK-32559) Fix the trim logic in UTF8String.toInt/toLong did't handle Chinese characters correctly

2020-08-06 Thread Apache Spark (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-32559: Assignee: Apache Spark > Fix the trim logic in UTF8String.toInt/toLong did't handle Chine

[jira] [Created] (SPARK-32559) Fix the trim logic in UTF8String.toInt/toLong did't handle Chinese characters correctly

2020-08-06 Thread EdisonWang (Jira)
EdisonWang created SPARK-32559: -- Summary: Fix the trim logic in UTF8String.toInt/toLong did't handle Chinese characters correctly Key: SPARK-32559 URL: https://issues.apache.org/jira/browse/SPARK-32559 P

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Description: Steps to reproduce the issue: --- 

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Description: Steps to reproduce the issue: --- 

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Environment: Spark 3.0 and Hadoop cluster having Hive_2.1.1 version. (was:

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Environment: Spark 3.0 and Hadoop cluster having Hive_2.1.1 version. (Linux

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Description: Steps to reproduce the issue: --- 

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Description: Steps to reproduce the issue: --- 

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Description: Steps to reproduce the issue: --- 

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Description: Steps to reproduce the issue: --- 

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Description: Steps to reproduce the issue: --- 

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Description: Steps to reproduce the issue: --- 

[jira] [Resolved] (SPARK-32546) SHOW VIEWS fails with MetaException ... ClassNotFoundException

2020-08-06 Thread Wenchen Fan (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-32546. - Fix Version/s: 3.1.0 Assignee: Maxim Gekk Resolution: Fixed > SHOW VIEWS fails w

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Description: Steps to reproduce the issue: --- 

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Description: Steps to reproduce the issue: --- 

[jira] [Updated] (SPARK-32558) ORC target files that Spark_3.0 produces does not work with Hive_2.1.1 (work-around of using spark.sql.orc.impl=hive is also not working)

2020-08-06 Thread Ramakrishna Prasad K S (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramakrishna Prasad K S updated SPARK-32558: --- Description: Steps to reproduce the issue:     Download Spark_3.0 on Linu

[jira] [Commented] (SPARK-32547) Cant able to process Timestamp 0001-01-01T00:00:00.000+0000 with TimestampType

2020-08-06 Thread Kent Yao (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172127#comment-17172127 ] Kent Yao commented on SPARK-32547: -- {code:java} >>> spark.sql("select timestamp '0001-

  1   2   >