[jira] [Assigned] (SPARK-25618) KafkaContinuousSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false 1 min 1 sec

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25618: Assignee: (was: Apache Spark) > KafkaContinuousSourceStressForDontFailOnDataLossSuite

[jira] [Commented] (SPARK-25618) KafkaContinuousSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false 1 min 1 sec

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669672#comment-16669672 ] Apache Spark commented on SPARK-25618: -- User 'dilipbiswal' has created a pull reque

[jira] [Assigned] (SPARK-25618) KafkaContinuousSourceStressForDontFailOnDataLossSuite: stress test for failOnDataLoss=false 1 min 1 sec

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25618: Assignee: Apache Spark > KafkaContinuousSourceStressForDontFailOnDataLossSuite: stress te

[jira] [Assigned] (SPARK-25573) Combine resolveExpression and resolve in the rule ResolveReferences

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25573: Assignee: (was: Apache Spark) > Combine resolveExpression and resolve in the rule Res

[jira] [Assigned] (SPARK-25573) Combine resolveExpression and resolve in the rule ResolveReferences

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25573: Assignee: Apache Spark > Combine resolveExpression and resolve in the rule ResolveReferen

[jira] [Commented] (SPARK-25573) Combine resolveExpression and resolve in the rule ResolveReferences

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669664#comment-16669664 ] Apache Spark commented on SPARK-25573: -- User 'dilipbiswal' has created a pull reque

[jira] [Commented] (SPARK-25573) Combine resolveExpression and resolve in the rule ResolveReferences

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669665#comment-16669665 ] Apache Spark commented on SPARK-25573: -- User 'dilipbiswal' has created a pull reque

[jira] [Assigned] (SPARK-25833) Update migration guide for Hive view compatibility

2018-10-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-25833: - Assignee: Chenxiao Mao > Update migration guide for Hive view compatibility > -

[jira] [Resolved] (SPARK-25833) Update migration guide for Hive view compatibility

2018-10-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-25833. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22868 [https://

[jira] [Commented] (SPARK-25746) Refactoring ExpressionEncoder

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669651#comment-16669651 ] Apache Spark commented on SPARK-25746: -- User 'cloud-fan' has created a pull request

[jira] [Resolved] (SPARK-25862) Remove rangeBetween APIs introduced in SPARK-21608

2018-10-30 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-25862. - Resolution: Fixed Fix Version/s: 3.0.0 > Remove rangeBetween APIs introduced in SPARK-21608 > ---

[jira] [Resolved] (SPARK-25847) Refactor JSONBenchmarks to use main method

2018-10-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-25847. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22844 [https://gi

[jira] [Assigned] (SPARK-25847) Refactor JSONBenchmarks to use main method

2018-10-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-25847: Assignee: caoxuewen > Refactor JSONBenchmarks to use main method > --

[jira] [Resolved] (SPARK-25691) Analyzer rule "AliasViewChild" does not stabilize

2018-10-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25691. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22713 [https://gith

[jira] [Assigned] (SPARK-25691) Analyzer rule "AliasViewChild" does not stabilize

2018-10-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-25691: --- Assignee: Marco Gaido > Analyzer rule "AliasViewChild" does not stabilize > ---

[jira] [Updated] (SPARK-25890) Null rows are ignored with Ctrl-A as a delimiter when reading a CSV file.

2018-10-30 Thread Lakshminarayan Kamath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lakshminarayan Kamath updated SPARK-25890: -- Description: Reading a Ctrl-A delimited CSV file ignores rows with all null va

[jira] [Created] (SPARK-25890) Null rows are ignored with Ctrl-A as a delimiter when reading a CSV file.

2018-10-30 Thread Lakshminarayan Kamath (JIRA)
Lakshminarayan Kamath created SPARK-25890: - Summary: Null rows are ignored with Ctrl-A as a delimiter when reading a CSV file. Key: SPARK-25890 URL: https://issues.apache.org/jira/browse/SPARK-25890

[jira] [Commented] (SPARK-25889) Dynamic allocation load-aware ramp up

2018-10-30 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669384#comment-16669384 ] DB Tsai commented on SPARK-25889: - The problem sounds valid, and the solution could work

[jira] [Updated] (SPARK-25888) Service requests for persist() blocks via external service after dynamic deallocation

2018-10-30 Thread Adam Kennedy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kennedy updated SPARK-25888: - Description: Large and highly multi-tenant Spark on YARN clusters with diverse job execution of

[jira] [Updated] (SPARK-25888) Service requests for persist() blocks via external service after dynamic deallocation

2018-10-30 Thread Adam Kennedy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kennedy updated SPARK-25888: - Issue Type: New Feature (was: Improvement) > Service requests for persist() blocks via external

[jira] [Updated] (SPARK-25888) Service requests for persist() blocks via external service after dynamic deallocation

2018-10-30 Thread Adam Kennedy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kennedy updated SPARK-25888: - Shepherd: DB Tsai > Service requests for persist() blocks via external service after dynamic >

[jira] [Created] (SPARK-25889) Dynamic allocation load-aware ramp up

2018-10-30 Thread Adam Kennedy (JIRA)
Adam Kennedy created SPARK-25889: Summary: Dynamic allocation load-aware ramp up Key: SPARK-25889 URL: https://issues.apache.org/jira/browse/SPARK-25889 Project: Spark Issue Type: New Feature

[jira] [Resolved] (SPARK-24434) Support user-specified driver and executor pod templates

2018-10-30 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-24434. Resolution: Fixed > Support user-specified driver and executor pod templates > ---

[jira] [Updated] (SPARK-25888) Service requests for persist() blocks via external service after dynamic deallocation

2018-10-30 Thread Adam Kennedy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kennedy updated SPARK-25888: - Description: Large and highly multi-tenant Spark on YARN clusters with diverse job execution of

[jira] [Updated] (SPARK-25888) Service requests for persist() blocks via external service after dynamic deallocation

2018-10-30 Thread Adam Kennedy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kennedy updated SPARK-25888: - Description: Large and highly multi-tenant Spark on YARN clusters with diverse job execution of

[jira] [Updated] (SPARK-25888) Service requests for persist() blocks via external service after dynamic deallocation

2018-10-30 Thread Adam Kennedy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kennedy updated SPARK-25888: - Environment: (was: Large YARN cluster with 1,000 nodes, 50,000 cores and 250 users, with pre

[jira] [Updated] (SPARK-25888) Service requests for persist() blocks via external service after dynamic deallocation

2018-10-30 Thread Adam Kennedy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kennedy updated SPARK-25888: - Environment: Large YARN cluster with 1,000 nodes, 50,000 cores and 250 users, with predominantl

[jira] [Updated] (SPARK-25888) Service requests for persist() blocks via external service after dynamic deallocation

2018-10-30 Thread Adam Kennedy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Kennedy updated SPARK-25888: - Summary: Service requests for persist() blocks via external service after dynamic deallocation

[jira] [Created] (SPARK-25888) Service requests for persist() block via external service after dynamic deallocation

2018-10-30 Thread Adam Kennedy (JIRA)
Adam Kennedy created SPARK-25888: Summary: Service requests for persist() block via external service after dynamic deallocation Key: SPARK-25888 URL: https://issues.apache.org/jira/browse/SPARK-25888

[jira] [Commented] (SPARK-25875) Merge code to set up driver features for different languages

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669183#comment-16669183 ] Apache Spark commented on SPARK-25875: -- User 'vanzin' has created a pull request fo

[jira] [Assigned] (SPARK-25875) Merge code to set up driver features for different languages

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25875: Assignee: (was: Apache Spark) > Merge code to set up driver features for different la

[jira] [Assigned] (SPARK-25875) Merge code to set up driver features for different languages

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25875: Assignee: Apache Spark > Merge code to set up driver features for different languages > -

[jira] [Commented] (SPARK-25879) Schema pruning fails when a nested field and top level field are selected

2018-10-30 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669170#comment-16669170 ] DB Tsai commented on SPARK-25879: - I can confirm that with PR from SPARK-25407, this iss

[jira] [Resolved] (SPARK-25773) Cancel zombie tasks in a result stage when the job finishes

2018-10-30 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-25773. -- Resolution: Fixed Fix Version/s: 3.0.0 > Cancel zombie tasks in a result stage when the

[jira] [Created] (SPARK-25887) Allow specifying Kubernetes context to use

2018-10-30 Thread Rob Vesse (JIRA)
Rob Vesse created SPARK-25887: - Summary: Allow specifying Kubernetes context to use Key: SPARK-25887 URL: https://issues.apache.org/jira/browse/SPARK-25887 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-25809) Support additional K8S cluster types for integration tests

2018-10-30 Thread Rob Vesse (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669048#comment-16669048 ] Rob Vesse commented on SPARK-25809: --- Fairly close to this being ready to merge > Supp

[jira] [Commented] (SPARK-25870) RandomSplit with seed gives different results depending on column order

2018-10-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669037#comment-16669037 ] Marco Gaido commented on SPARK-25870: - Thanks [~deacuna]. > RandomSplit with seed g

[jira] [Resolved] (SPARK-25870) RandomSplit with seed gives different results depending on column order

2018-10-30 Thread Daniel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel resolved SPARK-25870. Resolution: Not A Problem > RandomSplit with seed gives different results depending on column order >

[jira] [Commented] (SPARK-25870) RandomSplit with seed gives different results depending on column order

2018-10-30 Thread Daniel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669022#comment-16669022 ] Daniel commented on SPARK-25870: Semantically, I was thinking that rows should stay in t

[jira] [Commented] (SPARK-24285) Flaky test: ContinuousSuite.query without test harness

2018-10-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668984#comment-16668984 ] Dongjoon Hyun commented on SPARK-24285: --- Thank you for investigating this, [~irash

[jira] [Resolved] (SPARK-25848) Refactor CSVBenchmarks to use main method

2018-10-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-25848. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22845 [https://

[jira] [Assigned] (SPARK-25848) Refactor CSVBenchmarks to use main method

2018-10-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-25848: - Assignee: caoxuewen > Refactor CSVBenchmarks to use main method > -

[jira] [Commented] (SPARK-25886) Improve error message of `FailureSafeParser` and `from_avro` in FAILFAST mode

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668958#comment-16668958 ] Apache Spark commented on SPARK-25886: -- User 'gengliangwang' has created a pull req

[jira] [Assigned] (SPARK-25886) Improve error message of `FailureSafeParser` and `from_avro` in FAILFAST mode

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25886: Assignee: Apache Spark > Improve error message of `FailureSafeParser` and `from_avro` in

[jira] [Assigned] (SPARK-25886) Improve error message of `FailureSafeParser` and `from_avro` in FAILFAST mode

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25886: Assignee: (was: Apache Spark) > Improve error message of `FailureSafeParser` and `fro

[jira] [Commented] (SPARK-10892) Join with Data Frame returns wrong results

2018-10-30 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668941#comment-16668941 ] Nicholas Chammas commented on SPARK-10892: -- Is this issue still present in Spar

[jira] [Created] (SPARK-25886) Improve error message of `FailureSafeParser` and `from_avro` in FAILFAST mode

2018-10-30 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-25886: -- Summary: Improve error message of `FailureSafeParser` and `from_avro` in FAILFAST mode Key: SPARK-25886 URL: https://issues.apache.org/jira/browse/SPARK-25886 Pro

[jira] [Commented] (SPARK-23429) Add executor memory metrics to heartbeat and expose in executors REST API

2018-10-30 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668939#comment-16668939 ] Imran Rashid commented on SPARK-23429: -- merged https://github.com/apache/spark/pull

[jira] [Commented] (SPARK-24285) Flaky test: ContinuousSuite.query without test harness

2018-10-30 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668933#comment-16668933 ] Imran Rashid commented on SPARK-24285: -- I ran into this on one of my prs so I tried

[jira] [Commented] (SPARK-25873) Date corruption when Spark and Hive both are on different timezones

2018-10-30 Thread Pawan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668853#comment-16668853 ] Pawan commented on SPARK-25873: --- [~hyukjin.kwon] Please find below the data in t_src and

[jira] [Resolved] (SPARK-25790) PCA doesn't support more than 65535 column matrix

2018-10-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25790. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22784 [https://github.c

[jira] [Assigned] (SPARK-25790) PCA doesn't support more than 65535 column matrix

2018-10-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-25790: - Assignee: shahid > PCA doesn't support more than 65535 column matrix >

[jira] [Commented] (SPARK-25441) calculate term frequency in CountVectorizer()

2018-10-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668654#comment-16668654 ] Marco Gaido commented on SPARK-25441: - TF has an appropriate transformer. I think th

[jira] [Commented] (SPARK-25885) HighlyCompressedMapStatus deserialization optimization

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668648#comment-16668648 ] Apache Spark commented on SPARK-25885: -- User 'Koraseg' has created a pull request f

[jira] [Assigned] (SPARK-25885) HighlyCompressedMapStatus deserialization optimization

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25885: Assignee: (was: Apache Spark) > HighlyCompressedMapStatus deserialization optimizatio

[jira] [Assigned] (SPARK-25885) HighlyCompressedMapStatus deserialization optimization

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25885: Assignee: Apache Spark > HighlyCompressedMapStatus deserialization optimization > ---

[jira] [Commented] (SPARK-25885) HighlyCompressedMapStatus deserialization optimization

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668646#comment-16668646 ] Apache Spark commented on SPARK-25885: -- User 'Koraseg' has created a pull request f

[jira] [Created] (SPARK-25885) HighlyCompressedMapStatus deserialization optimization

2018-10-30 Thread Artem Kupchinskiy (JIRA)
Artem Kupchinskiy created SPARK-25885: - Summary: HighlyCompressedMapStatus deserialization optimization Key: SPARK-25885 URL: https://issues.apache.org/jira/browse/SPARK-25885 Project: Spark

[jira] [Resolved] (SPARK-25755) Supplementation of non-CodeGen unit tested for BroadcastHashJoinExec

2018-10-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25755. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22755 [https://gith

[jira] [Assigned] (SPARK-25755) Supplementation of non-CodeGen unit tested for BroadcastHashJoinExec

2018-10-30 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-25755: --- Assignee: caoxuewen > Supplementation of non-CodeGen unit tested for BroadcastHashJoinExec

[jira] [Assigned] (SPARK-25868) One part of Spark MLlib Kmean Logic Performance problem

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25868: Assignee: Apache Spark > One part of Spark MLlib Kmean Logic Performance problem > --

[jira] [Assigned] (SPARK-25868) One part of Spark MLlib Kmean Logic Performance problem

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25868: Assignee: (was: Apache Spark) > One part of Spark MLlib Kmean Logic Performance probl

[jira] [Commented] (SPARK-25868) One part of Spark MLlib Kmean Logic Performance problem

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668603#comment-16668603 ] Apache Spark commented on SPARK-25868: -- User 'KyleLi1985' has created a pull reques

[jira] [Commented] (SPARK-25868) One part of Spark MLlib Kmean Logic Performance problem

2018-10-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668591#comment-16668591 ] Hyukjin Kwon commented on SPARK-25868: -- JIRA describes what to fix and why. PR desc

[jira] [Updated] (SPARK-25868) One part of Spark MLlib Kmean Logic Performance problem

2018-10-30 Thread Liang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Li updated SPARK-25868: - Description: In function fastSquaredDistance, there is a low performance logic Already update a patch #

[jira] [Commented] (SPARK-25884) Add TBLPROPERTIES and COMMENT, and use LOCATION when SHOW CREATE TABLE.

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668502#comment-16668502 ] Apache Spark commented on SPARK-25884: -- User 'ueshin' has created a pull request fo

[jira] [Assigned] (SPARK-25884) Add TBLPROPERTIES and COMMENT, and use LOCATION when SHOW CREATE TABLE.

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25884: Assignee: Apache Spark > Add TBLPROPERTIES and COMMENT, and use LOCATION when SHOW CREATE

[jira] [Assigned] (SPARK-25884) Add TBLPROPERTIES and COMMENT, and use LOCATION when SHOW CREATE TABLE.

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25884: Assignee: Apache Spark > Add TBLPROPERTIES and COMMENT, and use LOCATION when SHOW CREATE

[jira] [Assigned] (SPARK-25884) Add TBLPROPERTIES and COMMENT, and use LOCATION when SHOW CREATE TABLE.

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25884: Assignee: (was: Apache Spark) > Add TBLPROPERTIES and COMMENT, and use LOCATION when

[jira] [Created] (SPARK-25884) Add TBLPROPERTIES and COMMENT, and use LOCATION when SHOW CREATE TABLE.

2018-10-30 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-25884: - Summary: Add TBLPROPERTIES and COMMENT, and use LOCATION when SHOW CREATE TABLE. Key: SPARK-25884 URL: https://issues.apache.org/jira/browse/SPARK-25884 Project: Sp

[jira] [Commented] (SPARK-25881) pyspark df.topandas() deal decimal type as object

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668442#comment-16668442 ] Apache Spark commented on SPARK-25881: -- User '351zyf' has created a pull request fo

[jira] [Commented] (SPARK-25863) java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala

2018-10-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668438#comment-16668438 ] Marco Gaido commented on SPARK-25863: - [~Tagar] thanks. ??not sure yet as it might

[jira] [Assigned] (SPARK-25883) Override method `prettyName` in `from_avro`/`to_avro`

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25883: Assignee: Apache Spark > Override method `prettyName` in `from_avro`/`to_avro` >

[jira] [Assigned] (SPARK-25883) Override method `prettyName` in `from_avro`/`to_avro`

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25883: Assignee: (was: Apache Spark) > Override method `prettyName` in `from_avro`/`to_avro`

[jira] [Commented] (SPARK-25883) Override method `prettyName` in `from_avro`/`to_avro`

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668433#comment-16668433 ] Apache Spark commented on SPARK-25883: -- User 'gengliangwang' has created a pull req

[jira] [Created] (SPARK-25883) Override method `prettyName` in `from_avro`/`to_avro`

2018-10-30 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-25883: -- Summary: Override method `prettyName` in `from_avro`/`to_avro` Key: SPARK-25883 URL: https://issues.apache.org/jira/browse/SPARK-25883 Project: Spark Iss

[jira] [Commented] (SPARK-25882) Add a function to join two datasets using one column with join type parameter

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668384#comment-16668384 ] Apache Spark commented on SPARK-25882: -- User 'arman1371' has created a pull request

[jira] [Assigned] (SPARK-25882) Add a function to join two datasets using one column with join type parameter

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25882: Assignee: (was: Apache Spark) > Add a function to join two datasets using one column

[jira] [Commented] (SPARK-25882) Add a function to join two datasets using one column with join type parameter

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668382#comment-16668382 ] Apache Spark commented on SPARK-25882: -- User 'arman1371' has created a pull request

[jira] [Assigned] (SPARK-25882) Add a function to join two datasets using one column with join type parameter

2018-10-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25882: Assignee: Apache Spark > Add a function to join two datasets using one column with join t

[jira] [Created] (SPARK-25882) Add a function to join two datasets using one column with join type parameter

2018-10-30 Thread Arman Yazdani (JIRA)
Arman Yazdani created SPARK-25882: - Summary: Add a function to join two datasets using one column with join type parameter Key: SPARK-25882 URL: https://issues.apache.org/jira/browse/SPARK-25882 Proje

[jira] [Closed] (SPARK-25881) pyspark df.topandas() deal decimal type as object

2018-10-30 Thread JohnsonZhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JohnsonZhang closed SPARK-25881. > pyspark df.topandas() deal decimal type as object >

[jira] [Comment Edited] (SPARK-25880) user set some hadoop configurations can not work

2018-10-30 Thread guojh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668263#comment-16668263 ] guojh edited comment on SPARK-25880 at 10/30/18 8:53 AM: - The ro

[jira] [Resolved] (SPARK-25881) pyspark df.topandas() deal decimal type as object

2018-10-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-25881. -- Resolution: Won't Fix > pyspark df.topandas() deal decimal type as object > --

[jira] [Resolved] (SPARK-25879) Schema pruning fails when a nested field and top level field are selected

2018-10-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-25879. -- Resolution: Duplicate > Schema pruning fails when a nested field and top level field are selec

[jira] [Commented] (SPARK-25879) Schema pruning fails when a nested field and top level field are selected

2018-10-30 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668332#comment-16668332 ] Liang-Chi Hsieh commented on SPARK-25879: - I agreed with [~hyukjin.kwon]. > Sch

[jira] [Commented] (SPARK-25879) Schema pruning fails when a nested field and top level field are selected

2018-10-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668329#comment-16668329 ] Dongjoon Hyun commented on SPARK-25879: --- +1 for [~hyukjin.kwon]'s suggestion. > S

[jira] [Commented] (SPARK-25879) Schema pruning fails when a nested field and top level field are selected

2018-10-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668327#comment-16668327 ] Hyukjin Kwon commented on SPARK-25879: -- Shell we leave this resolved as a duplicate

[jira] [Commented] (SPARK-25873) Date corruption when Spark and Hive both are on different timezones

2018-10-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668324#comment-16668324 ] Hyukjin Kwon commented on SPARK-25873: -- Can you show current results, expected resu

[jira] [Updated] (SPARK-25873) Date corruption when Spark and Hive both are on different timezones

2018-10-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-25873: - Description: There is date alteration when loading date from one table to another in hive throu

[jira] [Resolved] (SPARK-24233) Union Operation on Read of Dataframe does NOT produce correct result

2018-10-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24233. -- Resolution: Invalid > Union Operation on Read of Dataframe does NOT produce correct result >

[jira] [Updated] (SPARK-25852) we should filter the workOffers with freeCores>=CPUS_PER_TASK at first for better performance

2018-10-30 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-25852: Summary: we should filter the workOffers with freeCores>=CPUS_PER_TASK at first for better perform

[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>=CPUS_PER_TASK at first for better performance

2018-10-30 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-25852: Description: We should filter the workOffers with freeCores>=CPUS_PER_TASK for better performance.

[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>=CPUS_PER_TASK at first for better performance

2018-10-30 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-25852: Summary: we should filter the workOffers of which freeCores>=CPUS_PER_TASK at first for better per

[jira] [Commented] (SPARK-25868) One part of Spark MLlib Kmean Logic Performance problem

2018-10-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668314#comment-16668314 ] Hyukjin Kwon commented on SPARK-25868: -- What kind of problem? what was the expected

[jira] [Updated] (SPARK-25852) we should filter the workOffers of which freeCores>CPUS_PER_TASK at first for better performance

2018-10-30 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-25852: Description: We should filter the workOffers of which freeCores>=CPUS_PER_TASK for better performa

[jira] [Commented] (SPARK-25870) RandomSplit with seed gives different results depending on column order

2018-10-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668313#comment-16668313 ] Marco Gaido commented on SPARK-25870: - If you do some transformations (simple or com

[jira] [Issue Comment Deleted] (SPARK-25407) Spark throws a `ParquetDecodingException` when attempting to read a field from a complex type in certain cases of schema merging

2018-10-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25407: -- Comment: was deleted (was: [~michael]. Is this a regression from Spark 2.3?) > Spark throws a

[jira] [Updated] (SPARK-25880) user set some hadoop configurations can not work

2018-10-30 Thread guojh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] guojh updated SPARK-25880: -- Component/s: Spark Core > user set some hadoop configurations can not work > -

[jira] [Commented] (SPARK-25407) Spark throws a `ParquetDecodingException` when attempting to read a field from a complex type in certain cases of schema merging

2018-10-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668298#comment-16668298 ] Dongjoon Hyun commented on SPARK-25407: --- [~michael]. Is this a regression from Spa

  1   2   >