[jira] [Created] (SPARK-23544) Remove repartition operation from join in the optimizer

2018-02-28 Thread caoxuewen (JIRA)
caoxuewen created SPARK-23544: - Summary: Remove repartition operation from join in the optimizer Key: SPARK-23544 URL: https://issues.apache.org/jira/browse/SPARK-23544 Project: Spark Issue

[jira] [Assigned] (SPARK-23543) Automatic Module creation fails in Java 9

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23543: Assignee: (was: Apache Spark) > Automatic Module creation fails in Java 9 >

[jira] [Assigned] (SPARK-23543) Automatic Module creation fails in Java 9

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23543: Assignee: Apache Spark > Automatic Module creation fails in Java 9 >

[jira] [Assigned] (SPARK-23437) [ML] Distributed Gaussian Process Regression for MLlib

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23437: Assignee: Apache Spark > [ML] Distributed Gaussian Process Regression for MLlib >

[jira] [Commented] (SPARK-23266) Matrix Inversion on BlockMatrix

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381604#comment-16381604 ] Apache Spark commented on SPARK-23266: -- Hello,which version this issue will be release ? > Matrix

[jira] [Assigned] (SPARK-23389) When the shuffle dependency specifies aggregation ,and `dependency.mapSideCombine=false`, we should be able to use serialized sorting.

2018-02-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-23389: --- Assignee: liuxian > When the shuffle dependency specifies aggregation ,and >

[jira] [Resolved] (SPARK-23389) When the shuffle dependency specifies aggregation ,and `dependency.mapSideCombine=false`, we should be able to use serialized sorting.

2018-02-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23389. - Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20576

[jira] [Updated] (SPARK-23542) The `where exists' action in optimized logical plan should be optimized

2018-02-28 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23542: -- Description: The optimized logical plan of query '*select * from tt1 where exists (select * 

[jira] [Updated] (SPARK-23543) Automatic Module creation fails in Java 9

2018-02-28 Thread Brian D Chambers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian D Chambers updated SPARK-23543: - Description: When adding Spark to a Java 9 project that is utilizing the new jdk9 module

[jira] [Created] (SPARK-23543) Automatic Module creation fails in Java 9

2018-02-28 Thread Brian D Chambers (JIRA)
Brian D Chambers created SPARK-23543: Summary: Automatic Module creation fails in Java 9 Key: SPARK-23543 URL: https://issues.apache.org/jira/browse/SPARK-23543 Project: Spark Issue

[jira] [Resolved] (SPARK-23493) insert-into depends on columns order, otherwise incorrect data inserted

2018-02-28 Thread Xiaoju Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoju Wu resolved SPARK-23493. --- Resolution: Not A Bug > insert-into depends on columns order, otherwise incorrect data inserted >

[jira] [Resolved] (SPARK-23540) The `where exists' action in optimized logical plan should be optimized

2018-02-28 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei resolved SPARK-23540. --- Resolution: Duplicate > The `where exists' action in optimized logical plan should be

[jira] [Updated] (SPARK-23542) The `where exists' action in optimized logical plan should be optimized

2018-02-28 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23542: -- Description: The optimized logical plan of query 'select * from tt1 where exists (select * 

[jira] [Updated] (SPARK-23542) The `where exists' action in optimized logical plan should be optimized

2018-02-28 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23542: -- Description: The optimized logical plan of query 'select * from tt1 where exists (select * 

[jira] [Commented] (SPARK-23526) KafkaMicroBatchV2SourceSuite.ensure stream-stream self-join generates only one offset in offset log

2018-02-28 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381408#comment-16381408 ] Gabor Somogyi commented on SPARK-23526: --- I've checked the code and the problem is similar just like

[jira] [Updated] (SPARK-23542) The `where exists' action in optimized logical plan should be optimized

2018-02-28 Thread KaiXinXIaoLei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KaiXinXIaoLei updated SPARK-23542: -- Description: The optimized logical plan of query 'select * from tt1 where exists (select * 

[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread

2018-02-28 Thread Thilak Raj Balasubramanian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381396#comment-16381396 ] Thilak Raj Balasubramanian commented on SPARK-21918: [~huLiu] This feature is a very

[jira] [Created] (SPARK-23542) The `where exists' action in optimized logical plan should be optimized

2018-02-28 Thread KaiXinXIaoLei (JIRA)
KaiXinXIaoLei created SPARK-23542: - Summary: The `where exists' action in optimized logical plan should be optimized Key: SPARK-23542 URL: https://issues.apache.org/jira/browse/SPARK-23542 Project:

[jira] [Assigned] (SPARK-23541) Allow Kafka source to read data with greater parallelism than the number of topic-partitions

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23541: Assignee: Tathagata Das (was: Apache Spark) > Allow Kafka source to read data with

[jira] [Commented] (SPARK-23541) Allow Kafka source to read data with greater parallelism than the number of topic-partitions

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381363#comment-16381363 ] Apache Spark commented on SPARK-23541: -- User 'tdas' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23541) Allow Kafka source to read data with greater parallelism than the number of topic-partitions

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23541: Assignee: Apache Spark (was: Tathagata Das) > Allow Kafka source to read data with

[jira] [Created] (SPARK-23541) Allow Kafka source to read data with greater parallelism than the number of topic-partitions

2018-02-28 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-23541: - Summary: Allow Kafka source to read data with greater parallelism than the number of topic-partitions Key: SPARK-23541 URL: https://issues.apache.org/jira/browse/SPARK-23541

[jira] [Created] (SPARK-23540) The `where exists' action in optimized logical plan should be optimized

2018-02-28 Thread KaiXinXIaoLei (JIRA)
KaiXinXIaoLei created SPARK-23540: - Summary: The `where exists' action in optimized logical plan should be optimized Key: SPARK-23540 URL: https://issues.apache.org/jira/browse/SPARK-23540 Project:

[jira] [Commented] (SPARK-23526) KafkaMicroBatchV2SourceSuite.ensure stream-stream self-join generates only one offset in offset log

2018-02-28 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381349#comment-16381349 ] Gabor Somogyi commented on SPARK-23526: --- Reminds me 

[jira] [Commented] (SPARK-23527) Error with spark-submit and kerberos with TLS-enabled Hadoop cluster

2018-02-28 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381258#comment-16381258 ] Gabor Somogyi commented on SPARK-23527: --- Yeah, I agree with Yuming. In the first case host not

[jira] [Commented] (SPARK-23427) spark.sql.autoBroadcastJoinThreshold causing OOM exception in the driver

2018-02-28 Thread Pratik Dhumal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381251#comment-16381251 ] Pratik Dhumal commented on SPARK-23427: --- Hello, For the purpose of development plan, # Can we

[jira] [Commented] (SPARK-18630) PySpark ML memory leak

2018-02-28 Thread yogesh garg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381244#comment-16381244 ] yogesh garg commented on SPARK-18630: - I would like to take this. If I understand correctly, moving

[jira] [Resolved] (SPARK-23083) Adding Kubernetes as an option to https://spark.apache.org/

2018-02-28 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anirudh Ramanathan resolved SPARK-23083. Resolution: Fixed This has been merged, closing. > Adding Kubernetes as an option

[jira] [Commented] (SPARK-22942) Spark Sql UDF throwing NullPointer when adding a filter on a columns that uses that UDF

2018-02-28 Thread Ravneet Popli (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381077#comment-16381077 ] Ravneet Popli commented on SPARK-22942: --- Matthew - Were you able to resolve this? We are also

[jira] [Created] (SPARK-23539) Add support for Kafka headers in Structured Streaming

2018-02-28 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-23539: - Summary: Add support for Kafka headers in Structured Streaming Key: SPARK-23539 URL: https://issues.apache.org/jira/browse/SPARK-23539 Project: Spark

[jira] [Updated] (SPARK-18057) Update structured streaming kafka from 0.10.0.1 to 1.1.0

2018-02-28 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18057: - Summary: Update structured streaming kafka from 0.10.0.1 to 1.1.0 (was: Update structured

[jira] [Commented] (SPARK-23502) Support async init of spark context during spark-shell startup

2018-02-28 Thread Sital Kedia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380942#comment-16380942 ] Sital Kedia commented on SPARK-23502: - >> what happens when you operate on {{sc}} before it's

[jira] [Created] (SPARK-23538) Simplify SSL configuration for https client

2018-02-28 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-23538: -- Summary: Simplify SSL configuration for https client Key: SPARK-23538 URL: https://issues.apache.org/jira/browse/SPARK-23538 Project: Spark Issue Type:

[jira] [Updated] (SPARK-23537) Logistic Regression without standardization

2018-02-28 Thread Jordi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordi updated SPARK-23537: -- Affects Version/s: 2.0.2 > Logistic Regression without standardization >

[jira] [Updated] (SPARK-23537) Logistic Regression without standardization

2018-02-28 Thread Jordi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordi updated SPARK-23537: -- Priority: Major (was: Minor) > Logistic Regression without standardization >

[jira] [Updated] (SPARK-23537) Logistic Regression without standardization

2018-02-28 Thread Jordi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordi updated SPARK-23537: -- Description: I'm trying to train a Logistic Regression model, using Spark 2.2.1. I prefer to not use

[jira] [Updated] (SPARK-23537) Logistic Regression without standardization

2018-02-28 Thread Jordi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordi updated SPARK-23537: -- Description: I'm trying to train a Logistic Regression model, using Spark 2.2.1. I prefer to not use

[jira] [Updated] (SPARK-23537) Logistic Regression without standardization

2018-02-28 Thread Jordi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordi updated SPARK-23537: -- Attachment: standardization.log > Logistic Regression without standardization >

[jira] [Updated] (SPARK-23537) Logistic Regression without standardization

2018-02-28 Thread Jordi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordi updated SPARK-23537: -- Attachment: non-standardization.log > Logistic Regression without standardization >

[jira] [Commented] (SPARK-23498) Accuracy problem in comparison with string and integer

2018-02-28 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380722#comment-16380722 ] Kazuaki Ishizaki commented on SPARK-23498: -- Is SPARK-21774 related to this issue, too? >

[jira] [Created] (SPARK-23537) Logistic Regression without standardization

2018-02-28 Thread Jordi (JIRA)
Jordi created SPARK-23537: - Summary: Logistic Regression without standardization Key: SPARK-23537 URL: https://issues.apache.org/jira/browse/SPARK-23537 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-23536) Update each Data frame row with a random value

2018-02-28 Thread Deenadayal (JIRA)
Deenadayal created SPARK-23536: -- Summary: Update each Data frame row with a random value Key: SPARK-23536 URL: https://issues.apache.org/jira/browse/SPARK-23536 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-23499) Mesos Cluster Dispatcher should support priority queues to submit drivers

2018-02-28 Thread Pascal GILLET (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380652#comment-16380652 ] Pascal GILLET edited comment on SPARK-23499 at 2/28/18 4:54 PM: Below a

[jira] [Comment Edited] (SPARK-23513) java.io.IOException: Expected 12 fields, but got 5 for row :Spark submit error

2018-02-28 Thread Anuroopa George (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380650#comment-16380650 ] Anuroopa George edited comment on SPARK-23513 at 2/28/18 4:49 PM: -- Could

[jira] [Issue Comment Deleted] (SPARK-23499) Mesos Cluster Dispatcher should support priority queues to submit drivers

2018-02-28 Thread Pascal GILLET (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal GILLET updated SPARK-23499: -- Comment: was deleted (was: I attached a screenshot of the MesosClusterDispatcher UI showing

[jira] [Commented] (SPARK-23499) Mesos Cluster Dispatcher should support priority queues to submit drivers

2018-02-28 Thread Pascal GILLET (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380652#comment-16380652 ] Pascal GILLET commented on SPARK-23499: --- Below a screenshot of the MesosClusterDispatcher UI

[jira] [Updated] (SPARK-23499) Mesos Cluster Dispatcher should support priority queues to submit drivers

2018-02-28 Thread Pascal GILLET (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal GILLET updated SPARK-23499: -- Attachment: Screenshot from 2018-02-28 17-22-47.png > Mesos Cluster Dispatcher should support

[jira] [Updated] (SPARK-23514) Replace spark.sparkContext.hadoopConfiguration by spark.sessionState.newHadoopConf()

2018-02-28 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23514: Fix Version/s: (was: 2.3.1) 2.4.0 > Replace spark.sparkContext.hadoopConfiguration

[jira] [Assigned] (SPARK-23514) Replace spark.sparkContext.hadoopConfiguration by spark.sessionState.newHadoopConf()

2018-02-28 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-23514: --- Assignee: Juliusz Sompolski > Replace spark.sparkContext.hadoopConfiguration by >

[jira] [Resolved] (SPARK-23514) Replace spark.sparkContext.hadoopConfiguration by spark.sessionState.newHadoopConf()

2018-02-28 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23514. - Resolution: Fixed Fix Version/s: 2.3.1 > Replace spark.sparkContext.hadoopConfiguration by >

[jira] [Commented] (SPARK-23513) java.io.IOException: Expected 12 fields, but got 5 for row :Spark submit error

2018-02-28 Thread Anuroopa George (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380650#comment-16380650 ] Anuroopa George commented on SPARK-23513: - Could you please post the complete spark-submit

[jira] [Commented] (SPARK-23499) Mesos Cluster Dispatcher should support priority queues to submit drivers

2018-02-28 Thread Pascal GILLET (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380641#comment-16380641 ] Pascal GILLET commented on SPARK-23499: --- I attached a screenshot of the MesosClusterDispatcher UI

[jira] [Updated] (SPARK-23499) Mesos Cluster Dispatcher should support priority queues to submit drivers

2018-02-28 Thread Pascal GILLET (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal GILLET updated SPARK-23499: -- Attachment: (was: Screenshot from 2018-02-28 17-22-47.png) > Mesos Cluster Dispatcher

[jira] [Updated] (SPARK-23499) Mesos Cluster Dispatcher should support priority queues to submit drivers

2018-02-28 Thread Pascal GILLET (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pascal GILLET updated SPARK-23499: -- Attachment: Screenshot from 2018-02-28 17-22-47.png > Mesos Cluster Dispatcher should support

[jira] [Resolved] (SPARK-20368) Support Sentry on PySpark workers

2018-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-20368. -- Resolution: Duplicate Let me leave this resolved as a duplicate of SPARK-22959. > Support

[jira] [Assigned] (SPARK-23517) Make pyspark.util._exception_message produce the trace from Java side for Py4JJavaError

2018-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-23517: Assignee: Hyukjin Kwon > Make pyspark.util._exception_message produce the trace from Java

[jira] [Resolved] (SPARK-23517) Make pyspark.util._exception_message produce the trace from Java side for Py4JJavaError

2018-02-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-23517. -- Resolution: Fixed Fix Version/s: 2.3.1 Issue resolved by pull request 20680

[jira] [Commented] (SPARK-23525) ALTER TABLE CHANGE COLUMN doesn't work for external hive table

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380486#comment-16380486 ] Apache Spark commented on SPARK-23525: -- User 'jiangxb1987' has created a pull request for this

[jira] [Assigned] (SPARK-23525) ALTER TABLE CHANGE COLUMN doesn't work for external hive table

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23525: Assignee: (was: Apache Spark) > ALTER TABLE CHANGE COLUMN doesn't work for external

[jira] [Assigned] (SPARK-23525) ALTER TABLE CHANGE COLUMN doesn't work for external hive table

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23525: Assignee: Apache Spark > ALTER TABLE CHANGE COLUMN doesn't work for external hive table >

[jira] [Assigned] (SPARK-23508) blockManagerIdCache in BlockManagerId may cause oom

2018-02-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-23508: --- Assignee: zhoukang > blockManagerIdCache in BlockManagerId may cause oom >

[jira] [Resolved] (SPARK-23508) blockManagerIdCache in BlockManagerId may cause oom

2018-02-28 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23508. - Resolution: Fixed Fix Version/s: 2.3.1 2.4.0 2.2.2

[jira] [Commented] (SPARK-23527) Error with spark-submit and kerberos with TLS-enabled Hadoop cluster

2018-02-28 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380440#comment-16380440 ] Yuming Wang commented on SPARK-23527: - It isn't bug, please check your hostname: 

[jira] [Commented] (SPARK-23528) Expose vital statistics of GaussianMixtureModel

2018-02-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380426#comment-16380426 ] Marco Gaido commented on SPARK-23528: - The log likelihood is already available in the summary (eg.

[jira] [Commented] (SPARK-16996) Hive ACID delta files not seen

2018-02-28 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380370#comment-16380370 ] Steve Loughran commented on SPARK-16996: Like I said, Spark is trouble; we've just been including

[jira] [Assigned] (SPARK-21741) Python API for DataFrame-based multivariate summarizer

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21741: Assignee: Apache Spark > Python API for DataFrame-based multivariate summarizer >

[jira] [Commented] (SPARK-21741) Python API for DataFrame-based multivariate summarizer

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380368#comment-16380368 ] Apache Spark commented on SPARK-21741: -- User 'WeichenXu123' has created a pull request for this

[jira] [Assigned] (SPARK-21741) Python API for DataFrame-based multivariate summarizer

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21741: Assignee: (was: Apache Spark) > Python API for DataFrame-based multivariate

[jira] [Updated] (SPARK-23535) MinMaxScaler return 0.5 for an all zero column

2018-02-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-23535: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) (Not a bug -- any value in the

[jira] [Updated] (SPARK-23529) Specify hostpath volume and mount the volume in Spark driver and executor pods in Kubernetes

2018-02-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-23529: -- Target Version/s: (was: 2.3.0) Fix Version/s: (was: 2.3.0) > Specify hostpath volume and

[jira] [Commented] (SPARK-23535) MinMaxScaler return 0.5 for an all zero column

2018-02-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380334#comment-16380334 ] Marco Gaido commented on SPARK-23535: - I checked and each tool behaves in its own way when this case

[jira] [Assigned] (SPARK-23173) from_json can produce nulls for fields which are marked as non-nullable

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23173: Assignee: Apache Spark > from_json can produce nulls for fields which are marked as

[jira] [Assigned] (SPARK-23173) from_json can produce nulls for fields which are marked as non-nullable

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23173: Assignee: (was: Apache Spark) > from_json can produce nulls for fields which are

[jira] [Commented] (SPARK-23173) from_json can produce nulls for fields which are marked as non-nullable

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380250#comment-16380250 ] Apache Spark commented on SPARK-23173: -- User 'mswit-databricks' has created a pull request for this

[jira] [Commented] (SPARK-23523) Incorrect result caused by the rule OptimizeMetadataOnlyQuery

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380247#comment-16380247 ] Apache Spark commented on SPARK-23523: -- User 'jiangxb1987' has created a pull request for this

[jira] [Assigned] (SPARK-23531) When explain, plan's output should include attribute type info

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23531: Assignee: (was: Apache Spark) > When explain, plan's output should include attribute

[jira] [Commented] (SPARK-23531) When explain, plan's output should include attribute type info

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380235#comment-16380235 ] Apache Spark commented on SPARK-23531: -- User 'mgaido91' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23531) When explain, plan's output should include attribute type info

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23531: Assignee: Apache Spark > When explain, plan's output should include attribute type info >

[jira] [Commented] (SPARK-14974) spark sql job create too many files in HDFS when doing insert overwrite hive table

2018-02-28 Thread Kevin Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380126#comment-16380126 ] Kevin Zhang commented on SPARK-14974: - I encountered the same problem with [~ussraf] in spark 2.2 and

[jira] [Created] (SPARK-23535) MinMaxScaler return 0.5 for an all zero column

2018-02-28 Thread Yigal Weinberger (JIRA)
Yigal Weinberger created SPARK-23535: Summary: MinMaxScaler return 0.5 for an all zero column Key: SPARK-23535 URL: https://issues.apache.org/jira/browse/SPARK-23535 Project: Spark Issue

[jira] [Updated] (SPARK-23534) Spark run on Hadoop 3.0.0

2018-02-28 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-23534: Description: Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make sure

[jira] [Commented] (SPARK-18161) Default PickleSerializer pickle protocol doesn't handle > 4GB objects

2018-02-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379971#comment-16379971 ] Apache Spark commented on SPARK-18161: -- User 'inpefess' has created a pull request for this issue:

[jira] [Created] (SPARK-23534) Spark run on Hadoop 3.0.0

2018-02-28 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-23534: --- Summary: Spark run on Hadoop 3.0.0 Key: SPARK-23534 URL: https://issues.apache.org/jira/browse/SPARK-23534 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-23531) When explain, plan's output should include attribute type info

2018-02-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379967#comment-16379967 ] Marco Gaido commented on SPARK-23531: - I am working on this. I will submit a PR soon. > When