[jira] [Commented] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2016-08-27 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15442542#comment-15442542 ] Weichen Xu commented on SPARK-17139: Because LOR & MLOR interface need to be unified, I will create

[jira] [Assigned] (SPARK-17138) Python API for multinomial logistic regression

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17138: Assignee: (was: Apache Spark) > Python API for multinomial logistic regression >

[jira] [Commented] (SPARK-17138) Python API for multinomial logistic regression

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15442537#comment-15442537 ] Apache Spark commented on SPARK-17138: -- User 'WeichenXu123' has created a pull request for this

[jira] [Assigned] (SPARK-17138) Python API for multinomial logistic regression

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17138: Assignee: Apache Spark > Python API for multinomial logistic regression >

[jira] [Commented] (SPARK-17264) DataStreamWriter does not support "json" format

2016-08-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15442475#comment-15442475 ] Hyukjin Kwon commented on SPARK-17264: -- Is this a duplicate of SPARK-15472? > DataStreamWriter does

[jira] [Resolved] (SPARK-10834) SPARK SQL doesn't support INSERT INTO ... VALUES

2016-08-27 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-10834. Resolution: Fixed As of (at least) Spark 2.0 we now support INSERT INTO ... VALUES, so I'm going

[jira] [Resolved] (SPARK-11299) SQL Programming Guide's link to DataFrame Function Reference is wrong

2016-08-27 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-11299. Resolution: Fixed This was fixed by my PR in 2015. > SQL Programming Guide's link to DataFrame

[jira] [Commented] (SPARK-17275) Flaky test: org.apache.spark.deploy.RPackageUtilsSuite.jars that don't exist are skipped and print warning

2016-08-27 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15442059#comment-15442059 ] Shivaram Venkataraman commented on SPARK-17275: --- I dont think anything changed there

[jira] [Commented] (SPARK-17281) Add treeAggregateDepth parameter for AFTSurvivalRegression

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441999#comment-15441999 ] Apache Spark commented on SPARK-17281: -- User 'WeichenXu123' has created a pull request for this

[jira] [Assigned] (SPARK-17281) Add treeAggregateDepth parameter for AFTSurvivalRegression

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17281: Assignee: (was: Apache Spark) > Add treeAggregateDepth parameter for

[jira] [Assigned] (SPARK-17281) Add treeAggregateDepth parameter for AFTSurvivalRegression

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17281: Assignee: Apache Spark > Add treeAggregateDepth parameter for AFTSurvivalRegression >

[jira] [Created] (SPARK-17281) Add treeAggregateDepth parameter for AFTSurvivalRegression

2016-08-27 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-17281: -- Summary: Add treeAggregateDepth parameter for AFTSurvivalRegression Key: SPARK-17281 URL: https://issues.apache.org/jira/browse/SPARK-17281 Project: Spark Issue

[jira] [Commented] (SPARK-17280) Flaky test: org.apache.spark.streaming.kafka010.JavaKafkaRDDSuite and JavaDirectKafkaStreamSuite.testKafkaStream

2016-08-27 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441981#comment-15441981 ] Cody Koeninger commented on SPARK-17280: I can take a look but there's not a lot to go on. >

[jira] [Updated] (SPARK-17280) Flaky test: org.apache.spark.streaming.kafka010.JavaKafkaRDDSuite and JavaDirectKafkaStreamSuite.testKafkaStream

2016-08-27 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-17280: - Summary: Flaky test: org.apache.spark.streaming.kafka010.JavaKafkaRDDSuite and

[jira] [Updated] (SPARK-17280) Flaky test: org.apache.spark.streaming.kafka010.JavaKafkaRDDSuite and JavaDirectKafkaStreamSuite.testKafkaStream

2016-08-27 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-17280: - Description: https://spark-tests.appspot.com/builds/spark-master-test-maven-hadoop-2.2/1793

[jira] [Commented] (SPARK-17280) Flaky test: org.apache.spark.streaming.kafka010.JavaKafkaRDDSuite

2016-08-27 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441885#comment-15441885 ] Yin Huai commented on SPARK-17280: -- [~c...@koeninger.org] Will you have time to take a look at these two

[jira] [Created] (SPARK-17280) Flaky test: org.apache.spark.streaming.kafka010.JavaKafkaRDDSuite

2016-08-27 Thread Yin Huai (JIRA)
Yin Huai created SPARK-17280: Summary: Flaky test: org.apache.spark.streaming.kafka010.JavaKafkaRDDSuite Key: SPARK-17280 URL: https://issues.apache.org/jira/browse/SPARK-17280 Project: Spark

[jira] [Commented] (SPARK-12394) Support writing out pre-hash-partitioned data and exploit that in join optimizations to avoid shuffle (i.e. bucketing in Hive)

2016-08-27 Thread Darren Fu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441782#comment-15441782 ] Darren Fu commented on SPARK-12394: --- Good news, Tejas! I think SMB join is a required feature to make

[jira] [Commented] (SPARK-16957) Use weighted midpoints for split values.

2016-08-27 Thread Abdeali Kothari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441640#comment-15441640 ] Abdeali Kothari commented on SPARK-16957: - Hi, I'd like to begin contributing, and this seems

[jira] [Comment Edited] (SPARK-13525) SparkR: java.net.SocketTimeoutException: Accept timed out when running any dataframe function

2016-08-27 Thread Arihanth Jain (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441107#comment-15441107 ] Arihanth Jain edited comment on SPARK-13525 at 8/27/16 1:58 PM: I checked

[jira] [Commented] (SPARK-13525) SparkR: java.net.SocketTimeoutException: Accept timed out when running any dataframe function

2016-08-27 Thread Arihanth Jain (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441107#comment-15441107 ] Arihanth Jain commented on SPARK-13525: --- I checked for localhost and it works. The spark cluster

[jira] [Closed] (SPARK-9066) Improve cartesian performance

2016-08-27 Thread Weizhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weizhong closed SPARK-9066. --- Resolution: Fixed > Improve cartesian performance > -- > > Key:

[jira] [Closed] (SPARK-13768) Set hive conf failed use --hiveconf when beeline connect to thriftserver

2016-08-27 Thread Weizhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weizhong closed SPARK-13768. Resolution: Fixed > Set hive conf failed use --hiveconf when beeline connect to thriftserver >

[jira] [Assigned] (SPARK-17279) better error message for NPE during ScalaUDF execution

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17279: Assignee: Apache Spark (was: Wenchen Fan) > better error message for NPE during ScalaUDF

[jira] [Assigned] (SPARK-17279) better error message for NPE during ScalaUDF execution

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17279: Assignee: Wenchen Fan (was: Apache Spark) > better error message for NPE during ScalaUDF

[jira] [Commented] (SPARK-17279) better error message for NPE during ScalaUDF execution

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441008#comment-15441008 ] Apache Spark commented on SPARK-17279: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Created] (SPARK-17279) better error message for NPE during ScalaUDF execution

2016-08-27 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-17279: --- Summary: better error message for NPE during ScalaUDF execution Key: SPARK-17279 URL: https://issues.apache.org/jira/browse/SPARK-17279 Project: Spark Issue

[jira] [Created] (SPARK-17278) better error message for NPE during ScalaUDF execution

2016-08-27 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-17278: --- Summary: better error message for NPE during ScalaUDF execution Key: SPARK-17278 URL: https://issues.apache.org/jira/browse/SPARK-17278 Project: Spark Issue

[jira] [Updated] (SPARK-17277) Set hive conf failed

2016-08-27 Thread Weizhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weizhong updated SPARK-17277: - Description: Now we can't use "SET k=v" to set Hive conf, for example: run below SQL in spark-sql

[jira] [Updated] (SPARK-17277) Set hive conf failed

2016-08-27 Thread Weizhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weizhong updated SPARK-17277: - Description: Now we can't use "SET k=v" to set Hive conf, for example: run below SQL in spark-sql

[jira] [Resolved] (SPARK-15044) spark-sql will throw "input path does not exist" exception if it handles a partition which exists in hive table, but the path is removed manually

2016-08-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-15044. --- Resolution: Not A Problem > spark-sql will throw "input path does not exist" exception if it handles

[jira] [Created] (SPARK-17277) Set hive conf failed

2016-08-27 Thread Weizhong (JIRA)
Weizhong created SPARK-17277: Summary: Set hive conf failed Key: SPARK-17277 URL: https://issues.apache.org/jira/browse/SPARK-17277 Project: Spark Issue Type: Bug Components: SQL

[jira] [Resolved] (SPARK-17236) Use saveAsHadoopDataset to save RDD to HBASE, long time no response

2016-08-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17236. --- Resolution: Not A Problem I'm not sure how to resolve it, but if it's more an HBase issue, ask the

[jira] [Commented] (SPARK-17214) How to deal with dots (.) present in column names in SparkR

2016-08-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440966#comment-15440966 ] Sean Owen commented on SPARK-17214: --- I think the issue is that the 'underlying' dataframe hasn't

[jira] [Resolved] (SPARK-17143) pyspark unable to create UDF: java.lang.RuntimeException: org.apache.hadoop.fs.FileAlreadyExistsException: Parent path is not a directory: /tmp tmp

2016-08-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17143. --- Resolution: Not A Problem > pyspark unable to create UDF: java.lang.RuntimeException: >

[jira] [Resolved] (SPARK-17001) Enable standardScaler to standardize sparse vectors when withMean=True

2016-08-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17001. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14663

[jira] [Assigned] (SPARK-17001) Enable standardScaler to standardize sparse vectors when withMean=True

2016-08-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-17001: - Assignee: Sean Owen > Enable standardScaler to standardize sparse vectors when withMean=True >

[jira] [Updated] (SPARK-17216) Even timeline for a stage doesn't core 100% of the bar timeline bar in chrome

2016-08-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17216: -- Assignee: Robert Kruszewski Component/s: Web UI > Even timeline for a stage doesn't core 100%

[jira] [Resolved] (SPARK-17216) Even timeline for a stage doesn't core 100% of the bar timeline bar in chrome

2016-08-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17216. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Updated] (SPARK-15382) monotonicallyIncreasingId doesn't work when data is upsampled

2016-08-27 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-15382: -- Assignee: Takeshi Yamamuro > monotonicallyIncreasingId doesn't work when data is upsampled >

[jira] [Resolved] (SPARK-17274) Move join optimizer rules into a separate file

2016-08-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17274. - Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 > Move join optimizer

[jira] [Resolved] (SPARK-17273) Move expression optimizer rules into a separate file

2016-08-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17273. - Resolution: Fixed Fix Version/s: 2.1.0 > Move expression optimizer rules into a separate

[jira] [Resolved] (SPARK-17272) Move subquery optimizer rules into its own file

2016-08-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17272. - Resolution: Fixed Fix Version/s: 2.1.0 > Move subquery optimizer rules into its own file

[jira] [Updated] (SPARK-17270) Move object optimization rules into its own file

2016-08-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17270: Fix Version/s: 2.0.1 > Move object optimization rules into its own file >

[jira] [Updated] (SPARK-17110) Pyspark with locality ANY throw java.io.StreamCorruptedException

2016-08-27 Thread Tomer Kaftan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomer Kaftan updated SPARK-17110: - Environment: Cluster of 2 AWS r3.xlarge slaves launched via ec2 scripts, Spark 2.0.0, hadoop:

[jira] [Commented] (SPARK-17110) Pyspark with locality ANY throw java.io.StreamCorruptedException

2016-08-27 Thread Tomer Kaftan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440817#comment-15440817 ] Tomer Kaftan commented on SPARK-17110: -- Hi Miao, That setup wouldn't cause this bug to appear

[jira] [Commented] (SPARK-17276) Stop environment parameters flooding Jenkins build output

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440812#comment-15440812 ] Apache Spark commented on SPARK-17276: -- User 'keypointt' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17276) Stop environment parameters flooding Jenkins build output

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17276: Assignee: (was: Apache Spark) > Stop environment parameters flooding Jenkins build

[jira] [Assigned] (SPARK-17276) Stop environment parameters flooding Jenkins build output

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17276: Assignee: Apache Spark > Stop environment parameters flooding Jenkins build output >

[jira] [Assigned] (SPARK-17254) Filter operator should have “stop if false” semantics for sorted data

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17254: Assignee: Apache Spark > Filter operator should have “stop if false” semantics for sorted

[jira] [Commented] (SPARK-17254) Filter operator should have “stop if false” semantics for sorted data

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15440806#comment-15440806 ] Apache Spark commented on SPARK-17254: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17254) Filter operator should have “stop if false” semantics for sorted data

2016-08-27 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17254: Assignee: (was: Apache Spark) > Filter operator should have “stop if false” semantics

[jira] [Updated] (SPARK-17110) Pyspark with locality ANY throw java.io.StreamCorruptedException

2016-08-27 Thread Tomer Kaftan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomer Kaftan updated SPARK-17110: - Description: In Pyspark 2.0.0, any task that accesses cached data non-locally throws a