[jira] [Updated] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2017-01-03 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-13677: - Component/s: (was: MLlib) ML > Support Tree-Based Feature Transformation for

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2017-01-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797443#comment-15797443 ] Liang-Chi Hsieh edited comment on SPARK-7768 at 1/4/17 7:35 AM:

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

2017-01-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797443#comment-15797443 ] Liang-Chi Hsieh edited comment on SPARK-7768 at 1/4/17 7:35 AM:

[jira] [Commented] (SPARK-7768) Make user-defined type (UDT) API public

2017-01-03 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797443#comment-15797443 ] Liang-Chi Hsieh commented on SPARK-7768: I would like to push this forward and mak

[jira] [Commented] (SPARK-19035) rand() function in case when cause failed

2017-01-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797395#comment-15797395 ] Xiao Li commented on SPARK-19035: - Oracle treats it as the same random function. {nofor

[jira] [Updated] (SPARK-19026) local directories cannot be cleanuped when create directory of "executor-***" throws IOException such as there is no more free disk space to create it etc.

2017-01-03 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-19026: Description: i set SPARK_LOCAL_DIRS variable like this: SPARK_LOCAL_DIRS=/data2/spark/tmp,/data3/sp

[jira] [Updated] (SPARK-19026) local directories cannot be cleanuped when create directory of "executor-***" throws IOException such as there is no more free disk space to create it etc.

2017-01-03 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-19026: Description: i set SPARK_LOCAL_DIRS variable like this: SPARK_LOCAL_DIRS=/data2/spark/tmp,/data3/sp

[jira] [Resolved] (SPARK-19072) Catalyst's IN always returns false for infinity

2017-01-03 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-19072. -- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16469 [https://github.com/

[jira] [Updated] (SPARK-19072) Catalyst's IN always returns false for infinity

2017-01-03 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-19072: - Assignee: Wenchen Fan > Catalyst's IN always returns false for infinity > ---

[jira] [Comment Edited] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore

2017-01-03 Thread Dapeng Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797215#comment-15797215 ] Dapeng Sun edited comment on SPARK-13446 at 1/4/17 5:22 AM: H

[jira] [Commented] (SPARK-13446) Spark need to support reading data from Hive 2.0.0 metastore

2017-01-03 Thread Dapeng Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797215#comment-15797215 ] Dapeng Sun commented on SPARK-13446: Hi all, I created SPARK-19076 may be related to

[jira] [Commented] (SPARK-19076) Upgrade Hive dependence to Hive 2.x

2017-01-03 Thread Dapeng Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797208#comment-15797208 ] Dapeng Sun commented on SPARK-19076: After a quick look of the upstream Spark, it may

[jira] [Updated] (SPARK-19076) Upgrade Hive dependence to Hive 2.x

2017-01-03 Thread Dapeng Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun updated SPARK-19076: --- Description: Currently the upstream Spark depends on Hive 1.2.1 to build package, and Hive 2.0 has be

[jira] [Created] (SPARK-19076) Upgrade Hive dependence to Hive 2.x

2017-01-03 Thread Dapeng Sun (JIRA)
Dapeng Sun created SPARK-19076: -- Summary: Upgrade Hive dependence to Hive 2.x Key: SPARK-19076 URL: https://issues.apache.org/jira/browse/SPARK-19076 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-19075) Plz make MinMaxScaler can work with a Number type field

2017-01-03 Thread lichenglin (JIRA)
lichenglin created SPARK-19075: -- Summary: Plz make MinMaxScaler can work with a Number type field Key: SPARK-19075 URL: https://issues.apache.org/jira/browse/SPARK-19075 Project: Spark Issue Typ

[jira] [Commented] (SPARK-19072) Catalyst's IN always returns false for infinity

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797055#comment-15797055 ] Apache Spark commented on SPARK-19072: -- User 'cloud-fan' has created a pull request

[jira] [Assigned] (SPARK-19072) Catalyst's IN always returns false for infinity

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19072: Assignee: Apache Spark > Catalyst's IN always returns false for infinity > ---

[jira] [Assigned] (SPARK-19072) Catalyst's IN always returns false for infinity

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19072: Assignee: (was: Apache Spark) > Catalyst's IN always returns false for infinity >

[jira] [Commented] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2017-01-03 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796968#comment-15796968 ] zhengruifeng commented on SPARK-13677: -- Not at all. I know you commetters are busy.

[jira] [Commented] (SPARK-13435) Add Weighted Cohen's kappa to MulticlassMetrics

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796965#comment-15796965 ] Joseph K. Bradley commented on SPARK-13435: --- Thanks! We just have not been abl

[jira] [Updated] (SPARK-19072) Catalyst's IN always returns false for infinity

2017-01-03 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-19072: --- Description: This bug was caused by the fix for SPARK-18999 (https://github.com/apache/spark

[jira] [Updated] (SPARK-19072) Catalyst's IN always returns false for infinity

2017-01-03 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-19072: --- Description: This can be reproduced by adding the following test to PredicateSuite.scala (wh

[jira] [Updated] (SPARK-19072) Catalyst's IN always returns false for infinity

2017-01-03 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-19072: --- Summary: Catalyst's IN always returns false for infinity (was: PredicateSuite.IN test is fla

[jira] [Assigned] (SPARK-19074) Update Structured Streaming Programming guide for Update Mode

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19074: Assignee: Tathagata Das (was: Apache Spark) > Update Structured Streaming Programming gui

[jira] [Assigned] (SPARK-19074) Update Structured Streaming Programming guide for Update Mode

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19074: Assignee: Apache Spark (was: Tathagata Das) > Update Structured Streaming Programming gui

[jira] [Commented] (SPARK-19074) Update Structured Streaming Programming guide for Update Mode

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796884#comment-15796884 ] Apache Spark commented on SPARK-19074: -- User 'tdas' has created a pull request for t

[jira] [Created] (SPARK-19074) Update Structured Streaming Programming guide for Update Mode

2017-01-03 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-19074: - Summary: Update Structured Streaming Programming guide for Update Mode Key: SPARK-19074 URL: https://issues.apache.org/jira/browse/SPARK-19074 Project: Spark

[jira] [Assigned] (SPARK-19073) LauncherState should be only set to SUBMITTED after the application is submitted

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19073: Assignee: Apache Spark > LauncherState should be only set to SUBMITTED after the applicati

[jira] [Assigned] (SPARK-19073) LauncherState should be only set to SUBMITTED after the application is submitted

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19073: Assignee: (was: Apache Spark) > LauncherState should be only set to SUBMITTED after th

[jira] [Commented] (SPARK-19073) LauncherState should be only set to SUBMITTED after the application is submitted

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796852#comment-15796852 ] Apache Spark commented on SPARK-19073: -- User 'shimingfei' has created a pull request

[jira] [Created] (SPARK-19073) LauncherState should be only set to SUBMITTED after the application is submitted

2017-01-03 Thread shimingfei (JIRA)
shimingfei created SPARK-19073: -- Summary: LauncherState should be only set to SUBMITTED after the application is submitted Key: SPARK-19073 URL: https://issues.apache.org/jira/browse/SPARK-19073 Project:

[jira] [Assigned] (SPARK-19017) NOT IN subquery with more than one column may return incorrect results

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19017: Assignee: (was: Apache Spark) > NOT IN subquery with more than one column may return i

[jira] [Commented] (SPARK-19017) NOT IN subquery with more than one column may return incorrect results

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796839#comment-15796839 ] Apache Spark commented on SPARK-19017: -- User 'nsyca' has created a pull request for

[jira] [Assigned] (SPARK-19017) NOT IN subquery with more than one column may return incorrect results

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19017: Assignee: Apache Spark > NOT IN subquery with more than one column may return incorrect re

[jira] [Commented] (SPARK-19033) HistoryServer still uses old ACLs even if ACLs are updated

2017-01-03 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796834#comment-15796834 ] Saisai Shao commented on SPARK-19033: - Thanks a lot [~tgraves], I will think about ho

[jira] [Assigned] (SPARK-19070) Clean-up dataset actions

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19070: Assignee: Apache Spark (was: Herman van Hovell) > Clean-up dataset actions >

[jira] [Commented] (SPARK-19070) Clean-up dataset actions

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796777#comment-15796777 ] Apache Spark commented on SPARK-19070: -- User 'hvanhovell' has created a pull request

[jira] [Assigned] (SPARK-19070) Clean-up dataset actions

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19070: Assignee: Herman van Hovell (was: Apache Spark) > Clean-up dataset actions >

[jira] [Assigned] (SPARK-19064) Fix pip install issue with ml sub components

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19064: Assignee: Apache Spark > Fix pip install issue with ml sub components > --

[jira] [Commented] (SPARK-19064) Fix pip install issue with ml sub components

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796775#comment-15796775 ] Apache Spark commented on SPARK-19064: -- User 'holdenk' has created a pull request fo

[jira] [Assigned] (SPARK-19064) Fix pip install issue with ml sub components

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19064: Assignee: (was: Apache Spark) > Fix pip install issue with ml sub components > ---

[jira] [Commented] (SPARK-18966) NOT IN subquery with correlated expressions may return incorrect result

2017-01-03 Thread Nattavut Sutyanyong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796762#comment-15796762 ] Nattavut Sutyanyong commented on SPARK-18966: - Making a note that the query

[jira] [Commented] (SPARK-13677) Support Tree-Based Feature Transformation for ML

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796748#comment-15796748 ] Joseph K. Bradley commented on SPARK-13677: --- [~podongfeng] Apologies for the in

[jira] [Created] (SPARK-19072) PredicateSuite.IN test is flaky

2017-01-03 Thread Kay Ousterhout (JIRA)
Kay Ousterhout created SPARK-19072: -- Summary: PredicateSuite.IN test is flaky Key: SPARK-19072 URL: https://issues.apache.org/jira/browse/SPARK-19072 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-19039) UDF ClosureCleaner bug when UDF, col applied in paste mode in REPL

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796740#comment-15796740 ] Joseph K. Bradley commented on SPARK-19039: --- Whoops, thanks! Posted stack trac

[jira] [Updated] (SPARK-19039) UDF ClosureCleaner bug when UDF, col applied in paste mode in REPL

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19039: -- Description: When I try this: * Define UDF * Apply UDF to get Column * Use Column in a

[jira] [Commented] (SPARK-15009) PySpark CountVectorizerModel should be able to construct from vocabulary list

2017-01-03 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796737#comment-15796737 ] Bryan Cutler commented on SPARK-15009: -- Hi [~sueann], I pretty much have this done b

[jira] [Commented] (SPARK-18948) Add Mean Percentile Rank metric for ranking algorithms

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796730#comment-15796730 ] Joseph K. Bradley commented on SPARK-18948: --- OK, but please say if you'd like t

[jira] [Commented] (SPARK-19071) Optimizations for ML Pipeline Tuning

2017-01-03 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796709#comment-15796709 ] Bryan Cutler commented on SPARK-19071: -- I have a working version of parallel model e

[jira] [Commented] (SPARK-19071) Optimizations for ML Pipeline Tuning

2017-01-03 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796712#comment-15796712 ] Bryan Cutler commented on SPARK-19071: -- It would be great to get at least step 1 in

[jira] [Created] (SPARK-19071) Optimizations for ML Pipeline Tuning

2017-01-03 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-19071: Summary: Optimizations for ML Pipeline Tuning Key: SPARK-19071 URL: https://issues.apache.org/jira/browse/SPARK-19071 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-03 Thread Parag Chaudhari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parag Chaudhari updated SPARK-19069: Description: Although Spark history server UI shows task ‘status’ and ‘duration’ fields, it

[jira] [Updated] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-03 Thread Parag Chaudhari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parag Chaudhari updated SPARK-19069: Description: Although Spark history server UI shows task ‘status’ and ‘duration’ fields, it

[jira] [Updated] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-03 Thread Parag Chaudhari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parag Chaudhari updated SPARK-19069: Description: Although Spark history server UI shows task ‘status’ and ‘duration’ fields, it

[jira] [Updated] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-03 Thread Parag Chaudhari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parag Chaudhari updated SPARK-19069: Description: Although Spark history server UI shows task ‘status’ and ‘duration’ fields, it

[jira] [Updated] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-03 Thread Parag Chaudhari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parag Chaudhari updated SPARK-19069: Description: Although Spark history server UI shows task ‘status’ and ‘duration’ fields, it

[jira] [Updated] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-03 Thread Parag Chaudhari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parag Chaudhari updated SPARK-19069: Description: Although Spark history server UI shows task ‘status’ and ‘duration’ fields, it

[jira] [Updated] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-03 Thread Parag Chaudhari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parag Chaudhari updated SPARK-19069: Description: Although Spark history server UI shows task ‘status’ and ‘duration’ fields, it

[jira] [Updated] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-03 Thread Parag Chaudhari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parag Chaudhari updated SPARK-19069: Attachment: screenshot-1.png > Expose task 'status' and 'duration' in spark history server

[jira] [Assigned] (SPARK-19070) Clean-up dataset actions

2017-01-03 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell reassigned SPARK-19070: - Assignee: Herman van Hovell > Clean-up dataset actions > ---

[jira] [Updated] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-03 Thread Parag Chaudhari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parag Chaudhari updated SPARK-19069: Attachment: (was: UI-showing-status-and-duration.png) > Expose task 'status' and 'durat

[jira] [Updated] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-03 Thread Parag Chaudhari (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parag Chaudhari updated SPARK-19069: Attachment: UI-showing-status-and-duration.png > Expose task 'status' and 'duration' in spa

[jira] [Created] (SPARK-19070) Clean-up dataset actions

2017-01-03 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-19070: - Summary: Clean-up dataset actions Key: SPARK-19070 URL: https://issues.apache.org/jira/browse/SPARK-19070 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-19069) Expose task 'status' and 'duration' in spark history server REST API.

2017-01-03 Thread Parag Chaudhari (JIRA)
Parag Chaudhari created SPARK-19069: --- Summary: Expose task 'status' and 'duration' in spark history server REST API. Key: SPARK-19069 URL: https://issues.apache.org/jira/browse/SPARK-19069 Project:

[jira] [Commented] (SPARK-19032) Non-deterministic results using aggregation first across multiple workers

2017-01-03 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796650#comment-15796650 ] Herman van Hovell commented on SPARK-19032: --- [~harryw] First/Last only produce

[jira] [Closed] (SPARK-16786) LDA topic distributions for new documents in PySpark

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-16786. - Resolution: Won't Fix > LDA topic distributions for new documents in PySpark > --

[jira] [Commented] (SPARK-16786) LDA topic distributions for new documents in PySpark

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796648#comment-15796648 ] Joseph K. Bradley commented on SPARK-16786: --- [~supremekai] Thanks for the PR.

[jira] [Resolved] (SPARK-15163) Mark experimental algorithms experimental in PySpark

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-15163. --- Resolution: Fixed Assignee: holdenk Fix Version/s: 2.0.0 I'm resolvin

[jira] [Commented] (SPARK-15009) PySpark CountVectorizerModel should be able to construct from vocabulary list

2017-01-03 Thread Sue Ann Hong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796611#comment-15796611 ] Sue Ann Hong commented on SPARK-15009: -- also [~lins05] > PySpark CountVectorizerMod

[jira] [Commented] (SPARK-19032) Non-deterministic results using aggregation first across multiple workers

2017-01-03 Thread Harry Weppner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796603#comment-15796603 ] Harry Weppner commented on SPARK-19032: --- [~srowen] thanks for clarifying the intend

[jira] [Updated] (SPARK-19068) Large number of executors causing a ton of ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerExecutorMetricsUpdate(41,Wr

2017-01-03 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JESSE CHEN updated SPARK-19068: --- Attachment: sparklog.tar.gz This is the Spark console output in which you can find settings and seque

[jira] [Commented] (SPARK-15009) PySpark CountVectorizerModel should be able to construct from vocabulary list

2017-01-03 Thread Sue Ann Hong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796575#comment-15796575 ] Sue Ann Hong commented on SPARK-15009: -- I'd like to work on this -- [~bryanc] is tha

[jira] [Updated] (SPARK-19068) Large number of executors causing a ton of ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerExecutorMetricsUpdate(41,Wr

2017-01-03 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JESSE CHEN updated SPARK-19068: --- Description: On a large cluster with 45TB RAM and 1,000 cores, we used 1008 executors in order to us

[jira] [Created] (SPARK-19068) Large number of executors causing a ton of ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerExecutorMetricsUpdate(41,Wr

2017-01-03 Thread JESSE CHEN (JIRA)
JESSE CHEN created SPARK-19068: -- Summary: Large number of executors causing a ton of ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerExecutorMetricsUpdate(41,WrappedArray()) Key: SP

[jira] [Created] (SPARK-19067) mapWithState Style API

2017-01-03 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19067: Summary: mapWithState Style API Key: SPARK-19067 URL: https://issues.apache.org/jira/browse/SPARK-19067 Project: Spark Issue Type: New Feature

[jira] [Assigned] (SPARK-19066) SparkR LDA doesn't set optimizer correctly

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19066: Assignee: Apache Spark > SparkR LDA doesn't set optimizer correctly >

[jira] [Commented] (SPARK-19066) SparkR LDA doesn't set optimizer correctly

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796545#comment-15796545 ] Apache Spark commented on SPARK-19066: -- User 'wangmiao1981' has created a pull reque

[jira] [Assigned] (SPARK-19066) SparkR LDA doesn't set optimizer correctly

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19066: Assignee: (was: Apache Spark) > SparkR LDA doesn't set optimizer correctly > -

[jira] [Updated] (SPARK-19066) SparkR LDA doesn't set optimizer correctly

2017-01-03 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miao Wang updated SPARK-19066: -- Description: spark.lda pass the optimizer "em" or "online" to the backend. However, LDAWrapper doesn't

[jira] [Created] (SPARK-19066) SparkR LDA doesn't set optimizer correctly

2017-01-03 Thread Miao Wang (JIRA)
Miao Wang created SPARK-19066: - Summary: SparkR LDA doesn't set optimizer correctly Key: SPARK-19066 URL: https://issues.apache.org/jira/browse/SPARK-19066 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-19065) Bad error when using dropDuplicates in Streaming

2017-01-03 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-19065: Summary: Bad error when using dropDuplicates in Streaming Key: SPARK-19065 URL: https://issues.apache.org/jira/browse/SPARK-19065 Project: Spark Issu

[jira] [Created] (SPARK-19064) Fix pip install issue with ml sub components

2017-01-03 Thread holdenk (JIRA)
holdenk created SPARK-19064: --- Summary: Fix pip install issue with ml sub components Key: SPARK-19064 URL: https://issues.apache.org/jira/browse/SPARK-19064 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-19059) Unable to retrieve data from a parquet table whose name starts with underscore

2017-01-03 Thread Remon K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796394#comment-15796394 ] Remon K commented on SPARK-19059: - My apologies, yes, I had my SPARK_HOME set to a differ

[jira] [Updated] (SPARK-19057) Instance weights must be non-negative

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19057: -- Summary: Instance weights must be non-negative (was: Instances' weight must be non-neg

[jira] [Comment Edited] (SPARK-5535) Add parameter for storage levels

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796294#comment-15796294 ] Joseph K. Bradley edited comment on SPARK-5535 at 1/3/17 10:02 PM: -

[jira] [Updated] (SPARK-5535) Add parameter for storage levels

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5535: - Description: Add a special parameter type for storage levels that takes the string repres

[jira] [Commented] (SPARK-19007) Speedup and optimize the GradientBoostedTrees in the "data>memory" scene

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796313#comment-15796313 ] Joseph K. Bradley commented on SPARK-19007: --- >From discussion on the linked PR:

[jira] [Created] (SPARK-19063) Add parameter for storage levels to LDA

2017-01-03 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-19063: - Summary: Add parameter for storage levels to LDA Key: SPARK-19063 URL: https://issues.apache.org/jira/browse/SPARK-19063 Project: Spark Issue Type:

[jira] [Updated] (SPARK-5535) Add parameter for storage levels

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5535: - Summary: Add parameter for storage levels (was: Add parameter for storage levels.) > Add

[jira] [Commented] (SPARK-5535) Add parameter for storage levels.

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796294#comment-15796294 ] Joseph K. Bradley commented on SPARK-5535: -- This issue came up in [SPARK-19007],

[jira] [Updated] (SPARK-5535) Add parameter for storage levels

2017-01-03 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5535: - Description: Add a special parameter type for storage levels that takes both StorageLevels

[jira] [Updated] (SPARK-19059) Unable to retrieve data from a parquet table whose name starts with underscore

2017-01-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-19059: -- Target Version/s: (was: 2.1.0) Fix Version/s: (was: 2.1.1) > Unable to retrieve data from

[jira] [Commented] (SPARK-19059) Unable to retrieve data from a parquet table whose name starts with underscore

2017-01-03 Thread Giambattista (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796253#comment-15796253 ] Giambattista commented on SPARK-19059: -- Please note your environment is printing ver

[jira] [Commented] (SPARK-19059) Unable to retrieve data from a parquet table whose name starts with underscore

2017-01-03 Thread Giambattista (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796248#comment-15796248 ] Giambattista commented on SPARK-19059: -- I'm using version 2.1.0 and I suspect it is

[jira] [Commented] (SPARK-18866) Codegen fails with cryptic error if regexp_replace() output column is not aliased

2017-01-03 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796227#comment-15796227 ] Nicholas Chammas commented on SPARK-18866: -- Could be. I guess the issue of alias

[jira] [Commented] (SPARK-19059) Unable to retrieve data from a parquet table whose name starts with underscore

2017-01-03 Thread Remon K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796213#comment-15796213 ] Remon K commented on SPARK-19059: - I was not able to reproduce it {code} spark-2.1.0-bin-

[jira] [Commented] (SPARK-18877) Unable to read given csv data. Excepion: java.lang.IllegalArgumentException: requirement failed: Decimal precision 28 exceeds max precision 20

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796183#comment-15796183 ] Apache Spark commented on SPARK-18877: -- User 'dongjoon-hyun' has created a pull requ

[jira] [Commented] (SPARK-19035) rand() function in case when cause failed

2017-01-03 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796082#comment-15796082 ] Xiao Li commented on SPARK-19035: - Yes. They should not be treated as the same. > rand(

[jira] [Assigned] (SPARK-19062) Utils.writeByteBuffer should not modify buffer position

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19062: Assignee: Apache Spark (was: Kay Ousterhout) > Utils.writeByteBuffer should not modify bu

[jira] [Assigned] (SPARK-19062) Utils.writeByteBuffer should not modify buffer position

2017-01-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19062: Assignee: Kay Ousterhout (was: Apache Spark) > Utils.writeByteBuffer should not modify bu

  1   2   >