[jira] [Commented] (SPARK-17868) Do not use bitmasks during parsing and analysis of CUBE/ROLLUP/GROUPING SETS

2016-10-10 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564586#comment-15564586 ] Herman van Hovell commented on SPARK-17868: --- [~jiangxb] can you work on this one? > Do not use

[jira] [Updated] (SPARK-17868) Do not use bitmasks during parsing and analysis of CUBE/ROLLUP/GROUPING SETS

2016-10-10 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-17868: -- Assignee: (was: Herman van Hovell) > Do not use bitmasks during parsing and

[jira] [Created] (SPARK-17868) Do not use bitmasks during parsing and analysis of CUBE/ROLLUP/GROUPING SETS

2016-10-10 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-17868: - Summary: Do not use bitmasks during parsing and analysis of CUBE/ROLLUP/GROUPING SETS Key: SPARK-17868 URL: https://issues.apache.org/jira/browse/SPARK-17868

[jira] [Resolved] (SPARK-9560) Add LDA data generator

2016-10-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-9560. -- Resolution: Won't Fix > Add LDA data generator > -- > > Key:

[jira] [Resolved] (SPARK-15957) RFormula supports forcing to index label

2016-10-10 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-15957. - Resolution: Fixed Fix Version/s: 2.1.0 > RFormula supports forcing to index label >

[jira] [Commented] (SPARK-15343) NoClassDefFoundError when initializing Spark with YARN

2016-10-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564578#comment-15564578 ] Sean Owen commented on SPARK-15343: --- [~jdesmet] I'm not clear what you're advocating _in Spark_. See

[jira] [Commented] (SPARK-17865) R API for global temp view

2016-10-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564577#comment-15564577 ] Wenchen Fan commented on SPARK-17865: - All global temp views should be gone when the SparkContext is

[jira] [Commented] (SPARK-17865) R API for global temp view

2016-10-10 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564558#comment-15564558 ] Felix Cheung commented on SPARK-17865: -- I haven't kept up on SharedState before this but it looks to

[jira] [Created] (SPARK-17867) Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name

2016-10-10 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-17867: --- Summary: Dataset.dropDuplicates (i.e. distinct) should consider the columns with same column name Key: SPARK-17867 URL: https://issues.apache.org/jira/browse/SPARK-17867

[jira] [Resolved] (SPARK-17844) DataFrame API should simplify defining frame boundaries without partitioning/ordering

2016-10-10 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-17844. --- Resolution: Fixed Fix Version/s: 2.1.0 Resolved per Reynold's PR. >

[jira] [Created] (SPARK-17866) Dataset.dropDuplicates (i.e., distinct) should not change the output of child plan

2016-10-10 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-17866: --- Summary: Dataset.dropDuplicates (i.e., distinct) should not change the output of child plan Key: SPARK-17866 URL: https://issues.apache.org/jira/browse/SPARK-17866

[jira] [Commented] (SPARK-17865) R API for global temp view

2016-10-10 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564534#comment-15564534 ] Felix Cheung commented on SPARK-17865: -- I can take this. > R API for global temp view >

[jira] [Updated] (SPARK-17719) Unify and tie up options in a single place in JDBC datasource API

2016-10-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-17719: Assignee: Hyukjin Kwon > Unify and tie up options in a single place in JDBC datasource API >

[jira] [Updated] (SPARK-17776) Potentially duplicated names which might have conflicts between JDBC options and properties instance

2016-10-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-17776: Assignee: Hyukjin Kwon > Potentially duplicated names which might have conflicts between JDBC options >

[jira] [Resolved] (SPARK-17776) Potentially duplicated names which might have conflicts between JDBC options and properties instance

2016-10-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-17776. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15292

[jira] [Resolved] (SPARK-17719) Unify and tie up options in a single place in JDBC datasource API

2016-10-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-17719. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15292

[jira] [Commented] (SPARK-17865) R API for global temp view

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564495#comment-15564495 ] Reynold Xin commented on SPARK-17865: - [~jiangxb1987] never mind -- there is already the Python API.

[jira] [Commented] (SPARK-17865) Python API for global temp view

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564492#comment-15564492 ] Reynold Xin commented on SPARK-17865: - Ah OK - can you then make sure the Python API is updated

[jira] [Commented] (SPARK-17865) R API for global temp view

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564494#comment-15564494 ] Reynold Xin commented on SPARK-17865: - cc [~felixcheung] [~yanboliang] know anybody to take on this

[jira] [Updated] (SPARK-17865) R API for global temp view

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17865: Description: We need to add the R API for managing global temp views, mirroring the changes in

[jira] [Updated] (SPARK-17865) R API for global temp view

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17865: Summary: R API for global temp view (was: Python API for global temp view) > R API for global

[jira] [Commented] (SPARK-17865) Python API for global temp view

2016-10-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564489#comment-15564489 ] Wenchen Fan commented on SPARK-17865: - python API is already added, but R API hasn't. Should we

[jira] [Created] (SPARK-17865) Python API for global temp view

2016-10-10 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-17865: --- Summary: Python API for global temp view Key: SPARK-17865 URL: https://issues.apache.org/jira/browse/SPARK-17865 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-17865) Python API for global temp view

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564484#comment-15564484 ] Reynold Xin commented on SPARK-17865: - cc [~cloud_fan] [~jiangxb1987] would you be interested in

[jira] [Updated] (SPARK-17338) Add global temp view support

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17338: Summary: Add global temp view support (was: add global temp view support) > Add global temp view

[jira] [Updated] (SPARK-17338) add global temp view support

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17338: Summary: add global temp view support (was: add global temp view) > add global temp view support

[jira] [Updated] (SPARK-17338) Add global temp view support

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17338: Description: Global temporary view is a cross-session temporary view, which means it's shared

[jira] [Commented] (SPARK-17858) Provide option for Spark SQL to skip corrupt files

2016-10-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564479#comment-15564479 ] Sean Owen commented on SPARK-17858: --- Yeah, the related JIRA gives an argument that we shouldn't do

[jira] [Commented] (SPARK-3577) Add task metric to report spill time

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564472#comment-15564472 ] Reynold Xin commented on SPARK-3577: [~dreamworks007] can you take a look at the problem here?

[jira] [Commented] (SPARK-17864) Mark data type APIs as stable, rather than DeveloperApi

2016-10-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564464#comment-15564464 ] Apache Spark commented on SPARK-17864: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17864) Mark data type APIs as stable, rather than DeveloperApi

2016-10-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17864: Assignee: Apache Spark (was: Reynold Xin) > Mark data type APIs as stable, rather than

[jira] [Assigned] (SPARK-17864) Mark data type APIs as stable, rather than DeveloperApi

2016-10-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17864: Assignee: Reynold Xin (was: Apache Spark) > Mark data type APIs as stable, rather than

[jira] [Updated] (SPARK-17799) InterfaceStability annotation

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17799: Labels: releasenotes (was: ) > InterfaceStability annotation > - > >

[jira] [Updated] (SPARK-17864) Mark data type APIs as stable, rather than DeveloperApi

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17864: Labels: releasenotes (was: ) > Mark data type APIs as stable, rather than DeveloperApi >

[jira] [Created] (SPARK-17864) Mark data type APIs as stable, rather than DeveloperApi

2016-10-10 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-17864: --- Summary: Mark data type APIs as stable, rather than DeveloperApi Key: SPARK-17864 URL: https://issues.apache.org/jira/browse/SPARK-17864 Project: Spark Issue

[jira] [Comment Edited] (SPARK-3577) Add task metric to report spill time

2016-10-10 Thread Gaoxiang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563038#comment-15563038 ] Gaoxiang Liu edited comment on SPARK-3577 at 10/11/16 4:49 AM: --- [~rxin] I

[jira] [Commented] (SPARK-17816) Json serialzation of accumulators are failing with ConcurrentModificationException

2016-10-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564410#comment-15564410 ] Apache Spark commented on SPARK-17816: -- User 'seyfe' has created a pull request for this issue:

[jira] [Closed] (SPARK-4630) Dynamically determine optimal number of partitions

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-4630. -- Resolution: Duplicate Assignee: (was: Kostas Sakellis) > Dynamically determine optimal number

[jira] [Commented] (SPARK-14393) monotonicallyIncreasingId not monotonically increasing with downstream coalesce

2016-10-10 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564350#comment-15564350 ] Takeshi Yamamuro commented on SPARK-14393: -- Since coalesce() just after

[jira] [Resolved] (SPARK-17816) Json serialzation of accumulators are failing with ConcurrentModificationException

2016-10-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-17816. -- Resolution: Fixed Assignee: Ergin Seyfe Fix Version/s: 2.1.0 > Json

[jira] [Updated] (SPARK-15814) Aggregator can return null result

2016-10-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-15814: - Fix Version/s: 2.0.1 2.1.0 > Aggregator can return null result >

[jira] [Resolved] (SPARK-15577) Java can't import DataFrame type alias

2016-10-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-15577. -- Resolution: Not A Problem > Java can't import DataFrame type alias >

[jira] [Commented] (SPARK-15577) Java can't import DataFrame type alias

2016-10-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564316#comment-15564316 ] Hyukjin Kwon commented on SPARK-15577: -- Let me please close this as a not-a-problem. Please revoke

[jira] [Commented] (SPARK-15577) Java can't import DataFrame type alias

2016-10-10 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564302#comment-15564302 ] Jakob Odersky commented on SPARK-15577: --- this cleaning of jiras is really good to see :)

[jira] [Comment Edited] (SPARK-9265) Dataframe.limit joined with another dataframe can be non-deterministic

2016-10-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15558875#comment-15558875 ] Xiao Li edited comment on SPARK-9265 at 10/11/16 3:14 AM: -- This has been resolved

[jira] [Resolved] (SPARK-16896) Loading csv with duplicate column names

2016-10-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-16896. - Resolution: Fixed Fix Version/s: 2.1.0 > Loading csv with duplicate column names >

[jira] [Updated] (SPARK-16896) Loading csv with duplicate column names

2016-10-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-16896: Assignee: Hyukjin Kwon > Loading csv with duplicate column names >

[jira] [Updated] (SPARK-17738) Flaky test: org.apache.spark.sql.execution.columnar.ColumnTypeSuite MAP append/extract

2016-10-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17738: - Component/s: Tests > Flaky test: org.apache.spark.sql.execution.columnar.ColumnTypeSuite MAP >

[jira] [Updated] (SPARK-17738) Flaky test: org.apache.spark.sql.execution.columnar.ColumnTypeSuite MAP append/extract

2016-10-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17738: - Labels: flaky-test (was: ) > Flaky test:

[jira] [Updated] (SPARK-17738) Flaky test: org.apache.spark.sql.execution.columnar.ColumnTypeSuite MAP append/extract

2016-10-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17738: - Issue Type: Test (was: Bug) > Flaky test:

[jira] [Resolved] (SPARK-17738) Flaky test: org.apache.spark.sql.execution.columnar.ColumnTypeSuite MAP append/extract

2016-10-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-17738. -- Resolution: Fixed Fix Version/s: 2.0.2 > Flaky test:

[jira] [Commented] (SPARK-17772) Add helper testing methods for instance weighting

2016-10-10 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564174#comment-15564174 ] Seth Hendrickson commented on SPARK-17772: -- I'm working on this. > Add helper testing methods

[jira] [Commented] (SPARK-17626) TPC-DS performance improvements using star-schema heuristics

2016-10-10 Thread Ioana Delaney (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564169#comment-15564169 ] Ioana Delaney commented on SPARK-17626: --- [~ron8hu] Thank you for your comments. Our current star

[jira] [Created] (SPARK-17863) SELECT distinct does not work if there is a order by clause

2016-10-10 Thread Yin Huai (JIRA)
Yin Huai created SPARK-17863: Summary: SELECT distinct does not work if there is a order by clause Key: SPARK-17863 URL: https://issues.apache.org/jira/browse/SPARK-17863 Project: Spark Issue

[jira] [Updated] (SPARK-17863) SELECT distinct does not work if there is a order by clause

2016-10-10 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-17863: - Labels: correctness (was: ) > SELECT distinct does not work if there is a order by clause >

[jira] [Commented] (SPARK-17338) add global temp view

2016-10-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564138#comment-15564138 ] Apache Spark commented on SPARK-17338: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Commented] (SPARK-15577) Java can't import DataFrame type alias

2016-10-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564115#comment-15564115 ] Hyukjin Kwon commented on SPARK-15577: -- WDYT - [~holdenk] > Java can't import DataFrame type alias

[jira] [Comment Edited] (SPARK-15577) Java can't import DataFrame type alias

2016-10-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564115#comment-15564115 ] Hyukjin Kwon edited comment on SPARK-15577 at 10/11/16 1:32 AM: WDYT? -

[jira] [Comment Edited] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2016-10-10 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564097#comment-15564097 ] Weichen Xu edited comment on SPARK-17139 at 10/11/16 1:25 AM: -- I'm working

[jira] [Commented] (SPARK-15577) Java can't import DataFrame type alias

2016-10-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564103#comment-15564103 ] Hyukjin Kwon commented on SPARK-15577: -- Cool, then I guess we might be able to take an action on the

[jira] [Commented] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2016-10-10 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564097#comment-15564097 ] Weichen Xu commented on SPARK-17139: I'm working on it hardly and will create PR this week, thanks!

[jira] [Commented] (SPARK-4630) Dynamically determine optimal number of partitions

2016-10-10 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564071#comment-15564071 ] holdenk commented on SPARK-4630: I also agree this would be really good to revisit, from talking with

[jira] [Updated] (SPARK-16980) Load only catalog table partition metadata required to answer a query

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16980: Issue Type: Sub-task (was: Improvement) Parent: SPARK-17861 > Load only catalog table

[jira] [Created] (SPARK-17862) Feature flag SPARK-16980

2016-10-10 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-17862: --- Summary: Feature flag SPARK-16980 Key: SPARK-17862 URL: https://issues.apache.org/jira/browse/SPARK-17862 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-17861) Push data source partitions into metastore for catalog tables

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17861: Description: Initially, Spark SQL does not store any partition information in the catalog for

[jira] [Updated] (SPARK-17861) Push data source partitions into metastore for catalog tables and support partition pruning

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17861: Summary: Push data source partitions into metastore for catalog tables and support partition

[jira] [Updated] (SPARK-17861) Store data source partitions in metastore and push partition pruning into metastore

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17861: Summary: Store data source partitions in metastore and push partition pruning into metastore

[jira] [Updated] (SPARK-17861) Store data source partitions in metastore and push partition pruning into the metastore

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17861: Summary: Store data source partitions in metastore and push partition pruning into the metastore

[jira] [Commented] (SPARK-17861) Push data source partitions into metastore for catalog tables

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564058#comment-15564058 ] Reynold Xin commented on SPARK-17861: - cc [~michael] this is the main work I want to get in for 2.1.

[jira] [Created] (SPARK-17861) Push data source partitions into metastore for catalog tables

2016-10-10 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-17861: --- Summary: Push data source partitions into metastore for catalog tables Key: SPARK-17861 URL: https://issues.apache.org/jira/browse/SPARK-17861 Project: Spark

[jira] [Updated] (SPARK-17861) Push data source partitions into metastore for catalog tables

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17861: Priority: Critical (was: Major) > Push data source partitions into metastore for catalog tables >

[jira] [Updated] (SPARK-16980) Load only catalog table partition metadata required to answer a query

2016-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16980: Assignee: Michael Allman > Load only catalog table partition metadata required to answer a query >

[jira] [Commented] (SPARK-17801) [ML]Random Forest Regression fails for large input

2016-10-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563974#comment-15563974 ] Joseph K. Bradley commented on SPARK-17801: --- Btw, that maxBins setting is way too high. It

[jira] [Comment Edited] (SPARK-17801) [ML]Random Forest Regression fails for large input

2016-10-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563966#comment-15563966 ] Joseph K. Bradley edited comment on SPARK-17801 at 10/11/16 12:10 AM:

[jira] [Commented] (SPARK-17801) [ML]Random Forest Regression fails for large input

2016-10-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563966#comment-15563966 ] Joseph K. Bradley commented on SPARK-17801: --- Have you tried this with Spark 2.0? > [ML]Random

[jira] [Resolved] (SPARK-14610) Remove superfluous split from random forest findSplitsForContinousFeature

2016-10-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-14610. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 12374

[jira] [Updated] (SPARK-14610) Remove superfluous split from random forest findSplitsForContinousFeature

2016-10-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-14610: -- Assignee: Seth Hendrickson > Remove superfluous split from random forest

[jira] [Commented] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-10 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563944#comment-15563944 ] Hossein Falaki commented on SPARK-17781: Yes, but somehow inside {{worker.R}} Date fields in the

[jira] [Commented] (SPARK-9478) Add class weights to Random Forest

2016-10-10 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563919#comment-15563919 ] Seth Hendrickson commented on SPARK-9478: - I'm going to revive this, and hopefully submit a PR

[jira] [Assigned] (SPARK-17860) SHOW COLUMN's database conflict check should respect case sensitivity setting

2016-10-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17860: Assignee: Apache Spark > SHOW COLUMN's database conflict check should respect case

[jira] [Commented] (SPARK-17860) SHOW COLUMN's database conflict check should respect case sensitivity setting

2016-10-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563915#comment-15563915 ] Apache Spark commented on SPARK-17860: -- User 'dilipbiswal' has created a pull request for this

[jira] [Assigned] (SPARK-17860) SHOW COLUMN's database conflict check should respect case sensitivity setting

2016-10-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17860: Assignee: (was: Apache Spark) > SHOW COLUMN's database conflict check should respect

[jira] [Commented] (SPARK-15577) Java can't import DataFrame type alias

2016-10-10 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563910#comment-15563910 ] Jakob Odersky commented on SPARK-15577: --- This was considered and trade-offs were actively

[jira] [Comment Edited] (SPARK-15577) Java can't import DataFrame type alias

2016-10-10 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563910#comment-15563910 ] Jakob Odersky edited comment on SPARK-15577 at 10/10/16 11:41 PM: -- This

[jira] [Commented] (SPARK-15343) NoClassDefFoundError when initializing Spark with YARN

2016-10-10 Thread Jo Desmet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563894#comment-15563894 ] Jo Desmet commented on SPARK-15343: --- Still not acceptable, I mean how can we. This tool has been

[jira] [Created] (SPARK-17860) SHOW COLUMN's database conflict check should use case sensitive compare.

2016-10-10 Thread Dilip Biswal (JIRA)
Dilip Biswal created SPARK-17860: Summary: SHOW COLUMN's database conflict check should use case sensitive compare. Key: SPARK-17860 URL: https://issues.apache.org/jira/browse/SPARK-17860 Project:

[jira] [Updated] (SPARK-17860) SHOW COLUMN's database conflict check should respect case sensitivity setting

2016-10-10 Thread Dilip Biswal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dilip Biswal updated SPARK-17860: - Summary: SHOW COLUMN's database conflict check should respect case sensitivity setting (was:

[jira] [Commented] (SPARK-17857) SHOW TABLES IN schema throws exception if schema doesn't exist

2016-10-10 Thread Todd Nemet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563877#comment-15563877 ] Todd Nemet commented on SPARK-17857: I didn't even think to check Hive 1.2, since I figured it would

[jira] [Commented] (SPARK-17781) datetime is serialized as double inside dapply()

2016-10-10 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563862#comment-15563862 ] Shivaram Venkataraman commented on SPARK-17781: --- I'm not sure I follow. The class of the

[jira] [Commented] (SPARK-4630) Dynamically determine optimal number of partitions

2016-10-10 Thread Mike Dusenberry (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563839#comment-15563839 ] Mike Dusenberry commented on SPARK-4630: It would be really nice to revisit this issue, perhaps

[jira] [Comment Edited] (SPARK-4630) Dynamically determine optimal number of partitions

2016-10-10 Thread Mike Dusenberry (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563839#comment-15563839 ] Mike Dusenberry edited comment on SPARK-4630 at 10/10/16 11:10 PM: --- It

[jira] [Created] (SPARK-17859) persist should not impede with spark's ability to perform a broadcast join.

2016-10-10 Thread Franck Tago (JIRA)
Franck Tago created SPARK-17859: --- Summary: persist should not impede with spark's ability to perform a broadcast join. Key: SPARK-17859 URL: https://issues.apache.org/jira/browse/SPARK-17859 Project:

[jira] [Updated] (SPARK-17858) Provide option for Spark SQL to skip corrupt files

2016-10-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17858: - Description: In Spark 2.0, corrupt files will fail a SQL query. However, the user may just want

[jira] [Updated] (SPARK-17858) Provide option for Spark SQL to skip corrupt files

2016-10-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17858: - Description: In Spark 2.0, corrupt files will fail a job. However, the user may just want to

[jira] [Updated] (SPARK-17858) Provide option for Spark SQL to skip corrupt files

2016-10-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17858: - Issue Type: Improvement (was: Bug) > Provide option for Spark SQL to skip corrupt files >

[jira] [Updated] (SPARK-17858) Provide option for Spark SQL to skip corrupt files

2016-10-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-17858: - Description: In Spark 2.0, corrupt files will fail a job. However, the user may not > Provide

[jira] [Created] (SPARK-17858) Provide option for Spark SQL to skip corrupt files

2016-10-10 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-17858: Summary: Provide option for Spark SQL to skip corrupt files Key: SPARK-17858 URL: https://issues.apache.org/jira/browse/SPARK-17858 Project: Spark Issue

[jira] [Assigned] (SPARK-17850) HadoopRDD should not swallow EOFException

2016-10-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17850: Assignee: (was: Apache Spark) > HadoopRDD should not swallow EOFException >

[jira] [Assigned] (SPARK-17850) HadoopRDD should not swallow EOFException

2016-10-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17850: Assignee: Apache Spark > HadoopRDD should not swallow EOFException >

[jira] [Commented] (SPARK-17850) HadoopRDD should not swallow EOFException

2016-10-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563774#comment-15563774 ] Apache Spark commented on SPARK-17850: -- User 'zsxwing' has created a pull request for this issue:

  1   2   3   >