[jira] [Commented] (SPARK-16095) Yarn cluster mode should return consistent result for command line and SparkLauncher

2016-06-21 Thread Peng Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343760#comment-15343760 ] Peng Zhang commented on SPARK-16095: I want to make a unit test to explain this issue, but found

[jira] [Commented] (SPARK-16125) YarnClusterSuite test cluster mode incorrectly

2016-06-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343747#comment-15343747 ] Sean Owen commented on SPARK-16125: --- Looks like we might have related issues in UtilsSuite, context.py

[jira] [Commented] (SPARK-16064) Fix the GLM error caused by NA produced by reweight function

2016-06-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343745#comment-15343745 ] Sean Owen commented on SPARK-16064: --- Based on what though? you should compile the information publicly

[jira] [Created] (SPARK-16125) YarnClusterSuite test cluster mode incorrectly

2016-06-21 Thread Peng Zhang (JIRA)
Peng Zhang created SPARK-16125: -- Summary: YarnClusterSuite test cluster mode incorrectly Key: SPARK-16125 URL: https://issues.apache.org/jira/browse/SPARK-16125 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-16024) add tests for table creation with column comment

2016-06-21 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-16024: Description: (was: CREATE TABLE src(a INT COMMENT 'bla') USING parquet. When we describe

[jira] [Updated] (SPARK-16024) add tests for table creation with column comment

2016-06-21 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-16024: Description: should test both hive serde tables and datasource tables > add tests for table

[jira] [Updated] (SPARK-16104) Do not creaate CSV writer object for every flush when writing

2016-06-21 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-16104: --- Assignee: Hyukjin Kwon > Do not creaate CSV writer object for every flush when writing >

[jira] [Resolved] (SPARK-16104) Do not creaate CSV writer object for every flush when writing

2016-06-21 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16104. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 13809

[jira] [Commented] (SPARK-12173) Consider supporting DataSet API in SparkR

2016-06-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343699#comment-15343699 ] Reynold Xin commented on SPARK-12173: - I don't think you are looking for the dataset API. You are

[jira] [Updated] (SPARK-16041) Disallow Duplicate Columns in `partitionBy`, `bucketBy` and `sortBy`

2016-06-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-16041: Description: Duplicate columns are not allowed in `partitionBy`, `bucketBy`, `sortBy` in DataFrameWriter.

[jira] [Updated] (SPARK-16041) Disallow Duplicate Columns in `partitionBy`, `bucketBy` and `sortBy`

2016-06-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-16041: Summary: Disallow Duplicate Columns in `partitionBy`, `bucketBy` and `sortBy` (was: Disallow Duplicate

[jira] [Updated] (SPARK-16041) Disallow Duplicate Columns in `partitionBy`, `blockBy` and `sortBy`

2016-06-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-16041: Summary: Disallow Duplicate Columns in `partitionBy`, `blockBy` and `sortBy` (was: Disallow Duplicate

[jira] [Commented] (SPARK-16100) Aggregator fails with Tungsten error when complex types are used for results and partial sum

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343656#comment-15343656 ] Apache Spark commented on SPARK-16100: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343243#comment-15343243 ] Ryan Blue commented on SPARK-16032: --- [~cloud_fan], while I think by-name insertion is important in the

[jira] [Commented] (SPARK-16000) Make model loading backward compatible with saved models using old vector columns

2016-06-21 Thread Gayathri Murali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343235#comment-15343235 ] Gayathri Murali commented on SPARK-16000: - [~yuhaoyan] I can help with this. > Make model

[jira] [Commented] (SPARK-16121) ListingFileCatalog does not list in parallel anymore

2016-06-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343232#comment-15343232 ] Xiao Li commented on SPARK-16121: - I also saw this, but I thought this is by design. : ) >

[jira] [Commented] (SPARK-12172) Consider removing SparkR internal RDD APIs

2016-06-21 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343215#comment-15343215 ] Sun Rui commented on SPARK-12172: - currently spark.lapply() internally depends on RDD, we have to change

[jira] [Commented] (SPARK-12173) Consider supporting DataSet API in SparkR

2016-06-21 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343212#comment-15343212 ] Sun Rui commented on SPARK-12173: - [~rxin] yes R don't need compile time type safety, but map/reduce

[jira] [Commented] (SPARK-16124) Throws exception when executing query on `build/sbt hive/console`

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343176#comment-15343176 ] Apache Spark commented on SPARK-16124: -- User 'tilumi' has created a pull request for this issue:

[jira] [Closed] (SPARK-15326) Doing multiple unions on a Dataframe will result in a very inefficient query plan

2016-06-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-15326. - Resolution: Not A Problem > Doing multiple unions on a Dataframe will result in a very

[jira] [Assigned] (SPARK-16124) Throws exception when executing query on `build/sbt hive/console`

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16124: Assignee: Apache Spark > Throws exception when executing query on `build/sbt

[jira] [Assigned] (SPARK-16124) Throws exception when executing query on `build/sbt hive/console`

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16124: Assignee: (was: Apache Spark) > Throws exception when executing query on `build/sbt

[jira] [Commented] (SPARK-15326) Doing multiple unions on a Dataframe will result in a very inefficient query plan

2016-06-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343175#comment-15343175 ] Herman van Hovell commented on SPARK-15326: --- So we have code that will flatten nested Unions.

[jira] [Created] (SPARK-16124) Throws exception when executing query on `build/sbt hive/console`

2016-06-21 Thread MIN-FU YANG (JIRA)
MIN-FU YANG created SPARK-16124: --- Summary: Throws exception when executing query on `build/sbt hive/console` Key: SPARK-16124 URL: https://issues.apache.org/jira/browse/SPARK-16124 Project: Spark

[jira] [Comment Edited] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343029#comment-15343029 ] Wenchen Fan edited comment on SPARK-16032 at 6/22/16 1:15 AM: -- I think it

[jira] [Assigned] (SPARK-16123) Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16123: Assignee: (was: Apache Spark) > Avoid NegativeArraySizeException while reserving

[jira] [Assigned] (SPARK-16123) Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16123: Assignee: Apache Spark > Avoid NegativeArraySizeException while reserving additional

[jira] [Commented] (SPARK-16123) Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343139#comment-15343139 ] Apache Spark commented on SPARK-16123: -- User 'sameeragarwal' has created a pull request for this

[jira] [Updated] (SPARK-16123) Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader

2016-06-21 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal updated SPARK-16123: --- Description: Both off-heap and on-heap variants of ColumnVector.reserve() can unfortunately

[jira] [Updated] (SPARK-16102) Use Record API from Univocity rather than current data cast API.

2016-06-21 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16102: - Affects Version/s: 2.0.0 > Use Record API from Univocity rather than current data cast API. >

[jira] [Created] (SPARK-16123) Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader

2016-06-21 Thread Sameer Agarwal (JIRA)
Sameer Agarwal created SPARK-16123: -- Summary: Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader Key: SPARK-16123 URL:

[jira] [Commented] (SPARK-14172) Hive table partition predicate not passed down correctly

2016-06-21 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343121#comment-15343121 ] MIN-FU YANG commented on SPARK-14172: - I cannot reproduce it on 1.6.1 either. Could you give more

[jira] [Comment Edited] (SPARK-14172) Hive table partition predicate not passed down correctly

2016-06-21 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343121#comment-15343121 ] MIN-FU YANG edited comment on SPARK-14172 at 6/22/16 12:48 AM: --- I cannot

[jira] [Assigned] (SPARK-16119) Support "DROP TABLE ... PURGE" if Hive client supports it

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16119: Assignee: (was: Apache Spark) > Support "DROP TABLE ... PURGE" if Hive client

[jira] [Commented] (SPARK-16119) Support "DROP TABLE ... PURGE" if Hive client supports it

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343106#comment-15343106 ] Apache Spark commented on SPARK-16119: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16119) Support "DROP TABLE ... PURGE" if Hive client supports it

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16119: Assignee: Apache Spark > Support "DROP TABLE ... PURGE" if Hive client supports it >

[jira] [Assigned] (SPARK-16121) ListingFileCatalog does not list in parallel anymore

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16121: Assignee: Apache Spark > ListingFileCatalog does not list in parallel anymore >

[jira] [Assigned] (SPARK-16121) ListingFileCatalog does not list in parallel anymore

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16121: Assignee: (was: Apache Spark) > ListingFileCatalog does not list in parallel anymore

[jira] [Commented] (SPARK-16121) ListingFileCatalog does not list in parallel anymore

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343069#comment-15343069 ] Apache Spark commented on SPARK-16121: -- User 'yhuai' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16071) Not sufficient array size checks to avoid integer overflows in Tungsten

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16071: Assignee: Apache Spark > Not sufficient array size checks to avoid integer overflows in

[jira] [Assigned] (SPARK-16071) Not sufficient array size checks to avoid integer overflows in Tungsten

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16071: Assignee: (was: Apache Spark) > Not sufficient array size checks to avoid integer

[jira] [Commented] (SPARK-16071) Not sufficient array size checks to avoid integer overflows in Tungsten

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343067#comment-15343067 ] Apache Spark commented on SPARK-16071: -- User 'clockfly' has created a pull request for this issue:

[jira] [Created] (SPARK-16122) Spark History Server REST API missing an environment endpoint per application

2016-06-21 Thread Neelesh Srinivas Salian (JIRA)
Neelesh Srinivas Salian created SPARK-16122: --- Summary: Spark History Server REST API missing an environment endpoint per application Key: SPARK-16122 URL:

[jira] [Commented] (SPARK-15643) ML 2.0 QA: migration guide update

2016-06-21 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343059#comment-15343059 ] Joseph K. Bradley commented on SPARK-15643: --- A few more deprecations to add from current PRs: *

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343036#comment-15343036 ] Wenchen Fan commented on SPARK-16032: - [~rdblue] I think the biggest problem is we don't have by-name

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15343029#comment-15343029 ] Wenchen Fan commented on SPARK-16032: - I think it's nonsense to use `partitionBy` with `insertInto`,

[jira] [Resolved] (SPARK-16117) Hide LibSVMFileFormat in public API docs

2016-06-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-16117. - Resolution: Fixed Fix Version/s: 2.0.0 > Hide LibSVMFileFormat in public API docs >

[jira] [Resolved] (SPARK-16118) getDropLast is missing in OneHotEncoder

2016-06-21 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-16118. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13821

[jira] [Updated] (SPARK-16119) Support "DROP TABLE ... PURGE" if Hive client supports it

2016-06-21 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-16119: --- Description: There's currently code that explicitly disables the "PURGE" flag when dropping

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342963#comment-15342963 ] Ryan Blue commented on SPARK-16032: --- I'm referring to disabling the use of {{partitionBy}} with

[jira] [Commented] (SPARK-16075) Make VectorUDT/MatrixUDT singleton under spark.ml package

2016-06-21 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342930#comment-15342930 ] Miao Wang commented on SPARK-16075: --- I will follow on this one. Thanks! > Make VectorUDT/MatrixUDT

[jira] [Assigned] (SPARK-16106) TaskSchedulerImpl does not correctly handle new executors on existing hosts

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16106: Assignee: Apache Spark > TaskSchedulerImpl does not correctly handle new executors on

[jira] [Assigned] (SPARK-16106) TaskSchedulerImpl does not correctly handle new executors on existing hosts

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16106: Assignee: (was: Apache Spark) > TaskSchedulerImpl does not correctly handle new

[jira] [Commented] (SPARK-16106) TaskSchedulerImpl does not correctly handle new executors on existing hosts

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342926#comment-15342926 ] Apache Spark commented on SPARK-16106: -- User 'squito' has created a pull request for this issue:

[jira] [Commented] (SPARK-14172) Hive table partition predicate not passed down correctly

2016-06-21 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342917#comment-15342917 ] MIN-FU YANG commented on SPARK-14172: - Hi, I cannot reproduce the problem in master branch. Maybe

[jira] [Created] (SPARK-16121) ListingFileCatalog does not list in parallel anymore

2016-06-21 Thread Yin Huai (JIRA)
Yin Huai created SPARK-16121: Summary: ListingFileCatalog does not list in parallel anymore Key: SPARK-16121 URL: https://issues.apache.org/jira/browse/SPARK-16121 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342856#comment-15342856 ] Yin Huai commented on SPARK-16032: -- Regarding {{disabling Hive features}}, can you be more specific?

[jira] [Updated] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-16032: - Priority: Critical (was: Blocker) > Audit semantics of various insertion operations related to

[jira] [Assigned] (SPARK-16120) getCurrentLogFiles method in ReceiverSuite "WAL - generating and cleaning" case uses external variable instead of the passed parameter

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16120: Assignee: (was: Apache Spark) > getCurrentLogFiles method in ReceiverSuite "WAL -

[jira] [Assigned] (SPARK-16120) getCurrentLogFiles method in ReceiverSuite "WAL - generating and cleaning" case uses external variable instead of the passed parameter

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16120: Assignee: Apache Spark > getCurrentLogFiles method in ReceiverSuite "WAL - generating and

[jira] [Commented] (SPARK-16120) getCurrentLogFiles method in ReceiverSuite "WAL - generating and cleaning" case uses external variable instead of the passed parameter

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342852#comment-15342852 ] Apache Spark commented on SPARK-16120: -- User 'ahmed-mahran' has created a pull request for this

[jira] [Commented] (SPARK-15326) Doing multiple unions on a Dataframe will result in a very inefficient query plan

2016-06-21 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342850#comment-15342850 ] MIN-FU YANG commented on SPARK-15326: - I'll look into it. > Doing multiple unions on a Dataframe

[jira] [Created] (SPARK-16120) getCurrentLogFiles method in ReceiverSuite "WAL - generating and cleaning" case uses external variable instead of the passed parameter

2016-06-21 Thread Ahmed Mahran (JIRA)
Ahmed Mahran created SPARK-16120: Summary: getCurrentLogFiles method in ReceiverSuite "WAL - generating and cleaning" case uses external variable instead of the passed parameter Key: SPARK-16120 URL:

[jira] [Assigned] (SPARK-16110) Can't set Python via spark-submit for YARN cluster mode when PYSPARK_PYTHON & PYSPARK_DRIVER_PYTHON are set

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16110: Assignee: (was: Apache Spark) > Can't set Python via spark-submit for YARN cluster

[jira] [Assigned] (SPARK-16110) Can't set Python via spark-submit for YARN cluster mode when PYSPARK_PYTHON & PYSPARK_DRIVER_PYTHON are set

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16110: Assignee: Apache Spark > Can't set Python via spark-submit for YARN cluster mode when

[jira] [Commented] (SPARK-16110) Can't set Python via spark-submit for YARN cluster mode when PYSPARK_PYTHON & PYSPARK_DRIVER_PYTHON are set

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342831#comment-15342831 ] Apache Spark commented on SPARK-16110: -- User 'KevinGrealish' has created a pull request for this

[jira] [Commented] (SPARK-16106) TaskSchedulerImpl does not correctly handle new executors on existing hosts

2016-06-21 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342821#comment-15342821 ] Imran Rashid commented on SPARK-16106: -- cc [~kayousterhout] After taking a closer look at this, I

[jira] [Updated] (SPARK-16106) TaskSchedulerImpl does not correctly handle new executors on existing hosts

2016-06-21 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-16106: - Priority: Trivial (was: Major) > TaskSchedulerImpl does not correctly handle new executors on

[jira] [Updated] (SPARK-15606) Driver hang in o.a.s.DistributedSuite on 2 core machine

2016-06-21 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-15606: - Fix Version/s: 1.6.2 > Driver hang in o.a.s.DistributedSuite on 2 core machine >

[jira] [Updated] (SPARK-15606) Driver hang in o.a.s.DistributedSuite on 2 core machine

2016-06-21 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-15606: - Affects Version/s: 1.6.2 > Driver hang in o.a.s.DistributedSuite on 2 core machine >

[jira] [Created] (SPARK-16119) Support "DROP TABLE ... PURGE" if Hive client supports it

2016-06-21 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-16119: -- Summary: Support "DROP TABLE ... PURGE" if Hive client supports it Key: SPARK-16119 URL: https://issues.apache.org/jira/browse/SPARK-16119 Project: Spark

[jira] [Assigned] (SPARK-16115) Improve output column name for SHOW PARTITIONS command and improve an error message

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16115: Assignee: (was: Apache Spark) > Improve output column name for SHOW PARTITIONS

[jira] [Assigned] (SPARK-16115) Improve output column name for SHOW PARTITIONS command and improve an error message

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16115: Assignee: Apache Spark > Improve output column name for SHOW PARTITIONS command and

[jira] [Commented] (SPARK-16115) Improve output column name for SHOW PARTITIONS command and improve an error message

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342796#comment-15342796 ] Apache Spark commented on SPARK-16115: -- User 'skambha' has created a pull request for this issue:

[jira] [Commented] (SPARK-16118) getDropLast is missing in OneHotEncoder

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342790#comment-15342790 ] Apache Spark commented on SPARK-16118: -- User 'mengxr' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16118) getDropLast is missing in OneHotEncoder

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16118: Assignee: Xiangrui Meng (was: Apache Spark) > getDropLast is missing in OneHotEncoder >

[jira] [Assigned] (SPARK-16118) getDropLast is missing in OneHotEncoder

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16118: Assignee: Apache Spark (was: Xiangrui Meng) > getDropLast is missing in OneHotEncoder >

[jira] [Created] (SPARK-16118) getDropLast is missing in OneHotEncoder

2016-06-21 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-16118: - Summary: getDropLast is missing in OneHotEncoder Key: SPARK-16118 URL: https://issues.apache.org/jira/browse/SPARK-16118 Project: Spark Issue Type: New

[jira] [Commented] (SPARK-16107) Group GLM-related methods in generated doc

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342773#comment-15342773 ] Apache Spark commented on SPARK-16107: -- User 'junyangq' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16107) Group GLM-related methods in generated doc

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16107: Assignee: Junyang Qian (was: Apache Spark) > Group GLM-related methods in generated doc

[jira] [Assigned] (SPARK-16107) Group GLM-related methods in generated doc

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16107: Assignee: Apache Spark (was: Junyang Qian) > Group GLM-related methods in generated doc

[jira] [Commented] (SPARK-16117) Hide LibSVMFileFormat in public API docs

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342741#comment-15342741 ] Apache Spark commented on SPARK-16117: -- User 'mengxr' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16117) Hide LibSVMFileFormat in public API docs

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16117: Assignee: Apache Spark (was: Xiangrui Meng) > Hide LibSVMFileFormat in public API docs >

[jira] [Assigned] (SPARK-16117) Hide LibSVMFileFormat in public API docs

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16117: Assignee: Xiangrui Meng (was: Apache Spark) > Hide LibSVMFileFormat in public API docs >

[jira] [Commented] (SPARK-15968) HiveMetastoreCatalog does not correctly validate partitioned metastore relation when searching the internal table cache

2016-06-21 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342731#comment-15342731 ] Michael Allman commented on SPARK-15968: I've created a revised PR for this issue,

[jira] [Commented] (SPARK-15968) HiveMetastoreCatalog does not correctly validate partitioned metastore relation when searching the internal table cache

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342730#comment-15342730 ] Apache Spark commented on SPARK-15968: -- User 'mallman' has created a pull request for this issue:

[jira] [Commented] (SPARK-15997) Audit ml.feature Update documentation for ml feature transformers

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342728#comment-15342728 ] Apache Spark commented on SPARK-15997: -- User 'GayathriMurali' has created a pull request for this

[jira] [Assigned] (SPARK-16116) ConsoleSink should not require checkpointLocation

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16116: Assignee: Shixiong Zhu (was: Apache Spark) > ConsoleSink should not require

[jira] [Commented] (SPARK-16116) ConsoleSink should not require checkpointLocation

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342708#comment-15342708 ] Apache Spark commented on SPARK-16116: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16116) ConsoleSink should not require checkpointLocation

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16116: Assignee: Apache Spark (was: Shixiong Zhu) > ConsoleSink should not require

[jira] [Created] (SPARK-16117) Hide LibSVMFileFormat in public API docs

2016-06-21 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-16117: - Summary: Hide LibSVMFileFormat in public API docs Key: SPARK-16117 URL: https://issues.apache.org/jira/browse/SPARK-16117 Project: Spark Issue Type:

[jira] [Created] (SPARK-16116) ConsoleSink should not require checkpointLocation

2016-06-21 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-16116: Summary: ConsoleSink should not require checkpointLocation Key: SPARK-16116 URL: https://issues.apache.org/jira/browse/SPARK-16116 Project: Spark Issue

[jira] [Assigned] (SPARK-16114) Add network word count example

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16114: Assignee: (was: Apache Spark) > Add network word count example >

[jira] [Comment Edited] (SPARK-16090) Improve method grouping in SparkR generated docs

2016-06-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342696#comment-15342696 ] Felix Cheung edited comment on SPARK-16090 at 6/21/16 9:08 PM: --- This is for

[jira] [Commented] (SPARK-16114) Add network word count example

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342700#comment-15342700 ] Apache Spark commented on SPARK-16114: -- User 'jjthomas' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16114) Add network word count example

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16114: Assignee: Apache Spark > Add network word count example > --

[jira] [Commented] (SPARK-16090) Improve method grouping in SparkR generated docs

2016-06-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342696#comment-15342696 ] Felix Cheung commented on SPARK-16090: -- This is for example the html output for gapply {code} # S4

[jira] [Updated] (SPARK-16105) PCA Reverse Transformer

2016-06-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16105: -- Flags: (was: Important) Priority: Minor (was: Major) The transformation is to a

[jira] [Commented] (SPARK-16115) Improve output column name for SHOW PARTITIONS command and improve an error message

2016-06-21 Thread Sunitha Kambhampati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15342689#comment-15342689 ] Sunitha Kambhampati commented on SPARK-16115: - I will submit a PR shortly. > Improve output

[jira] [Updated] (SPARK-16115) Improve output column name for SHOW PARTITIONS command and improve an error message

2016-06-21 Thread Sunitha Kambhampati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunitha Kambhampati updated SPARK-16115: Description: Opening this issue to address the following: 1. For the SHOW

  1   2   3   4   >