[jira] [Commented] (SPARK-15124) R 2.0 QA: New R APIs and API docs

2016-06-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343810#comment-15343810 ] Felix Cheung commented on SPARK-15124: -- I think both of these are updated now. > R

[jira] [Updated] (SPARK-16088) Update setJobGroup, clearJobGroup, cancelJobGroup SparkR API to not require sc

2016-06-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-16088: - Summary: Update setJobGroup, clearJobGroup, cancelJobGroup SparkR API to not require sc (was: De

[jira] [Commented] (SPARK-16088) Deprecate setJobGroup, clearJobGroup, cancelJobGroup from SparkR API

2016-06-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343807#comment-15343807 ] Felix Cheung commented on SPARK-16088: -- Right, since they are S3 methods there reall

[jira] [Commented] (SPARK-16108) Why is KMeansModel (scala) private?

2016-06-21 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343808#comment-15343808 ] Rémi Delassus commented on SPARK-16108: --- It still prevents me from extending KMeans

[jira] [Updated] (SPARK-16127) Audit @Since annotations related to ml.linalg

2016-06-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-16127: --- Description: SPARK-14615 converted {{spark.ml}} to use the new {{Vector}}/{{Matrix}} classes

[jira] [Assigned] (SPARK-16127) Audit @Since annotations related to ml.linalg

2016-06-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-16127: -- Assignee: Nick Pentreath > Audit @Since annotations related to ml.linalg > ---

[jira] [Created] (SPARK-16127) Audit @Since annotations related to ml.linalg

2016-06-21 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-16127: -- Summary: Audit @Since annotations related to ml.linalg Key: SPARK-16127 URL: https://issues.apache.org/jira/browse/SPARK-16127 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16126) Better Error Message When using DataFrameReader without `path`

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343798#comment-15343798 ] Apache Spark commented on SPARK-16126: -- User 'gatorsmile' has created a pull request

[jira] [Assigned] (SPARK-16126) Better Error Message When using DataFrameReader without `path`

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16126: Assignee: (was: Apache Spark) > Better Error Message When using DataFrameReader withou

[jira] [Assigned] (SPARK-16126) Better Error Message When using DataFrameReader without `path`

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16126: Assignee: Apache Spark > Better Error Message When using DataFrameReader without `path` >

[jira] [Created] (SPARK-16126) Better Error Message When using DataFrameReader without `path`

2016-06-21 Thread Xiao Li (JIRA)
Xiao Li created SPARK-16126: --- Summary: Better Error Message When using DataFrameReader without `path` Key: SPARK-16126 URL: https://issues.apache.org/jira/browse/SPARK-16126 Project: Spark Issue T

[jira] [Commented] (SPARK-16064) Fix the GLM error caused by NA produced by reweight function

2016-06-21 Thread Zhang Mengqi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343796#comment-15343796 ] Zhang Mengqi commented on SPARK-16064: -- Please check the link below for all the info

[jira] [Commented] (SPARK-15516) Schema merging in driver fails for parquet when merging LongType and IntegerType

2016-06-21 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343791#comment-15343791 ] MIN-FU YANG commented on SPARK-15516: - I would like to look into this issue > Schema

[jira] [Resolved] (SPARK-15644) Replace SQLContext with SparkSession in MLlib

2016-06-21 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-15644. --- Resolution: Fixed Fix Version/s: 2.0.0 > Replace SQLContext with SparkSession

[jira] [Commented] (SPARK-15326) Doing multiple unions on a Dataframe will result in a very inefficient query plan

2016-06-21 Thread Jurriaan Pruis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343775#comment-15343775 ] Jurriaan Pruis commented on SPARK-15326: [~hvanhovell] unfortunately that doesn't

[jira] [Commented] (SPARK-16125) YarnClusterSuite test cluster mode incorrectly

2016-06-21 Thread Peng Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343766#comment-15343766 ] Peng Zhang commented on SPARK-16125: OK, I'll check them. > YarnClusterSuite test cl

[jira] [Comment Edited] (SPARK-16095) Yarn cluster mode should return consistent result for command line and SparkLauncher

2016-06-21 Thread Peng Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343760#comment-15343760 ] Peng Zhang edited comment on SPARK-16095 at 6/22/16 5:59 AM: -

[jira] [Commented] (SPARK-16095) Yarn cluster mode should return consistent result for command line and SparkLauncher

2016-06-21 Thread Peng Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343760#comment-15343760 ] Peng Zhang commented on SPARK-16095: I want to make a unit test to explain this issue

[jira] [Commented] (SPARK-16125) YarnClusterSuite test cluster mode incorrectly

2016-06-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343747#comment-15343747 ] Sean Owen commented on SPARK-16125: --- Looks like we might have related issues in UtilsSu

[jira] [Commented] (SPARK-16064) Fix the GLM error caused by NA produced by reweight function

2016-06-21 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343745#comment-15343745 ] Sean Owen commented on SPARK-16064: --- Based on what though? you should compile the infor

[jira] [Created] (SPARK-16125) YarnClusterSuite test cluster mode incorrectly

2016-06-21 Thread Peng Zhang (JIRA)
Peng Zhang created SPARK-16125: -- Summary: YarnClusterSuite test cluster mode incorrectly Key: SPARK-16125 URL: https://issues.apache.org/jira/browse/SPARK-16125 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-16024) add tests for table creation with column comment

2016-06-21 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-16024: Description: (was: CREATE TABLE src(a INT COMMENT 'bla') USING parquet. When we describe table,

[jira] [Updated] (SPARK-16024) add tests for table creation with column comment

2016-06-21 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-16024: Description: should test both hive serde tables and datasource tables > add tests for table creatio

[jira] [Updated] (SPARK-16104) Do not creaate CSV writer object for every flush when writing

2016-06-21 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-16104: --- Assignee: Hyukjin Kwon > Do not creaate CSV writer object for every flush when writing >

[jira] [Resolved] (SPARK-16104) Do not creaate CSV writer object for every flush when writing

2016-06-21 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16104. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 13809 [https://github.

[jira] [Commented] (SPARK-12173) Consider supporting DataSet API in SparkR

2016-06-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343699#comment-15343699 ] Reynold Xin commented on SPARK-12173: - I don't think you are looking for the dataset

[jira] [Updated] (SPARK-16041) Disallow Duplicate Columns in `partitionBy`, `bucketBy` and `sortBy`

2016-06-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-16041: Description: Duplicate columns are not allowed in `partitionBy`, `bucketBy`, `sortBy` in DataFrameWriter.

[jira] [Updated] (SPARK-16041) Disallow Duplicate Columns in `partitionBy`, `blockBy` and `sortBy`

2016-06-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-16041: Summary: Disallow Duplicate Columns in `partitionBy`, `blockBy` and `sortBy` (was: Disallow Duplicate Colu

[jira] [Updated] (SPARK-16041) Disallow Duplicate Columns in `partitionBy`, `bucketBy` and `sortBy`

2016-06-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-16041: Summary: Disallow Duplicate Columns in `partitionBy`, `bucketBy` and `sortBy` (was: Disallow Duplicate Col

[jira] [Commented] (SPARK-16100) Aggregator fails with Tungsten error when complex types are used for results and partial sum

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343656#comment-15343656 ] Apache Spark commented on SPARK-16100: -- User 'cloud-fan' has created a pull request

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343243#comment-15343243 ] Ryan Blue commented on SPARK-16032: --- [~cloud_fan], while I think by-name insertion is i

[jira] [Commented] (SPARK-16000) Make model loading backward compatible with saved models using old vector columns

2016-06-21 Thread Gayathri Murali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343235#comment-15343235 ] Gayathri Murali commented on SPARK-16000: - [~yuhaoyan] I can help with this. >

[jira] [Commented] (SPARK-16121) ListingFileCatalog does not list in parallel anymore

2016-06-21 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343232#comment-15343232 ] Xiao Li commented on SPARK-16121: - I also saw this, but I thought this is by design. : )

[jira] [Commented] (SPARK-12172) Consider removing SparkR internal RDD APIs

2016-06-21 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343215#comment-15343215 ] Sun Rui commented on SPARK-12172: - currently spark.lapply() internally depends on RDD, we

[jira] [Commented] (SPARK-12173) Consider supporting DataSet API in SparkR

2016-06-21 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343212#comment-15343212 ] Sun Rui commented on SPARK-12173: - [~rxin] yes R don't need compile time type safety, but

[jira] [Commented] (SPARK-16124) Throws exception when executing query on `build/sbt hive/console`

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343176#comment-15343176 ] Apache Spark commented on SPARK-16124: -- User 'tilumi' has created a pull request for

[jira] [Closed] (SPARK-15326) Doing multiple unions on a Dataframe will result in a very inefficient query plan

2016-06-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell closed SPARK-15326. - Resolution: Not A Problem > Doing multiple unions on a Dataframe will result in a very in

[jira] [Assigned] (SPARK-16124) Throws exception when executing query on `build/sbt hive/console`

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16124: Assignee: Apache Spark > Throws exception when executing query on `build/sbt hive/console`

[jira] [Assigned] (SPARK-16124) Throws exception when executing query on `build/sbt hive/console`

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16124: Assignee: (was: Apache Spark) > Throws exception when executing query on `build/sbt hi

[jira] [Commented] (SPARK-15326) Doing multiple unions on a Dataframe will result in a very inefficient query plan

2016-06-21 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343175#comment-15343175 ] Herman van Hovell commented on SPARK-15326: --- So we have code that will flatten

[jira] [Created] (SPARK-16124) Throws exception when executing query on `build/sbt hive/console`

2016-06-21 Thread MIN-FU YANG (JIRA)
MIN-FU YANG created SPARK-16124: --- Summary: Throws exception when executing query on `build/sbt hive/console` Key: SPARK-16124 URL: https://issues.apache.org/jira/browse/SPARK-16124 Project: Spark

[jira] [Comment Edited] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343029#comment-15343029 ] Wenchen Fan edited comment on SPARK-16032 at 6/22/16 1:15 AM: -

[jira] [Assigned] (SPARK-16123) Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16123: Assignee: (was: Apache Spark) > Avoid NegativeArraySizeException while reserving addit

[jira] [Assigned] (SPARK-16123) Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16123: Assignee: Apache Spark > Avoid NegativeArraySizeException while reserving additional capac

[jira] [Commented] (SPARK-16123) Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343139#comment-15343139 ] Apache Spark commented on SPARK-16123: -- User 'sameeragarwal' has created a pull requ

[jira] [Updated] (SPARK-16123) Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader

2016-06-21 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal updated SPARK-16123: --- Description: Both off-heap and on-heap variants of ColumnVector.reserve() can unfortunately

[jira] [Updated] (SPARK-16102) Use Record API from Univocity rather than current data cast API.

2016-06-21 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16102: - Affects Version/s: 2.0.0 > Use Record API from Univocity rather than current data cast API. > ---

[jira] [Created] (SPARK-16123) Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader

2016-06-21 Thread Sameer Agarwal (JIRA)
Sameer Agarwal created SPARK-16123: -- Summary: Avoid NegativeArraySizeException while reserving additional capacity in VectorizedColumnReader Key: SPARK-16123 URL: https://issues.apache.org/jira/browse/SPARK-16123

[jira] [Commented] (SPARK-14172) Hive table partition predicate not passed down correctly

2016-06-21 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343121#comment-15343121 ] MIN-FU YANG commented on SPARK-14172: - I cannot reproduce it on 1.6.1 either. Could y

[jira] [Comment Edited] (SPARK-14172) Hive table partition predicate not passed down correctly

2016-06-21 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343121#comment-15343121 ] MIN-FU YANG edited comment on SPARK-14172 at 6/22/16 12:48 AM:

[jira] [Assigned] (SPARK-16119) Support "DROP TABLE ... PURGE" if Hive client supports it

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16119: Assignee: (was: Apache Spark) > Support "DROP TABLE ... PURGE" if Hive client supports

[jira] [Commented] (SPARK-16119) Support "DROP TABLE ... PURGE" if Hive client supports it

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343106#comment-15343106 ] Apache Spark commented on SPARK-16119: -- User 'vanzin' has created a pull request for

[jira] [Assigned] (SPARK-16119) Support "DROP TABLE ... PURGE" if Hive client supports it

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16119: Assignee: Apache Spark > Support "DROP TABLE ... PURGE" if Hive client supports it > -

[jira] [Assigned] (SPARK-16121) ListingFileCatalog does not list in parallel anymore

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16121: Assignee: Apache Spark > ListingFileCatalog does not list in parallel anymore > --

[jira] [Assigned] (SPARK-16121) ListingFileCatalog does not list in parallel anymore

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16121: Assignee: (was: Apache Spark) > ListingFileCatalog does not list in parallel anymore >

[jira] [Commented] (SPARK-16121) ListingFileCatalog does not list in parallel anymore

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343069#comment-15343069 ] Apache Spark commented on SPARK-16121: -- User 'yhuai' has created a pull request for

[jira] [Assigned] (SPARK-16071) Not sufficient array size checks to avoid integer overflows in Tungsten

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16071: Assignee: Apache Spark > Not sufficient array size checks to avoid integer overflows in Tu

[jira] [Assigned] (SPARK-16071) Not sufficient array size checks to avoid integer overflows in Tungsten

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16071: Assignee: (was: Apache Spark) > Not sufficient array size checks to avoid integer over

[jira] [Commented] (SPARK-16071) Not sufficient array size checks to avoid integer overflows in Tungsten

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343067#comment-15343067 ] Apache Spark commented on SPARK-16071: -- User 'clockfly' has created a pull request f

[jira] [Created] (SPARK-16122) Spark History Server REST API missing an environment endpoint per application

2016-06-21 Thread Neelesh Srinivas Salian (JIRA)
Neelesh Srinivas Salian created SPARK-16122: --- Summary: Spark History Server REST API missing an environment endpoint per application Key: SPARK-16122 URL: https://issues.apache.org/jira/browse/SPARK-1612

[jira] [Commented] (SPARK-15643) ML 2.0 QA: migration guide update

2016-06-21 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343059#comment-15343059 ] Joseph K. Bradley commented on SPARK-15643: --- A few more deprecations to add fro

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343036#comment-15343036 ] Wenchen Fan commented on SPARK-16032: - [~rdblue] I think the biggest problem is we do

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343029#comment-15343029 ] Wenchen Fan commented on SPARK-16032: - I think it's nonsense to use `partitionBy` wit

[jira] [Resolved] (SPARK-16117) Hide LibSVMFileFormat in public API docs

2016-06-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-16117. - Resolution: Fixed Fix Version/s: 2.0.0 > Hide LibSVMFileFormat in public API docs > --

[jira] [Resolved] (SPARK-16118) getDropLast is missing in OneHotEncoder

2016-06-21 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-16118. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13821 [https://g

[jira] [Updated] (SPARK-16119) Support "DROP TABLE ... PURGE" if Hive client supports it

2016-06-21 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-16119: --- Description: There's currently code that explicitly disables the "PURGE" flag when dropping

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342963#comment-15342963 ] Ryan Blue commented on SPARK-16032: --- I'm referring to disabling the use of {{partitionB

[jira] [Commented] (SPARK-16075) Make VectorUDT/MatrixUDT singleton under spark.ml package

2016-06-21 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342930#comment-15342930 ] Miao Wang commented on SPARK-16075: --- I will follow on this one. Thanks! > Make VectorU

[jira] [Assigned] (SPARK-16106) TaskSchedulerImpl does not correctly handle new executors on existing hosts

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16106: Assignee: Apache Spark > TaskSchedulerImpl does not correctly handle new executors on exis

[jira] [Assigned] (SPARK-16106) TaskSchedulerImpl does not correctly handle new executors on existing hosts

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16106: Assignee: (was: Apache Spark) > TaskSchedulerImpl does not correctly handle new execut

[jira] [Commented] (SPARK-16106) TaskSchedulerImpl does not correctly handle new executors on existing hosts

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342926#comment-15342926 ] Apache Spark commented on SPARK-16106: -- User 'squito' has created a pull request for

[jira] [Commented] (SPARK-14172) Hive table partition predicate not passed down correctly

2016-06-21 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342917#comment-15342917 ] MIN-FU YANG commented on SPARK-14172: - Hi, I cannot reproduce the problem in master b

[jira] [Created] (SPARK-16121) ListingFileCatalog does not list in parallel anymore

2016-06-21 Thread Yin Huai (JIRA)
Yin Huai created SPARK-16121: Summary: ListingFileCatalog does not list in parallel anymore Key: SPARK-16121 URL: https://issues.apache.org/jira/browse/SPARK-16121 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342856#comment-15342856 ] Yin Huai commented on SPARK-16032: -- Regarding {{disabling Hive features}}, can you be mo

[jira] [Updated] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

2016-06-21 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-16032: - Priority: Critical (was: Blocker) > Audit semantics of various insertion operations related to partition

[jira] [Assigned] (SPARK-16120) getCurrentLogFiles method in ReceiverSuite "WAL - generating and cleaning" case uses external variable instead of the passed parameter

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16120: Assignee: (was: Apache Spark) > getCurrentLogFiles method in ReceiverSuite "WAL - gene

[jira] [Assigned] (SPARK-16120) getCurrentLogFiles method in ReceiverSuite "WAL - generating and cleaning" case uses external variable instead of the passed parameter

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16120: Assignee: Apache Spark > getCurrentLogFiles method in ReceiverSuite "WAL - generating and

[jira] [Commented] (SPARK-16120) getCurrentLogFiles method in ReceiverSuite "WAL - generating and cleaning" case uses external variable instead of the passed parameter

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342852#comment-15342852 ] Apache Spark commented on SPARK-16120: -- User 'ahmed-mahran' has created a pull reque

[jira] [Commented] (SPARK-15326) Doing multiple unions on a Dataframe will result in a very inefficient query plan

2016-06-21 Thread MIN-FU YANG (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342850#comment-15342850 ] MIN-FU YANG commented on SPARK-15326: - I'll look into it. > Doing multiple unions on

[jira] [Created] (SPARK-16120) getCurrentLogFiles method in ReceiverSuite "WAL - generating and cleaning" case uses external variable instead of the passed parameter

2016-06-21 Thread Ahmed Mahran (JIRA)
Ahmed Mahran created SPARK-16120: Summary: getCurrentLogFiles method in ReceiverSuite "WAL - generating and cleaning" case uses external variable instead of the passed parameter Key: SPARK-16120 URL: https://issu

[jira] [Assigned] (SPARK-16110) Can't set Python via spark-submit for YARN cluster mode when PYSPARK_PYTHON & PYSPARK_DRIVER_PYTHON are set

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16110: Assignee: (was: Apache Spark) > Can't set Python via spark-submit for YARN cluster mod

[jira] [Assigned] (SPARK-16110) Can't set Python via spark-submit for YARN cluster mode when PYSPARK_PYTHON & PYSPARK_DRIVER_PYTHON are set

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16110: Assignee: Apache Spark > Can't set Python via spark-submit for YARN cluster mode when PYSP

[jira] [Commented] (SPARK-16110) Can't set Python via spark-submit for YARN cluster mode when PYSPARK_PYTHON & PYSPARK_DRIVER_PYTHON are set

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342831#comment-15342831 ] Apache Spark commented on SPARK-16110: -- User 'KevinGrealish' has created a pull requ

[jira] [Commented] (SPARK-16106) TaskSchedulerImpl does not correctly handle new executors on existing hosts

2016-06-21 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342821#comment-15342821 ] Imran Rashid commented on SPARK-16106: -- cc [~kayousterhout] After taking a closer l

[jira] [Updated] (SPARK-16106) TaskSchedulerImpl does not correctly handle new executors on existing hosts

2016-06-21 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-16106: - Priority: Trivial (was: Major) > TaskSchedulerImpl does not correctly handle new executors on ex

[jira] [Updated] (SPARK-15606) Driver hang in o.a.s.DistributedSuite on 2 core machine

2016-06-21 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-15606: - Fix Version/s: 1.6.2 > Driver hang in o.a.s.DistributedSuite on 2 core machine >

[jira] [Updated] (SPARK-15606) Driver hang in o.a.s.DistributedSuite on 2 core machine

2016-06-21 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-15606: - Affects Version/s: 1.6.2 > Driver hang in o.a.s.DistributedSuite on 2 core machine >

[jira] [Created] (SPARK-16119) Support "DROP TABLE ... PURGE" if Hive client supports it

2016-06-21 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-16119: -- Summary: Support "DROP TABLE ... PURGE" if Hive client supports it Key: SPARK-16119 URL: https://issues.apache.org/jira/browse/SPARK-16119 Project: Spark

[jira] [Assigned] (SPARK-16115) Improve output column name for SHOW PARTITIONS command and improve an error message

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16115: Assignee: (was: Apache Spark) > Improve output column name for SHOW PARTITIONS command

[jira] [Assigned] (SPARK-16115) Improve output column name for SHOW PARTITIONS command and improve an error message

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16115: Assignee: Apache Spark > Improve output column name for SHOW PARTITIONS command and improv

[jira] [Commented] (SPARK-16115) Improve output column name for SHOW PARTITIONS command and improve an error message

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342796#comment-15342796 ] Apache Spark commented on SPARK-16115: -- User 'skambha' has created a pull request fo

[jira] [Commented] (SPARK-16118) getDropLast is missing in OneHotEncoder

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342790#comment-15342790 ] Apache Spark commented on SPARK-16118: -- User 'mengxr' has created a pull request for

[jira] [Assigned] (SPARK-16118) getDropLast is missing in OneHotEncoder

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16118: Assignee: Xiangrui Meng (was: Apache Spark) > getDropLast is missing in OneHotEncoder > -

[jira] [Assigned] (SPARK-16118) getDropLast is missing in OneHotEncoder

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16118: Assignee: Apache Spark (was: Xiangrui Meng) > getDropLast is missing in OneHotEncoder > -

[jira] [Created] (SPARK-16118) getDropLast is missing in OneHotEncoder

2016-06-21 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-16118: - Summary: getDropLast is missing in OneHotEncoder Key: SPARK-16118 URL: https://issues.apache.org/jira/browse/SPARK-16118 Project: Spark Issue Type: New Fea

[jira] [Commented] (SPARK-16107) Group GLM-related methods in generated doc

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342773#comment-15342773 ] Apache Spark commented on SPARK-16107: -- User 'junyangq' has created a pull request f

[jira] [Assigned] (SPARK-16107) Group GLM-related methods in generated doc

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16107: Assignee: Junyang Qian (was: Apache Spark) > Group GLM-related methods in generated doc >

[jira] [Assigned] (SPARK-16107) Group GLM-related methods in generated doc

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16107: Assignee: Apache Spark (was: Junyang Qian) > Group GLM-related methods in generated doc >

[jira] [Commented] (SPARK-16117) Hide LibSVMFileFormat in public API docs

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342741#comment-15342741 ] Apache Spark commented on SPARK-16117: -- User 'mengxr' has created a pull request for

[jira] [Assigned] (SPARK-16117) Hide LibSVMFileFormat in public API docs

2016-06-21 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16117: Assignee: Apache Spark (was: Xiangrui Meng) > Hide LibSVMFileFormat in public API docs >

  1   2   3   4   >