[jira] [Updated] (SPARK-16404) LeastSquaresAggregator in Linear Regression serializes unnecessary data

2016-08-04 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai updated SPARK-16404: Shepherd: DB Tsai > LeastSquaresAggregator in Linear Regression serializes unnecessary data >

[jira] [Updated] (SPARK-16404) LeastSquaresAggregator in Linear Regression serializes unnecessary data

2016-08-04 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai updated SPARK-16404: Assignee: Seth Hendrickson > LeastSquaresAggregator in Linear Regression serializes unnecessary data >

[jira] [Commented] (SPARK-16883) SQL decimal type is not properly cast to number when collecting SparkDataFrame

2016-08-04 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408916#comment-15408916 ] Miao Wang commented on SPARK-16883: --- Right. I am trying to modify it. I will send a PR once I fix it.

[jira] [Commented] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-08-04 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408900#comment-15408900 ] Liwei Lin commented on SPARK-16903: --- [~hyukjin.kwon] thanks for pinging me! I believe the cause is

[jira] [Commented] (SPARK-16409) regexp_extract with optional groups causes NPE

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408894#comment-15408894 ] Sean Owen commented on SPARK-16409: --- It's not quite my area, but I might know the answer. As to

[jira] [Comment Edited] (SPARK-16895) Reading empty string from csv has changed behaviour

2016-08-04 Thread Aseem Bansal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408892#comment-15408892 ] Aseem Bansal edited comment on SPARK-16895 at 8/5/16 5:19 AM: -- I see that

[jira] [Commented] (SPARK-16895) Reading empty string from csv has changed behaviour

2016-08-04 Thread Aseem Bansal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408892#comment-15408892 ] Aseem Bansal commented on SPARK-16895: -- I understand that it is duplicate. Regarding it being a bug

[jira] [Commented] (SPARK-16911) Fix broken links on the programming guide.

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408889#comment-15408889 ] Apache Spark commented on SPARK-16911: -- User 'shiv4nsh' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16911) Fix broken links on the programming guide.

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16911: Assignee: Apache Spark > Fix broken links on the programming guide. >

[jira] [Assigned] (SPARK-16911) Fix broken links on the programming guide.

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16911: Assignee: (was: Apache Spark) > Fix broken links on the programming guide. >

[jira] [Updated] (SPARK-16911) Fix broken links on the programming guide.

2016-08-04 Thread Shivansh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivansh updated SPARK-16911: - Summary: Fix broken links on the programming guide. (was: Fix broken liks on the programming guide.) >

[jira] [Commented] (SPARK-16321) [Spark 2.0] Performance regression when reading parquet and using PPD and non-vectorized reader

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408878#comment-15408878 ] Apache Spark commented on SPARK-16321: -- User 'viirya' has created a pull request for this issue:

[jira] [Created] (SPARK-16911) Fix broken liks on the programming guide.

2016-08-04 Thread Shivansh (JIRA)
Shivansh created SPARK-16911: Summary: Fix broken liks on the programming guide. Key: SPARK-16911 URL: https://issues.apache.org/jira/browse/SPARK-16911 Project: Spark Issue Type: Documentation

[jira] [Commented] (SPARK-16666) Kryo encoder for custom complex classes

2016-08-04 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408855#comment-15408855 ] Sean Zhong commented on SPARK-1: This issue has been fixed in Spark 2.0 and trunk. Can you use

[jira] [Resolved] (SPARK-16666) Kryo encoder for custom complex classes

2016-08-04 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong resolved SPARK-1. Resolution: Not A Problem > Kryo encoder for custom complex classes >

[jira] [Commented] (SPARK-9761) Inconsistent metadata handling with ALTER TABLE

2016-08-04 Thread David Winters (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408828#comment-15408828 ] David Winters commented on SPARK-9761: -- [~xwu0226] - Thanks for the follow-up. BTW, I was able to

[jira] [Created] (SPARK-16910) Infrastructure Projects name update in Supplemental Spark Projects

2016-08-04 Thread Qi Li (JIRA)
Qi Li created SPARK-16910: - Summary: Infrastructure Projects name update in Supplemental Spark Projects Key: SPARK-16910 URL: https://issues.apache.org/jira/browse/SPARK-16910 Project: Spark Issue

[jira] [Assigned] (SPARK-16909) Streaming for postgreSQL JDBC driver

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16909: Assignee: (was: Apache Spark) > Streaming for postgreSQL JDBC driver >

[jira] [Commented] (SPARK-16909) Streaming for postgreSQL JDBC driver

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408820#comment-15408820 ] Apache Spark commented on SPARK-16909: -- User 'princejwesley' has created a pull request for this

[jira] [Assigned] (SPARK-16909) Streaming for postgreSQL JDBC driver

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16909: Assignee: Apache Spark > Streaming for postgreSQL JDBC driver >

[jira] [Updated] (SPARK-16907) Parquet table reading performance regression when vectorized record reader is not used

2016-08-04 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-16907: Assignee: Sean Zhong > Parquet table reading performance regression when vectorized record reader

[jira] [Resolved] (SPARK-16907) Parquet table reading performance regression when vectorized record reader is not used

2016-08-04 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-16907. - Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 > Parquet table reading

[jira] [Created] (SPARK-16909) Streaming for postgreSQL JDBC driver

2016-08-04 Thread prince john wesley (JIRA)
prince john wesley created SPARK-16909: -- Summary: Streaming for postgreSQL JDBC driver Key: SPARK-16909 URL: https://issues.apache.org/jira/browse/SPARK-16909 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-16908) Java code style guideline documentation

2016-08-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408772#comment-15408772 ] Hyukjin Kwon edited comment on SPARK-16908 at 8/5/16 2:12 AM: -- Could you

[jira] [Commented] (SPARK-16908) Java code style guideline documentation

2016-08-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408772#comment-15408772 ] Hyukjin Kwon commented on SPARK-16908: -- Could you please take a look [~srowen]? > Java code style

[jira] [Created] (SPARK-16908) Java code style guideline documentation

2016-08-04 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-16908: Summary: Java code style guideline documentation Key: SPARK-16908 URL: https://issues.apache.org/jira/browse/SPARK-16908 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-16907) Parquet table reading performance regression when vectorized record reader is not used

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16907: Assignee: (was: Apache Spark) > Parquet table reading performance regression when

[jira] [Commented] (SPARK-16907) Parquet table reading performance regression when vectorized record reader is not used

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408737#comment-15408737 ] Apache Spark commented on SPARK-16907: -- User 'clockfly' has created a pull request for this issue:

[jira] [Created] (SPARK-16907) Parquet table reading performance regression when vectorized record reader is not used

2016-08-04 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16907: -- Summary: Parquet table reading performance regression when vectorized record reader is not used Key: SPARK-16907 URL: https://issues.apache.org/jira/browse/SPARK-16907

[jira] [Assigned] (SPARK-16907) Parquet table reading performance regression when vectorized record reader is not used

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16907: Assignee: Apache Spark > Parquet table reading performance regression when vectorized

[jira] [Commented] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-08-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408735#comment-15408735 ] Hyukjin Kwon commented on SPARK-16903: -- Oh, I thought we should apply {{nullValue}} to all types

[jira] [Commented] (SPARK-16896) Loading csv with duplicate column names

2016-08-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408728#comment-15408728 ] Hossein Falaki commented on SPARK-16896: I suggest we generally follow the restrictions of

[jira] [Commented] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-08-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408722#comment-15408722 ] Hossein Falaki commented on SPARK-16903: Thanks for the info. That make me doubt the decision to

[jira] [Commented] (SPARK-16896) Loading csv with duplicate column names

2016-08-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408715#comment-15408715 ] Hyukjin Kwon commented on SPARK-16896: -- I agree with appending a number to the deplicated column

[jira] [Commented] (SPARK-15899) file scheme should be used correctly

2016-08-04 Thread skaarthik (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408717#comment-15408717 ] skaarthik commented on SPARK-15899: --- Confirming that overriding the setting ("--conf

[jira] [Updated] (SPARK-16896) Loading csv with duplicate column names

2016-08-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16896: - Component/s: SQL > Loading csv with duplicate column names >

[jira] [Commented] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-08-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408704#comment-15408704 ] Hyukjin Kwon commented on SPARK-16903: -- Hi [~falaki], is this about SPARK-16462, SPARK-16460 and

[jira] [Comment Edited] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-08-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408704#comment-15408704 ] Hyukjin Kwon edited comment on SPARK-16903 at 8/5/16 12:44 AM: --- Hi

[jira] [Assigned] (SPARK-16906) Adds more input type information for TypedAggregateExpression

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16906: Assignee: Apache Spark > Adds more input type information for TypedAggregateExpression >

[jira] [Commented] (SPARK-16906) Adds more input type information for TypedAggregateExpression

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408622#comment-15408622 ] Apache Spark commented on SPARK-16906: -- User 'clockfly' has created a pull request for this issue:

[jira] [Commented] (SPARK-16883) SQL decimal type is not properly cast to number when collecting SparkDataFrame

2016-08-04 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408621#comment-15408621 ] Hossein Falaki commented on SPARK-16883: I think that is because we are not converting

[jira] [Assigned] (SPARK-16906) Adds more input type information for TypedAggregateExpression

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16906: Assignee: (was: Apache Spark) > Adds more input type information for

[jira] [Created] (SPARK-16906) Adds more input type information for TypedAggregateExpression

2016-08-04 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16906: -- Summary: Adds more input type information for TypedAggregateExpression Key: SPARK-16906 URL: https://issues.apache.org/jira/browse/SPARK-16906 Project: Spark

[jira] [Commented] (SPARK-16883) SQL decimal type is not properly cast to number when collecting SparkDataFrame

2016-08-04 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408611#comment-15408611 ] Miao Wang commented on SPARK-16883: --- > test <- sql("select cast('1' as double) as x, cast('2' as

[jira] [Commented] (SPARK-16409) regexp_extract with optional groups causes NPE

2016-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408607#comment-15408607 ] Reynold Xin commented on SPARK-16409: - [~srowen] do you have time to fix this one? > regexp_extract

[jira] [Created] (SPARK-16905) Support SQL DDL: MSCK REPAIR TABLE

2016-08-04 Thread Davies Liu (JIRA)
Davies Liu created SPARK-16905: -- Summary: Support SQL DDL: MSCK REPAIR TABLE Key: SPARK-16905 URL: https://issues.apache.org/jira/browse/SPARK-16905 Project: Spark Issue Type: New Feature

[jira] [Assigned] (SPARK-16904) Removal of Hive Built-in Hash Functions and TestHiveFunctionRegistry

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16904: Assignee: Apache Spark > Removal of Hive Built-in Hash Functions and

[jira] [Assigned] (SPARK-16904) Removal of Hive Built-in Hash Functions and TestHiveFunctionRegistry

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16904: Assignee: (was: Apache Spark) > Removal of Hive Built-in Hash Functions and

[jira] [Commented] (SPARK-16904) Removal of Hive Built-in Hash Functions and TestHiveFunctionRegistry

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408559#comment-15408559 ] Apache Spark commented on SPARK-16904: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Resolved] (SPARK-15074) Spark shuffle service bottlenecked while fetching large amount of intermediate data

2016-08-04 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-15074. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 12944

[jira] [Created] (SPARK-16904) Removal of Hive Built-in Hash Functions and TestHiveFunctionRegistry

2016-08-04 Thread Xiao Li (JIRA)
Xiao Li created SPARK-16904: --- Summary: Removal of Hive Built-in Hash Functions and TestHiveFunctionRegistry Key: SPARK-16904 URL: https://issues.apache.org/jira/browse/SPARK-16904 Project: Spark

[jira] [Commented] (SPARK-16334) [SQL] SQL query on parquet table java.lang.ArrayIndexOutOfBoundsException

2016-08-04 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408531#comment-15408531 ] Sameer Agarwal commented on SPARK-16334: Thanks Keith, that'll work. You can mail it to me at

[jira] [Commented] (SPARK-16334) [SQL] SQL query on parquet table java.lang.ArrayIndexOutOfBoundsException

2016-08-04 Thread Keith Kraus (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408520#comment-15408520 ] Keith Kraus commented on SPARK-16334: - [~sameerag] Sharing even a subset of the data would be very

[jira] [Commented] (SPARK-16611) Expose several hidden DataFrame/RDD functions

2016-08-04 Thread Alok Singh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408513#comment-15408513 ] Alok Singh commented on SPARK-16611: Hi [~shivaram] Thanks for the reply. 1)To illustrate what I

[jira] [Resolved] (SPARK-16877) Add a rule for preventing use Java's Override annotation

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16877. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Updated] (SPARK-16877) Add a rule for preventing use Java's Override annotation

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16877: -- Assignee: Hyukjin Kwon > Add a rule for preventing use Java's Override annotation >

[jira] [Resolved] (SPARK-16880) Improve ANN training, add training data persist if needed

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16880. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Updated] (SPARK-16880) Improve ANN training, add training data persist if needed

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16880: -- Assignee: Weichen Xu Priority: Minor (was: Major) > Improve ANN training, add training data

[jira] [Updated] (SPARK-16875) Add args checking for DataSet randomSplit and sample

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16875: -- Assignee: zhengruifeng > Add args checking for DataSet randomSplit and sample >

[jira] [Resolved] (SPARK-16875) Add args checking for DataSet randomSplit and sample

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16875. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Commented] (SPARK-16901) Hive settings in hive-site.xml may be overridden by Hive's default values

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408439#comment-15408439 ] Apache Spark commented on SPARK-16901: -- User 'yhuai' has created a pull request for this issue:

[jira] [Commented] (SPARK-16901) Hive settings in hive-site.xml may be overridden by Hive's default values

2016-08-04 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408431#comment-15408431 ] Yin Huai commented on SPARK-16901: -- This issue only appear if the value of

[jira] [Created] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-08-04 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-16903: -- Summary: nullValue in first field is not respected by CSV source when read Key: SPARK-16903 URL: https://issues.apache.org/jira/browse/SPARK-16903 Project: Spark

[jira] [Created] (SPARK-16902) Custom ExpressionEncoder for primitive array is not effective

2016-08-04 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-16902: Summary: Custom ExpressionEncoder for primitive array is not effective Key: SPARK-16902 URL: https://issues.apache.org/jira/browse/SPARK-16902 Project: Spark

[jira] [Commented] (SPARK-9761) Inconsistent metadata handling with ALTER TABLE

2016-08-04 Thread Xin Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408405#comment-15408405 ] Xin Wu commented on SPARK-9761: --- [~drwinters] Spark 2.0 has support DDL commands, which means it gives the

[jira] [Resolved] (SPARK-16876) Add match Column expression for regular expression matching in Scala API

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16876. --- Resolution: Duplicate It's not specific to Pyspark (I'll remove the spurious label), but in any

[jira] [Updated] (SPARK-16203) regexp_extract to return an ArrayType(StringType())

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16203: -- Component/s: (was: PySpark) > regexp_extract to return an ArrayType(StringType()) >

[jira] [Updated] (SPARK-16893) Spark CSV Provider option is not documented

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16893: -- Priority: Minor (was: Major) Please specify the problem here; it just says you faced some problem

[jira] [Resolved] (SPARK-16895) Reading empty string from csv has changed behaviour

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16895. --- Resolution: Duplicate In general, it's not true that a behavior change is a bug, when across major

[jira] [Commented] (SPARK-16832) CrossValidator and TrainValidationSplit are not random without seed

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408393#comment-15408393 ] Sean Owen commented on SPARK-16832: --- I think it's worth tabling a pull request to change that behavior,

[jira] [Updated] (SPARK-16899) Structured Streaming Checkpointing Example invalid

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16899: -- Priority: Minor (was: Critical) OK, open a pull request > Structured Streaming Checkpointing Example

[jira] [Commented] (SPARK-16885) Spark shell failed to run in yarn-client mode

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408388#comment-15408388 ] Sean Owen commented on SPARK-16885: --- spark.yarn.archive=/home/yzhishko/work/spark/jars > Spark shell

[jira] [Commented] (SPARK-15899) file scheme should be used correctly

2016-08-04 Thread Arsen Vladimirskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408362#comment-15408362 ] Arsen Vladimirskiy commented on SPARK-15899: Are you using the pre-compiled binaries? The

[jira] [Issue Comment Deleted] (SPARK-16885) Spark shell failed to run in yarn-client mode

2016-08-04 Thread Yury Zhyshko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yury Zhyshko updated SPARK-16885: - Comment: was deleted (was: Specified where?) > Spark shell failed to run in yarn-client mode >

[jira] [Commented] (SPARK-16885) Spark shell failed to run in yarn-client mode

2016-08-04 Thread Yury Zhyshko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408358#comment-15408358 ] Yury Zhyshko commented on SPARK-16885: -- Specified where? > Spark shell failed to run in yarn-client

[jira] [Commented] (SPARK-16885) Spark shell failed to run in yarn-client mode

2016-08-04 Thread Yury Zhyshko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408357#comment-15408357 ] Yury Zhyshko commented on SPARK-16885: -- Specified where? > Spark shell failed to run in yarn-client

[jira] [Commented] (SPARK-16334) [SQL] SQL query on parquet table java.lang.ArrayIndexOutOfBoundsException

2016-08-04 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408334#comment-15408334 ] Sameer Agarwal commented on SPARK-16334: [~keith.j.kraus], [~sebastianherold] -- would it be

[jira] [Created] (SPARK-16901) Hive settings in hive-site.xml may be overridden by Hive's default values

2016-08-04 Thread Yin Huai (JIRA)
Yin Huai created SPARK-16901: Summary: Hive settings in hive-site.xml may be overridden by Hive's default values Key: SPARK-16901 URL: https://issues.apache.org/jira/browse/SPARK-16901 Project: Spark

[jira] [Commented] (SPARK-16883) SQL decimal type is not properly cast to number when collecting SparkDataFrame

2016-08-04 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408298#comment-15408298 ] Miao Wang commented on SPARK-16883: --- I will take a look. > SQL decimal type is not properly cast to

[jira] [Updated] (SPARK-16884) Move DataSourceScanExec out of ExistingRDD.scala file

2016-08-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-16884: --- Assignee: Eric Liang > Move DataSourceScanExec out of ExistingRDD.scala file >

[jira] [Resolved] (SPARK-16884) Move DataSourceScanExec out of ExistingRDD.scala file

2016-08-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16884. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14487

[jira] [Resolved] (SPARK-16802) joins.LongToUnsafeRowMap crashes with ArrayIndexOutOfBoundsException

2016-08-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-16802. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Updated] (SPARK-16899) Structured Streaming Checkpointing Example invalid

2016-08-04 Thread Vladimir Feinberg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Feinberg updated SPARK-16899: -- Description: The structured streaming checkpointing example at the bottom of the page

[jira] [Created] (SPARK-16900) Complete-mode output for file sinks

2016-08-04 Thread Vladimir Feinberg (JIRA)
Vladimir Feinberg created SPARK-16900: - Summary: Complete-mode output for file sinks Key: SPARK-16900 URL: https://issues.apache.org/jira/browse/SPARK-16900 Project: Spark Issue Type:

[jira] [Created] (SPARK-16899) Structured Streaming Checkpointing Example invalid

2016-08-04 Thread Vladimir Feinberg (JIRA)
Vladimir Feinberg created SPARK-16899: - Summary: Structured Streaming Checkpointing Example invalid Key: SPARK-16899 URL: https://issues.apache.org/jira/browse/SPARK-16899 Project: Spark

[jira] [Commented] (SPARK-15937) Spark declares a succeeding job to be failed in yarn-cluster mode if the job takes very small time (~ < 10 seconds) to finish

2016-08-04 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408217#comment-15408217 ] Marcelo Vanzin commented on SPARK-15937: [~subrotosanyal] from the logs you posted it doesn't

[jira] [Commented] (SPARK-16832) CrossValidator and TrainValidationSplit are not random without seed

2016-08-04 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408190#comment-15408190 ] Bryan Cutler commented on SPARK-16832: -- Yeah, I'm not sure of the reason myself, but I agree with

[jira] [Resolved] (SPARK-16890) substring returns wrong result for positive position

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16890. --- Resolution: Not A Problem Yes, this is how it's supposed to work. Compare with

[jira] [Closed] (SPARK-16897) Invalid use of var where val can be used.

2016-08-04 Thread Shivansh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivansh closed SPARK-16897. Resolution: Won't Fix > Invalid use of var where val can be used. >

[jira] [Resolved] (SPARK-11938) Expose numFeatures in all ML PredictionModel for PySpark

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-11938. --- Resolution: Duplicate Target Version/s: (was: 2.1.0) > Expose numFeatures in all ML

[jira] [Commented] (SPARK-16772) Correct API doc references to PySpark classes + formatting fixes

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408167#comment-15408167 ] Apache Spark commented on SPARK-16772: -- User 'nchammas' has created a pull request for this issue:

[jira] [Resolved] (SPARK-15422) Remove unnecessary calculation of stage's parents

2016-08-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-15422. --- Resolution: Duplicate > Remove unnecessary calculation of stage's parents >

[jira] [Commented] (SPARK-11638) Run Spark on Mesos with bridge networking

2016-08-04 Thread John Omernik (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408068#comment-15408068 ] John Omernik commented on SPARK-11638: -- Another new guy, late to the game: Would work done on

[jira] [Commented] (SPARK-16898) Adds argument type information for typed logical plan like MapElements, TypedFilter, and AppendColumn

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408051#comment-15408051 ] Apache Spark commented on SPARK-16898: -- User 'clockfly' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16898) Adds argument type information for typed logical plan like MapElements, TypedFilter, and AppendColumn

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16898: Assignee: (was: Apache Spark) > Adds argument type information for typed logical plan

[jira] [Assigned] (SPARK-16898) Adds argument type information for typed logical plan like MapElements, TypedFilter, and AppendColumn

2016-08-04 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16898: Assignee: Apache Spark > Adds argument type information for typed logical plan like

[jira] [Updated] (SPARK-16898) Adds argument type information for typed logical plan like MapElements, TypedFilter, and AppendColumn

2016-08-04 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-16898: --- Summary: Adds argument type information for typed logical plan like MapElements, TypedFilter, and

[jira] [Created] (SPARK-16898) Adds argument type information for typed logical plan likMapElements, TypedFilter, and AppendColumn

2016-08-04 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16898: -- Summary: Adds argument type information for typed logical plan likMapElements, TypedFilter, and AppendColumn Key: SPARK-16898 URL: https://issues.apache.org/jira/browse/SPARK-16898

[jira] [Commented] (SPARK-16409) regexp_extract with optional groups causes NPE

2016-08-04 Thread Max Moroz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407972#comment-15407972 ] Max Moroz commented on SPARK-16409: --- Still causes NPE on the newly released Spark 2.0.0. >

[jira] [Commented] (SPARK-16798) java.lang.IllegalArgumentException: bound must be positive : Worked in 1.5.2

2016-08-04 Thread Charles Allen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407877#comment-15407877 ] Charles Allen commented on SPARK-16798: --- I reproduced it with stock spark. I'm working on getting a

  1   2   >