[jira] [Assigned] (SPARK-16849) Improve subquery execution by deduplicating the subqueries with the same results

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16849: Assignee: (was: Apache Spark) > Improve subquery execution by deduplicating the

[jira] [Assigned] (SPARK-16849) Improve subquery execution by deduplicating the subqueries with the same results

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16849: Assignee: Apache Spark > Improve subquery execution by deduplicating the subqueries with

[jira] [Commented] (SPARK-16849) Improve subquery execution by deduplicating the subqueries with the same results

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403394#comment-15403394 ] Apache Spark commented on SPARK-16849: -- User 'viirya' has created a pull request for this issue:

[jira] [Created] (SPARK-16849) Improve subquery execution by deduplicating the subqueries with the same results

2016-08-01 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-16849: --- Summary: Improve subquery execution by deduplicating the subqueries with the same results Key: SPARK-16849 URL: https://issues.apache.org/jira/browse/SPARK-16849

[jira] [Commented] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403380#comment-15403380 ] Xiao Li commented on SPARK-16842: - For each table, we just need to issue one query. That query will

[jira] [Comment Edited] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403376#comment-15403376 ] Hyukjin Kwon edited comment on SPARK-16842 at 8/2/16 5:21 AM: -- hm.. don't we

[jira] [Commented] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403376#comment-15403376 ] Hyukjin Kwon commented on SPARK-16842: -- hm.. don't we make a connection and then run a query to

[jira] [Commented] (SPARK-16848) Make jdbc() and read.format("jdbc") consistently throwing exception for user-specified schema

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403364#comment-15403364 ] Apache Spark commented on SPARK-16848: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-16848) Make jdbc() and read.format("jdbc") consistently throwing exception for user-specified schema

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16848: Assignee: Apache Spark > Make jdbc() and read.format("jdbc") consistently throwing

[jira] [Assigned] (SPARK-16848) Make jdbc() and read.format("jdbc") consistently throwing exception for user-specified schema

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16848: Assignee: (was: Apache Spark) > Make jdbc() and read.format("jdbc") consistently

[jira] [Commented] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403358#comment-15403358 ] Xiao Li commented on SPARK-16842: - I heard of a case. In one big Internet company, their use case could

[jira] [Commented] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403354#comment-15403354 ] Xiao Li commented on SPARK-16842: - The overhead of schema parsing in JDBC is small, right? > Concern

[jira] [Created] (SPARK-16848) Make jdbc() and read.format("jdbc") consistently throwing exception for user-specified schema

2016-08-01 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-16848: Summary: Make jdbc() and read.format("jdbc") consistently throwing exception for user-specified schema Key: SPARK-16848 URL: https://issues.apache.org/jira/browse/SPARK-16848

[jira] [Assigned] (SPARK-16847) Prevent to potentially read corrupt statstics on binary in Parquet via VectorizedReader

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16847: Assignee: Apache Spark > Prevent to potentially read corrupt statstics on binary in

[jira] [Assigned] (SPARK-16847) Prevent to potentially read corrupt statstics on binary in Parquet via VectorizedReader

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16847: Assignee: (was: Apache Spark) > Prevent to potentially read corrupt statstics on

[jira] [Commented] (SPARK-16847) Prevent to potentially read corrupt statstics on binary in Parquet via VectorizedReader

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403337#comment-15403337 ] Apache Spark commented on SPARK-16847: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-16843) Select features according to a percentile of the highest scores of ChiSqSelector

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16843: Assignee: (was: Apache Spark) > Select features according to a percentile of the

[jira] [Commented] (SPARK-16843) Select features according to a percentile of the highest scores of ChiSqSelector

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1540#comment-1540 ] Apache Spark commented on SPARK-16843: -- User 'mpjlu' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16843) Select features according to a percentile of the highest scores of ChiSqSelector

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16843: Assignee: Apache Spark > Select features according to a percentile of the highest scores

[jira] [Updated] (SPARK-16847) Prevent to potentially read corrupt statstics on binary in Parquet via VectorizedReader

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16847: - Summary: Prevent to potentially read corrupt statstics on binary in Parquet via VectorizedReader

[jira] [Updated] (SPARK-16847) Do not read Parquet corrupt statstics on binary via VectorizedReader when it is corrupt

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16847: - Summary: Do not read Parquet corrupt statstics on binary via VectorizedReader when it is corrupt

[jira] [Created] (SPARK-16847) Do not read Parquet corrupt statstics on binary

2016-08-01 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-16847: Summary: Do not read Parquet corrupt statstics on binary Key: SPARK-16847 URL: https://issues.apache.org/jira/browse/SPARK-16847 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403315#comment-15403315 ] Hyukjin Kwon commented on SPARK-16842: -- If we don't support schema compatibility but should support

[jira] [Comment Edited] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403308#comment-15403308 ] Hyukjin Kwon edited comment on SPARK-16842 at 8/2/16 3:56 AM: -- Thanks for

[jira] [Comment Edited] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403308#comment-15403308 ] Hyukjin Kwon edited comment on SPARK-16842 at 8/2/16 3:56 AM: -- Thanks for

[jira] [Commented] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403308#comment-15403308 ] Hyukjin Kwon commented on SPARK-16842: -- Thanks for your feedback. Yea, but I think it might not be

[jira] [Issue Comment Deleted] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16842: - Comment: was deleted (was: Thanks for your feedback. Yea, but I think it might not be very heavy

[jira] [Commented] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403309#comment-15403309 ] Hyukjin Kwon commented on SPARK-16842: -- Thanks for your feedback. Yea, but I think it might not be

[jira] [Closed] (SPARK-15939) Clarify ml.linalg usage

2016-08-01 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng closed SPARK-15939. Resolution: Not A Problem > Clarify ml.linalg usage > --- > >

[jira] [Updated] (SPARK-16843) Select features according to a percentile of the highest scores of ChiSqSelector

2016-08-01 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peng Meng updated SPARK-16843: -- Fix Version/s: (was: 2.0.1) 2.1.0 > Select features according to a percentile

[jira] [Updated] (SPARK-16843) Select features according to a percentile of the highest scores of ChiSqSelector

2016-08-01 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peng Meng updated SPARK-16843: -- Target Version/s: (was: 2.0.1) > Select features according to a percentile of the highest scores of

[jira] [Updated] (SPARK-16843) Select features according to a percentile of the highest scores of ChiSqSelector

2016-08-01 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peng Meng updated SPARK-16843: -- Priority: Minor (was: Major) > Select features according to a percentile of the highest scores of >

[jira] [Updated] (SPARK-16843) Select features according to a percentile of the highest scores of ChiSqSelector

2016-08-01 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peng Meng updated SPARK-16843: -- Affects Version/s: (was: 2.0.0) 2.1.0 > Select features according to a

[jira] [Created] (SPARK-16846) read.csv() option: "inferSchema" don't work

2016-08-01 Thread hejie (JIRA)
hejie created SPARK-16846: - Summary: read.csv() option: "inferSchema" don't work Key: SPARK-16846 URL: https://issues.apache.org/jira/browse/SPARK-16846 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-16818) Exchange reuse incorrectly reuses scans over different sets of partitions

2016-08-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16818: Fix Version/s: 2.0.1 > Exchange reuse incorrectly reuses scans over different sets of partitions >

[jira] [Commented] (SPARK-16826) java.util.Hashtable limits the throughput of PARSE_URL()

2016-08-01 Thread Sylvain Zimmer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403262#comment-15403262 ] Sylvain Zimmer commented on SPARK-16826: [~srowen] what about this?

[jira] [Commented] (SPARK-16579) Add a spark install function

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403256#comment-15403256 ] Apache Spark commented on SPARK-16579: -- User 'junyangq' has created a pull request for this issue:

[jira] [Commented] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403255#comment-15403255 ] Xiao Li commented on SPARK-16842: - When users specify the schema, we do not need to discover the schema,

[jira] [Commented] (SPARK-14559) Netty RPC didn't check channel is active before sending message

2016-08-01 Thread Tao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403247#comment-15403247 ] Tao Wang commented on SPARK-14559: -- Hi [~zsxwing], Sadly the application is ended now so i can't get the

[jira] [Comment Edited] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403238#comment-15403238 ] Sean Zhong edited comment on SPARK-16320 at 8/2/16 2:22 AM: [~maver1ck] Can

[jira] [Commented] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403238#comment-15403238 ] Sean Zhong commented on SPARK-16320: [~loziniak] Can you check whether the PR works for you? >

[jira] [Commented] (SPARK-16826) java.util.Hashtable limits the throughput of PARSE_URL()

2016-08-01 Thread Sylvain Zimmer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403236#comment-15403236 ] Sylvain Zimmer commented on SPARK-16826: Sorry I can't be more helpful on the Java side... But I

[jira] [Created] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-08-01 Thread hejie (JIRA)
hejie created SPARK-16845: - Summary: org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB Key: SPARK-16845 URL: https://issues.apache.org/jira/browse/SPARK-16845

[jira] [Commented] (SPARK-16826) java.util.Hashtable limits the throughput of PARSE_URL()

2016-08-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403222#comment-15403222 ] Sean Owen commented on SPARK-16826: --- URI.toURL just follows the same code path. Does URI itself parse

[jira] [Commented] (SPARK-16844) Generate code for sort based aggregation

2016-08-01 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403202#comment-15403202 ] yucai commented on SPARK-16844: --- We are working on the whole stage code gen for the sort based aggregation.

[jira] [Created] (SPARK-16844) Generate code for sort based aggregation

2016-08-01 Thread yucai (JIRA)
yucai created SPARK-16844: - Summary: Generate code for sort based aggregation Key: SPARK-16844 URL: https://issues.apache.org/jira/browse/SPARK-16844 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-16843) Select features according to a percentile of the highest scores of ChiSqSelector

2016-08-01 Thread Peng Meng (JIRA)
Peng Meng created SPARK-16843: - Summary: Select features according to a percentile of the highest scores of ChiSqSelector Key: SPARK-16843 URL: https://issues.apache.org/jira/browse/SPARK-16843 Project:

[jira] [Comment Edited] (SPARK-16826) java.util.Hashtable limits the throughput of PARSE_URL()

2016-08-01 Thread Sylvain Zimmer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403165#comment-15403165 ] Sylvain Zimmer edited comment on SPARK-16826 at 8/2/16 1:15 AM: [~srowen]

[jira] [Commented] (SPARK-16826) java.util.Hashtable limits the throughput of PARSE_URL()

2016-08-01 Thread Sylvain Zimmer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403165#comment-15403165 ] Sylvain Zimmer commented on SPARK-16826: [~srowen] thanks for the pointers! I'm parsing every

[jira] [Updated] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16842: - Description: If my understanding is correct, If the user-given schema is different with the

[jira] [Updated] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16842: - Description: If my understanding is correct, If the user-given schema is different with the

[jira] [Commented] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403139#comment-15403139 ] Hyukjin Kwon commented on SPARK-16842: -- Let me cc [~liancheng], [~smilegator] [~dongjoon] and

[jira] [Commented] (SPARK-16445) Multilayer Perceptron Classifier wrapper in SparkR

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403137#comment-15403137 ] Apache Spark commented on SPARK-16445: -- User 'keypointt' has created a pull request for this issue:

[jira] [Resolved] (SPARK-16828) remove MaxOf and MinOf

2016-08-01 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-16828. -- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14434

[jira] [Created] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-16842: Summary: Concern about disallowing user-given schema for Parquet and ORC Key: SPARK-16842 URL: https://issues.apache.org/jira/browse/SPARK-16842 Project: Spark

[jira] [Commented] (SPARK-16798) java.lang.IllegalArgumentException: bound must be positive : Worked in 1.5.2

2016-08-01 Thread Charles Allen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403116#comment-15403116 ] Charles Allen commented on SPARK-16798: --- Yep, still happens: {code} 16/08/02 00:41:17 INFO

[jira] [Commented] (SPARK-16802) joins.LongToUnsafeRowMap crashes with ArrayIndexOutOfBoundsException

2016-08-01 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403110#comment-15403110 ] Miao Wang commented on SPARK-16802: --- With latest code, it should have been fixed. I re-run the test

[jira] [Commented] (SPARK-16832) CrossValidator and TrainValidationSplit are not random without seed

2016-08-01 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403067#comment-15403067 ] Bryan Cutler commented on SPARK-16832: -- The default seed value is a constant, this is the trait

[jira] [Assigned] (SPARK-16841) Improves the row level metrics performance when reading Parquet table

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16841: Assignee: Apache Spark > Improves the row level metrics performance when reading Parquet

[jira] [Commented] (SPARK-16841) Improves the row level metrics performance when reading Parquet table

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403053#comment-15403053 ] Apache Spark commented on SPARK-16841: -- User 'clockfly' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16841) Improves the row level metrics performance when reading Parquet table

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16841: Assignee: (was: Apache Spark) > Improves the row level metrics performance when

[jira] [Updated] (SPARK-16841) Improves the row level metrics performance when reading Parquet table

2016-08-01 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-16841: --- Summary: Improves the row level metrics performance when reading Parquet table (was: Improve the

[jira] [Created] (SPARK-16841) Improve the row level metrics performance when reading Parquet table

2016-08-01 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-16841: -- Summary: Improve the row level metrics performance when reading Parquet table Key: SPARK-16841 URL: https://issues.apache.org/jira/browse/SPARK-16841 Project: Spark

[jira] [Assigned] (SPARK-16802) joins.LongToUnsafeRowMap crashes with ArrayIndexOutOfBoundsException

2016-08-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-16802: -- Assignee: Davies Liu > joins.LongToUnsafeRowMap crashes with ArrayIndexOutOfBoundsException >

[jira] [Commented] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402961#comment-15402961 ] Apache Spark commented on SPARK-16320: -- User 'clockfly' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16320: Assignee: (was: Apache Spark) > Spark 2.0 slower than 1.6 when querying nested

[jira] [Assigned] (SPARK-16320) Spark 2.0 slower than 1.6 when querying nested columns

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16320: Assignee: Apache Spark > Spark 2.0 slower than 1.6 when querying nested columns >

[jira] [Created] (SPARK-16840) Please save the aggregate term frequencies as part of the NaiveBayesModel

2016-08-01 Thread Barry Becker (JIRA)
Barry Becker created SPARK-16840: Summary: Please save the aggregate term frequencies as part of the NaiveBayesModel Key: SPARK-16840 URL: https://issues.apache.org/jira/browse/SPARK-16840 Project:

[jira] [Commented] (SPARK-16839) CleanupAliases may leave redundant aliases at end of analysis state

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402920#comment-15402920 ] Apache Spark commented on SPARK-16839: -- User 'eyalfa' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16839) CleanupAliases may leave redundant aliases at end of analysis state

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16839: Assignee: (was: Apache Spark) > CleanupAliases may leave redundant aliases at end of

[jira] [Assigned] (SPARK-16839) CleanupAliases may leave redundant aliases at end of analysis state

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16839: Assignee: Apache Spark > CleanupAliases may leave redundant aliases at end of analysis

[jira] [Created] (SPARK-16839) CleanupAliases may leave redundant aliases at end of analysis state

2016-08-01 Thread Eyal Farago (JIRA)
Eyal Farago created SPARK-16839: --- Summary: CleanupAliases may leave redundant aliases at end of analysis state Key: SPARK-16839 URL: https://issues.apache.org/jira/browse/SPARK-16839 Project: Spark

[jira] [Resolved] (SPARK-15869) HTTP 500 and NPE on streaming batch details page

2016-08-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-15869. -- Resolution: Fixed Assignee: Shixiong Zhu Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-16834) TrainValildationSplit and direct evaluation produce different scores

2016-08-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402847#comment-15402847 ] Sean Owen commented on SPARK-16834: --- Hm, I see. Is it due to the bug you found in

[jira] [Resolved] (SPARK-16548) java.io.CharConversionException: Invalid UTF-32 character prevents me from querying my data

2016-08-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16548. --- Resolution: Won't Fix > java.io.CharConversionException: Invalid UTF-32 character prevents me from

[jira] [Resolved] (SPARK-16495) Add ADMM optimizer in mllib package

2016-08-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16495. --- Resolution: Later > Add ADMM optimizer in mllib package > --- > >

[jira] [Resolved] (SPARK-16465) Add nonnegative flag to mllib ALS

2016-08-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16465. --- Resolution: Won't Fix > Add nonnegative flag to mllib ALS > - > >

[jira] [Resolved] (SPARK-16801) clearThreshold does not work for SparseVector

2016-08-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16801. --- Resolution: Not A Problem > clearThreshold does not work for SparseVector >

[jira] [Updated] (SPARK-16774) Fix use of deprecated TimeStamp constructor (also providing incorrect results)

2016-08-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-16774: -- Assignee: holdenk Priority: Minor (was: Major) > Fix use of deprecated TimeStamp constructor

[jira] [Comment Edited] (SPARK-7445) StringIndexer should handle binary labels properly

2016-08-01 Thread Ruben Janssen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402786#comment-15402786 ] Ruben Janssen edited comment on SPARK-7445 at 8/1/16 8:57 PM: -- I'd be

[jira] [Resolved] (SPARK-16774) Fix use of deprecated TimeStamp constructor (also providing incorrect results)

2016-08-01 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16774. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Commented] (SPARK-7445) StringIndexer should handle binary labels properly

2016-08-01 Thread Ruben Janssen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402786#comment-15402786 ] Ruben Janssen commented on SPARK-7445: -- I'd be interested to work on this. Before I start however,

[jira] [Commented] (SPARK-16700) StructType doesn't accept Python dicts anymore

2016-08-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402739#comment-15402739 ] Davies Liu commented on SPARK-16700: There are two separate problems here: 1) Spark 2.0 enforce data

[jira] [Assigned] (SPARK-15869) HTTP 500 and NPE on streaming batch details page

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15869: Assignee: Apache Spark > HTTP 500 and NPE on streaming batch details page >

[jira] [Assigned] (SPARK-15869) HTTP 500 and NPE on streaming batch details page

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15869: Assignee: (was: Apache Spark) > HTTP 500 and NPE on streaming batch details page >

[jira] [Commented] (SPARK-15869) HTTP 500 and NPE on streaming batch details page

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402730#comment-15402730 ] Apache Spark commented on SPARK-15869: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Updated] (SPARK-16792) Dataset containing a Case Class with a List type causes a CompileException (converting sequence to list)

2016-08-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-16792: - Component/s: (was: Spark Core) SQL > Dataset containing a Case Class with a

[jira] [Commented] (SPARK-14559) Netty RPC didn't check channel is active before sending message

2016-08-01 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402712#comment-15402712 ] Shixiong Zhu commented on SPARK-14559: -- [~WangTao] Could you check the AM process? Looks like it's

[jira] [Updated] (SPARK-16836) Hive date/time function error

2016-08-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16836: Description: Previously available hive functions for date/time are not available in Spark 2.0

[jira] [Comment Edited] (SPARK-16798) java.lang.IllegalArgumentException: bound must be positive : Worked in 1.5.2

2016-08-01 Thread Charles Allen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402676#comment-15402676 ] Charles Allen edited comment on SPARK-16798 at 8/1/16 7:30 PM: --- Minor

[jira] [Commented] (SPARK-16798) java.lang.IllegalArgumentException: bound must be positive : Worked in 1.5.2

2016-08-01 Thread Charles Allen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402676#comment-15402676 ] Charles Allen commented on SPARK-16798: --- Minor update. Due to library collisions I have to change

[jira] [Assigned] (SPARK-16836) Hive date/time function error

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16836: Assignee: Apache Spark > Hive date/time function error > - >

[jira] [Commented] (SPARK-16836) Hive date/time function error

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402635#comment-15402635 ] Apache Spark commented on SPARK-16836: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16836) Hive date/time function error

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16836: Assignee: (was: Apache Spark) > Hive date/time function error >

[jira] [Updated] (SPARK-16837) TimeWindow incorrectly drops slideDuration in constructors

2016-08-01 Thread Tom Magrino (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Magrino updated SPARK-16837: Description: Right now, the constructors for the TimeWindow expression in Catalyst incorrectly

[jira] [Assigned] (SPARK-16837) TimeWindow incorrectly drops slideDuration in constructors

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16837: Assignee: (was: Apache Spark) > TimeWindow incorrectly drops slideDuration in

[jira] [Commented] (SPARK-16837) TimeWindow incorrectly drops slideDuration in constructors

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402600#comment-15402600 ] Apache Spark commented on SPARK-16837: -- User 'tmagrino' has created a pull request for this issue:

[jira] [Assigned] (SPARK-16837) TimeWindow incorrectly drops slideDuration in constructors

2016-08-01 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16837: Assignee: Apache Spark > TimeWindow incorrectly drops slideDuration in constructors >

[jira] [Commented] (SPARK-16768) pyspark calls incorrect version of logistic regression

2016-08-01 Thread Colin Beckingham (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402599#comment-15402599 ] Colin Beckingham commented on SPARK-16768: -- Sean said "If you mean the calling stack trace..." -

[jira] [Commented] (SPARK-16775) Reduce internal warnings from deprecated accumulator API

2016-08-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402592#comment-15402592 ] holdenk commented on SPARK-16775: - Yes so my plan is to replace it with the new API in all of the places

  1   2   >