[jira] [Commented] (SPARK-7520) Install Jekyll On Jenkins Machines

2015-05-11 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538491#comment-14538491 ] Shivaram Venkataraman commented on SPARK-7520: -- The R docs look fine to me h

[jira] [Updated] (SPARK-7529) Java compatibility check for MLlib 1.4

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7529: - Description: Check Java compatibility for MLlib 1.4. We should create separate JIRAs for

[jira] [Created] (SPARK-7536) Audit MLlib Python API for 1.4

2015-05-11 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-7536: Summary: Audit MLlib Python API for 1.4 Key: SPARK-7536 URL: https://issues.apache.org/jira/browse/SPARK-7536 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-7535) Audit Pipeline APIs for 1.4

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7535: - Summary: Audit Pipeline APIs for 1.4 (was: Audit Pipeline APIs) > Audit Pipeline APIs for

[jira] [Updated] (SPARK-7443) MLlib 1.4 QA plan

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7443: - Description: TODO: create JIRAs for each task and assign them accordingly. h2. API * Che

[jira] [Updated] (SPARK-7537) Audit new public Scala APIs for MLlib 1.4

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7537: - Description: Audit new public Scala APIs added to MLlib in 1.4. Take note of: * Protected

[jira] [Created] (SPARK-7537) Audit new public Scala APIs for MLlib 1.4

2015-05-11 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-7537: Summary: Audit new public Scala APIs for MLlib 1.4 Key: SPARK-7537 URL: https://issues.apache.org/jira/browse/SPARK-7537 Project: Spark Issue Type: S

[jira] [Updated] (SPARK-7443) MLlib 1.4 QA plan

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7443: - Description: TODO: create JIRAs for each task and assign them accordingly. h2. API * Che

[jira] [Updated] (SPARK-7536) Audit MLlib Python API for 1.4

2015-05-11 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7536: - Assignee: Yanbo Liang > Audit MLlib Python API for 1.4 > -- > >

[jira] [Updated] (SPARK-7535) Audit Pipeline APIs for 1.4

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7535: - Description: This is an umbrella for auditing the Pipeline (spark.ml) APIs. Items to chec

[jira] [Created] (SPARK-7538) Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Lee McFadden (JIRA)
Lee McFadden created SPARK-7538: --- Summary: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge Key: SPARK-7538 URL: https://issues.apache.org/jira/browse/SPARK-7538 Project: Spar

[jira] [Updated] (SPARK-7443) MLlib 1.4 QA plan

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7443: - Description: TODO: create JIRAs for each task and assign them accordingly. h2. API * Che

[jira] [Updated] (SPARK-7443) MLlib 1.4 QA plan

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7443: - Description: TODO: create JIRAs for each task and assign them accordingly. h2. API * Che

[jira] [Updated] (SPARK-7443) MLlib 1.4 QA plan

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7443: - Description: TODO: create JIRAs for each task and assign them accordingly. h2. API * Che

[jira] [Created] (SPARK-7539) Perf tests for Python MLlib

2015-05-11 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-7539: Summary: Perf tests for Python MLlib Key: SPARK-7539 URL: https://issues.apache.org/jira/browse/SPARK-7539 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-7443) MLlib 1.4 QA plan

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7443: - Description: TODO: create JIRAs for each task and assign them accordingly. h2. API * Che

[jira] [Updated] (SPARK-7443) MLlib 1.4 QA plan

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7443: - Description: TODO: create JIRAs for each task and assign them accordingly. h2. API * Che

[jira] [Created] (SPARK-7540) PMML correctness check

2015-05-11 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-7540: Summary: PMML correctness check Key: SPARK-7540 URL: https://issues.apache.org/jira/browse/SPARK-7540 Project: Spark Issue Type: Sub-task C

[jira] [Updated] (SPARK-7443) MLlib 1.4 QA plan

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7443: - Description: TODO: create JIRAs for each task and assign them accordingly. h2. API * Che

[jira] [Created] (SPARK-7541) Check model save/load for MLlib 1.4

2015-05-11 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-7541: Summary: Check model save/load for MLlib 1.4 Key: SPARK-7541 URL: https://issues.apache.org/jira/browse/SPARK-7541 Project: Spark Issue Type: Sub-tas

[jira] [Updated] (SPARK-7443) MLlib 1.4 QA plan

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7443: - Description: TODO: create JIRAs for each task and assign them accordingly. h2. API * Che

[jira] [Updated] (SPARK-7443) MLlib 1.4 QA plan

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7443: - Description: TODO: create JIRAs for each task and assign them accordingly. h2. API * Che

[jira] [Resolved] (SPARK-7508) JettyUtils-generated servlets to log & report all errors

2015-05-11 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-7508. Resolution: Fixed Fix Version/s: 1.4.0 Assignee: Steve Loughran > JettyUtils

[jira] [Commented] (SPARK-7538) Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538613#comment-14538613 ] Sean Owen commented on SPARK-7538: -- This usually indicates classpath problems with the ap

[jira] [Commented] (SPARK-7462) By default retain group by columns in aggregate

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538657#comment-14538657 ] Apache Spark commented on SPARK-7462: - User 'rxin' has created a pull request for this

[jira] [Updated] (SPARK-7515) Update documentation for PySpark on YARN with cluster mode

2015-05-11 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated SPARK-7515: -- Assignee: Kousuke Saruta > Update documentation for PySpark on YARN with cluster mode >

[jira] [Resolved] (SPARK-7515) Update documentation for PySpark on YARN with cluster mode

2015-05-11 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza resolved SPARK-7515. --- Resolution: Fixed Fix Version/s: 1.5.0 Target Version/s: (was: 1.4.0) > Update docu

[jira] [Commented] (SPARK-7200) Tungsten test suites should fail if memory leak is detected

2015-05-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538674#comment-14538674 ] Josh Rosen commented on SPARK-7200: --- We should consider how we want to handle memory lea

[jira] [Resolved] (SPARK-7516) Replace deprecated Data Frame api in Python Docs

2015-05-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-7516. Resolution: Fixed Assignee: Guancheng Chen > Replace deprecated Data Frame api in Python Docs

[jira] [Created] (SPARK-7542) Use LongArray for sort buffer in UnsafeExternalSorter

2015-05-11 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-7542: - Summary: Use LongArray for sort buffer in UnsafeExternalSorter Key: SPARK-7542 URL: https://issues.apache.org/jira/browse/SPARK-7542 Project: Spark Issue Type: Imp

[jira] [Assigned] (SPARK-7360) Compare Pyrolite performance affected by useMemo

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7360: --- Assignee: Apache Spark (was: Nicholas Chammas) > Compare Pyrolite performance affected by us

[jira] [Assigned] (SPARK-7360) Compare Pyrolite performance affected by useMemo

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7360: --- Assignee: Nicholas Chammas (was: Apache Spark) > Compare Pyrolite performance affected by us

[jira] [Commented] (SPARK-7360) Compare Pyrolite performance affected by useMemo

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538727#comment-14538727 ] Apache Spark commented on SPARK-7360: - User 'mengxr' has created a pull request for th

[jira] [Commented] (SPARK-7088) [REGRESSION] Spark 1.3.1 breaks analysis of third-party logical plans

2015-05-11 Thread Santiago M. Mola (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538734#comment-14538734 ] Santiago M. Mola commented on SPARK-7088: - Any thoughts on this? > [REGRESSION] S

[jira] [Commented] (SPARK-6743) Join with empty projection on one side produces invalid results

2015-05-11 Thread Santiago M. Mola (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538737#comment-14538737 ] Santiago M. Mola commented on SPARK-6743: - Any thoughts on this? > Join with empt

[jira] [Created] (SPARK-7543) Break dataframe.py into multiple files

2015-05-11 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-7543: -- Summary: Break dataframe.py into multiple files Key: SPARK-7543 URL: https://issues.apache.org/jira/browse/SPARK-7543 Project: Spark Issue Type: Sub-task

[jira] [Resolved] (SPARK-7462) By default retain group by columns in aggregate

2015-05-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-7462. Resolution: Fixed Fix Version/s: 1.4.0 > By default retain group by columns in aggregate > --

[jira] [Resolved] (SPARK-7280) Add a method for dropping a column in Java/Scala

2015-05-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-7280. Resolution: Fixed > Add a method for dropping a column in Java/Scala > -

[jira] [Commented] (SPARK-7410) Add option to avoid broadcasting configuration with newAPIHadoopFile

2015-05-11 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538784#comment-14538784 ] Sandy Ryza commented on SPARK-7410: --- Thanks for the pointer, [~joshrosen]. Looked over

[jira] [Commented] (SPARK-7410) Add option to avoid broadcasting configuration with newAPIHadoopFile

2015-05-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538813#comment-14538813 ] Josh Rosen commented on SPARK-7410: --- The correctness issue might be slightly overblown:

[jira] [Created] (SPARK-7544) pyspark.sql.types.Row should implement __getitem__

2015-05-11 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-7544: --- Summary: pyspark.sql.types.Row should implement __getitem__ Key: SPARK-7544 URL: https://issues.apache.org/jira/browse/SPARK-7544 Project: Spark Issue

[jira] [Commented] (SPARK-7544) pyspark.sql.types.Row should implement __getitem__

2015-05-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538830#comment-14538830 ] Nicholas Chammas commented on SPARK-7544: - cc [~rxin], [~davies] > pyspark.sql.ty

[jira] [Commented] (SPARK-7133) Implement struct, array, and map field accessor using apply in Scala and __getitem__ in Python

2015-05-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538832#comment-14538832 ] Nicholas Chammas commented on SPARK-7133: - [SPARK-7544} > Implement struct, array

[jira] [Comment Edited] (SPARK-7133) Implement struct, array, and map field accessor using apply in Scala and __getitem__ in Python

2015-05-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538832#comment-14538832 ] Nicholas Chammas edited comment on SPARK-7133 at 5/11/15 11:02 PM: -

[jira] [Commented] (SPARK-7324) Add DataFrame.dropDuplicates

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538879#comment-14538879 ] Apache Spark commented on SPARK-7324: - User 'rxin' has created a pull request for this

[jira] [Created] (SPARK-7545) Bernoulli NaiveBayes should validate data

2015-05-11 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-7545: Summary: Bernoulli NaiveBayes should validate data Key: SPARK-7545 URL: https://issues.apache.org/jira/browse/SPARK-7545 Project: Spark Issue Type: I

[jira] [Updated] (SPARK-7545) Bernoulli NaiveBayes should validate data

2015-05-11 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7545: - Assignee: Leah McGuire > Bernoulli NaiveBayes should validate data > -

[jira] [Updated] (SPARK-7545) Bernoulli NaiveBayes should validate data

2015-05-11 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7545: - Affects Version/s: 1.4.0 > Bernoulli NaiveBayes should validate data > ---

[jira] [Commented] (SPARK-7545) Bernoulli NaiveBayes should validate data

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538892#comment-14538892 ] Joseph K. Bradley commented on SPARK-7545: -- [~lmcguire] Would you be able to add

[jira] [Created] (SPARK-7546) Add ML Pipelines example with complex feature transformations

2015-05-11 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-7546: Summary: Add ML Pipelines example with complex feature transformations Key: SPARK-7546 URL: https://issues.apache.org/jira/browse/SPARK-7546 Project: Spark

[jira] [Created] (SPARK-7547) ElasticNet example code

2015-05-11 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-7547: Summary: ElasticNet example code Key: SPARK-7547 URL: https://issues.apache.org/jira/browse/SPARK-7547 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-7547) ElasticNet example code

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-7547: - Assignee: DB Tsai > ElasticNet example code > --- > >

[jira] [Commented] (SPARK-7547) ElasticNet example code

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538906#comment-14538906 ] Joseph K. Bradley commented on SPARK-7547: -- [~dbtsai] Would you mind adding an ex

[jira] [Created] (SPARK-7548) Add explode expression

2015-05-11 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-7548: -- Summary: Add explode expression Key: SPARK-7548 URL: https://issues.apache.org/jira/browse/SPARK-7548 Project: Spark Issue Type: Sub-task Components: S

[jira] [Created] (SPARK-7549) Support aggregating over nested fields

2015-05-11 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-7549: -- Summary: Support aggregating over nested fields Key: SPARK-7549 URL: https://issues.apache.org/jira/browse/SPARK-7549 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-5707) Enabling spark.sql.codegen throws ClassNotFound exception

2015-05-11 Thread Nathan McCarthy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan McCarthy updated SPARK-5707: --- Affects Version/s: 1.3.1 > Enabling spark.sql.codegen throws ClassNotFound exception > ---

[jira] [Created] (SPARK-7550) Support setting the right schema & serde when writing to Hive metastore

2015-05-11 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-7550: -- Summary: Support setting the right schema & serde when writing to Hive metastore Key: SPARK-7550 URL: https://issues.apache.org/jira/browse/SPARK-7550 Project: Spark

[jira] [Commented] (SPARK-5707) Enabling spark.sql.codegen throws ClassNotFound exception

2015-05-11 Thread Nathan McCarthy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538926#comment-14538926 ] Nathan McCarthy commented on SPARK-5707: Also throws and error when switching to J

[jira] [Created] (SPARK-7551) Don't split by dot if within backticks for DataFrame attribute resolution

2015-05-11 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-7551: -- Summary: Don't split by dot if within backticks for DataFrame attribute resolution Key: SPARK-7551 URL: https://issues.apache.org/jira/browse/SPARK-7551 Project: Spark

[jira] [Commented] (SPARK-7551) Don't split by dot if within backticks for DataFrame attribute resolution

2015-05-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538930#comment-14538930 ] Reynold Xin commented on SPARK-7551: [~cloud_fan] do you have time for this? Would be

[jira] [Updated] (SPARK-7551) Don't split by dot if within backticks for DataFrame attribute resolution

2015-05-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7551: --- Priority: Critical (was: Major) > Don't split by dot if within backticks for DataFrame attribute reso

[jira] [Commented] (SPARK-7548) Add explode expression

2015-05-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538937#comment-14538937 ] Nicholas Chammas commented on SPARK-7548: - To provide a motivating example for the

[jira] [Assigned] (SPARK-7550) Support setting the right schema & serde when writing to Hive metastore

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7550: --- Assignee: Apache Spark > Support setting the right schema & serde when writing to Hive metast

[jira] [Assigned] (SPARK-7550) Support setting the right schema & serde when writing to Hive metastore

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7550: --- Assignee: (was: Apache Spark) > Support setting the right schema & serde when writing to

[jira] [Commented] (SPARK-7550) Support setting the right schema & serde when writing to Hive metastore

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538936#comment-14538936 ] Apache Spark commented on SPARK-7550: - User 'rxin' has created a pull request for this

[jira] [Updated] (SPARK-7509) Add drop column to Python DataFrame API

2015-05-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-7509: Target Version/s: 1.4.0 I'm targeting this for 1.4.0, though that's optimistic given that we

[jira] [Commented] (SPARK-7549) Support aggregating over nested fields

2015-05-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538965#comment-14538965 ] Nicholas Chammas commented on SPARK-7549: - To provide a motivating example for the

[jira] [Assigned] (SPARK-7509) Add drop column to Python DataFrame API

2015-05-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reassigned SPARK-7509: -- Assignee: Reynold Xin > Add drop column to Python DataFrame API > -

[jira] [Commented] (SPARK-7509) Add drop column to Python DataFrame API

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538977#comment-14538977 ] Apache Spark commented on SPARK-7509: - User 'rxin' has created a pull request for this

[jira] [Assigned] (SPARK-7509) Add drop column to Python DataFrame API

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7509: --- Assignee: Reynold Xin (was: Apache Spark) > Add drop column to Python DataFrame API > --

[jira] [Assigned] (SPARK-7509) Add drop column to Python DataFrame API

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7509: --- Assignee: Apache Spark (was: Reynold Xin) > Add drop column to Python DataFrame API > --

[jira] [Commented] (SPARK-7509) Add drop column to Python DataFrame API

2015-05-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538978#comment-14538978 ] Nicholas Chammas commented on SPARK-7509: - Oh, well nevermind then. :) > Add drop

[jira] [Updated] (SPARK-7269) Incorrect aggregation analysis

2015-05-11 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7269: -- Description: In a case insensitive analyzer (HiveContext), the attribute name captial differences will

[jira] [Commented] (SPARK-7413) Time to write shuffle spill files is not captured in ShuffleWriteMetrics

2015-05-11 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539067#comment-14539067 ] Josh Rosen commented on SPARK-7413: --- Actually, it looks like we sort-of try to do this i

[jira] [Resolved] (SPARK-5893) Add Bucketizer

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-5893. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5980 [https

[jira] [Updated] (SPARK-5893) Add Bucketizer

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-5893: - Assignee: Xusen Yin (was: Joseph K. Bradley) > Add Bucketizer > -- > >

[jira] [Commented] (SPARK-7545) Bernoulli NaiveBayes should validate data

2015-05-11 Thread Leah McGuire (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539078#comment-14539078 ] Leah McGuire commented on SPARK-7545: - Yes, I think I can get it in. > Bernoulli Nai

[jira] [Commented] (SPARK-7545) Bernoulli NaiveBayes should validate data

2015-05-11 Thread Leah McGuire (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539077#comment-14539077 ] Leah McGuire commented on SPARK-7545: - I think I can get it in :-) On Mon, May 11, 20

[jira] [Issue Comment Deleted] (SPARK-7545) Bernoulli NaiveBayes should validate data

2015-05-11 Thread Leah McGuire (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leah McGuire updated SPARK-7545: Comment: was deleted (was: Yes, I think I can get it in. ) > Bernoulli NaiveBayes should validate d

[jira] [Commented] (SPARK-7545) Bernoulli NaiveBayes should validate data

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539081#comment-14539081 ] Joseph K. Bradley commented on SPARK-7545: -- OK, I appreciate it! > Bernoulli Nai

[jira] [Created] (SPARK-7552) Close files correctly when iteration is finished in WAL recovery

2015-05-11 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-7552: -- Summary: Close files correctly when iteration is finished in WAL recovery Key: SPARK-7552 URL: https://issues.apache.org/jira/browse/SPARK-7552 Project: Spark I

[jira] [Commented] (SPARK-7540) PMML correctness check

2015-05-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539085#comment-14539085 ] Joseph K. Bradley commented on SPARK-7540: -- [~selvinsource] How much of this JIR

[jira] [Resolved] (SPARK-7530) Add API to get the current state of a StreamingContext

2015-05-11 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-7530. -- Resolution: Fixed Fix Version/s: 1.4.0 > Add API to get the current state of a StreamingC

[jira] [Updated] (SPARK-7551) Don't split by dot if within backticks for DataFrame attribute resolution

2015-05-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7551: --- Description: DataFrame's resolve: {code} protected[sql] def resolve(colName: String): NamedExpressio

[jira] [Resolved] (SPARK-7538) Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-7538. Resolution: Fixed This was a cross post from the mailing list. The poster closed the thread

[jira] [Resolved] (SPARK-7331) Create HiveConf per application instead of per query in HiveQl.scala

2015-05-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-7331. - Resolution: Fixed Fix Version/s: 1.2.3 Issue resolved by pull request 6036 [https:/

[jira] [Updated] (SPARK-7320) Add rollup and cube support to DataFrame DSL

2015-05-11 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7320: -- Assignee: Cheng Hao > Add rollup and cube support to DataFrame DSL > ---

[jira] [Updated] (SPARK-7150) SQLContext.range()

2015-05-11 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7150: -- Assignee: Adrian Wang > SQLContext.range() > -- > > Key: SPARK-7150 >

[jira] [Updated] (SPARK-7322) Add DataFrame DSL for window function support

2015-05-11 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-7322: -- Assignee: Cheng Hao > Add DataFrame DSL for window function support > --

[jira] [Resolved] (SPARK-7520) Install Jekyll On Jenkins Machines

2015-05-11 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-7520. Resolution: Fixed Fix Version/s: 1.4.0 All green - awesome thanks [~shaneknapp]! > I

[jira] [Updated] (SPARK-6876) DataFrame.na.replace value support for Python

2015-05-11 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-6876: -- Assignee: Adrian Wang > DataFrame.na.replace value support for Python >

[jira] [Commented] (SPARK-7531) Install GPG on Jenkins machines

2015-05-11 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539097#comment-14539097 ] Patrick Wendell commented on SPARK-7531: Yep - that one should work. I've actually

[jira] [Resolved] (SPARK-7324) Add DataFrame.dropDuplicates

2015-05-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-7324. - Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 6066 [https:/

[jira] [Updated] (SPARK-7324) Add DataFrame.dropDuplicates

2015-05-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-7324: Assignee: Reynold Xin > Add DataFrame.dropDuplicates > > >

[jira] [Resolved] (SPARK-7411) CTAS parser is incomplete

2015-05-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-7411. - Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5963 [https:/

[jira] [Resolved] (SPARK-7437) Fold "literal in (item1, item2, ..., literal, ...)" into true or false directly

2015-05-11 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-7437. - Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5972 [https:/

[jira] [Updated] (SPARK-7553) Add methods to maintain a singleton StreamingContext

2015-05-11 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-7553: - Description: In a REPL/notebook environment, its very easy to lose a reference to a StreamingCont

[jira] [Created] (SPARK-7553) Add methods to maintain a singleton StreamingContext

2015-05-11 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-7553: Summary: Add methods to maintain a singleton StreamingContext Key: SPARK-7553 URL: https://issues.apache.org/jira/browse/SPARK-7553 Project: Spark Issue Typ

[jira] [Commented] (SPARK-4128) Create instructions on fully building Spark in Intellij

2015-05-11 Thread Christian Kadner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539122#comment-14539122 ] Christian Kadner commented on SPARK-4128: - Hi Patrick, I recently set up my Intel

[jira] [Assigned] (SPARK-7552) Close files correctly when iteration is finished in WAL recovery

2015-05-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7552: --- Assignee: Apache Spark > Close files correctly when iteration is finished in WAL recovery > -

<    1   2   3   >