[jira] [Commented] (SPARK-5288) Stabilize Spark SQL data type API followup

2015-01-28 Thread Peter Rudenko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295128#comment-14295128 ] Peter Rudenko commented on SPARK-5288: -- NumericType should be public. Here's a use

[jira] [Resolved] (SPARK-5097) Adding data frame APIs to SchemaRDD

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-5097. Resolution: Fixed Fix Version/s: 1.3.0 Adding data frame APIs to SchemaRDD

[jira] [Commented] (SPARK-5439) Expose yarn app id for yarn mode

2015-01-28 Thread Chengxiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294872#comment-14294872 ] Chengxiang Li commented on SPARK-5439: -- I think the gap here is that, when launch a

[jira] [Commented] (SPARK-5426) SQL Java API helper methods

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294896#comment-14294896 ] Apache Spark commented on SPARK-5426: - User 'kul' has created a pull request for this

[jira] [Commented] (SPARK-5428) Declare the 'assembly' module at the bottom of the modules element in the parent POM

2015-01-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294935#comment-14294935 ] Patrick Wendell commented on SPARK-5428: [~tzolov] Do you mind explaining a bit

[jira] [Comment Edited] (SPARK-5288) Stabilize Spark SQL data type API followup

2015-01-28 Thread Peter Rudenko (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295128#comment-14295128 ] Peter Rudenko edited comment on SPARK-5288 at 1/28/15 1:25 PM:

[jira] [Commented] (SPARK-5097) Adding data frame APIs to SchemaRDD

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294846#comment-14294846 ] Apache Spark commented on SPARK-5097: - User 'rxin' has created a pull request for this

[jira] [Updated] (SPARK-5426) SQL Java API helper methods

2015-01-28 Thread Kuldeep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuldeep updated SPARK-5426: --- Summary: SQL Java API helper methods (was: SQL Java API helpe) SQL Java API helper methods

[jira] [Updated] (SPARK-5426) SQL Java API helpe

2015-01-28 Thread Kuldeep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuldeep updated SPARK-5426: --- Summary: SQL Java API helpe (was: SQL Ja) SQL Java API helpe -- Key:

[jira] [Created] (SPARK-5449) What happened to RDD's join transformation?

2015-01-28 Thread Abou Haydar Elias (JIRA)
Abou Haydar Elias created SPARK-5449: Summary: What happened to RDD's join transformation? Key: SPARK-5449 URL: https://issues.apache.org/jira/browse/SPARK-5449 Project: Spark Issue

[jira] [Updated] (SPARK-5426) SQL Ja

2015-01-28 Thread Kuldeep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuldeep updated SPARK-5426: --- Summary: SQL Ja (was: SchemaRDD is java incompatible) SQL Ja -- Key: SPARK-5426

[jira] [Commented] (SPARK-5420) Cross-langauge load/store functions for creating and saving DataFrames

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294888#comment-14294888 ] Reynold Xin commented on SPARK-5420: cc [~yhuai] Table is probably not the best name

[jira] [Updated] (SPARK-5426) SQL Java API helper methods

2015-01-28 Thread Kuldeep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuldeep updated SPARK-5426: --- Description: DataFrame previously SchemaRDD is not directly java compatible. But this does seems a bit odd as

[jira] [Updated] (SPARK-5426) SQL Java API helper methods

2015-01-28 Thread Kuldeep (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuldeep updated SPARK-5426: --- Description: DataFrame previously SchemaRDD is not directly java compatible. But (was: Here is a sample

[jira] [Resolved] (SPARK-5452) We are migrating Tera Data SQL to Spark SQL. Query is taking long time. Please have a look on this issue

2015-01-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5452. -- Resolution: Invalid I think this is another question that is appropriate on the user@ list. JIRA would

[jira] [Created] (SPARK-5452) We are migrating Tera Data SQL to Spark SQL. Query is taking long time. Please have a look on this issue

2015-01-28 Thread irfan (JIRA)
irfan created SPARK-5452: Summary: We are migrating Tera Data SQL to Spark SQL. Query is taking long time. Please have a look on this issue Key: SPARK-5452 URL: https://issues.apache.org/jira/browse/SPARK-5452

[jira] [Commented] (SPARK-5324) Results of describe can't be queried

2015-01-28 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294847#comment-14294847 ] Yanbo Liang commented on SPARK-5324: [~OopsOutOfMemory] Thanks for your comments. I

[jira] [Created] (SPARK-5447) Replace reference to SchemaRDD with DataFrame

2015-01-28 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-5447: -- Summary: Replace reference to SchemaRDD with DataFrame Key: SPARK-5447 URL: https://issues.apache.org/jira/browse/SPARK-5447 Project: Spark Issue Type: Sub-task

[jira] [Resolved] (SPARK-5449) What happened to RDD's join transformation?

2015-01-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5449. -- Resolution: Invalid Questions are better for the user@ list; JIRA is for reporting issues. It didn't go

[jira] [Commented] (SPARK-5446) Parquet column pruning should work for Map and Struct

2015-01-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294943#comment-14294943 ] Cheng Lian commented on SPARK-5446: --- No, I believe it's irrelevant. Parquet column

[jira] [Commented] (SPARK-5450) Add APIs to save a graph as a SequenceFile and load it

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294949#comment-14294949 ] Apache Spark commented on SPARK-5450: - User 'maropu' has created a pull request for

[jira] [Resolved] (SPARK-5415) Upgrade sbt to 0.13.7

2015-01-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5415. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Ryan Williams Upgrade sbt

[jira] [Updated] (SPARK-5341) Support maven coordinates in spark-shell and spark-submit

2015-01-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5341: --- Priority: Critical (was: Major) Support maven coordinates in spark-shell and spark-submit

[jira] [Closed] (SPARK-5061) SQLContext: overload createParquetFile

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-5061. -- Resolution: Not a Problem Per PR discussion, this can be done with just saveAsParquetFile on a

[jira] [Created] (SPARK-5446) Parquet column pruning should work for Map and Struct

2015-01-28 Thread Jianshi Huang (JIRA)
Jianshi Huang created SPARK-5446: Summary: Parquet column pruning should work for Map and Struct Key: SPARK-5446 URL: https://issues.apache.org/jira/browse/SPARK-5446 Project: Spark Issue

[jira] [Commented] (SPARK-5447) Replace reference to SchemaRDD with DataFrame

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294882#comment-14294882 ] Apache Spark commented on SPARK-5447: - User 'rxin' has created a pull request for this

[jira] [Commented] (SPARK-5420) Cross-langauge load/store functions for creating and saving DataFrames

2015-01-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294928#comment-14294928 ] Patrick Wendell commented on SPARK-5420: How about just load and store then?

[jira] [Resolved] (SPARK-5144) spark-yarn module should be published

2015-01-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5144. Resolution: Duplicate spark-yarn module should be published

[jira] [Updated] (SPARK-4574) Adding support for defining schema in foreign DDL commands.

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-4574: --- Assignee: wangfei Adding support for defining schema in foreign DDL commands.

[jira] [Updated] (SPARK-4574) Adding support for defining schema in foreign DDL commands.

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-4574: --- Description: Adding support for defining schema in foreign DDL commands. Now foreign DDL support

[jira] [Resolved] (SPARK-4809) Improve Guava shading in Spark

2015-01-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4809. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Marcelo Vanzin Improve

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-28 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294884#comment-14294884 ] Xiangrui Meng commented on SPARK-4846: -- We should throw a RuntimeException before

[jira] [Commented] (SPARK-5449) What happened to RDD's join transformation?

2015-01-28 Thread Abou Haydar Elias (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294944#comment-14294944 ] Abou Haydar Elias commented on SPARK-5449: -- Thanks a lot [~srowen]! I'm sorry I

[jira] [Created] (SPARK-5450) Add APIs to save a graph as a SequenceFile and load it

2015-01-28 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-5450: --- Summary: Add APIs to save a graph as a SequenceFile and load it Key: SPARK-5450 URL: https://issues.apache.org/jira/browse/SPARK-5450 Project: Spark

[jira] [Issue Comment Deleted] (SPARK-5324) Results of describe can't be queried

2015-01-28 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang updated SPARK-5324: --- Comment: was deleted (was: https://github.com/apache/spark/pull/4207) Results of describe can't be

[jira] [Created] (SPARK-5448) Make CacheManager a concrete class and field in SQLContext

2015-01-28 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-5448: -- Summary: Make CacheManager a concrete class and field in SQLContext Key: SPARK-5448 URL: https://issues.apache.org/jira/browse/SPARK-5448 Project: Spark Issue

[jira] [Commented] (SPARK-5446) Parquet column pruning should work for Map and Struct

2015-01-28 Thread Yi Tian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294925#comment-14294925 ] Yi Tian commented on SPARK-5446: Is it related to

[jira] [Created] (SPARK-5453) Use hive-site.xml to set class for adding custom filter for input files

2015-01-28 Thread Yash Datta (JIRA)
Yash Datta created SPARK-5453: - Summary: Use hive-site.xml to set class for adding custom filter for input files Key: SPARK-5453 URL: https://issues.apache.org/jira/browse/SPARK-5453 Project: Spark

[jira] [Created] (SPARK-5454) [SQL] Self join with ArrayType columns problems

2015-01-28 Thread Pierre Borckmans (JIRA)
Pierre Borckmans created SPARK-5454: --- Summary: [SQL] Self join with ArrayType columns problems Key: SPARK-5454 URL: https://issues.apache.org/jira/browse/SPARK-5454 Project: Spark Issue

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295011#comment-14295011 ] Apache Spark commented on SPARK-4846: - User 'jinntrance' has created a pull request

[jira] [Commented] (SPARK-3872) Rewrite the test for ActorInputStream.

2015-01-28 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295077#comment-14295077 ] Prashant Sharma commented on SPARK-3872: For reasons outlined here

[jira] [Comment Edited] (SPARK-3872) Rewrite the test for ActorInputStream.

2015-01-28 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295077#comment-14295077 ] Prashant Sharma edited comment on SPARK-3872 at 1/28/15 11:57 AM:

[jira] [Commented] (SPARK-3872) Rewrite the test for ActorInputStream.

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295081#comment-14295081 ] Apache Spark commented on SPARK-3872: - User 'ScrapCodes' has created a pull request

[jira] [Created] (SPARK-5455) Add MultipleTransformer abstract class

2015-01-28 Thread Peter Rudenko (JIRA)
Peter Rudenko created SPARK-5455: Summary: Add MultipleTransformer abstract class Key: SPARK-5455 URL: https://issues.apache.org/jira/browse/SPARK-5455 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-28 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295020#comment-14295020 ] Joseph Tang commented on SPARK-4846: OK. I've sent a new PR as below. When the

[jira] [Issue Comment Deleted] (SPARK-5135) Add support for describe [extended] table to DDL in SQLContext

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5135: --- Comment: was deleted (was: User 'OopsOutOfMemory' has created a pull request for this issue:

[jira] [Issue Comment Deleted] (SPARK-5135) Add support for describe [extended] table to DDL in SQLContext

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5135: --- Comment: was deleted (was: User 'rxin' has created a pull request for this issue:

[jira] [Commented] (SPARK-5428) Declare the 'assembly' module at the bottom of the modules element in the parent POM

2015-01-28 Thread Christian Tzolov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294979#comment-14294979 ] Christian Tzolov commented on SPARK-5428: - [~pwendell] it is a bit confusing that

[jira] [Commented] (SPARK-5452) We are migrating Tera Data SQL to Spark SQL. Query is taking long time. Please have a look on this issue

2015-01-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294995#comment-14294995 ] Sean Owen commented on SPARK-5452: -- [~Irfan123] I do not think this is suitable for JIRA,

[jira] [Commented] (SPARK-1444) Update branch-0.9's SBT to 0.13.1 so that it works with Java 8

2015-01-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295068#comment-14295068 ] Sean Owen commented on SPARK-1444: -- FWIW I quickly tried updating branch 0.9 this way,

[jira] [Commented] (SPARK-5428) Declare the 'assembly' module at the bottom of the modules element in the parent POM

2015-01-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294998#comment-14294998 ] Sean Owen commented on SPARK-5428: -- Of course, I suppose this depends on the idea that

[jira] [Comment Edited] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-28 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295020#comment-14295020 ] Joseph Tang edited comment on SPARK-4846 at 1/28/15 11:26 AM: --

[jira] [Commented] (SPARK-5324) Results of describe can't be queried

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295085#comment-14295085 ] Apache Spark commented on SPARK-5324: - User 'OopsOutOfMemory' has created a pull

[jira] [Reopened] (SPARK-5452) We are migrating Tera Data SQL to Spark SQL. Query is taking long time. Please have a look on this issue

2015-01-28 Thread irfan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] irfan reopened SPARK-5452: -- please provide some inputs. this looks like performance issue considering the configuration and environment. We

[jira] [Updated] (SPARK-5413) Upgrade metrics dependency to 3.1.0

2015-01-28 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5413: - Target Version/s: 1.3.0 Upgrade metrics dependency to 3.1.0 -

[jira] [Closed] (SPARK-5459) The reference of combineByKey in the programming guide should be replaced by aggregateByKey

2015-01-28 Thread Juliet Hougland (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Juliet Hougland closed SPARK-5459. -- Resolution: Duplicate The reference of combineByKey in the programming guide should be

[jira] [Updated] (SPARK-5109) Loading multiple parquet files into a single SchemaRDD

2015-01-28 Thread Sam Steingold (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Steingold updated SPARK-5109: - Summary: Loading multiple parquet files into a single SchemaRDD (was: Loading multiple parquet

[jira] [Commented] (SPARK-5395) Large number of Python workers causing resource depletion

2015-01-28 Thread Sven Krasser (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295409#comment-14295409 ] Sven Krasser commented on SPARK-5395: - Thanks Davies! Large number of Python workers

[jira] [Commented] (SPARK-5458) Refer to aggregateByKey instead of combineByKey in docs

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295415#comment-14295415 ] Apache Spark commented on SPARK-5458: - User 'sryza' has created a pull request for

[jira] [Commented] (SPARK-5461) Graph should have isCheckpointed, getCheckpointFiles methods

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295817#comment-14295817 ] Apache Spark commented on SPARK-5461: - User 'jkbradley' has created a pull request for

[jira] [Created] (SPARK-5462) Catalyst UnresolvedException Invalid call to qualifiers on unresolved object error when accessing fields in Python DataFrame

2015-01-28 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-5462: - Summary: Catalyst UnresolvedException Invalid call to qualifiers on unresolved object error when accessing fields in Python DataFrame Key: SPARK-5462 URL:

[jira] [Updated] (SPARK-4259) Add Power Iteration Clustering Algorithm with Gaussian Similarity Function

2015-01-28 Thread Fan Jiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fan Jiang updated SPARK-4259: - Description: In recent years, spectral clustering has become one of the most popular modern clustering

[jira] [Updated] (SPARK-5440) Add toLocalIterator to pyspark rdd

2015-01-28 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5440: -- Affects Version/s: (was: 1.2.0) I'm removing the Affects Version(s) field from this since it isn't

[jira] [Closed] (SPARK-4955) Dynamic allocation doesn't work in YARN cluster mode

2015-01-28 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4955. Resolution: Fixed Fix Version/s: 1.3.0 Dynamic allocation doesn't work in YARN cluster mode

[jira] [Updated] (SPARK-5440) Add toLocalIterator to pyspark rdd

2015-01-28 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5440: -- Assignee: Michael Nazario Add toLocalIterator to pyspark rdd --

[jira] [Resolved] (SPARK-5440) Add toLocalIterator to pyspark rdd

2015-01-28 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-5440. --- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4237

[jira] [Updated] (SPARK-5417) Remove redundant executor-ID set() call

2015-01-28 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5417: - Target Version/s: 1.3.0, 1.2.1 Fix Version/s: 1.3.0 Assignee: Ryan Williams

[jira] [Updated] (SPARK-4989) wrong application configuration cause cluster down in standalone mode

2015-01-28 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4989: - Target Version/s: 1.3.0, 1.1.2, 1.2.2 (was: 1.3.0) wrong application configuration cause cluster down

[jira] [Reopened] (SPARK-4989) wrong application configuration cause cluster down in standalone mode

2015-01-28 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or reopened SPARK-4989: -- wrong application configuration cause cluster down in standalone mode

[jira] [Updated] (SPARK-4258) NPE with new Parquet Filters

2015-01-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-4258: -- Issue Type: Sub-task (was: Bug) Parent: SPARK-5463 NPE with new Parquet Filters

[jira] [Updated] (SPARK-4387) Refactoring python profiling code to make it extensible

2015-01-28 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4387: -- Assignee: Yandu Oppacher Refactoring python profiling code to make it extensible

[jira] [Resolved] (SPARK-4387) Refactoring python profiling code to make it extensible

2015-01-28 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-4387. --- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 3901

[jira] [Updated] (SPARK-5434) Preserve spaces in path to spark-ec2

2015-01-28 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5434: - Assignee: Nicholas Chammas Preserve spaces in path to spark-ec2

[jira] [Updated] (SPARK-5434) Preserve spaces in path to spark-ec2

2015-01-28 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-5434: - Target Version/s: 1.3.0, 1.2.1 Fix Version/s: 1.3.0 Labels: backport-needed (was: )

[jira] [Created] (SPARK-5463) Fix Parquet filter push-down

2015-01-28 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-5463: - Summary: Fix Parquet filter push-down Key: SPARK-5463 URL: https://issues.apache.org/jira/browse/SPARK-5463 Project: Spark Issue Type: Bug Components:

[jira] [Created] (SPARK-5464) Calling help() on a Python DataFrame fails with cannot resolve column name __name__ error

2015-01-28 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-5464: - Summary: Calling help() on a Python DataFrame fails with cannot resolve column name __name__ error Key: SPARK-5464 URL: https://issues.apache.org/jira/browse/SPARK-5464

[jira] [Updated] (SPARK-5346) Parquet filter pushdown is not enabled when parquet.task.side.metadata is set to true (default value)

2015-01-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-5346: -- Issue Type: Sub-task (was: Bug) Parent: SPARK-5463 Parquet filter pushdown is not enabled

[jira] [Resolved] (SPARK-5458) Refer to aggregateByKey instead of combineByKey in docs

2015-01-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5458. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Sandy Ryza Refer to

[jira] [Commented] (SPARK-5463) Fix Parquet filter push-down

2015-01-28 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295902#comment-14295902 ] Cheng Lian commented on SPARK-5463: --- SPARK-4258 is fixed in Parquet master. SPARK-5451

[jira] [Commented] (SPARK-4259) Add Power Iteration Clustering Algorithm with Gaussian Similarity Function

2015-01-28 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295930#comment-14295930 ] Andrew Musselman commented on SPARK-4259: - So this feature won't be doing spectral

[jira] [Commented] (SPARK-4049) Storage web UI fraction cached shows as 100%

2015-01-28 Thread Sven Krasser (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296037#comment-14296037 ] Sven Krasser commented on SPARK-4049: - I'm also seeing this for a 2x replicated RDD

[jira] [Resolved] (SPARK-5448) Make CacheManager a concrete class and field in SQLContext

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-5448. Resolution: Fixed Fix Version/s: 1.3.0 Make CacheManager a concrete class and field in

[jira] [Resolved] (SPARK-5447) Replace reference to SchemaRDD with DataFrame

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-5447. Resolution: Fixed Fix Version/s: 1.3.0 Replace reference to SchemaRDD with DataFrame

[jira] [Commented] (SPARK-5388) Provide a stable application submission gateway

2015-01-28 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296068#comment-14296068 ] Marcelo Vanzin commented on SPARK-5388: --- Hi [~andrewor14], I read through the spec

[jira] [Created] (SPARK-5468) Remove Python LocalHiveContext

2015-01-28 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-5468: -- Summary: Remove Python LocalHiveContext Key: SPARK-5468 URL: https://issues.apache.org/jira/browse/SPARK-5468 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-5247) Enable javadoc/scaladoc for public classes in catalyst project

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5247: --- Priority: Blocker (was: Major) Enable javadoc/scaladoc for public classes in catalyst project

[jira] [Commented] (SPARK-3977) Conversions between {Row, Coordinate}Matrix - BlockMatrix

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296085#comment-14296085 ] Apache Spark commented on SPARK-3977: - User 'brkyvz' has created a pull request for

[jira] [Updated] (SPARK-5420) Cross-langauge load/store functions for creating and saving DataFrames

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5420: --- Description: We should have standard API's for loading or saving a table from a data store. Per

[jira] [Updated] (SPARK-5420) Cross-langauge load/store functions for creating and saving DataFrames

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5420: --- Description: We should have standard API's for loading or saving a table from a data store. Per

[jira] [Updated] (SPARK-5420) Cross-langauge load/store functions for creating and saving DataFrames

2015-01-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5420: --- Priority: Blocker (was: Major) Cross-langauge load/store functions for creating and saving

[jira] [Comment Edited] (SPARK-4049) Storage web UI fraction cached shows as 100%

2015-01-28 Thread Sven Krasser (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296037#comment-14296037 ] Sven Krasser edited comment on SPARK-4049 at 1/29/15 12:07 AM:

[jira] [Resolved] (SPARK-4586) Python API for ML Pipeline

2015-01-28 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-4586. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4151

[jira] [Resolved] (SPARK-5188) make-distribution.sh should support curl, not only wget to get Tachyon

2015-01-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5188. Resolution: Fixed Assignee: Kousuke Saruta make-distribution.sh should support curl,

[jira] [Updated] (SPARK-5188) make-distribution.sh should support curl, not only wget to get Tachyon

2015-01-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5188: --- Fix Version/s: 1.3.0 make-distribution.sh should support curl, not only wget to get Tachyon

[jira] [Updated] (SPARK-4989) wrong application configuration cause cluster down in standalone mode

2015-01-28 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4989: - Fix Version/s: 1.1.2 wrong application configuration cause cluster down in standalone mode

[jira] [Commented] (SPARK-4259) Add Power Iteration Clustering Algorithm with Gaussian Similarity Function

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295912#comment-14295912 ] Apache Spark commented on SPARK-4259: - User 'fjiang6' has created a pull request for

[jira] [Commented] (SPARK-5466) Build Error caused by Guava shading in Spark

2015-01-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296031#comment-14296031 ] Sean Owen commented on SPARK-5466: -- I see this too from a completely clean build. Build

[jira] [Resolved] (SPARK-5467) DStreams should provide windowing based on timestamps from the data (as opposed to wall clock time)

2015-01-28 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid resolved SPARK-5467. - Resolution: Duplicate DStreams should provide windowing based on timestamps from the data (as

[jira] [Commented] (SPARK-5467) DStreams should provide windowing based on timestamps from the data (as opposed to wall clock time)

2015-01-28 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296047#comment-14296047 ] Imran Rashid commented on SPARK-5467: - shoot, sorry I missed that jira. (I swear I

[jira] [Commented] (SPARK-5445) Make sure DataFrame expressions are usable in Java

2015-01-28 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296064#comment-14296064 ] Apache Spark commented on SPARK-5445: - User 'rxin' has created a pull request for this

  1   2   >