[jira] [Created] (SPARK-6052) In JSON schema inference, we should always set containsNull of an ArrayType to true

2015-02-26 Thread Yin Huai (JIRA)
Yin Huai created SPARK-6052: --- Summary: In JSON schema inference, we should always set containsNull of an ArrayType to true Key: SPARK-6052 URL: https://issues.apache.org/jira/browse/SPARK-6052 Project:

[jira] [Created] (SPARK-6053) Support model save/load in Python's ALS.

2015-02-26 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-6053: Summary: Support model save/load in Python's ALS. Key: SPARK-6053 URL: https://issues.apache.org/jira/browse/SPARK-6053 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-5556) Latent Dirichlet Allocation (LDA) using Gibbs sampler

2015-02-26 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339847#comment-14339847 ] Guoqiang Li commented on SPARK-5556: [This

[jira] [Commented] (SPARK-5556) Latent Dirichlet Allocation (LDA) using Gibbs sampler

2015-02-26 Thread Pedro Rodriguez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339772#comment-14339772 ] Pedro Rodriguez commented on SPARK-5556: See PR for info, TLDR: contains

[jira] [Updated] (SPARK-5991) Python API for ML model import/export

2015-02-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5991: - Target Version/s: (was: 1.4.0) Python API for ML model import/export

[jira] [Updated] (SPARK-5991) Python API for ML model import/export

2015-02-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5991: - Issue Type: Umbrella (was: Sub-task) Parent: (was: SPARK-4587) Python API for ML

[jira] [Created] (SPARK-6054) SQL UDF returning object of case class; regression from 1.2.0

2015-02-26 Thread Spiro Michaylov (JIRA)
Spiro Michaylov created SPARK-6054: -- Summary: SQL UDF returning object of case class; regression from 1.2.0 Key: SPARK-6054 URL: https://issues.apache.org/jira/browse/SPARK-6054 Project: Spark

[jira] [Created] (SPARK-6055) memory leak in pyspark sql

2015-02-26 Thread Davies Liu (JIRA)
Davies Liu created SPARK-6055: - Summary: memory leak in pyspark sql Key: SPARK-6055 URL: https://issues.apache.org/jira/browse/SPARK-6055 Project: Spark Issue Type: Bug Components:

[jira] [Commented] (SPARK-6055) memory leak in pyspark sql

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339846#comment-14339846 ] Apache Spark commented on SPARK-6055: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-6055) memory leak in pyspark sql

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339850#comment-14339850 ] Apache Spark commented on SPARK-6055: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-5281) Registering table on RDD is giving MissingRequirementError

2015-02-26 Thread Sangkyoon Nam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339779#comment-14339779 ] Sangkyoon Nam commented on SPARK-5281: -- I have same problem. In my case, I used CDH

[jira] [Updated] (SPARK-6036) EventLog process logic has race condition with Akka actor system

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-6036: - Labels: backport-needed (was: ) EventLog process logic has race condition with Akka actor system

[jira] [Commented] (SPARK-6055) memory leak in pyspark sql

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339840#comment-14339840 ] Apache Spark commented on SPARK-6055: - User 'davies' has created a pull request for

[jira] [Created] (SPARK-6056) Unlimit offHeap memory use cause RM killing the container

2015-02-26 Thread SaintBacchus (JIRA)
SaintBacchus created SPARK-6056: --- Summary: Unlimit offHeap memory use cause RM killing the container Key: SPARK-6056 URL: https://issues.apache.org/jira/browse/SPARK-6056 Project: Spark Issue

[jira] [Commented] (SPARK-5991) Python API for ML model import/export

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339853#comment-14339853 ] Apache Spark commented on SPARK-5991: - User 'mengxr' has created a pull request for

[jira] [Comment Edited] (SPARK-6050) Spark on YARN does not work --executor-cores is specified

2015-02-26 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339686#comment-14339686 ] Mridul Muralidharan edited comment on SPARK-6050 at 2/27/15 5:24 AM:

[jira] [Commented] (SPARK-6052) In JSON schema inference, we should always set containsNull of an ArrayType to true

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339761#comment-14339761 ] Apache Spark commented on SPARK-6052: - User 'yhuai' has created a pull request for

[jira] [Commented] (SPARK-6051) Add an option for DirectKafkaInputDStream to commit the offsets into ZK

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339758#comment-14339758 ] Apache Spark commented on SPARK-6051: - User 'jerryshao' has created a pull request for

[jira] [Commented] (SPARK-5556) Latent Dirichlet Allocation (LDA) using Gibbs sampler

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339767#comment-14339767 ] Apache Spark commented on SPARK-5556: - User 'EntilZha' has created a pull request for

[jira] [Commented] (SPARK-5845) Time to cleanup spilled shuffle files not included in shuffle write time

2015-02-26 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339794#comment-14339794 ] Ilya Ganelin commented on SPARK-5845: - I'm code complete on this, will submit a PR

[jira] [Commented] (SPARK-6050) Spark on YARN does not work --executor-cores is specified

2015-02-26 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339798#comment-14339798 ] Mridul Muralidharan commented on SPARK-6050: With more verbose debug added,

[jira] [Commented] (SPARK-5556) Latent Dirichlet Allocation (LDA) using Gibbs sampler

2015-02-26 Thread Pedro Rodriguez (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339849#comment-14339849 ] Pedro Rodriguez commented on SPARK-5556: Based on initial testing, I recall

[jira] [Closed] (SPARK-4300) Race condition during SparkWorker shutdown

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4300. Resolution: Fixed Fix Version/s: 1.4.0 1.2.2 Assignee: Sean

[jira] [Reopened] (SPARK-4300) Race condition during SparkWorker shutdown

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or reopened SPARK-4300: -- Race condition during SparkWorker shutdown --

[jira] [Resolved] (SPARK-794) Remove sleep() in ClusterScheduler.stop

2015-02-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-794. - Resolution: Fixed Remove sleep() in ClusterScheduler.stop ---

[jira] [Commented] (SPARK-1673) GLMNET implementation in Spark

2015-02-26 Thread mike bowles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339320#comment-14339320 ] mike bowles commented on SPARK-1673: Good discussion. I can see how it might be

[jira] [Created] (SPARK-6047) pyspark - class loading on driver failing with --jars and --packages

2015-02-26 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-6047: -- Summary: pyspark - class loading on driver failing with --jars and --packages Key: SPARK-6047 URL: https://issues.apache.org/jira/browse/SPARK-6047 Project: Spark

[jira] [Commented] (SPARK-3508) annotate the Spark configs to indicate which ones are meant for the end user

2015-02-26 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338457#comment-14338457 ] Thomas Graves commented on SPARK-3508: -- I wasn't suggesting renaming existing

[jira] [Commented] (SPARK-1391) BlockManager cannot transfer blocks larger than 2G in size

2015-02-26 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338648#comment-14338648 ] Imran Rashid commented on SPARK-1391: - The one complication here comes from the

[jira] [Resolved] (SPARK-6016) Cannot read the parquet table after overwriting the existing table when spark.sql.parquet.cacheMetadata=true

2015-02-26 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-6016. --- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4775

[jira] [Commented] (SPARK-6018) NoSuchMethodError in Spark app is swallowed by YARN AM

2015-02-26 Thread Tarek Abouzeid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338577#comment-14338577 ] Tarek Abouzeid commented on SPARK-6018: --- i am encountering same error :

[jira] [Updated] (SPARK-4545) If first Spark Streaming batch fails, it waits 10x batch duration before stopping

2015-02-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4545: - Priority: Major (was: Minor) Affects Version/s: 1.2.1 Bumping a bit since this makes a

[jira] [Commented] (SPARK-6018) NoSuchMethodError in Spark app is swallowed by YARN AM

2015-02-26 Thread Tarek Abouzeid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338580#comment-14338580 ] Tarek Abouzeid commented on SPARK-6018: --- i am encountering same error while

[jira] [Created] (SPARK-6042) spark-submit giving Exception in thread main java.lang.NoSuchMethodError: org.apache.spark.sql.hive.HiveContext.sql(Ljava/lang/String;)Lorg/apache/spark/sql/SchemaRDD;

2015-02-26 Thread Tarek Abouzeid (JIRA)
Tarek Abouzeid created SPARK-6042: - Summary: spark-submit giving Exception in thread main java.lang.NoSuchMethodError: org.apache.spark.sql.hive.HiveContext.sql(Ljava/lang/String;)Lorg/apache/spark/sql/SchemaRDD; Key: SPARK-6042

[jira] [Issue Comment Deleted] (SPARK-6018) NoSuchMethodError in Spark app is swallowed by YARN AM

2015-02-26 Thread Tarek Abouzeid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tarek Abouzeid updated SPARK-6018: -- Comment: was deleted (was: i am encountering same error : ) NoSuchMethodError in Spark app is

[jira] [Commented] (SPARK-4545) If first Spark Streaming batch fails, it waits 10x batch duration before stopping

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338480#comment-14338480 ] Apache Spark commented on SPARK-4545: - User 'srowen' has created a pull request for

[jira] [Updated] (SPARK-6042) spark-submit giving Exception in thread main java.lang.NoSuchMethodError: org.apache.spark.sql.hive.HiveContext.sql(Ljava/lang/String;)Lorg/apache/spark/sql/SchemaRDD;

2015-02-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6042: - Component/s: (was: Spark Submit) SQL Target Version/s: (was:

[jira] [Commented] (SPARK-6040) Fix the percent bug in tablesample

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338336#comment-14338336 ] Apache Spark commented on SPARK-6040: - User 'watermen' has created a pull request for

[jira] [Commented] (SPARK-6041) Compute shortest path for graph with edge distances

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338454#comment-14338454 ] Apache Spark commented on SPARK-6041: - User 'viirya' has created a pull request for

[jira] [Created] (SPARK-6041) Compute shortest path for graph with edge distances

2015-02-26 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-6041: -- Summary: Compute shortest path for graph with edge distances Key: SPARK-6041 URL: https://issues.apache.org/jira/browse/SPARK-6041 Project: Spark Issue

[jira] [Updated] (SPARK-6040) Fix the percent bug in tablesample

2015-02-26 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-6040: - Summary: Fix the percent bug in tablesample (was: Fix the bug in tablesample) Fix the percent bug in

[jira] [Commented] (SPARK-6017) Provide transparent secure communication channel on Yarn

2015-02-26 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338423#comment-14338423 ] Thomas Graves commented on SPARK-6017: -- when you say: Currently the method of

[jira] [Resolved] (SPARK-6023) ParquetConversions fails to replace the destination MetastoreRelation of an InsertIntoTable node to ParquetRelation2

2015-02-26 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-6023. --- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4782

[jira] [Closed] (SPARK-3508) annotate the Spark configs to indicate which ones are meant for the end user

2015-02-26 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves closed SPARK-3508. Resolution: Won't Fix annotate the Spark configs to indicate which ones are meant for the end user

[jira] [Updated] (SPARK-6043) Error when trying to rename table with alter table after using INSERT OVERWITE to populate the table

2015-02-26 Thread Trystan Leftwich (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trystan Leftwich updated SPARK-6043: Description: If you populate a table using INSERT OVERWRITE and then try to rename the

[jira] [Resolved] (SPARK-5508) [hive context] java.lang.IndexOutOfBoundsException: Index: 0, Size: 0

2015-02-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-5508. - Resolution: Duplicate [hive context] java.lang.IndexOutOfBoundsException: Index: 0, Size: 0

[jira] [Created] (SPARK-6043) Error when trying to rename table with alter table after using INSERT OVERWITE to populate the table

2015-02-26 Thread Trystan Leftwich (JIRA)
Trystan Leftwich created SPARK-6043: --- Summary: Error when trying to rename table with alter table after using INSERT OVERWITE to populate the table Key: SPARK-6043 URL:

[jira] [Commented] (SPARK-6017) Provide transparent secure communication channel on Yarn

2015-02-26 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338759#comment-14338759 ] Marcelo Vanzin commented on SPARK-6017: --- Hey Tom, No, I was referring to the

[jira] [Commented] (SPARK-6017) Provide transparent secure communication channel on Yarn

2015-02-26 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338787#comment-14338787 ] Thomas Graves commented on SPARK-6017: -- spark.authenticate.secret is only used in

[jira] [Commented] (SPARK-6017) Provide transparent secure communication channel on Yarn

2015-02-26 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338803#comment-14338803 ] Marcelo Vanzin commented on SPARK-6017: --- Ah, I see. I missed the code in

[jira] [Commented] (SPARK-6017) Provide transparent secure communication channel on Yarn

2015-02-26 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338811#comment-14338811 ] Marcelo Vanzin commented on SPARK-6017: --- [~rxin] yeah, when I wrote write a new RPC

[jira] [Commented] (SPARK-5950) Arrays and Maps stored with Hive Parquet Serde may not be able to read by the Parquet support in the Data Souce API

2015-02-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338855#comment-14338855 ] Yin Huai commented on SPARK-5950: - Seems the problem is that arrays and maps stored with

[jira] [Comment Edited] (SPARK-5950) Arrays and Maps stored with Hive Parquet Serde may not be able to read by the Parquet support in the Data Souce API

2015-02-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338855#comment-14338855 ] Yin Huai edited comment on SPARK-5950 at 2/26/15 6:23 PM: -- Seems

[jira] [Updated] (SPARK-5950) Arrays and Maps stored with Hive Parquet Serde may not be able to read by the Parquet support in the Data Souce API

2015-02-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-5950: Summary: Arrays and Maps stored with Hive Parquet Serde may not be able to read by the Parquet support in

[jira] [Commented] (SPARK-4545) If first Spark Streaming batch fails, it waits 10x batch duration before stopping

2015-02-26 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338718#comment-14338718 ] Saisai Shao commented on SPARK-4545: Hi [~srowen], from my understanding, if

[jira] [Updated] (SPARK-5950) Insert array into table saved as parquet should work when using datasource api

2015-02-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-5950: Priority: Critical (was: Major) Target Version/s: 1.3.0 Insert array into table saved as

[jira] [Updated] (SPARK-6043) Error when trying to rename table with alter table after using INSERT OVERWITE to populate the table

2015-02-26 Thread Trystan Leftwich (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trystan Leftwich updated SPARK-6043: Priority: Minor (was: Major) Error when trying to rename table with alter table after

[jira] [Commented] (SPARK-6026) Eliminate the bypassMergeThreshold parameter and associated hash-ish shuffle within the Sort shuffle code

2015-02-26 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338762#comment-14338762 ] Kay Ousterhout commented on SPARK-6026: --- I've observed this both when running the

[jira] [Updated] (SPARK-5950) Insert array into table saved as parquet should work when using datasource api

2015-02-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-5950: Priority: Major (was: Critical) Insert array into table saved as parquet should work when using

[jira] [Resolved] (SPARK-5801) Shuffle creates too many nested directories

2015-02-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5801. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4747

[jira] [Updated] (SPARK-5801) Shuffle creates too many nested directories

2015-02-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5801: - Priority: Minor (was: Critical) Assignee: Marcelo Vanzin Bumping down priority since it didn't seem

[jira] [Commented] (SPARK-5508) [hive context] java.lang.IndexOutOfBoundsException: Index: 0, Size: 0

2015-02-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338844#comment-14338844 ] Yin Huai commented on SPARK-5508: - [~Ayoub] I believe that it is the same problem with

[jira] [Commented] (SPARK-1673) GLMNET implementation in Spark

2015-02-26 Thread mike bowles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338897#comment-14338897 ] mike bowles commented on SPARK-1673: Some colleagues and I have a Spark version of

[jira] [Commented] (SPARK-5981) pyspark ML models should support predict/transform on vector within map

2015-02-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338939#comment-14338939 ] Joseph K. Bradley commented on SPARK-5981: -- Actually, rather than creating JIRAs

[jira] [Commented] (SPARK-1673) GLMNET implementation in Spark

2015-02-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339062#comment-14339062 ] Joseph K. Bradley commented on SPARK-1673: -- Some thoughts: {quote} Friedman says

[jira] [Commented] (SPARK-6044) RDD.aggregate() should not use the closure serializer on the zero value

2015-02-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339078#comment-14339078 ] Sean Owen commented on SPARK-6044: -- Yeah I think that's right, given the recent mailing

[jira] [Created] (SPARK-6044) RDD.aggregate() should not use the closure serializer on the zero value

2015-02-26 Thread Matt Cheah (JIRA)
Matt Cheah created SPARK-6044: - Summary: RDD.aggregate() should not use the closure serializer on the zero value Key: SPARK-6044 URL: https://issues.apache.org/jira/browse/SPARK-6044 Project: Spark

[jira] [Resolved] (SPARK-5363) Spark 1.2 freeze without error notification

2015-02-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-5363. --- Resolution: Fixed Fix Version/s: 1.4.0 1.2.2 1.3.0 I've

[jira] [Resolved] (SPARK-6007) Add numRows param in DataFrame.show

2015-02-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-6007. Resolution: Fixed Assignee: Jacky Li Add numRows param in DataFrame.show

[jira] [Commented] (SPARK-5253) LinearRegression with L1/L2 (elastic net) using OWLQN in new ML package

2015-02-26 Thread mike bowles (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338908#comment-14338908 ] mike bowles commented on SPARK-5253: There's some new discussion of the relationship

[jira] [Resolved] (SPARK-6004) Pick the best model when training GradientBoostedTrees with validation

2015-02-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-6004. -- Resolution: Fixed Fix Version/s: 1.4.0 Target Version/s: 1.4.0 Pick

[jira] [Updated] (SPARK-6004) Pick the best model when training GradientBoostedTrees with validation

2015-02-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6004: - Assignee: Liang-Chi Hsieh Pick the best model when training GradientBoostedTrees with

[jira] [Commented] (SPARK-5972) Cache residuals for GradientBoostedTrees during training

2015-02-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338930#comment-14338930 ] Joseph K. Bradley commented on SPARK-5972: -- Yes, but it also includes the

[jira] [Commented] (SPARK-5775) GenericRow cannot be cast to SpecificMutableRow when nested data and partitioned table

2015-02-26 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339010#comment-14339010 ] Cheng Lian commented on SPARK-5775: --- Hey [~avignon], sorry for the delay. I've left

[jira] [Commented] (SPARK-794) Remove sleep() in ClusterScheduler.stop

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339017#comment-14339017 ] Apache Spark commented on SPARK-794: User 'srowen' has created a pull request for this

[jira] [Commented] (SPARK-5972) Cache residuals for GradientBoostedTrees during training

2015-02-26 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338869#comment-14338869 ] Manoj Kumar commented on SPARK-5972: Just to clarify, this is just to prevent

[jira] [Commented] (SPARK-5775) GenericRow cannot be cast to SpecificMutableRow when nested data and partitioned table

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338969#comment-14338969 ] Apache Spark commented on SPARK-5775: - User 'liancheng' has created a pull request for

[jira] [Comment Edited] (SPARK-5972) Cache residuals for GradientBoostedTrees during training

2015-02-26 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338869#comment-14338869 ] Manoj Kumar edited comment on SPARK-5972 at 2/26/15 6:30 PM: -

[jira] [Resolved] (SPARK-6015) Backport Python doc source code link fix to 1.2

2015-02-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-6015. Resolution: Fixed Fix Version/s: 1.2.2 Backport Python doc source code link fix to 1.2

[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338911#comment-14338911 ] Reynold Xin commented on SPARK-5124: I took another look at the PR. Do we need to

[jira] [Updated] (SPARK-6004) Pick the best model when training GradientBoostedTrees with validation

2015-02-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6004: - Affects Version/s: 1.4.0 Pick the best model when training GradientBoostedTrees with

[jira] [Commented] (SPARK-5981) pyspark ML models should support predict/transform on vector within map

2015-02-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14338936#comment-14338936 ] Joseph K. Bradley commented on SPARK-5981: -- You can, but I'm going to split it

[jira] [Updated] (SPARK-6032) Move ivy logging to System.err in --packages

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-6032: - Affects Version/s: 1.3.0 Move ivy logging to System.err in --packages

[jira] [Updated] (SPARK-4704) SparkSubmitDriverBootstrap doesn't flush output

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4704: - Affects Version/s: 1.2.0 SparkSubmitDriverBootstrap doesn't flush output

[jira] [Updated] (SPARK-4704) SparkSubmitDriverBootstrap doesn't flush output

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4704: - Assignee: Sean Owen SparkSubmitDriverBootstrap doesn't flush output

[jira] [Commented] (SPARK-6045) RecordWriter should be checked against null in PairRDDFunctions#saveAsNewAPIHadoopDataset

2015-02-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339164#comment-14339164 ] Sean Owen commented on SPARK-6045: -- Per the PR, it's not so much that this isn't checked

[jira] [Commented] (SPARK-6045) RecordWriter should be checked against null in PairRDDFunctions#saveAsNewAPIHadoopDataset

2015-02-26 Thread Ted Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339169#comment-14339169 ] Ted Yu commented on SPARK-6045: --- The logic of CassandraHadoopMigrator.scala is unknown.

[jira] [Updated] (SPARK-6031) Refactor --packages to work inside the DriverBootstrapper so that the jars can be added to the driver classpath

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-6031: - Affects Version/s: 1.3.0 Refactor --packages to work inside the DriverBootstrapper so that the jars

[jira] [Created] (SPARK-6045) RecordWriter should be checked against null in PairRDDFunctions#saveAsNewAPIHadoopDataset

2015-02-26 Thread Ted Yu (JIRA)
Ted Yu created SPARK-6045: - Summary: RecordWriter should be checked against null in PairRDDFunctions#saveAsNewAPIHadoopDataset Key: SPARK-6045 URL: https://issues.apache.org/jira/browse/SPARK-6045 Project:

[jira] [Updated] (SPARK-4704) SparkSubmitDriverBootstrap doesn't flush output

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4704: - Target Version/s: 1.2.2, 1.4.0, 1.3.1 SparkSubmitDriverBootstrap doesn't flush output

[jira] [Updated] (SPARK-4704) SparkSubmitDriverBootstrap doesn't flush output

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4704: - Labels: backport-needed (was: ) SparkSubmitDriverBootstrap doesn't flush output

[jira] [Updated] (SPARK-4704) SparkSubmitDriverBootstrap doesn't flush output

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4704: - Fix Version/s: 1.4.0 1.2.2 SparkSubmitDriverBootstrap doesn't flush output

[jira] [Commented] (SPARK-6045) RecordWriter should be checked against null in PairRDDFunctions#saveAsNewAPIHadoopDataset

2015-02-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339149#comment-14339149 ] Apache Spark commented on SPARK-6045: - User 'tedyu' has created a pull request for

[jira] [Commented] (SPARK-6045) RecordWriter should be checked against null in PairRDDFunctions#saveAsNewAPIHadoopDataset

2015-02-26 Thread Ted Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14339147#comment-14339147 ] Ted Yu commented on SPARK-6045: --- https://github.com/apache/spark/pull/4794 RecordWriter

[jira] [Updated] (SPARK-6027) Make KafkaUtils work in Python with kafka-assembly provided as --jar or maven package provided as --packages

2015-02-26 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-6027: - Summary: Make KafkaUtils work in Python with kafka-assembly provided as --jar or maven package

[jira] [Closed] (SPARK-3562) Periodic cleanup event logs

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-3562. Resolution: Fixed Fix Version/s: 1.4.0 Target Version/s: 1.4.0 Periodic cleanup event

[jira] [Updated] (SPARK-3562) Periodic cleanup event logs

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3562: - Assignee: xukun Periodic cleanup event logs --- Key:

[jira] [Created] (SPARK-6046) Provide an easier way for developers to handle deprecated configs

2015-02-26 Thread Andrew Or (JIRA)
Andrew Or created SPARK-6046: Summary: Provide an easier way for developers to handle deprecated configs Key: SPARK-6046 URL: https://issues.apache.org/jira/browse/SPARK-6046 Project: Spark

[jira] [Updated] (SPARK-6045) RecordWriter should be checked against null in PairRDDFunctions#saveAsNewAPIHadoopDataset

2015-02-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6045: - Priority: Trivial (was: Major) Yeah, this replaces an NPE-hiding-an-NPE with a clearer

[jira] [Updated] (SPARK-6027) Make KafkaUtils work in Python with kafka-assembly provided as --jar or maven package provided as --packages

2015-02-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-6027: - Affects Version/s: 1.3.0 Make KafkaUtils work in Python with kafka-assembly provided as --jar or maven

  1   2   3   >