[jira] [Updated] (SPARK-10802) Let ALS recommend for subset of data

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10802: -- Priority: Minor (was: Major) > Let ALS recommend for subset of data >

[jira] [Commented] (SPARK-10802) Let ALS recommend for subset of data

2015-09-24 Thread Tomasz Bartczak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907669#comment-14907669 ] Tomasz Bartczak commented on SPARK-10802: - hmm you are probably referring to meth

[jira] [Assigned] (SPARK-10829) Scan DataSource with predicate expression combine partition key and attributes doesn't work

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10829: Assignee: (was: Apache Spark) > Scan DataSource with predicate expression combine part

[jira] [Assigned] (SPARK-10829) Scan DataSource with predicate expression combine partition key and attributes doesn't work

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10829: Assignee: Apache Spark > Scan DataSource with predicate expression combine partition key a

[jira] [Commented] (SPARK-10829) Scan DataSource with predicate expression combine partition key and attributes doesn't work

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907646#comment-14907646 ] Apache Spark commented on SPARK-10829: -- User 'chenghao-intel' has created a pull req

[jira] [Created] (SPARK-10829) Scan DataSource with predicate expression combine partition key and attributes doesn't work

2015-09-24 Thread Cheng Hao (JIRA)
Cheng Hao created SPARK-10829: - Summary: Scan DataSource with predicate expression combine partition key and attributes doesn't work Key: SPARK-10829 URL: https://issues.apache.org/jira/browse/SPARK-10829

[jira] [Assigned] (SPARK-10427) Spark-sql -f or -e will output some

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10427: Assignee: Apache Spark > Spark-sql -f or -e will output some > ---

[jira] [Commented] (SPARK-10427) Spark-sql -f or -e will output some

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907640#comment-14907640 ] Apache Spark commented on SPARK-10427: -- User 'zhichao-li' has created a pull request

[jira] [Assigned] (SPARK-10427) Spark-sql -f or -e will output some

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10427: Assignee: (was: Apache Spark) > Spark-sql -f or -e will output some >

[jira] [Comment Edited] (SPARK-10796) The Stage taskSets may are all removed while stage still have pending partitions after having lost some executors

2015-09-24 Thread SuYan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907532#comment-14907532 ] SuYan edited comment on SPARK-10796 at 9/25/15 5:11 AM: I already

[jira] [Updated] (SPARK-10794) Spark-SQL- select query on table column with binary Data Type displays error message- java.lang.ClassCastException: java.lang.String cannot be cast to [B

2015-09-24 Thread Anilkumar Kalshetti (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Kalshetti updated SPARK-10794: Priority: Critical (was: Minor) > Spark-SQL- select query on table column with bin

[jira] [Created] (SPARK-10828) Can we use the accumulo data RDD created from JAVA in spark, in sparkR?Is there any other way to proceed with it to create RRDD from a source RDD other than text RDD?Or

2015-09-24 Thread madhvi gupta (JIRA)
madhvi gupta created SPARK-10828: Summary: Can we use the accumulo data RDD created from JAVA in spark, in sparkR?Is there any other way to proceed with it to create RRDD from a source RDD other than text RDD?Or to use any other format of data st

[jira] [Comment Edited] (SPARK-10796) The Stage taskSets may are all removed while stage still have pending partitions after having lost some executors

2015-09-24 Thread SuYan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907532#comment-14907532 ] SuYan edited comment on SPARK-10796 at 9/25/15 4:07 AM: I already

[jira] [Commented] (SPARK-10796) The Stage taskSets may are all removed while stage still have pending partitions after having lost some executors

2015-09-24 Thread SuYan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907532#comment-14907532 ] SuYan commented on SPARK-10796: --- I already refine that description. Simple Example will be

[jira] [Updated] (SPARK-10796) The Stage taskSets may are all removed while stage still have pending partitions after having lost some executors

2015-09-24 Thread SuYan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] SuYan updated SPARK-10796: -- Description: We meet that problem in Spark 1.3.0, and I also check the latest Spark code, and I think that pro

[jira] [Commented] (SPARK-9850) Adaptive execution in Spark

2015-09-24 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907518#comment-14907518 ] Matei Zaharia commented on SPARK-9850: -- Hey Imran, this could make sense, but note th

[jira] [Created] (SPARK-10827) AppClient should not use `askWithReply` in `receiveAndReply` directly

2015-09-24 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-10827: Summary: AppClient should not use `askWithReply` in `receiveAndReply` directly Key: SPARK-10827 URL: https://issues.apache.org/jira/browse/SPARK-10827 Project: Spark

[jira] [Resolved] (SPARK-9852) Let reduce tasks fetch multiple map output partitions

2015-09-24 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-9852. -- Resolution: Fixed Fix Version/s: 1.6.0 > Let reduce tasks fetch multiple map output parti

[jira] [Created] (SPARK-10826) MasterSource should not access Master.workers, apps, waitingApps directly

2015-09-24 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-10826: Summary: MasterSource should not access Master.workers, apps, waitingApps directly Key: SPARK-10826 URL: https://issues.apache.org/jira/browse/SPARK-10826 Project: Sp

[jira] [Updated] (SPARK-10824) DataFrame show method - show(df) should show first N number of rows, similar to R

2015-09-24 Thread Narine Kokhlikyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Narine Kokhlikyan updated SPARK-10824: -- Summary: DataFrame show method - show(df) should show first N number of rows, similar t

[jira] [Assigned] (SPARK-10825) Fix race conditions in StandaloneDynamicAllocationSuite

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10825: Assignee: Apache Spark > Fix race conditions in StandaloneDynamicAllocationSuite > ---

[jira] [Assigned] (SPARK-10825) Fix race conditions in StandaloneDynamicAllocationSuite

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10825: Assignee: (was: Apache Spark) > Fix race conditions in StandaloneDynamicAllocationSuit

[jira] [Commented] (SPARK-10825) Fix race conditions in StandaloneDynamicAllocationSuite

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907484#comment-14907484 ] Apache Spark commented on SPARK-10825: -- User 'zsxwing' has created a pull request fo

[jira] [Assigned] (SPARK-10824) DataFrame show method - show(df) should show N number of rows, similar to R

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10824: Assignee: (was: Apache Spark) > DataFrame show method - show(df) should show N number

[jira] [Assigned] (SPARK-10824) DataFrame show method - show(df) should show N number of rows, similar to R

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10824: Assignee: Apache Spark > DataFrame show method - show(df) should show N number of rows, si

[jira] [Commented] (SPARK-10824) DataFrame show method - show(df) should show N number of rows, similar to R

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907476#comment-14907476 ] Apache Spark commented on SPARK-10824: -- User 'NarineK' has created a pull request fo

[jira] [Created] (SPARK-10825) Fix race conditions in StandaloneDynamicAllocationSuite

2015-09-24 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-10825: Summary: Fix race conditions in StandaloneDynamicAllocationSuite Key: SPARK-10825 URL: https://issues.apache.org/jira/browse/SPARK-10825 Project: Spark Issue

[jira] [Commented] (SPARK-10824) DataFrame show method - show(df) should show N number of rows, similar to R

2015-09-24 Thread Narine Kokhlikyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907472#comment-14907472 ] Narine Kokhlikyan commented on SPARK-10824: --- The actual structure of dataframe

[jira] [Created] (SPARK-10824) DataFrame show method - show(df) should show N number of rows, similar to R

2015-09-24 Thread Narine Kokhlikyan (JIRA)
Narine Kokhlikyan created SPARK-10824: - Summary: DataFrame show method - show(df) should show N number of rows, similar to R Key: SPARK-10824 URL: https://issues.apache.org/jira/browse/SPARK-10824

[jira] [Updated] (SPARK-10796) The Stage taskSets may are all removed while stage still have pending partitions after having lost some executors

2015-09-24 Thread SuYan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] SuYan updated SPARK-10796: -- Description: We meet that problem in Spark 1.3.0, and I also check the latest Spark code, and I think that pro

[jira] [Created] (SPARK-10823) API design: external state management

2015-09-24 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-10823: --- Summary: API design: external state management Key: SPARK-10823 URL: https://issues.apache.org/jira/browse/SPARK-10823 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-10822) Move contents of spark-unsafe subproject into spark-core

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10822: Assignee: Apache Spark (was: Josh Rosen) > Move contents of spark-unsafe subproject into

[jira] [Assigned] (SPARK-10822) Move contents of spark-unsafe subproject into spark-core

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10822: Assignee: Josh Rosen (was: Apache Spark) > Move contents of spark-unsafe subproject into

[jira] [Commented] (SPARK-10822) Move contents of spark-unsafe subproject into spark-core

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907332#comment-14907332 ] Apache Spark commented on SPARK-10822: -- User 'JoshRosen' has created a pull request

[jira] [Created] (SPARK-10822) Move contents of spark-unsafe subproject into spark-core

2015-09-24 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-10822: -- Summary: Move contents of spark-unsafe subproject into spark-core Key: SPARK-10822 URL: https://issues.apache.org/jira/browse/SPARK-10822 Project: Spark Issue Ty

[jira] [Commented] (SPARK-10000) Consolidate cache memory management and execution memory management

2015-09-24 Thread Bowen Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907329#comment-14907329 ] Bowen Zhang commented on SPARK-1: - [~rxin], I am interested in this ticket. Can y

[jira] [Created] (SPARK-10821) RandomForest serialization OOM during findBestSplits

2015-09-24 Thread Jay Luan (JIRA)
Jay Luan created SPARK-10821: Summary: RandomForest serialization OOM during findBestSplits Key: SPARK-10821 URL: https://issues.apache.org/jira/browse/SPARK-10821 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-10812) Spark Hadoop Util does not support stopping a non-yarn Spark Context & starting a Yarn spark context.

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10812: Assignee: Apache Spark > Spark Hadoop Util does not support stopping a non-yarn Spark Cont

[jira] [Assigned] (SPARK-10812) Spark Hadoop Util does not support stopping a non-yarn Spark Context & starting a Yarn spark context.

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10812: Assignee: (was: Apache Spark) > Spark Hadoop Util does not support stopping a non-yarn

[jira] [Commented] (SPARK-10812) Spark Hadoop Util does not support stopping a non-yarn Spark Context & starting a Yarn spark context.

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907272#comment-14907272 ] Apache Spark commented on SPARK-10812: -- User 'holdenk' has created a pull request fo

[jira] [Updated] (SPARK-10816) API design: window and session specification

2015-09-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-10816: Summary: API design: window and session specification (was: API design: window specification) > A

[jira] [Assigned] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10790: Assignee: (was: Apache Spark) > Dynamic Allocation does not request any executors if f

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907180#comment-14907180 ] Apache Spark commented on SPARK-10790: -- User 'jerryshao' has created a pull request

[jira] [Assigned] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10790: Assignee: Apache Spark > Dynamic Allocation does not request any executors if first stage

[jira] [Resolved] (SPARK-1856) Standardize MLlib interfaces

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-1856. -- Resolution: Fixed Fix Version/s: 1.5.0 I'm closing this. The Pipelines API is pr

[jira] [Updated] (SPARK-10817) ML abstraction umbrella

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-10817: -- Description: This is an umbrella for discussing and creating ML abstractions. This was

[jira] [Created] (SPARK-10820) Physical plan: determine physical operators needed

2015-09-24 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-10820: --- Summary: Physical plan: determine physical operators needed Key: SPARK-10820 URL: https://issues.apache.org/jira/browse/SPARK-10820 Project: Spark Issue Type:

[jira] [Updated] (SPARK-10819) Logical plan: determine logical operators needed

2015-09-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-10819: Summary: Logical plan: determine logical operators needed (was: Logical plan investigation) > Log

[jira] [Closed] (SPARK-3702) Standardize MLlib classes for learners, models

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-3702. Resolution: Fixed Fix Version/s: 1.5.0 I'm closing this and calling it fixed. This J

[jira] [Created] (SPARK-10819) Logical plan investigation

2015-09-24 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-10819: --- Summary: Logical plan investigation Key: SPARK-10819 URL: https://issues.apache.org/jira/browse/SPARK-10819 Project: Spark Issue Type: Sub-task Compo

[jira] [Created] (SPARK-10818) Query optimization: investigate whether we need a separate optimizer from Spark SQL's

2015-09-24 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-10818: --- Summary: Query optimization: investigate whether we need a separate optimizer from Spark SQL's Key: SPARK-10818 URL: https://issues.apache.org/jira/browse/SPARK-10818 P

[jira] [Created] (SPARK-10817) ML abstraction umbrella

2015-09-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-10817: - Summary: ML abstraction umbrella Key: SPARK-10817 URL: https://issues.apache.org/jira/browse/SPARK-10817 Project: Spark Issue Type: Umbrella

[jira] [Commented] (SPARK-10000) Consolidate cache memory management and execution memory management

2015-09-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907127#comment-14907127 ] Reynold Xin commented on SPARK-1: - Not much - unless you can think of something.

[jira] [Created] (SPARK-10816) API design: window specification

2015-09-24 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-10816: --- Summary: API design: window specification Key: SPARK-10816 URL: https://issues.apache.org/jira/browse/SPARK-10816 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-10814) API design: convergence of batch and streaming DataFrame

2015-09-24 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-10814: --- Summary: API design: convergence of batch and streaming DataFrame Key: SPARK-10814 URL: https://issues.apache.org/jira/browse/SPARK-10814 Project: Spark Issue

[jira] [Created] (SPARK-10815) API design: data sources and sinks

2015-09-24 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-10815: --- Summary: API design: data sources and sinks Key: SPARK-10815 URL: https://issues.apache.org/jira/browse/SPARK-10815 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-10813) API design: high level class structuring regarding windowed and non-windowed streams

2015-09-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-10813: Summary: API design: high level class structuring regarding windowed and non-windowed streams (was

[jira] [Created] (SPARK-10813) API design: high level semantics

2015-09-24 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-10813: --- Summary: API design: high level semantics Key: SPARK-10813 URL: https://issues.apache.org/jira/browse/SPARK-10813 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-10644) Applications wait even if free executors are available

2015-09-24 Thread Balagopal Nair (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907100#comment-14907100 ] Balagopal Nair commented on SPARK-10644: 4 core machine, 3 Workers with 3 executo

[jira] [Updated] (SPARK-10810) Improve session management for SQL

2015-09-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-10810: Issue Type: Improvement (was: Bug) > Improve session management for SQL >

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907072#comment-14907072 ] Thomas Graves commented on SPARK-10735: --- print schema for this shows: |-- beaconI

[jira] [Commented] (SPARK-10808) LDA user guide: discuss running time of LDA

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907071#comment-14907071 ] Joseph K. Bradley commented on SPARK-10808: --- Sure, thank you! > LDA user guide

[jira] [Assigned] (SPARK-10810) Improve session management for SQL

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10810: Assignee: Davies Liu (was: Apache Spark) > Improve session management for SQL > -

[jira] [Commented] (SPARK-10810) Improve session management for SQL

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907062#comment-14907062 ] Apache Spark commented on SPARK-10810: -- User 'davies' has created a pull request for

[jira] [Assigned] (SPARK-10810) Improve session management for SQL

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10810: Assignee: Apache Spark (was: Davies Liu) > Improve session management for SQL > -

[jira] [Created] (SPARK-10812) Spark Hadoop Util does not support stopping a non-yarn Spark Context & starting a Yarn spark context.

2015-09-24 Thread holdenk (JIRA)
holdenk created SPARK-10812: --- Summary: Spark Hadoop Util does not support stopping a non-yarn Spark Context & starting a Yarn spark context. Key: SPARK-10812 URL: https://issues.apache.org/jira/browse/SPARK-10812

[jira] [Commented] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2015-09-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907051#comment-14907051 ] Marcelo Vanzin commented on SPARK-10804: bq. Spark doesn't support INSERT INTO ..

[jira] [Resolved] (SPARK-10761) Refactor DiskBlockObjectWriter to not require BlockId

2015-09-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-10761. - Resolution: Fixed Fix Version/s: 1.6.0 > Refactor DiskBlockObjectWriter to not require Blo

[jira] [Assigned] (SPARK-10807) Add as.data.frame() as a synonym for collect()

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10807: Assignee: (was: Apache Spark) > Add as.data.frame() as a synonym for collect() > -

[jira] [Commented] (SPARK-10807) Add as.data.frame() as a synonym for collect()

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907035#comment-14907035 ] Apache Spark commented on SPARK-10807: -- User 'olarayej' has created a pull request f

[jira] [Assigned] (SPARK-10807) Add as.data.frame() as a synonym for collect()

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10807: Assignee: Apache Spark > Add as.data.frame() as a synonym for collect() >

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907011#comment-14907011 ] Josh Rosen commented on SPARK-10735: I would _not_ recommend using UserDefinedType. U

[jira] [Commented] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2015-09-24 Thread Antonio Piccolboni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906989#comment-14906989 ] Antonio Piccolboni commented on SPARK-10804: OK, maybe I read too much into i

[jira] [Commented] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2015-09-24 Thread Antonio Piccolboni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906988#comment-14906988 ] Antonio Piccolboni commented on SPARK-10804: HIVE-11949 -- but, as I was fili

[jira] [Assigned] (SPARK-10811) Minimize array copying cost in Parquet converters

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10811: Assignee: Apache Spark (was: Cheng Lian) > Minimize array copying cost in Parquet convert

[jira] [Assigned] (SPARK-10811) Minimize array copying cost in Parquet converters

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10811: Assignee: Cheng Lian (was: Apache Spark) > Minimize array copying cost in Parquet convert

[jira] [Commented] (SPARK-10811) Minimize array copying cost in Parquet converters

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906983#comment-14906983 ] Apache Spark commented on SPARK-10811: -- User 'liancheng' has created a pull request

[jira] [Updated] (SPARK-10796) The Stage taskSets may are all removed while stage still have pending partitions after having lost some executors

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10796: -- Component/s: Scheduler > The Stage taskSets may are all removed while stage still have pending > parti

[jira] [Commented] (SPARK-10808) LDA user guide: discuss running time of LDA

2015-09-24 Thread Mohamed Baddar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906975#comment-14906975 ] Mohamed Baddar commented on SPARK-10808: Hello [~josephkb] , can i take this task

[jira] [Updated] (SPARK-10796) The Stage taskSets may are all removed while stage still have pending partitions after having lost some executors

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10796: -- Priority: Minor (was: Major) Can you clarify with a simple example? I'm not clear on the situation you

[jira] [Updated] (SPARK-10787) Reset ObjectOutputStream more often to prevent OOME

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10787: -- Priority: Minor (was: Major) Component/s: Spark Core Issue Type: Improvement (was: Bug) >

[jira] [Commented] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2015-09-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906970#comment-14906970 ] Marcelo Vanzin commented on SPARK-10804: I'm not making any statement about anyth

[jira] [Created] (SPARK-10811) Minimize array copying cost in Parquet converters

2015-09-24 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-10811: -- Summary: Minimize array copying cost in Parquet converters Key: SPARK-10811 URL: https://issues.apache.org/jira/browse/SPARK-10811 Project: Spark Issue Type: Imp

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906969#comment-14906969 ] Saisai Shao commented on SPARK-10790: - Hi [~jonathak], I think I reproduced the probl

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906962#comment-14906962 ] Thomas Graves commented on SPARK-10735: --- It looks like there code created the dataf

[jira] [Commented] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2015-09-24 Thread Antonio Piccolboni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906941#comment-14906941 ] Antonio Piccolboni commented on SPARK-10804: That said, I will try and file a

[jira] [Commented] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2015-09-24 Thread Antonio Piccolboni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906934#comment-14906934 ] Antonio Piccolboni commented on SPARK-10804: As a developer, any issue my use

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906936#comment-14906936 ] Josh Rosen commented on SPARK-10735: While UserDefinedType is currently exposed as {{

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906931#comment-14906931 ] Thomas Graves commented on SPARK-10735: --- They are using SQLContext.createDataFrame(

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906912#comment-14906912 ] Jonathan Kelly commented on SPARK-10790: I can reproduce it with minExecutors=N a

[jira] [Resolved] (SPARK-10705) Stop converting internal rows to external rows in DataFrame.toJSON

2015-09-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-10705. -- Resolution: Fixed Fix Version/s: 1.6.0 This issue has been resolved by https://github.com/apache

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906894#comment-14906894 ] Josh Rosen commented on SPARK-10735: [~tgraves], Spark 1.5.0 is stricter in its enfor

[jira] [Comment Edited] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Glenn Strycker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906877#comment-14906877 ] Glenn Strycker edited comment on SPARK-10735 at 9/24/15 7:40 PM: --

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Glenn Strycker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906877#comment-14906877 ] Glenn Strycker commented on SPARK-10735: This appears very similar to a problem I

[jira] [Updated] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-10735: --- Description: In spark 1.5.0 we are now seeing an exception when converting an RDD with custom object

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906870#comment-14906870 ] Thomas Graves commented on SPARK-10735: --- [~joshrosen] it appears you did some of th

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906862#comment-14906862 ] Thomas Graves commented on SPARK-10735: --- The issue here appears to be that in spark

[jira] [Created] (SPARK-10810) Improve session management for SQL

2015-09-24 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10810: -- Summary: Improve session management for SQL Key: SPARK-10810 URL: https://issues.apache.org/jira/browse/SPARK-10810 Project: Spark Issue Type: Bug Comp

[jira] [Commented] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Ian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906835#comment-14906835 ] Ian commented on SPARK-10741: - yup, it works. The following insert select statement works fo

[jira] [Commented] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906820#comment-14906820 ] Yin Huai commented on SPARK-10741: -- [~ianlcsd] Does {{select "test1" as c1, (count(*)+1)

  1   2   3   >