[jira] [Created] (SPARK-10789) Cluster mode SparkSubmit classpath only includes Spark classpath

2015-09-24 Thread Jonathan Kelly (JIRA)
Jonathan Kelly created SPARK-10789: -- Summary: Cluster mode SparkSubmit classpath only includes Spark classpath Key: SPARK-10789 URL: https://issues.apache.org/jira/browse/SPARK-10789 Project: Spark

[jira] [Updated] (SPARK-10789) Cluster mode SparkSubmit classpath only includes Spark classpath

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Kelly updated SPARK-10789: --- Component/s: Spark Submit > Cluster mode SparkSubmit classpath only includes Spark classpath

[jira] [Created] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Jonathan Kelly (JIRA)
Jonathan Kelly created SPARK-10790: -- Summary: Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors Key: SPARK-10790 URL: https://is

[jira] [Commented] (SPARK-8386) DataFrame and JDBC regression

2015-09-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905940#comment-14905940 ] Liang-Chi Hsieh commented on SPARK-8386: [~phaumer] I can't reproduce this problem

[jira] [Updated] (SPARK-10789) Cluster mode SparkSubmit classpath only includes Spark classpath

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Kelly updated SPARK-10789: --- Description: When using cluster deploy mode, the classpath of the SparkSubmit process that g

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905989#comment-14905989 ] Sean Owen commented on SPARK-10790: --- [~jonathak] A number of quite similar sounding thi

[jira] [Created] (SPARK-10791) Optimize MLlib LDA topic distribution query performance

2015-09-24 Thread Marko Asplund (JIRA)
Marko Asplund created SPARK-10791: - Summary: Optimize MLlib LDA topic distribution query performance Key: SPARK-10791 URL: https://issues.apache.org/jira/browse/SPARK-10791 Project: Spark Iss

[jira] [Commented] (SPARK-10487) MLlib model fitting causes DataFrame write to break with OutOfMemory exception

2015-09-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-10487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906008#comment-14906008 ] Zsolt Tóth commented on SPARK-10487: Increasing the perm size on the driver fixes the

[jira] [Commented] (SPARK-10773) Repartition operation failing on RDD with "argument type mismatch" error

2015-09-24 Thread Bo soon Park (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906022#comment-14906022 ] Bo soon Park commented on SPARK-10773: -- I also so this error like this in mapr-spark

[jira] [Commented] (SPARK-10644) Applications wait even if free executors are available

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906030#comment-14906030 ] Sean Owen commented on SPARK-10644: --- How many cores per executor? I'm assuming you mean

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906047#comment-14906047 ] Jonathan Kelly commented on SPARK-10790: I did search through all dynamicAllocati

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906055#comment-14906055 ] Sean Owen commented on SPARK-10790: --- In your email I think you said you were using 1.4.

[jira] [Created] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
Adrian Tanase created SPARK-10792: - Summary: Spark streaming + YARN – executor is not re-created on machine restart Key: SPARK-10792 URL: https://issues.apache.org/jira/browse/SPARK-10792 Project: Spa

[jira] [Updated] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10792: -- Priority: Minor (was: Major) > Spark streaming + YARN – executor is not re-created on machine restart

[jira] [Commented] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906068#comment-14906068 ] Adrian Tanase commented on SPARK-10792: --- https://issues.apache.org/jira/browse/SPAR

[jira] [Commented] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906081#comment-14906081 ] Sean Owen commented on SPARK-10792: --- I wonder if this is interacting with a blacklist m

[jira] [Updated] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Tanase updated SPARK-10792: -- Description: We’re using spark streaming (1.4.0), deployed on AWS through yarn. It’s a statefu

[jira] [Commented] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906109#comment-14906109 ] Adrian Tanase commented on SPARK-10792: --- Yarn side or Spark side? If it does, shoul

[jira] [Updated] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Tanase updated SPARK-10792: -- Priority: Minor (was: Major) > Spark streaming + YARN – executor is not re-created on machine

[jira] [Updated] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Tanase updated SPARK-10792: -- Priority: Major (was: Minor) > Spark streaming + YARN – executor is not re-created on machine

[jira] [Commented] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906116#comment-14906116 ] Sean Owen commented on SPARK-10792: --- Both potentially, though I mean the Spark side. In

[jira] [Commented] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906133#comment-14906133 ] Adrian Tanase commented on SPARK-10792: --- Correct - I forgot to attach a screenshot

[jira] [Commented] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2015-09-24 Thread Konstantinos Kougios (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906126#comment-14906126 ] Konstantinos Kougios commented on SPARK-5928: - Same issue here with spark 1.5.

[jira] [Updated] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Tanase updated SPARK-10792: -- Attachment: Screen Shot 2015-09-21 at 1.58.28 PM.png > Spark streaming + YARN – executor is not

[jira] [Created] (SPARK-10793) Make sparks use/subclassing of hive more maintainable

2015-09-24 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-10793: -- Summary: Make sparks use/subclassing of hive more maintainable Key: SPARK-10793 URL: https://issues.apache.org/jira/browse/SPARK-10793 Project: Spark Iss

[jira] [Commented] (SPARK-10793) Make sparks use/subclassing of hive more maintainable

2015-09-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906172#comment-14906172 ] Steve Loughran commented on SPARK-10793: Leaving the sqp/hive parser integration

[jira] [Assigned] (SPARK-9346) Conversion is applied three times on partitioned data sources that require conversion

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-9346: --- Assignee: Apache Spark > Conversion is applied three times on partitioned data sources that r

[jira] [Commented] (SPARK-9346) Conversion is applied three times on partitioned data sources that require conversion

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906197#comment-14906197 ] Apache Spark commented on SPARK-9346: - User 'viirya' has created a pull request for th

[jira] [Assigned] (SPARK-9346) Conversion is applied three times on partitioned data sources that require conversion

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-9346: --- Assignee: (was: Apache Spark) > Conversion is applied three times on partitioned data sou

[jira] [Created] (SPARK-10794) Spark-SQL- select query on table column with binary Data Type displays error message- java.lang.ClassCastException: java.lang.String cannot be cast to [B

2015-09-24 Thread Anilkumar Kalshetti (JIRA)
Anilkumar Kalshetti created SPARK-10794: --- Summary: Spark-SQL- select query on table column with binary Data Type displays error message- java.lang.ClassCastException: java.lang.String cannot be cast to [B Key: SPARK-10794

[jira] [Commented] (SPARK-7483) [MLLib] Using Kryo with FPGrowth fails with an exception

2015-09-24 Thread simon.lou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906222#comment-14906222 ] simon.lou commented on SPARK-7483: -- kyro not support ListBuffer because ListBuffer don't

[jira] [Updated] (SPARK-10794) Spark-SQL- select query on table column with binary Data Type displays error message- java.lang.ClassCastException: java.lang.String cannot be cast to [B

2015-09-24 Thread Anilkumar Kalshetti (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Kalshetti updated SPARK-10794: Description: Spark-SQL connected to Hive Metastore-- MapR5.0 has Hive 1.0.0 Use bee

[jira] [Commented] (SPARK-7483) [MLLib] Using Kryo with FPGrowth fails with an exception

2015-09-24 Thread simon.lou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906221#comment-14906221 ] simon.lou commented on SPARK-7483: -- kyro not support ListBuffer because ListBuffer don't

[jira] [Issue Comment Deleted] (SPARK-7483) [MLLib] Using Kryo with FPGrowth fails with an exception

2015-09-24 Thread simon.lou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] simon.lou updated SPARK-7483: - Comment: was deleted (was: kyro not support ListBuffer because ListBuffer don't have any "zero argument c

[jira] [Updated] (SPARK-10794) Spark-SQL- select query on table column with binary Data Type displays error message- java.lang.ClassCastException: java.lang.String cannot be cast to [B

2015-09-24 Thread Anilkumar Kalshetti (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Kalshetti updated SPARK-10794: Attachment: binaryDataType.png testbinary.txt > Spark-SQL- select q

[jira] [Created] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2015-09-24 Thread Harshit (JIRA)
Harshit created SPARK-10795: --- Summary: FileNotFoundException while deploying pyspark job on cluster Key: SPARK-10795 URL: https://issues.apache.org/jira/browse/SPARK-10795 Project: Spark Issue Typ

[jira] [Updated] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2015-09-24 Thread Harshit (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harshit updated SPARK-10795: Description: I am trying to run simple spark job using pyspark, it works as standalone , but while I deplo

[jira] [Commented] (SPARK-10778) Implement toString for AssociationRules.Rule

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906272#comment-14906272 ] Apache Spark commented on SPARK-10778: -- User 'y-shimizu' has created a pull request

[jira] [Assigned] (SPARK-10778) Implement toString for AssociationRules.Rule

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10778: Assignee: Apache Spark > Implement toString for AssociationRules.Rule > --

[jira] [Assigned] (SPARK-10778) Implement toString for AssociationRules.Rule

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10778: Assignee: (was: Apache Spark) > Implement toString for AssociationRules.Rule > ---

[jira] [Commented] (SPARK-6028) Provide an alternative RPC implementation based on the network transport module

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906282#comment-14906282 ] Apache Spark commented on SPARK-6028: - User 'zsxwing' has created a pull request for t

[jira] [Commented] (SPARK-10474) TungstenAggregation cannot acquire memory for pointer array after switching to sort-based

2015-09-24 Thread Yi Zhou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906291#comment-14906291 ] Yi Zhou commented on SPARK-10474: - Hi [~andrewor14] [~yhuai]. It's OK for me and get no e

[jira] [Created] (SPARK-10796) The Stage taskSets may are all removed while stage still have pending partitions after having lost some executors

2015-09-24 Thread SuYan (JIRA)
SuYan created SPARK-10796: - Summary: The Stage taskSets may are all removed while stage still have pending partitions after having lost some executors Key: SPARK-10796 URL: https://issues.apache.org/jira/browse/SPARK-1079

[jira] [Commented] (SPARK-10688) Python API for AFTSurvivalRegression

2015-09-24 Thread Kai Jiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906372#comment-14906372 ] Kai Jiang commented on SPARK-10688: --- Working on it~ > Python API for AFTSurvivalRegres

[jira] [Created] (SPARK-10797) RDD's coalesce should not write out the temporary key

2015-09-24 Thread JIRA
Zoltán Zvara created SPARK-10797: Summary: RDD's coalesce should not write out the temporary key Key: SPARK-10797 URL: https://issues.apache.org/jira/browse/SPARK-10797 Project: Spark Issue T

[jira] [Commented] (SPARK-10797) RDD's coalesce should not write out the temporary key

2015-09-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906397#comment-14906397 ] Zoltán Zvara commented on SPARK-10797: -- I have prepared a solution for this, because

[jira] [Created] (SPARK-10798) JsonMappingException with Spark Context Parallelize

2015-09-24 Thread Dev Lakhani (JIRA)
Dev Lakhani created SPARK-10798: --- Summary: JsonMappingException with Spark Context Parallelize Key: SPARK-10798 URL: https://issues.apache.org/jira/browse/SPARK-10798 Project: Spark Issue Type:

[jira] [Updated] (SPARK-10798) JsonMappingException with Spark Context Parallelize

2015-09-24 Thread Dev Lakhani (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dev Lakhani updated SPARK-10798: Description: When trying to create an RDD of Rows using a Java Spark Context: List rows= new Vecto

[jira] [Created] (SPARK-10799) Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads

2015-09-24 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-10799: - Summary: Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads Key: SPARK-10799 URL: https://issues.apache.org/jira/browse/SPARK-10799 Project: S

[jira] [Updated] (SPARK-10799) Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads

2015-09-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10799: -- Labels: flaky-test (was: ) > Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple

[jira] [Reopened] (SPARK-10651) Flaky test: BroadcastSuite

2015-09-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reopened SPARK-10651: --- I think timeout doesn't work well. Saw more failures: https://amplab.cs.berkeley.edu/jenkins/job

[jira] [Created] (SPARK-10800) Flaky test: org.apache.spark.deploy.StandaloneDynamicAllocationSuite

2015-09-24 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-10800: - Summary: Flaky test: org.apache.spark.deploy.StandaloneDynamicAllocationSuite Key: SPARK-10800 URL: https://issues.apache.org/jira/browse/SPARK-10800 Project: Spark

[jira] [Updated] (SPARK-10651) Flaky test: BroadcastSuite

2015-09-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10651: -- Fix Version/s: (was: 1.6.0) > Flaky test: BroadcastSuite > -- > >

[jira] [Assigned] (SPARK-10799) Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10799: Assignee: Apache Spark (was: Shixiong Zhu) > Flaky test: org.apache.spark.rpc.netty.Inbox

[jira] [Commented] (SPARK-10799) Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906490#comment-14906490 ] Apache Spark commented on SPARK-10799: -- User 'zsxwing' has created a pull request fo

[jira] [Assigned] (SPARK-10799) Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10799: Assignee: Shixiong Zhu (was: Apache Spark) > Flaky test: org.apache.spark.rpc.netty.Inbox

[jira] [Updated] (SPARK-10801) StatCounter uses mutability and is not thread-safe

2015-09-24 Thread Gianmario Spacagna (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmario Spacagna updated SPARK-10801: --- Summary: StatCounter uses mutability and is not thread-safe (was: StatCounter uses m

[jira] [Created] (SPARK-10801) StatCounter uses mutability, is not thread-safe and hard to understand its implementation

2015-09-24 Thread Gianmario Spacagna (JIRA)
Gianmario Spacagna created SPARK-10801: -- Summary: StatCounter uses mutability, is not thread-safe and hard to understand its implementation Key: SPARK-10801 URL: https://issues.apache.org/jira/browse/SPARK-10

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906505#comment-14906505 ] Jonathan Kelly commented on SPARK-10790: Yes, this is on Spark 1.5.0. That's why

[jira] [Updated] (SPARK-10801) StatCounter uses mutability and is not thread-safe

2015-09-24 Thread Gianmario Spacagna (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmario Spacagna updated SPARK-10801: --- Description: The current implementation of org.apache.spark.util.StatCounter is muta

[jira] [Updated] (SPARK-10801) StatCounter uses mutability and is not thread-safe

2015-09-24 Thread Gianmario Spacagna (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmario Spacagna updated SPARK-10801: --- Affects Version/s: 1.0.0 > StatCounter uses mutability and is not thread-safe > -

[jira] [Updated] (SPARK-10778) Implement toString for AssociationRules.Rule

2015-09-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10778: -- Assignee: shimizu yoshihiro > Implement toString for AssociationRules.Rule > --

[jira] [Commented] (SPARK-10801) StatCounter uses mutability and is not thread-safe

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906512#comment-14906512 ] Sean Owen commented on SPARK-10801: --- Are you suggesting it be immutable? I think that w

[jira] [Updated] (SPARK-10670) Link to each language's API in codetabs in ML docs: spark.ml

2015-09-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10670: -- Assignee: yuhao yang > Link to each language's API in codetabs in ML docs: spark.ml > -

[jira] [Updated] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10790: -- Priority: Major (was: Critical) Yeah if so then ignore most of this since I thought you were on 1.4.1.

[jira] [Updated] (SPARK-10789) Cluster mode SparkSubmit classpath only includes Spark assembly

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Kelly updated SPARK-10789: --- Summary: Cluster mode SparkSubmit classpath only includes Spark assembly (was: Cluster mode

[jira] [Created] (SPARK-10802) Let ALS recommend for subset of data

2015-09-24 Thread Tomasz Bartczak (JIRA)
Tomasz Bartczak created SPARK-10802: --- Summary: Let ALS recommend for subset of data Key: SPARK-10802 URL: https://issues.apache.org/jira/browse/SPARK-10802 Project: Spark Issue Type: Improv

[jira] [Updated] (SPARK-9103) Tracking spark's memory usage

2015-09-24 Thread Zhang, Liye (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhang, Liye updated SPARK-9103: --- Attachment: Tracking Spark Memory Usage - Phase 1.pdf > Tracking spark's memory usage > --

[jira] [Commented] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2015-09-24 Thread Konstantinos Kougios (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906572#comment-14906572 ] Konstantinos Kougios commented on SPARK-5928: - Is there a work around for this

[jira] [Commented] (SPARK-10688) Python API for AFTSurvivalRegression

2015-09-24 Thread Gayathri Murali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906583#comment-14906583 ] Gayathri Murali commented on SPARK-10688: - I started working on it as well. Since

[jira] [Created] (SPARK-10803) Allow users to write and query Parquet user-defined key-value metadata directly

2015-09-24 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-10803: -- Summary: Allow users to write and query Parquet user-defined key-value metadata directly Key: SPARK-10803 URL: https://issues.apache.org/jira/browse/SPARK-10803 Project:

[jira] [Commented] (SPARK-10802) Let ALS recommend for subset of data

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906612#comment-14906612 ] Sean Owen commented on SPARK-10802: --- You can already pass an RDD of user,item pairs. I

[jira] [Commented] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906618#comment-14906618 ] Yin Huai commented on SPARK-10741: -- [~ianlcsd] Can you try the following queries to see

[jira] [Resolved] (SPARK-10765) use new aggregate interface for hive UDAF

2015-09-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-10765. -- Resolution: Fixed Fix Version/s: 1.6.0 > use new aggregate interface for hive UDAF > ---

[jira] [Commented] (SPARK-10765) use new aggregate interface for hive UDAF

2015-09-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906637#comment-14906637 ] Yin Huai commented on SPARK-10765: -- This issue has been resolved by https://github.com/a

[jira] [Commented] (SPARK-10487) MLlib model fitting causes DataFrame write to break with OutOfMemory exception

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906656#comment-14906656 ] Joseph K. Bradley commented on SPARK-10487: --- Ohh, that's very helpful. I suspe

[jira] [Updated] (SPARK-10797) RDD's coalesce should not write out the temporary key

2015-09-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Zvara updated SPARK-10797: - Description: It seems that {{RDD.coalesce}} will unnecessarily write out (to shuffle files) temp

[jira] [Updated] (SPARK-10797) RDD's coalesce should not write out the temporary key

2015-09-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Zvara updated SPARK-10797: - Description: It seems that {{RDD.coalesce}} will unnecessarily write out (to shuffle files) temp

[jira] [Updated] (SPARK-10797) RDD's coalesce should not write out the temporary key

2015-09-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Zvara updated SPARK-10797: - Description: It seems that {{RDD.coalesce}} will unnecessarily write out (to shuffle files) temp

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906696#comment-14906696 ] Saisai Shao commented on SPARK-10790: - Thanks [~srowen], let me check it. > Dynamic

[jira] [Created] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2015-09-24 Thread Antonio Piccolboni (JIRA)
Antonio Piccolboni created SPARK-10804: -- Summary: "LOCAL" in LOAD DATA LOCAL INPATH means "remote" Key: SPARK-10804 URL: https://issues.apache.org/jira/browse/SPARK-10804 Project: Spark

[jira] [Commented] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2015-09-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906723#comment-14906723 ] Marcelo Vanzin commented on SPARK-10804: This is really a Hive issue, which Spark

[jira] [Created] (SPARK-10805) JSON Data Frame does not return correct string lengths

2015-09-24 Thread Jeff Li (JIRA)
Jeff Li created SPARK-10805: --- Summary: JSON Data Frame does not return correct string lengths Key: SPARK-10805 URL: https://issues.apache.org/jira/browse/SPARK-10805 Project: Spark Issue Type: Impr

[jira] [Created] (SPARK-10806) Following val redefinition, sometimes the old value is still visible

2015-09-24 Thread Boris Alexeev (JIRA)
Boris Alexeev created SPARK-10806: - Summary: Following val redefinition, sometimes the old value is still visible Key: SPARK-10806 URL: https://issues.apache.org/jira/browse/SPARK-10806 Project: Spark

[jira] [Created] (SPARK-10807) Add as.data.frame() as a synonym for collect()

2015-09-24 Thread Oscar D. Lara Yejas (JIRA)
Oscar D. Lara Yejas created SPARK-10807: --- Summary: Add as.data.frame() as a synonym for collect() Key: SPARK-10807 URL: https://issues.apache.org/jira/browse/SPARK-10807 Project: Spark

[jira] [Commented] (SPARK-10807) Add as.data.frame() as a synonym for collect()

2015-09-24 Thread Oscar D. Lara Yejas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906760#comment-14906760 ] Oscar D. Lara Yejas commented on SPARK-10807: - I'm working on this one. Than

[jira] [Created] (SPARK-10808) LDA user guide: discuss running time of LDA

2015-09-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-10808: - Summary: LDA user guide: discuss running time of LDA Key: SPARK-10808 URL: https://issues.apache.org/jira/browse/SPARK-10808 Project: Spark Issue T

[jira] [Updated] (SPARK-10808) LDA user guide: discuss running time of LDA

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-10808: -- Description: Based on feedback like [SPARK-10791], we should discuss the computational

[jira] [Commented] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Ian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906768#comment-14906768 ] Ian commented on SPARK-10741: - The org.apache.spark.sql.AnalysisException is fixed, but the w

[jira] [Created] (SPARK-10809) Single-document topicDistributions method for LocalLDAModel

2015-09-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-10809: - Summary: Single-document topicDistributions method for LocalLDAModel Key: SPARK-10809 URL: https://issues.apache.org/jira/browse/SPARK-10809 Project: Spark

[jira] [Commented] (SPARK-10791) Optimize MLlib LDA topic distribution query performance

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906776#comment-14906776 ] Joseph K. Bradley commented on SPARK-10791: --- This sounds like a question for th

[jira] [Closed] (SPARK-10791) Optimize MLlib LDA topic distribution query performance

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-10791. - Resolution: Done > Optimize MLlib LDA topic distribution query performance >

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906787#comment-14906787 ] Saisai Shao commented on SPARK-10790: - Hi [~jonathak], let me trying to understand yo

[jira] [Commented] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906786#comment-14906786 ] Yin Huai commented on SPARK-10741: -- Any error? > Hive Query Having/OrderBy against Parq

[jira] [Commented] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Ian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906801#comment-14906801 ] Ian commented on SPARK-10741: - Two of my tests failed. The query returns nothing. {code}

[jira] [Closed] (SPARK-10487) MLlib model fitting causes DataFrame write to break with OutOfMemory exception

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-10487. - Resolution: Not A Problem > MLlib model fitting causes DataFrame write to break with OutO

[jira] [Commented] (SPARK-10487) MLlib model fitting causes DataFrame write to break with OutOfMemory exception

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906802#comment-14906802 ] Joseph K. Bradley commented on SPARK-10487: --- As far as I can tell, there isn't

[jira] [Commented] (SPARK-10773) Repartition operation failing on RDD with "argument type mismatch" error

2015-09-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906803#comment-14906803 ] Andrew Or commented on SPARK-10773: --- I believe this is fixed in 1.5.0: https://issues.a

[jira] [Commented] (SPARK-6028) Provide an alternative RPC implementation based on the network transport module

2015-09-24 Thread Neal Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906807#comment-14906807 ] Neal Yin commented on SPARK-6028: - [~rxin] I am wondering why spark wants to remove AKKA d

[jira] [Comment Edited] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Ian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906801#comment-14906801 ] Ian edited comment on SPARK-10741 at 9/24/15 6:46 PM: -- Two of my tes

  1   2   3   >