[jira] [Assigned] (SPARK-10670) Link to each language's API in codetabs in ML docs: spark.ml

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10670: Assignee: (was: Apache Spark) > Link to each language's API in codetabs in ML docs:

[jira] [Assigned] (SPARK-10670) Link to each language's API in codetabs in ML docs: spark.ml

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10670: Assignee: Apache Spark > Link to each language's API in codetabs in ML docs: spark.ml >

[jira] [Commented] (SPARK-8386) DataFrame and JDBC regression

2015-09-24 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905940#comment-14905940 ] Liang-Chi Hsieh commented on SPARK-8386: [~phaumer] I can't reproduce this problem. Can you give

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906047#comment-14906047 ] Jonathan Kelly commented on SPARK-10790: I did search through all dynamicAllocation-related JIRAs

[jira] [Updated] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10792: -- Priority: Minor (was: Major) > Spark streaming + YARN – executor is not re-created on machine restart

[jira] [Commented] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906068#comment-14906068 ] Adrian Tanase commented on SPARK-10792: --- https://issues.apache.org/jira/browse/SPARK-8297 seems to

[jira] [Commented] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906081#comment-14906081 ] Sean Owen commented on SPARK-10792: --- I wonder if this is interacting with a blacklist mechanism? sort

[jira] [Updated] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Tanase updated SPARK-10792: -- Description: We’re using spark streaming (1.4.0), deployed on AWS through yarn. It’s a

[jira] [Updated] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Tanase updated SPARK-10792: -- Priority: Major (was: Minor) > Spark streaming + YARN – executor is not re-created on machine

[jira] [Updated] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Tanase updated SPARK-10792: -- Priority: Minor (was: Major) > Spark streaming + YARN – executor is not re-created on machine

[jira] [Created] (SPARK-10794) Spark-SQL- select query on table column with binary Data Type displays error message- java.lang.ClassCastException: java.lang.String cannot be cast to [B

2015-09-24 Thread Anilkumar Kalshetti (JIRA)
Anilkumar Kalshetti created SPARK-10794: --- Summary: Spark-SQL- select query on table column with binary Data Type displays error message- java.lang.ClassCastException: java.lang.String cannot be cast to [B Key: SPARK-10794

[jira] [Commented] (SPARK-7483) [MLLib] Using Kryo with FPGrowth fails with an exception

2015-09-24 Thread simon.lou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906222#comment-14906222 ] simon.lou commented on SPARK-7483: -- kyro not support ListBuffer because ListBuffer don't have any "zero

[jira] [Updated] (SPARK-10794) Spark-SQL- select query on table column with binary Data Type displays error message- java.lang.ClassCastException: java.lang.String cannot be cast to [B

2015-09-24 Thread Anilkumar Kalshetti (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Kalshetti updated SPARK-10794: Description: Spark-SQL connected to Hive Metastore-- MapR5.0 has Hive 1.0.0 Use

[jira] [Commented] (SPARK-7483) [MLLib] Using Kryo with FPGrowth fails with an exception

2015-09-24 Thread simon.lou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906221#comment-14906221 ] simon.lou commented on SPARK-7483: -- kyro not support ListBuffer because ListBuffer don't have any "zero

[jira] [Commented] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906133#comment-14906133 ] Adrian Tanase commented on SPARK-10792: --- Correct - I forgot to attach a screenshot where this is

[jira] [Commented] (SPARK-10793) Make sparks use/subclassing of hive more maintainable

2015-09-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906172#comment-14906172 ] Steve Loughran commented on SPARK-10793: Leaving the sqp/hive parser integration alone, this is

[jira] [Commented] (SPARK-9346) Conversion is applied three times on partitioned data sources that require conversion

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906197#comment-14906197 ] Apache Spark commented on SPARK-9346: - User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-9346) Conversion is applied three times on partitioned data sources that require conversion

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-9346: --- Assignee: (was: Apache Spark) > Conversion is applied three times on partitioned data

[jira] [Assigned] (SPARK-9346) Conversion is applied three times on partitioned data sources that require conversion

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-9346: --- Assignee: Apache Spark > Conversion is applied three times on partitioned data sources that

[jira] [Created] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2015-09-24 Thread Harshit (JIRA)
Harshit created SPARK-10795: --- Summary: FileNotFoundException while deploying pyspark job on cluster Key: SPARK-10795 URL: https://issues.apache.org/jira/browse/SPARK-10795 Project: Spark Issue

[jira] [Updated] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Tanase updated SPARK-10792: -- Attachment: Screen Shot 2015-09-21 at 1.58.28 PM.png > Spark streaming + YARN – executor is

[jira] [Commented] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906109#comment-14906109 ] Adrian Tanase commented on SPARK-10792: --- Yarn side or Spark side? If it does, shouldn't that also

[jira] [Commented] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906116#comment-14906116 ] Sean Owen commented on SPARK-10792: --- Both potentially, though I mean the Spark side. In 6 it is

[jira] [Created] (SPARK-10793) Make sparks use/subclassing of hive more maintainable

2015-09-24 Thread Steve Loughran (JIRA)
Steve Loughran created SPARK-10793: -- Summary: Make sparks use/subclassing of hive more maintainable Key: SPARK-10793 URL: https://issues.apache.org/jira/browse/SPARK-10793 Project: Spark

[jira] [Issue Comment Deleted] (SPARK-7483) [MLLib] Using Kryo with FPGrowth fails with an exception

2015-09-24 Thread simon.lou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] simon.lou updated SPARK-7483: - Comment: was deleted (was: kyro not support ListBuffer because ListBuffer don't have any "zero argument

[jira] [Updated] (SPARK-10794) Spark-SQL- select query on table column with binary Data Type displays error message- java.lang.ClassCastException: java.lang.String cannot be cast to [B

2015-09-24 Thread Anilkumar Kalshetti (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anilkumar Kalshetti updated SPARK-10794: Attachment: binaryDataType.png testbinary.txt > Spark-SQL- select

[jira] [Updated] (SPARK-10795) FileNotFoundException while deploying pyspark job on cluster

2015-09-24 Thread Harshit (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harshit updated SPARK-10795: Description: I am trying to run simple spark job using pyspark, it works as standalone , but while I

[jira] [Updated] (SPARK-10789) Cluster mode SparkSubmit classpath only includes Spark classpath

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Kelly updated SPARK-10789: --- Component/s: Spark Submit > Cluster mode SparkSubmit classpath only includes Spark classpath

[jira] [Created] (SPARK-10792) Spark streaming + YARN – executor is not re-created on machine restart

2015-09-24 Thread Adrian Tanase (JIRA)
Adrian Tanase created SPARK-10792: - Summary: Spark streaming + YARN – executor is not re-created on machine restart Key: SPARK-10792 URL: https://issues.apache.org/jira/browse/SPARK-10792 Project:

[jira] [Commented] (SPARK-10773) Repartition operation failing on RDD with "argument type mismatch" error

2015-09-24 Thread Bo soon Park (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906022#comment-14906022 ] Bo soon Park commented on SPARK-10773: -- I also so this error like this in mapr-spark-1.4.1 [Code]

[jira] [Created] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Jonathan Kelly (JIRA)
Jonathan Kelly created SPARK-10790: -- Summary: Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors Key: SPARK-10790 URL:

[jira] [Created] (SPARK-10791) Optimize MLlib LDA topic distribution query performance

2015-09-24 Thread Marko Asplund (JIRA)
Marko Asplund created SPARK-10791: - Summary: Optimize MLlib LDA topic distribution query performance Key: SPARK-10791 URL: https://issues.apache.org/jira/browse/SPARK-10791 Project: Spark

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906055#comment-14906055 ] Sean Owen commented on SPARK-10790: --- In your email I think you said you were using 1.4.1; just to be

[jira] [Updated] (SPARK-10789) Cluster mode SparkSubmit classpath only includes Spark classpath

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Kelly updated SPARK-10789: --- Description: When using cluster deploy mode, the classpath of the SparkSubmit process that

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905989#comment-14905989 ] Sean Owen commented on SPARK-10790: --- [~jonathak] A number of quite similar sounding things were fixed

[jira] [Commented] (SPARK-10644) Applications wait even if free executors are available

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906030#comment-14906030 ] Sean Owen commented on SPARK-10644: --- How many cores per executor? I'm assuming you mean 1 and have

[jira] [Commented] (SPARK-10670) Link to each language's API in codetabs in ML docs: spark.ml

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905898#comment-14905898 ] Apache Spark commented on SPARK-10670: -- User 'hhbyyh' has created a pull request for this issue:

[jira] [Created] (SPARK-10789) Cluster mode SparkSubmit classpath only includes Spark classpath

2015-09-24 Thread Jonathan Kelly (JIRA)
Jonathan Kelly created SPARK-10789: -- Summary: Cluster mode SparkSubmit classpath only includes Spark classpath Key: SPARK-10789 URL: https://issues.apache.org/jira/browse/SPARK-10789 Project: Spark

[jira] [Commented] (SPARK-10487) MLlib model fitting causes DataFrame write to break with OutOfMemory exception

2015-09-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-10487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906008#comment-14906008 ] Zsolt Tóth commented on SPARK-10487: Increasing the perm size on the driver fixes the OOM:

[jira] [Commented] (SPARK-10778) Implement toString for AssociationRules.Rule

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906272#comment-14906272 ] Apache Spark commented on SPARK-10778: -- User 'y-shimizu' has created a pull request for this issue:

[jira] [Assigned] (SPARK-10778) Implement toString for AssociationRules.Rule

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10778: Assignee: Apache Spark > Implement toString for AssociationRules.Rule >

[jira] [Assigned] (SPARK-10778) Implement toString for AssociationRules.Rule

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10778: Assignee: (was: Apache Spark) > Implement toString for AssociationRules.Rule >

[jira] [Commented] (SPARK-6028) Provide an alternative RPC implementation based on the network transport module

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906282#comment-14906282 ] Apache Spark commented on SPARK-6028: - User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-10474) TungstenAggregation cannot acquire memory for pointer array after switching to sort-based

2015-09-24 Thread Yi Zhou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906291#comment-14906291 ] Yi Zhou commented on SPARK-10474: - Hi [~andrewor14] [~yhuai]. It's OK for me and get no errors. Thanks !

[jira] [Created] (SPARK-10796) The Stage taskSets may are all removed while stage still have pending partitions after having lost some executors

2015-09-24 Thread SuYan (JIRA)
SuYan created SPARK-10796: - Summary: The Stage taskSets may are all removed while stage still have pending partitions after having lost some executors Key: SPARK-10796 URL:

[jira] [Commented] (SPARK-10688) Python API for AFTSurvivalRegression

2015-09-24 Thread Kai Jiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906372#comment-14906372 ] Kai Jiang commented on SPARK-10688: --- Working on it~ > Python API for AFTSurvivalRegression >

[jira] [Commented] (SPARK-10797) RDD's coalesce should not write out the temporary key

2015-09-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906397#comment-14906397 ] Zoltán Zvara commented on SPARK-10797: -- I have prepared a solution for this, because I had to

[jira] [Updated] (SPARK-10798) JsonMappingException with Spark Context Parallelize

2015-09-24 Thread Dev Lakhani (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dev Lakhani updated SPARK-10798: Description: When trying to create an RDD of Rows using a Java Spark Context: List rows= new

[jira] [Created] (SPARK-10797) RDD's coalesce should not write out the temporary key

2015-09-24 Thread JIRA
Zoltán Zvara created SPARK-10797: Summary: RDD's coalesce should not write out the temporary key Key: SPARK-10797 URL: https://issues.apache.org/jira/browse/SPARK-10797 Project: Spark Issue

[jira] [Created] (SPARK-10798) JsonMappingException with Spark Context Parallelize

2015-09-24 Thread Dev Lakhani (JIRA)
Dev Lakhani created SPARK-10798: --- Summary: JsonMappingException with Spark Context Parallelize Key: SPARK-10798 URL: https://issues.apache.org/jira/browse/SPARK-10798 Project: Spark Issue

[jira] [Created] (SPARK-10799) Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads

2015-09-24 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-10799: - Summary: Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads Key: SPARK-10799 URL: https://issues.apache.org/jira/browse/SPARK-10799 Project:

[jira] [Updated] (SPARK-10799) Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads

2015-09-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10799: -- Labels: flaky-test (was: ) > Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple

[jira] [Reopened] (SPARK-10651) Flaky test: BroadcastSuite

2015-09-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reopened SPARK-10651: --- I think timeout doesn't work well. Saw more failures:

[jira] [Created] (SPARK-10800) Flaky test: org.apache.spark.deploy.StandaloneDynamicAllocationSuite

2015-09-24 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-10800: - Summary: Flaky test: org.apache.spark.deploy.StandaloneDynamicAllocationSuite Key: SPARK-10800 URL: https://issues.apache.org/jira/browse/SPARK-10800 Project:

[jira] [Assigned] (SPARK-10799) Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10799: Assignee: Apache Spark (was: Shixiong Zhu) > Flaky test:

[jira] [Commented] (SPARK-10799) Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906490#comment-14906490 ] Apache Spark commented on SPARK-10799: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-10799) Flaky test: org.apache.spark.rpc.netty.InboxSuite.post: multiple threads

2015-09-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-10799: Assignee: Shixiong Zhu (was: Apache Spark) > Flaky test:

[jira] [Updated] (SPARK-10801) StatCounter uses mutability and is not thread-safe

2015-09-24 Thread Gianmario Spacagna (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmario Spacagna updated SPARK-10801: --- Summary: StatCounter uses mutability and is not thread-safe (was: StatCounter uses

[jira] [Updated] (SPARK-10801) StatCounter uses mutability and is not thread-safe

2015-09-24 Thread Gianmario Spacagna (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmario Spacagna updated SPARK-10801: --- Description: The current implementation of org.apache.spark.util.StatCounter is

[jira] [Updated] (SPARK-10651) Flaky test: BroadcastSuite

2015-09-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10651: -- Fix Version/s: (was: 1.6.0) > Flaky test: BroadcastSuite > -- > >

[jira] [Updated] (SPARK-10801) StatCounter uses mutability and is not thread-safe

2015-09-24 Thread Gianmario Spacagna (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmario Spacagna updated SPARK-10801: --- Affects Version/s: 1.0.0 > StatCounter uses mutability and is not thread-safe >

[jira] [Updated] (SPARK-10778) Implement toString for AssociationRules.Rule

2015-09-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10778: -- Assignee: shimizu yoshihiro > Implement toString for AssociationRules.Rule >

[jira] [Updated] (SPARK-10789) Cluster mode SparkSubmit classpath only includes Spark assembly

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Kelly updated SPARK-10789: --- Summary: Cluster mode SparkSubmit classpath only includes Spark assembly (was: Cluster mode

[jira] [Created] (SPARK-10801) StatCounter uses mutability, is not thread-safe and hard to understand its implementation

2015-09-24 Thread Gianmario Spacagna (JIRA)
Gianmario Spacagna created SPARK-10801: -- Summary: StatCounter uses mutability, is not thread-safe and hard to understand its implementation Key: SPARK-10801 URL:

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906505#comment-14906505 ] Jonathan Kelly commented on SPARK-10790: Yes, this is on Spark 1.5.0. That's why I chose 1.5.0

[jira] [Commented] (SPARK-10801) StatCounter uses mutability and is not thread-safe

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906512#comment-14906512 ] Sean Owen commented on SPARK-10801: --- Are you suggesting it be immutable? I think that would be much

[jira] [Updated] (SPARK-10670) Link to each language's API in codetabs in ML docs: spark.ml

2015-09-24 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10670: -- Assignee: yuhao yang > Link to each language's API in codetabs in ML docs: spark.ml >

[jira] [Updated] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-10790: -- Priority: Major (was: Critical) Yeah if so then ignore most of this since I thought you were on

[jira] [Created] (SPARK-10802) Let ALS recommend for subset of data

2015-09-24 Thread Tomasz Bartczak (JIRA)
Tomasz Bartczak created SPARK-10802: --- Summary: Let ALS recommend for subset of data Key: SPARK-10802 URL: https://issues.apache.org/jira/browse/SPARK-10802 Project: Spark Issue Type:

[jira] [Commented] (SPARK-5928) Remote Shuffle Blocks cannot be more than 2 GB

2015-09-24 Thread Konstantinos Kougios (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906572#comment-14906572 ] Konstantinos Kougios commented on SPARK-5928: - Is there a work around for this? > Remote

[jira] [Created] (SPARK-10803) Allow users to write and query Parquet user-defined key-value metadata directly

2015-09-24 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-10803: -- Summary: Allow users to write and query Parquet user-defined key-value metadata directly Key: SPARK-10803 URL: https://issues.apache.org/jira/browse/SPARK-10803 Project:

[jira] [Commented] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906618#comment-14906618 ] Yin Huai commented on SPARK-10741: -- [~ianlcsd] Can you try the following queries to see if you can

[jira] [Commented] (SPARK-10688) Python API for AFTSurvivalRegression

2015-09-24 Thread Gayathri Murali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906583#comment-14906583 ] Gayathri Murali commented on SPARK-10688: - I started working on it as well. Since there isnt a

[jira] [Commented] (SPARK-10802) Let ALS recommend for subset of data

2015-09-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906612#comment-14906612 ] Sean Owen commented on SPARK-10802: --- You can already pass an RDD of user,item pairs. I think that's

[jira] [Commented] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906786#comment-14906786 ] Yin Huai commented on SPARK-10741: -- Any error? > Hive Query Having/OrderBy against Parquet table is not

[jira] [Comment Edited] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Ian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906801#comment-14906801 ] Ian edited comment on SPARK-10741 at 9/24/15 6:46 PM: -- Two of my tests failed. The

[jira] [Commented] (SPARK-6028) Provide an alternative RPC implementation based on the network transport module

2015-09-24 Thread Neal Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906807#comment-14906807 ] Neal Yin commented on SPARK-6028: - [~rxin] I am wondering why spark wants to remove AKKA dependence? Is

[jira] [Created] (SPARK-10810) Improve session management for SQL

2015-09-24 Thread Davies Liu (JIRA)
Davies Liu created SPARK-10810: -- Summary: Improve session management for SQL Key: SPARK-10810 URL: https://issues.apache.org/jira/browse/SPARK-10810 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906862#comment-14906862 ] Thomas Graves commented on SPARK-10735: --- The issue here appears to be that in spark 1.4.1 it would

[jira] [Created] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2015-09-24 Thread Antonio Piccolboni (JIRA)
Antonio Piccolboni created SPARK-10804: -- Summary: "LOCAL" in LOAD DATA LOCAL INPATH means "remote" Key: SPARK-10804 URL: https://issues.apache.org/jira/browse/SPARK-10804 Project: Spark

[jira] [Created] (SPARK-10807) Add as.data.frame() as a synonym for collect()

2015-09-24 Thread Oscar D. Lara Yejas (JIRA)
Oscar D. Lara Yejas created SPARK-10807: --- Summary: Add as.data.frame() as a synonym for collect() Key: SPARK-10807 URL: https://issues.apache.org/jira/browse/SPARK-10807 Project: Spark

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Glenn Strycker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906877#comment-14906877 ] Glenn Strycker commented on SPARK-10735: This appears very similar to a problem I had earlier

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906894#comment-14906894 ] Josh Rosen commented on SPARK-10735: [~tgraves], Spark 1.5.0 is stricter in its enforcement of types

[jira] [Created] (SPARK-10808) LDA user guide: discuss running time of LDA

2015-09-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-10808: - Summary: LDA user guide: discuss running time of LDA Key: SPARK-10808 URL: https://issues.apache.org/jira/browse/SPARK-10808 Project: Spark Issue

[jira] [Commented] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Ian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906801#comment-14906801 ] Ian commented on SPARK-10741: - Two of my tests failed. The query returns nothing. {code} test("test

[jira] [Commented] (SPARK-10487) MLlib model fitting causes DataFrame write to break with OutOfMemory exception

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906802#comment-14906802 ] Joseph K. Bradley commented on SPARK-10487: --- As far as I can tell, there isn't a huge change

[jira] [Closed] (SPARK-10487) MLlib model fitting causes DataFrame write to break with OutOfMemory exception

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley closed SPARK-10487. - Resolution: Not A Problem > MLlib model fitting causes DataFrame write to break with

[jira] [Resolved] (SPARK-10705) Stop converting internal rows to external rows in DataFrame.toJSON

2015-09-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-10705. -- Resolution: Fixed Fix Version/s: 1.6.0 This issue has been resolved by

[jira] [Updated] (SPARK-10797) RDD's coalesce should not write out the temporary key

2015-09-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Zvara updated SPARK-10797: - Description: It seems that {{RDD.coalesce}} will unnecessarily write out (to shuffle files)

[jira] [Commented] (SPARK-10791) Optimize MLlib LDA topic distribution query performance

2015-09-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906776#comment-14906776 ] Joseph K. Bradley commented on SPARK-10791: --- This sounds like a question for the user list, not

[jira] [Updated] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-10735: --- Description: In spark 1.5.0 we are now seeing an exception when converting an RDD with custom

[jira] [Created] (SPARK-10806) Following val redefinition, sometimes the old value is still visible

2015-09-24 Thread Boris Alexeev (JIRA)
Boris Alexeev created SPARK-10806: - Summary: Following val redefinition, sometimes the old value is still visible Key: SPARK-10806 URL: https://issues.apache.org/jira/browse/SPARK-10806 Project:

[jira] [Commented] (SPARK-10807) Add as.data.frame() as a synonym for collect()

2015-09-24 Thread Oscar D. Lara Yejas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906760#comment-14906760 ] Oscar D. Lara Yejas commented on SPARK-10807: - I'm working on this one. Thanks, Oscar > Add

[jira] [Commented] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Ian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906835#comment-14906835 ] Ian commented on SPARK-10741: - yup, it works. The following insert select statement works for non parquet

[jira] [Commented] (SPARK-10804) "LOCAL" in LOAD DATA LOCAL INPATH means "remote"

2015-09-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906723#comment-14906723 ] Marcelo Vanzin commented on SPARK-10804: This is really a Hive issue, which Spark just inherits

[jira] [Created] (SPARK-10809) Single-document topicDistributions method for LocalLDAModel

2015-09-24 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-10809: - Summary: Single-document topicDistributions method for LocalLDAModel Key: SPARK-10809 URL: https://issues.apache.org/jira/browse/SPARK-10809 Project: Spark

[jira] [Commented] (SPARK-10773) Repartition operation failing on RDD with "argument type mismatch" error

2015-09-24 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906803#comment-14906803 ] Andrew Or commented on SPARK-10773: --- I believe this is fixed in 1.5.0:

[jira] [Commented] (SPARK-10741) Hive Query Having/OrderBy against Parquet table is not working

2015-09-24 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906820#comment-14906820 ] Yin Huai commented on SPARK-10741: -- [~ianlcsd] Does {{select "test1" as c1, (count(*)+1) *10 as c2 from

[jira] [Commented] (SPARK-10735) CatalystTypeConverters MatchError converting RDD with custom object to dataframe

2015-09-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906870#comment-14906870 ] Thomas Graves commented on SPARK-10735: --- [~joshrosen] it appears you did some of the refactoring

[jira] [Commented] (SPARK-10790) Dynamic Allocation does not request any executors if first stage needs less than or equal to spark.dynamicAllocation.initialExecutors

2015-09-24 Thread Jonathan Kelly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14906912#comment-14906912 ] Jonathan Kelly commented on SPARK-10790: I can reproduce it with minExecutors=N and

  1   2   >