[jira] [Commented] (SPARK-7324) Add DataFrame.dropDuplicates

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525686#comment-14525686 ] Apache Spark commented on SPARK-7324: - User 'kaka1992' has created a pull request for

[jira] [Assigned] (SPARK-7324) Add DataFrame.dropDuplicates

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7324: --- Assignee: Apache Spark Add DataFrame.dropDuplicates

[jira] [Created] (SPARK-7326) Performing window() on a WindowedDStream doesn't work all the time

2015-05-03 Thread Wesley Miao (JIRA)
Wesley Miao created SPARK-7326: -- Summary: Performing window() on a WindowedDStream doesn't work all the time Key: SPARK-7326 URL: https://issues.apache.org/jira/browse/SPARK-7326 Project: Spark

[jira] [Assigned] (SPARK-7324) Add DataFrame.dropDuplicates

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7324: --- Assignee: (was: Apache Spark) Add DataFrame.dropDuplicates

[jira] [Updated] (SPARK-6026) Eliminate the bypassMergeThreshold parameter and associated hash-ish shuffle within the Sort shuffle code

2015-05-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-6026: -- Component/s: Shuffle Eliminate the bypassMergeThreshold parameter and associated hash-ish shuffle

[jira] [Commented] (SPARK-6411) PySpark DataFrames can't be created if any datetimes have timezones

2015-05-03 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526291#comment-14526291 ] Xiangrui Meng commented on SPARK-6411: -- [~airhorns] I'm testing Spark with Pyrolite

[jira] [Closed] (SPARK-4184) Improve Spark Streaming documentation to address commonly-asked questions

2015-05-03 Thread Chris Fregly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Fregly closed SPARK-4184. --- Resolution: Duplicate we'll incorporate changes in incrementally Improve Spark Streaming

[jira] [Updated] (SPARK-6654) Update Kinesis Streaming impls (both KCL-based and Direct) to use latest aws-java-sdk and kinesis-client-library

2015-05-03 Thread Chris Fregly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Fregly updated SPARK-6654: Priority: Major (was: Blocker) Target Version/s: 1.5.0 (was: 1.4.0) Update Kinesis

[jira] [Resolved] (SPARK-6907) Create an isolated classloader for the Hive Client.

2015-05-03 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-6907. - Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5851

[jira] [Resolved] (SPARK-7302) SPARK building documentation still mentions building for yarn 0.23

2015-05-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-7302. -- Resolution: Fixed Fix Version/s: 1.4.0 Assignee: Sean Owen Resolved by

[jira] [Commented] (SPARK-7326) Performing window() on a WindowedDStream doesn't work all the time

2015-05-03 Thread Wesley Miao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526062#comment-14526062 ] Wesley Miao commented on SPARK-7326: What I'd like to achieve is to do multiple-level

[jira] [Assigned] (SPARK-6908) Refactor existing code to use the isolated client

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6908: --- Assignee: (was: Apache Spark) Refactor existing code to use the isolated client

[jira] [Commented] (SPARK-6908) Refactor existing code to use the isolated client

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526104#comment-14526104 ] Apache Spark commented on SPARK-6908: - User 'marmbrus' has created a pull request for

[jira] [Assigned] (SPARK-6908) Refactor existing code to use the isolated client

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6908: --- Assignee: Apache Spark Refactor existing code to use the isolated client

[jira] [Updated] (SPARK-7302) SPARK building documentation still mentions building for yarn 0.23

2015-05-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-7302: - Priority: Minor (was: Major) SPARK building documentation still mentions building for yarn 0.23

[jira] [Commented] (SPARK-6873) Some Hive-Catalyst comparison tests fail due to unimportant order of some printed elements

2015-05-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526019#comment-14526019 ] Sean Owen commented on SPARK-6873: -- Sorry for the late reply, missed this. Yes, HashMap

[jira] [Commented] (SPARK-7013) Add unit test for spark.ml StandardScaler

2015-05-03 Thread Glenn Weidner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526042#comment-14526042 ] Glenn Weidner commented on SPARK-7013: -- I would like to work on this. I've started

[jira] [Commented] (SPARK-7275) Make LogicalRelation public

2015-05-03 Thread Glenn Weidner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526098#comment-14526098 ] Glenn Weidner commented on SPARK-7275: -- I checked the history of the file

[jira] [Commented] (SPARK-2336) Approximate k-NN Models for MLLib

2015-05-03 Thread Sen Fang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526153#comment-14526153 ] Sen Fang commented on SPARK-2336: - Hey Longbao, great to hear from you. To my best

[jira] [Updated] (SPARK-6943) Graphically show RDD's included in a stage

2015-05-03 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-6943: - Attachment: (was: with-stack-trace.png) Graphically show RDD's included in a stage

[jira] [Updated] (SPARK-6943) Graphically show RDD's included in a stage

2015-05-03 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-6943: - Attachment: (was: with-closures.png) Graphically show RDD's included in a stage

[jira] [Assigned] (SPARK-7330) JDBC RDD could lead to NPE when the date field is null

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7330: --- Assignee: Apache Spark JDBC RDD could lead to NPE when the date field is null

[jira] [Commented] (SPARK-6943) Graphically show RDD's included in a stage

2015-05-03 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526181#comment-14526181 ] Andrew Or commented on SPARK-6943: -- Hi [~kayousterhout] I updated the patch to include

[jira] [Assigned] (SPARK-3524) remove workaround to pickle array of float for Pyrolite

2015-05-03 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-3524: Assignee: Xiangrui Meng remove workaround to pickle array of float for Pyrolite

[jira] [Updated] (SPARK-3524) remove workaround to pickle array of float for Pyrolite

2015-05-03 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3524: - Target Version/s: 1.4.0 (was: 1.2.0) remove workaround to pickle array of float for Pyrolite

[jira] [Commented] (SPARK-7327) DataFrame show() method doesn't like empty dataframes

2015-05-03 Thread Olivier Girardot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526231#comment-14526231 ] Olivier Girardot commented on SPARK-7327: - ok thx DataFrame show() method

[jira] [Created] (SPARK-7332) RpcCallContext.sender has a different name from the original sender's name

2015-05-03 Thread Qiping Li (JIRA)
Qiping Li created SPARK-7332: Summary: RpcCallContext.sender has a different name from the original sender's name Key: SPARK-7332 URL: https://issues.apache.org/jira/browse/SPARK-7332 Project: Spark

[jira] [Updated] (SPARK-6602) Replace direct use of Akka with Spark RPC interface

2015-05-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6602: --- Priority: Critical (was: Major) Replace direct use of Akka with Spark RPC interface

[jira] [Updated] (SPARK-5293) Enable Spark user applications to use different versions of Akka

2015-05-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5293: --- Target Version/s: 1.6.0 Enable Spark user applications to use different versions of Akka

[jira] [Updated] (SPARK-6602) Replace direct use of Akka with Spark RPC interface

2015-05-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6602: --- Target Version/s: 1.5.0 (was: 1.4.0) Replace direct use of Akka with Spark RPC interface

[jira] [Assigned] (SPARK-6944) Mechanism to associate generic operator scope with RDD's

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6944: --- Assignee: Andrew Or (was: Apache Spark) Mechanism to associate generic operator scope with

[jira] [Created] (SPARK-7330) JDBC RDD could lead to NPE when the date field is null

2015-05-03 Thread Adrian Wang (JIRA)
Adrian Wang created SPARK-7330: -- Summary: JDBC RDD could lead to NPE when the date field is null Key: SPARK-7330 URL: https://issues.apache.org/jira/browse/SPARK-7330 Project: Spark Issue Type:

[jira] [Commented] (SPARK-6944) Mechanism to associate generic operator scope with RDD's

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526178#comment-14526178 ] Apache Spark commented on SPARK-6944: - User 'andrewor14' has created a pull request

[jira] [Assigned] (SPARK-7113) Add the direct stream related information to the streaming listener and web UI

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7113: --- Assignee: Apache Spark Add the direct stream related information to the streaming listener

[jira] [Commented] (SPARK-7113) Add the direct stream related information to the streaming listener and web UI

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526202#comment-14526202 ] Apache Spark commented on SPARK-7113: - User 'jerryshao' has created a pull request for

[jira] [Assigned] (SPARK-7113) Add the direct stream related information to the streaming listener and web UI

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7113: --- Assignee: (was: Apache Spark) Add the direct stream related information to the

[jira] [Assigned] (SPARK-7331) Create HiveConf per application instead of per query in HiveQl.scala

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7331: --- Assignee: Apache Spark Create HiveConf per application instead of per query in HiveQl.scala

[jira] [Assigned] (SPARK-7331) Create HiveConf per application instead of per query in HiveQl.scala

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7331: --- Assignee: (was: Apache Spark) Create HiveConf per application instead of per query in

[jira] [Commented] (SPARK-7331) Create HiveConf per application instead of per query in HiveQl.scala

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526223#comment-14526223 ] Apache Spark commented on SPARK-7331: - User 'nitin2goyal' has created a pull request

[jira] [Updated] (SPARK-7331) Create HiveConf per application instead of per query in HiveQl.scala

2015-05-03 Thread Nitin Goyal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nitin Goyal updated SPARK-7331: --- Description: A new HiveConf is created per query in getAst method in HiveQl.scala def getAst(sql:

[jira] [Assigned] (SPARK-7275) Make LogicalRelation public

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7275: --- Assignee: (was: Apache Spark) Make LogicalRelation public ---

[jira] [Commented] (SPARK-7275) Make LogicalRelation public

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526227#comment-14526227 ] Apache Spark commented on SPARK-7275: - User 'gweidner' has created a pull request for

[jira] [Assigned] (SPARK-7275) Make LogicalRelation public

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7275: --- Assignee: Apache Spark Make LogicalRelation public ---

[jira] [Commented] (SPARK-7322) Add DataFrame DSL for window function support

2015-05-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526166#comment-14526166 ] Reynold Xin commented on SPARK-7322: Yup. Add DataFrame DSL for window function

[jira] [Commented] (SPARK-7330) JDBC RDD could lead to NPE when the date field is null

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526180#comment-14526180 ] Apache Spark commented on SPARK-7330: - User 'adrian-wang' has created a pull request

[jira] [Assigned] (SPARK-7330) JDBC RDD could lead to NPE when the date field is null

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7330: --- Assignee: (was: Apache Spark) JDBC RDD could lead to NPE when the date field is null

[jira] [Commented] (SPARK-6944) Mechanism to associate generic operator scope with RDD's

2015-05-03 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526179#comment-14526179 ] Andrew Or commented on SPARK-6944: -- https://github.com/apache/spark/pull/5729 Mechanism

[jira] [Assigned] (SPARK-6944) Mechanism to associate generic operator scope with RDD's

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6944: --- Assignee: Apache Spark (was: Andrew Or) Mechanism to associate generic operator scope with

[jira] [Issue Comment Deleted] (SPARK-6944) Mechanism to associate generic operator scope with RDD's

2015-05-03 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-6944: - Comment: was deleted (was: https://github.com/apache/spark/pull/5729) Mechanism to associate generic

[jira] [Updated] (SPARK-7331) Create HiveConf per application instead of per query in HiveQl.scala

2015-05-03 Thread Nitin Goyal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nitin Goyal updated SPARK-7331: --- Description: A new HiveConf is created per query in getAst method in HiveQl.scala def getAst(sql:

[jira] [Created] (SPARK-7331) Create HiveConf per application instead of per query in HiveQl.scala

2015-05-03 Thread Nitin Goyal (JIRA)
Nitin Goyal created SPARK-7331: -- Summary: Create HiveConf per application instead of per query in HiveQl.scala Key: SPARK-7331 URL: https://issues.apache.org/jira/browse/SPARK-7331 Project: Spark

[jira] [Commented] (SPARK-3524) remove workaround to pickle array of float for Pyrolite

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526229#comment-14526229 ] Apache Spark commented on SPARK-3524: - User 'mengxr' has created a pull request for

[jira] [Assigned] (SPARK-3524) remove workaround to pickle array of float for Pyrolite

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-3524: --- Assignee: Apache Spark (was: Xiangrui Meng) remove workaround to pickle array of float for

[jira] [Assigned] (SPARK-3524) remove workaround to pickle array of float for Pyrolite

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-3524: --- Assignee: Xiangrui Meng (was: Apache Spark) remove workaround to pickle array of float for

[jira] [Resolved] (SPARK-7241) Pearson correlation for DataFrames

2015-05-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-7241. Resolution: Fixed Fix Version/s: 1.4.0 Pearson correlation for DataFrames

[jira] [Assigned] (SPARK-6411) PySpark DataFrames can't be created if any datetimes have timezones

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6411: --- Assignee: Apache Spark (was: Xiangrui Meng) PySpark DataFrames can't be created if any

[jira] [Commented] (SPARK-6411) PySpark DataFrames can't be created if any datetimes have timezones

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526244#comment-14526244 ] Apache Spark commented on SPARK-6411: - User 'mengxr' has created a pull request for

[jira] [Assigned] (SPARK-6411) PySpark DataFrames can't be created if any datetimes have timezones

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6411: --- Assignee: Xiangrui Meng (was: Apache Spark) PySpark DataFrames can't be created if any

[jira] [Assigned] (SPARK-6411) PySpark DataFrames can't be created if any datetimes have timezones

2015-05-03 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-6411: Assignee: Xiangrui Meng (was: Davies Liu) PySpark DataFrames can't be created if any

[jira] [Commented] (SPARK-6514) For Kinesis Streaming, use the same region for DynamoDB (KCL checkpoints) as the Kinesis stream itself

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526261#comment-14526261 ] Apache Spark commented on SPARK-6514: - User 'cfregly' has created a pull request for

[jira] [Commented] (SPARK-5960) Allow AWS credentials to be passed to KinesisUtils.createStream()

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526262#comment-14526262 ] Apache Spark commented on SPARK-5960: - User 'cfregly' has created a pull request for

[jira] [Assigned] (SPARK-6514) For Kinesis Streaming, use the same region for DynamoDB (KCL checkpoints) as the Kinesis stream itself

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6514: --- Assignee: Apache Spark For Kinesis Streaming, use the same region for DynamoDB (KCL

[jira] [Assigned] (SPARK-6656) Allow the application name to be passed in versus pulling from SparkContext.getAppName()

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6656: --- Assignee: Apache Spark Allow the application name to be passed in versus pulling from

[jira] [Commented] (SPARK-6656) Allow the application name to be passed in versus pulling from SparkContext.getAppName()

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526263#comment-14526263 ] Apache Spark commented on SPARK-6656: - User 'cfregly' has created a pull request for

[jira] [Assigned] (SPARK-6514) For Kinesis Streaming, use the same region for DynamoDB (KCL checkpoints) as the Kinesis stream itself

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6514: --- Assignee: (was: Apache Spark) For Kinesis Streaming, use the same region for DynamoDB

[jira] [Assigned] (SPARK-6656) Allow the application name to be passed in versus pulling from SparkContext.getAppName()

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6656: --- Assignee: (was: Apache Spark) Allow the application name to be passed in versus pulling

[jira] [Updated] (SPARK-6028) Provide an alternative RPC implementation based on the network transport module

2015-05-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6028: --- Priority: Critical (was: Major) Target Version/s: 1.5.0 Provide an alternative RPC

[jira] [Updated] (SPARK-6280) Remove Akka systemName from Spark

2015-05-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6280: --- Target Version/s: 1.5.0 (was: 1.4.0) Remove Akka systemName from Spark

[jira] [Resolved] (SPARK-7329) Use itertools.product in ParamGridBuilder

2015-05-03 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-7329. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5873

[jira] [Updated] (SPARK-6943) Graphically show RDD's included in a stage

2015-05-03 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-6943: - Attachment: job-page.png stage-page.png Graphically show RDD's included in a stage

[jira] [Commented] (SPARK-7327) DataFrame show() method doesn't like empty dataframes

2015-05-03 Thread Chen Song (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526191#comment-14526191 ] Chen Song commented on SPARK-7327: -- I am working on upgrade the show function. I'll run

[jira] [Closed] (SPARK-7024) Improve performance of function containsStar

2015-05-03 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi closed SPARK-7024. Resolution: Not A Problem Improve performance of function containsStar

[jira] [Commented] (SPARK-6873) Some Hive-Catalyst comparison tests fail due to unimportant order of some printed elements

2015-05-03 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526218#comment-14526218 ] Cheng Lian commented on SPARK-6873: --- The UDF synonyms lines are easy to be filtered out.

[jira] [Commented] (SPARK-7326) Performing window() on a WindowedDStream doesn't work all the time

2015-05-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526278#comment-14526278 ] Sean Owen commented on SPARK-7326: -- Makes sense, just interested in whether this was

[jira] [Updated] (SPARK-7326) Performing window() on a WindowedDStream doesn't work all the time

2015-05-03 Thread Wesley Miao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wesley Miao updated SPARK-7326: --- Description: Someone reported similar issues before but got no response.

[jira] [Updated] (SPARK-7326) Performing window() on a WindowedDStream doesn't work all the time

2015-05-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-7326: - Description: Someone reported similar issues before but got no response.

[jira] [Updated] (SPARK-7326) Performing window() on a WindowedDStream doesn't work all the time

2015-05-03 Thread Wesley Miao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wesley Miao updated SPARK-7326: --- Description: Someone reported similar issues before but got no response.

[jira] [Updated] (SPARK-7249) Updated Hadoop dependencies due to inconsistency in the versions

2015-05-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-7249: - Priority: Blocker (was: Minor) Target Version/s: 1.4.0 Affects Version/s: 1.3.1

[jira] [Commented] (SPARK-7326) Performing window() on a WindowedDStream doesn't work all the time

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525808#comment-14525808 ] Apache Spark commented on SPARK-7326: - User 'wesleymiao' has created a pull request

[jira] [Commented] (SPARK-1437) Jenkins should build with Java 6

2015-05-03 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525761#comment-14525761 ] Steve Loughran commented on SPARK-1437: --- ..be good for the pull request test runs to

[jira] [Commented] (SPARK-7326) Performing window() on a WindowedDStream doesn't work all the time

2015-05-03 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525815#comment-14525815 ] Sean Owen commented on SPARK-7326: -- Out of curiosity why do you have a window + slide of

[jira] [Commented] (SPARK-4105) FAILED_TO_UNCOMPRESS(5) errors when fetching shuffle data with sort-based shuffle

2015-05-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525903#comment-14525903 ] Josh Rosen commented on SPARK-4105: --- While working on some new shuffle code, I managed

[jira] [Updated] (SPARK-4106) Shuffle write and spill to disk metrics are incorrect

2015-05-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4106: -- Component/s: Shuffle Shuffle write and spill to disk metrics are incorrect

[jira] [Created] (SPARK-7329) Use itertools.product in ParamGridBuilder

2015-05-03 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-7329: Summary: Use itertools.product in ParamGridBuilder Key: SPARK-7329 URL: https://issues.apache.org/jira/browse/SPARK-7329 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-7022) PySpark is missing ParamGridBuilder

2015-05-03 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-7022. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5601

[jira] [Commented] (SPARK-7329) Use itertools.product in ParamGridBuilder

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525962#comment-14525962 ] Apache Spark commented on SPARK-7329: - User 'mengxr' has created a pull request for

[jira] [Assigned] (SPARK-7329) Use itertools.product in ParamGridBuilder

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7329: --- Assignee: Xiangrui Meng (was: Apache Spark) Use itertools.product in ParamGridBuilder

[jira] [Updated] (SPARK-4112) Have a reserved copy of Sorter/SortDataFormat

2015-05-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4112: -- Component/s: Shuffle Have a reserved copy of Sorter/SortDataFormat

[jira] [Updated] (SPARK-5392) Shuffle spill size is shown as negative

2015-05-03 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5392: -- Component/s: Shuffle Shuffle spill size is shown as negative ---

[jira] [Commented] (SPARK-7328) Add missing items to pyspark.mllib.linalg.Vectors

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525921#comment-14525921 ] Apache Spark commented on SPARK-7328: - User 'MechCoder' has created a pull request for

[jira] [Assigned] (SPARK-7328) Add missing items to pyspark.mllib.linalg.Vectors

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7328: --- Assignee: Apache Spark Add missing items to pyspark.mllib.linalg.Vectors

[jira] [Assigned] (SPARK-7329) Use itertools.product in ParamGridBuilder

2015-05-03 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7329: --- Assignee: Apache Spark (was: Xiangrui Meng) Use itertools.product in ParamGridBuilder

[jira] [Commented] (SPARK-6980) Akka timeout exceptions indicate which conf controls them

2015-05-03 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525927#comment-14525927 ] Bryan Cutler commented on SPARK-6980: - I added another commit to the PR and some basic

[jira] [Commented] (SPARK-5989) Model import/export for LDAModel

2015-05-03 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14525931#comment-14525931 ] Manoj Kumar commented on SPARK-5989: Oh just saw this, but there are a few other PR's