[jira] [Resolved] (SPARK-7310) SparkSubmit does not escape for java options and ^ won't work

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-7310. -- Resolution: Not A Problem OK sounds like the particular escaping issue is no longer a problem as far as

[jira] [Closed] (SPARK-7151) Correlation methods for DataFrame

2015-05-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-7151. -- Resolution: Duplicate Fix Version/s: 1.4.0 Assignee: Burak Yavuz Target

[jira] [Updated] (SPARK-1762) Add functionality to pin RDDs in cache

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-1762: - Target Version/s: (was: 1.2.0) Add functionality to pin RDDs in cache

[jira] [Commented] (SPARK-7394) Add Pandas style cast (astype)

2015-05-06 Thread Chen Song (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530065#comment-14530065 ] Chen Song commented on SPARK-7394: -- Ok, I'll work on this after finishing my daily job.

[jira] [Created] (SPARK-7395) some suggestion about SimpleApp in quick-start.html

2015-05-06 Thread zhengbing li (JIRA)
zhengbing li created SPARK-7395: --- Summary: some suggestion about SimpleApp in quick-start.html Key: SPARK-7395 URL: https://issues.apache.org/jira/browse/SPARK-7395 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-6940) PySpark CrossValidator

2015-05-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-6940. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 5926

[jira] [Assigned] (SPARK-7397) Add missing input information report back to ReceiverInputDStream due to SPARK-7139

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7397: --- Assignee: Apache Spark Add missing input information report back to ReceiverInputDStream

[jira] [Commented] (SPARK-7397) Add missing input information report back to ReceiverInputDStream due to SPARK-7139

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530218#comment-14530218 ] Apache Spark commented on SPARK-7397: - User 'jerryshao' has created a pull request for

[jira] [Assigned] (SPARK-7397) Add missing input information report back to ReceiverInputDStream due to SPARK-7139

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7397: --- Assignee: (was: Apache Spark) Add missing input information report back to

[jira] [Assigned] (SPARK-6812) filter() on DataFrame does not work as expected

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6812: --- Assignee: Apache Spark (was: Sun Rui) filter() on DataFrame does not work as expected

[jira] [Commented] (SPARK-6812) filter() on DataFrame does not work as expected

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530268#comment-14530268 ] Apache Spark commented on SPARK-6812: - User 'sun-rui' has created a pull request for

[jira] [Created] (SPARK-7397) Add missing input information report back to ReceiverInputDStream due to SPARK-7139

2015-05-06 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-7397: -- Summary: Add missing input information report back to ReceiverInputDStream due to SPARK-7139 Key: SPARK-7397 URL: https://issues.apache.org/jira/browse/SPARK-7397

[jira] [Commented] (SPARK-5281) Registering table on RDD is giving MissingRequirementError

2015-05-06 Thread Iulian Dragos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530224#comment-14530224 ] Iulian Dragos commented on SPARK-5281: -- Here's my workaround from [this stack

[jira] [Updated] (SPARK-7386) Spark application level metrics application.$AppName.$number.cores doesn't reset on Standalone Master deployment

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-7386: - Component/s: Spark Core Priority: Minor (was: Major) Please set component. I'm not familiar with

[jira] [Resolved] (SPARK-7374) Error message when launching: find: 'version' : No such file or directory

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-7374. -- Resolution: Duplicate Please search JIRA first and review

[jira] [Resolved] (SPARK-7369) Spark Python 1.3.1 Mllib dataframe random forest problem

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-7369. -- Resolution: Invalid Have a look at

[jira] [Updated] (SPARK-7285) Audit missing Hive functions

2015-05-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7285: --- Description: Create a list of functions that is on this page but not in SQL/DataFrame.

[jira] [Commented] (SPARK-5556) Latent Dirichlet Allocation (LDA) using Gibbs sampler

2015-05-06 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530029#comment-14530029 ] Guoqiang Li commented on SPARK-5556:

[jira] [Updated] (SPARK-7285) Audit missing Hive functions

2015-05-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7285: --- Description: Create a list of functions that is on this page but not in SQL/DataFrame.

[jira] [Updated] (SPARK-4488) Add control over map-side aggregation

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4488: - Target Version/s: (was: 1.1.1, 1.2.0) Add control over map-side aggregation

[jira] [Updated] (SPARK-3095) [PySpark] Speed up RDD.count()

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3095: - Target Version/s: (was: 1.2.0) [PySpark] Speed up RDD.count() --

[jira] [Updated] (SPARK-3492) Clean up Yarn integration code

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3492: - Target Version/s: (was: 1.2.0) Clean up Yarn integration code --

[jira] [Updated] (SPARK-3913) Spark Yarn Client API change to expose Yarn Resource Capacity, Yarn Application Listener and killApplication() API

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3913: - Target Version/s: (was: 1.2.0) Spark Yarn Client API change to expose Yarn Resource Capacity, Yarn

[jira] [Updated] (SPARK-2838) performance tests for feature transformations

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-2838: - Target Version/s: (was: 1.2.0) performance tests for feature transformations

[jira] [Updated] (SPARK-2653) Heap size should be the sum of driver.memory and executor.memory in local mode

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-2653: - Target Version/s: (was: 1.2.0) Heap size should be the sum of driver.memory and executor.memory in

[jira] [Updated] (SPARK-3685) Spark's local dir should accept only local paths

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3685: - Target Version/s: (was: 1.2.0) Spark's local dir should accept only local paths

[jira] [Updated] (SPARK-1239) Don't fetch all map output statuses at each reducer during shuffles

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-1239: - Target Version/s: (was: 1.2.0) Don't fetch all map output statuses at each reducer during shuffles

[jira] [Updated] (SPARK-3075) Expose a way for users to parse event logs

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3075: - Target Version/s: (was: 1.2.0) Expose a way for users to parse event logs

[jira] [Updated] (SPARK-2774) Set preferred locations for reduce tasks

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-2774: - Target Version/s: (was: 1.2.0) Set preferred locations for reduce tasks

[jira] [Updated] (SPARK-4609) Job can not finish if there is one bad slave in clusters

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4609: - Target Version/s: (was: 1.3.0) Job can not finish if there is one bad slave in clusters

[jira] [Updated] (SPARK-2992) The transforms formerly known as non-lazy

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-2992: - Target Version/s: (was: 1.2.0) The transforms formerly known as non-lazy

[jira] [Updated] (SPARK-3376) Memory-based shuffle strategy to reduce overhead of disk I/O

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3376: - Target Version/s: (was: 1.3.0) Memory-based shuffle strategy to reduce overhead of disk I/O

[jira] [Updated] (SPARK-3348) Support user-defined SparkListeners properly

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3348: - Target Version/s: (was: 1.2.0) Support user-defined SparkListeners properly

[jira] [Updated] (SPARK-4784) Model.fittingParamMap should store all Params

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4784: - Target Version/s: (was: 1.3.0) Model.fittingParamMap should store all Params

[jira] [Updated] (SPARK-3916) recognize appended data in textFileStream()

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3916: - Target Version/s: (was: 1.2.0) recognize appended data in textFileStream()

[jira] [Updated] (SPARK-3051) Support looking-up named accumulators in a registry

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3051: - Target Version/s: (was: 1.2.0) Support looking-up named accumulators in a registry

[jira] [Updated] (SPARK-7394) Add Pandas style cast (astype)

2015-05-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7394: --- Fix Version/s: (was: 1.4.0) Add Pandas style cast (astype) --

[jira] [Resolved] (SPARK-7372) Multiclass SVM - One vs All wrapper

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-7372. -- Resolution: Won't Fix This should be a question on user@ I think. It would better to build this once

[jira] [Commented] (SPARK-7394) Add Pandas style cast (astype)

2015-05-06 Thread Chen Song (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530066#comment-14530066 ] Chen Song commented on SPARK-7394: -- Ok, I'll work on this after finishing my daily job.

[jira] [Created] (SPARK-7396) Update Kafka example to use new API of Kafka 0.8.2

2015-05-06 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-7396: -- Summary: Update Kafka example to use new API of Kafka 0.8.2 Key: SPARK-7396 URL: https://issues.apache.org/jira/browse/SPARK-7396 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-7395) some suggestion about SimpleApp in quick-start.html

2015-05-06 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-7395: --- Affects Version/s: (was: 1.3.1) 1.4.0 some suggestion about SimpleApp in

[jira] [Updated] (SPARK-7395) some suggestion about SimpleApp in quick-start.html

2015-05-06 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao updated SPARK-7395: --- Affects Version/s: (was: 1.4.0) 1.3.1 some suggestion about SimpleApp in

[jira] [Commented] (SPARK-6824) Fill the docs for DataFrame API in SparkR

2015-05-06 Thread Qian Huang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529997#comment-14529997 ] Qian Huang commented on SPARK-6824: --- start working on this issue Fill the docs for

[jira] [Updated] (SPARK-7150) SQLContext.range()

2015-05-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7150: --- Summary: SQLContext.range() (was: Facilitate random column generation for DataFrames)

[jira] [Updated] (SPARK-7285) Audit missing Hive functions

2015-05-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7285: --- Description: Create a list of functions that is on this page but not in SQL/DataFrame.

[jira] [Updated] (SPARK-3454) Expose JSON representation of data shown in WebUI

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3454: - Target Version/s: (was: 1.2.0) Expose JSON representation of data shown in WebUI

[jira] [Updated] (SPARK-3166) Custom serialisers can't be shipped in application jars

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3166: - Target Version/s: (was: 1.2.0) Custom serialisers can't be shipped in application jars

[jira] [Updated] (SPARK-3134) Update block locations asynchronously in TorrentBroadcast

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3134: - Target Version/s: (was: 1.2.0) Update block locations asynchronously in TorrentBroadcast

[jira] [Updated] (SPARK-3684) Can't configure local dirs in Yarn mode

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3684: - Target Version/s: (was: 1.2.0) Can't configure local dirs in Yarn mode

[jira] [Updated] (SPARK-3631) Add docs for checkpoint usage

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3631: - Target Version/s: (was: 1.2.0) Add docs for checkpoint usage -

[jira] [Updated] (SPARK-3514) Provide a utility function for returning the hosts (and number) of live executors

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3514: - Target Version/s: (was: 1.2.0) Provide a utility function for returning the hosts (and number) of live

[jira] [Updated] (SPARK-1832) Executor UI improvement suggestions

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-1832: - Target Version/s: (was: 1.2.0) Executor UI improvement suggestions

[jira] [Updated] (SPARK-3630) Identify cause of Kryo+Snappy PARSING_ERROR

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3630: - Target Version/s: (was: 1.1.1, 1.2.0) Identify cause of Kryo+Snappy PARSING_ERROR

[jira] [Updated] (SPARK-3385) Improve shuffle performance

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3385: - Target Version/s: (was: 1.3.0) Improve shuffle performance ---

[jira] [Updated] (SPARK-4106) Shuffle write and spill to disk metrics are incorrect

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4106: - Target Version/s: (was: 1.2.0) Shuffle write and spill to disk metrics are incorrect

[jira] [Updated] (SPARK-4356) Test Scala 2.11 on Jenkins

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4356: - Target Version/s: (was: 1.2.0) Test Scala 2.11 on Jenkins --

[jira] [Updated] (SPARK-3115) Improve task broadcast latency for small tasks

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3115: - Target Version/s: (was: 1.2.0) Improve task broadcast latency for small tasks

[jira] [Updated] (SPARK-1823) ExternalAppendOnlyMap can still OOM if one key is very large

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-1823: - Target Version/s: (was: 1.2.0) ExternalAppendOnlyMap can still OOM if one key is very large

[jira] [Updated] (SPARK-3374) Spark on Yarn config cleanup

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3374: - Target Version/s: (was: 1.2.0) Spark on Yarn config cleanup

[jira] [Updated] (SPARK-2365) Add IndexedRDD, an efficient updatable key-value store

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-2365: - Target Version/s: (was: 1.2.0) Add IndexedRDD, an efficient updatable key-value store

[jira] [Updated] (SPARK-3441) Explain in docs that repartitionAndSortWithinPartitions enacts Hadoop style shuffle

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3441: - Target Version/s: (was: 1.2.0) Explain in docs that repartitionAndSortWithinPartitions enacts Hadoop

[jira] [Updated] (SPARK-3629) Improvements to YARN doc

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3629: - Target Version/s: (was: 1.1.1, 1.2.0) Improvements to YARN doc

[jira] [Updated] (SPARK-3982) receiverStream in Python API

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-3982: - Target Version/s: (was: 1.2.0) receiverStream in Python API

[jira] [Updated] (SPARK-7285) Audit missing Hive functions

2015-05-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7285: --- Description: Create a list of functions that is on this page but not in SQL/DataFrame.

[jira] [Resolved] (SPARK-4784) Model.fittingParamMap should store all Params

2015-05-06 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-4784. -- Resolution: Not A Problem [SPARK-5956] removed fittingParamMap, so this is no longer an

[jira] [Updated] (SPARK-7394) Add Pandas style cast (astype)

2015-05-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7394: --- Description: Basically alias astype == cast in Column for Python (and Python only). was:Basically

[jira] [Commented] (SPARK-7394) Add Pandas style cast (astype)

2015-05-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530063#comment-14530063 ] Reynold Xin commented on SPARK-7394: cc [~smacat] would you like to do this one as

[jira] [Resolved] (SPARK-7395) some suggestion about SimpleApp in quick-start.html

2015-05-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-7395. -- Resolution: Not A Problem Fix Version/s: (was: 1.4.0) Target Version/s: (was:

[jira] [Updated] (SPARK-7381) Python API for Transformers

2015-05-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7381: - Assignee: Burak Yavuz Python API for Transformers ---

[jira] [Updated] (SPARK-7383) Python API for ml.feature

2015-05-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7383: - Assignee: Burak Yavuz Python API for ml.feature -

[jira] [Updated] (SPARK-7382) Python API for ml.classification

2015-05-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7382: - Assignee: Burak Yavuz Python API for ml.classification

[jira] [Updated] (SPARK-7383) Python API for ml.feature

2015-05-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7383: - Priority: Major (was: Blocker) Python API for ml.feature -

[jira] [Updated] (SPARK-7388) Python Api for Param[Array[T]]

2015-05-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7388: - Assignee: Burak Yavuz Python Api for Param[Array[T]] --

[jira] [Updated] (SPARK-7381) Python API for Transformers

2015-05-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7381: - Priority: Major (was: Blocker) Python API for Transformers ---

[jira] [Updated] (SPARK-7382) Python API for ml.classification

2015-05-06 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7382: - Priority: Major (was: Blocker) Python API for ml.classification

[jira] [Commented] (SPARK-6812) filter() on DataFrame does not work as expected

2015-05-06 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530106#comment-14530106 ] Sun Rui commented on SPARK-6812: According to the R manual:

[jira] [Assigned] (SPARK-7396) Update Kafka example to use new API of Kafka 0.8.2

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7396: --- Assignee: Apache Spark Update Kafka example to use new API of Kafka 0.8.2

[jira] [Created] (SPARK-7393) How to improve Spark SQL performance?

2015-05-06 Thread Liang Lee (JIRA)
Liang Lee created SPARK-7393: Summary: How to improve Spark SQL performance? Key: SPARK-7393 URL: https://issues.apache.org/jira/browse/SPARK-7393 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-7394) Add Pandas style cast (astype)

2015-05-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7394: --- Labels: starter (was: ) Add Pandas style cast (astype) --

[jira] [Updated] (SPARK-7394) Add Pandas style cast (astype)

2015-05-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7394: --- Description: Basically alias astype == cast in Column for Python. (was: Basically alias astype ==

[jira] [Commented] (SPARK-6284) Support framework authentication and role in Mesos framework

2015-05-06 Thread Bharath Ravi Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530117#comment-14530117 ] Bharath Ravi Kumar commented on SPARK-6284: --- [~tnachen] This issue makes Spark

[jira] [Comment Edited] (SPARK-6258) Python MLlib API missing items: Clustering

2015-05-06 Thread Hrishikesh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530438#comment-14530438 ] Hrishikesh edited comment on SPARK-6258 at 5/6/15 12:34 PM:

[jira] [Reopened] (SPARK-3454) Expose JSON representation of data shown in WebUI

2015-05-06 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid reopened SPARK-3454: - need to fix some issues w/ test files in pr ... Expose JSON representation of data shown in WebUI

[jira] [Comment Edited] (SPARK-7116) Intermediate RDD cached but never unpersisted

2015-05-06 Thread Dennis Proppe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530390#comment-14530390 ] Dennis Proppe edited comment on SPARK-7116 at 5/6/15 11:47 AM:

[jira] [Assigned] (SPARK-5913) Python API for ChiSqSelector

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-5913: --- Assignee: Apache Spark (was: Yanbo Liang) Python API for ChiSqSelector

[jira] [Assigned] (SPARK-5913) Python API for ChiSqSelector

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-5913: --- Assignee: Yanbo Liang (was: Apache Spark) Python API for ChiSqSelector

[jira] [Commented] (SPARK-6258) Python MLlib API missing items: Clustering

2015-05-06 Thread Hrishikesh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530438#comment-14530438 ] Hrishikesh commented on SPARK-6258: --- [~yanboliang], you can start working on it.

[jira] [Commented] (SPARK-6258) Python MLlib API missing items: Clustering

2015-05-06 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530434#comment-14530434 ] Yanbo Liang commented on SPARK-6258: [~hrishikesh] Are you still work on this issue?

[jira] [Comment Edited] (SPARK-6258) Python MLlib API missing items: Clustering

2015-05-06 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530434#comment-14530434 ] Yanbo Liang edited comment on SPARK-6258 at 5/6/15 12:26 PM: -

[jira] [Updated] (SPARK-7369) Spark Python 1.3.1 Mllib dataframe random forest problem

2015-05-06 Thread Lisbeth Ron (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisbeth Ron updated SPARK-7369: --- Attachment: random_forest_dataframe_spark_30042015.py Hi Sean, I still have problems with python

[jira] [Updated] (SPARK-7322) Add DataFrame DSL for window function support

2015-05-06 Thread Olivier Girardot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olivier Girardot updated SPARK-7322: Labels: DataFrame (was: ) Add DataFrame DSL for window function support

[jira] [Commented] (SPARK-3454) Expose JSON representation of data shown in WebUI

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530562#comment-14530562 ] Apache Spark commented on SPARK-3454: - User 'squito' has created a pull request for

[jira] [Created] (SPARK-7398) Add back-pressure to Spark Streaming

2015-05-06 Thread JIRA
François Garillot created SPARK-7398: Summary: Add back-pressure to Spark Streaming Key: SPARK-7398 URL: https://issues.apache.org/jira/browse/SPARK-7398 Project: Spark Issue Type: Bug

[jira] [Assigned] (SPARK-6093) Add RegressionMetrics in PySpark/MLlib

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6093: --- Assignee: Apache Spark Add RegressionMetrics in PySpark/MLlib

[jira] [Commented] (SPARK-6093) Add RegressionMetrics in PySpark/MLlib

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530581#comment-14530581 ] Apache Spark commented on SPARK-6093: - User 'yanboliang' has created a pull request

[jira] [Commented] (SPARK-7116) Intermediate RDD cached but never unpersisted

2015-05-06 Thread Kalle Jepsen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530580#comment-14530580 ] Kalle Jepsen commented on SPARK-7116: - [~marmbrus] Do you remember why that

[jira] [Assigned] (SPARK-6093) Add RegressionMetrics in PySpark/MLlib

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6093: --- Assignee: (was: Apache Spark) Add RegressionMetrics in PySpark/MLlib

[jira] [Commented] (SPARK-6093) Add RegressionMetrics in PySpark/MLlib

2015-05-06 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530586#comment-14530586 ] Yanbo Liang commented on SPARK-6093: [~mengxr] Could you assign this to me? Add

[jira] [Assigned] (SPARK-7035) Drop __getattr__ on pyspark.sql.DataFrame

2015-05-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-7035: --- Assignee: (was: Apache Spark) Drop __getattr__ on pyspark.sql.DataFrame

[jira] [Commented] (SPARK-7393) How to improve Spark SQL performance?

2015-05-06 Thread Dennis Proppe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530636#comment-14530636 ] Dennis Proppe commented on SPARK-7393: -- Hi, Liang Lee, without more information (HDD

  1   2   3   >