[jira] [Resolved] (SPARK-5878) Python DataFrame.repartition() is broken

2015-02-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-5878. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Davies Liu Python

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325640#comment-14325640 ] Sean Owen commented on SPARK-5837: -- [~mukh007] That's different, and just you means

[jira] [Updated] (SPARK-5878) Python DataFrame.repartition() is broken

2015-02-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5878: --- Labels: DataFrame (was: ) Python DataFrame.repartition() is broken

[jira] [Updated] (SPARK-5878) Python DataFrame.repartition() is broken

2015-02-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5878: --- Component/s: SQL Python DataFrame.repartition() is broken

[jira] [Resolved] (SPARK-5864) support .jar as python package

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5864. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Davies Liu support .jar

[jira] [Resolved] (SPARK-5850) Remove experimental label for Scala 2.11 and FlumePollingStream

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5850. Resolution: Fixed Fix Version/s: 1.3.0 Remove experimental label for Scala 2.11 and

[jira] [Resolved] (SPARK-5856) In Maven build script, launch Zinc with more memory

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5856. Resolution: Fixed Fix Version/s: 1.3.0 In Maven build script, launch Zinc with more

[jira] [Commented] (SPARK-4454) Race condition in DAGScheduler

2015-02-18 Thread Rafal Kwasny (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325653#comment-14325653 ] Rafal Kwasny commented on SPARK-4454: - Thanks for looking into this The problem

[jira] [Commented] (SPARK-4579) Scheduling Delay appears negative

2015-02-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325631#comment-14325631 ] Sean Owen commented on SPARK-4579: -- See also

[jira] [Updated] (SPARK-4579) Scheduling Delay appears negative

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4579: --- Assignee: Andrew Or Scheduling Delay appears negative -

[jira] [Commented] (SPARK-4579) Scheduling Delay appears negative

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325622#comment-14325622 ] Patrick Wendell commented on SPARK-4579: [~andrewor14] Can you take a look at this

[jira] [Updated] (SPARK-4579) Scheduling Delay appears negative

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4579: --- Labels: (was: starter) Scheduling Delay appears negative

[jira] [Commented] (SPARK-5669) Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325759#comment-14325759 ] Apache Spark commented on SPARK-5669: - User 'srowen' has created a pull request for

[jira] [Updated] (SPARK-4579) Scheduling Delay appears negative

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4579: --- Priority: Critical (was: Minor) Scheduling Delay appears negative

[jira] [Updated] (SPARK-4579) Scheduling Delay appears negative

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4579: --- Labels: starter (was: ) Scheduling Delay appears negative

[jira] [Resolved] (SPARK-5389) spark-shell.cmd does not run from DOS Windows 7

2015-02-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5389. -- Resolution: Cannot Reproduce The error shows that Java and grep have crashed. This isn't a Spark

[jira] [Comment Edited] (SPARK-5389) spark-shell.cmd does not run from DOS Windows 7

2015-02-18 Thread Matt McKnight (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326275#comment-14326275 ] Matt McKnight edited comment on SPARK-5389 at 2/18/15 6:01 PM:

[jira] [Resolved] (SPARK-5519) Add user guide for FP-Growth

2015-02-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5519. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4661

[jira] [Created] (SPARK-5883) Add compression scheme in VertexAttributeBlock for shipping vertices to edge partitions

2015-02-18 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-5883: --- Summary: Add compression scheme in VertexAttributeBlock for shipping vertices to edge partitions Key: SPARK-5883 URL: https://issues.apache.org/jira/browse/SPARK-5883

[jira] [Commented] (SPARK-5016) GaussianMixtureEM should distribute matrix inverse for large numFeatures, k

2015-02-18 Thread Travis Galoppo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326229#comment-14326229 ] Travis Galoppo commented on SPARK-5016: --- [~josephkb] My previous comment got me

[jira] [Commented] (SPARK-5389) spark-shell.cmd does not run from DOS Windows 7

2015-02-18 Thread Matt McKnight (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326275#comment-14326275 ] Matt McKnight commented on SPARK-5389: -- Error message is reproducible. The error

[jira] [Commented] (SPARK-2973) Add a way to show tables without executing a job

2015-02-18 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326247#comment-14326247 ] Yin Huai commented on SPARK-2973: - I just tried our master. sql(show tables).collect()

[jira] [Updated] (SPARK-5883) Add compression scheme in VertexAttributeBlock for shipping vertices to edge partitions

2015-02-18 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-5883: Description: The size of shipped data between vertex partitions and edge partitions is one

[jira] [Resolved] (SPARK-5507) Add user guide for block matrix and its operations

2015-02-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5507. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4664

[jira] [Commented] (SPARK-5830) Don't create unnecessary directory for local root dir

2015-02-18 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326342#comment-14326342 ] Marcelo Vanzin commented on SPARK-5830: --- Just to clarify, that directory is not

[jira] [Commented] (SPARK-5866) pyspark read from s3

2015-02-18 Thread venu k tangirala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326362#comment-14326362 ] venu k tangirala commented on SPARK-5866: - is there a temporary work around for

[jira] [Updated] (SPARK-5885) Add VectorAssembler

2015-02-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5885: - Description: `VectorAssembler` takes a list of columns (of type double/int/vector) and merge

[jira] [Updated] (SPARK-5866) pyspark read from s3

2015-02-18 Thread venu k tangirala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] venu k tangirala updated SPARK-5866: Description: I am trying to read data from s3 via pyspark, I gave the credentials with sc=

[jira] [Updated] (SPARK-5886) Add LabelIndexer

2015-02-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5886: - Description: `LabelIndexer` takes a column of labels (raw categories) and outputs an integer

[jira] [Created] (SPARK-5890) Add FeatureDiscretizer

2015-02-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5890: Summary: Add FeatureDiscretizer Key: SPARK-5890 URL: https://issues.apache.org/jira/browse/SPARK-5890 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-5892) Clean up ML, MLlib docs for 1.3 release

2015-02-18 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-5892: Summary: Clean up ML, MLlib docs for 1.3 release Key: SPARK-5892 URL: https://issues.apache.org/jira/browse/SPARK-5892 Project: Spark Issue Type:

[jira] [Commented] (SPARK-5889) remove pid file in spark-daemon.sh after killing the process.

2015-02-18 Thread Zhan Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326443#comment-14326443 ] Zhan Zhang commented on SPARK-5889: --- https://github.com/apache/spark/pull/4676 remove

[jira] [Commented] (SPARK-5889) remove pid file in spark-daemon.sh after killing the process.

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326445#comment-14326445 ] Apache Spark commented on SPARK-5889: - User 'zhzhan' has created a pull request for

[jira] [Created] (SPARK-5886) Add LabelIndexer

2015-02-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5886: Summary: Add LabelIndexer Key: SPARK-5886 URL: https://issues.apache.org/jira/browse/SPARK-5886 Project: Spark Issue Type: Sub-task Components: ML

[jira] [Created] (SPARK-5889) remove pid file in spark-daemon.sh after killing the process.

2015-02-18 Thread Zhan Zhang (JIRA)
Zhan Zhang created SPARK-5889: - Summary: remove pid file in spark-daemon.sh after killing the process. Key: SPARK-5889 URL: https://issues.apache.org/jira/browse/SPARK-5889 Project: Spark Issue

[jira] [Commented] (SPARK-5867) Update spark.ml docs with DataFrame, Python examples

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326430#comment-14326430 ] Apache Spark commented on SPARK-5867: - User 'jkbradley' has created a pull request for

[jira] [Updated] (SPARK-5884) Implement feature transformers to ML pipelines in 1.4

2015-02-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5884: - Description: This is an umbrella JIRA to keep a list of feature transformers for ML pipelines we

[jira] [Commented] (SPARK-5832) Add Affinity Propagation clustering algorithm

2015-02-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326323#comment-14326323 ] Xiangrui Meng commented on SPARK-5832: -- [~viirya] Thanks for sharing the details! I'm

[jira] [Updated] (SPARK-5884) Implement feature transformers to ML pipelines in 1.4

2015-02-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5884: - Priority: Critical (was: Major) Implement feature transformers to ML pipelines in 1.4

[jira] [Created] (SPARK-5885) Add VectorAssembler

2015-02-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5885: Summary: Add VectorAssembler Key: SPARK-5885 URL: https://issues.apache.org/jira/browse/SPARK-5885 Project: Spark Issue Type: Sub-task Components:

[jira] [Commented] (SPARK-5016) GaussianMixtureEM should distribute matrix inverse for large numFeatures, k

2015-02-18 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326389#comment-14326389 ] Joseph K. Bradley commented on SPARK-5016: -- [~tgaloppo] That would be great if we

[jira] [Commented] (SPARK-5892) Clean up ML, MLlib docs for 1.3 release

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326431#comment-14326431 ] Apache Spark commented on SPARK-5892: - User 'jkbradley' has created a pull request for

[jira] [Created] (SPARK-5894) Add PolynomialMapper

2015-02-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5894: Summary: Add PolynomialMapper Key: SPARK-5894 URL: https://issues.apache.org/jira/browse/SPARK-5894 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-5895) Add VectorSlicer

2015-02-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5895: Summary: Add VectorSlicer Key: SPARK-5895 URL: https://issues.apache.org/jira/browse/SPARK-5895 Project: Spark Issue Type: Sub-task Components: ML

[jira] [Updated] (SPARK-5887) Class not found exception com.datastax.spark.connector.rdd.partitioner.CassandraPartition

2015-02-18 Thread Vijay Pawnarkar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay Pawnarkar updated SPARK-5887: --- Description: I am getting following class not found exception when using Spark 1.2.1 with

[jira] [Created] (SPARK-5891) Add Binarizer

2015-02-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5891: Summary: Add Binarizer Key: SPARK-5891 URL: https://issues.apache.org/jira/browse/SPARK-5891 Project: Spark Issue Type: Sub-task Components: ML

[jira] [Created] (SPARK-5884) Implement feature transformers to ML pipelines in 1.4

2015-02-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5884: Summary: Implement feature transformers to ML pipelines in 1.4 Key: SPARK-5884 URL: https://issues.apache.org/jira/browse/SPARK-5884 Project: Spark Issue

[jira] [Created] (SPARK-5888) Add OneHotEncoder

2015-02-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5888: Summary: Add OneHotEncoder Key: SPARK-5888 URL: https://issues.apache.org/jira/browse/SPARK-5888 Project: Spark Issue Type: Sub-task Components:

[jira] [Created] (SPARK-5893) Add Bucketizer

2015-02-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5893: Summary: Add Bucketizer Key: SPARK-5893 URL: https://issues.apache.org/jira/browse/SPARK-5893 Project: Spark Issue Type: Sub-task Components: ML

[jira] [Commented] (SPARK-5436) Validate GradientBoostedTrees during training

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326590#comment-14326590 ] Apache Spark commented on SPARK-5436: - User 'MechCoder' has created a pull request for

[jira] [Commented] (SPARK-5896) toDF in python doesn't work with Strings

2015-02-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326600#comment-14326600 ] Michael Armbrust commented on SPARK-5896: - Sorry! I had the wrong example.

[jira] [Created] (SPARK-5897) Add PIC code example to user guide

2015-02-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5897: Summary: Add PIC code example to user guide Key: SPARK-5897 URL: https://issues.apache.org/jira/browse/SPARK-5897 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-5897) Add PIC code example to user guide

2015-02-18 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5897: - Description: PIC user guide doesn't have code examples. Add PIC code example to user guide

[jira] [Created] (SPARK-5896) toDF in python doesn't work with Strings

2015-02-18 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-5896: --- Summary: toDF in python doesn't work with Strings Key: SPARK-5896 URL: https://issues.apache.org/jira/browse/SPARK-5896 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-5896) toDF in python doesn't work with Strings

2015-02-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326589#comment-14326589 ] Davies Liu commented on SPARK-5896: --- [~marmbrus] There is a mistake in your script, it

[jira] [Closed] (SPARK-5896) toDF in python doesn't work with Strings

2015-02-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-5896. - Resolution: Invalid Target Version/s: (was: 1.3.0) toDF in python doesn't work with Strings

[jira] [Created] (SPARK-5898) Can't create DataFrame from Pandas data frame

2015-02-18 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-5898: --- Summary: Can't create DataFrame from Pandas data frame Key: SPARK-5898 URL: https://issues.apache.org/jira/browse/SPARK-5898 Project: Spark Issue

[jira] [Updated] (SPARK-5896) toDF in python doesn't work with Strings

2015-02-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5896: Description: {code} rdd = sc.parallelize(range(10)).map(lambda x: (str(x), x)) kvdf =

[jira] [Updated] (SPARK-5896) toDF in python doesn't work with Strings

2015-02-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5896: Priority: Critical (was: Major) toDF in python doesn't work with Strings

[jira] [Updated] (SPARK-5896) toDF in python doesn't work with Strings

2015-02-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-5896: Description: {code} rdd = sc.parallelize(range(10)).map(lambda x: (str(x), x)) kvdf =

[jira] [Created] (SPARK-5899) Viewing specific stage information which contains thousands of tasks will freak out the driver and CPU cores from where it runs

2015-02-18 Thread Mark Khaitman (JIRA)
Mark Khaitman created SPARK-5899: Summary: Viewing specific stage information which contains thousands of tasks will freak out the driver and CPU cores from where it runs Key: SPARK-5899 URL:

[jira] [Commented] (SPARK-5896) toDF in python doesn't work with Strings

2015-02-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326603#comment-14326603 ] Davies Liu commented on SPARK-5896: --- I think this is a *feature*, not bug. `_1` is a

[jira] [Commented] (SPARK-5896) toDF in python doesn't work with Strings

2015-02-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326610#comment-14326610 ] Michael Armbrust commented on SPARK-5896: - Why not auto assign column names by

[jira] [Updated] (SPARK-5896) toDF in python doesn't work with tuple/list w/o names

2015-02-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-5896: -- Summary: toDF in python doesn't work with tuple/list w/o names (was: toDF in python doesn't work with

[jira] [Resolved] (SPARK-5840) HiveContext cannot be serialized due to tuple extraction

2015-02-18 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-5840. - Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4628

[jira] [Commented] (SPARK-5896) toDF in python doesn't work with tuple/list w/o names

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326655#comment-14326655 ] Apache Spark commented on SPARK-5896: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-5898) Can't create DataFrame from Pandas data frame

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326654#comment-14326654 ] Apache Spark commented on SPARK-5898: - User 'davies' has created a pull request for

[jira] [Updated] (SPARK-5896) toDF in python doesn't work with tuple/list

2015-02-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-5896: -- Summary: toDF in python doesn't work with tuple/list (was: toDF in python doesn't work with Strings)

[jira] [Created] (SPARK-5900) Wrap the results returned by PIC and FPGrowth in case classes

2015-02-18 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5900: Summary: Wrap the results returned by PIC and FPGrowth in case classes Key: SPARK-5900 URL: https://issues.apache.org/jira/browse/SPARK-5900 Project: Spark

[jira] [Commented] (SPARK-5906) Input read size incorrect for Parquet files

2015-02-18 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327007#comment-14327007 ] Sandy Ryza commented on SPARK-5906: --- That stack trace doesn't necessarily indicate to me

[jira] [Commented] (SPARK-5342) Allow long running Spark apps to run on secure YARN/HDFS

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327055#comment-14327055 ] Apache Spark commented on SPARK-5342: - User 'harishreedharan' has created a pull

[jira] [Commented] (SPARK-5751) Flaky test: o.a.s.sql.hive.thriftserver.HiveThriftServer2Suite sometimes times out

2015-02-18 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327053#comment-14327053 ] Cheng Lian commented on SPARK-5751: --- Hey [~joshrosen], thanks for the investigation! I

[jira] [Commented] (SPARK-5906) Input read size incorrect for Parquet files

2015-02-18 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326992#comment-14326992 ] Sandy Ryza commented on SPARK-5906: --- Hmm, that's definitely not the expected behavior.

[jira] [Commented] (SPARK-5906) Input read size incorrect for Parquet files

2015-02-18 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327010#comment-14327010 ] Kay Ousterhout commented on SPARK-5906: --- Ah sorry I was conflating NewHadoopRDD with

[jira] [Commented] (SPARK-5751) Flaky test: o.a.s.sql.hive.thriftserver.HiveThriftServer2Suite sometimes times out

2015-02-18 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327017#comment-14327017 ] Josh Rosen commented on SPARK-5751: --- I took a brief look at this today and have a few

[jira] [Updated] (SPARK-5859) fix Data Frame Python API

2015-02-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-5859: --- Assignee: Davies Liu fix Data Frame Python API -- Key:

[jira] [Commented] (SPARK-5881) RDD remains cached after the table gets overridden by CACHE TABLE

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327056#comment-14327056 ] Apache Spark commented on SPARK-5881: - User 'yhuai' has created a pull request for

[jira] [Closed] (SPARK-5906) Input read size incorrect for Parquet files

2015-02-18 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout closed SPARK-5906. - Resolution: Won't Fix Ah I see -- then I'm closing this, because I'm using Hadoop 2.0, which is

[jira] [Updated] (SPARK-5906) Input read size incorrect when using Hadoop version 2.5

2015-02-18 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-5906: -- Summary: Input read size incorrect when using Hadoop version 2.5 (was: Input read size

[jira] [Commented] (SPARK-5841) Memory leak in DiskBlockManager

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327086#comment-14327086 ] Apache Spark commented on SPARK-5841: - User 'nishkamravi2' has created a pull request

[jira] [Commented] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327038#comment-14327038 ] Apache Spark commented on SPARK-1537: - User 'zhzhan' has created a pull request for

[jira] [Assigned] (SPARK-5904) DataFrame methods with varargs do not work in Java

2015-02-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reassigned SPARK-5904: -- Assignee: Reynold Xin DataFrame methods with varargs do not work in Java

[jira] [Updated] (SPARK-5906) Input read size can be incorrect when using Hadoop version 2.5

2015-02-18 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-5906: -- Summary: Input read size can be incorrect when using Hadoop version 2.5 (was: Input read size

[jira] [Updated] (SPARK-5906) Input read size incorrect when using Hadoop version 2.5

2015-02-18 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-5906: -- Description: When SparkSQL reads input data from parquet, there are many cases where it

[jira] [Commented] (SPARK-5906) Input read size incorrect for Parquet files

2015-02-18 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327004#comment-14327004 ] Kay Ousterhout commented on SPARK-5906: --- Here's one stack trace from one of the

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325806#comment-14325806 ] Sean Owen commented on SPARK-5837: -- That's different [~rok], if you mean your browser

[jira] [Commented] (SPARK-5615) Fix testPackage in StreamingContextSuite

2015-02-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326148#comment-14326148 ] Sean Owen commented on SPARK-5615: -- ALthough the PR was closed, I think this still tracks

[jira] [Resolved] (SPARK-4949) shutdownCallback in SparkDeploySchedulerBackend should be enclosed by synchronized block.

2015-02-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4949. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 3781

[jira] [Updated] (SPARK-5825) Failure stopping Services while command line argument is too long

2015-02-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5825: - Priority: Blocker (was: Major) Target Version/s: 1.3.0 Assignee: Cheng Hao

[jira] [Commented] (SPARK-1920) Spark JAR compiled with Java 7 leads to PySpark not working in YARN

2015-02-18 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325961#comment-14325961 ] Thomas Graves commented on SPARK-1920: -- Note that jdk8 seems to also have this same

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-18 Thread Rok Roskar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325957#comment-14325957 ] Rok Roskar commented on SPARK-5837: --- by the way, what would be the normal way of seeing

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325960#comment-14325960 ] Sean Owen commented on SPARK-5837: -- If you access the driver on 4040 (or whatever you

[jira] [Commented] (SPARK-4879) Missing output partitions after job completes with speculative execution

2015-02-18 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325969#comment-14325969 ] Andrew Ash commented on SPARK-4879: --- That sounds exactly like what I'd expect to happen

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-18 Thread Rok Roskar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325802#comment-14325802 ] Rok Roskar commented on SPARK-5837: --- I'm seeing a similar issue to the one being

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-18 Thread Rok Roskar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326058#comment-14326058 ] Rok Roskar commented on SPARK-5837: --- yes, going to port 4040 sends me to port 8088 on

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-18 Thread Rok Roskar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325809#comment-14325809 ] Rok Roskar commented on SPARK-5837: --- or that a service isn't running where it's expected

[jira] [Commented] (SPARK-5436) Validate GradientBoostedTrees during training

2015-02-18 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325829#comment-14325829 ] Manoj Kumar commented on SPARK-5436: The idea sounds great. I shall come up with a

[jira] [Created] (SPARK-5882) Add a test for GraphLoader.edgeListFile

2015-02-18 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-5882: --- Summary: Add a test for GraphLoader.edgeListFile Key: SPARK-5882 URL: https://issues.apache.org/jira/browse/SPARK-5882 Project: Spark Issue Type: Test

[jira] [Commented] (SPARK-5882) Add a test for GraphLoader.edgeListFile

2015-02-18 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326006#comment-14326006 ] Apache Spark commented on SPARK-5882: - User 'maropu' has created a pull request for

[jira] [Commented] (SPARK-5837) HTTP 500 if try to access Spark UI in yarn-cluster or yarn-client mode

2015-02-18 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326154#comment-14326154 ] Sean Owen commented on SPARK-5837: -- Yes, if I have this right, it should redirect you

  1   2   >