[jira] [Commented] (SPARK-2103) Java + Kafka + Spark Streaming NoSuchMethodError in java.lang.Object.

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068263#comment-14068263 ] Apache Spark commented on SPARK-2103: - User 'jerryshao' has created a pull request for

[jira] [Commented] (SPARK-2565) Update ShuffleReadMetrics as blocks are fetched

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068262#comment-14068262 ] Apache Spark commented on SPARK-2565: - User 'sryza' has created a pull request for thi

[jira] [Updated] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1630: --- Target Version/s: 1.1.0 > PythonRDDs don't handle nulls gracefully >

[jira] [Updated] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1630: --- Fix Version/s: (was: 1.1.0) > PythonRDDs don't handle nulls gracefully >

[jira] [Updated] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1630: --- Component/s: SQL > PythonRDDs don't handle nulls gracefully > ---

[jira] [Updated] (SPARK-2546) Configuration object thread safety issue

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2546: --- Target Version/s: 1.1.0 > Configuration object thread safety issue >

[jira] [Updated] (SPARK-2312) Spark Actors do not handle unknown messages in their receive methods

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2312: --- Priority: Minor (was: Major) > Spark Actors do not handle unknown messages in their receive

[jira] [Updated] (SPARK-2380) Support displaying accumulator contents in the web UI

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2380: --- Priority: Critical (was: Major) > Support displaying accumulator contents in the web UI > --

[jira] [Resolved] (SPARK-2021) External hashing in PySpark

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2021. Resolution: Duplicate I think this is a dup of SPARK-2538 - so I'm closing it. Feel free to

[jira] [Updated] (SPARK-2579) Reading from S3 returns an inconsistent number of items with Spark 0.9.1

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2579: --- Priority: Critical (was: Major) > Reading from S3 returns an inconsistent number of items wi

[jira] [Commented] (SPARK-1767) Prefer HDFS-cached replicas when scheduling data-local tasks

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068242#comment-14068242 ] Patrick Wendell commented on SPARK-1767: I think the solution that [~adav] propose

[jira] [Updated] (SPARK-2045) Sort-based shuffle implementation

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2045: --- Component/s: Spark Core > Sort-based shuffle implementation > ---

[jira] [Comment Edited] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068240#comment-14068240 ] Patrick Wendell edited comment on SPARK-2583 at 7/21/14 6:03 AM: ---

[jira] [Commented] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not

2014-07-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068240#comment-14068240 ] Patrick Wendell commented on SPARK-2583: Hey [~sarutak] - I'm curious - what is th

[jira] [Commented] (SPARK-2582) Make Block Manager Master pluggable

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068238#comment-14068238 ] Apache Spark commented on SPARK-2582: - User 'harishreedharan' has created a pull reque

[jira] [Commented] (SPARK-2582) Make Block Manager Master pluggable

2014-07-20 Thread Hari Shreedharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068237#comment-14068237 ] Hari Shreedharan commented on SPARK-2582: - PR: https://github.com/apache/spark/pul

[jira] [Resolved] (SPARK-1945) Add full Java examples in MLlib docs

2014-07-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-1945. -- Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1311 [https://gith

[jira] [Commented] (SPARK-2470) Fix PEP 8 violations

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068162#comment-14068162 ] Apache Spark commented on SPARK-2470: - User 'nchammas' has created a pull request for

[jira] [Commented] (SPARK-2511) Add TF-IDF featurizer

2014-07-20 Thread Michael Yannakopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068159#comment-14068159 ] Michael Yannakopoulos commented on SPARK-2511: -- I am really interested in thi

[jira] [Comment Edited] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068113#comment-14068113 ] Doris Xin edited comment on SPARK-2599 at 7/21/14 2:06 AM: --- Foun

[jira] [Commented] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068113#comment-14068113 ] Doris Xin commented on SPARK-2599: -- Found this in-depth article discussing the different

[jira] [Resolved] (SPARK-2552) Stabilize the computation of logistic function in pyspark

2014-07-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-2552. -- Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1493 [https://gith

[jira] [Closed] (SPARK-2512) Stratified sampling

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doris Xin closed SPARK-2512. Resolution: Duplicate > Stratified sampling > --- > > Key: SPARK-2512 >

[jira] [Updated] (SPARK-2082) Stratified sampling implementation in PairRDDFunctions

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doris Xin updated SPARK-2082: - Target Version/s: 1.1.0 > Stratified sampling implementation in PairRDDFunctions > --

[jira] [Commented] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Debasish Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068094#comment-14068094 ] Debasish Das commented on SPARK-2602: - CDH5 does not even support java6 anymore ! > s

[jira] [Commented] (SPARK-2603) Remove unnecessary toMap and toList in converting Java collections to Scala collections JsonRDD.scala

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068093#comment-14068093 ] Apache Spark commented on SPARK-2603: - User 'yhuai' has created a pull request for thi

[jira] [Updated] (SPARK-2603) Remove unnecessary toMap and toList in converting Java collections to Scala collections JsonRDD.scala

2014-07-20 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-2603: Description: In JsonRDD.scalafy, we are using toMap/toList to convert a Java Map/List to a Scala one. These

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-20 Thread Ken Carlile (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068091#comment-14068091 ] Ken Carlile commented on SPARK-2282: Hi Aaron, I have pulled the spark-master repo a

[jira] [Created] (SPARK-2603) Remove unnecessary toMap and toList in converting Java collections to Scala collections JsonRDD.scala

2014-07-20 Thread Yin Huai (JIRA)
Yin Huai created SPARK-2603: --- Summary: Remove unnecessary toMap and toList in converting Java collections to Scala collections JsonRDD.scala Key: SPARK-2603 URL: https://issues.apache.org/jira/browse/SPARK-2603

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-20 Thread Aaron Davidson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068083#comment-14068083 ] Aaron Davidson commented on SPARK-2282: --- Hey Ken, I created [PR 1503|https://github

[jira] [Commented] (SPARK-2282) PySpark crashes if too many tasks complete quickly

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068082#comment-14068082 ] Apache Spark commented on SPARK-2282: - User 'aarondav' has created a pull request for

[jira] [Comment Edited] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068050#comment-14068050 ] Nicholas Chammas edited comment on SPARK-2602 at 7/20/14 9:24 PM: --

[jira] [Commented] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068050#comment-14068050 ] Nicholas Chammas commented on SPARK-2602: - Ah, I'm on Java 6. Looking at [this co

[jira] [Commented] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068045#comment-14068045 ] Sean Owen commented on SPARK-2602: -- I have not observed this ever. OS X 10.9.4 / Java 7.

[jira] [Commented] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Gera Shegalov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068044#comment-14068044 ] Gera Shegalov commented on SPARK-2602: -- Take a look at the thread on HADOOP-10290 >

[jira] [Created] (SPARK-2602) sbt/sbt test steals window focus on OS X

2014-07-20 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-2602: --- Summary: sbt/sbt test steals window focus on OS X Key: SPARK-2602 URL: https://issues.apache.org/jira/browse/SPARK-2602 Project: Spark Issue Type: Impr

[jira] [Commented] (SPARK-2047) Use less memory in AppendOnlyMap.destructiveSortedIterator

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068042#comment-14068042 ] Apache Spark commented on SPARK-2047: - User 'aarondav' has created a pull request for

[jira] [Resolved] (SPARK-2519) Eliminate pattern-matching on Tuple2 in performance-critical aggregation code

2014-07-20 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza resolved SPARK-2519. --- Resolution: Fixed Fix Version/s: 1.1.0 > Eliminate pattern-matching on Tuple2 in performance-c

[jira] [Resolved] (SPARK-2598) RangePartitioner's binary search does not use the given Ordering

2014-07-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-2598. Resolution: Fixed Fix Version/s: 1.0.2 1.1.0 > RangePartitioner's binary

[jira] [Updated] (SPARK-2601) py4j.Py4JException on sc.pickleFile

2014-07-20 Thread Kevin Matzen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Matzen updated SPARK-2601: Description: {code:title=test.py} from pyspark import SparkContext text_filename = 'README.md' pic

[jira] [Created] (SPARK-2601) py4j.Py4JException on sc.pickleFile

2014-07-20 Thread Kevin Matzen (JIRA)
Kevin Matzen created SPARK-2601: --- Summary: py4j.Py4JException on sc.pickleFile Key: SPARK-2601 URL: https://issues.apache.org/jira/browse/SPARK-2601 Project: Spark Issue Type: Bug Com

[jira] [Commented] (SPARK-2552) Stabilize the computation of logistic function in pyspark

2014-07-20 Thread Michael Yannakopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067971#comment-14067971 ] Michael Yannakopoulos commented on SPARK-2552: -- Xiangrui Meng, Sorry about p

[jira] [Comment Edited] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067876#comment-14067876 ] Sean Owen edited comment on SPARK-2599 at 7/20/14 10:20 AM: Ye

[jira] [Commented] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067876#comment-14067876 ] Sean Owen commented on SPARK-2599: -- The relative error will never be more than 2.0; it wo

[jira] [Created] (SPARK-2600) Correlations (Pearson, Spearman)

2014-07-20 Thread Doris Xin (JIRA)
Doris Xin created SPARK-2600: Summary: Correlations (Pearson, Spearman) Key: SPARK-2600 URL: https://issues.apache.org/jira/browse/SPARK-2600 Project: Spark Issue Type: Sub-task Compone

[jira] [Closed] (SPARK-2600) Correlations (Pearson, Spearman)

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doris Xin closed SPARK-2600. Resolution: Implemented > Correlations (Pearson, Spearman) > > >

[jira] [Commented] (SPARK-2512) Stratified sampling

2014-07-20 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067848#comment-14067848 ] Doris Xin commented on SPARK-2512: -- Hey Xiangrui can you close this one since there's alr

[jira] [Created] (SPARK-2599) almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0

2014-07-20 Thread Doris Xin (JIRA)
Doris Xin created SPARK-2599: Summary: almostEquals mllib.util.TestingUtils does not behave as expected when comparing against 0.0 Key: SPARK-2599 URL: https://issues.apache.org/jira/browse/SPARK-2599 Pro

[jira] [Updated] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not

2014-07-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2583: --- Priority: Critical (was: Major) > ConnectionManager cannot distinguish whether error occurred or not

[jira] [Updated] (SPARK-2583) ConnectionManager cannot distinguish whether error occurred or not

2014-07-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2583: --- Target Version/s: 1.1.0 Assignee: Kousuke Saruta > ConnectionManager cannot distinguish w

[jira] [Commented] (SPARK-2598) RangePartitioner's binary search does not use the given Ordering

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067836#comment-14067836 ] Apache Spark commented on SPARK-2598: - User 'rxin' has created a pull request for this

[jira] [Commented] (SPARK-2045) Sort-based shuffle implementation

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067826#comment-14067826 ] Apache Spark commented on SPARK-2045: - User 'mateiz' has created a pull request for th

[jira] [Created] (SPARK-2598) RangePartitioner's binary search does not use the given Ordering

2014-07-20 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2598: -- Summary: RangePartitioner's binary search does not use the given Ordering Key: SPARK-2598 URL: https://issues.apache.org/jira/browse/SPARK-2598 Project: Spark I

[jira] [Commented] (SPARK-2521) Broadcast RDD object once per TaskSet (instead of sending it for every task)

2014-07-20 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067824#comment-14067824 ] Apache Spark commented on SPARK-2521: - User 'rxin' has created a pull request for this