[GitHub] spark pull request: Set spark.executor.uri from environment variab...

2014-04-03 Thread ivanwick
GitHub user ivanwick opened a pull request: https://github.com/apache/spark/pull/311 Set spark.executor.uri from environment variable (needed by Mesos) The Mesos backend uses this property when setting up a slave process. It is similarly set in the Scala repl

[GitHub] spark pull request: Set spark.executor.uri from environment variab...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/311#issuecomment-39416318 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-1371] fix computePreferredLocations sig...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/302#issuecomment-39416816 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1371] fix computePreferredLocations sig...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/302#issuecomment-39416824 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1380: Add sort-merge based cogroup/joins...

2014-04-03 Thread ueshin
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/283#issuecomment-39417683 @rxin Thank you for your reply. There are some case to use merge join for optimization: 1. If data to be joined are already sorted by join keys, merge join

[GitHub] spark pull request: Spark parquet improvements

2014-04-03 Thread AndreSchumacher
Github user AndreSchumacher commented on the pull request: https://github.com/apache/spark/pull/195#issuecomment-39418052 @marmbrus Great, thanks a lot. I will go through those comments and your PR and extend the documentation. --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [SPARK-1398] Bumped FindBugs 1 to FindBugs 2

2014-04-03 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/307#issuecomment-39418061 Actually, looking at this a little deeper, the smarter move might be to bump our Guava version to 16.0.1 and eliminate the explicit dependency on findbugs

[GitHub] spark pull request: Spark parquet improvements

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/195#issuecomment-39418179 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1371] fix computePreferredLocations sig...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/302#issuecomment-39419969 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13719/ --- If your project

[GitHub] spark pull request: [SPARK-1371] fix computePreferredLocations sig...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/302#issuecomment-39419968 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-39420353 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-39420359 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: DAGScheduler NullPointerException bug fix

2014-04-03 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/273#issuecomment-39420452 ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request: SPARK-1380: Add sort-merge based cogroup/joins...

2014-04-03 Thread ueshin
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/283#issuecomment-39421176 @mridulm Thank you for your reply. There are 2 points I have to mention about memory: - Before shuffle If data are sorted, no more memory is needed

[GitHub] spark pull request: Spark parquet improvements

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/195#issuecomment-39421942 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13720/ --- If your project

[GitHub] spark pull request: Spark parquet improvements

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/195#issuecomment-39421940 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-1216. Add a OneHotEncoder for handling c...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/304#issuecomment-39423129 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1216. Add a OneHotEncoder for handling c...

2014-04-03 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/304#issuecomment-39423212 Thanks for the tip Reynold. Updated patch fixes the comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

Re: [GitHub] spark pull request: SPARK-1154: Clean up app folders in worker nod...

2014-04-03 Thread Mark Hamstra
Clone a new copy of your fork just to be safe; checkout your SPARK-1154 branch; git pull --rebase g...@github.com:apache/spark.git master; now your and Kelvin's changes should appear last, after everything that is already in master; push origin +SPARK-1154-cleanup-app-folders -- and you're

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-39424549 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13721/ --- If your project

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-39424548 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: SPARK-1093: Annotate developer and experimenta...

2014-04-03 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/274#discussion_r11243339 --- Diff: core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala --- @@ -57,6 +57,7 @@ private[spark] class CoGroupPartition(idx: Int, val deps:

[GitHub] spark pull request: SPARK-1093: Annotate developer and experimenta...

2014-04-03 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/274#discussion_r11243369 --- Diff: core/src/main/scala/org/apache/spark/rdd/PartitionPruningRDD.scala --- @@ -21,7 +21,8 @@ import scala.reflect.ClassTag import

[GitHub] spark pull request: SPARK-1093: Annotate developer and experimenta...

2014-04-03 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/274#discussion_r11243475 --- Diff: core/src/main/scala/org/apache/spark/util/FileLogger.scala --- @@ -36,7 +36,7 @@ import org.apache.spark.io.CompressionCodec * @param compress

[GitHub] spark pull request: SPARK-1093: Annotate developer and experimenta...

2014-04-03 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/274#discussion_r11243534 --- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala --- @@ -21,6 +21,11 @@ import scala.collection.mutable.ArrayBuffer import

[GitHub] spark pull request: SPARK-1216. Add a OneHotEncoder for handling c...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/304#issuecomment-39429342 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13722/ --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-1216. Add a OneHotEncoder for handling c...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/304#issuecomment-39429341 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1398] Bumped FindBugs 1 to FindBugs 2

2014-04-03 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/307#issuecomment-39429880 Ah I should have looked more at this. The dependency is just to bring in some annotations that were later standardized. We do not use them, and so do not need any copy,

[GitHub] spark pull request: method getAllPools in SC throws NPE

2014-04-03 Thread scwf
GitHub user scwf opened a pull request: https://github.com/apache/spark/pull/312 method getAllPools in SC throws NPE SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in

[GitHub] spark pull request: method getAllPools in SC throws NPE

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/312#issuecomment-39431227 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: method getAllPools in SC throws NPE

2014-04-03 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/312#issuecomment-39431524 TaskSchedulerImpl should override rootPool method --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [WIP] [SPARK-1328] Add vector statistics

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/268#issuecomment-39432280 Build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [WIP] [SPARK-1328] Add vector statistics

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/268#issuecomment-39432281 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13723/ --- If your project

[GitHub] spark pull request: method getAllPools in SC throws NPE

2014-04-03 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/312#issuecomment-39434966 i am wrong, i think this is an order issue, it is a matter of the order of starting of ui and the taskScheduler in sc --- If your project is set up for it, you can reply to

[GitHub] spark pull request: SPARK-1404: Always upgrade spark-env.sh vars t...

2014-04-03 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/310#discussion_r11246612 --- Diff: bin/spark-shell --- @@ -145,7 +145,7 @@ function resolve_spark_master(){ fi if [ -z $MASTER ]; then -

[GitHub] spark pull request: [SPARK-1276] Add a HistoryServer to render per...

2014-04-03 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/204#issuecomment-39449932 Its not readable by other users in the intermediate directory. Sorry I didn't explain fully. That directory itself is 1777, sticky bit set, world writable. Each user

[GitHub] spark pull request: Spark parquet improvements

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/195#issuecomment-39450898 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1276] Add a HistoryServer to render per...

2014-04-03 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/204#issuecomment-39453542 Also to clarify that the history files are actually first written to the users staging directory, which is only user readable/writable, and then moved over to

[GitHub] spark pull request: [SPARK-1276] Add a HistoryServer to render per...

2014-04-03 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/204#issuecomment-39454420 First of all I would like to say great work, having this history server is awesome!! I'm trying it out on a non-secure yarn cluster but when I run something

[GitHub] spark pull request: Spark parquet improvements

2014-04-03 Thread AndreSchumacher
Github user AndreSchumacher commented on the pull request: https://github.com/apache/spark/pull/195#issuecomment-39457091 @marmbrus Thanks again for the comments and changes. I shuffled the imports and removed the additions to SQLContext. Also moved the insert examples to the tests

[GitHub] spark pull request: Spark parquet improvements

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/195#issuecomment-39457183 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13724/ --- If your project

[GitHub] spark pull request: [SPARK-1276] Add a HistoryServer to render per...

2014-04-03 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/204#discussion_r11253731 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -0,0 +1,221 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-1276] Add a HistoryServer to render per...

2014-04-03 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/204#discussion_r11254082 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -0,0 +1,221 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-1276] Add a HistoryServer to render per...

2014-04-03 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/204#discussion_r11254543 --- Diff: sbin/start-history-server.sh --- @@ -0,0 +1,46 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-1276] Add a HistoryServer to render per...

2014-04-03 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/204#discussion_r11256081 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -0,0 +1,221 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-1276] Add a HistoryServer to render per...

2014-04-03 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/204#discussion_r11256259 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -0,0 +1,221 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-1404: Always upgrade spark-env.sh vars t...

2014-04-03 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/310#discussion_r11261202 --- Diff: bin/spark-shell --- @@ -145,7 +145,7 @@ function resolve_spark_master(){ fi if [ -z $MASTER ]; then -

[GitHub] spark pull request: SPARK-1404: Always upgrade spark-env.sh vars t...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/310#issuecomment-39476946 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1404: Always upgrade spark-env.sh vars t...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/310#issuecomment-39476959 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1350. Always use JAVA_HOME to run execut...

2014-04-03 Thread sryza
GitHub user sryza opened a pull request: https://github.com/apache/spark/pull/313 SPARK-1350. Always use JAVA_HOME to run executor container JVMs. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sryza/spark sandy-spark-1350

[GitHub] spark pull request: SPARK-1350. Always use JAVA_HOME to run execut...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/313#issuecomment-39478597 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1350. Always use JAVA_HOME to run execut...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/313#issuecomment-39478612 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1276] Add a HistoryServer to render per...

2014-04-03 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/204#discussion_r11262132 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -0,0 +1,221 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-1154: Clean up app folders in worker nod...

2014-04-03 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/288#issuecomment-39481008 @velvia what happens if you just close and re-open the pull request? Will that work? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-04-03 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r11263139 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-04-03 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r11263287 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: SPARK-1154: Clean up app folders in worker nod...

2014-04-03 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/288#issuecomment-39482892 rebasing and then force pushing isn't kind to Git, but GitHub does handle it well. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SQL] SPARK-1333 First draft of java API

2014-04-03 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/248#issuecomment-39482957 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1360] Add Timestamp Support for SQL

2014-04-03 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/275#issuecomment-39483020 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-04-03 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r11263798 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: SPARK-1404: Always upgrade spark-env.sh vars t...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/310#issuecomment-39483328 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-1404: Always upgrade spark-env.sh vars t...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/310#issuecomment-39483329 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13725/ --- If your project

[GitHub] spark pull request: [SQL] SPARK-1333 First draft of java API

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/248#issuecomment-39483451 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SQL] SPARK-1333 First draft of java API

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/248#issuecomment-39483466 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SQL] SPARK-1371 Hash Aggregation Improvements

2014-04-03 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/295#issuecomment-39483627 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-04-03 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r11263980 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-04-03 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r11264135 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -65,26 +65,36 @@ private[spark] class MapOutputTrackerMasterActor(tracker:

[GitHub] spark pull request: [SQL] SPARK-1371 Hash Aggregation Improvements

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/295#issuecomment-39484062 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-04-03 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r11264351 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -194,17 +194,43 @@ private[spark] class MapOutputTracker(conf: SparkConf)

[GitHub] spark pull request: SPARK-1350. Always use JAVA_HOME to run execut...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/313#issuecomment-39485509 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13726/ --- If your project

[GitHub] spark pull request: SPARK-1350. Always use JAVA_HOME to run execut...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/313#issuecomment-39485508 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-1154: Clean up app folders in worker nod...

2014-04-03 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/288#issuecomment-39485564 Rebasing is no big deal if the commits that you are rebasing are only in a private repo. For example, even though https://github.com/markhamstra/spark is technically

[GitHub] spark pull request: SPARK-1305: Support persisting RDD's directly ...

2014-04-03 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/158#discussion_r11264777 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -669,7 +717,7 @@ private[spark] class BlockManager( // Either

[GitHub] spark pull request: [SPARK-1398] Bumped FindBugs 1 to FindBugs 2

2014-04-03 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/307#issuecomment-39486891 Looks like just removing the findbugs jsr305 dependency entirely works fine, but bumping the Guava version not so much. I'm going to change the name of this JIRA and

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-04-03 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r11265542 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -641,7 +643,11 @@ class SparkContext( *

[GitHub] spark pull request: SPARK-1350. Always use JAVA_HOME to run execut...

2014-04-03 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/313#issuecomment-39488004 Change looks good to me. Will wait to see if anyone else has comments. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-1398] Removed findbugs jsr305 dependenc...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/307#issuecomment-39488330 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1398] Removed findbugs jsr305 dependenc...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/307#issuecomment-39488315 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1395] Allow local: URIs to work on Ya...

2014-04-03 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/303#discussion_r11265870 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -427,28 +430,49 @@ object ClientBase { } def

[GitHub] spark pull request: [SPARK-1198] Allow pipes tasks to run in diffe...

2014-04-03 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/128#issuecomment-39489093 BTW if you want this to go to 1.0, it would be good to make these changes soon. We can potentially also include it without Windows support, and mark the folder stuff as

[GitHub] spark pull request: SPARK-729: Closures not always serialized at c...

2014-04-03 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/189#issuecomment-39489309 Hey Will, sorry for the really late reply, still interested in having this go in, at least with the simple version. I've just been swamped with other reviews. --- If

[GitHub] spark pull request: [SPARK-1395] Allow local: URIs to work on Ya...

2014-04-03 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/303#discussion_r11266231 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -427,28 +430,49 @@ object ClientBase { } def

[GitHub] spark pull request: [SPARK-1259] Make RDD locally iterable

2014-04-03 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/156#issuecomment-39489375 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1259] Make RDD locally iterable

2014-04-03 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/156#discussion_r11266262 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala --- @@ -282,6 +283,17 @@ trait JavaRDDLike[T, This : JavaRDDLike[T, This]] extends

[GitHub] spark pull request: [SPARK-1259] Make RDD locally iterable

2014-04-03 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/156#discussion_r11266266 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -664,6 +664,24 @@ abstract class RDD[T: ClassTag]( } /** + *

[GitHub] spark pull request: SPARK-729: Closures not always serialized at c...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/189#issuecomment-39489560 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-729: Closures not always serialized at c...

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/189#issuecomment-39489550 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1259] Make RDD locally iterable

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/156#issuecomment-39489554 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1259] Make RDD locally iterable

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/156#issuecomment-39489572 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1259] Make RDD locally iterable

2014-04-03 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/156#discussion_r11266352 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -664,6 +664,24 @@ abstract class RDD[T: ClassTag]( } /** + *

[GitHub] spark pull request: SPARK-1154: Clean up app folders in worker nod...

2014-04-03 Thread velvia
Github user velvia commented on the pull request: https://github.com/apache/spark/pull/288#issuecomment-39489767 Mark, thanks. Yeah, I realize instead of doing a merge I should just do a pull now, but it's probably too late for this PR. Instead I'll close it and reopen and

[GitHub] spark pull request: [SPARK-1259] Make RDD locally iterable

2014-04-03 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/156#discussion_r11266448 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -664,6 +664,24 @@ abstract class RDD[T: ClassTag]( } /** + *

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-04-03 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r11266501 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -180,12 +180,24 @@ object SparkEnv extends Logging { } } +

[GitHub] spark pull request: SPARK-729: Closures not always serialized at c...

2014-04-03 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/189#issuecomment-39489148 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SQL] SPARK-1333 First draft of java API

2014-04-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/248#issuecomment-39490054 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13727/ --- If your project

[GitHub] spark pull request: SPARK-729: Closures not always serialized at c...

2014-04-03 Thread willb
Github user willb commented on the pull request: https://github.com/apache/spark/pull/189#issuecomment-39490073 Thanks for taking another look, Matei! I know there's a lot of stuff to get in before the merge window closes and appreciate the update. --- If your project is set up for

[GitHub] spark pull request: [SPARK-1259] Make RDD locally iterable

2014-04-03 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/156#issuecomment-39490196 Made some more comments to simplify this now that we return an Iterator. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-1198] Allow pipes tasks to run in diffe...

2014-04-03 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/128#issuecomment-39490166 Ok, thanks for the heads up, I hope to get to it today or tomorrow. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: SPARK-1154: Clean up app folders in worker nod...

2014-04-03 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/288#issuecomment-39490098 Really, there should be no need for that. Just do a 'pull --rebase' in your branch and everything that is not already part of master will be moved to after

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-04-03 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r11266739 --- Diff: core/src/main/scala/org/apache/spark/broadcast/Broadcast.scala --- @@ -51,49 +50,37 @@ import org.apache.spark._ * @tparam T Type of the data

  1   2   3   >