[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/146#discussion_r10631616 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala --- @@ -0,0 +1,328 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Don't swallow all kryo errors, only those that...

2014-03-14 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/142#issuecomment-37676276 Good catch, merging into master. We may want to merge this into branch-0.9 as well, @pwendell any thoughts? --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37676157 Looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10618053 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -49,11 +50,13 @@ private[spark] class CacheManager(blockManager: BlockManager

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-13 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10576136 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -222,4 +232,19 @@ private[spark] object HadoopRDD { def

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-13 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10576150 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -222,4 +232,19 @@ private[spark] object HadoopRDD { def

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-13 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10576162 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -222,4 +232,19 @@ private[spark] object HadoopRDD { def

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-13 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10576154 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -222,4 +232,19 @@ private[spark] object HadoopRDD { def

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/101#issuecomment-37499281 Thanks for updating this, I think changing the package is a very good cleanup. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10552386 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -222,4 +232,27 @@ private[spark] object HadoopRDD { def

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10552362 --- Diff: core/src/main/scala/org/apache/spark/SparkHadoopWriter.scala --- @@ -15,18 +15,19 @@ * limitations under the License

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10552358 --- Diff: core/src/main/scala/org/apache/spark/SparkHadoopWriter.scala --- @@ -15,18 +15,19 @@ * limitations under the License

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10552253 --- Diff: core/src/main/scala/org/apache/spark/SparkHadoopWriter.scala --- @@ -15,18 +15,19 @@ * limitations under the License

[GitHub] spark pull request: [SPARK-1186] : Enrich the Spark Shell to suppo...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/116#discussion_r10551425 --- Diff: bin/spark-shell --- @@ -30,69 +30,367 @@ esac # Enter posix mode for bash set -o posix -CORE_PATTERN="^[0-9]+$" -M

[GitHub] spark pull request: [SPARK-1186] : Enrich the Spark Shell to suppo...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/116#discussion_r10551396 --- Diff: bin/spark-shell --- @@ -30,69 +30,367 @@ esac # Enter posix mode for bash set -o posix -CORE_PATTERN="^[0-9]+$" -M

[GitHub] spark pull request: [SPARK-1186] : Enrich the Spark Shell to suppo...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/116#discussion_r10551389 --- Diff: bin/spark-shell --- @@ -30,69 +30,367 @@ esac # Enter posix mode for bash set -o posix -CORE_PATTERN="^[0-9]+$" -M

[GitHub] spark pull request: [SPARK-1186] : Enrich the Spark Shell to suppo...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/116#discussion_r10551344 --- Diff: bin/spark-shell --- @@ -30,69 +30,367 @@ esac # Enter posix mode for bash set -o posix -CORE_PATTERN="^[0-9]+$" -M

[GitHub] spark pull request: hot fix for PR105 - change to Java annotation

2014-03-12 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/133#issuecomment-37495480 LGTM, merged into master. Thanks!! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10551206 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag]( checkpointData.flatMap

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549798 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549758 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag]( checkpointData.flatMap

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549710 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag]( checkpointData.flatMap

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549475 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549439 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549405 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549357 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549336 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549211 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: SPARK-1160: Deprecate toArray in RDD

2014-03-12 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/105#issuecomment-37488227 Looks good to me. Merged into master, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/101#issuecomment-37480611 Thanks, this looks like a good bug fix. I just had some questions since I'm not super familiar with the nature of these Hadoop JobIds and such, and some minor styl

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10546476 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -165,12 +174,29 @@ class HadoopRDD[K, V]( override def compute(theSplit

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10546480 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -165,12 +174,29 @@ class HadoopRDD[K, V]( override def compute(theSplit

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10546472 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -165,12 +174,29 @@ class HadoopRDD[K, V]( override def compute(theSplit

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10546478 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -165,12 +174,29 @@ class HadoopRDD[K, V]( override def compute(theSplit

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10546468 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -34,6 +39,7 @@ import org.apache.spark.broadcast.Broadcast import

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10546475 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -165,12 +174,29 @@ class HadoopRDD[K, V]( override def compute(theSplit

[GitHub] spark pull request: SPARK-1128: set hadoop task properties when co...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/101#discussion_r10546469 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -111,6 +117,9 @@ class HadoopRDD[K, V]( protected val

[GitHub] spark pull request: [SPARK-1233] Fix running hadoop 0.23 due to ja...

2014-03-12 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/129#issuecomment-37445337 Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1233] Fix running hadoop 0.23 due to ja...

2014-03-12 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/129#issuecomment-37438275 Looks like this was introduced in #102. Looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-1232] Fix the hadoop 0.23 yarn build

2014-03-12 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/127#issuecomment-37437658 This was removed by accident in #91. Looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: SPARK-1099:Spark's local mode should probably ...

2014-03-12 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/110#issuecomment-37432422 This looks good to me, but I will leave this PR for a little longer in case anyone wants to raise questions about changing the behavior here. --- If your project is set

[GitHub] spark pull request: SPARK-1144 Added license and RAT to check lice...

2014-03-11 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/125#issuecomment-37319171 Do we have to include the actual rat docs and jar in our source, instead of being able to pull it down dynamically? The jar itself lends an extra 1.5MB to our source

[GitHub] spark pull request: [SPARK-1186] : Enrich the Spark Shell to suppo...

2014-03-10 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/116#issuecomment-37255431 Here are some of your questions from the previous PR, let me answer inline: 1. *Based [SPARK-929] would it make sense to also include --spark-daemon-memory as an

[GitHub] spark pull request: [SPARK-1186] : Enrich the Spark Shell to suppo...

2014-03-10 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/116#issuecomment-37253813 Jenkins, this is ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1099:Spark's local mode should probably ...

2014-03-10 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/110#discussion_r10442282 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -1204,7 +1204,7 @@ object SparkContext extends Logging { master match

[GitHub] spark pull request: SPARK-929: Fully deprecate usage of SPARK_MEM

2014-03-10 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/99#issuecomment-37213831 You might also look at my comment summarizing how one can configure memory for each of the components here: https://github.com/apache/incubator-spark/pull/615

[GitHub] spark pull request: SPARK-1167: Remove metrics-ganglia from defaul...

2014-03-09 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/108#issuecomment-37136294 Will this require the spark-ec2 scripts to pull in this dependency as well? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: SPARK-1205: Clean up callSite/origin/generator...

2014-03-09 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/106#issuecomment-37133978 Just had a couple questions essentially related to our deprecation policy. Change looks good to me otherwise. Thanks for the cleanup! --- If your project is set up for

[GitHub] spark pull request: SPARK-1205: Clean up callSite/origin/generator...

2014-03-09 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/106#discussion_r10414915 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala --- @@ -135,7 +135,11 @@ class JavaRDD[T](val rdd: RDD[T])(implicit val classTag

[GitHub] spark pull request: SPARK-1205: Clean up callSite/origin/generator...

2014-03-09 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/106#discussion_r10414905 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDD.scala --- @@ -135,7 +135,11 @@ class JavaRDD[T](val rdd: RDD[T])(implicit val classTag

[GitHub] spark pull request: SPARK-929: Fully deprecate usage of SPARK_MEM

2014-03-07 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/99#issuecomment-37059772 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-929: Fully deprecate usage of SPARK_MEM

2014-03-07 Thread aarondav
GitHub user aarondav opened a pull request: https://github.com/apache/spark/pull/99 SPARK-929: Fully deprecate usage of SPARK_MEM (Continued from old repo, prior discussion at https://github.com/apache/incubator-spark/pull/615) This patch cements our deprecation of the

[GitHub] spark pull request: SPARK-1136: Fix FaultToleranceTest for Docker ...

2014-03-07 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/5#issuecomment-37052191 Merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/98#issuecomment-37049831 Jenkins, this is ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have