[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10597200 --- Diff: core/src/main/scala/org/apache/spark/ui/SparkUI.scala --- @@ -68,19 +105,53 @@ private[spark] class SparkUI(sc: SparkContext) extends Logging {

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10597228 --- Diff: core/src/main/scala/org/apache/spark/ui/SparkUI.scala --- @@ -68,19 +105,53 @@ private[spark] class SparkUI(sc: SparkContext) extends Logging {

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37620617 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10597778 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -196,13 +152,46 @@ class DAGScheduler( })) } -

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10597813 --- Diff: core/src/main/scala/org/apache/spark/scheduler/JobLogger.scala --- @@ -22,24 +22,25 @@ import java.text.SimpleDateFormat import java.util.{Date,

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10597888 --- Diff: core/src/main/scala/org/apache/spark/scheduler/JobLogger.scala --- @@ -80,187 +81,78 @@ class JobLogger(val user: String, val logDirName: String)

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10598046 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerStatusListener.scala --- @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37623099 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37623106 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1104: kill Process in workerThread of Ex...

2014-03-14 Thread lalaguozhe
Github user lalaguozhe commented on the pull request: https://github.com/apache/spark/pull/35#issuecomment-37634728 up --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37634990 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37638767 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1170-pyspark-histogram: added histogram ...

2014-03-14 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/122#issuecomment-37642205 Hi Daniel, Thanks for the patch, It would be good to separate out the implementation of min max into a different PR and provide Rdd.min and RDD.max

[GitHub] spark pull request: SPARK-1246, added min max API to Double RDDs i...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/140#issuecomment-37642655 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1246, added min max API to Double RDDs i...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/140#issuecomment-37642654 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1246, added min max API to Double RDDs i...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/140#issuecomment-37647358 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

cloudera repo down again - mqtt

2014-03-14 Thread Tom Graves
It appears the cloudera repo for the mqtt stuff is down again.  Did someone  ping them the last time?   Can we pick this up from some other repo? [ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.4:process (default) on project spark-examples_2.10: Error

Re: cloudera repo down again - mqtt

2014-03-14 Thread Sean Owen
Repo is fine: http://repository.cloudera.com/artifactory/cloudera-repos/ This artifact has never been in the Cloudera repo actually. As I mentioned I am able to build successfully even if the repo is completely gone, as a result. The Spark build actually does not use the Cloudera repo, as

Re: cloudera repo down again - mqtt

2014-03-14 Thread sandy . ryza
Our guys are looking into it. I'll post when things are back up. -Sandy On Mar 14, 2014, at 7:37 AM, Tom Graves tgraves...@yahoo.com wrote: It appears the cloudera repo for the mqtt stuff is down again. Did someone ping them the last time? Can we pick this up from some other repo?

Re: cloudera repo down again - mqtt

2014-03-14 Thread Tom Graves
Thanks Sean, I assume you are building with maven and not sbt?   It completely fails for me. maven version is 3.1.0.  I'm also building for yarn but I don't think that matters (mvn  -Dyarn.version=0.23.10 -Dhadoop.version=0.23.10  -Pyarn-alpha  package -DskipTests).  I changed the pom to use

Re: spark config params conventions

2014-03-14 Thread Chester Chen
Based on typesafe config maintainer's response, with latest version of typeconfig, the double quote is no longer needed for key like spark.speculation, so you don't need code to strip the quotes Chester Alpine data labs Sent from my iPhone On Mar 12, 2014, at 2:50 PM, Aaron Davidson

Re: cloudera repo down again - mqtt

2014-03-14 Thread Sean Owen
Yes, I'm using Maven 3.2.1. Actually, scratch that, it fails for me too once it gets down into the MQTT module, with a clearer error: sun.security.validator.ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: timestamp check failed for project

[GitHub] spark pull request: SPARK-1170-pyspark-histogram: added histogram ...

2014-03-14 Thread dwmclary
Github user dwmclary commented on the pull request: https://github.com/apache/spark/pull/122#issuecomment-37666086 Prashant, Thanks for the review! I'll break out the min/max bit and submit a PR for 1246 later today and then just leave the histogram bit on 1170.

[GitHub] spark pull request: SPARK-1246, added min max API to Double RDDs i...

2014-03-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/140#discussion_r10616091 --- Diff: project/build.properties --- @@ -14,4 +14,4 @@ # See the License for the specific language governing permissions and # limitations under

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10616395 --- Diff: core/src/main/scala/org/apache/spark/scheduler/JobLogger.scala --- @@ -22,24 +22,25 @@ import java.text.SimpleDateFormat import

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10616539 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerStatusListener.scala --- @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10616549 --- Diff: core/src/main/scala/org/apache/spark/storage/PutResult.scala --- @@ -20,7 +20,13 @@ package org.apache.spark.storage import

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10616570 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -196,13 +152,46 @@ class DAGScheduler( })) } -

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10616671 --- Diff: core/src/main/scala/org/apache/spark/ui/SparkUI.scala --- @@ -68,19 +105,53 @@ private[spark] class SparkUI(sc: SparkContext) extends Logging {

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread marmbrus
GitHub user marmbrus opened a pull request: https://github.com/apache/spark/pull/141 Fix serialization of MutablePair. Also provide an interface for easy updating. You can merge this pull request into a Git repository by running: $ git pull https://github.com/marmbrus/spark

[GitHub] spark pull request: Don't swallow all kryo errors, only those that...

2014-03-14 Thread marmbrus
GitHub user marmbrus opened a pull request: https://github.com/apache/spark/pull/142 Don't swallow all kryo errors, only those that indicate we are out of data. You can merge this pull request into a Git repository by running: $ git pull https://github.com/marmbrus/spark

[GitHub] spark pull request: SPARK-1246, added min max API to Double RDDs i...

2014-03-14 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/140#discussion_r10618038 --- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala --- @@ -86,14 +92,9 @@ class DoubleRDDFunctions(self: RDD[Double]) extends

[GitHub] spark pull request: Don't swallow all kryo errors, only those that...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/142#issuecomment-37675315 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37676117 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37676157 Looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Don't swallow all kryo errors, only those that...

2014-03-14 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/142#issuecomment-37676276 Good catch, merging into master. We may want to merge this into branch-0.9 as well, @pwendell any thoughts? --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: Don't swallow all kryo errors, only those that...

2014-03-14 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/142#issuecomment-37678686 Looks good but maybe make the test `e.getMessage.toLowerCase.contains(buffer underflow)`, in case they change the wording. --- If your project is set up for it, you can

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/141#discussion_r10619938 --- Diff: core/src/main/scala/org/apache/spark/util/MutablePair.scala --- @@ -25,10 +25,20 @@ package org.apache.spark.util * @param _2 Element 2 of

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/144#discussion_r10620079 --- Diff: python/pyspark/rdd.py --- @@ -24,6 +24,7 @@ import sys import shlex import traceback +from bisect import bisect_right --- End

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37679298 It might be better to implement `RDD.min` and `RDD.max` with `reduce` directly instead of building a whole StatCounter for them. Also, can you add these to the Java/Scala

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37679480 Changed to update. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37679595 Can you check whether this is broken in Python too, and fix it there as well? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37681456 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37681462 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13182/ --- If your project

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37681461 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37681460 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13183/ --- If your project

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37681668 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37681669 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

Re: cloudera repo down again - mqtt

2014-03-14 Thread Sean Owen
PS the Cloudera cert issue was cleared up a few hours ago; give it a spin. On Fri, Mar 14, 2014 at 8:22 AM, Sean Owen so...@cloudera.com wrote: Yes, I'm using Maven 3.2.1. Actually, scratch that, it fails for me too once it gets down into the MQTT module, with a clearer error:

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/143#discussion_r10622183 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala --- @@ -533,7 +533,7 @@ abstract class DStream[T: ClassTag] (

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/143#discussion_r10622197 --- Diff: core/src/test/scala/org/apache/spark/serializer/ProactiveClosureSerializationSuite.scala --- @@ -0,0 +1,79 @@ +/* + * Licensed to the

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37686977 Ah sorry I didn't see that clean() gets called when the RDD is created and not just when the job is submitted. I think the check in DAGScheduler should be removed

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread willb
Github user willb commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37693355 A configuration option makes sense to me and I'm happy to add it. Let me know if you have strong feelings about what it should be called. --- If your project is set up

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread dwmclary
Github user dwmclary commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37694293 Matei, I updated the branch to do just that. Thanks for the review! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37699807 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37699806 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37702868 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13185/ --- If your project

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37703032 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37703031 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37704877 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37704961 @mateiz, done~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

Re: test cases stuck on local-cluster mode of ReplSuite?

2014-03-14 Thread Michael Armbrust
Sorry to revive an old thread, but I just ran into this issue myself. It is likely that you do not have the assembly jar built, or that you have SPARK_HOME set incorrectly (it does not need to be set). Michael On Thu, Feb 27, 2014 at 8:13 AM, Nan Zhu zhunanmcg...@gmail.com wrote: Hi, all

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread marmbrus
GitHub user marmbrus opened a pull request: https://github.com/apache/spark/pull/146 SPARK-1251 Support for optimizing and executing structured queries This pull request adds support to Spark for working with structured data using a simple SQL dialect, HiveQL and a Scala Query DSL.

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37708044 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37708046 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1242 Add aggregate to python rdd

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/139#issuecomment-37708361 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1242 Add aggregate to python rdd

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/139#issuecomment-37708362 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/146#discussion_r10631616 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala --- @@ -0,0 +1,328 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-1242 Add aggregate to python rdd

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/139#issuecomment-37710423 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37710426 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13187/ --- If your

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37710424 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37710500 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/146#discussion_r10632433 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ExpressionEvaluationSuite.scala --- @@ -0,0 +1,115 @@ +/* + * Licensed to

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37711109 The examples that you added are awesome!!! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37711356 I know @pwendell has expressed concern about config option bloat so maybe he has an opinion here...I would be in favor of not adding a config option because it's a

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37712122 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37712123 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13189/ --- If your project

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37712447 Hey Matei, For a large dataset someone might wanna do it once, like with stat counter all of the numbers are calculated in one go. --- If your project is set

[GitHub] spark pull request: SPARK-1246, added min max API to Double RDDs i...

2014-03-14 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/140#discussion_r10632860 --- Diff: project/build.properties --- @@ -14,4 +14,4 @@ # See the License for the specific language governing permissions and # limitations under

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37712645 Ahh I understood the downside, that would be just for numbers then. makes sense. May be we can have both ? --- If your project is set up for it, you can reply to

[GitHub] spark pull request: SPARK-1246, added min max API to Double RDDs i...

2014-03-14 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/140#discussion_r10632880 --- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala --- @@ -86,14 +92,9 @@ class DoubleRDDFunctions(self: RDD[Double]) extends

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-14 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37713958 Well done, the PR can fix [ SPARK-1248](https://spark-project.atlassian.net/browse/SPARK-1248) --- If your project is set up for it, you can reply to this email and have

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-03-14 Thread Matei Zaharia
I like the pom-reader approach as well — in particular, that it lets you add extra stuff in your SBT build after loading the dependencies from the POM. Profiles would be the one thing missing to be able to pass options through. Matei On Mar 14, 2014, at 10:03 AM, Patrick Wendell

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/144#discussion_r10633002 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala --- @@ -477,6 +477,16 @@ trait JavaRDDLike[T, This : JavaRDDLike[T, This]] extends

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/144#discussion_r10633001 --- Diff: core/src/test/scala/org/apache/spark/PartitioningSuite.scala --- @@ -171,6 +171,8 @@ class PartitioningSuite extends FunSuite with SharedSparkContext

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37714253 Yeah sorry, I didn't mean leave out max and min from StatCounter, I just meant that the RDD.max() and RDD.min() methods should directly call reduce. If you're calling

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread andrewor14
GitHub user andrewor14 opened a pull request: https://github.com/apache/spark/pull/147 [SPARK-1244] Throw exception if map output status exceeds frame size In the existing code, this fails silently... You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37716333 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37716332 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10633389 --- Diff: core/src/main/scala/org/apache/spark/scheduler/JobLogger.scala --- @@ -80,187 +81,78 @@ class JobLogger(val user: String, val logDirName: String)

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37717211 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37717261 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37717262 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10633506 --- Diff: core/src/main/scala/org/apache/spark/ui/UIReloader.scala --- @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37717648 I did : https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=e19044cb1048c3755d1ea2cb43879d2225d49b54 --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37717834 @marmbrus mind closing this? Somehow github didn't detect the close id correctly. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10633600 --- Diff: core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala --- @@ -136,4 +123,47 @@ class MapOutputTrackerSuite extends FunSuite with

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718089 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718090 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10633644 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -35,13 +35,21 @@ private[spark] case class GetMapOutputStatuses(shuffleId: Int)

  1   2   >