[GitHub] spark pull request: SPARK-1469: Scheduler mode should accept lower...

2014-04-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/388#discussion_r11558664 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -98,8 +98,12 @@ private[spark] class TaskSchedulerImpl( var

[GitHub] spark pull request: SPARK-1057 (alternative) Remove fastutil

2014-04-12 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/266#issuecomment-40273037 I did not notice this earlier. The toByteArray method is insanely expensive for anything nontrivial. A better solution would be to replace use of

[GitHub] spark pull request: [SPARK-1415] Hadoop min split for wholeTextFil...

2014-04-12 Thread yinxusen
Github user yinxusen commented on the pull request: https://github.com/apache/spark/pull/376#issuecomment-40273076 @mateiz I have to admit that I ignore the importance of providing the `minSplits`. I encountered a problem just now. I have 20,000 files and call `wholeTextFiles(dir)`

[GitHub] spark pull request: SPARK-1057 (alternative) Remove fastutil

2014-04-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/266 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/290#issuecomment-40273164 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14072/ --- If your project

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/290#issuecomment-40273163 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

2014-04-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/290#issuecomment-40273215 Thanks - merged this and picked it into 1.0. @andrewor14: get some sleep. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: SPARK-1469: Scheduler mode should accept lower...

2014-04-12 Thread techaddict
Github user techaddict commented on the pull request: https://github.com/apache/spark/pull/388#issuecomment-40273462 @pwendell Done :+1: anything else ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: SPARK-1057 (alternative) Remove fastutil

2014-04-12 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/266#issuecomment-40273651 I think we can replace it with a custom impl - where we decide that it is ok to waste some memory within some threshold in case the copy is much more expensive -

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

2014-04-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/290 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Remove extendedDebugInfo option in test build ...

2014-04-12 Thread haosdent
Github user haosdent closed the pull request at: https://github.com/apache/spark/pull/346 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/397 Added a FastByteArrayOutputStream that exposes the underlying array to avoid unnecessary mem copy. This should fix the extra memory copy introduced by #266. @mridulm @pwendell @mateiz You can

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40274836 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40274840 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40274869 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40274870 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14073/ --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

2014-04-12 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/290#issuecomment-40274906 Thanks. You and @tdas too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Update WindowedDStream.scala

2014-04-12 Thread baishuo-ailk
Github user baishuo-ailk commented on the pull request: https://github.com/apache/spark/pull/390#issuecomment-40275213 thank you @pwendell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Update WindowedDStream.scala

2014-04-12 Thread baishuo
Github user baishuo commented on the pull request: https://github.com/apache/spark/pull/390#issuecomment-40275299 thank you @pwendell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-1057 (alternative) Remove fastutil

2014-04-12 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/266#issuecomment-40275465 Hold up a sec -- the array copy is not new. It was merely hidden in the call to `trim()` before, or to `ByteBuffer.allocate()`. Yes, it's better to avoid it if possible.

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/397#discussion_r11559113 --- Diff: core/src/main/scala/org/apache/spark/util/io/FastByteArrayOutputStream.scala --- @@ -0,0 +1,104 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40276213 I actually meant something like this: (This is from an internal WIP branch to tackle the ByteBuffer to Seq[ByteBuffer]) Ideally I should submit this via a PR, but

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40277866 So I think I agree with the overall direction here, but want to make a few comments to clarify why. Apologies if I'm stating the obvious. The management of the

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40278958 Your summarization is fairly accurate @srowen. To add, my initial approach was to subclass to minimize code :-) The reason why I moved away from it was because I did

[GitHub] spark pull request: [WIP] SPARK-1477: Add the lifecycle interface

2014-04-12 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/379#issuecomment-40279118 @andrewor14 ,@tdas, mind reviewing this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40279192 You could deprecate and override `toByteArray` to throw an exception, etc., to be extra-safe. They work, the result just may not have much meaning independently. Your

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40279518 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40279524 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40279828 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40279833 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40279862 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14075/ --- If your project is set up for it, you can

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread iven
Github user iven commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40280398 I've finally get this working, and fixed several bugs in the original PR. It's really hard to get Spark(0.9 and higher) on Mesos working. Here's some note: *

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40280554 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40280557 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40280597 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14074/ --- If your project

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40280596 Build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40281446 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14076/ --- If your project

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40281445 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-1157][MLlib] L-BFGS Optimizer based on ...

2014-04-12 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/353#issuecomment-40281879 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1157][MLlib] L-BFGS Optimizer based on ...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/353#issuecomment-40281918 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1157][MLlib] L-BFGS Optimizer based on ...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/353#issuecomment-40281922 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Add role and checkpoint support for Mesos back...

2014-04-12 Thread iven
Github user iven commented on the pull request: https://github.com/apache/spark/pull/60#issuecomment-40282298 I've no idea why the test fails. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1157][MLlib] L-BFGS Optimizer based on ...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/353#issuecomment-40283049 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-1157][MLlib] L-BFGS Optimizer based on ...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/353#issuecomment-40283050 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14077/ --- If your project

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40284529 There are two issues here: a) If we are going to override and deprecate/throw exception for every method which is not exposed by OutputStream - while overriding

[GitHub] spark pull request: [SPARK-1403] Move the class loader creation ba...

2014-04-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/322#issuecomment-40284736 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1403] Move the class loader creation ba...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/322#issuecomment-40284773 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1403] Move the class loader creation ba...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/322#issuecomment-40284781 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-04-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-40284983 @sryza this is failing due to a python syntax error. In general if you wouldn't mind it would be good to run tests locally before pushing, since spinning up the test

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40285054 Sure, I myself was not suggesting that we should make them throw exceptions. If one really wanted to prohibit their use, that would be a way to do so even when

[GitHub] spark pull request: [SPARK-1403] Move the class loader creation ba...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/322#issuecomment-40285726 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14078/ --- If your project

[GitHub] spark pull request: [SPARK-1403] Move the class loader creation ba...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/322#issuecomment-40285724 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread ahirreddy
Github user ahirreddy commented on a diff in the pull request: https://github.com/apache/spark/pull/363#discussion_r11560720 --- Diff: docs/sql-programming-guide.md --- @@ -318,4 +391,24 @@ Row[] results = hiveCtx.hql(FROM src SELECT key, value).collect(); /div

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-40286775 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-40286771 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-04-12 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-40286819 My bad. I made a change after running tests and should have re-run them. Posted a patch that fixes the syntax error. --- If your project is set up for it, you can reply

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread ahirreddy
Github user ahirreddy commented on a diff in the pull request: https://github.com/apache/spark/pull/363#discussion_r11560776 --- Diff: python/pyspark/rdd.py --- @@ -1387,6 +1387,95 @@ def _jrdd(self): def _is_pipelinable(self): return not (self.is_cached or

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread ahirreddy
Github user ahirreddy commented on a diff in the pull request: https://github.com/apache/spark/pull/363#discussion_r11560795 --- Diff: python/pyspark/rdd.py --- @@ -1387,6 +1387,95 @@ def _jrdd(self): def _is_pipelinable(self): return not (self.is_cached or

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread ahirreddy
Github user ahirreddy commented on a diff in the pull request: https://github.com/apache/spark/pull/363#discussion_r11560881 --- Diff: python/pyspark/rdd.py --- @@ -1387,6 +1387,95 @@ def _jrdd(self): def _is_pipelinable(self): return not (self.is_cached or

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-40287754 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-40287755 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14079/ --- If your project

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/397#discussion_r11561006 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/util/RawTextSender.scala --- @@ -43,15 +45,15 @@ object RawTextSender extends Logging {

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/397#discussion_r11561015 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -1001,9 +1003,9 @@ private[spark] class BlockManager( blockId:

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40288679 Having a toByteBuffer method definitely seems reasonable to me, the only issue is that ByteBuffer does not provide a good stream-compatible API. So it would either still

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread ahirreddy
Github user ahirreddy commented on a diff in the pull request: https://github.com/apache/spark/pull/363#discussion_r11561160 --- Diff: python/pyspark/rdd.py --- @@ -1387,6 +1387,95 @@ def _jrdd(self): def _is_pipelinable(self): return not (self.is_cached or

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread ahirreddy
Github user ahirreddy commented on a diff in the pull request: https://github.com/apache/spark/pull/363#discussion_r11561162 --- Diff: python/pyspark/context.py --- @@ -460,6 +463,225 @@ def sparkUser(self): return self._jsc.sc().sparkUser()

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread ahirreddy
Github user ahirreddy commented on a diff in the pull request: https://github.com/apache/spark/pull/363#discussion_r11561164 --- Diff: python/run-tests --- @@ -56,6 +56,9 @@ run_test pyspark/mllib/clustering.py run_test pyspark/mllib/recommendation.py run_test

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/363#issuecomment-40289144 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/363#issuecomment-40289149 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40289835 @aarondav I personally like your second method. That alone is probably just what is needed. Callers who actually want a `ByteBuffer` can wrap easily with this info. In

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/363#issuecomment-40289932 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/363#issuecomment-40289938 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/363#issuecomment-40291396 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/363#issuecomment-40291397 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14080/ --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/363#issuecomment-40292065 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14081/ --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-1374: PySpark API for SparkSQL

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/363#issuecomment-40292064 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1057: Upgrade fastutil to 6.5.11

2014-04-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/215#issuecomment-40294619 @velvia mind closing this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [Fix #204] Update out-dated comments

2014-04-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/381 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-1057: Upgrade fastutil to 6.5.11

2014-04-12 Thread velvia
Github user velvia closed the pull request at: https://github.com/apache/spark/pull/215 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-1057: Upgrade fastutil to 6.5.11

2014-04-12 Thread velvia
Github user velvia commented on the pull request: https://github.com/apache/spark/pull/215#issuecomment-40295287 Ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40295319 Ok pushed a new version that avoids the extra trim. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40295427 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40295467 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40295468 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14082/ --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-1470: Spark logger moving to use scala-l...

2014-04-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/332#issuecomment-40295844 We should wait until questions like these are answered before we move to scala-logging. https://github.com/typesafehub/scala-logging/issues/4

[GitHub] spark pull request: Added a FastByteArrayOutputStream that exposes...

2014-04-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/397#issuecomment-40296425 @rxin hm looks like this RAT exclude isn't working. Can take another crack at it later tonight. https://github.com/apache/spark/blob/master/.rat-excludes#L43

[GitHub] spark pull request: Yarn: do not set local IP in remote process en...

2014-04-12 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/394#issuecomment-40296579 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Yarn: do not set local IP in remote process en...

2014-04-12 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/394#issuecomment-40296582 Seems like a good catch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Yarn: do not set local IP in remote process en...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/394#issuecomment-40296590 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: pyspark need Py2 to work, graceful and helping...

2014-04-12 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/392#issuecomment-40296603 Is there any way to do this test in Python instead of in bash? It looks like complicated and potentially brittle bash code. --- If your project is set up for it, you can

[GitHub] spark pull request: Yarn: do not set local IP in remote process en...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/394#issuecomment-40296586 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Yarn: do not set local IP in remote process en...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/394#issuecomment-40296609 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Yarn: do not set local IP in remote process en...

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/394#issuecomment-40296610 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14083/ --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-1426: Make MLlib work with NumPy version...

2014-04-12 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/391#issuecomment-40296943 Is 1.6 the oldest version it works with now, or could it also work with 1.5 or older? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-1415] Hadoop min split for wholeTextFil...

2014-04-12 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/376#issuecomment-40296996 While a minSplits for all New API Hadoop files would be useful, I think that's too complicated to do in 1.0, so it would be fine to just add it for wholeTextFiles now.

[GitHub] spark pull request: [SPARK-1415] Hadoop min split for wholeTextFil...

2014-04-12 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/376#discussion_r11562788 --- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala --- @@ -24,10 +24,13 @@ import org.apache.hadoop.conf.{Configurable, Configuration}

[GitHub] spark pull request: [SPARK-1415] Hadoop min split for wholeTextFil...

2014-04-12 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/376#issuecomment-40297045 BTW the current approach looks good, we should just merge this for now and maybe open a JIRA for the other types of files. --- If your project is set up for it, you can

[GitHub] spark pull request: update spark.default.parallelism

2014-04-12 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/389#issuecomment-40297101 Good catch --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: update spark.default.parallelism

2014-04-12 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/389#issuecomment-40297100 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: update spark.default.parallelism

2014-04-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/389#issuecomment-40297146 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

  1   2   >