Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/180#issuecomment-35963553
(`sbt/sbt scalastyle` runs its namesake, by the way)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/582#issuecomment-35954244
Looks good to me! Will merge soon, if no one else has comments.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/582#issuecomment-35919353
Let's just try it again - Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on G
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/586#issuecomment-35838441
Ah, great, that'll make it simple. We can only merge at the granularity of
PRs, so it'd be great if you could split the dependency stuff into its own
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/605#issuecomment-35821340
Thanks, LGTM! Merged into master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/419#issuecomment-35820427
Looks like all comments were addressed (...a while ago). Unfortunately,
this patch conflicts with the style cleanup patch, and is not currently in a
mergeable
Github user aarondav commented on a diff in the pull request:
https://github.com/apache/incubator-spark/pull/508#discussion_r9973811
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -394,6 +394,17 @@ abstract class RDD[T: ClassTag](
def ++(other: RDD[T
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/586#issuecomment-35820063
Do you think it would be possible to separate this into two patches, one
for the code warnings and one for the build issues? Code warnings are
relatively easy
Github user aarondav commented on a diff in the pull request:
https://github.com/apache/incubator-spark/pull/586#discussion_r9973707
--- Diff: project/SparkBuild.scala ---
@@ -340,7 +336,8 @@ object SparkBuild extends Build {
def streamingSettings = sharedSettings ++ Seq
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/634#issuecomment-35818896
Looks good to me. Just ran FaultToleranceTest with these updates and it ran
just fine. Merging into master and branch-0.9.
---
If your project is set up for
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/615#issuecomment-35818768
Updated to make de-publicize SPARK_DRIVER_MEMORY. `spark-class` is
currently only used in the following cases:
- spark-shell (which has a memory command
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/621#issuecomment-35818246
Thanks! Merged into master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#issuecomment-35818159
This looks great to me. "curator-recipes-2.4.0" includes a total of 3
curator dependencies, ZK 3.4.5 (same version as before), and guava 14.0.1 (sam
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/582#issuecomment-35817227
There is some desire to get this change in as we have a PR in the pipeline
that will use our JSON serialization library more heavily to serialize and log
Github user aarondav commented on a diff in the pull request:
https://github.com/apache/incubator-spark/pull/634#discussion_r9973330
--- Diff: docker/spark-test/base/Dockerfile ---
@@ -25,8 +25,8 @@ RUN apt-get update
# install a few other useful packages plus Open Jdk 7
Github user aarondav commented on a diff in the pull request:
https://github.com/apache/incubator-spark/pull/614#discussion_r9972843
--- Diff: repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala ---
@@ -876,7 +876,14 @@ class SparkILoop(in0: Option[BufferedReader
Github user aarondav commented on a diff in the pull request:
https://github.com/apache/incubator-spark/pull/630#discussion_r9958810
--- Diff: python/pyspark/rdd.py ---
@@ -916,6 +916,11 @@ def flatMapValues(self, f):
Pass each value in the key-value pair RDD through a
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/625#issuecomment-35762793
Thanks! Merged in master and branch-0.9.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/624#issuecomment-35705443
Thanks @guojc and @andrewor14! LGTM. Maybe @pwendell wants to take a look
as well?
---
If your project is set up for it, you can reply to this email and have
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/625#issuecomment-35704996
LGTM save the minor regex nit
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please
Github user aarondav commented on a diff in the pull request:
https://github.com/apache/incubator-spark/pull/625#discussion_r9938103
--- Diff: core/src/main/scala/org/apache/spark/deploy/ClientArguments.scala
---
@@ -115,3 +110,7 @@ private[spark] class ClientArguments(args
Github user aarondav commented on a diff in the pull request:
https://github.com/apache/incubator-spark/pull/623#discussion_r9930713
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -77,6 +77,7 @@ class PairRDDFunctions[K: ClassTag, V: ClassTag
GitHub user aarondav opened a pull request:
https://github.com/apache/incubator-spark/pull/623
Super minor: Add require for mergeCombiners in combineByKey
We changed the behavior in 0.9.0 from requiring that mergeCombiners be null
when mapSideCombine was false to requiring that
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/618#issuecomment-35529528
LGTM too, apologies for not catching this issue earlier.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user aarondav commented on a diff in the pull request:
https://github.com/apache/incubator-spark/pull/618#discussion_r9875739
--- Diff: docs/scala-programming-guide.md ---
@@ -17,12 +17,12 @@ This guide shows each of these features and walks
through some samples. It assum
Github user aarondav commented on a diff in the pull request:
https://github.com/apache/incubator-spark/pull/611#discussion_r9875482
--- Diff:
core/src/main/scala/org/apache/spark/deploy/master/ZooKeeperLeaderElectionAgent.scala
---
@@ -18,105 +18,73 @@
package
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#discussion_r9855451
Whoops, was thinking that preStart was called on restart as well, for some
reason.
---
If your project is set up for it, you can reply to this email and have
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/610#issuecomment-35470142
Don't worry about changing the return types on new lines for code that
you're not directly altering otherwise. We mostly decided this change of styl
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#issuecomment-35469138
Everything seems in order, just a few comments outstanding. I've run the
FaultToleranceTest and all tests passed. I had to update it, though (for
instance
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#discussion_r9855077
Make sure to close latch and zk before restarting, since we construct new
ones.
---
If your project is set up for it, you can reply to this email and have your
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#discussion_r9854099
It's as you said, this would be a significant change. Still, I think the
correct way to implement this now would probably be to have the Master
subscribe
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#discussion_r9854034
Looks like if you did not use a sameThreadExecutor, it could execute
outside of the synchronization block, but I think you're right that we're
techni
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/615#issuecomment-35464406
Changed to be SPARK_DRIVER_MEMORY
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/605#discussion_r9850119
The exact same confusion led me to centralize and document this :)
Just as a stylistic thing, I think we can put these on one line (that's the
idea
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/605#issuecomment-35454755
Ah, got it. I actually didn't realize that StorageLevel.scala is the Scala
version, and StorageLevels is the Java version (clear naming scheme we got
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/610#discussion_r9849854
Unfortuantely, we decided not to go with the return type on a new line
style (see dev list). The acceptable style is one of these two:
```
def
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/610#discussion_r9849773
perhaps you meant to add a type here?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/616#issuecomment-35453423
Also merged into branch-0.9, as we also use scala-2.10.3 there.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/616#issuecomment-35452922
Thanks! Merged.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/613#issuecomment-35444870
Thanks! Merged into master. This PR was rebased up to the previous master
HEAD, so there are no commits that could've conflicted and result in dupl
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/615#issuecomment-35433301
I'm amenable to switching to SPARK_DRIVER_MEMORY. I avoided it for two
reasons:
1) Following SPARK_MEM -> SPARK_DRIVER_MEM / SPARK_EXECUTOR_MEM
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/613#issuecomment-35432325
LGTM too after @rxin's change. Since Double is used throughout that file, I
think renaming JDouble is cleanest.
---
If your project is set up for it, yo
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/605#issuecomment-35432171
Any reason not to convert StorageLevels as well?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/605#discussion_r9841053
I'm like 85% sure we can use scala.Serializable (which extends
java.io.Serializable) and should not require an import.
---
If your project is set up for it
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/605#discussion_r9841070
same here
`def returnType(): ClassTag[R] = JavaSparkContext.fakeClassTag`
---
If your project is set up for it, you can reply to this email and have your
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/605#discussion_r9840846
Hm, let's use JavaSparkContext's fakeClassTag method, so we link to the
documentation on why we're doing this. You may have to rebase to get
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/613#discussion_r9839626
Nice catch on this one.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post
GitHub user aarondav opened a pull request:
https://github.com/apache/incubator-spark/pull/615
SPARK-929: Fully deprecate usage of SPARK_MEM
This patch cements our deprecation of the SPARK_MEM environment variable by
replacing it with three more specialized variables
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/585#discussion_r9834854
This and the one below were changed to private APIs. Also
cogroupResult2ToJava, but that's not here for some reason.
---
If your project is set up for it
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/613#discussion_r9834309
Sorry, just meant there are 2 newlines between the last import and the
following Javadoc. Not a change caused by you, but I thought I'd mention it
a
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/613#issuecomment-35414789
Woo-hoo! Thanks for doing this. This could generate a lot of merge
conflicts, so I will merge after comments are addressed. Since everything
compiles, it
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/613#discussion_r9834189
nice! Deleted imports make me happy...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/613#discussion_r9834148
extra whitespace here
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/613#discussion_r9834055
These imports of Some are not actually required, just inserted by IDEs
occasionally. This is actually the only remaining usage in our code base, so
let's
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/613#discussion_r9834070
braces not required
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/613#discussion_r9833979
Feel free to put them on the same line, we discussed that overly long
imports are perfectly fine.
---
If your project is set up for it, you can reply to this
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/612#issuecomment-35413258
Great find here! Thanks very much for submitting this patch. I just had a
suggestion regarding the style of your check. It might be neat if we could add
a test
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/612#issuecomment-35413065
Jenkins, this is ok to test.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/612#discussion_r9833685
This could be written pretty trivially using an Ordering, but I fear for
the performance characteristics here, since this is a fast path. How does this
look
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#issuecomment-35361778
Thanks for doing this! Took a very superficial pass, will attempt a deeper
one later. Have you tried running FaultToleranceTest? If you're not on a Linux
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#discussion_r9815357
This statement doesn't really give us anything if we're not behind a lock.
I think we should probably synchronize this method and notLeader and
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#discussion_r9815316
This suggests we may be able to do away with the actor-ness of this class
-- I haven't given it sufficient thought, but I suspect it's the case.
-
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#discussion_r9815286
Let's not have a default value for this and throw an error if it's not
defined. (I know that wasn't the original behavior, but I think this behavio
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#discussion_r9815294
extra newline?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/611#discussion_r9815215
explicit return type
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/599#issuecomment-35331962
Thanks! Merged into master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/536#issuecomment-35312964
Thanks! Merged into master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/572#discussion_r9801137
(1 to folds) is preferred, your style is fine though we use 2 space wrapped
indents instead of 4. Would this be possible, though?
```
(1 to folds).map
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/608#issuecomment-35307471
Thanks!! Merged in master and branch-0.9. We should've listened to the
"suspicious shadowing" warning in the first place, no good comes of those
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/607#issuecomment-35307506
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/403#issuecomment-35307150
Jenkins, add to whitelist and ok to test. (c'mon Jenkins, I believe in you)
---
If your project is set up for it, you can reply to this email and have
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/605#issuecomment-35305983
Jenkins whitelist is separate from Jenkins admins, the latter of which is
explicitly managed by the people who run the AMPLab Jenkins, alas.
Jenkins
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/606#discussion_r9798640
Did you add this class? I don't see it in this PR.
also, nit: the previous style was more correct, putting this parameter on
the next line (and both ind
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/475#issuecomment-35236088
Maybe Jenkins was down last time. Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/403#issuecomment-35236003
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/605#issuecomment-35228095
Moving these to src/main/java is a good idea, but I wonder if most of these
files would be better refactored into Scala. This should be trivial for all
except
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/604#discussion_r9781736
Went ahead and made these private[spark]. It seems unlikely that anyone
else would use these methods, and making them private means that the type
parameter
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/599#issuecomment-35226986
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/600#issuecomment-35226744
No need for your renaming PR @punya -- I've already taken care of it in
#604 :) Thanks!
---
If your project is set up for it, you can reply to this emai
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/597#issuecomment-35226013
What is the use-case you have in mind here? Just some sort of final status
of all executors right before terminating a job/shell?
If you're
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/600#issuecomment-35225376
This PR doesn't need to block on this discussion, since it doesn't actually
rely on the fake ClassTag, so I've merged it into master.
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/599#discussion_r9781258
update or remove comment at the top of this file that talks about the
options -- removal is fine since we have the list in code now
---
If your project is set
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/599#discussion_r9781246
pattern should start with ^ and end with $ -- just tried with something
like "4gz" and it passed
---
If your project is set up for it, you can rep
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/599#discussion_r9781248
change OPTIONS to SPARK_SHELL_OPTS!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/599#discussion_r9781252
nit: maybe "the maximum number of cores to be used by the spark shell"
---
If your project is set up for it, you can reply to this email and have
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/599#discussion_r9781247
update for -em
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/536#discussion_r9780371
I don't think a semicolon is gramatically correct here. In fact, I think
"The `updateStateByKey` operation allows you to maintain some state
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/536#discussion_r9780346
Not sure if this is the section you were talking about:
http://kafka.apache.org/documentation.html#kafkahadoopconsumerapi
---
If your project is set up for it
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/568#issuecomment-35221890
Merged in master and branch-0.9. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/573#issuecomment-35221813
I'm not 100% caught up on the state of this issue. Is #570 a "complete fix"
for this issue, or is this PR still the best fix we have in the pipe
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/599#discussion_r9780214
Most of the rest of the code in this file uses echo, so I would avoid
changing this unless you have a good reason.
---
If your project is set up for it, you can
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/599#discussion_r9780203
A slightly confusing point is that the driver is in the same memory space
as the shell, and we're really just controlling the memory of the shell process
i
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/599#discussion_r9780175
It is perhaps unfortunate that SPARK_WORKER_MEMORY actually does exist, and
controls the total amount of memory that a worker can lease across all
executors on a
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/603#issuecomment-35221556
Sorry if this is a stupid question, but why do we need Tachyon JSPs? Are we
going to host Tachyon pages from our own UI?
---
If your project is set up for it
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/603#discussion_r9780128
I don't know much about log4j, but could this accidentally override Spark's
own logging properties if it is the first log4j.properties file fo
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/604#discussion_r9780111
These changes may actually be problematic, as they are part of a publicly
accessible API. Even removing the specialization of V is not really
backwards
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/604#discussion_r9780099
ClassTags do not store generic information, so here we are still just
finding Tuple2.
---
If your project is set up for it, you can reply to this email and have
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/604#discussion_r9780093
Note: I removed the ClassTag for V, as it was not necessary. Also, I
reordered the type parameters to put K in front.
---
If your project is set up for it, you
GitHub user aarondav opened a pull request:
https://github.com/apache/incubator-spark/pull/604
SPARK-1098: Minor cleanup of ClassTag usage in Java API
Our usage of fake ClassTags in this manner is probably not healthy, but I'm
not sure if there's a better solution avail
Github user aarondav commented on the pull request:
https://github.com/apache/incubator-spark/pull/602#issuecomment-35211178
Neato, Jenkins actually listened to me. Merged into master, thanks!
If your project is set up for it, you can reply to this email and have your
reply appear on
1 - 100 of 135 matches
Mail list logo