[GitHub] spark pull request: do you mean inadvertently?

2014-12-05 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/3620

do you mean inadvertently?



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark streaming-foreachRDD

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3620.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3620


commit b72886b6570be62ca4bcf1964c489a5f51d41394
Author: CrazyJvm crazy...@gmail.com
Date:   2014-12-05T13:39:13Z

do you mean inadvertently?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: use isRunningLocally rather than runningLocall...

2014-10-21 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/2879

use isRunningLocally rather than runningLocally

runningLocally is deprecated now

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark runningLocally

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2879.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2879


commit bec0b3ef008c3fbc1dcf133db08271eb0892b50e
Author: CrazyJvm crazy...@gmail.com
Date:   2014-10-21T11:52:14Z

use isRunningLocally rather than runningLocally




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: use --total-executor-cores rather than --co...

2014-09-27 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/2540#issuecomment-57045529
  
@andrewor14 already modified title according to your suggestion. Thx


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: use --total-executor-cores rather than --co...

2014-09-25 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/2540

use --total-executor-cores rather than --cores after spark-shell



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark standalone-core

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2540.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2540


commit 66d9fc61af64a43c3022727b08a569702b759d30
Author: CrazyJvm crazy...@gmail.com
Date:   2014-09-26T02:50:51Z

use --total-executor-cores rather than --cores after spark-shell




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: use --total-executor-cores rather than --co...

2014-09-25 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/2540#issuecomment-56914942
  
launch spark-shell in standalone mode i mean


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: add some shuffle configurations in doc

2014-08-20 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/2031#issuecomment-52741370
  
@colorant  thanks, I've not noticed `toLowerCase ` before : ) . already 
modified 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: add spark.shuffle.spill.batchSize and fix the ...

2014-08-19 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/2031

add spark.shuffle.spill.batchSize and fix the value of spark.shuffle.manager

```scala
private val serializerBatchSize = 
sparkConf.getLong(spark.shuffle.spill.batchSize, 1)
```
add `spark.shuffle.spill.batchSize` to doc.

And according to 
```scala
// Let the user specify short names for shuffle managers
val shortShuffleMgrNames = Map(
  hash - org.apache.spark.shuffle.hash.HashShuffleManager,
  sort - org.apache.spark.shuffle.sort.SortShuffleManager)
val shuffleMgrName = conf.get(spark.shuffle.manager, hash)
```
value should be hash and sort rather than HASH and SORT


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark conf-spill-batchSize

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2031.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2031


commit 49b47f04150d7c6fd428631228fa1428a2978e9d
Author: CrazyJvm crazy...@gmail.com
Date:   2014-08-19T07:59:36Z

add configuration `spark.shuffle.spill.batchSize` and fix the value of 
spark.shuffle.manager




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: there's no need to use masterLock in Worker no...

2014-08-17 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/2008

there's no need to use masterLock in Worker now since all communications 
are within Akka actor

there's no need to use masterLock in Worker now since all communications 
are within Akka actor

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark no-need-master-lock

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2008.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2008


commit 58e7fa50e2e71f5d92de2cdcf6f2928c0da0db12
Author: CrazyJvm crazy...@gmail.com
Date:   2014-08-18T02:30:39Z

there's no need to use masterLock now since all communications are within 
Akka actor




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: add cacheTable guide

2014-08-01 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/1681#issuecomment-50952014
  
OK, got it, thanks for reminding. @pwendell


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: add cacheTable guide

2014-07-31 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/1681#issuecomment-50723525
  
thanks @pwendell , already fixed according to your suggestions. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: add cacheTable guide

2014-07-30 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/1681

add cacheTable guide

add the `cacheTable` specification

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark sql-programming-guide-cache

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1681.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1681


commit 2cbbf58c9a5efccbf392f0e1bbc777ac7b9d8179
Author: CrazyJvm crazy...@gmail.com
Date:   2014-07-31T04:17:13Z

add cacheTable guide




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2000:cannot connect to cluster in Standa...

2014-07-29 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/952#issuecomment-50458866
  
@mateiz YES, i agree.  I was motivated by the 
http://spark.apache.org/docs/latest/spark-standalone.html; , which says Note 
that if you are running spark-shell from one of the spark cluster machines, the 
bin/spark-shell script will automatically set MASTER from the SPARK_MASTER_IP 
and SPARK_MASTER_PORT variables in conf/spark-env.sh.  
So should I modify the guide rather than code ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2000:cannot connect to cluster in Standa...

2014-07-29 Thread CrazyJvm
Github user CrazyJvm closed the pull request at:

https://github.com/apache/spark/pull/952


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-2000:cannot connect to cluster in Standa...

2014-07-29 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/952#issuecomment-50562260
  
ok, so I will close this PR and send another patch to the guide. thanks for 
your discussion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: automatically set master according to `spark.m...

2014-07-29 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/1644

automatically set master according to `spark.master` in `spark-defaults

automatically set master according to `spark.master` in 
`spark-defaults.conf`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark standalone-guide

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1644.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1644


commit bb12b950c149e8ebeb78b047b9bfc37a4313eb76
Author: CrazyJvm crazy...@gmail.com
Date:   2014-07-30T01:45:31Z

automatically set master according to `spark.master` in 
`spark-defaults.conf`




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Graphx example

2014-07-22 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/1523

Graphx example

fix examples

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark graphx-example

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1523.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1523


commit 7cfff1d029ace9bdb2cd39e726d144a1cb8d868f
Author: CrazyJvm crazy...@gmail.com
Date:   2014-07-22T06:56:28Z

fix example for joinVertices

commit 663457a9f63c6e7bb1087e1ca4ed2a483ad3aa7a
Author: CrazyJvm crazy...@gmail.com
Date:   2014-07-22T07:04:03Z

outDegrees does not take parameters




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: fix Graph partitionStrategy comment

2014-07-10 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/1368

fix Graph partitionStrategy comment



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark graph-comment-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1368.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1368


commit e190d6fdd5b4d0f5a89352c38e5f06f5238b35a8
Author: CrazyJvm crazy...@gmail.com
Date:   2014-07-11T02:53:54Z

fix Graph partitionStrategy comment




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: fix spark.yarn.max.executor.failures explainat...

2014-07-02 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/1282

fix spark.yarn.max.executor.failures explaination

According to 
'''scala
  private val maxNumExecutorFailures = 
sparkConf.getInt(spark.yarn.max.executor.failures,
sparkConf.getInt(spark.yarn.max.worker.failures, 
math.max(args.numExecutors * 2, 3)))

default value should be numExecutors * 2, with minimum of 3,  and it's same 
to the config 
`spark.yarn.max.worker.failures`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark yarn-doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1282.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1282


commit a4b2e27b0c2d2345a60ba66943b219968465b48a
Author: CrazyJvm crazy...@gmail.com
Date:   2014-07-02T06:59:48Z

fix configuration spark.yarn.max.executor.failures

commit 2900d234c6ebb90a5c4601083ddf8d329a2ee99d
Author: CrazyJvm crazy...@gmail.com
Date:   2014-07-02T07:04:51Z

fix style

commit 211f1302aa6d57b07a7b2d3b7cd4ab21e6d50bbd
Author: CrazyJvm crazy...@gmail.com
Date:   2014-07-02T07:06:28Z

fix html tag

commit 86effa612d2ec9ae991b43e229c4ed266e6605a6
Author: CrazyJvm crazy...@gmail.com
Date:   2014-07-02T07:15:08Z

change expression

commit c438aecdec8ce90cb839b7c9aa8260ff4d3c62ba
Author: CrazyJvm crazy...@gmail.com
Date:   2014-07-02T07:18:18Z

fix style




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: make port of ConnectionManager configurable

2014-06-30 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/1267

make port of ConnectionManager configurable

I encountered a port confliction problem due to ConnectionManager which is 
really annoying .
So I make the ConnectionManager port configurable with default port  
still.
I added a new configuration called `spark.network.connectionmanager.port` , 
and I'm not sure whether the name is OK or not : )

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark ConnectionManager-Port

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1267.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1267


commit e34fbe5003a97b693de7abb5144d907d01283fef
Author: CrazyJvm crazy...@gmail.com
Date:   2014-06-30T09:37:12Z

make port of ConnectionManager configurable




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: make port of ConnectionManager configurable

2014-06-30 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/1267#issuecomment-47609357
  
Ah,yes, just in test. So maybe I should close the PR?
It looks not a big deal.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: make port of ConnectionManager configurable

2014-06-30 Thread CrazyJvm
Github user CrazyJvm closed the pull request at:

https://github.com/apache/spark/pull/1267


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Refactor DriverRunner and DriverRunnerTest

2014-06-17 Thread CrazyJvm
Github user CrazyJvm closed the pull request at:

https://github.com/apache/spark/pull/1066


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1999: StorageLevel in storage tab and RD...

2014-06-15 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/968#issuecomment-46136772
  
@pwendell , would you like to take a look at it again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Master supports pluggable clock

2014-06-12 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/1066

Master supports pluggable clock

Convenient for testing, especially in timeout scenario.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark master-clock

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1066.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1066


commit 4dddc889ff7e3c0f27ff65f6fe3101ead9cd91d9
Author: CrazyJvm crazy...@gmail.com
Date:   2014-06-12T07:59:30Z

Master supports pluggable clock




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: [SPARK-1946] Submit stage after (configured nu...

2014-06-12 Thread CrazyJvm
Github user CrazyJvm commented on a diff in the pull request:

https://github.com/apache/spark/pull/900#discussion_r13692635
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
 ---
@@ -225,6 +232,17 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, actorSystem: A
 throw new SparkException(Error notifying standalone scheduler's 
driver actor, e)
 }
   }
+
+  override def isReady(): Boolean = {
+if (ready){
+  return true
+}
+if ((System.currentTimeMillis() - createTime) = 
maxRegisteredWaitingTime) {
+  ready = true
+  return true
+}
+return false
--- End diff --

no need return i think


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Master supports pluggable clock

2014-06-12 Thread CrazyJvm
Github user CrazyJvm closed the pull request at:

https://github.com/apache/spark/pull/1066


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Master supports pluggable clock

2014-06-12 Thread CrazyJvm
GitHub user CrazyJvm reopened a pull request:

https://github.com/apache/spark/pull/1066

Master supports pluggable clock

Convenient for testing, especially in timeout scenario.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark master-clock

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1066.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1066


commit 4dddc889ff7e3c0f27ff65f6fe3101ead9cd91d9
Author: CrazyJvm crazy...@gmail.com
Date:   2014-06-12T07:59:30Z

Master supports pluggable clock

commit a002e423b85c4c46b1e54c761caad95dc34ef923
Author: CrazyJvm crazy...@gmail.com
Date:   2014-06-12T13:17:42Z

fix exception no matching constructor found on class 
org.apache.spark.deploy.master.Master for arguments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Master supports pluggable clock

2014-06-12 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/1066#issuecomment-45899433
  
I cannot figure it out why build failed here since everything is OK on my 
Mac.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Master supports pluggable clock

2014-06-12 Thread CrazyJvm
Github user CrazyJvm closed the pull request at:

https://github.com/apache/spark/pull/1066


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Master supports pluggable clock

2014-06-12 Thread CrazyJvm
GitHub user CrazyJvm reopened a pull request:

https://github.com/apache/spark/pull/1066

Master supports pluggable clock

Convenient for testing, especially in timeout scenario.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark master-clock

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1066.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1066


commit 4dddc889ff7e3c0f27ff65f6fe3101ead9cd91d9
Author: CrazyJvm crazy...@gmail.com
Date:   2014-06-12T07:59:30Z

Master supports pluggable clock

commit a002e423b85c4c46b1e54c761caad95dc34ef923
Author: CrazyJvm crazy...@gmail.com
Date:   2014-06-12T13:17:42Z

fix exception no matching constructor found on class 
org.apache.spark.deploy.master.Master for arguments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Use pluggable clock in DAGSheduler #SPARK-2031

2014-06-05 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/976

Use pluggable clock in DAGSheduler #SPARK-2031

DAGScheduler supports pluggable clock like what TaskSetManager does.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark clock

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/976.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #976


commit 6779a4c70f43f705b218bdd28bb9cdffaa4a4b1c
Author: CrazyJvm crazy...@gmail.com
Date:   2014-06-05T06:32:46Z

Use pluggable clock in DAGSheduler




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: SPARK-1999: StorageLevel in storage tab and RD...

2014-06-05 Thread CrazyJvm
Github user CrazyJvm commented on a diff in the pull request:

https://github.com/apache/spark/pull/968#discussion_r13429125
  
--- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala ---
@@ -33,6 +33,7 @@ class RDDInfo(
   var memSize = 0L
   var diskSize = 0L
   var tachyonSize = 0L
+  var _storageLevel = storageLevel
--- End diff --

i agree with you, so i will change it.
another reason to make storageLevel be a var is that the rdd information 
also use it. it should be updated, too.

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/RDDInfo.scala#L37-L43


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: JIRA https://issues.apache.org/jira/browse/SPA...

2014-06-04 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/968

JIRA https://issues.apache.org/jira/browse/SPARK-1999

StorageLevel in 'storage tab' and 'RDD Storage Info' never changes even if 
you call rdd.unpersist() and then you give the rdd another different storage 
level.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark ui-storagelevel

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/968.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #968


commit 9f1571ef47bb7975eed55441f01aecda25851f74
Author: CrazyJvm crazy...@gmail.com
Date:   2014-06-04T08:36:02Z

JIRA https://issues.apache.org/jira/browse/SPARK-1999
UI : StorageLevel in storage tab and RDD Storage Info never changes




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: StorageLevel in 'storage tab' and 'RDD Storage...

2014-06-03 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/950

StorageLevel in 'storage tab' and 'RDD Storage Info' never changes  
#SPARK-1999#

StorageLevel in 'storage tab' and 'RDD Storage Info' never changes even if 
you call rdd.unpersist() and then you give the rdd another different storage 
level.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/950.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #950


commit bbe28d2b4fee09258cd89c1085e0597d188bffa0
Author: Chen Chao crazy...@gmail.com
Date:   2014-06-02T08:25:51Z

Merge pull request #3 from apache/master

Add landmark-based Shortest Path algorithm to graphx.lib

commit a158ff031c021491f4e1ddd5c13f8317905c76ee
Author: Chen Chao crazy...@gmail.com
Date:   2014-06-03T01:16:34Z

Merge pull request #4 from apache/master

merge from master

commit 21aef67288bd3d50f29f4e4dceaa8df71d9e279d
Author: CrazyJvm crazy...@gmail.com
Date:   2014-06-03T10:52:38Z

fix ui storagelevel bug




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: StorageLevel in 'storage tab' and 'RDD Storage...

2014-06-03 Thread CrazyJvm
Github user CrazyJvm closed the pull request at:

https://github.com/apache/spark/pull/950


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: cannot connect to cluster in Standalone mode w...

2014-06-03 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/952

cannot connect to cluster in Standalone mode when run spark-shell in one of 
the cluster node without master option #SPARK2000

JIRA:https://issues.apache.org/jira/browse/SPARK-2000

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark patch-9

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/952.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #952


commit 08579af6d2aa2239da5b6532094301a4c4afe86b
Author: Chen Chao crazy...@gmail.com
Date:   2014-06-03T15:02:26Z

connect to cluster automatically




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: cannot connect to cluster in Standalone mode w...

2014-06-03 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/952#issuecomment-44985944
  
Hi witgo, I still cannot figure out what u mean... could u please give me 
some clues in detail?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: cannot connect to cluster in Standalone mode w...

2014-06-03 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/952#issuecomment-45045909
  
thanks, witgo. I think maybe your suggestion is better than my current 
solution since we do not need modify  shell.  
Another problem is that the spark-shell can not read spark-env.sh when 
submitting because it does not include shell 'load-spark-env.sh'.
I will modify and test, thanks a lot. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: correct tiny comment error

2014-05-31 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/928

correct tiny comment error



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark patch-8

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/928.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #928


commit 144328bc2262e15aaff60e70ec0abfb807c9bb43
Author: Chen Chao crazy...@gmail.com
Date:   2014-05-31T06:22:53Z

correct tiny comment error




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: default task number misleading in several plac...

2014-05-15 Thread CrazyJvm
Github user CrazyJvm commented on a diff in the pull request:

https://github.com/apache/spark/pull/766#discussion_r12670482
  
--- Diff: docs/streaming-programming-guide.md ---
@@ -522,9 +522,9 @@ common ones are as follows.
   td breduceByKey/b(ifunc/i, [inumTasks/i]) /td
   td When called on a DStream of (K, V) pairs, return a new DStream of 
(K, V) pairs where the
   values for each key are aggregated using the given reduce function. 
bNote:/b By default,
-  this uses Spark's default number of parallel tasks (2 for local machine, 
8 for a cluster) to
-  do the grouping. You can pass an optional codenumTasks/code argument 
to set a different
-  number of tasks./td
+  this uses Spark's default number of parallel tasks (local mode is 2, 
while cluster mode is
--- End diff --

it's good i think : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: default task number misleading in several plac...

2014-05-14 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/766

default task number misleading in several places

  private[streaming] def defaultPartitioner(numPartitions: Int = 
self.ssc.sc.defaultParallelism){
new HashPartitioner(numPartitions)
  }

it represents that the default task number in Spark Streaming relies on the 
variable defaultParallelism in SparkContext, which is decided by the config 
property spark.default.parallelism

the property spark.default.parallelism refers to 
https://github.com/apache/spark/pull/389

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark patch-7

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/766.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #766


commit cc5b66c1883eca8862b8f37ef50d64cc0408c54c
Author: Chen Chao crazy...@gmail.com
Date:   2014-05-14T07:45:10Z

default task number misleading in several places 

code
  private[streaming] def defaultPartitioner(numPartitions: Int = 
self.ssc.sc.defaultParallelism){
new HashPartitioner(numPartitions)
  }
/code

it represents that the default task number in Spark Streaming relies on the 
variable defaultParallelism in SparkContext, which is decided by the config 
property spark.default.parallelism

the property spark.default.parallelism refers to 
https://github.com/apache/spark/pull/389




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Args for worker rather than master

2014-04-29 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/587

Args for worker rather than master

Args for worker rather than master

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark patch-6

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/587.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #587


commit b54b89f3c83e8ae41ce65c806674e1675add72f1
Author: Chen Chao crazy...@gmail.com
Date:   2014-04-29T08:22:56Z

Args for worker rather than master

Args for worker rather than master




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: misleading task number of groupByKey

2014-04-14 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/403

misleading task number of groupByKey

By default, this uses only 8 parallel tasks to do the grouping. is a big 
misleading. Please refer to https://github.com/apache/spark/pull/389 

detail is as following code :
code
  def defaultPartitioner(rdd: RDD[_], others: RDD[_]*): Partitioner = {
val bySize = (Seq(rdd) ++ others).sortBy(_.partitions.size).reverse
for (r - bySize if r.partitioner.isDefined) {
  return r.partitioner.get
}
if (rdd.context.conf.contains(spark.default.parallelism)) {
  new HashPartitioner(rdd.context.defaultParallelism)
} else {
  new HashPartitioner(bySize.head.partitions.size)
}
  }
/code

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark patch-4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/403.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #403


commit 156833643d9ea1479222e9033164e92a9846351c
Author: Chen Chao crazy...@gmail.com
Date:   2014-04-14T07:39:50Z

misleading task number of groupByKey

By default, this uses only 8 parallel tasks to do the grouping. is a big 
misleading. Please refer to https://github.com/apache/spark/pull/389 

detail is as following code :
code
  def defaultPartitioner(rdd: RDD[_], others: RDD[_]*): Partitioner = {
val bySize = (Seq(rdd) ++ others).sortBy(_.partitions.size).reverse
for (r - bySize if r.partitioner.isDefined) {
  return r.partitioner.get
}
if (rdd.context.conf.contains(spark.default.parallelism)) {
  new HashPartitioner(rdd.context.defaultParallelism)
} else {
  new HashPartitioner(bySize.head.partitions.size)
}
  }
/code




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: style fix

2014-04-14 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/411

style fix

delete semicolon

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark patch-5

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/411.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #411


commit de5d9a75e2c16a7c85c6cfbae9f052994ee60688
Author: Chen Chao crazy...@gmail.com
Date:   2014-04-15T05:57:17Z

style fix 

delete semicolon




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: update spark.default.parallelism

2014-04-11 Thread CrazyJvm
GitHub user CrazyJvm opened a pull request:

https://github.com/apache/spark/pull/389

update spark.default.parallelism

actually, the value 8 is only valid in mesos fine-grained mode :
code
  override def defaultParallelism() = 
sc.conf.getInt(spark.default.parallelism, 8)
/code

while in coarse-grained model including mesos coares-grained, the value of 
the property depending on core numbers!
code 
override def defaultParallelism(): Int = {
   conf.getInt(spark.default.parallelism, math.max(totalCoreCount.get(), 
2))
  }
/code

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CrazyJvm/spark patch-2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/389.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #389


commit ee0fae00ad0294dedca6a78ed05b97ef0ddcc211
Author: Chen Chao crazy...@gmail.com
Date:   2014-04-11T06:54:58Z

update spark.default.parallelism 

actually, the value 8 is only valid in mesos fine-grained mode :
code
  override def defaultParallelism() = 
sc.conf.getInt(spark.default.parallelism, 8)
/code

while in coarse-grained model including mesos coares-grained, the value of 
the property depending on core numbers!
code 
override def defaultParallelism(): Int = {
conf.getInt(spark.default.parallelism, math.max(totalCoreCount.get(), 
2))
  }
/code




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] spark pull request: Merge Hadoop Into Spark

2014-04-01 Thread CrazyJvm
Github user CrazyJvm commented on the pull request:

https://github.com/apache/spark/pull/286#issuecomment-39278059
  
+1 amazing! I've been looking forward it for a long time! thx!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---