[GitHub] spark pull request: do you mean inadvertently?
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/3620 do you mean inadvertently? You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark streaming-foreachRDD Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3620.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3620 commit b72886b6570be62ca4bcf1964c489a5f51d41394 Author: CrazyJvm crazy...@gmail.com Date: 2014-12-05T13:39:13Z do you mean inadvertently? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: use isRunningLocally rather than runningLocall...
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/2879 use isRunningLocally rather than runningLocally runningLocally is deprecated now You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark runningLocally Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2879.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2879 commit bec0b3ef008c3fbc1dcf133db08271eb0892b50e Author: CrazyJvm crazy...@gmail.com Date: 2014-10-21T11:52:14Z use isRunningLocally rather than runningLocally --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: use --total-executor-cores rather than --co...
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/2540#issuecomment-57045529 @andrewor14 already modified title according to your suggestion. Thx --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: use --total-executor-cores rather than --co...
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/2540 use --total-executor-cores rather than --cores after spark-shell You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark standalone-core Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2540.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2540 commit 66d9fc61af64a43c3022727b08a569702b759d30 Author: CrazyJvm crazy...@gmail.com Date: 2014-09-26T02:50:51Z use --total-executor-cores rather than --cores after spark-shell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: use --total-executor-cores rather than --co...
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/2540#issuecomment-56914942 launch spark-shell in standalone mode i mean --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: add some shuffle configurations in doc
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/2031#issuecomment-52741370 @colorant thanks, I've not noticed `toLowerCase ` before : ) . already modified --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: add spark.shuffle.spill.batchSize and fix the ...
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/2031 add spark.shuffle.spill.batchSize and fix the value of spark.shuffle.manager ```scala private val serializerBatchSize = sparkConf.getLong(spark.shuffle.spill.batchSize, 1) ``` add `spark.shuffle.spill.batchSize` to doc. And according to ```scala // Let the user specify short names for shuffle managers val shortShuffleMgrNames = Map( hash - org.apache.spark.shuffle.hash.HashShuffleManager, sort - org.apache.spark.shuffle.sort.SortShuffleManager) val shuffleMgrName = conf.get(spark.shuffle.manager, hash) ``` value should be hash and sort rather than HASH and SORT You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark conf-spill-batchSize Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2031.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2031 commit 49b47f04150d7c6fd428631228fa1428a2978e9d Author: CrazyJvm crazy...@gmail.com Date: 2014-08-19T07:59:36Z add configuration `spark.shuffle.spill.batchSize` and fix the value of spark.shuffle.manager --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: there's no need to use masterLock in Worker no...
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/2008 there's no need to use masterLock in Worker now since all communications are within Akka actor there's no need to use masterLock in Worker now since all communications are within Akka actor You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark no-need-master-lock Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2008.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2008 commit 58e7fa50e2e71f5d92de2cdcf6f2928c0da0db12 Author: CrazyJvm crazy...@gmail.com Date: 2014-08-18T02:30:39Z there's no need to use masterLock now since all communications are within Akka actor --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: add cacheTable guide
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/1681#issuecomment-50952014 OK, got it, thanks for reminding. @pwendell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: add cacheTable guide
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/1681#issuecomment-50723525 thanks @pwendell , already fixed according to your suggestions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: add cacheTable guide
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/1681 add cacheTable guide add the `cacheTable` specification You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark sql-programming-guide-cache Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1681.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1681 commit 2cbbf58c9a5efccbf392f0e1bbc777ac7b9d8179 Author: CrazyJvm crazy...@gmail.com Date: 2014-07-31T04:17:13Z add cacheTable guide --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2000:cannot connect to cluster in Standa...
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/952#issuecomment-50458866 @mateiz YES, i agree. I was motivated by the http://spark.apache.org/docs/latest/spark-standalone.html; , which says Note that if you are running spark-shell from one of the spark cluster machines, the bin/spark-shell script will automatically set MASTER from the SPARK_MASTER_IP and SPARK_MASTER_PORT variables in conf/spark-env.sh. So should I modify the guide rather than code ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2000:cannot connect to cluster in Standa...
Github user CrazyJvm closed the pull request at: https://github.com/apache/spark/pull/952 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2000:cannot connect to cluster in Standa...
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/952#issuecomment-50562260 ok, so I will close this PR and send another patch to the guide. thanks for your discussion. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: automatically set master according to `spark.m...
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/1644 automatically set master according to `spark.master` in `spark-defaults automatically set master according to `spark.master` in `spark-defaults.conf` You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark standalone-guide Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1644.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1644 commit bb12b950c149e8ebeb78b047b9bfc37a4313eb76 Author: CrazyJvm crazy...@gmail.com Date: 2014-07-30T01:45:31Z automatically set master according to `spark.master` in `spark-defaults.conf` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Graphx example
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/1523 Graphx example fix examples You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark graphx-example Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1523.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1523 commit 7cfff1d029ace9bdb2cd39e726d144a1cb8d868f Author: CrazyJvm crazy...@gmail.com Date: 2014-07-22T06:56:28Z fix example for joinVertices commit 663457a9f63c6e7bb1087e1ca4ed2a483ad3aa7a Author: CrazyJvm crazy...@gmail.com Date: 2014-07-22T07:04:03Z outDegrees does not take parameters --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: fix Graph partitionStrategy comment
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/1368 fix Graph partitionStrategy comment You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark graph-comment-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1368.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1368 commit e190d6fdd5b4d0f5a89352c38e5f06f5238b35a8 Author: CrazyJvm crazy...@gmail.com Date: 2014-07-11T02:53:54Z fix Graph partitionStrategy comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: fix spark.yarn.max.executor.failures explainat...
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/1282 fix spark.yarn.max.executor.failures explaination According to '''scala private val maxNumExecutorFailures = sparkConf.getInt(spark.yarn.max.executor.failures, sparkConf.getInt(spark.yarn.max.worker.failures, math.max(args.numExecutors * 2, 3))) default value should be numExecutors * 2, with minimum of 3, and it's same to the config `spark.yarn.max.worker.failures` You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark yarn-doc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1282.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1282 commit a4b2e27b0c2d2345a60ba66943b219968465b48a Author: CrazyJvm crazy...@gmail.com Date: 2014-07-02T06:59:48Z fix configuration spark.yarn.max.executor.failures commit 2900d234c6ebb90a5c4601083ddf8d329a2ee99d Author: CrazyJvm crazy...@gmail.com Date: 2014-07-02T07:04:51Z fix style commit 211f1302aa6d57b07a7b2d3b7cd4ab21e6d50bbd Author: CrazyJvm crazy...@gmail.com Date: 2014-07-02T07:06:28Z fix html tag commit 86effa612d2ec9ae991b43e229c4ed266e6605a6 Author: CrazyJvm crazy...@gmail.com Date: 2014-07-02T07:15:08Z change expression commit c438aecdec8ce90cb839b7c9aa8260ff4d3c62ba Author: CrazyJvm crazy...@gmail.com Date: 2014-07-02T07:18:18Z fix style --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: make port of ConnectionManager configurable
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/1267 make port of ConnectionManager configurable I encountered a port confliction problem due to ConnectionManager which is really annoying . So I make the ConnectionManager port configurable with default port still. I added a new configuration called `spark.network.connectionmanager.port` , and I'm not sure whether the name is OK or not : ) You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark ConnectionManager-Port Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1267.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1267 commit e34fbe5003a97b693de7abb5144d907d01283fef Author: CrazyJvm crazy...@gmail.com Date: 2014-06-30T09:37:12Z make port of ConnectionManager configurable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: make port of ConnectionManager configurable
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/1267#issuecomment-47609357 Ah,yes, just in test. So maybe I should close the PR? It looks not a big deal. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: make port of ConnectionManager configurable
Github user CrazyJvm closed the pull request at: https://github.com/apache/spark/pull/1267 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Refactor DriverRunner and DriverRunnerTest
Github user CrazyJvm closed the pull request at: https://github.com/apache/spark/pull/1066 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1999: StorageLevel in storage tab and RD...
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/968#issuecomment-46136772 @pwendell , would you like to take a look at it again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Master supports pluggable clock
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/1066 Master supports pluggable clock Convenient for testing, especially in timeout scenario. You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark master-clock Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1066.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1066 commit 4dddc889ff7e3c0f27ff65f6fe3101ead9cd91d9 Author: CrazyJvm crazy...@gmail.com Date: 2014-06-12T07:59:30Z Master supports pluggable clock --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1946] Submit stage after (configured nu...
Github user CrazyJvm commented on a diff in the pull request: https://github.com/apache/spark/pull/900#discussion_r13692635 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -225,6 +232,17 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A throw new SparkException(Error notifying standalone scheduler's driver actor, e) } } + + override def isReady(): Boolean = { +if (ready){ + return true +} +if ((System.currentTimeMillis() - createTime) = maxRegisteredWaitingTime) { + ready = true + return true +} +return false --- End diff -- no need return i think --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Master supports pluggable clock
Github user CrazyJvm closed the pull request at: https://github.com/apache/spark/pull/1066 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Master supports pluggable clock
GitHub user CrazyJvm reopened a pull request: https://github.com/apache/spark/pull/1066 Master supports pluggable clock Convenient for testing, especially in timeout scenario. You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark master-clock Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1066.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1066 commit 4dddc889ff7e3c0f27ff65f6fe3101ead9cd91d9 Author: CrazyJvm crazy...@gmail.com Date: 2014-06-12T07:59:30Z Master supports pluggable clock commit a002e423b85c4c46b1e54c761caad95dc34ef923 Author: CrazyJvm crazy...@gmail.com Date: 2014-06-12T13:17:42Z fix exception no matching constructor found on class org.apache.spark.deploy.master.Master for arguments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Master supports pluggable clock
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/1066#issuecomment-45899433 I cannot figure it out why build failed here since everything is OK on my Mac. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Master supports pluggable clock
Github user CrazyJvm closed the pull request at: https://github.com/apache/spark/pull/1066 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Master supports pluggable clock
GitHub user CrazyJvm reopened a pull request: https://github.com/apache/spark/pull/1066 Master supports pluggable clock Convenient for testing, especially in timeout scenario. You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark master-clock Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1066.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1066 commit 4dddc889ff7e3c0f27ff65f6fe3101ead9cd91d9 Author: CrazyJvm crazy...@gmail.com Date: 2014-06-12T07:59:30Z Master supports pluggable clock commit a002e423b85c4c46b1e54c761caad95dc34ef923 Author: CrazyJvm crazy...@gmail.com Date: 2014-06-12T13:17:42Z fix exception no matching constructor found on class org.apache.spark.deploy.master.Master for arguments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Use pluggable clock in DAGSheduler #SPARK-2031
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/976 Use pluggable clock in DAGSheduler #SPARK-2031 DAGScheduler supports pluggable clock like what TaskSetManager does. You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark clock Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/976.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #976 commit 6779a4c70f43f705b218bdd28bb9cdffaa4a4b1c Author: CrazyJvm crazy...@gmail.com Date: 2014-06-05T06:32:46Z Use pluggable clock in DAGSheduler --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1999: StorageLevel in storage tab and RD...
Github user CrazyJvm commented on a diff in the pull request: https://github.com/apache/spark/pull/968#discussion_r13429125 --- Diff: core/src/main/scala/org/apache/spark/storage/RDDInfo.scala --- @@ -33,6 +33,7 @@ class RDDInfo( var memSize = 0L var diskSize = 0L var tachyonSize = 0L + var _storageLevel = storageLevel --- End diff -- i agree with you, so i will change it. another reason to make storageLevel be a var is that the rdd information also use it. it should be updated, too. https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/RDDInfo.scala#L37-L43 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: JIRA https://issues.apache.org/jira/browse/SPA...
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/968 JIRA https://issues.apache.org/jira/browse/SPARK-1999 StorageLevel in 'storage tab' and 'RDD Storage Info' never changes even if you call rdd.unpersist() and then you give the rdd another different storage level. You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark ui-storagelevel Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/968.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #968 commit 9f1571ef47bb7975eed55441f01aecda25851f74 Author: CrazyJvm crazy...@gmail.com Date: 2014-06-04T08:36:02Z JIRA https://issues.apache.org/jira/browse/SPARK-1999 UI : StorageLevel in storage tab and RDD Storage Info never changes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: StorageLevel in 'storage tab' and 'RDD Storage...
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/950 StorageLevel in 'storage tab' and 'RDD Storage Info' never changes #SPARK-1999# StorageLevel in 'storage tab' and 'RDD Storage Info' never changes even if you call rdd.unpersist() and then you give the rdd another different storage level. You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/950.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #950 commit bbe28d2b4fee09258cd89c1085e0597d188bffa0 Author: Chen Chao crazy...@gmail.com Date: 2014-06-02T08:25:51Z Merge pull request #3 from apache/master Add landmark-based Shortest Path algorithm to graphx.lib commit a158ff031c021491f4e1ddd5c13f8317905c76ee Author: Chen Chao crazy...@gmail.com Date: 2014-06-03T01:16:34Z Merge pull request #4 from apache/master merge from master commit 21aef67288bd3d50f29f4e4dceaa8df71d9e279d Author: CrazyJvm crazy...@gmail.com Date: 2014-06-03T10:52:38Z fix ui storagelevel bug --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: StorageLevel in 'storage tab' and 'RDD Storage...
Github user CrazyJvm closed the pull request at: https://github.com/apache/spark/pull/950 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: cannot connect to cluster in Standalone mode w...
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/952 cannot connect to cluster in Standalone mode when run spark-shell in one of the cluster node without master option #SPARK2000 JIRA:https://issues.apache.org/jira/browse/SPARK-2000 You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark patch-9 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/952.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #952 commit 08579af6d2aa2239da5b6532094301a4c4afe86b Author: Chen Chao crazy...@gmail.com Date: 2014-06-03T15:02:26Z connect to cluster automatically --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: cannot connect to cluster in Standalone mode w...
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/952#issuecomment-44985944 Hi witgo, I still cannot figure out what u mean... could u please give me some clues in detail? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: cannot connect to cluster in Standalone mode w...
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/952#issuecomment-45045909 thanks, witgo. I think maybe your suggestion is better than my current solution since we do not need modify shell. Another problem is that the spark-shell can not read spark-env.sh when submitting because it does not include shell 'load-spark-env.sh'. I will modify and test, thanks a lot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: correct tiny comment error
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/928 correct tiny comment error You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark patch-8 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/928.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #928 commit 144328bc2262e15aaff60e70ec0abfb807c9bb43 Author: Chen Chao crazy...@gmail.com Date: 2014-05-31T06:22:53Z correct tiny comment error --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: default task number misleading in several plac...
Github user CrazyJvm commented on a diff in the pull request: https://github.com/apache/spark/pull/766#discussion_r12670482 --- Diff: docs/streaming-programming-guide.md --- @@ -522,9 +522,9 @@ common ones are as follows. td breduceByKey/b(ifunc/i, [inumTasks/i]) /td td When called on a DStream of (K, V) pairs, return a new DStream of (K, V) pairs where the values for each key are aggregated using the given reduce function. bNote:/b By default, - this uses Spark's default number of parallel tasks (2 for local machine, 8 for a cluster) to - do the grouping. You can pass an optional codenumTasks/code argument to set a different - number of tasks./td + this uses Spark's default number of parallel tasks (local mode is 2, while cluster mode is --- End diff -- it's good i think : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: default task number misleading in several plac...
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/766 default task number misleading in several places private[streaming] def defaultPartitioner(numPartitions: Int = self.ssc.sc.defaultParallelism){ new HashPartitioner(numPartitions) } it represents that the default task number in Spark Streaming relies on the variable defaultParallelism in SparkContext, which is decided by the config property spark.default.parallelism the property spark.default.parallelism refers to https://github.com/apache/spark/pull/389 You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark patch-7 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/766.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #766 commit cc5b66c1883eca8862b8f37ef50d64cc0408c54c Author: Chen Chao crazy...@gmail.com Date: 2014-05-14T07:45:10Z default task number misleading in several places code private[streaming] def defaultPartitioner(numPartitions: Int = self.ssc.sc.defaultParallelism){ new HashPartitioner(numPartitions) } /code it represents that the default task number in Spark Streaming relies on the variable defaultParallelism in SparkContext, which is decided by the config property spark.default.parallelism the property spark.default.parallelism refers to https://github.com/apache/spark/pull/389 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Args for worker rather than master
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/587 Args for worker rather than master Args for worker rather than master You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark patch-6 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/587.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #587 commit b54b89f3c83e8ae41ce65c806674e1675add72f1 Author: Chen Chao crazy...@gmail.com Date: 2014-04-29T08:22:56Z Args for worker rather than master Args for worker rather than master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: misleading task number of groupByKey
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/403 misleading task number of groupByKey By default, this uses only 8 parallel tasks to do the grouping. is a big misleading. Please refer to https://github.com/apache/spark/pull/389 detail is as following code : code def defaultPartitioner(rdd: RDD[_], others: RDD[_]*): Partitioner = { val bySize = (Seq(rdd) ++ others).sortBy(_.partitions.size).reverse for (r - bySize if r.partitioner.isDefined) { return r.partitioner.get } if (rdd.context.conf.contains(spark.default.parallelism)) { new HashPartitioner(rdd.context.defaultParallelism) } else { new HashPartitioner(bySize.head.partitions.size) } } /code You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark patch-4 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/403.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #403 commit 156833643d9ea1479222e9033164e92a9846351c Author: Chen Chao crazy...@gmail.com Date: 2014-04-14T07:39:50Z misleading task number of groupByKey By default, this uses only 8 parallel tasks to do the grouping. is a big misleading. Please refer to https://github.com/apache/spark/pull/389 detail is as following code : code def defaultPartitioner(rdd: RDD[_], others: RDD[_]*): Partitioner = { val bySize = (Seq(rdd) ++ others).sortBy(_.partitions.size).reverse for (r - bySize if r.partitioner.isDefined) { return r.partitioner.get } if (rdd.context.conf.contains(spark.default.parallelism)) { new HashPartitioner(rdd.context.defaultParallelism) } else { new HashPartitioner(bySize.head.partitions.size) } } /code --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: style fix
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/411 style fix delete semicolon You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark patch-5 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/411.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #411 commit de5d9a75e2c16a7c85c6cfbae9f052994ee60688 Author: Chen Chao crazy...@gmail.com Date: 2014-04-15T05:57:17Z style fix delete semicolon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: update spark.default.parallelism
GitHub user CrazyJvm opened a pull request: https://github.com/apache/spark/pull/389 update spark.default.parallelism actually, the value 8 is only valid in mesos fine-grained mode : code override def defaultParallelism() = sc.conf.getInt(spark.default.parallelism, 8) /code while in coarse-grained model including mesos coares-grained, the value of the property depending on core numbers! code override def defaultParallelism(): Int = { conf.getInt(spark.default.parallelism, math.max(totalCoreCount.get(), 2)) } /code You can merge this pull request into a Git repository by running: $ git pull https://github.com/CrazyJvm/spark patch-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/389.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #389 commit ee0fae00ad0294dedca6a78ed05b97ef0ddcc211 Author: Chen Chao crazy...@gmail.com Date: 2014-04-11T06:54:58Z update spark.default.parallelism actually, the value 8 is only valid in mesos fine-grained mode : code override def defaultParallelism() = sc.conf.getInt(spark.default.parallelism, 8) /code while in coarse-grained model including mesos coares-grained, the value of the property depending on core numbers! code override def defaultParallelism(): Int = { conf.getInt(spark.default.parallelism, math.max(totalCoreCount.get(), 2)) } /code --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Merge Hadoop Into Spark
Github user CrazyJvm commented on the pull request: https://github.com/apache/spark/pull/286#issuecomment-39278059 +1 amazing! I've been looking forward it for a long time! thx! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---