[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-09-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/5178


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-09-01 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-136894558
  
Let's close this patch and reopen it later if necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-07-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-121073615
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-06-18 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-113335190
  
@viper-kun @maropu any updates? Should we take over? I would recommend that 
we close this patch since it's mostly gone stale. We can always reopen an 
updated version later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-19 Thread viper-kun
Github user viper-kun commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-94250246
  
@maropu  I will update it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-19 Thread viper-kun
Github user viper-kun commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r28650737
  
--- Diff: 
core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockManager.scala ---
@@ -180,7 +180,8 @@ class FileShuffleBlockManager(conf: SparkConf)
 Some(segment.nioByteBuffer())
   }
 
-  override def getBlockData(blockId: ShuffleBlockId): ManagedBuffer = {
+  override def getBlockData(blockId: ShuffleBlockId,
+  blockManagerId: BlockManagerId = blockManager.blockManagerId): 
ManagedBuffer = {
--- End diff --

It not support HashShuffle with consolidateShuffleFiles. For not 
confuse,it only support SortShuffleManager.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-17 Thread maropu
Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-93917004
  
@viper-kun What's the status of this patch? If you don't make further 
updates, I'd like to brush up this patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-05 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27781900
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala 
---
@@ -181,68 +181,102 @@ final class ShuffleBlockFetcherIterator(
 // at most maxBytesInFlight in order to limit the amount of data in 
flight.
 val remoteRequests = new ArrayBuffer[FetchRequest]
 
+val shuffleMgrName = blockManager.conf.get(spark.shuffle.manager, 
sort)
+val externalShuffleServiceEnabled =
+  blockManager.conf.getBoolean(spark.shuffle.service.enabled, false)
+
 // Tracks total number of blocks (including zero sized blocks)
 var totalBlocks = 0
-for ((address, blockInfos) - blocksByAddress) {
-  totalBlocks += blockInfos.size
-  if (address.executorId == blockManager.blockManagerId.executorId) {
-// Filter out zero-sized blocks
-localBlocks ++= blockInfos.filter(_._2 != 0).map(_._1)
-numBlocksToFetch += localBlocks.size
-  } else {
-val iterator = blockInfos.iterator
-var curRequestSize = 0L
-var curBlocks = new ArrayBuffer[(BlockId, Long)]
-while (iterator.hasNext) {
-  val (blockId, size) = iterator.next()
-  // Skip empty blocks
-  if (size  0) {
-curBlocks += ((blockId, size))
-remoteBlocks += blockId
-numBlocksToFetch += 1
-curRequestSize += size
-  } else if (size  0) {
-throw new BlockException(blockId, Negative block size  + 
size)
-  }
-  if (curRequestSize = targetRequestSize) {
-// Add this FetchRequest
-remoteRequests += new FetchRequest(address, curBlocks)
-curBlocks = new ArrayBuffer[(BlockId, Long)]
-logDebug(sCreating fetch request of $curRequestSize at 
$address)
-curRequestSize = 0
-  }
+if (shuffleMgrName.toLowerCase == hash || 
externalShuffleServiceEnabled) {
+  for ((address, blockInfos) - blocksByAddress) {
--- End diff --

We should check if SortShuffleManager is used because this patch only 
supports it. This logic fails when new shuffle managers will be implemented.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-05 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27781922
  
--- Diff: 
core/src/main/scala/org/apache/spark/shuffle/FileShuffleBlockManager.scala ---
@@ -180,7 +180,8 @@ class FileShuffleBlockManager(conf: SparkConf)
 Some(segment.nioByteBuffer())
   }
 
-  override def getBlockData(blockId: ShuffleBlockId): ManagedBuffer = {
+  override def getBlockData(blockId: ShuffleBlockId,
+  blockManagerId: BlockManagerId = blockManager.blockManagerId): 
ManagedBuffer = {
--- End diff --

Why this patch only supports SortShuffleManager?
ISTM the same logic could be straightforwardly implemented in 
HashShuffleManager.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-05 Thread maropu
Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-89735823
  
Understood.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-04 Thread scwf
Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-89716003
  
@maropu , yeah i think it is a common case for yarn mode. We often specify 
more executors than nodemanager, that means there are more than one executor on 
one machine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-03 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27719488
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
@@ -223,6 +226,15 @@ class BlockManagerMasterActor(val isLocal: Boolean, 
conf: SparkConf, listenerBus
 }
   }
 
+  // Return local dirs of other blockmanager on the same machine as 
blockManagerId
+  private def getLocalDirsPath(
+blockManagerId: BlockManagerId): Map[BlockManagerId, 
Array[String]] = {
+blockManagerInfo
+  .filter { case(id, _) = (id != blockManagerId  id.host == 
blockManagerId.host)}
--- End diff --

Nit: Unnecessary parentheses.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-04-03 Thread maropu
Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-89318869
  
One question; are there many cases for executors to share a single host in 
the Yarn mode?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-31 Thread viper-kun
Github user viper-kun commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-88064150
  
@andrewor14  Pls review it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87572136
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29390/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-30 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87577323
  
  [Test build #29391 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29391/consoleFull)
 for   PR 5178 at commit 
[`6c5c1d4`](https://github.com/apache/spark/commit/6c5c1d4d143e4806edd6cf747b84c56f992f14a9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-30 Thread scwf
Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87571847
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87566631
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29385/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87611637
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29391/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87535646
  
  [Test build #29381 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29381/consoleFull)
 for   PR 5178 at commit 
[`c429b66`](https://github.com/apache/spark/commit/c429b6617c042accf9028c61ba604bb8a6fe2635).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87552816
  
  [Test build #29381 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29381/consoleFull)
 for   PR 5178 at commit 
[`c429b66`](https://github.com/apache/spark/commit/c429b6617c042accf9028c61ba604bb8a6fe2635).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends 
ToBlockManagerMaster`

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87552831
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29381/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87547529
  
  [Test build #29385 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29385/consoleFull)
 for   PR 5178 at commit 
[`7ae6e46`](https://github.com/apache/spark/commit/7ae6e46dbdf5b9f7853c1ff2f03ea6c39af259ab).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87370436
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29361/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87370430
  
  [Test build #29361 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29361/consoleFull)
 for   PR 5178 at commit 
[`383a21f`](https://github.com/apache/spark/commit/383a21ff062a3f96513e72705596527a2e571cb0).
 * This patch **fails Scala style tests**.

 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends 
ToBlockManagerMaster`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87369980
  
  [Test build #29361 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29361/consoleFull)
 for   PR 5178 at commit 
[`383a21f`](https://github.com/apache/spark/commit/383a21ff062a3f96513e72705596527a2e571cb0).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87370834
  
  [Test build #29363 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29363/consoleFull)
 for   PR 5178 at commit 
[`6ee8df2`](https://github.com/apache/spark/commit/6ee8df2ccd10b05c031ed10ef3efe20a0b016a1e).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87378645
  
  [Test build #29363 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29363/consoleFull)
 for   PR 5178 at commit 
[`6ee8df2`](https://github.com/apache/spark/commit/6ee8df2ccd10b05c031ed10ef3efe20a0b016a1e).
 * This patch **fails Spark unit tests**.

 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends 
ToBlockManagerMaster`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87378679
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29363/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-28 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87212610
  
  [Test build #29346 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29346/consoleFull)
 for   PR 5178 at commit 
[`9ddf84c`](https://github.com/apache/spark/commit/9ddf84cb983182abb208a49256c3037868229295).
 * This patch **fails Spark unit tests**.

 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends 
ToBlockManagerMaster`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87212626
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29346/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-28 Thread scwf
Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-87201266
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-27 Thread viper-kun
Github user viper-kun commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86849484
  
Hi @andrewor14. pls retest it, test build time out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86906368
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29296/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86906360
  
**[Test build #29296 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29296/consoleFull)**
 for PR 5178 at commit 
[`bb94736`](https://github.com/apache/spark/commit/bb9473641c575e5022b95ee89372fe8981aa0814)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-27 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86878577
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-27 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86879709
  
  [Test build #29296 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29296/consoleFull)
 for   PR 5178 at commit 
[`bb94736`](https://github.com/apache/spark/commit/bb9473641c575e5022b95ee89372fe8981aa0814).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86430853
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29217/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86430835
  
**[Test build #29217 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29217/consoleFull)**
 for PR 5178 at commit 
[`0eaf1af`](https://github.com/apache/spark/commit/0eaf1afdc0c082fde53f3edcbc45ee75425ea98b)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86433963
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29218/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86433945
  
**[Test build #29218 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29218/consoleFull)**
 for PR 5178 at commit 
[`6162aef`](https://github.com/apache/spark/commit/6162aef5a5a2be52cf1b706b6cfc0c1aec204552)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86805460
  
  [Test build #29274 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29274/consoleFull)
 for   PR 5178 at commit 
[`bb94736`](https://github.com/apache/spark/commit/bb9473641c575e5022b95ee89372fe8981aa0814).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86828278
  
**[Test build #29274 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29274/consoleFull)**
 for PR 5178 at commit 
[`bb94736`](https://github.com/apache/spark/commit/bb9473641c575e5022b95ee89372fe8981aa0814)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86828291
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29274/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86383226
  
  [Test build #29217 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29217/consoleFull)
 for   PR 5178 at commit 
[`0eaf1af`](https://github.com/apache/spark/commit/0eaf1afdc0c082fde53f3edcbc45ee75425ea98b).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-26 Thread viper-kun
Github user viper-kun commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27193887
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala 
---
@@ -181,68 +181,100 @@ final class ShuffleBlockFetcherIterator(
 // at most maxBytesInFlight in order to limit the amount of data in 
flight.
 val remoteRequests = new ArrayBuffer[FetchRequest]
 
+val shuffleMgrName = SparkEnv.get.conf.get(spark.shuffle.manager, 
sort)
+
 // Tracks total number of blocks (including zero sized blocks)
 var totalBlocks = 0
-for ((address, blockInfos) - blocksByAddress) {
-  totalBlocks += blockInfos.size
-  if (address.executorId == blockManager.blockManagerId.executorId) {
-// Filter out zero-sized blocks
-localBlocks ++= blockInfos.filter(_._2 != 0).map(_._1)
-numBlocksToFetch += localBlocks.size
-  } else {
-val iterator = blockInfos.iterator
-var curRequestSize = 0L
-var curBlocks = new ArrayBuffer[(BlockId, Long)]
-while (iterator.hasNext) {
-  val (blockId, size) = iterator.next()
-  // Skip empty blocks
-  if (size  0) {
-curBlocks += ((blockId, size))
-remoteBlocks += blockId
-numBlocksToFetch += 1
-curRequestSize += size
-  } else if (size  0) {
-throw new BlockException(blockId, Negative block size  + 
size)
-  }
-  if (curRequestSize = targetRequestSize) {
-// Add this FetchRequest
-remoteRequests += new FetchRequest(address, curBlocks)
-curBlocks = new ArrayBuffer[(BlockId, Long)]
-logDebug(sCreating fetch request of $curRequestSize at 
$address)
-curRequestSize = 0
-  }
+if(shuffleMgrName == hash) {
--- End diff --

what are you means?how to specify the full path?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-26 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86388109
  
  [Test build #29218 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29218/consoleFull)
 for   PR 5178 at commit 
[`6162aef`](https://github.com/apache/spark/commit/6162aef5a5a2be52cf1b706b6cfc0c1aec204552).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27157449
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
@@ -81,7 +89,38 @@ private[spark] class DiskBlockManager(blockManager: 
BlockManager, conf: SparkCon
 new File(subDir, filename)
   }
 
-  def getFile(blockId: BlockId): File = getFile(blockId.name)
+  /** Looks up a file by blockManagerId */
+  // This method should be kept in sync with
+  // 
org.apache.spark.network.shuffle.StandaloneShuffleBlockManager#getFile().
--- End diff --

It seems that the changes here caused this to be out of sync, right? Do we 
want to make the same changes in `StandaloneShuffleBlockManager` too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86183403
  
  [Test build #29167 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29167/consoleFull)
 for   PR 5178 at commit 
[`5b766ca`](https://github.com/apache/spark/commit/5b766ca3f02d2e9b213e39908aec9c6b5804b623).
 * This patch **fails Scala style tests**.

 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class GetLocalDirsPath(blockManagerId: BlockManagerId) extends 
ToBlockManagerMaster`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86183415
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29167/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27156826
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
@@ -52,9 +52,9 @@ class BlockManagerMasterActor(val isLocal: Boolean, conf: 
SparkConf, listenerBus
 
   private val akkaTimeout = AkkaUtils.askTimeout(conf)
 
-  override def receiveWithLogging: PartialFunction[Any, Unit] = {
--- End diff --

please keep the return type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86182951
  
  [Test build #29167 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29167/consoleFull)
 for   PR 5178 at commit 
[`5b766ca`](https://github.com/apache/spark/commit/5b766ca3f02d2e9b213e39908aec9c6b5804b623).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27159008
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
@@ -81,7 +89,38 @@ private[spark] class DiskBlockManager(blockManager: 
BlockManager, conf: SparkCon
 new File(subDir, filename)
   }
 
-  def getFile(blockId: BlockId): File = getFile(blockId.name)
+  /** Looks up a file by blockManagerId */
+  // This method should be kept in sync with
+  // 
org.apache.spark.network.shuffle.StandaloneShuffleBlockManager#getFile().
+  def getFile(filename: String,  blockManagerId: BlockManagerId): File = {
+// Figure out which local directory it hashes to, and which 
subdirectory in that
+val hash = Utils.nonNegativeHash(filename)
+
+var localDirsPath = localDirsByBlkMgr.get(blockManagerId)
+if(!localDirsPath.isDefined) {
+  localDirsPath = localDirsByBlkMgr.synchronized {
+val old = localDirsByBlkMgr.get(blockManagerId)
+if(old.isDefined) {
+  old
+} else {
+  localDirsByBlkMgr ++= 
blockManager.master.getLocalDirsPath(blockManager.blockManagerId)
+  localDirsByBlkMgr.get(blockManagerId)
+}
+  }
+}
+
+val dirId = hash % localDirsPath.get.length
+val subDirId = (hash / localDirsPath.get.length) % subDirsPerLocalDir
+
+new File(localDirsPath.get(dirId) + / + %02x.format(subDirId), 
filename)
+  }
+
+  def getFile(blockId: BlockId, blockManagerId: BlockManagerId = 
blockManager.blockManagerId): File = {
+if(blockManagerId == blockManager.blockManagerId)
+  getFile(blockId.name)
+else
+  getFile(blockId.name, blockManagerId)
--- End diff --

style:
```
if (...) {
  getFile(...)
} else {
  getFile(...)
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27159189
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
@@ -81,7 +89,38 @@ private[spark] class DiskBlockManager(blockManager: 
BlockManager, conf: SparkCon
 new File(subDir, filename)
   }
 
-  def getFile(blockId: BlockId): File = getFile(blockId.name)
+  /** Looks up a file by blockManagerId */
+  // This method should be kept in sync with
+  // 
org.apache.spark.network.shuffle.StandaloneShuffleBlockManager#getFile().
+  def getFile(filename: String,  blockManagerId: BlockManagerId): File = {
--- End diff --

should this be called `getLocalFile`? Also the javadoc should explicitly 
say that the given `blockManagerId` is expected to refer to an executor started 
on this particular machine (i.e. it should be local).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86182102
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27156748
  
--- Diff: 
core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockManager.scala ---
@@ -52,12 +52,16 @@ class IndexShuffleBlockManager(conf: SparkConf) extends 
ShuffleBlockManager {
 ShuffleBlockId(shuffleId, mapId, 0)
   }
 
-  def getDataFile(shuffleId: Int, mapId: Int): File = {
-blockManager.diskBlockManager.getFile(ShuffleDataBlockId(shuffleId, 
mapId, 0))
+  def getDataFile(shuffleId: Int,
+   mapId: Int,
+   blockManagerId: BlockManagerId = 
blockManager.blockManagerId): File = {
--- End diff --

hey @viper-kun the style here and other places should be:
```
def getDataFile(
shuffleId: Int,
mapId: Int,
blockManagerId: BlockManagerId = ...): File = {
  ...
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27157027
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
@@ -223,6 +226,14 @@ class BlockManagerMasterActor(val isLocal: Boolean, 
conf: SparkConf, listenerBus
 }
   }
 
+  // Return local dirs of other blockmanager on the same machine as 
blockManagerId
+  private def getLocalDirsPath(blockManagerId: BlockManagerId): 
Map[BlockManagerId, Array[String]] = {
+blockManagerInfo.filter(info = (info._1 != blockManagerId  
info._1.host == blockManagerId.host))
+  .map { case(blockManagerId, info) =
+  (blockManagerId, info.localDirsPath)
+}.toMap
--- End diff --

style
```
blockManagerInfo
  .filter { case (id, _) = (id != blockManagerId  id.host == 
blockManagerId.host) }
  .mapValues { info = info.localDirsPath }
  .toMap
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27159923
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
@@ -81,7 +89,38 @@ private[spark] class DiskBlockManager(blockManager: 
BlockManager, conf: SparkCon
 new File(subDir, filename)
   }
 
-  def getFile(blockId: BlockId): File = getFile(blockId.name)
+  /** Looks up a file by blockManagerId */
+  // This method should be kept in sync with
+  // 
org.apache.spark.network.shuffle.StandaloneShuffleBlockManager#getFile().
+  def getFile(filename: String,  blockManagerId: BlockManagerId): File = {
--- End diff --

actually, I think it would make more sense to refactor this and the 
`getFile` in L65 to take the form:
```
def getLocalFile(
filename: String,
localDirs: Seq[String],
createIfAbsent: Boolean = false): File
```
If the block manager is the one that belongs to this executor, then 
`createIfAbsent` will be true. Otherwise it's false.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5178#discussion_r27161351
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala 
---
@@ -181,68 +181,100 @@ final class ShuffleBlockFetcherIterator(
 // at most maxBytesInFlight in order to limit the amount of data in 
flight.
 val remoteRequests = new ArrayBuffer[FetchRequest]
 
+val shuffleMgrName = SparkEnv.get.conf.get(spark.shuffle.manager, 
sort)
+
 // Tracks total number of blocks (including zero sized blocks)
 var totalBlocks = 0
-for ((address, blockInfos) - blocksByAddress) {
-  totalBlocks += blockInfos.size
-  if (address.executorId == blockManager.blockManagerId.executorId) {
-// Filter out zero-sized blocks
-localBlocks ++= blockInfos.filter(_._2 != 0).map(_._1)
-numBlocksToFetch += localBlocks.size
-  } else {
-val iterator = blockInfos.iterator
-var curRequestSize = 0L
-var curBlocks = new ArrayBuffer[(BlockId, Long)]
-while (iterator.hasNext) {
-  val (blockId, size) = iterator.next()
-  // Skip empty blocks
-  if (size  0) {
-curBlocks += ((blockId, size))
-remoteBlocks += blockId
-numBlocksToFetch += 1
-curRequestSize += size
-  } else if (size  0) {
-throw new BlockException(blockId, Negative block size  + 
size)
-  }
-  if (curRequestSize = targetRequestSize) {
-// Add this FetchRequest
-remoteRequests += new FetchRequest(address, curBlocks)
-curBlocks = new ArrayBuffer[(BlockId, Long)]
-logDebug(sCreating fetch request of $curRequestSize at 
$address)
-curRequestSize = 0
-  }
+if(shuffleMgrName == hash) {
--- End diff --

I don't think this is safe to do. The user can specify the full path to 
hash shuffle writer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-25 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-86201608
  
Hi @viper-kun it seems that there are quite a few style violations. For 
more detail please see 
https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide or 
https://github.com/databricks/scala-style-guide. Once you fix those I will do a 
closer review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5178#issuecomment-85807625
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-6521][Core]executors in the same node r...

2015-03-24 Thread viper-kun
GitHub user viper-kun opened a pull request:

https://github.com/apache/spark/pull/5178

[SPARK-6521][Core]executors in the same node read local shuffle file

In the past, executor read other executor's shuffle file in the same node 
by net. This pr make that executors in the same node read local shuffle file In 
sort-based Shuffle. It will reduce net transport.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viper-kun/spark readlocal

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5178.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5178


commit 28a96ec3fe2e9e6adadedddc6e74b2c1ef667b00
Author: xukun 00228947 xukun...@huawei.com
Date:   2015-03-25T02:26:33Z

read local

commit e6d1e0b62e248714042a1f1d463f9ab26b018516
Author: xukun 00228947 xukun...@huawei.com
Date:   2015-03-25T02:46:08Z

read local

commit 5b766ca3f02d2e9b213e39908aec9c6b5804b623
Author: xukun 00228947 xukun...@huawei.com
Date:   2015-03-25T02:49:02Z

fix it




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org