Github user drexin commented on the pull request:

    https://github.com/apache/spark/pull/1358#issuecomment-48706534
  
    Hi Patrick,
    
    the problem is described in [this mailing list 
entry](http://mail-archives.apache.org/mod_mbox/mesos-user/201407.mbox/%3c53b66e6d.7090...@uninett.no%3e)
    
    If I understand the [documentation on run 
modes](http://spark.apache.org/docs/latest/running-on-mesos.html) and the code 
correctly, in fine grained mode it starts a separate instance of 
`MesosExecutorBackend` for each spark task. If this is correct, then as soon as 
2 tasks run concurrently on the same machine we should run into this problem.
    
    On [this 
line](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala#L329)
 in the `BlockManagerMasterActor`, there is a check on the `BlockManagerId`, 
which will always be different per `Executor` instance, because the port in 
there is randomly assigned. The `executorId` however is always set to the mesos 
`slaveId`. This means that we are running into [this 
case](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala#L331-L335)
 as soon as we start two `Executor` instances on the same slave. This PR fixes 
this by adding the counter to the `executorId`. Please tell me if I overlooked 
something.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to