[jira] [Commented] (SPARK-11098) RPC message ordering is not guaranteed
[ https://issues.apache.org/jira/browse/SPARK-11098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966657#comment-14966657 ] Shixiong Zhu commented on SPARK-11098: -- [~vanzin] Vanzin Sorry I didn't realize you were not working on this one... I just sent a PR to fix the message ordering issue. I think we should guarantee that if using the same RpcEndpointRef to send multiple messages in the same thread, the receiver side should see these messages in the same order. > RPC message ordering is not guaranteed > -- > > Key: SPARK-11098 > URL: https://issues.apache.org/jira/browse/SPARK-11098 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Reynold Xin > > NettyRpcEnv doesn't guarantee message delivery order since there are multiple > threads sending messages in clientConnectionExecutor thread pool. We should > fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11098) RPC message ordering is not guaranteed
[ https://issues.apache.org/jira/browse/SPARK-11098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14966651#comment-14966651 ] Apache Spark commented on SPARK-11098: -- User 'zsxwing' has created a pull request for this issue: https://github.com/apache/spark/pull/9197 > RPC message ordering is not guaranteed > -- > > Key: SPARK-11098 > URL: https://issues.apache.org/jira/browse/SPARK-11098 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Reynold Xin > > NettyRpcEnv doesn't guarantee message delivery order since there are multiple > threads sending messages in clientConnectionExecutor thread pool. We should > fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11098) RPC message ordering is not guaranteed
[ https://issues.apache.org/jira/browse/SPARK-11098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957232#comment-14957232 ] Marcelo Vanzin commented on SPARK-11098: I'm not explicitly working on this a.t.m.. > RPC message ordering is not guaranteed > -- > > Key: SPARK-11098 > URL: https://issues.apache.org/jira/browse/SPARK-11098 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Reynold Xin > > NettyRpcEnv doesn't guarantee message delivery order since there are multiple > threads sending messages in clientConnectionExecutor thread pool. We should > fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11098) RPC message ordering is not guaranteed
[ https://issues.apache.org/jira/browse/SPARK-11098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14956396#comment-14956396 ] Reynold Xin commented on SPARK-11098: - [~vanzin] zsxwing told me you were working on this. Let me know if it is not the case. > RPC message ordering is not guaranteed > -- > > Key: SPARK-11098 > URL: https://issues.apache.org/jira/browse/SPARK-11098 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Reynold Xin > > NettyRpcEnv doesn't guarantee message delivery order since there are multiple > threads sending messages in clientConnectionExecutor thread pool. We should > fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11098) RPC message ordering is not guaranteed
[ https://issues.apache.org/jira/browse/SPARK-11098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958103#comment-14958103 ] Marcelo Vanzin commented on SPARK-11098: So, while working on another patch in this area, I ran into this issue, and I don't think it's a problem in the RPC layer, but rather a problem of the code calling the RPC layer. Even if somehow you synchronize things in the RPC env implementation so that RPCs are sent in the order they arrive, there are multiple threads that can be calling {{RpcEndpoint.send()}} or {{RpcEndpoint.ask()}} at the same time, and at that point there's not guarantee of any order. The problem I ran into explicitly was the Worker ignoring messages from the Master because it thought the master was not active. That's because those messages were arriving before the master had replied to the Worker's registration message. That's not the fault of the RPC layer, that's the fault of that reply being sent to the Worker as a separate message, instead of an RPC reply to the {{RegisterWorker}} message. {{Worker}} in this case should be using {{ask}} and getting the reply from that ask; that ensures the reply will arrive before any other messages the Master may want to send to the worker. If you want to see how to do that properly, see how {{CoarseGrainedExecutorBackend}} does its registration with the scheduler using {{ask}} instead of {{send}}. Anyway, I have that fixed in my patch, I might take it out as a separate fix and attach it to this bug. But I'm not sure if other areas of the code don't suffer from the same problem. > RPC message ordering is not guaranteed > -- > > Key: SPARK-11098 > URL: https://issues.apache.org/jira/browse/SPARK-11098 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Reynold Xin > > NettyRpcEnv doesn't guarantee message delivery order since there are multiple > threads sending messages in clientConnectionExecutor thread pool. We should > fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11098) RPC message ordering is not guaranteed
[ https://issues.apache.org/jira/browse/SPARK-11098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958309#comment-14958309 ] Marcelo Vanzin commented on SPARK-11098: Nevermind (too much). After thinking a little more, while the above is a potential problem, it's unrelated to this bug (and might not be fixed by fixing this bug). So I'll file a separate one for that fix. > RPC message ordering is not guaranteed > -- > > Key: SPARK-11098 > URL: https://issues.apache.org/jira/browse/SPARK-11098 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Reporter: Reynold Xin > > NettyRpcEnv doesn't guarantee message delivery order since there are multiple > threads sending messages in clientConnectionExecutor thread pool. We should > fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org