[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-03-30 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388002#comment-14388002
 ] 

Apache Spark commented on SPARK-5124:
-

User 'zsxwing' has created a pull request for this issue:
https://github.com/apache/spark/pull/5283

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Fix For: 1.4.0
>
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-03-08 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14352101#comment-14352101
 ] 

Shixiong Zhu commented on SPARK-5124:
-

The problem is that the message may come from caller.receive, but the callee 
wants to send the reply to caller.receiveAndReply.

However, I cannot find a use case now. But I find some RpcEndpoint may need to 
know the sender's address. So I added the sender method to RpcCallContext. And 
I also removed replyWithSender since it can be replaced with 
RpcCallContext.sender.sendWithReply(msg, self) now.

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-03-08 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14352102#comment-14352102
 ] 

Shixiong Zhu commented on SPARK-5124:
-

The problem is that the message may come from caller.receive, but the callee 
wants to send the reply to caller.receiveAndReply.

However, I cannot find a use case now. But I find some RpcEndpoint may need to 
know the sender's address. So I added the sender method to RpcCallContext. And 
I also removed replyWithSender since it can be replaced with 
RpcCallContext.sender.sendWithReply(msg, self) now.

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-03-06 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350652#comment-14350652
 ] 

Marcelo Vanzin commented on SPARK-5124:
---

bq. Reply to A but need a reply. It means, the replied message should be sent 
to receiveAndReply method of A. (replyWithSender)

What's an actual use case here? Why can't this be handled by two RPCs: caller 
-> callee and then callee -> caller, without the need for this separate reply 
method?

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-03-05 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349816#comment-14349816
 ] 

Reynold Xin commented on SPARK-5124:


RpcCallContext sounds good.

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-03-05 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349808#comment-14349808
 ] 

Shixiong Zhu commented on SPARK-5124:
-

1. For the local Endpoint, let's put it in a future change.
2. RpcCallContext looks good to me. @rxin, thoughts? 
3. For `replyWithSender`, I copied my answer in the pr here:

For example, RpcEndpoint A calls sendWithReply to send a message to RpcEndpoint 
B. In B's receiveAndReply, it may have two requirements.

Reply to A but don't need a reply. It means, the replied message should be sent 
to receive method of A. (reply)
Reply to A but need a reply. It means, the replied message should be sent to 
receiveAndReply method of A. (replyWithSender)
That's why here we need two methods.

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-03-05 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14349689#comment-14349689
 ] 

Marcelo Vanzin commented on SPARK-5124:
---

re: a local Endpoint, if it's really necessary, it would probably be an 
implementation of {{RpcEnv}} (similar to how akka implements it). A lot of 
{{RpcEnv}} probably doesn't make a lot of sense for a local Endpoint, but those 
differences can either be abstracted away (e.g. having a more generic "Env" 
class from which {{RpcEnv}} extends, or having dummy implementations for things 
that don't make sense locally). 

Either way, it shouldn't be hard to do if it's really needed - but from your 
patch it seems like we can punt that to a different change.

On a different note, I'm not a big fan of the {{RpcResponse}} name. It's not 
the response per se, but something that allows you to respond to an RPC. 
Something like {{RpcContext}} or {{RpcCallContext}} sounds more appropriate.

I'm also a little puzzled by {{replyWithSender}}. Why is it needed? In my view 
of things, when the caller receives a reply to an RPC, it knows to whom the RPC 
was sent, so it doesn't seem necessary to have this method. Unless I 
misunderstood what it is.

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-03-04 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14348215#comment-14348215
 ] 

Shixiong Zhu commented on SPARK-5124:
-

I made some changes to the above APIs.

1. I added a new trait RpcResponse because RpcResponseCallback can not be used 
directly. Both `receive` and `receiveAndReply` methods can receive messages if 
the sender is an RpcEndpoint, so if the receiver wants to reply the sender, it 
needs an approach to specify which methods it wants to send. Here is the 
interface of RpcResponse:

{quote}
private[spark] trait RpcResponse {
  def reply(response: Any): Unit
  def replyWithSender(response: Any, sender: RpcEndpointRef): Unit
  def fail(e: Throwable): Unit
}
{quote}

Calling `reply` will send the message to RpcEndpoint.receive, and calling 
`replyWithSender` will send the message to RpcEndpoint.replyWithSender.

2.Insteading of adding a trait like ThreadSafeRpcEndpoint, I added a new method 
`setupThreadSafeEndpoint` in RpcEnv, so that RpcEnv can decide how to implement 
the thread-safe semantics internally. For `setupEndpoint`, RpcEnv won't 
guarantee sending messages thread-safely.

[~vanzin] For the local Endpoint idea, I think we need a local message 
dispatcher to send/receive messages between different Endpoint. It can not be 
implemented just in the Endpoint. We will also need EndpointRef to refer to an 
Endpoint, and a generic trait for the local message dispatcher and RpcEnv. It 
looks a bit complex. What do you think?

Please help review my PR: https://github.com/apache/spark/pull/4588

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-27 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340066#comment-14340066
 ] 

Shixiong Zhu commented on SPARK-5124:
-

I have not tested the performance. OK, I will remove NetworkRpcEndpoint. If we 
find any performance problem, we can add it back then.

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-26 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14338911#comment-14338911
 ] 

Reynold Xin commented on SPARK-5124:


I took another look at the PR. Do we need to separate NetworkRpcEndpoint and 
RpcEndPoint? Is it that expensive to listen to network events? If not, would be 
simpler if we just consolidate them.


> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-25 Thread Aaron Davidson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337992#comment-14337992
 ] 

Aaron Davidson commented on SPARK-5124:
---

As Reynold mentioned me mentioning, I am partial towards something like 
receiveAndReply(msg: Any, response: RpcResponseCallback). This makes the 
difference in parameters of the two more obvious (more so than why one takes an 
Any and the other takes an RpcMessage).

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-25 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337965#comment-14337965
 ] 

Reynold Xin commented on SPARK-5124:


I think that works. Aaron also proposed receiveAndReply(Any, Callback).

I think both are ok.


> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-25 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337900#comment-14337900
 ] 

Shixiong Zhu commented on SPARK-5124:
-

So `RpcMessage.relay()` can be used in `receive` while it cannot be used in 
`receiveAndReply`? I think changing `receive(msg: RpcMessage)` to `receive(msg: 
Any)` is better. The message itself is sent to `receive` directly, while 
RpcMessage of `receiveAndReply` has the message content and also provide the 
`reply` method. 


> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-25 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337853#comment-14337853
 ] 

Reynold Xin commented on SPARK-5124:


Yup - Aaron knew exactly what I had in mind!


> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-25 Thread Aaron Davidson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337745#comment-14337745
 ] 

Aaron Davidson commented on SPARK-5124:
---

[~zsxwing] For receiveAndReply, I think the intention is message.reply(), so it 
can be performed in a separate thread.

My comment on failure would be so that exceptions can be relayed during 
sendWithReply() (this is natural anyway if sendWithReply returns a Future, it 
can just be completed with the exception).

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-25 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337739#comment-14337739
 ] 

Shixiong Zhu commented on SPARK-5124:
-

[~adav] 
{quote}
I might even supplement message with a reply with failure method, so exceptions 
can be relayed.
{quote}
Do you mean the sender can receive the error message if the receiver has some 
error?

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-25 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337733#comment-14337733
 ] 

Shixiong Zhu commented on SPARK-5124:
-

[~rxin] could you clarify how to reply the message in `receiveAndReply`? Use 
the return value, or `message.reply()`?

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-25 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337724#comment-14337724
 ] 

Shixiong Zhu commented on SPARK-5124:
-

[~vanzin], thanks for the suggestions. I agree most of them. Just a little 
comment for the following point:

{quote}
The default Endpoint has no thread-safety guarantees. You can wrap an Endpoint 
in an EventLoop if you want messages to be handled using a queue, or 
synchronize your receive() method (although that can block the dispatcher 
thread, which could be bad). But this would easily allow actors to process 
multiple messages concurrently if desired.
{quote}

Every Endpoint with EventLoop needs to have an exclusive thread. So it will 
increase the thread number significantly and pay some cost for the extra thread 
context switch. However, I think we can have a global Dispatcher for the 
Endpoints that need thread-safety guarantees. Endpoint needs to register itself 
to the Dispatcher. Dispatcher will dispatch the messages to these Endpoints and 
guarantee the thread-safety. This Dispatcher can only have a few threads and 
queues for dispatching messages.

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-25 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14337028#comment-14337028
 ] 

Reynold Xin commented on SPARK-5124:


I went and looked at the various use cases of Akka in the code base. Quite a 
few of them actually also use askWithReply. There are really two categories of 
message deliveries: one that the sender expects a reply, and one that doesn't. 
As a result, I think the following interface would make more sense:

{code}
// receive side:
def receive(msg: RpcMessage)
def receiveAndReply(msg: RpcMessage) = sys.err("...")

// send side:
def send(message: Object): Unit
def sendWithReply[T](message: Object): Future[T]
{code}

There is an obvious pairing here. Messages sent via "send" goes to "receive", 
without requiring an ack (although if we use the transport layer, an implicit 
ack can be sent). Messages sent via "sendWithReply" goes to "receiveAndReply", 
and the caller needs to explicitly handle the reply. 

In most cases, an rpc endpoint only needs to receive messages without reply, 
and thus can override just receive. 




> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-25 Thread Aaron Davidson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336974#comment-14336974
 ] 

Aaron Davidson commented on SPARK-5124:
---

I tend to prefer having explicit message.reply() semantics over Akka's weird 
ask. This is how the Transport layer implements RPCs, for instance: 
https://github.com/apache/spark/blob/master/network/common/src/main/java/org/apache/spark/network/server/RpcHandler.java#L26

I might even supplement message with a reply with failure method, so exceptions 
can be relayed.

I have not found this mechanism to be a bottleneck at many thousands of 
requests per second.

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-24 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335352#comment-14335352
 ] 

Reynold Xin commented on SPARK-5124:


BTW great points about the life cycle and thread safety expectations. We should 
make sure we document all of those clearly in the API doc.


> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-24 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335264#comment-14335264
 ] 

Reynold Xin commented on SPARK-5124:


It seems to me the 2nd would be useful in some cases, but not really necessary, 
and maybe even belongs in a layer higher than what is proposed here. It might 
also be too expensive to track, considering you can have thousands of RPC 
messages a second. And it is also subject to memory leaks, if the receiver 
doesn't properly discard the message.  As you said, none of the Spark code uses 
it at the moment.



> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-24 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335252#comment-14335252
 ] 

Marcelo Vanzin commented on SPARK-5124:
---

Hi @rxin,

Yeah, the current {{receive()}} gives you the message and the sender, but it 
doesn't give you a way to explicit reply to a message. That means two things:

- It's not clear whether calling something like {{sender.send()}} would be 
thread-safe. For example, in akka it would because akka processes one message 
at a time for each actor, so you're sending replies in order. But that forces 
you to implement the same semantics on any new RPC layer, and that is one akka 
feature thing that I'd prefer not to inherit (as you say, let each endpoint 
decide whether it needs it).

- It precludes anynchronously handling that message for much of the same 
reasons. e.g., let's say some incoming message requires a long computation 
before sending the reply, and you use a thread pool for that. With 
{{sender.send()}}, you'd need to build correlation logic into your application. 
If you have something like {{message.reply()}} instead, you don't need to - the 
RPC layer handles correlations and timing out if a reply doesn't arrive.

Granted, I have no idea how you'd implement that second bullet in akka, and 
perhaps for that reason I don't think Spark currently uses that pattern 
anywhere. But it sounds like a useful feature to have.


> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-24 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335234#comment-14335234
 ] 

Reynold Xin commented on SPARK-5124:


[~vanzin] I agree - it would be better to have a receive function that just 
takes a message and sender information. But that is what the current design 
provides, isn't it? There is an explicit sender. Are you suggesting getting rid 
of the partial function?

For synchronization. I keep going back and forth on this. For generality, it is 
better for the RPC interface to not enforce any synchronization, and let the 
endpoints synchronize themselves. However, we do reduce the potential of 
application errors.

cc [~adav]



> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-24 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335204#comment-14335204
 ] 

Marcelo Vanzin commented on SPARK-5124:
---

Ah, also, another comment before I forget. Scala's syntax sometimes throws me 
off, but it seems that in the current implementation of your RPC interfaces, 
you're following Akka's semantics for the {{receive()}} method. That is, is 
sort of assumes a single-threaded message queue, since otherwise you can't 
really have {{ask()}} without running into concurrency issues.

I'd suggest a different approach that allows more standard RPC semantics. For 
example, have the {{receive()}} method take an {{RpcMessage}} as the argument; 
this message contains information about the sender, what type of message it is, 
methods that allow you to reply to a message if it needs a reply, and the 
actual method payload.

That way, you can do the usual matching, and you can easily do asynchronous (or 
concurrent) processing and still be able to reply to the right message. This 
should easily map to akka's backend implementation, and allow a new RPC 
implementation to be more flexible.

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-23 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14334166#comment-14334166
 ] 

Marcelo Vanzin commented on SPARK-5124:
---

Hi [~zsxwing], I briefly looked over the design and parts of the 
implementation, and I have some comments. Overall this looks OK, even if it 
kinda does look a lot like akka's actor system, but I don't think there would 
be a problem implementing it over a different RPC library.

One thing that I found very confusing is {{RpcEndpoint}} vs. 
{{NetworkRpcEndpoint}}. The "R" in RPC means "remote", which sort of implies 
networking. I think Reynold touched on this when he talked about using an event 
loop for "local actors". Perhaps a more flexible approach would be something 
like:

- A generic {{Endpoint}} that defines the {{receive()}} method and other 
generic interfaces
- {{RpcEnv}} takes an {{Endpoint}} and a name to register endpoints for remote 
connections. Things like "onConnected" and "onDisassociated" become messages 
passed to {{receive()}}, kinda like in akka today. This means that there's no 
need for a specialized {{RpcEndpoint}} interface.
- "local" Endpoints could be exposed directly, without the need for an 
{{RpcEnv}}. For example, as a getter in {{SparkEnv}}. You could have some 
wrapper to expose {{send()}} and {{ask()}} so that the client interface looks 
similar for remote and local endpoints.
- The default {{Endpoint}} has no thread-safety guarantees. You can wrap an 
{{Endpoint}} in an {{EventLoop}} if you want messages to be handled using a 
queue, or synchronize your {{receive()}} method (although that can block the 
dispatcher thread, which could be bad). But this would easily allow actors to 
process multiple messages concurrently is desired.

What do you think?



> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-02-13 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320005#comment-14320005
 ] 

Apache Spark commented on SPARK-5124:
-

User 'zsxwing' has created a pull request for this issue:
https://github.com/apache/spark/pull/4588

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf, Pluggable RPC - draft 2.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-01-13 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14275014#comment-14275014
 ] 

Shixiong Zhu commented on SPARK-5124:
-

For 1) I prefer to finish it before this JIRA. See 
[#4016|https://github.com/apache/spark/pull/4016]

For 2), I will write some prototype codes to see if the current API design is 
sufficient.

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-01-12 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273885#comment-14273885
 ] 

Reynold Xin commented on SPARK-5124:


1. Let's put that outside of this PR (either leave it as an actor for now and 
follow up to change it to a loop, or submit a separate PR to change it to a 
loop before we merge the actor PR).

2. Yes - you don't necessarily need an alternative implementation, but making 
sure the current API design can indeed support alternative implementations is a 
good idea.


> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-01-12 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273560#comment-14273560
 ] 

Shixiong Zhu commented on SPARK-5124:
-

{quote}
1. Let's not rely on the property of local actor not passing messages through a 
socket for local actor speedup. Conceptually, there is no reason to tie local 
actor implementation to RPC. DAGScheduler's actor used to be a simple queue & 
event loop (before it was turned into an actor for no good reason). We can 
restore it to that.
{quote}
OK. I will change DAGScheduler actor to a simple event loop.

{quote}
2. Have you thought about how the fate sharing stuff would work with 
alternative RPC implementations?
{quote}

Just want to make sure we are thinking the same thing: do you mean how to 
notify DisassociatedEvent in alternative RPC implementation? If so, I'm 
thinking how to extract it from the RPC layer. But have not yet started it.

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-01-11 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273246#comment-14273246
 ] 

Reynold Xin commented on SPARK-5124:


Thanks for the response.

1. Let's not rely on the property of local actor not passing messages through a 
socket for local actor speedup. Conceptually, there is no reason to tie local 
actor implementation to RPC. DAGScheduler's actor used to be a simple queue & 
event loop (before it was turned into an actor for no good reason). We can 
restore it to that.

2. Have you thought about how the fate sharing stuff would work with 
alternative RPC implementations? 

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-01-09 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270809#comment-14270809
 ] 

Shixiong Zhu commented on SPARK-5124:
-

{quote}
1. For DAGScheduler, we are probably OK to just use an event loop instead of an 
actor. Just put some messages into a queue, and have a loop that processes the 
queue. Otherwise we are making every call to DAGScheduler going through a 
socket and that can severely impact scheduler throughput. (Although I haven't 
looked closely at your change so maybe you are doing a different thing here)
{quote}

In my current implementation, DAGScheduler still uses Actor. A LocalActorRef 
should not pass through the socket. However, for better performance, we can use 
a Multi-Producer-Single-Consumer(MPSC) queue to bypass Akka.

{quote}
2. Is it really that expensive to listen to network events that warrant a 
separate NetworkRpcEndpoint?
{quote}

I created a NetworkRpcEndpoint to avoid to do the following pattern matching 
checks for every message for RpcEndpoints not interested in them.

{code}
case AssociatedEvent(_, remoteAddress, _) =>
  ...
case DisassociatedEvent(_, remoteAddress, _) =>
  ...
case AssociationErrorEvent(cause, localAddress, remoteAddress, inbound, _) =>
  ...
{code}

{quote}
Thread-safe contract when processing messages (similar to Akka).
{quote}

The thread-safe contract means there is only one method of RpcEndpoint will be 
called at the same time, just like Actor. Without this property, RpcEndpoint 
will need a lock to protect its data. However, considering the complex logical 
of so many RpcEndpoints, it may lead to dead-lock.

{quote}
A simple fault tolerance(if a RpcEndpoint is crashed, restart it or stop it).
{quote}

It means "Any error thrown by `onStart`, `receive` and `onStop` will be sent to 
`onError`. If onError throws an error, it will force RpcEndpoint to
 restart by creating a new one."

"restart" maybe not a proper way. But an `onError` which is used to handle all 
errors is better than requiring RPCEndPoint never have uncaught exceptions 
(need to write many try-catch codes)

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-01-09 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270771#comment-14270771
 ] 

Reynold Xin commented on SPARK-5124:


Thanks for doing this. I haven't looked through all the changes yet, but here 
are some quick comments:

1. For DAGScheduler, we are probably OK to just use an event loop instead of an 
actor. Just put some messages into a queue, and have a loop that processes the 
queue. Otherwise we are making every call to DAGScheduler going through a 
socket and that can severely impact scheduler throughput. (Although I haven't 
looked closely at your change so maybe you are doing a different thing here)

2. Is it really that expensive to listen to network events that warrant a 
separate NetworkRpcEndpoint?

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5124) Standardize internal RPC interface

2015-01-09 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270748#comment-14270748
 ] 

Apache Spark commented on SPARK-5124:
-

User 'zsxwing' has created a pull request for this issue:
https://github.com/apache/spark/pull/3974

> Standardize internal RPC interface
> --
>
> Key: SPARK-5124
> URL: https://issues.apache.org/jira/browse/SPARK-5124
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Reporter: Reynold Xin
>Assignee: Shixiong Zhu
> Attachments: Pluggable RPC - draft 1.pdf
>
>
> In Spark we use Akka as the RPC layer. It would be great if we can 
> standardize the internal RPC interface to facilitate testing. This will also 
> provide the foundation to try other RPC implementations in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org