[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...

2017-08-24 Thread neoremind
Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18964
  
@zsxwing Thanks for reviewing.  The project I mentioned above is for 
studying purpose and hope it will help others who are interested. I totally 
agree that spark rpc mainly for internal use, but as I tested, its performance 
is good though in general cases, which is good news :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO...

2017-08-24 Thread neoremind
Github user neoremind commented on a diff in the pull request:

https://github.com/apache/spark/pull/18964#discussion_r135081475
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java
 ---
@@ -210,6 +210,14 @@ private TransportClient createClient(InetSocketAddress 
address)
   .option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 
conf.connectionTimeoutMs())
   .option(ChannelOption.ALLOCATOR, pooledAllocator);
 
+if (conf.receiveBuf() > 0) {
+  bootstrap.option(ChannelOption.SO_RCVBUF, conf.receiveBuf());
--- End diff --

For server side, netty leverages reactor pattern and `ServerBootstrap` is 
used here, method `option` works when socket binding and connecting occurs, 
while `childOption` works for the later process when connection is established 
and there will be one channel per client.
For client side, spark rpc uses `Bootstrap`, there is no method to set 
`childOption` because client works not like server side, it is pretty simple 
for client, only one thread-pool will be used by netty to do network I/O.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...

2017-08-23 Thread neoremind
Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18964
  
@zsxwing I did try to create a performance test against spark rpc, the test 
result can be found 
[here](https://github.com/neoremind/kraps-rpc#4-performance-test), note that I 
created the project for studying purpose and the code is based on 2.1.0. But as 
you said, the performance would not be dropped as client not using `SO_RCVBUF` 
and `SO_SNDBUF` set in `SparkConf`.  

For example, I use the scenario of concurrent calls 10, total calls 100k, 
keep all things as default, the QPS would be around 11k. When I set `SO_RCVBUF` 
and ` SO_SNDBUF`  to extremely small number like 100 the performance is 
affected tremendously. If they are set to a large number like 128k, the results 
won't be boosted by whether clients set the corresponding `SO_RCVBUF` and 
`SO_SNDBUF` value or not. 

I admit that the update is trivial but from users' perspective, if 
`spark.{module}.io.sendBuffer` and `spark.{module}.io.sendBuffer` are exposed 
outside and could be set, and they only works on server side, I think it is a 
little bit not consistent, so I raise the PR to try to make it work on both 
server and client side, just to make them consistent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...

2017-08-23 Thread neoremind
Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18964
  
@cloud-fan would you take a look of the PR, the update is very simple. 
Thanks very much!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18965: [SPARK-21749][DOC] Add comments for MessageEncoder to ex...

2017-08-17 Thread neoremind
Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18965
  
@srowen I see your concern, it is more internal oriented and maybe updated 
by developer as lib evolves, I will close the PR then. Thanks for reviewing!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18965: [SPARK-21749][DOC] Add comments for MessageEncode...

2017-08-17 Thread neoremind
Github user neoremind closed the pull request at:

https://github.com/apache/spark/pull/18965


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18965: [SPARK-21749][DOC] Add comments for MessageEncoder to ex...

2017-08-16 Thread neoremind
Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18965
  
I see, anyway this is what I found when I dig into the wire protocol of 
spark rpc since wire format is a big part of understanding the message 
structure. If someone thinks this is not necessary I can close the PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...

2017-08-16 Thread neoremind
Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18964
  
Not yet since it is OK to keep buffer size as default system value, but to 
keep it consistent as user would like to specify, this makes sense. 
I also notice that Spark RPC by default uses java native serialization, 
even a verifying endpoint exist or not request would cost 1K of payload size, 
not to mention some other real logic endpoint, so in the real world it might be 
useful to profile this, I suggest maybe providing more RPC monitoring log to or 
hook would be beneficial, anyway this should be discussed in another thread. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18965: [SPARK-21749][CORE] Add comments for MessageEncoder to e...

2017-08-16 Thread neoremind
Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18965
  
@jerryshao please review my separated PR. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...

2017-08-16 Thread neoremind
Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18964
  
@jerryshao please review my separated PR. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18922: [SPARK-21701][CORE] Enable RPC client to use SO_RCVBUF, ...

2017-08-16 Thread neoremind
Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18922
  
@jerryshao Thanks for your reviewing! Per your advice, I have separated the 
issue into different PRs, 
https://github.com/apache/spark/pull/18964
and
https://github.com/apache/spark/pull/18965


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18965: [SPARK-21749][CORE] Add comments for MessageEncod...

2017-08-16 Thread neoremind
GitHub user neoremind opened a pull request:

https://github.com/apache/spark/pull/18965

[SPARK-21749][CORE] Add comments for MessageEncoder to explain the wire 
format

## What changes were proposed in this pull request?

Spark RPC is built upon TCP tier and leverage netty framework. It would be 
better to have some comments in RPC module especially to explain the RPC 
message wire format since it is critical to understand what Message consists 
(header + payload). So Javadoc comments can be added into MessageEncoder class.

## How was this patch tested?

Only comments added. Existing unit tests are all passed.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/neoremind/spark add_comments

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18965.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18965


commit a65bc214be8c05a3901193f9cc57d00fbbf506d5
Author: xu.zhang 
Date:   2017-08-16T14:37:23Z

Add comments for MessageEncoder to explain the wire format




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO...

2017-08-16 Thread neoremind
GitHub user neoremind opened a pull request:

https://github.com/apache/spark/pull/18964

[SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF` and ` SO_SNDBUF`  
in SparkConf.

## What changes were proposed in this pull request?

TCP parameters like SO_RCVBUF and SO_SNDBUF can be set in SparkConf, and 
`org.apache.spark.network.server.TransportServe`r can use those parameters to 
build server by leveraging netty. But for TransportClientFactory, there is no 
such way to set those parameters from SparkConf. This could be inconsistent in 
server and client side when people set parameters in SparkConf. So this PR make 
RPC client to be enable to use those TCP parameters as well.

## How was this patch tested?

Existing tests.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/neoremind/spark add_client_param

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18964.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18964


commit 14ba13b1fec4fbe20b130283ba0d6b2d5c58bb87
Author: xu.zhang 
Date:   2017-08-16T14:30:25Z

Enable RPC client to use ` SO_RCVBUF` and ` SO_SNDBUF`  in SparkConf.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18922: [SPARK-21701][CORE] Enable RPC client to use SO_R...

2017-08-16 Thread neoremind
Github user neoremind commented on a diff in the pull request:

https://github.com/apache/spark/pull/18922#discussion_r133467528
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java
 ---
@@ -210,6 +210,18 @@ private TransportClient createClient(InetSocketAddress 
address)
   .option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 
conf.connectionTimeoutMs())
   .option(ChannelOption.ALLOCATOR, pooledAllocator);
 
+if (conf.backLog() > 0) {
+  bootstrap.option(ChannelOption.SO_BACKLOG, conf.backLog());
--- End diff --

I see your thoughts. I will eliminate the setting of the param.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18922: [SPARK-21701][CORE] Enable RPC client to use SO_R...

2017-08-16 Thread neoremind
Github user neoremind closed the pull request at:

https://github.com/apache/spark/pull/18922


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18922: [SPARK-21701][CORE] Enable RPC client to use SO_RCVBUF, ...

2017-08-14 Thread neoremind
Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18922
  
@zsxwing Would you mind verifying the patch for me? I notice you have 
contributed to rpc module in spark. Many thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18922: [SPARK-21701][CORE] Enable RPC client to use SO_R...

2017-08-11 Thread neoremind
GitHub user neoremind opened a pull request:

https://github.com/apache/spark/pull/18922

[SPARK-21701][CORE] Enable RPC client to use SO_RCVBUF, SO_SNDBUF and 
SO_BACKLOG in SparkConf

## What changes were proposed in this pull request?

1. TCP parameters like SO_RCVBUF, SO_SNDBUF and SO_BACKLOG can be set in 
`SparkConf`, and `org.apache.spark.network.server.TransportServer` can use 
those parameters to build server by leveraging netty. But for 
`TransportClientFactory`, there is no such way to set those parameters from 
`SparkConf`. This could be inconsistent in server and client side when people 
set parameters in `SparkConf`. So this PR make RPC client to be enable to use 
those TCP parameters as well.

2. Add some comments to `MessageDecoder` class to show how wire format 
looks like for a RPC message and draw a vivid graph to present to detailed wire 
protocol.

## How was this patch tested?
Existing tests and refine asking non-existent endpoint test case in 
`RpcEnvSuite`.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/neoremind/spark SPARK-21701

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18922.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18922


commit 09734e2a1cf611bfb05e5873f55e9423f8851fe8
Author: xu.zhang 
Date:   2017-08-11T14:06:15Z

1.  Enable RPC client to use ` SO_RCVBUF` and ` SO_SNDBUF` and `SO_BACKLOG` 
in SparkConf.
2.  Add Javadoc comment in MessageEncoder class.
3.  Refine test case of asking non-existent endpoint in RPC test suite.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org