[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...
Github user neoremind commented on the issue: https://github.com/apache/spark/pull/18964 @zsxwing Thanks for reviewing. The project I mentioned above is for studying purpose and hope it will help others who are interested. I totally agree that spark rpc mainly for internal use, but as I tested, its performance is good though in general cases, which is good news :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO...
Github user neoremind commented on a diff in the pull request: https://github.com/apache/spark/pull/18964#discussion_r135081475 --- Diff: common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java --- @@ -210,6 +210,14 @@ private TransportClient createClient(InetSocketAddress address) .option(ChannelOption.CONNECT_TIMEOUT_MILLIS, conf.connectionTimeoutMs()) .option(ChannelOption.ALLOCATOR, pooledAllocator); +if (conf.receiveBuf() > 0) { + bootstrap.option(ChannelOption.SO_RCVBUF, conf.receiveBuf()); --- End diff -- For server side, netty leverages reactor pattern and `ServerBootstrap` is used here, method `option` works when socket binding and connecting occurs, while `childOption` works for the later process when connection is established and there will be one channel per client. For client side, spark rpc uses `Bootstrap`, there is no method to set `childOption` because client works not like server side, it is pretty simple for client, only one thread-pool will be used by netty to do network I/O. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...
Github user neoremind commented on the issue: https://github.com/apache/spark/pull/18964 @zsxwing I did try to create a performance test against spark rpc, the test result can be found [here](https://github.com/neoremind/kraps-rpc#4-performance-test), note that I created the project for studying purpose and the code is based on 2.1.0. But as you said, the performance would not be dropped as client not using `SO_RCVBUF` and `SO_SNDBUF` set in `SparkConf`. For example, I use the scenario of concurrent calls 10, total calls 100k, keep all things as default, the QPS would be around 11k. When I set `SO_RCVBUF` and ` SO_SNDBUF` to extremely small number like 100 the performance is affected tremendously. If they are set to a large number like 128k, the results won't be boosted by whether clients set the corresponding `SO_RCVBUF` and `SO_SNDBUF` value or not. I admit that the update is trivial but from users' perspective, if `spark.{module}.io.sendBuffer` and `spark.{module}.io.sendBuffer` are exposed outside and could be set, and they only works on server side, I think it is a little bit not consistent, so I raise the PR to try to make it work on both server and client side, just to make them consistent. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...
Github user neoremind commented on the issue: https://github.com/apache/spark/pull/18964 @cloud-fan would you take a look of the PR, the update is very simple. Thanks very much! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18965: [SPARK-21749][DOC] Add comments for MessageEncoder to ex...
Github user neoremind commented on the issue: https://github.com/apache/spark/pull/18965 @srowen I see your concern, it is more internal oriented and maybe updated by developer as lib evolves, I will close the PR then. Thanks for reviewing! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18965: [SPARK-21749][DOC] Add comments for MessageEncode...
Github user neoremind closed the pull request at: https://github.com/apache/spark/pull/18965 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18965: [SPARK-21749][DOC] Add comments for MessageEncoder to ex...
Github user neoremind commented on the issue: https://github.com/apache/spark/pull/18965 I see, anyway this is what I found when I dig into the wire protocol of spark rpc since wire format is a big part of understanding the message structure. If someone thinks this is not necessary I can close the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...
Github user neoremind commented on the issue: https://github.com/apache/spark/pull/18964 Not yet since it is OK to keep buffer size as default system value, but to keep it consistent as user would like to specify, this makes sense. I also notice that Spark RPC by default uses java native serialization, even a verifying endpoint exist or not request would cost 1K of payload size, not to mention some other real logic endpoint, so in the real world it might be useful to profile this, I suggest maybe providing more RPC monitoring log to or hook would be beneficial, anyway this should be discussed in another thread. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18965: [SPARK-21749][CORE] Add comments for MessageEncoder to e...
Github user neoremind commented on the issue: https://github.com/apache/spark/pull/18965 @jerryshao please review my separated PR. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...
Github user neoremind commented on the issue: https://github.com/apache/spark/pull/18964 @jerryshao please review my separated PR. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18922: [SPARK-21701][CORE] Enable RPC client to use SO_RCVBUF, ...
Github user neoremind commented on the issue: https://github.com/apache/spark/pull/18922 @jerryshao Thanks for your reviewing! Per your advice, I have separated the issue into different PRs, https://github.com/apache/spark/pull/18964 and https://github.com/apache/spark/pull/18965 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18965: [SPARK-21749][CORE] Add comments for MessageEncod...
GitHub user neoremind opened a pull request: https://github.com/apache/spark/pull/18965 [SPARK-21749][CORE] Add comments for MessageEncoder to explain the wire format ## What changes were proposed in this pull request? Spark RPC is built upon TCP tier and leverage netty framework. It would be better to have some comments in RPC module especially to explain the RPC message wire format since it is critical to understand what Message consists (header + payload). So Javadoc comments can be added into MessageEncoder class. ## How was this patch tested? Only comments added. Existing unit tests are all passed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/neoremind/spark add_comments Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18965.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18965 commit a65bc214be8c05a3901193f9cc57d00fbbf506d5 Author: xu.zhang Date: 2017-08-16T14:37:23Z Add comments for MessageEncoder to explain the wire format --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO...
GitHub user neoremind opened a pull request: https://github.com/apache/spark/pull/18964 [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF` and ` SO_SNDBUF` in SparkConf. ## What changes were proposed in this pull request? TCP parameters like SO_RCVBUF and SO_SNDBUF can be set in SparkConf, and `org.apache.spark.network.server.TransportServe`r can use those parameters to build server by leveraging netty. But for TransportClientFactory, there is no such way to set those parameters from SparkConf. This could be inconsistent in server and client side when people set parameters in SparkConf. So this PR make RPC client to be enable to use those TCP parameters as well. ## How was this patch tested? Existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/neoremind/spark add_client_param Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18964.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18964 commit 14ba13b1fec4fbe20b130283ba0d6b2d5c58bb87 Author: xu.zhang Date: 2017-08-16T14:30:25Z Enable RPC client to use ` SO_RCVBUF` and ` SO_SNDBUF` in SparkConf. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18922: [SPARK-21701][CORE] Enable RPC client to use SO_R...
Github user neoremind commented on a diff in the pull request: https://github.com/apache/spark/pull/18922#discussion_r133467528 --- Diff: common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java --- @@ -210,6 +210,18 @@ private TransportClient createClient(InetSocketAddress address) .option(ChannelOption.CONNECT_TIMEOUT_MILLIS, conf.connectionTimeoutMs()) .option(ChannelOption.ALLOCATOR, pooledAllocator); +if (conf.backLog() > 0) { + bootstrap.option(ChannelOption.SO_BACKLOG, conf.backLog()); --- End diff -- I see your thoughts. I will eliminate the setting of the param. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18922: [SPARK-21701][CORE] Enable RPC client to use SO_R...
Github user neoremind closed the pull request at: https://github.com/apache/spark/pull/18922 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18922: [SPARK-21701][CORE] Enable RPC client to use SO_RCVBUF, ...
Github user neoremind commented on the issue: https://github.com/apache/spark/pull/18922 @zsxwing Would you mind verifying the patch for me? I notice you have contributed to rpc module in spark. Many thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #18922: [SPARK-21701][CORE] Enable RPC client to use SO_R...
GitHub user neoremind opened a pull request: https://github.com/apache/spark/pull/18922 [SPARK-21701][CORE] Enable RPC client to use SO_RCVBUF, SO_SNDBUF and SO_BACKLOG in SparkConf ## What changes were proposed in this pull request? 1. TCP parameters like SO_RCVBUF, SO_SNDBUF and SO_BACKLOG can be set in `SparkConf`, and `org.apache.spark.network.server.TransportServer` can use those parameters to build server by leveraging netty. But for `TransportClientFactory`, there is no such way to set those parameters from `SparkConf`. This could be inconsistent in server and client side when people set parameters in `SparkConf`. So this PR make RPC client to be enable to use those TCP parameters as well. 2. Add some comments to `MessageDecoder` class to show how wire format looks like for a RPC message and draw a vivid graph to present to detailed wire protocol. ## How was this patch tested? Existing tests and refine asking non-existent endpoint test case in `RpcEnvSuite`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/neoremind/spark SPARK-21701 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/18922.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #18922 commit 09734e2a1cf611bfb05e5873f55e9423f8851fe8 Author: xu.zhang Date: 2017-08-11T14:06:15Z 1. Enable RPC client to use ` SO_RCVBUF` and ` SO_SNDBUF` and `SO_BACKLOG` in SparkConf. 2. Add Javadoc comment in MessageEncoder class. 3. Refine test case of asking non-existent endpoint in RPC test suite. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org