[jira] [Updated] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread Jian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Zhang updated HDFS-17531:
--
Attachment: (was: Aynchronous router.pdf)

> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Aynchronous router.pdf, HDFS-17531.001.patch, 
> image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of the router can be improved by increasing the 
> number of handlers, it is still relatively slow. More threads will increase 
> the CPU context switching time, and in fact many of the handler threads are 
> in a blocked state, which is undoubtedly a waste of thread resources. When a 
> request enters the router, there is no guarantee that there will be a running 
> handler at this time.
>  
> Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
> complete solution.
>  
> Welcome everyone to exchange and discuss!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread Jian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Zhang updated HDFS-17531:
--
Attachment: Aynchronous router.pdf

> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Aynchronous router.pdf, HDFS-17531.001.patch, 
> image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of the router can be improved by increasing the 
> number of handlers, it is still relatively slow. More threads will increase 
> the CPU context switching time, and in fact many of the handler threads are 
> in a blocked state, which is undoubtedly a waste of thread resources. When a 
> request enters the router, there is no guarantee that there will be a running 
> handler at this time.
>  
> Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
> complete solution.
>  
> Welcome everyone to exchange and discuss!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread Jian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Zhang updated HDFS-17531:
--
Attachment: HDFS-17531.001.patch
Status: Patch Available  (was: Open)

> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Aynchronous router.pdf, HDFS-17531.001.patch, 
> image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of the router can be improved by increasing the 
> number of handlers, it is still relatively slow. More threads will increase 
> the CPU context switching time, and in fact many of the handler threads are 
> in a blocked state, which is undoubtedly a waste of thread resources. When a 
> request enters the router, there is no guarantee that there will be a running 
> handler at this time.
>  
> Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
> complete solution.
>  
> Welcome everyone to exchange and discuss!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17531:
--
Labels: pull-request-available  (was: )

> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
>  Labels: pull-request-available
> Attachments: Aynchronous router.pdf, image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of the router can be improved by increasing the 
> number of handlers, it is still relatively slow. More threads will increase 
> the CPU context switching time, and in fact many of the handler threads are 
> in a blocked state, which is undoubtedly a waste of thread resources. When a 
> request enters the router, there is no guarantee that there will be a running 
> handler at this time.
>  
> Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
> complete solution.
>  
> Welcome everyone to exchange and discuss!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread Jian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Zhang updated HDFS-17531:
--
Description: 
*Description*

Currently, the main function of the Router service is to accept client 
requests, forward the requests to the corresponding downstream ns, and then 
return the results of the downstream ns to the client. The link is as follows:

*!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
The main threads involved in the rpc link are:
{*}Read{*}: Get the client request and put it into the call queue *(1)*
{*}Handler{*}:
Extract call *(2)* from the call queue, process the call, generate a new call, 
place it in the call of the connection thread, and wait for the call processing 
to complete *(3)*
After being awakened by the connection thread, process the response and put it 
into the response queue *(5)*
*Connection:*
Hold the link with downstream ns, send the call from the call to the downstream 
ns (via {*}rpcRequestThread{*}), and obtain a response from ns. Based on the 
call in the response, notify the call to complete processing *(4)*
*Responder:*
Retrieve the response queue from the queue *(6)* and return it to the client
 

*Shortcoming*
Even if the *connection* thread can send more requests to downstream 
nameservices, since *(3)* and *(4)* are synchronous, when the *handler* thread 
adds the call to connection.calls, it needs to wait until the *connection* 
notifies the call to complete, and then Only after the response is put into the 
response queue can a new call be obtained from the call queue and processed. 
Therefore, the concurrency performance of the router is limited by the number 
of handlers; a simple example is as follows: If the number of handlers is 1 and 
the maximum number of calls in the connection thread is 10, then even if the 
connection thread can send 10 requests to the downstream ns, since the number 
of handlers is 1, the router can only process one request after another. 
 
Since the performance of router rpc is mainly limited by the number of 
handlers, the most effective way to improve rpc performance currently is to 
increase the number of handlers. Letting the router create a large number of 
handler threads will also increase the number of thread switches and cannot 
maximize the use of machine performance.
 
There are usually multiple ns downstream of the router. If the handler forwards 
the request to an ns with poor performance, it will cause the handler to wait 
for a long time. Due to the reduction of available handlers, the router's 
ability to handle ns requests with normal performance will be reduced. From the 
perspective of the client, the performance of the downstream ns of the router 
has deteriorated at this time. We often find that the call queue of the 
downstream ns is not high, but the call queue of the router is very high.
 
Therefore, although the main function of the router is to federate and handle 
requests from multiple NSs, the current synchronous RPC performance cannot 
satisfy the scenario where there are many NSs downstream of the router. Even if 
the concurrent performance of the router can be improved by increasing the 
number of handlers, it is still relatively slow. More threads will increase the 
CPU context switching time, and in fact many of the handler threads are in a 
blocked state, which is undoubtedly a waste of thread resources. When a request 
enters the router, there is no guarantee that there will be a running handler 
at this time.
 

Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
complete solution.

 

Welcome everyone to exchange and discuss!

  was:
*Description*

Currently, the main function of the Router service is to accept client 
requests, forward the requests to the corresponding downstream ns, and then 
return the results of the downstream ns to the client. The link is as follows:

*!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
The main threads involved in the rpc link are:
{*}Read{*}: Get the client request and put it into the call queue *(1)*
{*}Handler{*}:
Extract call *(2)* from the call queue, process the call, generate a new call, 
place it in the call of the connection thread, and wait for the call processing 
to complete *(3)*
After being awakened by the connection thread, process the response and put it 
into the response queue *(5)*
*Connection:*
Hold the link with downstream ns, send the call from the call to the downstream 
ns (via {*}rpcRequestThread{*}), and obtain a response from ns. Based on the 
call in the response, notify the call to complete processing *(4)*
*Responder:*
Retrieve the response queue from the queue *(6)* and return it to the client
 

*Shortcoming*
Even if the *connection* thread can send more requests to downstream 
nameservices, since *(3)* and *(4)* are synchronous, when the *handler* thread 
adds the call to connectio

[jira] [Updated] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread Jian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Zhang updated HDFS-17531:
--
Attachment: Aynchronous router.pdf

> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
> Attachments: Aynchronous router.pdf, image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of the router can be improved by increasing the 
> number of handlers, it is still relatively slow. More threads will increase 
> the CPU context switching time, and in fact many of the handler threads are 
> in a blocked state, which is undoubtedly a waste of thread resources. When a 
> request enters the router, there is no guarantee that there will be a running 
> handler at this time.
>  
> Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
> complete solution.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread Jian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Zhang updated HDFS-17531:
--
Description: 
*Description*

Currently, the main function of the Router service is to accept client 
requests, forward the requests to the corresponding downstream ns, and then 
return the results of the downstream ns to the client. The link is as follows:

*!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
The main threads involved in the rpc link are:
{*}Read{*}: Get the client request and put it into the call queue *(1)*
{*}Handler{*}:
Extract call *(2)* from the call queue, process the call, generate a new call, 
place it in the call of the connection thread, and wait for the call processing 
to complete *(3)*
After being awakened by the connection thread, process the response and put it 
into the response queue *(5)*
*Connection:*
Hold the link with downstream ns, send the call from the call to the downstream 
ns (via {*}rpcRequestThread{*}), and obtain a response from ns. Based on the 
call in the response, notify the call to complete processing *(4)*
*Responder:*
Retrieve the response queue from the queue *(6)* and return it to the client
 

*Shortcoming*
Even if the *connection* thread can send more requests to downstream 
nameservices, since *(3)* and *(4)* are synchronous, when the *handler* thread 
adds the call to connection.calls, it needs to wait until the *connection* 
notifies the call to complete, and then Only after the response is put into the 
response queue can a new call be obtained from the call queue and processed. 
Therefore, the concurrency performance of the router is limited by the number 
of handlers; a simple example is as follows: If the number of handlers is 1 and 
the maximum number of calls in the connection thread is 10, then even if the 
connection thread can send 10 requests to the downstream ns, since the number 
of handlers is 1, the router can only process one request after another. 
 
Since the performance of router rpc is mainly limited by the number of 
handlers, the most effective way to improve rpc performance currently is to 
increase the number of handlers. Letting the router create a large number of 
handler threads will also increase the number of thread switches and cannot 
maximize the use of machine performance.
 
There are usually multiple ns downstream of the router. If the handler forwards 
the request to an ns with poor performance, it will cause the handler to wait 
for a long time. Due to the reduction of available handlers, the router's 
ability to handle ns requests with normal performance will be reduced. From the 
perspective of the client, the performance of the downstream ns of the router 
has deteriorated at this time. We often find that the call queue of the 
downstream ns is not high, but the call queue of the router is very high.
 
Therefore, although the main function of the router is to federate and handle 
requests from multiple NSs, the current synchronous RPC performance cannot 
satisfy the scenario where there are many NSs downstream of the router. Even if 
the concurrent performance of the router can be improved by increasing the 
number of handlers, it is still relatively slow. More threads will increase the 
CPU context switching time, and in fact many of the handler threads are in a 
blocked state, which is undoubtedly a waste of thread resources. When a request 
enters the router, there is no guarantee that there will be a running handler 
at this time.
 

Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
complete solution.

 

  was:
*Description*

Currently, the main function of the Router service is to accept client 
requests, forward the requests to the corresponding downstream ns, and then 
return the results of the downstream ns to the client. The link is as follows:

*!image-2024-05-19-18-07-51-282.png|width=900,height=300!*

The main threads involved in the rpc link are:
{*}read{*}: Get the client request and put it into the call queue *(1)*
**

{*}handler{*}:
Remove the call from the call queue {*}(2){*}, process the call, generate a new 
call and put it into the calls of the connection thread, and wait for the call 
to be processed *(3)*
After being awakened by the connection thread, process the response and put the 
response into the response queue *(5)*
**

{*}connection{*}:
Hold the link with the downstream ns, send the call in calls to the downstream 
ns (through rpcRequestThread), and get the response from the ns, and notify the 
call processing completion according to the callid in the response *(4)*
**

{*}responder{*}:
Take out the response in the response queue column *(6)* and return it to the 
client

 

*Shortcoming*

Even if the *connection* thread can send more requests to downstream 
nameservices, since *(3)* and *(4)* are synchronous, when the *handler* thread 
adds the call t

[jira] [Updated] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread Jian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Zhang updated HDFS-17531:
--
Description: 
*Description*

Currently, the main function of the Router service is to accept client 
requests, forward the requests to the corresponding downstream ns, and then 
return the results of the downstream ns to the client. The link is as follows:

*!image-2024-05-19-18-07-51-282.png|width=900,height=300!*

The main threads involved in the rpc link are:
{*}read{*}: Get the client request and put it into the call queue *(1)*
**

{*}handler{*}:
Remove the call from the call queue {*}(2){*}, process the call, generate a new 
call and put it into the calls of the connection thread, and wait for the call 
to be processed *(3)*
After being awakened by the connection thread, process the response and put the 
response into the response queue *(5)*
**

{*}connection{*}:
Hold the link with the downstream ns, send the call in calls to the downstream 
ns (through rpcRequestThread), and get the response from the ns, and notify the 
call processing completion according to the callid in the response *(4)*
**

{*}responder{*}:
Take out the response in the response queue column *(6)* and return it to the 
client

 

*Shortcoming*

Even if the *connection* thread can send more requests to downstream 
nameservices, since *(3)* and *(4)* are synchronous, when the *handler* thread 
adds the call to connection.calls, it needs to wait until the *connection* 
notifies the call to complete, and then Only after the response is put into the 
response queue can a new call be obtained from the call queue and processed. 
Therefore, the concurrency performance of the router is limited by the number 
of handlers; a simple example is as follows:
 - If the number of handlers is 1 and the maximum number of calls in the 
connection thread is 10, then even if the connection thread can send 10 
requests to the downstream ns, since the number of handlers is 1, the router 
can only process one request after another. .

Since the performance of router rpc is mainly limited by the number of 
handlers, the most effective way to improve rpc performance currently is to 
increase the number of handlers. Letting the router create a large number of 
handler threads will also increase the number of thread switches and cannot 
maximize the use of the machine performance.

There are usually multiple ns downstream of the router. If the handler forwards 
the request to an ns with poor performance, it will cause the handler to wait 
for a long time. Due to the reduction of available handlers, the router's 
ability to handle ns requests with normal performance will be reduced. , from 
the perspective of the client, the performance of the downstream ns of the 
router has deteriorated at this time. We often find that the call queue of the 
downstream ns is not high, but the call queue of the router is very high.

Therefore, although the main function of the router is to federate and handle 
requests from multiple NSs, the current synchronous RPC performance cannot 
satisfy the scenario where there are many NSs downstream of the router. Even if 
the concurrent performance of the router can be improved by increasing the 
number of handlers, it is still relatively slow. More threads will increase the 
CPU context switching time, and in fact many of the handler threads are in a 
blocked state, which is undoubtedly a waste of thread resources. When a request 
enters the router, there is no guarantee that there will be a running handler 
at this time.

 

Therefore, I consider asynchronous router rpc. Please view the pdf for the 
complete solution.

 

> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
> Attachments: image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}read{*}: Get the client request and put it into the call queue *(1)*
> **
> {*}handler{*}:
> Remove the call from the call queue {*}(2){*}, process the call, generate a 
> new call and put it into the calls of the connection thread, and wait for the 
> call to be processed *(3)*
> After being awakened by the connection thread, process the response and put 
> the response into the response queue *(5)*
> **
> {*}connection{*}:
> Hold the link with the downstream ns, send th

[jira] [Updated] (HDFS-17531) RBF: Aynchronous router RPC.

2024-05-19 Thread Jian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Zhang updated HDFS-17531:
--
Attachment: image-2024-05-19-18-07-51-282.png

> RBF: Aynchronous router RPC.
> 
>
> Key: HDFS-17531
> URL: https://issues.apache.org/jira/browse/HDFS-17531
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jian Zhang
>Assignee: Jian Zhang
>Priority: Major
> Attachments: image-2024-05-19-18-07-51-282.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org