Re: [Cluster-devel] [PATCH v6.1-rc1 18/18] fs: dlm: rework lowcomms handling

Alexander Aring Fri, 28 Oct 2022 14:26:16 -0700

Hi,

On Thu, Oct 27, 2022 at 4:45 PM Alexander Aring <[email protected]> wrote:
>
> This patch is rework of lowcomms handling, the main goal was here to
> handle recvmsg() and sendpage() to run parallel. Parallel in two sense:
> 1. per connection and 2. that recvmsg()/sendpage() doesn't block each
> other.
>
> Currently recvmsg()/sendpage() cannot run parallel because two
> workqueues "dlm_recv" and "dlm_send" are ordered workqeues. That's said
> and we having two workqueues for receive and send, they block each if
> both workqueues schedule the same connection at the same time because a
> per connection mutex to prevent that the socket gets released by being
> used.
>
> To make it more parallel we introduce one "dlm_io" workqueue which is
> not an ordered workqueue, the amount of workers are not limited. Due
> per connection flags SEND/RECV pending we schedule workers ordered per
> connection and per send and receive task. To get rid of the mutex
> blocking same workers to do socket handling we switched to a semaphore
> which handles socket operations as read lock and sock releases as write
> operations, to prevent sock_release() being called while the socket is
> being used.
>
> There might be more optimization removing the semaphore and replacing it
> with rcu, however due other circumstances e.g. othercon behaviour it
> seems complicated to doing this change. I added comments to remove the
> othercon handling and moving to rcu handling. We need to do that to the
> next dlm major version upgrade because it is not backwards compatible
> with the current connect mechanism.
>
> The processing of dlm messages need to be still handled by a ordered
> workqueue. An dlm_process ordered workqeue was introduced which gets
> filled by the receive worker. This is probably the next bottleneck of
> DLM but the application can't currently parse dlm messages parallel. A
> comment was introduced to lift the workqueue context of dlm processing
> in a non-sleepable softirq.
>
> The lifetime of a connection struct is now different because the
> dlm_node_addr structure was merged into the connection struct. The
> shutdown handling for TCP was removed to handle half-open sockets. We
> have a application handling for dealing with half-opened connection
> which is required by dlm and cluster manager join/leave events. SCTP
> doesn't support half-closed sockets and it's already handled by the
> application to never run into such case, therefore we removed it.
>
> To check if lowcomms is currently running and it cannot be configured at
> this time we introduced a function dlm_lowcomms_is_running() and remove
> the dlm_allow_conn variable. The synchronization of stopping all listen
> socket workers goes now of it's data ready socket callback.
>
> Signed-off-by: Alexander Aring <[email protected]>
> ---
>  fs/dlm/config.c    |    4 +-
>  fs/dlm/lockspace.c |    5 +-
>  fs/dlm/lowcomms.c  | 1453 ++++++++++++++++++++------------------------
>  fs/dlm/lowcomms.h  |    6 +-
>  fs/dlm/main.c      |    7 +-
>  fs/dlm/midcomms.c  |   68 ++-
>  fs/dlm/midcomms.h  |    4 +
>  7 files changed, 730 insertions(+), 817 deletions(-)
>
> diff --git a/fs/dlm/config.c b/fs/dlm/config.c
> index ac8b62106ce0..20b60709eccf 100644
> --- a/fs/dlm/config.c
> +++ b/fs/dlm/config.c
> @@ -183,7 +183,7 @@ static int dlm_check_protocol_and_dlm_running(unsigned 
> int x)
>                 return -EINVAL;
>         }
>
> -       if (dlm_allow_conn)
...
> +
> +       switch (ret) {
> +       case DLM_IO_END:
> +               dlm_midcomms_receive_done(con->nodeid);
> +               break;
> +       case DLM_IO_EOF:
> +               close_connection(con, false);
> +               break;
> +       case DLM_IO_RESCHED:
> +               cond_resched();
> +               queue_work(io_workqueue, &con->rwork);
> +               break;
> +       default:
> +               if (ret < 0) {
> +                       log_print("node %d recv error %d", con->nodeid, ret);
> +                       close_connection(con, false);
> +                       dlm_midcomms_unack_msg_resend(con->nodeid);


I will change it to only call close_connection() on the send side and
put only a dlm_midcomms_unack_msg_resend() in here. It is difficult to
synchronize it with the sending worker. As we retransmit not acked dlm
messages we will hopefully get an error on the sending try and it will
close the socket.

Because the othercon and if we have the othercon it's only receiving
we will call close_connection() if it's a "othercon".

I currently try to run a test with tcpkill to see how the
close/reconnect behaviour acts.
I know the patch is really really big. I will try to split them
somehow into smaller ones as possible.

- Alex

Re: [Cluster-devel] [PATCH v6.1-rc1 18/18] fs: dlm: rework lowcomms handling

Reply via email to