On Tue, Jun 29, 2021 at 3:57 PM Jon Maloy <[email protected]> wrote:
>
>
> On 28/06/2021 15:16, Xin Long wrote:
> > On Mon, Jun 28, 2021 at 3:03 PM Xin Long <[email protected]> wrote:
> >> On Sun, Jun 27, 2021 at 3:44 PM Erin Shepherd <[email protected]> wrote:
> >>> Xin Long <[email protected]> writes:
> >>>
> >>>> Currently, when userspace reads a datagram with a buffer that is
> >>>> smaller than this datagram, the data will be truncated and only
> >>>> part of it can be received by users. It doesn't seem right that
> >>>> users don't know the datagram size and have to use a huge buffer
> >>>> to read it to avoid the truncation.
> >>>>
> >>>> This patch to fix it by keeping the skb in rcv queue until the
> >>>> whole data is read by users. Only the last msg of the datagram
> >>>> will be marked with MSG_EOR, just as TCP/SCTP does.
> Makes sense to me.
> >>> I agree that the current behavior is suboptimal, but:
> >>>
> >>>   * Isn't this the same behavior that other datagram socket types
> >>>     exhibit? It seems like this would make TIPC behave inconsistently
> >>>     compared to other transports
> >> Yes, SCTP.
> >> Do you see any reliable datagram transports not doing this?
> >>
> >>>   * Isn't this a compatibility break with existing software? Particularly
> >>>     existing software will not expect to receive trailers of overlong
> >>>     datagrams
> >> I talked to Jon about this, he seems okay with this.
> >>
> >>> It feels like this behavior should be activated either with a
> >>> setsockopt(2) call or a new MSG_* flag passed to recv
> >> Anyway, It may not be worth a new sockopt.
> >> I'm thinking to pass MSG_EOR into sendmsg:
> >>    sendmsg(MSG_EOR).
> > sorry, I meant recvmsg();
>
> Still not sure I understand what you are suggesting here. Do you mean
> that if we add MSG_EOR as a flag  to recvmsg() that means we *don't*
> want the remainder of the message, i.e., it is ok to truncate it?
>
> Or do you mean the opposite?
Yes, Jon, I mean the opposite.

when MSG_EOR is set, we will go with what this patch does,
but to delete MSG_EOR if this is not the last part of the data,
and keep MSG_EOR if this is the last part of the data.

when MSG_EOR is not set, the msg will be truncated as before.

>
> In the first case, we don't solve any compatibility issue, if that is
> the purpose. The programmer still has to add code to get the current
> behavior.
>
> In the latter case we would be on the 100% safe side, although I have a
> real hard time to see that this could be a real issue. Why would anybody
> deliberately design an application for having messages truncated.
>
> ///jon
>
>
> >> to indicate we don't want the truncating msg.
> >>
> >> When the msg flag returns with no MSG_EOR, it means there's more data to 
> >> read.
> >>
> >> Thanks.
> >>> - Erin
> >>>
> >>>> Signed-off-by: Xin Long <[email protected]>
> >>>> ---
> >>>>   net/tipc/socket.c | 30 +++++++++++++++++++++---------
> >>>>   1 file changed, 21 insertions(+), 9 deletions(-)
> >>>>
> >>>> diff --git a/net/tipc/socket.c b/net/tipc/socket.c
> >>>> index 34a97ea36cc8..504e59838b8b 100644
> >>>> --- a/net/tipc/socket.c
> >>>> +++ b/net/tipc/socket.c
> >>>> @@ -1880,6 +1880,7 @@ static int tipc_recvmsg(struct socket *sock, 
> >>>> struct msghdr *m,
> >>>>        bool connected = !tipc_sk_type_connectionless(sk);
> >>>>        struct tipc_sock *tsk = tipc_sk(sk);
> >>>>        int rc, err, hlen, dlen, copy;
> >>>> +     struct tipc_skb_cb *skb_cb;
> >>>>        struct sk_buff_head xmitq;
> >>>>        struct tipc_msg *hdr;
> >>>>        struct sk_buff *skb;
> >>>> @@ -1903,6 +1904,7 @@ static int tipc_recvmsg(struct socket *sock, 
> >>>> struct msghdr *m,
> >>>>                if (unlikely(rc))
> >>>>                        goto exit;
> >>>>                skb = skb_peek(&sk->sk_receive_queue);
> >>>> +             skb_cb = TIPC_SKB_CB(skb);
> >>>>                hdr = buf_msg(skb);
> >>>>                dlen = msg_data_sz(hdr);
> >>>>                hlen = msg_hdr_sz(hdr);
> >>>> @@ -1922,18 +1924,27 @@ static int tipc_recvmsg(struct socket *sock, 
> >>>> struct msghdr *m,
> >>>>
> >>>>        /* Capture data if non-error msg, otherwise just set return value 
> >>>> */
> >>>>        if (likely(!err)) {
> >>>> -             copy = min_t(int, dlen, buflen);
> >>>> -             if (unlikely(copy != dlen))
> >>>> -                     m->msg_flags |= MSG_TRUNC;
> >>>> -             rc = skb_copy_datagram_msg(skb, hlen, m, copy);
> >>>> +             int offset = skb_cb->bytes_read;
> >>>> +
> >>>> +             copy = min_t(int, dlen - offset, buflen);
> >>>> +             rc = skb_copy_datagram_msg(skb, hlen + offset, m, copy);
> >>>> +             if (unlikely(rc))
> >>>> +                     goto exit;
> >>>> +             if (unlikely(offset + copy < dlen)) {
> >>>> +                     if (!(flags & MSG_PEEK))
> >>>> +                             skb_cb->bytes_read = offset + copy;
> >>>> +             } else {
> >>>> +                     m->msg_flags |= MSG_EOR;
> >>>> +                     skb_cb->bytes_read = 0;
> >>>> +             }
> >>>>        } else {
> >>>>                copy = 0;
> >>>>                rc = 0;
> >>>> -             if (err != TIPC_CONN_SHUTDOWN && connected && 
> >>>> !m->msg_control)
> >>>> +             if (err != TIPC_CONN_SHUTDOWN && connected && 
> >>>> !m->msg_control) {
> >>>>                        rc = -ECONNRESET;
> >>>> +                     goto exit;
> >>>> +             }
> >>>>        }
> >>>> -     if (unlikely(rc))
> >>>> -             goto exit;
> >>>>
> >>>>        /* Mark message as group event if applicable */
> >>>>        if (unlikely(grp_evt)) {
> >>>> @@ -1956,9 +1967,10 @@ static int tipc_recvmsg(struct socket *sock, 
> >>>> struct msghdr *m,
> >>>>                tipc_node_distr_xmit(sock_net(sk), &xmitq);
> >>>>        }
> >>>>
> >>>> -     tsk_advance_rx_queue(sk);
> >>>> +     if (!skb_cb->bytes_read)
> >>>> +             tsk_advance_rx_queue(sk);
> >>>>
> >>>> -     if (likely(!connected))
> >>>> +     if (likely(!connected) || skb_cb->bytes_read)
> >>>>                goto exit;
> >>>>
> >>>>        /* Send connection flow control advertisement when applicable */
> >>>> --
> >>>> 2.27.0
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> tipc-discussion mailing list
> >>>> [email protected]
> >>>> https://lists.sourceforge.net/lists/listinfo/tipc-discussion
> >
> > _______________________________________________
> > tipc-discussion mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/tipc-discussion
> >
>


_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to