On Wed, Apr 14, 2021 at 2:38 AM Stefano Garzarella <sgarz...@redhat.com> wrote:
>
> On Wed, Apr 14, 2021 at 03:20:07AM -0400, Michael S. Tsirkin wrote:
> >On Wed, Apr 14, 2021 at 08:57:06AM +0200, Stefano Garzarella wrote:
> >> On Tue, Apr 13, 2021 at 03:58:34PM -0400, Michael S. Tsirkin wrote:
> >> > On Tue, Apr 13, 2021 at 04:03:51PM +0200, Stefano Garzarella wrote:
> >> > > On Tue, Apr 13, 2021 at 09:50:45AM -0400, Michael S. Tsirkin wrote:
> >> > > > On Tue, Apr 13, 2021 at 03:38:52PM +0200, Stefano Garzarella wrote:
> >> > > > > On Tue, Apr 13, 2021 at 09:16:50AM -0400, Michael S. Tsirkin wrote:
> >> > > > > > On Tue, Apr 13, 2021 at 02:58:53PM +0200, Stefano Garzarella 
> >> > > > > > wrote:
> >> > > > > > > On Mon, Apr 12, 2021 at 03:42:23PM -0700, Jiang Wang . wrote:
> >> > > > > > > > On Mon, Apr 12, 2021 at 7:21 AM Stefano Garzarella 
> >> > > > > > > > <sgarz...@redhat.com> wrote:
> >> > > > > > > > >
> >> > > > > > > > > On Mon, Apr 12, 2021 at 02:50:17PM +0100, Stefan Hajnoczi 
> >> > > > > > > > > wrote:
> >> > > > > > > > > >On Thu, Apr 01, 2021 at 04:36:02AM +0000, jiang.wang
> >> > > > > > > > > >wrote:
>
> [...]
>
> >> > > > > > > > > >>
> >> > > > > > > > > >> +Datagram sockets provide connectionless unreliable 
> >> > > > > > > > > >> messages of
> >> > > > > > > > > >> +a fixed maximum length.
> >> > > > > > > > > >
> >> > > > > > > > > >Plus unordered (?) and with message boundaries. In other 
> >> > > > > > > > > >words:
> >> > > > > > > > > >
> >> > > > > > > > > >  Datagram sockets provide unordered, unreliable, 
> >> > > > > > > > > > connectionless message
> >> > > > > > > > > >  with message boundaries and a fixed maximum length.
> >> > > > > > > > > >
> >> > > > > > > > > >I didn't think of the fixed maximum length aspect before. 
> >> > > > > > > > > >I guess the
> >> > > > > > > > > >intention is that the rx buffer size is the message size 
> >> > > > > > > > > >limit? That's
> >> > > > > > > > > >different from UDP messages, which can be fragmented into 
> >> > > > > > > > > >multiple IP
> >> > > > > > > > > >packets and can be larger than 64KiB:
> >> > > > > > > > > >https://en.wikipedia.org/wiki/User_Datagram_Protocol#UDP_datagram_structure
> >> > > > > > > > > >
> >> > > > > > > > > >Is it possible to support large datagram messages in 
> >> > > > > > > > > >vsock? I'm a little
> >> > > > > > > > > >concerned that applications that run successfully over 
> >> > > > > > > > > >UDP will not be
> >> > > > > > > > > >portable if vsock has this limitation because it would 
> >> > > > > > > > > >impose extra
> >> > > > > > > > > >message boundaries that the application protocol might 
> >> > > > > > > > > >not tolerate.
> >> > > > > > > > >
> >> > > > > > > > > Maybe we can reuse the same approach Arseny is using for 
> >> > > > > > > > > SEQPACKET.
> >> > > > > > > > > Fragment the packets according to the buffers in the 
> >> > > > > > > > > virtqueue and set
> >> > > > > > > > > the EOR flag to indicate the last packet in the message.
> >> > > > > > > > >
> >> > > > > > > > Agree. Another option is to use the ones for skb since we 
> >> > > > > > > > may need to
> >> > > > > > > > use skbs for multiple transport support anyway.
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > > The important thing I think is to have a single flag in 
> >> > > > > > > virtio-vsock that
> >> > > > > > > identifies pretty much the same thing: this is the last 
> >> > > > > > > fragment of a series
> >> > > > > > > to rebuild a packet.
> >> > > > > > >
> >> > > > > > > We should reuse the same flag for DGRAM and SEQPACKET.
> >> > > > > > >
> >> > > > > > > Thanks,
> >> > > > > > > Stefano
> >> > > > > >
> >> > > > > > Well DGRAM can drop data so I wonder whether it can work ...
> >> > > > > >
> >> > > > >
> >> > > > > Yep, this is true, but the channel should not be losing packets, 
> >> > > > > so if the
> >> > > > > receiver discards packets, it knows that it must then discard all 
> >> > > > > of them
> >> > > > > until the EOR.
> >> > > >
> >> > > > That is not so easy - they can come mixed up from multiple sources.
> >> > >
> >> > > I think we can prevent mixing because virtuqueue is point to point and 
> >> > > its
> >> > > use is not thread safe, so the access (in the same peer) is already
> >> > > serialized.
> >> > > In the end the packet would be fragmented only before copying it to the
> >> > > virtuqueue.
> >> > >
> >> > > But maybe I missed something...
> >> >
> >> > Well I ask what's the point of fragmenting then. I assume it's so we
> >> > can pass huge messages around so you can't keep locks ...
> >> >
> >>
> >> Maybe I'm wrong, but isn't this similar to what we do in virtio-net with
> >> mergeable buffers?
> >
> >The point of mergeable buffers is to use less memory: both for each
> >packet and for a full receive vq.
> >
> >> Also in this case I think the fragmentation will happen only in the device,
> >> since the driver can enqueue the entire buffer.
> >>
> >> Maybe we can reuse mergeable buffers for virtio-vsock if the EOR flag is 
> >> not
> >> suitable.
> >
> >That sounds very reasonable.
>
> It should also allow us to save the header for each fragment.
>
> @Jiang Do you want to explore this?
> I'm talking about VIRTIO_NET_F_MRG_RXBUF feature.

Sure. Will do.

> >
> >> IIUC in the vsock device the fragmentation for DGRAM will happen just
> >> before
> >> to queue it in the virtqueue, and the device can check how many buffers are
> >> available in the queue and it can decide whether to queue them all up or
> >> throw them away.
> >> >
> >> > > > Sure linux net core does this but with fragmentation added in,
> >> > > > I start wondering whether you are beginning to reinvent the net stack
> >> > > > ...
> >> > >
> >> > > No, I hope not :-), in the end our advantage is that we have a channel 
> >> > > that
> >> > > doesn't lose packets, so I guess we can make assumptions that the 
> >> > > network
> >> > > stack can't.
> >> > >
> >> > > Thanks,
> >> > > Stefano
> >> >
> >> > I still don't know how will credit accounting work for datagram,
> >> > but proposals I saw seem to actually lose packets ...
> >> >
> >>
> >> I still don't know too, but I think it's not an issue in the RX side,
> >> since if it doesn't have space, can drop all the fragment.
> >>
> >> Another option to avoid fragmentation could be to allocate 64K buffers for
> >> the new DGRAM virtqueue.
> >
> >That's a lot of buffers ...
>
> Yep I see, and they would often be mostly unused...
>
> >
> >> In this way we will have at most 64K packets, which is similar to
> >> UDP/IP,
> >> without extra work for the fragmentation.
> >
> >IIRC default MTU is 1280 not 64K ...
>
> I was thinking that UDP at most can support 64K messages that IP should
> fragment according to MTU.
>
> Thanks,
> Stefano
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to