On Mon, Jan 19, 2026 at 10:55:06PM +0100, Erik Huelsmann wrote:
> Hi,
> 
> > > While analysing traffic generated by "virsh vol-download", I noticed
> > > that the download stream exhibits the following pattern (over
> > > qemu+ssh://):
> > >
> > > <Start>
> > > Read 4 bytes (= frame length field)
> > > Read 262144 bytes (= header + payload ; VIR_NET_MESSAGE_HEADER_MAX
> > > (24) + VIR_NET_MESSAGE_LEGACY_PAYLOAD_MAX (262120))
> > > Read 4 bytes (= frame length field)
> > > Read 48 bytes (= header + payload)
> > > <Go back to Start>
> > >
> > > While nothing seems to be actually wrong with this pattern, the
> > > average payload on large data transfers is only (262120+24)/2 ==
> > > 131072 and the CPU makes double the iterations necessary. I'll be
> > > trying to figure out where this comes from, but I imagine that will be
> > > pretty hard to track down.
> > >
> > > Two questions:
> > >  1. Do you see the same thing?
> > >  2. Any ideas/hints where I could start looking?
> >
> > I'd suggest enabling RPC tracing with systemtap, or debug logs, to see
> > what the messages being sent are.
> >
> > The 262144 byte message will be VIR_NET_STREAM with data payload.
> >
> > The 48 byte message is slightly surprising, but given you're using
> > vol-download, I'm going to guess that the volume is sparse, and
> > say this the 48 byte message is a VIR_NET_STREAM_HOLE message
> > skipping over the transfer of a large area of zeros.
> 
> I think I found the problem by trying to figure out the various call
> paths in the code base: the root cause is a difference in buffer sizes
> and how the remote driver works with those.
> 
> I found that the remote driver uses the virFDStream API from
> src/util/virfdstream.c to read local data. This API reads blocks of
> 256kB and sends those off into the (local) stream [1]. Then the remote
> driver finds that the input message to be transferred is larger than
> maximum data frame on the wire protocol (also 256kB, but *including*
> the header of 24 bytes), meaning it will cut the buffer into two
> parts: the maximum size of the data frame payload and "the rest". (By
> the way, if we had read 1MB instead of 256kB, it would have split this
> one input chunk into 5 chunks on the wire protocol.)
> 
> The actual splitting is happening by daemonStreamHandleWrite() in
> remote_daemon_stream.c, because it's not considering there could be a
> next message in the queue which would be able to fill the remaining
> space in the data frame, but it's only handling the "remaining" data
> in the current message.
> 
> I *do* understand the perspective of the virfdstream implementation to
> read chunks of 256kB (and not 256kB minus 24). I do understand that -
> from a simplicity perspective - it makes sense to push out one message
> from the transfer queue at a time, but is the intent that the average
> transfer then becomes 128kB minus 12 bytes?

Ok, this is an implementation flaw in virfdstream.

There are two scenarios (1) reading from a local file - fdstream spawns
a background helper thread; (1) reading from a non-file - fdstream
reads directly from the FD.

When reading directly from the FD, the buffer passed to virFDStreamRead
will be filled with as much data as is available.

When reading via the helper thread, the buffer passed to virFDStreamRead
will only be filled with the data from the 1st pending message from the
background thread.

We need to fix the latter so that it will fill the buffer with data from
*all* pending messages from the thread, until the buffer is full. 


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Reply via email to