* Juan Quintela (quint...@redhat.com) wrote: > "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote: > > Hi Juan, > > What are the semantics of 'qemu_peek_buffer'? > > > > - is it supposed to guarantee (if there are no errors) that > > it will read 'size' bytes? (i.e. it should block) > > > We are talking qemu, specifically migration. "quarantee" is always a > too strong word O:-)
It would be nice for people to expect it! > Once told that, qemu_peek_buffer() should always read the amount of > stuff it has been asked (or terun one error. OK. > > There are currently two users of it: > > * qemu_read_buffer which spins filling it's buffer up > > with repeated calls to qemu_peek_buffer > > <if people are using grep, function in qemu_get_buffer()> > > if we ask for "size" bytes, and there are less that that size, we are in > trouble. > > > > * vmstate_subsection_load that returns if the size read > > doesn't match what it was expecting > > This is look-ahead of "size" chars, and has all the problems that you > can think of read-ahead of more than one char. How is it used in > sub-sections: > > - we read that the 1st char is a subsection number (but it could not be > a subsection). > - we read the size > - we read the subsection name of that size > > - we search for a subsection with that name, if it exist, we then read > them properly. if it don't exist, we abort the whole subsection idea, > nad continue as if the subsection number hadn't exist in the 1st place. > > > I can't see how both of them can be right. > > > > The problem I'm seeing is that in my world I've got a > > qemu_peek_buffer of 8 bytes, and with a repeated virt-test > > local tcp migration it's failing about 1 in 8 times; > > here is some debug: > > > > 19:51:15 INFO | [qemu output] qemu_peek_buffer refill case (pre); size=8 > > offset=0 index=32764 pending=4 buf_index=32764 buf_size=32768 pos=23302795 > > 19:51:15 INFO | [qemu output] qemu_fill_buffer got 1 > > 19:51:15 INFO | [qemu output] qemu_peek_buffer refill case (post); size=8 > > offset=0 index=0 pending=5 buf_index=0 buf_size=5 pos=23302796 > > 19:51:15 INFO | [qemu output] qemu_peek_buffer (size>pending); size=8 > > offset=0 index=0 pending=5 buf_index=0 buf_size=5 pos=23302796 > > > > i.e. I asked for 8 bytes, there were 4 in the buffer, it called fill > > buffer, which got one > > more byte, and thus it returned me 5. > > Uh, oh. That shouldn't happen. could you try to change the > "if (pending < size)" > to > "while (pending < size)" > > remember that you need to handle errors on that loop? Yes, I think that will work - I think really the loop that is in qemu_get_buffer() needs to move up into qemu_peek_buffer() (with qemu_fill_buffer returning the number of bytes it has read). > BTW, what are you doing when you get that error? It is vmstate_load or > qemu_get_buffer()? and for what fiel? In my migration visitor world, in the compatibility code for the current format, I'm using qemu_peek_buffer to peek the 8 byte header on ram pages to figure out which type of page encoding we're using, so this means I'm calling qemu_peek_buffer once per page (with 8 bytes peek'd each time), so it's calling it a lot. > > I think what vmstate_subsection_load wants (and what I want) is something > > like qemu_read_buffer but which doesn't advance it's pointer, i.e. to read > > a header, decide it's not for me and let the next function along use it. > > Problem was to fix the case when there is less that <size> bytes into > the buffer. i.e. we start to search for a string that is bigger than > the remaining (already read) buffer. > > > vmstate_subsection_load doesn't look like it flags an error if it > > doesn't read enough; I guess the effect will be just to fail to > > load a migration in some interesting way. > > Error checking is missing there. If we don't get enough space, we > really want to fail the subsection read. Yes, I think it's possible this code could hit the problem I've seen, but it's much less likely because there are a lot less subsections than there are pages. I think the kernel can return only a few bytes, possibly because when qemu_fill_buffer fills the buffer it can read a weird number of bytes depending how much buffer space is left; so the kernel could have a few left for the next read. Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK