On 04/26/2017 01:01 PM, Dr. David Alan Gilbert wrote: > * Christian Borntraeger ([email protected]) wrote: >> On 04/24/2017 09:12 PM, Dr. David Alan Gilbert wrote: >>> * Christian Borntraeger ([email protected]) wrote: >>>> On 04/24/2017 04:35 PM, Dr. David Alan Gilbert wrote: >>>>> * Christian Borntraeger ([email protected]) wrote: >>>>>> On 04/24/2017 12:53 PM, Dr. David Alan Gilbert wrote: >>>>>>> * Christian Borntraeger ([email protected]) wrote: >>>>>>>> David, Juan, >>>>>>>> >>>>>>>> I can trigger a hang of postcopy migration (when I do it early) so >>>>>>>> that both sides are in paused state. Looks like one thread is still >>>>>>>> loading vmstates for virtio and this load accesses guest memory and >>>>>>>> triggers a userfault. >>>>>>> >>>>>>> It's perfectly legal for the destination to cause a userfault as it >>>>>>> loads the virtio queue - the virtio queue should be being loaded by >>>>>>> the main thread from the buffer while the 'listening' thread is >>>>>>> waiting for the incoming page data. >>>>>>> >>>>>>> Can you turn on the following tracing please: destination: >>>>>>> postcopy_ram_fault_thread_request postcopy_place_page >>>>>>> postcopy_place_page_zero >>>>>>> >>>>>>> source: migrate_handle_rp_req_pages ram_save_queue_pages >>>>>>> >>>>>>> You should see: virtio does the access userfault generates a fault >>>>>>> postcopy_ram_fault_thread_request sends the request to the source >>>>>>> >>>>>>> the source sees migrate_handle_rp_req_pages queues it with >>>>>>> ram_save_queue_pages >>>>>>> >>>>>>> the destination sees the page arrive and postcopy_place_page or >>>>>>> postcopy_place_page_zero >>>>>>> >>>>>>> some of that might disappear if the page was already on it's way. >>>>>> >>>>>> >>>>>> the last event on the source are >>>>>> [..] >>>>>> [email protected]:postcopy_place_page host=0x3ff92246000 >>>>>> [email protected]:postcopy_place_page host=0x3ff92247000 >>>>> >>>>> How do you see those on the source??? >>>> >>>> It was the previous migrate (I did it in a loop forth and back) >>>> The problem happens on the migrate back. >>> >>> OK, good. >>> >>>>> >>>>>> [email protected]:migrate_handle_rp_req_pages in s390.ram at >>>>>> 41d9000 len 1000 >>>>>> [email protected]:ram_save_queue_pages s390.ram: start: 41d9000 >>>>>> len: 100 >>>>>> >>>>> >>>>> Is that a typo? I'm expecting those two 'len' fields to be the same? >>>> >>>> Yes, its a cut'n' paste "miss the last byte" >>> >>> Good. >>> >>> Ok, before digging further, is this a new bug or does it happen on older >>> QEMU? Have you got a rune to reproduce it on x86? >>> >>>> >>>>> >>>>>> On the target a see lots of >>>>>> >>>>>> [email protected]:postcopy_place_page_zero host=0x3ff9befa000 >>>>>> [email protected]:postcopy_place_page host=0x3ff9befb000 >>>>>> [email protected]:postcopy_place_page host=0x3ff9befc000 >>>>>> [email protected]:postcopy_place_page host=0x3ff9befd000 >>>>>> [email protected]:postcopy_place_page_zero host=0x3ff9befe000 >>>>>> [email protected]:postcopy_place_page host=0x3ff9beff000 >>>>>> >>>>>> So we have about 2 seconds of traffic going on after that request, >>>>>> I assume its precopy related. >>>>>> >>>>>> Looking back on the target history there was >>>>>> [email protected]:postcopy_ram_fault_thread_request Request for >>>>>> HVA=3ff618d9000 rb=s390.ram offset=41d9000 >>>>>> >>>>>> In fact it seems to be the first and only request: >>>>>> >>>>>> # cat /tmp/test0.trace | grep -v postcopy_place_page >>>>>> >>>>>> [email protected]:postcopy_ram_fault_thread_request Request for >>>>>> HVA=3ff618d9000 rb=s390.ram offset=41d9000 >>>>> >>>>> OK, does the HVA there correspond to the address that your virtio device >>>>> is blocking on? >>>> >>>> yes it is the same page. >>>> >>>> >>>>> (or at least the start of the page) >>>>> Do you see a postcopy_place_page with a host= the same HVA ? >>>> >>>> no >>> >>> Hmm, that's bad. >>> The flow is: >>> Precopy >>> (a) Send pages >>> Switch to postcopy >>> (b) Send discards for pages that have been changed after >>> (a) >>> Postcopy >>> (c) Keep sending pages until we run out >>> (d) But send pages we're asked for first >>> >>> (d) might be ignored if the source thinks the page was already sent or not >>> dirty. >>> >>> So we need to figure out: >>> 1) If the source sent the pages during (a) >>> 2) If the source discarded it during (b) >>> 3) If it sent it again any time in c/d >>> 4) If it ignored the request from d >>> >>> >>> So please turn on the traces: >>> get_queued_page_not_dirty should help with (4) >>> get_queued_page should help with (4) >>> ram_discard_range should help with (2) >>> loadvm_postcopy_ram_handle_discard should help with (2) >>> qemu_savevm_send_postcopy_ram_discard should help with (2) >>> >>> add near the top of ram_save_page in ram.c: >>> fprintf(stderr, "%s: %s:%zx\n", __func__, block->idstr, (size_t)offset); >>> >>> should help with 1, 3 >> >> >> source: >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> [email protected]:qemu_savevm_send_postcopy_ram_discard s390.ram: 9d >> [email protected]:migrate_handle_rp_req_pages in s390.ram at 3c6b000 >> len 1000 >> [email protected]:ram_save_queue_pages s390.ram: start: 3c6b000 len: >> 1000 >> [email protected]:get_queued_page_not_dirty s390.ram/3c6b000 >> ram_addr=3c6b000 (sent=0) > > OK, the source is ignoring the request because it thinks the page has > already been sent and hasn't been redirtied. > >> I also get >> ram_save_page: s390.ram:3c6b000 > > and you said in your other message that happens before the discards, which > means it's precopy, and that's why you don't see the 'place_page'. > >> target (without postcopy_place) >> [email protected]:ram_discard_range s390.ram: start: 0 5dc00000 > > That's start of day I think, so we can ignore that (I think!) > >> [email protected]:loadvm_postcopy_ram_handle_discard > > .... > >> [email protected]:ram_discard_range s390.ram: start: 3bf6000 1000 >> [email protected]:ram_discard_range s390.ram: start: 3bfd000 1000 >> [email protected]:ram_discard_range s390.ram: start: 3c06000 4000 >> [email protected]:loadvm_postcopy_ram_handle_discard >> [email protected]:ram_discard_range s390.ram: start: 3c79000 3000 > > Well, it's not discarded the 3c6b000 page, so the page we received > during precopy should still be there, and it shouldn't fault. > > In migration/ram.c in ram_load, before the line: > switch (flags & ~RAM_SAVE_FLAG_CONTINUE) { > > can you add: > fprintf(stderr, "%s: addr=%zx flags=%zx host=%p\n", __func__, addr, > flags, host); > > that should show the incoming blocks during precopy and we should be able > to see the page coming in. > Hmm, I have a theory, if the flags field has bit 1 set, i.e. > RAM_SAVE_FLAG_COMPRESS > then try changing ram_handle_compressed to always do the memset.
FWIW, changing ram_handle_compressed to always memset makes the problem go away.
