Re: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest

Wen Congyang Thu, 15 Dec 2011 00:55:30 -0800

At 12/15/2011 09:30 AM, HATAYAMA Daisuke Write:
> From: Wen Congyang <we...@cn.fujitsu.com>
> Subject: Re: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci 
> device is used by guest
> Date: Tue, 13 Dec 2011 17:20:24 +0800
> 
>> At 12/13/2011 02:01 PM, HATAYAMA Daisuke Write:
>>> From: Wen Congyang <we...@cn.fujitsu.com>
>>> Subject: Re: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci 
>>> device is used by guest
>>> Date: Tue, 13 Dec 2011 11:35:53 +0800
>>>
>>>> Hi, hatayama-san
>>>>
>>>> At 12/13/2011 11:12 AM, HATAYAMA Daisuke Write:
>>>>> Hello Wen,
>>>>>
>>>>> From: Wen Congyang <we...@cn.fujitsu.com>
>>>>> Subject: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci 
>>>>> device is used by guest
>>>>> Date: Fri, 09 Dec 2011 15:57:26 +0800
>>>>>
>>>>>> Hi, all
>>>>>>
>>>>>> 'virsh dump' can not work when host pci device is used by guest. We have
>>>>>> discussed this issue here:
>>>>>> http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00736.html
>>>>>>
>>>>>> We have determined to introduce a new command dump to dump memory. The 
>>>>>> core
>>>>>> file's format can be elf.
>>>>>>
>>>>>> Note:
>>>>>> 1. The guest should be x86 or x86_64. The other arch is not supported.
>>>>>> 2. If you use old gdb, gdb may crash. I use gdb-7.3.1, and it does not 
>>>>>> crash.
>>>>>> 3. If the OS is in the second kernel, gdb may not work well, and crash 
>>>>>> can
>>>>>>    work by specifying '--machdep phys_addr=xxx' in the command line. The
>>>>>>    reason is that the second kernel will update the page table, and we 
>>>>>> can
>>>>>>    not get the page table for the first kernel.
>>>>>
>>>>> I guess still the current implementation breaks vmalloc'ed area that
>>>>> needs page tables originally located in the first 640kB, right? If you
>>>>> want to do so in a correct way, you need to identify a position of
>>>>> backup region and get data of 1st kernel's page tables.
>>>>
>>>> I do not know anything about vmalloc'ed area. Can you explain it more
>>>> detailed?
>>>>
>>>
>>> It's memory area not straight-mapped. To read the area, it's necessary
>>> to look up guest machine's page tables. If I understand correctly,
>>> your current implementation translates the vmalloc'ed area so that the
>>> generated vmcore is linearly mapped w.r.t. virtual-address for gdb to
>>> work.
>>
>> Do you mean the page table for vmalloc'ed area is stored in first 640KB,
>> and it may be overwriten by the second kernel(this region has been backed 
>> up)?
>>
> 
> This might be wrong.. I've locally tried to ensure this but I have not
> done yet.
> 
> I make sure at least pgtlist_data could be within the first 640kB:
> 
> crash> log
> <cut>
> No NUMA configuration found
> Faking a node at 0000000000000000-000000007f800000
> Bootmem setup node 0 0000000000000000-000000007f800000
>   NODE_DATA [0000000000011000 - 0000000000044fff] <-- this


Only kernel built with CONFIG_NUMA has this. This config is only enabled
on RHEL x86_64. I do not have such env on hand now.

>   bootmap [0000000000045000 -  0000000000054eff] pages 10
> (7 early reservations) ==> bootmem [0000000000 - 007f800000]
>   #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
>   #1 [0000006000 - 0000008000]       TRAMPOLINE ==> [0000006000 - 0000008000]
> 
> And I had ever had the vmcore created after entering 2nd kernel where
> I cannot see module data using mod sub-command, which was resolved by
> re-reading the address to the corresponding backup region.
> 
> I guess becuase crash uses page table on memory, this affects paging
> badly.
> 
> I want to look into this more but I don't have such vmcore now because
> I lost them accidentally... I tried to reproduce this some times
> yesterday but didn't succeed. The vmcore above is one of them.
> 
>>>
>>> kdump saves the first 640kB physical memory into the backup region. I
>>> guess, for some vmcores created by the current implementation, gdb and
>>> crash cannot see the vmalloc'ed memory area that needs page tables
>>
>> Hmm, IIRC, crash do not use CPU's page table. gdb use the information in
>> PT_LOAD to read memory area.
>>
> 
> I was confused this. Your dump command uses CPU's page table.
> 
> So on the qemu side you can get page table over a whole physical
> address, right? If so, contents themselves are not broken, I think.
> 
>>> placed at the 640kB region, correctly. For example, try to use mod
>>> sub-command. Kernel modules are allocated on vmalloc'ed area.
>>>
>>> I have developped a very similar logic for sadump. Look at sadump.c in
>>> crash. Logic itself is very simple, but debugging information is
>>> necessary. Documentation/kdump/kdump.txt and the following paper
>>> explains backup region mechanism very well, and the implementaion
>>> around there remains same now.
>>
>> Hmm, we can not use debugging information on qemu sied.
>>
> 
> How about re-reading them later in crash? Users want to see the 1st
> kernel rather than 2nd kernel.

A easy way to see the 1st kernel is: specify --machdep phys_base=xxx in
the command line.

Thanks
Wen Congyang
> 
> To do it, the dump format must be able to be distingished from crash.
> Which does function in crash read vmcores created by this command?
> kcore, or netdump?
> 
> Thanks.
> HATAYAMA, Daisuke
> 
>

Re: [Qemu-devel] [RFC][PATCT 0/5 v2] dump memory when host pci device is used by guest

Reply via email to