Hi,

On 03/07/2019 10:16, huang.jun...@zte.com.cn wrote:
>> On 02/07/2019 11:34, Yi Wang wrote:
>>> From: Junhua Huang <huang.jun...@zte.com.cn>
>>> The 'commit 50d7ba36b916 ("arm64: export memblock_reserve()d regions via 
>>> /proc/iomem")'
>>> show the reserved memblock in /proc/iomem. But the initrd's reserved 
>>> memblock
>>> will be freed in free_initrd_mem(), which executes after the 
>>> reserve_memblock_reserved_regions().
>>> So there are some incorrect information shown in /proc/iomem. e.g.:
>>> 80000000-bbdfffff : System RAM
>>>   80080000-813effff : Kernel code
>>>   813f0000-8156ffff : reserved
>>>   81570000-817fcfff : Kernel data
>>>   83400000-83ffffff : reserved
>>>   90000000-90004fff : reserved
>>>   b0000000-b2618fff : reserved
>>>   b8c00000-bbbfffff : reserved
>>> In this case, the range from b0000000 to b2618fff is reserved for initrd, 
>>> which should be
>>> clean from the resource tree after it was freed.
>>
>> (There was some discussion about this over-estimate on the list, but it 
>> didn't make it
>> into the commit message.) I think a reserved->free change is fine. If 
>> user-space thinks
>> its still reserved nothing bad happens.

>>> As kexec-tool will collect the iomem reserved info 
>>> and use it in second kernel, which causes error message generated a second 
>>> time.

>> What error message?

> Sorry, it's my mistake. The kexec-tool could not use iomem reserved info in 
> the second kernel.
> The error message I mean is that the initrd reserved memblock region will be 
> shown in 
> second kernel /proc/iomem. But this message comes from the dtb's memreserve 
> node, 
> not the first kernel /proc/iomem.

This doesn't sound right.
Is kexec-tool spraying anything reserved in /proc/iomem into the DT as 
memreserve?


These top-level 'nomap' and second-level 'reserved' entries exist to stop 
kexec-tools
trying to write the new kernel over the top of something important. This only 
matters
between 'load' and 'exec' during the #1-kernel:

| kexec-tools reads /proc/iomem.
| kexec-tools tells #1-kernel "I want this 10MB image to be located at 0xf00".
| #1-kernel knows 0xf00 is in use, so it stores the data else where until 
kexec-time.
[some time passes]
| #1-kernel kexec's, copying the image to 0xf00
| #2-kernel now owns the machine

This goes wrong if 0xf00 belonged to firmware (nomap), or contained something 
important
(uefi memory map, acpi tables etc).

Once the second kernel has started running it should re-discover where this 
important
stuff is from the EFI and ACPI tables.

We deliberately over-estimate these second-level reserved regions as its the 
simplest
thing to do. (e.g. the per-cpu chunk allocations get swept up too)


Does this mean the amount of usable memory in the system reduces each time you 
kexec? That
shouldn't be true!


Thanks,

James

Reply via email to