On 03/18/2017 at 01:38 AM, Eric W. Biederman wrote:
> Xunlei Pang <xlp...@redhat.com> writes:
>
>> kexec setups identity mappings for all the memory mapped in 1st kernel,
>> this is not necessary for the kdump case. Actually it can cause extra
>> memory consumption for paging structures, which is quite considerable
>> on modern machines with huge memory.
>>
>> E.g. On our 24TB machine, it will waste around 96MB (around 4MB/TB)
>> from the reserved memory range if setting all the identity mappings.
>>
>> It also causes some trouble for distributions that use an intelligent
>> policy to evaluate the proper "crashkernel=X" for users.
>>
>> To solve it, in case of kdump, we only setup identity mappings for the
>> crash memory and the ISA memory(may be needed by purgatory/kdump
>> boot).
> How about instead we detect the presence of 1GiB pages and use them
> if they are available.  We already use 2MiB pages.  If we can do that
> we will only need about 192K for page tables in the case you have
> described and this all becomes a non-issue.
>
> I strongly suspect that the presence of 24TiB of memory in an x86 system
> strongly correlates to the presence of 1GiB pages.
>
> In principle we certainly can use a less extensive mapping but that
> should not be something that differs between the two kexec cases.

Ok, will try gbpages for the identity mapping.

Regards,
Xunlei

> I can see forcing the low 1MiB range in.  But calling it ISA range is
> very wrong and misleading.  The reasons that range are special during
> boot-up have nothing to do with ISA.  But have everything to do with
> where legacy page tables are mapped, and where we need identity pages to
> start other cpus.  I think the only user that actually cares is
> purgatory where it plays swapping games with the low 1MiB because we
> can't preload what we need to down there or it would mess up the running
> kernel.  So saying anything about the old ISA bus is wrong and
> misleading.  At the very very least we need accurate comments.
>
> Eric
>
>
>> Signed-off-by: Xunlei Pang <xlp...@redhat.com>
>> ---
>>  arch/x86/kernel/machine_kexec_64.c | 34 ++++++++++++++++++++++++++++++----
>>  1 file changed, 30 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kernel/machine_kexec_64.c 
>> b/arch/x86/kernel/machine_kexec_64.c
>> index 857cdbd..db77a76 100644
>> --- a/arch/x86/kernel/machine_kexec_64.c
>> +++ b/arch/x86/kernel/machine_kexec_64.c
>> @@ -112,14 +112,40 @@ static int init_pgtable(struct kimage *image, unsigned 
>> long start_pgtable)
>>  
>>      level4p = (pgd_t *)__va(start_pgtable);
>>      clear_page(level4p);
>> -    for (i = 0; i < nr_pfn_mapped; i++) {
>> -            mstart = pfn_mapped[i].start << PAGE_SHIFT;
>> -            mend   = pfn_mapped[i].end << PAGE_SHIFT;
>>  
>> +    if (image->type == KEXEC_TYPE_CRASH) {
>> +            /* Always map the ISA range */
>>              result = kernel_ident_mapping_init(&info,
>> -                                             level4p, mstart, mend);
>> +                            level4p, 0, ISA_END_ADDRESS);
>>              if (result)
>>                      return result;
>> +
>> +            /* crashk_low_res may not be initialized when reaching here */
>> +            if (crashk_low_res.end) {
>> +                    mstart = crashk_low_res.start;
>> +                    mend = crashk_low_res.end + 1;
>> +                    result = kernel_ident_mapping_init(&info,
>> +                                    level4p, mstart, mend);
>> +                    if (result)
>> +                            return result;
>> +            }
>> +
>> +            mstart = crashk_res.start;
>> +            mend = crashk_res.end + 1;
>> +            result = kernel_ident_mapping_init(&info,
>> +                            level4p, mstart, mend);
>> +            if (result)
>> +                    return result;
>> +    } else {
>> +            for (i = 0; i < nr_pfn_mapped; i++) {
>> +                    mstart = pfn_mapped[i].start << PAGE_SHIFT;
>> +                    mend   = pfn_mapped[i].end << PAGE_SHIFT;
>> +
>> +                    result = kernel_ident_mapping_init(&info,
>> +                                     level4p, mstart, mend);
>> +                    if (result)
>> +                            return result;
>> +            }
>>      }
>>  
>>      /*

Reply via email to