Re: devm_memremap_pages() triggers a kasan_add_zero_shadow() warning

2019-08-21 Thread Qian Cai



> On Aug 21, 2019, at 9:31 PM, Baoquan He  wrote:
> 
> On 08/21/19 at 05:12pm, Qian Cai wrote:
 Does disabling CONFIG_RANDOMIZE_BASE help? Maybe that workaround has
 regressed. Effectively we need to find what is causing the kernel to
 sometimes be placed in the middle of a custom reserved memmap= range.
>>> 
>>> Yes, disabling KASLR works good so far. Assuming the workaround, i.e.,
>>> f28442497b5c
>>> (“x86/boot: Fix KASLR and memmap= collision”) is correct.
>>> 
>>> The only other commit that might regress it from my research so far is,
>>> 
>>> d52e7d5a952c ("x86/KASLR: Parse all 'memmap=' boot option entries”)
>>> 
>> 
>> It turns out that the origin commit f28442497b5c (“x86/boot: Fix KASLR and
>> memmap= collision”) has a bug that is unable to handle "memmap=" in
>> CONFIG_CMDLINE instead of a parameter in bootloader because when it (as well 
>> as
>> the commit d52e7d5a952c) calls get_cmd_line_ptr() in order to run
>> mem_avoid_memmap(), "boot_params" has no knowledge of CONFIG_CMDLINE. Only 
>> later
>> in setup_arch(), the kernel will deal with parameters over there.
> 
> Yes, we didn't consider CONFIG_CMDLINE during boot compressing stage. It
> should be a generic issue since other parameters from CONFIG_CMDLINE could
> be ignored too, not only KASLR handling. Would you like to cast a patch
> to fix it? Or I can fix it later, maybe next week.

I think you have more experience than me in this area, so if you have time to 
fix it, that
would be nice.



Re: devm_memremap_pages() triggers a kasan_add_zero_shadow() warning

2019-08-21 Thread Baoquan He
On 08/21/19 at 05:12pm, Qian Cai wrote:
> > > Does disabling CONFIG_RANDOMIZE_BASE help? Maybe that workaround has
> > > regressed. Effectively we need to find what is causing the kernel to
> > > sometimes be placed in the middle of a custom reserved memmap= range.
> > 
> > Yes, disabling KASLR works good so far. Assuming the workaround, i.e.,
> > f28442497b5c
> > (“x86/boot: Fix KASLR and memmap= collision”) is correct.
> > 
> > The only other commit that might regress it from my research so far is,
> > 
> > d52e7d5a952c ("x86/KASLR: Parse all 'memmap=' boot option entries”)
> > 
> 
> It turns out that the origin commit f28442497b5c (“x86/boot: Fix KASLR and
> memmap= collision”) has a bug that is unable to handle "memmap=" in
> CONFIG_CMDLINE instead of a parameter in bootloader because when it (as well 
> as
> the commit d52e7d5a952c) calls get_cmd_line_ptr() in order to run
> mem_avoid_memmap(), "boot_params" has no knowledge of CONFIG_CMDLINE. Only 
> later
> in setup_arch(), the kernel will deal with parameters over there.

Yes, we didn't consider CONFIG_CMDLINE during boot compressing stage. It
should be a generic issue since other parameters from CONFIG_CMDLINE could
be ignored too, not only KASLR handling. Would you like to cast a patch
to fix it? Or I can fix it later, maybe next week.

Thanks
Baoquan


Re: devm_memremap_pages() triggers a kasan_add_zero_shadow() warning

2019-08-21 Thread Qian Cai
On Sat, 2019-08-17 at 23:25 -0400, Qian Cai wrote:
> > On Aug 17, 2019, at 12:59 PM, Dan Williams  wrote:
> > 
> > On Sat, Aug 17, 2019 at 4:13 AM Qian Cai  wrote:
> > > 
> > > 
> > > 
> > > > On Aug 16, 2019, at 11:57 PM, Dan Williams 
> > > > wrote:
> > > > 
> > > > On Fri, Aug 16, 2019 at 8:34 PM Qian Cai  wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > > On Aug 16, 2019, at 5:48 PM, Dan Williams 
> > > > > > wrote:
> > > > > > 
> > > > > > On Fri, Aug 16, 2019 at 2:36 PM Qian Cai  wrote:
> > > > > > > 
> > > > > > > Every so often recently, booting Intel CPU server on linux-next
> > > > > > > triggers this
> > > > > > > warning. Trying to figure out if  the commit 7cc7867fb061
> > > > > > > ("mm/devm_memremap_pages: enable sub-section remap") is the
> > > > > > > culprit here.
> > > > > > > 
> > > > > > > # ./scripts/faddr2line vmlinux devm_memremap_pages+0x894/0xc70
> > > > > > > devm_memremap_pages+0x894/0xc70:
> > > > > > > devm_memremap_pages at mm/memremap.c:307
> > > > > > 
> > > > > > Previously the forced section alignment in devm_memremap_pages()
> > > > > > would
> > > > > > cause the implementation to never violate the
> > > > > > KASAN_SHADOW_SCALE_SIZE
> > > > > > (12K on x86) constraint.
> > > > > > 
> > > > > > Can you provide a dump of /proc/iomem? I'm curious what resource is
> > > > > > triggering such a small alignment granularity.
> > > > > 
> > > > > This is with memmap=4G!4G ,
> > > > > 
> > > > > # cat /proc/iomem
> > > > 
> > > > [..]
> > > > > 1-155df : Persistent Memory (legacy)
> > > > > 1-155df : namespace0.0
> > > > > 155e0-15982bfff : System RAM
> > > > > 155e0-156a00fa0 : Kernel code
> > > > > 156a00fa1-15765d67f : Kernel data
> > > > > 157837000-1597f : Kernel bss
> > > > > 15982c000-1 : Persistent Memory (legacy)
> > > > > 2-87fff : System RAM
> > > > 
> > > > Ok, looks like 4G is bad choice to land the pmem emulation on this
> > > > system because it collides with where the kernel is deployed and gets
> > > > broken into tiny pieces that violate kasan's. This is a known problem
> > > > with memmap=. You need to pick an memory range that does not collide
> > > > with anything else. See:
> > > > 
> > > >   https://nvdimm.wiki.kernel.org/how_to_choose_the_correct_memmap_kernel
> > > > _parameter_for_pmem_on_your_system
> > > > 
> > > > ...for more info.
> > > 
> > > Well, it seems I did exactly follow the information in that link,
> > > 
> > > [0.00] BIOS-provided physical RAM map:
> > > [0.00] BIOS-e820: [mem 0x-0x00093fff]
> > > usable
> > > [0.00] BIOS-e820: [mem 0x00094000-0x0009]
> > > reserved
> > > [0.00] BIOS-e820: [mem 0x000e-0x000f]
> > > reserved
> > > [0.00] BIOS-e820: [mem 0x0010-0x5a7a0fff]
> > > usable
> > > [0.00] BIOS-e820: [mem 0x5a7a1000-0x5b5e0fff]
> > > reserved
> > > [0.00] BIOS-e820: [mem 0x5b5e1000-0x790fefff]
> > > usable
> > > [0.00] BIOS-e820: [mem 0x790ff000-0x791fefff]
> > > reserved
> > > [0.00] BIOS-e820: [mem 0x791ff000-0x7b5fefff] ACPI
> > > NVS
> > > [0.00] BIOS-e820: [mem 0x7b5ff000-0x7b7fefff] ACPI
> > > data
> > > [0.00] BIOS-e820: [mem 0x7b7ff000-0x7b7f]
> > > usable
> > > [0.00] BIOS-e820: [mem 0x7b80-0x8fff]
> > > reserved
> > > [0.00] BIOS-e820: [mem 0xff80-0x]
> > > reserved
> > > [0.00] BIOS-e820: [mem 0x0001-0x00087fff]
> > > usable
> > > 
> > > Where 4G is good. Then,
> > > 
> > > [0.00] user-defined physical RAM map:
> > > [0.00] user: [mem 0x-0x00093fff] usable
> > > [0.00] user: [mem 0x00094000-0x0009] reserved
> > > [0.00] user: [mem 0x000e-0x000f] reserved
> > > [0.00] user: [mem 0x0010-0x5a7a0fff] usable
> > > [0.00] user: [mem 0x5a7a1000-0x5b5e0fff] reserved
> > > [0.00] user: [mem 0x5b5e1000-0x790fefff] usable
> > > [0.00] user: [mem 0x790ff000-0x791fefff] reserved
> > > [0.00] user: [mem 0x791ff000-0x7b5fefff] ACPI NVS
> > > [0.00] user: [mem 0x7b5ff000-0x7b7fefff] ACPI data
> > > [0.00] user: [mem 0x7b7ff000-0x7b7f] usable
> > > [0.00] user: [mem 0x7b80-0x8fff] reserved
> > > [0.00] user: [mem 0xff80-0x] reserved
> > > [0.00] user: [mem 0x0001-0x0001]
> > > persistent (type 12)
> > > [0.00] user: [mem 0x0002-0x00087fff] usable
> > > 
> > > The doc did mention that “There seems to be an issue with 

Re: devm_memremap_pages() triggers a kasan_add_zero_shadow() warning

2019-08-17 Thread Qian Cai



> On Aug 17, 2019, at 12:59 PM, Dan Williams  wrote:
> 
> On Sat, Aug 17, 2019 at 4:13 AM Qian Cai  wrote:
>> 
>> 
>> 
>>> On Aug 16, 2019, at 11:57 PM, Dan Williams  wrote:
>>> 
>>> On Fri, Aug 16, 2019 at 8:34 PM Qian Cai  wrote:
 
 
 
> On Aug 16, 2019, at 5:48 PM, Dan Williams  
> wrote:
> 
> On Fri, Aug 16, 2019 at 2:36 PM Qian Cai  wrote:
>> 
>> Every so often recently, booting Intel CPU server on linux-next triggers 
>> this
>> warning. Trying to figure out if  the commit 7cc7867fb061
>> ("mm/devm_memremap_pages: enable sub-section remap") is the culprit here.
>> 
>> # ./scripts/faddr2line vmlinux devm_memremap_pages+0x894/0xc70
>> devm_memremap_pages+0x894/0xc70:
>> devm_memremap_pages at mm/memremap.c:307
> 
> Previously the forced section alignment in devm_memremap_pages() would
> cause the implementation to never violate the KASAN_SHADOW_SCALE_SIZE
> (12K on x86) constraint.
> 
> Can you provide a dump of /proc/iomem? I'm curious what resource is
> triggering such a small alignment granularity.
 
 This is with memmap=4G!4G ,
 
 # cat /proc/iomem
>>> [..]
 1-155df : Persistent Memory (legacy)
 1-155df : namespace0.0
 155e0-15982bfff : System RAM
 155e0-156a00fa0 : Kernel code
 156a00fa1-15765d67f : Kernel data
 157837000-1597f : Kernel bss
 15982c000-1 : Persistent Memory (legacy)
 2-87fff : System RAM
>>> 
>>> Ok, looks like 4G is bad choice to land the pmem emulation on this
>>> system because it collides with where the kernel is deployed and gets
>>> broken into tiny pieces that violate kasan's. This is a known problem
>>> with memmap=. You need to pick an memory range that does not collide
>>> with anything else. See:
>>> 
>>>   
>>> https://nvdimm.wiki.kernel.org/how_to_choose_the_correct_memmap_kernel_parameter_for_pmem_on_your_system
>>> 
>>> ...for more info.
>> 
>> Well, it seems I did exactly follow the information in that link,
>> 
>> [0.00] BIOS-provided physical RAM map:
>> [0.00] BIOS-e820: [mem 0x-0x00093fff] usable
>> [0.00] BIOS-e820: [mem 0x00094000-0x0009] 
>> reserved
>> [0.00] BIOS-e820: [mem 0x000e-0x000f] 
>> reserved
>> [0.00] BIOS-e820: [mem 0x0010-0x5a7a0fff] usable
>> [0.00] BIOS-e820: [mem 0x5a7a1000-0x5b5e0fff] 
>> reserved
>> [0.00] BIOS-e820: [mem 0x5b5e1000-0x790fefff] usable
>> [0.00] BIOS-e820: [mem 0x790ff000-0x791fefff] 
>> reserved
>> [0.00] BIOS-e820: [mem 0x791ff000-0x7b5fefff] ACPI 
>> NVS
>> [0.00] BIOS-e820: [mem 0x7b5ff000-0x7b7fefff] ACPI 
>> data
>> [0.00] BIOS-e820: [mem 0x7b7ff000-0x7b7f] usable
>> [0.00] BIOS-e820: [mem 0x7b80-0x8fff] 
>> reserved
>> [0.00] BIOS-e820: [mem 0xff80-0x] 
>> reserved
>> [0.00] BIOS-e820: [mem 0x0001-0x00087fff] usable
>> 
>> Where 4G is good. Then,
>> 
>> [0.00] user-defined physical RAM map:
>> [0.00] user: [mem 0x-0x00093fff] usable
>> [0.00] user: [mem 0x00094000-0x0009] reserved
>> [0.00] user: [mem 0x000e-0x000f] reserved
>> [0.00] user: [mem 0x0010-0x5a7a0fff] usable
>> [0.00] user: [mem 0x5a7a1000-0x5b5e0fff] reserved
>> [0.00] user: [mem 0x5b5e1000-0x790fefff] usable
>> [0.00] user: [mem 0x790ff000-0x791fefff] reserved
>> [0.00] user: [mem 0x791ff000-0x7b5fefff] ACPI NVS
>> [0.00] user: [mem 0x7b5ff000-0x7b7fefff] ACPI data
>> [0.00] user: [mem 0x7b7ff000-0x7b7f] usable
>> [0.00] user: [mem 0x7b80-0x8fff] reserved
>> [0.00] user: [mem 0xff80-0x] reserved
>> [0.00] user: [mem 0x0001-0x0001] persistent 
>> (type 12)
>> [0.00] user: [mem 0x0002-0x00087fff] usable
>> 
>> The doc did mention that “There seems to be an issue with CONFIG_KSAN at the 
>> moment however.”
>> without more detail though.
> 
> Does disabling CONFIG_RANDOMIZE_BASE help? Maybe that workaround has
> regressed. Effectively we need to find what is causing the kernel to
> sometimes be placed in the middle of a custom reserved memmap= range.

Yes, disabling KASLR works good so far. Assuming the workaround, i.e., 
f28442497b5c
(“x86/boot: Fix KASLR and memmap= collision”) is correct.

The only other commit that might regress it from my research so far is,

d52e7d5a952c ("x86/KASLR: Parse all 

Re: devm_memremap_pages() triggers a kasan_add_zero_shadow() warning

2019-08-17 Thread Dan Williams
On Sat, Aug 17, 2019 at 4:13 AM Qian Cai  wrote:
>
>
>
> > On Aug 16, 2019, at 11:57 PM, Dan Williams  wrote:
> >
> > On Fri, Aug 16, 2019 at 8:34 PM Qian Cai  wrote:
> >>
> >>
> >>
> >>> On Aug 16, 2019, at 5:48 PM, Dan Williams  
> >>> wrote:
> >>>
> >>> On Fri, Aug 16, 2019 at 2:36 PM Qian Cai  wrote:
> 
>  Every so often recently, booting Intel CPU server on linux-next triggers 
>  this
>  warning. Trying to figure out if  the commit 7cc7867fb061
>  ("mm/devm_memremap_pages: enable sub-section remap") is the culprit here.
> 
>  # ./scripts/faddr2line vmlinux devm_memremap_pages+0x894/0xc70
>  devm_memremap_pages+0x894/0xc70:
>  devm_memremap_pages at mm/memremap.c:307
> >>>
> >>> Previously the forced section alignment in devm_memremap_pages() would
> >>> cause the implementation to never violate the KASAN_SHADOW_SCALE_SIZE
> >>> (12K on x86) constraint.
> >>>
> >>> Can you provide a dump of /proc/iomem? I'm curious what resource is
> >>> triggering such a small alignment granularity.
> >>
> >> This is with memmap=4G!4G ,
> >>
> >> # cat /proc/iomem
> > [..]
> >> 1-155df : Persistent Memory (legacy)
> >>  1-155df : namespace0.0
> >> 155e0-15982bfff : System RAM
> >>  155e0-156a00fa0 : Kernel code
> >>  156a00fa1-15765d67f : Kernel data
> >>  157837000-1597f : Kernel bss
> >> 15982c000-1 : Persistent Memory (legacy)
> >> 2-87fff : System RAM
> >
> > Ok, looks like 4G is bad choice to land the pmem emulation on this
> > system because it collides with where the kernel is deployed and gets
> > broken into tiny pieces that violate kasan's. This is a known problem
> > with memmap=. You need to pick an memory range that does not collide
> > with anything else. See:
> >
> >
> > https://nvdimm.wiki.kernel.org/how_to_choose_the_correct_memmap_kernel_parameter_for_pmem_on_your_system
> >
> > ...for more info.
>
> Well, it seems I did exactly follow the information in that link,
>
> [0.00] BIOS-provided physical RAM map:
> [0.00] BIOS-e820: [mem 0x-0x00093fff] usable
> [0.00] BIOS-e820: [mem 0x00094000-0x0009] reserved
> [0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
> [0.00] BIOS-e820: [mem 0x0010-0x5a7a0fff] usable
> [0.00] BIOS-e820: [mem 0x5a7a1000-0x5b5e0fff] reserved
> [0.00] BIOS-e820: [mem 0x5b5e1000-0x790fefff] usable
> [0.00] BIOS-e820: [mem 0x790ff000-0x791fefff] reserved
> [0.00] BIOS-e820: [mem 0x791ff000-0x7b5fefff] ACPI NVS
> [0.00] BIOS-e820: [mem 0x7b5ff000-0x7b7fefff] ACPI 
> data
> [0.00] BIOS-e820: [mem 0x7b7ff000-0x7b7f] usable
> [0.00] BIOS-e820: [mem 0x7b80-0x8fff] reserved
> [0.00] BIOS-e820: [mem 0xff80-0x] reserved
> [0.00] BIOS-e820: [mem 0x0001-0x00087fff] usable
>
> Where 4G is good. Then,
>
> [0.00] user-defined physical RAM map:
> [0.00] user: [mem 0x-0x00093fff] usable
> [0.00] user: [mem 0x00094000-0x0009] reserved
> [0.00] user: [mem 0x000e-0x000f] reserved
> [0.00] user: [mem 0x0010-0x5a7a0fff] usable
> [0.00] user: [mem 0x5a7a1000-0x5b5e0fff] reserved
> [0.00] user: [mem 0x5b5e1000-0x790fefff] usable
> [0.00] user: [mem 0x790ff000-0x791fefff] reserved
> [0.00] user: [mem 0x791ff000-0x7b5fefff] ACPI NVS
> [0.00] user: [mem 0x7b5ff000-0x7b7fefff] ACPI data
> [0.00] user: [mem 0x7b7ff000-0x7b7f] usable
> [0.00] user: [mem 0x7b80-0x8fff] reserved
> [0.00] user: [mem 0xff80-0x] reserved
> [0.00] user: [mem 0x0001-0x0001] persistent 
> (type 12)
> [0.00] user: [mem 0x0002-0x00087fff] usable
>
> The doc did mention that “There seems to be an issue with CONFIG_KSAN at the 
> moment however.”
> without more detail though.

Does disabling CONFIG_RANDOMIZE_BASE help? Maybe that workaround has
regressed. Effectively we need to find what is causing the kernel to
sometimes be placed in the middle of a custom reserved memmap= range.


Re: devm_memremap_pages() triggers a kasan_add_zero_shadow() warning

2019-08-17 Thread Qian Cai



> On Aug 16, 2019, at 11:57 PM, Dan Williams  wrote:
> 
> On Fri, Aug 16, 2019 at 8:34 PM Qian Cai  wrote:
>> 
>> 
>> 
>>> On Aug 16, 2019, at 5:48 PM, Dan Williams  wrote:
>>> 
>>> On Fri, Aug 16, 2019 at 2:36 PM Qian Cai  wrote:
 
 Every so often recently, booting Intel CPU server on linux-next triggers 
 this
 warning. Trying to figure out if  the commit 7cc7867fb061
 ("mm/devm_memremap_pages: enable sub-section remap") is the culprit here.
 
 # ./scripts/faddr2line vmlinux devm_memremap_pages+0x894/0xc70
 devm_memremap_pages+0x894/0xc70:
 devm_memremap_pages at mm/memremap.c:307
>>> 
>>> Previously the forced section alignment in devm_memremap_pages() would
>>> cause the implementation to never violate the KASAN_SHADOW_SCALE_SIZE
>>> (12K on x86) constraint.
>>> 
>>> Can you provide a dump of /proc/iomem? I'm curious what resource is
>>> triggering such a small alignment granularity.
>> 
>> This is with memmap=4G!4G ,
>> 
>> # cat /proc/iomem
> [..]
>> 1-155df : Persistent Memory (legacy)
>>  1-155df : namespace0.0
>> 155e0-15982bfff : System RAM
>>  155e0-156a00fa0 : Kernel code
>>  156a00fa1-15765d67f : Kernel data
>>  157837000-1597f : Kernel bss
>> 15982c000-1 : Persistent Memory (legacy)
>> 2-87fff : System RAM
> 
> Ok, looks like 4G is bad choice to land the pmem emulation on this
> system because it collides with where the kernel is deployed and gets
> broken into tiny pieces that violate kasan's. This is a known problem
> with memmap=. You need to pick an memory range that does not collide
> with anything else. See:
> 
>
> https://nvdimm.wiki.kernel.org/how_to_choose_the_correct_memmap_kernel_parameter_for_pmem_on_your_system
> 
> ...for more info.

Well, it seems I did exactly follow the information in that link,

[0.00] BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x00093fff] usable
[0.00] BIOS-e820: [mem 0x00094000-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x5a7a0fff] usable
[0.00] BIOS-e820: [mem 0x5a7a1000-0x5b5e0fff] reserved
[0.00] BIOS-e820: [mem 0x5b5e1000-0x790fefff] usable
[0.00] BIOS-e820: [mem 0x790ff000-0x791fefff] reserved
[0.00] BIOS-e820: [mem 0x791ff000-0x7b5fefff] ACPI NVS
[0.00] BIOS-e820: [mem 0x7b5ff000-0x7b7fefff] ACPI data
[0.00] BIOS-e820: [mem 0x7b7ff000-0x7b7f] usable
[0.00] BIOS-e820: [mem 0x7b80-0x8fff] reserved
[0.00] BIOS-e820: [mem 0xff80-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00087fff] usable

Where 4G is good. Then,

[0.00] user-defined physical RAM map:
[0.00] user: [mem 0x-0x00093fff] usable
[0.00] user: [mem 0x00094000-0x0009] reserved
[0.00] user: [mem 0x000e-0x000f] reserved
[0.00] user: [mem 0x0010-0x5a7a0fff] usable
[0.00] user: [mem 0x5a7a1000-0x5b5e0fff] reserved
[0.00] user: [mem 0x5b5e1000-0x790fefff] usable
[0.00] user: [mem 0x790ff000-0x791fefff] reserved
[0.00] user: [mem 0x791ff000-0x7b5fefff] ACPI NVS
[0.00] user: [mem 0x7b5ff000-0x7b7fefff] ACPI data
[0.00] user: [mem 0x7b7ff000-0x7b7f] usable
[0.00] user: [mem 0x7b80-0x8fff] reserved
[0.00] user: [mem 0xff80-0x] reserved
[0.00] user: [mem 0x0001-0x0001] persistent 
(type 12)
[0.00] user: [mem 0x0002-0x00087fff] usable

The doc did mention that “There seems to be an issue with CONFIG_KSAN at the 
moment however.”
without more detail though.

Re: devm_memremap_pages() triggers a kasan_add_zero_shadow() warning

2019-08-16 Thread Dan Williams
On Fri, Aug 16, 2019 at 8:34 PM Qian Cai  wrote:
>
>
>
> > On Aug 16, 2019, at 5:48 PM, Dan Williams  wrote:
> >
> > On Fri, Aug 16, 2019 at 2:36 PM Qian Cai  wrote:
> >>
> >> Every so often recently, booting Intel CPU server on linux-next triggers 
> >> this
> >> warning. Trying to figure out if  the commit 7cc7867fb061
> >> ("mm/devm_memremap_pages: enable sub-section remap") is the culprit here.
> >>
> >> # ./scripts/faddr2line vmlinux devm_memremap_pages+0x894/0xc70
> >> devm_memremap_pages+0x894/0xc70:
> >> devm_memremap_pages at mm/memremap.c:307
> >
> > Previously the forced section alignment in devm_memremap_pages() would
> > cause the implementation to never violate the KASAN_SHADOW_SCALE_SIZE
> > (12K on x86) constraint.
> >
> > Can you provide a dump of /proc/iomem? I'm curious what resource is
> > triggering such a small alignment granularity.
>
> This is with memmap=4G!4G ,
>
> # cat /proc/iomem
[..]
> 1-155df : Persistent Memory (legacy)
>   1-155df : namespace0.0
> 155e0-15982bfff : System RAM
>   155e0-156a00fa0 : Kernel code
>   156a00fa1-15765d67f : Kernel data
>   157837000-1597f : Kernel bss
> 15982c000-1 : Persistent Memory (legacy)
> 2-87fff : System RAM

Ok, looks like 4G is bad choice to land the pmem emulation on this
system because it collides with where the kernel is deployed and gets
broken into tiny pieces that violate kasan's. This is a known problem
with memmap=. You need to pick an memory range that does not collide
with anything else. See:


https://nvdimm.wiki.kernel.org/how_to_choose_the_correct_memmap_kernel_parameter_for_pmem_on_your_system

...for more info.


Re: devm_memremap_pages() triggers a kasan_add_zero_shadow() warning

2019-08-16 Thread Qian Cai



> On Aug 16, 2019, at 5:48 PM, Dan Williams  wrote:
> 
> On Fri, Aug 16, 2019 at 2:36 PM Qian Cai  wrote:
>> 
>> Every so often recently, booting Intel CPU server on linux-next triggers this
>> warning. Trying to figure out if  the commit 7cc7867fb061
>> ("mm/devm_memremap_pages: enable sub-section remap") is the culprit here.
>> 
>> # ./scripts/faddr2line vmlinux devm_memremap_pages+0x894/0xc70
>> devm_memremap_pages+0x894/0xc70:
>> devm_memremap_pages at mm/memremap.c:307
> 
> Previously the forced section alignment in devm_memremap_pages() would
> cause the implementation to never violate the KASAN_SHADOW_SCALE_SIZE
> (12K on x86) constraint.
> 
> Can you provide a dump of /proc/iomem? I'm curious what resource is
> triggering such a small alignment granularity.

This is with memmap=4G!4G ,

# cat /proc/iomem 
-0fff : Reserved
1000-00093fff : System RAM
00094000-0009 : Reserved
000a-000b : PCI Bus :00
000c-000c7fff : Video ROM
000c8000-000cbfff : Adapter ROM
000cc000-000ccfff : Adapter ROM
000e-000f : Reserved
  000f-000f : System ROM
0010-5a7a0fff : System RAM
5a7a1000-5b5e0fff : Reserved
5b5e1000-790fefff : System RAM
  6900-78ff : Crash kernel
790ff000-791fefff : Reserved
791ff000-7b5fefff : ACPI Non-volatile Storage
7b5ff000-7b7fefff : ACPI Tables
7b7ff000-7b7f : System RAM
7b80-8fff : Reserved
  8000-8fff : PCI MMCONFIG  [bus 00-ff]
9000-c7ffbfff : PCI Bus :00
  9000-92af : PCI Bus :01
9000-9000 : :01:00.2
9100-91ff : :01:00.1
9200-927f : :01:00.1
9280-928f : :01:00.2
9290-929f : :01:00.2
92a0-92a7 : :01:00.2
92a8-92a87fff : :01:00.2
92a88000-92a8bfff : :01:00.1
92a8c000-92a8c0ff : :01:00.2
92a8d000-92a8d1ff : :01:00.0
  92b0-92df : PCI Bus :02
92b0-92bf : :02:00.1
  92b0-92bf : igb
92c0-92cf : :02:00.0
  92c0-92cf : igb
92d0-92d03fff : :02:00.1
  92d0-92d03fff : igb
92d04000-92d07fff : :02:00.0
  92d04000-92d07fff : igb
92d8-92df : :02:00.0
  92e0-92ff : PCI Bus :03
92e0-92ef : :03:00.0
  92e0-92ef : hpsa
92f0-92f003ff : :03:00.0
  92f0-92f003ff : hpsa
92f8-92ff : :03:00.0
  9300-930003ff : :00:1d.0
  93001000-930013ff : :00:1a.0
  93003000-93003fff : :00:05.4
c7ffc000-c7ffcfff : dmar1
c800-fbffbfff : PCI Bus :80
  c800-c8000fff : :80:05.4
fbffc000-fbffcfff : dmar0
fec0-fecf : PNP0003:00
  fec0-fec003ff : IOAPIC 0
  fec01000-fec013ff : IOAPIC 1
  fec4-fec403ff : IOAPIC 2
fed0-fed003ff : HPET 0
  fed0-fed003ff : PNP0103:00
fed12000-fed1200f : pnp 00:01
fed12010-fed1201f : pnp 00:01
fed1b000-fed1bfff : pnp 00:01
fed1c000-fed3 : pnp 00:01
fed45000-fed8bfff : pnp 00:01
fee0-feef : pnp 00:01
  fee0-fee00fff : Local APIC
ff80- : Reserved
1-155df : Persistent Memory (legacy)
  1-155df : namespace0.0
155e0-15982bfff : System RAM
  155e0-156a00fa0 : Kernel code
  156a00fa1-15765d67f : Kernel data
  157837000-1597f : Kernel bss
15982c000-1 : Persistent Memory (legacy)
2-87fff : System RAM
  85800-877ff : Crash kernel
380-39f : PCI Bus :00
  39fffe0-39fffef : PCI Bus :02
  390-390 : :00:14.0
  391-3913fff : :00:04.7
  3914000-3917fff : :00:04.6
  3918000-391bfff : :00:04.5
  391c000-391 : :00:04.4
  392-3923fff : :00:04.3
  3924000-3927fff : :00:04.2
  3928000-392bfff : :00:04.1
  392c000-392 : :00:04.0
  3931000-39310ff : :00:1f.3
3a0-3bf : PCI Bus :80
  3b0-3b03fff : :80:04.7
  3b04000-3b07fff : :80:04.6
  3b08000-3b0bfff : :80:04.5
  3b0c000-3b0 : :80:04.4
  3b1-3b13fff : :80:04.3
  3b14000-3b17fff : :80:04.2
  3b18000-3b1bfff : :80:04.1
  3b1c000-3b1 : :80:04.0

> 
> Is it truly only linux-next or does latest mainline have this issue as well?

No idea. I have not had a chance to test it on the mainline yet.



Re: devm_memremap_pages() triggers a kasan_add_zero_shadow() warning

2019-08-16 Thread Dan Williams
On Fri, Aug 16, 2019 at 2:36 PM Qian Cai  wrote:
>
> Every so often recently, booting Intel CPU server on linux-next triggers this
> warning. Trying to figure out if  the commit 7cc7867fb061
> ("mm/devm_memremap_pages: enable sub-section remap") is the culprit here.
>
> # ./scripts/faddr2line vmlinux devm_memremap_pages+0x894/0xc70
> devm_memremap_pages+0x894/0xc70:
> devm_memremap_pages at mm/memremap.c:307

Previously the forced section alignment in devm_memremap_pages() would
cause the implementation to never violate the KASAN_SHADOW_SCALE_SIZE
(12K on x86) constraint.

Can you provide a dump of /proc/iomem? I'm curious what resource is
triggering such a small alignment granularity.

Is it truly only linux-next or does latest mainline have this issue as well?


devm_memremap_pages() triggers a kasan_add_zero_shadow() warning

2019-08-16 Thread Qian Cai
Every so often recently, booting Intel CPU server on linux-next triggers this
warning. Trying to figure out if  the commit 7cc7867fb061
("mm/devm_memremap_pages: enable sub-section remap") is the culprit here.

# ./scripts/faddr2line vmlinux devm_memremap_pages+0x894/0xc70
devm_memremap_pages+0x894/0xc70:
devm_memremap_pages at mm/memremap.c:307

[   32.074412][  T294] WARNING: CPU: 31 PID: 294 at mm/kasan/init.c:496
kasan_add_zero_shadow.cold.2+0xc/0x39
[   32.077448][  T294] Modules linked in:
[   32.078614][  T294] CPU: 31 PID: 294 Comm: kworker/u97:1 Not tainted 5.3.0-
rc4-next-20190816+ #7
[   32.081299][  T294] Hardware name: HP ProLiant XL420 Gen9/ProLiant XL420
Gen9, BIOS U19 12/27/2015
[   32.084430][  T294] Workqueue: events_unbound async_run_entry_fn
[   32.086347][  T294] RIP: 0010:kasan_add_zero_shadow.cold.2+0xc/0x39
[   32.088303][  T294] Code: ff 48 c7 c7 b0 06 74 86 e8 0e e2 db ff 0f 0b e9 64
f7 ff ff 48 8b 45 98 48 89 45 b8 eb be 48 c7 c7 b0 06 74 86 e8 f1 e1 db ff <0f>
0b b8 ea ff ff ff e9 ad fe ff ff 48 c7 c7 b0 06 74 86 e8 d9 e1
[   32.094183][  T294] RSP: :8884428cf738 EFLAGS: 00010282
[   32.096030][  T294] RAX: 0024 RBX: 88833c1b8100 RCX:
85730ba8
[   32.098391][  T294] RDX:  RSI: dc00 RDI:
86964740
[   32.100802][  T294] RBP: 8884428cf750 R08: fbfff0d2c8e9 R09:
fbfff0d2c8e9
[   32.103229][  T294] R10: fbfff0d2c8e8 R11: 86964743 R12:
111088519ef3
[   32.105581][  T294] R13: 88833dbc8010 R14: 00017a02c000 R15:
88833c1b8128
[   32.107956][  T294] FS:  () GS:88844db8()
knlGS:
[   32.110585][  T294] CS:  0010 DS:  ES:  CR0: 80050033
[   32.112606][  T294] CR2:  CR3: 000163012001 CR4:
001606a0
[   32.112610][  T294] Call Trace:
[   32.112622][  T294]  devm_memremap_pages+0x894/0xc70
[   32.112635][  T294]  ? devm_memremap_pages_release+0x510/0x510
[   32.119291][  T294]  ? do_raw_read_unlock+0x2c/0x60
[   32.122470][  T332] namespace0.0 initialised, 400896 pages in 50ms
[   32.143086][  T294]  ? _raw_read_unlock+0x27/0x40
[   32.143094][  T294]  pmem_attach_disk+0x490/0x880
[   32.143106][  T294]  ? pmem_pagemap_kill+0x30/0x30
[   32.186834][T1] debug: unmapping init [mem 0x9d602000-
0x9d7f]
[   32.195383][  T294]  ? kfree+0x106/0x400
[   32.195394][  T294]  ? kfree_const+0x17/0x30
[   32.314107][  T294]  ? kobject_put+0xfb/0x250
[   32.334569][  T294]  ? put_device+0x13/0x20
[   32.354169][  T294]  nd_pmem_probe+0x83/0xa0
[   32.374162][  T294]  nvdimm_bus_probe+0xaa/0x1f0
[   32.395901][  T294]  really_probe+0x1a2/0x630
[   32.416352][  T294]  driver_probe_device+0xcd/0x1f0
[   32.438901][  T294]  __device_attach_driver+0xed/0x150
[   32.463074][  T294]  ? driver_allows_async_probing+0x90/0x90
[   32.489538][  T294]  bus_for_each_drv+0xfa/0x160
[   32.511038][  T294]  ? bus_rescan_devices+0x20/0x20
[   32.731179][  T294]  ? do_raw_spin_unlock+0xa8/0x140
[   32.754475][  T294]  __device_attach+0x16d/0x220
[   32.775648][  T294]  ? device_bind_driver+0x80/0x80
[   32.798379][  T294]  ? __kasan_check_write+0x14/0x20
[   32.821550][  T294]  ? wait_for_completion_io+0x20/0x20
[   32.846143][  T294]  device_initial_probe+0x13/0x20
[   32.868959][  T294]  bus_probe_device+0x10f/0x130
[   32.891093][  T294]  device_add+0xadb/0xd00
[   32.910946][  T294]  ? root_device_unregister+0x40/0x40
[   32.935477][  T294]  ? nd_synchronize+0x20/0x20
[   32.956715][  T294]  nd_async_device_register+0x12/0x40
[   32.981106][  T294]  async_run_entry_fn+0x7f/0x2d0
[   33.003537][  T294]  process_one_work+0x53b/0xa70
[   33.026673][  T294]  ? pwq_dec_nr_in_flight+0x170/0x170
[   33.051060][  T294]  worker_thread+0x63/0x5b0
[   33.071431][  T294]  kthread+0x1df/0x200
[   33.089767][  T294]  ? process_one_work+0xa70/0xa70
[   33.112635][  T294]  ? kthread_park+0xc0/0xc0
[   33.132698][  T294]  ret_from_fork+0x35/0x40
[   33.155214][  T294] ---[ end trace 6917fee95b72ffee ]---
[   33.182365][T1] debug: unmapping init [mem 0x86e7b000-
0x87031fff]
[   33.184491][  T332] pmem0: detected capacity change from 0 to 1642070016
[   33.251029][  T294] nd_pmem: probe of namespace1.0 failed with error -22