Re: [mm] c4344e8035: WARNING: CPU: 0 PID: 101 at mm/memory.c:303 __tlb_remove_page_size+0x25/0x99

2016-10-12 Thread Ye Xiaolong
On 10/12, Aneesh Kumar K.V wrote:
>kernel test robot  writes:
>
>> FYI, we noticed the following commit:
>>
>> https://github.com/0day-ci/linux 
>> Aneesh-Kumar-K-V/mm-Use-the-correct-page-size-when-removing-the-page/20161012-013446
>> commit c4344e80359420d7574b3b90fddf53311f1d24e6 ("mm: Remove the page size 
>> change check in tlb_remove_page")
>>
>> in testcase: boot
>>
>> on test machine: qemu-system-i386 -enable-kvm -cpu Haswell,+smep,+smap -m 
>> 360M
>>
>> caused below changes:
>>
>>
>> ++++
>> || eff764128d | c4344e8035 |
>> ++++
>> | boot_successes | 59 | 0  |
>> | boot_failures  | 0  | 43 |
>> | WARNING:at_mm/memory.c:#__tlb_remove_page_size | 0  | 43 |
>> | calltrace:SyS_execve   | 0  | 43 |
>> | calltrace:run_init_process | 0  | 21 |
>> ++++
>>
>>
>>
>> [4.096204] Write protecting the kernel text: 3148k
>> [4.096911] Write protecting the kernel read-only data: 1444k
>> [4.120357] [ cut here ]
>> [4.121078] WARNING: CPU: 0 PID: 101 at mm/memory.c:303 
>> __tlb_remove_page_size+0x25/0x99
>> [4.122380] Modules linked in:
>> [4.122788] CPU: 0 PID: 101 Comm: run-parts Not tainted 
>> 4.8.0-mm1-00315-gc4344e8 #5
>> [4.123956]  bd145dc4 b111e5e6 bd145de0 b10320dc 012f b10974d1 
>> bd145e70 c4954170
>> [4.125277]  c4954170 bd145df4 b103215f 0009   
>> bd145e04 b10974d1
>> [4.126424]  c4954170 bd145e70 bd145e14 b10263ca bd145e70 bd47bafc 
>> bd145e40 b109767a
>> [4.127622] Call Trace:
>
>Thanks for the report. The below change should fix this.
>
>commit 18c929e7cf672da617dc218c6265366bf78b1644
>Author: Aneesh Kumar K.V 
>Date:   Wed Oct 12 08:40:41 2016 +0530
>
>update mmu gather page size before flushing page table cache
>
>diff --git a/mm/memory.c b/mm/memory.c
>index 26d1ba8c87e6..7e7eccb82a2b 100644
>--- a/mm/memory.c
>+++ b/mm/memory.c
>@@ -526,7 +526,11 @@ void free_pgd_range(struct mmu_gather *tlb,
>   end -= PMD_SIZE;
>   if (addr > end - 1)
>   return;
>-
>+  /*
>+   * We add page table cache pages with PAGE_SIZE,
>+   * (see pte_free_tlb()), flush the tlb if we need
>+   */
>+  tlb_remove_check_page_size_change(tlb, PAGE_SIZE);
>   pgd = pgd_offset(tlb->mm, addr);
>   do {
>   next = pgd_addr_end(addr, end);
>

Just applied this fix on top of commit c4344e8035 and confirmed that
reportedwarning is gone with this fix.

Tested-by: Xiaolong Ye 

=
compiler/kconfig/rootfs/sleep/tbox_group/testcase:
  
gcc-6/i386-randconfig-s1-201641/quantal-core-i386.cgz/1/vm-vp-quantal-i386/boot

commit:
  c4344e80359420d7574b3b90fddf53311f1d24e6
  384db818365c90b91d8bad80be188765e801cf58 ("update mmu gather page size before 
flushing page table cache")

c4344e80359420d7 384db818365c90b91d8bad80be
 --
   fail:runs  %reproductionfail:runs
   | | |
 24:24-100%:5 
dmesg.WARNING:at_mm/memory.c:#__tlb_remove_page_size

Thanks,
Xiaolong


[mm] c4344e8035: WARNING: CPU: 0 PID: 101 at mm/memory.c:303 __tlb_remove_page_size+0x25/0x99

2016-10-12 Thread kernel test robot
FYI, we noticed the following commit:

https://github.com/0day-ci/linux 
Aneesh-Kumar-K-V/mm-Use-the-correct-page-size-when-removing-the-page/20161012-013446
commit c4344e80359420d7574b3b90fddf53311f1d24e6 ("mm: Remove the page size 
change check in tlb_remove_page")

in testcase: boot

on test machine: qemu-system-i386 -enable-kvm -cpu Haswell,+smep,+smap -m 360M

caused below changes:


++++
|| eff764128d | c4344e8035 |
++++
| boot_successes | 59 | 0  |
| boot_failures  | 0  | 43 |
| WARNING:at_mm/memory.c:#__tlb_remove_page_size | 0  | 43 |
| calltrace:SyS_execve   | 0  | 43 |
| calltrace:run_init_process | 0  | 21 |
++++



[4.096204] Write protecting the kernel text: 3148k
[4.096911] Write protecting the kernel read-only data: 1444k
[4.120357] [ cut here ]
[4.121078] WARNING: CPU: 0 PID: 101 at mm/memory.c:303 
__tlb_remove_page_size+0x25/0x99
[4.122380] Modules linked in:
[4.122788] CPU: 0 PID: 101 Comm: run-parts Not tainted 
4.8.0-mm1-00315-gc4344e8 #5
[4.123956]  bd145dc4 b111e5e6 bd145de0 b10320dc 012f b10974d1 bd145e70 
c4954170
[4.125277]  c4954170 bd145df4 b103215f 0009   bd145e04 
b10974d1
[4.126424]  c4954170 bd145e70 bd145e14 b10263ca bd145e70 bd47bafc bd145e40 
b109767a
[4.127622] Call Trace:
[4.128255] [ cut here ]
[4.128261] WARNING: CPU: 0 PID: 103 at mm/memory.c:303 
__tlb_remove_page_size+0x25/0x99
[4.128261] Modules linked in:
[4.128264] CPU: 0 PID: 103 Comm: sh Not tainted 4.8.0-mm1-00315-gc4344e8 #5
[4.128268]  bd143dc4 b111e5e6 bd143de0 b10320dc 012f b10974d1 bd143e70 
c494cd00
[4.128271]  c494cd00 bd143df4 b103215f 0009   bd143e04 
b10974d1
[4.128274]  c494cd00 bd143e70 bd143e14 b10263ca bd143e70 bd47dafc bd143e40 
b109767a
[4.128275] Call Trace:
[4.128281]  [] dump_stack+0x16/0x18
[4.128284]  [] __warn+0xa5/0xbc
[4.128286]  [] ? __tlb_remove_page_size+0x25/0x99
[4.128288]  [] warn_slowpath_null+0x11/0x16
[4.128290]  [] __tlb_remove_page_size+0x25/0x99
[4.128293]  [] ___pte_free_tlb+0x57/0x66
[4.128295]  [] free_pgd_range+0x135/0x1d0
[4.128298]  [] setup_arg_pages+0x219/0x29a
[4.128302]  [] load_elf_binary+0x2ad/0x94a
[4.128305]  [] ? _copy_from_user+0x49/0x5c
[4.128307]  [] search_binary_handler+0x106/0x159
[4.128309]  [] do_execveat_common+0x3bf/0x4dc
[4.128311]  [] do_execve+0x14/0x16
[4.128313]  [] SyS_execve+0x16/0x18
[4.128316]  [] do_fast_syscall_32+0x8f/0xce
[4.128320]  [] sysenter_past_esp+0x47/0x75
[4.128322] ---[ end trace 816334aebb0eaffe ]---
[4.132981] [ cut here ]





Thanks,
Kernel Test Robot
#
# Automatically generated file; DO NOT EDIT.
# Linux/i386 4.8.0-mm1 Kernel Configuration
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf32-i386"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_BITS_MAX=16
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEBUG_RODATA=y
CONFIG_PGTABLE_LEVELS=2
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
# CONFIG_KERNEL_GZIP is not set
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
CONFIG_KERNEL_LZO=y
# 

Re: [mm] c4344e8035: WARNING: CPU: 0 PID: 101 at mm/memory.c:303 __tlb_remove_page_size+0x25/0x99

2016-10-11 Thread Aneesh Kumar K.V
kernel test robot  writes:

> FYI, we noticed the following commit:
>
> https://github.com/0day-ci/linux 
> Aneesh-Kumar-K-V/mm-Use-the-correct-page-size-when-removing-the-page/20161012-013446
> commit c4344e80359420d7574b3b90fddf53311f1d24e6 ("mm: Remove the page size 
> change check in tlb_remove_page")
>
> in testcase: boot
>
> on test machine: qemu-system-i386 -enable-kvm -cpu Haswell,+smep,+smap -m 360M
>
> caused below changes:
>
>
> ++++
> || eff764128d | c4344e8035 |
> ++++
> | boot_successes | 59 | 0  |
> | boot_failures  | 0  | 43 |
> | WARNING:at_mm/memory.c:#__tlb_remove_page_size | 0  | 43 |
> | calltrace:SyS_execve   | 0  | 43 |
> | calltrace:run_init_process | 0  | 21 |
> ++++
>
>
>
> [4.096204] Write protecting the kernel text: 3148k
> [4.096911] Write protecting the kernel read-only data: 1444k
> [4.120357] [ cut here ]
> [4.121078] WARNING: CPU: 0 PID: 101 at mm/memory.c:303 
> __tlb_remove_page_size+0x25/0x99
> [4.122380] Modules linked in:
> [4.122788] CPU: 0 PID: 101 Comm: run-parts Not tainted 
> 4.8.0-mm1-00315-gc4344e8 #5
> [4.123956]  bd145dc4 b111e5e6 bd145de0 b10320dc 012f b10974d1 
> bd145e70 c4954170
> [4.125277]  c4954170 bd145df4 b103215f 0009   
> bd145e04 b10974d1
> [4.126424]  c4954170 bd145e70 bd145e14 b10263ca bd145e70 bd47bafc 
> bd145e40 b109767a
> [4.127622] Call Trace:

Thanks for the report. The below change should fix this.

commit 18c929e7cf672da617dc218c6265366bf78b1644
Author: Aneesh Kumar K.V 
Date:   Wed Oct 12 08:40:41 2016 +0530

update mmu gather page size before flushing page table cache

diff --git a/mm/memory.c b/mm/memory.c
index 26d1ba8c87e6..7e7eccb82a2b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -526,7 +526,11 @@ void free_pgd_range(struct mmu_gather *tlb,
end -= PMD_SIZE;
if (addr > end - 1)
return;
-
+   /*
+* We add page table cache pages with PAGE_SIZE,
+* (see pte_free_tlb()), flush the tlb if we need
+*/
+   tlb_remove_check_page_size_change(tlb, PAGE_SIZE);
pgd = pgd_offset(tlb->mm, addr);
do {
next = pgd_addr_end(addr, end);