Re: [linux-next][Oops] memory hot-unplug results fault instruction address at /include/linux/list.h:104

2017-10-03 Thread Abdul Haleem
On Wed, 2017-09-20 at 12:54 -0700, Kees Cook wrote:
> On Wed, Sep 20, 2017 at 12:40 AM, Abdul Haleem
>  wrote:
> > On Tue, 2017-09-12 at 12:11 +0530, abdul wrote:
> >> Hi,
> >>
> >> Memory hot-unplug on PowerVM LPAR running next-20170911 results in
> >> Faulting instruction address: 0xc02b56c4
> >>
> >> which maps to the below code path:
> >>
> >> 0xc02b56c4 is in __rmqueue (./include/linux/list.h:104).
> >> 99 * This is only for internal list manipulation where we know
> >> 100* the prev/next entries already!
> >> 101*/
> >> 102   static inline void __list_del(struct list_head * prev, struct
> >> list_head * next)
> >> 103   {
> >> 104   next->prev = prev;
> >> 105   WRITE_ONCE(prev->next, next);
> >> 106   }
> >> 107
> >> 108   /**
> >>
> >
> > I see another kernel Oops when running transparent hugepages
> > de-fragmentation test.
> >
> > And the faulty instruction address again pointing to same code line
> > 0xc026f9f4 is in compaction_alloc (./include/linux/list.h:104)
> >
> > steps to recreate:
> > -
> > 1. Enable transparent hugepages ("always")
> > 2. Turn off the defrag $ echo 0 > khugepaged/defrag
> > 3. Write random to memory path
> > 4. Set huge pages numbers
> > 5. Turn on defrag $ echo 1 > khugepaged/defrag
> >
> >
> > new trace:
> > --
> > Unable to handle kernel paging request for data at address
> > 0x5deadbeef108
> 
> This looks like use-after-list-removal, that value appears to be LIST_POISON1.
> 
> Try enabling CONFIG_DEBUG_LIST to see if you get better details?

Trace messages after enabling CONFIG_DEBUG_LIST

BUG: Bad page state in process in:imklog  pfn:6cbb3
page:f1b2ecc0 count:2 mapcount:0 mapping:c00769aafd20 index:0x1
flags: 0x3381068(uptodate|lru|active|private)
raw: 03381068 c00769aafd20 0001 0002
raw: 5deadbeef100 5deadbeef200  c000feca3400
page dumped because: page still charged to cgroup
page->mem_cgroup:c000feca3400
bad because of flags: 0x1068(uptodate|lru|active|private)
kernel BUG at mm/vmscan.c:1556!
[c5da79f0] [c02bfe74] __alloc_pages_nodemask+0x754/0x1160
Oops: Exception in kernel mode, sig: 5 [#1]
LE SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: xt_addrtype xt_conntrack ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
iptable_filter 
[c5da7bf0] [c034c238] alloc_pages_vma+0xb8/0x290
[c5da7c60] [c03102b0] __handle_mm_fault+0x1150/0x1ad0
[c5da7d40] [c0310d58] handle_mm_fault+0x128/0x210
[c5da7d80] [c0067878] __do_page_fault+0x218/0x8e0
[c5da7e30] [c000a4a4] handle_page_fault+0x18/0x38
Instruction dump:
38210060 e8010010 7c0803a6 4e800020 6042 3c62ff93 7ca62b78 7d244b78 
7d455378 3863edc8 4bafe4d1 6000 <0fe0> 3860 4b60 6000 
---[ end trace 1e619608a776e913 ]---
list_add corruption. next->prev should be prev (c0077ff54710), but was 
5deadbeef200. (next=f1b2ece0).
[ cut here ]
WARNING: CPU: 5 PID: 308 at lib/list_debug.c:25 __list_add_valid+0xa4/0xf0
Modules linked in: xt_addrtype xt_conntrack ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge stp llc 
dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c vmx_crypto 
pseries_rng
 ip_tables x_tables nf_nat nf_conntrack bridge stp llc dm_thin_pool 
dm_persistent_data dm_bio_prison dm_bufio libcrc32c vmx_crypto pseries_rng 
rtc_generic autofs4
CPU: 2 PID: 1 Comm: systemd Tainted: GB   W   
4.14.0-rc2-next-20170929-autotest #2
task: c00777e0 task.stack: c00777e8
NIP:  c02d5900 LR: c02d586c CTR: 
REGS: c00777e82c20 TRAP: 0700   Tainted: GB   W
(4.14.0-rc2-next-20170929-autotest)
MSR:  80029033   CR: 22248428  XER: 200a  
CFAR: c02d587c SOFTE: 0 
GPR00: c02d586c c00777e82ea0 c15ac700 ffea 
GPR04:  c00777e830a0 00014f28 0001 
GPR08:  033800010008  3563376431303030 
GPR12: 8800 
 rtc_generic
ce741500 f1d7c4a0 0001 
GPR16: c00777e833ac c00777e830b0 0002 c00777e830a0 
GPR20:  c00777e833c4 c00777e82f10 0006 
GPR24: c00777e82f50 0020 0007 c00774193800 
GPR28: 0006 000c c00774193820 
 autofs4
f1d7c560 
NIP [c02d5900] isolate_lru_pages.isra.21+0x360/0x580
LR [c02d586c] isolate_lru_pages.isra.21+0x2cc/0x580
Call Trace:
[c00777e82ea0] [c02d586c] isolate_lru_pages.isra.21+0x2cc/0x580 
(unreliable)

Re: [linux-next][Oops] memory hot-unplug results fault instruction address at /include/linux/list.h:104

2017-09-29 Thread Abdul Haleem
On Wed, 2017-09-20 at 12:54 -0700, Kees Cook wrote:
> On Wed, Sep 20, 2017 at 12:40 AM, Abdul Haleem
>  wrote:
> > On Tue, 2017-09-12 at 12:11 +0530, abdul wrote:
> >> Hi,
> >>
> >> Memory hot-unplug on PowerVM LPAR running next-20170911 results in
> >> Faulting instruction address: 0xc02b56c4
> >>
> >> which maps to the below code path:
> >>
> >> 0xc02b56c4 is in __rmqueue (./include/linux/list.h:104).
> >> 99 * This is only for internal list manipulation where we know
> >> 100* the prev/next entries already!
> >> 101*/
> >> 102   static inline void __list_del(struct list_head * prev, struct
> >> list_head * next)
> >> 103   {
> >> 104   next->prev = prev;
> >> 105   WRITE_ONCE(prev->next, next);
> >> 106   }
> >> 107
> >> 108   /**
> >>
> >
> > I see another kernel Oops when running transparent hugepages
> > de-fragmentation test.
> >
> > And the faulty instruction address again pointing to same code line
> > 0xc026f9f4 is in compaction_alloc (./include/linux/list.h:104)
> >
> > steps to recreate:
> > -
> > 1. Enable transparent hugepages ("always")
> > 2. Turn off the defrag $ echo 0 > khugepaged/defrag
> > 3. Write random to memory path
> > 4. Set huge pages numbers
> > 5. Turn on defrag $ echo 1 > khugepaged/defrag
> >
> >
> > new trace:
> > --
> > Unable to handle kernel paging request for data at address
> > 0x5deadbeef108
> 
> This looks like use-after-list-removal, that value appears to be LIST_POISON1.
> 
> Try enabling CONFIG_DEBUG_LIST to see if you get better details?

With above config enabled I see below messages and also call traces. But
no kernel Oops.

BUG: Bad page state in process drmgr  pfn:770c7
page:f1dc31c0 count:0 mapcount:0 mapping:f1dc31c8
index:0x1
flags: 0x338()
raw: 0338 f1dc31c8 0001 
raw: 5deadbeef100 5deadbeef200  
page dumped because: non-NULL mapping



-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre


BUG: Bad page state in process drmgr  pfn:770c7
page:f1dc31c0 count:0 mapcount:0 mapping:f1dc31c8 index:0x1
flags: 0x338()
raw: 0338 f1dc31c8 0001 
raw: 5deadbeef100 5deadbeef200  
page dumped because: non-NULL mapping
BUG: Bad page state in process drmgr  pfn:770c7
page:f1dc31c0 count:0 mapcount:0 mapping:f1dc31c8 index:0x1
flags: 0x338()
raw: 0338 f1dc31c8 0001 
raw: 5deadbeef100 5deadbeef200  
page dumped because: non-NULL mapping
BUG: Bad page state in process drmgr  pfn:770c7
page:f1dc31c0 count:0 mapcount:0 mapping:f1dc31c8 index:0x1
flags: 0x338()
raw: 0338 f1dc31c8 0001 
raw: 5deadbeef100 5deadbeef200  
page dumped because: non-NULL mapping
BUG: Bad page state in process systemd-journal  pfn:770c7
page:f1dc31c0 count:0 mapcount:0 mapping:f1dc31c8 index:0x1
flags: 0x338()
raw: 0338 f1dc31c8 0001 
raw: 5deadbeef100 5deadbeef200  
page dumped because: non-NULL mapping
BUG: Bad page state in process systemd-journal  pfn:770c7
page:f1dc31c0 count:0 mapcount:0 mapping:f1dc31c8 index:0x1
flags: 0x338()
raw: 0338 f1dc31c8 0001 
raw: 5deadbeef100 5deadbeef200  
page dumped because: non-NULL mapping
BUG: Bad page state in process in:imklog  pfn:770c7
page:f1dc31c0 count:0 mapcount:0 mapping:f1dc31c8 index:0x1
flags: 0x338()
raw: 0338 f1dc31c8 0001 
raw: 5deadbeef100 5deadbeef200  
page dumped because: non-NULL mapping
BUG: Bad page state in process systemd-journal  pfn:770c7
page:f1dc31c0 count:0 mapcount:0 mapping:f1dc31c8 index:0x1
flags: 0x338()
raw: 0338 f1dc31c8 0001 
raw: 5deadbeef100 5deadbeef200  
page dumped because: non-NULL mapping
BUG: Bad page state in process in:imklog  pfn:770c7
page:f1dc31c0 count:0 mapcount:0 mapping:f1dc31c8 index:0x1
flags: 0x338()
raw: 0338 f1dc31c8 0001 
raw: 5deadbeef100 5deadbeef200  
page dumped because: non-NULL mapping
BUG: Bad page state in process in:imklog  pfn:770c7
page:f1dc31c0 count:0 mapcount:0 mapping:f1dc31c8 index:0x1
flags: 0x338()
raw: 0338 

Re: [linux-next][Oops] memory hot-unplug results fault instruction address at /include/linux/list.h:104

2017-09-20 Thread Kees Cook
On Wed, Sep 20, 2017 at 12:40 AM, Abdul Haleem
 wrote:
> On Tue, 2017-09-12 at 12:11 +0530, abdul wrote:
>> Hi,
>>
>> Memory hot-unplug on PowerVM LPAR running next-20170911 results in
>> Faulting instruction address: 0xc02b56c4
>>
>> which maps to the below code path:
>>
>> 0xc02b56c4 is in __rmqueue (./include/linux/list.h:104).
>> 99 * This is only for internal list manipulation where we know
>> 100* the prev/next entries already!
>> 101*/
>> 102   static inline void __list_del(struct list_head * prev, struct
>> list_head * next)
>> 103   {
>> 104   next->prev = prev;
>> 105   WRITE_ONCE(prev->next, next);
>> 106   }
>> 107
>> 108   /**
>>
>
> I see another kernel Oops when running transparent hugepages
> de-fragmentation test.
>
> And the faulty instruction address again pointing to same code line
> 0xc026f9f4 is in compaction_alloc (./include/linux/list.h:104)
>
> steps to recreate:
> -
> 1. Enable transparent hugepages ("always")
> 2. Turn off the defrag $ echo 0 > khugepaged/defrag
> 3. Write random to memory path
> 4. Set huge pages numbers
> 5. Turn on defrag $ echo 1 > khugepaged/defrag
>
>
> new trace:
> --
> Unable to handle kernel paging request for data at address
> 0x5deadbeef108

This looks like use-after-list-removal, that value appears to be LIST_POISON1.

Try enabling CONFIG_DEBUG_LIST to see if you get better details?

-Kees

-- 
Kees Cook
Pixel Security


Re: [linux-next][Oops] memory hot-unplug results fault instruction address at /include/linux/list.h:104

2017-09-20 Thread Abdul Haleem
On Tue, 2017-09-12 at 12:11 +0530, abdul wrote:
> Hi,
> 
> Memory hot-unplug on PowerVM LPAR running next-20170911 results in
> Faulting instruction address: 0xc02b56c4
> 
> which maps to the below code path:
> 
> 0xc02b56c4 is in __rmqueue (./include/linux/list.h:104).
> 99 * This is only for internal list manipulation where we know
> 100* the prev/next entries already!
> 101*/
> 102   static inline void __list_del(struct list_head * prev, struct
> list_head * next)
> 103   {
> 104   next->prev = prev;
> 105   WRITE_ONCE(prev->next, next);
> 106   }
> 107   
> 108   /**
> 

I see another kernel Oops when running transparent hugepages
de-fragmentation test.

And the faulty instruction address again pointing to same code line
0xc026f9f4 is in compaction_alloc (./include/linux/list.h:104)

steps to recreate:
-
1. Enable transparent hugepages ("always")
2. Turn off the defrag $ echo 0 > khugepaged/defrag
3. Write random to memory path 
4. Set huge pages numbers 
5. Turn on defrag $ echo 1 > khugepaged/defrag


new trace:
--
Unable to handle kernel paging request for data at address
0x5deadbeef108
Faulting instruction address: 0xc026f9f4
Oops: Kernel access of bad area, sig: 11 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
Dumping ftrace buffer: 
   (ftrace buffer empty)
Modules linked in: bridge iptable_mangle ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4
xt_tcpudp tun stp llc kvm_hv kvm iptable_filter vmx_crypto
powernv_op_panel powernv_rng leds_powernv rng_core ipmi_powernv
led_class ipmi_devintf ipmi_msghandler binfmt_misc nfsd ip_tables
x_tables autofs4 [last unloaded: bridge]
CPU: 52 PID: 803 Comm: kcompactd1 Not tainted
4.13.0-next-20170915-autotest #1
task: c007f238 task.stack: c007f240
NIP:  c026f9f4 LR: c02d1328 CTR: c026f980
REGS: c007f24037d0 TRAP: 0380   Not tainted
(4.13.0-next-20170915-autotest)
MSR:  92009033   CR: 22822088  XER:
  
CFAR: c02d1324 SOFTE: 1 
GPR00: c02d1328 c007f2403a50 c10bd500
f3dcd100 
GPR04: c007f2403c90 c007f2403af0 f21628a0
5deadbeef100 
GPR08: 5deadbeef200 5deadbeef200 5deadbeef100
0060 
GPR12: c026f980 cfd51e00 f2163700
2000 
GPR16:  8000 
c026c3d0 
GPR20: 0003 0001 c007f2403ca0
c007f2403c90 
GPR24: c026f980  f21636c0
f3dcd100 
GPR28: 5deadbeef100 5deadbeef200 0001
c007f2403c90 
NIP [c026f9f4] compaction_alloc+0x74/0x350
LR [c02d1328] migrate_pages+0x268/0x10c0
Call Trace:
[c007f2403a50] [c0239584] free_hot_cold_page+0x2b4/0x310
(unreliable)
[c007f2403ad0] [c02d1328] migrate_pages+0x268/0x10c0
[c007f2403bc0] [c0270814] compact_zone+0x294/0xb30
[c007f2403c70] [c02714c8] kcompactd_do_work+0x168/0x300
[c007f2403d40] [c0271718] kcompactd+0xb8/0x250
[c007f2403dc0] [c01102f0] kthread+0x160/0x1a0
[c007f2403e30] [c000bc60] ret_from_kernel_thread+0x5c/0x7c
Instruction dump:
419e008c 3d405dea e87f 614adbee 794a07c6 654af000 e9030008 e8e3 
3863ffe0 7d495378 614a0100 61290200  f8e8 f9430020
f9230028 
---[ end trace 27b8c4e55ceebc7d ]---

> 
> Machine Type: Power 8 PowerVM LPAR
> Kernel version: 4.13.0-next-20170911
> config file : attached 
> 
> 
> dmesg logs
> -
> 
> Unable to handle kernel paging request for data at address
> 0x5deadbeef108
> Faulting instruction address: 0xc02b56c4
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE SMP NR_CPUS=2048 NUMA pSeries
> Modules linked in: xt_addrtype xt_conntrack ipt_MASQUERADE
> nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
> nf_nat_ipv4 iptable_filter ip_tables x_tables nf_nat nf_conntrack bridge
> stp llc dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c
> rtc_generic vmx_crypto pseries_rng autofs4
> CPU: 5 PID: 846 Comm: avocado Not tainted 4.13.0-next-20170911 #1
> task: c00771c02e00 task.stack: c00771c88000
> NIP:  c02b56c4 LR: c02b7738 CTR: c03587b0
> REGS: c00771c8b2c0 TRAP: 0380   Not tainted  (4.13.0-next-20170911)
> MSR:  80010280b033   CR:
> 84228828  XER: 2000  
> CFAR: c02b7734 SOFTE: 0 
> GPR00: c02b7738 c00771c8b540 c1598a00
>  
> GPR04: f1d2cce0 0001 5deadbeef100
> 5deadbeef200 
> GPR08: 5deadbee c0077ff54710 
> 0060 
> GPR12: 24242824 ce743480 00077eb9
> c0077fc68978 
> GPR16: c0077ff54600