Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-18 Thread Jan Stancek
- Original Message - > Jan Stancek writes: > > Hi Mike, > > > > Revert of 67961f9db8c4 helps, I let whole suite run for 100 iterations, > > there were no issues. > > > > I cut down reproducer and removed last mmap/write/munmap as that is enough > > to reproduce the problem. Then I star

Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-18 Thread Aneesh Kumar K.V
Jan Stancek writes: > Hi Mike, > > Revert of 67961f9db8c4 helps, I let whole suite run for 100 iterations, > there were no issues. > > I cut down reproducer and removed last mmap/write/munmap as that is enough > to reproduce the problem. Then I started introducing some traces into kernel > and not

Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-17 Thread Mike Kravetz
On 10/17/2016 03:53 PM, Mike Kravetz wrote: > On 10/16/2016 10:04 PM, Aneesh Kumar K.V wrote: >> >> looking at that commit, I am not sure region_chg output indicate a hole >> punched. ie, w.r.t private mapping when we mmap, we don't do a >> region_chg (hugetlb_reserve_page()). So with a fault later

Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-17 Thread Mike Kravetz
On 10/17/2016 11:27 AM, Aneesh Kumar K.V wrote: > Jan Stancek writes: > > >> Hi Mike, >> >> Revert of 67961f9db8c4 helps, I let whole suite run for 100 iterations, >> there were no issues. >> >> I cut down reproducer and removed last mmap/write/munmap as that is enough >> to reproduce the proble

Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-17 Thread Mike Kravetz
On 10/16/2016 10:04 PM, Aneesh Kumar K.V wrote: > Mike Kravetz writes: > >> On 10/14/2016 01:48 AM, Jan Stancek wrote: >>> On 10/14/2016 01:26 AM, Mike Kravetz wrote: Hi Jan, Any chance you can get the contents of /sys/kernel/mm/hugepages before and after the first run of

Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-17 Thread Aneesh Kumar K.V
Jan Stancek writes: > Hi Mike, > > Revert of 67961f9db8c4 helps, I let whole suite run for 100 iterations, > there were no issues. > > I cut down reproducer and removed last mmap/write/munmap as that is enough > to reproduce the problem. Then I started introducing some traces into kernel > and n

Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-17 Thread Jan Stancek
ec.com, > "aneesh kumar" > , "iamjoonsoo kim" > Sent: Saturday, 15 October, 2016 1:57:31 AM > Subject: Re: [bug/regression] libhugetlbfs testsuite failures and OOMs > eventually kill my system > > > It is pretty consistent that we leak a reserve page e

Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-17 Thread Aneesh Kumar K.V
Mike Kravetz writes: > On 10/14/2016 01:48 AM, Jan Stancek wrote: >> On 10/14/2016 01:26 AM, Mike Kravetz wrote: >>> >>> Hi Jan, >>> >>> Any chance you can get the contents of /sys/kernel/mm/hugepages >>> before and after the first run of libhugetlbfs testsuite on Power? >>> Perhaps a script like

Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-14 Thread Mike Kravetz
On 10/14/2016 01:48 AM, Jan Stancek wrote: > On 10/14/2016 01:26 AM, Mike Kravetz wrote: >> >> Hi Jan, >> >> Any chance you can get the contents of /sys/kernel/mm/hugepages >> before and after the first run of libhugetlbfs testsuite on Power? >> Perhaps a script like: >> >> cd /sys/kernel/mm/hugepa

Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-14 Thread Jan Stancek
On 10/14/2016 01:26 AM, Mike Kravetz wrote: > > Hi Jan, > > Any chance you can get the contents of /sys/kernel/mm/hugepages > before and after the first run of libhugetlbfs testsuite on Power? > Perhaps a script like: > > cd /sys/kernel/mm/hugepages > for f in hugepages-*/*; do > n=`cat $f

Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-13 Thread Mike Kravetz
On 10/13/2016 08:24 AM, Mike Kravetz wrote: > On 10/13/2016 05:19 AM, Jan Stancek wrote: >> Hi, >> >> I'm running into ENOMEM failures with libhugetlbfs testsuite [1] on >> a power8 lpar system running 4.8 or latest git [2]. Repeated runs of >> this suite trigger multiple OOMs, that eventually kill

Re: [bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-13 Thread Mike Kravetz
On 10/13/2016 05:19 AM, Jan Stancek wrote: > Hi, > > I'm running into ENOMEM failures with libhugetlbfs testsuite [1] on > a power8 lpar system running 4.8 or latest git [2]. Repeated runs of > this suite trigger multiple OOMs, that eventually kill entire system, > it usually takes 3-5 runs: > >

[bug/regression] libhugetlbfs testsuite failures and OOMs eventually kill my system

2016-10-13 Thread Jan Stancek
Hi, I'm running into ENOMEM failures with libhugetlbfs testsuite [1] on a power8 lpar system running 4.8 or latest git [2]. Repeated runs of this suite trigger multiple OOMs, that eventually kill entire system, it usually takes 3-5 runs: * Total System Memory..: 18024 MB * Shared Mem Max M