Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-25 Thread Joonsoo Kim
On Fri, May 23, 2014 at 05:57:58PM -0700, Laura Abbott wrote:
> On 5/12/2014 10:04 AM, Laura Abbott wrote:
> > 
> > I'm going to see about running this through tests internally for comparison.
> > Hopefully I'll get useful results in a day or so.
> > 
> > Thanks,
> > Laura
> > 
> 
> We ran some tests internally and found that for our purposes these patches 
> made
> the benchmarks worse vs. the existing implementation of using CMA first for 
> some
> pages. These are mostly androidisms but androidisms that we care about for
> having a device be useful.
> 
> The foreground memory headroom on the device was on average about 40 MB 
> smaller
>  when using these patches vs our existing implementation of something like
> solution #1. By foreground memory headroom we simply mean the amount of memory
> that the foreground application can allocate before it is killed by the 
> Android
>  Low Memory killer.
> 
> We also found that when running a sequence of app launches these patches had
> more high priority app kills by the LMK and more alloc stalls. The test did a
> total of 500 hundred app launches (using 9 separate applications) The CMA
> memory in our system is rarely used by its client and is therefore available
> to the system most of the time.
> 
> Test device
> - 4 CPUs
> - Android 4.4.2
> - 512MB of RAM
> - 68 MB of CMA
> 
> 
> Results:
> 
> Existing solution:
> Foreground headroom: 200MB
> Number of higher priority LMK kills (oom_score_adj < 529): 332
> Number of alloc stalls: 607
> 
> 
> Test patches:
> Foreground headroom: 160MB
> Number of higher priority LMK kills (oom_score_adj < 529):
> 459 Number of alloc stalls: 29538
> 
> We believe that the issues seen with these patches are the result of the LMK
> being more aggressive. The LMK will be more aggressive because it will ignore
> free CMA pages for unmovable allocations, and since most calls to the LMK are
> made by kswapd (which uses GFP_KERNEL) the LMK will mostly ignore free CMA
> pages. Because the LMK thresholds are higher than the zone watermarks, there
> will often be a lot of free CMA pages in the system when the LMK is called,
> which the LMK will usually ignore.

Hello,

Really thanks for testing!!!
If possible, please let me know nr_free_cma of these patches/your in-house
implementation before testing.

I can guess following scenario about your test.

On boot-up, CMA memory are mostly used by native processes, because
your implementation use CMA first for some pages. kswapd
is woken up late since non-CMA free memory is larger than my
implementation. And, on reclaiming, the LMK reclaiming memory by
killing app process would reclaim movable memory with high probability
since cma memory are mostly used by native processes and app processes
have just movable memory.

This is just my guess. But, if it is true, this is not fair test for
this patchset. If possible, could you make nr_free_cma same on both
implementation before testing?

Moreover, in mainline implementation, the LMK doesn't consider if memory
type is CMA or not. Maybe your overall system would be highly optimized
for your implementation, so I'm not sure if your testing is
appropriate or not for this patchset.

Anyway, I would like to optimize this for android. :)
Please let me know more about your system.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-23 Thread Laura Abbott
On 5/12/2014 10:04 AM, Laura Abbott wrote:
> 
> I'm going to see about running this through tests internally for comparison.
> Hopefully I'll get useful results in a day or so.
> 
> Thanks,
> Laura
> 

We ran some tests internally and found that for our purposes these patches made
the benchmarks worse vs. the existing implementation of using CMA first for some
pages. These are mostly androidisms but androidisms that we care about for
having a device be useful.

The foreground memory headroom on the device was on average about 40 MB smaller
 when using these patches vs our existing implementation of something like
solution #1. By foreground memory headroom we simply mean the amount of memory
that the foreground application can allocate before it is killed by the Android
 Low Memory killer.

We also found that when running a sequence of app launches these patches had
more high priority app kills by the LMK and more alloc stalls. The test did a
total of 500 hundred app launches (using 9 separate applications) The CMA
memory in our system is rarely used by its client and is therefore available
to the system most of the time.

Test device
- 4 CPUs
- Android 4.4.2
- 512MB of RAM
- 68 MB of CMA


Results:

Existing solution:
Foreground headroom: 200MB
Number of higher priority LMK kills (oom_score_adj < 529): 332
Number of alloc stalls: 607


Test patches:
Foreground headroom: 160MB
Number of higher priority LMK kills (oom_score_adj < 529):
459 Number of alloc stalls: 29538

We believe that the issues seen with these patches are the result of the LMK
being more aggressive. The LMK will be more aggressive because it will ignore
free CMA pages for unmovable allocations, and since most calls to the LMK are
made by kswapd (which uses GFP_KERNEL) the LMK will mostly ignore free CMA
pages. Because the LMK thresholds are higher than the zone watermarks, there
will often be a lot of free CMA pages in the system when the LMK is called,
which the LMK will usually ignore.

Thanks,
Laura

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-19 Thread Joonsoo Kim
On Tue, May 20, 2014 at 08:18:59AM +0900, Minchan Kim wrote:
> On Mon, May 19, 2014 at 01:50:01PM +0900, Joonsoo Kim wrote:
> > On Mon, May 19, 2014 at 11:53:05AM +0900, Minchan Kim wrote:
> > > On Mon, May 19, 2014 at 11:11:21AM +0900, Joonsoo Kim wrote:
> > > > On Thu, May 15, 2014 at 11:43:53AM +0900, Minchan Kim wrote:
> > > > > On Thu, May 15, 2014 at 10:53:01AM +0900, Joonsoo Kim wrote:
> > > > > > On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:
> > > > > > > Hey Joonsoo,
> > > > > > > 
> > > > > > > On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:
> > > > > > > > CMA is introduced to provide physically contiguous pages at 
> > > > > > > > runtime.
> > > > > > > > For this purpose, it reserves memory at boot time. Although it 
> > > > > > > > reserve
> > > > > > > > memory, this reserved memory can be used for movable memory 
> > > > > > > > allocation
> > > > > > > > request. This usecase is beneficial to the system that needs 
> > > > > > > > this CMA
> > > > > > > > reserved memory infrequently and it is one of main purpose of
> > > > > > > > introducing CMA.
> > > > > > > > 
> > > > > > > > But, there is a problem in current implementation. The problem 
> > > > > > > > is that
> > > > > > > > it works like as just reserved memory approach. The pages on 
> > > > > > > > cma reserved
> > > > > > > > memory are hardly used for movable memory allocation. This is 
> > > > > > > > caused by
> > > > > > > > combination of allocation and reclaim policy.
> > > > > > > > 
> > > > > > > > The pages on cma reserved memory are allocated if there is no 
> > > > > > > > movable
> > > > > > > > memory, that is, as fallback allocation. So the time this 
> > > > > > > > fallback
> > > > > > > > allocation is started is under heavy memory pressure. Although 
> > > > > > > > it is under
> > > > > > > > memory pressure, movable allocation easily succeed, since there 
> > > > > > > > would be
> > > > > > > > many pages on cma reserved memory. But this is not the case for 
> > > > > > > > unmovable
> > > > > > > > and reclaimable allocation, because they can't use the pages on 
> > > > > > > > cma
> > > > > > > > reserved memory. These allocations regard system's free memory 
> > > > > > > > as
> > > > > > > > (free pages - free cma pages) on watermark checking, that is, 
> > > > > > > > free
> > > > > > > > unmovable pages + free reclaimable pages + free movable pages. 
> > > > > > > > Because
> > > > > > > > we already exhausted movable pages, only free pages we have are 
> > > > > > > > unmovable
> > > > > > > > and reclaimable types and this would be really small amount. So 
> > > > > > > > watermark
> > > > > > > > checking would be failed. It will wake up kswapd to make enough 
> > > > > > > > free
> > > > > > > > memory for unmovable and reclaimable allocation and kswapd will 
> > > > > > > > do.
> > > > > > > > So before we fully utilize pages on cma reserved memory, kswapd 
> > > > > > > > start to
> > > > > > > > reclaim memory and try to make free memory over the high 
> > > > > > > > watermark. This
> > > > > > > > watermark checking by kswapd doesn't take care free cma pages 
> > > > > > > > so many
> > > > > > > > movable pages would be reclaimed. After then, we have a lot of 
> > > > > > > > movable
> > > > > > > > pages again, so fallback allocation doesn't happen again. To 
> > > > > > > > conclude,
> > > > > > > > amount of free memory on meminfo which includes free CMA pages 
> > > > > > > > is moving
> > > > > > > > around 512 MB if I reserve 512 MB memory for CMA.
> > > > > > > > 
> > > > > > > > I found this problem on following experiment.
> > > > > > > > 
> > > > > > > > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > > > > > > > make -j24
> > > > > > > > 
> > > > > > > > CMA reserve:0 MB512 MB
> > > > > > > > Elapsed-time:   234.8   361.8
> > > > > > > > Average-MemFree:283880 KB   530851 KB
> > > > > > > > 
> > > > > > > > To solve this problem, I can think following 2 possible 
> > > > > > > > solutions.
> > > > > > > > 1. allocate the pages on cma reserved memory first, and if they 
> > > > > > > > are
> > > > > > > >exhausted, allocate movable pages.
> > > > > > > > 2. interleaved allocation: try to allocate specific amounts of 
> > > > > > > > memory
> > > > > > > >from cma reserved memory and then allocate from free movable 
> > > > > > > > memory.
> > > > > > > 
> > > > > > > I love this idea but when I see the code, I don't like that.
> > > > > > > In allocation path, just try to allocate pages by round-robin so 
> > > > > > > it's role
> > > > > > > of allocator. If one of migratetype is full, just pass mission to 
> > > > > > > reclaimer
> > > > > > > with hint(ie, Hey reclaimer, it's non-movable allocation fail
> > > > > > > so there is pointless if you reclaim MIGRATE_CMA pages) so that
> > > > > > > reclaimer can filter it out during page scanning.
> > > > > > > We already have an tool to achieve it(ie, isola

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-19 Thread Minchan Kim
On Thu, May 15, 2014 at 11:45:31AM +0900, Heesub Shin wrote:
> Hello,
> 
> On 05/15/2014 10:53 AM, Joonsoo Kim wrote:
> >On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:
> >>Hey Joonsoo,
> >>
> >>On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:
> >>>CMA is introduced to provide physically contiguous pages at runtime.
> >>>For this purpose, it reserves memory at boot time. Although it reserve
> >>>memory, this reserved memory can be used for movable memory allocation
> >>>request. This usecase is beneficial to the system that needs this CMA
> >>>reserved memory infrequently and it is one of main purpose of
> >>>introducing CMA.
> >>>
> >>>But, there is a problem in current implementation. The problem is that
> >>>it works like as just reserved memory approach. The pages on cma reserved
> >>>memory are hardly used for movable memory allocation. This is caused by
> >>>combination of allocation and reclaim policy.
> >>>
> >>>The pages on cma reserved memory are allocated if there is no movable
> >>>memory, that is, as fallback allocation. So the time this fallback
> >>>allocation is started is under heavy memory pressure. Although it is under
> >>>memory pressure, movable allocation easily succeed, since there would be
> >>>many pages on cma reserved memory. But this is not the case for unmovable
> >>>and reclaimable allocation, because they can't use the pages on cma
> >>>reserved memory. These allocations regard system's free memory as
> >>>(free pages - free cma pages) on watermark checking, that is, free
> >>>unmovable pages + free reclaimable pages + free movable pages. Because
> >>>we already exhausted movable pages, only free pages we have are unmovable
> >>>and reclaimable types and this would be really small amount. So watermark
> >>>checking would be failed. It will wake up kswapd to make enough free
> >>>memory for unmovable and reclaimable allocation and kswapd will do.
> >>>So before we fully utilize pages on cma reserved memory, kswapd start to
> >>>reclaim memory and try to make free memory over the high watermark. This
> >>>watermark checking by kswapd doesn't take care free cma pages so many
> >>>movable pages would be reclaimed. After then, we have a lot of movable
> >>>pages again, so fallback allocation doesn't happen again. To conclude,
> >>>amount of free memory on meminfo which includes free CMA pages is moving
> >>>around 512 MB if I reserve 512 MB memory for CMA.
> >>>
> >>>I found this problem on following experiment.
> >>>
> >>>4 CPUs, 1024 MB, VIRTUAL MACHINE
> >>>make -j24
> >>>
> >>>CMA reserve:   0 MB512 MB
> >>>Elapsed-time:  234.8   361.8
> >>>Average-MemFree:   283880 KB   530851 KB
> >>>
> >>>To solve this problem, I can think following 2 possible solutions.
> >>>1. allocate the pages on cma reserved memory first, and if they are
> >>>exhausted, allocate movable pages.
> >>>2. interleaved allocation: try to allocate specific amounts of memory
> >>>from cma reserved memory and then allocate from free movable memory.
> >>
> >>I love this idea but when I see the code, I don't like that.
> >>In allocation path, just try to allocate pages by round-robin so it's role
> >>of allocator. If one of migratetype is full, just pass mission to reclaimer
> >>with hint(ie, Hey reclaimer, it's non-movable allocation fail
> >>so there is pointless if you reclaim MIGRATE_CMA pages) so that
> >>reclaimer can filter it out during page scanning.
> >>We already have an tool to achieve it(ie, isolate_mode_t).
> >
> >Hello,
> >
> >I agree with leaving fast allocation path as simple as possible.
> >I will remove runtime computation for determining ratio in
> >__rmqueue_cma() and, instead, will use pre-computed value calculated
> >on the other path.
> >
> >I am not sure that whether your second suggestion(Hey relaimer part)
> >is good or not. In my quick thought, that could be helpful in the
> >situation that many free cma pages remained. But, it would be not helpful
> >when there are neither free movable and cma pages. In generally, most
> >workloads mainly uses movable pages for page cache or anonymous mapping.
> >Although reclaim is triggered by non-movable allocation failure, reclaimed
> >pages are used mostly by movable allocation. We can handle these allocation
> >request even if we reclaim the pages just in lru order. If we rotate
> >the lru list for finding movable pages, it could cause more useful
> >pages to be evicted.
> >
> >This is just my quick thought, so please let me correct if I am wrong.
> 
> We have an out of tree implementation that is completely the same
> with the approach Minchan said and it works, but it has definitely
> some side-effects as you pointed, distorting the LRU and evicting
> hot pages. I do not attach code fragments in this thread for some
> reasons, but it must be easy for yourself. I am wondering if it
> could help also in your case.
> 
> Thanks,
> Heesub

Heesub, To be sure, did you 

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-19 Thread Minchan Kim
On Mon, May 19, 2014 at 01:50:01PM +0900, Joonsoo Kim wrote:
> On Mon, May 19, 2014 at 11:53:05AM +0900, Minchan Kim wrote:
> > On Mon, May 19, 2014 at 11:11:21AM +0900, Joonsoo Kim wrote:
> > > On Thu, May 15, 2014 at 11:43:53AM +0900, Minchan Kim wrote:
> > > > On Thu, May 15, 2014 at 10:53:01AM +0900, Joonsoo Kim wrote:
> > > > > On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:
> > > > > > Hey Joonsoo,
> > > > > > 
> > > > > > On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:
> > > > > > > CMA is introduced to provide physically contiguous pages at 
> > > > > > > runtime.
> > > > > > > For this purpose, it reserves memory at boot time. Although it 
> > > > > > > reserve
> > > > > > > memory, this reserved memory can be used for movable memory 
> > > > > > > allocation
> > > > > > > request. This usecase is beneficial to the system that needs this 
> > > > > > > CMA
> > > > > > > reserved memory infrequently and it is one of main purpose of
> > > > > > > introducing CMA.
> > > > > > > 
> > > > > > > But, there is a problem in current implementation. The problem is 
> > > > > > > that
> > > > > > > it works like as just reserved memory approach. The pages on cma 
> > > > > > > reserved
> > > > > > > memory are hardly used for movable memory allocation. This is 
> > > > > > > caused by
> > > > > > > combination of allocation and reclaim policy.
> > > > > > > 
> > > > > > > The pages on cma reserved memory are allocated if there is no 
> > > > > > > movable
> > > > > > > memory, that is, as fallback allocation. So the time this fallback
> > > > > > > allocation is started is under heavy memory pressure. Although it 
> > > > > > > is under
> > > > > > > memory pressure, movable allocation easily succeed, since there 
> > > > > > > would be
> > > > > > > many pages on cma reserved memory. But this is not the case for 
> > > > > > > unmovable
> > > > > > > and reclaimable allocation, because they can't use the pages on 
> > > > > > > cma
> > > > > > > reserved memory. These allocations regard system's free memory as
> > > > > > > (free pages - free cma pages) on watermark checking, that is, free
> > > > > > > unmovable pages + free reclaimable pages + free movable pages. 
> > > > > > > Because
> > > > > > > we already exhausted movable pages, only free pages we have are 
> > > > > > > unmovable
> > > > > > > and reclaimable types and this would be really small amount. So 
> > > > > > > watermark
> > > > > > > checking would be failed. It will wake up kswapd to make enough 
> > > > > > > free
> > > > > > > memory for unmovable and reclaimable allocation and kswapd will 
> > > > > > > do.
> > > > > > > So before we fully utilize pages on cma reserved memory, kswapd 
> > > > > > > start to
> > > > > > > reclaim memory and try to make free memory over the high 
> > > > > > > watermark. This
> > > > > > > watermark checking by kswapd doesn't take care free cma pages so 
> > > > > > > many
> > > > > > > movable pages would be reclaimed. After then, we have a lot of 
> > > > > > > movable
> > > > > > > pages again, so fallback allocation doesn't happen again. To 
> > > > > > > conclude,
> > > > > > > amount of free memory on meminfo which includes free CMA pages is 
> > > > > > > moving
> > > > > > > around 512 MB if I reserve 512 MB memory for CMA.
> > > > > > > 
> > > > > > > I found this problem on following experiment.
> > > > > > > 
> > > > > > > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > > > > > > make -j24
> > > > > > > 
> > > > > > > CMA reserve:  0 MB512 MB
> > > > > > > Elapsed-time: 234.8   361.8
> > > > > > > Average-MemFree:  283880 KB   530851 KB
> > > > > > > 
> > > > > > > To solve this problem, I can think following 2 possible solutions.
> > > > > > > 1. allocate the pages on cma reserved memory first, and if they 
> > > > > > > are
> > > > > > >exhausted, allocate movable pages.
> > > > > > > 2. interleaved allocation: try to allocate specific amounts of 
> > > > > > > memory
> > > > > > >from cma reserved memory and then allocate from free movable 
> > > > > > > memory.
> > > > > > 
> > > > > > I love this idea but when I see the code, I don't like that.
> > > > > > In allocation path, just try to allocate pages by round-robin so 
> > > > > > it's role
> > > > > > of allocator. If one of migratetype is full, just pass mission to 
> > > > > > reclaimer
> > > > > > with hint(ie, Hey reclaimer, it's non-movable allocation fail
> > > > > > so there is pointless if you reclaim MIGRATE_CMA pages) so that
> > > > > > reclaimer can filter it out during page scanning.
> > > > > > We already have an tool to achieve it(ie, isolate_mode_t).
> > > > > 
> > > > > Hello,
> > > > > 
> > > > > I agree with leaving fast allocation path as simple as possible.
> > > > > I will remove runtime computation for determining ratio in
> > > > > __rmqueue_cma() and, instead, will use pre-computed value calculated
> > > > > on the other path.
> > > > 
> >

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-18 Thread Joonsoo Kim
On Mon, May 19, 2014 at 11:53:05AM +0900, Minchan Kim wrote:
> On Mon, May 19, 2014 at 11:11:21AM +0900, Joonsoo Kim wrote:
> > On Thu, May 15, 2014 at 11:43:53AM +0900, Minchan Kim wrote:
> > > On Thu, May 15, 2014 at 10:53:01AM +0900, Joonsoo Kim wrote:
> > > > On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:
> > > > > Hey Joonsoo,
> > > > > 
> > > > > On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:
> > > > > > CMA is introduced to provide physically contiguous pages at runtime.
> > > > > > For this purpose, it reserves memory at boot time. Although it 
> > > > > > reserve
> > > > > > memory, this reserved memory can be used for movable memory 
> > > > > > allocation
> > > > > > request. This usecase is beneficial to the system that needs this 
> > > > > > CMA
> > > > > > reserved memory infrequently and it is one of main purpose of
> > > > > > introducing CMA.
> > > > > > 
> > > > > > But, there is a problem in current implementation. The problem is 
> > > > > > that
> > > > > > it works like as just reserved memory approach. The pages on cma 
> > > > > > reserved
> > > > > > memory are hardly used for movable memory allocation. This is 
> > > > > > caused by
> > > > > > combination of allocation and reclaim policy.
> > > > > > 
> > > > > > The pages on cma reserved memory are allocated if there is no 
> > > > > > movable
> > > > > > memory, that is, as fallback allocation. So the time this fallback
> > > > > > allocation is started is under heavy memory pressure. Although it 
> > > > > > is under
> > > > > > memory pressure, movable allocation easily succeed, since there 
> > > > > > would be
> > > > > > many pages on cma reserved memory. But this is not the case for 
> > > > > > unmovable
> > > > > > and reclaimable allocation, because they can't use the pages on cma
> > > > > > reserved memory. These allocations regard system's free memory as
> > > > > > (free pages - free cma pages) on watermark checking, that is, free
> > > > > > unmovable pages + free reclaimable pages + free movable pages. 
> > > > > > Because
> > > > > > we already exhausted movable pages, only free pages we have are 
> > > > > > unmovable
> > > > > > and reclaimable types and this would be really small amount. So 
> > > > > > watermark
> > > > > > checking would be failed. It will wake up kswapd to make enough free
> > > > > > memory for unmovable and reclaimable allocation and kswapd will do.
> > > > > > So before we fully utilize pages on cma reserved memory, kswapd 
> > > > > > start to
> > > > > > reclaim memory and try to make free memory over the high watermark. 
> > > > > > This
> > > > > > watermark checking by kswapd doesn't take care free cma pages so 
> > > > > > many
> > > > > > movable pages would be reclaimed. After then, we have a lot of 
> > > > > > movable
> > > > > > pages again, so fallback allocation doesn't happen again. To 
> > > > > > conclude,
> > > > > > amount of free memory on meminfo which includes free CMA pages is 
> > > > > > moving
> > > > > > around 512 MB if I reserve 512 MB memory for CMA.
> > > > > > 
> > > > > > I found this problem on following experiment.
> > > > > > 
> > > > > > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > > > > > make -j24
> > > > > > 
> > > > > > CMA reserve:0 MB512 MB
> > > > > > Elapsed-time:   234.8   361.8
> > > > > > Average-MemFree:283880 KB   530851 KB
> > > > > > 
> > > > > > To solve this problem, I can think following 2 possible solutions.
> > > > > > 1. allocate the pages on cma reserved memory first, and if they are
> > > > > >exhausted, allocate movable pages.
> > > > > > 2. interleaved allocation: try to allocate specific amounts of 
> > > > > > memory
> > > > > >from cma reserved memory and then allocate from free movable 
> > > > > > memory.
> > > > > 
> > > > > I love this idea but when I see the code, I don't like that.
> > > > > In allocation path, just try to allocate pages by round-robin so it's 
> > > > > role
> > > > > of allocator. If one of migratetype is full, just pass mission to 
> > > > > reclaimer
> > > > > with hint(ie, Hey reclaimer, it's non-movable allocation fail
> > > > > so there is pointless if you reclaim MIGRATE_CMA pages) so that
> > > > > reclaimer can filter it out during page scanning.
> > > > > We already have an tool to achieve it(ie, isolate_mode_t).
> > > > 
> > > > Hello,
> > > > 
> > > > I agree with leaving fast allocation path as simple as possible.
> > > > I will remove runtime computation for determining ratio in
> > > > __rmqueue_cma() and, instead, will use pre-computed value calculated
> > > > on the other path.
> > > 
> > > Sounds good.
> > > 
> > > > 
> > > > I am not sure that whether your second suggestion(Hey relaimer part)
> > > > is good or not. In my quick thought, that could be helpful in the
> > > > situation that many free cma pages remained. But, it would be not 
> > > > helpful
> > > > when there are neither free movable 

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-18 Thread Minchan Kim
On Mon, May 19, 2014 at 11:11:21AM +0900, Joonsoo Kim wrote:
> On Thu, May 15, 2014 at 11:43:53AM +0900, Minchan Kim wrote:
> > On Thu, May 15, 2014 at 10:53:01AM +0900, Joonsoo Kim wrote:
> > > On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:
> > > > Hey Joonsoo,
> > > > 
> > > > On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:
> > > > > CMA is introduced to provide physically contiguous pages at runtime.
> > > > > For this purpose, it reserves memory at boot time. Although it reserve
> > > > > memory, this reserved memory can be used for movable memory allocation
> > > > > request. This usecase is beneficial to the system that needs this CMA
> > > > > reserved memory infrequently and it is one of main purpose of
> > > > > introducing CMA.
> > > > > 
> > > > > But, there is a problem in current implementation. The problem is that
> > > > > it works like as just reserved memory approach. The pages on cma 
> > > > > reserved
> > > > > memory are hardly used for movable memory allocation. This is caused 
> > > > > by
> > > > > combination of allocation and reclaim policy.
> > > > > 
> > > > > The pages on cma reserved memory are allocated if there is no movable
> > > > > memory, that is, as fallback allocation. So the time this fallback
> > > > > allocation is started is under heavy memory pressure. Although it is 
> > > > > under
> > > > > memory pressure, movable allocation easily succeed, since there would 
> > > > > be
> > > > > many pages on cma reserved memory. But this is not the case for 
> > > > > unmovable
> > > > > and reclaimable allocation, because they can't use the pages on cma
> > > > > reserved memory. These allocations regard system's free memory as
> > > > > (free pages - free cma pages) on watermark checking, that is, free
> > > > > unmovable pages + free reclaimable pages + free movable pages. Because
> > > > > we already exhausted movable pages, only free pages we have are 
> > > > > unmovable
> > > > > and reclaimable types and this would be really small amount. So 
> > > > > watermark
> > > > > checking would be failed. It will wake up kswapd to make enough free
> > > > > memory for unmovable and reclaimable allocation and kswapd will do.
> > > > > So before we fully utilize pages on cma reserved memory, kswapd start 
> > > > > to
> > > > > reclaim memory and try to make free memory over the high watermark. 
> > > > > This
> > > > > watermark checking by kswapd doesn't take care free cma pages so many
> > > > > movable pages would be reclaimed. After then, we have a lot of movable
> > > > > pages again, so fallback allocation doesn't happen again. To conclude,
> > > > > amount of free memory on meminfo which includes free CMA pages is 
> > > > > moving
> > > > > around 512 MB if I reserve 512 MB memory for CMA.
> > > > > 
> > > > > I found this problem on following experiment.
> > > > > 
> > > > > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > > > > make -j24
> > > > > 
> > > > > CMA reserve:  0 MB512 MB
> > > > > Elapsed-time: 234.8   361.8
> > > > > Average-MemFree:  283880 KB   530851 KB
> > > > > 
> > > > > To solve this problem, I can think following 2 possible solutions.
> > > > > 1. allocate the pages on cma reserved memory first, and if they are
> > > > >exhausted, allocate movable pages.
> > > > > 2. interleaved allocation: try to allocate specific amounts of memory
> > > > >from cma reserved memory and then allocate from free movable 
> > > > > memory.
> > > > 
> > > > I love this idea but when I see the code, I don't like that.
> > > > In allocation path, just try to allocate pages by round-robin so it's 
> > > > role
> > > > of allocator. If one of migratetype is full, just pass mission to 
> > > > reclaimer
> > > > with hint(ie, Hey reclaimer, it's non-movable allocation fail
> > > > so there is pointless if you reclaim MIGRATE_CMA pages) so that
> > > > reclaimer can filter it out during page scanning.
> > > > We already have an tool to achieve it(ie, isolate_mode_t).
> > > 
> > > Hello,
> > > 
> > > I agree with leaving fast allocation path as simple as possible.
> > > I will remove runtime computation for determining ratio in
> > > __rmqueue_cma() and, instead, will use pre-computed value calculated
> > > on the other path.
> > 
> > Sounds good.
> > 
> > > 
> > > I am not sure that whether your second suggestion(Hey relaimer part)
> > > is good or not. In my quick thought, that could be helpful in the
> > > situation that many free cma pages remained. But, it would be not helpful
> > > when there are neither free movable and cma pages. In generally, most
> > > workloads mainly uses movable pages for page cache or anonymous mapping.
> > > Although reclaim is triggered by non-movable allocation failure, reclaimed
> > > pages are used mostly by movable allocation. We can handle these 
> > > allocation
> > > request even if we reclaim the pages just in lru order. If we rotate
> > > the lru list for finding m

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-18 Thread Joonsoo Kim
On Sun, May 18, 2014 at 11:06:08PM +0530, Aneesh Kumar K.V wrote:
> Joonsoo Kim  writes:
> 
> > On Wed, May 14, 2014 at 02:12:19PM +0530, Aneesh Kumar K.V wrote:
> >> Joonsoo Kim  writes:
> >> 
> >> 
> >> 
> >> Another issue i am facing with the current code is the atomic allocation
> >> failing even with large number of CMA pages around. In my case we never
> >> reclaimed because large part of the memory is consumed by the page cache 
> >> and
> >> for that, free memory check doesn't include at free_cma. I will test
> >> with this patchset and update here once i have the results.
> >> 
> >
> > Hello,
> >
> > Could you elaborate more on your issue?
> > I can't completely understand your problem.
> > So your atomic allocation is movable? And although there are many free
> > cma pages, that request is fail?
> >
> 
> non movable atomic allocations are failing because we don't have
> anything other than CMA pages left and kswapd is yet to catchup ?
> 
> 
>   swapper/0: page allocation failure: order:0, mode:0x20
>   CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.23-1500.pkvm2_1.5.ppc64 #1
>   Call Trace:
>  [c00cb610] [c0017330] .show_stack+0x130/0x200 (unreliable)
>  [c00cb6e0] [c087a8c8] .dump_stack+0x28/0x3c
>  [c00cb750] [c01e06f0] .warn_alloc_failed+0x110/0x160
>  [c00cb800] [c01e5984] .__alloc_pages_nodemask+0x9d4/0xbf0
>  [c00cb9e0] [c023775c] .alloc_pages_current+0xcc/0x1b0
>  [c00cba80] [c07098d4] .__netdev_alloc_frag+0x1a4/0x1d0
>  [c00cbb20] [c070d750] .__netdev_alloc_skb+0xc0/0x130
>  [c00cbbb0] [d9639b40] .tg3_poll_work+0x900/0x1110 [tg3]
>  [c00cbd10] [d963a3a4] .tg3_poll_msix+0x54/0x200 [tg3]
>  [c00cbdb0] [c071fcec] .net_rx_action+0x1dc/0x310
>  [c00cbe90] [c00c1b08] .__do_softirq+0x158/0x330
>  [c00cbf90] [c0025744] .call_do_softirq+0x14/0x24
>  [c00c7e00] [c0011684] .do_softirq+0xf4/0x130
>  [c00c7e90] [c00c1f18] .irq_exit+0xc8/0x110
>  [c00c7f10] [c0011258] .__do_irq+0xc8/0x1f0
>  [c00c7f90] [c0025768] .call_do_irq+0x14/0x24
>  [c137b750] [c001142c] .do_IRQ+0xac/0x130
>  [c137b800] [c0002a64]
>  hardware_interrupt_common+0x164/0x180
> 
> 
> 
> 
>  Node 0 DMA: 408*64kB (C) 408*128kB (C) 408*256kB (C) 408*512kB (C) 
> 408*1024kB (C) 406*2048kB (C) 199*4096kB (C) 97*8192kB (C) 6*16384kB (C) =
>  3348992kB
>  Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
> hugepages_size=16384kB
>  Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
> hugepages_size=16777216kB
> 
> meminfo details:
> 
>  MemTotal:   65875584 kB
>  MemFree: 8001856 kB
>  Buffers:49330368 kB
>  Cached:   178752 kB
>  SwapCached:0 kB
>  Active: 28550464 kB
>  Inactive:   25476416 kB
>  Active(anon):3771008 kB
>  Inactive(anon):   767360 kB
>  Active(file):   24779456 kB
>  Inactive(file): 24709056 kB
>  Unevictable:   15104 kB
>  Mlocked:   15104 kB
>  SwapTotal:   8384448 kB
>  SwapFree:8384448 kB
>  Dirty: 0 kB
> 
> -aneesh
> 

Hello,

I think that third patch in this patchset would solve this problem.
Your problem may occur in following scenario.

1. Unmovable, reclaimable page are nearly empty.
2. There are some movable pages, so watermark checking is ok.
3. A lot of movable allocations are requested.
4. Most of movable pages are allocated.
5. But, watermark checking is still ok, because we have a lot of
   free cma pages and this allocation is for movable type.
   No waking up kswapd.
6. non-movable atomic allocation request => fail

So, the problem is in step #5. Althoght we have enough pages for
movable type, we should prepare allocation request for the others.
With my third patch, kswapd could be woken by movable allocation, so
your problem would disappreared.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-18 Thread Joonsoo Kim
On Thu, May 15, 2014 at 11:43:53AM +0900, Minchan Kim wrote:
> On Thu, May 15, 2014 at 10:53:01AM +0900, Joonsoo Kim wrote:
> > On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:
> > > Hey Joonsoo,
> > > 
> > > On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:
> > > > CMA is introduced to provide physically contiguous pages at runtime.
> > > > For this purpose, it reserves memory at boot time. Although it reserve
> > > > memory, this reserved memory can be used for movable memory allocation
> > > > request. This usecase is beneficial to the system that needs this CMA
> > > > reserved memory infrequently and it is one of main purpose of
> > > > introducing CMA.
> > > > 
> > > > But, there is a problem in current implementation. The problem is that
> > > > it works like as just reserved memory approach. The pages on cma 
> > > > reserved
> > > > memory are hardly used for movable memory allocation. This is caused by
> > > > combination of allocation and reclaim policy.
> > > > 
> > > > The pages on cma reserved memory are allocated if there is no movable
> > > > memory, that is, as fallback allocation. So the time this fallback
> > > > allocation is started is under heavy memory pressure. Although it is 
> > > > under
> > > > memory pressure, movable allocation easily succeed, since there would be
> > > > many pages on cma reserved memory. But this is not the case for 
> > > > unmovable
> > > > and reclaimable allocation, because they can't use the pages on cma
> > > > reserved memory. These allocations regard system's free memory as
> > > > (free pages - free cma pages) on watermark checking, that is, free
> > > > unmovable pages + free reclaimable pages + free movable pages. Because
> > > > we already exhausted movable pages, only free pages we have are 
> > > > unmovable
> > > > and reclaimable types and this would be really small amount. So 
> > > > watermark
> > > > checking would be failed. It will wake up kswapd to make enough free
> > > > memory for unmovable and reclaimable allocation and kswapd will do.
> > > > So before we fully utilize pages on cma reserved memory, kswapd start to
> > > > reclaim memory and try to make free memory over the high watermark. This
> > > > watermark checking by kswapd doesn't take care free cma pages so many
> > > > movable pages would be reclaimed. After then, we have a lot of movable
> > > > pages again, so fallback allocation doesn't happen again. To conclude,
> > > > amount of free memory on meminfo which includes free CMA pages is moving
> > > > around 512 MB if I reserve 512 MB memory for CMA.
> > > > 
> > > > I found this problem on following experiment.
> > > > 
> > > > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > > > make -j24
> > > > 
> > > > CMA reserve:0 MB512 MB
> > > > Elapsed-time:   234.8   361.8
> > > > Average-MemFree:283880 KB   530851 KB
> > > > 
> > > > To solve this problem, I can think following 2 possible solutions.
> > > > 1. allocate the pages on cma reserved memory first, and if they are
> > > >exhausted, allocate movable pages.
> > > > 2. interleaved allocation: try to allocate specific amounts of memory
> > > >from cma reserved memory and then allocate from free movable memory.
> > > 
> > > I love this idea but when I see the code, I don't like that.
> > > In allocation path, just try to allocate pages by round-robin so it's role
> > > of allocator. If one of migratetype is full, just pass mission to 
> > > reclaimer
> > > with hint(ie, Hey reclaimer, it's non-movable allocation fail
> > > so there is pointless if you reclaim MIGRATE_CMA pages) so that
> > > reclaimer can filter it out during page scanning.
> > > We already have an tool to achieve it(ie, isolate_mode_t).
> > 
> > Hello,
> > 
> > I agree with leaving fast allocation path as simple as possible.
> > I will remove runtime computation for determining ratio in
> > __rmqueue_cma() and, instead, will use pre-computed value calculated
> > on the other path.
> 
> Sounds good.
> 
> > 
> > I am not sure that whether your second suggestion(Hey relaimer part)
> > is good or not. In my quick thought, that could be helpful in the
> > situation that many free cma pages remained. But, it would be not helpful
> > when there are neither free movable and cma pages. In generally, most
> > workloads mainly uses movable pages for page cache or anonymous mapping.
> > Although reclaim is triggered by non-movable allocation failure, reclaimed
> > pages are used mostly by movable allocation. We can handle these allocation
> > request even if we reclaim the pages just in lru order. If we rotate
> > the lru list for finding movable pages, it could cause more useful
> > pages to be evicted.
> > 
> > This is just my quick thought, so please let me correct if I am wrong.
> 
> Why should reclaimer reclaim unnecessary pages?
> So, your answer is that it would be better because upcoming newly allocated
> pages would be allocated easily 

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-18 Thread Aneesh Kumar K.V
Joonsoo Kim  writes:

> On Wed, May 14, 2014 at 02:12:19PM +0530, Aneesh Kumar K.V wrote:
>> Joonsoo Kim  writes:
>> 
>> 
>> 
>> Another issue i am facing with the current code is the atomic allocation
>> failing even with large number of CMA pages around. In my case we never
>> reclaimed because large part of the memory is consumed by the page cache and
>> for that, free memory check doesn't include at free_cma. I will test
>> with this patchset and update here once i have the results.
>> 
>
> Hello,
>
> Could you elaborate more on your issue?
> I can't completely understand your problem.
> So your atomic allocation is movable? And although there are many free
> cma pages, that request is fail?
>

non movable atomic allocations are failing because we don't have
anything other than CMA pages left and kswapd is yet to catchup ?


  swapper/0: page allocation failure: order:0, mode:0x20
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.23-1500.pkvm2_1.5.ppc64 #1
  Call Trace:
 [c00cb610] [c0017330] .show_stack+0x130/0x200 (unreliable)
 [c00cb6e0] [c087a8c8] .dump_stack+0x28/0x3c
 [c00cb750] [c01e06f0] .warn_alloc_failed+0x110/0x160
 [c00cb800] [c01e5984] .__alloc_pages_nodemask+0x9d4/0xbf0
 [c00cb9e0] [c023775c] .alloc_pages_current+0xcc/0x1b0
 [c00cba80] [c07098d4] .__netdev_alloc_frag+0x1a4/0x1d0
 [c00cbb20] [c070d750] .__netdev_alloc_skb+0xc0/0x130
 [c00cbbb0] [d9639b40] .tg3_poll_work+0x900/0x1110 [tg3]
 [c00cbd10] [d963a3a4] .tg3_poll_msix+0x54/0x200 [tg3]
 [c00cbdb0] [c071fcec] .net_rx_action+0x1dc/0x310
 [c00cbe90] [c00c1b08] .__do_softirq+0x158/0x330
 [c00cbf90] [c0025744] .call_do_softirq+0x14/0x24
 [c00c7e00] [c0011684] .do_softirq+0xf4/0x130
 [c00c7e90] [c00c1f18] .irq_exit+0xc8/0x110
 [c00c7f10] [c0011258] .__do_irq+0xc8/0x1f0
 [c00c7f90] [c0025768] .call_do_irq+0x14/0x24
 [c137b750] [c001142c] .do_IRQ+0xac/0x130
 [c137b800] [c0002a64]
 hardware_interrupt_common+0x164/0x180




 Node 0 DMA: 408*64kB (C) 408*128kB (C) 408*256kB (C) 408*512kB (C) 408*1024kB 
(C) 406*2048kB (C) 199*4096kB (C) 97*8192kB (C) 6*16384kB (C) =
 3348992kB
 Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=16384kB
 Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 
hugepages_size=16777216kB

meminfo details:

 MemTotal:   65875584 kB
 MemFree: 8001856 kB
 Buffers:49330368 kB
 Cached:   178752 kB
 SwapCached:0 kB
 Active: 28550464 kB
 Inactive:   25476416 kB
 Active(anon):3771008 kB
 Inactive(anon):   767360 kB
 Active(file):   24779456 kB
 Inactive(file): 24709056 kB
 Unevictable:   15104 kB
 Mlocked:   15104 kB
 SwapTotal:   8384448 kB
 SwapFree:8384448 kB
 Dirty: 0 kB

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-14 Thread Minchan Kim
Hello Heesub,

On Thu, May 15, 2014 at 11:45:31AM +0900, Heesub Shin wrote:
> Hello,
> 
> On 05/15/2014 10:53 AM, Joonsoo Kim wrote:
> >On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:
> >>Hey Joonsoo,
> >>
> >>On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:
> >>>CMA is introduced to provide physically contiguous pages at runtime.
> >>>For this purpose, it reserves memory at boot time. Although it reserve
> >>>memory, this reserved memory can be used for movable memory allocation
> >>>request. This usecase is beneficial to the system that needs this CMA
> >>>reserved memory infrequently and it is one of main purpose of
> >>>introducing CMA.
> >>>
> >>>But, there is a problem in current implementation. The problem is that
> >>>it works like as just reserved memory approach. The pages on cma reserved
> >>>memory are hardly used for movable memory allocation. This is caused by
> >>>combination of allocation and reclaim policy.
> >>>
> >>>The pages on cma reserved memory are allocated if there is no movable
> >>>memory, that is, as fallback allocation. So the time this fallback
> >>>allocation is started is under heavy memory pressure. Although it is under
> >>>memory pressure, movable allocation easily succeed, since there would be
> >>>many pages on cma reserved memory. But this is not the case for unmovable
> >>>and reclaimable allocation, because they can't use the pages on cma
> >>>reserved memory. These allocations regard system's free memory as
> >>>(free pages - free cma pages) on watermark checking, that is, free
> >>>unmovable pages + free reclaimable pages + free movable pages. Because
> >>>we already exhausted movable pages, only free pages we have are unmovable
> >>>and reclaimable types and this would be really small amount. So watermark
> >>>checking would be failed. It will wake up kswapd to make enough free
> >>>memory for unmovable and reclaimable allocation and kswapd will do.
> >>>So before we fully utilize pages on cma reserved memory, kswapd start to
> >>>reclaim memory and try to make free memory over the high watermark. This
> >>>watermark checking by kswapd doesn't take care free cma pages so many
> >>>movable pages would be reclaimed. After then, we have a lot of movable
> >>>pages again, so fallback allocation doesn't happen again. To conclude,
> >>>amount of free memory on meminfo which includes free CMA pages is moving
> >>>around 512 MB if I reserve 512 MB memory for CMA.
> >>>
> >>>I found this problem on following experiment.
> >>>
> >>>4 CPUs, 1024 MB, VIRTUAL MACHINE
> >>>make -j24
> >>>
> >>>CMA reserve:   0 MB512 MB
> >>>Elapsed-time:  234.8   361.8
> >>>Average-MemFree:   283880 KB   530851 KB
> >>>
> >>>To solve this problem, I can think following 2 possible solutions.
> >>>1. allocate the pages on cma reserved memory first, and if they are
> >>>exhausted, allocate movable pages.
> >>>2. interleaved allocation: try to allocate specific amounts of memory
> >>>from cma reserved memory and then allocate from free movable memory.
> >>
> >>I love this idea but when I see the code, I don't like that.
> >>In allocation path, just try to allocate pages by round-robin so it's role
> >>of allocator. If one of migratetype is full, just pass mission to reclaimer
> >>with hint(ie, Hey reclaimer, it's non-movable allocation fail
> >>so there is pointless if you reclaim MIGRATE_CMA pages) so that
> >>reclaimer can filter it out during page scanning.
> >>We already have an tool to achieve it(ie, isolate_mode_t).
> >
> >Hello,
> >
> >I agree with leaving fast allocation path as simple as possible.
> >I will remove runtime computation for determining ratio in
> >__rmqueue_cma() and, instead, will use pre-computed value calculated
> >on the other path.
> >
> >I am not sure that whether your second suggestion(Hey relaimer part)
> >is good or not. In my quick thought, that could be helpful in the
> >situation that many free cma pages remained. But, it would be not helpful
> >when there are neither free movable and cma pages. In generally, most
> >workloads mainly uses movable pages for page cache or anonymous mapping.
> >Although reclaim is triggered by non-movable allocation failure, reclaimed
> >pages are used mostly by movable allocation. We can handle these allocation
> >request even if we reclaim the pages just in lru order. If we rotate
> >the lru list for finding movable pages, it could cause more useful
> >pages to be evicted.
> >
> >This is just my quick thought, so please let me correct if I am wrong.
> 
> We have an out of tree implementation that is completely the same
> with the approach Minchan said and it works, but it has definitely
> some side-effects as you pointed, distorting the LRU and evicting
> hot pages. I do not attach code fragments in this thread for some

Actually, I discussed with Joonsoo to solve such corner case in future if
someone report it but you did it now. Thanks!

LRU churning

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-14 Thread Heesub Shin

Hello,

On 05/15/2014 10:53 AM, Joonsoo Kim wrote:

On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:

Hey Joonsoo,

On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:

CMA is introduced to provide physically contiguous pages at runtime.
For this purpose, it reserves memory at boot time. Although it reserve
memory, this reserved memory can be used for movable memory allocation
request. This usecase is beneficial to the system that needs this CMA
reserved memory infrequently and it is one of main purpose of
introducing CMA.

But, there is a problem in current implementation. The problem is that
it works like as just reserved memory approach. The pages on cma reserved
memory are hardly used for movable memory allocation. This is caused by
combination of allocation and reclaim policy.

The pages on cma reserved memory are allocated if there is no movable
memory, that is, as fallback allocation. So the time this fallback
allocation is started is under heavy memory pressure. Although it is under
memory pressure, movable allocation easily succeed, since there would be
many pages on cma reserved memory. But this is not the case for unmovable
and reclaimable allocation, because they can't use the pages on cma
reserved memory. These allocations regard system's free memory as
(free pages - free cma pages) on watermark checking, that is, free
unmovable pages + free reclaimable pages + free movable pages. Because
we already exhausted movable pages, only free pages we have are unmovable
and reclaimable types and this would be really small amount. So watermark
checking would be failed. It will wake up kswapd to make enough free
memory for unmovable and reclaimable allocation and kswapd will do.
So before we fully utilize pages on cma reserved memory, kswapd start to
reclaim memory and try to make free memory over the high watermark. This
watermark checking by kswapd doesn't take care free cma pages so many
movable pages would be reclaimed. After then, we have a lot of movable
pages again, so fallback allocation doesn't happen again. To conclude,
amount of free memory on meminfo which includes free CMA pages is moving
around 512 MB if I reserve 512 MB memory for CMA.

I found this problem on following experiment.

4 CPUs, 1024 MB, VIRTUAL MACHINE
make -j24

CMA reserve:0 MB512 MB
Elapsed-time:   234.8   361.8
Average-MemFree:283880 KB   530851 KB

To solve this problem, I can think following 2 possible solutions.
1. allocate the pages on cma reserved memory first, and if they are
exhausted, allocate movable pages.
2. interleaved allocation: try to allocate specific amounts of memory
from cma reserved memory and then allocate from free movable memory.


I love this idea but when I see the code, I don't like that.
In allocation path, just try to allocate pages by round-robin so it's role
of allocator. If one of migratetype is full, just pass mission to reclaimer
with hint(ie, Hey reclaimer, it's non-movable allocation fail
so there is pointless if you reclaim MIGRATE_CMA pages) so that
reclaimer can filter it out during page scanning.
We already have an tool to achieve it(ie, isolate_mode_t).


Hello,

I agree with leaving fast allocation path as simple as possible.
I will remove runtime computation for determining ratio in
__rmqueue_cma() and, instead, will use pre-computed value calculated
on the other path.

I am not sure that whether your second suggestion(Hey relaimer part)
is good or not. In my quick thought, that could be helpful in the
situation that many free cma pages remained. But, it would be not helpful
when there are neither free movable and cma pages. In generally, most
workloads mainly uses movable pages for page cache or anonymous mapping.
Although reclaim is triggered by non-movable allocation failure, reclaimed
pages are used mostly by movable allocation. We can handle these allocation
request even if we reclaim the pages just in lru order. If we rotate
the lru list for finding movable pages, it could cause more useful
pages to be evicted.

This is just my quick thought, so please let me correct if I am wrong.


We have an out of tree implementation that is completely the same with 
the approach Minchan said and it works, but it has definitely some 
side-effects as you pointed, distorting the LRU and evicting hot pages. 
I do not attach code fragments in this thread for some reasons, but it 
must be easy for yourself. I am wondering if it could help also in your 
case.


Thanks,
Heesub





And we couldn't do it in zone_watermark_ok with set/reset ALLOC_CMA?
If possible, it would be better becauser it's generic function to check
free pages and cause trigger reclaim/compaction logic.


I guess, your *it* means ratio computation. Right?
I don't like putting it on zone_watermark_ok(). Although it need to
refer to free cma pages value which are also referred in zone_watermark_ok(),
this computation is for determining ratio, not

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-14 Thread Minchan Kim
On Thu, May 15, 2014 at 10:53:01AM +0900, Joonsoo Kim wrote:
> On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:
> > Hey Joonsoo,
> > 
> > On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:
> > > CMA is introduced to provide physically contiguous pages at runtime.
> > > For this purpose, it reserves memory at boot time. Although it reserve
> > > memory, this reserved memory can be used for movable memory allocation
> > > request. This usecase is beneficial to the system that needs this CMA
> > > reserved memory infrequently and it is one of main purpose of
> > > introducing CMA.
> > > 
> > > But, there is a problem in current implementation. The problem is that
> > > it works like as just reserved memory approach. The pages on cma reserved
> > > memory are hardly used for movable memory allocation. This is caused by
> > > combination of allocation and reclaim policy.
> > > 
> > > The pages on cma reserved memory are allocated if there is no movable
> > > memory, that is, as fallback allocation. So the time this fallback
> > > allocation is started is under heavy memory pressure. Although it is under
> > > memory pressure, movable allocation easily succeed, since there would be
> > > many pages on cma reserved memory. But this is not the case for unmovable
> > > and reclaimable allocation, because they can't use the pages on cma
> > > reserved memory. These allocations regard system's free memory as
> > > (free pages - free cma pages) on watermark checking, that is, free
> > > unmovable pages + free reclaimable pages + free movable pages. Because
> > > we already exhausted movable pages, only free pages we have are unmovable
> > > and reclaimable types and this would be really small amount. So watermark
> > > checking would be failed. It will wake up kswapd to make enough free
> > > memory for unmovable and reclaimable allocation and kswapd will do.
> > > So before we fully utilize pages on cma reserved memory, kswapd start to
> > > reclaim memory and try to make free memory over the high watermark. This
> > > watermark checking by kswapd doesn't take care free cma pages so many
> > > movable pages would be reclaimed. After then, we have a lot of movable
> > > pages again, so fallback allocation doesn't happen again. To conclude,
> > > amount of free memory on meminfo which includes free CMA pages is moving
> > > around 512 MB if I reserve 512 MB memory for CMA.
> > > 
> > > I found this problem on following experiment.
> > > 
> > > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > > make -j24
> > > 
> > > CMA reserve:  0 MB512 MB
> > > Elapsed-time: 234.8   361.8
> > > Average-MemFree:  283880 KB   530851 KB
> > > 
> > > To solve this problem, I can think following 2 possible solutions.
> > > 1. allocate the pages on cma reserved memory first, and if they are
> > >exhausted, allocate movable pages.
> > > 2. interleaved allocation: try to allocate specific amounts of memory
> > >from cma reserved memory and then allocate from free movable memory.
> > 
> > I love this idea but when I see the code, I don't like that.
> > In allocation path, just try to allocate pages by round-robin so it's role
> > of allocator. If one of migratetype is full, just pass mission to reclaimer
> > with hint(ie, Hey reclaimer, it's non-movable allocation fail
> > so there is pointless if you reclaim MIGRATE_CMA pages) so that
> > reclaimer can filter it out during page scanning.
> > We already have an tool to achieve it(ie, isolate_mode_t).
> 
> Hello,
> 
> I agree with leaving fast allocation path as simple as possible.
> I will remove runtime computation for determining ratio in
> __rmqueue_cma() and, instead, will use pre-computed value calculated
> on the other path.

Sounds good.

> 
> I am not sure that whether your second suggestion(Hey relaimer part)
> is good or not. In my quick thought, that could be helpful in the
> situation that many free cma pages remained. But, it would be not helpful
> when there are neither free movable and cma pages. In generally, most
> workloads mainly uses movable pages for page cache or anonymous mapping.
> Although reclaim is triggered by non-movable allocation failure, reclaimed
> pages are used mostly by movable allocation. We can handle these allocation
> request even if we reclaim the pages just in lru order. If we rotate
> the lru list for finding movable pages, it could cause more useful
> pages to be evicted.
> 
> This is just my quick thought, so please let me correct if I am wrong.

Why should reclaimer reclaim unnecessary pages?
So, your answer is that it would be better because upcoming newly allocated
pages would be allocated easily without interrupt. But it could reclaim
too much pages until watermark for unmovable allocation is okay.
Even, sometime, you might see OOM.

Moreover, how could you handle current trobule?
For example, there is atomic allocation and the only thing to save the world
is kswapd because it's 

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-14 Thread Joonsoo Kim
On Wed, May 14, 2014 at 02:12:19PM +0530, Aneesh Kumar K.V wrote:
> Joonsoo Kim  writes:
> 
> > CMA is introduced to provide physically contiguous pages at runtime.
> > For this purpose, it reserves memory at boot time. Although it reserve
> > memory, this reserved memory can be used for movable memory allocation
> > request. This usecase is beneficial to the system that needs this CMA
> > reserved memory infrequently and it is one of main purpose of
> > introducing CMA.
> >
> > But, there is a problem in current implementation. The problem is that
> > it works like as just reserved memory approach. The pages on cma reserved
> > memory are hardly used for movable memory allocation. This is caused by
> > combination of allocation and reclaim policy.
> >
> > The pages on cma reserved memory are allocated if there is no movable
> > memory, that is, as fallback allocation. So the time this fallback
> > allocation is started is under heavy memory pressure. Although it is under
> > memory pressure, movable allocation easily succeed, since there would be
> > many pages on cma reserved memory. But this is not the case for unmovable
> > and reclaimable allocation, because they can't use the pages on cma
> > reserved memory. These allocations regard system's free memory as
> > (free pages - free cma pages) on watermark checking, that is, free
> > unmovable pages + free reclaimable pages + free movable pages. Because
> > we already exhausted movable pages, only free pages we have are unmovable
> > and reclaimable types and this would be really small amount. So watermark
> > checking would be failed. It will wake up kswapd to make enough free
> > memory for unmovable and reclaimable allocation and kswapd will do.
> > So before we fully utilize pages on cma reserved memory, kswapd start to
> > reclaim memory and try to make free memory over the high watermark. This
> > watermark checking by kswapd doesn't take care free cma pages so many
> > movable pages would be reclaimed. After then, we have a lot of movable
> > pages again, so fallback allocation doesn't happen again. To conclude,
> > amount of free memory on meminfo which includes free CMA pages is moving
> > around 512 MB if I reserve 512 MB memory for CMA.
> 
> 
> Another issue i am facing with the current code is the atomic allocation
> failing even with large number of CMA pages around. In my case we never
> reclaimed because large part of the memory is consumed by the page cache and
> for that, free memory check doesn't include at free_cma. I will test
> with this patchset and update here once i have the results.
> 

Hello,

Could you elaborate more on your issue?
I can't completely understand your problem.
So your atomic allocation is movable? And although there are many free
cma pages, that request is fail?


> >
> > I found this problem on following experiment.
> >
> > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > make -j24
> >
> > CMA reserve:0 MB512 MB
> > Elapsed-time:   234.8   361.8
> > Average-MemFree:283880 KB   530851 KB
> >
> > To solve this problem, I can think following 2 possible solutions.
> > 1. allocate the pages on cma reserved memory first, and if they are
> >exhausted, allocate movable pages.
> > 2. interleaved allocation: try to allocate specific amounts of memory
> >from cma reserved memory and then allocate from free movable memory.
> >
> > I tested #1 approach and found the problem. Although free memory on
> > meminfo can move around low watermark, there is large fluctuation on free
> > memory, because too many pages are reclaimed when kswapd is invoked.
> > Reason for this behaviour is that successive allocated CMA pages are
> > on the LRU list in that order and kswapd reclaim them in same order.
> > These memory doesn't help watermark checking from kwapd, so too many
> > pages are reclaimed, I guess.
> >
> > So, I implement #2 approach.
> > One thing I should note is that we should not change allocation target
> > (movable list or cma) on each allocation attempt, since this prevent
> > allocated pages to be in physically succession, so some I/O devices can
> > be hurt their performance. To solve this, I keep allocation target
> > in at least pageblock_nr_pages attempts and make this number reflect
> > ratio, free pages without free cma pages to free cma pages. With this
> > approach, system works very smoothly and fully utilize the pages on
> > cma reserved memory.
> >
> > Following is the experimental result of this patch.
> >
> > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > make -j24
> >
> > 
> > CMA reserve:0 MB512 MB
> > Elapsed-time:   234.8   361.8
> > Average-MemFree:283880 KB   530851 KB
> > pswpin: 7   110064
> > pswpout:452 767502
> >
> > 
> > CMA reserve:0 MB512 MB
> > Elapsed-time:   234.2   235.6
> > Average-MemFree:2816

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-14 Thread Joonsoo Kim
On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:
> Hey Joonsoo,
> 
> On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:
> > CMA is introduced to provide physically contiguous pages at runtime.
> > For this purpose, it reserves memory at boot time. Although it reserve
> > memory, this reserved memory can be used for movable memory allocation
> > request. This usecase is beneficial to the system that needs this CMA
> > reserved memory infrequently and it is one of main purpose of
> > introducing CMA.
> > 
> > But, there is a problem in current implementation. The problem is that
> > it works like as just reserved memory approach. The pages on cma reserved
> > memory are hardly used for movable memory allocation. This is caused by
> > combination of allocation and reclaim policy.
> > 
> > The pages on cma reserved memory are allocated if there is no movable
> > memory, that is, as fallback allocation. So the time this fallback
> > allocation is started is under heavy memory pressure. Although it is under
> > memory pressure, movable allocation easily succeed, since there would be
> > many pages on cma reserved memory. But this is not the case for unmovable
> > and reclaimable allocation, because they can't use the pages on cma
> > reserved memory. These allocations regard system's free memory as
> > (free pages - free cma pages) on watermark checking, that is, free
> > unmovable pages + free reclaimable pages + free movable pages. Because
> > we already exhausted movable pages, only free pages we have are unmovable
> > and reclaimable types and this would be really small amount. So watermark
> > checking would be failed. It will wake up kswapd to make enough free
> > memory for unmovable and reclaimable allocation and kswapd will do.
> > So before we fully utilize pages on cma reserved memory, kswapd start to
> > reclaim memory and try to make free memory over the high watermark. This
> > watermark checking by kswapd doesn't take care free cma pages so many
> > movable pages would be reclaimed. After then, we have a lot of movable
> > pages again, so fallback allocation doesn't happen again. To conclude,
> > amount of free memory on meminfo which includes free CMA pages is moving
> > around 512 MB if I reserve 512 MB memory for CMA.
> > 
> > I found this problem on following experiment.
> > 
> > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > make -j24
> > 
> > CMA reserve:0 MB512 MB
> > Elapsed-time:   234.8   361.8
> > Average-MemFree:283880 KB   530851 KB
> > 
> > To solve this problem, I can think following 2 possible solutions.
> > 1. allocate the pages on cma reserved memory first, and if they are
> >exhausted, allocate movable pages.
> > 2. interleaved allocation: try to allocate specific amounts of memory
> >from cma reserved memory and then allocate from free movable memory.
> 
> I love this idea but when I see the code, I don't like that.
> In allocation path, just try to allocate pages by round-robin so it's role
> of allocator. If one of migratetype is full, just pass mission to reclaimer
> with hint(ie, Hey reclaimer, it's non-movable allocation fail
> so there is pointless if you reclaim MIGRATE_CMA pages) so that
> reclaimer can filter it out during page scanning.
> We already have an tool to achieve it(ie, isolate_mode_t).

Hello,

I agree with leaving fast allocation path as simple as possible.
I will remove runtime computation for determining ratio in
__rmqueue_cma() and, instead, will use pre-computed value calculated
on the other path.

I am not sure that whether your second suggestion(Hey relaimer part)
is good or not. In my quick thought, that could be helpful in the
situation that many free cma pages remained. But, it would be not helpful
when there are neither free movable and cma pages. In generally, most
workloads mainly uses movable pages for page cache or anonymous mapping.
Although reclaim is triggered by non-movable allocation failure, reclaimed
pages are used mostly by movable allocation. We can handle these allocation
request even if we reclaim the pages just in lru order. If we rotate
the lru list for finding movable pages, it could cause more useful
pages to be evicted.

This is just my quick thought, so please let me correct if I am wrong.

> 
> And we couldn't do it in zone_watermark_ok with set/reset ALLOC_CMA?
> If possible, it would be better becauser it's generic function to check
> free pages and cause trigger reclaim/compaction logic.

I guess, your *it* means ratio computation. Right?
I don't like putting it on zone_watermark_ok(). Although it need to
refer to free cma pages value which are also referred in zone_watermark_ok(),
this computation is for determining ratio, not for triggering
reclaim/compaction. And this zone_watermark_ok() is on more hot-path, so
putting this logic into zone_watermark_ok() looks not better to me.

I will think better place to do it.

Thanks.
--
To unsubscribe fro

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-14 Thread Aneesh Kumar K.V
Joonsoo Kim  writes:

> CMA is introduced to provide physically contiguous pages at runtime.
> For this purpose, it reserves memory at boot time. Although it reserve
> memory, this reserved memory can be used for movable memory allocation
> request. This usecase is beneficial to the system that needs this CMA
> reserved memory infrequently and it is one of main purpose of
> introducing CMA.
>
> But, there is a problem in current implementation. The problem is that
> it works like as just reserved memory approach. The pages on cma reserved
> memory are hardly used for movable memory allocation. This is caused by
> combination of allocation and reclaim policy.
>
> The pages on cma reserved memory are allocated if there is no movable
> memory, that is, as fallback allocation. So the time this fallback
> allocation is started is under heavy memory pressure. Although it is under
> memory pressure, movable allocation easily succeed, since there would be
> many pages on cma reserved memory. But this is not the case for unmovable
> and reclaimable allocation, because they can't use the pages on cma
> reserved memory. These allocations regard system's free memory as
> (free pages - free cma pages) on watermark checking, that is, free
> unmovable pages + free reclaimable pages + free movable pages. Because
> we already exhausted movable pages, only free pages we have are unmovable
> and reclaimable types and this would be really small amount. So watermark
> checking would be failed. It will wake up kswapd to make enough free
> memory for unmovable and reclaimable allocation and kswapd will do.
> So before we fully utilize pages on cma reserved memory, kswapd start to
> reclaim memory and try to make free memory over the high watermark. This
> watermark checking by kswapd doesn't take care free cma pages so many
> movable pages would be reclaimed. After then, we have a lot of movable
> pages again, so fallback allocation doesn't happen again. To conclude,
> amount of free memory on meminfo which includes free CMA pages is moving
> around 512 MB if I reserve 512 MB memory for CMA.


Another issue i am facing with the current code is the atomic allocation
failing even with large number of CMA pages around. In my case we never
reclaimed because large part of the memory is consumed by the page cache and
for that, free memory check doesn't include at free_cma. I will test
with this patchset and update here once i have the results.

>
> I found this problem on following experiment.
>
> 4 CPUs, 1024 MB, VIRTUAL MACHINE
> make -j24
>
> CMA reserve:  0 MB512 MB
> Elapsed-time: 234.8   361.8
> Average-MemFree:  283880 KB   530851 KB
>
> To solve this problem, I can think following 2 possible solutions.
> 1. allocate the pages on cma reserved memory first, and if they are
>exhausted, allocate movable pages.
> 2. interleaved allocation: try to allocate specific amounts of memory
>from cma reserved memory and then allocate from free movable memory.
>
> I tested #1 approach and found the problem. Although free memory on
> meminfo can move around low watermark, there is large fluctuation on free
> memory, because too many pages are reclaimed when kswapd is invoked.
> Reason for this behaviour is that successive allocated CMA pages are
> on the LRU list in that order and kswapd reclaim them in same order.
> These memory doesn't help watermark checking from kwapd, so too many
> pages are reclaimed, I guess.
>
> So, I implement #2 approach.
> One thing I should note is that we should not change allocation target
> (movable list or cma) on each allocation attempt, since this prevent
> allocated pages to be in physically succession, so some I/O devices can
> be hurt their performance. To solve this, I keep allocation target
> in at least pageblock_nr_pages attempts and make this number reflect
> ratio, free pages without free cma pages to free cma pages. With this
> approach, system works very smoothly and fully utilize the pages on
> cma reserved memory.
>
> Following is the experimental result of this patch.
>
> 4 CPUs, 1024 MB, VIRTUAL MACHINE
> make -j24
>
> 
> CMA reserve:0 MB512 MB
> Elapsed-time:   234.8   361.8
> Average-MemFree:283880 KB   530851 KB
> pswpin: 7   110064
> pswpout:452 767502
>
> 
> CMA reserve:0 MB512 MB
> Elapsed-time:   234.2   235.6
> Average-MemFree:281651 KB   290227 KB
> pswpin: 8   8
> pswpout:430 510
>
> There is no difference if we don't have cma reserved memory (0 MB case).
> But, with cma reserved memory (512 MB case), we fully utilize these
> reserved memory through this patch and the system behaves like as
> it doesn't reserve any memory.
>
> With this patch, we aggressively allocate the pages on cma reserved memory
> so latency of CMA 

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-12 Thread Minchan Kim
On Mon, May 12, 2014 at 10:04:29AM -0700, Laura Abbott wrote:
> Hi,
> 
> On 5/7/2014 5:32 PM, Joonsoo Kim wrote:
> > CMA is introduced to provide physically contiguous pages at runtime.
> > For this purpose, it reserves memory at boot time. Although it reserve
> > memory, this reserved memory can be used for movable memory allocation
> > request. This usecase is beneficial to the system that needs this CMA
> > reserved memory infrequently and it is one of main purpose of
> > introducing CMA.
> > 
> > But, there is a problem in current implementation. The problem is that
> > it works like as just reserved memory approach. The pages on cma reserved
> > memory are hardly used for movable memory allocation. This is caused by
> > combination of allocation and reclaim policy.
> > 
> > The pages on cma reserved memory are allocated if there is no movable
> > memory, that is, as fallback allocation. So the time this fallback
> > allocation is started is under heavy memory pressure. Although it is under
> > memory pressure, movable allocation easily succeed, since there would be
> > many pages on cma reserved memory. But this is not the case for unmovable
> > and reclaimable allocation, because they can't use the pages on cma
> > reserved memory. These allocations regard system's free memory as
> > (free pages - free cma pages) on watermark checking, that is, free
> > unmovable pages + free reclaimable pages + free movable pages. Because
> > we already exhausted movable pages, only free pages we have are unmovable
> > and reclaimable types and this would be really small amount. So watermark
> > checking would be failed. It will wake up kswapd to make enough free
> > memory for unmovable and reclaimable allocation and kswapd will do.
> > So before we fully utilize pages on cma reserved memory, kswapd start to
> > reclaim memory and try to make free memory over the high watermark. This
> > watermark checking by kswapd doesn't take care free cma pages so many
> > movable pages would be reclaimed. After then, we have a lot of movable
> > pages again, so fallback allocation doesn't happen again. To conclude,
> > amount of free memory on meminfo which includes free CMA pages is moving
> > around 512 MB if I reserve 512 MB memory for CMA.
> > 
> > I found this problem on following experiment.
> > 
> > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > make -j24
> > 
> > CMA reserve:0 MB512 MB
> > Elapsed-time:   234.8   361.8
> > Average-MemFree:283880 KB   530851 KB
> > 
> > To solve this problem, I can think following 2 possible solutions.
> > 1. allocate the pages on cma reserved memory first, and if they are
> >exhausted, allocate movable pages.
> > 2. interleaved allocation: try to allocate specific amounts of memory
> >from cma reserved memory and then allocate from free movable memory.
> > 
> > I tested #1 approach and found the problem. Although free memory on
> > meminfo can move around low watermark, there is large fluctuation on free
> > memory, because too many pages are reclaimed when kswapd is invoked.
> > Reason for this behaviour is that successive allocated CMA pages are
> > on the LRU list in that order and kswapd reclaim them in same order.
> > These memory doesn't help watermark checking from kwapd, so too many
> > pages are reclaimed, I guess.
> > 
> 
> We have an out of tree implementation of #1 and so far it's worked for us
> although we weren't looking at the same metrics. I don't completely
> understand the issue you pointed out with #1. It sounds like the issue is
> that CMA pages are already in use by other processes and on LRU lists and
> because the pages are on LRU lists these aren't counted towards the
> watermark by kswapd. Is my understanding correct?

Kswapd could reclaim MIGRATE_CMA pages unconditionally although allocator
patch was failed by non-movable allocation. It's pointless and should fix.

> 
> > So, I implement #2 approach.
> > One thing I should note is that we should not change allocation target
> > (movable list or cma) on each allocation attempt, since this prevent
> > allocated pages to be in physically succession, so some I/O devices can
> > be hurt their performance. To solve this, I keep allocation target
> > in at least pageblock_nr_pages attempts and make this number reflect
> > ratio, free pages without free cma pages to free cma pages. With this
> > approach, system works very smoothly and fully utilize the pages on
> > cma reserved memory.
> > 
> > Following is the experimental result of this patch.
> > 
> > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > make -j24
> > 
> > 
> > CMA reserve:0 MB512 MB
> > Elapsed-time:   234.8   361.8
> > Average-MemFree:283880 KB   530851 KB
> > pswpin: 7   110064
> > pswpout:452 767502
> > 
> > 
> > CMA reserve:0 MB512 MB
> > Elapsed-time:   234.2

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-12 Thread Minchan Kim
Hey Joonsoo,

On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:
> CMA is introduced to provide physically contiguous pages at runtime.
> For this purpose, it reserves memory at boot time. Although it reserve
> memory, this reserved memory can be used for movable memory allocation
> request. This usecase is beneficial to the system that needs this CMA
> reserved memory infrequently and it is one of main purpose of
> introducing CMA.
> 
> But, there is a problem in current implementation. The problem is that
> it works like as just reserved memory approach. The pages on cma reserved
> memory are hardly used for movable memory allocation. This is caused by
> combination of allocation and reclaim policy.
> 
> The pages on cma reserved memory are allocated if there is no movable
> memory, that is, as fallback allocation. So the time this fallback
> allocation is started is under heavy memory pressure. Although it is under
> memory pressure, movable allocation easily succeed, since there would be
> many pages on cma reserved memory. But this is not the case for unmovable
> and reclaimable allocation, because they can't use the pages on cma
> reserved memory. These allocations regard system's free memory as
> (free pages - free cma pages) on watermark checking, that is, free
> unmovable pages + free reclaimable pages + free movable pages. Because
> we already exhausted movable pages, only free pages we have are unmovable
> and reclaimable types and this would be really small amount. So watermark
> checking would be failed. It will wake up kswapd to make enough free
> memory for unmovable and reclaimable allocation and kswapd will do.
> So before we fully utilize pages on cma reserved memory, kswapd start to
> reclaim memory and try to make free memory over the high watermark. This
> watermark checking by kswapd doesn't take care free cma pages so many
> movable pages would be reclaimed. After then, we have a lot of movable
> pages again, so fallback allocation doesn't happen again. To conclude,
> amount of free memory on meminfo which includes free CMA pages is moving
> around 512 MB if I reserve 512 MB memory for CMA.
> 
> I found this problem on following experiment.
> 
> 4 CPUs, 1024 MB, VIRTUAL MACHINE
> make -j24
> 
> CMA reserve:  0 MB512 MB
> Elapsed-time: 234.8   361.8
> Average-MemFree:  283880 KB   530851 KB
> 
> To solve this problem, I can think following 2 possible solutions.
> 1. allocate the pages on cma reserved memory first, and if they are
>exhausted, allocate movable pages.
> 2. interleaved allocation: try to allocate specific amounts of memory
>from cma reserved memory and then allocate from free movable memory.

I love this idea but when I see the code, I don't like that.
In allocation path, just try to allocate pages by round-robin so it's role
of allocator. If one of migratetype is full, just pass mission to reclaimer
with hint(ie, Hey reclaimer, it's non-movable allocation fail
so there is pointless if you reclaim MIGRATE_CMA pages) so that
reclaimer can filter it out during page scanning.
We already have an tool to achieve it(ie, isolate_mode_t).

And we couldn't do it in zone_watermark_ok with set/reset ALLOC_CMA?
If possible, it would be better becauser it's generic function to check
free pages and cause trigger reclaim/compaction logic.

> 
> I tested #1 approach and found the problem. Although free memory on
> meminfo can move around low watermark, there is large fluctuation on free
> memory, because too many pages are reclaimed when kswapd is invoked.
> Reason for this behaviour is that successive allocated CMA pages are
> on the LRU list in that order and kswapd reclaim them in same order.
> These memory doesn't help watermark checking from kwapd, so too many
> pages are reclaimed, I guess.
> 
> So, I implement #2 approach.
> One thing I should note is that we should not change allocation target
> (movable list or cma) on each allocation attempt, since this prevent
> allocated pages to be in physically succession, so some I/O devices can
> be hurt their performance. To solve this, I keep allocation target
> in at least pageblock_nr_pages attempts and make this number reflect
> ratio, free pages without free cma pages to free cma pages. With this
> approach, system works very smoothly and fully utilize the pages on
> cma reserved memory.
> 
> Following is the experimental result of this patch.
> 
> 4 CPUs, 1024 MB, VIRTUAL MACHINE
> make -j24
> 
> 
> CMA reserve:0 MB512 MB
> Elapsed-time:   234.8   361.8
> Average-MemFree:283880 KB   530851 KB
> pswpin: 7   110064
> pswpout:452 767502
> 
> 
> CMA reserve:0 MB512 MB
> Elapsed-time:   234.2   235.6
> Average-MemFree:281651 KB   290227 KB
> pswpin: 8   8
> pswpout:430   

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-12 Thread Joonsoo Kim
On Mon, May 12, 2014 at 10:04:29AM -0700, Laura Abbott wrote:
> Hi,
> 
> On 5/7/2014 5:32 PM, Joonsoo Kim wrote:
> > CMA is introduced to provide physically contiguous pages at runtime.
> > For this purpose, it reserves memory at boot time. Although it reserve
> > memory, this reserved memory can be used for movable memory allocation
> > request. This usecase is beneficial to the system that needs this CMA
> > reserved memory infrequently and it is one of main purpose of
> > introducing CMA.
> > 
> > But, there is a problem in current implementation. The problem is that
> > it works like as just reserved memory approach. The pages on cma reserved
> > memory are hardly used for movable memory allocation. This is caused by
> > combination of allocation and reclaim policy.
> > 
> > The pages on cma reserved memory are allocated if there is no movable
> > memory, that is, as fallback allocation. So the time this fallback
> > allocation is started is under heavy memory pressure. Although it is under
> > memory pressure, movable allocation easily succeed, since there would be
> > many pages on cma reserved memory. But this is not the case for unmovable
> > and reclaimable allocation, because they can't use the pages on cma
> > reserved memory. These allocations regard system's free memory as
> > (free pages - free cma pages) on watermark checking, that is, free
> > unmovable pages + free reclaimable pages + free movable pages. Because
> > we already exhausted movable pages, only free pages we have are unmovable
> > and reclaimable types and this would be really small amount. So watermark
> > checking would be failed. It will wake up kswapd to make enough free
> > memory for unmovable and reclaimable allocation and kswapd will do.
> > So before we fully utilize pages on cma reserved memory, kswapd start to
> > reclaim memory and try to make free memory over the high watermark. This
> > watermark checking by kswapd doesn't take care free cma pages so many
> > movable pages would be reclaimed. After then, we have a lot of movable
> > pages again, so fallback allocation doesn't happen again. To conclude,
> > amount of free memory on meminfo which includes free CMA pages is moving
> > around 512 MB if I reserve 512 MB memory for CMA.
> > 
> > I found this problem on following experiment.
> > 
> > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > make -j24
> > 
> > CMA reserve:0 MB512 MB
> > Elapsed-time:   234.8   361.8
> > Average-MemFree:283880 KB   530851 KB
> > 
> > To solve this problem, I can think following 2 possible solutions.
> > 1. allocate the pages on cma reserved memory first, and if they are
> >exhausted, allocate movable pages.
> > 2. interleaved allocation: try to allocate specific amounts of memory
> >from cma reserved memory and then allocate from free movable memory.
> > 
> > I tested #1 approach and found the problem. Although free memory on
> > meminfo can move around low watermark, there is large fluctuation on free
> > memory, because too many pages are reclaimed when kswapd is invoked.
> > Reason for this behaviour is that successive allocated CMA pages are
> > on the LRU list in that order and kswapd reclaim them in same order.
> > These memory doesn't help watermark checking from kwapd, so too many
> > pages are reclaimed, I guess.
> > 
> 
> We have an out of tree implementation of #1 and so far it's worked for us
> although we weren't looking at the same metrics. I don't completely
> understand the issue you pointed out with #1. It sounds like the issue is
> that CMA pages are already in use by other processes and on LRU lists and
> because the pages are on LRU lists these aren't counted towards the
> watermark by kswapd. Is my understanding correct?

Hello,

Yes, your understanding is correct.
kswapd want to reclaim normal (not CMA) pages, but LRU lists could
have a lot of CMA pages continuously by #1 approach, so watermark
aren't restored easily.


> 
> > So, I implement #2 approach.
> > One thing I should note is that we should not change allocation target
> > (movable list or cma) on each allocation attempt, since this prevent
> > allocated pages to be in physically succession, so some I/O devices can
> > be hurt their performance. To solve this, I keep allocation target
> > in at least pageblock_nr_pages attempts and make this number reflect
> > ratio, free pages without free cma pages to free cma pages. With this
> > approach, system works very smoothly and fully utilize the pages on
> > cma reserved memory.
> > 
> > Following is the experimental result of this patch.
> > 
> > 4 CPUs, 1024 MB, VIRTUAL MACHINE
> > make -j24
> > 
> > 
> > CMA reserve:0 MB512 MB
> > Elapsed-time:   234.8   361.8
> > Average-MemFree:283880 KB   530851 KB
> > pswpin: 7   110064
> > pswpout:452 767502
> > 
> > 
> > CMA reserve:0 MB   

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-12 Thread Laura Abbott
Hi,

On 5/7/2014 5:32 PM, Joonsoo Kim wrote:
> CMA is introduced to provide physically contiguous pages at runtime.
> For this purpose, it reserves memory at boot time. Although it reserve
> memory, this reserved memory can be used for movable memory allocation
> request. This usecase is beneficial to the system that needs this CMA
> reserved memory infrequently and it is one of main purpose of
> introducing CMA.
> 
> But, there is a problem in current implementation. The problem is that
> it works like as just reserved memory approach. The pages on cma reserved
> memory are hardly used for movable memory allocation. This is caused by
> combination of allocation and reclaim policy.
> 
> The pages on cma reserved memory are allocated if there is no movable
> memory, that is, as fallback allocation. So the time this fallback
> allocation is started is under heavy memory pressure. Although it is under
> memory pressure, movable allocation easily succeed, since there would be
> many pages on cma reserved memory. But this is not the case for unmovable
> and reclaimable allocation, because they can't use the pages on cma
> reserved memory. These allocations regard system's free memory as
> (free pages - free cma pages) on watermark checking, that is, free
> unmovable pages + free reclaimable pages + free movable pages. Because
> we already exhausted movable pages, only free pages we have are unmovable
> and reclaimable types and this would be really small amount. So watermark
> checking would be failed. It will wake up kswapd to make enough free
> memory for unmovable and reclaimable allocation and kswapd will do.
> So before we fully utilize pages on cma reserved memory, kswapd start to
> reclaim memory and try to make free memory over the high watermark. This
> watermark checking by kswapd doesn't take care free cma pages so many
> movable pages would be reclaimed. After then, we have a lot of movable
> pages again, so fallback allocation doesn't happen again. To conclude,
> amount of free memory on meminfo which includes free CMA pages is moving
> around 512 MB if I reserve 512 MB memory for CMA.
> 
> I found this problem on following experiment.
> 
> 4 CPUs, 1024 MB, VIRTUAL MACHINE
> make -j24
> 
> CMA reserve:  0 MB512 MB
> Elapsed-time: 234.8   361.8
> Average-MemFree:  283880 KB   530851 KB
> 
> To solve this problem, I can think following 2 possible solutions.
> 1. allocate the pages on cma reserved memory first, and if they are
>exhausted, allocate movable pages.
> 2. interleaved allocation: try to allocate specific amounts of memory
>from cma reserved memory and then allocate from free movable memory.
> 
> I tested #1 approach and found the problem. Although free memory on
> meminfo can move around low watermark, there is large fluctuation on free
> memory, because too many pages are reclaimed when kswapd is invoked.
> Reason for this behaviour is that successive allocated CMA pages are
> on the LRU list in that order and kswapd reclaim them in same order.
> These memory doesn't help watermark checking from kwapd, so too many
> pages are reclaimed, I guess.
> 

We have an out of tree implementation of #1 and so far it's worked for us
although we weren't looking at the same metrics. I don't completely
understand the issue you pointed out with #1. It sounds like the issue is
that CMA pages are already in use by other processes and on LRU lists and
because the pages are on LRU lists these aren't counted towards the
watermark by kswapd. Is my understanding correct?

> So, I implement #2 approach.
> One thing I should note is that we should not change allocation target
> (movable list or cma) on each allocation attempt, since this prevent
> allocated pages to be in physically succession, so some I/O devices can
> be hurt their performance. To solve this, I keep allocation target
> in at least pageblock_nr_pages attempts and make this number reflect
> ratio, free pages without free cma pages to free cma pages. With this
> approach, system works very smoothly and fully utilize the pages on
> cma reserved memory.
> 
> Following is the experimental result of this patch.
> 
> 4 CPUs, 1024 MB, VIRTUAL MACHINE
> make -j24
> 
> 
> CMA reserve:0 MB512 MB
> Elapsed-time:   234.8   361.8
> Average-MemFree:283880 KB   530851 KB
> pswpin: 7   110064
> pswpout:452 767502
> 
> 
> CMA reserve:0 MB512 MB
> Elapsed-time:   234.2   235.6
> Average-MemFree:281651 KB   290227 KB
> pswpin: 8   8
> pswpout:430 510
> 
> There is no difference if we don't have cma reserved memory (0 MB case).
> But, with cma reserved memory (512 MB case), we fully utilize these
> reserved memory through this patch and the system behaves like as
> it doesn't reserve any memory.

What m

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-09 Thread Michal Nazarewicz
On Wed, May 07 2014, Joonsoo Kim wrote:
> Signed-off-by: Joonsoo Kim 

The code looks good to me, but I don't feel competent on whether the
approach is beneficial or not.  Still:

Acked-by: Michal Nazarewicz 


-- 
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz(o o)
ooo +--ooO--(_)--Ooo--


signature.asc
Description: PGP signature


[RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-07 Thread Joonsoo Kim
CMA is introduced to provide physically contiguous pages at runtime.
For this purpose, it reserves memory at boot time. Although it reserve
memory, this reserved memory can be used for movable memory allocation
request. This usecase is beneficial to the system that needs this CMA
reserved memory infrequently and it is one of main purpose of
introducing CMA.

But, there is a problem in current implementation. The problem is that
it works like as just reserved memory approach. The pages on cma reserved
memory are hardly used for movable memory allocation. This is caused by
combination of allocation and reclaim policy.

The pages on cma reserved memory are allocated if there is no movable
memory, that is, as fallback allocation. So the time this fallback
allocation is started is under heavy memory pressure. Although it is under
memory pressure, movable allocation easily succeed, since there would be
many pages on cma reserved memory. But this is not the case for unmovable
and reclaimable allocation, because they can't use the pages on cma
reserved memory. These allocations regard system's free memory as
(free pages - free cma pages) on watermark checking, that is, free
unmovable pages + free reclaimable pages + free movable pages. Because
we already exhausted movable pages, only free pages we have are unmovable
and reclaimable types and this would be really small amount. So watermark
checking would be failed. It will wake up kswapd to make enough free
memory for unmovable and reclaimable allocation and kswapd will do.
So before we fully utilize pages on cma reserved memory, kswapd start to
reclaim memory and try to make free memory over the high watermark. This
watermark checking by kswapd doesn't take care free cma pages so many
movable pages would be reclaimed. After then, we have a lot of movable
pages again, so fallback allocation doesn't happen again. To conclude,
amount of free memory on meminfo which includes free CMA pages is moving
around 512 MB if I reserve 512 MB memory for CMA.

I found this problem on following experiment.

4 CPUs, 1024 MB, VIRTUAL MACHINE
make -j24

CMA reserve:0 MB512 MB
Elapsed-time:   234.8   361.8
Average-MemFree:283880 KB   530851 KB

To solve this problem, I can think following 2 possible solutions.
1. allocate the pages on cma reserved memory first, and if they are
   exhausted, allocate movable pages.
2. interleaved allocation: try to allocate specific amounts of memory
   from cma reserved memory and then allocate from free movable memory.

I tested #1 approach and found the problem. Although free memory on
meminfo can move around low watermark, there is large fluctuation on free
memory, because too many pages are reclaimed when kswapd is invoked.
Reason for this behaviour is that successive allocated CMA pages are
on the LRU list in that order and kswapd reclaim them in same order.
These memory doesn't help watermark checking from kwapd, so too many
pages are reclaimed, I guess.

So, I implement #2 approach.
One thing I should note is that we should not change allocation target
(movable list or cma) on each allocation attempt, since this prevent
allocated pages to be in physically succession, so some I/O devices can
be hurt their performance. To solve this, I keep allocation target
in at least pageblock_nr_pages attempts and make this number reflect
ratio, free pages without free cma pages to free cma pages. With this
approach, system works very smoothly and fully utilize the pages on
cma reserved memory.

Following is the experimental result of this patch.

4 CPUs, 1024 MB, VIRTUAL MACHINE
make -j24


CMA reserve:0 MB512 MB
Elapsed-time:   234.8   361.8
Average-MemFree:283880 KB   530851 KB
pswpin: 7   110064
pswpout:452 767502


CMA reserve:0 MB512 MB
Elapsed-time:   234.2   235.6
Average-MemFree:281651 KB   290227 KB
pswpin: 8   8
pswpout:430 510

There is no difference if we don't have cma reserved memory (0 MB case).
But, with cma reserved memory (512 MB case), we fully utilize these
reserved memory through this patch and the system behaves like as
it doesn't reserve any memory.

With this patch, we aggressively allocate the pages on cma reserved memory
so latency of CMA can arise. Below is the experimental result about
latency.

4 CPUs, 1024 MB, VIRTUAL MACHINE
CMA reserve: 512 MB
Backgound Workload: make -jN
Real Workload: 8 MB CMA allocation/free 20 times with 5 sec interval

N:14   816
Elapsed-time(Before): 4309.75  9511.09 12276.1  77103.5
Elapsed-time(After):  5391.69 16114.1  19380.3  34879.2

So generally we can see latency increase. Ratio of this increase
is rather big - up to 70%. But, under the heavy workload, it shows
latency decrease - up to 55%. Th