[PATCH -mm -V8] mm, swap: fix race between swapoff and some swap operations

2019-02-17 Thread Huang, Ying
From: Huang Ying When swapin is performed, after getting the swap entry information from the page table, system will swap in the swap entry, without any lock held to prevent the swap device from being swapoff. This may cause the race like below, CPU 1 CPU 2

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-15 Thread Michal Hocko
On Fri 15-02-19 15:08:36, Huang, Ying wrote: > Michal Hocko writes: > > > On Mon 11-02-19 16:38:46, Huang, Ying wrote: > >> From: Huang Ying > >> > >> When swapin is performed, after getting the swap entry information from > >> the page table,

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-14 Thread Huang, Ying
Andrea Arcangeli writes: > On Thu, Feb 14, 2019 at 04:07:37PM +0800, Huang, Ying wrote: >> Before, we choose to use stop_machine() to reduce the overhead of hot >> path (page fault handler) as much as possible. But now, I found >> rcu_read_lock_sched() is just a wrapper of preempt_disable().

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-14 Thread Andrea Arcangeli
On Thu, Feb 14, 2019 at 04:07:37PM +0800, Huang, Ying wrote: > Before, we choose to use stop_machine() to reduce the overhead of hot > path (page fault handler) as much as possible. But now, I found > rcu_read_lock_sched() is just a wrapper of preempt_disable(). So maybe > we can switch to RCU

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-14 Thread Andrea Arcangeli
Hello, On Thu, Feb 14, 2019 at 12:30:02PM -0800, Andrew Morton wrote: > This was discussed to death and I think the changelog explains the > conclusions adequately. swapoff is super-rare so a stop_machine() in > that path is appropriate if its use permits more efficiency in the >

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-14 Thread Andrew Morton
gt; > get_swap_device() to put_swap_device(), the preemption is disabled, so > > stop_machine() in swapoff() will wait until put_swap_device() is called. > > > > In addition to swap_map, cluster_info, etc. data structure in the struct > > swap_info_struct, the swap cach

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-14 Thread Michal Hocko
On Mon 11-02-19 16:38:46, Huang, Ying wrote: > From: Huang Ying > > When swapin is performed, after getting the swap entry information from > the page table, system will swap in the swap entry, without any lock held > to prevent the swap device from being swapoff. This may cause

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-13 Thread Andrea Arcangeli
Hello everyone, On Mon, Feb 11, 2019 at 04:38:46PM +0800, Huang, Ying wrote: > @@ -2386,7 +2463,17 @@ static void enable_swap_info(struct swap_info_struct > *p, int prio, > frontswap_init(p->type, frontswap_map); > spin_lock(_lock); > spin_lock(>lock); > -

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-12 Thread Huang, Ying
Tim Chen writes: > On 2/11/19 10:47 PM, Huang, Ying wrote: >> Andrea Parri writes: >> > + if (!si) > + goto bad_nofile; > + > + preempt_disable(); > + if (!(si->flags & SWP_VALID)) > + goto unlock_out; After Hugh alluded to barriers, it seems

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-12 Thread Daniel Jordan
On Tue, Feb 12, 2019 at 04:21:21AM +0100, Andrea Parri wrote: > > > + if (!si) > > > + goto bad_nofile; > > > + > > > + preempt_disable(); > > > + if (!(si->flags & SWP_VALID)) > > > + goto unlock_out; > > > > After Hugh alluded to barriers, it seems the read of SWP_VALID could be

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-12 Thread Tim Chen
On 2/11/19 10:47 PM, Huang, Ying wrote: > Andrea Parri writes: > + if (!si) + goto bad_nofile; + + preempt_disable(); + if (!(si->flags & SWP_VALID)) + goto unlock_out; >>> >>> After Hugh alluded to barriers, it seems the read of SWP_VALID could

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-12 Thread Andrea Parri
> Alternative implementation could be replacing disable preemption with > rcu_read_lock_sched and stop_machine() with synchronize_sched(). JFYI, starting with v4.20-rc1, synchronize_rcu{,expedited}() also wait for preempt-disable sections (the intent seems to retire the RCU-sched update-side

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-11 Thread Huang, Ying
Daniel Jordan writes: > On Mon, Feb 11, 2019 at 04:38:46PM +0800, Huang, Ying wrote: >> +struct swap_info_struct *get_swap_device(swp_entry_t entry) >> +{ >> +struct swap_info_struct *si; >> +unsigned long type, offset; >> + >> +if (!entry.val) >> +goto out; > >> +

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-11 Thread Andrea Parri
> > + if (!si) > > + goto bad_nofile; > > + > > + preempt_disable(); > > + if (!(si->flags & SWP_VALID)) > > + goto unlock_out; > > After Hugh alluded to barriers, it seems the read of SWP_VALID could be > reordered with the write in preempt_disable at runtime. Without

[PATCH v2 17/26] userfaultfd: wp: support swap and page migration

2019-02-11 Thread Peter Xu
For either swap and page migration, we all use the bit 2 of the entry to identify whether this entry is uffd write-protected. It plays a similar role as the existing soft dirty bit in swap entries but only for keeping the uffd-wp tracking for a specific PTE/PMD. Something special here

Re: [PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-11 Thread Daniel Jordan
On Mon, Feb 11, 2019 at 04:38:46PM +0800, Huang, Ying wrote: > +struct swap_info_struct *get_swap_device(swp_entry_t entry) > +{ > + struct swap_info_struct *si; > + unsigned long type, offset; > + > + if (!entry.val) > + goto out; > + type = swp_type(entry); > +

[PATCH v1] MAINAINERS: Swap words in INTEL PMIC MULTIFUNCTION DEVICE DRIVERS

2019-02-11 Thread Andy Shevchenko
Swap PMIC and MULTIFUNCTION words in the title to: - show that this is about Intel PMICs - keep MAINTAINERS properly sorted Signed-off-by: Andy Shevchenko --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 9919840d54cd

[PATCH -mm -V7] mm, swap: fix race between swapoff and some swap operations

2019-02-11 Thread Huang, Ying
From: Huang Ying When swapin is performed, after getting the swap entry information from the page table, system will swap in the swap entry, without any lock held to prevent the swap device from being swapoff. This may cause the race like below, CPU 1 CPU 2

[PATCH 4.19 097/106] mm/swap: use nr_node_ids for avail_lists in swap_info_struct

2019-01-24 Thread Greg Kroah-Hartman
4.19-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ] Since a2468cc9bfdf ("swap: choose swap device according to numa node"), avail_lists field of swap_info_struct is changed t

[PATCH 4.20 117/127] mm/swap: use nr_node_ids for avail_lists in swap_info_struct

2019-01-24 Thread Greg Kroah-Hartman
4.20-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ] Since a2468cc9bfdf ("swap: choose swap device according to numa node"), avail_lists field of swap_info_struct is changed t

[PATCH 4.14 56/63] mm/swap: use nr_node_ids for avail_lists in swap_info_struct

2019-01-24 Thread Greg Kroah-Hartman
4.14-stable review patch. If anyone has any objections, please let me know. -- [ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ] Since a2468cc9bfdf ("swap: choose swap device according to numa node"), avail_lists field of swap_info_struct is changed t

[PATCH RFC 19/24] userfaultfd: wp: support swap and page migration

2019-01-21 Thread Peter Xu
For either swap and page migration, we all use the bit 2 of the entry to identify whether this entry is uffd write-protected. It plays a similar role as the existing soft dirty bit in swap entries but only for keeping the uffd-wp tracking for a specific PTE/PMD. Something special here

[PATCH v4 1/2] mm: Refactor swap-in logic out of shmem_getpage_gfp

2019-01-14 Thread Vineeth Remanan Pillai
swap-in logic could be reused independently without rest of the logic in shmem_getpage_gfp. So lets refactor it out as an independent function. Signed-off-by: Vineeth Remanan Pillai --- mm/shmem.c | 449 + 1 file changed, 244 insertions

Re: [PATCH] mm: swap: use mem_cgroup_is_root() instead of deferencing css->parent

2019-01-11 Thread Michal Hocko
On Sat 12-01-19 02:55:13, Yang Shi wrote: > mem_cgroup_is_root() is preferred API to check if memcg is root or not. > Use it instead of deferencing css->parent. > > Cc: Huang Ying > Cc: Tim Chen > Signed-off-by: Yang Shi Yes, this is more readable. Acked-by: Michal Hocko > --- >

[PATCH] mm: swap: use mem_cgroup_is_root() instead of deferencing css->parent

2019-01-11 Thread Yang Shi
mem_cgroup_is_root() is preferred API to check if memcg is root or not. Use it instead of deferencing css->parent. Cc: Huang Ying Cc: Tim Chen Signed-off-by: Yang Shi --- include/linux/swap.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/swap.h

[PATCH 4.20 16/65] mm, swap: fix swapoff with KSM pages

2019-01-11 Thread Greg Kroah-Hartman
be found. So in try_to_unuse(), even if the swap count of a swap entry isn't zero, the page needs to be deleted from swap cache, so that, in the next round a new page could be allocated and swapin for the other mappings of the swapped out KSM page. But this contradicts with the THP swap support. Where

[PATCH 4.19 088/148] mm, swap: fix swapoff with KSM pages

2019-01-11 Thread Greg Kroah-Hartman
be found. So in try_to_unuse(), even if the swap count of a swap entry isn't zero, the page needs to be deleted from swap cache, so that, in the next round a new page could be allocated and swapin for the other mappings of the swapped out KSM page. But this contradicts with the THP swap support. Where

[PATCH 4.14 066/105] mm, swap: fix swapoff with KSM pages

2019-01-11 Thread Greg Kroah-Hartman
be found. So in try_to_unuse(), even if the swap count of a swap entry isn't zero, the page needs to be deleted from swap cache, so that, in the next round a new page could be allocated and swapin for the other mappings of the swapped out KSM page. But this contradicts with the THP swap support. Where

Re: [PATCH 2/4] drm/atmel-hlcdc: do not swap w/h of the crtc when a plane is rotated

2019-01-11 Thread Nicolas.Ferre
>>> index ea8fc0deb814..d6f93f029020 100644 >>> --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c >>> +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c >>> @@ -642,9 +642,6 @@ static int atmel_hlcdc_plane_atomic_check(struct >>> drm

Re: [PATCH 2/4] drm/atmel-hlcdc: do not swap w/h of the crtc when a plane is rotated

2019-01-11 Thread Peter Rosin
tmel_hlcdc_plane.c >> +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c >> @@ -642,9 +642,6 @@ static int atmel_hlcdc_plane_atomic_check(struct >> drm_plane *p, >> * Swap width and size in case of 90 or 270 degrees rotation >> */ >> if (drm

Re: [v5 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-10 Thread Yang Shi
On 1/10/19 3:31 PM, Andrew Morton wrote: On Fri, 4 Jan 2019 03:27:52 +0800 Yang Shi wrote: Swap readahead would read in a few pages regardless if the underlying device is busy or not. It may incur long waiting time if the device is congested, and it may also exacerbate the congestion

RE: [v5 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-10 Thread Chen, Tim C
> > > > + if (si->flags & (SWP_BLKDEV | SWP_FS)) { > > I re-read your discussion with Tim and I must say the reasoning behind this > test remain foggy. I was worried that the dereference inode = si->swap_file->f_mapping->host; is not always safe for corner cases. So the test makes sure that

Re: [v5 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-10 Thread Andrew Morton
On Fri, 4 Jan 2019 03:27:52 +0800 Yang Shi wrote: > Swap readahead would read in a few pages regardless if the underlying > device is busy or not. It may incur long waiting time if the device is > congested, and it may also exacerbate the congestion. > > Use inode_read_conge

Re: [v5 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-10 Thread Andrew Morton
On Fri, 4 Jan 2019 03:27:52 +0800 Yang Shi wrote: > Swap readahead would read in a few pages regardless if the underlying > device is busy or not. It may incur long waiting time if the device is > congested, and it may also exacerbate the congestion. > > Use inode_read_conge

Re: [v5 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-10 Thread Yang Shi
Hi Andrew, How do you look these patches? They had been reviewed and the commit log has been updated per your and Daniel's comments. Thanks, Yang On 1/3/19 11:27 AM, Yang Shi wrote: Swap readahead would read in a few pages regardless if the underlying device is busy or not. It may

Re: [PATCH 2/4] drm/atmel-hlcdc: do not swap w/h of the crtc when a plane is rotated

2019-01-10 Thread Boris Brezillon
atic int atmel_hlcdc_plane_atomic_check(struct > drm_plane *p, >* Swap width and size in case of 90 or 270 degrees rotation >*/ > if (drm_rotation_90_or_270(state->base.rotation)) { > - tmp = state->crtc_w; > - state->crtc

[PATCH 2/4] drm/atmel-hlcdc: do not swap w/h of the crtc when a plane is rotated

2019-01-10 Thread Peter Rosin
/atmel_hlcdc_plane.c index ea8fc0deb814..d6f93f029020 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c @@ -642,9 +642,6 @@ static int atmel_hlcdc_plane_atomic_check(struct drm_plane *p, * Swap width and size in case of 90 or 270 degrees

[PATCH v8 02/11] gpu: ipu-csi: Swap fields according to input/output field types

2019-01-09 Thread Steve Longerbeam
The function ipu_csi_init_interface() was inverting the F-bit for NTSC case, in the CCIR_CODE_1/2 registers. The result being that for NTSC bottom-top field order, the CSI would swap fields and capture in top-bottom order. Instead, base field swap on the field order of the input to the CSI

[PATCH v7 02/11] gpu: ipu-csi: Swap fields according to input/output field types

2019-01-09 Thread Steve Longerbeam
The function ipu_csi_init_interface() was inverting the F-bit for NTSC case, in the CCIR_CODE_1/2 registers. The result being that for NTSC bottom-top field order, the CSI would swap fields and capture in top-bottom order. Instead, base field swap on the field order of the input to the CSI

Re: [v2 PATCH 2/5] mm: memcontrol: do not try to do swap when force empty

2019-01-09 Thread Yang Shi
have already been freed or uncharged. Even though anonymous pages get swapped out, but they still get charged to swap space. So, it sounds pointless to do swap for force empty. I tried to dig into the history of this, it was introduced by commit 8c7c6e34a125 ("memcg: mem+swap controller

Re: [PATCH v6 02/12] gpu: ipu-csi: Swap fields according to input/output field types

2019-01-09 Thread Philipp Zabel
On Tue, 2019-01-08 at 16:15 -0800, Steve Longerbeam wrote: > The function ipu_csi_init_interface() was inverting the F-bit for > NTSC case, in the CCIR_CODE_1/2 registers. The result being that > for NTSC bottom-top field order, the CSI would swap fields and > capture in top-

[PATCH v6 02/12] gpu: ipu-csi: Swap fields according to input/output field types

2019-01-08 Thread Steve Longerbeam
The function ipu_csi_init_interface() was inverting the F-bit for NTSC case, in the CCIR_CODE_1/2 registers. The result being that for NTSC bottom-top field order, the CSI would swap fields and capture in top-bottom order. Instead, base field swap on the field order of the input to the CSI

Re: [PATCH v6] gpu: ipu-csi: Swap fields according to input/output field types

2019-01-08 Thread Steve Longerbeam
NTSC case, in the CCIR_CODE_1/2 registers. The result being that for NTSC bottom-top field order, the CSI would swap fields and capture in top-bottom order. Instead, base field swap on the field order of the input to the CSI, and the field order of the requested output. If the input/output fields are

[PATCH AUTOSEL 4.14 52/53] mm/swap: use nr_node_ids for avail_lists in swap_info_struct

2019-01-08 Thread Sasha Levin
From: Aaron Lu [ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ] Since a2468cc9bfdf ("swap: choose swap device according to numa node"), avail_lists field of swap_info_struct is changed to an array with MAX_NUMNODES elements. This made swap_info_struct size increase

[PATCH AUTOSEL 4.19 95/97] mm/swap: use nr_node_ids for avail_lists in swap_info_struct

2019-01-08 Thread Sasha Levin
From: Aaron Lu [ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ] Since a2468cc9bfdf ("swap: choose swap device according to numa node"), avail_lists field of swap_info_struct is changed to an array with MAX_NUMNODES elements. This made swap_info_struct size increase

[PATCH AUTOSEL 4.20 114/117] mm/swap: use nr_node_ids for avail_lists in swap_info_struct

2019-01-08 Thread Sasha Levin
From: Aaron Lu [ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ] Since a2468cc9bfdf ("swap: choose swap device according to numa node"), avail_lists field of swap_info_struct is changed to an array with MAX_NUMNODES elements. This made swap_info_struct size increase

[PATCH 4.14 056/101] x86/speculation/l1tf: Drop the swap storage limit restriction when l1tf=off

2019-01-07 Thread Greg Kroah-Hartman
4.14-stable review patch. If anyone has any objections, please let me know. -- From: Michal Hocko commit 5b5e4d623ec8a34689df98e42d038a3b594d2ff9 upstream. Swap storage is restricted to max_swapfile_size (~16TB on x86_64) whenever the system is deemed affected by L1TF

[PATCH 4.19 090/170] x86/speculation/l1tf: Drop the swap storage limit restriction when l1tf=off

2019-01-07 Thread Greg Kroah-Hartman
4.19-stable review patch. If anyone has any objections, please let me know. -- From: Michal Hocko commit 5b5e4d623ec8a34689df98e42d038a3b594d2ff9 upstream. Swap storage is restricted to max_swapfile_size (~16TB on x86_64) whenever the system is deemed affected by L1TF

[PATCH 4.20 054/145] x86/speculation/l1tf: Drop the swap storage limit restriction when l1tf=off

2019-01-07 Thread Greg Kroah-Hartman
4.20-stable review patch. If anyone has any objections, please let me know. -- From: Michal Hocko commit 5b5e4d623ec8a34689df98e42d038a3b594d2ff9 upstream. Swap storage is restricted to max_swapfile_size (~16TB on x86_64) whenever the system is deemed affected by L1TF

Re: [v2 PATCH 2/5] mm: memcontrol: do not try to do swap when force empty

2019-01-04 Thread Shakeel Butt
ed or uncharged. Even though anonymous pages get > swapped out, but they still get charged to swap space. So, it sounds > pointless to do swap for force empty. > > I tried to dig into the history of this, it was introduced by > commit 8c7c6e34a125 ("memcg: mem+swap controller

[v2 PATCH 2/5] mm: memcontrol: do not try to do swap when force empty

2019-01-04 Thread Yang Shi
, but they still get charged to swap space. So, it sounds pointless to do swap for force empty. I tried to dig into the history of this, it was introduced by commit 8c7c6e34a125 ("memcg: mem+swap controller core"), but there is not any clue about why it was done so at the first place. The below s

Re: [v5 PATCH 2/2] mm: swap: add comment for swap_vma_readahead

2019-01-03 Thread Huang, Ying
NULL. > */ > struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, > struct vm_fault *vmf) > @@ -698,6 +698,20 @@ static void swap_ra_info(struct vm_fault *vmf, > pte_unmap(orig_pte); > } > > +/** > + * swap_vma_rea

[v5 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-03 Thread Yang Shi
Swap readahead would read in a few pages regardless if the underlying device is busy or not. It may incur long waiting time if the device is congested, and it may also exacerbate the congestion. Use inode_read_congested() to check if the underlying device is busy or not like what file page

[v5 PATCH 2/2] mm: swap: add comment for swap_vma_readahead

2019-01-03 Thread Yang Shi
ault *vmf, pte_unmap(orig_pte); } +/** + * swap_vma_readahead - swap in pages in hope we need them soon + * @entry: swap entry of this memory + * @gfp_mask: memory allocation flags + * @vmf: fault information + * + * Returns the struct page for entry and addr, after queueing swapin. + * + * Primi

Re: [PATCH 2/3] mm: memcontrol: do not try to do swap when force empty

2019-01-03 Thread Yang Shi
. Since there should be no attached tasks to offlining memcg, the tasks anonymous pages would have already been freed or uncharged. Anon pages can come from tmpfs files as well. Yes, but they are charged to swap space as regular anon pages. The point was the lifetime of tmpfs anon pages are not tied

Re: [v4 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-03 Thread Daniel Jordan
On Thu, Jan 03, 2019 at 09:10:13AM -0800, Yang Shi wrote: > How about the below description: > > The test with page_fault1 of will-it-scale (sometimes tracing may just show > runtest.py that is the wrapper script of page_fault1), which basically > launches NR_CPU threads to generate 128MB

Re: [v4 PATCH 2/2] mm: swap: add comment for swap_vma_readahead

2019-01-03 Thread Yang Shi
/swap_state.c b/mm/swap_state.c index 78d500e..dd8f698 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -698,6 +698,23 @@ static void swap_ra_info(struct vm_fault *vmf, pte_unmap(orig_pte); } +/** + * swap_vm_readahead - swap in pages in hope we need them soon s/swap_vm_readahead

Re: [v4 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-03 Thread Yang Shi
(#thr, runtime, system cpus/memory/swap) for more context. How about the below description: The test with page_fault1 of will-it-scale (sometimes tracing may just show runtest.py that is the wrapper script of page_fault1), which basically launches NR_CPU threads to generate 128MB anonymous pages

Re: [PATCH 2/3] mm: memcontrol: do not try to do swap when force empty

2019-01-03 Thread Shakeel Butt
flining a memcg. Since there should be no > >> attached tasks to offlining memcg, the tasks anonymous pages would have > >> already been freed or uncharged. > > Anon pages can come from tmpfs files as well. > > Yes, but they are charged to swap space as regular anon pages. >

Re: [PATCH 2/3] mm: memcontrol: do not try to do swap when force empty

2019-01-03 Thread Yang Shi
have already been freed or uncharged. Anon pages can come from tmpfs files as well. Yes, but they are charged to swap space as regular anon pages. Even though anonymous pages get swapped out, but they still get charged to swap space. So, it sounds pointless to do swap for force empty. I

Re: [v4 PATCH 2/2] mm: swap: add comment for swap_vma_readahead

2019-01-02 Thread Huang, Ying
mm/swap_state.c b/mm/swap_state.c > index 78d500e..dd8f698 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -698,6 +698,23 @@ static void swap_ra_info(struct vm_fault *vmf, > pte_unmap(orig_pte); > } > > +/** > + * swap_vm_readahead - swap in pages in ho

Re: [v4 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-02 Thread Daniel Jordan
ntry:3.600us > | do_swap_page(); > runtest.py-1417 [020] 301.935878: funcgraph_entry:7.202us > | do_swap_page(); Hi Yang, I guess runtest.py just calls page_fault1_thr? Being explicit about this may improve the changelog for those unfamiliar with will-it-scale. May also be useful to name will-it-scale and how it was run (#thr, runtime, system cpus/memory/swap) for more context.

Re: [PATCH 2/3] mm: memcontrol: do not try to do swap when force empty

2019-01-02 Thread Shakeel Butt
ed or uncharged. Anon pages can come from tmpfs files as well. > Even though anonymous pages get > swapped out, but they still get charged to swap space. So, it sounds > pointless to do swap for force empty. > I understand that force_empty is typically used before rmdir'ing a

[PATCH 2/3] mm: memcontrol: do not try to do swap when force empty

2019-01-02 Thread Yang Shi
, but they still get charged to swap space. So, it sounds pointless to do swap for force empty. I tried to dig into the history of this, it was introduced by commit 8c7c6e34a125 ("memcg: mem+swap controller core"), but there is not any clue about why it was done so at the first place. The below s

[v4 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-29 Thread Yang Shi
Swap readahead would read in a few pages regardless if the underlying device is busy or not. It may incur long waiting time if the device is congested, and it may also exacerbate the congestion. Use inode_read_congested() to check if the underlying device is busy or not like what file page

[v4 PATCH 2/2] mm: swap: add comment for swap_vma_readahead

2018-12-29 Thread Yang Shi
/swap_state.c +++ b/mm/swap_state.c @@ -698,6 +698,23 @@ static void swap_ra_info(struct vm_fault *vmf, pte_unmap(orig_pte); } +/** + * swap_vm_readahead - swap in pages in hope we need them soon + * @entry: swap entry of this memory + * @gfp_mask: memory allocation flags + * @vmf: fault

Re: [v3 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-28 Thread Yang Shi
On 12/28/18 4:42 PM, Andrew Morton wrote: On Sat, 22 Dec 2018 05:40:19 +0800 Yang Shi wrote: Swap readahead would read in a few pages regardless if the underlying device is busy or not. It may incur long waiting time if the device is congested, and it may also exacerbate the congestion

Re: [v3 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-28 Thread Andrew Morton
On Sat, 22 Dec 2018 05:40:19 +0800 Yang Shi wrote: > Swap readahead would read in a few pages regardless if the underlying > device is busy or not. It may incur long waiting time if the device is > congested, and it may also exacerbate the congestion. > > Use inode_read_conge

Re: [PATCH] mm, swap: Fix swapoff with KSM pages

2018-12-27 Thread Andrew Morton
need_to_copy() for details. > > During swapoff, unuse_vma() uses anon_vma (if available) to locate VMA > and virtual address mapped to the page, so not all mappings to a > swapped out KSM page could be found. So in try_to_unuse(), even if > the swap count of a swap entry

Re: [PATCH] mm, swap: Fix swapoff with KSM pages

2018-12-25 Thread Huang, Ying
Hi, Andrew, This patch is based on linus' tree instead of the head of mmotm tree because it is to fix a bug there. The bug is introduced by commit e07098294adf ("mm, THP, swap: support to reclaim swap space for THP swapped out"), which is merged by v4.14-rc1. So I think we shoul

[PATCH] mm, swap: Fix swapoff with KSM pages

2018-12-25 Thread Huang Ying
(if available) to locate VMA and virtual address mapped to the page, so not all mappings to a swapped out KSM page could be found. So in try_to_unuse(), even if the swap count of a swap entry isn't zero, the page needs to be deleted from swap cache, so that, in the next round a new page could

Re: [v3 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-21 Thread Yang Shi
On 12/21/18 2:42 PM, Tim Chen wrote: On 12/21/18 1:40 PM, Yang Shi wrote: Swap readahead would read in a few pages regardless if the underlying device is busy or not. It may incur long waiting time if the device is congested, and it may also exacerbate the congestion. Use

Re: [v3 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-21 Thread Tim Chen
On 12/21/18 1:40 PM, Yang Shi wrote: > Swap readahead would read in a few pages regardless if the underlying > device is busy or not. It may incur long waiting time if the device is > congested, and it may also exacerbate the congestion. > > Use inode_read_congested() to check if

[v3 PATCH 2/2] mm: swap: add comment for swap_vma_readahead

2018-12-21 Thread Yang Shi
/swap_state.c +++ b/mm/swap_state.c @@ -698,6 +698,23 @@ static void swap_ra_info(struct vm_fault *vmf, pte_unmap(orig_pte); } +/** + * swap_vm_readahead - swap in pages in hope we need them soon + * @entry: swap entry of this memory + * @gfp_mask: memory allocation flags + * @vmf: fault

[v3 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-21 Thread Yang Shi
Swap readahead would read in a few pages regardless if the underlying device is busy or not. It may incur long waiting time if the device is congested, and it may also exacerbate the congestion. Use inode_read_congested() to check if the underlying device is busy or not like what file page

Re: [v2 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-21 Thread Yang Shi
On 12/21/18 10:34 AM, Tim Chen wrote: On 12/20/18 4:21 PM, Yang Shi wrote: --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -538,11 +538,17 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, bool do_poll = true, page_allocated; struct vm_area_struct

Re: [v2 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-21 Thread Tim Chen
On 12/20/18 4:21 PM, Yang Shi wrote: > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -538,11 +538,17 @@ struct page *swap_cluster_readahead(swp_entry_t entry, > gfp_t gfp_mask, > bool do_poll = true, page_allocated; > struct vm_area_struct *vma = vmf->vma; > unsigned long

[v2 PATCH 2/2] mm: swap: add comment for swap_vma_readahead

2018-12-20 Thread Yang Shi
/swap_state.c +++ b/mm/swap_state.c @@ -697,6 +697,23 @@ static void swap_ra_info(struct vm_fault *vmf, pte_unmap(orig_pte); } +/** + * swap_vm_readahead - swap in pages in hope we need them soon + * @entry: swap entry of this memory + * @gfp_mask: memory allocation flags + * @vmf: fault

[v2 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-20 Thread Yang Shi
Swap readahead would read in a few pages regardless if the underlying device is busy or not. It may incur long waiting time if the device is congested, and it may also exacerbate the congestion. Use inode_read_congested() to check if the underlying device is busy or not like what file page

Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-19 Thread Tim Chen
You should do it only when (si->flags & SWP_FS) is true. >>>>> Do you mean it is not safe for swap partition? >>>> The f_mapping may not be instantiated.  It is only done for SWP_FS. >>> Really? I saw the below calls in swapon: >>> >>>

Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-19 Thread Yang Shi
On 12/19/18 11:00 AM, Tim Chen wrote: On 12/19/18 10:40 AM, Yang Shi wrote: I don't think your dereference inode = si->swap_file->f_mapping->host is always safe.  You should do it only when (si->flags & SWP_FS) is true. Do you mean it is not safe for swap partition? T

Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-19 Thread Tim Chen
On 12/19/18 10:40 AM, Yang Shi wrote: > > >>>> I don't think your dereference inode = si->swap_file->f_mapping->host >>>> is always safe.  You should do it only when (si->flags & SWP_FS) is true. >>> Do you mean it is not safe for s

Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-19 Thread Yang Shi
levant. As long as it is trying to readahead from swap, it should check if the underlying device is busy or not regardless of shmem or anon page. I don't think your dereference inode = si->swap_file->f_mapping->host is always safe.  You should do it only when (si->flags &

Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-19 Thread Tim Chen
host; >>>>>      mask = swapin_nr_pages(offset) - 1; >>>>>    if (!mask) >>>>>    goto skip; >>>>>    >>>> Shmem will also be using this function and I don't think the >>>> inode_read_congested >

Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-18 Thread Yang Shi
swapin_nr_pages(offset) - 1;   if (!mask)   goto skip; Shmem will also be using this function and I don't think the inode_read_congested logic is relevant for that case. IMHO, shmem is also relevant. As long as it is trying to readahead from swap, it should check if the un

Re: [PATCH -V9 07/21] swap: Support PMD swap mapping when splitting huge PMD

2018-12-18 Thread Huang, Ying
t; > Problem is 'pmd' is passed here, which has been pmdp_invalidate()ed under the > assumption that it is not a swap entry. pmd's pfn bits get inverted for L1TF, > so the swap entry gets corrupted and this BUG is the first place that notices. > > I don't see a reason to invalidate so soon, so wha

Re: [PATCH -V9 10/21] swap: Swapin a THP in one piece

2018-12-18 Thread Huang, Ying
Daniel Jordan writes: > On Fri, Dec 14, 2018 at 02:27:43PM +0800, Huang Ying wrote: >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 1cec1eec340e..644cb5d6b056 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -33,6 +33,8 @@ >> #include >> #include >> #include >>

Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-18 Thread Tim Chen
long addr = vmf->address; >>> +    struct inode *inode = si->swap_file->f_mapping->host; >>>     mask = swapin_nr_pages(offset) - 1; >>>   if (!mask) >>>   goto skip; >>>   >> Shmem will also be using this function and I don't thin

Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-18 Thread Yang Shi
skip; Shmem will also be using this function and I don't think the inode_read_congested logic is relevant for that case. IMHO, shmem is also relevant. As long as it is trying to readahead from swap, it should check if the underlying device is busy or not regardless of shmem or anon pa

Re: [PATCH -V9 10/21] swap: Swapin a THP in one piece

2018-12-18 Thread Daniel Jordan
On Fri, Dec 14, 2018 at 02:27:43PM +0800, Huang Ying wrote: > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 1cec1eec340e..644cb5d6b056 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -33,6 +33,8 @@ > #include > #include > #include > +#include > +#include swap.h is

Re: [PATCH -V9 07/21] swap: Support PMD swap mapping when splitting huge PMD

2018-12-18 Thread Daniel Jordan
WAP) && is_swap_pmd(old_pmd)) > + return __split_huge_swap_pmd(vma, haddr, pmd); Problem is 'pmd' is passed here, which has been pmdp_invalidate()ed under the assumption that it is not a swap entry. pmd's pfn bits get inverted for L1TF, so the swap entry gets corrupted and this BUG is

Re: [RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-18 Thread Tim Chen
On 12/17/18 10:52 PM, Yang Shi wrote: > > diff --git a/mm/swap_state.c b/mm/swap_state.c > index fd2f21e..7cc3c29 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -538,11 +538,15 @@ struct page *swap_cluster_readahead(swp_entry_t entry, > gfp_t gfp_mask, > bool do_poll = true,

[RFC PATCH 2/2] mm: swap: add comment for swap_vma_readahead

2018-12-17 Thread Yang Shi
/swap_state.c +++ b/mm/swap_state.c @@ -695,6 +695,23 @@ static void swap_ra_info(struct vm_fault *vmf, pte_unmap(orig_pte); } +/** + * swap_vm_readahead - swap in pages in hope we need them soon + * @entry: swap entry of this memory + * @gfp_mask: memory allocation flags + * @vmf: fault

[RFC PATCH 1/2] mm: swap: check if swap backing device is congested or not

2018-12-17 Thread Yang Shi
Swap readahead would read in a few pages regardless if the underlying device is busy or not. It may incur long waiting time if the device is congested, and it may also exacerbate the congestion. Use inode_read_congested() to check if the underlying device is busy or not like what file page

Re: [PATCH v6] gpu: ipu-csi: Swap fields according to input/output field types

2018-12-15 Thread kbuild test robot
/Steve-Longerbeam/gpu-ipu-csi-Swap-fields-according-to-input-output-field-types/20181215-135741 config: nds32-allmodconfig (attached as .config) compiler: nds32le-linux-gcc (GCC) 6.4.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin

[PATCH v6] gpu: ipu-csi: Swap fields according to input/output field types

2018-12-14 Thread Steve Longerbeam
The function ipu_csi_init_interface() was inverting the F-bit for NTSC case, in the CCIR_CODE_1/2 registers. The result being that for NTSC bottom-top field order, the CSI would swap fields and capture in top-bottom order. Instead, base field swap on the field order of the input to the CSI

Re: [PATCH v5 02/12] gpu: ipu-csi: Swap fields according to input/output field types

2018-12-14 Thread Steve Longerbeam
swap fields and capture in top-bottom order. Instead, base field swap on the field order of the input to the CSI, and the field order of the requested output. If the input/output fields are sequential but different, swap fields, otherwise do not swap. This requires passing both the input and output

[PATCH -V9 05/21] swap: Support PMD swap mapping in put_swap_page()

2018-12-13 Thread Huang Ying
Previously, during swapout, all PMD page mapping will be split and replaced with PTE swap mapping. And when clearing the SWAP_HAS_CACHE flag for the huge swap cluster in put_swap_page(), the huge swap cluster will be split. Now, during swapout, the PMD page mappings to the THP will be changed

[PATCH -V9 18/21] swap: Support PMD swap mapping for MADV_WILLNEED

2018-12-13 Thread Huang Ying
During MADV_WILLNEED, for a PMD swap mapping, if THP swapin is enabled for the VMA, the whole swap cluster will be swapin. Otherwise, the huge swap cluster and the PMD swap mapping will be split and fallback to PTE swap mapping. Signed-off-by: "Huang, Ying" Cc: "Kirill A. Shutem

[PATCH -V9 19/21] swap: Support PMD swap mapping in mincore()

2018-12-13 Thread Huang Ying
During mincore(), for PMD swap mapping, swap cache will be looked up. If the resulting page isn't compound page, the PMD swap mapping will be split and fallback to PTE swap mapping processing. Signed-off-by: "Huang, Ying" Cc: "Kirill A. Shutemov" Cc: Andrea Arcangeli

<    4   5   6   7   8   9   10   11   12   13   >