From: Huang Ying
When swapin is performed, after getting the swap entry information from
the page table, system will swap in the swap entry, without any lock held
to prevent the swap device from being swapoff. This may cause the race
like below,
CPU 1 CPU 2
On Fri 15-02-19 15:08:36, Huang, Ying wrote:
> Michal Hocko writes:
>
> > On Mon 11-02-19 16:38:46, Huang, Ying wrote:
> >> From: Huang Ying
> >>
> >> When swapin is performed, after getting the swap entry information from
> >> the page table,
Andrea Arcangeli writes:
> On Thu, Feb 14, 2019 at 04:07:37PM +0800, Huang, Ying wrote:
>> Before, we choose to use stop_machine() to reduce the overhead of hot
>> path (page fault handler) as much as possible. But now, I found
>> rcu_read_lock_sched() is just a wrapper of preempt_disable().
On Thu, Feb 14, 2019 at 04:07:37PM +0800, Huang, Ying wrote:
> Before, we choose to use stop_machine() to reduce the overhead of hot
> path (page fault handler) as much as possible. But now, I found
> rcu_read_lock_sched() is just a wrapper of preempt_disable(). So maybe
> we can switch to RCU
Hello,
On Thu, Feb 14, 2019 at 12:30:02PM -0800, Andrew Morton wrote:
> This was discussed to death and I think the changelog explains the
> conclusions adequately. swapoff is super-rare so a stop_machine() in
> that path is appropriate if its use permits more efficiency in the
>
gt; > get_swap_device() to put_swap_device(), the preemption is disabled, so
> > stop_machine() in swapoff() will wait until put_swap_device() is called.
> >
> > In addition to swap_map, cluster_info, etc. data structure in the struct
> > swap_info_struct, the swap cach
On Mon 11-02-19 16:38:46, Huang, Ying wrote:
> From: Huang Ying
>
> When swapin is performed, after getting the swap entry information from
> the page table, system will swap in the swap entry, without any lock held
> to prevent the swap device from being swapoff. This may cause
Hello everyone,
On Mon, Feb 11, 2019 at 04:38:46PM +0800, Huang, Ying wrote:
> @@ -2386,7 +2463,17 @@ static void enable_swap_info(struct swap_info_struct
> *p, int prio,
> frontswap_init(p->type, frontswap_map);
> spin_lock(_lock);
> spin_lock(>lock);
> -
Tim Chen writes:
> On 2/11/19 10:47 PM, Huang, Ying wrote:
>> Andrea Parri writes:
>>
> + if (!si)
> + goto bad_nofile;
> +
> + preempt_disable();
> + if (!(si->flags & SWP_VALID))
> + goto unlock_out;
After Hugh alluded to barriers, it seems
On Tue, Feb 12, 2019 at 04:21:21AM +0100, Andrea Parri wrote:
> > > + if (!si)
> > > + goto bad_nofile;
> > > +
> > > + preempt_disable();
> > > + if (!(si->flags & SWP_VALID))
> > > + goto unlock_out;
> >
> > After Hugh alluded to barriers, it seems the read of SWP_VALID could be
On 2/11/19 10:47 PM, Huang, Ying wrote:
> Andrea Parri writes:
>
+ if (!si)
+ goto bad_nofile;
+
+ preempt_disable();
+ if (!(si->flags & SWP_VALID))
+ goto unlock_out;
>>>
>>> After Hugh alluded to barriers, it seems the read of SWP_VALID could
> Alternative implementation could be replacing disable preemption with
> rcu_read_lock_sched and stop_machine() with synchronize_sched().
JFYI, starting with v4.20-rc1, synchronize_rcu{,expedited}() also wait
for preempt-disable sections (the intent seems to retire the RCU-sched
update-side
Daniel Jordan writes:
> On Mon, Feb 11, 2019 at 04:38:46PM +0800, Huang, Ying wrote:
>> +struct swap_info_struct *get_swap_device(swp_entry_t entry)
>> +{
>> +struct swap_info_struct *si;
>> +unsigned long type, offset;
>> +
>> +if (!entry.val)
>> +goto out;
>
>> +
> > + if (!si)
> > + goto bad_nofile;
> > +
> > + preempt_disable();
> > + if (!(si->flags & SWP_VALID))
> > + goto unlock_out;
>
> After Hugh alluded to barriers, it seems the read of SWP_VALID could be
> reordered with the write in preempt_disable at runtime. Without
For either swap and page migration, we all use the bit 2 of the entry to
identify whether this entry is uffd write-protected. It plays a similar
role as the existing soft dirty bit in swap entries but only for keeping
the uffd-wp tracking for a specific PTE/PMD.
Something special here
On Mon, Feb 11, 2019 at 04:38:46PM +0800, Huang, Ying wrote:
> +struct swap_info_struct *get_swap_device(swp_entry_t entry)
> +{
> + struct swap_info_struct *si;
> + unsigned long type, offset;
> +
> + if (!entry.val)
> + goto out;
> + type = swp_type(entry);
> +
Swap PMIC and MULTIFUNCTION words in the title to:
- show that this is about Intel PMICs
- keep MAINTAINERS properly sorted
Signed-off-by: Andy Shevchenko
---
MAINTAINERS | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 9919840d54cd
From: Huang Ying
When swapin is performed, after getting the swap entry information from
the page table, system will swap in the swap entry, without any lock held
to prevent the swap device from being swapoff. This may cause the race
like below,
CPU 1 CPU 2
4.19-stable review patch. If anyone has any objections, please let me know.
--
[ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ]
Since a2468cc9bfdf ("swap: choose swap device according to numa node"),
avail_lists field of swap_info_struct is changed t
4.20-stable review patch. If anyone has any objections, please let me know.
--
[ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ]
Since a2468cc9bfdf ("swap: choose swap device according to numa node"),
avail_lists field of swap_info_struct is changed t
4.14-stable review patch. If anyone has any objections, please let me know.
--
[ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ]
Since a2468cc9bfdf ("swap: choose swap device according to numa node"),
avail_lists field of swap_info_struct is changed t
For either swap and page migration, we all use the bit 2 of the entry to
identify whether this entry is uffd write-protected. It plays a similar
role as the existing soft dirty bit in swap entries but only for keeping
the uffd-wp tracking for a specific PTE/PMD.
Something special here
swap-in logic could be reused independently without rest of the logic
in shmem_getpage_gfp. So lets refactor it out as an independent
function.
Signed-off-by: Vineeth Remanan Pillai
---
mm/shmem.c | 449 +
1 file changed, 244 insertions
On Sat 12-01-19 02:55:13, Yang Shi wrote:
> mem_cgroup_is_root() is preferred API to check if memcg is root or not.
> Use it instead of deferencing css->parent.
>
> Cc: Huang Ying
> Cc: Tim Chen
> Signed-off-by: Yang Shi
Yes, this is more readable.
Acked-by: Michal Hocko
> ---
>
mem_cgroup_is_root() is preferred API to check if memcg is root or not.
Use it instead of deferencing css->parent.
Cc: Huang Ying
Cc: Tim Chen
Signed-off-by: Yang Shi
---
include/linux/swap.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/swap.h
be found. So in try_to_unuse(), even if the swap count of
a swap entry isn't zero, the page needs to be deleted from swap cache, so
that, in the next round a new page could be allocated and swapin for the
other mappings of the swapped out KSM page.
But this contradicts with the THP swap support. Where
be found. So in try_to_unuse(), even if the swap count of
a swap entry isn't zero, the page needs to be deleted from swap cache, so
that, in the next round a new page could be allocated and swapin for the
other mappings of the swapped out KSM page.
But this contradicts with the THP swap support. Where
be found. So in try_to_unuse(), even if the swap count of
a swap entry isn't zero, the page needs to be deleted from swap cache, so
that, in the next round a new page could be allocated and swapin for the
other mappings of the swapped out KSM page.
But this contradicts with the THP swap support. Where
>>> index ea8fc0deb814..d6f93f029020 100644
>>> --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
>>> +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
>>> @@ -642,9 +642,6 @@ static int atmel_hlcdc_plane_atomic_check(struct
>>> drm
tmel_hlcdc_plane.c
>> +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
>> @@ -642,9 +642,6 @@ static int atmel_hlcdc_plane_atomic_check(struct
>> drm_plane *p,
>> * Swap width and size in case of 90 or 270 degrees rotation
>> */
>> if (drm
On 1/10/19 3:31 PM, Andrew Morton wrote:
On Fri, 4 Jan 2019 03:27:52 +0800 Yang Shi wrote:
Swap readahead would read in a few pages regardless if the underlying
device is busy or not. It may incur long waiting time if the device is
congested, and it may also exacerbate the congestion
> >
> > + if (si->flags & (SWP_BLKDEV | SWP_FS)) {
>
> I re-read your discussion with Tim and I must say the reasoning behind this
> test remain foggy.
I was worried that the dereference
inode = si->swap_file->f_mapping->host;
is not always safe for corner cases.
So the test makes sure that
On Fri, 4 Jan 2019 03:27:52 +0800 Yang Shi wrote:
> Swap readahead would read in a few pages regardless if the underlying
> device is busy or not. It may incur long waiting time if the device is
> congested, and it may also exacerbate the congestion.
>
> Use inode_read_conge
On Fri, 4 Jan 2019 03:27:52 +0800 Yang Shi wrote:
> Swap readahead would read in a few pages regardless if the underlying
> device is busy or not. It may incur long waiting time if the device is
> congested, and it may also exacerbate the congestion.
>
> Use inode_read_conge
Hi Andrew,
How do you look these patches? They had been reviewed and the commit log
has been updated per your and Daniel's comments.
Thanks,
Yang
On 1/3/19 11:27 AM, Yang Shi wrote:
Swap readahead would read in a few pages regardless if the underlying
device is busy or not. It may
atic int atmel_hlcdc_plane_atomic_check(struct
> drm_plane *p,
>* Swap width and size in case of 90 or 270 degrees rotation
>*/
> if (drm_rotation_90_or_270(state->base.rotation)) {
> - tmp = state->crtc_w;
> - state->crtc
/atmel_hlcdc_plane.c
index ea8fc0deb814..d6f93f029020 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
@@ -642,9 +642,6 @@ static int atmel_hlcdc_plane_atomic_check(struct drm_plane
*p,
* Swap width and size in case of 90 or 270 degrees
The function ipu_csi_init_interface() was inverting the F-bit for
NTSC case, in the CCIR_CODE_1/2 registers. The result being that
for NTSC bottom-top field order, the CSI would swap fields and
capture in top-bottom order.
Instead, base field swap on the field order of the input to the CSI
The function ipu_csi_init_interface() was inverting the F-bit for
NTSC case, in the CCIR_CODE_1/2 registers. The result being that
for NTSC bottom-top field order, the CSI would swap fields and
capture in top-bottom order.
Instead, base field swap on the field order of the input to the CSI
have
already been freed or uncharged. Even though anonymous pages get
swapped out, but they still get charged to swap space. So, it sounds
pointless to do swap for force empty.
I tried to dig into the history of this, it was introduced by
commit 8c7c6e34a125 ("memcg: mem+swap controller
On Tue, 2019-01-08 at 16:15 -0800, Steve Longerbeam wrote:
> The function ipu_csi_init_interface() was inverting the F-bit for
> NTSC case, in the CCIR_CODE_1/2 registers. The result being that
> for NTSC bottom-top field order, the CSI would swap fields and
> capture in top-
The function ipu_csi_init_interface() was inverting the F-bit for
NTSC case, in the CCIR_CODE_1/2 registers. The result being that
for NTSC bottom-top field order, the CSI would swap fields and
capture in top-bottom order.
Instead, base field swap on the field order of the input to the CSI
NTSC case, in the CCIR_CODE_1/2 registers. The result being that
for NTSC bottom-top field order, the CSI would swap fields and
capture in top-bottom order.
Instead, base field swap on the field order of the input to the CSI,
and the field order of the requested output. If the input/output
fields are
From: Aaron Lu
[ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ]
Since a2468cc9bfdf ("swap: choose swap device according to numa node"),
avail_lists field of swap_info_struct is changed to an array with
MAX_NUMNODES elements. This made swap_info_struct size increase
From: Aaron Lu
[ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ]
Since a2468cc9bfdf ("swap: choose swap device according to numa node"),
avail_lists field of swap_info_struct is changed to an array with
MAX_NUMNODES elements. This made swap_info_struct size increase
From: Aaron Lu
[ Upstream commit 66f71da9dd38af17dc17209cdde7987d4679a699 ]
Since a2468cc9bfdf ("swap: choose swap device according to numa node"),
avail_lists field of swap_info_struct is changed to an array with
MAX_NUMNODES elements. This made swap_info_struct size increase
4.14-stable review patch. If anyone has any objections, please let me know.
--
From: Michal Hocko
commit 5b5e4d623ec8a34689df98e42d038a3b594d2ff9 upstream.
Swap storage is restricted to max_swapfile_size (~16TB on x86_64) whenever
the system is deemed affected by L1TF
4.19-stable review patch. If anyone has any objections, please let me know.
--
From: Michal Hocko
commit 5b5e4d623ec8a34689df98e42d038a3b594d2ff9 upstream.
Swap storage is restricted to max_swapfile_size (~16TB on x86_64) whenever
the system is deemed affected by L1TF
4.20-stable review patch. If anyone has any objections, please let me know.
--
From: Michal Hocko
commit 5b5e4d623ec8a34689df98e42d038a3b594d2ff9 upstream.
Swap storage is restricted to max_swapfile_size (~16TB on x86_64) whenever
the system is deemed affected by L1TF
ed or uncharged. Even though anonymous pages get
> swapped out, but they still get charged to swap space. So, it sounds
> pointless to do swap for force empty.
>
> I tried to dig into the history of this, it was introduced by
> commit 8c7c6e34a125 ("memcg: mem+swap controller
, but they still get charged to swap space. So, it sounds
pointless to do swap for force empty.
I tried to dig into the history of this, it was introduced by
commit 8c7c6e34a125 ("memcg: mem+swap controller core"), but there is
not any clue about why it was done so at the first place.
The below s
NULL.
> */
> struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
> struct vm_fault *vmf)
> @@ -698,6 +698,20 @@ static void swap_ra_info(struct vm_fault *vmf,
> pte_unmap(orig_pte);
> }
>
> +/**
> + * swap_vma_rea
Swap readahead would read in a few pages regardless if the underlying
device is busy or not. It may incur long waiting time if the device is
congested, and it may also exacerbate the congestion.
Use inode_read_congested() to check if the underlying device is busy or
not like what file page
ault *vmf,
pte_unmap(orig_pte);
}
+/**
+ * swap_vma_readahead - swap in pages in hope we need them soon
+ * @entry: swap entry of this memory
+ * @gfp_mask: memory allocation flags
+ * @vmf: fault information
+ *
+ * Returns the struct page for entry and addr, after queueing swapin.
+ *
+ * Primi
. Since there should be no
attached tasks to offlining memcg, the tasks anonymous pages would have
already been freed or uncharged.
Anon pages can come from tmpfs files as well.
Yes, but they are charged to swap space as regular anon pages.
The point was the lifetime of tmpfs anon pages are not tied
On Thu, Jan 03, 2019 at 09:10:13AM -0800, Yang Shi wrote:
> How about the below description:
>
> The test with page_fault1 of will-it-scale (sometimes tracing may just show
> runtest.py that is the wrapper script of page_fault1), which basically
> launches NR_CPU threads to generate 128MB
/swap_state.c b/mm/swap_state.c
index 78d500e..dd8f698 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -698,6 +698,23 @@ static void swap_ra_info(struct vm_fault *vmf,
pte_unmap(orig_pte);
}
+/**
+ * swap_vm_readahead - swap in pages in hope we need them soon
s/swap_vm_readahead
(#thr, runtime,
system cpus/memory/swap) for more context.
How about the below description:
The test with page_fault1 of will-it-scale (sometimes tracing may just
show runtest.py that is the wrapper script of page_fault1), which
basically launches NR_CPU threads to generate 128MB anonymous pages
flining a memcg. Since there should be no
> >> attached tasks to offlining memcg, the tasks anonymous pages would have
> >> already been freed or uncharged.
> > Anon pages can come from tmpfs files as well.
>
> Yes, but they are charged to swap space as regular anon pages.
>
have
already been freed or uncharged.
Anon pages can come from tmpfs files as well.
Yes, but they are charged to swap space as regular anon pages.
Even though anonymous pages get
swapped out, but they still get charged to swap space. So, it sounds
pointless to do swap for force empty.
I
mm/swap_state.c b/mm/swap_state.c
> index 78d500e..dd8f698 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -698,6 +698,23 @@ static void swap_ra_info(struct vm_fault *vmf,
> pte_unmap(orig_pte);
> }
>
> +/**
> + * swap_vm_readahead - swap in pages in ho
ntry:3.600us
> | do_swap_page();
> runtest.py-1417 [020] 301.935878: funcgraph_entry:7.202us
> | do_swap_page();
Hi Yang, I guess runtest.py just calls page_fault1_thr? Being explicit about
this may improve the changelog for those unfamiliar with will-it-scale.
May also be useful to name will-it-scale and how it was run (#thr, runtime,
system cpus/memory/swap) for more context.
ed or uncharged.
Anon pages can come from tmpfs files as well.
> Even though anonymous pages get
> swapped out, but they still get charged to swap space. So, it sounds
> pointless to do swap for force empty.
>
I understand that force_empty is typically used before rmdir'ing a
, but they still get charged to swap space. So, it sounds
pointless to do swap for force empty.
I tried to dig into the history of this, it was introduced by
commit 8c7c6e34a125 ("memcg: mem+swap controller core"), but there is
not any clue about why it was done so at the first place.
The below s
Swap readahead would read in a few pages regardless if the underlying
device is busy or not. It may incur long waiting time if the device is
congested, and it may also exacerbate the congestion.
Use inode_read_congested() to check if the underlying device is busy or
not like what file page
/swap_state.c
+++ b/mm/swap_state.c
@@ -698,6 +698,23 @@ static void swap_ra_info(struct vm_fault *vmf,
pte_unmap(orig_pte);
}
+/**
+ * swap_vm_readahead - swap in pages in hope we need them soon
+ * @entry: swap entry of this memory
+ * @gfp_mask: memory allocation flags
+ * @vmf: fault
On 12/28/18 4:42 PM, Andrew Morton wrote:
On Sat, 22 Dec 2018 05:40:19 +0800 Yang Shi wrote:
Swap readahead would read in a few pages regardless if the underlying
device is busy or not. It may incur long waiting time if the device is
congested, and it may also exacerbate the congestion
On Sat, 22 Dec 2018 05:40:19 +0800 Yang Shi wrote:
> Swap readahead would read in a few pages regardless if the underlying
> device is busy or not. It may incur long waiting time if the device is
> congested, and it may also exacerbate the congestion.
>
> Use inode_read_conge
need_to_copy() for details.
>
> During swapoff, unuse_vma() uses anon_vma (if available) to locate VMA
> and virtual address mapped to the page, so not all mappings to a
> swapped out KSM page could be found. So in try_to_unuse(), even if
> the swap count of a swap entry
Hi, Andrew,
This patch is based on linus' tree instead of the head of mmotm tree
because it is to fix a bug there.
The bug is introduced by commit e07098294adf ("mm, THP, swap: support to
reclaim swap space for THP swapped out"), which is merged by v4.14-rc1.
So I think we shoul
(if available) to locate VMA
and virtual address mapped to the page, so not all mappings to a
swapped out KSM page could be found. So in try_to_unuse(), even if
the swap count of a swap entry isn't zero, the page needs to be
deleted from swap cache, so that, in the next round a new page could
On 12/21/18 2:42 PM, Tim Chen wrote:
On 12/21/18 1:40 PM, Yang Shi wrote:
Swap readahead would read in a few pages regardless if the underlying
device is busy or not. It may incur long waiting time if the device is
congested, and it may also exacerbate the congestion.
Use
On 12/21/18 1:40 PM, Yang Shi wrote:
> Swap readahead would read in a few pages regardless if the underlying
> device is busy or not. It may incur long waiting time if the device is
> congested, and it may also exacerbate the congestion.
>
> Use inode_read_congested() to check if
/swap_state.c
+++ b/mm/swap_state.c
@@ -698,6 +698,23 @@ static void swap_ra_info(struct vm_fault *vmf,
pte_unmap(orig_pte);
}
+/**
+ * swap_vm_readahead - swap in pages in hope we need them soon
+ * @entry: swap entry of this memory
+ * @gfp_mask: memory allocation flags
+ * @vmf: fault
Swap readahead would read in a few pages regardless if the underlying
device is busy or not. It may incur long waiting time if the device is
congested, and it may also exacerbate the congestion.
Use inode_read_congested() to check if the underlying device is busy or
not like what file page
On 12/21/18 10:34 AM, Tim Chen wrote:
On 12/20/18 4:21 PM, Yang Shi wrote:
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -538,11 +538,17 @@ struct page *swap_cluster_readahead(swp_entry_t entry,
gfp_t gfp_mask,
bool do_poll = true, page_allocated;
struct vm_area_struct
On 12/20/18 4:21 PM, Yang Shi wrote:
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -538,11 +538,17 @@ struct page *swap_cluster_readahead(swp_entry_t entry,
> gfp_t gfp_mask,
> bool do_poll = true, page_allocated;
> struct vm_area_struct *vma = vmf->vma;
> unsigned long
/swap_state.c
+++ b/mm/swap_state.c
@@ -697,6 +697,23 @@ static void swap_ra_info(struct vm_fault *vmf,
pte_unmap(orig_pte);
}
+/**
+ * swap_vm_readahead - swap in pages in hope we need them soon
+ * @entry: swap entry of this memory
+ * @gfp_mask: memory allocation flags
+ * @vmf: fault
Swap readahead would read in a few pages regardless if the underlying
device is busy or not. It may incur long waiting time if the device is
congested, and it may also exacerbate the congestion.
Use inode_read_congested() to check if the underlying device is busy or
not like what file page
You should do it only when (si->flags & SWP_FS) is true.
>>>>> Do you mean it is not safe for swap partition?
>>>> The f_mapping may not be instantiated. It is only done for SWP_FS.
>>> Really? I saw the below calls in swapon:
>>>
>>>
On 12/19/18 11:00 AM, Tim Chen wrote:
On 12/19/18 10:40 AM, Yang Shi wrote:
I don't think your dereference inode = si->swap_file->f_mapping->host
is always safe. You should do it only when (si->flags & SWP_FS) is true.
Do you mean it is not safe for swap partition?
T
On 12/19/18 10:40 AM, Yang Shi wrote:
>
>
>>>> I don't think your dereference inode = si->swap_file->f_mapping->host
>>>> is always safe. You should do it only when (si->flags & SWP_FS) is true.
>>> Do you mean it is not safe for s
levant. As long as it is trying to readahead from swap,
it should check if the underlying device is busy or not regardless of shmem or
anon page.
I don't think your dereference inode = si->swap_file->f_mapping->host
is always safe. You should do it only when (si->flags &
host;
>>>>> mask = swapin_nr_pages(offset) - 1;
>>>>> if (!mask)
>>>>> goto skip;
>>>>>
>>>> Shmem will also be using this function and I don't think the
>>>> inode_read_congested
>
swapin_nr_pages(offset) - 1;
if (!mask)
goto skip;
Shmem will also be using this function and I don't think the
inode_read_congested
logic is relevant for that case.
IMHO, shmem is also relevant. As long as it is trying to readahead from swap,
it should check if the un
t;
> Problem is 'pmd' is passed here, which has been pmdp_invalidate()ed under the
> assumption that it is not a swap entry. pmd's pfn bits get inverted for L1TF,
> so the swap entry gets corrupted and this BUG is the first place that notices.
>
> I don't see a reason to invalidate so soon, so wha
Daniel Jordan writes:
> On Fri, Dec 14, 2018 at 02:27:43PM +0800, Huang Ying wrote:
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 1cec1eec340e..644cb5d6b056 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -33,6 +33,8 @@
>> #include
>> #include
>> #include
>>
long addr = vmf->address;
>>> + struct inode *inode = si->swap_file->f_mapping->host;
>>> mask = swapin_nr_pages(offset) - 1;
>>> if (!mask)
>>> goto skip;
>>>
>> Shmem will also be using this function and I don't thin
skip;
Shmem will also be using this function and I don't think the
inode_read_congested
logic is relevant for that case.
IMHO, shmem is also relevant. As long as it is trying to readahead from
swap, it should check if the underlying device is busy or not regardless
of shmem or anon pa
On Fri, Dec 14, 2018 at 02:27:43PM +0800, Huang Ying wrote:
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 1cec1eec340e..644cb5d6b056 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -33,6 +33,8 @@
> #include
> #include
> #include
> +#include
> +#include
swap.h is
WAP) && is_swap_pmd(old_pmd))
> + return __split_huge_swap_pmd(vma, haddr, pmd);
Problem is 'pmd' is passed here, which has been pmdp_invalidate()ed under the
assumption that it is not a swap entry. pmd's pfn bits get inverted for L1TF,
so the swap entry gets corrupted and this BUG is
On 12/17/18 10:52 PM, Yang Shi wrote:
>
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index fd2f21e..7cc3c29 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -538,11 +538,15 @@ struct page *swap_cluster_readahead(swp_entry_t entry,
> gfp_t gfp_mask,
> bool do_poll = true,
/swap_state.c
+++ b/mm/swap_state.c
@@ -695,6 +695,23 @@ static void swap_ra_info(struct vm_fault *vmf,
pte_unmap(orig_pte);
}
+/**
+ * swap_vm_readahead - swap in pages in hope we need them soon
+ * @entry: swap entry of this memory
+ * @gfp_mask: memory allocation flags
+ * @vmf: fault
Swap readahead would read in a few pages regardless if the underlying
device is busy or not. It may incur long waiting time if the device is
congested, and it may also exacerbate the congestion.
Use inode_read_congested() to check if the underlying device is busy or
not like what file page
/Steve-Longerbeam/gpu-ipu-csi-Swap-fields-according-to-input-output-field-types/20181215-135741
config: nds32-allmodconfig (attached as .config)
compiler: nds32le-linux-gcc (GCC) 6.4.0
reproduce:
wget
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O
~/bin
The function ipu_csi_init_interface() was inverting the F-bit for
NTSC case, in the CCIR_CODE_1/2 registers. The result being that
for NTSC bottom-top field order, the CSI would swap fields and
capture in top-bottom order.
Instead, base field swap on the field order of the input to the CSI
swap fields and
capture in top-bottom order.
Instead, base field swap on the field order of the input to the CSI,
and the field order of the requested output. If the input/output
fields are sequential but different, swap fields, otherwise do
not swap. This requires passing both the input and output
Previously, during swapout, all PMD page mapping will be split and
replaced with PTE swap mapping. And when clearing the SWAP_HAS_CACHE
flag for the huge swap cluster in put_swap_page(), the huge swap
cluster will be split. Now, during swapout, the PMD page mappings to
the THP will be changed
During MADV_WILLNEED, for a PMD swap mapping, if THP swapin is enabled
for the VMA, the whole swap cluster will be swapin. Otherwise, the
huge swap cluster and the PMD swap mapping will be split and fallback
to PTE swap mapping.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutem
During mincore(), for PMD swap mapping, swap cache will be looked up.
If the resulting page isn't compound page, the PMD swap mapping will
be split and fallback to PTE swap mapping processing.
Signed-off-by: "Huang, Ying"
Cc: "Kirill A. Shutemov"
Cc: Andrea Arcangeli
801 - 900 of 8381 matches
Mail list logo