Hi Ted,
On Sat, Apr 14, 2018 at 8:30 PM, Theodore Y. Ts'o wrote:
> When you open "foo", the restulting file descriptor is not associated
> with the symlink. The resulting file descriptor is the exact same
> thing you would get if you had instead called:
>
> fd = open("/tmp/bar/quux", O_C
Hi Ted,
Thanks for the reply.
On Sat, Apr 14, 2018 at 8:17 PM, Theodore Y. Ts'o wrote:
>
> The only thing I would add to Dave's comments is that a lot of these
> formal semantics are de facto, and not de jure. If you take a look at
> POSIX or the Single Unix Specification, they are remarkably s
On Sat, Apr 14, 2018 at 08:13:28PM -0500, Vijay Chidambaram wrote:
>
> We are *not* saying an fsync on a symlink file has to result in any
> action on the original file. We understand the lack of ordering
> constraints here.
The problem is you're not being precise here. The fsync(2) system
call
The only thing I would add to Dave's comments is that a lot of these
formal semantics are de facto, and not de jure. If you take a look at
POSIX or the Single Unix Specification, they are remarkably silent
about how fsync works.
In fact POSIX/SUS doesn't even define "fsync on a directory". In th
Hi Dave,
Thank you for your detailed reply.
I think we still have a misunderstanding. Bear with me, much of this
may seem obvious to you, but not to us and future readers of this
mailing list :)
We are *not* saying an fsync on a symlink file has to result in any
action on the original file. We u
On Fri, Apr 13, 2018 at 10:27:56PM -0500, Vijay Chidambaram wrote:
> Hi Dave,
>
> Thanks for the reply.
>
> I feel like we are not talking about the same thing here.
>
> What we are asking is: if you perform
>
> fsync(symlink)
> crash
>
> can we expect it to see the symlink file in the parent
Use new return type vm_fault_t for page_mkwrite
and fault handler.
Signed-off-by: Souptick Joarder
Reviewed-by: Matthew Wilcox
---
fs/f2fs/file.c | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542..045337a 100644
--- a/fs
On Sat, Apr 14, 2018 at 12:50:30PM -0700, Matthew Wilcox wrote:
> On Mon, Apr 09, 2018 at 04:18:07PM -0500, Goldwyn Rodrigues wrote:
>
> I'm sorry I missed this email. My inbox is a disaster :(
>
> > I tried these patches against next-20180329 and added the patch for the
> > bug reported by Mike
On Mon, Apr 09, 2018 at 04:18:07PM -0500, Goldwyn Rodrigues wrote:
I'm sorry I missed this email. My inbox is a disaster :(
> I tried these patches against next-20180329 and added the patch for the
> bug reported by Mike Kravetz. I am getting the following BUG on ext4 and
> xfs, running generic/
From: Matthew Wilcox
This hopefully temporary function is useful for users who have not yet
been converted to multi-index entries.
Signed-off-by: Matthew Wilcox
---
include/linux/xarray.h | 2 ++
lib/xarray.c | 22 ++
2 files changed, 24 insertions(+)
diff --git
From: Matthew Wilcox
Introduce xarray value entries to replace the radix tree exceptional
entry code. This is a slight change in encoding to allow the use of an
extra bit (we can now store BITS_PER_LONG - 1 bits in a value entry).
It is also a change in emphasis; exceptional entries are intimida
From: Matthew Wilcox
I'm not 100% convinced that the rewrite of nilfs_copy_back_pages is
correct, but it will at least have different bugs from the current
version.
Signed-off-by: Matthew Wilcox
---
fs/nilfs2/btnode.c | 37 +---
fs/nilfs2/page.c | 72 +
From: Matthew Wilcox
Includes moving mapping_tagged() to fs.h as a static inline, and
changing it to return bool.
Signed-off-by: Matthew Wilcox
---
include/linux/fs.h | 17 ++-
mm/page-writeback.c | 72 -
2 files changed, 36 insertions(+), 5
From: Matthew Wilcox
Use the XArray APIs to add and replace pages in the page cache. This
removes two uses of the radix tree preload API and is significantly
shorter code.
Signed-off-by: Matthew Wilcox
---
include/linux/swap.h | 8 ++-
mm/filemap.c | 143 ++--
From: Matthew Wilcox
This is a 1:1 conversion.
Signed-off-by: Matthew Wilcox
---
mm/shmem.c | 23 +++
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index 5b8a2d944a0c..784a49aad902 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1085,2
From: Matthew Wilcox
It does not return an error, so we don't need to check the return value
for IS_ERR().
Signed-off-by: Matthew Wilcox
---
fs/dax.c | 16 +---
1 file changed, 1 insertion(+), 15 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index b0efb0a9604a..c6086e7566c3 100644
From: Matthew Wilcox
Switch to a batch-processing model like shmem_wait_for_pins() and
use the xa_state previously set up by shmem_wait_for_pins().
Signed-off-by: Matthew Wilcox
Reviewed-by: Mike Kravetz
---
mm/shmem.c | 44 ++--
1 file changed, 18 inse
From: Matthew Wilcox
A couple of short loops.
Signed-off-by: Matthew Wilcox
---
fs/fs-writeback.c | 25 +
1 file changed, 9 insertions(+), 16 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 18e7807c74ea..87a2827721e0 100644
--- a/fs/fs-writeback.c
From: Matthew Wilcox
This is documentation on how to use the XArray, not details about its
internal implementation.
Signed-off-by: Matthew Wilcox
Acked-by: Josef Bacik
---
Documentation/core-api/index.rst | 1 +
Documentation/core-api/xarray.rst | 361 ++
2 file
From: Matthew Wilcox
This iterator allows the user to efficiently walk a range of the array,
executing the loop body once for each entry in that range that matches
the filter. This commit also includes xa_find() and xa_find_above()
which are helper functions for xa_for_each() but may also be use
From: Matthew Wilcox
Remove the last mentions of radix tree from various comments.
Signed-off-by: Matthew Wilcox
---
mm/shmem.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index 2283872a84a1..075b19da8327 100644
--- a/mm/shmem.c
+++
From: Matthew Wilcox
Signed-off-by: Matthew Wilcox
Acked-by: David Sterba
---
fs/btrfs/compression.c | 4 +---
fs/btrfs/extent_io.c | 8 +++-
2 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index dfd73e7265cf..54448d5d86e8 10
From: Matthew Wilcox
Rename the function from page_cache_tree_delete_batch to just
page_cache_delete_batch.
Signed-off-by: Matthew Wilcox
---
mm/filemap.c | 28 +---
1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index faadf1
From: Matthew Wilcox
Simpler code because the xarray takes care of things like the limit and
dereferencing the slot.
Signed-off-by: Matthew Wilcox
---
mm/shmem.c | 18 --
1 file changed, 4 insertions(+), 14 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index 0ead678725c4..
From: Matthew Wilcox
xa_store() differs from radix_tree_insert() in that it will overwrite an
existing element in the array rather than returning an error. This is
the behaviour which most users want, and those that want more complex
behaviour generally want to use the xas family of routines any
From: Matthew Wilcox
This first function in the XArray API brings with it a lot of support
infrastructure. The advanced API is based around the xa_state which is
a more capable version of the radix_tree_iter.
As the test-suite demonstrates, it is possible to use the xarray and
radix tree APIs o
From: Matthew Wilcox
This one is trivial.
Signed-off-by: Matthew Wilcox
---
mm/readahead.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/mm/readahead.c b/mm/readahead.c
index c7ddcf60ac6d..50910c27b372 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -174,9 +174,7 @
From: Matthew Wilcox
Signed-off-by: Matthew Wilcox
---
mm/filemap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index f85cdda6744f..bf231ebadb86 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2583,7 +2583,7 @@ static struct page *do_read_cac
From: Matthew Wilcox
Signed-off-by: Matthew Wilcox
---
mm/migrate.c | 41 -
1 file changed, 16 insertions(+), 25 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index f65dd69e1fd1..de1a602d0056 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -323,7
From: Matthew Wilcox
This is a perfect use for xa_cmpxchg(). Note the use of 0 for GFP
flags; we won't be allocating memory.
Signed-off-by: Matthew Wilcox
---
mm/shmem.c | 7 ++-
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index 985c5cdec7f7..0ead
From: Matthew Wilcox
dax_load_hole was swallowing the errors from vm_insert_mixed().
Use vmf_insert_mixed() instead to get a vm_fault_t, and convert
dax_load_hole() to the vm_fault_t convention.
Signed-off-by: Matthew Wilcox
---
fs/dax.c | 9 +
1 file changed, 5 insertions(+), 4 deleti
From: Matthew Wilcox
Instead of storing a pointer to the slot containing the canonical entry,
store the offset of the slot. Produces slightly more efficient code
(~300 bytes) and simplifies the implementation.
Signed-off-by: Matthew Wilcox
Reviewed-by: Josef Bacik
---
include/linux/xarray.h
From: Matthew Wilcox
Both callers of __delete_from_swap_cache have the swp_entry_t already,
so pass that in to make constructing the XA_STATE easier.
Signed-off-by: Matthew Wilcox
---
include/linux/swap.h | 5 +++--
mm/swap_state.c | 24 ++--
mm/vmscan.c | 2
From: Matthew Wilcox
The page cache was the only user of this interface and it has now
been converted to the XArray. Transform the test into a test of
xas_init_tags().
Signed-off-by: Matthew Wilcox
---
include/linux/radix-tree.h | 2 --
lib/radix-tree.c | 13 ---
From: Matthew Wilcox
Add some XArray-based helper functions to replace the radix tree based
metaphors currently in use. The biggest change is that converted code
doesn't see its own lock bit; get_unlocked_entry() always returns an
entry with the lock bit clear, and locking the entry now returns
From: Matthew Wilcox
With no more radix tree API users left, we can drop the GFP flags
and use xa_init() instead of INIT_RADIX_TREE().
Signed-off-by: Matthew Wilcox
---
fs/inode.c | 2 +-
mm/swap_state.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/inode.c b/f
From: Matthew Wilcox
I found another victim of the radix tree being hard to use. Because
there was no call to radix_tree_preload(), khugepaged was allocating
radix_tree_nodes using GFP_ATOMIC.
I also converted a local_irq_save()/restore() pair to
disable()/enable().
Signed-off-by: Matthew Wilc
From: Matthew Wilcox
This conversion keeps the radix tree and XArray data structures in sync
at all times. That allows us to convert the page cache one function at
a time and should allow for easier bisection. Other than renaming some
elements of the structures, the data structures are fundamen
From: Matthew Wilcox
xa_load has its own RCU locking, so we can eliminate it here.
Signed-off-by: Matthew Wilcox
---
mm/shmem.c | 7 +--
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index b538cd71f772..5b8a2d944a0c 100644
--- a/mm/shmem.c
+++ b/mm/sh
From: Matthew Wilcox
Use my_zero_pfn instead of ZERO_PAGE, and pass the vaddr to it so it
works on MIPS and s390.
Signed-off-by: Matthew Wilcox
---
fs/dax.c | 10 +-
1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index e014c99b21fd..b0efb0a9604a 100644
From: Matthew Wilcox
Remove mentions of 'radix' and 'radix tree'. Simplify some names by
dropping the word 'mapping'.
Signed-off-by: Matthew Wilcox
---
fs/dax.c | 87 +++-
1 file changed, 42 insertions(+), 45 deletions(-)
diff --git a/fs/da
From: Matthew Wilcox
Avoids walking the radix tree multiple times looking for tags.
Signed-off-by: Matthew Wilcox
---
fs/dax.c | 17 +
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index 19ac013204a1..b68b2f81fa47 100644
--- a/fs/dax.c
+++ b/
From: Matthew Wilcox
Like cmpxchg(), xa_cmpxchg will only store to the index if the current
entry matches the old entry. It returns the current entry, which is
usually more useful than the errno returned by radix_tree_insert().
For the users who really only want the errno, the xa_insert() wrappe
From: Matthew Wilcox
Simplify the locking by taking the spinlock while we walk the tree on
the assumption that many acquires and releases of the lock will be worse
than holding the lock while we process an entire batch of pages.
Signed-off-by: Matthew Wilcox
Reviewed-by: Mike Kravetz
---
mm/s
From: Matthew Wilcox
The following functions are (now) unused:
- __radix_tree_delete_node
- radix_tree_for_each_contig
- radix_tree_gang_lookup_slot
- radix_tree_join
- radix_tree_maybe_preload_order
- radix_tree_split
- radix_tree_split_preload
Signed-off-by: Matthew Wilcox
---
.clang-
From: Matthew Wilcox
Use XArray iteration instead of a pagevec.
Signed-off-by: Matthew Wilcox
---
fs/dax.c | 129 ++-
1 file changed, 61 insertions(+), 68 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index b68b2f81fa47..87da84c761a7 100644
From: Matthew Wilcox
Introduce page_cache_pin() to factor out the common logic between the
various lookup routines:
find_get_entry
find_get_entries
find_get_pages_range
find_get_pages_contig
find_get_pages_range_tag
find_get_entries_tag
filemap_map_pages
By using the xa_state to control the ite
From: Matthew Wilcox
This is the last part of DAX to be converted to the XArray so
remove all the old helper functions.
Signed-off-by: Matthew Wilcox
---
fs/dax.c | 368 ++-
1 file changed, 92 insertions(+), 276 deletions(-)
diff --git a/fs/
From: Matthew Wilcox
Since the XArray is embedded in the struct address_space, this contains
exactly as much entropy as the address of the mapping.
Signed-off-by: Matthew Wilcox
---
fs/dax.c | 29 +++--
1 file changed, 15 insertions(+), 14 deletions(-)
diff --git a/fs/
From: Matthew Wilcox
This is a direct replacement for struct radix_tree_node. A couple of
struct members have changed name, so convert those. Use a #define so
that radix tree users continue to work without change.
Signed-off-by: Matthew Wilcox
Reviewed-by: Josef Bacik
---
include/linux/radi
From: Matthew Wilcox
Change i_pages from a radix_tree_root to an xarray, convert the
documentation into kernel-doc format and change the order of the elements
to pack them better on 64-bit systems.
Signed-off-by: Matthew Wilcox
---
include/linux/fs.h | 46 +++---
From: Matthew Wilcox
Slightly shorter and easier to read code.
Signed-off-by: Matthew Wilcox
---
mm/khugepaged.c | 17 +
1 file changed, 5 insertions(+), 12 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 43598cc5998b..28579ad0c5fe 100644
--- a/mm/khugepaged.
From: Matthew Wilcox
The page cache offers the ability to search for a miss in the previous or
next N locations. Rather than teach the XArray about the page cache's
definition of a miss, use xas_prev() and xas_next() to search the page
array. This should be more efficient as it does not have to
From: Matthew Wilcox
Add myself as XArray and IDR maintainer.
Signed-off-by: Matthew Wilcox
---
MAINTAINERS | 12
1 file changed, 12 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 0a1410d5a621..3fec61e86022 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15386,6 +15386
From: Matthew Wilcox
Instead of calling find_get_pages_range() and putting any reference,
use xas_find() to iterate over any entries in the range, skipping the
shadow/swap entries.
Signed-off-by: Matthew Wilcox
---
mm/filemap.c | 26 ++
1 file changed, 18 insertions(+),
From: Matthew Wilcox
This removes the last caller of radix_tree_maybe_preload_order().
Simpler code, unless we run out of memory for new xa_nodes partway through
inserting entries into the xarray. Hopefully we can support multi-index
entries in the page cache soon and all the awful code goes awa
From: Matthew Wilcox
This function frees all the internal memory allocated to the xarray
and reinitialises it to be empty.
Signed-off-by: Matthew Wilcox
---
include/linux/xarray.h | 1 +
lib/xarray.c | 28
2 files changed, 29 insertions(+)
diff --git a/
From: Matthew Wilcox
shmem_radix_tree_replace() is renamed to shmem_xa_replace() and
converted to use the XArray API.
Signed-off-by: Matthew Wilcox
---
mm/shmem.c | 22 --
1 file changed, 8 insertions(+), 14 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index e47aaaed0
From: Matthew Wilcox
This is a straightforward conversion.
Signed-off-by: Matthew Wilcox
---
fs/f2fs/data.c | 3 +--
fs/f2fs/dir.c| 2 +-
fs/f2fs/inline.c | 4 ++--
fs/f2fs/node.c | 9 +++--
4 files changed, 7 insertions(+), 11 deletions(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/da
From: Matthew Wilcox
Signed-off-by: Matthew Wilcox
---
drivers/staging/lustre/lustre/llite/glimpse.c | 12 +---
drivers/staging/lustre/lustre/mdc/mdc_request.c | 16
2 files changed, 13 insertions(+), 15 deletions(-)
diff --git a/drivers/staging/lustre/lustre/llite/g
From: Matthew Wilcox
Mostly comment fixes, but one use of __xa_set_tag.
Signed-off-by: Matthew Wilcox
---
fs/buffer.c | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index fda79261aa08..2dcca4263b5c 100644
--- a/fs/buffer.c
+++ b/fs/
From: Matthew Wilcox
The code is slightly shorter and simpler.
Signed-off-by: Matthew Wilcox
---
mm/filemap.c | 30 ++
1 file changed, 14 insertions(+), 16 deletions(-)
diff --git a/mm/filemap.c b/mm/filemap.c
index 070b5e4527ac..4af06a1a9818 100644
--- a/mm/filema
From: Matthew Wilcox
xa_find() is a slightly easier API to use than
radix_tree_gang_lookup_slot() because it contains its own RCU locking.
Signed-off-by: Matthew Wilcox
---
mm/shmem.c | 14 --
1 file changed, 4 insertions(+), 10 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
in
From: Matthew Wilcox
Combine __add_to_swap_cache and add_to_swap_cache into one function
since there is no more need to preload.
Signed-off-by: Matthew Wilcox
---
mm/swap_state.c | 93 +++--
1 file changed, 29 insertions(+), 64 deletions(-)
diff --g
From: Matthew Wilcox
Removes sparse warnings.
Signed-off-by: Matthew Wilcox
---
fs/btrfs/extent_io.c| 4 ++--
fs/ext4/inode.c | 2 +-
fs/f2fs/data.c | 2 +-
fs/gfs2/aops.c | 2 +-
include/linux/pagevec.h | 8 +---
mm/swap.c | 4 ++--
6 files chan
From: Matthew Wilcox
This is a direct replacement for struct radix_tree_root. Some of the
struct members have changed name; convert those, and use a #define so
that radix_tree users continue to work without change.
Signed-off-by: Matthew Wilcox
Reviewed-by: Josef Bacik
---
include/linux/radi
From: Matthew Wilcox
This function combines the functionality of radix_tree_gang_lookup() and
radix_tree_gang_lookup_tagged(). It extracts entries matching the
specified filter into a normal array.
Signed-off-by: Matthew Wilcox
---
include/linux/xarray.h | 2 ++
lib/xarray.c | 80 +
From: Matthew Wilcox
XArray tags are slightly more strongly typed than the radix tree tags,
but occupy the same bits. This commit also adds the xas_ family of tag
operations, for cases where the caller is already holding the lock, and
xa_tagged() to ask whether any array member has a particular
From: Matthew Wilcox
The only user of this functionality was the page cache, and it's now
been converted to the XArray.
Signed-off-by: Matthew Wilcox
---
include/linux/radix-tree.h| 4 +---
lib/idr.c | 2 +-
lib/radix-tree.c | 25 +
From: Matthew Wilcox
These two functions move the xas index by one position, and adjust the
rest of the iterator state to match it. This is more efficient than
calling xas_set() as it keeps the iterator at the leaves of the tree
instead of walking the iterator from the root each time.
Signed-of
From: Matthew Wilcox
Quite a straightforward conversion.
Signed-off-by: Matthew Wilcox
---
mm/huge_memory.c | 17 +++--
1 file changed, 7 insertions(+), 10 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 14ed6ee5e02f..f2d6b53cb8e9 100644
--- a/mm/huge_memory.c
From: Matthew Wilcox
This is essentially xa_cmpxchg() with the locking handled above us,
and it doesn't have to handle replacing a NULL entry.
Signed-off-by: Matthew Wilcox
---
mm/truncate.c | 15 ++-
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/mm/truncate.c b/mm/
From: Matthew Wilcox
We construct a fake XA_STATE and use it to delete the node with xa_store()
rather than adding a special function for this unique use case.
Signed-off-by: Matthew Wilcox
---
include/linux/swap.h | 9
mm/workingset.c | 51 +++---
73 matches
Mail list logo