I thought btrfs supports 64-bit inodes, in which case this will
truncate the inode number on 32-bit architectures.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordom
On Mon, Jun 12, 2017 at 05:38:13PM -0500, Goldwyn Rodrigues wrote:
> We had FS_NOWAIT in filesystem type flags (in v3), but retracted it
> later in v4.
A per-fs flag is wrong as file_operation may have different
capabilities.
> I will work on adding FMODE_AIO_NOWAIT in the meantime.
If Al prefer
Am Mon, 12 Jun 2017 11:00:31 +0200
schrieb Henk Slager :
> Hi all,
>
> there is 1-block corruption a 8TB filesystem that showed up several
> months ago. The fs is almost exclusively a btrfs receive target and
> receives monthly sequential snapshots from two hosts but 1 received
> uuid. I do not k
On Mon, Jun 12, 2017 at 08:23:15AM -0400, Jeff Layton wrote:
> Just set the FS_WB_ERRSEQ flag to indicate that we want to use errseq_t
> based error reporting. Internal filemap_* calls are left as-is for now.
>
> Signed-off-by: Jeff Layton
> ---
> fs/xfs/xfs_super.c | 2 +-
> 1 file changed, 1 i
Signed-off-by: Anand Jain
---
v2: fix another location where ino_t was required.
include/trace/events/btrfs.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index e37973526153..692207e3f5d5 100644
--- a/include
On 06/10/2017 12:34 AM, Al Viro wrote:
> On Thu, Jun 08, 2017 at 12:39:10AM -0700, Christoph Hellwig wrote:
>> As already indicated this whole series looks fine to me.
>>
>> Al: are you going to pick this up? Or Andrew?
>
> The main issue here is "let's bail out from ->write_iter() instances"
>
Since dio submit has used bio_clone_fast, the submitted bio may not have a
reliable bi_vcnt, for the bio vector iterations in checksum related
functions, bio->bi_iter is not modified yet and it's safe to use
bio_for_each_segment, while for those bio vector iterations in dio read's
endio, we now sav
On Mon, Jun 12, 2017 at 05:29:41PM +0200, David Sterba wrote:
> We can hardcode GFP_NOFS to btrfs_io_bio_alloc, although it means we
> change it back from GFP_KERNEL in scrub. I'd rather save a few stack
> bytes from not passing the gfp flags in the remaining, more imporatant,
> contexts and the bi
On Tue, Jun 06, 2017 at 04:45:28PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval
>
> The extents marked in pin_down_extent() will be unpinned later in
> unpin_extent_range(), which decrements total_bytes_pinned.
> pin_down_extent() must increment the counter to avoid underflowing it.
> Also a
On Tue, Jun 06, 2017 at 04:45:26PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval
>
> There are a few places where we pass in a negative num_bytes, so make it
> signed for clarity. Also move it up in the file since later patches will
> need it there.
>
Reviewed-by: Liu Bo
-liubo
> Signed-o
On Mon, Jun 12, 2017 at 05:29:39PM +0200, David Sterba wrote:
> We use btrfs_bioset for bios and ask to allocate the entire size of
> btrfs_io_bio from btrfs bio_alloc_bioset. The member 'bio' is
> initialized but the bytes from 0 to offset of 'bio' are left
> uninitialized. Although we initialize
Main part is the initialization of btrfs_io_bio, the rest is doc & cleanup.
David Sterba (3):
btrfs: document mandatory order of bio in btrfs_io_bio
btrfs: add helper to initialize the non-bio part of btrfs_io_bio
btrfs: sink gfp parameter to btrfs_io_bio_alloc
fs/btrfs/check-integrity.c |
We can hardcode GFP_NOFS to btrfs_io_bio_alloc, although it means we
change it back from GFP_KERNEL in scrub. I'd rather save a few stack
bytes from not passing the gfp flags in the remaining, more imporatant,
contexts and the bio allocating API now looks more consistent.
Signed-off-by: David Ster
We use btrfs_bioset for bios and ask to allocate the entire size of
btrfs_io_bio from btrfs bio_alloc_bioset. The member 'bio' is
initialized but the bytes from 0 to offset of 'bio' are left
uninitialized. Although we initialize some of the members in our
helpers, we should initialize the whole str
Signed-off-by: David Sterba
---
fs/btrfs/volumes.h | 4
1 file changed, 4 insertions(+)
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 58b97b6f5f02..35327efecdbb 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -281,6 +281,10 @@ struct btrfs_io_bio {
u8 *csum_
On Mon, Jun 12, 2017 at 06:03:15PM +0200, Hans van Kranenburg wrote:
> >> Most interesting changes since v1:
> >> - mention the special tree_id input value 0
> >> - rewrite the part about min_key and max_key, trying to be more concise
> >
> > I find the description instructive enough so the expa
On Fri, Jun 09, 2017 at 02:52:29PM +0800, Anand Jain wrote:
> So remove the static check of send error
Please improve the changelog text.
> Signed-off-by: Anand Jain
> ---
> fs/btrfs/disk-io.c | 19 +--
> 1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/fs/btrf
On Fri, Jun 09, 2017 at 02:52:28PM +0800, Anand Jain wrote:
> Commit
> 9035b5dbc576 btrfs: btrfs_io_bio_alloc never fails, skip error handling
>
> removed the -ENOMEM return from write_dev_flush() so no need to
> check for the -ENOMEM during send.
>
> This patch also peals write_dev_flush's wait
On Tue, Jun 06, 2017 at 12:20:32AM +0200, Hans van Kranenburg wrote:
> A programmer who is trying to implement calling the btrfs SEARCH
> or SEARCH_V2 ioctl will probably soon end up reading this struct
> definition.
>
> Properly document the input fields to prevent common misconceptions:
> 1. Th
On 06/12/2017 05:38 PM, David Sterba wrote:
> On Tue, Jun 06, 2017 at 12:20:32AM +0200, Hans van Kranenburg wrote:
>> A programmer who is trying to implement calling the btrfs SEARCH
>> or SEARCH_V2 ioctl will probably soon end up reading this struct
>> definition.
>>
>> Properly document the input
On Fri, Jun 09, 2017 at 09:44:31AM +0300, Nikolay Borisov wrote:
>
>
> On 7.06.2017 21:09, Sargun Dhillon wrote:
> > This patch is a small performance optimization to get rid of a spin
> > lock, where instead an atomic64_t can be used.
> >
> > Signed-off-by: Sargun Dhillon
>
> I've already se
On Tue, Jun 06, 2017 at 01:52:52PM -0600, Liu Bo wrote:
> With switching to use btrfs_bio_clone_partial() to split bio in
> directIO path, read endio is also adapted to that by recording a
> iterator in btrfs_bio, however, it breaks those bios which are less
> than stripe length thus no need to be
On Tue, Jun 06, 2017 at 01:52:52PM -0600, Liu Bo wrote:
> With switching to use btrfs_bio_clone_partial() to split bio in
> directIO path, read endio is also adapted to that by recording a
> iterator in btrfs_bio, however, it breaks those bios which are less
> than stripe length thus no need to be
On Tue, Jun 06, 2017 at 03:00:58PM +0200, Hans van Kranenburg wrote:
> While talking to another btrfs user on IRC today, it became clear that a
> major point of confusion in the btrfs send manual is that it's not
> telling the user soon enough that send/receive solely operates on
> subvolume snapsh
On Tue, Jun 06, 2017 at 04:45:26PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval
>
> There are a few places where we pass in a negative num_bytes, so make it
> signed for clarity. Also move it up in the file since later patches will
> need it there.
>
> Signed-off-by: Omar Sandoval
Reviewe
On Tue, Jun 06, 2017 at 04:45:27PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval
>
> Signed-off-by: Omar Sandoval
> ---
> fs/btrfs/extent-tree.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 7c01b4e9e3b6..60
On Wed, Jun 07, 2017 at 11:19:37AM -0700, Omar Sandoval wrote:
> On Fri, Jun 02, 2017 at 06:58:36PM +0200, David Sterba wrote:
> > Update direct callers of btrfs_bio_clone that do error handling, that we
> > can now remove.
> >
> > Signed-off-by: David Sterba
> > ---
> > fs/btrfs/inode.c | 4 -
On Wed, Jun 07, 2017 at 03:40:16PM +0800, Anand Jain wrote:
>
>
> On 06/03/17 00:58, David Sterba wrote:
> > All callers pass gfp_flags=GFP_NOFS and nr_vecs=BIO_MAX_PAGES.
>
> The line (in the other thread) mentioning the reason to remove
>__GFP_HIGH can go into the commit log here.
Makes
On Mon, Jun 12, 2017 at 06:10:28PM +0800, Anand Jain wrote:
> Signed-off-by: Anand Jain
There's one more in btrfs__qgroup_data_map, please fix it as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo
/commits/Anand-Jain/btrfs-add-compression-trace-points/20170612-184615
base: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
config: s390-allmodconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 6.3.0-18) 6.3.0 20170516
reproduce:
wget
https
On Mon, Jun 12, 2017 at 02:40:38PM +0200, David Sterba wrote:
> On Fri, Jun 09, 2017 at 08:50:12AM -0700, Filip Bystricky wrote:
> > Dear btrfs maintainers,
> > Google is evaluating btrfs for its potential use in android, but
> > currently the lack of native file-based encryption unfortunately make
On Mon, Jun 12, 2017 at 08:23:10AM -0400, Jeff Layton wrote:
> For now, only do this when the FS_WB_ERRSEQ flag is set. The
> AS_EIO/AS_ENOSPC flags are not currently cleared in the older code when
> writeback initiation fails, only when we discover an error after waiting
> on writeback to complete
On Mon, Jun 12, 2017 at 08:23:06AM -0400, Jeff Layton wrote:
> Add a new FS_WB_ERRSEQ flag to the fstype. Later patches will set and
> key off of that to decide what behavior should be used.
Please invert this so that only file systems that keep the old semantics
need a flag.
--
To unsubscribe fro
Ensure that we get an error back on all fds when a block device is
open by multiple writers and writeback fails.
Signed-off-by: Jeff Layton
---
tests/generic/998 | 64 +++
tests/generic/998.out | 2 ++
tests/generic/group | 1 +
3 files cha
On Mon, Jun 12, 2017 at 08:23:11AM -0400, Jeff Layton wrote:
> Allow filesystems to opt-in to a final check of wb_err if FS_WB_ERRSEQ
> is set. Technically, we could just plumb these calls into all of the
> fsync operations, but I think this means less code, changes and churn.
Please add it to eve
Make a new btrfs/999 test that works the way Chris Mason suggested:
Build a filesystem with 2 devices that stripes the data across
both devices, but mirrors metadata across both. Then, make one
of the devices fail and see how fsync is handled.
Signed-off-by: Jeff Layton
---
tests/btrfs/999 |
I'm working on a set of kernel patches to change how writeback errors
are handled and reported in the kernel. Instead of reporting a
writeback error to only the first fsync caller on the file, I aim
to make the kernel report them once on every file description.
This patch adds a test for the new b
The writeback error handling test requires that you put the journal on a
separate device. This allows us to use dmerror to simulate data
writeback failure, without affecting the journal.
xfs already has infrastructure for this (a'la $SCRATCH_LOGDEV), so wire
up the ext4 code so that it can do the
Signed-off-by: Jeff Layton
---
common/rc | 11 ++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/common/rc b/common/rc
index 08807ac7c22a..46b890cbff6a 100644
--- a/common/rc
+++ b/common/rc
@@ -832,7 +832,16 @@ _scratch_mkfs()
mkfs_cmd="$MKFS_BTRFS_PROG"
v4: respin set based on Eryu's comments
These tests are companion tests to the patchset I recently posted with
the cover letter:
[PATCH v6 00/20] fs: enhanced writeback error reporting with errseq_t (pile
#1)
This set just adds 3 new xfstests to test writeback behavior. One generic
filesyst
On Fri, Jun 09, 2017 at 08:50:12AM -0700, Filip Bystricky wrote:
> Dear btrfs maintainers,
> Google is evaluating btrfs for its potential use in android, but
> currently the lack of native file-based encryption unfortunately makes
> it a nonstarter.
The file-based encryption is covered by the fscr
v6:
===
This is the sixth posting of the patchset to revamp the way writeback
errors are tracked and reported.
This is a smaller set than the last one. The main difference from the
last set is that this one just adds errseq_t based error reporting for
the purposes of fsync, while leaving the inter
The callers all set it to 1.
Also, make it clear that this function will not set any sort of AS_*
error, and that the caller must do so if necessary. No existing caller
uses this on normal files, so none of them need it.
Also, add __must_check here since, in general, the callers need to
handle an
The -EIO returned here can end up overriding whatever error is marked in
the address space, and be returned at fsync time, even when there is a
more appropriate error stored in the mapping.
Read errors are also sometimes tracked on a per-page level using
PG_error. Suppose we have a read error on a
Signed-off-by: Jeff Layton
Reviewed-by: Jan Kara
Reviewed-by: Matthew Wilcox
Reviewed-by: Christoph Hellwig
---
fs/buffer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 161be58c5cb0..4be8b914a222 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
To enable that, make __errseq_set return the value that it was set to
when we exit the loop. Take heed that that value is not suitable as a
later "since" value, as it will not have been marked seen.
Signed-off-by: Jeff Layton
---
include/linux/errseq.h | 2 +-
include/linux/fs.h
I noticed on xfs that I could still sometimes get back an error on fsync
on a fd that was opened after the error condition had been cleared.
The problem is that the buffer code sets the write_io_error flag and
then later checks that flag to set the error in the mapping. That flag
perisists for qui
ext2 currently does a test+clear of the AS_EIO flag, which is
is problematic for some coming changes.
What we really need to do instead is call filemap_check_errors
in __generic_file_fsync after syncing out the buffers. That
will be sufficient for this case, and help other callers detect
these err
ext4 uses the blockdev mapping for tracking metadata stored in the
pagecache. Sample its wb_err when opening a file and store that in
the f_md_wb_err field.
Change ext4_sync_file to check for data errors first, and then check the
blockdev mapping for metadata errors afterward.
Note that because m
Now that we have new infrastructure for handling writeback errors using
errseq_t, we need to convert the existing code to use it. We could
attempt to retrofit the old interfaces on top of the new, but there is
a conceptual disconnect here in the case of internal callers that
invoke filemap_fdatawai
Jan Kara's description for this patch is much better than mine, so I'm
quoting it verbatim here:
-8<-
DAX currently doesn't set errors in the mapping when cache flushing
fails in dax_writeback_mapping_range(). Since this function can get
called only from fsync(2) or
Just check and advance the data errseq_t in struct file before
before returning from fsync on normal files. Internal filemap_*
callers are left as-is.
We also set the FS_WB_ERRSEQ flag just for completeness sake.
Not much is really using it at this point.
Signed-off-by: Jeff Layton
---
fs/xfs/x
This is a very minimal conversion to errseq_t based error tracking
for raw block device access.
Only real change that is strictly required is that we must ensure that
filemap_report_wb_err is unconditionally called after fsync, which is
now done if FS_WB_ERRSEQ is set in fs_flags. That ensures tha
Some filesystems keep a different mapping for metadata writeback. Add a
second errseq_t to struct file for tracking metadata writeback errors.
Also add a new function for checking a mapping of the caller's choosing
vs. the f_md_wb_err value.
Signed-off-by: Jeff Layton
---
include/linux/fs.h
When a writeback error occurs, we want later callers to be able to pick
up that fact when they go to wait on that writeback to complete.
Traditionally, we've used AS_EIO/AS_ENOSPC flags to track that, but
that's problematic since only one "checker" will be informed when an
error occurs.
In later p
An errseq_t is a way of recording errors in one place, and allowing any
number of "subscribers" to tell whether an error has been set again
since a previous time.
It's implemented as an unsigned 32-bit value that is managed with atomic
operations. The low order bits are designated to hold an error
Add the FS_WB_ERRSEQ flag to indicate to other subsystems that errseq_t
based error reporting for data writeback is in effect, and to opt-in to
reporting those errors in call_fsync.
ext4 uses the blockdev mapping for tracking metadata stored in the
pagecache. Sample its wb_err when opening a file
Set the FS_WB_ERRSEQ flag to opt-in to errseq_t based reporting.
Internal call to filemap_* functions are left as-is.
Signed-off-by: Jeff Layton
---
fs/btrfs/super.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 4f1cdd5058f1..c99af0
Allow filesystems to opt-in to a final check of wb_err if FS_WB_ERRSEQ
is set. Technically, we could just plumb these calls into all of the
fsync operations, but I think this means less code, changes and churn.
Signed-off-by: Jeff Layton
---
include/linux/fs.h | 20 ++--
1 file c
Internal callers of filemap_* functions are left as-is.
Signed-off-by: Jeff Layton
---
fs/btrfs/file.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index da1096eb1a40..4632f16bc49c 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
I waxed a little loquacious here, but I figured that more detail was
better, and writeback error handling is so hard to get right.
Although I think we'll eventually remove it once the transition is
complete, I've gone ahead and documented the FS_WB_ERRSEQ flag as well.
Cc: Jan Kara
Signed-off-by
Let's try to make this extra clear for fs authors.
Also, although I think we'll eventually remove it once the transition is
complete, I've gone ahead and documented the FS_WB_ERRSEQ flag as well.
Cc: Jan Kara
Signed-off-by: Jeff Layton
---
Documentation/filesystems/vfs.txt | 48 +++
Just set the FS_WB_ERRSEQ flag to indicate that we want to use errseq_t
based error reporting. Internal filemap_* calls are left as-is for now.
Signed-off-by: Jeff Layton
---
fs/xfs/xfs_super.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_su
Most filesystems currently use mapping_set_error and
filemap_check_errors for setting and reporting/clearing writeback errors
at the mapping level. filemap_check_errors is indirectly called from
most of the filemap_fdatawait_* functions and from
filemap_write_and_wait*. These functions are called f
The error code should be negative. Since this ends up in the default
case anyway, this is harmless, but it's less confusing to negate it.
Also, later patches will require a negative error code here.
Signed-off-by: Jeff Layton
Reviewed-by: Ross Zwisler
Reviewed-by: Jan Kara
Reviewed-by: Matthew
Don't try to check PageError since that's potentially racy and not
necessarily going to be set after writepage errors out.
Instead, check the mapping for an error after writepage returns.
Signed-off-by: Jeff Layton
Reviewed-by: Jan Kara
---
mm/page-writeback.c | 15 +++
1 file chan
On 12.06.2017 12:44, Anand Jain wrote:
>
> Thanks for the review.
>
>> I haven't read previous submissions and any discussions that might have
>> occurred there but why not just stick to
>> btrfs_data_compression/btrfs_data_compressor. I know there is certain
>> semantic overload since we call
Signed-off-by: Anand Jain
---
include/trace/events/btrfs.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index e37973526153..6494e34b6df9 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
Thanks for the review.
I haven't read previous submissions and any discussions that might have
occurred there but why not just stick to
btrfs_data_compression/btrfs_data_compressor. I know there is certain
semantic overload since we call it a compressor yet it also does
decompression but let's
Hi all,
there is 1-block corruption a 8TB filesystem that showed up several
months ago. The fs is almost exclusively a btrfs receive target and
receives monthly sequential snapshots from two hosts but 1 received
uuid. I do not know exactly when the corruption has happened but it
must have been rou
On 12.06.2017 11:32, Anand Jain wrote:
> This patch adds compression and decompression trace points for the
> purpose of debugging.
>
> Signed-off-by: Anand Jain
> ---
> v2:
> . Use better naming.
>(If transform is not good enough I have run out of ideas, pls suggest).
> . To be applied o
This patch adds compression and decompression trace points for the
purpose of debugging.
Signed-off-by: Anand Jain
---
v2:
. Use better naming.
(If transform is not good enough I have run out of ideas, pls suggest).
. To be applied on top of
git://git.kernel.org/pub/scm/linux/kernel/git/k
72 matches
Mail list logo