date:20170710

Re: [PATCH v2 3/4] btrfs: Add zstd support

2017-07-10 Thread Nick Terrell

On 7/10/17, 5:36 AM, "Austin S. Hemmelgarn"  wrote:
> On 2017-07-07 23:07, Adam Borowski wrote:
>> On Sat, Jul 08, 2017 at 01:40:18AM +0200, Adam Borowski wrote:
>>> On Fri, Jul 07, 2017 at 11:17:49PM +, Nick Terrell wrote:
 On 7/6/17, 9:32 AM, "Adam Borowski"  wrote:
> Got a reproducible crash on amd64:
>>>
 Thanks for the bug report Adam! I'm looking into the failure, and haven't
 been able to reproduce it yet. I've built my kernel from your tree, and
 I ran your script with the kernel.tar tarball 100 times, but haven't gotten
 a failure yet.
>>>
 I have a few questions to guide my debugging.

 - How many cores are you running with? I’ve run the script with 1, 2, and 
 4 cores.
 - Which version of gcc are you using to compile the kernel? I’m using 
 gcc-6.2.0-5ubuntu12.
 - Are the failures always in exactly the same place, and does it fail 100%
of the time or just regularly?
>>>
>>> 6 cores -- all on bare metal.  gcc-7.1.0-9.
>>> Lemme try with gcc-6, a different config or in a VM.
>>
>> I've tried the following:
>> * gcc-6, defconfig (+btrfs obviously)
>> * gcc-7, defconfig
>> * gcc-6, my regular config
>> * gcc-7, my regular config
>> * gcc-7, debug + UBSAN + etc
>> * gcc-7, defconfig, qemu-kvm with only 1 core
>>
>> Every build with gcc-7 reproduces the crash, every with gcc-6 does not.
>>
> Got a GCC7 tool-chain built, and I can confirm this here too, tested 
> with various numbers of cores ranging from 1-32 in a QEMU+KVM VM, with 
> various combinations of debug options and other config switches.

The problem is caused by a gcc-7 bug [1]. It miscompiles
ZSTD_wildcopy(void *dst, void const *src, ptrdiff_t len) when len is 0.
It only happens when it can't analyze ZSTD_copy8(), which is the case in
the kernel, because memcpy() is implemented with inline assembly. The
generated code is slow anyways, so I propose this workaround, which will
be included in the next patch set. I've confirmed that it fixes the bug for
me. This alternative implementation is also 10-20x faster, and compiles to
the same x86 assembly as the original ZSTD_wildcopy() with the userland
memcpy() implementation [2].

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81388#add_comment
[2] https://godbolt.org/g/q5YpLx

Signed-off-by: Nick Terrell 
---
 lib/zstd/zstd_internal.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/zstd/zstd_internal.h b/lib/zstd/zstd_internal.h
index 6748719..ade0365 100644
--- a/lib/zstd/zstd_internal.h
+++ b/lib/zstd/zstd_internal.h
@@ -126,7 +126,9 @@ static const U32 OF_defaultNormLog = OF_DEFAULTNORMLOG;
 /*-***
 *  Shared functions to include for inlining
 */
-static void ZSTD_copy8(void *dst, const void *src) { memcpy(dst, src, 8); }
+static void ZSTD_copy8(void *dst, const void *src) {
+   ZSTD_write64(dst, ZSTD_read64(src));
+}
 #define COPY8(d, s)   \
{ \
ZSTD_copy8(d, s); \
--
2.9.3

Re: [PATCH v2] btrfs: Correct assignment of pos

2017-07-10 Thread Liu Bo

On Tue, Jul 04, 2017 at 10:33:07PM -0500, Goldwyn Rodrigues wrote:
> From: Goldwyn Rodrigues 
> 
> Assigning pos for usage early messes up in append mode, where
> the pos is re-assigned in generic_write_checks(). Assign
> pos later to get the correct position to write from iocb->ki_pos.
> 
> Since check_can_nocow also uses the value of pos, we shift
> generic_write_checks() before check_can_nocow(). Checks with
> IOCB_DIRECT are present in generic_write_checks(), so checking
> for IOCB_NOWAIT is enough.
> 
> Also, put locking sequence in the fast path.
> 
> Changes since v1:
>  - Moved pos higher up to encompass check_can_nocow() call.
> 
> Fixes: edf064e7c6fe ("btrfs: nowait aio support")
> Signed-off-by: Goldwyn Rodrigues 
> ---
>  fs/btrfs/file.c | 26 ++
>  1 file changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 59e2dccdf75b..ad53832838b5 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1875,16 +1875,25 @@ static ssize_t btrfs_file_write_iter(struct kiocb 
> *iocb,
>   ssize_t num_written = 0;
>   bool sync = (file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host);
>   ssize_t err;
> - loff_t pos = iocb->ki_pos;
> + loff_t pos;
>   size_t count = iov_iter_count(from);
>   loff_t oldsize;
>   int clean_page = 0;
>  
> - if ((iocb->ki_flags & IOCB_NOWAIT) &&
> - (iocb->ki_flags & IOCB_DIRECT)) {
> - /* Don't sleep on inode rwsem */
> - if (!inode_trylock(inode))
> + if (!inode_trylock(inode)) {
> + if (iocb->ki_flags & IOCB_NOWAIT)
>   return -EAGAIN;
> + inode_lock(inode);
> + }
> +
> + err = generic_write_checks(iocb, from);
> + if (err <= 0) {
> + inode_unlock(inode);
> + return err;
> + }
> +
> + pos = iocb->ki_pos;
> + if (iocb->ki_flags & IOCB_NOWAIT) {
>   /*
>* We will allocate space in case nodatacow is not set,
>* so bail
> @@ -1895,13 +1904,6 @@ static ssize_t btrfs_file_write_iter(struct kiocb 
> *iocb,
>   inode_unlock(inode);
>   return -EAGAIN;
>   }
> - } else
> - inode_lock(inode);
> -
> - err = generic_write_checks(iocb, from);
> - if (err <= 0) {
> - inode_unlock(inode);
> - return err;
>   }
>  
>   current->backing_dev_info = inode_to_bdi(inode);

Reviewed-by: Liu Bo 

-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] btrfs-progs: Tighten integer types in print-tree.

2017-07-10 Thread Adam Buchbinder

There are likely more places where the wrong size types are used, but
these tripped Clang's warnings because they eventually get passed to
printf.

Signed-off-by: Adam Buchbinder 
---
 print-tree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/print-tree.c b/print-tree.c
index a0d3395..82d6572 100644
--- a/print-tree.c
+++ b/print-tree.c
@@ -197,7 +197,7 @@ static void qgroup_flags_to_str(u64 flags, char *ret)
 
 void print_chunk(struct extent_buffer *eb, struct btrfs_chunk *chunk)
 {
-   int num_stripes = btrfs_chunk_num_stripes(eb, chunk);
+   u16 num_stripes = btrfs_chunk_num_stripes(eb, chunk);
int i;
u32 chunk_item_size = btrfs_chunk_item_size(num_stripes);
char chunk_flags_str[32] = {0};
@@ -336,7 +336,7 @@ static void print_file_extent_item(struct extent_buffer *eb,
   int slot,
   struct btrfs_file_extent_item *fi)
 {
-   int extent_type = btrfs_file_extent_type(eb, fi);
+   unsigned char extent_type = btrfs_file_extent_type(eb, fi);
char compress_str[16];
 
compress_type_to_str(btrfs_file_extent_compression(eb, fi),
-- 
2.13.2.725.g09c95d1e9-goog

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] btrfs-progs: Fix missing internal deps in tests.

2017-07-10 Thread Adam Buchbinder

Doing a straight 'make test' would fail because some misc and fsck
tests require particular tools to already be built. Add dependencies
at the Makefile and shell-script level.

Signed-off-by: Adam Buchbinder 
---
 Makefile| 5 +++--
 tests/fsck-tests.sh | 1 +
 tests/misc-tests.sh | 3 +++
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 81598df..4669525 100644
--- a/Makefile
+++ b/Makefile
@@ -272,11 +272,12 @@ test-convert: btrfs btrfs-convert
$(Q)bash tests/convert-tests.sh
 
 test-check: test-fsck
-test-fsck: btrfs btrfs-image btrfs-corrupt-block mkfs.btrfs
+test-fsck: btrfs btrfs-image btrfs-corrupt-block mkfs.btrfs btrfstune
@echo "[TEST]   fsck-tests.sh"
$(Q)bash tests/fsck-tests.sh
 
-test-misc: btrfs btrfs-image btrfs-corrupt-block mkfs.btrfs btrfstune fssum
+test-misc: btrfs btrfs-image btrfs-corrupt-block mkfs.btrfs btrfstune fssum \
+   btrfs-zero-log btrfs-find-root btrfs-select-super
@echo "[TEST]   misc-tests.sh"
$(Q)bash tests/misc-tests.sh
 
diff --git a/tests/fsck-tests.sh b/tests/fsck-tests.sh
index 44cca1b..15d26c7 100755
--- a/tests/fsck-tests.sh
+++ b/tests/fsck-tests.sh
@@ -23,6 +23,7 @@ rm -f "$RESULTS"
 check_prereq btrfs-corrupt-block
 check_prereq btrfs-image
 check_prereq btrfs
+check_prereq btrfstune
 check_kernel_support
 
 run_one_test() {
diff --git a/tests/misc-tests.sh b/tests/misc-tests.sh
index 1c645c9..0898801 100755
--- a/tests/misc-tests.sh
+++ b/tests/misc-tests.sh
@@ -24,6 +24,9 @@ check_prereq btrfs-corrupt-block
 check_prereq btrfs-image
 check_prereq btrfstune
 check_prereq btrfs
+check_prereq btrfs-zero-log
+check_prereq btrfs-find-root
+check_prereq btrfs-select-super
 check_kernel_support
 
 # The tests are driven by their custom script called 'test.sh'
-- 
2.13.2.725.g09c95d1e9-goog

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 3/4] btrfs: Add zstd support

2017-07-10 Thread Clemens Eisserer

> So, I don't see any problem making the level configurable.

+1 - configureable compression level would be very appreciated from my side.
Can't wait until zstd support is mainline :)

Thanks and br, Clemens
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 3/4] btrfs: Add zstd support

2017-07-10 Thread Nick Terrell

On 7/10/17, 5:36 AM, "Austin S. Hemmelgarn"  wrote:
> On 2017-07-07 23:07, Adam Borowski wrote:
>> On Sat, Jul 08, 2017 at 01:40:18AM +0200, Adam Borowski wrote:
>>> On Fri, Jul 07, 2017 at 11:17:49PM +, Nick Terrell wrote:
 On 7/6/17, 9:32 AM, "Adam Borowski"  wrote:
> Got a reproducible crash on amd64:
>>>
 Thanks for the bug report Adam! I'm looking into the failure, and haven't
 been able to reproduce it yet. I've built my kernel from your tree, and
 I ran your script with the kernel.tar tarball 100 times, but haven't gotten
 a failure yet.
>>>
 I have a few questions to guide my debugging.

 - How many cores are you running with? I’ve run the script with 1, 2, and 
 4 cores.
 - Which version of gcc are you using to compile the kernel? I’m using 
 gcc-6.2.0-5ubuntu12.
 - Are the failures always in exactly the same place, and does it fail 100%
of the time or just regularly?
>>>
>>> 6 cores -- all on bare metal.  gcc-7.1.0-9.
>>> Lemme try with gcc-6, a different config or in a VM.
>> 
>> I've tried the following:
>> * gcc-6, defconfig (+btrfs obviously)
>> * gcc-7, defconfig
>> * gcc-6, my regular config
>> * gcc-7, my regular config
>> * gcc-7, debug + UBSAN + etc
>> * gcc-7, defconfig, qemu-kvm with only 1 core
>>
>> Every build with gcc-7 reproduces the crash, every with gcc-6 does not.
>>
> Got a GCC7 tool-chain built, and I can confirm this here too, tested 
> with various numbers of cores ranging from 1-32 in a QEMU+KVM VM, with 
> various combinations of debug options and other config switches.

I was running in an Ubuntu 16.10 VM on a MacBook Pro. I built with gcc-6.2
with KASAN, and couldn't trigger it, as expected. I built with gcc-7.1.0
built from source, and couldn't reproduce it. However, when I set up
qemu-kvm on another device, and compiled with gcc-7.1.0 built from source,
I was able to reproduce the bug. Now that I can reproduce it, I'll look
into a fix. Thanks Adam and Austin for finding, reproducing, and verifying
the bug.

Re: [PATCH v4 0/6] Chunk level degradable check

2017-07-10 Thread Dmitrii Tcvetkov

On Wed, 28 Jun 2017 13:43:29 +0800
Qu Wenruo  wrote:

> The patchset can be fetched from my github repo:
> https://github.com/adam900710/linux/tree/degradable
> 
> The patchset is based on David's for-4.13-part1 branch.
> 
> Btrfs currently uses num_tolerated_disk_barrier_failures to do global
> check for tolerated missing device.
> 
> Although the one-size-fit-all solution is quite safe, it's too strict
> if data and metadata has different duplication level.
> 
> For example, if one use Single data and RAID1 metadata for 2 disks, it
> means any missing device will make the fs unable to be degraded
> mounted.
> 
> But in fact, some times all single chunks may be in the existing
> device and in that case, we should allow it to be rw degraded mounted.
> 
> Such case can be easily reproduced using the following script:
>  # mkfs.btrfs -f -m raid1 -d sing /dev/sdb /dev/sdc
>  # wipefs -f /dev/sdc
>  # mount /dev/sdb -o degraded,rw
> 
> If using btrfs-debug-tree to check /dev/sdb, one should find that the
> data chunk is only in sdb, so in fact it should allow degraded mount.
> 
> This patchset will introduce a new per-chunk degradable check for
> btrfs, allow above case to succeed, and it's quite small anyway.
> 
> And enhance kernel error message for missing device, at least user
> can know what's making mount failed, other than meaningless
> "failed to read system chunk/chunk tree -5".
> 
> v2:
>   Update after almost 2 years.
>   Add the last patch to enhance the kernel output, so user can know
>   it's missing devices that prevents btrfs to be mounted.
> v3:
>   Remove one duplicated missing device output
>   Use the advice from Anand Jain, not to add new members in
> btrfs_device, but use a new structure extra_rw_degrade_errors, to
> record error when sending down/waiting device.
> v3.1:
>   Reduce the critical section in btrfs_check_rw_degradable(), follow
> other caller to only acquire the lock when searching, as extent_map
> has refcount to avoid concurrency already.
>   The modification itself won't affect the behavior, so tested-by
> tags are added to each patch.
> v4:
>   Thanks Anand for this dev flush work, which makes us more easier to
>   detect flush error in previous transaction.
>   Now this patchset won't need to alloc memory, and can just use
>   btrfs_device->last_flush_error to check if last flush finished
>   correctly.
>   New rebase, so old tested by tags are all removed, sorry guys.
> 
> Qu Wenruo (6):
>   btrfs: Introduce a function to check if all chunks a OK for degraded
> rw mount
>   btrfs: Do chunk level rw degrade check at mount time
>   btrfs: Do chunk level degradation check for remount
>   btrfs: Allow barrier_all_devices to do chunk level device check
>   btrfs: Cleanup num_tolerated_disk_barrier_failures
>   btrfs: Enhance missing device kernel message
> 
>  fs/btrfs/ctree.h   |  2 --
>  fs/btrfs/disk-io.c | 81 
>  fs/btrfs/disk-io.h |  2 --
>  fs/btrfs/super.c   |  3 +-
>  fs/btrfs/volumes.c | 99
> +-
> fs/btrfs/volumes.h |  3 ++ 6 files changed, 85 insertions(+), 105
> deletions(-)
> 

Tested on top of current mainline master (commit 
af3c8d98508d37541d4bf57f13a984a7f73a328c). Didn't find any regressions.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Btrfs: add skeleton code for compression heuristic

2017-07-10 Thread David Sterba

On Tue, Jul 04, 2017 at 08:28:15PM +0300, Timofey Titovets wrote:
> For now that code just return true
> Later more complex heuristic code will be added
> 
> Signed-off-by: Timofey Titovets 
> ---
>  fs/btrfs/compression.c | 22 ++
>  fs/btrfs/compression.h |  2 ++
>  fs/btrfs/inode.c   | 25 -
>  3 files changed, 40 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> index a2fad39f79ba..481e56f61461 100644
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -1098,3 +1098,25 @@ int btrfs_decompress_buf2page(const char *buf, 
> unsigned long buf_start,
>  
>   return 1;
>  }
> +
> +/*
> + * Heuristic skeleton
> + * For now just would be a naive and very optimistic 'return true'.
> + */
> +int btrfs_compress_heuristic(struct inode *inode, u64 start, u64 end)
> +{
> + u64 index = start >> PAGE_SHIFT;
> + u64 end_index = end >> PAGE_SHIFT;
> + struct page *page;
> + int ret = 1;
> +
> + while (index <= end_index) {
> + page = find_get_page(inode->i_mapping, index);
> + kmap(page);
> + kunmap(page);
> + put_page(page);
> + index++;
> + }
> +
> + return ret;
> +}
> diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h
> index 680d4265d601..259ea776c9d4 100644
> --- a/fs/btrfs/compression.h
> +++ b/fs/btrfs/compression.h
> @@ -93,4 +93,6 @@ struct btrfs_compress_op {
>  extern const struct btrfs_compress_op btrfs_zlib_compress;
>  extern const struct btrfs_compress_op btrfs_lzo_compress;
>  
> +int btrfs_compress_heuristic(struct inode *inode, u64 start, u64 end);
> +
>  #endif
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 556c93060606..285e5b5eed35 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -510,15 +510,6 @@ static noinline void compress_file_range(struct inode 
> *inode,
>*/
>   if (inode_need_compress(inode)) {

Actually, I think we should put the decision logic based on heuristic
into inode_need_compress that's called from here.

So, please update prototype of inode_need_compress to take start/end and
call btrfs_compress_heuristic from there.

>   WARN_ON(pages);
> - pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);
> - if (!pages) {
> - /* just bail out to the uncompressed code */
> - goto cont;
> - }
> -
> - if (BTRFS_I(inode)->force_compress)
> - compress_type = BTRFS_I(inode)->force_compress;
> -
>   /*
>* we need to call clear_page_dirty_for_io on each
>* page in the range.  Otherwise applications with the file
> @@ -530,6 +521,22 @@ static noinline void compress_file_range(struct inode 
> *inode,
>*/
>   extent_range_clear_dirty_for_io(inode, start, end);
>   redirty = 1;
> +
> + ret = btrfs_compress_heuristic(inode, start, end);

Here we're too far to skip compression, as the pages have been marked
for redirtying again. The original code proceeds to compression
directly.

Merging the logic into inode_need_compress would also mean that it's
going to be used from run_delalloc_range. This could have other
implications, but I haven't looked closely.

> +
> + /* Heuristic say: dont try compress that */
> + if (ret == 0)
> + goto cont;
> +
> + pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);
> + if (!pages) {
> + /* just bail out to the uncompressed code */
> + goto cont;
> + }
> +
> + if (BTRFS_I(inode)->force_compress)
> + compress_type = BTRFS_I(inode)->force_compress;
> +
>   ret = btrfs_compress_pages(compress_type,
>  inode->i_mapping, start,
>  pages,
> -- 
> 2.13.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/2] btrfs: account for pinned bytes in should_alloc_chunk

2017-07-10 Thread David Sterba

On Thu, Jun 22, 2017 at 09:51:47AM -0400, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> In a heavy write scenario, we can end up with a large number of pinned bytes.
> This can translate into (very) premature ENOSPC because pinned bytes
> must be accounted for when allowing a reservation but aren't accounted for
> when deciding whether to create a new chunk.
> 
> This patch adds the accounting to should_alloc_chunk so that we can
> create the chunk.
> 
> Signed-off-by: Jeff Mahoney 

I'm adding the two patches to for-next. More reviews welcome.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 00/13] use rbtrees for preliminary backrefs

2017-07-10 Thread David Sterba

On Wed, Jun 28, 2017 at 09:56:52PM -0600, Edmund Nadolski wrote:
> Edmund Nadolski (6):
>   btrfs: btrfs_check_shared should manage its own transaction
>   btrfs: remove ref_tree implementation from backref.c
>   btrfs: convert prelimary reference tracking to use rbtrees
>   btrfs: add cond_resched() calls when resolving backrefs
>   btrfs: allow backref search checks for shared extents
>   btrfs: clean up extraneous computations in add_delayed_refs
> 
> Jeff Mahoney (7):
>   btrfs: struct-funcs, constify readers
>   btrfs: constify tracepoint arguments
>   btrfs: backref, constify some arguments
>   btrfs: backref, add unode_aux_to_inode_list helper
>   btrfs: backref, cleanup __ namespace abuse
>   btrfs: add a node counter to each of the rbtrees
>   btrfs: backref, add tracepoints for prelim_ref insertion and merging

I'm merging patches 1-7 now, as they're fairly independent, no need to
keep them separate. 8 needs an update and the rest seems to depend on
it. I'll keep it in the for-next testing and replace with updated
version when it arrives.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 08/13] btrfs: convert prelimary reference tracking to use rbtrees

2017-07-10 Thread David Sterba

On Wed, Jun 28, 2017 at 09:57:00PM -0600, Edmund Nadolski wrote:
> It's been known for a while that the use of multiple lists
> that are periodically merged was an algorithmic problem within
> btrfs.  There are several workloads that don't complete in any
> reasonable amount of time (e.g. btrfs/130) and others that cause
> soft lockups.
> 
> The solution is to use a pair of rbtrees that do insertion merging

You've added third rbtree, so the changelog should be updated.

> for both indirect and direct refs, with the former converting
> refs into the latter.  The result is a btrfs/130 workload that
> used to take several hours now takes about half of that. This
> runtime still isn't acceptable and a future patch will address that
> by moving the rbtrees higher in the stack so the lookups can be
> shared across multiple calls to find_parent_nodes.
> 
> Signed-off-by: Edmund Nadolski 
> Signed-off-by: Jeff Mahoney 
> ---
>  fs/btrfs/backref.c | 437 
> ++---
>  1 file changed, 280 insertions(+), 157 deletions(-)
> 
> diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
> index 6cac5ab..ebe8875 100644
> --- a/fs/btrfs/backref.c
> +++ b/fs/btrfs/backref.c
> @@ -26,11 +26,6 @@
>  #include "delayed-ref.h"
>  #include "locking.h"
>  
> -enum merge_mode {
> - MERGE_IDENTICAL_KEYS = 1,
> - MERGE_IDENTICAL_PARENTS,
> -};
> -
>  /* Just an arbitrary number so we can be sure this happened */
>  #define BACKREF_FOUND_SHARED 6
>  
> @@ -129,7 +124,7 @@ static int find_extent_in_eb(const struct extent_buffer 
> *eb,
>   * this structure records all encountered refs on the way up to the root
>   */
>  struct prelim_ref {
> - struct list_head list;
> + struct rb_node rbnode;
>   u64 root_id;
>   struct btrfs_key key_for_search;
>   int level;
> @@ -139,6 +134,18 @@ struct prelim_ref {
>   u64 wanted_disk_byte;
>  };
>  
> +struct preftree {
> + struct rb_root root;
> +};
> +
> +#define PREFTREE_INIT{ .root = RB_ROOT }
> +
> +struct preftrees {
> + struct preftree direct;/* BTRFS_SHARED_[DATA|BLOCK]_REF_KEY */
> + struct preftree indirect;  /* BTRFS_[TREE_BLOCK|EXTENT_DATA]_REF_KEY */
> + struct preftree indirect_missing_keys;
> +};
> +
>  static struct kmem_cache *btrfs_prelim_ref_cache;
>  
>  int __init btrfs_prelim_ref_init(void)
> @@ -158,6 +165,108 @@ void btrfs_prelim_ref_exit(void)
>   kmem_cache_destroy(btrfs_prelim_ref_cache);
>  }
>  
> +static void free_pref(struct prelim_ref *ref)
> +{
> + kmem_cache_free(btrfs_prelim_ref_cache, ref);
> +}
> +
> +/*
> + * Return 0 when both refs are for the same block (and can be merged).
> + * A -1 return indicates ref1 is a 'lower' block than ref2, while 1
> + * indicates a 'higher' block.
> + */
> +static int prelim_ref_compare(struct prelim_ref *ref1,
> +   struct prelim_ref *ref2)
> +{
> + if (ref1->level < ref2->level)
> + return -1;
> + if (ref1->level > ref2->level)
> + return 1;
> + if (ref1->root_id < ref2->root_id)
> + return -1;
> + if (ref1->root_id > ref2->root_id)
> + return 1;
> + if (ref1->key_for_search.type < ref2->key_for_search.type)
> + return -1;
> + if (ref1->key_for_search.type > ref2->key_for_search.type)
> + return 1;
> + if (ref1->key_for_search.objectid < ref2->key_for_search.objectid)
> + return -1;
> + if (ref1->key_for_search.objectid > ref2->key_for_search.objectid)
> + return 1;
> + if (ref1->key_for_search.offset < ref2->key_for_search.offset)
> + return -1;
> + if (ref1->key_for_search.offset > ref2->key_for_search.offset)
> + return 1;
> + if (ref1->parent < ref2->parent)
> + return -1;
> + if (ref1->parent > ref2->parent)
> + return 1;
> +
> + return 0;
> +}
> +
> +/*
> + * Add @newref to the @root rbtree, merging identical refs.
> + *
> + * Callers should assumed that newref has been freed after calling.
> + */
> +static void prelim_ref_insert(struct preftree *preftree,
> +   struct prelim_ref *newref)
> +{
> + struct rb_root *root;
> + struct rb_node **p;
> + struct rb_node *parent = NULL;
> + struct prelim_ref *ref;
> + int result;
> +
> + root = >root;
> + p = >rb_node;
> +
> + while (*p) {
> + parent = *p;
> + ref = rb_entry(parent, struct prelim_ref, rbnode);
> + result = prelim_ref_compare(ref, newref);
> + if (result < 0) {
> + p = &(*p)->rb_left;
> + } else if (result > 0) {
> + p = &(*p)->rb_right;
> + } else {
> + /* Identical refs, merge them and free @newref */
> + struct extent_inode_elem *eie = ref->inode_list;
> +
> + while (eie &&

Re: [PATCH v2 07/13] btrfs: remove ref_tree implementation from backref.c

2017-07-10 Thread David Sterba

On Wed, Jun 28, 2017 at 09:56:59PM -0600, Edmund Nadolski wrote:
> Commit afce772e87c3 ("btrfs: fix check_shared for fiemap ioctl") added
> the ref_tree code in backref.c to reduce backref searching for
> shared extents under the FIEMAP ioctl. This code will not be
> compatible with the upcoming rbtree changes for improved backref
> searching, so this patch removes the ref_tree code.  The rbtree
> changes will provide the equivalent functionality for FIEMAP.
> 
> The above commit also introduced transaction semantics around calls to
> btrfs_check_shared() in order to accurately account for delayed refs.
> This functionality needs to be retained, so a complete revert of the
> above commit is not desirable. This patch therefore removes the
> ref_tree portion of the commit as above, however it does not remove
> the transaction portion.
> 
> Signed-off-by: Edmund Nadolski 
> Signed-off-by: Jeff Mahoney 

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 06/13] btrfs: btrfs_check_shared should manage its own transaction

2017-07-10 Thread David Sterba

On Wed, Jun 28, 2017 at 09:56:58PM -0600, Edmund Nadolski wrote:
> Commit afce772e87c3 ("btrfs: fix check_shared for fiemap ioctl") added
> transaction semantics around calls to btrfs_check_shared() in order to
> provide accurate accounting of delayed refs. The transaction management
> should be done inside btrfs_check_shared(), so that callers do not need
> to manage transactions individually.
> 
> Signed-off-by: Edmund Nadolski 
> Signed-off-by: Jeff Mahoney 

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/14] VFS: Don't use save/replace_mount_options if not using generic_show_options

2017-07-10 Thread David Sterba

On Wed, Jul 05, 2017 at 04:24:09PM +0100, David Howells wrote:
> btrfs, debugfs, reiserfs and tracefs call save_mount_options() and reiserfs
> calls replace_mount_options(), but they then implement their own
> ->show_options() methods and don't touch s_options, rendering the saved
> options unnecessary.  I'm trying to eliminate s_options to make it easier
> to implement a context-based mount where the mount options can be passed
> individually over a file descriptor.
> 
> Remove the calls to save/replace_mount_options() call in these cases.
> 
> Signed-off-by: David Howells 
> cc: Chris Mason 
> cc: Greg Kroah-Hartman 
> cc: Steven Rostedt 
> cc: linux-btrfs@vger.kernel.org
> cc: reiserfs-de...@vger.kernel.org
> ---

For

>  fs/btrfs/super.c|1 -

Acked-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] Urgent nowait aio fixup for btrfs

2017-07-10 Thread David Sterba

Hi,

please pull the following patch that fixes a user-visible bug introduced by the
nowait-aio patches merged in this cycle.

The branch is based on the closest merge in Jens' block changes that contained
the nowait-aio patches, the patch does not apply to current btrfs branch. As
the fix is urgent I send the pull request myself (it's all within btrfs code),
I was not available to sync with Jens over the weekend.

The following changes since commit f95a0d6a95b12a79b7492da7ab687ae4cd741124:

  Merge commit '8e8320c9315c' into for-4.13/block (2017-06-22 21:55:24 -0600)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git 
nowait-aio-btrfs-fixup

for you to fetch changes up to ff0fa73247e442518936baa43c3f037b17f10fa7:

  btrfs: nowait aio: Correct assignment of pos (2017-07-10 15:29:44 +0200)


Goldwyn Rodrigues (1):
  btrfs: nowait aio: Correct assignment of pos

 fs/btrfs/file.c | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] btrfs: resume qgroup rescan on rw remount

2017-07-10 Thread Nikolay Borisov



On 10.07.2017 16:12, Nikolay Borisov wrote:
> 
> 
> On  4.07.2017 14:49, Aleksa Sarai wrote:
>> Several distributions mount the "proper root" as ro during initrd and
>> then remount it as rw before pivot_root(2). Thus, if a rescan had been
>> aborted by a previous shutdown, the rescan would never be resumed.
>>
>> This issue would manifest itself as several btrfs ioctl(2)s causing the
>> entire machine to hang when btrfs_qgroup_wait_for_completion was hit
>> (due to the fs_info->qgroup_rescan_running flag being set but the rescan
>> itself not being resumed). Notably, Docker's btrfs storage driver makes
>> regular use of BTRFS_QUOTA_CTL_DISABLE and BTRFS_IOC_QUOTA_RESCAN_WAIT
>> (causing this problem to be manifested on boot for some machines).
>>
>> Cc:  # v3.11+
>> Cc: Jeff Mahoney 
>> Fixes: b382a324b60f ("Btrfs: fix qgroup rescan resume on mount")
>> Signed-off-by: Aleksa Sarai 
> 
> Indeed, looking at the code it seems that b382a324b60f ("Btrfs: fix
> qgroup rescan resume on mount") missed adding the qgroup_rescan_resume
> in the remount path. One thing which I couldn't verify though is whether
> reading fs_info->qgroup_flags without any locking is safe from remount
> context.
> 
> During remount I don't see any locks taken that prevent operations which
> can modify qgroup_flags.
> 
> 

Further inspection reveals that the access rules to qgroup_flags are
somewhat broken so this patch doesn't really make things any worse than
they are. As such:

Reviewed-by: Nikolay Borisov 
Tested-by: Nikolay Borisov 

> 
>> ---
>>  fs/btrfs/super.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
>> index 6346876c97ea..ff6690389343 100644
>> --- a/fs/btrfs/super.c
>> +++ b/fs/btrfs/super.c
>> @@ -1821,6 +1821,8 @@ static int btrfs_remount(struct super_block *sb, int 
>> *flags, char *data)
>>  goto restore;
>>  }
>>  
>> +btrfs_qgroup_rescan_resume(fs_info);
>> +
>>  if (!fs_info->uuid_root) {
>>  btrfs_info(fs_info, "creating UUID tree");
>>  ret = btrfs_create_uuid_tree(fs_info);
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] btrfs: resume qgroup rescan on rw remount

2017-07-10 Thread Nikolay Borisov



On  4.07.2017 14:49, Aleksa Sarai wrote:
> Several distributions mount the "proper root" as ro during initrd and
> then remount it as rw before pivot_root(2). Thus, if a rescan had been
> aborted by a previous shutdown, the rescan would never be resumed.
> 
> This issue would manifest itself as several btrfs ioctl(2)s causing the
> entire machine to hang when btrfs_qgroup_wait_for_completion was hit
> (due to the fs_info->qgroup_rescan_running flag being set but the rescan
> itself not being resumed). Notably, Docker's btrfs storage driver makes
> regular use of BTRFS_QUOTA_CTL_DISABLE and BTRFS_IOC_QUOTA_RESCAN_WAIT
> (causing this problem to be manifested on boot for some machines).
> 
> Cc:  # v3.11+
> Cc: Jeff Mahoney 
> Fixes: b382a324b60f ("Btrfs: fix qgroup rescan resume on mount")
> Signed-off-by: Aleksa Sarai 

Indeed, looking at the code it seems that b382a324b60f ("Btrfs: fix
qgroup rescan resume on mount") missed adding the qgroup_rescan_resume
in the remount path. One thing which I couldn't verify though is whether
reading fs_info->qgroup_flags without any locking is safe from remount
context.

During remount I don't see any locks taken that prevent operations which
can modify qgroup_flags.



> ---
>  fs/btrfs/super.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 6346876c97ea..ff6690389343 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1821,6 +1821,8 @@ static int btrfs_remount(struct super_block *sb, int 
> *flags, char *data)
>   goto restore;
>   }
>  
> + btrfs_qgroup_rescan_resume(fs_info);
> +
>   if (!fs_info->uuid_root) {
>   btrfs_info(fs_info, "creating UUID tree");
>   ret = btrfs_create_uuid_tree(fs_info);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

linxi★广交会 —— 2017广州国际进出口汽车配件展览会 [地右P4/L100-Z]

2017-07-10 Thread zwt2

【通过本邮件参展优惠500元一展位，需回信专用邮箱“12809...@qq.com”报名】
　　
尊敬的 linxi 企业领导/公司负责人：
　　
　　诚邀参加中国最大汽配外贸展 —— APF 2017
　　汽配行业品牌盛会，外贸企业最佳选择，全球采购首选平台！
　　
★　与“广交会”同期同地举行，
★　以“广交会”庞大的客流量为依托，买家互动，借势兴展，
★　共享来自全球数十万采购商资源•••
　　
　　
【 基 本 信 息 】
　　
中文名称： 2017广州国际进出口汽车配件展览会
英文名称： The Guangzhou International Import and Export Auto Parts Fair 2017 （APF 
2017）
　　
展会日期： 2017年10月13—15日
展会场馆： 广州琶洲国际采购中心
　　
批准单位： 中华人民共和国商务部
主办单位： 中国对外贸易经济合作企业协会、映德国际会展有限公司
　　
官方网站： http://www.CAPE-china.com 
在线客服： 邮箱/QQ：q...@12809395.com；  微信：ZhanShangZhiJia；  
微博：http://weibo.com/yingdehuizhan
咨询电话： 4000-580-850（转5206或8144）； 131-2662-5206； 010―8699-7155、 8084-2128； 
　　
　　
【 展 会 介 绍 】
　　
　　
中国目前的汽车保有量已达1.95亿多辆，预计到2020年，中国汽车保有量将超过2.5亿辆。预计2016年中国汽车年产销量将超过3000万辆，到2020年中国汽车产销量将分别超过4500万辆，从而成为名副其实的全球第一大汽车市场。汽车配件是汽车工业发展的基础，汽车配件配套及售后服务市场是汽车市场的重要组成部分，中国汽车工业的迅猛发展，为汽车配件行业提供了坚实的产业基础和有力的市场支撑，并将形成1.5-2万亿元超大规模的市场产值。
　　
　　
作为汽车市场的焦点，广州拥有国内最大的汽车生产基地和汽车产业集群，连续三年汽车消费增速全国前列。2017年是“十三五”规划实施的重要一年，是供给侧结构性改革的深化之年，中国汽车工业已步入由大到强的发展之路，行业资源分配日益优化、产业布局日趋合理的态势已初现端倪，产业发展正逐步由产销量的提升演变为质量的飞跃。尤其在夯实产业根基、促使健康发展原则指导下，汽车配件产业，已被提升为汽车产业链条中首要的发展对象，资源倾斜、政策扶持、整顿规范，可以预计，继我国整车生产及消费在过去十年取得蓬勃发展成就之后，未来五到十年，将是我国汽车配件行业产生根本性变革的黄金时期。
　　
　　
得益于中国汽车产业高速发展和全球汽车零部件产业链积极向中国转移，映德会展、中汽展览联合行业权威机构定于2017年10月13-15日在广州琶洲国际采购中心举办“2017广州国际进出口汽车配件展览会”（APF
 
2017）。依托汽车产业和全球最大的潜在市场资源，根据汽车配件产业发展现状和中外市场需求，在继承和延伸往届展会成功经验的基础上，在各级政府部门、行业协会的关心与支持下、经过主承办单位的精心组织策划，“APF
 2017”将以全新的面貌再现广州，展会将全面展示汽车领域的最新产品与成果及未来发展方向，将有超过百家合作媒体的超大阵容作全方位的立体宣传。APF 
全国统一参展报名热线：4000-580-850（转5206、8144）。

　　
我们将继续以“突出品牌、开拓创新、注重实效、强化服务”的办展宗旨，凭借独特的创意，科学的组织管理和卓越的服务，以全新的理念为广大中外参展商提供一个“专业化、国际化、品牌化”的展示交流平台，为全球汽车配件及后市场行业提供更多的合作机会，有力推动中国汽车配件产品全面进入全球采购体系，与世界各国汽车产业协调合作、互利共赢、共同发展进步。
　　
　　
【 展 会 优 势 】
　　
●绝佳商机 —— APF 
2017举办时间正值“广交会”期间，享有“中国第一展”美誉的“广交会”，每年参加的采购商大约20多万，来自一百多个国家和地区。我们将通过一系列途径充分借助“广交会”全球买家的巨大资源，并通过组委会客户关系邀请系统向国内外三十多万采购商发出邀请，与“广交会”完全互动，借势兴展，同时弥补“广交会”内销的不足，形成“一内一外、相辅相成”的作用。以“广交会”庞大的客流量为依托，中外采购商云集，市场潜力不可估量，巨大商机全面彰显，是开拓国际市场的重要平台！
　　
●   黄金地段 —— 
广州琶洲国际采购中心与广交会展馆一路之隔，连为一体，形成完美对接，连接广交会同类产品展区，距离地铁八号线琶洲站A出口仅200米之遥，交通非常便利，方便海外客商前来参观、采购。
　　
●   参展回报 —— 
与每个国内外采购决策者面对面交流，和意向客户达成交易，在专业客户中扩大品牌影响力；建立海外分销网络，拓展国际市场；新产品、新技术推广；开拓新市场；了解竞争对手及行业发展趋势；洞悉国际最新技术与资讯；约见老客户并发展新业务。
　　
　　
【 目 标 观 众 】
　　
　　中国（广州）国际汽车零部件及用品展览会组委会（映德会展―YOND 
EXPO）将专业观众组织和媒体宣传作为工作重点，邀请中外汽车制造商、改装厂、改装行、改装店，汽车工业设备制造商、汽车零配件用品制造商、贸易商、代理商、经销商、终端用户，汽车配件用品市场、超市、连锁加盟店、4S店，汽车保养及美容中心、汽车维修中心、汽车修理厂，汽车综合性能检测站、汽车后市场经销商，汽车后市场连锁经营领域专家、学者、投资公司及国内外有志于汽车后市场投资创业人士、汽车服务行业、汽车爱好者、车友会、俱乐部、商务机构、汽车维修检测行业相关部门、汽车交通运输部门、政府主管部门、汽车行业协会、专业媒体等主要单位及负责人参会。采取卓有实效的措施为参展企业搭建交流与合作的平台，促进科技成果转化，提高企业市场竞争力；同时通过系列紧密有序的宣传活动，确保展会在国内外引起最大关注。
　　
　　16万国内外专业买家云集羊城 ——
　　
一、 国内专业买家 
1、300家整车厂和汽车销售公司
　- 
本田（广州，东风），丰田（一汽，广汽），大众（一汽，上海），北京现代，上海通用，东风日产，长安福特，比亚迪，奇瑞等35家主流整车企业和60家汽车销售公司，汽车用品公司的采购负责人现场参观采购。
2、8000家4S店集团及全国4S店
　- 
新疆广汇，冀东庞大，上海永达，浙江物产元通，广物汽贸，东创建国，大连中升，湖南申湘，深圳深业，中汽西南，安徽亚夏，郑州豫华等300家4S店集团和中国各品牌4000家4S店采购负责人参展采购。
3、1500家全国一级批发物流商
　- 
欧特隆（辽宁，杭州，南京，山西），沈阳新天成，郑州二仟家，山西茂德隆，长沙湘泸，福建永联，成都穗丰，广州永丰，新疆半分利，北京派安，石家庄中惠等1200家一级批发物流参展采购。
4、7000家全国各地市代理经销商
5、2500家全国优质影音改装专业店
　- 以新城子昂，上海车之宝，北京双周，音乐前线，先歌兄弟， 非常城市等为代表的全国各区域优质影音改装店参展采购。
6、300家大型零售终端连锁
　- 以新奇特，黄帽子，上海美车饰等为代表的全国各区域优质零售终端及大型连锁参展采购。。
7、9家国内终端零售店（含南方/泛珠三角地区终端店3家）
　- 以金手指，车元素等为代表的福建，江西，湖南，广东，广西，海南，四川，贵州，云南，香港，澳门等泛珠三角地区零售终端现场采购。以及2万家全国优秀零售终端。
　　
二、 国外专业买家 
1、4000亚洲买家：
　- 包括日本、韩国、印度尼西亚、马来西亚、印度、泰国、菲律宾、越南、新加坡等国行业商会组团采购参观。
2、1500中东买家：
　- 包括阿联酋、沙特阿拉伯、伊朗、叙利亚、以色列、科威特、卡塔尔、也门等国采购商组团参观采购。
3、2500欧美买家：
　- 包括德国、英国、法国、美国、墨西哥、加拿大等国采购商采购参观。
　　
　　
【 展 品 范 围 】
　　
　　
汽车零部件、零配件，发动机系统、底盘系统、制动系统、行驶系统、转向系统、车身系统、传动系统、排气系统、散热冷却系统、燃油系统，汽车附件、通用件、紧固件、密封件、摩擦材料，汽车电机、轴承、蓄电池、滤清器、散热器、消声器、传感器、仪器仪表、雨刷器、变速器、离合器、离合片、刹车片、汽车弹簧、减震器、保险杠、安全气囊、座椅、玻璃、车镜、车灯、汽车空调、轮胎、轮毂、链条、防滑链，汽车线束、插接器、硬管、软管、软轴、拉索，车用纺织品，汽车油漆、润滑油、机油、添加剂，汽车用品，汽车电子电器，汽车音影、音响、导航、车载通讯、安全和防盗系统，汽车改装部件及用品，汽保设备及工具，汽车模具，汽车零部件制造技术、设备、工具及材料，汽车零部件清洗设备及包装，汽车新产品，汽车节能环保与新能源技术及产品，相关软件、媒体、认证、金融和保险机构等。
　　
　　
【 参 展 细 则 】
　　
◆ 展位规格： 
　　1、特装展位：36平方米起租，仅提供相应面积室内外空地。展台搭建、展览器具、用电用水等自理。 
　　2、标准展位：9平方米（3m×3m）每个，2.5m高壁板、一条楣板（展商名称）、一张洽谈桌、两把椅子、两盏射灯、220V/5A电源插座一处。 
　　
◆ 展位费用：  
　　特装展位：境内企业RMB2000/平方米；　　境外企业USD500/平方米； 
　　标准展位：境内企业RMB2/个；　　境外企业USD5000/个； （双面开口标准展位另加收10%费用）
　　
◆ 会刊广告： 
（大会《会刊》将帮助您在展会后找到客户！除在展会期间广为发送外，还通过各种有关渠道发送给未能前来参观展会的各地专业人士手中，他们可利用会刊迅速查找服务内容与联络方法。
 会刊尺寸：130mm*210mm，进口铜板纸彩色精印，发行量10万册。）
　　封面 CNY 3； 封二封三 CNY 22000； 扉页 CNY 18000； 黑白页 CNY 5000；
　　封底 CNY 2； 彩页跨版 CNY 18000； 彩页 CNY 12000； 300字简介 CNY 2000；
　　
◆ 会议论坛：
　　
如技术交流会/产品推广发布会，CNY9000/小时/场，用于会场及相关设备租金（包括场地、扩音设施、灯具、投影机、投影仪，桌椅、空调、茶水并协助主讲企业组织听众）。
　　 
　　 
【 参 展 程 序 】
　　
1、大会即日起开始接受厂商报名，组委会（映德会展―YOND 
EXPO）严格按“款到先后顺序优先安排展位”，先期报名参展企业除“在统一参展费用的基础上获得较靠前展台位置”的同时，并可享受更多“展前宣传”和“买家推介”等增值服务。
2、参展单位请详细填写《参展申请表》（备索）并加盖公章，传真或复印后寄送至大会组织办公室（映德会展―YOND 
EXPO），并于三个工作日内向大会指定账户汇出参展费用。 
3、参展单位请于报名时将300字内企业简介同时提供至大会组织办公室，以便进行及时展前宣传和刊登《会刊》等。 
4、展品运输、仓储、吊装，展商报道、接待、食宿等后勤服务，详见会前《参展商手册》，约在大会开幕前一个半月发送。
5、需用动力电、气或用水、特装展台装修等事宜，请于大会开幕前一月将有关资料提供给大会组委会，以便会务组协助参展企业做好相应安排。
6、组委会拒绝与参展范围不符的厂商参展。报名截止日期：2017年08月31日。 
　　
　　
【 筹 展 联 络 】
 　　
广州国际进出口汽车配件展组委会
官方网站： http://www.CAPE-china.com 
全国统一客服热线：

Re: Chunk root problem

2017-07-10 Thread Austin S. Hemmelgarn


On 2017-07-10 00:21, Daniel Brady wrote:

On 7/7/2017 1:06 AM, Daniel Brady wrote:

On 7/6/2017 11:48 PM, Roman Mamedov wrote:

On Wed, 5 Jul 2017 22:10:35 -0600
Daniel Brady  wrote:


parent transid verify failed


Typically in Btrfs terms this means "you're screwed", fsck will not fix it, and
nobody will know how to fix or what is the cause either. Time to restore from
backups! Or look into "btrfs restore" if you don't have any.

In your case it's especially puzzling as the difference in transid numbers is
really significant (about 100K), almost like the FS was operating for months
without updating some parts of itself -- and no checksum errors either, so
all looks correct, except that everything is horribly wrong.

This kind of error seems to occur more often in RAID setups, either Btrfs
native RAID, or with Btrfs on top of other RAID setups -- i.e. where it
becomes a complex issue that all writes to multi devices DO complete IN order,
in case of an unclean shutdown. (which is much simpler on a single device FS).

Also one of your disks or cables is failing (was /dev/sde on that boot, but may
get a different index next boot), check SMART data for it and replace.


[   21.230919] BTRFS info (device sdf): bdev /dev/sde errs: wr 402545, rd
234683174, flush 194501, corrupt 0, gen 0




Well that's not good news. Unfortunately I made a fatal error in not
having a backup. Restore looks like I could recover a good chunk of it
from the dry runs, however it has a lot of trouble reading many files.
I'm sure that is related to the one disk (sde). Drives were setup as raid56.

After updating the kernel as suggested in the email from Duncan it
reduced the "parent transid verify" errors down to just one and the errs
on sde still exist.

[   21.400190] BTRFS info (device sdb): use no compression
[   21.400191] BTRFS info (device sdb): disk space caching is enabled
[   21.400192] BTRFS info (device sdb): has skinny extents
[   21.584923] BTRFS info (device sdb): bdev /dev/sde errs: wr 402545,
rd 234683174, flush 194501, corrupt 0, gen 0
[   23.394788] BTRFS error (device sdb): parent transid verify failed on
5257838690304 wanted 591492 found 489231
[   23.416489] BTRFS error (device sdb): parent transid verify failed on
5257838690304 wanted 591492 found 489231
[   23.416524] BTRFS error (device sdb): failed to read block groups: -5
[   23.448478] BTRFS error (device sdb): open_ctree failed

I ran a SMART test as you suggested with a passing result. I also
swapped SATA cables & power with another drive and the error followed
the drive confirmed by the serial via SMART. It seems like it just can't
read from that one drive for whatever reason. I also tried disconnecting
the drive and trying to mount it degraded with no luck. Still had the
transid error just with null as the bdev.

smartctl -a /dev/sde
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.12.0-1.el7.elrepo.x86_64]
(local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red (AF)
Device Model: WDC WD30EFRX-68EUZN0
Serial Number:WD-WCC4N0PEYTEV
LU WWN Device Id: 5 0014ee 2b7dbfe54
Firmware Version: 82.00A82
User Capacity:3,000,592,982,016 bytes [3.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate:5400 rpm
Device is:In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:Fri Jul  7 00:30:10 2017 MDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
 was never started.
 Auto Offline Data Collection:
Disabled.
Self-test execution status:  (   0) The previous self-test routine
completed
 without error or no self-test
has ever
 been run.
Total time to complete Offline
data collection:(40500) seconds.
Offline data collection
capabilities:(0x7b) SMART execute Offline immediate.
 Auto Offline data collection
on/off support.
 Suspend Offline collection upon new
 command.
 Offline surface scan supported.
 Self-test supported.
 Conveyance Self-test supported.
 Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering

Re: raid10 array lost with single disk failure?

2017-07-10 Thread Austin S. Hemmelgarn


On 2017-07-09 22:13, Adam Bahe wrote:

I have finished all of the above suggestions, ran a scrub, remounted,
rebooted, made sure the system didn't hang, and then kicked off
another balance on the entire pool. It completed rather quickly but
something still does not seem right.

Label: 'btrfs_pool1'  uuid: 04a7fa70-1572-47a2-a55c-7c99aef12603
 Total devices 18 FS bytes used 23.64TiB
 devid1 size 1.82TiB used 1.82TiB path /dev/sdd
 devid2 size 1.82TiB used 1.82TiB path /dev/sdf
 devid3 size 3.64TiB used 3.07TiB path /dev/sdg
 devid4 size 3.64TiB used 3.06TiB path /dev/sdk
 devid5 size 1.82TiB used 1.82TiB path /dev/sdn
 devid6 size 3.64TiB used 3.06TiB path /dev/sdo
 devid7 size 1.82TiB used 1.82TiB path /dev/sds
 devid8 size 1.82TiB used 1.82TiB path /dev/sdj
 devid9 size 1.82TiB used 1.82TiB path /dev/sdi
 devid   10 size 1.82TiB used 1.82TiB path /dev/sdq
 devid   11 size 1.82TiB used 1.82TiB path /dev/sdr
 devid   12 size 1.82TiB used 1.82TiB path /dev/sde
 devid   13 size 1.82TiB used 1.82TiB path /dev/sdm
 devid   14 size 7.28TiB used 4.78TiB path /dev/sdh
 devid   15 size 7.28TiB used 4.99TiB path /dev/sdl
 devid   16 size 7.28TiB used 4.97TiB path /dev/sdp
 devid   17 size 7.28TiB used 4.99TiB path /dev/sdc
 devid   18 size 5.46TiB used 210.12GiB path /dev/sdb

/dev/sdb is the new disk, but btrfs only moved 210.12GB over to it.
Most disks in the array are >50% utilized or more. Is this normal?


Was this from a full balance, or just running a scrub to repair chunks?

You have three ways you can repair a BTRFS volume that's lost a device:

* The first, quickest, and most reliable is to use `btrfs device 
replace` to replace the failing/missing device.  This will result in 
only reading data that needs to be read to go on the new device, so it 
completes quicker, but you will also need to resize the new device if 
you are going to a larger device, and can't replace the missing device 
with a smaller one.


* The second is to add the device to the array, then run a scrub on the 
whole array.  This will spit out a bunch of errors from the chunks that 
need to be rebuilt, but will make sure everything is consistent.  This 
isn't as fast as using `device replace`, but is still quicker than a 
full balance most of the time.  In this particular case, I would expect 
behavior like what you're seeing above at least some of the time.


* The third, and slowest method, is to add the new device, then run a 
full balance.  This will make sure data is evenly distributed 
proportionate to device size and will rebuild all the partial chunks. 
It will also take the longest, and put significantly more stress on the 
array than the other two options (it rewrites the entire array).  If 
this is what you used, then you probably found a bug, because it should 
never result in what you're seeing.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 3/4] btrfs: Add zstd support

2017-07-10 Thread Austin S. Hemmelgarn


On 2017-07-07 23:07, Adam Borowski wrote:

On Sat, Jul 08, 2017 at 01:40:18AM +0200, Adam Borowski wrote:

On Fri, Jul 07, 2017 at 11:17:49PM +, Nick Terrell wrote:

On 7/6/17, 9:32 AM, "Adam Borowski"  wrote:

Got a reproducible crash on amd64:



Thanks for the bug report Adam! I'm looking into the failure, and haven't
been able to reproduce it yet. I've built my kernel from your tree, and
I ran your script with the kernel.tar tarball 100 times, but haven't gotten
a failure yet.



I have a few questions to guide my debugging.

- How many cores are you running with? I’ve run the script with 1, 2, and 4 
cores.
- Which version of gcc are you using to compile the kernel? I’m using 
gcc-6.2.0-5ubuntu12.
- Are the failures always in exactly the same place, and does it fail 100%
   of the time or just regularly?


6 cores -- all on bare metal.  gcc-7.1.0-9.
Lemme try with gcc-6, a different config or in a VM.


I've tried the following:
* gcc-6, defconfig (+btrfs obviously)
* gcc-7, defconfig
* gcc-6, my regular config
* gcc-7, my regular config
* gcc-7, debug + UBSAN + etc
* gcc-7, defconfig, qemu-kvm with only 1 core

Every build with gcc-7 reproduces the crash, every with gcc-6 does not.

Got a GCC7 tool-chain built, and I can confirm this here too, tested 
with various numbers of cores ranging from 1-32 in a QEMU+KVM VM, with 
various combinations of debug options and other config switches.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Commit edf064e7c (btrfs: nowait aio support) breaks shells

2017-07-10 Thread David Sterba

On Fri, Jul 07, 2017 at 08:09:28PM -0600, Jens Axboe wrote:
> On 07/07/2017 07:51 PM, Goldwyn Rodrigues wrote:
> > On 07/04/2017 05:16 PM, Jens Axboe wrote:
> >>
> >> Please expedite getting this upstream, asap.
> > 
> > I have posted an updated patch [1] and it is acked by David. Would you
> > pick it up or should it go through the btrfs tree (or some other tree)?
> > 
> > [1] https://patchwork.kernel.org/patch/9825813/
> 
> I'm fine with either, I just want it to go in asap. I'm sending off
> a pull Monday. David, up to you.

I'm reading this just now, so I'll send the patch in my pull today.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs device ready purpose

2017-07-10 Thread Austin S. Hemmelgarn


On 2017-07-07 13:40, Chris Murphy wrote:

On Fri, Jul 7, 2017 at 10:59 AM, Andrei Borzenkov  wrote:

07.07.2017 19:42, Chris Murphy пишет:

I'm digging through piles of list emails and not really finding an
answer to this. Maybe it's Friday and I'm just confused...


[root@f26s ~]# btrfs device ready /dev/sda1
[root@f26s ~]# echo $?
0
[root@f26s ~]# btrfs device ready /dev/mapper/vg-1
[root@f26s ~]# echo $?
0


/dev/sda1 is a single device btrfs and it is present, 'btrfs fi show' finds it.
/dev/mapper/vg-1 is one member of a two device Btrfs volume, the
missing device is /dev/mapper/vg-2 and 'btrfs fi show' sees 1 but not
2 and says a device is missing.

And yet the ready command completes and the exit code is the same, and
I just don't understand the purpose of this command. The man page says
"wait" but I don't see any waiting, so at the very least we probably
need to come up with a more descriptive man page description.



Here on Ubuntu 16.04.2 man page says

ready 
Check device to see if it has all of it’s devices in cache
for mounting.


4.11 man page says for 'btrfs device ready'
Wait until all devices of a multiple-device filesystem are scanned and
registered within the kernel module.

Not sure when it was updated.




Nothing about "wait".

btrfs device ready is only useful during startup. The only check it does
- whether kernel knows about all devices or not. It is really intended
to be used from udev rule to delay attempt to mount filesystem until all
devices have been seen. And it is only useful with systemd as it is the
only program that actually pays attention to it.


OK that make sense. I'm just not used to the convention where a user
space tool does literally nothing for the user, and also has a man
page entry as if it is for users. And I get that developers are users
too, it just seems like an odd convention but whatever.

TBH, it's not even entirely useful during startup.  Even if all the 
devices are supposedly there, the filesystem may still not mount because 
of any number of other issues, the inherent TOCTOU race not 
withstanding, and even if all the devices aren't there, the filesystem 
may still mount depending on what options you pass.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Can I drop/reset files with corrupted data if they are in a read only snapshot?

2017-07-10 Thread Marc MERLIN

Thanks for the Cc/ping, I appreciate it

On Sun, Jul 09, 2017 at 11:38:51AM +, Duncan wrote:
> At your own risk you can try using btrfs property to set the ro snapshot 
> to rw.  Then you can delete the corrupted files and reset the snapshot 
> back to ro.
> 
> Of course you'll need to do the same thing on both the send and receive 
> side in ordered to keep the two reference snapshots in sync.
> 
> However, because my use-case doesn't involve send/receive, I've not been 
> able to personally verify that the above procedure doesn't screw up an 
> incremental send/receive job using that modified snapshot as a 
> reference.  There has been one report on the list of someone who did the 

On the plus side, on FS #1, I was able to do this, clear all the
corrupted blocks and both scrub and btrfs check come back clean.

On the 2nd one with more corruption, I'm very saddened to say that I did
clear all the blocks, and on my next scrub I got new files that had
corruption and didn't show up on the previous scrub (!).
So, I rebooted from 4.11 to 4.8 and I'm doing a new scrub will clear all
those blocks too, and see if it stays that way, or not.

For reference the steps look like this:
gargamel:/mnt/btrfs_pool1# btrfs property set Video_ro.20170625_01:23:05 ro 
false
truncate damaged files
gargamel:/mnt/btrfs_pool1# btrfs property set Video_ro.20170625_01:23:05 ro true
gargamel:/mnt/btrfs_pool1# btrfs scrub start -Bd /mnt/btrfs_pool1
scrub device /dev/mapper/dshelf1 (id 1) done
scrub started at Sun Jul  9 08:05:44 2017 and finished after 11:31:58
total bytes scrubbed: 10.43TiB with 0 errors
gargamel:/mnt/btrfs_pool1#

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 3/4] btrfs: Add zstd support

Re: [PATCH v2] btrfs: Correct assignment of pos

[PATCH] btrfs-progs: Tighten integer types in print-tree.

[PATCH] btrfs-progs: Fix missing internal deps in tests.

Re: [PATCH v2 3/4] btrfs: Add zstd support

Re: [PATCH v2 3/4] btrfs: Add zstd support

Re: [PATCH v4 0/6] Chunk level degradable check

Re: [PATCH] Btrfs: add skeleton code for compression heuristic

Re: [PATCH v2 1/2] btrfs: account for pinned bytes in should_alloc_chunk

Re: [PATCH v2 00/13] use rbtrees for preliminary backrefs

Re: [PATCH v2 08/13] btrfs: convert prelimary reference tracking to use rbtrees

Re: [PATCH v2 07/13] btrfs: remove ref_tree implementation from backref.c

Re: [PATCH v2 06/13] btrfs: btrfs_check_shared should manage its own transaction

Re: [PATCH 01/14] VFS: Don't use save/replace_mount_options if not using generic_show_options

[GIT PULL] Urgent nowait aio fixup for btrfs

Re: [PATCH] btrfs: resume qgroup rescan on rw remount

Re: [PATCH] btrfs: resume qgroup rescan on rw remount

linxi★广交会 —— 2017广州国际进出口汽车配件展览会 [地右P4/L100-Z]

Re: Chunk root problem

Re: raid10 array lost with single disk failure?

Re: [PATCH v2 3/4] btrfs: Add zstd support

Re: Commit edf064e7c (btrfs: nowait aio support) breaks shells

Re: btrfs device ready purpose

Re: Can I drop/reset files with corrupted data if they are in a read only snapshot?

24 matches

Site Navigation

Mail list logo

Footer information