date:20160401

On 04/02/2016 09:33 AM, Yauhen Kharuzhy wrote:

On Sat, Apr 02, 2016 at 09:15:56AM +0800, Anand Jain wrote:

On 03/30/2016 03:47 AM, Yauhen Kharuzhy wrote:

On Tue, Mar 29, 2016 at 10:41:36PM +0800, Anand Jain wrote:

Hi Yauhen,

Issue 2.
At start of autoreplacig drive by hotspare, kernel craches in transaction
handling code (inside of btrfs_commit_transaction() called by autoreplace
initiating
routines). I 'fixed' this by removing of closing of bdev in
btrfs_close_one_device_dont_free(), see
https://bitbucket.org/jekhor/linux-btrfs/commits/dfa441c9ec7b3833f6a5e4d0b6f8c678faea29bb?at=master
(oops text is attached also). Bdev is closed after replacing by
btrfs_dev_replace_finishing(), so this is safe but doesn't seem
to be right way.

I have sent out V2. I don't see that issue with this,
could you pls try ?

Yes, it reproduced on v4.4.5 kernel. I will try with current
'for-linus-4.6' Chris' tree soon.

To emulate a drive failure, I disconnect the drive in VirtualBox, so bdev
can be freed by kernel after releasing of all references to it.

So far the raid group profile would adapt to lower suitable
group profile when device is missing/failed. This appears to
be not happening with RAID56 OR there are stale IO which wasn't
flushed out. Anyway to have this fixed I am moving the patch
btrfs: introduce device dynamic state transition to offline or failed
to the top in v3 for any potential changes.
But firstly we need a reliable test case, or a very carefully
crafted test case which can create this situation

Here below is the dm-error that I am using for testing, which
apparently doesn't report this issue. Could you please try on V3. ?
(pls note the device names are hard coded in the test script
sorry about that) This would eventually be fstests script.

Sure. But I don't see any V3 patches in the list. Are you still
preparing to send them or I missed something?

Its out now. There was a little distraction when I was about to send it.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Global hotspare functionality

2016-04-01 Thread Yauhen Kharuzhy

On Sat, Apr 02, 2016 at 09:15:56AM +0800, Anand Jain wrote:
> 
> 
> On 03/30/2016 03:47 AM, Yauhen Kharuzhy wrote:
> >On Tue, Mar 29, 2016 at 10:41:36PM +0800, Anand Jain wrote:
> >>
> >>Hi Yauhen,
> >>
> >
> >>>
> >>>Issue 2.
> >>>At start of autoreplacig drive by hotspare, kernel craches in transaction
> >>>handling code (inside of btrfs_commit_transaction() called by autoreplace 
> >>>initiating
> >>>routines). I 'fixed' this by removing of closing of bdev in 
> >>>btrfs_close_one_device_dont_free(), see
> >>>https://bitbucket.org/jekhor/linux-btrfs/commits/dfa441c9ec7b3833f6a5e4d0b6f8c678faea29bb?at=master
> >>>(oops text is attached also). Bdev is closed after replacing by
> >>>btrfs_dev_replace_finishing(), so this is safe but doesn't seem
> >>>to be right way.
> >>
> >>  I have sent out V2. I don't see that issue with this,
> >>  could you pls try ?
> >
> >Yes, it reproduced on v4.4.5 kernel. I will try with current
> >'for-linus-4.6' Chris' tree soon.
> >
> >To emulate a drive failure, I disconnect the drive in VirtualBox, so bdev
> >can be freed by kernel after releasing of all references to it.
> 
>   So far the raid group profile would adapt to lower suitable
>   group profile when device is missing/failed. This appears to
>   be not happening with RAID56 OR there are stale IO which wasn't
>   flushed out. Anyway to have this fixed I am moving the patch
>btrfs: introduce device dynamic state transition to offline or failed
>   to the top in v3 for any potential changes.
>   But firstly we need a reliable test case, or a very carefully
>   crafted test case which can create this situation
> 
>   Here below is the dm-error that I am using for testing, which
>   apparently doesn't report this issue. Could you please try on V3. ?
>   (pls note the device names are hard coded in the test script
>   sorry about that) This would eventually be fstests script.

Sure. But I don't see any V3 patches in the list. Are you still
preparing to send them or I missed something?


-- 
Yauhen Kharuzhy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 00/13 v3] Introduce device state 'failed', Hot spare and Auto replace

Thanks for various comments, tests and feedback.

Background: Hot spare and Auto replace:
 Hot spare is predominately used to mitigate or narrow the time
 window of a degraded mode, during which any further disk
 failure might lead to a catastrophic data loss. Data center
 storage generally will have couple of disks reserved as spares
 on the storage, so that it will automatically kickin to resilver
 the storage pool so that the pool is back to a healthy state.
 Mainly this is an storage feature rather than a FS feature,
 I believe people acquainted with enterprise storage use cases
 will appreciate the need of it, and so most/all of the enterprise
 storage has hot spare feature.

Btrfs device states:
 This patch-set adds 'failed' state and makes provision to use
 'offline' state as two new device states. So to summarize
 various device states and their meanings..

 /* missing: device wasn't found at the time of mount */
 int missing;

 /*
  * failed: device confirmed to have experienced critical
  * io failure
  */
 int failed;

 /*
  * offline: When there is no confirmation that a disk has
  * failed. But an interim communication breakdown
  * and not necessarily a candidate for the device replace.
  * Device might be online after user intervention or after
  * block transport layer error recovery.
  */
 int offline;


Device state transition Tuning and visualization:
 Sysfs interfaces are planned to provide the required tuning for
 device state transition, sensitivities and visualization of device
 states. However sysfs framework which could provide such an interface
 is being reviewed/tested and not yet ready as of now. So for the
 testing and debug of these features here I have used an update
 version of the procfs patch which is in the ML.

  [PATCH] btrfs: debug: procfs-devlist: introduce procfs interface for
the device list for debugging

 I find the above patch very useful, easy to use (as compared to
 sysfs to visualize the device state) and stable.

This patch set does not depend on any of the sysfs patches as such.

Backward compatibility:
 Adds a new incompatibility feature flags
 (BTRFS_FEATURE_INCOMPAT_SPARE_DEV) to manage the spare device
 when older kernels are used. So it is tested to be work fine
 with older kernel/prog versions.


Auto replace:
 Replace happens automatically, that is when there is any write
 failed or flush failed, the device will be marked as failed, which
 will stop any further IO attempt to that device. And in the next
 commit cycle the auto replace will pick the spare device to
 replace the failed device. And so the btrfs volume is back to a
 healthy state.

Per FSID spare vs Global spare:
 As of now only global hot spare is supported, that is hot spare(s)
 are for all the btrfs FS in the system. However future there will
 be a fs_info->no_auto_replace tunable which can be tuned by the user
 to limit the use of global spare.


Example use case:
 Here below is an example use case of the hot spare setup.

 Add a spare device:
btrfs spare add /dev/sde -f

 If there is a spare device which is already added before the,
 just run

btrfs dev scan [/dev/sde]

 Which will register the spare device to the kernel.

btrfs fi show
 Label: none uuid: 52f170c1-725c-457d-8cfd-d57090460091
  Total devices 2 FS bytes used 112.00KiB
  devid 1 size 2.00GiB used 417.50MiB path /dev/sdc
  devid 2 size 2.00GiB used 417.50MiB path /dev/sdd

Global spare
  device size 3.00GiB path /dev/sde


Patches:

Kernel:
 First, it needs, Qu's per chunk missing device patchset, which is
 part of the set.

 Next patches 6-9 adds support for Spare device. For kernel without
 spare feature the spare device is kept away. And when the kernel
 supports the spare device, it will inhibit from mounting it. Further
 these patch set provides helper function to pick a spare device and
 release a spare device back to the spare device pool.

 Patch 10 provides helper function to auto replace.
 Patch 11 provides helper function to bring a device to failed state.
 Patch 12 marks a device as failed based on flush and write errors,
  and avoids any further IO to it.
 Last 13 triggers auto replace.

Progs:
 Needs below 4 patches which will add sub cli 'spare' to manage
 the spare device. As of now deleting a spare device has to be
 managed using wipefs. However in the long run we would a proper
 btrfs command to do that job.

V2->V3:
Kernel:
  Thanks to Yauhen and Austin for the review comments.
  Again split Patch 11 and 12 which was merged in V2 for better.
  Patch numbers are reordered (sorry about that) but for better.
  Fix rcu issue in btrfs_get_spare_device(), we don't need rcu
   as its under uuid_mutex
  Fix rcu issue and to check for replace lock at
   btrfs_auto_replace_start()
  Cleanup old: casualty_kthread() new: health_kthread() with
changes as per
838fe188 'btrfs: cleaner_kthread() doesn't need explicit freeze'
(thanks

[PATCH 10/13] btrfs: introduce helper functions to perform hot replace

Hot replace / auto replace is important volume manager feature
and is critical to the data center operations, so that the degraded
volume can be brought back to a healthy state at the earliest and
without manual intervention.

This modifies the existing replace code to suite the need of auto
replace, in the long run I hope both the codes to be merged.

Signed-off-by: Anand Jain 
Tested-by: Austin S. Hemmelgarn 
---
 fs/btrfs/dev-replace.c | 43 +++
 fs/btrfs/dev-replace.h |  1 +
 2 files changed, 44 insertions(+)

diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 2b926867d136..ceab4c51db32 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -957,3 +957,46 @@ void btrfs_bio_counter_inc_blocked(struct btrfs_fs_info 
*fs_info)
 _info->fs_state));
}
 }
+
+int btrfs_auto_replace_start(struct btrfs_root *root,
+   struct btrfs_device *src_device)
+{
+   int ret;
+   char *tgt_path;
+   char *src_path;
+   struct btrfs_fs_info *fs_info = root->fs_info;
+
+   if (fs_info->sb->s_flags & MS_RDONLY)
+   return -EROFS;
+
+   btrfs_dev_replace_lock(_info->dev_replace, 0);
+   if (btrfs_dev_replace_is_ongoing(_info->dev_replace)) {
+   btrfs_dev_replace_unlock(_info->dev_replace, 0);
+   return -EBUSY;
+   }
+   btrfs_dev_replace_unlock(_info->dev_replace, 0);
+
+   if (btrfs_get_spare_device(_path)) {
+   btrfs_err(root->fs_info,
+   "No spare device found/configured in the kernel");
+   return -EINVAL;
+   }
+
+   rcu_read_lock();
+   src_path = kstrdup(rcu_str_deref(src_device->name), GFP_ATOMIC);
+   rcu_read_unlock();
+   if (!src_path) {
+   kfree(tgt_path);
+   return -ENOMEM;
+   }
+   ret = btrfs_dev_replace_start(root, tgt_path,
+   src_device->devid, src_path,
+   BTRFS_IOCTL_DEV_REPLACE_CONT_READING_FROM_SRCDEV_MODE_AVOID);
+   if (ret)
+   btrfs_put_spare_device(tgt_path);
+
+   kfree(tgt_path);
+   kfree(src_path);
+
+   return 0;
+}
diff --git a/fs/btrfs/dev-replace.h b/fs/btrfs/dev-replace.h
index e922b42d91df..b918b9d6e5df 100644
--- a/fs/btrfs/dev-replace.h
+++ b/fs/btrfs/dev-replace.h
@@ -46,4 +46,5 @@ static inline void btrfs_dev_replace_stats_inc(atomic64_t 
*stat_value)
 {
atomic64_inc(stat_value);
 }
+int btrfs_auto_replace_start(struct btrfs_root *root, struct btrfs_device 
*src_device);
 #endif
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 02/13] btrfs: Do per-chunk check for mount time check

From: Qu Wenruo 

Now use the btrfs_check_degraded() to do mount time degraded check.

With this patch, now we can mount with the following case:
 # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
 # wipefs -a /dev/sdc
 # mount /dev/sdb /mnt/btrfs -o degraded
 As the single data chunk is only in sdb, so it's OK to mount as degraded,
 as missing one device is OK for RAID1.

But still fail with the following case as expected:
 # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc
 # wipefs -a /dev/sdb
 # mount /dev/sdc /mnt/btrfs -o degraded
 As the data chunk is only in sdb, so it's not OK to mount it as degraded.

Reported-by: Zhao Lei 
Reported-by: Anand Jain 
Signed-off-by: Qu Wenruo 

[Btrfs: use btrfs_error instead of btrfs_err during mount]
Signed-off-by: Anand Jain 
---
 fs/btrfs/disk-io.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c95e3ce9f22e..bfea0f8f6a87 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2880,6 +2880,16 @@ int open_ctree(struct super_block *sb,
goto fail_tree_roots;
}
 
+   ret = btrfs_check_degradable(fs_info, fs_info->sb->s_flags);
+   if (ret < 0) {
+   btrfs_err(fs_info, "degraded writable mount failed %d", ret);
+   goto fail_tree_roots;
+   } else if (ret > 0 && !btrfs_test_opt(chunk_root, DEGRADED)) {
+   btrfs_warn(fs_info,
+   "Some device missing, but still degraded mountable, 
please mount with -o degraded option");
+   ret = -EACCES;
+   goto fail_tree_roots;
+   }
/*
 * keep the device that is marked to be the target device for the
 * dev_replace procedure
@@ -2983,14 +2993,6 @@ retry_root_backup:
}
fs_info->num_tolerated_disk_barrier_failures =
btrfs_calc_num_tolerated_disk_barrier_failures(fs_info);
-   if (fs_info->fs_devices->missing_devices >
-fs_info->num_tolerated_disk_barrier_failures &&
-   !(sb->s_flags & MS_RDONLY)) {
-   pr_warn("BTRFS: missing devices(%llu) exceeds the limit(%d), 
writeable mount is not allowed\n",
-   fs_info->fs_devices->missing_devices,
-   fs_info->num_tolerated_disk_barrier_failures);
-   goto fail_sysfs;
-   }
 
fs_info->cleaner_kthread = kthread_run(cleaner_kthread, tree_root,
   "btrfs-cleaner");
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 05/13] btrfs: Cleanup num_tolerated_disk_barrier_failures

From: Qu Wenruo 

As we use per-chunk degradable check, now the global
num_tolerated_disk_barrier_failures is of no use. So cleanup it.

Signed-off-by: Qu Wenruo 

[Btrfs: resolve conflict to apply 'btrfs: Cleanup 
num_tolerated_disk_barrier_failures']
Signed-off-by: Anand Jain 
---
 fs/btrfs/ctree.h   |  2 --
 fs/btrfs/disk-io.c | 56 --
 fs/btrfs/disk-io.h |  2 --
 fs/btrfs/volumes.c | 17 -
 4 files changed, 77 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 84a6a5b3384a..e0a50f478e01 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1829,8 +1829,6 @@ struct btrfs_fs_info {
/* next backup root to be overwritten */
int backup_root_index;
 
-   int num_tolerated_disk_barrier_failures;
-
/* device replace state */
struct btrfs_dev_replace dev_replace;
 
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 85e26d62c089..7f02f1766037 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2991,8 +2991,6 @@ retry_root_backup:
printk(KERN_ERR "BTRFS: Failed to read block groups: %d\n", 
ret);
goto fail_sysfs;
}
-   fs_info->num_tolerated_disk_barrier_failures =
-   btrfs_calc_num_tolerated_disk_barrier_failures(fs_info);
 
fs_info->cleaner_kthread = kthread_run(cleaner_kthread, tree_root,
   "btrfs-cleaner");
@@ -3559,60 +3557,6 @@ int btrfs_get_num_tolerated_disk_barrier_failures(u64 
flags)
return min_tolerated;
 }
 
-int btrfs_calc_num_tolerated_disk_barrier_failures(
-   struct btrfs_fs_info *fs_info)
-{
-   struct btrfs_ioctl_space_info space;
-   struct btrfs_space_info *sinfo;
-   u64 types[] = {BTRFS_BLOCK_GROUP_DATA,
-  BTRFS_BLOCK_GROUP_SYSTEM,
-  BTRFS_BLOCK_GROUP_METADATA,
-  BTRFS_BLOCK_GROUP_DATA | BTRFS_BLOCK_GROUP_METADATA};
-   int i;
-   int c;
-   int num_tolerated_disk_barrier_failures =
-   (int)fs_info->fs_devices->num_devices;
-
-   for (i = 0; i < ARRAY_SIZE(types); i++) {
-   struct btrfs_space_info *tmp;
-
-   sinfo = NULL;
-   rcu_read_lock();
-   list_for_each_entry_rcu(tmp, _info->space_info, list) {
-   if (tmp->flags == types[i]) {
-   sinfo = tmp;
-   break;
-   }
-   }
-   rcu_read_unlock();
-
-   if (!sinfo)
-   continue;
-
-   down_read(>groups_sem);
-   for (c = 0; c < BTRFS_NR_RAID_TYPES; c++) {
-   u64 flags;
-
-   if (list_empty(>block_groups[c]))
-   continue;
-
-   btrfs_get_block_group_info(>block_groups[c],
-  );
-   if (space.total_bytes == 0 || space.used_bytes == 0)
-   continue;
-   flags = space.flags;
-
-   num_tolerated_disk_barrier_failures = min(
-   num_tolerated_disk_barrier_failures,
-   btrfs_get_num_tolerated_disk_barrier_failures(
-   flags));
-   }
-   up_read(>groups_sem);
-   }
-
-   return num_tolerated_disk_barrier_failures;
-}
-
 static int write_all_supers(struct btrfs_root *root, int max_mirrors)
 {
struct list_head *head;
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 8e79d0070bcf..dd155621f95f 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -141,8 +141,6 @@ struct btrfs_root *btrfs_create_tree(struct 
btrfs_trans_handle *trans,
 int btree_lock_page_hook(struct page *page, void *data,
void (*flush_fn)(void *));
 int btrfs_get_num_tolerated_disk_barrier_failures(u64 flags);
-int btrfs_calc_num_tolerated_disk_barrier_failures(
-   struct btrfs_fs_info *fs_info);
 int __init btrfs_end_io_wq_init(void);
 void btrfs_end_io_wq_exit(void);
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a840d78ba127..dff2deaf88d3 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1876,9 +1876,6 @@ int btrfs_rm_device(struct btrfs_root *root, char 
*device_path)
free_fs_devices(cur_devices);
}
 
-   root->fs_info->num_tolerated_disk_barrier_failures =
-   btrfs_calc_num_tolerated_disk_barrier_failures(root->fs_info);
-
/*
 * at this point, the device is zero sized.  We want to
 * remove it from the devices list and zero out the old super
@@ -2405,8 +2402,6 @@ int btrfs_init_new_device(struct btrfs_root *root, char 
*device_path)

[PATCH 11/13] btrfs: introduce device dynamic state transition to offline or failed

This patch provides helper functions to force a device to offline
or failed, and we need this device states for the following reasons,
1) a. it can be reported that device has failed when it does
   b. close the device when it goes offline so that blocklayer can
  cleanup
2) identify the candidate for the auto replace
3) avoid further commit error reported against the failing device and
4) a device in the multi device btrfs may go offline from the system
   (but as of now in in some system config btrfs gets unmounted in this
context, which is not a correct behavior)

Signed-off-by: Anand Jain 
Tested-by: Austin S. Hemmelgarn 
---
 fs/btrfs/volumes.c | 137 +
 fs/btrfs/volumes.h |  13 +
 2 files changed, 150 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 072cefac958c..eb9f28504d3f 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7149,3 +7149,140 @@ out:
read_unlock(_tree->map_tree.lock);
return ret;
 }
+
+static void __close_device(struct work_struct *work)
+{
+   struct btrfs_device *device;
+
+   device = container_of(work, struct btrfs_device, rcu_work);
+
+   if (device->bdev)
+   blkdev_put(device->bdev, device->mode);
+
+   device->bdev = NULL;
+}
+
+static void close_device(struct rcu_head *head)
+{
+   struct btrfs_device *device;
+
+   device = container_of(head, struct btrfs_device, rcu);
+
+   INIT_WORK(>rcu_work, __close_device);
+   schedule_work(>rcu_work);
+}
+
+void btrfs_close_one_device_dont_free(struct btrfs_device *device)
+{
+   struct btrfs_fs_devices *fs_devices = device->fs_devices;
+
+   if (device->bdev)
+   fs_devices->open_devices--;
+
+   if (device->writeable &&
+   device->devid != BTRFS_DEV_REPLACE_DEVID) {
+   list_del_init(>dev_alloc_list);
+   fs_devices->rw_devices--;
+   }
+
+   device->writeable = 0;
+
+   call_rcu(>rcu, close_device);
+}
+
+void force_device_close(struct btrfs_device *device)
+{
+   struct btrfs_device *next_device;
+   struct btrfs_fs_devices *fs_devices;
+
+   fs_devices = device->fs_devices;
+
+   mutex_lock(_devices->device_list_mutex);
+   lock_chunks(fs_devices->fs_info->fs_root);
+
+   next_device = list_entry(fs_devices->devices.next,
+   struct btrfs_device, dev_list);
+   if (device->bdev == fs_devices->fs_info->sb->s_bdev)
+   fs_devices->fs_info->sb->s_bdev = next_device->bdev;
+
+   if (device->bdev == fs_devices->latest_bdev)
+   fs_devices->latest_bdev = next_device->bdev;
+
+   btrfs_close_one_device_dont_free(device);
+
+   /*
+* TODO: works for now, but its better to keep the state of
+* missing and offline different, and update rest of the
+* places where we check for only missing and not for failed
+* or offline as of now.
+*/
+   device->missing = 1;
+   fs_devices->missing_devices++;
+   device->writeable = 0;
+
+   rcu_barrier();
+
+   unlock_chunks(fs_devices->fs_info->fs_root);
+   mutex_unlock(_devices->device_list_mutex);
+}
+
+void btrfs_enforce_device_state(struct btrfs_device *dev, char *why)
+{
+   bool degrade_option;
+   int tolerated_fail;
+   struct btrfs_fs_info *fs_info;
+   struct btrfs_fs_devices *fs_devices;
+
+   fs_devices = dev->fs_devices;
+   fs_info = fs_devices->fs_info;
+   degrade_option = btrfs_test_opt(fs_info->fs_root, DEGRADED);
+
+   /* todo: support seed later */
+   if (fs_devices->seeding)
+   return;
+
+   /* this shouldn't be called if device is already missing */
+   if (dev->missing || !dev->bdev)
+   return;
+
+   if (dev->offline || dev->failed)
+   return;
+
+   /* Only RW device is requested to force close let FS handle it*/
+   if (fs_devices->rw_devices == 1) {
+   btrfs_std_error(fs_info, -EIO,
+   "force offline last RW device");
+   return;
+   }
+
+   if (!strcmp(why, "offline"))
+   dev->offline = 1;
+   else if (!strcmp(why, "failed"))
+   dev->failed = 1;
+   else
+   return;
+
+   btrfs_sysfs_rm_device_link(fs_devices, dev);
+
+   force_device_close(dev);
+
+   tolerated_fail = btrfs_check_degradable(fs_info,
+   fs_info->sb->s_flags);
+   if (tolerated_fail > 0) {
+   btrfs_warn_in_rcu(fs_info, "device %s %s, chunks degraded",
+   rcu_str_deref(dev->name), why);
+   } else if(tolerated_fail < 0) {
+   btrfs_warn_in_rcu(fs_info,
+   "device %s %s, chunks failed",
+   rcu_str_deref(dev->name), why);
+

[PATCH 03/13] btrfs: Do per-chunk degraded check for remount

From: Qu Wenruo 

Just the same for mount time check, use new btrfs_check_degraded() to do
per chunk check.

Signed-off-by: Qu Wenruo 

Btrfs: use btrfs_error instead of btrfs_err during remount

Signed-off-by: Anand Jain 
---
 fs/btrfs/super.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 00b8f37cc306..87639fa53b10 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1767,11 +1767,14 @@ static int btrfs_remount(struct super_block *sb, int 
*flags, char *data)
goto restore;
}
 
-   if (fs_info->fs_devices->missing_devices >
-fs_info->num_tolerated_disk_barrier_failures &&
-   !(*flags & MS_RDONLY)) {
+   ret = btrfs_check_degradable(fs_info, *flags);
+   if (ret < 0) {
+   btrfs_err(fs_info,
+   "degraded writable remount failed %d", ret);
+   goto restore;
+   } else if (ret > 0 && !btrfs_test_opt(root, DEGRADED)) {
btrfs_warn(fs_info,
-   "too many missing devices, writeable remount is 
not allowed");
+   "some device missing, but still degraded 
mountable, please remount with -o degraded option");
ret = -EACCES;
goto restore;
}
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 01/13] btrfs: Introduce a new function to check if all chunks a OK for degraded mount

From: Qu Wenruo 

Introduce a new function, btrfs_check_degradable(), to judge if all chunks
in btrfs is OK for degraded mount.

It provides the new basis for accurate btrfs mount/remount and even
runtime degraded mount check other than old one-size-fit-all method.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/volumes.c | 63 ++
 fs/btrfs/volumes.h |  1 +
 2 files changed, 64 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index e2b54d546b7c..dd3dc53a302a 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7042,3 +7042,66 @@ static void btrfs_close_one_device(struct btrfs_device 
*device)
 
call_rcu(>rcu, free_device);
 }
+
+/*
+ * Check if all chunks in the fs is OK for degraded mount
+ * Caller itself should do extra check if DEGRADED mount option is given
+ * for >0 return value.
+ *
+ * Return 0 if all chunks are OK.
+ * Return >0 if all chunks are degradable but not all OK.
+ * Return <0 if any chunk is not degradable or other bug.
+ */
+int btrfs_check_degradable(struct btrfs_fs_info *fs_info, unsigned flags)
+{
+   struct btrfs_mapping_tree *map_tree = _info->mapping_tree;
+   struct extent_map *em;
+   u64 next_start = 0;
+   int ret = 0;
+
+   if (flags & MS_RDONLY)
+   return 0;
+
+   read_lock(_tree->map_tree.lock);
+   em = lookup_extent_mapping(_tree->map_tree, 0, (u64)(-1));
+   /* No any chunk? Should be a huge bug */
+   if (!em) {
+   ret = -ENOENT;
+   goto out;
+   }
+
+   while (em) {
+   struct map_lookup *map;
+   int missing = 0;
+   int max_tolerated;
+   int i;
+
+   map = (struct map_lookup *) em->bdev;
+   max_tolerated =
+   btrfs_get_num_tolerated_disk_barrier_failures(
+   map->type);
+   for (i = 0; i < map->num_stripes; i++) {
+   if (map->stripes[i].dev->missing)
+   missing++;
+   }
+   if (missing > max_tolerated) {
+   ret = -EIO;
+   btrfs_warn(fs_info,
+  "missing devices(%d) exceeds the limit(%d), 
writebale mount is not allowed",
+  missing, max_tolerated);
+   goto out;
+   } else if (missing)
+   ret = 1;
+   next_start = extent_map_end(em);
+
+   /*
+* Alwasy search range [next_start, (u64)-1) to find the next
+* chunk map
+*/
+   em = lookup_extent_mapping(_tree->map_tree, next_start,
+  (u64)(-1) - next_start);
+   }
+out:
+   read_unlock(_tree->map_tree.lock);
+   return ret;
+}
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 1939ebde63df..351431a3f5aa 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -566,5 +566,6 @@ static inline void unlock_chunks(struct btrfs_root *root)
 struct list_head *btrfs_get_fs_uuids(void);
 void btrfs_set_fs_info_ptr(struct btrfs_fs_info *fs_info);
 void btrfs_reset_fs_info_ptr(struct btrfs_fs_info *fs_info);
+int btrfs_check_degradable(struct btrfs_fs_info *fs_info, unsigned flags);
 
 #endif
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 07/13] btrfs: add check not to mount a spare device

Spare devices can be scanned but shouldn't be mountable.

Signed-off-by: Anand Jain 
Tested-by: Austin S. Hemmelgarn 
---
 fs/btrfs/disk-io.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 7f02f1766037..b99329e37965 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2806,6 +2806,14 @@ int open_ctree(struct super_block *sb,
goto fail_alloc;
}
 
+   if (btrfs_super_incompat_flags(disk_super) &
+   BTRFS_FEATURE_INCOMPAT_SPARE_DEV) {
+   /*You can only scan a spare device but not mount*/
+   printk(KERN_ERR "BTRFS: You can't mount a spare device\n");
+   err = -ENOTSUPP;
+   goto fail_alloc;
+   }
+
/*
 * Needn't use the lock because there is no other task which will
 * update the flag.
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 08/13] btrfs: support btrfs dev scan for spare device

When the user or system calls the BTRFS_IOC_SCAN_DEV,
ioctl this patch will make sure it is added to the device
list and set it as spare.

This operation will be same when BTRFS_IOC_DEVICES_READY
as well since BTRFS_IOC_DEVICES_READY ioctl has been doing
that by legacy.

Signed-off-by: Anand Jain 
Tested-by: Austin S. Hemmelgarn 
---
 fs/btrfs/volumes.c | 4 
 fs/btrfs/volumes.h | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index dff2deaf88d3..d729539f9612 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -604,6 +604,10 @@ static noinline int device_list_add(const char *path,
if (IS_ERR(fs_devices))
return PTR_ERR(fs_devices);
 
+   if (btrfs_super_incompat_flags(disk_super) &
+   BTRFS_FEATURE_INCOMPAT_SPARE_DEV)
+   fs_devices->spare = 1;
+
list_add(_devices->list, _uuids);
 
device = NULL;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 48ced5cc09e4..51cf716eb35b 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -263,6 +263,8 @@ struct btrfs_fs_devices {
struct kobject fsid_kobj;
struct kobject *device_dir_kobj;
struct completion kobj_unregister;
+
+   int spare;
 };
 
 #define BTRFS_BIO_INLINE_CSUM_SIZE 64
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 12/13] btrfs: check device for critical errors and mark failed

Write and Flush errors are considered as critical errors,
upon which the device will be brought offline and marked as
failed. Write and Flush errors are identified using device
error statistics. This is monitored using a kthread
btrfs_health.

Signed-off-by: Anand Jain 
Tested-by: Austin S. Hemmelgarn 
---
 fs/btrfs/ctree.h   |   2 ++
 fs/btrfs/disk-io.c | 101 -
 fs/btrfs/volumes.c |   1 +
 fs/btrfs/volumes.h |   4 +++
 4 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index aa693cfdc9f0..47e9cd9dd29a 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1569,6 +1569,7 @@ struct btrfs_fs_info {
struct mutex tree_log_mutex;
struct mutex transaction_kthread_mutex;
struct mutex cleaner_mutex;
+   struct mutex health_mutex;
struct mutex chunk_mutex;
struct mutex volume_mutex;
 
@@ -1686,6 +1687,7 @@ struct btrfs_fs_info {
struct btrfs_workqueue *extent_workers;
struct task_struct *transaction_kthread;
struct task_struct *cleaner_kthread;
+   struct task_struct *health_kthread;
int thread_pool_size;
 
struct kobject *space_info_kobj;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b99329e37965..b523e56b34e9 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1869,6 +1869,93 @@ sleep:
return 0;
 }
 
+/*
+ * returns:
+ * < 0 : Check didn't run, std error
+ *   0 : No errors found
+ * > 0 : # of devices having fatal errors
+ */
+static int btrfs_update_devices_health(struct btrfs_root *root)
+{
+   int ret = 0;
+   struct btrfs_device *device;
+   struct btrfs_fs_info *fs_info = root->fs_info;
+
+   if (btrfs_fs_closing(fs_info))
+   return -EBUSY;
+
+   /* mark disk(s) with write or flush error(s) as failed */
+   mutex_lock(_info->volume_mutex);
+   list_for_each_entry_rcu(device,
+   _info->fs_devices->devices, dev_list) {
+   int c_err;
+
+   if (device->failed) {
+   ret++;
+   continue;
+   }
+
+   /*
+* todo: replace target device's write/flush error,
+* skip for now
+*/
+   if (device->is_tgtdev_for_dev_replace)
+   continue;
+
+   if (!device->dev_stats_valid)
+   continue;
+
+   c_err = atomic_read(>new_critical_errs);
+   atomic_sub(c_err, >new_critical_errs);
+   if (c_err) {
+   btrfs_crit_in_rcu(fs_info,
+   "fatal error on device %s",
+   rcu_str_deref(device->name));
+   btrfs_enforce_device_state(device, "failed");
+   ret ++;
+   }
+   }
+   mutex_unlock(_info->volume_mutex);
+
+   return ret;
+}
+
+/*
+ * Devices health maintenance kthread, gets woken-up by transaction
+ * kthread, once sysfs is ready, this should publish the report
+ * through sysfs so that user land scripts and invoke actions.
+ */
+static int health_kthread(void *arg)
+{
+   struct btrfs_root *root = arg;
+
+   do {
+   if (btrfs_need_cleaner_sleep(root))
+   goto sleep;
+
+   if (!mutex_trylock(>fs_info->health_mutex))
+   goto sleep;
+
+   if (btrfs_need_cleaner_sleep(root)) {
+   mutex_unlock(>fs_info->health_mutex);
+   goto sleep;
+   }
+
+   /* Check devices health */
+   btrfs_update_devices_health(root);
+
+   mutex_unlock(>fs_info->health_mutex);
+
+sleep:
+   set_current_state(TASK_INTERRUPTIBLE);
+   if (!kthread_should_stop())
+   schedule();
+   __set_current_state(TASK_RUNNING);
+   } while (!kthread_should_stop());
+
+   return 0;
+}
+
 static int transaction_kthread(void *arg)
 {
struct btrfs_root *root = arg;
@@ -1915,6 +2002,7 @@ static int transaction_kthread(void *arg)
btrfs_end_transaction(trans, root);
}
 sleep:
+   wake_up_process(root->fs_info->health_kthread);
wake_up_process(root->fs_info->cleaner_kthread);
mutex_unlock(>fs_info->transaction_kthread_mutex);
 
@@ -2663,6 +2751,7 @@ int open_ctree(struct super_block *sb,
mutex_init(_info->chunk_mutex);
mutex_init(_info->transaction_kthread_mutex);
mutex_init(_info->cleaner_mutex);
+   mutex_init(_info->health_mutex);
mutex_init(_info->volume_mutex);
mutex_init(_info->ro_block_group_mutex);
init_rwsem(_info->commit_root_sem);
@@ -3005,11 +3094,16 @@ retry_root_backup:
if

[PATCH 06/13] btrfs: introduce BTRFS_FEATURE_INCOMPAT_SPARE_DEV

Add BTRFS_FEATURE_INCOMPAT_SPARE_DEV (400) flag to identify
a spare device.

Along with this it checks in the mount context that a spare
device will fail to mount.  As spare devices aren't mountable.

Signed-off-by: Anand Jain 
Tested-by: Austin S. Hemmelgarn 
---
 fs/btrfs/ctree.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index e0a50f478e01..2c185a8e92f0 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -531,6 +531,7 @@ struct btrfs_super_block {
 #define BTRFS_FEATURE_INCOMPAT_RAID56  (1ULL << 7)
 #define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA (1ULL << 8)
 #define BTRFS_FEATURE_INCOMPAT_NO_HOLES(1ULL << 9)
+#define BTRFS_FEATURE_INCOMPAT_SPARE_DEV   (1ULL << 10)
 
 #define BTRFS_FEATURE_COMPAT_SUPP  0ULL
 #define BTRFS_FEATURE_COMPAT_SAFE_SET  0ULL
@@ -551,7 +552,8 @@ struct btrfs_super_block {
 BTRFS_FEATURE_INCOMPAT_RAID56 |\
 BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF | \
 BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA |   \
-BTRFS_FEATURE_INCOMPAT_NO_HOLES)
+BTRFS_FEATURE_INCOMPAT_NO_HOLES |  \
+BTRFS_FEATURE_INCOMPAT_SPARE_DEV)
 
 #define BTRFS_FEATURE_INCOMPAT_SAFE_SET\
(BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF)
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 04/13] btrfs: Allow barrier_all_devices to do per-chunk device check

From: Qu Wenruo 

The last user of num_tolerated_disk_barrier_failures is
barrier_all_devices(). But it's can be easily changed to new per-chunk
degradable check framework.

Now btrfs_device will have two extra members, representing send/wait
error, set at write_dev_flush() time. And then check it in a similar but
more accurate behavior than old code.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/disk-io.c | 13 +
 fs/btrfs/volumes.c |  6 +-
 fs/btrfs/volumes.h |  4 
 3 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index bfea0f8f6a87..85e26d62c089 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3491,8 +3491,6 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
 {
struct list_head *head;
struct btrfs_device *dev;
-   int errors_send = 0;
-   int errors_wait = 0;
int ret;
 
/* send down all the barriers */
@@ -3501,7 +3499,7 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
if (dev->missing)
continue;
if (!dev->bdev) {
-   errors_send++;
+   dev->err_send = 1;
continue;
}
if (!dev->in_fs_metadata || !dev->writeable)
@@ -3509,7 +3507,7 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
 
ret = write_dev_flush(dev, 0);
if (ret)
-   errors_send++;
+   dev->err_send = 1;
}
 
/* wait for all the barriers */
@@ -3517,7 +3515,7 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
if (dev->missing)
continue;
if (!dev->bdev) {
-   errors_wait++;
+   dev->err_wait = 1;
continue;
}
if (!dev->in_fs_metadata || !dev->writeable)
@@ -3525,10 +3523,9 @@ static int barrier_all_devices(struct btrfs_fs_info 
*info)
 
ret = write_dev_flush(dev, 1);
if (ret)
-   errors_wait++;
+   dev->err_wait = 1;
}
-   if (errors_send > info->num_tolerated_disk_barrier_failures ||
-   errors_wait > info->num_tolerated_disk_barrier_failures)
+   if (btrfs_check_degradable(info, info->sb->s_flags) < 0)
return -EIO;
return 0;
 }
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index dd3dc53a302a..a840d78ba127 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7081,8 +7081,12 @@ int btrfs_check_degradable(struct btrfs_fs_info 
*fs_info, unsigned flags)
btrfs_get_num_tolerated_disk_barrier_failures(
map->type);
for (i = 0; i < map->num_stripes; i++) {
-   if (map->stripes[i].dev->missing)
+   if (map->stripes[i].dev->missing ||
+   map->stripes[i].dev->err_wait ||
+   map->stripes[i].dev->err_send)
missing++;
+   map->stripes[i].dev->err_wait = 0;
+   map->stripes[i].dev->err_send = 0;
}
if (missing > max_tolerated) {
ret = -EIO;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 351431a3f5aa..48ced5cc09e4 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -76,6 +76,10 @@ struct btrfs_device {
int can_discard;
int is_tgtdev_for_dev_replace;
 
+   /* for barrier_all_devices() check */
+   int err_send;
+   int err_wait;
+
 #ifdef __BTRFS_NEED_DEVICE_DATA_ORDERED
seqcount_t data_seqcount;
 #endif
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 09/13] btrfs: provide framework to get and put a spare device

This adds functions to get and put a spare device from the list.
So that hot repace code can pick a spare device when needed.

Signed-off-by: Anand Jain 
Tested-by: Austin S. Hemmelgarn 
---
 fs/btrfs/ctree.h   |  1 +
 fs/btrfs/super.c   |  5 +
 fs/btrfs/volumes.c | 53 +
 fs/btrfs/volumes.h |  2 ++
 4 files changed, 61 insertions(+)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 2c185a8e92f0..aa693cfdc9f0 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -4185,6 +4185,7 @@ void btrfs_sysfs_remove_mounted(struct btrfs_fs_info 
*fs_info);
 ssize_t btrfs_listxattr(struct dentry *dentry, char *buffer, size_t size);
 
 /* super.c */
+struct file_system_type *btrfs_get_fs_type(void);
 int btrfs_parse_options(struct btrfs_root *root, char *options,
unsigned long new_flags);
 int btrfs_sync_fs(struct super_block *sb, int wait);
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 87639fa53b10..49ba899b2d36 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -69,6 +69,11 @@ static struct file_system_type btrfs_fs_type;
 
 static int btrfs_remount(struct super_block *sb, int *flags, char *data);
 
+struct file_system_type *btrfs_get_fs_type()
+{
+   return _fs_type;
+}
+
 const char *btrfs_decode_error(int errno)
 {
char *errstr = "unknown";
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index d729539f9612..072cefac958c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -525,6 +525,59 @@ static void pending_bios_fn(struct btrfs_work *work)
run_scheduled_bios(device);
 }
 
+int btrfs_get_spare_device(char **path)
+{
+   int ret = 1;
+   struct btrfs_fs_devices *fs_devices;
+   struct btrfs_device *device;
+   struct list_head *fs_uuids = btrfs_get_fs_uuids();
+
+   mutex_lock(_mutex);
+   list_for_each_entry(fs_devices, fs_uuids, list) {
+   if (!fs_devices->spare)
+   continue;
+
+   /* as of now there is only one device in the spare fs_devices */
+   device = list_entry(fs_devices->devices.next,
+   struct btrfs_device, dev_list);
+
+   if (!device || !device->name)
+   continue;
+
+   fs_devices->spare = 0;
+   /*
+* Its under uuid_mutex and there is one spare per fsid
+* so rcu lock is actually not required
+*/
+   *path = kstrdup(device->name->str, GFP_KERNEL);
+   if (*path)
+   ret = 0;
+   else
+   ret = -ENOMEM;
+   break;
+   }
+
+   if (!ret) {
+   btrfs_sysfs_remove_fsid(fs_devices);
+   list_del(_devices->list);
+   free_fs_devices(fs_devices);
+   }
+   mutex_unlock(_mutex);
+
+   return ret;
+}
+
+void btrfs_put_spare_device(char *path)
+{
+   struct file_system_type *btrfs_fs_type;
+   struct btrfs_fs_devices *fs_devices;
+
+   btrfs_fs_type = btrfs_get_fs_type();
+
+   if (btrfs_scan_one_device(path, FMODE_READ,
+   btrfs_fs_type, _devices))
+   printk(KERN_INFO "failed to return spare device\n");
+}
 
 void btrfs_free_stale_device(struct btrfs_device *cur_dev)
 {
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 51cf716eb35b..b4308afa3097 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -469,6 +469,8 @@ int btrfs_init_new_device(struct btrfs_root *root, char 
*path);
 int btrfs_init_dev_replace_tgtdev(struct btrfs_root *root, char *device_path,
  struct btrfs_device *srcdev,
  struct btrfs_device **device_out);
+int btrfs_get_spare_device(char **path);
+void btrfs_put_spare_device(char *path);
 int btrfs_balance(struct btrfs_balance_control *bctl,
  struct btrfs_ioctl_balance_args *bargs);
 int btrfs_resume_balance_async(struct btrfs_fs_info *fs_info);
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 13/13] btrfs: check for failed device and hot replace

This patch checks for failed device and kicks out auto
replace, if when user decided to disable auto replace
it can be done by future sysfs or future ioctl interface
to set fs_info->no_auto_replace parameter to 1.

Signed-off-by: Anand Jain 
Tested-by: Austin S. Hemmelgarn 
---
 fs/btrfs/ctree.h   |  2 ++
 fs/btrfs/disk-io.c | 34 ++
 2 files changed, 36 insertions(+)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 47e9cd9dd29a..67bb36bb82ee 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1862,6 +1862,8 @@ struct btrfs_fs_info {
struct list_head pinned_chunks;
 
int creating_free_space_tree;
+
+   int no_auto_replace;
 };
 
 struct btrfs_subvolume_writers {
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b523e56b34e9..f205e7e94948 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1869,6 +1869,38 @@ sleep:
return 0;
 }
 
+static int btrfs_recuperate(struct btrfs_root *root)
+{
+   int ret;
+   int found = 0;
+   struct btrfs_device *device;
+   struct btrfs_fs_devices *fs_devices;
+
+   fs_devices = root->fs_info->fs_devices;
+
+   mutex_lock(_devices->device_list_mutex);
+   rcu_read_lock();
+   list_for_each_entry_rcu(device,
+   _devices->devices, dev_list) {
+   if (device->failed) {
+   found = 1;
+   break;
+   }
+   }
+   rcu_read_unlock();
+   mutex_unlock(_devices->device_list_mutex);
+
+   /*
+* We are using the replace code which should be interrupt-able
+* during unmount, and as of now there is no user land stop
+* request that we support and this will run until its complete
+*/
+   if (found && !root->fs_info->no_auto_replace)
+   ret = btrfs_auto_replace_start(root, device);
+
+   return ret;
+}
+
 /*
  * returns:
  * < 0 : Check didn't run, std error
@@ -1944,6 +1976,8 @@ static int health_kthread(void *arg)
/* Check devices health */
btrfs_update_devices_health(root);
 
+   btrfs_recuperate(root);
+
mutex_unlock(>fs_info->health_mutex);
 
 sleep:
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Global hotspare functionality




On 03/31/2016 06:17 AM, Yauhen Kharuzhy wrote:

On Tue, Mar 29, 2016 at 10:40:40PM +0300, Yauhen Kharuzhy wrote:

Hi.

I am testing hotspare v2 on kernel v4.4.5 (I will try latest Chris' tree later)
now with lockdep debugging enabled. At starting of replacement, lockdep warning 
is displayed,
because kstrdup() is called with GFP_NOFS inside of rcu_read_lock/unlock()
block (GFP_NOFS can sleep).


Similar thing in the btrfs_auto_replace_start(): rcu_str_deref() without
rcu_read_lock():

int btrfs_auto_replace_start(struct btrfs_root *root,
 struct btrfs_device *src_device)
{
 int ret;
 char *tgt_path;

 if (btrfs_get_spare_device(_path)) {
 btrfs_err(root->fs_info,
 "No spare device found/configured in the kernel");
 return -EINVAL;
 }

 ret = btrfs_dev_replace_start(root, tgt_path,
 src_device->devid,
 rcu_str_deref(src_device->name),


This is fixed in V3.

Thanks, Anand



 BTRFS_IOCTL_DEV_REPLACE_CONT_READING_FROM_SRCDEV_MODE_AVOID);
 if (ret)
 btrfs_put_spare_device(tgt_path);

 kfree(tgt_path);

 return 0;
}

[  156.168133] ===
[  156.168963] [ INFO: suspicious RCU usage. ]
[  156.169822] 4.4.5-scst31x+ #20 Not tainted
[  156.170656] ---
[  156.171488] fs/btrfs/dev-replace.c:990 suspicious rcu_dereference_check() 
usage!
[  156.172920]
[  156.172920] other info that might help us debug this:
[  156.172920]
[  156.174825]
[  156.174825] rcu_scheduler_active = 1, debug_locks = 0
[  156.176152] 1 lock held by btrfs-casualty/4807:
[  156.181917]  #0:  (_info->casualty_mutex){+.+...}, at: 
[] casualty_kthread+0x64/0x390 [btrfs]
[  156.193511]
[  156.193511] stack backtrace:
[  156.194680] CPU: 0 PID: 4807 Comm: btrfs-casualty Not tainted 4.4.5-scst31x+ 
#20
[  156.201650] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS 
VirtualBox 12/01/2006
[  156.219100]   88005d79fda0 813529e3 
88005e19c600
[  156.221216]  0001 88005d79fdd0 810d6407 

[  156.224287]   88005f4a0c00 88005da36000 
88005d79fe08
[  156.226375] Call Trace:
[  156.227078]  [] dump_stack+0x85/0xc2
[  156.228152]  [] lockdep_rcu_suspicious+0xd7/0x110
[  156.229418]  [] btrfs_auto_replace_start+0xa6/0xd0 [btrfs]
[  156.230714]  [] casualty_kthread+0x2c4/0x390 [btrfs]
[  156.231915]  [] ? casualty_kthread+0x19c/0x390 [btrfs]
[  156.233105]  [] ? btrfs_check_devices+0x200/0x200 [btrfs]
[  156.234339]  [] kthread+0xef/0x110
[  156.235309]  [] ? 
__raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[  156.236940]  [] ? kthread_create_on_node+0x200/0x200
[  156.239489]  [] ret_from_fork+0x3f/0x70
[  156.240533]  [] ? kthread_create_on_node+0x200/0x200



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Global hotspare functionality




On 03/30/2016 03:47 AM, Yauhen Kharuzhy wrote:

On Tue, Mar 29, 2016 at 10:41:36PM +0800, Anand Jain wrote:


Hi Yauhen,





Issue 2.
At start of autoreplacig drive by hotspare, kernel craches in transaction
handling code (inside of btrfs_commit_transaction() called by autoreplace 
initiating
routines). I 'fixed' this by removing of closing of bdev in 
btrfs_close_one_device_dont_free(), see
https://bitbucket.org/jekhor/linux-btrfs/commits/dfa441c9ec7b3833f6a5e4d0b6f8c678faea29bb?at=master
(oops text is attached also). Bdev is closed after replacing by
btrfs_dev_replace_finishing(), so this is safe but doesn't seem
to be right way.


  I have sent out V2. I don't see that issue with this,
  could you pls try ?


Yes, it reproduced on v4.4.5 kernel. I will try with current
'for-linus-4.6' Chris' tree soon.

To emulate a drive failure, I disconnect the drive in VirtualBox, so bdev
can be freed by kernel after releasing of all references to it.


  So far the raid group profile would adapt to lower suitable
  group profile when device is missing/failed. This appears to
  be not happening with RAID56 OR there are stale IO which wasn't
  flushed out. Anyway to have this fixed I am moving the patch
   btrfs: introduce device dynamic state transition to offline or failed
  to the top in v3 for any potential changes.
  But firstly we need a reliable test case, or a very carefully
  crafted test case which can create this situation

  Here below is the dm-error that I am using for testing, which
  apparently doesn't report this issue. Could you please try on V3. ?
  (pls note the device names are hard coded in the test script
  sorry about that) This would eventually be fstests script.



# cat util
run()
{
local ret

echo -- ${*} --
echo ${*} | bash
ret=$?
if [ $ret -ne 0 ]; then
echo
echo "## FAILED: RET $ret #"
echo
exit
fi
echo
#echo "OK?"; read
}

runnt()
{
local ret

echo -- ${*} --
echo ${*} | bash
ret=$?
echo
#echo "OK?"; read
}

wipeall()
{
runnt "wipefs -a /dev/sd[c-h] > /dev/null"
}

create_err_dev_raid1()
{
dm_backing_dev="/dev/sdd"
blk_dev_size=`blockdev --getsz $dm_backing_dev`
dmerror_dev="/dev/mapper/dm-sdd"
dmlinear_table="0 $blk_dev_size linear $dm_backing_dev 0"
dmerror_table="0 $blk_dev_size error $dm_backing_dev 0"

echo -e dm_backing_dev'\t'= $dm_backing_dev
echo -e blk_dev_size'\t'= $blk_dev_size
echo -e dmerror_dev'\t'= $dmerror_dev
echo -e dmlinear_table'\t'= $dmlinear_table
echo -e dmerror_table'\t'= $dmerror_table
echo

runnt "dmsetup remove dm-sdd > /dev/null 2>&1"
run "dmsetup create dm-sdd --table '${dmlinear_table}'"

run "mkfs.btrfs -f -draid1 -mraid1 /dev/sdc $dmerror_dev > /dev/null 
2>&1"
run mount /dev/sdc /btrfs
run "fillfs /btrfs 1000 > /dev/null 2>&1"
run "dd if=/dev/zero of=/btrfs/tf1 bs=4096 count=100 > /dev/null 2>&1"

run btrfs fi show

#   run sleep 32

run dmsetup suspend dm-sdd
run "dmsetup load dm-sdd --table '$dmerror_table'"
run dmsetup resume dm-sdd
run "dd if=/dev/zero of=/btrfs/tf1 bs=4096 count=100 > /dev/null 2>&1"

run btrfs fi show
}

create_err_dev_raid56()
{
dm_backing_dev="/dev/sdd"
blk_dev_size=`blockdev --getsz $dm_backing_dev`
dmerror_dev="/dev/mapper/dm-sdd"
dmlinear_table="0 $blk_dev_size linear $dm_backing_dev 0"
dmerror_table="0 $blk_dev_size error $dm_backing_dev 0"

echo -e dm_backing_dev'\t'= $dm_backing_dev
echo -e blk_dev_size'\t'= $blk_dev_size
echo -e dmerror_dev'\t'= $dmerror_dev
echo -e dmlinear_table'\t'= $dmlinear_table
echo -e dmerror_table'\t'= $dmerror_table
echo

runnt "dmsetup remove dm-sdd > /dev/null 2>&1"
run "dmsetup create dm-sdd --table '${dmlinear_table}'"

	run "mkfs.btrfs -f -draid5 -mraid5 /dev/sdc /dev/sdf $dmerror_dev > 
/dev/null 2>&1"

run mount /dev/sdc /btrfs
run "fillfs /btrfs 1000 > /dev/null 2>&1"
run "dd if=/dev/zero of=/btrfs/tf1 bs=4096 count=100 > /dev/null 2>&1"

run btrfs fi show

#   run sleep 32

run dmsetup suspend dm-sdd
run "dmsetup load dm-sdd --table '$dmerror_table'"
run dmsetup resume dm-sdd
run "dd if=/dev/zero of=/btrfs/tf1 bs=4096 count=100 > /dev/null 2>&1"

run btrfs fi show
}

# cat auto-replace-test56
source $(dirname $0)/util

wipeall

run btrfs spare add /dev/sde

#run cat /proc/fs/btrfs/devlist

create_err_dev_raid56
--


Thanks, Anand




[ 1464.232552] BTRFS info (device sdc): dev_replace from  (devid 
4) to /dev/sdg started
[ 1464.255824] BUG: unable to handle kernel NULL pointer dereference at 
0548
[ 1464.291760]

Re: [PATCH 12/12] btrfs: check device for critical errors and mark failed




On 03/30/2016 08:49 AM, Yauhen Kharuzhy wrote:

On Tue, Mar 29, 2016 at 10:22:29PM +0800, Anand Jain wrote:

Write and Flush errors are considered as critical errors,
upon which the device will be brought offline and marked as
failed. Write and Flush errors are identified using device
error statistics.

Signed-off-by: Anand Jain 

btrfs: check for failed device and hot replace

This patch creates casualty_kthread to check for the failed
devices, and triggers device replace.

Signed-off-by: Anand Jain 
---
  fs/btrfs/ctree.h   |   2 +
  fs/btrfs/disk-io.c | 161 -
  fs/btrfs/disk-io.h |   2 +
  fs/btrfs/volumes.c |   1 +
  fs/btrfs/volumes.h |   4 ++
  5 files changed, 169 insertions(+), 1 deletion(-)


btrfs_check_and_handle_casualty() tries to perfom auto-replacement
only once after each failure. If no hotspare was added in system before 
failure, only one
remaining way to replace drive is to perform replace manually. This sounds
reasonable, so just clarification: are you sure that we shouldn't start
autoreplacement if hotspare will be added after drive failure?

V1 of the patchset tried to perform autoreplace endlessly until replace
drive is added.


Yeah. I did that change purposely, but in V3 I have reverted, so
that code is more flexible and has better design control/change.

Thanks, Anand


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 12/12] btrfs: check device for critical errors and mark failed




On 03/30/2016 06:41 AM, Yauhen Kharuzhy wrote:

On Tue, Mar 29, 2016 at 10:22:29PM +0800, Anand Jain wrote:

Write and Flush errors are considered as critical errors,
upon which the device will be brought offline and marked as
failed. Write and Flush errors are identified using device
error statistics.

Signed-off-by: Anand Jain 

btrfs: check for failed device and hot replace

This patch creates casualty_kthread to check for the failed
devices, and triggers device replace.

Signed-off-by: Anand Jain 
---
  fs/btrfs/ctree.h   |   2 +
  fs/btrfs/disk-io.c | 161 -
  fs/btrfs/disk-io.h |   2 +
  fs/btrfs/volumes.c |   1 +
  fs/btrfs/volumes.h |   4 ++
  5 files changed, 169 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 2c185a8e92f0..36f1c29e00a0 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1569,6 +1569,7 @@ struct btrfs_fs_info {
struct mutex tree_log_mutex;
struct mutex transaction_kthread_mutex;
struct mutex cleaner_mutex;
+   struct mutex casualty_mutex;
struct mutex chunk_mutex;
struct mutex volume_mutex;

@@ -1686,6 +1687,7 @@ struct btrfs_fs_info {
struct btrfs_workqueue *extent_workers;
struct task_struct *transaction_kthread;
struct task_struct *cleaner_kthread;
+   struct task_struct *casualty_kthread;
int thread_pool_size;

struct kobject *space_info_kobj;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b99329e37965..650e26e0acda 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1869,6 +1869,153 @@ sleep:
return 0;
  }

+static int btrfs_check_and_handle_casualty(void *arg)
+{
+   int ret;
+   int found = 0;
+   struct btrfs_device *device;
+   struct btrfs_root *root = arg;
+   struct btrfs_fs_info *fs_info = root->fs_info;
+   struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
+
+   btrfs_dev_replace_lock(_info->dev_replace, 0);
+   if (btrfs_dev_replace_is_ongoing(_info->dev_replace)) {
+   btrfs_dev_replace_unlock(_info->dev_replace, 0);
+   return -EBUSY;
+   }
+   btrfs_dev_replace_unlock(_info->dev_replace, 0);
+
+   ret = btrfs_check_devices(fs_devices);
+   if (ret == 1) {
+   /*
+* There were some casualties, and if its beyond a
+* chunk group can tolerate, then FS will already
+* be in readonly, so check that. And that's best
+* btrfs could do as of now and no replace will help.
+*/
+   if (fs_info->sb->s_flags & MS_RDONLY)
+   return -EROFS;
+
+   mutex_lock(_devices->device_list_mutex);
+   rcu_read_lock();
+   list_for_each_entry_rcu(device,
+   _devices->devices, dev_list) {
+   if (device->failed) {
+   found = 1;
+   break;
+   }
+   }
+   rcu_read_unlock();
+   mutex_unlock(_devices->device_list_mutex);
+   }
+
+   /*
+* We are using the replace code which should be interrupt-able
+* during unmount, and as of now there is no user land stop
+* request that we support and this will run until its complete
+*/
+   if (found)
+   ret = btrfs_auto_replace_start(root, device);
+
+   return ret;
+}
+
+/*
+ * A kthread to check if any auto maintenance be required. This is
+ * multithread safe, and kthread is running only if
+ * fs_info->casualty_kthread is not NULL, fixme: atomic ?
+ */
+static int casualty_kthread(void *arg)
+{
+   int ret;
+   int again;
+   struct btrfs_root *root = arg;
+
+   do {
+   again = 0;
+
+   if (btrfs_need_cleaner_sleep(root))
+   goto sleep;
+
+   if (!mutex_trylock(>fs_info->casualty_mutex))
+   goto sleep;
+
+   if (btrfs_need_cleaner_sleep(root)) {
+   mutex_unlock(>fs_info->casualty_mutex);
+   goto sleep;
+   }
+
+   ret = btrfs_check_and_handle_casualty(arg);
+   if (ret == -EROFS) {
+   /*
+* When checking and fixing the devices, the
+* FS may be marked as RO in some situations.
+* And on ROFS casualty thread has no work.
+* So optimize here, to stop this thread until
+* FS is back to RW.
+*/
+   }
+   mutex_unlock(>fs_info->casualty_mutex);
+
+sleep:
+   if (!try_to_freeze() && !again) {


This block was copy-pasted from the cleaner_kthread(). 'again' variable

Re: Another ENOSPC situation

2016-04-01 Thread Henk Slager

On Fri, Apr 1, 2016 at 10:40 PM, Marc Haber  wrote:
> On Fri, Apr 01, 2016 at 09:20:52PM +0200, Henk Slager wrote:
>> On Fri, Apr 1, 2016 at 6:50 PM, Marc Haber  
>> wrote:
>> > On Fri, Apr 01, 2016 at 06:30:20PM +0200, Marc Haber wrote:
>> >> On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote:
>> >> > On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber 
>> >> >  wrote:
>> >> > > btrfs balance -mprofiles seems to do something. one kworked and one
>> >> > > btrfs-transaction process hog one CPU core each for hours, while
>> >> > > blocking the filesystem for minutes apiece, which leads to the host
>> >> > > being nearly unuseable up to the point of "clock and mouse pointer
>> >> > > frozen for nearly ten minutes".
>> >> >
>> >> > I assume you still have your every 10 minutes snapshotting running
>> >> > while balancing?
>> >>
>> >> No, I disabled the cronjob before trying the balance. I might be
>> >> crazy, but not stup^wunexperienced.
>> >
>> > That being said, I would still expect the code not to allow _this_
>> > kind of effect on the entire system when two alledgely incompatible
>> > operations run simultaneously. I mean, Linux is a multi-user,
>> > multi-tasking operating system where one simply cannot expect all
>> > processes to be cooperative to each other. We have the operating
>> > systems to prevent this kind of issues, not to cause them.
>>
>> Maybe look at it differently: Does user mh have trouble using this
>> laptop w.r.t. storing files?
>
> No. I would have cried murder otherwise.
>
>> In openSUSE Tumbleweed (the snapshot from end of march), root access
>> is needed to change the default snapshotting config, otherwise you
>> will have a 10 year history. After that change has been done according
>> to needs of the user, there is no need to run manual balance.
>
> So you are saying the balancing a filesystem should never be
> necessary? Or what are you trying to say?

There is a package  bbtrfsmaintenance  which does balancing for the
user after it is configured by root according to user's wishes and
needs.

Key thing I want to say is that you should change you snapshotting
rate and/or policy. It has been hinted before and it is more a
psychological issue than technical I think.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] Btrfs

2016-04-01 Thread Chris Mason

Hi Linus,

My for-linus-4.6 branch:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git 
for-linus-4.6

Has a few fixes Dave Sterba had queued up.  These are all pretty small,
but since they were tested I decided against waiting for more:

Alex Lyakas (2) commits (+18/-10):
btrfs: do not write corrupted metadata blocks to disk (+13/-2)
btrfs: csum_tree_block: return proper errno value (+5/-8)

Jiri Kosina (2) commits (+7/-10):
btrfs: cleaner_kthread() doesn't need explicit freeze (+1/-1)
btrfs: transaction_kthread() is not freezable (+6/-9)

Total: (4) commits (+25/-20)

 fs/btrfs/disk-io.c | 45 +
 1 file changed, 25 insertions(+), 20 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)

2016-04-01 Thread James Johnston

> I grabbed this part from the log after the machine crashed again
> following trying to transfer a bunch of files that included ones with
> csum errors, let me know if this looks like the same issue you were
> having:
> 

Idk?  You hit a soft lockup, mine got a "kernel BUG at..."

Your stack trace diverges from mine after bio_endio.

James 



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Another ENOSPC situation

On Fri, Apr 01, 2016 at 09:20:52PM +0200, Henk Slager wrote:
> On Fri, Apr 1, 2016 at 6:50 PM, Marc Haber  
> wrote:
> > On Fri, Apr 01, 2016 at 06:30:20PM +0200, Marc Haber wrote:
> >> On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote:
> >> > On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber  
> >> > wrote:
> >> > > btrfs balance -mprofiles seems to do something. one kworked and one
> >> > > btrfs-transaction process hog one CPU core each for hours, while
> >> > > blocking the filesystem for minutes apiece, which leads to the host
> >> > > being nearly unuseable up to the point of "clock and mouse pointer
> >> > > frozen for nearly ten minutes".
> >> >
> >> > I assume you still have your every 10 minutes snapshotting running
> >> > while balancing?
> >>
> >> No, I disabled the cronjob before trying the balance. I might be
> >> crazy, but not stup^wunexperienced.
> >
> > That being said, I would still expect the code not to allow _this_
> > kind of effect on the entire system when two alledgely incompatible
> > operations run simultaneously. I mean, Linux is a multi-user,
> > multi-tasking operating system where one simply cannot expect all
> > processes to be cooperative to each other. We have the operating
> > systems to prevent this kind of issues, not to cause them.
> 
> Maybe look at it differently: Does user mh have trouble using this
> laptop w.r.t. storing files?

No. I would have cried murder otherwise.

> In openSUSE Tumbleweed (the snapshot from end of march), root access
> is needed to change the default snapshotting config, otherwise you
> will have a 10 year history. After that change has been done according
> to needs of the user, there is no need to run manual balance.

So you are saying the balancing a filesystem should never be
necessary? Or what are you trying to say?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/8] btrfs: uapi migration for user-visible API components

commit 55e301fd57a (Btrfs: move fs/btrfs/ioctl.h to
include/uapi/linux/btrfs.h) was intended to make the ioctl definitions
available to userspace.  Unfortunately, moving just that file wasn't
enough and many of the ioctls aren't actually usable without the
userspace programmer filling in the gaps.  Specifically, for the routine
ioctls like BTRFS_IOC_SETLABEL, BTRFS_LABEL_SIZE wasn't defined so the
ioctl definition would be incomplete.  We were also missing
the argument structure for defrag.  Beyond that, many of the ioctl
structures have a flags field that may or may not be independent of
the btrfs internals.  Lastly, the SEARCH_TREE ioctl exposes all of the
internal items of the tree to userspace programmers so the item
structures should be exposed so that they can be parsed properly.

So, to make all this more convenient for consumers of these APIs, I've
moved the flags used by the ioctl structures into btrfs.h and
moved the item definitions, key IDs, tree root objectids, and other
well-known objectids into a new btrfs_tree.h.  ctree.h includes this
new header directly, so there aren't any changes to .c files at all.

The only part of this set that isn't just a direct cut-and-paste is
the last one which converts u8 and u64 values to __u8 and __u64 since
the former aren't exported via include/uapi.

The goal is that everything required to use the btrfs ioctls for a
particular kernel release should be made available by exporting the uapi
headers for that release.

I intend to use these for the strace ioctl decoding patch I've been
working on so that I don't need to duplicate of the definitions in the
code I send upstream as the final version of the patch.  Prior to this
patchset, I had to duplicate nearly 100 defines and several structures --
and that's without doing any item decoding at all.

I do expect there might be some discussion here. :)

-Jeff

Jeff Mahoney (8):
  btrfs: uapi/linux/btrfs.h migration, move BTRFS_LABEL_SIZE
  btrfs: uapi/linux/btrfs.h migration, qgroup limit flags
  btrfs: uapi/linux/btrfs.h migration, document subvol flags
  btrfs: uapi/linux/btrfs.h migration, move feature flags
  btrfs: uapi/linux/btrfs.h migration, move balance flags
  btrfs: uapi/linux/btrfs.h migration, move struct
btrfs_ioctl_defrag_range_args
  btrfs: uapi/linux/btrfs_tree.h migration, item types and defines
  btrfs: uapi/linux/btrfs_tree.h, use __u8 and __u64

 fs/btrfs/ctree.h| 1014 +--
 fs/btrfs/volumes.h  |   46 --
 include/uapi/linux/btrfs.h  |  173 ++-
 include/uapi/linux/btrfs_tree.h |  966 +
 4 files changed, 1135 insertions(+), 1064 deletions(-)
 create mode 100644 include/uapi/linux/btrfs_tree.h

-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/8] btrfs: uapi/linux/btrfs.h migration, move feature flags

The compat/compat_ro/incompat feature flags are used by the feature set/get
ioctls.

Signed-off-by: Jeff Mahoney 
---
 fs/btrfs/ctree.h   | 25 -
 include/uapi/linux/btrfs.h | 31 +++
 2 files changed, 31 insertions(+), 25 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index c228b39..378482c 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -506,31 +506,6 @@ struct btrfs_super_block {
  * Compat flags that we support.  If any incompat flags are set other than the
  * ones specified below then we will fail to mount
  */
-#define BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE(1ULL << 0)
-
-#define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF   (1ULL << 0)
-#define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL  (1ULL << 1)
-#define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS(1ULL << 2)
-#define BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO(1ULL << 3)
-/*
- * some patches floated around with a second compression method
- * lets save that incompat here for when they do get in
- * Note we don't actually support it, we're just reserving the
- * number
- */
-#define BTRFS_FEATURE_INCOMPAT_COMPRESS_LZOv2  (1ULL << 4)
-
-/*
- * older kernels tried to do bigger metadata blocks, but the
- * code was pretty buggy.  Lets not let them try anymore.
- */
-#define BTRFS_FEATURE_INCOMPAT_BIG_METADATA(1ULL << 5)
-
-#define BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF   (1ULL << 6)
-#define BTRFS_FEATURE_INCOMPAT_RAID56  (1ULL << 7)
-#define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA (1ULL << 8)
-#define BTRFS_FEATURE_INCOMPAT_NO_HOLES(1ULL << 9)
-
 #define BTRFS_FEATURE_COMPAT_SUPP  0ULL
 #define BTRFS_FEATURE_COMPAT_SAFE_SET  0ULL
 #define BTRFS_FEATURE_COMPAT_SAFE_CLEAR0ULL
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index 0316e23..de98717 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -222,6 +222,37 @@ struct btrfs_ioctl_fs_info_args {
__u64 reserved[122];/* pad to 1k */
 };
 
+/*
+ * feature flags
+ *
+ * Used by:
+ * struct btrfs_ioctl_feature_flags
+ */
+#define BTRFS_FEATURE_COMPAT_RO_FREE_SPACE_TREE(1ULL << 0)
+
+#define BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF   (1ULL << 0)
+#define BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL  (1ULL << 1)
+#define BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS(1ULL << 2)
+#define BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO(1ULL << 3)
+/*
+ * some patches floated around with a second compression method
+ * lets save that incompat here for when they do get in
+ * Note we don't actually support it, we're just reserving the
+ * number
+ */
+#define BTRFS_FEATURE_INCOMPAT_COMPRESS_LZOv2  (1ULL << 4)
+
+/*
+ * older kernels tried to do bigger metadata blocks, but the
+ * code was pretty buggy.  Lets not let them try anymore.
+ */
+#define BTRFS_FEATURE_INCOMPAT_BIG_METADATA(1ULL << 5)
+
+#define BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF   (1ULL << 6)
+#define BTRFS_FEATURE_INCOMPAT_RAID56  (1ULL << 7)
+#define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA (1ULL << 8)
+#define BTRFS_FEATURE_INCOMPAT_NO_HOLES(1ULL << 9)
+
 struct btrfs_ioctl_feature_flags {
__u64 compat_flags;
__u64 compat_ro_flags;
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 6/8] btrfs: uapi/linux/btrfs.h migration, move struct btrfs_ioctl_defrag_range_args

struct btrfs_ioctl_defrag_range_args is used by the BTRFS_IOC_DEFRAG_RANGE
ioctl.

Signed-off-by: Jeff Mahoney 
---
 fs/btrfs/ctree.h   | 31 ---
 include/uapi/linux/btrfs.h | 38 +-
 2 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 378482c..89f36b6 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1992,37 +1992,6 @@ struct btrfs_root {
atomic_t qgroup_meta_rsv;
 };
 
-struct btrfs_ioctl_defrag_range_args {
-   /* start of the defrag operation */
-   __u64 start;
-
-   /* number of bytes to defrag, use (u64)-1 to say all */
-   __u64 len;
-
-   /*
-* flags for the operation, which can include turning
-* on compression for this one defrag
-*/
-   __u64 flags;
-
-   /*
-* any extent bigger than this will be considered
-* already defragged.  Use 0 to take the kernel default
-* Use 1 to say every single extent must be rewritten
-*/
-   __u32 extent_thresh;
-
-   /*
-* which compression method to use if turning on compression
-* for this defrag operation.  If unspecified, zlib will
-* be used
-*/
-   __u32 compress_type;
-
-   /* spare for later */
-   __u32 unused[4];
-};
-
 
 /*
  * inode items have the data typically returned from stat and store other
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index abae362..98aff38 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -474,9 +474,45 @@ struct btrfs_ioctl_clone_range_args {
   __u64 dest_offset;
 };
 
-/* flags for the defrag range ioctl */
+/*
+ * flags definition for the defrag range ioctl
+ *
+ * Used by:
+ * struct btrfs_ioctl_defrag_range_args.flags
+ */
 #define BTRFS_DEFRAG_RANGE_COMPRESS 1
 #define BTRFS_DEFRAG_RANGE_START_IO 2
+struct btrfs_ioctl_defrag_range_args {
+   /* start of the defrag operation */
+   __u64 start;
+
+   /* number of bytes to defrag, use (u64)-1 to say all */
+   __u64 len;
+
+   /*
+* flags for the operation, which can include turning
+* on compression for this one defrag
+*/
+   __u64 flags;
+
+   /*
+* any extent bigger than this will be considered
+* already defragged.  Use 0 to take the kernel default
+* Use 1 to say every single extent must be rewritten
+*/
+   __u32 extent_thresh;
+
+   /*
+* which compression method to use if turning on compression
+* for this defrag operation.  If unspecified, zlib will
+* be used
+*/
+   __u32 compress_type;
+
+   /* spare for later */
+   __u32 unused[4];
+};
+
 
 #define BTRFS_SAME_DATA_DIFFERS1
 /* For extent-same ioctl */
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/8] btrfs: uapi/linux/btrfs.h migration, qgroup limit flags

The BTRFS_QGROUP_LIMIT_* flags are required to tell the kernel which
fields are valid when using the BTRFS_IOC_QGROUP_LIMIT ioctl.

Signed-off-by: Jeff Mahoney 
---
 fs/btrfs/ctree.h   |  8 
 include/uapi/linux/btrfs.h | 22 +-
 2 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 3beaa24..c228b39 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1154,14 +1154,6 @@ struct btrfs_qgroup_info_item {
__le64 excl_cmpr;
 } __attribute__ ((__packed__));
 
-/* flags definition for qgroup limits */
-#define BTRFS_QGROUP_LIMIT_MAX_RFER(1ULL << 0)
-#define BTRFS_QGROUP_LIMIT_MAX_EXCL(1ULL << 1)
-#define BTRFS_QGROUP_LIMIT_RSV_RFER(1ULL << 2)
-#define BTRFS_QGROUP_LIMIT_RSV_EXCL(1ULL << 3)
-#define BTRFS_QGROUP_LIMIT_RFER_CMPR   (1ULL << 4)
-#define BTRFS_QGROUP_LIMIT_EXCL_CMPR   (1ULL << 5)
-
 struct btrfs_qgroup_limit_item {
/*
 * only updated when any of the other values change
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index 11eee34..9651af3 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -41,7 +41,19 @@ struct btrfs_ioctl_vol_args {
 #define BTRFS_UUID_SIZE 16
 #define BTRFS_UUID_UNPARSED_SIZE   37
 
-#define BTRFS_QGROUP_INHERIT_SET_LIMITS(1ULL << 0)
+/*
+ * flags definition for qgroup limits
+ *
+ * Used by:
+ * struct btrfs_qgroup_limit.flags
+ * struct btrfs_qgroup_limit_item.flags
+ */
+#define BTRFS_QGROUP_LIMIT_MAX_RFER(1ULL << 0)
+#define BTRFS_QGROUP_LIMIT_MAX_EXCL(1ULL << 1)
+#define BTRFS_QGROUP_LIMIT_RSV_RFER(1ULL << 2)
+#define BTRFS_QGROUP_LIMIT_RSV_EXCL(1ULL << 3)
+#define BTRFS_QGROUP_LIMIT_RFER_CMPR   (1ULL << 4)
+#define BTRFS_QGROUP_LIMIT_EXCL_CMPR   (1ULL << 5)
 
 struct btrfs_qgroup_limit {
__u64   flags;
@@ -51,6 +63,14 @@ struct btrfs_qgroup_limit {
__u64   rsv_excl;
 };
 
+/*
+ * flags definition for qgroup inheritance
+ *
+ * Used by:
+ * struct btrfs_qgroup_inherit.flags
+ */
+#define BTRFS_QGROUP_INHERIT_SET_LIMITS(1ULL << 0)
+
 struct btrfs_qgroup_inherit {
__u64   flags;
__u64   num_qgroups;
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 7/8] btrfs: uapi/linux/btrfs_tree.h migration, item types and defines

The BTRFS_IOC_SEARCH_TREE ioctl returns file system items directly
to userspace.  In order to decode them, full type information is required.

Create a new header, btrfs_tree to contain these since most users won't
need them.

Signed-off-by: Jeff Mahoney 
---
 fs/btrfs/ctree.h| 949 +--
 include/uapi/linux/btrfs_tree.h | 966 
 2 files changed, 967 insertions(+), 948 deletions(-)
 create mode 100644 include/uapi/linux/btrfs_tree.h

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 89f36b6..cf34fb5 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -64,98 +65,6 @@ struct btrfs_ordered_sum;
 
 #define BTRFS_COMPAT_EXTENT_TREE_V0
 
-/* holds pointers to all of the tree roots */
-#define BTRFS_ROOT_TREE_OBJECTID 1ULL
-
-/* stores information about which extents are in use, and reference counts */
-#define BTRFS_EXTENT_TREE_OBJECTID 2ULL
-
-/*
- * chunk tree stores translations from logical -> physical block numbering
- * the super block points to the chunk tree
- */
-#define BTRFS_CHUNK_TREE_OBJECTID 3ULL
-
-/*
- * stores information about which areas of a given device are in use.
- * one per device.  The tree of tree roots points to the device tree
- */
-#define BTRFS_DEV_TREE_OBJECTID 4ULL
-
-/* one per subvolume, storing files and directories */
-#define BTRFS_FS_TREE_OBJECTID 5ULL
-
-/* directory objectid inside the root tree */
-#define BTRFS_ROOT_TREE_DIR_OBJECTID 6ULL
-
-/* holds checksums of all the data extents */
-#define BTRFS_CSUM_TREE_OBJECTID 7ULL
-
-/* holds quota configuration and tracking */
-#define BTRFS_QUOTA_TREE_OBJECTID 8ULL
-
-/* for storing items that use the BTRFS_UUID_KEY* types */
-#define BTRFS_UUID_TREE_OBJECTID 9ULL
-
-/* tracks free space in block groups. */
-#define BTRFS_FREE_SPACE_TREE_OBJECTID 10ULL
-
-/* device stats in the device tree */
-#define BTRFS_DEV_STATS_OBJECTID 0ULL
-
-/* for storing balance parameters in the root tree */
-#define BTRFS_BALANCE_OBJECTID -4ULL
-
-/* orhpan objectid for tracking unlinked/truncated files */
-#define BTRFS_ORPHAN_OBJECTID -5ULL
-
-/* does write ahead logging to speed up fsyncs */
-#define BTRFS_TREE_LOG_OBJECTID -6ULL
-#define BTRFS_TREE_LOG_FIXUP_OBJECTID -7ULL
-
-/* for space balancing */
-#define BTRFS_TREE_RELOC_OBJECTID -8ULL
-#define BTRFS_DATA_RELOC_TREE_OBJECTID -9ULL
-
-/*
- * extent checksums all have this objectid
- * this allows them to share the logging tree
- * for fsyncs
- */
-#define BTRFS_EXTENT_CSUM_OBJECTID -10ULL
-
-/* For storing free space cache */
-#define BTRFS_FREE_SPACE_OBJECTID -11ULL
-
-/*
- * The inode number assigned to the special inode for storing
- * free ino cache
- */
-#define BTRFS_FREE_INO_OBJECTID -12ULL
-
-/* dummy objectid represents multiple objectids */
-#define BTRFS_MULTIPLE_OBJECTIDS -255ULL
-
-/*
- * All files have objectids in this range.
- */
-#define BTRFS_FIRST_FREE_OBJECTID 256ULL
-#define BTRFS_LAST_FREE_OBJECTID -256ULL
-#define BTRFS_FIRST_CHUNK_TREE_OBJECTID 256ULL
-
-
-/*
- * the device items go into the chunk tree.  The key is in the form
- * [ 1 BTRFS_DEV_ITEM_KEY device_id ]
- */
-#define BTRFS_DEV_ITEMS_OBJECTID 1ULL
-
-#define BTRFS_BTREE_INODE_OBJECTID 1
-
-#define BTRFS_EMPTY_SUBVOL_DIR_OBJECTID 2
-
-#define BTRFS_DEV_REPLACE_DEVID 0ULL
-
 /*
  * the max metadata block size.  This limit is somewhat artificial,
  * but the memmove costs go through the roof for larger blocks.
@@ -175,12 +84,6 @@ struct btrfs_ordered_sum;
  */
 #define BTRFS_LINK_MAX 65535U
 
-/* 32 bytes in various csum fields */
-#define BTRFS_CSUM_SIZE 32
-
-/* csum types */
-#define BTRFS_CSUM_TYPE_CRC32  0
-
 static const int btrfs_csum_sizes[] = { 4 };
 
 /* four bytes for CRC32 */
@@ -189,17 +92,6 @@ static const int btrfs_csum_sizes[] = { 4 };
 /* spefic to btrfs_map_block(), therefore not in include/linux/blk_types.h */
 #define REQ_GET_READ_MIRRORS   (1 << 30)
 
-#define BTRFS_FT_UNKNOWN   0
-#define BTRFS_FT_REG_FILE  1
-#define BTRFS_FT_DIR   2
-#define BTRFS_FT_CHRDEV3
-#define BTRFS_FT_BLKDEV4
-#define BTRFS_FT_FIFO  5
-#define BTRFS_FT_SOCK  6
-#define BTRFS_FT_SYMLINK   7
-#define BTRFS_FT_XATTR 8
-#define BTRFS_FT_MAX   9
-
 /* ioprio of readahead is set to idle */
 #define BTRFS_IOPRIO_READA (IOPRIO_PRIO_VALUE(IOPRIO_CLASS_IDLE, 0))
 
@@ -207,138 +99,10 @@ static const int btrfs_csum_sizes[] = { 4 };
 
 #define BTRFS_MAX_EXTENT_SIZE SZ_128M
 
-/*
- * The key defines the order in the tree, and so it also defines (optimal)
- * block layout.
- *
- * objectid corresponds to the inode number.
- *
- * type tells us things about the object, and is a kind of stream selector.
- * so for a given inode, keys with type of 1 might refer to the inode data,
- * type of 2 may point to file data in the btree and type

[PATCH 3/8] btrfs: uapi/linux/btrfs.h migration, document subvol flags

Signed-off-by: Jeff Mahoney 
---
 include/uapi/linux/btrfs.h | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index 9651af3..0316e23 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -34,9 +34,6 @@ struct btrfs_ioctl_vol_args {
 
 #define BTRFS_DEVICE_PATH_NAME_MAX 1024
 
-#define BTRFS_SUBVOL_CREATE_ASYNC  (1ULL << 0)
-#define BTRFS_SUBVOL_RDONLY(1ULL << 1)
-#define BTRFS_SUBVOL_QGROUP_INHERIT(1ULL << 2)
 #define BTRFS_FSID_SIZE 16
 #define BTRFS_UUID_SIZE 16
 #define BTRFS_UUID_UNPARSED_SIZE   37
@@ -85,6 +82,20 @@ struct btrfs_ioctl_qgroup_limit_args {
struct btrfs_qgroup_limit lim;
 };
 
+/*
+ * flags for subvolumes
+ *
+ * Used by:
+ * struct btrfs_ioctl_vol_args_v2.flags
+ *
+ * BTRFS_SUBVOL_RDONLY is also provided/consumed by the following ioctls:
+ * - BTRFS_IOC_SUBVOL_GETFLAGS
+ * - BTRFS_IOC_SUBVOL_SETFLAGS
+ */
+#define BTRFS_SUBVOL_CREATE_ASYNC  (1ULL << 0)
+#define BTRFS_SUBVOL_RDONLY(1ULL << 1)
+#define BTRFS_SUBVOL_QGROUP_INHERIT(1ULL << 2)
+
 #define BTRFS_SUBVOL_NAME_MAX 4039
 struct btrfs_ioctl_vol_args_v2 {
__s64 fd;
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 8/8] btrfs: uapi/linux/btrfs_tree.h, use u8 and u64

u8 and u64 aren't exported to userspace, while __u8 and __u64 are.

Signed-off-by: Jeff Mahoney 
---
 include/uapi/linux/btrfs_tree.h | 52 -
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h
index 1e87505..d5ad15a 100644
--- a/include/uapi/linux/btrfs_tree.h
+++ b/include/uapi/linux/btrfs_tree.h
@@ -334,14 +334,14 @@
  */
 struct btrfs_disk_key {
__le64 objectid;
-   u8 type;
+   __u8 type;
__le64 offset;
 } __attribute__ ((__packed__));
 
 struct btrfs_key {
-   u64 objectid;
-   u8 type;
-   u64 offset;
+   __u64 objectid;
+   __u8 type;
+   __u64 offset;
 } __attribute__ ((__packed__));
 
 struct btrfs_dev_item {
@@ -379,22 +379,22 @@ struct btrfs_dev_item {
__le32 dev_group;
 
/* seek speed 0-100 where 100 is fastest */
-   u8 seek_speed;
+   __u8 seek_speed;
 
/* bandwidth 0-100 where 100 is fastest */
-   u8 bandwidth;
+   __u8 bandwidth;
 
/* btrfs generated uuid for this device */
-   u8 uuid[BTRFS_UUID_SIZE];
+   __u8 uuid[BTRFS_UUID_SIZE];
 
/* uuid of FS who owns this device */
-   u8 fsid[BTRFS_UUID_SIZE];
+   __u8 fsid[BTRFS_UUID_SIZE];
 } __attribute__ ((__packed__));
 
 struct btrfs_stripe {
__le64 devid;
__le64 offset;
-   u8 dev_uuid[BTRFS_UUID_SIZE];
+   __u8 dev_uuid[BTRFS_UUID_SIZE];
 } __attribute__ ((__packed__));
 
 struct btrfs_chunk {
@@ -433,7 +433,7 @@ struct btrfs_chunk {
 struct btrfs_free_space_entry {
__le64 offset;
__le64 bytes;
-   u8 type;
+   __u8 type;
 } __attribute__ ((__packed__));
 
 struct btrfs_free_space_header {
@@ -486,7 +486,7 @@ struct btrfs_extent_item_v0 {
 
 struct btrfs_tree_block_info {
struct btrfs_disk_key key;
-   u8 level;
+   __u8 level;
 } __attribute__ ((__packed__));
 
 struct btrfs_extent_data_ref {
@@ -501,7 +501,7 @@ struct btrfs_shared_data_ref {
 } __attribute__ ((__packed__));
 
 struct btrfs_extent_inline_ref {
-   u8 type;
+   __u8 type;
__le64 offset;
 } __attribute__ ((__packed__));
 
@@ -523,7 +523,7 @@ struct btrfs_dev_extent {
__le64 chunk_objectid;
__le64 chunk_offset;
__le64 length;
-   u8 chunk_tree_uuid[BTRFS_UUID_SIZE];
+   __u8 chunk_tree_uuid[BTRFS_UUID_SIZE];
 } __attribute__ ((__packed__));
 
 struct btrfs_inode_ref {
@@ -583,7 +583,7 @@ struct btrfs_dir_item {
__le64 transid;
__le16 data_len;
__le16 name_len;
-   u8 type;
+   __u8 type;
 } __attribute__ ((__packed__));
 
 #define BTRFS_ROOT_SUBVOL_RDONLY   (1ULL << 0)
@@ -605,8 +605,8 @@ struct btrfs_root_item {
__le64 flags;
__le32 refs;
struct btrfs_disk_key drop_progress;
-   u8 drop_level;
-   u8 level;
+   __u8 drop_level;
+   __u8 level;
 
/*
 * The following fields appear after subvol_uuids+subvol_times
@@ -625,9 +625,9 @@ struct btrfs_root_item {
 * when invalidating the fields.
 */
__le64 generation_v2;
-   u8 uuid[BTRFS_UUID_SIZE];
-   u8 parent_uuid[BTRFS_UUID_SIZE];
-   u8 received_uuid[BTRFS_UUID_SIZE];
+   __u8 uuid[BTRFS_UUID_SIZE];
+   __u8 parent_uuid[BTRFS_UUID_SIZE];
+   __u8 received_uuid[BTRFS_UUID_SIZE];
__le64 ctransid; /* updated when an inode changes */
__le64 otransid; /* trans when created */
__le64 stransid; /* trans when sent. non-zero for received subvol */
@@ -751,12 +751,12 @@ struct btrfs_file_extent_item {
 * it is treated like an incompat flag for reading and writing,
 * but not for stat.
 */
-   u8 compression;
-   u8 encryption;
+   __u8 compression;
+   __u8 encryption;
__le16 other_encoding; /* spare for later use */
 
/* are we inline data or a real extent? */
-   u8 type;
+   __u8 type;
 
/*
 * disk space consumed by the extent, checksum blocks are included
@@ -783,7 +783,7 @@ struct btrfs_file_extent_item {
 } __attribute__ ((__packed__));
 
 struct btrfs_csum_item {
-   u8 csum;
+   __u8 csum;
 } __attribute__ ((__packed__));
 
 struct btrfs_dev_stats_item {
@@ -874,14 +874,14 @@ enum btrfs_raid_types {
 #define BTRFS_EXTENDED_PROFILE_MASK(BTRFS_BLOCK_GROUP_PROFILE_MASK | \
 BTRFS_AVAIL_ALLOC_BIT_SINGLE)
 
-static inline u64 chunk_to_extended(u64 flags)
+static inline __u64 chunk_to_extended(__u64 flags)
 {
if ((flags & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0)
flags |= BTRFS_AVAIL_ALLOC_BIT_SINGLE;
 
return flags;
 }
-static inline u64 extended_to_chunk(u64 flags)
+static inline __u64 extended_to_chunk(__u64 flags)
 {
return flags & ~BTRFS_AVAIL_ALLOC_BIT_SINGLE;
 }
@@ -900,7 +900,7 @@ struct btrfs_free_space_info {
 #define

[PATCH 5/8] btrfs: uapi/linux/btrfs.h migration, move balance flags

The BTRFS_BALANCE_* flags are used by struct btrfs_ioctl_balance_args.flags
and btrfs_ioctl_balance_args.{data,meta,sys}.flags in the BTRFS_IOC_BALANCE
ioctl.

Signed-off-by: Jeff Mahoney 
---
 fs/btrfs/volumes.h | 46 -
 include/uapi/linux/btrfs.h | 64 ++
 2 files changed, 64 insertions(+), 46 deletions(-)

diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 1939ebd..144cec3 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -357,52 +357,6 @@ struct map_lookup {
 #define map_lookup_size(n) (sizeof(struct map_lookup) + \
(sizeof(struct btrfs_bio_stripe) * (n)))
 
-/*
- * Restriper's general type filter
- */
-#define BTRFS_BALANCE_DATA (1ULL << 0)
-#define BTRFS_BALANCE_SYSTEM   (1ULL << 1)
-#define BTRFS_BALANCE_METADATA (1ULL << 2)
-
-#define BTRFS_BALANCE_TYPE_MASK(BTRFS_BALANCE_DATA |   \
-BTRFS_BALANCE_SYSTEM | \
-BTRFS_BALANCE_METADATA)
-
-#define BTRFS_BALANCE_FORCE(1ULL << 3)
-#define BTRFS_BALANCE_RESUME   (1ULL << 4)
-
-/*
- * Balance filters
- */
-#define BTRFS_BALANCE_ARGS_PROFILES(1ULL << 0)
-#define BTRFS_BALANCE_ARGS_USAGE   (1ULL << 1)
-#define BTRFS_BALANCE_ARGS_DEVID   (1ULL << 2)
-#define BTRFS_BALANCE_ARGS_DRANGE  (1ULL << 3)
-#define BTRFS_BALANCE_ARGS_VRANGE  (1ULL << 4)
-#define BTRFS_BALANCE_ARGS_LIMIT   (1ULL << 5)
-#define BTRFS_BALANCE_ARGS_LIMIT_RANGE (1ULL << 6)
-#define BTRFS_BALANCE_ARGS_STRIPES_RANGE (1ULL << 7)
-#define BTRFS_BALANCE_ARGS_USAGE_RANGE (1ULL << 10)
-
-#define BTRFS_BALANCE_ARGS_MASK\
-   (BTRFS_BALANCE_ARGS_PROFILES |  \
-BTRFS_BALANCE_ARGS_USAGE | \
-BTRFS_BALANCE_ARGS_DEVID | \
-BTRFS_BALANCE_ARGS_DRANGE |\
-BTRFS_BALANCE_ARGS_VRANGE |\
-BTRFS_BALANCE_ARGS_LIMIT | \
-BTRFS_BALANCE_ARGS_LIMIT_RANGE |   \
-BTRFS_BALANCE_ARGS_STRIPES_RANGE | \
-BTRFS_BALANCE_ARGS_USAGE_RANGE)
-
-/*
- * Profile changing flags.  When SOFT is set we won't relocate chunk if
- * it already has the target profile (even though it may be
- * half-filled).
- */
-#define BTRFS_BALANCE_ARGS_CONVERT (1ULL << 8)
-#define BTRFS_BALANCE_ARGS_SOFT(1ULL << 9)
-
 struct btrfs_balance_args;
 struct btrfs_balance_progress;
 struct btrfs_balance_control {
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index de98717..abae362 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -317,6 +317,70 @@ struct btrfs_balance_progress {
__u64 completed;/* # of chunks relocated so far */
 };
 
+/*
+ * flags definition for balance
+ *
+ * Restriper's general type filter
+ *
+ * Used by:
+ * btrfs_ioctl_balance_args.flags
+ * btrfs_balance_control.flags (internal)
+ */
+#define BTRFS_BALANCE_DATA (1ULL << 0)
+#define BTRFS_BALANCE_SYSTEM   (1ULL << 1)
+#define BTRFS_BALANCE_METADATA (1ULL << 2)
+
+#define BTRFS_BALANCE_TYPE_MASK(BTRFS_BALANCE_DATA |   \
+BTRFS_BALANCE_SYSTEM | \
+BTRFS_BALANCE_METADATA)
+
+#define BTRFS_BALANCE_FORCE(1ULL << 3)
+#define BTRFS_BALANCE_RESUME   (1ULL << 4)
+
+/*
+ * flags definitions for per-type balance args
+ *
+ * Balance filters
+ *
+ * Used by:
+ * struct btrfs_balance_args
+ */
+#define BTRFS_BALANCE_ARGS_PROFILES(1ULL << 0)
+#define BTRFS_BALANCE_ARGS_USAGE   (1ULL << 1)
+#define BTRFS_BALANCE_ARGS_DEVID   (1ULL << 2)
+#define BTRFS_BALANCE_ARGS_DRANGE  (1ULL << 3)
+#define BTRFS_BALANCE_ARGS_VRANGE  (1ULL << 4)
+#define BTRFS_BALANCE_ARGS_LIMIT   (1ULL << 5)
+#define BTRFS_BALANCE_ARGS_LIMIT_RANGE (1ULL << 6)
+#define BTRFS_BALANCE_ARGS_STRIPES_RANGE (1ULL << 7)
+#define BTRFS_BALANCE_ARGS_USAGE_RANGE (1ULL << 10)
+
+#define BTRFS_BALANCE_ARGS_MASK\
+   (BTRFS_BALANCE_ARGS_PROFILES |  \
+BTRFS_BALANCE_ARGS_USAGE | \
+BTRFS_BALANCE_ARGS_DEVID | \
+BTRFS_BALANCE_ARGS_DRANGE |\
+BTRFS_BALANCE_ARGS_VRANGE |\
+BTRFS_BALANCE_ARGS_LIMIT | \
+BTRFS_BALANCE_ARGS_LIMIT_RANGE |   \
+BTRFS_BALANCE_ARGS_STRIPES_RANGE | \
+BTRFS_BALANCE_ARGS_USAGE_RANGE)
+
+/*
+ * Profile changing flags.  When SOFT is set we won't relocate chunk if
+ * it already has the target profile (even though it may be
+ * half-filled).
+ */
+#define BTRFS_BALANCE_ARGS_CONVERT (1ULL << 8)
+#define BTRFS_BALANCE_ARGS_SOFT(1ULL << 9)
+
+
+/*
+ * flags definition for balance state
+ *
+ * Used by:
+ *

[PATCH 1/8] btrfs: uapi/linux/btrfs.h migration, move BTRFS_LABEL_SIZE

BTRFS_LABEL_SIZE is required to define the BTRFS_IOC_GET_FSLABEL and
BTRFS_IOC_SET_FSLABEL ioctls.

Signed-off-by: Jeff Mahoney 
---
 fs/btrfs/ctree.h   | 1 -
 include/uapi/linux/btrfs.h | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 84a6a5b..3beaa24 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -410,7 +410,6 @@ struct btrfs_header {
  * room to translate 14 chunks with 3 stripes each.
  */
 #define BTRFS_SYSTEM_CHUNK_ARRAY_SIZE 2048
-#define BTRFS_LABEL_SIZE 256
 
 /*
  * just in case we somehow lose the roots and are not able to mount,
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index dea8931..11eee34 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -23,6 +23,7 @@
 
 #define BTRFS_IOCTL_MAGIC 0x94
 #define BTRFS_VOL_NAME_MAX 255
+#define BTRFS_LABEL_SIZE 256
 
 /* this should be 4k */
 #define BTRFS_PATH_NAME_MAX 4087
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Another ENOSPC situation

2016-04-01 Thread Henk Slager

On Fri, Apr 1, 2016 at 6:50 PM, Marc Haber  wrote:
> On Fri, Apr 01, 2016 at 06:30:20PM +0200, Marc Haber wrote:
>> On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote:
>> > On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber  
>> > wrote:
>> > > btrfs balance -mprofiles seems to do something. one kworked and one
>> > > btrfs-transaction process hog one CPU core each for hours, while
>> > > blocking the filesystem for minutes apiece, which leads to the host
>> > > being nearly unuseable up to the point of "clock and mouse pointer
>> > > frozen for nearly ten minutes".
>> >
>> > I assume you still have your every 10 minutes snapshotting running
>> > while balancing?
>>
>> No, I disabled the cronjob before trying the balance. I might be
>> crazy, but not stup^wunexperienced.
>
> That being said, I would still expect the code not to allow _this_
> kind of effect on the entire system when two alledgely incompatible
> operations run simultaneously. I mean, Linux is a multi-user,
> multi-tasking operating system where one simply cannot expect all
> processes to be cooperative to each other. We have the operating
> systems to prevent this kind of issues, not to cause them.

Maybe look at it differently: Does user mh have trouble using this
laptop w.r.t. storing files?

In openSUSE Tumbleweed (the snapshot from end of march), root access
is needed to change the default snapshotting config, otherwise you
will have a 10 year history. After that change has been done according
to needs of the user, there is no need to run manual balance.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Compression causes kernel crashes if there are I/O or checksum errors (was: RE: kernel BUG at fs/btrfs/volumes.c:5519 when hot-removing device in RAID-1)

2016-04-01 Thread mitch

I grabbed this part from the log after the machine crashed again
following trying to transfer a bunch of files that included ones with
csum errors, let me know if this looks like the same issue you were
having:


Mar 31 00:49:42 sl-server kernel: NMI watchdog: BUG: soft lockup -
CPU#21 stuck for 22s! [kworker/u67:5:80994]
Mar 31 00:49:42 sl-server kernel: Modules linked in: fuse xt_CHECKSUM
ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT
nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat
ebtable_broute ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6
nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
iptable_security iptable_raw iptable_filter dm_mirror dm_region_hash
dm_log dm_mod kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel xfs aesni_intel lrw gf128mul glue_helper libcrc32c
ablk_helper cryptd joydev input_leds edac_mce_amd k10temp edac_core
fam15h_power sp5100_tco sg i2c_piix4 8250_fintek acpi_cpufreq shpchp
nfsd auth_rpcgss nfs_acl
Mar 31 00:49:42 sl-server kernel:  lockd grace sunrpc ip_tables btrfs
xor ata_generic pata_acpi raid6_pq sd_mod mgag200 crc32c_intel
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci
serio_raw pata_atiixp libahci igb drm ptp pps_core mpt3sas dca
raid_class libata i2c_algo_bit scsi_transport_sas fjes uas usb_storage
Mar 31 00:49:42 sl-server kernel: CPU: 21 PID: 80994 Comm:
kworker/u67:5 Not tainted 4.5.0-1.el7.elrepo.x86_64 #1
Mar 31 00:49:42 sl-server kernel: Hardware name: Supermicro
H8DG6/H8DGi/H8DG6/H8DGi, BIOS 3.511/25/2013
Mar 31 00:49:42 sl-server kernel: Workqueue: btrfs-endio
btrfs_endio_helper [btrfs]
Mar 31 00:49:42 sl-server kernel: task: 8817f6fa8000 ti:
8800b731 task.ti: 8800b731
Mar 31 00:49:42 sl-server kernel: RIP:
0010:[]  []
btrfs_decompress_buf2page+0x123/0x200 [btrfs]
Mar 31 00:49:42 sl-server kernel: RSP: 0018:8800b7313be0  EFLAGS:
0246
Mar 31 00:49:42 sl-server kernel: RAX:  RBX:
 RCX: 
Mar 31 00:49:42 sl-server kernel: RDX:  RSI:
c9000e3d8000 RDI: 88144c7cc000
Mar 31 00:49:42 sl-server kernel: RBP: 8800b7313c48 R08:
8810f0295000 R09: 0020
Mar 31 00:49:42 sl-server kernel: R10: 8810d2ba7869 R11:
00010008 R12: 8817f6fa8000
Mar 31 00:49:42 sl-server kernel: R13: 8800b7313ce0 R14:
0008 R15: 1000
Mar 31 00:49:42 sl-server kernel: FS:  7efce58fb740()
GS:881807d4() knlGS:
Mar 31 00:49:42 sl-server kernel: CS:  0010 DS:  ES:  CR0:
8005003b
Mar 31 00:49:42 sl-server kernel: CR2: 7f00caf249e8 CR3:
001062121000 CR4: 000406e0
Mar 31 00:49:42 sl-server kernel: Stack:
Mar 31 00:49:42 sl-server kernel:  0020 f000
8810f0295000 8744
Mar 31 00:49:42 sl-server kernel:  00010008 c9000e3d7000
ea005131f300 0001
Mar 31 00:49:42 sl-server kernel:  0797 2869
0869 8810d2ba7000
Mar 31 00:49:42 sl-server kernel: Call Trace:
Mar 31 00:49:42 sl-server kernel:  []
lzo_decompress_biovec+0x202/0x300 [btrfs]
Mar 31 00:49:42 sl-server kernel:  []
end_compressed_bio_read+0x1f6/0x2f0 [btrfs]
Mar 31 00:49:42 sl-server kernel:  []
bio_endio+0x40/0x60
Mar 31 00:49:42 sl-server kernel:  []
end_workqueue_fn+0x3c/0x40 [btrfs]
Mar 31 00:49:42 sl-server kernel:  []
normal_work_helper+0xc0/0x2c0 [btrfs]
Mar 31 00:49:42 sl-server kernel:  []
btrfs_endio_helper+0x12/0x20 [btrfs]
Mar 31 00:49:42 sl-server kernel:  []
process_one_work+0x14f/0x400
Mar 31 00:49:42 sl-server kernel:  []
worker_thread+0x125/0x4b0
Mar 31 00:49:42 sl-server kernel:  [] ?
rescuer_thread+0x370/0x370
Mar 31 00:49:42 sl-server kernel:  []
kthread+0xd8/0xf0
Mar 31 00:49:42 sl-server kernel:  [] ?
kthread_park+0x60/0x60
Mar 31 00:49:42 sl-server kernel:  []
ret_from_fork+0x3f/0x70
Mar 31 00:49:42 sl-server kernel:  [] ?
kthread_park+0x60/0x60
Mar 31 00:49:42 sl-server kernel: Code: c7 48 8b 45 c0 49 03 7d 00 4a
8d 34 38 e8 06 18 00 e1 41 83 ac 24 28 12 00 00 01 41 8b 84 24 28 12 00
00 85 c0 0f 88 bf 00 00 00 <48> 89 d8 49 03 45 00 49 01 df 49 29 de 48
01 5d d0 48 3d 00 10 
Mar 31 00:49:43 sl-server sh[1297]: abrt-dump-oops: Found oopses: 1
Mar 31 00:49:43 sl-server sh[1297]: abrt-dump-oops: Creating problem
directories
Mar 31 00:49:43 sl-server sh[1297]: abrt-dump-oops: Not going to make
dump directories world readable because PrivateReports is on
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs_destroy_inode WARN_ON.

2016-04-01 Thread Dave Jones

On Fri, Apr 01, 2016 at 02:12:27PM -0400, Dave Jones wrote:
 > BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 30s!
 > Showing busy workqueues and worker pools:
 > workqueue events: flags=0x0
 >   pwq 6: cpus=3 node=0 flags=0x0 nice=0 active=1/256
 > pending: vmstat_shepherd
 >   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256
 > pending: check_corruption
 >   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=3/256
 > pending: usb_serial_port_work, lru_add_drain_per_cpu BAR(17230), 
 > e1000_watchdog_task
 > workqueue events_power_efficient: flags=0x82
 >   pwq 8: cpus=0-3 flags=0x4 nice=0 active=3/256
 > pending: fb_flashcursor, neigh_periodic_work, neigh_periodic_work
 > workqueue events_freezable_power_: flags=0x86
 >   pwq 8: cpus=0-3 flags=0x4 nice=0 active=1/256
 > pending: disk_events_workfn
 > workqueue netns: flags=0x6000a
 >   pwq 8: cpus=0-3 flags=0x4 nice=0 active=1/1
 > in-flight: 10038:cleanup_net
 > workqueue writeback: flags=0x4e
 >   pwq 8: cpus=0-3 flags=0x4 nice=0 active=2/256
 > pending: wb_workfn, wb_workfn
 > workqueue kblockd: flags=0x18
 >   pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=2/256
 > pending: blk_mq_timeout_work, blk_mq_timeout_work
 > workqueue vmstat: flags=0xc
 >   pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256
 > pending: vmstat_update
 >   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
 > pending: vmstat_update
 >   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256
 > pending: vmstat_update
 > pool 8: cpus=0-3 flags=0x4 nice=0 hung=0s workers=11 idle: 11638 10276 609 
 > 17937 606 9237 605 891 15998 14100
 > note: trinity-c13[18815] exited with preempt_count 1

This has wedged userspace too:

23082 pts/2SN+0:00  |   \_ /bin/bash scripts/test-multi.sh
14140 pts/2SNL+   0:15  |   \_ ../trinity -q -l off -N 100 -a64 -x 
fsync -x fdatasync
16900 ?DNs0:04  |   \_ ../trinity -q -l off -N 100 -a64 
-x fsync -x fdata
18894 ?DNs0:02  |   \_ ../trinity -q -l off -N 100 -a64 
-x fsync -x fdata

(14:16:02:davej@think:trinity[master])$ stack 16900
[] wait_on_page_bit_killable+0x156/0x1b0
[] __lock_page_or_retry+0x112/0x1b0
[] filemap_fault+0x367/0xb30
[] __do_fault+0x167/0x3d0
[] handle_mm_fault+0x1837/0x2520
[] __do_page_fault+0x248/0x770
[] do_page_fault+0x39/0xa0
[] page_fault+0x1f/0x30
[] mm_release+0x1ec/0x230
[] do_exit+0x5d0/0x18c0
[] do_group_exit+0xac/0x190
[] get_signal+0x48f/0xeb0
[] do_signal+0xa0/0xb50
[] exit_to_usermode_loop+0xd9/0x100
[] do_syscall_64+0x238/0x2b0
[] return_from_SYSCALL_64+0x0/0x7a
[] 0x

(14:16:09:davej@think:trinity[master])$ stack 18894
[] btrfs_file_write_iter+0xe8/0x9a0 [btrfs]
[] __vfs_write+0x279/0x2e0
[] vfs_write+0x11e/0x2b0
[] SyS_write+0xd2/0x1a0
[] do_syscall_64+0x103/0x2b0
[] return_from_SYSCALL_64+0x0/0x7a
[] 0x

I tried to ftrace the latter process, and the box completely hung.

Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs_destroy_inode WARN_ON.

2016-04-01 Thread Dave Jones

On Sun, Mar 27, 2016 at 09:14:00PM -0400, Dave Jones wrote:
 
 >  > WARNING: CPU: 2 PID: 32570 at fs/btrfs/inode.c:9261 
 > btrfs_destroy_inode+0x389/0x3f0 [btrfs]
 >  > CPU: 2 PID: 32570 Comm: rm Not tainted 4.5.0-think+ #14
 >  >  c039baf9 ef721ef0 88025966fc08 8957bcdb
 >  >    88025966fc50 890b41f1
 >  >  88045d918040 242d4eed6048 88024eed6048 88024eed6048
 >  > Call Trace:
 >  >  [] ? btrfs_destroy_inode+0x389/0x3f0 [btrfs]
 >  >  [] dump_stack+0x68/0x9d
 >  >  [] __warn+0x111/0x130
 >  >  [] warn_slowpath_null+0x1d/0x20
 >  >  [] btrfs_destroy_inode+0x389/0x3f0 [btrfs]
 >  >  [] destroy_inode+0x67/0x90
 >  >  [] evict+0x1b7/0x240
 >  >  [] iput+0x3ae/0x4e0
 >  >  [] ? dput+0x20e/0x460
 >  >  [] do_unlinkat+0x256/0x440
 >  >  [] ? do_rmdir+0x350/0x350
 >  >  [] ? syscall_trace_enter_phase1+0x87/0x260
 >  >  [] ? enter_from_user_mode+0x50/0x50
 >  >  [] ? __lock_is_held+0x25/0xd0
 >  >  [] ? mark_held_locks+0x22/0xc0
 >  >  [] ? syscall_trace_enter_phase2+0x12d/0x3d0
 >  >  [] ? SyS_rmdir+0x20/0x20
 >  >  [] SyS_unlinkat+0x1b/0x30
 >  >  [] do_syscall_64+0xf4/0x240
 >  >  [] entry_SYSCALL64_slow_path+0x25/0x25
 >  > ---[ end trace a48ce4e6a1b5e409 ]---
 >  > 
 >  > That's WARN_ON(BTRFS_I(inode)->csum_bytes);
 >  > 
 >  > *maybe* it's a bad disk, but there's no indication in dmesg of anything 
 > awry.
 >  > Spinning rust on SATA, nothing special.
 > 
 > Same WARN_ON is reachable from umount too..
 > 
 > WARNING: CPU: 2 PID: 20092 at fs/btrfs/inode.c:9261 
 > btrfs_destroy_inode+0x40c/0x480 [btrfs]
 > CPU: 2 PID: 20092 Comm: umount Tainted: GW   4.5.0-think+ #1
 >   a32c482b 8803cd187b60 9d63af84
 >    c05c5e40 c04d316c
 >  8803cd187ba8 9d0c4c27 880460d80040 242dcd187bb0
 > Call Trace:
 >  [] dump_stack+0x95/0xe1
 >  [] ? btrfs_destroy_inode+0x40c/0x480 [btrfs]
 >  [] __warn+0x147/0x170
 >  [] warn_slowpath_null+0x31/0x40
 >  [] btrfs_destroy_inode+0x40c/0x480 [btrfs]
 >  [] ? btrfs_test_destroy_inode+0x40/0x40 [btrfs]
 >  [] destroy_inode+0x77/0xb0
 >  [] evict+0x20e/0x2c0
 >  [] dispose_list+0x70/0xb0
 >  [] evict_inodes+0x26f/0x2c0
 >  [] ? inode_add_lru+0x60/0x60
 >  [] ? fsnotify_unmount_inodes+0x215/0x2c0
 >  [] generic_shutdown_super+0x76/0x1c0
 >  [] kill_anon_super+0x29/0x40
 >  [] btrfs_kill_super+0x31/0x130 [btrfs]
 >  [] deactivate_locked_super+0x6f/0xb0
 >  [] deactivate_super+0x99/0xb0
 >  [] cleanup_mnt+0x70/0xd0
 >  [] __cleanup_mnt+0x1b/0x20
 >  [] task_work_run+0xef/0x130
 >  [] exit_to_usermode_loop+0xf9/0x100
 >  [] do_syscall_64+0x238/0x2b0
 >  [] entry_SYSCALL64_slow_path+0x25/0x25

Additional fallout:

BTRFS: assertion failed: num_extents, file: fs/btrfs/extent-tree.c, line: 5584
[ cut here ]
kernel BUG at fs/btrfs/ctree.h:4320!
invalid opcode:  [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
CPU: 1 PID: 18815 Comm: trinity-c13 Tainted: GW   4.6.0-rc1-think+ 
#1
task: 88045de10040 ti: 8803afa38000 task.ti: 8803afa38000
RIP: 0010:[]  [] 
assfail.constprop.88+0x2b/0x2d [btrfs]
RSP: 0018:8803afa3f838  EFLAGS: 00010282
RAX: 004e RBX: c046e200 RCX: 
RDX:  RSI: 0003 RDI: ed0075f47efb
RBP: 8803afa3f848 R08: 0001 R09: 0001
R10:  R11: 0001 R12: 15d0
R13: 8803fda0e048 R14: 8803fda0dc38 R15: 8803fda0dc58
FS:  7fa0566d6700() GS:880468a0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fa0566d9000 CR3: 000333bc4000 CR4: 001406e0
DR0: 7fa0554fb000 DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0600
Stack:
  8803fda0e048 8803afa3f880 c032288b
  880460bb33f8 8803fda0e048 8803fda0dc38
 8803fda0dc58 8803afa3f8c8 c032f851 0001
Call Trace:
 [] drop_outstanding_extent+0x10b/0x130 [btrfs]
 [] btrfs_delalloc_release_metadata+0x71/0x480 [btrfs]
 [] ? __btrfs_buffered_write+0xa6f/0xb50 [btrfs]
 [] btrfs_delalloc_release_space+0x27/0x50 [btrfs]
 [] __btrfs_buffered_write+0xa28/0xb50 [btrfs]
 [] ? btrfs_dirty_pages+0x1c0/0x1c0 [btrfs]
 [] ? filemap_fdatawait_range+0x3e/0x50
 [] ? generic_file_direct_write+0x237/0x2f0
 [] ? filemap_write_and_wait_range+0xa0/0xa0
 [] ? btrfs_file_write_iter+0x670/0x9a0 [btrfs]
 [] btrfs_file_write_iter+0x74d/0x9a0 [btrfs]
 [] do_iter_readv_writev+0x153/0x1f0
 [] ? btrfs_sync_file+0x920/0x920 [btrfs]
 [] ? vfs_iter_read+0x1e0/0x1e0
 [] ? preempt_count_sub+0xb9/0x130
 [] ? percpu_down_read+0x57/0xa0
 [] ? __sb_start_write+0xee/0x130
 [] ? btrfs_sync_file+0x920/0x920 [btrfs]
 [] do_readv_writev+0x30f/0x460
 [] ? vfs_write+0x2b0/0x2b0
 [] ?

Re: Another ENOSPC situation

On Fri, Apr 01, 2016 at 06:30:20PM +0200, Marc Haber wrote:
> On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote:
> > On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber  
> > wrote:
> > > btrfs balance -mprofiles seems to do something. one kworked and one
> > > btrfs-transaction process hog one CPU core each for hours, while
> > > blocking the filesystem for minutes apiece, which leads to the host
> > > being nearly unuseable up to the point of "clock and mouse pointer
> > > frozen for nearly ten minutes".
> > 
> > I assume you still have your every 10 minutes snapshotting running
> > while balancing?
> 
> No, I disabled the cronjob before trying the balance. I might be
> crazy, but not stup^wunexperienced.

That being said, I would still expect the code not to allow _this_
kind of effect on the entire system when two alledgely incompatible
operations run simultaneously. I mean, Linux is a multi-user,
multi-tasking operating system where one simply cannot expect all
processes to be cooperative to each other. We have the operating
systems to prevent this kind of issues, not to cause them.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v9 00/19] Btrfs dedupe framework

2016-04-01 Thread David Sterba

On Fri, Apr 01, 2016 at 08:26:43AM +0800, Qu Wenruo wrote:
> 
> 
> David Sterba wrote on 2016/03/31 18:12 +0200:
> > On Wed, Mar 30, 2016 at 03:55:55PM +0800, Qu Wenruo wrote:
> >> This March 30th patchset update mostly addresses the patchset structure
> >> comment from David:
> >> 1) Change the patchset sequence
> >> Not If only apply the first 14 patches, it can provide the full
> >> backward compatible in-memory only dedupe backend.
> >>
> >> Only starts from patch 15, on-disk format will be changed.
> >>
> >> So patch 1~14 is going to be pushed for next merge window, while I'll
> >> still submit them all for review purpose.
> >
> > I'll buy 1-10 with the ioctl hidden under the BTRFS_DEBUG config option
> > until the interface is settled.
> >
> >
> Nice to hear that.
> 
> I'll add BTRFS_DEBUG config then.

Independent of the next merge window, I'll add them to my for-next after
you send the updated version. I'll also try to review them next week,
but I don't remember any critical issue during first reading, so there's
no blocker.

> BTW, any comment on btrfs-convert rewrite?

This not the right place to ask, better to ping as reply to the thread
as I could miss it. Nevertheless, the answer is that it's going to devel
branch, the convert tests passed (as required minimum), but the patchset
is still not reviewed up to my satisfaction.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Another ENOSPC situation

On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote:
> On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber  
> wrote:
> > btrfs balance -mprofiles seems to do something. one kworked and one
> > btrfs-transaction process hog one CPU core each for hours, while
> > blocking the filesystem for minutes apiece, which leads to the host
> > being nearly unuseable up to the point of "clock and mouse pointer
> > frozen for nearly ten minutes".
> 
> I assume you still have your every 10 minutes snapshotting running
> while balancing?

No, I disabled the cronjob before trying the balance. I might be
crazy, but not stup^wunexperienced.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Another ENOSPC situation

2016-04-01 Thread Henk Slager

On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber  wrote:
> Hi,
>
> just for a change, this is another btrfs on a different host. The host
> is also running Debian unstable with mainline kernels, the btrfs in
> question was created (not converted) in March 2015 with btrfs-tools
> 3.17. It is the root fs of my main work notebook which is under
> workstation load, with lots of snapshots being created and deleted.
>
> Balance immediately fails with ENOSPC
>
> balance -dprofiles=single -dusage=1 goes through "fine" ("had to
> relocate 0 out of 602 chunks")
>
> balance -dprofiles=single -dusage=2 also ENOSPCes immediately.
>
> [4/502]mh@swivel:~$ sudo btrfs fi usage /
> Overall:
> Device size: 600.00GiB
> Device allocated:600.00GiB
> Device unallocated:1.00MiB
> Device missing:  0.00B
> Used:413.40GiB
> Free (estimated):148.20GiB  (min: 148.20GiB)
> Data ratio:   1.00
> Metadata ratio:   2.00
> Global reserve:  512.00MiB  (used: 0.00B)
>
> Data,single: Size:553.93GiB, Used:405.73GiB
>/dev/mapper/swivelbtr 553.93GiB
>
> Metadata,DUP: Size:23.00GiB, Used:3.83GiB
>/dev/mapper/swivelbtr  46.00GiB
>
> System,DUP: Size:32.00MiB, Used:112.00KiB
>/dev/mapper/swivelbtr  64.00MiB
>
> Unallocated:
>/dev/mapper/swivelbtr   1.00MiB
> [5/503]mh@swivel:~$
>
> btrfs balance -mprofiles seems to do something. one kworked and one
> btrfs-transaction process hog one CPU core each for hours, while
> blocking the filesystem for minutes apiece, which leads to the host
> being nearly unuseable up to the point of "clock and mouse pointer
> frozen for nearly ten minutes".

I assume you still have your every 10 minutes snapshotting running
while balancing?

> The btrfs balance cancel I issued after four hours of this state took
> eleven minutes alone to complete.
>
> These are all log entries that were obtained after starting btrfs
> balance -mprofiles on 09:43
> Apr  1 12:18:21 swivel kernel: [253651.970413] BTRFS info (device dm-14): 
> found 3523 extents
> Apr  1 12:18:21 swivel kernel: [253652.035572] BTRFS info (device dm-14): 
> relocating block group 1538365849600 flags 36
> Apr  1 13:30:57 swivel kernel: [258007.653597] BTRFS info (device dm-14): 
> found 3585 extents
> Apr  1 13:30:57 swivel kernel: [258007.746541] BTRFS info (device dm-14): 
> relocating block group 1536755236864 flags 36
> Apr  1 13:49:39 swivel kernel: [259130.296184] BTRFS info (device dm-14): 
> found 3047 extents
> Apr  1 13:49:39 swivel kernel: [259130.357314] BTRFS info (device dm-14): 
> relocating block group 1528702173184 flags 36
> Apr  1 14:30:00 swivel kernel: [261550.776348] BTRFS info (device dm-14): 
> found 4200 extents
>
> This kernel trace from 11:16 is not btrfs-related, is it? I guess it's
> bluetooth related since it happened simultaneously to the bluetooth
> device popping out an in:
> Apr  1 11:16:38 swivel kernel: [249948.993751] usb 1-1.4: USB disconnect, 
> device number 39
> Apr  1 11:16:38 swivel systemd[1]: Starting Load/Save RF Kill Switch Status...
> Apr  1 11:16:38 swivel systemd[1]: Started Load/Save RF Kill Switch Status.
> Apr  1 11:16:38 swivel systemd[1]: bluetooth.target: Unit not needed anymore. 
> Stopping.
> Apr  1 11:16:38 swivel systemd[1]: Stopped target Bluetooth.
> Apr  1 11:16:38 swivel laptop-mode: Laptop mode
> Apr  1 11:16:38 swivel laptop-mode: enabled, not active
> Apr  1 11:16:39 swivel kernel: [249949.211549] usb 1-1.4: new full-speed USB 
> device number 40 using ehci-pci
> Apr  1 11:16:39 swivel kernel: [249949.308386] usb 1-1.4: New USB device 
> found, idVendor=0a5c, idProduct=217f
> Apr  1 11:16:39 swivel kernel: [249949.308397] usb 1-1.4: New USB device 
> strings: Mfr=1, Product=2, SerialNumber=3
> Apr  1 11:16:39 swivel kernel: [249949.308402] usb 1-1.4: Product: Broadcom 
> Bluetooth Device
> Apr  1 11:16:39 swivel kernel: [249949.308407] usb 1-1.4: Manufacturer: 
> Broadcom Corp
> Apr  1 11:16:39 swivel kernel: [249949.308412] usb 1-1.4: SerialNumber: 
> CCAF78F1274F
> Apr  1 11:16:39 swivel systemd[1]: Reached target Bluetooth.
> Apr  1 11:16:39 swivel kernel: [249949.507794] [ cut here 
> ]
> Apr  1 11:16:39 swivel kernel: [249949.507810] WARNING: CPU: 1 PID: 11 at 
> arch/x86/kernel/cpu/perf_event_intel_ds.c:325 reserve_ds_buffers+0x102/0x326()
> Apr  1 11:16:39 swivel kernel: [249949.507813] alloc_bts_buffer: BTS buffer 
> allocation failure
> Apr  1 11:16:39 swivel kernel: [249949.507816] Modules linked in: cpuid 
> hid_generic usbhid hid e1000e tun ctr ccm rfcomm bridge stp llc 
> cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_powersave 
> nf_conntrack_netlink nfnetlink bnep binfmt_misc intel_rapl 
> x86_pkg_temp_thermal arc4 intel_powerclamp kvm_intel kvm irqbypass iwldvm 
> snd_hda_codec_conexant

Re: Again, no space left on device while rebalancing and recipe doesnt work

On Sat, Feb 27, 2016 at 10:14:50PM +0100, Marc Haber wrote:
> I have again the issue of no space left on device while rebalancing
> (with btrfs-tools 4.4.1 on kernel 4.4.2 on Debian unstable):

just for the record: The host started acting up in more and more
interesting ways, and after a call of rm during kernel build resulted
in SIGSEGV, I did the backup-format-restore routine for this system
back to ext4 just to find out whether I have bad hardware or a bad
filesystem.

And, since going back to ext4, the system is just fine again. So it's
not bad hardware.

This systems's root drive is going to stay on ext4 for a loong
time. If I get the btrfs phenomena I experience on other hosts get
solved at some time in the future, I might migrate /home back to
btrfs, but that's not going to happen in the next six months.

This is a really bad experience which has made me lost a lot of faith
in the new filesystem. I really feel sad about that.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Another ENOSPC situation

Hi,

just for a change, this is another btrfs on a different host. The host
is also running Debian unstable with mainline kernels, the btrfs in
question was created (not converted) in March 2015 with btrfs-tools
3.17. It is the root fs of my main work notebook which is under
workstation load, with lots of snapshots being created and deleted.

Balance immediately fails with ENOSPC

balance -dprofiles=single -dusage=1 goes through "fine" ("had to
relocate 0 out of 602 chunks")

balance -dprofiles=single -dusage=2 also ENOSPCes immediately.

[4/502]mh@swivel:~$ sudo btrfs fi usage /
Overall:
Device size: 600.00GiB
Device allocated:600.00GiB
Device unallocated:1.00MiB
Device missing:  0.00B
Used:413.40GiB
Free (estimated):148.20GiB  (min: 148.20GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:553.93GiB, Used:405.73GiB
   /dev/mapper/swivelbtr 553.93GiB

Metadata,DUP: Size:23.00GiB, Used:3.83GiB
   /dev/mapper/swivelbtr  46.00GiB

System,DUP: Size:32.00MiB, Used:112.00KiB
   /dev/mapper/swivelbtr  64.00MiB

Unallocated:
   /dev/mapper/swivelbtr   1.00MiB
[5/503]mh@swivel:~$ 

btrfs balance -mprofiles seems to do something. one kworked and one
btrfs-transaction process hog one CPU core each for hours, while
blocking the filesystem for minutes apiece, which leads to the host
being nearly unuseable up to the point of "clock and mouse pointer
frozen for nearly ten minutes".

The btrfs balance cancel I issued after four hours of this state took
eleven minutes alone to complete.

These are all log entries that were obtained after starting btrfs
balance -mprofiles on 09:43
Apr  1 12:18:21 swivel kernel: [253651.970413] BTRFS info (device dm-14): found 
3523 extents
Apr  1 12:18:21 swivel kernel: [253652.035572] BTRFS info (device dm-14): 
relocating block group 1538365849600 flags 36
Apr  1 13:30:57 swivel kernel: [258007.653597] BTRFS info (device dm-14): found 
3585 extents
Apr  1 13:30:57 swivel kernel: [258007.746541] BTRFS info (device dm-14): 
relocating block group 1536755236864 flags 36
Apr  1 13:49:39 swivel kernel: [259130.296184] BTRFS info (device dm-14): found 
3047 extents
Apr  1 13:49:39 swivel kernel: [259130.357314] BTRFS info (device dm-14): 
relocating block group 1528702173184 flags 36
Apr  1 14:30:00 swivel kernel: [261550.776348] BTRFS info (device dm-14): found 
4200 extents

This kernel trace from 11:16 is not btrfs-related, is it? I guess it's
bluetooth related since it happened simultaneously to the bluetooth
device popping out an in:
Apr  1 11:16:38 swivel kernel: [249948.993751] usb 1-1.4: USB disconnect, 
device number 39
Apr  1 11:16:38 swivel systemd[1]: Starting Load/Save RF Kill Switch Status...
Apr  1 11:16:38 swivel systemd[1]: Started Load/Save RF Kill Switch Status.
Apr  1 11:16:38 swivel systemd[1]: bluetooth.target: Unit not needed anymore. 
Stopping.
Apr  1 11:16:38 swivel systemd[1]: Stopped target Bluetooth.
Apr  1 11:16:38 swivel laptop-mode: Laptop mode
Apr  1 11:16:38 swivel laptop-mode: enabled, not active
Apr  1 11:16:39 swivel kernel: [249949.211549] usb 1-1.4: new full-speed USB 
device number 40 using ehci-pci
Apr  1 11:16:39 swivel kernel: [249949.308386] usb 1-1.4: New USB device found, 
idVendor=0a5c, idProduct=217f
Apr  1 11:16:39 swivel kernel: [249949.308397] usb 1-1.4: New USB device 
strings: Mfr=1, Product=2, SerialNumber=3
Apr  1 11:16:39 swivel kernel: [249949.308402] usb 1-1.4: Product: Broadcom 
Bluetooth Device
Apr  1 11:16:39 swivel kernel: [249949.308407] usb 1-1.4: Manufacturer: 
Broadcom Corp
Apr  1 11:16:39 swivel kernel: [249949.308412] usb 1-1.4: SerialNumber: 
CCAF78F1274F
Apr  1 11:16:39 swivel systemd[1]: Reached target Bluetooth.
Apr  1 11:16:39 swivel kernel: [249949.507794] [ cut here 
]
Apr  1 11:16:39 swivel kernel: [249949.507810] WARNING: CPU: 1 PID: 11 at 
arch/x86/kernel/cpu/perf_event_intel_ds.c:325 reserve_ds_buffers+0x102/0x326()
Apr  1 11:16:39 swivel kernel: [249949.507813] alloc_bts_buffer: BTS buffer 
allocation failure
Apr  1 11:16:39 swivel kernel: [249949.507816] Modules linked in: cpuid 
hid_generic usbhid hid e1000e tun ctr ccm rfcomm bridge stp llc 
cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_powersave 
nf_conntrack_netlink nfnetlink bnep binfmt_misc intel_rapl x86_pkg_temp_thermal 
arc4 intel_powerclamp kvm_intel kvm irqbypass iwldvm snd_hda_codec_conexant 
snd_hda_codec_generic mac80211 input_leds btusb btbcm i2c_i801 snd_hda_intel 
btintel snd_hda_codec bluetooth iwlwifi snd_hda_core cfg80211 snd_hwdep sg 
snd_pcm_oss snd_mixer_oss lpc_ich mfd_core snd_pcm shpchp snd_timer 
thinkpad_acpi nvram snd battery soundcore rfkill ac tpm_tis tpm evdev processor 
xt_TCPMSS xt_tcpudp iptable_mangle iptable_filter

Re: [PATCH] btrfs-progs: fsck: Fix a false metadata extent warning

2016-04-01 Thread David Sterba

On Fri, Apr 01, 2016 at 08:09:56PM +0800, Qu Wenruo wrote:
> 
> 
> On 04/01/2016 07:39 PM, David Sterba wrote:
> > On Fri, Apr 01, 2016 at 04:50:06PM +0800, Qu Wenruo wrote:
> >>> After another look, why don't we use nodesize directly? Or stripesize
> >>> where applies. With max_size == 0 the test does not make sense, we ought
> >>> to know the alignment.
> >>>
> >> Yes, my first though is also to use nodesize directly, which should be
> >> always correct.
> >>
> >> But the problem is, the related function call stack doesn't have any
> >> member to reach btrfs_root or btrfs_fs_info.
> >>
> >> In the very beginning version of such crossing stripe check, I used to
> >> add a btrfs_root/btrfs_fs_info parameter to the function.
> >>
> >> But the code change are too many, so I use 'max_size'.
> >>
> >> I can try to re-do such modification, but IIRC it didn't cause a good
> >> result.
> >
> > Yes it would require refactoring, which would be good on itself, because
> > add_extent_rec takes 12(!) parameters. Some of its callers would need to
> > be updated, but it seems doable.
> 
> I'll try to refactor.

I'm working on it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "bad metadata" not fixed by btrfs repair