On Tue, Oct 27, 2015 at 4:13 AM, Qu Wenruo <quwen...@cn.fujitsu.com> wrote:
>
>
> Filipe Manana wrote on 2015/10/25 14:39 +0000:
>>
>> On Tue, Oct 13, 2015 at 3:20 AM, Qu Wenruo <quwen...@cn.fujitsu.com>
>> wrote:
>>>
>>> Add new function btrfs_add_delayed_qgroup_reserve() function to record
>>> how much space is reserved for that extent.
>>>
>>> As btrfs only accounts qgroup at run_delayed_refs() time, so newly
>>> allocated extent should keep the reserved space until then.
>>>
>>> So add needed function with related members to do it.
>>>
>>> Signed-off-by: Qu Wenruo <quwen...@cn.fujitsu.com>
>>> ---
>>> v2:
>>>    None
>>> v3:
>>>    None
>>> ---
>>>   fs/btrfs/delayed-ref.c | 29 +++++++++++++++++++++++++++++
>>>   fs/btrfs/delayed-ref.h | 14 ++++++++++++++
>>>   2 files changed, 43 insertions(+)
>>>
>>> diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
>>> index ac3e81d..bd9b63b 100644
>>> --- a/fs/btrfs/delayed-ref.c
>>> +++ b/fs/btrfs/delayed-ref.c
>>> @@ -476,6 +476,8 @@ add_delayed_ref_head(struct btrfs_fs_info *fs_info,
>>>          INIT_LIST_HEAD(&head_ref->ref_list);
>>>          head_ref->processing = 0;
>>>          head_ref->total_ref_mod = count_mod;
>>> +       head_ref->qgroup_reserved = 0;
>>> +       head_ref->qgroup_ref_root = 0;
>>>
>>>          /* Record qgroup extent info if provided */
>>>          if (qrecord) {
>>> @@ -746,6 +748,33 @@ int btrfs_add_delayed_data_ref(struct btrfs_fs_info
>>> *fs_info,
>>>          return 0;
>>>   }
>>>
>>> +int btrfs_add_delayed_qgroup_reserve(struct btrfs_fs_info *fs_info,
>>> +                                    struct btrfs_trans_handle *trans,
>>> +                                    u64 ref_root, u64 bytenr, u64
>>> num_bytes)
>>> +{
>>> +       struct btrfs_delayed_ref_root *delayed_refs;
>>> +       struct btrfs_delayed_ref_head *ref_head;
>>> +       int ret = 0;
>>> +
>>> +       if (!fs_info->quota_enabled || !is_fstree(ref_root))
>>> +               return 0;
>>> +
>>> +       delayed_refs = &trans->transaction->delayed_refs;
>>> +
>>> +       spin_lock(&delayed_refs->lock);
>>> +       ref_head = find_ref_head(&delayed_refs->href_root, bytenr, 0);
>>> +       if (!ref_head) {
>>> +               ret = -ENOENT;
>>> +               goto out;
>>> +       }
>>
>>
>> Hi Qu,
>>
>> So while running btrfs/063, with qgroups enabled (I modified the test
>> to enable qgroups), ran into this 2 times:
>>
>> [169125.246506] BTRFS info (device sdc): disk space caching is enabled
>> [169125.363164] ------------[ cut here ]------------
>> [169125.365236] WARNING: CPU: 10 PID: 2827 at fs/btrfs/inode.c:2929
>> btrfs_finish_ordered_io+0x347/0x4eb [btrfs]()
>> [169125.367702] BTRFS: Transaction aborted (error -2)
>> [169125.368830] Modules linked in: btrfs dm_flakey dm_mod
>> crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs
>> lockd grace fscache sunrpc loop fuse parport_pc parport i2c_piix4
>> psmouse acpi_cpufreq microcode pcspkr processor evdev i2c_core
>> serio_raw button ext4 crc16 jbd2 mbcache sd_mod sg sr_mod cdrom
>> ata_generic virtio_scsi ata_piix libata floppy virtio_pci virtio_ring
>> scsi_mod e1000 virtio [last unloaded: btrfs]
>> [169125.376755] CPU: 10 PID: 2827 Comm: kworker/u32:14 Tainted: G
>>    W       4.3.0-rc5-btrfs-next-17+ #1
>
>
> Hi Filipe,

Hi Qu,

>
> Although not related to the bug report, I'm a little interested in your
> testing kernel.
>
> Are you testing integration-4.4 from Chris repo?

Yes, I got that from Chris' integration-4.4 branch.

> Or 4.3-rc from mainline repo with my qgroup reserve patchset applied?
>
> Although integration-4.4 already merged qgroup reserve patchset, but it's
> causing some strange bug like over decrease data sinfo->bytes_may_use,
> mainly in generic/127 testcase.

Haven't hit that one yet.

>
> But if qgroup reserve patchset is rebased to integration-4.3 (I did all my
> old tests based on that), no generic/127 problem at all.
>
> Thanks,
> Qu
>
>
>> [169125.378522] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org
>> 04/01/2014
>> [169125.380916] Workqueue: btrfs-endio-write btrfs_endio_write_helper
>> [btrfs]
>> [169125.382167]  0000000000000000 ffff88007ef2bc28 ffffffff812566f4
>> ffff88007ef2bc70
>> [169125.383643]  ffff88007ef2bc60 ffffffff8104d0a6 ffffffffa03cac33
>> ffff8801f5ca6db0
>> [169125.385197]  ffff8802c6c7ee98 ffff880122bc1000 00000000fffffffe
>> ffff88007ef2bcc8
>> [169125.386691] Call Trace:
>> [169125.387194]  [<ffffffff812566f4>] dump_stack+0x4e/0x79
>> [169125.388205]  [<ffffffff8104d0a6>] warn_slowpath_common+0x9f/0xb8
>> [169125.389386]  [<ffffffffa03cac33>] ?
>> btrfs_finish_ordered_io+0x347/0x4eb [btrfs]
>> [169125.390837]  [<ffffffff8104d107>] warn_slowpath_fmt+0x48/0x50
>> [169125.391839]  [<ffffffffa03d67bb>] ? unpin_extent_cache+0xbe/0xcc
>> [btrfs]
>> [169125.392973]  [<ffffffffa03cac33>]
>> btrfs_finish_ordered_io+0x347/0x4eb [btrfs]
>> [169125.395714]  [<ffffffff8147c612>] ?
>> _raw_spin_unlock_irqrestore+0x38/0x60
>> [169125.396888]  [<ffffffff81087d0b>] ?
>> trace_hardirqs_off_caller+0x1f/0xb9
>> [169125.397986]  [<ffffffffa03cadec>] finish_ordered_fn+0x15/0x17 [btrfs]
>> [169125.399122]  [<ffffffffa03ec706>] normal_work_helper+0x14c/0x32a
>> [btrfs]
>> [169125.400300]  [<ffffffffa03ec9e6>] btrfs_endio_write_helper+0x12/0x14
>> [btrfs]
>> [169125.401450]  [<ffffffff81063b23>] process_one_work+0x24a/0x4ac
>> [169125.402631]  [<ffffffff81064285>] worker_thread+0x206/0x2c2
>> [169125.403622]  [<ffffffff8106407f>] ? rescuer_thread+0x2cb/0x2cb
>> [169125.404693]  [<ffffffff8106904d>] kthread+0xef/0xf7
>> [169125.405727]  [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24
>> [169125.406808]  [<ffffffff8147d10f>] ret_from_fork+0x3f/0x70
>> [169125.407834]  [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24
>> [169125.408840] ---[ end trace 6ee4342a5722b119 ]---
>> [169125.409654] BTRFS: error (device sdc) in
>> btrfs_finish_ordered_io:2929: errno=-2 No such entry
>>
>> So what you have here is racy:
>>
>> btrfs_finish_ordered_io()
>>     joins existing transaction (or starts a new one)
>>     insert_reserved_file_extent()
>>        btrfs_alloc_reserved_file_extent() --> creates delayed ref
>>
>>        ******* delayed refs are run, someone called
>> btrfs_async_run_delayed_refs() from btrfs_end_transaction(), ref head
>> is removed ******
>>
>>        btrfs_add_delayed_qgroup_reserve() --> does not find delayed ref
>> head, returns -ENOENT and finish_ordered_io aborts current
>> transaction...
>>
>> A very tiny race, but...
>>
>> thanks
>>
>>
>>> +       WARN_ON(ref_head->qgroup_reserved || ref_head->qgroup_ref_root);
>>> +       ref_head->qgroup_ref_root = ref_root;
>>> +       ref_head->qgroup_reserved = num_bytes;
>>> +out:
>>> +       spin_unlock(&delayed_refs->lock);
>>> +       return ret;
>>> +}
>>> +
>>>   int btrfs_add_delayed_extent_op(struct btrfs_fs_info *fs_info,
>>>                                  struct btrfs_trans_handle *trans,
>>>                                  u64 bytenr, u64 num_bytes,
>>> diff --git a/fs/btrfs/delayed-ref.h b/fs/btrfs/delayed-ref.h
>>> index 13fb5e6..d4c41e2 100644
>>> --- a/fs/btrfs/delayed-ref.h
>>> +++ b/fs/btrfs/delayed-ref.h
>>> @@ -113,6 +113,17 @@ struct btrfs_delayed_ref_head {
>>>          int total_ref_mod;
>>>
>>>          /*
>>> +        * For qgroup reserved space freeing.
>>> +        *
>>> +        * ref_root and reserved will be recorded after
>>> +        * BTRFS_ADD_DELAYED_EXTENT is called.
>>> +        * And will be used to free reserved qgroup space at
>>> +        * run_delayed_refs() time.
>>> +        */
>>> +       u64 qgroup_ref_root;
>>> +       u64 qgroup_reserved;
>>> +
>>> +       /*
>>>           * when a new extent is allocated, it is just reserved in memory
>>>           * The actual extent isn't inserted into the extent allocation
>>> tree
>>>           * until the delayed ref is processed.  must_insert_reserved is
>>> @@ -242,6 +253,9 @@ int btrfs_add_delayed_data_ref(struct btrfs_fs_info
>>> *fs_info,
>>>                                 u64 owner, u64 offset, int action,
>>>                                 struct btrfs_delayed_extent_op
>>> *extent_op,
>>>                                 int no_quota);
>>> +int btrfs_add_delayed_qgroup_reserve(struct btrfs_fs_info *fs_info,
>>> +                                    struct btrfs_trans_handle *trans,
>>> +                                    u64 ref_root, u64 bytenr, u64
>>> num_bytes);
>>>   int btrfs_add_delayed_extent_op(struct btrfs_fs_info *fs_info,
>>>                                  struct btrfs_trans_handle *trans,
>>>                                  u64 bytenr, u64 num_bytes,
>>> --
>>> 2.6.1
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>>
>



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to