On Fri, Sep 14, 2012 at 07:25:45PM +0800, Liu Bo wrote:
> On 09/14/2012 07:15 PM, David Sterba wrote:
> > On Fri, Sep 14, 2012 at 04:58:04PM +0800, Liu Bo wrote:
> >> --- a/fs/btrfs/transaction.c
> >> +++ b/fs/btrfs/transaction.c
> >> @@ -306,9 +306,17 @@ static struct btrfs_trans_handle 
> >> *start_transaction(struct btrfs_root *root,
> >>            WARN_ON(type != TRANS_JOIN && type != TRANS_JOIN_NOLOCK &&
> >>                    type != TRANS_JOIN_ONLY);
> >>            h = current->journal_info;
> >> -          h->use_count++;
> >> -          h->orig_rsv = h->block_rsv;
> >> +          if (h->block_rsv) {
> >> +                  struct btrfs_trans_rsv_item *item;
> >> +                  item = kmalloc(sizeof(*item), GFP_NOFS);
> > 
> > I'd rather avoid the kmalloc here and add a list hook into
> > btrfs_block_rsv itself (used only for this purpose).
> > 
> > It also does not increase the failure surface and we don't have to
> > handle error conditions from deep callchains.
> > 
> 
> Actually I placed a list hook at first, but I found the same block_rsv could 
> be inserted into
> the list chain twice, which will cause list_head's terrible warnings.

Is it expected and correct to add the rsv twice? I know the transaction
joins and commits are sometimes wildly nested so it does not mean that's
necessarily a bug. Then it is not possible to embed the list hook, we
need to keep the full track of the blk_rsv stack.

I'm counting two other similar structs used inside btrfs
(list_head + 8B value), we could make a shared separate slab for them.

struct delayed_iput {
        struct list_head           list;                 /*     0    16 */
        struct inode *             inode;                /*    16     8 */

        /* size: 24, cachelines: 1, members: 2 */
        /* last cacheline: 24 bytes */
};

truct seq_list {
        struct list_head           list;                 /*     0    16 */
        u64                        seq;                  /*    16     8 */

        /* size: 24, cachelines: 1, members: 2 */
        /* last cacheline: 24 bytes */
};

seq_list is never allocated, only embedded inside tree_mod_log structures.

delayed_iput is allocated on every add_delayed_iput and freed quite
frequently (once per-commit at least), so this is a short-lived object
as well as the proposed btrfs_trans_rsv_item. IMO this justifies using
the slab. Does this sound good to you?


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to