Hi Tejun,
On 18/2/23 22:23, Tejun Heo wrote:
> Hello,
>
> On Fri, Feb 23, 2018 at 09:56:54AM +0800, xuejiufei wrote:
>>> On Thu, Feb 22, 2018 at 02:14:34PM +0800, Joseph Qi wrote:
>>>> I still don't get how css_tryget can work here.
>>>>
>>>> The race happens when:
>>>> 1) writeback kworker has found the blkg with rcu;
>>>> 2) blkcg is during offlining and blkg_destroy() has already been called.
>>>> Then, writeback kworker will take queue lock and access the blkg with
>>>> refcount 0.
>>>
>>> Yeah, then tryget would fail and it should go through the root.
>>>
>> In this race, the refcount of blkg becomes zero and is destroyed.
>> However css may still have refcount, and css_tryget can return success
>> before other callers put the refcount.
>> So I don't get how css_tryget can fix this race? Or I wonder if we can
>> add another function blkg_tryget?
>
> IIRC, as long as the blkcg and the device are there, the blkgs aren't
> gonna be destroyed. So, if you have a ref to the blkcg through
> tryget, the blkg shouldn't go away.
>
Maybe we have misunderstanding here.
In this case, blkg doesn't go away as we have rcu protect, but
blkg_destroy() can be called, in which blkg_put() will put the last
refcnt and then schedule __blkg_release_rcu().
css refcnt can't prevent blkcg css from offlining, instead it is css
online_cnt.
css_tryget() will only get a refcnt of blkcg css, but can't be
guaranteed to fail when css is confirmed to kill.
The race sequence:
writeback kworker cgroup_rmdir
cgroup_destroy_locked
kill_css
css_killed_ref_fn
css_killed_work_fn
offline_css
blkcg_css_offline
blkcg_bio_issue_check
rcu_read_lock
blkg_lookup
spin_trylock(q->queue_lock)
blkg_destroy
spin_unlock(q->queue_lock)
blk_throtl_bio
spin_lock_irq(q->queue_lock)
spin_unlock_irq(q->queue_lock)
rcu_read_unlock
Thanks,
Joseph