On Tue, Apr 02, 2019 at 05:45:54PM +0800, Qu Wenruo wrote: > There are two tree lock events which can sleep: > - btrfs_tree_read_lock() > - btrfs_tree_lock() > > Sometimes we may need to look into the concurrency picture of the fs. > For that case, we need the execution time of above two functions and the > owner of @eb. > > Here we introduce a trace events for user space tools like bcc, to get > the execution time of above two functions, and get detailed owner info > where eBPF code can't. > > All the overhead is hidden behind the trace events, so if events are not > enabled, there is no overhead. > > Also, since this patch and later user space tool only cares about the > execution time and owner, other info like bytenr is ignored in this > events.
The time is definetelly the interesting part here, I'm thinking if the block number could also provide some useful hints. The gathered trace data can be used to group by blocks to compare the times, or see the histogram of times. The block at minimum would be useful and I don't want to add more information there, like level or nritems. In case we do want that information, a heavier tracepoint version should be added with all the other data. > Signed-off-by: Qu Wenruo <w...@suse.com> > --- > Changelog: > v2: > - Hide all ktime_get_ns() call behind trace events > So there will be no overhead if trace events are not enabled. > --- > fs/btrfs/locking.c | 12 ++++++++++++ > include/trace/events/btrfs.h | 36 ++++++++++++++++++++++++++++++++++++ > 2 files changed, 48 insertions(+) > > diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c > index 6df03ba36026..67b77f1d113e 100644 > --- a/fs/btrfs/locking.c > +++ b/fs/btrfs/locking.c > @@ -158,6 +158,10 @@ void btrfs_clear_lock_blocking_write(struct > extent_buffer *eb) > */ > void btrfs_tree_read_lock(struct extent_buffer *eb) > { > + u64 start_ns = 0; > + > + if (trace_btrfs_tree_read_lock_enabled()) > + start_ns = ktime_get_ns(); > again: > BUG_ON(!atomic_read(&eb->blocking_writers) && > current->pid == eb->lock_owner); > @@ -174,6 +178,7 @@ void btrfs_tree_read_lock(struct extent_buffer *eb) > BUG_ON(eb->lock_nested); > eb->lock_nested = true; > read_unlock(&eb->lock); > + trace_btrfs_tree_read_lock(eb, start_ns); > return; > } > if (atomic_read(&eb->blocking_writers)) { > @@ -184,6 +189,7 @@ void btrfs_tree_read_lock(struct extent_buffer *eb) > } > btrfs_assert_tree_read_locks_get(eb); > btrfs_assert_spinning_readers_get(eb); > + trace_btrfs_tree_read_lock(eb, start_ns); > } > > /* > @@ -299,6 +305,11 @@ void btrfs_tree_read_unlock_blocking(struct > extent_buffer *eb) > */ > void btrfs_tree_lock(struct extent_buffer *eb) > { > + u64 start_ns = 0; > + > + if (trace_btrfs_tree_lock_enabled()) > + start_ns = ktime_get_ns(); > + > WARN_ON(eb->lock_owner == current->pid); > again: > wait_event(eb->read_lock_wq, atomic_read(&eb->blocking_readers) == 0); > @@ -312,6 +323,7 @@ void btrfs_tree_lock(struct extent_buffer *eb) > btrfs_assert_spinning_writers_get(eb); > btrfs_assert_tree_write_locks_get(eb); > eb->lock_owner = current->pid; > + trace_btrfs_tree_lock(eb, start_ns); > } > > /* > diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h > index 8b12753fee78..406a71571254 100644 > --- a/include/trace/events/btrfs.h > +++ b/include/trace/events/btrfs.h > @@ -2005,6 +2005,42 @@ TRACE_EVENT(btrfs_convert_extent_bit, > __print_flags(__entry->clear_bits, "|", EXTENT_FLAGS)) > ); > > +DECLARE_EVENT_CLASS(btrfs_sleep_tree_lock, > + TP_PROTO(const struct extent_buffer *eb, u64 start_ns), > + > + TP_ARGS(eb, start_ns), > + > + TP_STRUCT__entry_btrfs( > + __field( u64, start_ns ) > + __field( u64, end_ns ) > + __field( u64, owner ) > + __field( int, is_log_tree ) > + ), > + > + TP_fast_assign_btrfs(eb->fs_info, > + __entry->start_ns = start_ns; > + __entry->end_ns = ktime_get_ns(); > + __entry->owner = btrfs_header_owner(eb); > + __entry->is_log_tree = (eb->log_index >= 0); > + ), > + > + TP_printk_btrfs("start_ns=%llu end_ns=%llu owner=%llu is_log_tree=%d", > + __entry->start_ns, __entry->end_ns, __entry->owner, > + __entry->is_log_tree) I think the delta between start_ns and end_ns should be printed too, so it's obvious without further processing: TP_printk_btrfs("start_ns=%llu end_ns=%llu diff=%llu owner=%llu is_log_tree=%d",