[
https://issues.apache.org/jira/browse/HBASE-24380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125899#comment-17125899
]
Viraj Jasani commented on HBASE-24380:
--------------------------------------
Although HBASE-23037 is not relevant, since it is log improvement, something we
can backport to branch-1. It does print WAL file path in log once split is
completed.
As part of this Jira, I agree on enabling status journal logging (MonitoredTask
is already being used).
Tried minor modifications and this is how journal looks like (master branch):
-------------------------------------------------------------------------------------------------
Journal Log: null at 1591274328697
Opening log file
hdfs://localhost:62321/user/vjasani/test-data/2171b716-f908-fbd4-82db-13037736881b/WALs/testioeonoutputthread,16010,1591274326862/wal.dat.7
at 1591274328715
Processed 20 edits across 2 regions cost 10335 ms; edits skipped=0;
WAL=hdfs://localhost:62321/user/vjasani/test-data/2171b716-f908-fbd4-82db-13037736881b/WALs/testioeonoutputthread,16010,1591274326862/wal.dat.7,
size=1.4 K, length=1411, corrupted=false, progress failed=true at 1591274339638
------------------------------------------------------------------------------------------------
Maybe we can also pass MonitorTask instance to Recovered Edit writers and
capture recovered edit writer path with journal log.
> Improve WAL splitting log lines to enable sessionization
> --------------------------------------------------------
>
> Key: HBASE-24380
> URL: https://issues.apache.org/jira/browse/HBASE-24380
> Project: HBase
> Issue Type: Improvement
> Components: logging, Operability, wal
> Reporter: Andrew Kyle Purtell
> Priority: Minor
>
> Looking to reconstruct a timeline from write of recovered.edits file back to
> start of WAL file split, with a bunch of unrelated activity in the meantime,
> there isn't a consistent token that links split file write messages (which
> include store path including region hash) to beginning of WAL splitting
> activity. Sessonizing by host doesn't work because work can bounce around
> through retries. Thread context names in the logs vary and can be like
> [nds1-225-fra:60020-7] or [fb472085572ba72e96f1] (trailing digits of region
> hash) or [splits-1589016325868] .
> We could have WALSplitter get the current time when starting the split of a
> WAL file and have it log this timestamp in every line as a splitting session
> identifier.
> Related, we should track the time of split task execution end to end and
> export a metric that captures it.
> It might also be worthwhile to wire up more of WAL splitting to TaskMonitor
> status logging. If we do this we can also enable status journal logging, so
> when a WAL split has completed, a line will appear in the log that has the
> list of all status messages recorded during splitting and the time delta in
> milliseconds between them.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)