[ https://issues.apache.org/jira/browse/HBASE-22976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Stack updated HBASE-22976: ---------------------------------- Release Note: WALPlayer can replay the content of recovered.edits directories. Side-effect is that WAL filename timestamp is now factored when setting start/end times for WALInputFormat; i.e. wal.start.time and wal.end.time values on a job context. Previous we looked at wal.end.time only. Now we consider wal.start.time too. If a file has a name outside of wal.start.time<->wal.end.time, it'll be by-passed. This change-in-behavior will make it easier on operator crafting timestamp filters processing WALs. was:WALPlayer can replay the content of recovered.edits directories. > [HBCK2] Add RecoveredEditsPlayer > -------------------------------- > > Key: HBASE-22976 > URL: https://issues.apache.org/jira/browse/HBASE-22976 > Project: HBase > Issue Type: Sub-task > Components: hbck2, walplayer > Reporter: Michael Stack > Assignee: Michael Stack > Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.3, 2.4.0 > > Attachments: 22976.txt > > > We need a recovered edits player. Messing w/ the 'adoption service' -- > tooling to adopt orphan regions and hfiles -- I've been manufacturing damaged > clusters by moving stuff around under the running cluster. No reason to think > that an hbase couldn't lose accounting of a whole region if a cataclysm. If > so, region will have stuff like the '.regioninfo', dirs per column family w/ > store files but it could too have a 'recovered_edits' directory with content > in it. We have a WALPlayer for errant WALs. We have the FSHLog tool which can > read recovered_edits content for debugging data loss. Missing is a > RecoveredEditsPlayer. > I took a look at extending the WALPlayer since it has a bunch of nice options > and it can run at bulk. Ideally, it would just digest recovered edits content > if passed an option or recovered edits directories. On first glance, it > didn't seem like an easy integration.... Would be worth taking a look again. > Would be good if we could avoid making a new, distinct tool, just for > Recovered Edits. > The bulkload tool expects hfiles in column family directories. Recovered > edits files are not hfiles and the files are x-columnfamily so this is not > the way to go though a bulkload-like tool that moved the recovered edits > files under the appropriate region dir and asked the region reopen would be a > possibility (Would need the bulk load complete trick of splitting input if > the region boundaries in the live cluster do not align w/ those of the errant > recovered edits files). -- This message was sent by Atlassian Jira (v8.3.4#803005)