[
https://issues.apache.org/jira/browse/HADOOP-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jim Kellerman updated HADOOP-1820:
----------------------------------
Status: Patch Available (was: In Progress)
TestLogRolling: wait after writing until data is flushed to disk.
TestHBaseCluster: Make lease timeout longer, increase time between client
retries.
> [hbase] regionserver creates hlogs without bound
> ------------------------------------------------
>
> Key: HADOOP-1820
> URL: https://issues.apache.org/jira/browse/HADOOP-1820
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/hbase
> Affects Versions: 0.15.0
> Reporter: stack
> Assignee: Jim Kellerman
> Fix For: 0.15.0
>
> Attachments: excerpt.log, patch.txt, patch.txt, patch.txt, patch.txt,
> patch.txt, patch.txt, patch.txt
>
>
> Regionserver keeps log of all edits for all the regions its carrying. Its
> used recoverying state if a regionserver crashes: edits that have not been
> persisted to an HStoreFile are rerun to populate memcache which in turn is
> converted to an on-filesytem HStoreFile. On a period, the log is rotated and
> a new one is opened. While the region server is up, the logs grow in number
> without bound. Only the most recent contain unpersisted edits. If the
> region server goes down clean, then its logs are cleaned up. If a region
> server crashes, as part of recovery, the logs of edits are sorted and split
> per region. Recovery would run faster if it did not have to plough through
> reams of stale edits.
> Just now, I had a host crash w/ 112 log files each of 30k plus edits each.
> We could rename the log rolling thread the log maintainer. As well as
> rolling logs, it could check for edit logs to clean. When rolled, logs could
> be marked with the sequence id of their last contained edit. The thread
> could on a period ask each hosted region for the "lowest highest" sequence id
> of all regions deployed. Once this number had crossed out that on a
> particular log, the log could be cleaned up safely.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.