[ https://issues.apache.org/jira/browse/HBASE-6774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nicolas Liochon updated HBASE-6774: ----------------------------------- Assignee: Himanshu Vashishtha > Immediate assignment of regions that don't have entries in HLog > --------------------------------------------------------------- > > Key: HBASE-6774 > URL: https://issues.apache.org/jira/browse/HBASE-6774 > Project: HBase > Issue Type: Improvement > Components: master, regionserver > Affects Versions: 0.95.2 > Reporter: Nicolas Liochon > Assignee: Himanshu Vashishtha > > The algo is today, after a failure detection: > - split the logs > - when all the logs are split, assign the regions > But some regions can have no entries at all in the HLog. There are many > reasons for this: > - kind of reference or historical tables. Bulk written sometimes then read > only. > - sequential rowkeys. In this case, most of the regions will be read only. > But they can be in a regionserver with a lot of writes. > - tables flushed often for safety reasons. I'm thinking about meta here. > For meta; we can imagine flushing very often. Hence, the recovery for meta, > in many cases, will be the failure detection time. > There are different possible algos: > Option 1) > A new task is added, in parallel of the split. This task reads all the HLog. > If there is no entry for a region, this region is assigned. > Pro: simple > Cons: We will need to read all the files. Add a read. > Option 2) > The master writes in ZK the number of log files, per region. > When the regionserver starts the split, it reads the full block (64M) and > decrease the log file counter of the region. If it reaches 0, the assign > start. At the end of its split, the region server decreases the counter as > well. This allow to start the assign even if not all the HLog are finished. > It would allow to make some regions available even if we have an issue in one > of the log file. > Pro: parallel > Cons: add something to do for the region server. Requites to read the whole > file before starting to write. > Option 3) > Add some metadata at the end of the log file. The last log file won't have > meta data, as if we are recovering, it's because the server crashed. But the > others will. And last log file should be smaller (half a block on average). > Option 4) Still some metadata, but in a different file. Cons: write are > increased (but not that much, we just need to write the region once). Pros: > if we lose the HLog files (major failure, no replica available) we can still > continue with the regions that were not written at this stage. > I think it should be done, even if none of the algorithm above is totally > convincing yet. It's linked as well to locality and short circuit reads: with > these two points reading the file twice become much less of an issue for > example. My current preference would be to open the file twice in the region > server, once for splitting as of today, once for a quick read looking for > unused regions. Who knows, may be it would even be faster this way, the quick > read thread would warm-up the different caches for the splitting thread. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira