[ https://issues.apache.org/jira/browse/HBASE-18971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang resolved HBASE-18971. ------------------------------- Resolution: Duplicate Duplicated by HBASE-19358. > Limit the concurrent opened wal writers when splitting > ------------------------------------------------------ > > Key: HBASE-18971 > URL: https://issues.apache.org/jira/browse/HBASE-18971 > Project: HBase > Issue Type: Improvement > Components: Recovery, wal > Reporter: Duo Zhang > > A whole cluster restart is very easy to fail under the current architecture > if there are many regions on a single region server. > On a small cluster, although an recovered edits file is very small, NN will > reserve a block size for it when opening, so it will easily run out of space. > And on a large cluster, although the max xceiver count is already 4096, it is > still easy to run out of quota and cause DN to reject our request if there > are 1k+ regions on a single RS as we will write 3 copies for a block. > Under the current architecture we need to carefully choose the > ‘hbase.regionserver.wal.max.splitters’ and > 'hbase.master.executor.serverops.threads' to limit the concurrency of wal > splitter. But this is only a compromise as it also slows down the fail > recovery. > So here we want to limit the concurrent opened wal writers when splitting. It > may work like a memstore, which buffers the wal entries in memory and when it > is full we flush some entries out. > Suggestions are welcomed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)