[ https://issues.apache.org/jira/browse/HBASE-16288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406229#comment-15406229 ]
Hudson commented on HBASE-16288: -------------------------------- FAILURE: Integrated in HBase-0.98-matrix #380 (See [https://builds.apache.org/job/HBase-0.98-matrix/380/]) HBASE-16319 Fix TestCacheOnWrite after HBASE-16288 (apurtell: rev f8c45d755241930e2b3bb2db610e41fefb4f1aa2) * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java > HFile intermediate block level indexes might recurse forever creating multi > TB files > ------------------------------------------------------------------------------------ > > Key: HBASE-16288 > URL: https://issues.apache.org/jira/browse/HBASE-16288 > Project: HBase > Issue Type: Bug > Components: HFile > Reporter: Enis Soztutar > Assignee: Enis Soztutar > Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3 > > Attachments: hbase-16288_v1.patch, hbase-16288_v2.patch, > hbase-16288_v3.patch, hbase-16288_v4.patch > > > Mighty [~elserj] was debugging an opentsdb cluster where some region > directory ended up having 5TB+ files under <regiondir>/.tmp/ > Further debugging and analysis, we were able to reproduce the problem locally > where we never we recursing in this code path for writing intermediate level > indices: > {code:title=HFileBlockIndex.java} > if (curInlineChunk != null) { > while (rootChunk.getRootSize() > maxChunkSize) { > rootChunk = writeIntermediateLevel(out, rootChunk); > numLevels += 1; > } > } > {code} > The problem happens if we end up with a very large rowKey (larger than > "hfile.index.block.max.size" being the first key in the block, then moving > all the way to the root-level index building. We will keep writing and > building the next level of intermediate level indices with a single > very-large key. This can happen in flush / compaction / region recovery > causing cluster inoperability due to ever-growing files. > Seems the issue was also reported earlier, with a temporary workaround: > https://github.com/OpenTSDB/opentsdb/issues/490 -- This message was sent by Atlassian JIRA (v6.3.4#6332)