[ https://issues.apache.org/jira/browse/HBASE-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268413#comment-14268413 ]
Cosmin Lehene commented on HBASE-6351: -------------------------------------- [~te...@apache.org], [~apurtell] perhaps this is worth a refresh? > Stop compactions from polluting OS FS cache > -------------------------------------------- > > Key: HBASE-6351 > URL: https://issues.apache.org/jira/browse/HBASE-6351 > Project: HBase > Issue Type: Improvement > Components: Performance > Reporter: Ted Yu > > The following came from Otis via http://search-hadoop.com/m/MGVqgZJ4Mj2 : > Lucene 4.0.0-Alpha was recently released. Mike McCandless, sne of the Lucene > developers, wrote a really nice post about new things in this version of > Lucene. The part that I think is interesting for HBase, and that HBase devs > may want to look at (and borrow to use with compactions) is this: > Reducing merge IO impact > Merging (consolidating many small segments into a single big one) is a very > IO and CPU intensive operation which can easily interfere with ongoing > searches. In 4.0.0 we now have two ways to reduct this impact: > * Rate-limit the IO caused by ongoing merging, by calling > FSDirectory.setMaxMergeWriteMBPerSec. > * Use the new NativeUnixDirectory which bypasses the OS's IO cache > for all merge IO, by using direct IO. This ensures that a merge won't evict > hot pages used by searches. (Note that there is also a native > WindowsDirectory, but it does not yet use direct IO during merging... patches > welcome!). > Remember to also set swappiness to 0 on Linux if you want to maximize search > responsiveness. > More generally, the APIs that open an input or output file > (Directory.openInput and Directory.createOutput) now take an IOContext > describing what's being done (e.g., flush vs merge), so you can create a > custom Directory that changes its behavior depending on the context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)