[
https://issues.apache.org/jira/browse/ACCUMULO-416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Christopher Tubbs updated ACCUMULO-416:
---------------------------------------
Fix Version/s: 1.5.0
> reevaluate limiting the number of open files given HDFS improvements
> --------------------------------------------------------------------
>
> Key: ACCUMULO-416
> URL: https://issues.apache.org/jira/browse/ACCUMULO-416
> Project: Accumulo
> Issue Type: Improvement
> Components: tserver
> Reporter: Adam Fuchs
> Assignee: Keith Turner
> Fix For: 1.5.0
>
>
> Tablet servers limit the number of files that can be opened for scans and for
> major compactions. The two main reasons for this limit was to reduce our
> impact on HDFS, primarily regarding connections to data nodes, and to limit
> our memory usage related to preloading file indexes. A third reason might be
> that disk thrashing could become a problem if we try to read from too many
> places at once.
> Two improvements may have made (or may soon make) this limit obsolete: HDFS
> now pools connections, and RFile now uses a multi-level index. With these
> improvements, is it reasonable to lift some of our open file restrictions?
> The tradeoff on query side might be availability vs. overall resource usage.
> On the compaction side, the tradeoff is probably write replication vs.
> thrashing on reads. I think we can make an argument that queries should be
> available at almost any cost, but the compaction tradeoff is not as clear. We
> should test the efficiency of compacting a large number of files to get a
> better feeling for how the two extremes effect read and write performance
> across the system.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira