[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13666314#comment-13666314 ]
Dave Marion edited comment on ACCUMULO-118 at 5/24/13 2:12 PM: --------------------------------------------------------------- Personally I am not a fan of the hash idea. I would rather see a mapping of namespace prefix to NN in the configuration (ns1 = hdfs://host:port, ns2 = hdfs://host:port). I'm thinking forward to table file load balancing across namespaces and backups (see my comment from 3/Apr/12). If for example you quiesced the database and performed a backup, then you could change the namespace mapping such that ns1 and ns2 point to the same hdfs://host:port if for some reason you lost the 2nd hdfs instance (it crashed, you wanted to remove it, etc). This could also allow for an upgrade of Hadoop wile Accumulo is still running. Think about the scenario where ns1 is on racks 1&2 and ns2 is on racks 3&4 of a cluster and the files of table T are spread across ns1 and ns2. You could change the configuration of the table file load balancer (new feature) that puts new files on ns2. You recompact the table so now all new files are on ns2. When done for all tables (and walogs), then you can shutdown ns1 and upgrade to a new version of Hadoop. was (Author: dlmarion): Personally I am not a fan of the hash idea. I would rather see a mapping of namespace prefix to NN in the configuration (ns1 = hdfs://host:port, ns2 = hdfs://host:port). I'm thinking forward to table file load balancing across namespaces and backups (see my comment from 3/Apr/12). If for example you quiesced the database and performed a backup, then you could change the namespace mapping such that ns1 and ns2 point to the same hdfs://host:port if for some reason you lost the 2nd hdfs instance (it crashed, you wanted to remove it, etc). This could also allow for of Hadoop wile Accumulo is still running. Think about the scenario where ns1 is on racks 1&2 and ns2 is on racks 3&4 of a cluster and the files of table T are spread across ns1 and ns2. You could change the configuration of the table file load balancer (new feature) that puts new files on ns2. You recompact the table so now all new files are on ns2. When done for all tables (and walogs), then you can shutdown ns1 and upgrade to a new version of Hadoop. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > ---------------------------------------------------------------------------------------------- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver > Reporter: Eric Newton > Assignee: Eric Newton > Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira