[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970183#comment-13970183 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1587749 from dlmar...@apache.org in branch 'site/trunk' [ https://svn.apache.org/r1587749 ] Changed name of ACCUMULO-118 feature > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13949692#comment-13949692 ] Sean Busbey commented on ACCUMULO-118: -- Issue closed; please file any newly discovered issues as follow on tickets. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948015#comment-13948015 ] Josh Elser commented on ACCUMULO-118: - Would be good to have someone else look through the potential "bugs" from direct FileSystem usage that I outlined on ACCUMULO-2552. I believe that list doesn't have anything that would be blocker for initial functionality, so I think I'm good to close this issue out. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948000#comment-13948000 ] Sean Busbey commented on ACCUMULO-118: -- All subtasks against this ticket are now complete. Any objections to closing it out? > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900947#comment-13900947 ] Josh Elser commented on ACCUMULO-118: - And then I realized that this was the precise thing that ACCUMULO-2061 was recommendation. I'll bump the priority on that issue to denote my frustration with the current mechanisms and approval of that idea. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900917#comment-13900917 ] Josh Elser commented on ACCUMULO-118: - I was just testing this out. I tried to add a new volume to an existing 1.6 instance I had lying around. I expected that each volume I specified in {{instance.volumes}} was the "equivalent" of how multiple {{instance.dfs.uri}}+{{instance.dfs.dir}} would have worked. In other words, I expected each element in {{instance.volumes}} to be the base directory that Accumulo would write to. Instead, it actually wrote to that volume + {{instance.dfs.dir}}. This irks me for a few reasons: # I must have the same base directory used in HDFS across all volumes (not the end of the world, but I don't see any reason to impose that on our end). # I expected {{instance.volumes}} to be a replacement to {{instance.dfs.dir}} and {{instance.dfs.uri}}, but the new configuration still relies on the old configuration. Let me try to be crystal clear. I had an existing installation on machine1 in {{/accumulo1.6}} in HDFS. I tried to add a second volume, stored on machine2, in {{/accumulo1.6-newvolume}} (I already had an /accumulo1.6 from other testing on machine2). I configured my {{instance.volumes}} value to {{hdfs://machine1:8020/accumulo1.6,hdfs://machine2:8020/accumulo1.6-newvolume}}. Sadly, when invoking {{bin/accumulo init --add-volumes}}, this failed on me because it actually looked in {{hdfs://machine1:8020/accumulo1.6/accumulo}} and {{hdfs://machine2:8020/accumulo1.6-newvolume/accumulo}}. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891494#comment-13891494 ] Dave Marion commented on ACCUMULO-118: -- bq. calling for a vote on the dev list after writing a design doc for a feature seems like a simple thing to try caveat that if you don't know anything about the subject, then don't vote. I can see the situation where someone gives a +1 for an idea after reading the design doc, but not really understanding what it will entail to complete the task. Maybe not a vote, but a "yes, I've read it and my comments / questions are in the associated JIRA." > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890670#comment-13890670 ] Mike Drob commented on ACCUMULO-118: This seems like it would be a good discussion for the mailing list, to make sure it gets more eyes on it, instead of just the folks paying attention to the JIRA. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890330#comment-13890330 ] Josh Elser commented on ACCUMULO-118: - I think the short of it here: it's hard. I remember when Eric was initially working on absolute paths and I thought "hrm, that's a good idea. should simplify a lot of things in the end". In hindsight, I don't think I really considered all of the difficulties that the changes introduce (most notably around upgrades and namenode/namespace decommissioning). Maintaining a long-running feature branch isn't too bad as long as the code you tweaked also doesn't change out from underneath you. I agree with you Keith, I think that focusing on design docs before starting to work on it can help quite a bit on a couple of levels (avoid flaws in design, catch bugs earlier, net a better architected solutions). Additionally, firming up a design can also help us break down "really big" problems into "slightly less big" problems which will likely help manage those changes. I think we've generally tried to abstain from requiring voting, but if that's what we need to get eyes on ideas, so be it. If we can get good, thought-out reviews without voting (which I think we've been fairly decent at so far, but we haven't had "big" designs go through review yet), I'd rather stay that way. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890317#comment-13890317 ] Keith Turner commented on ACCUMULO-118: --- My comment about voting on design documents was a possible solution to a problem. The problem is how can we get peer reviewed designs? How can we motivate each other to spend a good chunk of time critically thinking about our designs. This issue had a design document and I remember reading and thinking "makes sense". I do not remember investing much time trying to find flaws with it, I think I just wanted to get back to whatever I was immersed in at the time. My thinking is that calling for a vote on the dev list after writing a design doc for a feature seems like a simple thing to try w/o any additional overhead because we have the dev list in place. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890300#comment-13890300 ] Keith Turner commented on ACCUMULO-118: --- bq. But it was a pretty massive change, and maintaining it as a patch set, even with git's help, would have been very hard. There is a cost there. There is also a cost to having incomplete features in master when we declare feature freeze. 1.5.0 and 1.6.0 were both delayed because of incomplete features. One possible solution to this for 1.7.0 development is to only merge in complete features to master. There are lots of commits related to all of the 1.6.0 features, many of them reworking code changed by previous commits related to the feature. If someone was working on another feature in a branch, this introduces a lot of unnecessary noise for them to deal with. Ideally they would only have to merge and resolve conflicts once per feature. Of course we will never achieve ideal ratio of 1, but I think we can easily make the commit per feature/bug ratio much lower than it currently is. I know [~ctubbsii] worked on namespaces in a branch for a long period of time, merging in changes and resolving conflicts, I am not sure how painful this was. For a feature like this that touches a lot of existing code, there is the option of refactoring w/o changing functionality and merging that into master. Of course the refactoring would be done to make the functionality changes in the branch easier. So its a multi-step process w/ the goal of always leaving master in state where its ready for release testing and minimizing the number of merges for other feature branches. This approach would also make code reviews of commits to master much easier. I am going to try doing this for the 1.7 features I work on. bq. but was certainly not anything I was thinking about when I was changing thousands of lines I was trying to determine how we can flush out more potential issues before changing thousands of lines. If we can get as many people as possible to carefully review the design that would probably do the trick. I wonder if voting on design docs for new features would help. Voting would motivate me to carefully review a design because I would not want to vote until I had done so. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888661#comment-13888661 ] Eric Newton commented on ACCUMULO-118: -- bq. I think this feature was merged in before it was complete Probably. But it was a pretty massive change, and maintaining it as a patch set, even with git's help, would have been very hard. bq. I did not realize all of the problems absolute paths could cause Nor would we have if it was not merged in. bq. should have started with administrative use cases I think we are getting better at this. For example, I can think of lots of ways that the initial WAL implementation caused a lot of grief for unsuspecting administrators. We fixed this after it was released into the wild based on feedback from the administrators. Ultimately these were fixed by moving the WAL to HDFS, and then ferreting out all the settings to make HDFS an appropriate store for the WAL. I think the use case of "what if administrators change the URL of a NN?" is a reasonable one, but was certainly not anything I was thinking about when I was changing thousands of lines of code to use full paths. The more subtle issues of determining aliases for namespaces (hdfs://example:9000 vs hdfs://example.com:9000), and recognizing real namespaces under viewfs are the sort of subtle things that we will only find through actual use. My initial goal of using concrete paths to simplify debugging might have been the wrong choice. Using some kind of indirect configuration that points to a real namespace (like viewfs) may have been better. But, that requires that you value "administrators should be able to easily move a NN to a new URL." The ability to do this with the old relative paths was not a design goal, so much as a useful result of using the shortest name possible for each file. bq. These really seem to be the long poll in the tent for the 1.6 release Seems to me to not be so far behind namespaces. Constructive criticism includes suggestions on how to make things better. Working code is even more constructive. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887940#comment-13887940 ] Keith Turner commented on ACCUMULO-118: --- bq. This suite of tickets is really starting to concern me. I am also concerned, I am worried about administrators having a really bad experience (bricked Accumulo instance). In hindsight I think this feature was merged in before it was complete and should not have been merged in. What can we learn from this to make development go more smoothly in the future? Personally I think if a new feature is incomplete (in terms of test, usability, documentation, etc), breaks other features of Accumulo, or introduces new problems it should not be merged in. I did not realize all of the problems absolute paths could cause until well after it was merged in. In hindsight I think the design of this feature should have started with administrative use cases. Maybe that would have brought to attention the problems w/ absolute paths much earlier. bq. I would like to start the conversation about marking this item and items around it as @Experimental I wish it were that simple. Wether you use multiple namenodes or not, Accumulo will use absolute paths in 1.6 and administrators do not have a reliable way to manage those paths. I have a fix for ACCUMULO-1832 in the works, I should be done w/ that very soon. I think that fix will address my major concern w/ this issue. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887170#comment-13887170 ] John Vines commented on ACCUMULO-118: - This suite of tickets is really starting to concern me. These really seem to be the long poll in the tent for the 1.6 release and I'm starting to question how badly they're holding back the release. We have a mix of standing improvements that are encompassed with this feature with tickets for bugs found and bugs that are theoretically possible. I would like to start the conversation about marking this item and items around it as @Experimental and minimizing impact from the work done so far on single node instances. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13852117#comment-13852117 ] Keith Turner commented on ACCUMULO-118: --- Need to fix ACCUMULO-1771 before fixing this > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764775#comment-13764775 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1465b6e32a6b8b1c480908f0fd39e825b20c3d21 in branch refs/heads/master from [~keith_turner] [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=1465b6e ] ACCUMULO-118 made export table generate fully qualified path for export file in distcp file > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13692404#comment-13692404 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1496225 from [~ecn] [ https://svn.apache.org/r1496225 ] ACCUMULO-118 added getConnector method for VolumeManager test > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689408#comment-13689408 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1495105 from [~ecn] [ https://svn.apache.org/r1495105 ] ACCUMULO-118 fix rat check > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689396#comment-13689396 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1495099 from [~ecn] [ https://svn.apache.org/r1495099 ] ACCUMULO-118 add a note to the user's manual about multi-volume configuration > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689393#comment-13689393 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1495096 from [~ecn] [ https://svn.apache.org/r1495096 ] ACCUMULO-118 there's only one default volume > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689387#comment-13689387 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1495087 from [~ecn] [ https://svn.apache.org/r1495087 ] ACCUMULO-118 move unit test into test area > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689277#comment-13689277 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1495020 from [~ecn] [ https://svn.apache.org/r1495020 ] ACCUMULO-118 add unit test for multiple volumes > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688403#comment-13688403 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1494761 from [~ecn] [ https://svn.apache.org/r1494761 ] ACCUMULO-118 cleanup sandbox > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688387#comment-13688387 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1494759 from [~ecn] [ https://svn.apache.org/r1494759 ] ACCUMULO-118 merge to trunk > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688356#comment-13688356 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1494746 from [~ecn] [ https://svn.apache.org/r1494746 ] ACCUMULO-118 fix last TODO > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688319#comment-13688319 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1494736 from [~ecn] [ https://svn.apache.org/r1494736 ] ACCUMULO-118 use the correct data range for making file delete markers for !METADATA merges > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688089#comment-13688089 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1494671 from [~ecn] [ https://svn.apache.org/r1494671 ] ACCUMULO-118 merge trunk to branch > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688052#comment-13688052 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1494648 from [~ecn] [ https://svn.apache.org/r1494648 ] ACCUMULO-118 fix fs chooser, enabled multi-volume where the first volume is not the same as the HDFS config > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687274#comment-13687274 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1494340 from [~ecn] [ https://svn.apache.org/r1494340 ] ACCUMULO-118 cleanup TODOs simplify VolumeManager interface, add some basic documentation > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687215#comment-13687215 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1494310 from [~ecn] [ https://svn.apache.org/r1494310 ] ACCUMULO-118 use a configurable class to select volumes for new files > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687126#comment-13687126 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1494279 from [~ecn] [ https://svn.apache.org/r1494279 ] ACCUMULO-118 merge trunk to sandbox > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687002#comment-13687002 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1494226 from [~ecn] [ https://svn.apache.org/r1494226 ] ACCUMULO-118 switch to using "volume" instead of "namespace" > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13685920#comment-13685920 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1493916 from [~ecn] [ https://svn.apache.org/r1493916 ] ACCUMULO-118 merge root, fix bunches of issues with gc > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13685547#comment-13685547 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1493756 from [~ecn] [ https://svn.apache.org/r1493756 ] ACCUMULO-118 fixed log recovery, du > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677252#comment-13677252 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1490362 from [~ecn] [ https://svn.apache.org/r1490362 ] ACCUMULO-118 add apache license header to new files > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677217#comment-13677217 ] Eric Newton commented on ACCUMULO-118: -- Right now I'm using "instance.namespaces" for the file system roots to use. It was noticed recently that the term "namespace" conflicts with the concept of "table namespaces". Anyone have a better term for these file system roots to use? * instance.dfs.uris * instance.dfs.dirs * instance.dfs.roots * instance.dfs.filesystems > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677197#comment-13677197 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1490338 from [~ecn] [ https://svn.apache.org/r1490338 ] ACCUMULO-118 handle different uris > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677189#comment-13677189 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1490337 from [~ecn] [ https://svn.apache.org/r1490337 ] ACCUMULO-118 merge trunk into sandbox > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677002#comment-13677002 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1490273 from [~ecn] [ https://svn.apache.org/r1490273 ] ACCUMULO-118 merge trunk into sandbox > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676974#comment-13676974 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1490262 from [~ecn] [ https://svn.apache.org/r1490262 ] ACCUMULO-118 merge trunk to branch > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676155#comment-13676155 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1489980 from [~ecn] [ https://svn.apache.org/r1489980 ] ACCUMULO-118 merge trunk into sandbox > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676132#comment-13676132 ] ASF subversion and git services commented on ACCUMULO-118: -- Commit 1489969 from [~ecn] [ https://svn.apache.org/r1489969 ] ACCUMULO-118 support multiple namespaces for tables > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668603#comment-13668603 ] Keith Turner commented on ACCUMULO-118: --- I was looking at some docs on viewfs. If possible, I am thinking we should not do anything that would preclude using viewfs. It seems like if URIs were supported for tablet dirs and files (along with a way to choose a tablet dir) that this would almost be enough to support viewfs. {noformat} 1;m srv:dir viewfs://clusterX/accumulo1/tables/abc 1;m file:viewfs://ns1/accumulo1/tables/abc/F002.rf []196,1 1< srv:dir viewfs://clusterX/accumulo2/tables/abc 1< file:viewfs://ns1/accumulo2/tables/abc/F003.rf []196,1 {noformat} If we want to further develop our own indirection layer, then maybe we should define our own URI prefix. Something like ans://. How independent should this URI be? Something like ans:/// would assume that you know where to look up. If the URI were like ans://++/ then it would be more self contained. I do not think its necessary to make it self contained, its for internal use and would be translated by as needed. I was thinking about how bulk import will work in this federated world. Below is one way this could work. * Client calls import dir w/ /foo1 * Accumlo client code uses local config to convert /foo1 to URI hdfs://nn1/foo1 * hdfs://nn1/foo1 is passed to Accumulo server code via thrift * Accumulo server code looks at URI to determine where to move to, determines it has accumulo dir hdfs://nn1/accumulo. * moves files in hdfs://nn1/foo1 to hdfs://nn1/accumulo/tables/abc * Replaces hdfs://nn1/accumulo/tables/abc with ans://ns1/accumulo/tables/abc * Does bulk import of files in ans://ns1/accumulo/tables/abc Is this how this should work? The scenario above implies that Accumulo needs a dir on each namenode and way of mapping URIs to the appropriate Accumulo dir. Need to wor through this scenario w/ viewfs also. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668434#comment-13668434 ] Dave Marion commented on ACCUMULO-118: -- [~afuchs] I think they are related in that they are both solutions to the same problem. However, I think you can still use both solutions together. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668428#comment-13668428 ] Dave Marion commented on ACCUMULO-118: -- bq. Would avoding storing direct pointers to namenodes in the metadata table be sufficient to satisfy this? Always have a level of indirection like viewfs? Yes, a simple direct mapping is what I would like to see. Using a hash function is not simple or direct and I don't see why its needed. If I have a backup of the files from ns1 on ns2 and vice versa (think hot backup) and ns2 fails, then I can just change the mapping so that both ns1 and ns2 point to the same Hadoop HDFS instance. No update of the metadata table needed. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668380#comment-13668380 ] Adam Fuchs commented on ACCUMULO-118: - How does this related to ACCUMULO-722? Seems like if we implemented ACCUMULO-722 then we wouldn't need this, but this is a much easier ticket, right? > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668377#comment-13668377 ] Keith Turner commented on ACCUMULO-118: --- Would also be nice to outline an upgrade strategy in the design doc. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668366#comment-13668366 ] Keith Turner commented on ACCUMULO-118: --- bq. I would rather see a mapping of namespace prefix to NN in the configuration (ns1 = hdfs://host:port, ns2 = hdfs://host:port) I agree, I think making this explicit and straightforward is the way to go. Although, storing the namespace config in accumulo-site.xml seems error prone. In the worst case node1 defines ns1=hdfs://nn3 and node2 defines ns2=hdfs://nn4. I would advocate only storing this mapping in zookeeper. bq. I'm thinking forward to table file load balancing across namespaces and backups (see my comment from 3/Apr/12). [~dlmarion] Would avoding storing direct pointers to namenodes in the metadata table be sufficient to satisfy this? Always have a level of indirection like viewfs? [~ecn] Would there be any reason for a single tablet to spread its files across multiple name nodes? Tablets currently have a directory column that tells a tablet where to create new files. This could be converted to an absolute path/url. When a tablet creates a new file, it uses this path. There may be some small efficiency gain when opening multiple files for tablet if all of the calls went to the same namenode. {noformat} 1< srv:dir namespace://ns1/accumulo/tables/abc 1< file:namespace://ns1/accumulo/tables/abc/F002.rf []196,1 {noformat} bq. Perhaps a per-table configuration of the hash function? Could possibly have a plugin thats called to choose the value of srv:dir for a new tablet. The input to this function could be the KeyExtent and list of available namespaces and it could return a namespace/url. The default implemention could hash+mod the tablets end row and use that index into the list of namespaces. [~ecn] once a design is settled on, I think it would be useful if the design doc outlined how this new feature will interact with bulk import, clone table, export/import table, and offline map reduce. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13666516#comment-13666516 ] Christopher Tubbs commented on ACCUMULO-118: {quote}Personally I am not a fan of the hash idea. I would rather...{quote} Perhaps a per-table configuration of the hash function? > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13666314#comment-13666314 ] Dave Marion commented on ACCUMULO-118: -- Personally I am not a fan of the hash idea. I would rather see a mapping of namespace prefix to NN in the configuration (ns1 = hdfs://host:port, ns2 = hdfs://host:port). I'm thinking forward to table file load balancing across namespaces and backups (see my comment from 3/Apr/12). If for example you quiesced the database and performed a backup, then you could change the namespace mapping such that ns1 and ns2 point to the same hdfs://host:port if for some reason you lost the 2nd hdfs instance (it crashed, you wanted to remove it, etc). This could also allow for of Hadoop wile Accumulo is still running. Think about the scenario where ns1 is on racks 1&2 and ns2 is on racks 3&4 of a cluster and the files of table T are spread across ns1 and ns2. You could change the configuration of the table file load balancer (new feature) that puts new files on ns2. You recompact the table so now all new files are on ns2. When done for all tables (and walogs), then you can shutdown ns1 and upgrade to a new version of Hadoop. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665879#comment-13665879 ] Josh Elser commented on ACCUMULO-118: - {quote} With a different viewfs implementation, we could use a hash to determine the namespace and hide the whole switch between file systems: hash("/accumulo/tables/1/default_tablet/A0003892.rf") % 2 -> hdfs://nn1:12345 hash("/accumulo/tables/1/default_tablet/F0004567.rf") % 2 -> hdfs://nn2:12345 Unfortunately, renaming a file might force it to a new namespace. {quote} I bet you could be tricky with that actual hash+modulo in such a way which guarantees that it wouldn't change namespaces. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Attachments: ACCUMULO-118-01.txt > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665392#comment-13665392 ] Eric Newton commented on ACCUMULO-118: -- Yes, I'm working on it now. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665235#comment-13665235 ] Keith Turner commented on ACCUMULO-118: --- bq. It's time to start thinking about 1.6, and I intend for this to be part of that release. My goal is a working branch in 2 months. Do you plan to post a design doc outlining your approach to tackling this? > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664756#comment-13664756 ] Eric Newton commented on ACCUMULO-118: -- Yes, [~elserj] that's what I mean. It's time to start thinking about 1.6, and I intend for this to be part of that release. My goal is a working branch in 2 months. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664749#comment-13664749 ] Josh Elser commented on ACCUMULO-118: - I believe this means that Eric intends to not release 1.6 without this. > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
[ https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664741#comment-13664741 ] David Medinets commented on ACCUMULO-118: - Why was this switched to a blocker? What is it blocking? > accumulo could work across HDFS instances, which would help it to scale past > a single namenode > -- > > Key: ACCUMULO-118 > URL: https://issues.apache.org/jira/browse/ACCUMULO-118 > Project: Accumulo > Issue Type: Improvement > Components: master, tserver >Reporter: Eric Newton >Assignee: Eric Newton >Priority: Blocker > Fix For: 1.6.0 > > Original Estimate: 2,016h > Remaining Estimate: 2,016h > > Consider using full path names to files, which would allow the servers to > access the files on any HDFS file system. > Work may exist elsewhere to run HDFS using a number of NameNode instances to > break up the namespace. > We may need a pluggable strategy to determine namespace for new files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira