[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-04-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970183#comment-13970183
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1587749 from dlmar...@apache.org in branch 'site/trunk'
[ https://svn.apache.org/r1587749 ]

Changed name of ACCUMULO-118 feature

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-03-27 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13949692#comment-13949692
 ] 

Sean Busbey commented on ACCUMULO-118:
--

Issue closed; please file any newly discovered issues as follow on tickets.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-03-26 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948015#comment-13948015
 ] 

Josh Elser commented on ACCUMULO-118:
-

Would be good to have someone else look through the potential "bugs" from 
direct FileSystem usage that I outlined on ACCUMULO-2552.

I believe that list doesn't have anything that would be blocker for initial 
functionality, so I think I'm good to close this issue out.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-03-26 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948000#comment-13948000
 ] 

Sean Busbey commented on ACCUMULO-118:
--

All subtasks against this ticket are now complete. Any objections to closing it 
out?

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-02-13 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900947#comment-13900947
 ] 

Josh Elser commented on ACCUMULO-118:
-

And then I realized that this was the precise thing that ACCUMULO-2061 was 
recommendation. I'll bump the priority on that issue to denote my frustration 
with the current mechanisms and approval of that idea.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-02-13 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900917#comment-13900917
 ] 

Josh Elser commented on ACCUMULO-118:
-

I was just testing this out. I tried to add a new volume to an existing 1.6 
instance I had lying around.

I expected that each volume I specified in {{instance.volumes}} was the 
"equivalent" of how multiple {{instance.dfs.uri}}+{{instance.dfs.dir}} would 
have worked. In other words, I expected each element in {{instance.volumes}} to 
be the base directory that Accumulo would write to. Instead, it actually wrote 
to that volume + {{instance.dfs.dir}}.

This irks me for a few reasons:

# I must have the same base directory used in HDFS across all volumes (not the 
end of the world, but I don't see any reason to impose that on our end).
# I expected {{instance.volumes}} to be a replacement to {{instance.dfs.dir}} 
and {{instance.dfs.uri}}, but the new configuration still relies on the old 
configuration.

Let me try to be crystal clear. I had an existing installation on machine1 in 
{{/accumulo1.6}} in HDFS. I tried to add a second volume, stored on machine2, 
in {{/accumulo1.6-newvolume}} (I already had an /accumulo1.6 from other testing 
on machine2). I configured my {{instance.volumes}} value to 
{{hdfs://machine1:8020/accumulo1.6,hdfs://machine2:8020/accumulo1.6-newvolume}}.
 Sadly, when invoking {{bin/accumulo init --add-volumes}}, this failed on me 
because it actually looked in {{hdfs://machine1:8020/accumulo1.6/accumulo}} and 
{{hdfs://machine2:8020/accumulo1.6-newvolume/accumulo}}.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-02-04 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891494#comment-13891494
 ] 

Dave Marion commented on ACCUMULO-118:
--

bq. calling for a vote on the dev list after writing a design doc for a feature 
seems like a simple thing to try

caveat that if you don't know anything about the subject, then don't vote. I 
can see the situation where someone gives a +1 for an idea after reading the 
design doc, but not really understanding what it will entail to complete the 
task. Maybe not a vote, but a "yes, I've read it and my comments / questions 
are in the associated JIRA."

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-02-04 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890670#comment-13890670
 ] 

Mike Drob commented on ACCUMULO-118:


This seems like it would be a good discussion for the mailing list, to make 
sure it gets more eyes on it, instead of just the folks paying attention to the 
JIRA.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-02-03 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890330#comment-13890330
 ] 

Josh Elser commented on ACCUMULO-118:
-

I think the short of it here: it's hard.

I remember when Eric was initially working on absolute paths and I thought 
"hrm, that's a good idea. should simplify a lot of things in the end". In 
hindsight, I don't think I really considered all of the difficulties that the 
changes introduce (most notably around upgrades and namenode/namespace 
decommissioning).

Maintaining a long-running feature branch isn't too bad as long as the code you 
tweaked also doesn't change out from underneath you.

I agree with you Keith, I think that focusing on design docs before starting to 
work on it can help quite a bit on a couple of levels (avoid flaws in design, 
catch bugs earlier, net a better architected solutions). Additionally, firming 
up a design can also help us break down "really big" problems into "slightly 
less big" problems which will likely help manage those changes. I think we've 
generally tried to abstain from requiring voting, but if that's what we need to 
get eyes on ideas, so be it. If we can get good, thought-out reviews without 
voting (which I think we've been fairly decent at so far, but we haven't had 
"big" designs go through review yet), I'd rather stay that way.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-02-03 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890317#comment-13890317
 ] 

Keith Turner commented on ACCUMULO-118:
---

My comment about voting on design documents was a possible solution to a 
problem.  The problem is how can we get peer reviewed designs?  How can we 
motivate each other to spend a good chunk of time critically thinking about our 
designs.   This issue had a design document and I remember reading and thinking 
"makes sense".  I do not remember investing much time trying to find flaws with 
it, I think I just wanted to get back to whatever I was immersed in at the 
time.   My thinking is that calling for a vote on the dev list after writing a 
design doc for a feature seems like a simple thing to try w/o any additional 
overhead because we have the dev list in place.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-02-03 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890300#comment-13890300
 ] 

Keith Turner commented on ACCUMULO-118:
---

bq. But it was a pretty massive change, and maintaining it as a patch set, even 
with git's help, would have been very hard.

There is a cost there.  There is also a cost to having incomplete features in 
master when we declare feature freeze.   1.5.0 and 1.6.0 were both delayed 
because of incomplete features.  One possible solution to this for 1.7.0 
development is to only merge in complete features to master.

There are lots of commits related to all of the 1.6.0 features, many of them 
reworking code changed by previous commits related to the feature.  If someone 
was working on another feature in a branch, this introduces a lot of 
unnecessary noise for them to deal with.  Ideally they would only have to merge 
and resolve conflicts once per feature.  Of course we will never achieve ideal 
ratio of 1, but I think we can easily make the commit per feature/bug ratio 
much lower than it currently is.  I know [~ctubbsii] worked on namespaces in a 
branch for a long period of time, merging in changes and resolving conflicts, I 
am not sure how painful this was.

For a feature like this that touches a lot of existing code, there is the 
option of refactoring w/o changing functionality and merging that into master.  
 Of course the refactoring would be done to make the functionality changes in 
the branch easier.   So its a multi-step process w/ the goal of always leaving 
master in state where its ready for release testing and minimizing the number 
of merges for other feature branches.  This approach would also make code 
reviews of commits to master much easier.   I am going to try doing this for 
the 1.7 features I work on.

bq.  but was certainly not anything I was thinking about when I was changing 
thousands of lines 

I was trying to determine how we can flush out more potential issues before 
changing thousands of lines. If we can get as many people as possible to 
carefully review the design that would probably do the trick.  I wonder if 
voting on design docs for new features would help.  Voting would motivate me to 
carefully review a design because I would not want to vote until I had done so.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-02-01 Thread Eric Newton (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888661#comment-13888661
 ] 

Eric Newton commented on ACCUMULO-118:
--

bq.  I think this feature was merged in before it was complete

Probably.  But it was a pretty massive change, and maintaining it as a patch 
set, even with git's help, would have been very hard.

bq. I did not realize all of the problems absolute paths could cause

Nor would we have if it was not merged in.

bq. should have started with administrative use cases

I think we are getting better at this.  For example, I can think of lots of 
ways that the initial WAL implementation caused a lot of grief for unsuspecting 
administrators.  We fixed this after it was released into the wild based on 
feedback from the administrators. Ultimately these were fixed by moving the WAL 
to HDFS, and then ferreting out all the settings to make HDFS an appropriate 
store for the WAL.

I think the use case of "what if administrators change the URL of a NN?" is a 
reasonable one, but was certainly not anything I was thinking about when I was 
changing thousands of lines of code to use full paths.  The more subtle issues 
of determining aliases for namespaces (hdfs://example:9000 vs 
hdfs://example.com:9000), and recognizing real namespaces under viewfs are the 
sort of subtle things that we will only find through actual use.

My initial goal of using concrete paths to simplify debugging might have been 
the wrong choice.  Using some kind of indirect configuration that points to a 
real namespace (like viewfs) may have been better.  But, that requires that you 
value "administrators should be able to easily move a NN to a new URL."  The 
ability to do this with the old relative paths was not a design goal, so much 
as a useful result of using the shortest name possible for each file.

bq. These really seem to be the long poll in the tent for the 1.6 release 

Seems to me to not be so far behind namespaces. Constructive criticism includes 
suggestions on how to make things better.  Working code is even more 
constructive.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-01-31 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887940#comment-13887940
 ] 

Keith Turner commented on ACCUMULO-118:
---

bq. This suite of tickets is really starting to concern me.

I am also concerned, I am worried about administrators having a really bad 
experience (bricked Accumulo instance).   In hindsight I think this feature was 
merged in before it was complete and should not have been merged in.  What can 
we learn from this to make development go more smoothly in the future?

Personally I think if a new feature is incomplete (in terms of test, usability, 
documentation, etc), breaks other features of Accumulo, or introduces new 
problems it should not be merged in.  I did not realize all of the problems 
absolute paths could cause until well after it was merged in.   In hindsight I 
think the design of this feature should have started with administrative use 
cases. Maybe that would have brought to attention the problems w/ absolute 
paths much earlier.   

bq.  I would like to start the conversation about marking this item and items 
around it as @Experimental

I wish it were that simple.  Wether you use multiple namenodes or not, Accumulo 
will use absolute paths in 1.6 and administrators do not have a reliable way to 
manage those paths.  I have a fix for ACCUMULO-1832 in the works, I should be 
done w/ that very soon.  I think that fix will address my major concern w/ this 
issue.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2014-01-30 Thread John Vines (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887170#comment-13887170
 ] 

John Vines commented on ACCUMULO-118:
-

This suite of tickets is really starting to concern me. These really seem to be 
the long poll in the tent for the 1.6 release and I'm starting to question how 
badly they're holding back the release. We have a mix of standing improvements 
that are encompassed with this feature with tickets for bugs found and bugs 
that are theoretically possible.

I would like to start the conversation about marking this item and items around 
it as @Experimental and minimizing impact from the work done so far on single 
node instances.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-12-18 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13852117#comment-13852117
 ] 

Keith Turner commented on ACCUMULO-118:
---

Need to fix ACCUMULO-1771 before fixing this

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-09-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764775#comment-13764775
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1465b6e32a6b8b1c480908f0fd39e825b20c3d21 in branch refs/heads/master 
from [~keith_turner]
[ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=1465b6e ]

ACCUMULO-118 made export table generate fully qualified path for export file in 
distcp file


> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-24 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13692404#comment-13692404
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1496225 from [~ecn]
[ https://svn.apache.org/r1496225 ]

ACCUMULO-118 added getConnector method for VolumeManager test

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689408#comment-13689408
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1495105 from [~ecn]
[ https://svn.apache.org/r1495105 ]

ACCUMULO-118 fix rat check

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689396#comment-13689396
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1495099 from [~ecn]
[ https://svn.apache.org/r1495099 ]

ACCUMULO-118 add a note to the user's manual about multi-volume configuration

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689393#comment-13689393
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1495096 from [~ecn]
[ https://svn.apache.org/r1495096 ]

ACCUMULO-118 there's only one default volume

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689387#comment-13689387
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1495087 from [~ecn]
[ https://svn.apache.org/r1495087 ]

ACCUMULO-118 move unit test into test area

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689277#comment-13689277
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1495020 from [~ecn]
[ https://svn.apache.org/r1495020 ]

ACCUMULO-118 add unit test for multiple volumes

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688403#comment-13688403
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1494761 from [~ecn]
[ https://svn.apache.org/r1494761 ]

ACCUMULO-118 cleanup sandbox

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688387#comment-13688387
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1494759 from [~ecn]
[ https://svn.apache.org/r1494759 ]

ACCUMULO-118 merge to trunk

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688356#comment-13688356
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1494746 from [~ecn]
[ https://svn.apache.org/r1494746 ]

ACCUMULO-118 fix last TODO

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688319#comment-13688319
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1494736 from [~ecn]
[ https://svn.apache.org/r1494736 ]

ACCUMULO-118 use the correct data range for making file delete markers for 
!METADATA merges

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688089#comment-13688089
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1494671 from [~ecn]
[ https://svn.apache.org/r1494671 ]

ACCUMULO-118 merge trunk to branch

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688052#comment-13688052
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1494648 from [~ecn]
[ https://svn.apache.org/r1494648 ]

ACCUMULO-118 fix fs chooser, enabled multi-volume where the first volume is not 
the same as the HDFS config

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687274#comment-13687274
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1494340 from [~ecn]
[ https://svn.apache.org/r1494340 ]

ACCUMULO-118 cleanup TODOs simplify VolumeManager interface, add some basic 
documentation

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687215#comment-13687215
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1494310 from [~ecn]
[ https://svn.apache.org/r1494310 ]

ACCUMULO-118 use a configurable class to select volumes for new files

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687126#comment-13687126
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1494279 from [~ecn]
[ https://svn.apache.org/r1494279 ]

ACCUMULO-118 merge trunk to sandbox

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687002#comment-13687002
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1494226 from [~ecn]
[ https://svn.apache.org/r1494226 ]

ACCUMULO-118 switch to using "volume" instead of "namespace"

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13685920#comment-13685920
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1493916 from [~ecn]
[ https://svn.apache.org/r1493916 ]

ACCUMULO-118 merge root, fix bunches of issues with gc

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13685547#comment-13685547
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1493756 from [~ecn]
[ https://svn.apache.org/r1493756 ]

ACCUMULO-118 fixed log recovery, du

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-06 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677252#comment-13677252
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1490362 from [~ecn]
[ https://svn.apache.org/r1490362 ]

ACCUMULO-118 add apache license header to new files

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-06 Thread Eric Newton (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677217#comment-13677217
 ] 

Eric Newton commented on ACCUMULO-118:
--

Right now I'm using "instance.namespaces" for the file system roots to use. It 
was noticed recently that the term "namespace" conflicts with the concept of 
"table namespaces".  Anyone have a better term for these file system roots to 
use?

 * instance.dfs.uris
 * instance.dfs.dirs
 * instance.dfs.roots
 * instance.dfs.filesystems



> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-06 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677197#comment-13677197
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1490338 from [~ecn]
[ https://svn.apache.org/r1490338 ]

ACCUMULO-118 handle different uris

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-06 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677189#comment-13677189
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1490337 from [~ecn]
[ https://svn.apache.org/r1490337 ]

ACCUMULO-118 merge trunk into sandbox

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-06 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677002#comment-13677002
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1490273 from [~ecn]
[ https://svn.apache.org/r1490273 ]

ACCUMULO-118 merge trunk into sandbox

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-06 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676974#comment-13676974
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1490262 from [~ecn]
[ https://svn.apache.org/r1490262 ]

ACCUMULO-118 merge trunk to branch

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676155#comment-13676155
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1489980 from [~ecn]
[ https://svn.apache.org/r1489980 ]

ACCUMULO-118 merge trunk into sandbox

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-06-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676132#comment-13676132
 ] 

ASF subversion and git services commented on ACCUMULO-118:
--

Commit 1489969 from [~ecn]
[ https://svn.apache.org/r1489969 ]

ACCUMULO-118 support multiple namespaces for tables

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-28 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668603#comment-13668603
 ] 

Keith Turner commented on ACCUMULO-118:
---

I was looking at some docs on viewfs.  If possible, I am thinking we should not 
do anything that would preclude using viewfs.   It seems like if URIs were 
supported for tablet dirs and files (along with a way to choose a tablet dir) 
that this would almost be enough to support viewfs.

{noformat}
  1;m srv:dir viewfs://clusterX/accumulo1/tables/abc
  1;m file:viewfs://ns1/accumulo1/tables/abc/F002.rf []196,1

  1< srv:dir viewfs://clusterX/accumulo2/tables/abc
  1< file:viewfs://ns1/accumulo2/tables/abc/F003.rf []196,1
{noformat}

If we want to further develop our own indirection layer, then maybe we should 
define our own URI prefix.   Something like ans://.  How independent should 
this URI be?  Something like ans:/// would assume that 
you know where to look  up.  If the URI were like 
ans://++/ then it would be more 
self contained.   I do not think its necessary to make it self contained, its 
for internal use and would be translated by as needed.

I was thinking about how bulk import will work in this federated world.  Below 
is one way this could work.

 * Client calls import dir w/ /foo1
 * Accumlo client code uses local config to convert /foo1 to URI hdfs://nn1/foo1
 * hdfs://nn1/foo1 is passed to Accumulo server code via thrift
 * Accumulo server code looks at URI to determine where to move to, determines 
it has accumulo dir hdfs://nn1/accumulo.
 * moves files in hdfs://nn1/foo1 to hdfs://nn1/accumulo/tables/abc
 * Replaces hdfs://nn1/accumulo/tables/abc with ans://ns1/accumulo/tables/abc
 * Does bulk import of files in ans://ns1/accumulo/tables/abc

Is this how this should work?  The scenario above implies that Accumulo needs a 
dir on each namenode and way of mapping URIs to the appropriate Accumulo dir.  
Need to wor through this scenario w/ viewfs also.  




> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-28 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668434#comment-13668434
 ] 

Dave Marion commented on ACCUMULO-118:
--

[~afuchs] I think they are related in that they are both solutions to the same 
problem. However, I think you can still use both solutions together.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-28 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668428#comment-13668428
 ] 

Dave Marion commented on ACCUMULO-118:
--

bq. Would avoding storing direct pointers to namenodes in the metadata table be 
sufficient to satisfy this? Always have a level of indirection like viewfs? 

Yes, a simple direct mapping is what I would like to see. Using a hash function 
is not simple or direct and I don't see why its needed. If I have a backup of 
the files from ns1 on ns2 and vice versa (think hot backup) and ns2 fails, then 
I can just change the mapping so that both ns1 and ns2 point to the same Hadoop 
HDFS instance. No update of the metadata table needed.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-28 Thread Adam Fuchs (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668380#comment-13668380
 ] 

Adam Fuchs commented on ACCUMULO-118:
-

How does this related to ACCUMULO-722? Seems like if we implemented 
ACCUMULO-722 then we wouldn't need this, but this is a much easier ticket, 
right?

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-28 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668377#comment-13668377
 ] 

Keith Turner commented on ACCUMULO-118:
---

Would also be nice to outline an upgrade strategy in the design doc.  

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-28 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668366#comment-13668366
 ] 

Keith Turner commented on ACCUMULO-118:
---

bq.  I would rather see a mapping of namespace prefix to NN in the 
configuration (ns1 = hdfs://host:port, ns2 = hdfs://host:port)

I agree, I think making this explicit and straightforward is the way to go. 
Although, storing the namespace config in accumulo-site.xml seems error prone.  
 In the worst case node1 defines ns1=hdfs://nn3 and node2 defines 
ns2=hdfs://nn4.  I would advocate only storing this mapping in zookeeper.   

bq.  I'm thinking forward to table file load balancing across namespaces and 
backups (see my comment from 3/Apr/12). 

[~dlmarion] Would avoding storing direct pointers to namenodes in the metadata 
table be sufficient to satisfy this?  Always have a level of indirection like 
viewfs?  

[~ecn] Would there be any reason for a single tablet to spread its files across 
multiple name nodes?  Tablets currently have a directory column that tells a 
tablet where to create new files.  This could be converted to an absolute 
path/url. When a tablet creates a new file, it uses this path.   There may be 
some small efficiency gain when opening multiple files for tablet if all of the 
calls went to the same namenode.

{noformat}
   1< srv:dir  namespace://ns1/accumulo/tables/abc
   1< file:namespace://ns1/accumulo/tables/abc/F002.rf []196,1
{noformat}

bq. Perhaps a per-table configuration of the hash function?

Could possibly have a plugin thats called to choose the value of srv:dir for a 
new tablet.   The input to this function could be the KeyExtent and list of 
available namespaces and it could return a namespace/url.  The default 
implemention could hash+mod the tablets end row and use that index into the 
list of namespaces.

[~ecn] once a design is settled on, I think it would be useful if the design 
doc outlined how this new feature will interact with bulk import, clone table, 
export/import table, and offline map reduce.   


> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-24 Thread Christopher Tubbs (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13666516#comment-13666516
 ] 

Christopher Tubbs commented on ACCUMULO-118:


{quote}Personally I am not a fan of the hash idea. I would rather...{quote}

Perhaps a per-table configuration of the hash function?

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-24 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13666314#comment-13666314
 ] 

Dave Marion commented on ACCUMULO-118:
--

Personally I am not a fan of the hash idea. I would rather see a mapping of 
namespace prefix to NN in the configuration (ns1 = hdfs://host:port, ns2 = 
hdfs://host:port). I'm thinking forward to table file load balancing across 
namespaces and backups (see my comment from 3/Apr/12). If for example you 
quiesced the database and performed a backup, then you could change the 
namespace mapping such that ns1 and ns2 point to the same hdfs://host:port if 
for some reason you lost the 2nd hdfs instance (it crashed, you wanted to 
remove it, etc). 

This could also allow for of Hadoop wile Accumulo is still running. Think about 
the scenario where ns1 is on racks 1&2 and ns2 is on racks 3&4 of a cluster and 
the files of table T are spread across ns1 and ns2. You could change the 
configuration of the table file load balancer (new feature) that puts new files 
on ns2. You recompact the table so now all new files are on ns2. When done for 
all tables (and walogs), then you can shutdown ns1 and upgrade to a new version 
of Hadoop.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-23 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665879#comment-13665879
 ] 

Josh Elser commented on ACCUMULO-118:
-

{quote}
With a different viewfs implementation, we could use a hash to
determine the namespace and hide the whole switch between file
systems:

hash("/accumulo/tables/1/default_tablet/A0003892.rf") % 2 -> hdfs://nn1:12345
hash("/accumulo/tables/1/default_tablet/F0004567.rf") % 2 -> hdfs://nn2:12345

Unfortunately, renaming a file might force it to a new namespace. 
{quote}

I bet you could be tricky with that actual hash+modulo in such a way which 
guarantees that it wouldn't change namespaces.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
> Attachments: ACCUMULO-118-01.txt
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-23 Thread Eric Newton (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665392#comment-13665392
 ] 

Eric Newton commented on ACCUMULO-118:
--

Yes, I'm working on it now.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-23 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665235#comment-13665235
 ] 

Keith Turner commented on ACCUMULO-118:
---

bq.  It's time to start thinking about 1.6, and I intend for this to be part of 
that release. My goal is a working branch in 2 months.

Do you plan to post a design doc outlining your approach to tackling this?  

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-22 Thread Eric Newton (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664756#comment-13664756
 ] 

Eric Newton commented on ACCUMULO-118:
--

Yes, [~elserj] that's what I mean.  It's time to start thinking about 1.6, and 
I intend for this to be part of that release.  My goal is a working branch in 2 
months.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-22 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664749#comment-13664749
 ] 

Josh Elser commented on ACCUMULO-118:
-

I believe this means that Eric intends to not release 1.6 without this.

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode

2013-05-22 Thread David Medinets (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664741#comment-13664741
 ] 

David Medinets commented on ACCUMULO-118:
-

Why was this switched to a blocker? What is it blocking?

> accumulo could work across HDFS instances, which would help it to scale past 
> a single namenode
> --
>
> Key: ACCUMULO-118
> URL: https://issues.apache.org/jira/browse/ACCUMULO-118
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master, tserver
>Reporter: Eric Newton
>Assignee: Eric Newton
>Priority: Blocker
> Fix For: 1.6.0
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>
> Consider using full path names to files, which would allow the servers to 
> access the files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to 
> break up the namespace.
> We may need a pluggable strategy to determine namespace for new files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira