The simplest thing to do would be to just store the row key from TableMain
as the column qualifier in the TableRLookup row.  Then you don't even need
to append to an existing value, you're just adding a new column where the
qualifer stores the row ID and the value can be empty.  So instead of using
a single cell value as your collection, you use the entire column family (or
a subset of it) as your collection.

So, from your example, you would have:

TableRLookup:

user1:
    row1=
    row2=
    row6=
user2:
    row3=
    row7=
user3:
    row4=
    row8=
user4:
    row5=


If you have other existing columns stored in these rows and need to
distinguish the reverse index columns from the others, just use a short
prefix: rl_row1, rl_row2, etc.  HBase will sort the cells within a given
row/family by qualifier, which will keep the rl_ cells together and,
depending on your use case, may help in selecting ranges within the
collection.

You can also store additional data as the cell value, which may be useful
for additional sorting or filtering or your index entries without having to
go back to the original rows in TableMain.

If your needs are more complicated than this, then you could create a
coprocessor to perform the append atomically within a single cell and to
support other set operations as well (contains, intersection, etc).

But just by looking at both column qualifiers and values as data, you can
support a lot of uses with nothing more than standard HBase operations.


Gary



On Tue, Apr 12, 2011 at 4:52 PM, Stack <[email protected]> wrote:

> Sounds like a job for a coprocessor; the cell-append-coprocessor.  You
> load it on a particular column family and anything put to this
> processor appends to the existing cell value, if a value already
> exists.
> St.Ack
>
> On Tue, Apr 12, 2011 at 8:54 AM, Vishal Kapoor
> <[email protected]> wrote:
> > rebuilding the whole reverse lookup table should be expensive if I am
> > looking for a million new rows every day in the master table,
> > reading a row manually and then writing the appended row should be a
> > solution but will be a pain.
> >
> > for a file backed system doing a append should be possible?
> >
> > Vishal Kapoor
> >
> > On Tue, Apr 12, 2011 at 11:34 AM, Buttler, David <[email protected]>
> wrote:
> >> You have the keys for both tables, is there any reason you can't do a
> get, local append, put?
> >> If you do it in batch, then running a reduce job that collects all of
> the keys for a given value would be fairly efficient.
> >>
> >> Dave
> >>
> >>
> >> -----Original Message-----
> >> From: Vishal Kapoor [mailto:[email protected]]
> >> Sent: Tuesday, April 12, 2011 8:29 AM
> >> To: [email protected]
> >> Subject: Append value to a cell?
> >>
> >> Do we have any API which can append text values or row Ids to a cell.
> >>
> >> I want to do a control break report and want to append row Ids to a
> >> cell value...
> >> here is an example.
> >>
> >> TableMain
> >> row1 : user1
> >> row2: user1
> >> row3 : user2
> >> row4 : user3
> >>
> >>
> >> reverse lookup.
> >> TableRLookup.
> >> user1 : row1,row2
> >> user2: row3
> >> user3: row4.
> >>
> >> now I get some more data.
> >>
> >> row5 : user4
> >> row6 : user1
> >> row7 : user2
> >> row8 : user3
> >>
> >> now I want my lookup to get updated incrementally, not the entire run.
> >> TableRLookup.
> >> user1 : row1,row2,row6
> >> user2: row3,row7
> >> user3: row4.row8
> >> user4: row5
> >>
> >> my rowIds are String.,
> >>
> >> how can we do this without rebuilding the entire reverse lookup time
> >> from ground up.
> >>
> >> thanks,
> >> Vishal Kapoor
> >>
> >
>

Reply via email to