Brian (guessing it's your name from your email address), Please be more specific about your table design. For example, a "column" in HBase is a very vague word since it may refer to a column family or a column key inside a column family. Also, what kind of load you expect to have?
Maybe answering to this will also help you understanding HBase. Thx, J-D On Fri, Jul 18, 2008 at 4:41 PM, imbmay <[EMAIL PROTECTED]> wrote: > > I want to use hbase to maintain a very large dataset which needs to be > updated pretty much continuously. I'm creating a record for each entity > and > including a creation timestamp column as well as between 10 and 1000 > additional columns named for distinct events related to the record entity. > Being new to hbase the approach I've taken is to create a map/reduce app > that for each input record: > > Does a lookup in the table using HTable get(row, column) on the timestamp > colum to determine if there is an existing row for the entity. > If there is no existing record for the entity, the event history for the > entity is added to the table with one column added per unique event id. > If there is an existing record for the entity, it just adds the most recent > event to the table. > > I'd like feedback as to whether this is a reasonable approach in terms of > general performance and reliability or if there is a different pattern > better suited to hbase with map/reduce or if I should even be using > map/reduce for this. > > Thanks in advance. > > > -- > View this message in context: > http://www.nabble.com/Table-Updates-with-Map-Reduce-tp18537368p18537368.html > Sent from the HBase User mailing list archive at Nabble.com. > >
