Hi.Thank You for a quick response. About question 3, i want to clarify my self: For example, i have a row that i need to update (latest one), i read the row, proceed some operations on some cells and now i want to update, before i'm going to update i want to check may be another user (application instance) already changed this specific row and my update will written over his changes, that will lead to loose his data. So avoid this i want to check i row (specific cells) that i'm going to update has the same timestamp that i hold and nobody changed them.
Best Regards. On Thu, Sep 18, 2008 at 7:50 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote: > Slava, > > Answers in-line. > > J-D > > On Wed, Sep 17, 2008 at 2:49 PM, Slava Gorelik <[EMAIL PROTECTED] > >wrote: > > > Hi.Few small questions: > > 1) BatchUpdate.*getTimestamp< > > > http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/io/BatchUpdate.html#getTimestamp() > < > http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/io/BatchUpdate.html#getTimestamp%28%29 > > > > > > > *() - If i understand correct, this method should return the timestamp > that > > row will be committed with. > > But how the BatchUpdate will now the timestamp ? Isn't this timestamp > > should be only known after the row is written ? > > Any way, the value returned is always the same and not correct. > > > If you do not specify a timestamp, the value returned will be > HConstants.LATEST_TIMESTAMP which is Long.MAX_VALUE. HBase interprets this > as "if BU.timestamp = LATEST_TIMESTAMP, replace it with current timestamp". > The timestamp returned will be different if you created the BatchUpdate > with > a specified timestamp, see my answer to your second question. > > > > > > > > 2) Delete Cell - i saw in the FAQ that need to add a delete record and > > commit it with exactly the same timestamp like the original > > row, but i didn't found any commit method with timestamp. > > > See the BatchUpdate > constructor< > http://hadoop.apache.org/hbase/docs/r0.2.1/api/org/apache/hadoop/hbase/io/BatchUpdate.html#BatchUpdate%28java.lang.String,%20long%29 > >that > uses a timestamp. > > > > > > > > 3) For my update operation i need to check if the row that my application > > holds is still contains most recent data and only in this > > case i'll update some cells, to do this i need to lock the row -> check > > the timestamp of the particular cell -> update it if > > timestamp is the same that application holds. All those operation, if > > they are perform on HTable will be perform by numbers of > > RPC. I think, if it's possible to do those operation directly on > > HRegsionServer, will help me to get rid off all extra RPCs. Is > > there some way to work with specific HRegionServer that row is belongs > to > > it ? If yes - how can i get the HRegionServer for this > > specific row. > > > It is best to abstract how HBase works in client or this could be a mess. > For example, you would have to reimplement the finding of a region server > for a region, with retries. Instead of updating by deleting/inserting, you > should just do a put so it will be inserted with current timestamp and, by > default, HBase retrieves the cell with the latest timestamp for a get or a > scan. How HBase works is very different from your typical RDBMS ;) > > > > > > > > > > Thank You and Best Regards. > > Slava. > > >
