Wes, The timestamp is used for versioning.
There have been arguments recently around 0.20 changes regarding whether the user should be allowed to manually set this stamp or it is always generated server-side according to NOW. Currently the decision has been made to allow the user to manually set the stamp on insertion, to any stamp at or before now (but not in the future). This is so we can ensure when doing a flush that no entries in the storefile will have a stamp that is later than the flush stamp. In the canonical use case for HBase, web crawling, timestamps are used to version and date each crawl. You could then set HBase to keep the 10 most recent versions and older ones would be deleted on major compactions. At the other extreme, you could set the timestamp then each individual column in a family could be a time-ordered list of whatever you want. In practice, however, I've found that it makes more sense to encode stamps in your row keys or column names. Hope that helps. JG > -----Original Message----- > From: Wes Chow [mailto:[email protected]] > Sent: Wednesday, April 01, 2009 5:56 AM > To: [email protected] > Subject: timestamp uses > > > So far, few if any of the schema designs I've come across have really > talked about using the timestamp field and HBase's automatic deletion > of > old cells in a smart way. > > What is the timestamp typically used for? Snapshotting? Implementing > more complicated transactions than HBase natively supports? > > > Wes
