I realized that there is a problem. I read some comments and realized that the TTL is used when the compaction is executed. that means if there is some regions that has older data then TTL , and no new data is written to that region, then the old data will never be removed , because no compaction will ever happen for those data after they are stored.

Unfortunately, this is a fairly common situation for log data. Once they are written, they are seldom touched again, hence there will no compaction against those data. It would be nice to use TTL to retire them, but because there is no compaction
for those old regions, the data will never be retired.

I wonder if the above scenario is correct and if it is, is there a solution for this other than periodically issue a client side
delete request ?

Jimmy

--------------------------------------------------
From: "Jonathan Gray" <jg...@facebook.com>
Sent: Monday, September 13, 2010 6:06 PM
To: <user@hbase.apache.org>
Subject: RE: what is the unit of the TTL in hbase and how often hbase remove expired regions?

The unit is seconds as outlined in HColumnDescriptor. It's a little confusing because server-side everything is milliseconds. On the server, it is converted from the user-configured seconds to milliseconds.

Also, HBase will never expire "regions", rather it expires individual versions of cells according to their timestamps.

This is not enforced periodically, it is actually enforced constantly, so you should *never see* an expired cell. This does not mean it does not still exist on disk, it means it will not be visible in user queries. On a major compaction (default every 24 hours) HBase will actually delete the expired cells.

JG

-----Original Message-----
From: Jinsong Hu [mailto:jinsong...@hotmail.com]
Sent: Monday, September 13, 2010 3:52 PM
To: user@hbase.apache.org
Subject: what is the unit of the TTL in hbase and how often hbase
remove expired regions?

Hi,
  I want to find out what is the unit for TTL in hbase. I googled
around and
found some people say it is microsecond.
and I thought it was millisecond as that is java default. Then I
searched
hbase code and saw some test code treating
the unit to be seconds.
  I used a TTL=600000. if the unit is millisecond, then that means 10
minute. However, I continue to insert records into
this table, and found that the regions older than 10 minutes are not
removed.
  The question I have is , what is the unit for TTL. and the second
question
is, how often hbase checks all regions and
remove expired regions.

Jimmy


Reply via email to