Yes.

I have already used the way which suggested by Nicolas.

By the way which suggested by Lars, exporting the content of table, I
am not sure if it's a good idea. As I can't control the compactions,
the data completion can not be guaranteed. It means between two export
operations, if there are compactions happening, and then the deleted
data will be lost. BTW,  if I understand right, from Lars's
description, the deleted data might also be removed during the minor
compaction!

Thanks

Yong


On Thu, Jan 26, 2012 at 11:52 PM, lars hofhansl <lhofha...@yahoo.com> wrote:
> If you are planning to use trunk (what will be 0.94) you can also enable 
> KEEP_DELETED_CELLS for your column families.
> That will keep deleted cells around (until they get removed because of # of 
> versions, or TTL).
>
> Also note that version # and TTL checks are also performed during minor 
> compactions and even during memstore flushes, and hence cells might be 
> removed on those occasions as well.
>
> If you have time and space, you also backup your tables into text files 
> (using export) and crunch them there (I added support for HBASE-4536) in 
> export as well.
>
> -- Lars
>
>
> ----- Original Message -----
> From: yonghu <yongyong...@gmail.com>
> To: user@hbase.apache.org; lars hofhansl <lhofha...@yahoo.com>
> Cc:
> Sent: Thursday, January 26, 2012 1:22 PM
> Subject: Re: the occasion of the major compact?
>
> yes. I read this blog
> http://hadoop-hbase.blogspot.com/2011/12/raw-scans.html. And I thought
> if I could disable the major compact, it was possible to use the way
> described in the blog. Otherwise, the major compact will remove the
> deleted data.
>
> Thanks!
>
> Yong
>
> On Thu, Jan 26, 2012 at 10:11 PM, lars hofhansl <lhofha...@yahoo.com> wrote:
>> Unless you have HBASE-4536 (only in trunk, though) or are parsing the HFiles 
>> yourself you have no way of actually getting to the deleted data.
>>
>> -- Lars
>>
>>
>>
>> ----- Original Message -----
>> From: yonghu <yongyong...@gmail.com>
>> To: user@hbase.apache.org
>> Cc:
>> Sent: Thursday, January 26, 2012 1:00 PM
>> Subject: Re: the occasion of the major compact?
>>
>> Nicolas,
>>
>> In my use case, I want to extract the deleted data. Hence, if I
>> disable the major compaction, I can prevent the hbase to actually
>> delete the data. After extracting the deleted data, I can issue major
>> compact by myself.
>>
>> Regards
>>
>> Yong
>>
>> On Thu, Jan 26, 2012 at 8:02 PM, Nicolas Spiegelberg
>> <nspiegelb...@fb.com> wrote:
>>> Yong,
>>>
>>> Can you please explain why you want to disable major compactions?  What
>>> are the problems that you're currently seeing or what are you worried will
>>> happen if a major compaction is allowed to occur?  Right now, there are
>>> only an extremely small subset of cases where you must explicitly disable
>>> compactions.  These use cases I know of are very complicated and require
>>> building StoreFile analysis tools underneath HBase, that I'm pretty sure
>>> you're not needing this.
>>>
>>> Please also read my follow up commentary to explaining major compaction
>>> logic:
>>> http://search-hadoop.com/m/JR9sK1xnbj21
>>> http://search-hadoop.com/m/X7W7q1xnbj21
>>>
>>>
>>> The vast majority of users need features completely unrelated to
>>> compactions.  The compaction algorithm is an easy target to worry about.
>>>
>>>
>>> On 1/26/12 7:06 AM, "yonghu" <yongyong...@gmail.com> wrote:
>>>
>>>>Hello Mikael,
>>>>
>>>>I think disabling the major compaction in the timed and client-issued
>>>>situation is not a problem. The problem is the size-based. From the
>>>>mailing list, it only talks about the situation of minor compaction
>>>>not major compaction, if I understand right. So, I want to know if
>>>>someone can tell me how to close the major compaction in size-based
>>>>situation.
>>>>
>>>>Thanks
>>>>
>>>>Yong
>>>>I saw the description which indicating the size of store file can also
>>>>trigger major compaction.
>>>>
>>>>On Thu, Jan 26, 2012 at 3:54 PM, Mikael Sitruk <mikael.sit...@gmail.com>
>>>>wrote:
>>>>> Yong hi
>>>>>
>>>>> As far as i know setting  hbase.hregion.majorcompaction to 0 will
>>>>>disable
>>>>> the time based trigger only.
>>>>> Client are always able to invoke the major compact, no matter what is
>>>>>the
>>>>> value of the hbase.hregion.majorcompaction.
>>>>>
>>>>> Perhaps client invocation of compaction can me disabled with the
>>>>>security
>>>>> package.
>>>>>
>>>>> Anyway i'm digging into 0.92, I hope to get those insight soon.
>>>>>
>>>>> Mikael.S
>>>>>
>>>>> On Thu, Jan 26, 2012 at 4:39 PM, yonghu <yongyong...@gmail.com> wrote:
>>>>>
>>>>>> Thanks for your response.
>>>>>>
>>>>>> I knew that major compact can be triggered based on client, time and
>>>>>> size. In my situation, I have to close the functionality of major
>>>>>> compact. So, if I set the Œhbase.hregion.majorcompaction¹ into 0, it
>>>>>> will close all the three situations or I have to set it separately for
>>>>>> each case. BTW, my hbase version is 0.92.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Yong
>>>>>>
>>>>>> On Thu, Jan 26, 2012 at 3:09 PM, Mikael Sitruk
>>>>>><mikael.sit...@gmail.com>
>>>>>> wrote:
>>>>>> > look at the thread http://search-hadoop.com/m/GHUWQ1xnbj21, it
>>>>>>explain a
>>>>>> > lot on major compaction and enhancement over versions
>>>>>> >
>>>>>> > Mikael.S
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Jan 26, 2012 at 3:51 PM, Damien Hardy <dha...@figarocms.fr>
>>>>>> wrote:
>>>>>> >
>>>>>> >> Le 26/01/2012 14:43, yonghu a écrit :
>>>>>> >> > Hello,
>>>>>> >> >
>>>>>> >> > I read this blog http://outerthought.org/blog/465-ot.html. It
>>>>>> mentions
>>>>>> >> > that every 24 hours the major compaction will occur. My question
>>>>>>is
>>>>>> >> > that if there are any other conditions which can trigger major
>>>>>> >> > compaction happening? For example, when the size of store file
>>>>>>reaches
>>>>>> >> > the threshold (I think this will cause minor compaction or region
>>>>>>file
>>>>>> >> > split, not major compaction, but not quite sure).
>>>>>> >> >
>>>>>> >> > Thanks!
>>>>>> >> >
>>>>>> >> > Yong
>>>>>> >>
>>>>>> >> Hello,
>>>>>> >> I think when there is massive delete on the table or change table
>>>>>> >> attribute like TTL (that is susseptible of remove a lot of
>>>>>> >> versions/rows) or COMPRESSION wich gain a lot of disk space on each
>>>>>> region.
>>>>>> >>
>>>>>> >> Cheers,
>>>>>> >>
>>>>>> >> --
>>>>>> >> Damien
>>>>>> >>
>>>>>> >>
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Mikael.S
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Mikael.S
>>>
>>
>

Reply via email to