Can you define 'come due'?

The NPE occurs at the first isMajorCompaction() test in the main loop of 
MajorCompactionChecker.
That cycle is executed every 2.78 hours.
Yet I know that I've kept healthy QA test data up and running for much longer 
than that.
 

James Kennedy
Project Manager
Troove Inc.

On 2011-02-10, at 10:46 PM, Ryan Rawson wrote:

> I am speaking off the hip here, but the major compaction algorithm
> attempts to keep the number of major compactions to a minimum by
> checking the timestamp of the file. So it's possible that the other
> regions just 'didnt come due' yet.
> 
> -ryan
> 
> On Thu, Feb 10, 2011 at 10:42 PM, James Kennedy
> <[email protected]> wrote:
>> I've tested HBase 0.90 + HBase-trx 0.90.0 and i've run it over old data from 
>> 0.89x using a variety of seeded unit test/QA data and cluster configurations.
>> 
>> But when it came time to upgrade some production data I got snagged on 
>> HBASE-3524. The gist of it is in Ryan's last points:
>> 
>> * compaction is "optional", meaning if it fails no data is lost, so you
>> should probably be fine.
>> 
>> * Older versions of the code did not write out time tracker data and
>> that is why your older files were giving you NPEs.
>> 
>> Makes sense.  But why did I not encounter this with my initial data upgrades 
>> on very similar data pkgs?
>> 
>> So I applied Ryan's patch, which simply assigns a default value 
>> (Long.MIN_VALUE) when a StoreFile lacks a timeRangeTracker and I "fixed" the 
>> data by forcing major compactions on the regions affected.  Preliminary 
>> poking has not shown any instability in the data since.
>> 
>> But I confess that I just don't have the time right now to really dig into 
>> the code and validate that there are no more gotchya's or data corruption 
>> that could have resulted.
>> 
>> I guess the questions that I have for the team are:
>> 
>> * What state would 9 out of 50 tables be in to miss the new 0.90.0 
>> timeRangeTracker injection before the first major compaction check?
>> * Where else is the new TimeRangeTracker used?  Could a StoreFile with a 
>> null timeRangeTracker have corrupted the data in other subtler ways?
>> * What other upgrade-related data changes might not have completed elsewhere?
>> 
>> Thanks,
>> 
>> James Kennedy
>> Project Manage
>> Troove Inc.
>> 
>> 

Reply via email to