On Mon, Nov 9, 2015 at 10:52 PM, WangYQ <wangyongqiang0...@163.com> wrote:

> guys, i have some problems on seq_id
>
> 1. what is the motivation of adding seq_id as part of the bulkload files
>
>
Bulk loaded files either get no sequence id -- and so they are considered
AFTER all current edits in the Store -- or the bulk loaded file gets the
highest current sequence id so the edits are ordered to appear BEFORE any
current edit.  The file is assigned a sequence id. It applies to all
constituent edits.


> 2. why introduce the conf hbase.mapreduce.bulkload.assign.sequenceNumbers
> in class loadIncrementalHFiles, if this is true, every time we flush,
> should first do a flush
>
>
This is the switch for whether bulk files are ordered first or last.



> 3. now if we compact two files, seq_id are 3 and 7, the file after compact
> will get seq_id from region, may be 12
> if i make the file after compacted seq_id=7, the largerest seq_id among
> compacted files, will cause any serious problems?
>
>
IIUC, there is no problem. The sequence id assigned is always the highest
beyond any edit that may exist in current files (and therefore beyond the
highest possible sequence id a compaction could make).

St.Ack



> the version is hbase0.98.10
>
> thanks
>
>
>
>
> 发自 网易邮箱大师

Reply via email to