Actually I want compactions, and I think I also want splits, because the
data is going to grow over time.
So perhaps we should trigger splits and compaction manually... but I am
wondering whether it would be desirable that Hbase triggers them just as if
I load my data using puts. LoadIncrementalHFiles is just an alternative way
to achieve the loading of a large batch of data into HBase. But
semantically, shouldn't it have the  (or similar) effect on the table
as calling many individual puts?

Regarding the alternative HLog implementation, yes I was thinking of writing
and closing the file for every WAL write. But if the only writes are caused
by tables structure changes such as splits and reassignment, then that is
rare and the performance penalty could be tolerable. I was wondering whether
anybody had tried that before.

Thanks -Andreas.

On Thu, Jun 23, 2011 at 12:09 AM, Andrew Purtell <[email protected]>wrote:

> > From: Andreas Neumann <[email protected]>
> > we will use LoadIncrementalHFiles, are you saying that this
> > will never cause a split?
>
> Create the table with the region split threshold set to Long.MAX_VALUE and
> a set of pre-split points that partitions the key space as evenly as
> possible.
>
> Use HBase's TotalOrderPartitioner over row keys and HFileOutputFormat to
> build HFiles that fit within the predefined splits.
>
> The result will never split.
>
> If you do not modify the table(s), there will be no compaction activity
> either.
>
>  - Andy
>

Reply via email to