Re: about HBASE-48

stack Sat, 10 Oct 2009 23:56:16 -0700

On Sat, Oct 10, 2009 at 10:54 PM, Anty <[email protected]> wrote:

> Hi:
>    statck
>     i did some tests on bulk load tools of HBASE-48.
>


Thanks for trying it out.


> I took files made by TestHFileOutputFormat test and passed them to the
> script you wrote.It did works ,but it seems to be something unusual.For
> each
> region ,the STARTKEY and ENDKEY is nearly the same,the ENDKY is bigger than
> STARTKEY by nearly 1,e.g.
>  STARTKEY=>'0000009447',ENDKY=>'0000009448';
>  STARTKEY=>'0000020476',ENDKY=>'0000020477';
> ...
>
>
Did you do your own partitioner or just use default hash partitioner?



>        i also have some doubts about TestHFileOutputFormat,the default
> partitioner is hash partitioner,however ,the hash partitioner can't meet
> requirements of TestHFileOutputFormat ,just as you said we need to ensure a
> total ordering of all keys and we need to supply a partitioner that does
> total ordering(but you didn't add a new  partitioner in
> TestHFileOutputFormat).
>

This is broke then as you point out.   We should make something like what is
described in https://issues.apache.org/jira/browse/HBASE-1901 for
TestHFileOutputFormat?




>   so ,I think TestHFileOutputFormat use the hash partitionar ,it does not
> do  totoal ordering,different regions would have rows intercross ,which is
> not correct for hbase.And I found the firstKey,lastKey of the files mady by
> TestHFileOutputFormat is indeed intercross.
>    if the bulk tools is just the beginning,needed further improvement?I
> think the bulk tools is very usefull.
>
>
Can you help us improve it?  What do you think we need to do next
(hbase-901?)

Thanks for writing Anty Rao.
St.Ack

Re: about HBASE-48

Reply via email to