Re: Writing Custom - KeyComparator !!!

Nick Dimiduk Wed, 27 Aug 2014 11:27:13 -0700

You might also have a look at using OrderedBytes [0] instead of Bytes for
encoding your values to byte[]. This is the kind of use-case those encoders
are intended to support.


Thanks,
Nick

[0]:
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/OrderedBytes.html


On Wed, Aug 27, 2014 at 10:20 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> A brief search for KeyComparator using http://search-hadoop.com/ didn't
> turn up previous discussion on using custom KeyComparator.
> I would suggest conforming to best practices of row key design and
> leaving custom
> KeyComparator as last resort.
>
> Cheers
>
>
> On Wed, Aug 27, 2014 at 9:24 AM, @Sanjiv Singh <sanjiv.is...@gmail.com>
> wrote:
>
> > Hi Ted,
> >
> > Yes definitely, i can  make it as Fixed country code.
> >
> > The example i choose is just one of the use-case of specific ordering
> > need.   I am thinking of if we can use any user object as row-key and
> > ordering of rows within HBase are defined explicitly  by Custom
> > KeyComparator.
> >
> >
> >
> >
> >
> >
> >
> > Regards
> > Sanjiv Singh
> > Mob :  +091 9990-447-339
> >
> >
> > On Wed, Aug 27, 2014 at 9:20 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> >
> >> Sanjiv:
> >> Is there a reason for you to choose full country name ?
> >> Row key would be stored for every KeyValue in the same row, choosing
> >> abbreviation would reduce storage cost.
> >>
> >> Cheers
> >>
> >>
> >> On Wed, Aug 27, 2014 at 8:38 AM, @Sanjiv Singh <sanjiv.is...@gmail.com>
> >> wrote:
> >>
> >>> Hi Ted,
> >>>
> >>> Yes it would work for country code like IND for 'india' , AUS for
> >>> australia.
> >>>
> >>> But in my use-case, It's full country name ( not just three alphabet
> >>> country code).
> >>>
> >>> Regards
> >>> Sanjiv Singh
> >>> Mob :  +091 9990-447-339
> >>>
> >>>
> >>> On Wed, Aug 27, 2014 at 8:34 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> >>>
> >>>> Sanjiv:
> >>>> Is country code of fixed width ?
> >>>>
> >>>> If so, as long as country is the prefix, it would be sorted first.
> >>>>
> >>>> Cheers
> >>>>
> >>>>
> >>>> On Wed, Aug 27, 2014 at 8:00 AM, @Sanjiv Singh <
> sanjiv.is...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Hi JM,
> >>>>>
> >>>>> Thanks for link... I agree with you that i can be done when key is an
> >>>>> integer.
> >>>>>
> >>>>> Reason why i am asking for custom KeyComparator is that  Something
> key
> >>>>> is
> >>>>> not just integer or some value , it can be of composition of multiple
> >>>>> values  like <COUNTRY><CITY> where key is made up of two values, one
> is
> >>>>> COUNTRY and other is CITY.
> >>>>>
> >>>>> The way i wanted to order first them by COUNTRY , then by CITY.
> >>>>>
> >>>>> How can we do the same ?
> >>>>>
> >>>>>
> >>>>> Hope that I have taken correct example, emphasizes my use-case.
> >>>>>
> >>>>>
> >>>>> Regards
> >>>>> Sanjiv Singh
> >>>>> Mob :  +091 9990-447-339
> >>>>>
> >>>>>
> >>>>> On Wed, Aug 27, 2014 at 5:42 PM, Jean-Marc Spaggiari <
> >>>>> jean-m...@spaggiari.org> wrote:
> >>>>>
> >>>>> > Hi Sanjiv!!!! ;)
> >>>>> >
> >>>>> > If you want your keys to be ordered as Integers, why do you not
> >>>>> simply
> >>>>> > store them as Integers and not as Strings? HBase order the rows
> >>>>> > alphabetically, and you can not change that. Yes you can implement
> a
> >>>>> key
> >>>>> > comparator if you want but I don't think it's going to change
> >>>>> anything to
> >>>>> > this situation.
> >>>>> >
> >>>>> > You might want to take a look at this:
> >>>>> > http://hbase.apache.org/book/rowkey.design.html
> >>>>> >
> >>>>> > Just put your values that way:
> >>>>> >
> >>>>> >       int myKey = 22000;
> >>>>> >       Put put = new Put(Bytes.toBytes(myKey));
> >>>>> >
> >>>>> > And that will solve your ordering problem.
> >>>>> >
> >>>>> > JM
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > 2014-08-27 6:09 GMT-04:00 @Sanjiv Singh <sanjiv.is...@gmail.com>:
> >>>>> >
> >>>>> >>  Hi All,
> >>>>> >>
> >>>>> >> As we know,  All rows are always sorted lexicographically by their
> >>>>> row
> >>>>> >> key.
> >>>>> >> In lexicographical order, each key is compared at binary level,
> >>>>> byte by
> >>>>> >> byte and from left to right.
> >>>>> >>
> >>>>> >> See the example below , where row key is some integer value and
> >>>>> output of
> >>>>> >> scan show lexicographical order of rows in table.
> >>>>> >>
> >>>>> >> hbase(main):001:0> scan 'table1'
> >>>>> >> ROW        COLUMN+CELL
> >>>>> >> 1               column=cf1:, timestamp=1297073325971 ...
> >>>>> >> 11             column=cf 1:, timestamp=1297073337383 ...
> >>>>> >> 11000        column=cf1 :, timestamp=1297073340493 ...
> >>>>> >> 2               column=cf1:, timestamp=1297073329851 ...
> >>>>> >> 22             column=cf1:, timestamp=1297073344482 ...
> >>>>> >> 22000        column=cf1:, timestamp=1297073333504 ...
> >>>>> >> 23             column=cf1:, timestamp=1297073349875 ...
> >>>>> >>
> >>>>> >> I want to see these rows ordered as integer, not the default way.
> I
> >>>>> can
> >>>>> >> pad
> >>>>> >> keys with '0' to get a proper sorting order(i don't like it).
> >>>>> >>
> >>>>> >> I wanted to see these rows sorted as integer , not just as output
> >>>>> of scan
> >>>>> >> OR get method , but also to store rows with consecutive integer
> row
> >>>>> keys
> >>>>> >> in
> >>>>> >> same block.
> >>>>> >>
> >>>>> >> Now the question is :
> >>>>> >>
> >>>>> >>    - Can we define our own custom KeyComparator ?
> >>>>> >>    - If Yes , can we enforce it for PUT method ?  so that rows
> >>>>> would be
> >>>>> >>    stored as new KeyComparator.
> >>>>> >>    - Can we plug this comparator duriong SCAN method to change
> >>>>> order of
> >>>>> >>
> >>>>> >>    result rows ?
> >>>>> >>
> >>>>> >> Hope, i have explained the proplem well,  seeking for your
> valuable
> >>>>> >> response on it.
> >>>>> >>
> >>>>> >>
> >>>>> >> Regards
> >>>>> >> Sanjiv Singh
> >>>>> >> Mob :  +091 9990-447-339
> >>>>> >>
> >>>>> >
> >>>>> >
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
>

Re: Writing Custom - KeyComparator !!!

Reply via email to