You might also have a look at using OrderedBytes [0] instead of Bytes for encoding your values to byte[]. This is the kind of use-case those encoders are intended to support.
Thanks, Nick [0]: https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/OrderedBytes.html On Wed, Aug 27, 2014 at 10:20 AM, Ted Yu <yuzhih...@gmail.com> wrote: > A brief search for KeyComparator using http://search-hadoop.com/ didn't > turn up previous discussion on using custom KeyComparator. > I would suggest conforming to best practices of row key design and > leaving custom > KeyComparator as last resort. > > Cheers > > > On Wed, Aug 27, 2014 at 9:24 AM, @Sanjiv Singh <sanjiv.is...@gmail.com> > wrote: > > > Hi Ted, > > > > Yes definitely, i can make it as Fixed country code. > > > > The example i choose is just one of the use-case of specific ordering > > need. I am thinking of if we can use any user object as row-key and > > ordering of rows within HBase are defined explicitly by Custom > > KeyComparator. > > > > > > > > > > > > > > > > Regards > > Sanjiv Singh > > Mob : +091 9990-447-339 > > > > > > On Wed, Aug 27, 2014 at 9:20 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > > >> Sanjiv: > >> Is there a reason for you to choose full country name ? > >> Row key would be stored for every KeyValue in the same row, choosing > >> abbreviation would reduce storage cost. > >> > >> Cheers > >> > >> > >> On Wed, Aug 27, 2014 at 8:38 AM, @Sanjiv Singh <sanjiv.is...@gmail.com> > >> wrote: > >> > >>> Hi Ted, > >>> > >>> Yes it would work for country code like IND for 'india' , AUS for > >>> australia. > >>> > >>> But in my use-case, It's full country name ( not just three alphabet > >>> country code). > >>> > >>> Regards > >>> Sanjiv Singh > >>> Mob : +091 9990-447-339 > >>> > >>> > >>> On Wed, Aug 27, 2014 at 8:34 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >>> > >>>> Sanjiv: > >>>> Is country code of fixed width ? > >>>> > >>>> If so, as long as country is the prefix, it would be sorted first. > >>>> > >>>> Cheers > >>>> > >>>> > >>>> On Wed, Aug 27, 2014 at 8:00 AM, @Sanjiv Singh < > sanjiv.is...@gmail.com> > >>>> wrote: > >>>> > >>>>> Hi JM, > >>>>> > >>>>> Thanks for link... I agree with you that i can be done when key is an > >>>>> integer. > >>>>> > >>>>> Reason why i am asking for custom KeyComparator is that Something > key > >>>>> is > >>>>> not just integer or some value , it can be of composition of multiple > >>>>> values like <COUNTRY><CITY> where key is made up of two values, one > is > >>>>> COUNTRY and other is CITY. > >>>>> > >>>>> The way i wanted to order first them by COUNTRY , then by CITY. > >>>>> > >>>>> How can we do the same ? > >>>>> > >>>>> > >>>>> Hope that I have taken correct example, emphasizes my use-case. > >>>>> > >>>>> > >>>>> Regards > >>>>> Sanjiv Singh > >>>>> Mob : +091 9990-447-339 > >>>>> > >>>>> > >>>>> On Wed, Aug 27, 2014 at 5:42 PM, Jean-Marc Spaggiari < > >>>>> jean-m...@spaggiari.org> wrote: > >>>>> > >>>>> > Hi Sanjiv!!!! ;) > >>>>> > > >>>>> > If you want your keys to be ordered as Integers, why do you not > >>>>> simply > >>>>> > store them as Integers and not as Strings? HBase order the rows > >>>>> > alphabetically, and you can not change that. Yes you can implement > a > >>>>> key > >>>>> > comparator if you want but I don't think it's going to change > >>>>> anything to > >>>>> > this situation. > >>>>> > > >>>>> > You might want to take a look at this: > >>>>> > http://hbase.apache.org/book/rowkey.design.html > >>>>> > > >>>>> > Just put your values that way: > >>>>> > > >>>>> > int myKey = 22000; > >>>>> > Put put = new Put(Bytes.toBytes(myKey)); > >>>>> > > >>>>> > And that will solve your ordering problem. > >>>>> > > >>>>> > JM > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > 2014-08-27 6:09 GMT-04:00 @Sanjiv Singh <sanjiv.is...@gmail.com>: > >>>>> > > >>>>> >> Hi All, > >>>>> >> > >>>>> >> As we know, All rows are always sorted lexicographically by their > >>>>> row > >>>>> >> key. > >>>>> >> In lexicographical order, each key is compared at binary level, > >>>>> byte by > >>>>> >> byte and from left to right. > >>>>> >> > >>>>> >> See the example below , where row key is some integer value and > >>>>> output of > >>>>> >> scan show lexicographical order of rows in table. > >>>>> >> > >>>>> >> hbase(main):001:0> scan 'table1' > >>>>> >> ROW COLUMN+CELL > >>>>> >> 1 column=cf1:, timestamp=1297073325971 ... > >>>>> >> 11 column=cf 1:, timestamp=1297073337383 ... > >>>>> >> 11000 column=cf1 :, timestamp=1297073340493 ... > >>>>> >> 2 column=cf1:, timestamp=1297073329851 ... > >>>>> >> 22 column=cf1:, timestamp=1297073344482 ... > >>>>> >> 22000 column=cf1:, timestamp=1297073333504 ... > >>>>> >> 23 column=cf1:, timestamp=1297073349875 ... > >>>>> >> > >>>>> >> I want to see these rows ordered as integer, not the default way. > I > >>>>> can > >>>>> >> pad > >>>>> >> keys with '0' to get a proper sorting order(i don't like it). > >>>>> >> > >>>>> >> I wanted to see these rows sorted as integer , not just as output > >>>>> of scan > >>>>> >> OR get method , but also to store rows with consecutive integer > row > >>>>> keys > >>>>> >> in > >>>>> >> same block. > >>>>> >> > >>>>> >> Now the question is : > >>>>> >> > >>>>> >> - Can we define our own custom KeyComparator ? > >>>>> >> - If Yes , can we enforce it for PUT method ? so that rows > >>>>> would be > >>>>> >> stored as new KeyComparator. > >>>>> >> - Can we plug this comparator duriong SCAN method to change > >>>>> order of > >>>>> >> > >>>>> >> result rows ? > >>>>> >> > >>>>> >> Hope, i have explained the proplem well, seeking for your > valuable > >>>>> >> response on it. > >>>>> >> > >>>>> >> > >>>>> >> Regards > >>>>> >> Sanjiv Singh > >>>>> >> Mob : +091 9990-447-339 > >>>>> >> > >>>>> > > >>>>> > > >>>>> > >>>> > >>>> > >>> > >> > > >