Hi Ted, thanks alot for this. It's exactly what i need.
Lukas 2013/3/4 Ted Yu <yuzhih...@gmail.com> > What HBase version are you planning to use ? > > In 0.94, you can refer to: > > src/main/java/org/apache/hadoop/hbase/regionserver/KeyPrefixRegionSplitPolicy.java > > You can write a policy which splits along category boundaries. > > There're other split policies in case you're interested: > > > ./src/main/java/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.java > > ./src/main/java/org/apache/hadoop/hbase/regionserver/DelimitedKeyPrefixRegionSplitPolicy.java > > ./src/main/java/org/apache/hadoop/hbase/regionserver/IncreasingToUpperBoundRegionSplitPolicy.java > > Cheers > > On Mon, Mar 4, 2013 at 12:55 PM, Lukáš Drbal <lukas.dr...@gmail.com> > wrote: > > > Hi Jilal, > > thanks for response, but can you give me please any link or explain it > > more? > > I don't know what you mean with regular expression spliting. My data are > > not fixed and will grow in time. > > > > Thanks. > > > > Regards > > > > Lukas Drbal > > > > > > 2013/3/4 Jilal Oussama <jilal.ouss...@gmail.com> > > > > > You can split in your application using a regular expression on the > > > underscore char if the langage supports them (like spliting data of a > csv > > > file) > > > > > > > > > 2013/3/4 Lukáš Drbal <lukas.dr...@gmail.com> > > > > > > > Hi, > > > > > > > > i have one question about rowkey design and presplit table. > > > > > > > > My usecase: > > > > I need store a lot of comments where each comment are for one article > > and > > > > this article has one category. > > > > > > > > What i need: > > > > 1) read one comment by id (where i know commentId, articleId and > > > > categoryId) > > > > 2) read all coments for article (i know categoryId and articleId) > > > > 3) read all comments for category (i know categoryId) > > > > > > > > From this read pattern i see one good rowkey: > > > > <categoryId>_<articleId>_<commentId> > > > > > > > > But here i don't have fixed size of rowkey, so i don't know how to > > define > > > > split pattern. How can be this solved? > > > > This id's come from external system and grow very fast, so add some > > like > > > > "padding" for each part are hard. > > > > > > > > Maybe i can use hash function for each part > > > > md5(<categoryId>_md5(<articleId>)_md5(<commentId>), but this rowkey > is > > > very > > > > long (3*32+2 bytes), i don't have experience with this long rowkeys. > > > > > > > > Can someone give me a suggestions please? > > > > > > > > Regards > > > > > > > > Lukas Drbal > > > > > > > > > > > > > > > -- > > Save The World - http://www.worldcommunitygrid.org/ > > http://www.worldcommunitygrid.org/stat/viewMemberInfo.do?userName=LesTR > > > > LesTR > > > -- Save The World - http://www.worldcommunitygrid.org/ http://www.worldcommunitygrid.org/stat/viewMemberInfo.do?userName=LesTR LesTR