Here is some links that helped me design my keys... http://www.appfirst.com/blog/best-practices-for-managing-hbase-in-a-high-write-environment/ http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/ http://hbase.apache.org/book/rowkey.design.html http://opentsdb.net/docs/build/html/user_guide/backends/hbase.html
Some fun bed time reading.. :) cheers, liam On Tue, Apr 29, 2014 at 3:51 PM, Software Dev <[email protected]>wrote: > Someone mentioned in another post about hotspotting. I guess I could > reverse the row keys to prevent this? > > On Tue, Apr 29, 2014 at 3:34 PM, Software Dev <[email protected]> > wrote: > > Hey all. I have some questions regarding row key and column design. > > > > We want to calculate some metrics based on our page views broken down > > by hour, day, month and year. We also want this broken down country > > and have the ability to filter by some other attributes such as the > > sex of the user or whether or not the user is logged in..... Note > > these will all be increments. > > > > So we have the initial row key design as > > > > YYYY - Row key for yearly totals > > YYYYMM - Row key for monthly totals > > YYYYMMDD - Row key for daily totals > > YYYYMMDDHH - Row key for hourly totals > > > > I think this may make sense as it will be easy to do a range scan over > > a time period. > > > > Now for my column design. We were thinking along these lines. > > > > daily:US - Daily counts for the US > > hourly:CA - Hourly counts for Canada > > ... and so on > > > > Now this seems like it would work but fails when we add in the > > requirement of filtering results base on some other attributes. Say we > > wanted to be able to filter based on sex (M or F) and/or filter based > > on logged in status (Online or Offline) OR and/or filter based on some > > other attribute OR perform no filtering at all. How would I go about > > accomplishing this? > > > > Thanks for any input/pointers. >
