So the whole point of getting the region locations is to ensure that there is one thread per region server ?
On Wed, Oct 5, 2011 at 4:42 PM, lars hofhansl <lhofha...@yahoo.com> wrote: > Hi Sam, > > > There were some attempts to build this in. In the end I think the exact > patterns are different based on what one is trying to achieve. > Currently what you can do is getting all the region locations > (HTable.getRegionLocations). From the HRegionInfos you can > get the regions start and end keys. > Now you can issue parallel scan for as many regions as you want (by create a > Scan object with start and row set to the region's > start and end key). > You probably want to group the regions by regionserver and have one thread > per region server, or something. > > > -- Lars > ________________________________ > From: Sam Seigal <selek...@yahoo.com> > To: hbase-u...@hadoop.apache.org > Sent: Wednesday, October 5, 2011 4:29 PM > Subject: Using Scans in parallel > > Hi , > > Is there a known way to be able to do Scan's in parallel (in different > threads even) and then sort/combine the output ? > > For a row key like: > > prefix-event_type-event_id > prefix-event_type-event_id > > I want to declare two scan objects (for say event_id_type foo) > > Scan 1 => 0-foo > Scan 2 => 1-foo > > execute the scans in parallel (maybe even in different threads) and > then merge the results ? > > Thank you, > > Sam >