Lars, We tried, but I didn't know there is such a contention issue. We have two different column families. First one contains data, that are partially used as a filter. And actual data lives in second column family.
So, outer scanner (the first one) goes through the table and filter out keys that contain required data. Then, these keys are moved to the inner (second) scanner. BTW, second scanner utilizes FuzzyRowFilter: http://blog.sematext.com/2012/08/09/consider-using-fuzzyrowfilter-when-in-need-for-secondary-indexes-in-hbase/ We have pretty small cluster - only 18 mappers, but looks like it's enough to get contention =) On Thu, Dec 20, 2012 at 10:51 PM, lars hofhansl <lhofha...@yahoo.com> wrote: > Cool. > > You probably made it less likely that your scanners will scan the same > HFile in parallel. > > -- Lars > > > > ________________________________ > From: Eugeny Morozov <emoro...@griddynamics.com> > To: user@hbase.apache.org; lars hofhansl <lhofha...@yahoo.com> > Sent: Thursday, December 20, 2012 2:32 AM > Subject: Re: Many scanner opening > > Lars, > > Cool stuff! Thanks a lot! I'm not sure I can apply the patch, cause we're > using CDH-4.1.1, but increasing size of internal scanner does the trick - > decreased number of scanners. > At least temporarily it's good enough. > > Thanks! > > On Wed, Dec 19, 2012 at 6:23 AM, lars hofhansl <lhofha...@yahoo.com> > wrote: > > > You might have run into HBASE-7336. > > (Not available in any official release, yet) > > > > If you're using 0.94 (and probably 0.92) you can just apply this patch > > (it's save and simple). > > > > > > > > ________________________________ > > From: Eugeny Morozov <emoro...@griddynamics.com> > > To: user@hbase.apache.org > > Sent: Tuesday, December 18, 2012 12:01 AM > > Subject: Many scanner opening > > > > Hello! > > > > We faced an issue recently that the more map tasks are completed, the > > longer it takes to complete one more map task. > > > > In our architecture we have two scanners to read the table. The first > one, > > which is called 'outer' scanner is reading table and filter some rowkeys. > > These rowkeys are used as a filter for second scanner - 'internal'. Thus > we > > constantly open 'internal' scanner with different filters. > > > > As an additional symptoms we see that our cluster practically does > nothing > > - there is no CPU loading, no disk loading, no network, etc. Most of the > > time it means we are waiting on some locks, but I'm not sure. > > > > I would appreciate any ideas or suggestions to understand the case. > > Thank you in advance. > > -- > > Evgeny Morozov > > Developer Grid Dynamics > > Skype: morozov.evgeny > > www.griddynamics.com > > emoro...@griddynamics.com > > > > > > -- > Evgeny Morozov > Developer Grid Dynamics > Skype: morozov.evgeny > www.griddynamics.com > emoro...@griddynamics.com > -- Evgeny Morozov Developer Grid Dynamics Skype: morozov.evgeny www.griddynamics.com emoro...@griddynamics.com