Thanks Tim,

I suspect that it should work unless you get so many connections trying to hit 
the same region that you overwhelm its ability to handle the scans properly.
(Or there was a problem in the OP's code)

Scans should be 'dirty reads' imho.


-Mike

> Date: Thu, 22 Apr 2010 18:57:01 +0200
> Subject: Re: multiple scanners on same table will cause problem? Scan results 
>         change among different tries.
> From: timrobertson...@gmail.com
> To: hbase-user@hadoop.apache.org
> 
> Attached is a quickly hacked test for parallel scanning threads.  You
> might want to increase the amount of data in the test though to test
> properly.
> It seems to pass consistently for me.
> 
> Note it uses a shared HTable object across threads, but the API states:
> "Used to communicate with a single HBase table. This class is not
> thread safe for writes. Gets, puts, and deletes take out a row lock
> for the duration of their operation. Scans (currently) do not respect
> row locking."
> 
> But I am not doing any writes in the test.
> 
> Cheers,
> Tim
> 
> 
> 
> On Thu, Apr 22, 2010 at 4:22 PM, Michael Segel
> <michael_se...@hotmail.com> wrote:
> >
> >
> > Tim,
> >
> > Even without his code, this should be pretty straightforward on how to 
> > duplicate.
> >
> > Create the table with a sequence as a column in a column family.
> > Then write a non-m/r job that has multiple threads that connect to
> > HBase and see what they get when they hit the small table in a single 
> > region.
> >
> > If you can duplicate the problem, that would be the test code for the jira.
> >
> > -Mike
> >
> >> Date: Thu, 22 Apr 2010 16:13:31 +0200
> >> Subject: Re: multiple scanners on same table will cause problem? Scan 
> >> results         change among different tries.
> >> From: timrobertson...@gmail.com
> >> To: hbase-user@hadoop.apache.org
> >>
> >> Could you please post your code that is doing the scanning Steven?
> >>
> >>
> >>
> >> On Thu, Apr 22, 2010 at 3:50 PM, Michael Segel
> >> <michael_se...@hotmail.com> wrote:
> >> >
> >> > Ok...
> >> >
> >> > This is something that I think we'll need input from a major 
> >> > contributor...
> >> >
> >> > It looks like there may be an issue with respect to row locking...
> >> >
> >> > I guess the questions to ask are:
> >> >
> >> > - How does HBase handle row level locking?
> >> > -Concurrent reads/fetches of the same row?
> >> >
> >> > To be honest and fair, HBase is still an immature product when compared 
> >> > to databases and there going to be some issues that need to be fleshed 
> >> > out.  (Lets see where we are in 20+ years ;-)
> >> >
> >> > I wish I knew more about the internals of HBase, but there are only so 
> >> > many hours in the day and my wife forces me to work so I can keep up 
> >> > with her spending. ;-) (And if any of you happen to ever meet her, 
> >> > please don't bring this up, she'll kill me. :-D   )
> >> >
> >> > Lets see what St.Ack or Andrew have to say. This might be a JIRA issue.
> >> >
> >> > Thx
> >> >
> >> > -Mike
> >> >
> >> >
> >> >
> >> >> Date: Thu, 22 Apr 2010 20:17:12 +0800
> >> >> Subject: Re: multiple scanners on same table will cause problem? Scan 
> >> >> results         change among different tries.
> >> >> From: steven.zhuang.1...@gmail.com
> >> >> To: hbase-user@hadoop.apache.org
> >> >>
> >> >> hi, Michael,
> >> >>
> >> >>                Sorry for not making the question clear, there are 
> >> >> multiple
> >> >> scanners scanning a single table, there might be the case multiple 
> >> >> scanners
> >> >> reading from a single region.
> >> >>        please see answers inline.
> >> >>
> >> >> On Thu, Apr 22, 2010 at 8:08 PM, Michael Segel 
> >> >> <michael_se...@hotmail.com>wrote:
> >> >>
> >> >> >
> >> >> > I'm sorry, but are you trying to say that you have multiple scanners 
> >> >> > trying
> >> >> > to read from a single region and the result sets do not match?
> >> >> >
> >> >> >  Yes, the result sets do not match.
> >> >>
> >> >> > I guess it would be an easy test, enter a bunch of rows in to a 
> >> >> > region and
> >> >> > have a unique integer for each row. (1,2,3,...)
> >> >> > Then run a bunch of unfiltered scans in parallel, and generate a sum 
> >> >> > from
> >> >> > the scan. If any of the sums do not match, then you have a potential 
> >> >> > issue
> >> >> > on concurency/row locking, and row isolation level.  How does HBase 
> >> >> > handle
> >> >> > row level locking and isolation levels?
> >> >> >
> >> >> > I have iterate on the rows/columnfamilies/cells, and printed the 
> >> >> > content of
> >> >> each cell, found that there are some cells missing in some scan result 
> >> >> set.
> >> >>
> >> >> > -Mike
> >> >> >
> >> >> > > Date: Thu, 22 Apr 2010 17:07:47 +0800
> >> >> > > Subject: multiple scanners on same table will cause problem? Scan 
> >> >> > > results
> >> >> >     change among different tries.
> >> >> > > From: steven.zhuang.1...@gmail.com
> >> >> > > To: hbase-user@hadoop.apache.org
> >> >> > >
> >> >> > > hi, All,
> >> >> > >           Has anybody do scan on one table using multiple scanners 
> >> >> > > at the
> >> >> > > same time and  found some inconsistent problem?
> >> >> > >           I am doing query on a table using dozens(20-120) of 
> >> >> > > scanners in
> >> >> > > parallel(multiple threads), trying to take advantage of the multiple
> >> >> > cores.
> >> >> > > But I found the scan results doesn't consist among several goes. I 
> >> >> > > have
> >> >> > > checked my code, seems there is no bug in it. So I guess the 
> >> >> > > problem may
> >> >> > > come from the HBase itself.
> >> >> > >           My HBase version is 0.20.3.
> >> >> >
> >> >> > _________________________________________________________________
> >> >> > The New Busy think 9 to 5 is a cute idea. Combine multiple calendars 
> >> >> > with
> >> >> > Hotmail.
> >> >> >
> >> >> > http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5
> >> >> >
> >> >
> >> > _________________________________________________________________
> >> > Hotmail is redefining busy with tools for the New Busy. Get more from 
> >> > your inbox.
> >> > http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2
> >
> > _________________________________________________________________
> > The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with 
> > Hotmail.
> > http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5
                                          
_________________________________________________________________
Hotmail is redefining busy with tools for the New Busy. Get more from your 
inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2

Reply via email to