Re: Deadlock when mapping a table?

Stack Sat, 10 Apr 2010 19:41:34 -0700

On Sat, Apr 10, 2010 at 4:38 PM, Joost Ouwerkerk <[email protected]> wrote:
> We're mapping a table with about 2 million rows in 100 regions on 40 nodes.
> In each map, we're doing a random read on the same table.  We're
> encountering a situation that looks alot like deadlock.  When the job is
> launched, some of the tasktrackers appear to get blocked in doing the first
> random read.  The only trace we get is an eventual Unknown Scanner Exception
> in the RegionServer log, at which point the task is actually reported as
> successfully completed by MapReduce (1 row processed).  There is no error in
> the task's log.  The job completes as SUCCESSFUL with an incomplete number
> of rows.  In the worst case scenario, we've actually seen ALL the
> tasktrackers encounter this problem; the job completes succesfully with 100
> rows processed (1 per region).



Any chance of a threaddump on the the problematic RS at the time?  Can
you even figure the culprit?  There is a known deadlock that can
happen writing (HBASE-2322) but this seems like something else.  If
its a deadlock, often JVM can recognize it as so and it'll be detailed
on the tail of the threaddump.  Todd has been messing too w/ jcarder
(sp)?  That found HBASE-2322 but thats all it found I believe (I need
to run it on next release candidate before it becomes a release
candidate).  Maybe you're running into very slow reads because you
don't have HBASE-2180?

St.Ack



>
> When we remove the code that does the random read in the map, there are no
> problems.
>
> Anyone?  This is driving me crazy because I can't reproduce it locally (it
> only seems to be a problem in a distributed environment with many nodes) and
> because there is no stacktrace besides the scanner exception (which is
> clearly a symptom, not a cause).
>
> j
>

Re: Deadlock when mapping a table?

Reply via email to