Re: Replay of hlog required, forcing regionserver restart

Jean-Daniel Cryans Mon, 21 Jul 2008 06:35:03 -0700

I will add this to the troubleshooting page of the wiki, good job Andrew.

As a next step, maybe HBase should handle this in a better way?


J-D

On Mon, Jul 21, 2008 at 8:09 AM, Renaud Delbru <[EMAIL PROTECTED]>
wrote:

> Hi Andrew,
>
> Thanks for your explanations, it gives me more insight on how HBase works
> internally.
>
> The cause of the problem was in fact the excessive load on the servers.
> Processes was stuck due to excessive swapping.
> I have reduced the number of Hadoop tasks in parallel from 24 to 12. Then,
> I have run a hadoop job for testing. There is no more rpc timeout errors in
> the logs, and HBase stays stable during the whole process (no regionserver
> went down).
>
> Thanks Andrew and Jean-Daniel for the help and advices.
> --
> Renaud Delbru
>
>
> excessive load, with
>
>> processes stuck in disk wait, and/or threads starved for
>> CPU time, etc. Processes can be stuck in disk wait while
>> being swapped in or out also, so excessive RAM/heap is a
>> third possibility.
>>
>
>  I have two questions, to try to understand better what is
>>> happening:
>>>
>>> 1. Why HBase forces the regionserver to restart ? In which
>>> particular case this is triggered ?
>>>
>>>
>>
>> A regionserver restarts itself when there is a serious internal error.
>> There are many potential causes of a restart.
>> In your case the channel to DFS fails.
>>
>>
>>
>>> 2. Why the region server is not correctly restarted ?
>>> HBase forces the region server to restart, but in fact the
>>> region server is just shutdown, the HRegionServer process
>>> is still alive. As a consequence, the master cannot connect
>>> to the region server, and any insert is refused.
>>>
>>>
>>
>> It is the client that is attempting to talk to the region server e.g. to
>> insert. Clients get region location assignments from the master, cache them,
>> and then interact with the regionservers directly. Only after a certain
>> number of retries after a regionserver has failed will the client go back to
>> the master and ask whether a region has received a new assignment. If the
>> system is operating properly, the regionserver would have stopped reporting
>> to the master, the master would have declared it dead, and the master would
>> have reassigned the affected regions to other regionservers. The client then
>> discovers the reassignment, contacts the newly assigned regionserver(s), and
>> progress continues (eventually).
>>
>>
>>
>>> If this is related to timeout answer from one region server
>>> or from the master, is it possible to modify some options to
>>> tell HBase to wait until he gets an answer, and to not force
>>> the region server to restart.
>>>
>>>
>>
>> If a regionserver has encountered a fatal error, there's no other
>> meaningful option but to restart.
>>
>>> Thanks for you help.
>>>
>>> --
>>> Renaud Delbru
>>>
>>> Andrew Purtell wrote:
>>>
>>>
>>>> Actually I missed this:
>>>>
>>>>
>>>>
>>>>> 2008-07-19 07:31:45,505 WARN
>>>>>
>>>>>
>>>> org.apache.hadoop.dfs.DFSClient:
>>>
>>>
>>>> DataStreamer Exception:
>>>>>
>>>>>
>>>> java.net.SocketTimeoutException: timed out
>>>
>>>
>>>> waiting for rpc response
>>>>>
>>>>>
>>>> So it was a socket timeout.
>>>> I think you need to spread the load by adding more DFS
>>>>
>>>>
>>> datanodes at a minimum and probably also HBase
>>> regionservers, or reduce concurrency.
>>>
>>>
>>>>   - Andy
>>>>
>>>> --- On Sat, 7/19/08, Andrew Purtell
>>>>
>>>>
>>> <[EMAIL PROTECTED]> wrote:
>>>
>>>
>>>>
>>>>
>>>>> From: Andrew Purtell <[EMAIL PROTECTED]>
>>>>> Subject: Re: Replay of hlog required, forcing
>>>>>
>>>>>
>>>> regionserver restart
>>>
>>>
>>>> To: [email protected]
>>>>> Date: Saturday, July 19, 2008, 9:54 AM
>>>>> Hello Renaud,
>>>>>
>>>>> I think your cluster is overloaded.
>>>>>
>>>>>
>>>>>
>>>>>> WARN org.apache.hadoop.hbase.util.Sleeper: We
>>>>>>
>>>>>>
>>>>> slept
>>>
>>>
>>>>
>>>>>>
>>>>> 30837ms,
>>>>>
>>>>>
>>>>>> ten times longer than scheduled: 3000
>>>>>> 2008-07-18 03:55:57,783 WARN
>>>>>>
>>>>>>
>>>>> org.apache.hadoop.hbase.util.Sleeper: We
>>>>>
>>>>>
>>>>>> slept 583701ms, ten times longer than
>>>>>>
>>>>>>
>>>>> scheduled: 3000
>>>
>>>
>>>>
>>>>>>
>>>>> See
>>>>>
>>>>>
>>>> https://issues.apache.org/jira/browse/HBASE-616.
>>>
>>>> I think these types of messages imply excessive
>>>>>
>>>>>
>>>> load, with
>>>
>>>
>>>> processes stuck in disk wait, and/or threads
>>>>>
>>>>>
>>>> starved for
>>>
>>>
>>>> CPU time, etc. Processes can be stuck in disk wait
>>>>>
>>>>>
>>>> while
>>>
>>>
>>>> being swapped in or out also, so excessive
>>>>>
>>>>>
>>>> RAM/heap is a
>>>
>>>
>>>> third possibility. Given a 3 node cluster with a
>>>>>
>>>>>
>>>> full
>>>
>>>
>>>> complement of Hadoop and HBase daemons and 24
>>>>>
>>>>>
>>>> tasks,
>>>
>>>
>>>> perhaps all three load-dimensions are redlined.
>>>>>
>>>>>
>>>> :-)
>>>
>>>
>>>> It's hard to say what exactly was the root
>>>>>
>>>>>
>>>> cause of the
>>>
>>>
>>>> exception on the DFS client channel which brought
>>>>>
>>>>>
>>>> down the
>>>
>>>
>>>> regionserver:
>>>>>
>>>>>  > java.io.IOException: Premeture EOF from
>>>>>
>>>>>
>>>> inputStream
>>>
>>>
>>>> at
>>>>>  >
>>>>>
>>>>>
>>>>>
>>>> org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100)
>>>
>>>
>>>> This could have been a socket timeout, or a fault
>>>>>
>>>>>
>>>> on the
>>>
>>>
>>>> data node. Does anything useful/meaningful appear
>>>>>
>>>>>
>>>> in the
>>>
>>>
>>>> datanode log?
>>>>>   - Andy
>>>>>
>>>>> --- On Sat, 7/19/08, Renaud Delbru
>>>>> <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> From: Renaud Delbru
>>>>>>
>>>>>>
>>>>> <[EMAIL PROTECTED]>
>>>
>>>
>>>> Subject: Re: Replay of hlog required, forcing
>>>>>>
>>>>>>
>>>>> regionserver restart
>>>>>
>>>>>
>>>>>> To: [email protected]
>>>>>> Date: Saturday, July 19, 2008, 7:46 AM
>>>>>> The cluster is currently composed of 3 nodes.
>>>>>>
>>>>>>
>>>>> The
>>>
>>>
>>>>
>>>>>>
>>>>> cluster is
>>>>>
>>>>>
>>>>>> also used for executing hadoop jobs. One of
>>>>>>
>>>>>>
>>>>> the node
>>>
>>>
>>>>
>>>>>>
>>>>> acts as an
>>>>>
>>>>>
>>>>>> hadoop master, an HMaster and an
>>>>>>
>>>>>>
>>>>> HRegionServer.
>>>
>>>
>>>> The current settings are: Xmx=1024M for the
>>>>>>
>>>>>>
>>>>> Hadoop
>>>
>>>
>>>>
>>>>>>
>>>>> Master,
>>>>>
>>>>>
>>>>>> Xmx=2048M for the HMaster, Xmx=2048M for each
>>>>>>
>>>>>>
>>>>> HRegionServer.
>>>>>
>>>>>
>>>>>> The inserts are somehow randomised. We have 24
>>>>>>
>>>>>>
>>>>> hadoop
>>>
>>>
>>>>
>>>>>>
>>>>> jobs
>>>>>
>>>>>
>>>>>> uploading various documents. The row key is
>>>>>>
>>>>>>
>>>>> the URL of
>>>
>>>
>>>>
>>>>>>
>>>>> the document,
>>>>>
>>>>>
>>>>>> and each hadoop task has to upload a range of
>>>>>>
>>>>>>
>>>>> URLs
>>>
>>>
>>>>
>>>>>>
>>>>> (for example,
>>>>>
>>>>>> http://my-url/AAA - http://my-url/BBB).
>>>>>> Thus, I assume that each hadoop tasks will hit
>>>>>>
>>>>>>
>>>>> a
>>>
>>>
>>>>
>>>>>>
>>>>> different
>>>>>
>>>>>
>>>>>> region and that the insert load is distributed
>>>>>>
>>>>>>
>>>>> among
>>>
>>>
>>>>
>>>>>>
>>>>> the nodes.
>>>>>
>>>>>
>>>>>> Thanks.
>>>>>> --
>>>>>> Renaud Delbru
>>>>>>
>>>>>> Jean-Daniel Cryans wrote:
>>>>>>
>>>>>>
>>>>>>> Ok then, at least having the nodes
>>>>>>>
>>>>>>>
>>>>>> synchronized
>>>
>>>
>>>>
>>>>>>>
>>>>>> keeps
>>>>>
>>>>>
>>>>>> you safe from other
>>>>>>
>>>>>>
>>>>>>> problems. By the way, how many nodes do
>>>>>>>
>>>>>>>
>>>>>> you have?
>>>
>>>
>>>> One other thing to check is your writing
>>>>>>>
>>>>>>>
>>>>>> pattern.
>>>
>>>
>>>>
>>>>>>>
>>>>>> If
>>>>>
>>>>>
>>>>>> you insert data with
>>>>>>
>>>>>>
>>>>>>> sequential row keys, an example would be
>>>>>>>
>>>>>>>
>>>>>> designing
>>>>>
>>>>>
>>>>>> your row keys as an
>>>>>>
>>>>>>
>>>>>>> incremented primary key equivalent but
>>>>>>>
>>>>>>>
>>>>>> left
>>>
>>>
>>>>
>>>>>>>
>>>>>> padded
>>>>>
>>>>>
>>>>>> with zeros, these inserts
>>>>>>
>>>>>>
>>>>>>> will always hit on a single region thus
>>>>>>>
>>>>>>>
>>>>>> the load
>>>
>>>
>>>>
>>>>>>>
>>>>>> will
>>>>>
>>>>>
>>>>>> not be distributed. If
>>>>>>
>>>>>>
>>>>>>> it's your case, try randomizing the
>>>>>>>
>>>>>>>
>>>>>> inserts.
>>>
>>>
>>>> I will also look at your logs monday
>>>>>>>
>>>>>>>
>>>>>> morning.
>>>
>>>
>>>> Thx,
>>>>>>>
>>>>>>> J-D
>>>>>>>
>>>>>>> On Sat, Jul 19, 2008 at 5:47 AM, Renaud
>>>>>>>
>>>>>>>
>>>>>> Delbru
>>>
>>>
>>>>
>>>>>>>
>>>>>> <[EMAIL PROTECTED]>
>>>>>>
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Hi Jean-Daniel,
>>>>>>>>
>>>>>>>> I have synchronised all the nodes to
>>>>>>>>
>>>>>>>>
>>>>>>> the same
>>>
>>>
>>>>
>>>>>>>>
>>>>>>> ntp
>>>>>
>>>>>
>>>>>> server yesterday.
>>>>>>
>>>>>>
>>>>>>> This morning, another regionserver
>>>>>>>>
>>>>>>>>
>>>>>>> went down
>>>
>>>
>>>>
>>>>>>>>
>>>>>>> for
>>>>>
>>>>>
>>>>>> the same reason (log
>>>>>>
>>>>>>
>>>>>>> attached).
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>> --
>>>>>>>> Renaud Delbru
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Jean-Daniel Cryans wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Renaud,
>>>>>>>>>
>>>>>>>>> Yes, the impact is important. Try
>>>>>>>>>
>>>>>>>>>
>>>>>>>> fixing
>>>
>>>
>>>>
>>>>>>>>>
>>>>>>>> it
>>>>>
>>>>>
>>>>>> first because typically heavy
>>>>>>
>>>>>>
>>>>>>> write load problems only occur
>>>>>>>>>
>>>>>>>>>
>>>>>>>> when there
>>>
>>>
>>>>
>>>>>>>>>
>>>>>>>> is a
>>>>>
>>>>>
>>>>>> very few number of nodes
>>>>>>
>>>>>>
>>>>>>> and
>>>>>>>>> regions. We have seen this many
>>>>>>>>>
>>>>>>>>>
>>>>>>>> times.
>>>
>>>
>>>> Happy fixing!
>>>>>>>>>
>>>>>>>>> J-D
>>>>>>>>>
>>>>>>>>> On Fri, Jul 18, 2008 at 5:20 PM,
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Renaud
>>>
>>>
>>>>
>>>>>>>>>
>>>>>>>> Delbru
>>>>>
>>>>>
>>>>>> <[EMAIL PROTECTED]>
>>>>>>
>>>>>>
>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Hi Jean-Daniel,
>>>>>>>>>>
>>>>>>>>>> my nodes clocks are
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> desynchronised
>>>
>>>
>>>>
>>>>>>>>>>
>>>>>>>>> (each
>>>>>
>>>>>
>>>>>> node has a different time).
>>>>>>
>>>>>>
>>>>>>> Could
>>>>>>>>>> it have an impact on HBase ?
>>>>>>>>>>
>>>>>>>>>> At the moment of the failure,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> the
>>>
>>>
>>>>
>>>>>>>>>>
>>>>>>>>> node had
>>>>>
>>>>>
>>>>>> about 55 regions (with
>>>>>>
>>>>>>
>>>>>>> hbase.hregion.max.filesize =
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> 128M).
>>>
>>>
>>>> Thanks.
>>>>>>>>>> --
>>>>>>>>>> Renaud Delbru
>>>>>>>>>>
>>>>>>>>>> As One node is at
>>>>>>>>>>
>>>>>>>>>> Jean-Daniel Cryans wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Renaud,
>>>>>>>>>>>
>>>>>>>>>>> Just to make sure, are all
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> your
>>>
>>>
>>>>
>>>>>>>>>>>
>>>>>>>>>> nodes
>>>>>
>>>>>
>>>>>> clocks synchronized? Also, at the
>>>>>>
>>>>>>
>>>>>>> moment of that failure,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> how many
>>>
>>>
>>>>
>>>>>>>>>>>
>>>>>>>>>> regions did the table have?
>>>>>>
>>>>>>
>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> J-D
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jul 18, 2008 at
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> 5:11 AM,
>>>
>>>
>>>>
>>>>>>>>>>>
>>>>>>>>>> Renaud Delbru <[EMAIL PROTECTED]>
>>>>>>
>>>>>>
>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I am using HBase
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> 0.2.0-dev,
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Hudson
>>>>>
>>>>>
>>>>>> Build #208.
>>>>>>
>>>>>>
>>>>>>> I am experiencing a
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> problem
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> with
>>>>>
>>>>>
>>>>>> HBase that looks like the issue
>>>>>>
>>>>>>
>>>>>>
>>>>> https://issues.apache.org/jira/browse/HBASE-15.
>>>
>>>
>>>>
>>>>>>
>>>>>>> During an upload
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> intensive
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> task,
>>>>>
>>>>>
>>>>>> one of the regionserver went down to
>>>>>>
>>>>>>
>>>>>>> force rerun of hlog.
>>>>>>>>>>>>
>>>>>>>>>>>> This is not the first
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> time we
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> have
>>>>>
>>>>>
>>>>>> experienced this problem, it
>>>>>>
>>>>>>
>>>>>>> happened
>>>>>>>>>>>> already on another
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> cluster
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> node a
>>>>>
>>>>>
>>>>>> few days ago, in the same settings.
>>>>>>
>>>>>>
>>>>>>> I have attached the
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> regionserver
>>>>>
>>>>>
>>>>>> log. I don't notice something
>>>>>>
>>>>>>
>>>>>>> particular in the
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> hadoop
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> datanode
>>>>>
>>>>>
>>>>>> log.
>>>>>>
>>>>>>
>>>>>>> Thanks in advance for
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> any
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> advices.
>>>>>
>>>>>
>>>>>> --
>>>>>>>>>>>> Renaud Delbru
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2008-07-18
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> 03:40:51,861 INFO
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.hbase.regionserver.LogRoller:
>>>
>>>
>>>>
>>>>>>
>>>>> Rolling
>>>>>
>>>>>
>>>>>> hlog. Number of
>>>>>>
>>>>>>
>>>>>>> entries: 30008
>>>>>>>>>>>> 2008-07-18
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> 03:40:52,073 INFO
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.hbase.regionserver.HLog:
>>>>>>
>>>>>>
>>>>>>> New
>>>>>>>>>>>> log writer created at
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>> /hbase/log_140.203.154.176_1216023782181_60020/hlog.dat.1216348851959
>>>
>>>
>>>>
>>>>>
>>>>>> 2008-07-18
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> 03:43:36,257 INFO
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.hbase.regionserver.LogRoller:
>>>
>>>
>>>>
>>>>>>
>>>>> Rolling
>>>>>
>>>>>
>>>>>> hlog. Number of
>>>>>>
>>>>>>
>>>>>>> entries: 30008
>>>>>>>>>>>> 2008-07-18
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> 03:43:37,020 INFO
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.hbase.regionserver.HLog:
>>>>>>
>>>>>>
>>>>>>> New
>>>>>>>>>>>> log writer created at
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>> /hbase/log_140.203.154.176_1216023782181_60020/hlog.dat.1216349016755
>>>
>>>
>>>>
>>>>>
>>>>>> 2008-07-18
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> 03:45:12,963 INFO
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>>
>>>>>>
>>>>>>> starting compaction on
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> region
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> page-repository,
>>>>>>
>>>>>>
>>>>> http://dbpedia.org/resource/Judah_ibn_Ghayyat,1215919264401
>>>
>>>
>>>>
>>>>>
>>>>>> 2008-07-18<
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>> http://dbpedia.org/resource/Judah_ibn_Ghayyat,12159192644012008-07-18
>>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> 03:45:58,868
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>
>>>> WARN
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.hbase.util.Sleeper: We slept
>>>>>>
>>>>>>
>>>>> 30837ms,
>>>>>
>>>>>
>>>>>> ten times longer than
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> scheduled:
>>>>>
>>>>>
>>>>>> 3000
>>>>>>
>>>>>>
>>>>>>> 2008-07-18
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> 03:55:57,783 WARN
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.hbase.util.Sleeper: We
>>>>>>
>>>>>>
>>>>>>> slept
>>>>>>>>>>>> 583701ms, ten times
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> longer
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> than
>>>>>
>>>>>
>>>>>> scheduled: 3000
>>>>>>
>>>>>>
>>>>>>> 2008-07-18
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> 03:55:57,787 INFO
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.hbase.regionserver.LogRoller:
>>>
>>>
>>>>
>>>>>>
>>>>> Rolling
>>>>>
>>>>>
>>>>>> hlog. Number of
>>>>>>
>>>>>>
>>>>>>> entries: 21011
>>>>>>>>>>>> 2008-07-18
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> 03:55:58,410 WARN
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.dfs.DFSClient: Exception
>>>>>>
>>>>>>
>>>>>>> while reading from
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> blk_-6424749704096696952 of
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>> /hbase/page-repository/942413588/field/mapfiles/7570604533190594376/data
>>>
>>>
>>>>
>>>>>
>>>>>> from
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> 140.203.154.176:50010:
>>>
>>>
>>>>
>>>>>>>>>>>>
>>>>>>>>>>> java.io.IOException: Premeture EOF from
>>>>>>
>>>>>>
>>>>>>> ...
>
> [Message clipped]

Re: Replay of hlog required, forcing regionserver restart

Reply via email to