Re: distributed log splitting aborted

N Keywal Fri, 06 Jul 2012 03:19:51 -0700

Hi Cyril,

BTW, have you checked dfs.datanode.max.xcievers and ulimit -n? When
underconfigured they can cause this type of errors, even if it seems
it's not the case here...


Cheers,

N.

On Fri, Jul 6, 2012 at 11:31 AM, Cyril Scetbon <cyril.scet...@free.fr> wrote:
> The file is now missing but I have tried with another one and you can see the 
> error :
>
> shell> hdfs dfs -ls 
> "/hbase/.logs/hb-d11,60020,1341097456894-splitting/hb-d11%2C60020%2C1341097456894.1341421613446"
> Found 1 items
> -rw-r--r--   4 hbase supergroup          0 2012-07-04 17:06 
> /hbase/.logs/hb-d11,60020,1341097456894-splitting/hb-d11%2C60020%2C1341097456894.1341421613446
> shell> hdfs dfs -cat 
> "/hbase/.logs/hb-d11,60020,1341097456894-splitting/hb-d11%2C60020%2C1341097456894.1341421613446"
> 12/07/06 09:27:51 WARN hdfs.DFSClient: Last block locations not available. 
> Datanodes might not have reported blocks completely. Will retry for 3 times
> 12/07/06 09:27:55 WARN hdfs.DFSClient: Last block locations not available. 
> Datanodes might not have reported blocks completely. Will retry for 2 times
> 12/07/06 09:27:59 WARN hdfs.DFSClient: Last block locations not available. 
> Datanodes might not have reported blocks completely. Will retry for 1 times
> cat: Could not obtain the last block locations.
>
> I'm using hadoop 2.0 from Cloudera package (CDH4) with hbase 0.92.1
>
> Regards
> Cyril SCETBON
>
> On Jul 5, 2012, at 11:44 PM, Jean-Daniel Cryans wrote:
>
>> Interesting... Can you read the file? Try a "hadoop dfs -cat" on it
>> and see if it goes to the end of it.
>>
>> It could also be useful to see a bigger portion of the master log, for
>> all I know maybe it handles it somehow and there's a problem
>> elsewhere.
>>
>> Finally, which Hadoop version are you using?
>>
>> Thx,
>>
>> J-D
>>
>> On Thu, Jul 5, 2012 at 1:58 PM, Cyril Scetbon <cyril.scet...@free.fr> wrote:
>>> yes :
>>>
>>> /hbase/.logs/hb-d12,60020,1341429679981-splitting/hb-d12%2C60020%2C1341429679981.134143064971
>>>
>>> I did a fsck and here is the report :
>>>
>>> Status: HEALTHY
>>> Total size:    618827621255 B (Total open files size: 868 B)
>>> Total dirs:    4801
>>> Total files:   2825 (Files currently being written: 42)
>>> Total blocks (validated):      11479 (avg. block size 53909541 B) (Total 
>>> open file blocks (not validated): 41)
>>> Minimally replicated blocks:   11479 (100.0 %)
>>> Over-replicated blocks:        1 (0.008711561 %)
>>> Under-replicated blocks:       0 (0.0 %)
>>> Mis-replicated blocks:         0 (0.0 %)
>>> Default replication factor:    4
>>> Average block replication:     4.0000873
>>> Corrupt blocks:                0
>>> Missing replicas:              0 (0.0 %)
>>> Number of data-nodes:          12
>>> Number of racks:               1
>>> FSCK ended at Thu Jul 05 20:56:35 UTC 2012 in 795 milliseconds
>>>
>>>
>>> The filesystem under path '/hbase' is HEALTHY
>>>
>>> Cyril SCETBON
>>>
>>> Cyril SCETBON
>>>
>>> On Jul 5, 2012, at 7:59 PM, Jean-Daniel Cryans wrote:
>>>
>>>> Does this file really exist in HDFS?
>>>>
>>>> hdfs://hb-zk1:54310/hbase/.logs/hb-d12,60020,1341429679981-splitting/hb-d12%2C60020%2C1341429679981.1341430649711
>>>>
>>>> If so, did you run fsck in HDFS?
>>>>
>>>> It would be weird if HDFS doesn't report anything bad but somehow the
>>>> clients (like HBase) can't read it.
>>>>
>>>> J-D
>>>>
>>>> On Thu, Jul 5, 2012 at 12:45 AM, Cyril Scetbon <cyril.scet...@free.fr> 
>>>> wrote:
>>>>> Hi,
>>>>>
>>>>> I can nolonger start my cluster correctly and get messages like 
>>>>> http://pastebin.com/T56wrJxE (taken on one region server)
>>>>>
>>>>> I suppose Hbase is not done for being stopped but only for having some 
>>>>> nodes going down ??? HDFS is not complaining, it's only HBase that can't 
>>>>> start correctly :(
>>>>>
>>>>> I suppose some data has not been flushed and it's not really important 
>>>>> for me. Is there a way to fix theses errors even if I will lose data ?
>>>>>
>>>>> thanks
>>>>>
>>>>> Cyril SCETBON
>>>>>
>>>
>

Re: distributed log splitting aborted

Reply via email to