Hi Cyril, BTW, have you checked dfs.datanode.max.xcievers and ulimit -n? When underconfigured they can cause this type of errors, even if it seems it's not the case here...
Cheers, N. On Fri, Jul 6, 2012 at 11:31 AM, Cyril Scetbon <cyril.scet...@free.fr> wrote: > The file is now missing but I have tried with another one and you can see the > error : > > shell> hdfs dfs -ls > "/hbase/.logs/hb-d11,60020,1341097456894-splitting/hb-d11%2C60020%2C1341097456894.1341421613446" > Found 1 items > -rw-r--r-- 4 hbase supergroup 0 2012-07-04 17:06 > /hbase/.logs/hb-d11,60020,1341097456894-splitting/hb-d11%2C60020%2C1341097456894.1341421613446 > shell> hdfs dfs -cat > "/hbase/.logs/hb-d11,60020,1341097456894-splitting/hb-d11%2C60020%2C1341097456894.1341421613446" > 12/07/06 09:27:51 WARN hdfs.DFSClient: Last block locations not available. > Datanodes might not have reported blocks completely. Will retry for 3 times > 12/07/06 09:27:55 WARN hdfs.DFSClient: Last block locations not available. > Datanodes might not have reported blocks completely. Will retry for 2 times > 12/07/06 09:27:59 WARN hdfs.DFSClient: Last block locations not available. > Datanodes might not have reported blocks completely. Will retry for 1 times > cat: Could not obtain the last block locations. > > I'm using hadoop 2.0 from Cloudera package (CDH4) with hbase 0.92.1 > > Regards > Cyril SCETBON > > On Jul 5, 2012, at 11:44 PM, Jean-Daniel Cryans wrote: > >> Interesting... Can you read the file? Try a "hadoop dfs -cat" on it >> and see if it goes to the end of it. >> >> It could also be useful to see a bigger portion of the master log, for >> all I know maybe it handles it somehow and there's a problem >> elsewhere. >> >> Finally, which Hadoop version are you using? >> >> Thx, >> >> J-D >> >> On Thu, Jul 5, 2012 at 1:58 PM, Cyril Scetbon <cyril.scet...@free.fr> wrote: >>> yes : >>> >>> /hbase/.logs/hb-d12,60020,1341429679981-splitting/hb-d12%2C60020%2C1341429679981.134143064971 >>> >>> I did a fsck and here is the report : >>> >>> Status: HEALTHY >>> Total size: 618827621255 B (Total open files size: 868 B) >>> Total dirs: 4801 >>> Total files: 2825 (Files currently being written: 42) >>> Total blocks (validated): 11479 (avg. block size 53909541 B) (Total >>> open file blocks (not validated): 41) >>> Minimally replicated blocks: 11479 (100.0 %) >>> Over-replicated blocks: 1 (0.008711561 %) >>> Under-replicated blocks: 0 (0.0 %) >>> Mis-replicated blocks: 0 (0.0 %) >>> Default replication factor: 4 >>> Average block replication: 4.0000873 >>> Corrupt blocks: 0 >>> Missing replicas: 0 (0.0 %) >>> Number of data-nodes: 12 >>> Number of racks: 1 >>> FSCK ended at Thu Jul 05 20:56:35 UTC 2012 in 795 milliseconds >>> >>> >>> The filesystem under path '/hbase' is HEALTHY >>> >>> Cyril SCETBON >>> >>> Cyril SCETBON >>> >>> On Jul 5, 2012, at 7:59 PM, Jean-Daniel Cryans wrote: >>> >>>> Does this file really exist in HDFS? >>>> >>>> hdfs://hb-zk1:54310/hbase/.logs/hb-d12,60020,1341429679981-splitting/hb-d12%2C60020%2C1341429679981.1341430649711 >>>> >>>> If so, did you run fsck in HDFS? >>>> >>>> It would be weird if HDFS doesn't report anything bad but somehow the >>>> clients (like HBase) can't read it. >>>> >>>> J-D >>>> >>>> On Thu, Jul 5, 2012 at 12:45 AM, Cyril Scetbon <cyril.scet...@free.fr> >>>> wrote: >>>>> Hi, >>>>> >>>>> I can nolonger start my cluster correctly and get messages like >>>>> http://pastebin.com/T56wrJxE (taken on one region server) >>>>> >>>>> I suppose Hbase is not done for being stopped but only for having some >>>>> nodes going down ??? HDFS is not complaining, it's only HBase that can't >>>>> start correctly :( >>>>> >>>>> I suppose some data has not been flushed and it's not really important >>>>> for me. Is there a way to fix theses errors even if I will lose data ? >>>>> >>>>> thanks >>>>> >>>>> Cyril SCETBON >>>>> >>> >