Re: LeaseExpiredException and too many xceiver
Config on most Y! clusters sets dfs.datanode.max.xcievers to a large value .. something like 1k to 2k. You could try that. Raghu. Nathan Marz wrote: Looks like the exception on the datanode got truncated a little bit. Here's the full exception: 2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode: DatanodeRegistration(10.100.11.115:50010, storageID=DS-2129547091-10.100.11.115-50010-1225485937590, infoPort=50075, ipcPort=50020):DataXceiver: java.io.IOException: xceiverCount 257 exceeds the limit of concurrent xcievers 256 at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1030) at java.lang.Thread.run(Thread.java:619) On Oct 31, 2008, at 2:49 PM, Nathan Marz wrote: Hello, We are seeing some really bad errors on our hadoop cluster. After reformatting the whole cluster, the first job we run immediately fails with "Could not find block locations..." errrors. In the namenode logs, we see a ton of errors like: 2008-10-31 14:20:44,799 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 7276, call addBlock(/tmp/dustintmp/shredded_dataunits/_t$ org.apache.hadoop.dfs.LeaseExpiredException: No lease on /tmp/dustintmp/shredded_dataunits/_temporary/_attempt_200810311418_0002_m_23_0$ at org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1166) at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1097) at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888) In the datanode logs, we see a ton of errors like: 2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode: DatanodeRegistration(10.100.11.115:50010, storageID=DS-2129547091-10.100.11.1$ of concurrent xcievers 256 at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1030) at java.lang.Thread.run(Thread.java:619) Anyone have any ideas on what may be wrong? Thanks, Nathan Marz Rapleaf
Re: LeaseExpiredException and too many xceiver
Looks like the exception on the datanode got truncated a little bit. Here's the full exception: 2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode: DatanodeRegistration(10.100.11.115:50010, storageID=DS-2129547091-10.100.11.115-50010-1225485937590, infoPort=50075, ipcPort=50020):DataXceiver: java.io.IOException: xceiverCount 257 exceeds the limit of concurrent xcievers 256 at org.apache.hadoop.dfs.DataNode $DataXceiver.run(DataNode.java:1030) at java.lang.Thread.run(Thread.java:619) On Oct 31, 2008, at 2:49 PM, Nathan Marz wrote: Hello, We are seeing some really bad errors on our hadoop cluster. After reformatting the whole cluster, the first job we run immediately fails with "Could not find block locations..." errrors. In the namenode logs, we see a ton of errors like: 2008-10-31 14:20:44,799 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 7276, call addBlock(/tmp/dustintmp/ shredded_dataunits/_t$ org.apache.hadoop.dfs.LeaseExpiredException: No lease on /tmp/ dustintmp/shredded_dataunits/_temporary/ _attempt_200810311418_0002_m_23_0$ at org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1166) at org .apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java: 1097) at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun .reflect .DelegatingMethodAccessorImpl .invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888) In the datanode logs, we see a ton of errors like: 2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode: DatanodeRegistration(10.100.11.115:50010, storageID=DS-2129547091-10.100.11.1$ of concurrent xcievers 256 at org.apache.hadoop.dfs.DataNode $DataXceiver.run(DataNode.java:1030) at java.lang.Thread.run(Thread.java:619) Anyone have any ideas on what may be wrong? Thanks, Nathan Marz Rapleaf
LeaseExpiredException and too many xceiver
Hello, We are seeing some really bad errors on our hadoop cluster. After reformatting the whole cluster, the first job we run immediately fails with "Could not find block locations..." errrors. In the namenode logs, we see a ton of errors like: 2008-10-31 14:20:44,799 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 7276, call addBlock(/tmp/dustintmp/shredded_dataunits/_t$ org.apache.hadoop.dfs.LeaseExpiredException: No lease on /tmp/ dustintmp/shredded_dataunits/_temporary/ _attempt_200810311418_0002_m_23_0$ at org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1166) at org .apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java: 1097) at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun .reflect .DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888) In the datanode logs, we see a ton of errors like: 2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode: DatanodeRegistration(10.100.11.115:50010, storageID=DS-2129547091-10.100.11.1$ of concurrent xcievers 256 at org.apache.hadoop.dfs.DataNode $DataXceiver.run(DataNode.java:1030) at java.lang.Thread.run(Thread.java:619) Anyone have any ideas on what may be wrong? Thanks, Nathan Marz Rapleaf