Re: LeaseExpiredException and too many xceiver

2008-10-31 Thread Raghu Angadi


Config on most Y! clusters sets dfs.datanode.max.xcievers to a large 
value .. something like 1k to 2k. You could try that.


Raghu.

Nathan Marz wrote:
Looks like the exception on the datanode got truncated a little bit. 
Here's the full exception:


2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode: 
DatanodeRegistration(10.100.11.115:50010,
storageID=DS-2129547091-10.100.11.115-50010-1225485937590, 
infoPort=50075, ipcPort=50020):DataXceiver: java.io.IOException:

xceiverCount 257 exceeds the limit of concurrent xcievers 256
at 
org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1030)

at java.lang.Thread.run(Thread.java:619)


On Oct 31, 2008, at 2:49 PM, Nathan Marz wrote:


Hello,

We are seeing some really bad errors on our hadoop cluster. After 
reformatting the whole cluster, the first job we run immediately fails 
with "Could not find block locations..." errrors. In the namenode 
logs, we see a ton of errors like:


2008-10-31 14:20:44,799 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 5 on 7276, call addBlock(/tmp/dustintmp/shredded_dataunits/_t$
org.apache.hadoop.dfs.LeaseExpiredException: No lease on 
/tmp/dustintmp/shredded_dataunits/_temporary/_attempt_200810311418_0002_m_23_0$ 

   at 
org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1166)
   at 
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1097) 


   at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 


   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)



In the datanode logs, we see a ton of errors like:

2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode: 
DatanodeRegistration(10.100.11.115:50010, 
storageID=DS-2129547091-10.100.11.1$

of concurrent xcievers 256
   at 
org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1030)

   at java.lang.Thread.run(Thread.java:619)



Anyone have any ideas on what may be wrong?

Thanks,
Nathan Marz
Rapleaf






Re: LeaseExpiredException and too many xceiver

2008-10-31 Thread Nathan Marz
Looks like the exception on the datanode got truncated a little bit.  
Here's the full exception:


2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode:  
DatanodeRegistration(10.100.11.115:50010,
storageID=DS-2129547091-10.100.11.115-50010-1225485937590,  
infoPort=50075, ipcPort=50020):DataXceiver: java.io.IOException:

xceiverCount 257 exceeds the limit of concurrent xcievers 256
at org.apache.hadoop.dfs.DataNode 
$DataXceiver.run(DataNode.java:1030)

at java.lang.Thread.run(Thread.java:619)


On Oct 31, 2008, at 2:49 PM, Nathan Marz wrote:


Hello,

We are seeing some really bad errors on our hadoop cluster. After  
reformatting the whole cluster, the first job we run immediately  
fails with "Could not find block locations..." errrors. In the  
namenode logs, we see a ton of errors like:


2008-10-31 14:20:44,799 INFO org.apache.hadoop.ipc.Server: IPC  
Server handler 5 on 7276, call addBlock(/tmp/dustintmp/ 
shredded_dataunits/_t$
org.apache.hadoop.dfs.LeaseExpiredException: No lease on /tmp/ 
dustintmp/shredded_dataunits/_temporary/ 
_attempt_200810311418_0002_m_23_0$
   at  
org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1166)
   at  
org 
.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java: 
1097)

   at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
   at  
sun 
.reflect 
.DelegatingMethodAccessorImpl 
.invoke(DelegatingMethodAccessorImpl.java:25)

   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)



In the datanode logs, we see a ton of errors like:

2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode:  
DatanodeRegistration(10.100.11.115:50010,  
storageID=DS-2129547091-10.100.11.1$

of concurrent xcievers 256
   at org.apache.hadoop.dfs.DataNode 
$DataXceiver.run(DataNode.java:1030)

   at java.lang.Thread.run(Thread.java:619)



Anyone have any ideas on what may be wrong?

Thanks,
Nathan Marz
Rapleaf




LeaseExpiredException and too many xceiver

2008-10-31 Thread Nathan Marz

Hello,

We are seeing some really bad errors on our hadoop cluster. After  
reformatting the whole cluster, the first job we run immediately fails  
with "Could not find block locations..." errrors. In the namenode  
logs, we see a ton of errors like:


2008-10-31 14:20:44,799 INFO org.apache.hadoop.ipc.Server: IPC Server  
handler 5 on 7276, call addBlock(/tmp/dustintmp/shredded_dataunits/_t$
org.apache.hadoop.dfs.LeaseExpiredException: No lease on /tmp/ 
dustintmp/shredded_dataunits/_temporary/ 
_attempt_200810311418_0002_m_23_0$
at  
org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1166)
at  
org 
.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java: 
1097)

at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
25)

at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)



In the datanode logs, we see a ton of errors like:

2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode:  
DatanodeRegistration(10.100.11.115:50010,  
storageID=DS-2129547091-10.100.11.1$

of concurrent xcievers 256
at org.apache.hadoop.dfs.DataNode 
$DataXceiver.run(DataNode.java:1030)

at java.lang.Thread.run(Thread.java:619)



Anyone have any ideas on what may be wrong?

Thanks,
Nathan Marz
Rapleaf