Re: Running Hadoop on cluster with NFS booted systems

Nick Rathke Tue, 29 Sep 2009 14:45:55 -0700

Hi Brian / Todd,

-bash-3.2# cat /proc/sys/kernel/random/entropy_avail
128


So I did

rngd -r /dev/urandom -o /dev/random -f -t 1 &

and it **seems** to be working.. The web page shows the nodes as thereand the logs seem to show that the clients have started correctly, but Ihave not yet tried to run any jobs.


Thanks for all your help!!

-Nick

Brian Bockelman wrote:

Hey Nick,

Try this:
cat /proc/sys/kernel/random/entropy_avail

Is it a small number (<300)?
Basically, one way Linux generates entropy is via input from thekeyboard. So, as soon as you log into the NFS booted server, you'vegiven it enough entropy for HDFS to start up.
Here's a relevant-looking link:

http://rackerhacker.com/2007/07/01/check-available-entropy-in-linux/

Brian

On Sep 29, 2009, at 1:27 PM, Nick Rathke wrote:
Great. I'll look at this fix. Here is what I got based on Brian's info

lsof -p gave me :
java 12739 root 50r CHR 1,8 3335/dev/randomjava 12739 root 51r CHR 1,9 3325/dev/urandom
.
.
.
.
java 12739 root 66r CHR 1,8 3335/dev/random
Both do exist in /dev

and securerandom.source=file

was already set to

securerandom.source=file:/dev/urandom
I have also checked that the permissions on said file are the samebetween nfs nodes and local OS nodes.
-Nick



Todd Lipcon wrote:
Yep, this is a common problem. The fix that Brian outlined helps alot, butif you are *really* strapped for random bits, you'll still block.This is
because even if you've set the random source, it still uses the real
/dev/random to grab a seed for the prng, at least on my system.
On systems where I know I don't care about true randomness, I alsouse this
trick:
http://www.chrissearle.org/blog/technical/increase_entropy_26_kernel_linux_box
It's very handy for boxes running hudson that start and stop multinode
pseudodistributed hadoop clusters regularly.

-Todd
On Tue, Sep 29, 2009 at 10:16 AM, Brian Bockelman<bbock...@cse.unl.edu>wrote:
Hey Nick,
Strange. It appears that the Jetty server has stalled while tryingto readfrom /dev/random. Is it possible that some part of /dev isn'tinitialized
before the datanode is launched?

Can you confirm this using "lsof -p <process ID>" ?

I copy/paste a solution I found in a forum via google below.

Brian
Edit $JAVA_HOME/jre/lib/security/java.security and change theproperty:
securerandom.source=file:/dev/random

to:

securerandom.source=file:/dev/urandom


On Sep 29, 2009, at 11:26 AM, Nick Rathke wrote:

Thanks.  Here it is as in all of it's glory...
-Nick


2009-09-29 09:15:53
Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.2-b01 mixedmode):
"263851...@qtp0-1" prio=10 tid=0x00002aaaf846a000 nid=0x226b in
Object.wait() [0x0000000041d24000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaade3587f8> (a
org.mortbay.thread.QueuedThreadPool$PoolThread)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:565)
- locked <0x00002aaade3587f8> (a
org.mortbay.thread.QueuedThreadPool$PoolThread)

"1837007...@qtp0-0" prio=10 tid=0x00002aaaf84d4000 nid=0x226a in
Object.wait() [0x0000000041b22000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaade3592b8> (a
org.mortbay.thread.QueuedThreadPool$PoolThread)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:565)
- locked <0x00002aaade3592b8> (a
org.mortbay.thread.QueuedThreadPool$PoolThread)

"refreshUsed-/tmp/hadoop-root/dfs/data" daemon prio=10
tid=0x00002aaaf8456000 nid=0x2269 waiting on condition[0x0000000041c23000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.fs.DU$DURefreshThread.run(DU.java:80)
at java.lang.Thread.run(Thread.java:619)

"RMI TCP Accept-0" daemon prio=10 tid=0x00002aaaf834d800 nid=0x225a
runnable [0x000000004171e000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
- locked <0x00002aaade358040> (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:453)
at java.net.ServerSocket.accept(ServerSocket.java:421)
at
sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:34)
at
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:369)
at
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341)
at java.lang.Thread.run(Thread.java:619)
"Low Memory Detector" daemon prio=10 tid=0x00000000535f5000nid=0x2259
runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"CompilerThread1" daemon prio=10 tid=0x00000000535f1800 nid=0x2258waiting
on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"CompilerThread0" daemon prio=10 tid=0x00000000535ef000 nid=0x2257waiting
on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00000000535ec800 nid=0x2256
waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00000000535cf800 nid=0x2255 in
Object.wait() [0x0000000041219000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaade3472f0> (ajava.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x00002aaade3472f0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
"Reference Handler" daemon prio=10 tid=0x00000000535c8000nid=0x2254 in
Object.wait() [0x0000000041118000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaade3a2018> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
- locked <0x00002aaade3a2018> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x0000000053554000 nid=0x2245 runnable
[0x0000000040208000]
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:199)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <0x00002aaade1e5870> (a java.io.BufferedInputStream)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <0x00002aaade1e29f8> (a java.io.BufferedInputStream)
at
sun.security.provider.SeedGenerator$URLSeedGenerator.getSeedByte(SeedGenerator.java:453)
at
sun.security.provider.SeedGenerator.getSeedBytes(SeedGenerator.java:123)
at
sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:118)
at
sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
at
sun.security.provider.SecureRandom.engineNextBytes(SecureRandom.java:171)
- locked <0x00002aaade1e2500> (a sun.security.provider.SecureRandom)
at java.security.SecureRandom.nextBytes(SecureRandom.java:433)
- locked <0x00002aaade1e2830> (a java.security.SecureRandom)
at java.security.SecureRandom.next(SecureRandom.java:455)
at java.util.Random.nextLong(Random.java:284)
at
org.mortbay.jetty.servlet.HashSessionIdManager.doStart(HashSessionIdManager.java:139)
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
- locked <0x00002aaade1e21c0> (a java.lang.Object)
at
org.mortbay.jetty.servlet.AbstractSessionManager.doStart(AbstractSessionManager.java:168)
at
org.mortbay.jetty.servlet.HashSessionManager.doStart(HashSessionManager.java:67)
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
- locked <0x00002aaade334c00> (a java.lang.Object)
at
org.mortbay.jetty.servlet.SessionHandler.doStart(SessionHandler.java:115)
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
- locked <0x00002aaade334b18> (a java.lang.Object)
at
org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
at
org.mortbay.jetty.handler.ContextHandler.startContext(ContextHandler.java:537)
at org.mortbay.jetty.servlet.Context.startContext(Context.java:136)
at
org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1234)
at
org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:517)atorg.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:460)
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
- locked <0x00002aaade334ab0> (a java.lang.Object)
at
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
at
org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
- locked <0x00002aaade332c30> (a java.lang.Object)
at
org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
at org.mortbay.jetty.Server.doStart(Server.java:222)
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
- locked <0x00002aaab44191a0> (a java.lang.Object)
at org.apache.hadoop.http.HttpServer.start(HttpServer.java:460)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:375)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:216)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368)
"VM Thread" prio=10 tid=0x00000000535c1800 nid=0x2253 runnable
"GC task thread#0 (ParallelGC)" prio=10 tid=0x000000005355e000nid=0x2246
runnable
"GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000053560000nid=0x2247
runnable
"GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000053561800nid=0x2248
runnable
"GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000053563800nid=0x2249
runnable
"GC task thread#4 (ParallelGC)" prio=10 tid=0x0000000053565800nid=0x224a
runnable
"GC task thread#5 (ParallelGC)" prio=10 tid=0x0000000053567000nid=0x224b
runnable
"GC task thread#6 (ParallelGC)" prio=10 tid=0x0000000053569000nid=0x224c
runnable
"GC task thread#7 (ParallelGC)" prio=10 tid=0x000000005356b000nid=0x224d
runnable
"GC task thread#8 (ParallelGC)" prio=10 tid=0x000000005356c800nid=0x224e
runnable
"GC task thread#9 (ParallelGC)" prio=10 tid=0x000000005356e800nid=0x224f
runnable
"GC task thread#10 (ParallelGC)" prio=10 tid=0x0000000053570800nid=0x2250
runnable
"GC task thread#11 (ParallelGC)" prio=10 tid=0x0000000053572000nid=0x2251
runnable
"GC task thread#12 (ParallelGC)" prio=10 tid=0x0000000053574000nid=0x2252
runnable

"VM Periodic Task Thread" prio=10 tid=0x00002aaaf835f800 nid=0x225b
waiting on condition

JNI global references: 715

Heap
PSYoungGen      total 5312K, used 5185K [0x00002aaaddde0000,
0x00002aaade5a0000, 0x00002aaaf2b30000)
eden space 4416K, 97% used
[0x00002aaaddde0000,0x00002aaade210688,0x00002aaade230000)
from space 896K, 100% used
[0x00002aaade320000,0x00002aaade400000,0x00002aaade400000)
to   space 960K, 0% used
[0x00002aaade230000,0x00002aaade230000,0x00002aaade320000)
PSOldGen        total 5312K, used 1172K [0x00002aaab4330000,
0x00002aaab4860000, 0x00002aaaddde0000)
object space 5312K, 22% used
[0x00002aaab4330000,0x00002aaab44550b8,0x00002aaab4860000)
PSPermGen       total 21248K, used 13354K [0x00002aaaaef30000,
0x00002aaab03f0000, 0x00002aaab4330000)
object space 21248K, 62% used
[0x00002aaaaef30000,0x00002aaaafc3a818,0x00002aaab03f0000)




Brian Bockelman wrote:
Hey Nick,

I believe the mailing list stripped out your attachment.

Brian

On Sep 29, 2009, at 10:22 AM, Nick Rathke wrote:

Hi,
Here is the dump. I looked it over and unfortunately it is pretty
meaningless to me at this point. Any help deciphering it wouldbe greatly
appreciated.

I have also now disabled the IB interface on my 2 test systems,
unfortunately that had no impact.

-Nick


Todd Lipcon wrote:
Hi Nick,

Figure out the pid of the DataNode process using either "jps" or
straight
"ps auxw | grep DataNode", and then kill -QUIT <pid>. Thatshould cause
the
node to dump its stack to its stdout. That'll either end up inthe .out
file
in your logs directory, or on your console, depending how youstarted
the
daemon.

-Todd

On Mon, Sep 28, 2009 at 9:11 PM, Nick Rathke <n...@sci.utah.edu>
wrote:


Hi Todd,
Unfortunately it never returns. Gives good info on a runningnode.
-bash-3.2# curl http://127.0.0.1:50075/stacks

If I do a stop-all on the master I get

curl: (52) Empty reply from server

on the stuck node.
If I do this in a browser I can see that it is **trying** toconnect,
if I
kill the java process I get "Server not found" but as long asthe java
process are running I just get a black page.
Should I try a TCP dump and see if I can see packets flowing ?would
that
be of any help ?

-Nick



Todd Lipcon wrote:


Hi Nick,
Can you curl http://127.0.0.1:50075/stacks on one of thestuck nodes
and
paste the result?
Sometimes that can give an indication as to where things aregetting
stuck.

-Todd

On Mon, Sep 28, 2009 at 7:21 PM, Nick Rathke <n...@sci.utah.edu>
wrote:




FYI I get the same hanging behavior if I follow the Hadoop quick
start
for
a single node base line configuration ( no modified conf files)

-Nick



Brian Bockelman wrote:




Hey Nicke,
Do you have any error messages appearing in the log files?

Brian

On Sep 28, 2009, at 2:06 PM, Nick Rathke wrote:

Ted Dunning wrote:
I think that the last time you asked this question, thesuggestion
was

to
look at DNS and make sure that everything is exactlycorrect in
the
net-boot
configuration. Hadoop is very sensitive to networkrouting and
naming
details.

So,

a) in your net-boot, how are IP addresses assigned?
We assign static IP's based on a node's MAC address viaDHCP so
that
when a node is netbooted or booted with a local OS itgets the
same IP
and
hostname.




b) how are DNS names propagated?
cluster DNS names are on a mixed in with our facility DNS
servers.



All nodes have proper forward and reverse DNS lookups.
c) how have you guaranteed that (a) and (b) are exactly
consistent?

Host MAC address. I also have manually conformed this.
     d) how have your guaranteed that every node can talk to
every
other node
both by name and IP address?
Local cluster DNS / DHCP + all nodes have all other nodeshost
names
and IP's in /etc/hosts. I have compared all the configfiles for
DNS /
DHCP
/ and /etc/hosts to make sure all information is the same.
e) have you assured yourself that any reverse mapping thatexists
is
correct?

Yes, and tested.
One more bit of information. The system boots on a 1Gbnetwork
all
other
network traffic i.e. MPI and NFS to data volumes is via IB.
The IB network also has proper forward/backwards DNSentries. IB
IP
address are setup at boot time via a script that takes thehost IP
and
a
fixed offset to calculate the address for the IBinterface. I have
also
confirmed that the IB IP address's match our DNS .

-Nick
On Mon, Sep 28, 2009 at 9:45 AM, Nick Rathke<n...@sci.utah.edu>
wrote:
I am hopping that someone can help with this issue. I havea 64
node
cluster that we would like to run Hadoop on, most of thenodes
are
netbooted
via NFS.
Hadoop runs fine on nodes IF the node uses a local OSinstall,
but
doesn't
work when nodes are netbooted. Under netboot I can seethat the
slaves
have
the correct Java processes running, but the Hadoop webpages
never
shows the
nodes as available. The Hadoop logs on the nodes alsoshow that
everything
is running and started up correctly.
On the few node that have a local OS installedeverything works
just
fine
and I can run the test jobs without issue (so far).

I  am using the identical hadoop install and configuration
between
netbooted nodes and none netbooted nodes.

Has anyone encountered this type of issue ?
--
Nick Rathke
Scientific Computing and Imaging Institute
Sr. Systems Administrator
n...@sci.utah.edu
www.sci.utah.edu
801-587-9933
801-557-3832
"I came I saw I made it possible" Royal Bliss - Here TheyCome
--
Nick Rathke
Scientific Computing and Imaging Institute
Sr. Systems Administrator
n...@sci.utah.edu
www.sci.utah.edu
801-587-9933
801-557-3832

"I came I saw I made it possible" Royal Bliss - Here They Come
--
Nick Rathke
Scientific Computing and Imaging Institute
Sr. Systems Administrator
n...@sci.utah.edu
www.sci.utah.edu
801-587-9933
801-557-3832

"I came I saw I made it possible" Royal Bliss - Here They Come
--
Nick Rathke
Scientific Computing and Imaging Institute
Sr. Systems Administrator
n...@sci.utah.edu
www.sci.utah.edu
801-587-9933
801-557-3832

"I came I saw I made it possible" Royal Bliss - Here They Come



--
Nick Rathke
Scientific Computing and Imaging Institute
Sr. Systems Administrator
n...@sci.utah.edu
www.sci.utah.edu
801-587-9933
801-557-3832

"I came I saw I made it possible" Royal Bliss - Here They Come

Re: Running Hadoop on cluster with NFS booted systems

Reply via email to