Gopi, The namenode is essentially running bash -c "id -Gn hduser" and waiting for the response. You could try executing from the shell it to see if it does take a long time in Azure, or if the output is too complex to parse.
Regards, Marcos In 04-04-2013 09:33, Gopi Krishna M wrote: in case anybody has some idea on what I should look for in debugging this.. below is a strace of the simple bin/hadoop dfs -ls / command run on the machine running namenode % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 49.56 0.000853 45 19 14 execve 36.20 0.000623 1 925 rt_sigprocmask 5.75 0.000099 2 62 brk 5.29 0.000091 9 10 getgid 3.20 0.000055 1 81 rt_sigaction 0.00 0.000000 0 71 read 0.00 0.000000 0 3 write 0.00 0.000000 0 49 11 open 0.00 0.000000 0 87 close 0.00 0.000000 0 71 31 stat 0.00 0.000000 0 36 fstat 0.00 0.000000 0 12 lseek 0.00 0.000000 0 72 mmap 0.00 0.000000 0 36 mprotect 0.00 0.000000 0 9 munmap 0.00 0.000000 0 19 rt_sigreturn 0.00 0.000000 0 2 2 ioctl 0.00 0.000000 0 40 31 access 0.00 0.000000 0 19 pipe 0.00 0.000000 0 8 dup2 0.00 0.000000 0 2 getpid 0.00 0.000000 0 20 clone 0.00 0.000000 0 38 19 wait4 0.00 0.000000 0 2 uname 0.00 0.000000 0 24 2 fcntl 0.00 0.000000 0 16 getdents 0.00 0.000000 0 2 readlink 0.00 0.000000 0 2 gettimeofday 0.00 0.000000 0 5 getrlimit 0.00 0.000000 0 10 getuid 0.00 0.000000 0 10 geteuid 0.00 0.000000 0 10 getegid 0.00 0.000000 0 2 getppid 0.00 0.000000 0 2 getpgrp 0.00 0.000000 0 5 arch_prctl 0.00 0.000000 0 2 1 futex 0.00 0.000000 0 1 set_tid_address 0.00 0.000000 0 10 2 openat 0.00 0.000000 0 1 set_robust_list ------ ----------- ----------- --------- --------- ---------------- 100.00 0.001721 1795 113 total On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <mgo...@gmail.com<mailto:mgo...@gmail.com>> wrote: Hi.. I have installed hadoop 0.23.5 and is working fine on two of my installations. In a new installation on Windows Azure VMs, I am seeing an inordinate delay (of 1 minute) when doing a simple dfs ls command. This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre command : bin/hadoop dfs -ls / With log set to DEBUG, following is the output on the console. The logs in bold show the sleep/timeout/delay of 1 minute. Any pointers on what to check for / what is the reason for this delay? 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache. hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of successful kerberos logins and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops) 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache. hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of failed kerberos logins and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops) 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics 13/03/31 05:49:55 DEBUG security.Groups: Creating new Groups object 13/03/31 05:49:55 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUni xGroupsMapping; cacheTimeout=300000 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hduser 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI loginUser:hduser (auth:SIMPLE) 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms. 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for protocol ClientProtocol 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000 Thanks Gopi