[ 
https://issues.apache.org/jira/browse/HDFS-6254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976299#comment-13976299
 ] 

huang ken commented on HDFS-6254:
---------------------------------

test.cpp:
int main()
{
        hdfsFS fs = NULL;
        fs = hdfsConnect("172.16.19.222", 8020); // romote host is unavailable
        if (!fs)
        {
                printf("HDFS connect error!\n");
                return -1;
        }
        else 
        {
                printf("HDFS connect OK.\n");
        }
        if (hdfsCreateDirectory(fs, "/tmp/root/") != 0 )
        {
                printf("Create dir error!\n");
                return -1;
        }
        else
        {
                printf("Create dir Ok.\n");
                return 0;
        }
}

What puzzled me is following two scenarios:

#scenario 1:compile test.cpp & run
[root@datanode test]# ./a.out 
2014-04-22 09:48:34,106 WARN  util.NativeCodeLoader 
(NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for 
your platform... using builtin-java classes where applicable
HDFS connect OK.
hdfsCreateDirectory(/tmp/root/): FileSystem#mkdirs error:
java.net.ConnectException: Call From datanode2/172.16.18.238 to 
172.16.19.222:8020 failed on connection exception: java.net.ConnectException: 
Connection timed out; For more details see:  
http://wiki.apache.org/hadoop/ConnectionRefused
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
        at org.apache.hadoop.ipc.Client.call(Client.java:1351)
        at org.apache.hadoop.ipc.Client.call(Client.java:1300)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at com.sun.proxy.$Proxy9.mkdirs(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy9.mkdirs(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:467)
        at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2394)
        at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2365)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:817)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:813)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:813)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:806)
        at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1933)
Caused by: java.net.ConnectException: Connection timed out
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
        at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
        at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
        at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399)
        at org.apache.hadoop.ipc.Client.call(Client.java:1318)
        ... 19 more
Create dir error!

#scenario  2:compile test.cpp & gdb
[root@datanode2 test3]# gdb a.out 
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/test3/a.out...done.
(gdb) l
1       //g++ -g test.cpp 
-L/home/hkx/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/lib/native 
-I/home/hkx/hadoop-2.2.0-src/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs
 -I/home/yjx/JDK/jdk1.6.0_24/include -I/home/yjx/JDK/jdk1.6.0_24/include/linux 
-lhdfs
2       #include <stdio.h>
3       #include <stdlib.h>
4       #include "hdfs.h"
5
6       int main()
7       {
8               hdfsFS fs = NULL;
9               fs = hdfsConnect("172.16.19.222", 8020);
10              if (!fs)
(gdb) b test.cpp:8
Breakpoint 1 at 0x40071c: file test.cpp, line 8.
(gdb) r
Starting program: /root/test3/a.out 
[Thread debugging using libthread_db enabled]

Breakpoint 1, main () at test.cpp:8
8               hdfsFS fs = NULL;
Missing separate debuginfos, use: debuginfo-install 
glibc-2.12-1.80.el6_3.6.x86_64 jdk-1.7.0_55-fcs.x86_64 
libgcc-4.4.6-4.el6.x86_64 libstdc++-4.4.6-4.el6.x86_64
(gdb) n
9               fs = hdfsConnect("172.16.19.222", 8020);
(gdb) p fs
$1 = (hdfsFS) 0x0
(gdb) n
[New Thread 0x7ffff23aa700 (LWP 3218)]
[New Thread 0x7ffff22a9700 (LWP 3219)]
[New Thread 0x7ffff21a8700 (LWP 3220)]
[New Thread 0x7ffff20a7700 (LWP 3221)]
[New Thread 0x7ffff0740700 (LWP 3222)]
[New Thread 0x7ffff063f700 (LWP 3223)]
[New Thread 0x7ffff053e700 (LWP 3224)]
[New Thread 0x7ffff043d700 (LWP 3225)]
[New Thread 0x7ffff033c700 (LWP 3226)]
[New Thread 0x7ffff023b700 (LWP 3227)]
[New Thread 0x7ffff013a700 (LWP 3228)]
[New Thread 0x7fffebfff700 (LWP 3229)]

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff24155ca in ?? ()
(gdb) bt
#0  0x00007ffff24155ca in ?? ()
#1  0x0000000000615000 in ?? ()
#2  0x00000000f506d712 in ?? ()
#3  0x0000000000615000 in ?? ()
#4  0x00000000f5084c68 in ?? ()
#5  0x000000000000001b in ?? ()
#6  0x0000000000000000 in ?? ()
(gdb) quit
A debugging session is active.

        Inferior 1 [process 3214] will be killed.

Quit anyway? (y or n) y

In scenario 1, do return connect OK and fail in hdfsCreateDirectory(). 
In scenario 2, got sigsegv while connect to a host unavailable(not matter 
remote host is available or not).Seems can't debug it using gdb.

> hdfsConnect segment fault where namenode not connected
> ------------------------------------------------------
>
>                 Key: HDFS-6254
>                 URL: https://issues.apache.org/jira/browse/HDFS-6254
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 2.2.0
>         Environment: Linux Centos 64bit
>            Reporter: huang ken
>
> When namenode is not started, the libhdfs client will cause segment fault 
> while connecting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to