[
https://issues.apache.org/jira/browse/HADOOP-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644100#action_12644100
]
Brian Bockelman commented on HADOOP-4541:
-----------------------------------------
Dummy me forgot the greatest tool for debugging of them all: gdb.
Here's the stack trace of one of these suckers in action:
(gdb) where
#0 0x0000002a975a0a47 in CollectedHeap::allocate_from_tlab_slow () from
/opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so
#1 0x0000002a97529551 in CollectedHeap::common_mem_allocate_noinit () from
/opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so
#2 0x0000002a9793374b in typeArrayKlass::allocate () from
/opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so
#3 0x0000002a9769f5e8 in jni_NewByteArray () from
/opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so
#4 0x0000002a971ef363 in hdfsWrite (fs=Variable "fs" is not available.
) at hdfs.c:590
#5 0x0000002a970ea97a in globus_l_gfs_hdfs_dump_buffers (hdfs_handle=0x6de080)
at globus_gridftp_server_hdfs.c:507
That dump_buffers call (internal to my application) worries me; give me a few
hours to step through things in GDB; if errors aren't propogated right in that
particular function, it could lead to an application-side infinite loop.
> Infinite loop in error handler for libhdfs
> ------------------------------------------
>
> Key: HADOOP-4541
> URL: https://issues.apache.org/jira/browse/HADOOP-4541
> Project: Hadoop Core
> Issue Type: Bug
> Components: libhdfs
> Affects Versions: 0.18.1
> Reporter: Brian Bockelman
> Attachments: libhdfs_failure.txt
>
>
> If there is a problem writing out to HDFS, libhdfs gets put in an infinite
> loop.
> Unfortunately, my program reroutes stderr/stdout to /dev/null, so I am
> attaching the strace output. You can see the java stack traces which are
> written over and over.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.