[ 
https://issues.apache.org/jira/browse/HDFS-7010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163095#comment-14163095
 ] 

Zhanwei Wang commented on HDFS-7010:
------------------------------------

Hi [~wheat9]

Your concern is reasonable.

bq. This approach is highly platform-specific and compiler-specific
Yes. Currently I use backtrace to dump the call stack and implemented two ways 
of demanling c++ symbols. One way is for ELF and another way use libcxxabi for 
other platform or compiler.  Actually I'm working on MacOS with Clang, it works 
fine. 

The interface of printing call stack is very simple {{extern const std::string 
PrintStack(int skip, int maxDepth)}}. Its implementation on difficult platform 
could vary. For example, {{StackWalk64}} could be used on Windows to get the 
call stack. And for other platform which is hard to print the call stack, just 
disable printing call stack by using the code in my previous comment.

bq. how does it work across different version of libstdc++
If the different version of libstdc++ is ABI compatible, it should be fine, 
otherwise, other troubles like core dump when throw/catch exception will come 
first before printing the call stack.  

bq. How does backtracing work? Note that -fomit-frame-pointer is turned on when 
the binary is compiled with -O2. The backtrace might not work at all.
Yes. That is why -fno-omit-frame-pointer is used. 

bq. How many symobls can get in the backtrace? Note that in the end only public 
symbols (i.e., FileSystem / InputStream / OutputStream) are exported in the 
binary – there are not many symbols in the stacktrace

>From the current experience of using libhdfs3 in our production, if the 
>function is not inlined, it can be printed in MacOS with CLang, RHEL5/6 with 
>GCC 4.4.2

bq. What about following the common approach here, that is, making the DWARF 
symbols available to the developer and let the demangling happens at the 
developer side. I agree it's not as ideal as Java which have precise exception, 
but it at least gives you some debugability when shipping a well-optimized 
library to the end user.

Printing the call stack is not only useful for debug, but also for the writing 
log. If the customer report an issue with the call stack, it will save a lot of 
time to find the root cause. It benefit both developer and users.

I know that printing call stack is not a good way to do such work and has bad 
portability. Based on the current test and use experience, it does not 
introduce trouble yet but provide much benefit. I think Colin's suggestion is 
better but it needs time and effort to implement. After someone improved the 
exception in libhdfs3 just like Colin said:

bq. add more identifying information to each exception to know where it came 
from

The way of printing the call stack should be retired and I'd like to remove it 
at that time.



 

> boot up libhdfs3 project
> ------------------------
>
>                 Key: HDFS-7010
>                 URL: https://issues.apache.org/jira/browse/HDFS-7010
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Zhanwei Wang
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-7010-pnative.003.patch, 
> HDFS-7010-pnative.004.patch, HDFS-7010-pnative.004.patch, HDFS-7010.patch
>
>
> boot up libhdfs3 project with CMake, Readme and license file.
> Integrate google mock and google test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to