On 01/06/2016 10:43 AM, Shuai Wang wrote:
Dear list,
I am writing to ask how to use DynInst to recognize *function entry
points (memory addresses) in stripped binaries*.
I successfully installed the 32-bit DynInst 9.10, and I use a DynInst
script to iterate all the functions with the following commands to
*dump all the function entry point addresses from stripped binaries*.
.......
vector<BPatch_module *> * modules =
appImage->getModules();
......
vector<BPatch_function *> * funcs =
(*module_iter)->getProcedures();
vector<BPatch_function *>::iterator func_iter;
for(func_iter = funcs->begin(); func_iter !=
funcs->end(); ++func_iter) {
char functionName[1024];
(*func_iter)->getName(functionName, 1024);
cout << "-- Function : " << functionName <<
" --" << endl;
......
I extract the function entry point addresses from the function names.
I test some LLVM compiler CoreUtil binaries with O2 optimization
level. And the precision/recall rate is general very good! *Precision:
0.99; Recall: 0.91*
*
*
According to this paper
<ftp://ftp.cs.wisc.edu/paradyn/papers/Williams15Dyninst.pdf>, Section
6.2, on average DynInst can have over 0.97 precision, and 0.93 recall
on 32-bit ELF binaries. It is very consistent with my test! But still,
I am not sure whether I did everything correct.
So here are my questions:
1. It seems that by leveraging machine learning method to recognize
functions, DynInst needs a training process before recognition, but I
didn't do any training (although the results are pretty good), is
there anything in particular I have to do before using DynInst?
The training step has been done once and the resulting model is baked
into the Dyninst code base. Your experimental setup should be correct.
2. If there is a "pre-trained" model installed in DynInst 9.10
already, what kind of binaries does this model include? For example,
can I use it to test 32-bit ELF binaries compiled from LLVM with O3?
or ICC with O3?
Dyninst was trained on the test set of binaries produced by the BAP
group at CMU, which includes binutils and coreutils binaries built with
gcc and icc at O0 through O3 (as well as Windows binaries, though that's
of course producing a separate model). I expect the model to generalize
decently to LLVM binaries, and we'd be interested to hear your results.
Our initial indications are that these models, applied to modern
compiler versions, are not terribly sensitive to the toolchain used.
--bw
Am I clear enough? I appreciate if anyone can give me some help!
Sincerely,
Shuai
_______________________________________________
Dyninst-api mailing list
Dyninst-api@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
_______________________________________________
Dyninst-api mailing list
Dyninst-api@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api