On Mon, 2021-07-05 at 21:45 +0530, Ankur Saini wrote: > I forgot to send the daily report yesterday, so this one covers the > work done on both days > > AIM : > > - make the analyzer call the function with the updated call-string > representation ( even the ones that doesn’t have a superedge ) > - make the analyzer figure out the point of return from the function > called without the superedge > - make the analyser figure out the correct point to return back in the > caller function > - make enode and eedge representing the return call > - test the changes on the example I created before > - speculate what GCC generates for a vfunc call and discuss how can we > use it to our advantage > > — > > PROGRESS ( changes can be seen on > "refs/users/arsenic/heads/analyzer_extension “ branch of the repository > ) : > > - Thanks to the new call-string representation, I was able to push > calls to the call stack which doesn’t have a superedge and was > successfully able to see the calls happening via the function pointer. > > - To detect the returning point of the function I used the fact that > such supernodes would contain an EXIT bb, would not have any return > superedge and would still have a pending call-stack. > > - Now the next part was to find out the destination node of the return, > for this I again made use of the new call string and created a custom > accessor to get the caller and callee supernodes of the return call, > then I extracted the gcall* from the caller supernode to ulpdate the > program state, > > - now that I have got next state and next point, it was time to put the > final piece of puzzle together and create exploded node and edge > representing the returning call. > > - I tested the changes on the the following program where the analyzer > was earlier giving a false negative due to not detecting call via a > function pointer > > ``` > #include <stdio.h> > #include <stdlib.h> > > void fun(int *int_ptr) > { > free(int_ptr); > } > > int test() > { > int *int_ptr = (int*)malloc(sizeof(int)); > void (*fun_ptr)(int *) = &fun; > (*fun_ptr)(int_ptr); > > return 0; > } > > void test_2() > { > test(); > } > ``` > ( compiler explorer link : https://godbolt.org/z/9KfenGET9 < > https://godbolt.org/z/9KfenGET9> ) > > and results were showing success where the analyzer was now able to > successfully detect, call and return from the function that was called > via the function pointer and no longer reported the memory leak it was > reporting before. : )
This is great; well done! It would be good to turn the above into a regression test. I think you can do that by simply adding it to gcc/testsuite/gcc.dg/analyzer. You could also add a case where fun_ptr is called twice, and check that it reports it as a double-free (and add a dg-warning directive to verify that it correctly complains). I wonder if your branch has already have fixed: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100546 > > - I think I should point this out, in the process I created a lot of > custom function to access/alter some data which was not possible > before. > > - now that calls via function pointer are taken care of, it was time > to see what exactly happen what GCC generates when a function is > dispatched dynamically, and as planned earlier, I went to ipa- > devirt.c ( devirtualizer’s implementation of GCC ) to investigate. > > - althogh I didn’t understood everything that was happening there but > here are some of the findings I though might be interesting for the > project :- > > the polymorphic call is called with a OBJ_TYPE_REF which > contains otr_type( a type of class whose method is called) and > otr_token (the index into virtual table where address is taken) > > the devirtualizer builds a type inheritance graph to keep > track of entire inheritance hierarchy > > the most interesting function I found was > “possible_polymorphic_call_targets()” which returns the vector of all > possible targets of polymorphic call represented by a calledge or a > gcall. > > what I understood the devirtualizer do is to search in > these polymorphic calls and filter out the the calls which are more > likely to be called ( known as likely calls ) and then turn them into > speculative calls which are later turned into direct calls. > > - another thing I was curious to know was, how would analyzer behave > when encountered with a polymorphic call now that we are splitting > the superedges at every call. > > the results were interesting, I was able to see analyzer splitting > supernodes for the calls right away but this time they were not > connected via a intraprocedural edge making the analyzer crashing at > the callsite ( I would look more into it tomorrow ) > > the example I used was : - > ``` > struct A > { > virtual int foo (void) > { > return 42; > } > }; > > struct B: public A > { > int foo (void) > { > return 0; > } > }; > > int test() > { > struct B b, *bptr=&b; > bptr->foo(); > return bptr->foo(); > } > ``` > ( compiler explorer link : https://godbolt.org/z/d986ab7MY < > https://godbolt.org/z/d986ab7MY> ) > I can see the crash in gdb: In state_purge_per_ssa_name::process_point, when if (snode->m_returning_call) the code assumes that there will a cgraph_edge, which isn't the case anymore; it will need to go from the "return" supernode to the "call" supernode (both within the caller function). > — > > STATUS AT THE END OF THE DAY :- > > - make the analyzer call the function with the updated call-string > representation ( even the ones that doesn’t have a superedge ) (done) > - make the analyzer figure out the point of return from the function > called without the superedge (done) > - make the analyser figure out the correct point to return back in > the caller function (done) > - make enode and eedge representing the return call (done) > - test the changes on the example I created before (done) > - speculate what GCC generates for a vfunc call and discuss how can > we use it to our advantage (done) > Good work; looks promising. Dave