On Wed, Dec 13, 2017 at 07:50:44PM +0000, bpr via Digitalmars-d-announce wrote: > On Tuesday, 5 December 2017 at 18:20:40 UTC, Seb wrote: [...] > Of the projects in [2], I like the general purpose betterC libraries > most, and I think it's something where students could make a real > impact in that time period. [...] > > [2] https://wiki.dlang.org/GSOC_2018_Ideas
The "Who's (using) who?" project can use my symbol dependency tool as a starting point: https://github.com/quickfur/symdep Basically, as it stands, it can extract the list of symbols from the program, and which symbol references which other symbols, where "A references B" means the disassembled code between A and the next symbol in the executable contains a reference to an address somewhere between B and the next symbol after B. This is done by inspecting the output of the `objdump` tool. A list of dependencies can be produced in either text format or in GraphViz .dot format, which can be passed to graphviz or neato to produce a graphical chart of symbol dependencies. As of now, the following are possible points of improvement: - Make it work on Windows and other OSes that don't have the `objdump` utility; - Add better capability to limit the output to a subgraph of the full graph. Because of the huge number of symbols in a typical D program, outputting the entire dependency graph will produce a graph far too large to be easily understood. Currently, symdep has the capability of restricting the output to the subgraph of symbols reachable from a certain given symbol (useful for answering "what does function foo call?"), or the subgraph of symbols NOT reachable from a certain given symbol (e.g., "what are the symbols that aren't reachable from _Dmain?"). However, in medium-to-large D programs, the resulting subgraph is still far too large to be useful, so a better way of selecting a subgraph would be nice. Perhaps implementing a maximum recursion level to the existing subgraph functions might be a good start, i.e., "what are the symbols referenced by _Dmain up to 3 levels down the call chain / reference graph?". - Better accuracy for dependency detection. Currently, it may not produce the most accurate results because if there are private / static symbols in a module that don't export a public symbol in the executable, symdep won't know if a reference is actually to that private symbol, and will blindly assume that it's actually referencing the closest public symbol that comes before the private symbol in the executable. This makes the output graph inaccurate. Also, some references that go through indirection may not be detected correctly, e.g., if function F calls function G via a function pointer table or thunk. (I think the function table case should still work, as long as the function table itself has a public symbol; it will just show up in the output as F -> tableSym -> G. But this has not been rigorously tested.) - Currently, symdep does not distinguish between code symbols and data symbols. For its stated purpose (i.e., find unexpected dependencies to Phobos modules that seemingly aren't used), this is not necessarily a bad thing. But being able to tell the difference helps to make the output more readable, e.g., use different node shapes for code vs. data symbols; it also allows subgraph queries to be restricted to a particular node type (show me the call graph vs. show me the data dependency graph), etc.. T -- Dogs have owners ... cats have staff. -- Krista Casada