CCing the list as this is a pretty good summary of where we are with treehydra.
David Mandelin wrote: > See ~dmandelin/outparams. Especially analysis.js, but outparams.js is > the proper script to use in treehydra. Currently, it performs a > liveness analysis, although it only understands a few GIMPLE > expression types. But the main data-flow-analysis framework is in > place. Once this liveness analysis is working, it shouldn't be too > much harder to get ESP & the outparams analysis itself. > > So, now that I've had a couple of days to work with it, the API does > not seem that bad. And I've been able to get to all the stuff I need > in this current Treehydra. So that's all pretty cool. Sweet! > > Here's some random thoughts on stuff I think I'm going to need going > forward. Let me know if you want any bugs filed for them. I certainly > can, but they're still at the random thoughts stage now. > > 1. Nice debug printing for Treehydra objects. > > This is essential. I have some approximation of this in > do_dehydra_dump, but it could certainly be improved. Also, I think we > want a nicely readable representation for important objects -- any > *_DECL, *_EXPR, *_STATEMENT, CFG, or basic block, at least. Again, I > have some starting points in my scripts. Hopefully, we can write a > 'tlabel' function or some such that can automatically detect the kind > of object and dispatch to the right display function. It would also be > sort of nice to have this be an object method, but that's not very > important. Committed some initial code and filed https://bugzilla.mozilla.org/show_bug.cgi?id=425458 JavaScript prototypes are awesome for this. I bet people never imagined GIMPLE looking as good as we'll have it in treehydra :) > > 2. Map and Set data structures. > > I checked out your latest version, and it's good enough that I stole > your idea for generating hash codes, but what I need is kind of > different. And, in the end I had to abandon that hash code technique, > as it turns out you can have 2 distinct treehydra objects for the same > program variable, but I need them to act as the same key. (And by the > way, having an extra hashcode method on Object kind of messes up clean > iteration of little objects, so I recommend removing that. Your key > generation policy is perfectly appropriate for certain applications, > but I strongly recommend a less sneaky mechanism.) Sounds like you should override the hashcode() function for those cases. Need to file a bug on implementing the function in C to avoid an extra property. > > I have my own Map and Set in analysis.js. Map is the important one, > Set is implemented using Map in the obvious way. The key things about > this Map that I need are: > > - customizable "hashcode" generation (it's actually a string key(1), > but that's not crucial. The crucial point is control over the > definition of key(2) equality. Ugh, I overloaded 'key' in my design. > key(2) is keys of the hash table ADT, key(1) is the thing used in the > implementation to look up the key(2)-value pair.) > > - being able to iterate accurately over the contents (no extra junk > for methods or whatever). > > In my design, the keys have to be unique, i.e., two objects have the > same key iff they should be considered equal. Obviously, this is not > appropriate for all situations, but it works so far for me. > > For debugging purposes, they need to be able to cooperate with the > stringizing functions, probably through another customization parameter. > > > 3. Instruction decoding. > > In general, analyses are going to want a simpler view of the > instructions than the GIMPLE tree. Exactly what that means overall is > hard for me to say, but there is at least one set of APIs I will need > for sure. Namely, I want these accessors for any GIMPLE instruction: > > defs - Return/iterate the set of variables written to by the > instruction. > uses - Return/iterate the set of variables read from by the instruction. > op - Return the operator or function name invoked by the instruction. > > As an example of the usage, in liveness analysis, immediately before > any instruction X, defs(X) are not live, and uses(X) are live. So > clearly this is going to make liveness analysis a lot easier to > express than groveling over each instruction type. > > There's a lot of fiddly bits in this area that make it hard to define > exactly. For example, I'm not sure what the API should look like when > pointers come into play, but so far having INDIRECT_REF appear in the > defs or uses list seems ok to me. Also, some analyses, such as > liveness, care about strong vs. weak defs, so we might want to have > both available. (strong def -> variable is definitely assigned, weak > def -> may be assigned. strong_defs(X) are not live before X, but > weak_defs(X) are live before X if they are live after X.) Don't have any genius ideas for this one. Guess we'll just wait and see how this evolves. Taras _______________________________________________ Dev-static-analysis mailing list [email protected] https://lists.mozilla.org/listinfo/dev-static-analysis
