Re: Treehydra etc.

Taras Glek Thu, 27 Mar 2008 11:16:38 -0700

CCing the list as this is a pretty good summary of where we are with 
treehydra.


David Mandelin wrote:
> See ~dmandelin/outparams. Especially analysis.js, but outparams.js is 
> the proper script to use in treehydra. Currently, it performs a 
> liveness analysis, although it only understands a few GIMPLE 
> expression types. But the main data-flow-analysis framework is in 
> place. Once this liveness analysis is working, it shouldn't be too 
> much harder to get ESP & the outparams analysis itself.
>
> So, now that I've had a couple of days to work with it, the API does 
> not seem that bad. And I've been able to get to all the stuff I need 
> in this current Treehydra. So that's all pretty cool.
Sweet!
>
> Here's some random thoughts on stuff I think I'm going to need going 
> forward. Let me know if you want any bugs filed for them. I certainly 
> can, but they're still at the random thoughts stage now.
>
> 1. Nice debug printing for Treehydra objects.
>
> This is essential. I have some approximation of this in 
> do_dehydra_dump, but it could certainly be improved. Also, I think we 
> want a nicely readable representation for important objects -- any 
> *_DECL, *_EXPR, *_STATEMENT, CFG, or basic block, at least. Again, I 
> have some starting points in my scripts. Hopefully, we can write a 
> 'tlabel' function or some such that can automatically detect the kind 
> of object and dispatch to the right display function. It would also be 
> sort of nice to have this be an object method, but that's not very 
> important.
Committed some initial code and filed
https://bugzilla.mozilla.org/show_bug.cgi?id=425458

JavaScript prototypes are awesome for this. I bet people never imagined 
GIMPLE looking as good as we'll have it in treehydra :)
>
> 2. Map and Set data structures.
>
> I checked out your latest version, and it's good enough that I stole 
> your idea for generating hash codes, but what I need is kind of 
> different. And, in the end I had to abandon that hash code technique, 
> as it turns out you can have 2 distinct treehydra objects for the same 
> program variable, but I need them to act as the same key. (And by the 
> way, having an extra hashcode method on Object kind of messes up clean 
> iteration of little objects, so I recommend removing that. Your key 
> generation policy is perfectly appropriate for certain applications, 
> but I strongly recommend a less sneaky mechanism.)
Sounds like you should override the hashcode() function for those cases. 
Need to file a bug on implementing the function in C to avoid an extra 
property.
>
> I have my own Map and Set in analysis.js. Map is the important one, 
> Set is implemented using Map in the obvious way. The key things about 
> this Map that I need are:
>
> - customizable "hashcode" generation (it's actually a string key(1), 
> but that's not crucial. The crucial point is control over the 
> definition of key(2) equality. Ugh, I overloaded 'key' in my design. 
> key(2) is keys of the hash table ADT, key(1) is the thing used in the 
> implementation to look up the key(2)-value pair.)
>
> - being able to iterate accurately over the contents (no extra junk 
> for methods or whatever).
>
> In my design, the keys have to be unique, i.e., two objects have the 
> same key iff they should be considered equal. Obviously, this is not 
> appropriate for all situations, but it works so far for me.
>
> For debugging purposes, they need to be able to cooperate with the 
> stringizing functions, probably through another customization parameter.
>
>
> 3. Instruction decoding.
>
> In general, analyses are going to want a simpler view of the 
> instructions than the GIMPLE tree. Exactly what that means overall is 
> hard for me to say, but there is at least one set of APIs I will need 
> for sure. Namely, I want these accessors for any GIMPLE instruction:
>
>  defs - Return/iterate the set of variables written to by the 
> instruction.
>  uses - Return/iterate the set of variables read from by the instruction.
>  op   - Return the operator or function name invoked by the instruction.
>
> As an example of the usage, in liveness analysis, immediately before 
> any instruction X, defs(X) are not live, and uses(X) are live. So 
> clearly this is going to make liveness analysis a lot easier to 
> express than groveling over each instruction type.
>
> There's a lot of fiddly bits in this area that make it hard to define 
> exactly. For example, I'm not sure what the API should look like when 
> pointers come into play, but so far having INDIRECT_REF appear in the 
> defs or uses list seems ok to me. Also, some analyses, such as 
> liveness, care about strong vs. weak defs, so we might want to have 
> both available. (strong def -> variable is definitely assigned, weak 
> def -> may be assigned. strong_defs(X) are not live before X, but 
> weak_defs(X) are live before X if they are live after X.)
Don't have any genius ideas for this one. Guess we'll just wait and see 
how this evolves.

Taras

_______________________________________________
Dev-static-analysis mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-static-analysis

Re: Treehydra etc.

Reply via email to