(moved off of -meta) "David L. Nicol" wrote: > > Steve Fink wrote (and I edited slightly): > > > <groan> I can't figure out why so many people misinterpret my RFC12 > > as requiring a solution to the halting problem. > > a large class of incompletely expressed > suggestions appear to get grouped into > > "This requires solving the halting problem!" > > without providing further explanation. > > Looking at your RFC12, I imagine the problem is, that it is impossible > to know at compile time if a variable is going to get used or not. When > exactly is the warning supposed to be generated? Compile time. It's pretty much the same as your taint checking. An assignment to a variable means that it is initialized immediately after that statement, as well as at every other location in the code that will only get executed if that assignment will have been executed first. So certainly everything following that assignment in the same control scope. I avoid the halting problem by never requiring to know whether a variable _will_ be used, only whether it _can_ be. Under pessimistic assumptions like "any boolean expression can be either true or false in a valid run of a program". Which had better be true! In the case of taintedness, it's those conservative assumptions that require some of the work to be done at runtime. It's standard semantic analysis. Both your taintedness analysis and my reachability analyses can be fully described by specifying what things generate the characteristic you're analyzing, what things block (in the literature, "kill") it, and the transfer rules. It's often not the best way of implementing it, since the fully general framework can be pretty slow, but it's a concise way of describing things that you'll find in many textbooks. They talk in terms of maintaining GEN and KILL sets and describing when to add, subtract, union, and intersect the incoming & outgoing GENs and KILLs. Yours: taint-vulnerable, tainted Mine: initialized, undefined, maybe-undefined, reachable, maybe-reachable The implementations of both our RFCs would probably share a lot of code. Some sort of framework would be useful. And these sorts of things are absolutely necessary for nontrivial optimizations, too. Fortunately, both our RFCs are also ones that don't really _change_ the language; they have to be either be implemented in the core or just have deep access to the op/parse tree, but they can always be added later. Well, barring really wacky ideas. (use weirdness qw(all-lexicals-are-now-global-ha-ha))
