On Mon, Aug 5, 2019 at 3:13 AM Jaroslav Tulach <jaroslav.tul...@gmail.com> wrote:
> Lively discussion. I can't pretend I read it all, but... But read the readme on the github site about "deep" mode, which addresses your objection to hashing only method signatures. I think I anticipated your objection. The problem that is worth solving here is the case where module A has an implementation dependency on module B which rarely changes, and a new release of A (as part of the IDE - this is the most common case) breaks module A, even though NO CODE HAS CHANGED in module B. That is a silly situation. Git commit IDs won't solve it. Build dates we already know can't solve it. What solves it perfectly is a hash only of those elements that the compiler turns into code. License header changes don't affect it. Renaming private variables doesn't affect it. Reformatting doesn't affect it. But alter the control flow in any way, and you get a different hash. I think the more interesting application of this (and the reason I started playing with this idea a few years ago) is it you keep hashes for every public method. You upgrade a library and wonder how much you should worry that something broke, and how much QA you should plan for. Wouldn't it be nicer if you could literally know what percentage of the code you actually touch has changed? Stick that together with eigenvector centrality (a measure of connector-ness - most connected through, not to) values for the things that changed, and you could compute a simple score for the risk of a library upgrade as it applies to YOUR code and the things it really calls. -Tim > > I'd like to remind you that: > - specification version is used to depend on features > - implementation version is used to depend on bug-to-bug compatibility > > E.g. use specification version to depend on properly versioned API. Use > implementation version to workaround bugs in behavior. > > From this point of view it makes sense make implementation version = git > hash. Checksum of available methods isn't really appropriate as bugs in > behavior happen in the method bodies, not in signatures. > > -jt > > > ne 4. 8. 2019 v 11:27 odesÃlatel Tim Boudreau <niftin...@gmail.com> > napsal: > > > Well, since I mentioned I'd poked at writing a Java code signature > hashing > > tool, I figured why not finish it, so, if anyone's interested, here you > > go. The readme explains what it does: > > > > https://github.com/timboudreau/signature-hash > > > > Wrote a few tests that pass (same hash with differently formatted code, > or > > code with or without a private method), and tried it on a few projects > > making changes. Jan Lahoda, you're probably one of the few people on the > > planet that could look at this and see if there's anything glaringly > wrong > > with it :-) > > > > It builds to a command-line executable JAR, but could easily be made into > > an Ant or Maven plugin or whatever. Hashing the signatures of things > that > > actually get *called* seems like it would be far superior to git hashes, > > where inconsequential changes would result in a false-incompatibility > > reading. > > > > -Tim > > > -- http://timboudreau.com