On Mon, Aug 5, 2019 at 3:13 AM Jaroslav Tulach <jaroslav.tul...@gmail.com>
wrote:

> Lively discussion. I can't pretend I read it all, but...


But read the readme on the github site about "deep" mode, which addresses
your objection to hashing only method signatures. I think I anticipated
your objection.

The problem that is worth solving here is the case where  module A has an
implementation dependency on module B which rarely changes, and a new
release of A (as part of the IDE - this is the most common case) breaks
module A, even though NO CODE HAS CHANGED in module B.

That is a silly situation. Git commit IDs won't solve it. Build dates we
already know can't solve it.

What solves it perfectly is a hash only of those elements that the compiler
turns into code. License header changes don't affect it. Renaming private
variables doesn't affect it. Reformatting doesn't affect it. But alter the
control flow in any way, and you get a different hash.

I think the more interesting application of this (and the reason I started
playing with this idea a few years ago) is it you keep hashes for every
public method. You upgrade a library and wonder how much you should worry
that something broke, and how much QA you should plan for. Wouldn't it be
nicer if you could literally know what percentage of the code you actually
touch has changed? Stick that together with eigenvector centrality (a
measure of connector-ness - most connected through, not to) values for the
things that changed, and you could compute a simple score for the risk of a
library upgrade as it applies to YOUR code and the things it really calls.

-Tim


>
> I'd like to remind you that:
> - specification version is used to depend on features
> - implementation version is used to depend on bug-to-bug compatibility
>
> E.g. use specification version to depend on properly versioned API. Use
> implementation version to workaround bugs in behavior.
>
> From this point of view it makes sense make implementation version = git
> hash. Checksum of available methods isn't really appropriate as bugs in
> behavior happen in the method bodies, not in signatures.
>
> -jt
>
>
> ne 4. 8. 2019 v 11:27 odesílatel Tim Boudreau <niftin...@gmail.com>
> napsal:
>
> > Well, since I mentioned I'd poked at writing a Java code signature
> hashing
> > tool, I figured why not finish it, so, if anyone's interested, here you
> > go.  The readme explains what it does:
> >
> > https://github.com/timboudreau/signature-hash
> >
> > Wrote a few tests that pass (same hash with differently formatted code,
> or
> > code with or without a private method), and tried it on a few projects
> > making changes.  Jan Lahoda, you're probably one of the few people on the
> > planet that could look at this and see if there's anything glaringly
> wrong
> > with it :-)
> >
> > It builds to a command-line executable JAR, but could easily be made into
> > an Ant or Maven plugin or whatever.  Hashing the signatures of things
> that
> > actually get *called* seems like it would be far superior to git hashes,
> > where inconsequential changes would result in a false-incompatibility
> > reading.
> >
> > -Tim
> >
>
-- 
http://timboudreau.com

Reply via email to