Hi Eric,

On Tue, Feb 23, 2016 at 01:56:07PM -0600, Eric W. Biederman wrote:
> 
> Fengguag Wu, Xiaolong Ye, have you attempted to use the truncated
> sha1 of the file the patch applies to?  Git already places a file sha1
> at the top of a patch.  See the index line?
> 
> > diff --git a/fs/namespace.c b/fs/namespace.c
> > index eccd925c6e82..3c3f8172c734 100644

Yes we've evaluated to make use of that index. The conclusion is,
it helps make a better guess, however it's still a guessing work
and far from perfect.

A simple accounting shows only 1/5 files will be changed between
two major kernel releases:

        wfg /c/linux% git ls-files |wc -l    
        52915
        wfg /c/linux% git diff --name-only v4.3 v4.4|wc -l                      
         
        10606

That means a huge number candidate base tree IDs matching the given
blob IDs.

> > --- a/fs/namespace.c
> > +++ b/fs/namespace.c
> 
> As I understand it you are aiming for making a good guess what the patch
> or patches apply to, having a set of file hashes looks like it would
> give you that.
> 
> All it should take is to iterate over a patchset and for each file in
> the patchset capture the first file hash.  Then in the smallish set of
> maintainer trees see if that set of file hashes matches any of their
> recent commits.  You should be able to prune the set of possible
> maintainer trees even more by looking at the mailling list or lists
> the patch was submitted to.
 
We actually start with the above thinking half year ago. Yes it'll
help narrow down the list of candidate maintainer trees. And the
chance will be increased if the patchset modifies multiple files,
and the fact some files are modified more frequently than the others.
However it's still fundamentally a guess work. The best choice is to
ask for explicit "base tree ID".

> Before we talk about adding anything more I think we need a clear
> picture of what you have tried with what already exists.  A decade ago
> part of the problem was that not everyone used git.  At best it will
> take a little while before everyone upgrades to a version of git diff
> containing your changes, and if possibly even longer if they have to
> start specifying an additional option when a diff is generated.

That's a good concern. It may take year long delay before reaching
reasonable population of the new feature.

To speedup the process, we could advocate the new git option in 0day
robot's error reports. Since we catch errors in ~10 LKML patches each
day, within months most kernel developers should get the tips on how
to set it up and enable the feature by default.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to