Re: [PATCH v5 5/9] patch-id: document new behaviour
On Thu, Apr 24, 2014 at 03:12:14PM -0700, Junio C Hamano wrote: Michael S. Tsirkin m...@redhat.com writes: +--unstable:: +Use a non-symmetrical sum of hashes, such that reordering What is a non-symmetrical sum? Non-symmetrical combination function is better? I do not think either is very good X-. The primary points to convey for --stable are: - Two patches produced by comparing the same two trees with two different settings for -Oorderfile will result in the same patchc signature, thereby allowing the computed result to be used as a key to index some metainformation about the change between the two trees; - It will produce a result different from the plain vanilla patch-id has always produced even when used on a diff output taken without any use of -Oorderfile, thereby making existing databases keyed by patch-ids unusable. The fact that we happened to use a patch-id that catches that somebody reordered the same patch into different file order and declares that they are two different changes is a more historical accident than a designed goal. I would even say that we would have used the stable version from the beginning if we thought that -Oorderfile would be widely used when these two features both appeared. Even though I was the guilty one who introduced it, I'd admit that -Oorderfile has merely been a curiosity from its inception and has been a failed experiment, not in the sense that the feature does not work as adverertised (it does), but in the sense that it is not widely used (evidenced by the lack of complaints on missing diff.orderfile for a long time) at all. With -Oorderfile being a failed experiment, the unstability did not matter, so it has stuck. The only two things worth mentioning about --unstable, if our future direction is to see diff.orderfile and --stable a lot more widely used, are: (1) it keeps producing the same patch-id as existing versions of Git, so users with existing databases (who do not deal with reordered patches) may want to use it; and perhaps (2) it will not consider a patch taken with -Oorderfile and another without it from the same source the same patches. Mathmatically speaking, mentioning non-symmetrial might be one way of expressing the latter point (2), but stressing on that alone without mentioning (1) misses the point. (2) is _not_ a designed feature, so it is not very interesting. Unless you have an existing database, there is no reason to use --unstable. On the other hand (1) is a very relevant thing to mention, as we are talking about a feature that, if unused, may break existing users' data. OK I did just that, pls take a look. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 5/9] patch-id: document new behaviour
Clarify that patch ID can now be a sum of hashes, not a hash. Document how command line and config options affect the behaviour. Signed-off-by: Michael S. Tsirkin m...@redhat.com --- Documentation/git-patch-id.txt | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/Documentation/git-patch-id.txt b/Documentation/git-patch-id.txt index 312c3b1..e21b79b 100644 --- a/Documentation/git-patch-id.txt +++ b/Documentation/git-patch-id.txt @@ -8,14 +8,14 @@ git-patch-id - Compute unique ID for a patch SYNOPSIS [verse] -'git patch-id' patch +'git patch-id' [--stable | --unstable] patch DESCRIPTION --- -A patch ID is nothing but a SHA-1 of the diff associated with a patch, with -whitespace and line numbers ignored. As such, it's reasonably stable, but at -the same time also reasonably unique, i.e., two patches that have the same patch -ID are almost guaranteed to be the same thing. +A patch ID is nothing but a sum of SHA-1 of the diff hunks associated with a +patch, with whitespace and line numbers ignored. As such, it's reasonably +stable, but at the same time also reasonably unique, i.e., two patches that +have the same patch ID are almost guaranteed to be the same thing. IOW, you can use this thing to look for likely duplicate commits. @@ -27,6 +27,19 @@ This can be used to make a mapping from patch ID to commit ID. OPTIONS --- + +--stable:: + Use a symmetrical sum of hashes as the patch ID. + With this option, reordering file diffs that make up a patch or + splitting a diff up to multiple diffs that touch the same path + does not affect the ID. + This is the default if patchid.stable is set to true. + +--unstable:: + Use a non-symmetrical sum of hashes, such that reordering + or splitting the patch does affect the ID. + This is the default. + patch:: The diff to create the ID of. -- MST -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 5/9] patch-id: document new behaviour
Michael S. Tsirkin wrote: Documentation/git-patch-id.txt | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) Ah, there's the documentation. Please squash this with the patch that introduces the new behavior so they can be reviewed together more easily (both now and later when people do archeology). [...] +--stable:: + Use a symmetrical sum of hashes as the patch ID. + With this option, reordering file diffs that make up a patch or + splitting a diff up to multiple diffs that touch the same path + does not affect the ID. + This is the default if patchid.stable is set to true. This doesn't explain to me why I would want to use --stable versus --unstable. Maybe an EXAMPLES section would help? The only reason I can think of to use --unstable is for compatibility with historical patch-ids. Is there any other reason? At this point in the series there is no patchid.stable configuration. +--unstable:: + Use a non-symmetrical sum of hashes, such that reordering What is a non-symmetrical sum? Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 5/9] patch-id: document new behaviour
On Thu, Apr 24, 2014 at 10:33:25AM -0700, Jonathan Nieder wrote: Michael S. Tsirkin wrote: Documentation/git-patch-id.txt | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) Ah, there's the documentation. Please squash this with the patch that introduces the new behavior so they can be reviewed together more easily (both now and later when people do archeology). [...] +--stable:: + Use a symmetrical sum of hashes as the patch ID. + With this option, reordering file diffs that make up a patch or + splitting a diff up to multiple diffs that touch the same path + does not affect the ID. + This is the default if patchid.stable is set to true. This doesn't explain to me why I would want to use --stable versus --unstable. Maybe an EXAMPLES section would help? The only reason I can think of to use --unstable is for compatibility with historical patch-ids. Is there any other reason? At this point in the series there is no patchid.stable configuration. +--unstable:: + Use a non-symmetrical sum of hashes, such that reordering What is a non-symmetrical sum? Non-symmetrical combination function is better? Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 5/9] patch-id: document new behaviour
Michael S. Tsirkin m...@redhat.com writes: +--unstable:: + Use a non-symmetrical sum of hashes, such that reordering What is a non-symmetrical sum? Non-symmetrical combination function is better? I do not think either is very good X-. The primary points to convey for --stable are: - Two patches produced by comparing the same two trees with two different settings for -Oorderfile will result in the same patchc signature, thereby allowing the computed result to be used as a key to index some metainformation about the change between the two trees; - It will produce a result different from the plain vanilla patch-id has always produced even when used on a diff output taken without any use of -Oorderfile, thereby making existing databases keyed by patch-ids unusable. The fact that we happened to use a patch-id that catches that somebody reordered the same patch into different file order and declares that they are two different changes is a more historical accident than a designed goal. I would even say that we would have used the stable version from the beginning if we thought that -Oorderfile would be widely used when these two features both appeared. Even though I was the guilty one who introduced it, I'd admit that -Oorderfile has merely been a curiosity from its inception and has been a failed experiment, not in the sense that the feature does not work as adverertised (it does), but in the sense that it is not widely used (evidenced by the lack of complaints on missing diff.orderfile for a long time) at all. With -Oorderfile being a failed experiment, the unstability did not matter, so it has stuck. The only two things worth mentioning about --unstable, if our future direction is to see diff.orderfile and --stable a lot more widely used, are: (1) it keeps producing the same patch-id as existing versions of Git, so users with existing databases (who do not deal with reordered patches) may want to use it; and perhaps (2) it will not consider a patch taken with -Oorderfile and another without it from the same source the same patches. Mathmatically speaking, mentioning non-symmetrial might be one way of expressing the latter point (2), but stressing on that alone without mentioning (1) misses the point. (2) is _not_ a designed feature, so it is not very interesting. Unless you have an existing database, there is no reason to use --unstable. On the other hand (1) is a very relevant thing to mention, as we are talking about a feature that, if unused, may break existing users' data. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html