Re: [PATCHv2 0/7] Fix and generalize version sort reordering
On Tue, Dec 20, 2016 at 09:50:42AM +0100, SZEDER Gábor wrote: > > It just seems like the whole thing would conceptually easier if we > > pre-parsed the versions into a sequence of elements, then the comparison > > between any two elements would just walk that sequence. The benefit > > there is that you can implement whatever rules you like for the parsing > > (like "prefer longer suffixes to shorter"), but you know the comparison > > will always be consistent. > > I considered parsing tagnames into prefix, version number and suffix, > and then work from that, but decided against it. > > versioncmp() is taken from glibc, so I assume that it's thoroughly > tested, even in corner cases (e.g. multiple leading zeros). > Furthermore, I think it's a good thing that by default (i.e. without > suffix reordering) our version sort orders the same way as glibc's > version sort does. Introducing a different algorithm would risk bugs > in the more subtle cases. Fair enough. If it's working, I agree there is risk in changing things. And if you're willing to deal with the bugs, then I'm happy to stand back. :) > Then there are all the weird release suffixes out there, and I didn't > want to decide on a policy for splitting them sanely; don't know > whether there exist any universal rules for this splitting at > all. E.g. one of the packages here has the following version (let's > ignore the fact that because of the '~' this is an invalid refname in > git): I have a hunch that any policy you'd have to set for splitting is going to end up becoming a policy you'll have to use when comparing (or you risk violating the transitivity of your comparison function). But that's just a hunch, not a proof. Again, I'm happy to defer to you if you're the one working on it. -Peff
Re: [PATCHv2 0/7] Fix and generalize version sort reordering
On Wed, Dec 14, 2016 at 6:08 PM, Jeff Kingwrote: > On Thu, Dec 08, 2016 at 03:23:54PM +0100, SZEDER Gábor wrote: > >> > With my patches it looks like this: >> > >> > $ git -c versionsort.prereleasesuffix=-pre \ >> > -c versionsort.prereleasesuffix=-prerelease \ >> > tag -l --sort=version:refname >> > v1.0.0-prerelease1 >> > v1.0.0-pre1 >> > v1.0.0-pre2 >> > v1.0.0 >> >> And when there happen to be more than one matching suffixes starting >> at the same earliest position, then we should pick the longest of >> them. The new patch 6/7 implements this behavior. > > The whole approach taken by the suffix code (before your patches) leaves > me with the nagging feeling that the comparison is not always going to > be transitive (i.e., that "a < b && b < c" does not always imply "a < > c"), which is going to cause nonsensical sorting results. > > And that may be part of the issue your 6/7 fixes. > > I spent some time playing with this the other day, though, and couldn't > come up with a specific example that fails the condition above. > > It just seems like the whole thing would conceptually easier if we > pre-parsed the versions into a sequence of elements, then the comparison > between any two elements would just walk that sequence. The benefit > there is that you can implement whatever rules you like for the parsing > (like "prefer longer suffixes to shorter"), but you know the comparison > will always be consistent. I considered parsing tagnames into prefix, version number and suffix, and then work from that, but decided against it. versioncmp() is taken from glibc, so I assume that it's thoroughly tested, even in corner cases (e.g. multiple leading zeros). Furthermore, I think it's a good thing that by default (i.e. without suffix reordering) our version sort orders the same way as glibc's version sort does. Introducing a different algorithm would risk bugs in the more subtle cases. Then there are all the weird release suffixes out there, and I didn't want to decide on a policy for splitting them sanely; don't know whether there exist any universal rules for this splitting at all. E.g. one of the packages here has the following version (let's ignore the fact that because of the '~' this is an invalid refname in git): 1.1.0~rc1-2ubuntu7-1linuxmint1 Now, it's clear that the version number is "1.1.0", and the user should configure the suffix "~rc" for prerelease reordering. But what about the rest? How should we split it "into a sequence of elements", is it { "1.1.0", "~rc1", "-2ubuntu7", "-1linuxmint1" } or { "1.1.0", "~rc1-2", "ubuntu7-1", "linuxmint1" }? What if there is a hard-working developer who is involved in a lot of Debian derivatives (and derivatives of derivatives...), and, for whatever reason, wants to put derivative-specific versions in a particular order? With my series, or conceptually even with master if it weren't buggy, it's possible to specify the order of suffixes of suffixes, and that dev could do this: $ git -c versionsort.suffix=-rc -c versionsort.suffix=linuxmint -c versionsort.suffix=YADoaDD tag -l --sort=version:refname '1.1.0*' 1.1.0-rc1-2ubuntu7-1linuxmint1 1.1.0-rc1-2ubuntu7-1YADoaDD2 1.1.0 1.1.0-2ubuntu7-1linuxmint1 1.1.0-2ubuntu7-1YADoaDD2 and would get Linux Mint-specific tags before "Yet Another Derivative of a Debian Derivative"-specific ones. Not sure whether this is relevant in practice, but I think it's a nice property nonetheless. (Btw, just for fun, I also found a package version 2.0.0~beta2+isreally1.8.6-0ubuntu1. "isreally". Oh yeah :) > It would also be more efficient, I think (it seems like the sort is > O(nr_tags * lg(nr_tags) * nr_suffixes) due to parsing suffixes in the > comparator). Though that probably doesn't matter much in practice. I don't think there will be more than only a few configured suffixes in any repository. However, if you consider O as "number of starts_with() invocations", then there is an additional suffix_length factor. But then again, these suffixes tend to be short. > I dunno. I think maybe your 6/7 has converged on an equivalent behavior. > And I am certainly not volunteering to re-write it, so if what you have > works, I'm not opposed to it. > > -Peff
Re: [PATCHv2 0/7] Fix and generalize version sort reordering
Jeff Kingwrites: > On Thu, Dec 08, 2016 at 03:23:54PM +0100, SZEDER Gábor wrote: > >> > With my patches it looks like this: >> > >> > $ git -c versionsort.prereleasesuffix=-pre \ >> > -c versionsort.prereleasesuffix=-prerelease \ >> > tag -l --sort=version:refname >> > v1.0.0-prerelease1 >> > v1.0.0-pre1 >> > v1.0.0-pre2 >> > v1.0.0 >> >> And when there happen to be more than one matching suffixes starting >> at the same earliest position, then we should pick the longest of >> them. The new patch 6/7 implements this behavior. > > The whole approach taken by the suffix code (before your patches) leaves > me with the nagging feeling that the comparison is not always going to > be transitive (i.e., that "a < b && b < c" does not always imply "a < > c"), which is going to cause nonsensical sorting results. > > And that may be part of the issue your 6/7 fixes. > > I spent some time playing with this the other day, though, and couldn't > come up with a specific example that fails the condition above. > > It just seems like the whole thing would conceptually easier if we > pre-parsed the versions into a sequence of elements, then the comparison > between any two elements would just walk that sequence. The benefit > there is that you can implement whatever rules you like for the parsing > (like "prefer longer suffixes to shorter"), but you know the comparison > will always be consistent. > > It would also be more efficient, I think (it seems like the sort is > O(nr_tags * lg(nr_tags) * nr_suffixes) due to parsing suffixes in the > comparator). Though that probably doesn't matter much in practice. > > I dunno. I think maybe your 6/7 has converged on an equivalent behavior. > And I am certainly not volunteering to re-write it, so if what you have > works, I'm not opposed to it. I also had worries about transitiveness but couldn't break it after trying for some time. I find your pre-parsing suggestion a great one, not from the point of view of performance, but because I would imagine that the resulting logic would become a lot easier to understand.
Re: [PATCHv2 0/7] Fix and generalize version sort reordering
On Thu, Dec 08, 2016 at 03:23:54PM +0100, SZEDER Gábor wrote: > > With my patches it looks like this: > > > > $ git -c versionsort.prereleasesuffix=-pre \ > > -c versionsort.prereleasesuffix=-prerelease \ > > tag -l --sort=version:refname > > v1.0.0-prerelease1 > > v1.0.0-pre1 > > v1.0.0-pre2 > > v1.0.0 > > And when there happen to be more than one matching suffixes starting > at the same earliest position, then we should pick the longest of > them. The new patch 6/7 implements this behavior. The whole approach taken by the suffix code (before your patches) leaves me with the nagging feeling that the comparison is not always going to be transitive (i.e., that "a < b && b < c" does not always imply "a < c"), which is going to cause nonsensical sorting results. And that may be part of the issue your 6/7 fixes. I spent some time playing with this the other day, though, and couldn't come up with a specific example that fails the condition above. It just seems like the whole thing would conceptually easier if we pre-parsed the versions into a sequence of elements, then the comparison between any two elements would just walk that sequence. The benefit there is that you can implement whatever rules you like for the parsing (like "prefer longer suffixes to shorter"), but you know the comparison will always be consistent. It would also be more efficient, I think (it seems like the sort is O(nr_tags * lg(nr_tags) * nr_suffixes) due to parsing suffixes in the comparator). Though that probably doesn't matter much in practice. I dunno. I think maybe your 6/7 has converged on an equivalent behavior. And I am certainly not volunteering to re-write it, so if what you have works, I'm not opposed to it. -Peff
[PATCHv2 0/7] Fix and generalize version sort reordering
> After having slept on this a couple of times After having slept on it a few (erm...) more times... > I think the longest > matching prerelease suffix should determine the sorting order. > > A release tag usually consists of an optional prefix, e.g. 'v' or > 'snapshot-', followed by the actual version number, followed by an > optional suffix. In the contrived example quoted above this suffix > is '-foo-bar', thus it feels wrong to match '-bar' against it, when > the user explicitly configured '-foo-bar' as prerelease suffix as > well. At the risk of overthinking it: I think it would be more correct to say that the earliest starting matching prerelease suffix should determine the sorting order in the first place. > Then here is a more realistic case for sorting based on the longest > matching suffix, where > >- a longer suffix starts with the shorter one, >- and the longer suffix comes after the shorter one in the > configuration. > > With my patches it looks like this: > > $ git -c versionsort.prereleasesuffix=-pre \ > -c versionsort.prereleasesuffix=-prerelease \ > tag -l --sort=version:refname > v1.0.0-prerelease1 > v1.0.0-pre1 > v1.0.0-pre2 > v1.0.0 And when there happen to be more than one matching suffixes starting at the same earliest position, then we should pick the longest of them. The new patch 6/7 implements this behavior. Changes since v1: - Documentation, in-code comment and commit message updates. - Use different tagnames and suffixes in the tests, keeping the new tests simpler and changes to existing tests smaller. - Added a test of questionable value to try to check that we don't read memory before the tagname strings in case of an unusually long suffix. - Removed unnecessary {}. - Added two patches on top to address the corner cases and to introduce 'versionsort.suffix' for the resulting more general suffix reordering behavior. The interdiff at the bottom is between v1 and only the first five patches of v2, i.e. not including the two new patches. I think it's easier to judge the changes this way. SZEDER Gábor (7): t7004-tag: delete unnecessary tags with test_when_finished t7004-tag: use test_config helper t7004-tag: add version sort tests to show prerelease reordering issues versioncmp: pass full tagnames to swap_prereleases() versioncmp: cope with common part overlapping with prerelease suffix versioncmp: use earliest-longest contained suffix to determine sorting order versioncmp: generalize version sort suffix reordering Documentation/config.txt | 44 Documentation/git-tag.txt | 4 +- t/t7004-tag.sh| 132 +++--- versioncmp.c | 68 +--- 4 files changed, 196 insertions(+), 52 deletions(-) -- 2.11.0.78.g5a2d011 diff --git a/Documentation/config.txt b/Documentation/config.txt index 0bcb6790d..c1a9616e9 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -3008,8 +3008,12 @@ versionsort.prereleaseSuffix:: This variable can be specified multiple times, once per suffix. The order of suffixes in the config file determines the sorting order (e.g. if "-pre" appears before "-rc" in the config file then 1.0-preXX -is sorted before 1.0-rcXX). The sorting order between different -suffixes is undefined if they are in multiple config files. +is sorted before 1.0-rcXX). +If more than one suffixes match the same tagname, then that tagname will +be sorted according to the matching suffix which comes first in the +configuration. +The sorting order between different suffixes is undefined if they are +in multiple config files. web.browser:: Specify a web browser that may be used by some commands. diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh index f0cfe1fa3..c7aaace8c 100755 --- a/t/t7004-tag.sh +++ b/t/t7004-tag.sh @@ -1511,14 +1511,14 @@ test_expect_success 'invalid sort parameter in configuratoin' ' ' test_expect_success 'version sort with prerelease reordering' ' - test_config versionsort.prereleaseSuffix -beta && - git tag foo1.6-beta1 && - git tag foo1.6-beta2 && + test_config versionsort.prereleaseSuffix -rc && + git tag foo1.6-rc1 && + git tag foo1.6-rc2 && git tag -l --sort=version:refname "foo*" >actual && cat >expect <<-\EOF && foo1.3 - foo1.6-beta1 - foo1.6-beta2 + foo1.6-rc1 + foo1.6-rc2 foo1.6 foo1.10 EOF @@ -1526,54 +1526,49 @@ test_expect_success 'version sort with prerelease reordering' ' ' test_expect_success 'reverse version sort with prerelease reordering' ' - test_config versionsort.prereleaseSuffix -beta && + test_config versionsort.prereleaseSuffix -rc && git tag -l --sort=-version:refname "foo*" >actual && cat >expect <<-\EOF && foo1.10 foo1.6 - foo1.6-beta2 -