On 20/3/19 11:13 pm, Maarten de Vries wrote: > On Wed, 20 Mar 2019 at 12:19, Richard Dodd <[email protected]> wrote: >> >> I'm just trying to get a mathematically sound implementation of all the >> properties (traits in rust parlance) that the version exhibits. If there is >> an equivalence relation and a hash function as above, then I can store >> versions in a hash set/map. If there is a total ordering, then I can store >> versions in a binary tree set/map. I'm making a library, so I want to allow >> as much as possible without allowing anything unsound. Rust's type system >> allows me to make these promises in code, rather than just documenting them. >> >> You can see my implementation at >> https://github.com/derekdreery/alpm-experimental/blob/master/src/version.rs, >> if you're interested. >> >> >> On Sat, Mar 16, 2019 at 2:02 PM Richard Dodd <[email protected]> >> wrote: >> >>> I'm writing to ask about the version comparison function `rpmvercmp` in >>> `libalpm/version.cL83`. The algorithm is complex and I'm trying to >>> understand its behavior. I want to hash using the package name and version >>> as a key, and so I need a hash function where `i1 == i2 => hash(i1) == >>> hash(i2)` according to the version comparison operation. I'll describe how >>> the algorithm behaves and then ask my questions. >>> >>> The algorithm works on a byte string and uses ascii comparison rules (no >>> unicode). >>> >>> - First, split the input up into blocks of *alpha*, *digit* or >>> *non-alphanum*. >>> - For each pair of blocks >>> - If the types are different, then *non-alphanum* is newer than >>> *numeric*, which is newer than *alpha* >>> - If the types are the same, then the rules are >>> - For *non-alphanum*, compare lengths, longer is newer, equal >>> lengths are equal segments (so *--* and *::* are the same) >>> - For *alpha* just do a lexicographic comparison (so *b* is newer >>> than *a* etc.) >>> - For *numeric*, do a numeric comparison. (this can be done by >>> skipping leading zeros, then comparing length, then comparing >>> lexicographically, to avoid overflow on integer conversion) >>> - If one input is longer than the other, and all sections so far have >>> been equal, then if the next section of the longer is *alpha*, it is older, >>> and if it is *numeric* it is newer. (so "1a" is older than "1", "a1" is >>> newer than "a"). >>> - If the inputs have the same number of sections that are all equal, >>> then they are equal. >>> >>> Questions: >>> >>> 1. Is the algorithm correctly described here. >>> 2. This should mean that if I hash length for *non-numeric*, the string >>> with stripped zeros for *numeric*, and just the string for *alpha*, the has >>> law should be upheld. Is this correct? >>> >>> Thanks >>> Rich >>> >>> >>> > > I have a somewhat abandoned rust implementation of the version > comparison algorithm too: > https://github.com/de-vri-es/pacman-repo-tools/blob/master/src/version/compare.rs > > Assuming my implementation is correct, it's does define a total ordering. > > Maybe most interesting for you there is the test cases. I'm pretty > sure I verified all of them against `/usr/bin/vercmp` (the binary > executable shipped with pacman). You could even write the test cases > to actually run vercmp (at the cost of portability). Or you could use > vercmp to do some fuzz testing against randomly generated inputs. I > imagine you would quickly find most discrepancies, so you could > at-least use it to verify your algorithm against the reference > implementation. >
Note pacman has a test suite for version comparison. A
