On Sat, Mar 16, 2019, 10:02 Richard Dodd <[email protected]> wrote:
> I'm writing to ask about the version comparison function `rpmvercmp` in > `libalpm/version.cL83`. The algorithm is complex and I'm trying to > understand its behavior. I want to hash using the package name and version > as a key, and so I need a hash function where `i1 == i2 => hash(i1) == > hash(i2)` according to the version comparison operation. I'll describe how > the algorithm behaves and then ask my questions. > Could you explain what your actual goal is? Having ordering in a hash function sounds extremely odd. Generally, you care about associativity *or* ordering, not both. The algorithm works on a byte string and uses ascii comparison rules (no > unicode). > > - First, split the input up into blocks of *alpha*, *digit* or > *non-alphanum*. > - For each pair of blocks > - If the types are different, then *non-alphanum* is newer than > *numeric*, which is newer than *alpha* > - If the types are the same, then the rules are > - For *non-alphanum*, compare lengths, longer is newer, equal lengths > are equal segments (so *--* and *::* are the same) > - For *alpha* just do a lexicographic comparison (so *b* is newer > than *a* etc.) > - For *numeric*, do a numeric comparison. (this can be done by > skipping leading zeros, then comparing length, then comparing > lexicographically, to avoid overflow on integer conversion) > - If one input is longer than the other, and all sections so far have > been equal, then if the next section of the longer is *alpha*, it is older, > and if it is *numeric* it is newer. (so "1a" is older than "1", "a1" is > newer than "a"). > - If the inputs have the same number of sections that are all equal, then > they are equal. > > Questions: > > 1. Is the algorithm correctly described here. > 2. This should mean that if I hash length for *non-numeric*, the string > with stripped zeros for *numeric*, and just the string for *alpha*, the has > law should be upheld. Is this correct? > > Thanks > Rich >
