Hi Denis

I personaly think, that this is not a discussion about propabilities but
about fail-safe and robust implementation.

I understand that you do not want to implement a logic to the id generation
which can mitigate collisions
(e.g. collision-bit), because the probability that this logic ever solves a
problem is "low" and the code must
be maintained. I agree that this is a question about probabilities if this
feature is useful and necessary.

Yet, I think if anybody hits a collision, which still may happen with
enough bad luck, than this person wants
the system to behave robust and fail-safe and that it never ever corrupts
stored data. IMHO this is therefore
not a question of probabilites but a question of fail-safe implementation
and thus a question of software quality.

Only my 5cents.
Regards,
Fabian


2018-02-08 9:24 GMT+01:00 Denis Gervalle <[email protected]>:

> Hi Marc and Thomas,
>
> I followed your discussion with great interest. I agree that Thomas very
> light proposal is good to put in place, since it has almost no negative
> impact and only benefit. I think there is also a possibility to mitigate
> the object issue with something close (check integrity of what we get, to
> at least detect an issue), but that's not perfect of course.
>
> That’s said, I would like to point you to this interesting question on
> StackOverflow (https://stackoverflow.com/questions/22029012/
> probability-of-64bit-hash-code-collisions) and remind you that base on
> the Birthday Paradox, with the released of 4.x, we have raised our worrying
> threshold of documents/objects from 65535, to more than 4 billion… and it
> took a while (4 versions of XWiki) before we had the strong feeling we need
> to raise. So, while before 4.x, the worrying threshold was really low, the
> effective happening of a collision was already low.
>
> My own experience was the risk before 4.x was really high with generated
> names, much hight than with names use by real user. When I was it by that
> issue, I remember being really bad about it. This is also probably why you
> have raised this thread. The previous hash was too small and had also a
> discutable distribution.
>
> The MD5 algorithm like many crypto hashes is particularly well suited for
> providing a good distribution (http://michiel.buddingh.eu/
> distribution-of-hash-values), the cutting at 64 bits may lower this, but
> I doubt it would be significant for us. So, personally, I feel really
> comfortable with the current implementation, and I think you can sleep in
> peace as well.
>
> Just my thought about not raising fears when it’s no more really justified.
> Regards,
>
> --
> Denis Gervalle
> SOFTEC sa - CEO
>
> On 7 Feb 2018, 16:10 +0100, Denis Gervalle <[email protected]>,
> wrote:
> >
> > Hi Marc and Thomas,
> >
> > I followed your discussion with great interest. I agree that Thomas very
> light proposal is good to put in place, since it has almost no negative
> impact and only benefit. I think there is also a possibility to mitigate
> the object issue with something close (check integrity of what we get, to
> at least detect an issue), but that's not perfect of course.
> >
> > That’s said, I would like to point you to this interesting question on
> StackOverflow (https://stackoverflow.com/questions/22029012/
> probability-of-64bit-hash-code-collisions) and remind you that base on
> the Birthday Paradox, with the released of 4.x, we have raised our worrying
> threshold of documents/objects from 65535, to more than 4 billion… and it
> took a while (4 versions of XWiki) before we had the strong feeling we need
> to raise. So, while before 4.x, the worrying threshold was really low, the
> effective happening of a collision was already low.
> >
> > My own experience was the risk before 4.x was really high with generated
> names, much hight than with names use by real user. When I was it by that
> issue, I remember being really bad about it. This is also probably why you
> have raised this thread. The previous hash was too small and had also a
> discutable distribution.
> >
> > The MD5 algorithm like many crypto hashes is particularly well suited
> for providing a good distribution (http://michiel.buddingh.eu/
> distribution-of-hash-values), the cutting at 64 bits may lower this, but
> I doubt it would be significant for us. So, personally, I feel really
> comfortable with the current implementation, and I think you can sleep in
> peace as well.
> >
> > Just my thought about not raising fears when it’s no more really
> justified.
> > Regards,
>

Reply via email to