Hi, On 2019-05-22 12:43, Sam Hartman wrote: > So, I think it's problematic to apply old assumptions to new areas. The > reproducible builds world has gotten a lot further with bit-for-bit > identical builds than I ever imagined they would.
I overhauled the reproducibility section. And lowered the reproducibility standard from "Bit-by-Bit" to "Numerically", which is the most practical choice for now. Anyway we can raise the bar in the future if things got better in terms of reproducibility. > However, what's actually needed in the deep learning context is weaker > than bit-for-bit identical. What we need is a way to validate that two > models are identical for some equality predicate that meets our security > and safety (and freedom) concerns. Parallel computation in the > training, the sort of floating point issues you point to, and a lot of > other things may make bit-for-bit identical models hard to come by. Indeed: I name this as "Numerically Reproducible": https://salsa.debian.org/lumin/deeplearning-policy#neural-network-reproducibility > Obviously we need to validate the correctness of whatever comparison > function we use. The checksums match is relatively easy to validate. > Something that for example understood floating point numbers would have > a greater potential for bugs than an implementation of say sha256. > > So, yeah, bit-for-bit identical is great if we can get it. But > validating these models is important enough that if we need to use a > different equality predicate it's still worth doing. For now, we just need to compare the digits and the curves: train twice without any modification, and see if the curves and digits are the same. Further measures, I think, depends on how this field evolves.