Re: [tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw
> Date: Wed, 12 Dec 2018 17:47:05 +0200 > From: George Kadianakis > > I followed your suggestion and turned the tests into deterministic by > sampling from a deterministic randomness source. I verified that all the > crypto_rand() call outputs are now the same between the 32-bit mingw > build and the 64-bit gcc one: [...] Although it is worthwhile to separate (a) a nondeterministic entropy source from (b) a deterministic PRNG with seeds you can save to get reproducible results...it turns out all that was a red herring, and the culprit was an incautious linear interpolation I had written in the binning for stochastic tests. Fortunately the problematic code is limited to the tests and does not run in the tor daemon! Lerping -- computing t |---> a + (b - a)*t = (1 - t)*a + t*b -- is deceptive in its apparent simplicity, and it just happened that no trouble arose in binary64 arithmetic, which is what you get on just about every machine on the planet _except_ 32-bit x86 in binary80 mode -- which mingw32 uses. (And the m68k floating-point coprocessor, but one doesn't see those much these days.) Fortunately you don't have to use Windows to run tests -- `gcc -m32' on an amd64 system should do just as well, and you can make sure you're in binary80 mode with uint32_t cwsave, cw = 0x037f; asm volatile("fnstcw %0; fldcw %1" : "=m"(cwsave) : "m"(cw)); ... asm volatile("fldcw %0" : : "m"(cwsave)); (On NetBSD, this is spelled `fp_prec_t p = fpsetprec(FP_PE); ...; fpsetprec(p)', but I'm not sure anyone else has adopted that.) Binary80 arithmetic tickled a problem in the lerp used for binning -- a problem which I took some pains to avoid in sample_uniform_interval (see the long theorem in the comments there). By `binning' I mean dividing the support of each distribution into histogram bins, starting at the 1st percentile and ending at the 99th percentile, and uniformly spaced in the support with a linear interpolation to pick the bin boundaries. We then use the CDF or SF to compute the probability a sample falls into each bin, count a lot of samples, and use a psi test to assess whether the sample counts are close enough to the expected counts. (There's nothing magic about uniform spacing; I just pulled it out of my arse, and it helped me catch bugs.) When beta is small, the log-logistic distribution has a very fat tail, so the 1st percentile and the 99th percentile are very far away (e.g., for alpha=e and beta=1/10, the 1st percentile is ~2.5e-20 and the 99th percentile is ~3e+20). The naive lerp I had written in bin_cdfs to compute an equispaced partition between these two points overshot in this case in binary80 arithmetic, which led to an intermediate NaN computation, which made the test fail 100% of the time. The attached patch uses a slightly less naive lerp. It's not necessarily monotonic but that doesn't really matter here since we're only examining 100 histogram bins -- overcounting by the width of one or two consecutive floating-point numbers is essentially inconsequential at this scale for these tests. >From 1338258a0d6a76bbe30bf03b90acff4559c24120 Mon Sep 17 00:00:00 2001 From: Taylor R Campbell Date: Thu, 13 Dec 2018 02:46:27 + Subject: [PATCH] Use a lerp that never overshoots when determining bin boundaries. In binary80 arithmetic, the bins for the log-logistic distribution, which has a very fat tail, were computed with a value that overshot the bounds, leading to a NaN intermediate. One defensive way to avoid this would be to change the cdf_* and sf_* functions for distributions of bounded support -- like log-logistic, which is supported only on (0, +\infty) -- to return -/+inf for points outside the bounds. But for testing purposes it might be better not to work defensively like that because it might mask upstream problems. --- src/test/test_prob_distr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/test/test_prob_distr.c b/src/test/test_prob_distr.c index 002497e08..16919e0f4 100644 --- a/src/test/test_prob_distr.c +++ b/src/test/test_prob_distr.c @@ -1002,14 +1002,14 @@ bin_cdfs(const struct dist *dist, double lo, double hi, double *logP, size_t n) logP[0] = log(CDF(x_1) - 0); /* 0 = CDF(-inf) */ for (i = 1; i < n2; i++) { x_0 = x_1; -x_1 = lo + i*w; +x_1 = (i <= n/2 ? lo + i*w : hi - (n - 2 - i)*w); logP[i] = log(CDF(x_1) - CDF(x_0)); } x_0 = hi; logP[n - 1] = log(SF(x_0) - 0); /* 0 = SF(+inf) = 1 - CDF(+inf) */ for (i = 1; i < n - n2; i++) { x_1 = x_0; -x_0 = hi - i*w; +x_0 = (i <= n/2 ? hi - i*w : lo + (n - 2 - i)*w); logP[n - i - 1] = log(SF(x_0) - SF(x_1)); } #undef SF -- 2.19.1 ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw
George Kadianakis writes: > Hello Riastradh, > > as discussed on IRC, Appveyor recently started failing the stochastic > tests of LogLogistic on 32-bit builds: > https://github.com/torproject/tor/pull/576 > https://ci.appveyor.com/project/torproject/tor/builds/20897462 > > I managed to reproduce the breakage by cross-compiling Tor and running > the tests with wine, using this script of ahf: > https://github.com/ahf/tor-win32/ > > Here are my findings: > > The following two test cases are breaking 100% reproducibly: > > ok = test_stochastic_log_logistic_impl(M_E, 1e-1); > ok = test_stochastic_log_logistic_impl(exp(-10), 1e-2); > And here are some updates: I followed your suggestion and turned the tests into deterministic by sampling from a deterministic randomness source. I verified that all the crypto_rand() call outputs are now the same between the 32-bit mingw build and the 64-bit gcc one: https://github.com/asn-d6/tor/commit/3d8c86c2f08ad2cc7ed030bbf8e11b110351f5c8 I then focused on the test_stochastic_log_logistic_impl(M_E, 1e-1) test case and tried to figure out where the deviation was happening between 64-bit gcc and 32-bit mingw... That took a while but I finally got some figures. Check out my commit that adds some printfs as well: https://github.com/asn-d6/tor/commit/36999c640fe824ab9fb85b5d2cd15017a97a532f So using the output from that that commit I noticed that many times log_logistic_sample() would give different outputs in these two systems. In particular sometimes the x value would differ even with the same (s, p0) pair, and other times the x value would be the same but the final alpha*pow(x,1/beta) value would differ. Even tho this is the case, the test would only fail for certain values for beta (as mentioned in my previous email). I now inline various such failure cases and one correct one: Case #1 (same x, different sample value): mingw-32: beta: 0x1.ap-4 s: 3122729323, p0: 0x1.68d18a44b82fbp-1 x: 0x1.d686a1e7fa35p+0 alpha*pow(x, 1/beta): 0x1.2affd5bfff433p+10 gcc-64: beta: 0x1.ap-4 s: 3122729323, p0: 0x1.68d18a44b82fbp-1 x: 0x1.d686a1e7fa35p+0 alpha*pow(x, 1/beta): 0x1.2affd5bfff434p+10 Case #2 (same x, different sample value): mingw-32: beta: 0x1.ap-4 s: 738208646, p0: 0x1.a1ecd53def5d3p-2 x: 0x1.068987864c2aep-2 alpha*pow(x, 1/beta): 0x1.bfba380255bb8p-19 linux: beta: 0x1.ap-4 s: 738208646, p0: 0x1.a1ecd53def5d3p-2 x: 0x1.068987864c2aep-2 alpha*pow(x, 1/beta): 0x1.bfba380255bb9p-19 Case #3 (different x, different sample value): mingw-32: beta: 0x1.ap-4 s: 95364755, p0: 0x1.575b5ea720e3cp-1 x: 0x1.fb7949976ab04p+0 alpha*pow(x, 1/beta): 0x1.3e605e169e8cbp+11 gcc-64: beta: 0x1.ap-4 s: 95364755, p0: 0x1.575b5ea720e3cp-1 x: 0x1.fb7949976ab03p+0 alpha*pow(x, 1/beta): 0x1.3e605e169e8c5p+11 Case #4 (different x, different sample value): mingw-32: beta: 0x1.ap-4 s: 2082443965, p0: 0x1.530a8759113bp-2 x: 0x1.42989e50ac641p+2 alpha*pow(x, 1/beta): 0x1.b724d48bf0f6cp+24 gcc-64: beta: 0x1.ap-4 s: 2082443965, p0: 0x1.530a8759113bp-2 x: 0x1.42989e50ac64p+2 alpha*pow(x, 1/beta): 0x1.b724d48bf0f5ep+24 Case #5 (different x, different sample value): mingw-32: beta: 0x1.ap-4 s: 443038967, p0: 0x1.b0124b971bbf3p-4 x: 0x1.1f5b72f5f6a3ep+4 alpha*pow(x, 1/beta): 0x1.143a16cdae94fp+43 gcc-64: beta: 0x1.ap-4 s: 443038967, p0: 0x1.b0124b971bbf3p-4 x: 0x1.1f5b72f5f6a3fp+4 alpha*pow(x, 1/beta): 0x1.143a16cdae958p+43 Case #6 (same sample value): mingw-32: beta: 0x1.ap-4 s: 2932701594, p0: 0x1.b407f600e6d87p-1 x: 0x1.7bb183ccc47efp-1 alpha*pow(x, 1/beta): 0x1.181016f03c09p-3 gcc-64: beta: 0x1.ap-4 s: 2932701594, p0: 0x1.b407f600e6d87p-1 x: 0x1.7bb183ccc47efp-1 alpha*pow(x, 1/beta): 0x1.181016f03c09p-3 ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev