Re: [tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw
> Date: Thu, 13 Dec 2018 03:25:30 + > From: Taylor R Campbell > > Binary80 arithmetic tickled a problem in the lerp used for binning -- > [...] Correction: while making the lerp less naive addresses this problem, it also arises only when binary80 arithmetic and binary64 arithmetic are mixed, which you get on 32-bit x86 in C double with the x87 unit in binary80 mode so (a) double is binary64, but (b) intermediate expressions are evaluated in binary80. The attached program shows this, by putting either one (bad) or two (good) intermediate expressions in double variables. On x86, if you compile it with -mfpmath=387 (default on 32-bit), you'll see the bad result, a negative answer; if you compile it with -mfpmath=sse (default on 64-bit), you'll see only good results, zero. Convert everything to use long double (and %Le) instead so that all the arithmetic and intermediate quantities are binary80, and it's fine. % cc -o loser -mfpmath=387 loser.c && ./loser bad -2.45760e+04 good 0.0e+00 % cc -o loser -mfpmath=sse loser.c && ./loser bad 0.0e+00 good 0.0e+00 (This is why I don't like x87 and the automagic evaluation of expressions in higher precision...) #include #ifdef __NetBSD__ #include #else #include #define FP_PS 0 #define FP_PRS 1 #define FP_PD 2 #define FP_PE 3 int fpsetprec(int nprec) { uint32_t ocw, ncw; int oprec; asm volatile("fnstcw %0" : "=m"(ocw)); oprec = (ocw >> 8) & 3; ncw = (ocw & ~(uint32_t)(3 << 8)) | (uint32_t)((nprec & 3) << 8); asm volatile("fldcw %0" : : "m"(ncw)); return oprec; } #endif int main(void) { volatile double lo = 2.4608250784829636e-20; volatile double hi = 3.0026742508190853e+20; volatile double w, d; volatile size_t n = 100; fpsetprec(FP_PE); w = (hi - lo)/(n - 2); d = (n - 2)*w; printf("bad %.17e\n", hi - (n - 2)*w); printf("good %.17e\n", hi - d); fflush(stdout); return ferror(stdout); } ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw
> Date: Wed, 12 Dec 2018 17:47:05 +0200 > From: George Kadianakis > > I followed your suggestion and turned the tests into deterministic by > sampling from a deterministic randomness source. I verified that all the > crypto_rand() call outputs are now the same between the 32-bit mingw > build and the 64-bit gcc one: [...] Although it is worthwhile to separate (a) a nondeterministic entropy source from (b) a deterministic PRNG with seeds you can save to get reproducible results...it turns out all that was a red herring, and the culprit was an incautious linear interpolation I had written in the binning for stochastic tests. Fortunately the problematic code is limited to the tests and does not run in the tor daemon! Lerping -- computing t |---> a + (b - a)*t = (1 - t)*a + t*b -- is deceptive in its apparent simplicity, and it just happened that no trouble arose in binary64 arithmetic, which is what you get on just about every machine on the planet _except_ 32-bit x86 in binary80 mode -- which mingw32 uses. (And the m68k floating-point coprocessor, but one doesn't see those much these days.) Fortunately you don't have to use Windows to run tests -- `gcc -m32' on an amd64 system should do just as well, and you can make sure you're in binary80 mode with uint32_t cwsave, cw = 0x037f; asm volatile("fnstcw %0; fldcw %1" : "=m"(cwsave) : "m"(cw)); ... asm volatile("fldcw %0" : : "m"(cwsave)); (On NetBSD, this is spelled `fp_prec_t p = fpsetprec(FP_PE); ...; fpsetprec(p)', but I'm not sure anyone else has adopted that.) Binary80 arithmetic tickled a problem in the lerp used for binning -- a problem which I took some pains to avoid in sample_uniform_interval (see the long theorem in the comments there). By `binning' I mean dividing the support of each distribution into histogram bins, starting at the 1st percentile and ending at the 99th percentile, and uniformly spaced in the support with a linear interpolation to pick the bin boundaries. We then use the CDF or SF to compute the probability a sample falls into each bin, count a lot of samples, and use a psi test to assess whether the sample counts are close enough to the expected counts. (There's nothing magic about uniform spacing; I just pulled it out of my arse, and it helped me catch bugs.) When beta is small, the log-logistic distribution has a very fat tail, so the 1st percentile and the 99th percentile are very far away (e.g., for alpha=e and beta=1/10, the 1st percentile is ~2.5e-20 and the 99th percentile is ~3e+20). The naive lerp I had written in bin_cdfs to compute an equispaced partition between these two points overshot in this case in binary80 arithmetic, which led to an intermediate NaN computation, which made the test fail 100% of the time. The attached patch uses a slightly less naive lerp. It's not necessarily monotonic but that doesn't really matter here since we're only examining 100 histogram bins -- overcounting by the width of one or two consecutive floating-point numbers is essentially inconsequential at this scale for these tests. >From 1338258a0d6a76bbe30bf03b90acff4559c24120 Mon Sep 17 00:00:00 2001 From: Taylor R Campbell Date: Thu, 13 Dec 2018 02:46:27 + Subject: [PATCH] Use a lerp that never overshoots when determining bin boundaries. In binary80 arithmetic, the bins for the log-logistic distribution, which has a very fat tail, were computed with a value that overshot the bounds, leading to a NaN intermediate. One defensive way to avoid this would be to change the cdf_* and sf_* functions for distributions of bounded support -- like log-logistic, which is supported only on (0, +\infty) -- to return -/+inf for points outside the bounds. But for testing purposes it might be better not to work defensively like that because it might mask upstream problems. --- src/test/test_prob_distr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/test/test_prob_distr.c b/src/test/test_prob_distr.c index 002497e08..16919e0f4 100644 --- a/src/test/test_prob_distr.c +++ b/src/test/test_prob_distr.c @@ -1002,14 +1002,14 @@ bin_cdfs(const struct dist *dist, double lo, double hi, double *logP, size_t n) logP[0] = log(CDF(x_1) - 0); /* 0 = CDF(-inf) */ for (i = 1; i < n2; i++) { x_0 = x_1; -x_1 = lo + i*w; +x_1 = (i <= n/2 ? lo + i*w : hi - (n - 2 - i)*w); logP[i] = log(CDF(x_1) - CDF(x_0)); } x_0 = hi; logP[n - 1] = log(SF(x_0) - 0); /* 0 = SF(+inf) = 1 - CDF(+inf) */ for (i = 1; i < n - n2; i++) { x_1 = x_0; -x_0 = hi - i*w; +x_0 = (i <= n/2 ? hi - i*w : lo + (n - 2 - i)*w); logP[n - i - 1] = log(SF(x_0) - SF(x_1)); } #undef SF -- 2.19.1 ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Re: [tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw
George Kadianakis writes: > Hello Riastradh, > > as discussed on IRC, Appveyor recently started failing the stochastic > tests of LogLogistic on 32-bit builds: > https://github.com/torproject/tor/pull/576 > https://ci.appveyor.com/project/torproject/tor/builds/20897462 > > I managed to reproduce the breakage by cross-compiling Tor and running > the tests with wine, using this script of ahf: > https://github.com/ahf/tor-win32/ > > Here are my findings: > > The following two test cases are breaking 100% reproducibly: > > ok = test_stochastic_log_logistic_impl(M_E, 1e-1); > ok = test_stochastic_log_logistic_impl(exp(-10), 1e-2); > And here are some updates: I followed your suggestion and turned the tests into deterministic by sampling from a deterministic randomness source. I verified that all the crypto_rand() call outputs are now the same between the 32-bit mingw build and the 64-bit gcc one: https://github.com/asn-d6/tor/commit/3d8c86c2f08ad2cc7ed030bbf8e11b110351f5c8 I then focused on the test_stochastic_log_logistic_impl(M_E, 1e-1) test case and tried to figure out where the deviation was happening between 64-bit gcc and 32-bit mingw... That took a while but I finally got some figures. Check out my commit that adds some printfs as well: https://github.com/asn-d6/tor/commit/36999c640fe824ab9fb85b5d2cd15017a97a532f So using the output from that that commit I noticed that many times log_logistic_sample() would give different outputs in these two systems. In particular sometimes the x value would differ even with the same (s, p0) pair, and other times the x value would be the same but the final alpha*pow(x,1/beta) value would differ. Even tho this is the case, the test would only fail for certain values for beta (as mentioned in my previous email). I now inline various such failure cases and one correct one: Case #1 (same x, different sample value): mingw-32: beta: 0x1.ap-4 s: 3122729323, p0: 0x1.68d18a44b82fbp-1 x: 0x1.d686a1e7fa35p+0 alpha*pow(x, 1/beta): 0x1.2affd5bfff433p+10 gcc-64: beta: 0x1.ap-4 s: 3122729323, p0: 0x1.68d18a44b82fbp-1 x: 0x1.d686a1e7fa35p+0 alpha*pow(x, 1/beta): 0x1.2affd5bfff434p+10 Case #2 (same x, different sample value): mingw-32: beta: 0x1.ap-4 s: 738208646, p0: 0x1.a1ecd53def5d3p-2 x: 0x1.068987864c2aep-2 alpha*pow(x, 1/beta): 0x1.bfba380255bb8p-19 linux: beta: 0x1.ap-4 s: 738208646, p0: 0x1.a1ecd53def5d3p-2 x: 0x1.068987864c2aep-2 alpha*pow(x, 1/beta): 0x1.bfba380255bb9p-19 Case #3 (different x, different sample value): mingw-32: beta: 0x1.ap-4 s: 95364755, p0: 0x1.575b5ea720e3cp-1 x: 0x1.fb7949976ab04p+0 alpha*pow(x, 1/beta): 0x1.3e605e169e8cbp+11 gcc-64: beta: 0x1.ap-4 s: 95364755, p0: 0x1.575b5ea720e3cp-1 x: 0x1.fb7949976ab03p+0 alpha*pow(x, 1/beta): 0x1.3e605e169e8c5p+11 Case #4 (different x, different sample value): mingw-32: beta: 0x1.ap-4 s: 2082443965, p0: 0x1.530a8759113bp-2 x: 0x1.42989e50ac641p+2 alpha*pow(x, 1/beta): 0x1.b724d48bf0f6cp+24 gcc-64: beta: 0x1.ap-4 s: 2082443965, p0: 0x1.530a8759113bp-2 x: 0x1.42989e50ac64p+2 alpha*pow(x, 1/beta): 0x1.b724d48bf0f5ep+24 Case #5 (different x, different sample value): mingw-32: beta: 0x1.ap-4 s: 443038967, p0: 0x1.b0124b971bbf3p-4 x: 0x1.1f5b72f5f6a3ep+4 alpha*pow(x, 1/beta): 0x1.143a16cdae94fp+43 gcc-64: beta: 0x1.ap-4 s: 443038967, p0: 0x1.b0124b971bbf3p-4 x: 0x1.1f5b72f5f6a3fp+4 alpha*pow(x, 1/beta): 0x1.143a16cdae958p+43 Case #6 (same sample value): mingw-32: beta: 0x1.ap-4 s: 2932701594, p0: 0x1.b407f600e6d87p-1 x: 0x1.7bb183ccc47efp-1 alpha*pow(x, 1/beta): 0x1.181016f03c09p-3 gcc-64: beta: 0x1.ap-4 s: 2932701594, p0: 0x1.b407f600e6d87p-1 x: 0x1.7bb183ccc47efp-1 alpha*pow(x, 1/beta): 0x1.181016f03c09p-3 ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
[tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw
Hello Riastradh, as discussed on IRC, Appveyor recently started failing the stochastic tests of LogLogistic on 32-bit builds: https://github.com/torproject/tor/pull/576 https://ci.appveyor.com/project/torproject/tor/builds/20897462 I managed to reproduce the breakage by cross-compiling Tor and running the tests with wine, using this script of ahf: https://github.com/ahf/tor-win32/ Here are my findings: The following two test cases are breaking 100% reproducibly: ok = test_stochastic_log_logistic_impl(M_E, 1e-1); ok = test_stochastic_log_logistic_impl(exp(-10), 1e-2); The breakage seems to be because of the beta parameter. In particular, it seems like the test will break with any beta <= 0.26, and will succeed with a beta >= 0.27. The space in between is still unclear ;) I haven't managed to figure out what's actually offending the test but I can reproduce this so I can do some digging if you have any ideas. FWIW, I haven't noticed any other stochastic test breakage. Cheers! ___ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev