Re: [tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw

2018-12-12 Thread Taylor R Campbell
> Date: Wed, 12 Dec 2018 17:47:05 +0200
> From: George Kadianakis 
> 
> I followed your suggestion and turned the tests into deterministic by
> sampling from a deterministic randomness source. I verified that all the
> crypto_rand() call outputs are now the same between the 32-bit mingw
> build and the 64-bit gcc one: [...]

Although it is worthwhile to separate (a) a nondeterministic entropy
source from (b) a deterministic PRNG with seeds you can save to get
reproducible results...it turns out all that was a red herring, and
the culprit was an incautious linear interpolation I had written in
the binning for stochastic tests.  Fortunately the problematic code is
limited to the tests and does not run in the tor daemon!

Lerping -- computing t |---> a + (b - a)*t = (1 - t)*a + t*b -- is
deceptive in its apparent simplicity, and it just happened that no
trouble arose in binary64 arithmetic, which is what you get on just
about every machine on the planet _except_ 32-bit x86 in binary80 mode
-- which mingw32 uses.  (And the m68k floating-point coprocessor, but
one doesn't see those much these days.)  Fortunately you don't have to
use Windows to run tests -- `gcc -m32' on an amd64 system should do
just as well, and you can make sure you're in binary80 mode with

   uint32_t cwsave, cw = 0x037f;
   asm volatile("fnstcw %0; fldcw %1" : "=m"(cwsave) : "m"(cw));
   ...
   asm volatile("fldcw %0" : : "m"(cwsave));

(On NetBSD, this is spelled `fp_prec_t p = fpsetprec(FP_PE); ...;
fpsetprec(p)', but I'm not sure anyone else has adopted that.)

Binary80 arithmetic tickled a problem in the lerp used for binning --
a problem which I took some pains to avoid in sample_uniform_interval
(see the long theorem in the comments there).  By `binning' I mean
dividing the support of each distribution into histogram bins,
starting at the 1st percentile and ending at the 99th percentile, and
uniformly spaced in the support with a linear interpolation to pick
the bin boundaries.

We then use the CDF or SF to compute the probability a sample falls
into each bin, count a lot of samples, and use a psi test to assess
whether the sample counts are close enough to the expected counts.
(There's nothing magic about uniform spacing; I just pulled it out of
my arse, and it helped me catch bugs.)

When beta is small, the log-logistic distribution has a very fat tail,
so the 1st percentile and the 99th percentile are very far away (e.g.,
for alpha=e and beta=1/10, the 1st percentile is ~2.5e-20 and the 99th
percentile is ~3e+20).  The naive lerp I had written in bin_cdfs to
compute an equispaced partition between these two points overshot in
this case in binary80 arithmetic, which led to an intermediate NaN
computation, which made the test fail 100% of the time.

The attached patch uses a slightly less naive lerp.  It's not
necessarily monotonic but that doesn't really matter here since we're
only examining 100 histogram bins -- overcounting by the width of one
or two consecutive floating-point numbers is essentially
inconsequential at this scale for these tests.
>From 1338258a0d6a76bbe30bf03b90acff4559c24120 Mon Sep 17 00:00:00 2001
From: Taylor R Campbell 
Date: Thu, 13 Dec 2018 02:46:27 +
Subject: [PATCH] Use a lerp that never overshoots when determining bin
 boundaries.

In binary80 arithmetic, the bins for the log-logistic distribution,
which has a very fat tail, were computed with a value that overshot
the bounds, leading to a NaN intermediate.

One defensive way to avoid this would be to change the cdf_* and sf_*
functions for distributions of bounded support -- like log-logistic,
which is supported only on (0, +\infty) -- to return -/+inf for
points outside the bounds.  But for testing purposes it might be
better not to work defensively like that because it might mask
upstream problems.
---
 src/test/test_prob_distr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/test/test_prob_distr.c b/src/test/test_prob_distr.c
index 002497e08..16919e0f4 100644
--- a/src/test/test_prob_distr.c
+++ b/src/test/test_prob_distr.c
@@ -1002,14 +1002,14 @@ bin_cdfs(const struct dist *dist, double lo, double hi, 
double *logP, size_t n)
   logP[0] = log(CDF(x_1) - 0); /* 0 = CDF(-inf) */
   for (i = 1; i < n2; i++) {
 x_0 = x_1;
-x_1 = lo + i*w;
+x_1 = (i <= n/2 ? lo + i*w : hi - (n - 2 - i)*w);
 logP[i] = log(CDF(x_1) - CDF(x_0));
   }
   x_0 = hi;
   logP[n - 1] = log(SF(x_0) - 0); /* 0 = SF(+inf) = 1 - CDF(+inf) */
   for (i = 1; i < n - n2; i++) {
 x_1 = x_0;
-x_0 = hi - i*w;
+x_0 = (i <= n/2 ? hi - i*w : lo + (n - 2 - i)*w);
 logP[n - i - 1] = log(SF(x_0) - SF(x_1));
   }
 #undef SF
-- 
2.19.1

___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw

2018-12-12 Thread George Kadianakis
George Kadianakis  writes:

> Hello Riastradh,
>
> as discussed on IRC, Appveyor recently started failing the stochastic
> tests of LogLogistic on 32-bit builds:
>  https://github.com/torproject/tor/pull/576
>  https://ci.appveyor.com/project/torproject/tor/builds/20897462
>
> I managed to reproduce the breakage by cross-compiling Tor and running
> the tests with wine, using this script of ahf: 
> https://github.com/ahf/tor-win32/
>
> Here are my findings:
>
> The following two test cases are breaking 100% reproducibly:
>
> ok = test_stochastic_log_logistic_impl(M_E, 1e-1);
> ok = test_stochastic_log_logistic_impl(exp(-10), 1e-2);
>

And here are some updates:

I followed your suggestion and turned the tests into deterministic by
sampling from a deterministic randomness source. I verified that all the
crypto_rand() call outputs are now the same between the 32-bit mingw
build and the 64-bit gcc one:
  
https://github.com/asn-d6/tor/commit/3d8c86c2f08ad2cc7ed030bbf8e11b110351f5c8

I then focused on the test_stochastic_log_logistic_impl(M_E, 1e-1) test
case and tried to figure out where the deviation was happening between
64-bit gcc and 32-bit mingw... That took a while but I finally got some
figures. Check out my commit that adds some printfs as well:
  
https://github.com/asn-d6/tor/commit/36999c640fe824ab9fb85b5d2cd15017a97a532f

So using the output from that that commit I noticed that many times
log_logistic_sample() would give different outputs in these two
systems. In particular sometimes the x value would differ even with the
same (s, p0) pair, and other times the x value would be the same but the
final alpha*pow(x,1/beta) value would differ. Even tho this is the case,
the test would only fail for certain values for beta (as mentioned in my
previous email).

I now inline various such failure cases and one correct one:

Case #1 (same x, different sample value):

 mingw-32:
beta: 0x1.ap-4
s: 3122729323, p0: 0x1.68d18a44b82fbp-1
x: 0x1.d686a1e7fa35p+0
alpha*pow(x, 1/beta): 0x1.2affd5bfff433p+10

 gcc-64:
beta: 0x1.ap-4
s: 3122729323, p0: 0x1.68d18a44b82fbp-1
x: 0x1.d686a1e7fa35p+0
alpha*pow(x, 1/beta): 0x1.2affd5bfff434p+10

Case #2 (same x, different sample value):

 mingw-32:
beta: 0x1.ap-4
s: 738208646, p0: 0x1.a1ecd53def5d3p-2
x: 0x1.068987864c2aep-2
alpha*pow(x, 1/beta): 0x1.bfba380255bb8p-19

 linux:
beta: 0x1.ap-4
s: 738208646, p0: 0x1.a1ecd53def5d3p-2
x: 0x1.068987864c2aep-2
alpha*pow(x, 1/beta): 0x1.bfba380255bb9p-19
 
Case #3 (different x, different sample value):

 mingw-32:
beta: 0x1.ap-4
s: 95364755, p0: 0x1.575b5ea720e3cp-1
x: 0x1.fb7949976ab04p+0
alpha*pow(x, 1/beta): 0x1.3e605e169e8cbp+11

 gcc-64:
beta: 0x1.ap-4
s: 95364755, p0: 0x1.575b5ea720e3cp-1
x: 0x1.fb7949976ab03p+0
alpha*pow(x, 1/beta): 0x1.3e605e169e8c5p+11

Case #4 (different x, different sample value):

 mingw-32:
beta: 0x1.ap-4
s: 2082443965, p0: 0x1.530a8759113bp-2
x: 0x1.42989e50ac641p+2
alpha*pow(x, 1/beta): 0x1.b724d48bf0f6cp+24

 gcc-64:
beta: 0x1.ap-4
s: 2082443965, p0: 0x1.530a8759113bp-2
x: 0x1.42989e50ac64p+2
alpha*pow(x, 1/beta): 0x1.b724d48bf0f5ep+24

Case #5 (different x, different sample value):

 mingw-32:
beta: 0x1.ap-4
s: 443038967, p0: 0x1.b0124b971bbf3p-4
x: 0x1.1f5b72f5f6a3ep+4
alpha*pow(x, 1/beta): 0x1.143a16cdae94fp+43

 gcc-64:
beta: 0x1.ap-4
s: 443038967, p0: 0x1.b0124b971bbf3p-4
x: 0x1.1f5b72f5f6a3fp+4
alpha*pow(x, 1/beta): 0x1.143a16cdae958p+43

Case #6 (same sample value):

 mingw-32:
beta: 0x1.ap-4
s: 2932701594, p0: 0x1.b407f600e6d87p-1
x: 0x1.7bb183ccc47efp-1
alpha*pow(x, 1/beta): 0x1.181016f03c09p-3

 gcc-64:
beta: 0x1.ap-4
s: 2932701594, p0: 0x1.b407f600e6d87p-1
x: 0x1.7bb183ccc47efp-1
alpha*pow(x, 1/beta): 0x1.181016f03c09p-3
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev