Re: [tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw

2018-12-13 Thread Taylor R Campbell
> Date: Thu, 13 Dec 2018 03:25:30 +
> From: Taylor R Campbell 
> 
> Binary80 arithmetic tickled a problem in the lerp used for binning --
> [...]

Correction: while making the lerp less naive addresses this problem,
it also arises only when binary80 arithmetic and binary64 arithmetic
are mixed, which you get on 32-bit x86 in C double with the x87 unit
in binary80 mode so (a) double is binary64, but (b) intermediate
expressions are evaluated in binary80.

The attached program shows this, by putting either one (bad) or two
(good) intermediate expressions in double variables.  On x86, if you
compile it with -mfpmath=387 (default on 32-bit), you'll see the bad
result, a negative answer; if you compile it with -mfpmath=sse
(default on 64-bit), you'll see only good results, zero.  Convert
everything to use long double (and %Le) instead so that all the
arithmetic and intermediate quantities are binary80, and it's fine.

% cc -o loser -mfpmath=387 loser.c && ./loser
bad -2.45760e+04
good 0.0e+00
% cc -o loser -mfpmath=sse loser.c && ./loser
bad 0.0e+00
good 0.0e+00

(This is why I don't like x87 and the automagic evaluation of
expressions in higher precision...)
#include 

#ifdef __NetBSD__

#include 

#else

#include 

#define FP_PS   0
#define FP_PRS  1
#define FP_PD   2
#define FP_PE   3

int
fpsetprec(int nprec)
{
uint32_t ocw, ncw;
int oprec;

asm volatile("fnstcw %0" : "=m"(ocw));

oprec = (ocw >> 8) & 3;
ncw = (ocw & ~(uint32_t)(3 << 8)) | (uint32_t)((nprec & 3) << 8);

asm volatile("fldcw %0" : : "m"(ncw));

return oprec;
}

#endif

int
main(void)
{
volatile double lo = 2.4608250784829636e-20;
volatile double hi = 3.0026742508190853e+20;
volatile double w, d;
volatile size_t n = 100;

fpsetprec(FP_PE);
w = (hi - lo)/(n - 2);
d = (n - 2)*w;
printf("bad %.17e\n", hi - (n - 2)*w);
printf("good %.17e\n", hi - d);

fflush(stdout);
return ferror(stdout);
}
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw

2018-12-12 Thread Taylor R Campbell
> Date: Wed, 12 Dec 2018 17:47:05 +0200
> From: George Kadianakis 
> 
> I followed your suggestion and turned the tests into deterministic by
> sampling from a deterministic randomness source. I verified that all the
> crypto_rand() call outputs are now the same between the 32-bit mingw
> build and the 64-bit gcc one: [...]

Although it is worthwhile to separate (a) a nondeterministic entropy
source from (b) a deterministic PRNG with seeds you can save to get
reproducible results...it turns out all that was a red herring, and
the culprit was an incautious linear interpolation I had written in
the binning for stochastic tests.  Fortunately the problematic code is
limited to the tests and does not run in the tor daemon!

Lerping -- computing t |---> a + (b - a)*t = (1 - t)*a + t*b -- is
deceptive in its apparent simplicity, and it just happened that no
trouble arose in binary64 arithmetic, which is what you get on just
about every machine on the planet _except_ 32-bit x86 in binary80 mode
-- which mingw32 uses.  (And the m68k floating-point coprocessor, but
one doesn't see those much these days.)  Fortunately you don't have to
use Windows to run tests -- `gcc -m32' on an amd64 system should do
just as well, and you can make sure you're in binary80 mode with

   uint32_t cwsave, cw = 0x037f;
   asm volatile("fnstcw %0; fldcw %1" : "=m"(cwsave) : "m"(cw));
   ...
   asm volatile("fldcw %0" : : "m"(cwsave));

(On NetBSD, this is spelled `fp_prec_t p = fpsetprec(FP_PE); ...;
fpsetprec(p)', but I'm not sure anyone else has adopted that.)

Binary80 arithmetic tickled a problem in the lerp used for binning --
a problem which I took some pains to avoid in sample_uniform_interval
(see the long theorem in the comments there).  By `binning' I mean
dividing the support of each distribution into histogram bins,
starting at the 1st percentile and ending at the 99th percentile, and
uniformly spaced in the support with a linear interpolation to pick
the bin boundaries.

We then use the CDF or SF to compute the probability a sample falls
into each bin, count a lot of samples, and use a psi test to assess
whether the sample counts are close enough to the expected counts.
(There's nothing magic about uniform spacing; I just pulled it out of
my arse, and it helped me catch bugs.)

When beta is small, the log-logistic distribution has a very fat tail,
so the 1st percentile and the 99th percentile are very far away (e.g.,
for alpha=e and beta=1/10, the 1st percentile is ~2.5e-20 and the 99th
percentile is ~3e+20).  The naive lerp I had written in bin_cdfs to
compute an equispaced partition between these two points overshot in
this case in binary80 arithmetic, which led to an intermediate NaN
computation, which made the test fail 100% of the time.

The attached patch uses a slightly less naive lerp.  It's not
necessarily monotonic but that doesn't really matter here since we're
only examining 100 histogram bins -- overcounting by the width of one
or two consecutive floating-point numbers is essentially
inconsequential at this scale for these tests.
>From 1338258a0d6a76bbe30bf03b90acff4559c24120 Mon Sep 17 00:00:00 2001
From: Taylor R Campbell 
Date: Thu, 13 Dec 2018 02:46:27 +
Subject: [PATCH] Use a lerp that never overshoots when determining bin
 boundaries.

In binary80 arithmetic, the bins for the log-logistic distribution,
which has a very fat tail, were computed with a value that overshot
the bounds, leading to a NaN intermediate.

One defensive way to avoid this would be to change the cdf_* and sf_*
functions for distributions of bounded support -- like log-logistic,
which is supported only on (0, +\infty) -- to return -/+inf for
points outside the bounds.  But for testing purposes it might be
better not to work defensively like that because it might mask
upstream problems.
---
 src/test/test_prob_distr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/test/test_prob_distr.c b/src/test/test_prob_distr.c
index 002497e08..16919e0f4 100644
--- a/src/test/test_prob_distr.c
+++ b/src/test/test_prob_distr.c
@@ -1002,14 +1002,14 @@ bin_cdfs(const struct dist *dist, double lo, double hi, 
double *logP, size_t n)
   logP[0] = log(CDF(x_1) - 0); /* 0 = CDF(-inf) */
   for (i = 1; i < n2; i++) {
 x_0 = x_1;
-x_1 = lo + i*w;
+x_1 = (i <= n/2 ? lo + i*w : hi - (n - 2 - i)*w);
 logP[i] = log(CDF(x_1) - CDF(x_0));
   }
   x_0 = hi;
   logP[n - 1] = log(SF(x_0) - 0); /* 0 = SF(+inf) = 1 - CDF(+inf) */
   for (i = 1; i < n - n2; i++) {
 x_1 = x_0;
-x_0 = hi - i*w;
+x_0 = (i <= n/2 ? hi - i*w : lo + (n - 2 - i)*w);
 logP[n - i - 1] = log(SF(x_0) - SF(x_1));
   }
 #undef SF
-- 
2.19.1

___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw

2018-12-12 Thread George Kadianakis
George Kadianakis  writes:

> Hello Riastradh,
>
> as discussed on IRC, Appveyor recently started failing the stochastic
> tests of LogLogistic on 32-bit builds:
>  https://github.com/torproject/tor/pull/576
>  https://ci.appveyor.com/project/torproject/tor/builds/20897462
>
> I managed to reproduce the breakage by cross-compiling Tor and running
> the tests with wine, using this script of ahf: 
> https://github.com/ahf/tor-win32/
>
> Here are my findings:
>
> The following two test cases are breaking 100% reproducibly:
>
> ok = test_stochastic_log_logistic_impl(M_E, 1e-1);
> ok = test_stochastic_log_logistic_impl(exp(-10), 1e-2);
>

And here are some updates:

I followed your suggestion and turned the tests into deterministic by
sampling from a deterministic randomness source. I verified that all the
crypto_rand() call outputs are now the same between the 32-bit mingw
build and the 64-bit gcc one:
  
https://github.com/asn-d6/tor/commit/3d8c86c2f08ad2cc7ed030bbf8e11b110351f5c8

I then focused on the test_stochastic_log_logistic_impl(M_E, 1e-1) test
case and tried to figure out where the deviation was happening between
64-bit gcc and 32-bit mingw... That took a while but I finally got some
figures. Check out my commit that adds some printfs as well:
  
https://github.com/asn-d6/tor/commit/36999c640fe824ab9fb85b5d2cd15017a97a532f

So using the output from that that commit I noticed that many times
log_logistic_sample() would give different outputs in these two
systems. In particular sometimes the x value would differ even with the
same (s, p0) pair, and other times the x value would be the same but the
final alpha*pow(x,1/beta) value would differ. Even tho this is the case,
the test would only fail for certain values for beta (as mentioned in my
previous email).

I now inline various such failure cases and one correct one:

Case #1 (same x, different sample value):

 mingw-32:
beta: 0x1.ap-4
s: 3122729323, p0: 0x1.68d18a44b82fbp-1
x: 0x1.d686a1e7fa35p+0
alpha*pow(x, 1/beta): 0x1.2affd5bfff433p+10

 gcc-64:
beta: 0x1.ap-4
s: 3122729323, p0: 0x1.68d18a44b82fbp-1
x: 0x1.d686a1e7fa35p+0
alpha*pow(x, 1/beta): 0x1.2affd5bfff434p+10

Case #2 (same x, different sample value):

 mingw-32:
beta: 0x1.ap-4
s: 738208646, p0: 0x1.a1ecd53def5d3p-2
x: 0x1.068987864c2aep-2
alpha*pow(x, 1/beta): 0x1.bfba380255bb8p-19

 linux:
beta: 0x1.ap-4
s: 738208646, p0: 0x1.a1ecd53def5d3p-2
x: 0x1.068987864c2aep-2
alpha*pow(x, 1/beta): 0x1.bfba380255bb9p-19
 
Case #3 (different x, different sample value):

 mingw-32:
beta: 0x1.ap-4
s: 95364755, p0: 0x1.575b5ea720e3cp-1
x: 0x1.fb7949976ab04p+0
alpha*pow(x, 1/beta): 0x1.3e605e169e8cbp+11

 gcc-64:
beta: 0x1.ap-4
s: 95364755, p0: 0x1.575b5ea720e3cp-1
x: 0x1.fb7949976ab03p+0
alpha*pow(x, 1/beta): 0x1.3e605e169e8c5p+11

Case #4 (different x, different sample value):

 mingw-32:
beta: 0x1.ap-4
s: 2082443965, p0: 0x1.530a8759113bp-2
x: 0x1.42989e50ac641p+2
alpha*pow(x, 1/beta): 0x1.b724d48bf0f6cp+24

 gcc-64:
beta: 0x1.ap-4
s: 2082443965, p0: 0x1.530a8759113bp-2
x: 0x1.42989e50ac64p+2
alpha*pow(x, 1/beta): 0x1.b724d48bf0f5ep+24

Case #5 (different x, different sample value):

 mingw-32:
beta: 0x1.ap-4
s: 443038967, p0: 0x1.b0124b971bbf3p-4
x: 0x1.1f5b72f5f6a3ep+4
alpha*pow(x, 1/beta): 0x1.143a16cdae94fp+43

 gcc-64:
beta: 0x1.ap-4
s: 443038967, p0: 0x1.b0124b971bbf3p-4
x: 0x1.1f5b72f5f6a3fp+4
alpha*pow(x, 1/beta): 0x1.143a16cdae958p+43

Case #6 (same sample value):

 mingw-32:
beta: 0x1.ap-4
s: 2932701594, p0: 0x1.b407f600e6d87p-1
x: 0x1.7bb183ccc47efp-1
alpha*pow(x, 1/beta): 0x1.181016f03c09p-3

 gcc-64:
beta: 0x1.ap-4
s: 2932701594, p0: 0x1.b407f600e6d87p-1
x: 0x1.7bb183ccc47efp-1
alpha*pow(x, 1/beta): 0x1.181016f03c09p-3
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


[tor-dev] prob_distr.c: LogLogistic fails stochastic tests on 32-bits mingw

2018-12-11 Thread George Kadianakis
Hello Riastradh,

as discussed on IRC, Appveyor recently started failing the stochastic
tests of LogLogistic on 32-bit builds:
 https://github.com/torproject/tor/pull/576
 https://ci.appveyor.com/project/torproject/tor/builds/20897462

I managed to reproduce the breakage by cross-compiling Tor and running
the tests with wine, using this script of ahf: https://github.com/ahf/tor-win32/

Here are my findings:

The following two test cases are breaking 100% reproducibly:

ok = test_stochastic_log_logistic_impl(M_E, 1e-1);
ok = test_stochastic_log_logistic_impl(exp(-10), 1e-2);

The breakage seems to be because of the beta parameter. In particular,
it seems like the test will break with any beta <= 0.26, and will
succeed with a beta >= 0.27. The space in between is still unclear ;)

I haven't managed to figure out what's actually offending the test but I
can reproduce this so I can do some digging if you have any ideas.

FWIW, I haven't noticed any other stochastic test breakage.

Cheers!
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev