Re: svn commit: r241755 - head/lib/msun/src

2012-10-22 Thread Warner Losh

On Oct 21, 2012, at 10:06 PM, Steve Kargl wrote:

 On Sun, Oct 21, 2012 at 09:08:49PM -0600, Warner Losh wrote:
 Feel free to fix them however.  I added the comments because
 the algorithms weren't quite the same...  If you have a better
 way, feel free to back my stuff out on the way to it.
 
 Warner
 
 
 Please back your commits out. 
 
 Two of the three people who actually appear to be
 working on libm have now requested the back-out.

OK.  While it would be just as easy for you guys to commit the new stuff over 
mine, I'll revert them.

Warner

___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: svn commit: r241755 - head/lib/msun/src

2012-10-22 Thread Warner Losh

On Oct 21, 2012, at 10:06 PM, Steve Kargl wrote:

 On Sun, Oct 21, 2012 at 09:08:49PM -0600, Warner Losh wrote:
 Feel free to fix them however.  I added the comments because
 the algorithms weren't quite the same...  If you have a better
 way, feel free to back my stuff out on the way to it.
 
 Warner
 
 
 Please back your commits out. 
 
 Two of the three people who actually appear to be
 working on libm have now requested the back-out.

Done.  Have fun with whatever you have in mind to replace it.

Warner

___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: svn commit: r241755 - head/lib/msun/src

2012-10-22 Thread Warner Losh

On Oct 22, 2012, at 5:14 AM, Bruce Evans wrote:

 On Sun, 21 Oct 2012, Warner Losh wrote:
 
 Feel free to fix them however.  I added the comments because the algorithms 
 weren't quite the same...  If you have a better way, feel free to back my 
 stuff out on the way to it.
 
 But the algorithms are identical to a fault.  Inside the functions, all
 lines except 1 in each correspond exactly, and the exception is a style
 bug.  Only about 30 lines in each are not lexically identical.  The
 non-lexical differences are for things like different magic numbers.

Except that's not true.  For expf, only the first two terms of the 5 are 
computed in Remes expansion.  That's why I bothered to document the difference. 
 For logf, there are at least two additional terms in the intermediate form.

The differences here are:
/* logf */
t1= w*(Lg2+w*Lg4); 
t2= z*(Lg1+w*Lg3); 
/* log */
t1= w*(Lg2+w*(Lg4+w*Lg6));
t2= z*(Lg1+w*(Lg3+w*(Lg5+w*Lg7)));
 
which to even the most casual observer are clearly different.

 The old fdlibm comments aren't too careful about keeping magic numbers
 out of the algorithm description so that the algorithm description is
 as general as possible.  The precise magic numbers are often critical
 to the details of the implementation of the algorithm but not really
 to the algorithm itself.

The current code isn't careful at all about magic numbers.  Which ones are 
magic hex constants for IEEE stuff, and which ones are hex numbers for the 
exact representation of important constants?

Anyway, it is clear you guys don't want them in there so I've reverted them.  
I'm totally baffled by this request because these algorithms are similar not 
identical, but since you guys are the active maintainers, I'll accede to your 
wishes after expressing my utter bafflement at the justifications.

Warner
___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: svn commit: r241755 - head/lib/msun/src

2012-10-22 Thread Steve Kargl
On Mon, Oct 22, 2012 at 06:59:04AM -0600, Warner Losh wrote:
 
 On Oct 21, 2012, at 10:06 PM, Steve Kargl wrote:
 
  On Sun, Oct 21, 2012 at 09:08:49PM -0600, Warner Losh wrote:
  Feel free to fix them however.  I added the comments because
  the algorithms weren't quite the same...  If you have a better
  way, feel free to back my stuff out on the way to it.
  
  Warner
  
  
  Please back your commits out. 
  
  Two of the three people who actually appear to be
  working on libm have now requested the back-out.
 
 OK.  While it would be just as easy for you guys to commit the new stuff over 
 mine, I'll revert them.
 
 Warner
 

Thanks.

BTW, besides bde's technical points, your change made
our sources different from OpenBSD, NetBSD, and new
project openlibm.  Diffing against the other trees
would become cluttered.

-- 
Steve
___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: svn commit: r241755 - head/lib/msun/src

2012-10-22 Thread Warner Losh

On Oct 22, 2012, at 7:40 AM, Steve Kargl wrote:

 On Mon, Oct 22, 2012 at 06:59:04AM -0600, Warner Losh wrote:
 
 On Oct 21, 2012, at 10:06 PM, Steve Kargl wrote:
 
 On Sun, Oct 21, 2012 at 09:08:49PM -0600, Warner Losh wrote:
 Feel free to fix them however.  I added the comments because
 the algorithms weren't quite the same...  If you have a better
 way, feel free to back my stuff out on the way to it.
 
 Warner
 
 
 Please back your commits out. 
 
 Two of the three people who actually appear to be
 working on libm have now requested the back-out.
 
 OK.  While it would be just as easy for you guys to commit the new stuff 
 over mine, I'll revert them.
 
 Warner
 
 
 Thanks.
 
 BTW, besides bde's technical points, your change made
 our sources different from OpenBSD, NetBSD, and new
 project openlibm.  Diffing against the other trees
 would become cluttered.

BDE's technical points vary in quality and are difficult to argue with since 
they are so nit-picky. :(  I'd be happy to work through them, but some of the 
issues I just fundamentally disagree with.  Since I backed out the comments, 
I've decided not to spend the time arguing, but do think that documenting the 
differences between the precisions would be good.  I started down this path 
because I thought expf was broken because it didn't match exp exactly...

However, since he's implementing a new one, wouldn't that also have diffability 
issues too?

Warner
___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: svn commit: r241755 - head/lib/msun/src

2012-10-22 Thread Bruce Evans

On Mon, 22 Oct 2012, Warner Losh wrote:


On Oct 22, 2012, at 5:14 AM, Bruce Evans wrote:


On Sun, 21 Oct 2012, Warner Losh wrote:


Feel free to fix them however.  I added the comments because the algorithms 
weren't quite the same...  If you have a better way, feel free to back my stuff 
out on the way to it.


But the algorithms are identical to a fault.  Inside the functions, all
lines except 1 in each correspond exactly, and the exception is a style
bug.  Only about 30 lines in each are not lexically identical.  The
non-lexical differences are for things like different magic numbers.


Except that's not true.  For expf, only the first two terms of the 5 are 
computed in Remes expansion.  That's why I bothered to document the difference. 
 For logf, there are at least two additional terms in the intermediate form.


No, they both use a Remez-type polynomial approximation.  The only
differences are that the approximation for logf needs _fewer_ terms,
and in fact has 3 fewer (except in the Cygnus version it had the
same number of terms with the lower ones useless at best), and that
the grouping of the terms is different for efficiency.

I say Remez-type because it is unclear if anyone actually uses pure
Remez.  I use my own idea of what it is, without having looked at
documentation of the original algorithm.  I probably have different
tweaks than normal.  fdlibm's approximations often seem to be too
inaccurate to have been generated by full Remez.  The ones here in
the Cygnus version were generated by starting with the double precision
coeffs and blindly rounding them to float precision.   This is a bad
non-Remez method (but pure Remez doesn't understand limited precision
AFAIK).  In practice it gave 6 extra bits using Lg1-Lg7, where my
coeffs generated for float precision give 10 extra bits using only
Lg1-Lg4.  The extra coeffs tend to be actively harmful (and IIRC they
were here), because when Remez generates the lower coeffs it tunes
them for the lower coeffs actually working, all in infinite precision,
and after rounding to finite precision the delicate tuning for this
just doesn't work.

The comments actually say special Remes.  Who knows what that is? :-)
Even running same algorithm, there is a lot of noise in the lower bits
for all except the first few coeffs, so it is hard to duplicate coeffs
produced by other special versions.


The differences here are:
/* logf */
   t1= w*(Lg2+w*Lg4);
   t2= z*(Lg1+w*Lg3);
/* log */
   t1= w*(Lg2+w*(Lg4+w*Lg6));
   t2= z*(Lg1+w*(Lg3+w*(Lg5+w*Lg7)));

which to even the most casual observer are clearly different.


Too casual :-).  Both use a moderately efficient grouping of terms for
efficiency.  Both are more optimal than the simple Horner's method
grouping which would look similar even to a too-casual observer:

R = s*s*(Lg1+s*(Lg2+s*(...+Lg[4or7])...))

Neither is highly optimized, but the logf one is is better, mainly because
a power of 2 number of terms is easiest to get right and a small number of
terms is especially easy to get right.  In fact, the logf one is just the
log one with higher terms omitted.  Simply omitting higher terms is usually
too simple, but here it happens to improve the grouping.

The grouping is a general polynomial evaluation technique and isn't
part of Remez.  The details of it aren't documented at all.  In fact,
the pseudo-code doesn't even mention z completely: it says R(z) ~=
Lg1*s**2 + ... +Lg7*s**14 (using superscripts for the powers) where
the code says t2 = z*(Lg1+w*(Lg3+...)... R = t2+t1.

Many other details involve unmentioned parts of general polynymonal
evaluation technique (mainly, add terms from high to low for accuracy,
but violate this rule as much as possible for efficiency, and avoid
underflow by filtering out small args early and by not calculating
high powers directly).


The old fdlibm comments aren't too careful about keeping magic numbers
out of the algorithm description so that the algorithm description is
as general as possible.  The precise magic numbers are often critical
to the details of the implementation of the algorithm but not really
to the algorithm itself.


The current code isn't careful at all about magic numbers.  Which ones are 
magic hex constants for IEEE stuff, and which ones are hex numbers for the 
exact representation of important constants?


None for the hex FP constants.  Hex FP constants are for the programmer's
convenience, but I are just style bugs where I use put them in e_expf.c.
Decimal FP constants work with any C compiler.  They are only pseudocode
in comments in fdlibm, and no one every used their values from there AFAIK,
since C compilers were common when fdlibm was released.  The style bug is
to mix C99 hex FP constants in old fdlibm (Cygnus) code.  Though only in
comments, it looks strange since the active code still laboriously spells
out (float) casts to get float constants.  Even C90 has float constants
via the F suffix.  But I recently decided 

Re: svn commit: r241755 - head/lib/msun/src

2012-10-21 Thread Warner Losh
Feel free to fix them however.  I added the comments because the algorithms 
weren't quite the same...  If you have a better way, feel free to back my stuff 
out on the way to it.

Warner

On Oct 19, 2012, at 11:43 PM, Bruce Evans wrote:

 On Fri, 19 Oct 2012, Warner Losh wrote:
 
 Log:
 Document the methods used to compute logf. Taken and edited from the
 double version, with adaptations for the differences between it and
 the float version.
 
 Please back this out.
 
 This was intentionally left out.  It is painful to maintain large
 comments that should be identical for all precisions.  Double precision
 is primary, and only comments that depend on the precision should be
 given in other precisions, except for short comments to to right of
 coulde whose non-duplication would just give larger diffs.  If you
 want to duplicate them, then they would often have to be quadruplicated
 in 4 files for the 4 supported precisions:
 - double precision
 - float precision
 - long double with MANT_DIG = 64 and 16 other bits
 - long double with MANT_DIG = 113 and 15 other bits
 Only 4 now, but there could easily be variations for long doubles.
 
 log* will be replaced by my version soon.  One of the organizational
 things that I'm currently struggling with is how not to duplicate the
 comments in them.  Currently I duplicate then in 4 files, and don't
 quite duplicate then in a 5th file for optimized float precision.
 Maintaining them has been very painful.  They are about 3 times as large
 as the comments here, and have more and more-delicate precision-dependent
 details, so trimming them is harder than here.  They do cover all of
 log*(), log1p*(), log2*() and log10*() in the same file, so there is
 not as much duplication as for the different e_log*.c files.
 
 Another organizational problem is how to organize the functions.  I
 currently have 4 primary interfaces per file (should have a couple
 more for kernels), and 1 file for each precision, and currently use
 inlines and ifdefs to avoid these inlines, but the inlines are
 not properly optimized for all cases of interest, and break debugging
 in most cases, so I plan to switch to more macro hackery (like multiple
 #include __FILE__'s with ifdefs to select what is compiled for each
 #include) .  Even more macro hackery would let me put support for all
 precisions in the same files, so duplicated comments would be even less
 useful, but the precision-dependent ones would be even harder to write
 and read.
 
 We are developing inverse trig functions, and handle the comments and
 other things by generating code for all precisions from the double
 precision case.  Large comments are stripped as part of the generation.
 This is easier to write but not so good to read.  This is only feasible
 because the code is not very precision-dependent.
 
 Modified: head/lib/msun/src/e_logf.c
 ==
 --- head/lib/msun/src/e_logf.c   Fri Oct 19 22:21:01 2012
 (r241754)
 +++ head/lib/msun/src/e_logf.c   Fri Oct 19 22:46:48 2012
 (r241755)
 @@ -19,6 +19,57 @@ __FBSDID($FreeBSD$);
 #include math.h
 #include math_private.h
 
 +/* __ieee754_log(x)
 
 Here the comment doesn't even match the code.  This is __ieee754_logf(x),
 not __ieee754_log().  Technically, this is a comment that depends on the
 precision, so it should be given separately, but the differences for type
 suffixes are especially painful to maintain in comments and especially
 useless to document in comments.
 
 + * Return the logrithm of x
 
 This duplicates e_log.c to a fault.  Fixing this spelling error would
 require touching multiple files.
 
 + *
 + * Method :
 + *   1. Argument Reduction: find k and f such that
 + *  x = 2^k * (1+f),
 + * where  sqrt(2)/2  1+f  sqrt(2) .
 + *
 + *   2. Approximation of log(1+f).
 + *  Let s = f/(2+f) ; based on log(1+f) = log(1+s) - log(1-s)
 + *   = 2s + 2/3 s**3 + 2/5 s**5 + .,
 + *   = 2s + s*R
 
 Nothing new here.  I don't want to see it again.
 
 + *  We use a special Reme algorithm on [0,0.1716] to generate
 + *  a polynomial of degree 8 to approximate R The maximum error
 + *  of this polynomial approximation is bounded by 2**-34.24. In
 + *  other words,
 
 Lots more spelling and grammar errors.  I'm not sure if the correct spelling
 is Remez or Remes, but I'm sure it isn't Reme.  One sentence break is
 missing a . and the others are consistently too short.
 
 The 2**-34.24 number is precision-dependent and is documented in more
 completely below using my automatically generated comment that gives a
 consistent format.  (I haven't merged this improvement back into e_log.c.)
 Now there is duplication even within the same file :-(.
 
 + *  2  4  6  8
 + *  R(z) ~ Lg1*s +Lg2*s +Lg3*s +Lg4*s
 
 The number of coeffs is indeed precision-dependent, but uninteresting.
 My 

Re: svn commit: r241755 - head/lib/msun/src

2012-10-21 Thread Steve Kargl
On Sun, Oct 21, 2012 at 09:08:49PM -0600, Warner Losh wrote:
 Feel free to fix them however.  I added the comments because
 the algorithms weren't quite the same...  If you have a better
 way, feel free to back my stuff out on the way to it.
 
 Warner
 

Please back your commits out. 

Two of the three people who actually appear to be
working on libm have now requested the back-out.

-- 
Steve
___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: svn commit: r241755 - head/lib/msun/src

2012-10-20 Thread Bruce Evans

On Sat, 20 Oct 2012, Bruce Evans wrote:


On Fri, 19 Oct 2012, Warner Losh wrote:


Log:
 Document the methods used to compute logf. Taken and edited from the
 double version, with adaptations for the differences between it and
 the float version.


Please back this out.

This was intentionally left out.  It is painful to maintain large
comments that should be identical for all precisions.  Double precision
is primary, and only comments that depend on the precision should be
given in other precisions, except for short comments to to right of
coulde whose non-duplication would just give larger diffs.  If you
want to duplicate them, then they would often have to be quadruplicated
in 4 files for the 4 supported precisions:
- double precision
- float precision
- long double with MANT_DIG = 64 and 16 other bits
- long double with MANT_DIG = 113 and 15 other bits
Only 4 now, but there could easily be variations for long doubles.


PS: in long double trig functions, I handle this carefully by giving
references to the file containing the main description of the algorithm.
Old Cygnus float functions are no so careful, and I didn't want to churn
them for this.  Example from k_cosl.c:

% #include sys/cdefs.h
% __FBSDID($FreeBSD: src/lib/msun/ld80/k_cosl.c,v 1.1 2008/02/17 07:32:14 das Exp 
$);
% 
% /*

%  * ld80 version of k_cos.c.  See ../src/k_cos.c for most comments.
%  */

Most details are here.  They are described in the same style as the
e_log.c comments that you duplicated.  Here they are in the kernel
instead of in the main file.  Another problem is that they really
do belong in the kernel, but are harder to read there.  We already
have kernels for log and logf (k_log.h and k_logf.c).  These
unfortunately have to duplicate the code, but they are careful to
not duplicate comments and carefully point to the main file for the
comments.  The kernels are only used for log10*() and log2*(), so
strictly whatever the kernels do doesn't affect log() and logf().
You didn't duplicate the comments into these files.  If you did,
there would already be 4 or 6 copies of them without even long double
precision:
- e_log.c, e_logf.c
- k_log.h, k_logf.c
- e_log10.c, e_log10f.c, e_log2.c, e_log2f.c (if we don't use kernels
  and/or don't put the comments in the kernels).

% 
% #include math_private.h
% 
% /*

%  * Domain [-0.7854, 0.7854], range ~[-2.43e-23, 2.425e-23]:
%  * |cos(x) - c(x)|  2**-75.1
%  *
%  * The coefficients of c(x) were generated by a pari-gp script using
%  * a Remez algorithm that searches for the best higher coefficients
%  * after rounding leading coefficients to a specified precision.

This documents some precision-dependent details too verbosely.  It goes
without saying that we generate minimax polynomials by some method.
If we want to document the method(s), then it should be once per
method and not placed in the source files whose polynomials happened
to be generated using the methods.

%  *
%  * Simpler methods like Chebyshev or basic Remez barely suffice for
%  * cos() in 64-bit precision, because we want the coefficient of x^2
%  * to be precisely -0.5 so that multiplying by it is exact, and plain
%  * rounding of the coefficients of a good polynomial approximation only
%  * gives this up to about 64-bit precision.  Plain rounding also gives
%  * a mediocre approximation for the coefficient of x^4, but a rounding
%  * error of 0.5 ulps for this coefficient would only contribute ~0.01
%  * ulps to the final error, so this is unimportant.  Rounding errors in
%  * higher coefficients are even less important.
%  *
%  * In fact, coefficients above the x^4 one only need to have 53-bit
%  * precision, and this is more efficient.  We get this optimization
%  * almost for free from the complications needed to search for the best
%  * higher coefficients.
%  */

This documents in excessive detail the limits of some methods.  When it
was written, I didn't know the limits very well.  This is essentially
a way of saying don't document your methods here, else it would expand
to be as verbose as this comment, if not to a book.  I don't want to
read this book in every source file.

Bruce
___
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to svn-src-head-unsubscr...@freebsd.org


Re: svn commit: r241755 - head/lib/msun/src

2012-10-19 Thread Bruce Evans

On Fri, 19 Oct 2012, Warner Losh wrote:


Log:
 Document the methods used to compute logf. Taken and edited from the
 double version, with adaptations for the differences between it and
 the float version.


Please back this out.

This was intentionally left out.  It is painful to maintain large
comments that should be identical for all precisions.  Double precision
is primary, and only comments that depend on the precision should be
given in other precisions, except for short comments to to right of
coulde whose non-duplication would just give larger diffs.  If you
want to duplicate them, then they would often have to be quadruplicated
in 4 files for the 4 supported precisions:
- double precision
- float precision
- long double with MANT_DIG = 64 and 16 other bits
- long double with MANT_DIG = 113 and 15 other bits
Only 4 now, but there could easily be variations for long doubles.

log* will be replaced by my version soon.  One of the organizational
things that I'm currently struggling with is how not to duplicate the
comments in them.  Currently I duplicate then in 4 files, and don't
quite duplicate then in a 5th file for optimized float precision.
Maintaining them has been very painful.  They are about 3 times as large
as the comments here, and have more and more-delicate precision-dependent
details, so trimming them is harder than here.  They do cover all of
log*(), log1p*(), log2*() and log10*() in the same file, so there is
not as much duplication as for the different e_log*.c files.

Another organizational problem is how to organize the functions.  I
currently have 4 primary interfaces per file (should have a couple
more for kernels), and 1 file for each precision, and currently use
inlines and ifdefs to avoid these inlines, but the inlines are
not properly optimized for all cases of interest, and break debugging
in most cases, so I plan to switch to more macro hackery (like multiple
#include __FILE__'s with ifdefs to select what is compiled for each
#include) .  Even more macro hackery would let me put support for all
precisions in the same files, so duplicated comments would be even less
useful, but the precision-dependent ones would be even harder to write
and read.

We are developing inverse trig functions, and handle the comments and
other things by generating code for all precisions from the double
precision case.  Large comments are stripped as part of the generation.
This is easier to write but not so good to read.  This is only feasible
because the code is not very precision-dependent.


Modified: head/lib/msun/src/e_logf.c
==
--- head/lib/msun/src/e_logf.c  Fri Oct 19 22:21:01 2012(r241754)
+++ head/lib/msun/src/e_logf.c  Fri Oct 19 22:46:48 2012(r241755)
@@ -19,6 +19,57 @@ __FBSDID($FreeBSD$);
#include math.h
#include math_private.h

+/* __ieee754_log(x)


Here the comment doesn't even match the code.  This is __ieee754_logf(x),
not __ieee754_log().  Technically, this is a comment that depends on the
precision, so it should be given separately, but the differences for type
suffixes are especially painful to maintain in comments and especially
useless to document in comments.


+ * Return the logrithm of x


This duplicates e_log.c to a fault.  Fixing this spelling error would
require touching multiple files.


+ *
+ * Method :
+ *   1. Argument Reduction: find k and f such that
+ *  x = 2^k * (1+f),
+ * where  sqrt(2)/2  1+f  sqrt(2) .
+ *
+ *   2. Approximation of log(1+f).
+ *  Let s = f/(2+f) ; based on log(1+f) = log(1+s) - log(1-s)
+ *   = 2s + 2/3 s**3 + 2/5 s**5 + .,
+ *   = 2s + s*R


Nothing new here.  I don't want to see it again.


+ *  We use a special Reme algorithm on [0,0.1716] to generate
+ *  a polynomial of degree 8 to approximate R The maximum error
+ *  of this polynomial approximation is bounded by 2**-34.24. In
+ *  other words,


Lots more spelling and grammar errors.  I'm not sure if the correct spelling
is Remez or Remes, but I'm sure it isn't Reme.  One sentence break is
missing a . and the others are consistently too short.

The 2**-34.24 number is precision-dependent and is documented in more
completely below using my automatically generated comment that gives a
consistent format.  (I haven't merged this improvement back into e_log.c.)
Now there is duplication even within the same file :-(.


+ *  2  4  6  8
+ *  R(z) ~ Lg1*s +Lg2*s +Lg3*s +Lg4*s


The number of coeffs is indeed precision-dependent, but uninteresting.
My constent format for poly coeffs doesn't mention them in comments.
It is enough to spell the constants for them in a consistent way
(consistent across all sources, not just log*).


+ *  (the values of Lg1 to Lg7 are listed in the program)


Here the comment doesn't even match the code.  There is no Lg5 to
Lg7 for float precision.


+