Re: PPC64: Poor StrictMath performance due to non-optimized compilation

Chris Plummer Thu, 17 Nov 2016 13:49:44 -0800

On 11/17/16 1:33 PM, joe darcy wrote:

Hi Gustavo,
On 11/17/2016 10:31 AM, Gustavo Romero wrote:
Hi Joe,

Thanks a lot for your valuable comments.

On 17-11-2016 15:35, joe darcy wrote:
Currently, optimization for building fdlibm is disabled, except forthe
"solaris" OS target [1].
The reason for that is because historically the Solaris compilershave had sufficient discipline and control regarding floating-pointsemantics and compiler optimizations to still implement theJava-mandated results when optimization was enabled. The gcc familyof compilers, for example, has lacked such discipline.
oh, I see. Thanks for clarifying that. I was exactly wondering whyfdlibmoptimization is off even for x86_x64 as it, AFAICS regarding gcc 5only, doesnot affect the precision, even if setting -O3 does not improve theperformance
as much as on PPC64.
The fdlibm code relies on aliasing a two-element array of int with adouble to do bit-level reads and writes of floating-point values. As Iunderstand it, the C spec allows compilers to assume values ofdifferent types don't overlap in memory. The compilation environmenthas to be configured in such a way that the C compiler disables codegeneration and optimization techniques that would run afoul of thesefdlibm coding practices.

This is the strict aliasing issue right? It's a long standing problemwith fdlibm that kept getting worse as gcc got smarter. IIRC, compilingwith -fno-strict-aliasing fixes it, but it's been more than 12 yearssince I last dealt with fdlibm and compiler aliasing issues.


Chris

As a consequence on PPC64 (Linux) StrictMath methods like, but notlimited to,sin(), cos(), and tan() perform verify poor in comparison to thesame methods
in Math class [2]:
If you are doing your work against JDK 9, note that the pow, hypot,and cbrt fdlibm methods required by StrictMath have been ported toJava (JDK-8134780: Port fdlibm to Java). I have intentions toport the remaining methods to Java, but it is unclear whether or notthis will occur for JDK 9.
Yes, I'm doing my work against 9. So is there any problem if Iproceed with mychange? I understand that there is no conflict as JDK-8134780progresses andreplaces the StrictMath methods by their counterparts in Java.Please, advice.
If I manage to finish the fdlibm C -> Java port in JDK 9, the changesyou are proposing would eventually be removed as unneeded since the Ccode wouldn't be there to get compiled anymore.
Is it intended to downport JDK-8134780 to 8?
Such a backport would be technically possible, but we at Oracle don'tcurrently plan to do so.
Methods in the Math class, such as pow, are often intrinsified anduse a different algorithm so a straight performance comparison maynot be as fair or meaningful in those cases.
I agree. It's just that the issue on StrictMath methods was firstnoted due tothat huge gap (Math vs StrictMath) on PPC64, which is not prominenton x64.
Depending on how Math.{sin, cos} is implemented on PPC64, compilingthe fdlibm sin/cos with more aggressive optimizations should not beexpected to close the performance gap. In particular, if Math.{sin,cos} is an intrinsic on PPC64 (I haven't checked the sources) thatused platform-specific feature (say fused multiply add instructions)then just compiling fdlibm more aggressively wouldn't necessarily makeup that gap.
To allow cross-platform and cross-release reproducibility, StrictMathis specified to use the particular fdlibm algorithms, which precludesusing better algorithms developed more recently. If we were to startwith a clean slate today, to get such reproducibility we would specifycorrectly-rounded behavior of all those methods, but such an approachwas much less tractable technical 20+ years ago without benefit of theresearch that was been done in the interim, such as the work of Prof.Muller and associates: https://lipforge.ens-lyon.fr/projects/crlibm/.
Accumulating the the results of the functions and comparisons thesums is not a sufficiently robust way of checking to see if theoptimized versions are indeed equivalent to the non-optimized ones.The specification of StrictMath requires a particular result foreach set of floating-point arguments and sums get round-awaylow-order bits that differ.
That's really good point, thanks for letting me know about that. I'llre-test my
change under that perspective.
Running the JDK math library regression tests and corresponding JCKtests is recommended for work in this area.
Got it. By "the JDK math library regression tests" you mean exactlywhich test
suite? the jtreg tests?
Specifically, the regression tests under test/java/lang/Math andtest/java/lang/StrictMath in the jdk repository. There are some othermath library tests in the hotspot repo, but I don't know where theyare offhand.
A note on methodologies, when I've been writing test for my port I'vetried to include test cases that exercise all the branches point inthe code. Due to the large input space (~2^64 for a single-argumentmethod), random sampling alone is an inefficient way to try to finddifferences in behavior.
For testing against JCK/TCK I'll need some help on that.
I believe the JCK/TCK does have additional testcases relevant here.

HTH; thanks,

-Joe

Re: PPC64: Poor StrictMath performance due to non-optimized compilation

Reply via email to