Hi Sergey, thank you to look at this problem.
I confirm that your simple patch improves the performance on my laptop ubuntu 16.4 (gcc 5.4 as yours) with intel i4700 cpu when I run the ellipse JMH test. - ojdk9 without patch: Benchmark (size) Mode Cnt Score Error Units EllipseRdrTest.drawEllipse 100 avgt 6 0,233 ± 0,007 ms/op EllipseRdrTest.drawEllipse 500 avgt 6 1,203 ± 0,004 ms/op EllipseRdrTest.drawEllipse 900 avgt 6 2,361 ± 0,458 ms/op *EllipseRdrTest.drawEllipse 1400 avgt 6 4,023 ± 0,028 ms/op*EllipseRdrTest.fillEllipse 100 avgt 6 0,198 ± 0,010 ms/op EllipseRdrTest.fillEllipse 500 avgt 6 1,858 ± 0,046 ms/op *EllipseRdrTest.fillEllipse 900 avgt 6 4,962 ± 0,393 ms/opEllipseRdrTest.fillEllipse 1400 avgt 6 10,475 ± 0,035 ms/op* - ojdk9 with patch: Benchmark (size) Mode Cnt Score Error Units EllipseRdrTest.drawEllipse 100 avgt 6 0,232 ± 0,006 ms/op EllipseRdrTest.drawEllipse 500 avgt 6 1,203 ± 0,021 ms/op EllipseRdrTest.drawEllipse 900 avgt 6 2,355 ± 0,467 ms/op *EllipseRdrTest.drawEllipse 1400 avgt 6 3,835 ± 0,632 ms/op* EllipseRdrTest.fillEllipse 100 avgt 6 0,191 ± 0,010 ms/op EllipseRdrTest.fillEllipse 500 avgt 6 1,793 ± 0,029 ms/op *EllipseRdrTest.fillEllipse 900 avgt 6 4,741 ± 0,062 ms/opEllipseRdrTest.fillEllipse 1400 avgt 6 8,810 ± 0,100 ms/op* - reference jdk8 with marlin 0.7.4 (comparable): Benchmark (size) Mode Cnt Score Error Units EllipseRdrTest.drawEllipse 100 avgt 6 0,231 ± 0,002 ms/op EllipseRdrTest.drawEllipse 500 avgt 6 1,199 ± 0,013 ms/op EllipseRdrTest.drawEllipse 900 avgt 6 2,282 ± 0,006 ms/op EllipseRdrTest.drawEllipse 1400 avgt 6 3,600 ± 0,133 ms/op EllipseRdrTest.fillEllipse 100 avgt 6 0,189 ± 0,001 ms/op EllipseRdrTest.fillEllipse 500 avgt 6 1,777 ± 0,009 ms/op EllipseRdrTest.fillEllipse 900 avgt 6 4,856 ± 0,110 ms/op EllipseRdrTest.fillEllipse 1400 avgt 6 10,252 ± 0,302 ms/op If you need, I can run against Oracle JDK9 EA builds. Cheers & Happy hollidays, Laurent 2016-12-21 15:44 GMT+01:00 Sergey Bylokhov <[email protected]>: > Hi, Laurent. > Can you please check the next patch: > ========== > diff -r 8a61c000a194 make/lib/Awt2dLibraries.gmk > --- a/make/lib/Awt2dLibraries.gmk Tue Dec 20 09:52:14 2016 -0800 > +++ b/make/lib/Awt2dLibraries.gmk Wed Dec 21 17:33:36 2016 +0300 > @@ -222,6 +222,7 @@ > # applies to debug builds. > ifeq ($(TOOLCHAIN_TYPE), gcc) > BUILD_LIBAWT_debug_mem.c_CFLAGS := -w > + LIBAWT_CFLAGS += -fgcse-after-reload > endif > > > $(eval $(call SetupNativeCompilation,BUILD_LIBAWT, \ > ========== > > It seems that this is the simplest version which produce the good > performance results and safe enough to be integrated. On my system(Ubuntu > gcc5.4) it will speedup default rasterizer from 8.400 to 6.200 ms/op +- > 20%. Default rasterizer in OracleJDK 8u112 has 6.500. > Fix does not affect the the public jdk9.(which is build by RE on gcc > 4.9.2), seems gcc 4.9.2 produce good results w/ and w/o this option. > > > > We should also be wary of compiler options that are a win on one processor > family and a loss on another. Anything that schedules instructions may be > specific to a particular generation of CPUs, for instance. Or for i5 vs i7 > vs M(obile)... > > ...jim > > On 10/2/15 9:10 AM, Laurent Bourgès wrote: > > Sergey, > > thanks for the information: > > I tried your gcc options on my ubuntu 14.4 (v4.8.4) and it is > actually > slightly faster: 10% on my fill ellipse test (450ms vs 490ms). > > > I tested by your jmh test, and the difference became bigger on 1400 > size. > > > Interesting; I will try too. > > > > Do you know which gcc compiler and options are used to build > JavaSE EA? > > I guess that compiler options in makefile are the same. > > plus some default gcc options: > jdk8: > gcc (GCC) 4.3.0 20080428 (Red Hat-8) C compiler version 4.3.0-8) > > jdk9: > gcc-4.8.2 - OEL5.5 > > > However the gcc compiler are different: 4.3 vs 4.8.2 ! > > So it may be worth comparing their different optimization options; I > guess somebody already looked at that ! > > > Moreover, the linux distrib may define default options. > > I will try to figure out all compiler options (command line + > defaults) > on my machine. > > > It is not simple to find an option, which will help for everyone. > Two options suggested by me is a minimum number from -O3 to get the > maximum performance, both seems reasonable. Actually if I change the > -O2 to -O3(OPTIMIZATION := LOW =>> OPTIMIZATION := HIGHEST) > performance became worse. > > > It is often the case with O3, but your patch seems a good win with only > 2 enabled options. > > > What is your build environment ? > > > Ubuntu 14.04 gcc 4.8.4 > > > I have the same and I got finally my gcc options: > gcc -c -Q -O2 --help=common > > Here are the difference between O2 vs O3 with gcc 4.8.4: > > gcc -c -Q -O3 --help=optimizers > /tmp/O3-opts > gcc -c -Q -O2 --help=optimizers > /tmp/O2-opts > diff /tmp/O2-opts /tmp/O3-opts | grep enabled > > *> -fgcse-after-reload [enabled] > *> -finline-functions [enabled] > > -fipa-cp-clone [enabled] > -fpredictive-commoning [enabled] > -ftree-loop-distribute-patterns [enabled] > -ftree-partial-pre [enabled] > > *> -ftree-vectorize [enabled] > *> -funswitch-loops [enabled] > > -fvect-cost-model [enabled] > > > So we could evaluate some of these options and see what is the best > compromise for libawt on gcc 4.8 ! > > Regards, > Laurent > > > -- -- Laurent Bourgès
