RE: minimal gcc version for jdk/jdk and usage of -Wno-int-in-bool-context flag in HS build

2018-11-02 Thread Baesken, Matthias
Hi Magnus / Erik  , thanks for explaning !

Best regards, Matthias

> -Original Message-
> From: Magnus Ihse Bursie 
> Sent: Donnerstag, 1. November 2018 11:36
> To: Baesken, Matthias ; Aleksey Shipilev
> ; 'build-dev@openjdk.java.net'  d...@openjdk.java.net>
> Subject: Re: minimal gcc version for jdk/jdk and usage of -Wno-int-in-bool-
> context flag in HS build
> 
> 
> 
> On 2018-10-31 17:17, Baesken, Matthias wrote:
> > Hi  Aleksey ,  I tried  with  gcc 6.2   not  6.3  maybe this makes a 
> > difference .
> >
> > Unfortunately I cannot see  your  exact gcc/g++  compile calls from the
> logfile .
> > Do you have  "-Wall"  set  which seems to be necessary to enable  the
> additional flag ?
> >
> > https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
> >
> > -Wint-in-bool-context
> > Warn for suspicious use of integer values where boolean values are
> expected, such as conditional expressions (?:) using non-boolean integer
> constants in boolean context, like if (a <= b ? 2 : 3).
> > Or left shifting of signed integers in boolean context, like for (a = 0; 1 
> > << a;
> a++);. Likewise for all kinds of multiplications regardless of the data type.
> > This warning is enabled by -Wall.
> 
> Since JDK-8211029, we have -Wall enabled for hotspot on gcc.
> 
> I can only reiterate what Erik says. Gcc, since a long time ago,
> silently ignores -Wno-XXX for XXX that it does not support or
> understand. *However*, if *another* failure occurs, then gcc prints out
> warnings for unknown -Wno-XXX flags as well. So those are just red
> herrings. If you see them, you will also see another error, and that is
> your real problem.
> 
> /Magnus
> 
> >
> >
> > The   configure flag  --with-extra-cflags   is not used by us .
> >
> > Best regards, Matthias
> >
> >
> >



Re: RFR(M) 8152988: [AOT] Update test batch definitions to include aot-ed java.base module mode into hs-comp testing

2018-11-02 Thread Magnus Ihse Bursie

On 2018-11-02 05:32, Ekaterina Pavlova wrote:

yes, it is at the same location:
 http://cr.openjdk.java.net/~epavlova//8152988/webrev.01/

I see.

In the future, please do not update webrevs "in place" (except perhaps 
for minor changes like a typo in a comment). This invalidates the 
history of the webrev, and makes it impossible to follow the discussion 
in the mailing list. Instead, create a new version of the webrev each 
time (webrev.02, ...).


Looks good now. Thanks for hanging in there, even though it has taken 
quite some time!


I noticed a single minor thing, there's a double space here:
+    $1_JAOTC_OPTS +=  --compile-with-assertions

If you want to, you can remove it. You do not need any more webrevs for 
that.


/Magnus



-katya

On 11/1/18 3:15 PM, Magnus Ihse Bursie wrote:

Do you have an updated webrev?

/Magnus

1 nov. 2018 kl. 22:28 skrev Ekaterina Pavlova 
:


Magnus,

thanks for the review.
I fixed everything you pointed to.

thanks,
-katya


On 11/1/18 6:46 AM, Magnus Ihse Bursie wrote:
Hi Katya,
Sorry for the late response. :-(

On 2018-10-29 21:24, Ekaterina Pavlova wrote:
Vladimir,
I added "-XX:+UseAOTStrictLoading" to "java -version", see 
make/RunTests.gmk.
Also added "--compile-with-assertion" to jaotc flags in case "-ea" 
flag was present in vm options
(otherwise AOTed library will warn that it does not have java 
assertions in compiled code).

This code:
+ ifneq ($$(findstring -ea, $$($1_VM_OPTIONS)), )
Using findstring is somewhat brittle here, since it will find 
substrings as well. Better to use filter which will match an exact 
word.


I will add -XX:+UseAOTStrictLoading and 
"-Xlog:aot+class+fingerprint=trace -Xlog:aot+class+load=trace"

directly to AOT testing tasks in mach5 jobs.

Magnus,
I changed
  "$$(addprefix -vmoption:, $$($1_AOT_OPTIONS))"
to
+  ifneq ($$($1_AOT_OPTIONS), )
+    $1_JTREG_BASIC_OPTIONS += -vmoptions:"$$($1_AOT_OPTIONS)"
+  endif

Please review updated webrev:
http://cr.openjdk.java.net/~epavlova//8152988/webrev.01/
Some fallout from my and Erik's discussion still needs implementing 
in RunTests.gmk:

$$(addprefix --compile-commands , $$($1_AOT_CCLIST)) \
Please rewrite as:
$$(addprefix --compile-commands$(SPACE), $$($1_AOT_CCLIST)) \
I'm still not happy with the always printed "java -version". :( 
Vladimir agreed that it would be reasonable to redirect this into a 
file. This can be done by e.g.

+ $$(call ExecuteWithLog, $$@.check, \
+ $$(FIXPATH) $$(JDK_IMAGE_DIR)/bin/java \
+ $$($1_VM_OPTIONS) -XX:+PrintAOT -XX:+UseAOTStrictLoading 
-XX:AOTLibrary=$$@ -version \

+ > $$@.verify-aot
+ )
This will redirect the output to $($1_AOT_LIB).verify-aot in 
test-support/aot.
I still would like to see the if statement around SetupAot moved 
out, like this:

+ ifneq ($$(GTEST_AOT_MODULES), ) + $$(eval $$(call SetupAot, $1, \
+ MODULES := $$(GTEST_AOT_MODULES), \
+ VM_OPTIONS := $$(GTEST_VM_OPTIONS) $$(GTEST_JAVA_OPTIONS), \
+ )) + endif
and similarly for JTREG. This is coupled with removing the "ifneq 
($$($1_MODULES), )" in SetupAotBody (and do not forget to un-indent 
two spaces).

/Magnus


thanks,
-katya



On 10/23/18 4:40 PM, Vladimir Kozlov wrote:

On 10/22/18 9:36 AM, Ekaterina Pavlova wrote:
Vladimir,

thanks a lot for the review.
"-XX:+PrintAOT" was only passed to "java -version" which is done 
before the testing to verify that AOTed libraries work.


If it is only for -version then using  -XX:+PrintAOT is okay.

It also perhaps could be useful debug info in case some tests 
starts fail.
But I can change -XX:+PrintAOT to -XX:+UseAOTStrictLoading if 
you think it is more than enough.


-XX:+UseAOTStrictLoading is more important for tests which have 
different flags specified and may invalidate AOTLibrary 
configuration. We would want to catch tests which will not use 
AOT libraries. At least when we test these your changes.




Do you suggest to add "-Xlog:aot+class+fingerprint=trace 
-Xlog:aot+class+load=trace" to "java -version" only

or to java flags used during tests execution as well?


These options are next step after -XX:+UseAOTStrictLoading. We 
were asked before how we can guarantee that AOTed methods are 
used by tests. One way is PrintAOT which should show published 
AOT methods. An other way is Xlogs flags which have less output 
but provide at least some level of information that AOTed classes 
are used and not skipped.


Thanks,
Vladimir




-katya


On 10/19/18 10:40 AM, Vladimir Kozlov wrote:
I share concern about output of -XX:+PrintAOT - it prints all 
AOT classes and methods. For AOTed java.base the output will be 
almost the same for all tests (depends which are java.base 
classes are used).


I would suggest instead to use -XX:+UseAOTStrictLoading which 
exit(1) VM if it can't load AOT library for some reason 
(mismatched configuration - different GCs).


Also -Xlog:aot+class+fingerprint=trace  to trace cases of 
mismatching class fingerprint.
Also -Xlog:aot+class+load=trace -to trace AOT class loading or 
skipping them if 

Re: RFR(S): remove ExecuteInternalVMTests and VerboseInternalVMTests flags

2018-11-02 Thread Magnus Ihse Bursie

Hi Igor,

On 2018-11-01 22:15, Igor Ignatyev wrote:

http://cr.openjdk.java.net/~iignatyev//8213058/webrev.00/index.html
What release are you targeting? This does not look like a patch that can 
be applied to jdk/jdk, at least not after JDK-8210958.


When removing the "hotspot-internal" test, please also update 
make/common/FindTests.gmk, so it does not get added to ALL_NAMED_TESTS. 
Also, please update the test documentation at doc/testing.md and 
testing.html. (Update the markdown file, then you can regenerate the 
html file using "make update-build-docs".)


/Magnus


174 lines changed: 0 ins; 170 del; 4 mod;

Hi all,

could you please review this small clean up which removes 
ExecuteInternalVMTests and VerboseInternalVMTests flags and related make 
targets and tests?

8177708[1-3] is to convert the last of internal vm tests, so the whole 
InternalVMTests can be removed.

JBS: https://bugs.openjdk.java.net/browse/JDK-8213058
webrev: http://cr.openjdk.java.net/~iignatyev//8213058/webrev.00/index.html

testing: tier1, build all regular platforms
[1] https://bugs.openjdk.java.net/browse/JDK-8177708
[2] http://cr.openjdk.java.net/~iignatyev//8177708/webrev.00/index.html
[3] 
http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-October/030633.html

Thanks,
-- Igor




Re: RFR(S): remove ExecuteInternalVMTests and VerboseInternalVMTests flags

2018-11-02 Thread Magnus Ihse Bursie



On 2018-11-02 00:37, Igor Ignatyev wrote:

on a second though, removing these macros isn't that big, here is an incremental 
webrev: http://cr.openjdk.java.net/~iignatyev//8213058/webrev.0-1 
. besides 
removing test_log, it also include fixes in build and doc needed due to rebasing.

Erik, could you please re-review build changes?

http://cr.openjdk.java.net/~iignatyev//8213058/webrev.01/ 
 is the whole webrev.


Sorry, I missed your updated webrev. You still need to fix 
FindTests.gmk, though.


/Magnus





Thanks,
-- Igor



On Nov 1, 2018, at 4:23 PM, Igor Ignatyev  wrote:

Hi David,

removing usage of test_log will just mix "unneeded" changes w/ this clean up. 
as TestReservedSpace_test, TestReserveMemorySpecial_test, TestVirtualSpace_test, and 
TestMetaspaceUtils_test are to be removed by 8213269[*], I don't think we need to pay 
much attention to their code.

[*] https://bugs.openjdk.java.net/browse/JDK-8213269 


-- Igor


On Nov 1, 2018, at 4:16 PM, David Holmes  wrote:

Hi Igor,

There's no point having empty test_log macros that do nothing. The macro and 
all uses should just be deleted ... unless you plan on adding some other form 
of logging for this?

Thanks,
David

On 2/11/2018 7:15 AM, Igor Ignatyev wrote:

http://cr.openjdk.java.net/~iignatyev//8213058/webrev.00/index.html

174 lines changed: 0 ins; 170 del; 4 mod;

Hi all,
could you please review this small clean up which removes 
ExecuteInternalVMTests and VerboseInternalVMTests flags and related make 
targets and tests?
8177708[1-3] is to convert the last of internal vm tests, so the whole 
InternalVMTests can be removed.
JBS: https://bugs.openjdk.java.net/browse/JDK-8213058
webrev: http://cr.openjdk.java.net/~iignatyev//8213058/webrev.00/index.html
testing: tier1, build all regular platforms
[1] https://bugs.openjdk.java.net/browse/JDK-8177708
[2] http://cr.openjdk.java.net/~iignatyev//8177708/webrev.00/index.html
[3] 
http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-October/030633.html
Thanks,
-- Igor




Re: Stop using precompiled headers for Linux?

2018-11-02 Thread Magnus Ihse Bursie

On 2018-11-02 00:53, Ioi Lam wrote:
Maybe precompiled.hpp can be periodically (weekly?) updated by a 
robot, which parses the dependencies files generated by gcc, and pick 
the most popular N files?
I think that's tricky to implement automatically. However, I've done 
more or less, that, and I've got some wonderful results! :-)


I'd still like to run some more tests, but preliminiary data indicates 
that there is much to be gained by having a more sensible list of files 
in the precompiled header.


The fewer files we got on this list, the less likely it is to become 
(drastically) outdated. So I don't think we need to do this 
automatically, but perhaps manually every now and then when we feel 
build times are increasing.


/Magnus



- Ioi


On 11/1/18 4:38 PM, David Holmes wrote:
It's not at all obvious to me that the way we use PCH is the 
right/best way to use it. We dump every header we think it would be 
good to precompile into precompiled.hpp and then only ask gcc to 
precompile it. That results in a ~250MB file that has to be read into 
and processed for every source file! That doesn't seem very efficient 
to me.


Cheers,
David

On 2/11/2018 3:18 AM, Erik Joelsson wrote:

Hello,

My point here, which wasn't very clear, is that Mac and Linux seem 
to lose just as much real compile time. The big difference in these 
tests was rather the number of cpus in the machine (32 threads in 
the linux box vs 8 on the mac). The total amount of work done was 
increased when PCH was disabled, that's the user time. Here is my 
theory on why the real (wall clock) time was not consistent with 
user time between these experiments can be explained:


With pch the time line (simplified) looks like this:

1. Single thread creating PCH
2. All cores compiling C++ files

When disabling pch it's just:

1. All cores compiling C++ files

To gain speed with PCH, the time spent in 1 much be less than the 
time saved in 2. The potential time saved in 2 goes down as the 
number of cpus go up. I'm pretty sure that if I repeated the 
experiment on Linux on a smaller box (typically one we use in CI), 
the results would look similar to Macosx, and similarly, if I had 
access to a much bigger mac, it would behave like the big Linux box. 
This is why I'm saying this should be done for both or none of these 
platforms.


In addition to this, the experiment only built hotspot. If you we 
would instead build the whole JDK, then the time wasted in 1 in the 
PCH case would be negated to a large extent by other build targets 
running concurrently, so for a full build, PCH is still providing 
value.


The question here is that if the value of PCH isn't very big, 
perhaps it's not worth it if it's also creating as much grief as 
described here. There is no doubt that there is value however. And 
given the examination done by Magnus, it seems this value could be 
increased.


The main reason why we haven't disabled PCH in CI before this. We 
really really want to get CI builds fast. We don't have a ton of 
over capacity to just throw at it. PCH made builds faster, so we 
used them. My other reason is consistency between builds. Supporting 
multiple different modes of building creates the potential for 
inconsistencies. For that reason I would definitely not support 
having PCH on by default, but turned off in our CI/dev-submit. We 
pick one or the other as the official build configuration, and we 
stick with the official build configuration for all builds of any 
official capacity (which includes CI).


In the current CI setup, we have a bunch of tiers that execute one 
after the other. The jdk-submit currently only runs tier1. In tier2 
I've put slowdebug builds with PCH disabled, just to help verify a 
common developer configuration. These builds are not meant to be 
used for testing or anything like that, they are just run for 
verification, which is why this is ok. We could argue that it would 
make sense to move the linux-x64-slowdebug without pch build to 
tier1 so that it's included in dev-submit.


/Erik

On 2018-11-01 03:38, Magnus Ihse Bursie wrote:



On 2018-10-31 00:54, Erik Joelsson wrote:
Below are the corresponding numbers from a Mac, (Mac Pro (Late 
2013), 3.7 GHz, Quad-Core Intel Xeon E5, 16 GB). To be clear, the 
-npch is without precompiled headers. Here we see a slight 
degradation when disabling on both user time and wall clock time. 
My guess is that the user time increase is about the same, but 
because of a lower cpu count, the extra load is not as easily 
covered.


These tests were run with just building hotspot. This means that 
the precompiled header is generated alone on one core while 
nothing else is happening, which would explain this degradation in 
build speed. If we were instead building the whole product, we 
would see a better correlation between user and real time.


Given the very small benefit here, it could make sense to disable 
precompiled headers by default for Linux and Mac, just as we did 
with

Re: Stop using precompiled headers for Linux?

2018-11-02 Thread Magnus Ihse Bursie



On 2018-11-02 11:39, Magnus Ihse Bursie wrote:

On 2018-11-02 00:53, Ioi Lam wrote:
Maybe precompiled.hpp can be periodically (weekly?) updated by a 
robot, which parses the dependencies files generated by gcc, and pick 
the most popular N files?
I think that's tricky to implement automatically. However, I've done 
more or less, that, and I've got some wonderful results! :-)


Ok, I'm done running my tests.

TL;DR: I've managed to reduce wall-clock time from 2m 45s (with pch) or 
2m 23s (without pch), to 1m 55s. The cpu time spent went from 52m 27s 
(with pch) or 55m 30s (without pch) to 41m 10s. This is a huge gain for 
our automated builds! And a clear improvement even for the ordinary 
developer.


The list of included header files is reduced to just 37. The winning 
combination was to include all header files that was included in more 
than 130 different files, but to exclude all files with the name 
"*.inline.hpp". Hopefully, a further gain of not pulling in the 
*.inline.hpp files is that the risk of pch/non-pch failures will diminish.


However, these 37 files in turn pull in an additional 201 header files. 
Of these, three are *.inline.hpp:
share/jfr/recorder/checkpoint/types/traceid/jfrTraceIdBits.inline.hpp, 
os_cpu/linux_x86/bytes_linux_x86.inline.hpp and 
os_cpu/linux_x86/copy_linux_x86.inline.hpp. This looks like a problem 
with the header files to me.


With some exceptions (mostly related to JFR), these additional 200 files 
have "generic" looking names (like share/gc/g1/g1_globals.hpp), which 
indicate to me that it is reasonable to have them in this list, just as 
the list of the original 37 tended to be quite general and high-level 
includes. However, some files (like 
share/jfr/instrumentation/jfrEventClassTransformer.hpp) has maybe leaked 
in where they should not really be. It might be worth letting a hotspot 
engineer spend some cycles to check up these files and see if anything 
can be improved.


Caveats: I have only run this on my local linux build with the default 
server JVM configuration. Other machines will have different sweet 
spots. Other JVM variants/feature combinations will have different sweet 
spots. And, most importantly, I have not tested this at all on Windows. 
Nevertheless, I'm almost prepared to suggest a patch that uses this 
selection of files if running on gcc, just as is, because of the speed 
improvements I measured.


And some data:

Here is my log from my runs. The "on or above" means the cutoff I used 
for how many files that needed to include the files that were selected. 
As you can see, there is not much difference between cutoffs between 
130-150, or (without the inline files) between 110 and 150. (There were 
a lot of additional inline files in the positions below 130.) With all 
other equal, I'd prefer a solution with fewer files. That is less likely 
to go bad.


real    2m45.623s
user    52m27.813s
sys    5m27.176s
hotspot with original pch

real    2m23.837s
user    55m30.448s
sys    3m39.739s
hotspot without pch

real    1m59.533s
user    42m50.019s
sys    3m0.893s
hotspot new pch on or above 250

real    1m58.937s
user    42m18.994s
sys    3m0.245s
hotspot new pch on or above 200

real    2m0.729s
user    42m16.636s
sys    2m57.125s
hotspot new pch on or above 170

real    1m58.064s
user    42m9.618s
sys    2m57.635s
hotspot new pch on or above 150

real    1m58.053s
user    42m9.796s
sys    2m58.732s
hotspot new pch on or above 130

real    2m3.364s
user    42m54.818s
sys    3m2.737s
hotspot new pch on or above 100

real    2m6.698s
user    44m30.434s
sys    3m12.015s
hotspot new pch on or above 70

real    2m0.598s
user    41m17.810s
sys    2m56.258s
hotspot new pch on or above 150 without inline

real    1m55.981s
user    41m10.076s
sys    2m51.983s
hotspot new pch on or above 130 without inline

real    1m56.449s
user    41m10.667s
sys    2m53.808s
hotspot new pch on or above 110 without inline

And here is the "winning" list (which I declared as "on or above 130, 
without inline"). I encourage everyone to try this on their own system, 
and report back the results!


#ifndef DONT_USE_PRECOMPILED_HEADER
# include "classfile/classLoaderData.hpp"
# include "classfile/javaClasses.hpp"
# include "classfile/systemDictionary.hpp"
# include "gc/shared/collectedHeap.hpp"
# include "gc/shared/gcCause.hpp"
# include "logging/log.hpp"
# include "memory/allocation.hpp"
# include "memory/iterator.hpp"
# include "memory/memRegion.hpp"
# include "memory/resourceArea.hpp"
# include "memory/universe.hpp"
# include "oops/instanceKlass.hpp"
# include "oops/klass.hpp"
# include "oops/method.hpp"
# include "oops/objArrayKlass.hpp"
# include "oops/objArrayOop.hpp"
# include "oops/oop.hpp"
# include "oops/oopsHierarchy.hpp"
# include "runtime/atomic.hpp"
# include "runtime/globals.hpp"
# include "runtime/handles.hpp"
# include "runtime/mutex.hpp"
# include "runtime/orderAccess.hpp"
# include "runtime/os.hpp"
# include "runtime/thread.hpp"
# include "runtime/time

Re: Stop using precompiled headers for Linux?

2018-11-02 Thread Magnus Ihse Bursie

On 2018-11-02 12:14, Magnus Ihse Bursie wrote:
Caveats: I have only run this on my local linux build with the default 
server JVM configuration. Other machines will have different sweet 
spots. Other JVM variants/feature combinations will have different 
sweet spots. And, most importantly, I have not tested this at all on 
Windows. Nevertheless, I'm almost prepared to suggest a patch that 
uses this selection of files if running on gcc, just as is, because of 
the speed improvements I measured. 


I've started running tests on other platforms. Unfortunately, I don't 
have access to quite as powerful machines, so everything takes much 
longer. For the moment, I've only tested my "BKM" (best known method) 
from linux, to see if it works.


For xcode/macos I got:
real    4m21,528s
user    27m28,623s
sys    2m18,244s
hotspot with original pch

real    4m28,867s
user    29m10,685s
sys    2m14,456s
hotspot without pch

real    3m6,322s
user    19m3,000s
sys    1m41,252s
hotspot with new BKM pch

So obviously this is a nice improvement even here. I could probably try 
around a bit and see if there is an even better fit with a different 
selections of header files, but even without that, I'd say this patch is 
by itself as good for clang as it is for gcc.


For windows I got:
real    6m39.035s
user    0m58.580s
sys 2m48.138s
hotspot with original pch

real    10m29.227s
user    1m6.909s
sys    2m24.108s
hotspot without pch

real    6m56.262s
user    0m57.563s
sys    2m27.514s
hotspot with new BKM pch

I'm not sure what's going on with the user time numbers here. Presumably 
cygwin cannot get to the real Windows time data. What I can see is the 
huge difference in wall clock time between PCH and no PCH. I can also 
see that the new trimmed BKM list retains most of that improvement, but 
is actually somewhat slower that the original list. I'm currently 
rerunning with a larger set on Windows, to see if this helps improve 
things. I can certainly live with a precompiled.hpp that includes some 
additional files on Windows.


/Magnus




Re: RFR(M) 8152988: [AOT] Update test batch definitions to include aot-ed java.base module mode into hs-comp testing

2018-11-02 Thread Dmitry Chuyko

Hi Katya,

Just FYI, non-tiered .so libs from java.base and Graal are compiled and 
linked on aarch64 as expected. I.e. like


jaotc -J-Xmx4g --info --compile-with-assertions --output java.base.so 
--module java.base --compile-commands java.base-list.txt


So it will be good to have this change for our testing on ARMs.

-Dmitry

On 10/18/18 9:29 AM, Ekaterina Pavlova wrote:

Hi All,

Please review the changes which add support to build AOTed jdk modules 
and then use them during the testing.
Note, building of AOTed libraries needs to be done at the same machine 
the testing is going to be started.

So, this could not be done at jdk build time.

Updated files:
- make/RunTests.gmk, make/RunTestsPrebuilt.gmk, 
make/RunTestsPrebuiltSpec.gmk


  Main changes which add make procedures and targets to build AOT-ed 
libraries.
  New environment variable TEST_OPTS_AOT_MODULES is introduced to 
specify list of modules to build AOT-ed libraries for.

  Ex: TEST_OPTS_AOT_MODULES=java.base java.logging

- make/conf/jib-profiles.js
  Added Devkit installation on all platforms required to properly 
build AOTed libraries.


- test/hotspot/jtreg/compiler/aot/scripts/java.base-list.txt
test/hotspot/jtreg/compiler/aot/scripts/jdk.internal.vm.compiler-list.txt
  These lists were updated based on latest jaotc errors satus.


    JBS: https://bugs.openjdk.java.net/browse/JDK-8152988
 webrev: http://cr.openjdk.java.net/~epavlova//8152988/webrev.01/
testing: Tested by running subset of hotspot, jdk and jck tests with 
TEST_OPTS_AOT_MODULES=java.base jdk.internal.vm.compiler

 with "make run-test" and in Mach5.

thanks,
-katya

p.s.
 Thanks a lot Erik for huge help with porting original patch done in 
old testing framework (test/TestCommon.gmk)

 to new one (make/RunTests*).




Re: Stop using precompiled headers for Linux?

2018-11-02 Thread Thomas Stüfe
Hi Magnus,

your winning variant gives me a nice boost on my thinkpad:

pch, standard:
real17m52.367s
user52m20.730s
sys 4m53.711s

pch, your variant:
real15m0.514s
user46m6.466s
sys 2m38.371s

(non-pch is ~19-20 minutes WTC)

With those numbers, I might start using pch again on low powered machines.

.. Thomas



On Fri, Nov 2, 2018 at 12:14 PM Magnus Ihse Bursie
 wrote:
>
>
> On 2018-11-02 11:39, Magnus Ihse Bursie wrote:
> > On 2018-11-02 00:53, Ioi Lam wrote:
> >> Maybe precompiled.hpp can be periodically (weekly?) updated by a
> >> robot, which parses the dependencies files generated by gcc, and pick
> >> the most popular N files?
> > I think that's tricky to implement automatically. However, I've done
> > more or less, that, and I've got some wonderful results! :-)
>
> Ok, I'm done running my tests.
>
> TL;DR: I've managed to reduce wall-clock time from 2m 45s (with pch) or
> 2m 23s (without pch), to 1m 55s. The cpu time spent went from 52m 27s
> (with pch) or 55m 30s (without pch) to 41m 10s. This is a huge gain for
> our automated builds! And a clear improvement even for the ordinary
> developer.
>
> The list of included header files is reduced to just 37. The winning
> combination was to include all header files that was included in more
> than 130 different files, but to exclude all files with the name
> "*.inline.hpp". Hopefully, a further gain of not pulling in the
> *.inline.hpp files is that the risk of pch/non-pch failures will diminish.
>
> However, these 37 files in turn pull in an additional 201 header files.
> Of these, three are *.inline.hpp:
> share/jfr/recorder/checkpoint/types/traceid/jfrTraceIdBits.inline.hpp,
> os_cpu/linux_x86/bytes_linux_x86.inline.hpp and
> os_cpu/linux_x86/copy_linux_x86.inline.hpp. This looks like a problem
> with the header files to me.
>
> With some exceptions (mostly related to JFR), these additional 200 files
> have "generic" looking names (like share/gc/g1/g1_globals.hpp), which
> indicate to me that it is reasonable to have them in this list, just as
> the list of the original 37 tended to be quite general and high-level
> includes. However, some files (like
> share/jfr/instrumentation/jfrEventClassTransformer.hpp) has maybe leaked
> in where they should not really be. It might be worth letting a hotspot
> engineer spend some cycles to check up these files and see if anything
> can be improved.
>
> Caveats: I have only run this on my local linux build with the default
> server JVM configuration. Other machines will have different sweet
> spots. Other JVM variants/feature combinations will have different sweet
> spots. And, most importantly, I have not tested this at all on Windows.
> Nevertheless, I'm almost prepared to suggest a patch that uses this
> selection of files if running on gcc, just as is, because of the speed
> improvements I measured.
>
> And some data:
>
> Here is my log from my runs. The "on or above" means the cutoff I used
> for how many files that needed to include the files that were selected.
> As you can see, there is not much difference between cutoffs between
> 130-150, or (without the inline files) between 110 and 150. (There were
> a lot of additional inline files in the positions below 130.) With all
> other equal, I'd prefer a solution with fewer files. That is less likely
> to go bad.
>
> real2m45.623s
> user52m27.813s
> sys5m27.176s
> hotspot with original pch
>
> real2m23.837s
> user55m30.448s
> sys3m39.739s
> hotspot without pch
>
> real1m59.533s
> user42m50.019s
> sys3m0.893s
> hotspot new pch on or above 250
>
> real1m58.937s
> user42m18.994s
> sys3m0.245s
> hotspot new pch on or above 200
>
> real2m0.729s
> user42m16.636s
> sys2m57.125s
> hotspot new pch on or above 170
>
> real1m58.064s
> user42m9.618s
> sys2m57.635s
> hotspot new pch on or above 150
>
> real1m58.053s
> user42m9.796s
> sys2m58.732s
> hotspot new pch on or above 130
>
> real2m3.364s
> user42m54.818s
> sys3m2.737s
> hotspot new pch on or above 100
>
> real2m6.698s
> user44m30.434s
> sys3m12.015s
> hotspot new pch on or above 70
>
> real2m0.598s
> user41m17.810s
> sys2m56.258s
> hotspot new pch on or above 150 without inline
>
> real1m55.981s
> user41m10.076s
> sys2m51.983s
> hotspot new pch on or above 130 without inline
>
> real1m56.449s
> user41m10.667s
> sys2m53.808s
> hotspot new pch on or above 110 without inline
>
> And here is the "winning" list (which I declared as "on or above 130,
> without inline"). I encourage everyone to try this on their own system,
> and report back the results!
>
> #ifndef DONT_USE_PRECOMPILED_HEADER
> # include "classfile/classLoaderData.hpp"
> # include "classfile/javaClasses.hpp"
> # include "classfile/systemDictionary.hpp"
> # include "gc/shared/collectedHeap.hpp"
> # include "gc/shared/gcCause.hpp"
> # include "logging/log.hpp"
> # include "memory/allo

Re: Stop using precompiled headers for Linux?

2018-11-02 Thread Erik Joelsson

Nice work!

What exactly are you measuring, "make hotspot" or some other target?

If we can find a reasonable set of extra files for the windows pch that 
restores all or most of the performance, that would of course be 
preferable. I doubt we will find a significantly better selection on Mac 
compared to Linux though.


/Erik


On 2018-11-02 07:00, Magnus Ihse Bursie wrote:

On 2018-11-02 12:14, Magnus Ihse Bursie wrote:
Caveats: I have only run this on my local linux build with the 
default server JVM configuration. Other machines will have different 
sweet spots. Other JVM variants/feature combinations will have 
different sweet spots. And, most importantly, I have not tested this 
at all on Windows. Nevertheless, I'm almost prepared to suggest a 
patch that uses this selection of files if running on gcc, just as 
is, because of the speed improvements I measured. 


I've started running tests on other platforms. Unfortunately, I don't 
have access to quite as powerful machines, so everything takes much 
longer. For the moment, I've only tested my "BKM" (best known method) 
from linux, to see if it works.


For xcode/macos I got:
real    4m21,528s
user    27m28,623s
sys    2m18,244s
hotspot with original pch

real    4m28,867s
user    29m10,685s
sys    2m14,456s
hotspot without pch

real    3m6,322s
user    19m3,000s
sys    1m41,252s
hotspot with new BKM pch

So obviously this is a nice improvement even here. I could probably 
try around a bit and see if there is an even better fit with a 
different selections of header files, but even without that, I'd say 
this patch is by itself as good for clang as it is for gcc.


For windows I got:
real    6m39.035s
user    0m58.580s
sys 2m48.138s
hotspot with original pch

real    10m29.227s
user    1m6.909s
sys    2m24.108s
hotspot without pch

real    6m56.262s
user    0m57.563s
sys    2m27.514s
hotspot with new BKM pch

I'm not sure what's going on with the user time numbers here. 
Presumably cygwin cannot get to the real Windows time data. What I can 
see is the huge difference in wall clock time between PCH and no PCH. 
I can also see that the new trimmed BKM list retains most of that 
improvement, but is actually somewhat slower that the original list. 
I'm currently rerunning with a larger set on Windows, to see if this 
helps improve things. I can certainly live with a precompiled.hpp that 
includes some additional files on Windows.


/Magnus






Re: Stop using precompiled headers for Linux?

2018-11-02 Thread Aleksey Shipilev
On 11/02/2018 12:14 PM, Magnus Ihse Bursie wrote:
> And here is the "winning" list (which I declared as "on or above 130, without 
> inline"). I encourage
> everyone to try this on their own system, and report back the results!
> 
> #ifndef DONT_USE_PRECOMPILED_HEADER
> # include "classfile/classLoaderData.hpp"
> # include "classfile/javaClasses.hpp"
> # include "classfile/systemDictionary.hpp"
> # include "gc/shared/collectedHeap.hpp"
> # include "gc/shared/gcCause.hpp"
> # include "logging/log.hpp"
> # include "memory/allocation.hpp"
> # include "memory/iterator.hpp"
> # include "memory/memRegion.hpp"
> # include "memory/resourceArea.hpp"
> # include "memory/universe.hpp"
> # include "oops/instanceKlass.hpp"
> # include "oops/klass.hpp"
> # include "oops/method.hpp"
> # include "oops/objArrayKlass.hpp"
> # include "oops/objArrayOop.hpp"
> # include "oops/oop.hpp"
> # include "oops/oopsHierarchy.hpp"
> # include "runtime/atomic.hpp"
> # include "runtime/globals.hpp"
> # include "runtime/handles.hpp"
> # include "runtime/mutex.hpp"
> # include "runtime/orderAccess.hpp"
> # include "runtime/os.hpp"
> # include "runtime/thread.hpp"
> # include "runtime/timer.hpp"
> # include "services/memTracker.hpp"
> # include "utilities/align.hpp"
> # include "utilities/bitMap.hpp"
> # include "utilities/copy.hpp"
> # include "utilities/debug.hpp"
> # include "utilities/exceptions.hpp"
> # include "utilities/globalDefinitions.hpp"
> # include "utilities/growableArray.hpp"
> # include "utilities/macros.hpp"
> # include "utilities/ostream.hpp"
> # include "utilities/ticks.hpp"
> #endif // !DONT_USE_PRECOMPILED_HEADER

"make clean hotspot" times on my TR 2950X Linux x86_64 build node:

 no PCH: {134s, 135s, 135s} wall, ~59m user
old PCH: {136s, 136s, 135s} wall, ~55m user
new PCH: {111s, 108s, 108s} wall, ~45m user

I am all for shallower PCH, even knowing I would disable it for my builds 
anyway :)

Thanks,
-Aleksey



Re: RFR(S): remove ExecuteInternalVMTests and VerboseInternalVMTests flags

2018-11-02 Thread Igor Ignatyev
Hi Magnus,

> You still need to fix FindTests.gmk, though.
thanks for spotting that. is [*] enough, or did I miss something else?

-- Igor

[*]
> diff -r c6d128f60997 make/common/FindTests.gmk
> --- a/make/common/FindTests.gmk Thu Nov 01 16:33:43 2018 -0700
> +++ b/make/common/FindTests.gmk Fri Nov 02 10:02:22 2018 -0700
> @@ -79,7 +79,7 @@
>  ALL_NAMED_TESTS += $(addprefix make-, $(MAKE_TEST_TARGETS))
>  
>  # Add special tests
> -ALL_NAMED_TESTS += hotspot-internal failure-handler make
> +ALL_NAMED_TESTS += failure-handler make
>  
>  
> 


> On Nov 2, 2018, at 1:27 AM, Magnus Ihse Bursie 
>  wrote:
> 
> 
> On 2018-11-02 00:37, Igor Ignatyev wrote:
>> on a second though, removing these macros isn't that big, here is an 
>> incremental webrev: 
>> http://cr.openjdk.java.net/~iignatyev//8213058/webrev.0-1 
>> . 
>> besides removing test_log, it also include fixes in build and doc needed due 
>> to rebasing.
>> 
>> Erik, could you please re-review build changes?
>> 
>> http://cr.openjdk.java.net/~iignatyev//8213058/webrev.01/ 
>>  is the whole 
>> webrev.
> 
> Sorry, I missed your updated webrev. You still need to fix FindTests.gmk, 
> though.
> 
> /Magnus
> 
> 
> 
>> 
>> Thanks,
>> -- Igor
>> 
>> 
>>> On Nov 1, 2018, at 4:23 PM, Igor Ignatyev  wrote:
>>> 
>>> Hi David,
>>> 
>>> removing usage of test_log will just mix "unneeded" changes w/ this clean 
>>> up. as TestReservedSpace_test, TestReserveMemorySpecial_test, 
>>> TestVirtualSpace_test, and TestMetaspaceUtils_test are to be removed by 
>>> 8213269[*], I don't think we need to pay much attention to their code.
>>> 
>>> [*] https://bugs.openjdk.java.net/browse/JDK-8213269 
>>> 
>>> 
>>> -- Igor
>>> 
 On Nov 1, 2018, at 4:16 PM, David Holmes  wrote:
 
 Hi Igor,
 
 There's no point having empty test_log macros that do nothing. The macro 
 and all uses should just be deleted ... unless you plan on adding some 
 other form of logging for this?
 
 Thanks,
 David
 
 On 2/11/2018 7:15 AM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8213058/webrev.00/index.html
>> 174 lines changed: 0 ins; 170 del; 4 mod;
> Hi all,
> could you please review this small clean up which removes 
> ExecuteInternalVMTests and VerboseInternalVMTests flags and related make 
> targets and tests?
> 8177708[1-3] is to convert the last of internal vm tests, so the whole 
> InternalVMTests can be removed.
> JBS: https://bugs.openjdk.java.net/browse/JDK-8213058
> webrev: 
> http://cr.openjdk.java.net/~iignatyev//8213058/webrev.00/index.html
> testing: tier1, build all regular platforms
> [1] https://bugs.openjdk.java.net/browse/JDK-8177708
> [2] http://cr.openjdk.java.net/~iignatyev//8177708/webrev.00/index.html
> [3] 
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-October/030633.html
> Thanks,
> -- Igor
> 



Re: RFR(S): remove ExecuteInternalVMTests and VerboseInternalVMTests flags

2018-11-02 Thread Magnus Ihse Bursie


> 2 nov. 2018 kl. 18:03 skrev Igor Ignatyev :
> 
> Hi Magnus,
> 
>> You still need to fix FindTests.gmk, though.
> thanks for spotting that. is [*] enough, or did I miss something else?

This looks enough. Thanks!

/Magnus

> 
> -- Igor
> 
> [*]
>> diff -r c6d128f60997 make/common/FindTests.gmk
>> --- a/make/common/FindTests.gmk Thu Nov 01 16:33:43 2018 -0700
>> +++ b/make/common/FindTests.gmk Fri Nov 02 10:02:22 2018 -0700
>> @@ -79,7 +79,7 @@
>> ALL_NAMED_TESTS += $(addprefix make-, $(MAKE_TEST_TARGETS))
>> 
>> # Add special tests
>> -ALL_NAMED_TESTS += hotspot-internal failure-handler make
>> +ALL_NAMED_TESTS += failure-handler make
>> 
>> 
> 
> 
>> On Nov 2, 2018, at 1:27 AM, Magnus Ihse Bursie 
>>  wrote:
>> 
>> 
>>> On 2018-11-02 00:37, Igor Ignatyev wrote:
>>> on a second though, removing these macros isn't that big, here is an 
>>> incremental webrev: 
>>> http://cr.openjdk.java.net/~iignatyev//8213058/webrev.0-1 
>>> . 
>>> besides removing test_log, it also include fixes in build and doc needed 
>>> due to rebasing.
>>> 
>>> Erik, could you please re-review build changes?
>>> 
>>> http://cr.openjdk.java.net/~iignatyev//8213058/webrev.01/ 
>>>  is the whole 
>>> webrev.
>> 
>> Sorry, I missed your updated webrev. You still need to fix FindTests.gmk, 
>> though.
>> 
>> /Magnus
>> 
>> 
>> 
>>> 
>>> Thanks,
>>> -- Igor
>>> 
>>> 
 On Nov 1, 2018, at 4:23 PM, Igor Ignatyev  wrote:
 
 Hi David,
 
 removing usage of test_log will just mix "unneeded" changes w/ this clean 
 up. as TestReservedSpace_test, TestReserveMemorySpecial_test, 
 TestVirtualSpace_test, and TestMetaspaceUtils_test are to be removed by 
 8213269[*], I don't think we need to pay much attention to their code.
 
 [*] https://bugs.openjdk.java.net/browse/JDK-8213269 
 
 
 -- Igor
 
> On Nov 1, 2018, at 4:16 PM, David Holmes  wrote:
> 
> Hi Igor,
> 
> There's no point having empty test_log macros that do nothing. The macro 
> and all uses should just be deleted ... unless you plan on adding some 
> other form of logging for this?
> 
> Thanks,
> David
> 
>> On 2/11/2018 7:15 AM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8213058/webrev.00/index.html
>>> 174 lines changed: 0 ins; 170 del; 4 mod;
>> Hi all,
>> could you please review this small clean up which removes 
>> ExecuteInternalVMTests and VerboseInternalVMTests flags and related make 
>> targets and tests?
>> 8177708[1-3] is to convert the last of internal vm tests, so the whole 
>> InternalVMTests can be removed.
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8213058
>> webrev: 
>> http://cr.openjdk.java.net/~iignatyev//8213058/webrev.00/index.html
>> testing: tier1, build all regular platforms
>> [1] https://bugs.openjdk.java.net/browse/JDK-8177708
>> [2] http://cr.openjdk.java.net/~iignatyev//8177708/webrev.00/index.html
>> [3] 
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-October/030633.html
>> Thanks,
>> -- Igor
>> 
> 



Re: Stop using precompiled headers for Linux?

2018-11-02 Thread Magnus Ihse Bursie



On 2018-11-02 17:21, Erik Joelsson wrote:

Nice work!

What exactly are you measuring, "make hotspot" or some other target?

Yes, "make hotspot".

If we can find a reasonable set of extra files for the windows pch 
that restores all or most of the performance, that would of course be 
preferable. I doubt we will find a significantly better selection on 
Mac compared to Linux though.
It seems the best selection for Mac more or less exactly equals Linux. 
Which is nice. For Windows, I was able to more or less precisely match 
the original behaviour with the Linux set + the four inline.hpp files I 
removed for Linux:


# include "oops/oop.inline.hpp"
# include "memory/allocation.inline.hpp"
# include "oops/access.inline.hpp"
# include "runtime/handles.inline.hpp"

Then I got from the original:
real    6m39.035s
user    0m58.580s
sys 2m48.138s
hotspot with original pch

to:

real    6m18.645s
user    0m55.963s
sys    2m28.264s
hotspot with new pch, BKM (on and above 130), including inline

Quite good for just adding four more files depending on the Windows 
platform.


By adding yet some more include files (and keeping the inline files), I 
was able to improve Windows compile time somewhat more:

real    6m7.355s
user    0m55.718s
sys    2m26.153s
hotspot with new pch on and above 110, including inline

Then I also added this set:
// 130-110
# include "runtime/thread.inline.hpp"
# include "utilities/bitMap.inline.hpp"
# include "oops/arrayOop.inline.hpp"
# include "gc/shared/gcId.hpp"
# include "runtime/mutexLocker.hpp"
# include "oops/objArrayOop.inline.hpp"
# include "classfile/javaClasses.inline.hpp"
# include "memory/referenceType.hpp"
# include "oops/weakHandle.hpp"
# include "oops/compressedOops.inline.hpp"
# include "gc/shared/barrierSet.hpp"
# include "utilities/stack.hpp"
# include "gc/g1/g1YCTypes.hpp"
# include "memory/padded.hpp"
# include "logging/logHandle.hpp"

This starts to look a bit specialized (the g1 files is likely to need 
#ifdef guards etc), so maybe it's not worth it.


/Magnus





/Erik


On 2018-11-02 07:00, Magnus Ihse Bursie wrote:

On 2018-11-02 12:14, Magnus Ihse Bursie wrote:
Caveats: I have only run this on my local linux build with the 
default server JVM configuration. Other machines will have different 
sweet spots. Other JVM variants/feature combinations will have 
different sweet spots. And, most importantly, I have not tested this 
at all on Windows. Nevertheless, I'm almost prepared to suggest a 
patch that uses this selection of files if running on gcc, just as 
is, because of the speed improvements I measured. 


I've started running tests on other platforms. Unfortunately, I don't 
have access to quite as powerful machines, so everything takes much 
longer. For the moment, I've only tested my "BKM" (best known method) 
from linux, to see if it works.


For xcode/macos I got:
real    4m21,528s
user    27m28,623s
sys    2m18,244s
hotspot with original pch

real    4m28,867s
user    29m10,685s
sys    2m14,456s
hotspot without pch

real    3m6,322s
user    19m3,000s
sys    1m41,252s
hotspot with new BKM pch

So obviously this is a nice improvement even here. I could probably 
try around a bit and see if there is an even better fit with a 
different selections of header files, but even without that, I'd say 
this patch is by itself as good for clang as it is for gcc.


For windows I got:
real    6m39.035s
user    0m58.580s
sys 2m48.138s
hotspot with original pch

real    10m29.227s
user    1m6.909s
sys    2m24.108s
hotspot without pch

real    6m56.262s
user    0m57.563s
sys    2m27.514s
hotspot with new BKM pch

I'm not sure what's going on with the user time numbers here. 
Presumably cygwin cannot get to the real Windows time data. What I 
can see is the huge difference in wall clock time between PCH and no 
PCH. I can also see that the new trimmed BKM list retains most of 
that improvement, but is actually somewhat slower that the original 
list. I'm currently rerunning with a larger set on Windows, to see if 
this helps improve things. I can certainly live with a 
precompiled.hpp that includes some additional files on Windows.


/Magnus








Re: Stop using precompiled headers for Linux?

2018-11-02 Thread Magnus Ihse Bursie




On 2018-11-02 15:00, Magnus Ihse Bursie wrote:
So obviously this is a nice improvement even here. I could probably 
try around a bit and see if there is an even better fit with a 
different selections of header files, but even without that, I'd say 
this patch is by itself as good for clang as it is for gcc. 


I could not improve matters much. There was very small differences 
between setting the limit at 110, 130 (BKM), 150 or 200:


real    3m6,322s
user    19m3,000s
sys    1m41,252s
hotspot new pch BKM (on and above 130)

real    3m6,309s
user    19m2,764s
sys    1m41,488s
hotspot new pch on and above 110

real    3m7,515s
user    19m6,470s
sys    1m42,345s
hotspot new pch on and above 150

real    3m8,497s
user    19m15,851s
sys    1m42,494s
hotspot new pch on and above 200

/Magnus