Re: Loading classes with many methods is very expensive
Hi Stanimir, On 10/30/2014 12:00 AM, Stanimir Simeonoff wrote: Hi Peter, The removal of value wrapper is a clever approach to reduce the new instances created although it feels very unnatural (at least to me). Number of objects created is practically the same. Previous approach used same MethodList instances for keys and values. New approach uses just Key instances, values are Methods themselves. It's allways 1 additional object per Method, but new approach might use a couple of more Map.Entry objects (when multiple methods with same signature are present). New approach is different in that it was designed to keep the insertion order of Method(s) (LinkedHashMap), but you found out this is not true (below)... Small optimization; eagerly calculate the hash in the c'tor, hash = 149 method.getReturnType().hashCode() ^ 150 System.identityHashCode(method.getName()) ^ 151 Arrays.hashCode(method.getParameterTypes() and so the real int hashCode(){return seq+hash;} Right, that would be an time/space trade-off. It might not be on 64bit arch - just on 32bit. I should check. Perhaps hashCode should not depend on seq (see below). Also I am not sure if method.getName().hashCode() would be better or not, if previously identityHashCode is not invoked on that string and depending on the configured hashCode (-XX:hashCode) generator System.identityHashCode might be slow or worse call C random() which doesn't scale. Setting the hashCode requires a CAS too. CAS is of course one-time off but still Thanks for pointing that out. I'll use String.hashCode() then. About the order of the returned methods: if you remove and then put from/to the LHM the order of iteration is going to be greatly altered. Is that ok? Excelent point! I might get away with excluding seq from hashCode computation and only use it in equals(). This way Key(s) could be modified in-place (hashCode would not change, entry would stay in same bucket), it would just become equal to some other Key. Modification operations would keep the invariant that there are no duplicate Key(s) at any time in the Map. All entries sharing same method signature would end-up in the same bucket (and maybe share it with entries that just happen to have the same hashCode mod capacity) and we would get a similar linked list of entries as with previous approach - only without another level of indirection... Regards, Peter Stanimir On Wed, Oct 29, 2014 at 7:16 PM, Peter Levart peter.lev...@gmail.com wrote: On 10/29/2014 02:08 PM, Joel Borggrén-Franck wrote: Hi Peter, I'm not entirely convinced this is a bug. The lookup order for getMethod has for a long time been walk up superclasses and return what you find there first without even looking at interfaces. It might be desirable to change that but I'm not sure. Hi Joel, It has been for a long time like that as you say, but for a long time Java did not have default methods. It's unexpected for getMethod() to return a method that is not contained in getMethods() result. Anyway, I have created a bug to track this: https://bugs.openjdk.java.net/browse/JDK-8062389 For next iteration of the getMethods() O(n^2) fix, I used a slightly different approach, which you might like more or not. Instead of using linked-lists of Method objects as values of a LinkedHashMap I created a special Key object, holding a reference to Method object and an additional 'seq' int field, which discriminates among methods with same signature. Values of LinkedHashMap are Method objects themselves: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.03/ I have encapsulated this functionality into a package-private java.lang.MethodTable. The implementation of this API can be easily changed to using linked-lists as values of LinkedHashMap if desired. The performance characteristics are similar, with hardly measurable advantage of this latest approach as can be seen from http://cr.openjdk.java.net/~ plevart/jdk9-dev/Class.getMethods/GetAllRtMethods.java benchmark: Original code: 19658 classes loaded in 2.071013902 seconds. 494392 methods obtained in 1.089983418 seconds. 494392 methods obtained in 0.952488497 seconds. 494392 methods obtained in 0.912878317 seconds. 494392 methods obtained in 0.940293784 seconds. 494392 methods obtained in 0.987640733 seconds. 494392 methods obtained in 0.925393355 seconds. 494392 methods obtained in 0.89397002 seconds. 494392 methods obtained in 0.915042463 seconds. 494392 methods obtained in 0.897669082 seconds. 494392 methods obtained in 0.878140502 seconds. Patched code: 19658 classes loaded in 2.153024197 seconds. 494392 methods obtained in 0.875651469 seconds. 494392 methods obtained in 0.791937742 seconds. 494392 methods obtained in 0.780995693 seconds. 494392 methods obtained in 0.759593461 seconds. 494392 methods obtained in 0.766528355 seconds. 494392 methods obtained in 0.756567663 seconds. 494392
Re: Loading classes with many methods is very expensive
On 10/30/2014 10:29 AM, Peter Levart wrote: I might get away with excluding seq from hashCode computation and only use it in equals(). This way Key(s) could be modified in-place (hashCode would not change, entry would stay in same bucket), it would just become equal to some other Key. Modification operations would keep the invariant that there are no duplicate Key(s) at any time in the Map. All entries sharing same method signature would end-up in the same bucket (and maybe share it with entries that just happen to have the same hashCode mod capacity) and we would get a similar linked list of entries as with previous approach - only without another level of indirection... That would not work. I would have to have the Key object accessible. Since Map does not have a getKey(key) operation, I would have to store the Key as a value too. And we're back to MethodList approach... Peter
Re: Loading classes with many methods is very expensive
Guys, can you please start a new RFR thread for Peter's change? You have a bug ID now, and the discussion should be searchable by bug ID. Also, you are spamming two aliases (corelibs + hotspot-runtime) instead of one ;) -Aleksey. On 10/30/2014 12:43 PM, Peter Levart wrote: On 10/30/2014 10:29 AM, Peter Levart wrote: I might get away with excluding seq from hashCode computation and only use it in equals(). This way Key(s) could be modified in-place (hashCode would not change, entry would stay in same bucket), it would just become equal to some other Key. Modification operations would keep the invariant that there are no duplicate Key(s) at any time in the Map. All entries sharing same method signature would end-up in the same bucket (and maybe share it with entries that just happen to have the same hashCode mod capacity) and we would get a similar linked list of entries as with previous approach - only without another level of indirection... That would not work. I would have to have the Key object accessible. Since Map does not have a getKey(key) operation, I would have to store the Key as a value too. And we're back to MethodList approach... Peter
Re: Loading classes with many methods is very expensive
Sorry, I apologize. I'll start a new thread on core-libs only. Peter On 10/30/2014 10:47 AM, Aleksey Shipilev wrote: Guys, can you please start a new RFR thread for Peter's change? You have a bug ID now, and the discussion should be searchable by bug ID. Also, you are spamming two aliases (corelibs + hotspot-runtime) instead of one ;) -Aleksey. On 10/30/2014 12:43 PM, Peter Levart wrote: On 10/30/2014 10:29 AM, Peter Levart wrote: I might get away with excluding seq from hashCode computation and only use it in equals(). This way Key(s) could be modified in-place (hashCode would not change, entry would stay in same bucket), it would just become equal to some other Key. Modification operations would keep the invariant that there are no duplicate Key(s) at any time in the Map. All entries sharing same method signature would end-up in the same bucket (and maybe share it with entries that just happen to have the same hashCode mod capacity) and we would get a similar linked list of entries as with previous approach - only without another level of indirection... That would not work. I would have to have the Key object accessible. Since Map does not have a getKey(key) operation, I would have to store the Key as a value too. And we're back to MethodList approach... Peter
Re: Loading classes with many methods is very expensive
Hi Joel, I found an inconsistency between getMethod() and getMethods() results that is present in current JDK8/9 code and in my latest webrev.02. The following program: import java.util.stream.Collectors; import java.util.stream.Stream; public class GetMethodTest { static void test(Class? clazz) throws Exception { System.out.println(clazz.getName() + .class.getMethods(): + Stream .of(clazz.getMethods()) .filter(m - m.getDeclaringClass() != Object.class) .collect(Collectors.toList())); System.out.println(clazz.getName() + .class.getMethod(\m\): + clazz.getMethod(m)); System.out.println(); } public static void main(String[] args) throws Exception { test(I.class); test(J.class); test(A.class); test(B.class); } } interface I { void m(); } interface J extends I { default void m() {} } abstract class A implements I {} abstract class B extends A implements J {} prints: I.class.getMethods(): [public abstract void I.m()] I.class.getMethod(m): public abstract void I.m() J.class.getMethods(): [public default void J.m()] J.class.getMethod(m): public default void J.m() A.class.getMethods(): [public abstract void I.m()] A.class.getMethod(m): public abstract void I.m() B.class.getMethods(): [public default void J.m()] B.class.getMethod(m): public abstract void I.m() B.class.getMethods() reports default method J.m() (which I think is correct), but B.class.getMethod(m) reports the abstract I.m() inherited from A, because here the getMethod0() algorithm stops searching for and consolidating any methods in (super)interfaces. Do you agree that this is a bug? Regards, Peter On 10/27/2014 02:45 PM, Joel Borggrén-Franck wrote: Hi Peter, As always, thanks for doing this! It has been on my todolist for a while but never quite bubbling up to the top. I don’t have time to look att his right now, but I expect to have some free time next week, but i have two short comments First, I have been thinking about moving MethodArray to its’s own top-level class, isn’t it about time? Second I would expect testing for the missing cases you uncovered (good catch!). I’ll try to get back to you asap. cheers /Joel On 26 okt 2014, at 23:53, Peter Levart peter.lev...@gmail.com wrote: On 10/26/2014 09:25 PM, Peter Levart wrote: 19657 classes loaded in 1.987373401 seconds. 494141 methods obtained in 1.02493941 seconds. vs. 19657 classes loaded in 2.084409717 seconds. 494124 methods obtained in 0.915928578 seconds. Hi, As you might have noticed, the number of methods obtained from patched code differed from original code. I have investigated this and found that original code treats abstract class methods the same as abstract interface methods as far as multiple inheritance is concerned (it keeps them together in the returned array). So I fixed this and here's new webrev which behaves the same as original code: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.02/ Comparing original vs. patched code still shows speed-up: Original: 19657 classes loaded in 1.980493029 seconds. 494141 methods obtained in 0.976318927 seconds. 494141 methods obtained in 0.886504437 seconds. 494141 methods obtained in 0.911153722 seconds. 494141 methods obtained in 0.880550509 seconds. 494141 methods obtained in 0.875526704 seconds. 494141 methods obtained in 0.877258894 seconds. 494141 methods obtained in 0.871794344 seconds. 494141 methods obtained in 0.884159644 seconds. 494141 methods obtained in 0.892648522 seconds. 494141 methods obtained in 0.884581841 seconds. Patched: 19657 classes loaded in 2.055697675 seconds. 494141 methods obtained in 0.853922188 seconds. 494141 methods obtained in 0.776203794 seconds. 494141 methods obtained in 0.858774803 seconds. 494141 methods obtained in 0.778178867 seconds. 494141 methods obtained in 0.760043997 seconds. 494141 methods obtained in 0.756352444 seconds. 494141 methods obtained in 0.740826372 seconds. 494141 methods obtained in 0.744264782 seconds. 494141 methods obtained in 0.73805894 seconds. 494141 methods obtained in 0.746852752 seconds. 55 java/lang/reflect jtreg tests still pass. As they did before, which means that we don't have a coverage for such cases. I'll see where I can add such a case (EnumSet for example, which inherits from Set interface and AbstractColection class via two different paths, so Set.size()/iterator() and AbstractCollection.size()/iterator() are both returned from getMethods())... Regards, Peter
Re: Loading classes with many methods is very expensive
Hi Peter, I’m not entirely convinced this is a bug. The lookup order for getMethod has for a long time been walk up superclasses and return what you find there first without even looking at interfaces. It might be desirable to change that but I’m not sure. cheers /Joel On 29 okt 2014, at 12:26, Peter Levart peter.lev...@gmail.com wrote: Hi Joel, I found an inconsistency between getMethod() and getMethods() results that is present in current JDK8/9 code and in my latest webrev.02. The following program: import java.util.stream.Collectors; import java.util.stream.Stream; public class GetMethodTest { static void test(Class? clazz) throws Exception { System.out.println(clazz.getName() + .class.getMethods(): + Stream .of(clazz.getMethods()) .filter(m - m.getDeclaringClass() != Object.class) .collect(Collectors.toList())); System.out.println(clazz.getName() + .class.getMethod(\m\): + clazz.getMethod(m)); System.out.println(); } public static void main(String[] args) throws Exception { test(I.class); test(J.class); test(A.class); test(B.class); } } interface I { void m(); } interface J extends I { default void m() {} } abstract class A implements I {} abstract class B extends A implements J {} prints: I.class.getMethods(): [public abstract void I.m()] I.class.getMethod(m): public abstract void I.m() J.class.getMethods(): [public default void J.m()] J.class.getMethod(m): public default void J.m() A.class.getMethods(): [public abstract void I.m()] A.class.getMethod(m): public abstract void I.m() B.class.getMethods(): [public default void J.m()] B.class.getMethod(m): public abstract void I.m() B.class.getMethods() reports default method J.m() (which I think is correct), but B.class.getMethod(m) reports the abstract I.m() inherited from A, because here the getMethod0() algorithm stops searching for and consolidating any methods in (super)interfaces. Do you agree that this is a bug? Regards, Peter On 10/27/2014 02:45 PM, Joel Borggrén-Franck wrote: Hi Peter, As always, thanks for doing this! It has been on my todolist for a while but never quite bubbling up to the top. I don’t have time to look att his right now, but I expect to have some free time next week, but i have two short comments First, I have been thinking about moving MethodArray to its’s own top-level class, isn’t it about time? Second I would expect testing for the missing cases you uncovered (good catch!). I’ll try to get back to you asap. cheers /Joel On 26 okt 2014, at 23:53, Peter Levart peter.lev...@gmail.com wrote: On 10/26/2014 09:25 PM, Peter Levart wrote: 19657 classes loaded in 1.987373401 seconds. 494141 methods obtained in 1.02493941 seconds. vs. 19657 classes loaded in 2.084409717 seconds. 494124 methods obtained in 0.915928578 seconds. Hi, As you might have noticed, the number of methods obtained from patched code differed from original code. I have investigated this and found that original code treats abstract class methods the same as abstract interface methods as far as multiple inheritance is concerned (it keeps them together in the returned array). So I fixed this and here's new webrev which behaves the same as original code: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.02/ Comparing original vs. patched code still shows speed-up: Original: 19657 classes loaded in 1.980493029 seconds. 494141 methods obtained in 0.976318927 seconds. 494141 methods obtained in 0.886504437 seconds. 494141 methods obtained in 0.911153722 seconds. 494141 methods obtained in 0.880550509 seconds. 494141 methods obtained in 0.875526704 seconds. 494141 methods obtained in 0.877258894 seconds. 494141 methods obtained in 0.871794344 seconds. 494141 methods obtained in 0.884159644 seconds. 494141 methods obtained in 0.892648522 seconds. 494141 methods obtained in 0.884581841 seconds. Patched: 19657 classes loaded in 2.055697675 seconds. 494141 methods obtained in 0.853922188 seconds. 494141 methods obtained in 0.776203794 seconds. 494141 methods obtained in 0.858774803 seconds. 494141 methods obtained in 0.778178867 seconds. 494141 methods obtained in 0.760043997 seconds. 494141 methods obtained in 0.756352444 seconds. 494141 methods obtained in 0.740826372 seconds. 494141 methods obtained in 0.744264782 seconds. 494141 methods obtained in 0.73805894 seconds. 494141 methods obtained in 0.746852752 seconds. 55 java/lang/reflect jtreg tests still pass. As they did before, which means that we don't have a coverage for such cases. I'll see where I can add such a case (EnumSet for example, which inherits from Set interface and
Re: Loading classes with many methods is very expensive
On 10/29/2014 02:08 PM, Joel Borggrén-Franck wrote: Hi Peter, I’m not entirely convinced this is a bug. The lookup order for getMethod has for a long time been walk up superclasses and return what you find there first without even looking at interfaces. It might be desirable to change that but I’m not sure. Hi Joel, It has been for a long time like that as you say, but for a long time Java did not have default methods. It's unexpected for getMethod() to return a method that is not contained in getMethods() result. Anyway, I have created a bug to track this: https://bugs.openjdk.java.net/browse/JDK-8062389 For next iteration of the getMethods() O(n^2) fix, I used a slightly different approach, which you might like more or not. Instead of using linked-lists of Method objects as values of a LinkedHashMap I created a special Key object, holding a reference to Method object and an additional 'seq' int field, which discriminates among methods with same signature. Values of LinkedHashMap are Method objects themselves: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.03/ I have encapsulated this functionality into a package-private java.lang.MethodTable. The implementation of this API can be easily changed to using linked-lists as values of LinkedHashMap if desired. The performance characteristics are similar, with hardly measurable advantage of this latest approach as can be seen from http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/GetAllRtMethods.java benchmark: Original code: 19658 classes loaded in 2.071013902 seconds. 494392 methods obtained in 1.089983418 seconds. 494392 methods obtained in 0.952488497 seconds. 494392 methods obtained in 0.912878317 seconds. 494392 methods obtained in 0.940293784 seconds. 494392 methods obtained in 0.987640733 seconds. 494392 methods obtained in 0.925393355 seconds. 494392 methods obtained in 0.89397002 seconds. 494392 methods obtained in 0.915042463 seconds. 494392 methods obtained in 0.897669082 seconds. 494392 methods obtained in 0.878140502 seconds. Patched code: 19658 classes loaded in 2.153024197 seconds. 494392 methods obtained in 0.875651469 seconds. 494392 methods obtained in 0.791937742 seconds. 494392 methods obtained in 0.780995693 seconds. 494392 methods obtained in 0.759593461 seconds. 494392 methods obtained in 0.766528355 seconds. 494392 methods obtained in 0.756567663 seconds. 494392 methods obtained in 0.739177848 seconds. 494392 methods obtained in 0.729245613 seconds. 494392 methods obtained in 0.74081083 seconds. 494392 methods obtained in 0.731749505 seconds. Martin's ManyMethodsBenchmark shows this algorithm has O(n) time complexity too: Original: Base class load time: 131.95 ms getDeclaredMethods: 65521 methods, 32.00 ms total time, 0.0005 ms per method getMethods: 65530 methods, 44.24 ms total time, 0.0007 ms per method Derived class load time: 32525.23 ms getDeclaredMethods: 65521 methods, 30.37 ms total time, 0.0005 ms per method getMethods: 65530 methods, 7897.03 ms total time, 0.1205 ms per method Patched: Base class load time: 129.72 ms getDeclaredMethods: 65521 methods, 32.76 ms total time, 0.0005 ms per method getMethods: 65530 methods, 42.68 ms total time, 0.0007 ms per method Derived class load time: 31620.47 ms getDeclaredMethods: 65521 methods, 30.49 ms total time, 0.0005 ms per method getMethods: 65530 methods, 88.23 ms total time, 0.0013 ms per method I have also run Martin's LoadAllClassesAndMethods test (Thanks Martin, I changed it slightly so that exceptions are collected and reported at the end instead of bailing-out on first exception). Original LoadAllClassesAndMethods write classlist.txt: class load: 23052 classes, 1563.75 ms total time, 0.0678 ms per class 4 exceptions encountered: java.lang.IncompatibleClassChangeError: jdk.nashorn.internal.codegen.CompilationPhase$10 and jdk.nashorn.internal.codegen.CompilationPhase$10$1 disagree on InnerClasses attribute java.lang.IncompatibleClassChangeError: jdk.nashorn.internal.codegen.CompilationPhase$5 and jdk.nashorn.internal.codegen.CompilationPhase$5$1 disagree on InnerClasses attribute java.lang.IncompatibleClassChangeError: java.security.ProtectionDomain$3 and java.security.ProtectionDomain$3$1 disagree on InnerClasses attribute java.lang.IncompatibleClassChangeError: java.lang.Compiler and java.lang.Compiler$1 disagree on InnerClasses attribute getMethods: 23052 classes, 831.87 ms total time, 0.0361 ms per class Patched LoadAllClassesAndMethods diff classlist.txt: class load: 23051 classes, 1596.58 ms total time, 0.0693 ms per class 5 exceptions encountered: java.lang.IncompatibleClassChangeError: jdk.nashorn.internal.codegen.CompilationPhase$10 and jdk.nashorn.internal.codegen.CompilationPhase$10$1 disagree on InnerClasses attribute java.lang.IncompatibleClassChangeError: jdk.nashorn.internal.codegen.CompilationPhase$5 and
Re: inconsistency between Class.getMethod and Class.getMethods Was: Loading classes with many methods is very expensive
Hi Peter, On 29 Oct 2014, at 18:16, Peter Levart peter.lev...@gmail.com wrote: On 10/29/2014 02:08 PM, Joel Borggrén-Franck wrote: Hi Peter, I’m not entirely convinced this is a bug. The lookup order for getMethod has for a long time been walk up superclasses and return what you find there first without even looking at interfaces. It might be desirable to change that but I’m not sure. Hi Joel, It has been for a long time like that as you say, but for a long time Java did not have default methods. It's unexpected for getMethod() to return a method that is not contained in getMethods() result. Anyway, I have created a bug to track this: https://bugs.openjdk.java.net/browse/JDK-8062389 Thanks. For the record I don’t necessarily disagree, I just haven’t had time to research this yet. I think this is a subset of https://bugs.openjdk.java.net/browse/JDK-8029459 but might also be worth changing separately in 9. If it is a bug, it might not be worth fixing in 8u. cheers /Joel
Re: Loading classes with many methods is very expensive
Hi Peter, The removal of value wrapper is a clever approach to reduce the new instances created although it feels very unnatural (at least to me). Small optimization; eagerly calculate the hash in the c'tor, hash = 149 method.getReturnType().hashCode() ^ 150 System.identityHashCode(method.getName()) ^ 151 Arrays.hashCode(method.getParameterTypes() and so the real int hashCode(){return seq+hash;} Also I am not sure if method.getName().hashCode() would be better or not, if previously identityHashCode is not invoked on that string and depending on the configured hashCode (-XX:hashCode) generator System.identityHashCode might be slow or worse call C random() which doesn't scale. Setting the hashCode requires a CAS too. CAS is of course one-time off but still About the order of the returned methods: if you remove and then put from/to the LHM the order of iteration is going to be greatly altered. Is that ok? Stanimir On Wed, Oct 29, 2014 at 7:16 PM, Peter Levart peter.lev...@gmail.com wrote: On 10/29/2014 02:08 PM, Joel Borggrén-Franck wrote: Hi Peter, I'm not entirely convinced this is a bug. The lookup order for getMethod has for a long time been walk up superclasses and return what you find there first without even looking at interfaces. It might be desirable to change that but I'm not sure. Hi Joel, It has been for a long time like that as you say, but for a long time Java did not have default methods. It's unexpected for getMethod() to return a method that is not contained in getMethods() result. Anyway, I have created a bug to track this: https://bugs.openjdk.java.net/browse/JDK-8062389 For next iteration of the getMethods() O(n^2) fix, I used a slightly different approach, which you might like more or not. Instead of using linked-lists of Method objects as values of a LinkedHashMap I created a special Key object, holding a reference to Method object and an additional 'seq' int field, which discriminates among methods with same signature. Values of LinkedHashMap are Method objects themselves: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.03/ I have encapsulated this functionality into a package-private java.lang.MethodTable. The implementation of this API can be easily changed to using linked-lists as values of LinkedHashMap if desired. The performance characteristics are similar, with hardly measurable advantage of this latest approach as can be seen from http://cr.openjdk.java.net/~ plevart/jdk9-dev/Class.getMethods/GetAllRtMethods.java benchmark: Original code: 19658 classes loaded in 2.071013902 seconds. 494392 methods obtained in 1.089983418 seconds. 494392 methods obtained in 0.952488497 seconds. 494392 methods obtained in 0.912878317 seconds. 494392 methods obtained in 0.940293784 seconds. 494392 methods obtained in 0.987640733 seconds. 494392 methods obtained in 0.925393355 seconds. 494392 methods obtained in 0.89397002 seconds. 494392 methods obtained in 0.915042463 seconds. 494392 methods obtained in 0.897669082 seconds. 494392 methods obtained in 0.878140502 seconds. Patched code: 19658 classes loaded in 2.153024197 seconds. 494392 methods obtained in 0.875651469 seconds. 494392 methods obtained in 0.791937742 seconds. 494392 methods obtained in 0.780995693 seconds. 494392 methods obtained in 0.759593461 seconds. 494392 methods obtained in 0.766528355 seconds. 494392 methods obtained in 0.756567663 seconds. 494392 methods obtained in 0.739177848 seconds. 494392 methods obtained in 0.729245613 seconds. 494392 methods obtained in 0.74081083 seconds. 494392 methods obtained in 0.731749505 seconds. Martin's ManyMethodsBenchmark shows this algorithm has O(n) time complexity too: Original: Base class load time: 131.95 ms getDeclaredMethods: 65521 methods, 32.00 ms total time, 0.0005 ms per method getMethods: 65530 methods, 44.24 ms total time, 0.0007 ms per method Derived class load time: 32525.23 ms getDeclaredMethods: 65521 methods, 30.37 ms total time, 0.0005 ms per method getMethods: 65530 methods, 7897.03 ms total time, 0.1205 ms per method Patched: Base class load time: 129.72 ms getDeclaredMethods: 65521 methods, 32.76 ms total time, 0.0005 ms per method getMethods: 65530 methods, 42.68 ms total time, 0.0007 ms per method Derived class load time: 31620.47 ms getDeclaredMethods: 65521 methods, 30.49 ms total time, 0.0005 ms per method getMethods: 65530 methods, 88.23 ms total time, 0.0013 ms per method I have also run Martin's LoadAllClassesAndMethods test (Thanks Martin, I changed it slightly so that exceptions are collected and reported at the end instead of bailing-out on first exception). Original LoadAllClassesAndMethods write classlist.txt: class load: 23052 classes, 1563.75 ms total time, 0.0678 ms per class 4 exceptions encountered:
Re: Loading classes with many methods is very expensive
On 10/27/2014 01:53 AM, Peter Levart wrote: As you might have noticed, the number of methods obtained from patched code differed from original code. I have investigated this and found that original code treats abstract class methods the same as abstract interface methods as far as multiple inheritance is concerned (it keeps them together in the returned array). So I fixed this and here's new webrev which behaves the same as original code: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.02/ This seems to be a candidate fix for https://bugs.openjdk.java.net/browse/JDK-8061950, right? If so, please assign yourself there. It might be advisable to start the separate RFR thread for a fix? My comments for the first reading of the patch: * Why MethodList maintains the linked list? Doesn't O(n) MethodList.add() lead to quadratic complexity again? Should we instead use the ArrayListMethod for storing method synonyms? * Please convert some of the indexed loops into for-each loops for better readability? E.g: for (int i = 0; i intfcsMethods.length; i++) { for (Method m : intfcsMethods[i]) { ...to: for (Method[] ms : intfcsMethods) { for (Method m : ms) { * I would think the code would be more readable if MethodList.m was having a more readable name, e.g. MethodList.base or MethodList.image or MethodList.primary? * Formatting and logic in MethodList.consolidateWith gives me chills. I think at very least introducing variables for m.getDeclaringClass() and ml.m.getDeclaringClass() improve readability. Also, moving the comments at the end of the line should improve readability and better highlight the boolean expression you are spelling out. Below is the straight-forward rewrite: MethodList consolidateWith(MethodList ml) { Method mineM = m; Method otherM = ml.m; Class? mine = mineM.getDeclaringClass(); Class? other = otherM.getDeclaringClass(); if (other == mine // other is same method as mine || !Modifier.isAbstract(mineM.getModifiers()) // OR, mine is non-abstract method... (other.isAssignableFrom(mine) // ...which overrides other || other.isInterface() !mine.isInterface())) { // ...class method which wins over interface // just return return this; } if (!Modifier.isAbstract(otherM.getModifiers()) // other is non-abstract method... (mine.isAssignableFrom(other) // ...which overrides mine || mine.isInterface() !other.isInterface())) {// ...class method which wins over interface // knock m out and continue return (next == null) ? ml : next.consolidateWith(ml); } // else leave m in and continue next = (next == null) ? ml : next.consolidateWith(ml); return this; } * Should we use just methods.put() here? 2851 MethodList ml0 = methods.putIfAbsent(ml, ml); * I wonder if the code would be simpler if we push the MapMethodList, MethodList maintenance to some internal utility class? That would probably simplify the loops in privateGetPublicMethods? Thanks, -Aleksey.
Re: Loading classes with many methods is very expensive
On Mon, Oct 27, 2014 at 1:23 PM, Aleksey Shipilev aleksey.shipi...@oracle.com wrote: On 10/27/2014 01:53 AM, Peter Levart wrote: As you might have noticed, the number of methods obtained from patched code differed from original code. I have investigated this and found that original code treats abstract class methods the same as abstract interface methods as far as multiple inheritance is concerned (it keeps them together in the returned array). So I fixed this and here's new webrev which behaves the same as original code: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.02/ This seems to be a candidate fix for https://bugs.openjdk.java.net/browse/JDK-8061950, right? If so, please assign yourself there. It might be advisable to start the separate RFR thread for a fix? My comments for the first reading of the patch: * Why MethodList maintains the linked list? Doesn't O(n) MethodList.add() lead to quadratic complexity again? Should we instead use the ArrayListMethod for storing method synonyms? But N would be the amount of exactly the same overridden methods from superclasses and declared in interfaces. Usually there would be zero or 1-2. So the in-place linked list conserves memory, the extra indirection of ArrayList for n=1 would also be slower. IMO, it's a well designed approach. The only case where it'd actually matter is hundreds of superclasses each overriding all of the their methods. Stanimir * Please convert some of the indexed loops into for-each loops for better readability? E.g: for (int i = 0; i intfcsMethods.length; i++) { for (Method m : intfcsMethods[i]) { ...to: for (Method[] ms : intfcsMethods) { for (Method m : ms) { * I would think the code would be more readable if MethodList.m was having a more readable name, e.g. MethodList.base or MethodList.image or MethodList.primary? * Formatting and logic in MethodList.consolidateWith gives me chills. I think at very least introducing variables for m.getDeclaringClass() and ml.m.getDeclaringClass() improve readability. Also, moving the comments at the end of the line should improve readability and better highlight the boolean expression you are spelling out. Below is the straight-forward rewrite: MethodList consolidateWith(MethodList ml) { Method mineM = m; Method otherM = ml.m; Class? mine = mineM.getDeclaringClass(); Class? other = otherM.getDeclaringClass(); if (other == mine // other is same method as mine || !Modifier.isAbstract(mineM.getModifiers()) // OR, mine is non-abstract method... (other.isAssignableFrom(mine) // ...which overrides other || other.isInterface() !mine.isInterface())) { // ...class method which wins over interface // just return return this; } if (!Modifier.isAbstract(otherM.getModifiers()) // other is non-abstract method... (mine.isAssignableFrom(other) // ...which overrides mine || mine.isInterface() !other.isInterface())) {// ...class method which wins over interface // knock m out and continue return (next == null) ? ml : next.consolidateWith(ml); } // else leave m in and continue next = (next == null) ? ml : next.consolidateWith(ml); return this; } * Should we use just methods.put() here? 2851 MethodList ml0 = methods.putIfAbsent(ml, ml); * I wonder if the code would be simpler if we push the MapMethodList, MethodList maintenance to some internal utility class? That would probably simplify the loops in privateGetPublicMethods? Thanks, -Aleksey.
Re: Loading classes with many methods is very expensive
Hi Aleksey, On 10/27/2014 12:23 PM, Aleksey Shipilev wrote: On 10/27/2014 01:53 AM, Peter Levart wrote: As you might have noticed, the number of methods obtained from patched code differed from original code. I have investigated this and found that original code treats abstract class methods the same as abstract interface methods as far as multiple inheritance is concerned (it keeps them together in the returned array). So I fixed this and here's new webrev which behaves the same as original code: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.02/ This seems to be a candidate fix for https://bugs.openjdk.java.net/browse/JDK-8061950, right? If so, please assign yourself there. It might be advisable to start the separate RFR thread for a fix? That's exactly the issue I was looking for. I assigned it to myself. Thanks for looking at the code... My comments for the first reading of the patch: * Why MethodList maintains the linked list? Doesn't O(n) MethodList.add() lead to quadratic complexity again? Should we instead use the ArrayListMethod for storing method synonyms? It's not quadratic, but more O(n*m) where n is the number of methods and m is the number of methods with same signature inherited via different paths from abstract class and/or (super)interfaces. This is typically 1, rarely 2, almost never 3 or more. For example: abstract class A implements I, J, K {} interface I { void m(); } interface J { void m(); } interface K { void m(); } Here the linked list with signature 'void m()' will contain 3 elements. When a particular method for a particular signature is found in a (super)interface, it has to be compared with potentially all methods having same signature and modifications performed while traversing are: do nothing, remove element, append element. So linked-list is perfect for that purpose. LinkedList.add() is only called when an abstract superclass brings multiple methods with same signature and all of them are inherited by abstract subclass - very very rarely is there more than 1 method with same signature inherited from superclass. I merged the functionality of the method-signature-key with the linked-list node, holding just a reference to Method object and a link to 'next' node. I think this is the most compact temporary datastructure representation for that purpose that leverages standard LinkedHashMap. * Please convert some of the indexed loops into for-each loops for better readability? E.g: for (int i = 0; i intfcsMethods.length; i++) { for (Method m : intfcsMethods[i]) { ...to: for (Method[] ms : intfcsMethods) { for (Method m : ms) { Good catch. I'll do this. * I would think the code would be more readable if MethodList.m was having a more readable name, e.g. MethodList.base or MethodList.image or MethodList.primary? Just MethodList.method will be ok - it's nothing special with this method. It just happens to be 1st in list of methods with same signature. With introduction of local variables (later down), the conditions will be shortened in MethodList.consolidateWith() and short field name is not needed. * Formatting and logic in MethodList.consolidateWith gives me chills. I think at very least introducing variables for m.getDeclaringClass() and ml.m.getDeclaringClass() improve readability. Also, moving the comments at the end of the line should improve readability and better highlight the boolean expression you are spelling out. Below is the straight-forward rewrite: MethodList consolidateWith(MethodList ml) { Method mineM = m; Method otherM = ml.m; Class? mine = mineM.getDeclaringClass(); Class? other = otherM.getDeclaringClass(); if (other == mine // other is same method as mine || !Modifier.isAbstract(mineM.getModifiers()) // OR, mine is non-abstract method... (other.isAssignableFrom(mine) // ...which overrides other || other.isInterface() !mine.isInterface())) { // ...class method which wins over interface // just return return this; } if (!Modifier.isAbstract(otherM.getModifiers()) // other is non-abstract method... (mine.isAssignableFrom(other) // ...which overrides mine || mine.isInterface() !other.isInterface())) {// ...class method which wins over interface // knock m out and continue return (next == null) ? ml : next.consolidateWith(ml); } // else leave m in and continue next = (next == null) ? ml : next.consolidateWith(ml); return this; } Excellent. I'll take your formatting if you don't mind ;-) * Should we use just methods.put() here? 2851
Re: Loading classes with many methods is very expensive
Hi Peter, As always, thanks for doing this! It has been on my todolist for a while but never quite bubbling up to the top. I don’t have time to look att his right now, but I expect to have some free time next week, but i have two short comments First, I have been thinking about moving MethodArray to its’s own top-level class, isn’t it about time? Second I would expect testing for the missing cases you uncovered (good catch!). I’ll try to get back to you asap. cheers /Joel On 26 okt 2014, at 23:53, Peter Levart peter.lev...@gmail.com wrote: On 10/26/2014 09:25 PM, Peter Levart wrote: 19657 classes loaded in 1.987373401 seconds. 494141 methods obtained in 1.02493941 seconds. vs. 19657 classes loaded in 2.084409717 seconds. 494124 methods obtained in 0.915928578 seconds. Hi, As you might have noticed, the number of methods obtained from patched code differed from original code. I have investigated this and found that original code treats abstract class methods the same as abstract interface methods as far as multiple inheritance is concerned (it keeps them together in the returned array). So I fixed this and here's new webrev which behaves the same as original code: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.02/ Comparing original vs. patched code still shows speed-up: Original: 19657 classes loaded in 1.980493029 seconds. 494141 methods obtained in 0.976318927 seconds. 494141 methods obtained in 0.886504437 seconds. 494141 methods obtained in 0.911153722 seconds. 494141 methods obtained in 0.880550509 seconds. 494141 methods obtained in 0.875526704 seconds. 494141 methods obtained in 0.877258894 seconds. 494141 methods obtained in 0.871794344 seconds. 494141 methods obtained in 0.884159644 seconds. 494141 methods obtained in 0.892648522 seconds. 494141 methods obtained in 0.884581841 seconds. Patched: 19657 classes loaded in 2.055697675 seconds. 494141 methods obtained in 0.853922188 seconds. 494141 methods obtained in 0.776203794 seconds. 494141 methods obtained in 0.858774803 seconds. 494141 methods obtained in 0.778178867 seconds. 494141 methods obtained in 0.760043997 seconds. 494141 methods obtained in 0.756352444 seconds. 494141 methods obtained in 0.740826372 seconds. 494141 methods obtained in 0.744264782 seconds. 494141 methods obtained in 0.73805894 seconds. 494141 methods obtained in 0.746852752 seconds. 55 java/lang/reflect jtreg tests still pass. As they did before, which means that we don't have a coverage for such cases. I'll see where I can add such a case (EnumSet for example, which inherits from Set interface and AbstractColection class via two different paths, so Set.size()/iterator() and AbstractCollection.size()/iterator() are both returned from getMethods())... Regards, Peter
Re: Loading classes with many methods is very expensive
Hi Joel, Thanks for peeking into the code. On 10/27/2014 02:45 PM, Joel Borggrén-Franck wrote: Hi Peter, As always, thanks for doing this! It has been on my todolist for a while but never quite bubbling up to the top. I don’t have time to look att his right now, but I expect to have some free time next week, but i have two short comments First, I have been thinking about moving MethodArray to its’s own top-level class, isn’t it about time? Well, you're the second (first was Aleksey) to comment on factoring away the logic into a separate class. I'll do that then. Second I would expect testing for the missing cases you uncovered (good catch!). I plan to add a case or two to appropriate test. I'm thinking about test/java/lang/Class/getMethods/StarInheritance.java ... I’ll try to get back to you asap. Thank you, Peter cheers /Joel On 26 okt 2014, at 23:53, Peter Levart peter.lev...@gmail.com wrote: On 10/26/2014 09:25 PM, Peter Levart wrote: 19657 classes loaded in 1.987373401 seconds. 494141 methods obtained in 1.02493941 seconds. vs. 19657 classes loaded in 2.084409717 seconds. 494124 methods obtained in 0.915928578 seconds. Hi, As you might have noticed, the number of methods obtained from patched code differed from original code. I have investigated this and found that original code treats abstract class methods the same as abstract interface methods as far as multiple inheritance is concerned (it keeps them together in the returned array). So I fixed this and here's new webrev which behaves the same as original code: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.02/ Comparing original vs. patched code still shows speed-up: Original: 19657 classes loaded in 1.980493029 seconds. 494141 methods obtained in 0.976318927 seconds. 494141 methods obtained in 0.886504437 seconds. 494141 methods obtained in 0.911153722 seconds. 494141 methods obtained in 0.880550509 seconds. 494141 methods obtained in 0.875526704 seconds. 494141 methods obtained in 0.877258894 seconds. 494141 methods obtained in 0.871794344 seconds. 494141 methods obtained in 0.884159644 seconds. 494141 methods obtained in 0.892648522 seconds. 494141 methods obtained in 0.884581841 seconds. Patched: 19657 classes loaded in 2.055697675 seconds. 494141 methods obtained in 0.853922188 seconds. 494141 methods obtained in 0.776203794 seconds. 494141 methods obtained in 0.858774803 seconds. 494141 methods obtained in 0.778178867 seconds. 494141 methods obtained in 0.760043997 seconds. 494141 methods obtained in 0.756352444 seconds. 494141 methods obtained in 0.740826372 seconds. 494141 methods obtained in 0.744264782 seconds. 494141 methods obtained in 0.73805894 seconds. 494141 methods obtained in 0.746852752 seconds. 55 java/lang/reflect jtreg tests still pass. As they did before, which means that we don't have a coverage for such cases. I'll see where I can add such a case (EnumSet for example, which inherits from Set interface and AbstractColection class via two different paths, so Set.size()/iterator() and AbstractCollection.size()/iterator() are both returned from getMethods())... Regards, Peter
Re: Loading classes with many methods is very expensive
Hi Peter, On 10/27/2014 03:45 PM, Peter Levart wrote: I merged the functionality of the method-signature-key with the linked-list node, holding just a reference to Method object and a link to 'next' node. I think this is the most compact temporary datastructure representation for that purpose that leverages standard LinkedHashMap. I see. I wonder when somebody come with a stress test using multiple interfaces with the same signature method :) Seems very unlikely. * Formatting and logic in MethodList.consolidateWith gives me chills. I think at very least introducing variables for m.getDeclaringClass() and ml.m.getDeclaringClass() improve readability. Also, moving the comments at the end of the line should improve readability and better highlight the boolean expression you are spelling out. Below is the straight-forward rewrite: ... Excellent. I'll take your formatting if you don't mind ;-) Sure. Check if I had captured it right first. I think we need to be more strict with parentheses to avoid precedence overlooks: (A || B C) should better be ((A || B) C) or (A || (B C)) * Should we use just methods.put() here? 2851 MethodList ml0 = methods.putIfAbsent(ml, ml); It really doesn't matter as the exception is thrown if (ml0 != null) and LinkedHashMap is forgotten... if (ml0 == null), put and putIfAbsent are equivalent. I used putIfAbsent to make the 3 loops (declared methods, superclass methods, (super)interface methods) more uniform. Okay. * I wonder if the code would be simpler if we push the MapMethodList, MethodList maintenance to some internal utility class? That would probably simplify the loops in privateGetPublicMethods? I thought MethodList.add()/consolidateWith() were close to those utility methods. I could extract the logic in each of 3 loops to 3 void methods taking 'methods' Map and a Method instance, but I don't want to pollute the method namespace in j.l.Class any more than necessary and making a special inner class for that is an overkill. Previously the holder for such methods was MethodArray, but it's gone now. I could subclass LinkedHashMap if I wanted to avoid an unnecessary indirection, but that's not a good practice. It's not that bad as it is - 38 lines of code for 3 loops with comments and spacing. I could improve comments though... Yes. I had to read the entire thing a couple of times to understand how this works. Containing most of the logic into MethodList seems like a way to improve this. Not sure though. Thanks, -Aleksey.
Re: Loading classes with many methods is very expensive
I'm having trouble keeping up with this thread I started, but I did update the test/benchmark http://cr.openjdk.java.net/~martin/webrevs/openjdk9/Class.getMethods-benchmark/ I added a test *LoadAllClassesAndMethods *that does what its name says. Although originally written as just a benchmark, it also checks en passant that all classes in all JRE jar files can be loaded. Interestingly, openjdk9 passes this test, but proprietary Oracle JDK fails it. I think this is a useful packaging-level test of the entire JDK as well. java.lang.NoClassDefFoundError: Failed to load jfxrt.jar(com/sun/deploy/uitoolkit/impl/fx/FXApplet2Adapter$1.class) at LoadAllClassesAndMethods.loadAllExtensionClasses(LoadAllClassesAndMethods.java:178) at LoadAllClassesAndMethods.methodsByClass(LoadAllClassesAndMethods.java:198) at LoadAllClassesAndMethods.main(LoadAllClassesAndMethods.java:65) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.sun.javatest.regtest.MainWrapper$MainThread.run(MainWrapper.java:94) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NoClassDefFoundError: com/sun/deploy/uitoolkit/Applet2Adapter at java.lang.ClassLoader.defineClass1(Native Method) Jeremy is independently working on yet another performance problem exposed by *LoadAllClassesAndMethods* On Mon, Oct 27, 2014 at 8:48 AM, Aleksey Shipilev aleksey.shipi...@oracle.com wrote: Hi Peter, On 10/27/2014 03:45 PM, Peter Levart wrote: I merged the functionality of the method-signature-key with the linked-list node, holding just a reference to Method object and a link to 'next' node. I think this is the most compact temporary datastructure representation for that purpose that leverages standard LinkedHashMap. I see. I wonder when somebody come with a stress test using multiple interfaces with the same signature method :) Seems very unlikely. * Formatting and logic in MethodList.consolidateWith gives me chills. I think at very least introducing variables for m.getDeclaringClass() and ml.m.getDeclaringClass() improve readability. Also, moving the comments at the end of the line should improve readability and better highlight the boolean expression you are spelling out. Below is the straight-forward rewrite: ... Excellent. I'll take your formatting if you don't mind ;-) Sure. Check if I had captured it right first. I think we need to be more strict with parentheses to avoid precedence overlooks: (A || B C) should better be ((A || B) C) or (A || (B C)) * Should we use just methods.put() here? 2851 MethodList ml0 = methods.putIfAbsent(ml, ml); It really doesn't matter as the exception is thrown if (ml0 != null) and LinkedHashMap is forgotten... if (ml0 == null), put and putIfAbsent are equivalent. I used putIfAbsent to make the 3 loops (declared methods, superclass methods, (super)interface methods) more uniform. Okay. * I wonder if the code would be simpler if we push the MapMethodList, MethodList maintenance to some internal utility class? That would probably simplify the loops in privateGetPublicMethods? I thought MethodList.add()/consolidateWith() were close to those utility methods. I could extract the logic in each of 3 loops to 3 void methods taking 'methods' Map and a Method instance, but I don't want to pollute the method namespace in j.l.Class any more than necessary and making a special inner class for that is an overkill. Previously the holder for such methods was MethodArray, but it's gone now. I could subclass LinkedHashMap if I wanted to avoid an unnecessary indirection, but that's not a good practice. It's not that bad as it is - 38 lines of code for 3 loops with comments and spacing. I could improve comments though... Yes. I had to read the entire thing a couple of times to understand how this works. Containing most of the logic into MethodList seems like a way to improve this. Not sure though. Thanks, -Aleksey.
Re: Loading classes with many methods is very expensive
On 27.10.2014 22:01, Martin Buchholz wrote: I'm having trouble keeping up with this thread I started, but I did update the test/benchmark http://cr.openjdk.java.net/~martin/webrevs/openjdk9/Class.getMethods-benchmark/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/Class.getMethods-benchmark/ I added a test *LoadAllClassesAndMethods *that does what its name says. Although originally written as just a benchmark, In our recent Nashorn classloading work, we did this benchmark: http://cr.openjdk.java.net/~shade/8053904/ClassLoaderBenchmark.java ...which, I think can be easily exploded to call getMethods() as well. Jeremy is independently working on yet another performance problem exposed by *LoadAllClassesAndMethods* Does Jeremy have a reportable outline? Submit the bug to make this thread short :) Thanks, -Aleksey.
Re: Loading classes with many methods is very expensive
Jeremy has already filed a bug (8062116), and is just finishing up the patch (I jumped the gun a bit by attaching it to the bug - it has a few bugs in it). I'll probably have something in reasonable shape to review tomorrow (I've submitted it internally, and I want to let it spend a day or so having our test infrastructure throw things at it). It's actually a JVMTI performance regression. The data structure that is being used to store jmethodids isn't great. I've refactored it a bit to improve its performance, but it should really be a closed hash table. There isn't a handy one in HS, so I just made it O(1) for writes. It's still O(n) for reads, though, which is nasty. We can have the real conversation when I post the RFR, though. Jeremy On Mon, Oct 27, 2014 at 12:45 PM, Aleksey Shipilev aleksey.shipi...@oracle.com wrote: On 27.10.2014 22:01, Martin Buchholz wrote: I'm having trouble keeping up with this thread I started, but I did update the test/benchmark http://cr.openjdk.java.net/~martin/webrevs/openjdk9/Class.getMethods-benchmark/ http://cr.openjdk.java.net/%7Emartin/webrevs/openjdk9/Class.getMethods-benchmark/ I added a test *LoadAllClassesAndMethods *that does what its name says. Although originally written as just a benchmark, In our recent Nashorn classloading work, we did this benchmark: http://cr.openjdk.java.net/~shade/8053904/ClassLoaderBenchmark.java ...which, I think can be easily exploded to call getMethods() as well. Jeremy is independently working on yet another performance problem exposed by *LoadAllClassesAndMethods* Does Jeremy have a reportable outline? Submit the bug to make this thread short :) Thanks, -Aleksey.
Re: Loading classes with many methods is very expensive
Hi all, I revamped the Class.getMethods() implementation so that it is faster for common cases and O(n) at the same time, which makes Martin's test happy (at least in part that measures getMethods() speed - the class loading / linkage in VM is a separate issue). With the following test that loads all classes from rt.jar and calls getMethods() on each of them: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/GetAllRtMethods.java And system property 'sun.reflect.noCaches=true' (so that we exercise the logic in every loop - not just 1st), original code prints: 19657 classes loaded in 1.987373401 seconds. 494141 methods obtained in 1.02493941 seconds. 494141 methods obtained in 0.905235658 seconds. 494141 methods obtained in 0.914434303 seconds. 494141 methods obtained in 0.887212805 seconds. 494141 methods obtained in 0.888929483 seconds. 494141 methods obtained in 0.883309141 seconds. 494141 methods obtained in 0.88341098 seconds. 494141 methods obtained in 0.897397146 seconds. 494141 methods obtained in 0.885677466 seconds. 494141 methods obtained in 0.895834176 seconds. Patched code does the same about 10% faster: 19657 classes loaded in 2.084409717 seconds. 494124 methods obtained in 0.915928578 seconds. 494124 methods obtained in 0.785342465 seconds. 494124 methods obtained in 0.784852619 seconds. 494124 methods obtained in 0.793450205 seconds. 494124 methods obtained in 0.849915078 seconds. 494124 methods obtained in 0.77835511 seconds. 494124 methods obtained in 0.764144701 seconds. 494124 methods obtained in 0.754122383 seconds. 494124 methods obtained in 0.747961897 seconds. 494124 methods obtained in 0.7489937 seconds. Martin's test prints on my computer with original code the following: Base class load time: 177.80 ms getDeclaredMethods: Methods: 65521, Total time: 35.79 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 50.15 ms, Time per method: 0.0008 ms Derived class load time: 34015.70 ms getDeclaredMethods: Methods: 65521, Total time: 33.82 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 8122.00 ms, Time per method: 0.1239 ms And with patched code this: Base class load time: 157.16 ms getDeclaredMethods: Methods: 65521, Total time: 65.77 ms, Time per method: 0.0010 ms getMethods: Methods: 65530, Total time: 44.64 ms, Time per method: 0.0007 ms Derived class load time: 33996.69 ms getDeclaredMethods: Methods: 65521, Total time: 32.63 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 92.12 ms, Time per method: 0.0014 ms Here's a webrev of the patch: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.01/ Patched code is simpler (65 lines gone) and I hope, easier to understand and change (I think a change in spec is coming in JDK9 which will handle abstract interface methods the same way as default, right Joel?) I also took the liberty to eliminate some redundant array and Field/Method/Constructor copies. get[Method0,Field0,Counstuctor0] now return 'root' objects. Copying is performed in methods that call them and expose the objects to any code outside java.lang.Class. Also, findFields() and findMethods() don't do copying of Field/Method objects themselves, but rather live that to methods that call them. getInterfaces() method now delegates to getInterfaces(boolean copyArray) so that internally, not array copying is performed. All 55 java/lang/reflect jtreg tests pass with this patch. Regards, Peter
Re: Loading classes with many methods is very expensive
On 10/26/2014 11:21 PM, Stanimir Simeonoff wrote: Great effort. From first glance: the hashCode and equals of MethodList use m.getParameterTypes() which is cloned. I'd rather pay the collision costs and rely on the default hashCode/equals of Method itself (hashCode ignores the parameters). Possibly hashCode can include the returnType but equals should be direct call to Method.equals. Can't do that as Method.equals also compares declaringClass, which is not part of method signature. We could use SharedSecrets to access the internal array of parameterTypes directly. Regards, Peter Stanimir On Sun, Oct 26, 2014 at 10:25 PM, Peter Levart peter.lev...@gmail.com mailto:peter.lev...@gmail.com wrote: Hi all, I revamped the Class.getMethods() implementation so that it is faster for common cases and O(n) at the same time, which makes Martin's test happy (at least in part that measures getMethods() speed - the class loading / linkage in VM is a separate issue). With the following test that loads all classes from rt.jar and calls getMethods() on each of them: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/GetAllRtMethods.java http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/Class.getMethods/GetAllRtMethods.java And system property 'sun.reflect.noCaches=true' (so that we exercise the logic in every loop - not just 1st), original code prints: 19657 classes loaded in 1.987373401 seconds. 494141 methods obtained in 1.02493941 seconds. 494141 methods obtained in 0.905235658 seconds. 494141 methods obtained in 0.914434303 seconds. 494141 methods obtained in 0.887212805 seconds. 494141 methods obtained in 0.888929483 seconds. 494141 methods obtained in 0.883309141 seconds. 494141 methods obtained in 0.88341098 seconds. 494141 methods obtained in 0.897397146 seconds. 494141 methods obtained in 0.885677466 seconds. 494141 methods obtained in 0.895834176 seconds. Patched code does the same about 10% faster: 19657 classes loaded in 2.084409717 tel:2.084409717 seconds. 494124 methods obtained in 0.915928578 seconds. 494124 methods obtained in 0.785342465 seconds. 494124 methods obtained in 0.784852619 seconds. 494124 methods obtained in 0.793450205 seconds. 494124 methods obtained in 0.849915078 seconds. 494124 methods obtained in 0.77835511 seconds. 494124 methods obtained in 0.764144701 seconds. 494124 methods obtained in 0.754122383 seconds. 494124 methods obtained in 0.747961897 seconds. 494124 methods obtained in 0.7489937 seconds. Martin's test prints on my computer with original code the following: Base class load time: 177.80 ms getDeclaredMethods: Methods: 65521, Total time: 35.79 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 50.15 ms, Time per method: 0.0008 ms Derived class load time: 34015.70 ms getDeclaredMethods: Methods: 65521, Total time: 33.82 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 8122.00 ms, Time per method: 0.1239 ms And with patched code this: Base class load time: 157.16 ms getDeclaredMethods: Methods: 65521, Total time: 65.77 ms, Time per method: 0.0010 ms getMethods: Methods: 65530, Total time: 44.64 ms, Time per method: 0.0007 ms Derived class load time: 33996.69 ms getDeclaredMethods: Methods: 65521, Total time: 32.63 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 92.12 ms, Time per method: 0.0014 ms Here's a webrev of the patch: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.01/ http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/Class.getMethods/webrev.01/ Patched code is simpler (65 lines gone) and I hope, easier to understand and change (I think a change in spec is coming in JDK9 which will handle abstract interface methods the same way as default, right Joel?) I also took the liberty to eliminate some redundant array and Field/Method/Constructor copies. get[Method0,Field0,Counstuctor0] now return 'root' objects. Copying is performed in methods that call them and expose the objects to any code outside java.lang.Class. Also, findFields() and findMethods() don't do copying of Field/Method objects themselves, but rather live that to methods that call them. getInterfaces() method now delegates to getInterfaces(boolean copyArray) so that internally, not array copying is performed. All 55 java/lang/reflect jtreg tests pass with this patch. Regards, Peter
Re: Loading classes with many methods is very expensive
On Mon, Oct 27, 2014 at 12:36 AM, Peter Levart peter.lev...@gmail.com wrote: On 10/26/2014 11:21 PM, Stanimir Simeonoff wrote: Great effort. From first glance: the hashCode and equals of MethodList use m.getParameterTypes() which is cloned. I'd rather pay the collision costs and rely on the default hashCode/equals of Method itself (hashCode ignores the parameters). Possibly hashCode can include the returnType but equals should be direct call to Method.equals. Can't do that as Method.equals also compares declaringClass, which is not part of method signature. We could use SharedSecrets to access the internal array of parameterTypes directly. Opps, very true. If SharedSecrets is not available, get the parameters in the constructor, so clone happens once only. Method.clone() should be comparable to the cost of a LHM node, so the overhead is not so high. I doubt the JIT can EA clone() which would be the best case scenario. I was wondering if the computation of size is better than just array copy (as LHM.iterator is effectively linked list), however for small number of methods it would keep all the references in L1 so that would be better in the common case. Stanimir Regards, Peter Stanimir On Sun, Oct 26, 2014 at 10:25 PM, Peter Levart peter.lev...@gmail.com wrote: Hi all, I revamped the Class.getMethods() implementation so that it is faster for common cases and O(n) at the same time, which makes Martin's test happy (at least in part that measures getMethods() speed - the class loading / linkage in VM is a separate issue). With the following test that loads all classes from rt.jar and calls getMethods() on each of them: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/GetAllRtMethods.java And system property 'sun.reflect.noCaches=true' (so that we exercise the logic in every loop - not just 1st), original code prints: 19657 classes loaded in 1.987373401 seconds. 494141 methods obtained in 1.02493941 seconds. 494141 methods obtained in 0.905235658 seconds. 494141 methods obtained in 0.914434303 seconds. 494141 methods obtained in 0.887212805 seconds. 494141 methods obtained in 0.888929483 seconds. 494141 methods obtained in 0.883309141 seconds. 494141 methods obtained in 0.88341098 seconds. 494141 methods obtained in 0.897397146 seconds. 494141 methods obtained in 0.885677466 seconds. 494141 methods obtained in 0.895834176 seconds. Patched code does the same about 10% faster: 19657 classes loaded in 2.084409717 seconds. 494124 methods obtained in 0.915928578 seconds. 494124 methods obtained in 0.785342465 seconds. 494124 methods obtained in 0.784852619 seconds. 494124 methods obtained in 0.793450205 seconds. 494124 methods obtained in 0.849915078 seconds. 494124 methods obtained in 0.77835511 seconds. 494124 methods obtained in 0.764144701 seconds. 494124 methods obtained in 0.754122383 seconds. 494124 methods obtained in 0.747961897 seconds. 494124 methods obtained in 0.7489937 seconds. Martin's test prints on my computer with original code the following: Base class load time: 177.80 ms getDeclaredMethods: Methods: 65521, Total time: 35.79 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 50.15 ms, Time per method: 0.0008 ms Derived class load time: 34015.70 ms getDeclaredMethods: Methods: 65521, Total time: 33.82 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 8122.00 ms, Time per method: 0.1239 ms And with patched code this: Base class load time: 157.16 ms getDeclaredMethods: Methods: 65521, Total time: 65.77 ms, Time per method: 0.0010 ms getMethods: Methods: 65530, Total time: 44.64 ms, Time per method: 0.0007 ms Derived class load time: 33996.69 ms getDeclaredMethods: Methods: 65521, Total time: 32.63 ms, Time per method: 0.0005 ms getMethods: Methods: 65530, Total time: 92.12 ms, Time per method: 0.0014 ms Here's a webrev of the patch: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.01/ Patched code is simpler (65 lines gone) and I hope, easier to understand and change (I think a change in spec is coming in JDK9 which will handle abstract interface methods the same way as default, right Joel?) I also took the liberty to eliminate some redundant array and Field/Method/Constructor copies. get[Method0,Field0,Counstuctor0] now return 'root' objects. Copying is performed in methods that call them and expose the objects to any code outside java.lang.Class. Also, findFields() and findMethods() don't do copying of Field/Method objects themselves, but rather live that to methods that call them. getInterfaces() method now delegates to getInterfaces(boolean copyArray) so that internally, not array copying is performed. All 55 java/lang/reflect jtreg tests pass with this patch. Regards, Peter
Re: Loading classes with many methods is very expensive
On 10/26/2014 09:25 PM, Peter Levart wrote: 19657 classes loaded in 1.987373401 seconds. 494141 methods obtained in 1.02493941 seconds. vs. 19657 classes loaded in 2.084409717 seconds. 494124 methods obtained in 0.915928578 seconds. Hi, As you might have noticed, the number of methods obtained from patched code differed from original code. I have investigated this and found that original code treats abstract class methods the same as abstract interface methods as far as multiple inheritance is concerned (it keeps them together in the returned array). So I fixed this and here's new webrev which behaves the same as original code: http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getMethods/webrev.02/ Comparing original vs. patched code still shows speed-up: Original: 19657 classes loaded in 1.980493029 seconds. 494141 methods obtained in 0.976318927 seconds. 494141 methods obtained in 0.886504437 seconds. 494141 methods obtained in 0.911153722 seconds. 494141 methods obtained in 0.880550509 seconds. 494141 methods obtained in 0.875526704 seconds. 494141 methods obtained in 0.877258894 seconds. 494141 methods obtained in 0.871794344 seconds. 494141 methods obtained in 0.884159644 seconds. 494141 methods obtained in 0.892648522 seconds. 494141 methods obtained in 0.884581841 seconds. Patched: 19657 classes loaded in 2.055697675 seconds. 494141 methods obtained in 0.853922188 seconds. 494141 methods obtained in 0.776203794 seconds. 494141 methods obtained in 0.858774803 seconds. 494141 methods obtained in 0.778178867 seconds. 494141 methods obtained in 0.760043997 seconds. 494141 methods obtained in 0.756352444 seconds. 494141 methods obtained in 0.740826372 seconds. 494141 methods obtained in 0.744264782 seconds. 494141 methods obtained in 0.73805894 seconds. 494141 methods obtained in 0.746852752 seconds. 55 java/lang/reflect jtreg tests still pass. As they did before, which means that we don't have a coverage for such cases. I'll see where I can add such a case (EnumSet for example, which inherits from Set interface and AbstractColection class via two different paths, so Set.size()/iterator() and AbstractCollection.size()/iterator() are both returned from getMethods())... Regards, Peter
Re: Loading classes with many methods is very expensive
55 java/lang/reflect jtreg tests still pass. As they did before, which means that we don't have a coverage for such cases. I'll see where I can add such a case (EnumSet for example, which inherits from Set interface and AbstractColection class via two different paths, so Set.size()/iterator() and AbstractCollection.size()/iterator() are both returned from getMethods())... Reminds me the case when a package private class implementing a public interface cannot have its methods invoked via object.getClass().getMethod(name, classes).invoke(object, args) outside the package. Also I can't find any spec. on how multiple equivalent methods should be returned. Regards, Peter
Re: Loading classes with many methods is very expensive
Hi Martin, On 23.10.2014 02:53, Martin Buchholz wrote: If you have a class with ~64k methods with a superclass that also has ~64k methods, class loading that single class will cost you ~30sec and calling Class.getMethods another ~10sec. Both are unacceptably slow. I think both are due to O(N^2) algorithms, the first in hotspot, and the second in Class.java. Interesting, this is the profile: http://cr.openjdk.java.net/~shade/8061949/calltree-1.txt ...and I submitted two issues as the result of this quick performance investigation. We also need better benchmarks for this. https://bugs.openjdk.java.net/browse/JDK-8061949 https://bugs.openjdk.java.net/browse/JDK-8061950 Nashorn also generates the mammoth classes on many cases, and therefore solving both should (at least marginally) help there as well. -Aleksey.
Re: Loading classes with many methods is very expensive
On 10/23/2014 01:57 AM, Stanimir Simeonoff wrote: Class.java can easily be improved by adding an auxiliary HasMapString, Integer/int[] indexing the position in the array by name and then checking only few overloaded signatures in case of a match. The auxiliary map can be deployed only past some threshold of methods, so it should not affect the common case. Alternatively sorting the array and using binary search is quite trivial to implement as well. Java Class.getMethods() implementation is complicated by the fact that, although not specified, the order of methods in returned array is important. Once it changed, if I remember correctly, and broke many programs, so it had to be restored... Peter Btw, really nice benchmark. Stanimir On Thu, Oct 23, 2014 at 1:53 AM, Martin Buchholz marti...@google.com wrote: Here at Google we have both kinds of scalability problems - loading classes from a classpath with 10,000 jars, and loading a single class file packed with the maximal number of methods. This message is about the latter. If you have a class with ~64k methods with a superclass that also has ~64k methods, class loading that single class will cost you ~30sec and calling Class.getMethods another ~10sec. Both are unacceptably slow. I think both are due to O(N^2) algorithms, the first in hotspot, and the second in Class.java. I have the start of a fix for Class.java, but it makes the common case slower. A really good fix is harder to find. In general, I think Class.java could benefit from some performance-oriented rework. Is anyone else working on class loading performance, especially in hotspot? Here's the benchmark (that could perhaps be integrated into openjdk even without a fix) http://cr.openjdk.java.net/~martin/webrevs/openjdk9/Class.getMethods-benchmark/test/java/lang/Class/getMethods/ManyMethodsBenchmark.java.html Base class load time: 186.44 ms getDeclaredMethods: Methods: 65521, Total time: 43.27 ms, Time per method: 0.0007 ms getMethods: Methods: 65530, Total time: 60.82 ms, Time per method: 0.0009 ms Derived class load time: 33440.13 ms getDeclaredMethods: Methods: 65521, Total time: 39.71 ms, Time per method: 0.0006 ms getMethods: Methods: 65530, Total time: 11582.54 ms, Time per method: 0.1768 ms
Re: Loading classes with many methods is very expensive
Peter, Which hotspot algorithm is the one you identified as slow? Would that be for loading the class or for Class.getMethods? thanks, Karen On Oct 23, 2014, at 9:37 AM, Peter Levart wrote: On 10/23/2014 01:57 AM, Stanimir Simeonoff wrote: Class.java can easily be improved by adding an auxiliary HasMapString, Integer/int[] indexing the position in the array by name and then checking only few overloaded signatures in case of a match. The auxiliary map can be deployed only past some threshold of methods, so it should not affect the common case. Alternatively sorting the array and using binary search is quite trivial to implement as well. Java Class.getMethods() implementation is complicated by the fact that, although not specified, the order of methods in returned array is important. Once it changed, if I remember correctly, and broke many programs, so it had to be restored... Peter Btw, really nice benchmark. Stanimir On Thu, Oct 23, 2014 at 1:53 AM, Martin Buchholz marti...@google.com wrote: Here at Google we have both kinds of scalability problems - loading classes from a classpath with 10,000 jars, and loading a single class file packed with the maximal number of methods. This message is about the latter. If you have a class with ~64k methods with a superclass that also has ~64k methods, class loading that single class will cost you ~30sec and calling Class.getMethods another ~10sec. Both are unacceptably slow. I think both are due to O(N^2) algorithms, the first in hotspot, and the second in Class.java. I have the start of a fix for Class.java, but it makes the common case slower. A really good fix is harder to find. In general, I think Class.java could benefit from some performance-oriented rework. Is anyone else working on class loading performance, especially in hotspot? Here's the benchmark (that could perhaps be integrated into openjdk even without a fix) http://cr.openjdk.java.net/~martin/webrevs/openjdk9/Class.getMethods-benchmark/test/java/lang/Class/getMethods/ManyMethodsBenchmark.java.html Base class load time: 186.44 ms getDeclaredMethods: Methods: 65521, Total time: 43.27 ms, Time per method: 0.0007 ms getMethods: Methods: 65530, Total time: 60.82 ms, Time per method: 0.0009 ms Derived class load time: 33440.13 ms getDeclaredMethods: Methods: 65521, Total time: 39.71 ms, Time per method: 0.0006 ms getMethods: Methods: 65530, Total time: 11582.54 ms, Time per method: 0.1768 ms
Re: Loading classes with many methods is very expensive
I was under the impression the declaring order has been abandoned in 1.7 which I considered quite a let down as allowed ordering the function calls in JMX in a non-chaotic way. On Thu, Oct 23, 2014 at 4:37 PM, Peter Levart peter.lev...@gmail.com wrote: On 10/23/2014 01:57 AM, Stanimir Simeonoff wrote: Class.java can easily be improved by adding an auxiliary HasMapString, Integer/int[] indexing the position in the array by name and then checking only few overloaded signatures in case of a match. The auxiliary map can be deployed only past some threshold of methods, so it should not affect the common case. Alternatively sorting the array and using binary search is quite trivial to implement as well. Java Class.getMethods() implementation is complicated by the fact that, although not specified, the order of methods in returned array is important. Once it changed, if I remember correctly, and broke many programs, so it had to be restored... Peter Btw, really nice benchmark. Stanimir On Thu, Oct 23, 2014 at 1:53 AM, Martin Buchholz marti...@google.com wrote: Here at Google we have both kinds of scalability problems - loading classes from a classpath with 10,000 jars, and loading a single class file packed with the maximal number of methods. This message is about the latter. If you have a class with ~64k methods with a superclass that also has ~64k methods, class loading that single class will cost you ~30sec and calling Class.getMethods another ~10sec. Both are unacceptably slow. I think both are due to O(N^2) algorithms, the first in hotspot, and the second in Class.java. I have the start of a fix for Class.java, but it makes the common case slower. A really good fix is harder to find. In general, I think Class.java could benefit from some performance-oriented rework. Is anyone else working on class loading performance, especially in hotspot? Here's the benchmark (that could perhaps be integrated into openjdk even without a fix) http://cr.openjdk.java.net/~martin/webrevs/openjdk9/Class. getMethods-benchmark/test/java/lang/Class/getMethods/ ManyMethodsBenchmark.java.html Base class load time: 186.44 ms getDeclaredMethods: Methods: 65521, Total time: 43.27 ms, Time per method: 0.0007 ms getMethods: Methods: 65530, Total time: 60.82 ms, Time per method: 0.0009 ms Derived class load time: 33440.13 ms getDeclaredMethods: Methods: 65521, Total time: 39.71 ms, Time per method: 0.0006 ms getMethods: Methods: 65530, Total time: 11582.54 ms, Time per method: 0.1768 ms
Re: Loading classes with many methods is very expensive
Hi Martin, On 23 okt 2014, at 00:53, Martin Buchholz marti...@google.com wrote: Here at Google we have both kinds of scalability problems - loading classes from a classpath with 10,000 jars, and loading a single class file packed with the maximal number of methods. This message is about the latter. If you have a class with ~64k methods with a superclass that also has ~64k methods, class loading that single class will cost you ~30sec and calling Class.getMethods another ~10sec. Both are unacceptably slow. I think both are due to O(N^2) algorithms, the first in hotspot, and the second in Class.java. Throw in lots of interfaces with lots of default methods and i suspect this can get even worse. I spent some time thinking about this when fixing an issue with reflection for default methods, reviewed here [1]. I have the start of a fix for Class.java, but it makes the common case slower. A really good fix is harder to find. In general, I think Class.java could benefit from some performance-oriented rework. Is anyone else working on class loading performance, especially in hotspot? We have been thinking about replacing the duplication of the method lookup logic in j.l.Class with a call to the VM which should already have the information needed. I’m not sure why the logic was duplicated on the library side way back when this was written. This isn’t something we are actively working on though. As an aside, I often found myself wanting for the actual method descriptor when working with this, but considering there can be *a lot* of instances of Method/Constructor adding a descriptor field to Executable wasn’t an obvious win to me. cheers /Joel [1] http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-May/026782.html
Re: Loading classes with many methods is very expensive
On 10/23/2014 03:48 PM, Karen Kinnear wrote: Peter, Which hotspot algorithm is the one you identified as slow? Would that be for loading the class or for Class.getMethods? thanks, Karen Martin did it. And Alexey profiled it. I just noted that Java part of Class.getMethods(), should it be rewritten, might be required to keep the order of Method objects returned in the array although it is not specified. Class.getMethods() is based on native Class methods which return declared methods. They are not slow. The Java algorithm that combines those into a compiled list of public methods (which includes inherited from supertypes) is slow for large number of methods since it is O(n^2). Regards, Peter On Oct 23, 2014, at 9:37 AM, Peter Levart wrote: On 10/23/2014 01:57 AM, Stanimir Simeonoff wrote: Class.java can easily be improved by adding an auxiliary HasMapString, Integer/int[] indexing the position in the array by name and then checking only few overloaded signatures in case of a match. The auxiliary map can be deployed only past some threshold of methods, so it should not affect the common case. Alternatively sorting the array and using binary search is quite trivial to implement as well. Java Class.getMethods() implementation is complicated by the fact that, although not specified, the order of methods in returned array is important. Once it changed, if I remember correctly, and broke many programs, so it had to be restored... Peter Btw, really nice benchmark. Stanimir On Thu, Oct 23, 2014 at 1:53 AM, Martin Buchholz marti...@google.com wrote: Here at Google we have both kinds of scalability problems - loading classes from a classpath with 10,000 jars, and loading a single class file packed with the maximal number of methods. This message is about the latter. If you have a class with ~64k methods with a superclass that also has ~64k methods, class loading that single class will cost you ~30sec and calling Class.getMethods another ~10sec. Both are unacceptably slow. I think both are due to O(N^2) algorithms, the first in hotspot, and the second in Class.java. I have the start of a fix for Class.java, but it makes the common case slower. A really good fix is harder to find. In general, I think Class.java could benefit from some performance-oriented rework. Is anyone else working on class loading performance, especially in hotspot? Here's the benchmark (that could perhaps be integrated into openjdk even without a fix) http://cr.openjdk.java.net/~martin/webrevs/openjdk9/Class.getMethods-benchmark/test/java/lang/Class/getMethods/ManyMethodsBenchmark.java.html Base class load time: 186.44 ms getDeclaredMethods: Methods: 65521, Total time: 43.27 ms, Time per method: 0.0007 ms getMethods: Methods: 65530, Total time: 60.82 ms, Time per method: 0.0009 ms Derived class load time: 33440.13 ms getDeclaredMethods: Methods: 65521, Total time: 39.71 ms, Time per method: 0.0006 ms getMethods: Methods: 65530, Total time: 11582.54 ms, Time per method: 0.1768 ms
Re: Loading classes with many methods is very expensive
On 10/23/2014 05:06 PM, Joel Borggrén-Franck wrote: Hi Martin, On 23 okt 2014, at 00:53, Martin Buchholz marti...@google.com wrote: Here at Google we have both kinds of scalability problems - loading classes from a classpath with 10,000 jars, and loading a single class file packed with the maximal number of methods. This message is about the latter. If you have a class with ~64k methods with a superclass that also has ~64k methods, class loading that single class will cost you ~30sec and calling Class.getMethods another ~10sec. Both are unacceptably slow. I think both are due to O(N^2) algorithms, the first in hotspot, and the second in Class.java. Throw in lots of interfaces with lots of default methods and i suspect this can get even worse. I spent some time thinking about this when fixing an issue with reflection for default methods, reviewed here [1]. Hi Joel, I may have found another issue: interface A4 extends B4, C4 {} interface B4 extends D4 { void m(); } interface C4 extends D4 {} interface D4 { void m(); } Calling A4.class.getMethods() returns B4.m, D4.m - this is supposed to be OK as per JDK 7 spec. Changing both methods to default methods: interface A5 extends B5, C5 {} interface B5 extends D5 { default void m() {} } interface C5 extends D5 {} interface D5 { default void m() {}; } A5.class.getMethods() returns B5.m - this is new in JDK 8 spec for default methods. Now see the following two examples (just one method is default, the other is abstract): interface A6 extends B6, C6 {} interface B6 extends D6 { void m(); } interface C6 extends D6 {} interface D6 { default void m() {}; } A6.class.getMethods() returns B6.m, D6.m // B.m, D.m interface A7 extends B7, C7 {} interface B7 extends D7 { default void m() {} } interface C7 extends D7 {} interface D7 { void m(); } A7.class.getMethods() returns B7.m Do last two examples give expected result? Why are A6 and A7 different? What is the rule here? I would expect A6.class.getMethods() only return the D6.m default method. This is the non-abstract method that gets used when implementing an interface. If A6 example is really showing a bug, then I may have a solution that fixes it *and* is faster than current code for getMethods() + it is O(n)... Is there a test that validates correctness of getMethods() or at least a set of interfaces and/or classes to exercise the algorithm and compare it to a different implementation? I have the start of a fix for Class.java, but it makes the common case slower. A really good fix is harder to find. In general, I think Class.java could benefit from some performance-oriented rework. Is anyone else working on class loading performance, especially in hotspot? We have been thinking about replacing the duplication of the method lookup logic in j.l.Class with a call to the VM which should already have the information needed. I’m not sure why the logic was duplicated on the library side way back when this was written. This isn’t something we are actively working on though. Does VM have enough information to quickly compile the list of public methods (getMethods() equivalent) or only to look-up the public method (getMethod() equivalent)? Regards, Peter As an aside, I often found myself wanting for the actual method descriptor when working with this, but considering there can be *a lot* of instances of Method/Constructor adding a descriptor field to Executable wasn’t an obvious win to me. cheers /Joel [1] http://mail.openjdk.java.net/pipermail/core-libs-dev/2014-May/026782.html
Re: Loading classes with many methods is very expensive
On 10/23/2014 05:44 PM, Peter Levart wrote: interface A6 extends B6, C6 {} interface B6 extends D6 { void m(); } interface C6 extends D6 {} interface D6 { default void m() {}; } A6.class.getMethods() returns B6.m, D6.m Ah, B6.m re-abstracts the default method D6.m. I can see the rule here. Never mind my previous question. Regards, Peter
Re: Loading classes with many methods is very expensive
On 23 Oct 2014, at 17:44, Peter Levart peter.lev...@gmail.com wrote: Is there a test that validates correctness of getMethods() or at least a set of interfaces and/or classes to exercise the algorithm and compare it to a different implementation? There is :) http://hg.openjdk.java.net/jdk9/dev/jdk/file/8b4aa51c8744/test/java/lang/reflect/DefaultMethodMembers/FilterNotMostSpecific.java IIRC 75% or so of the cases passes both before and after the change. I have highlighted a number of interesting cases with line comments. cheers /Joel
Loading classes with many methods is very expensive
Here at Google we have both kinds of scalability problems - loading classes from a classpath with 10,000 jars, and loading a single class file packed with the maximal number of methods. This message is about the latter. If you have a class with ~64k methods with a superclass that also has ~64k methods, class loading that single class will cost you ~30sec and calling Class.getMethods another ~10sec. Both are unacceptably slow. I think both are due to O(N^2) algorithms, the first in hotspot, and the second in Class.java. I have the start of a fix for Class.java, but it makes the common case slower. A really good fix is harder to find. In general, I think Class.java could benefit from some performance-oriented rework. Is anyone else working on class loading performance, especially in hotspot? Here's the benchmark (that could perhaps be integrated into openjdk even without a fix) http://cr.openjdk.java.net/~martin/webrevs/openjdk9/Class.getMethods-benchmark/test/java/lang/Class/getMethods/ManyMethodsBenchmark.java.html Base class load time: 186.44 ms getDeclaredMethods: Methods: 65521, Total time: 43.27 ms, Time per method: 0.0007 ms getMethods: Methods: 65530, Total time: 60.82 ms, Time per method: 0.0009 ms Derived class load time: 33440.13 ms getDeclaredMethods: Methods: 65521, Total time: 39.71 ms, Time per method: 0.0006 ms getMethods: Methods: 65530, Total time: 11582.54 ms, Time per method: 0.1768 ms
Re: Loading classes with many methods is very expensive
Class.java can easily be improved by adding an auxiliary HasMapString, Integer/int[] indexing the position in the array by name and then checking only few overloaded signatures in case of a match. The auxiliary map can be deployed only past some threshold of methods, so it should not affect the common case. Alternatively sorting the array and using binary search is quite trivial to implement as well. Btw, really nice benchmark. Stanimir On Thu, Oct 23, 2014 at 1:53 AM, Martin Buchholz marti...@google.com wrote: Here at Google we have both kinds of scalability problems - loading classes from a classpath with 10,000 jars, and loading a single class file packed with the maximal number of methods. This message is about the latter. If you have a class with ~64k methods with a superclass that also has ~64k methods, class loading that single class will cost you ~30sec and calling Class.getMethods another ~10sec. Both are unacceptably slow. I think both are due to O(N^2) algorithms, the first in hotspot, and the second in Class.java. I have the start of a fix for Class.java, but it makes the common case slower. A really good fix is harder to find. In general, I think Class.java could benefit from some performance-oriented rework. Is anyone else working on class loading performance, especially in hotspot? Here's the benchmark (that could perhaps be integrated into openjdk even without a fix) http://cr.openjdk.java.net/~martin/webrevs/openjdk9/Class.getMethods-benchmark/test/java/lang/Class/getMethods/ManyMethodsBenchmark.java.html Base class load time: 186.44 ms getDeclaredMethods: Methods: 65521, Total time: 43.27 ms, Time per method: 0.0007 ms getMethods: Methods: 65530, Total time: 60.82 ms, Time per method: 0.0009 ms Derived class load time: 33440.13 ms getDeclaredMethods: Methods: 65521, Total time: 39.71 ms, Time per method: 0.0006 ms getMethods: Methods: 65530, Total time: 11582.54 ms, Time per method: 0.1768 ms