On Wed, 17 Nov 2021 18:48:14 GMT, Volker Simonis <simo...@openjdk.org> wrote:

> In JDK 9, function objects implemented as anonymous inner classes in 
> java.util.regex.Pattern have been replaced by lambda functions (see 
> [JDK-8151481](https://bugs.openjdk.java.net/browse/JDK-8151481)). This led to 
> a significant performance regression (up to ~8X) for patterns with negated 
> character classes (e.g. "[^A-Za-z0-9]").
> 
> JDK 8
> -----
> Benchmark                  (patternString)                                    
>   (text)  Mode  Cnt    Score     Error  Units
> FindPattern.testFind          [^A-Za-z0-9]  
> abcdefghijklmnop1234567890ABCDEFGHIJKLMNOP  avgt    3  241.359 ± 142.079  
> ns/op
> FindPattern.testFind          [^A-Za-z0-9]  
> ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  avgt    3  754.492 ± 181.505  
> ns/op
> FindPattern.testFind           [A-Za-z0-9]  
> abcdefghijklmnop1234567890ABCDEFGHIJKLMNOP  avgt    3  738.998 ± 402.188  
> ns/op
> FindPattern.testFind           [A-Za-z0-9]  
> ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  avgt    3  260.550 ± 118.717  
> ns/op
> 
> JDK 17
> ------
> Benchmark                  (patternString)                                    
>   (text)  Mode  Cnt     Score     Error  Units
> FindPattern.testFind          [^A-Za-z0-9]  
> abcdefghijklmnop1234567890ABCDEFGHIJKLMNOP  avgt    3  1921.954 ±  90.698  
> ns/op
> FindPattern.testFind          [^A-Za-z0-9]  
> ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  avgt    3  2496.115 ± 555.147  
> ns/op
> FindPattern.testFind           [A-Za-z0-9]  
> abcdefghijklmnop1234567890ABCDEFGHIJKLMNOP  avgt    3   873.154 ± 178.640  
> ns/op
> FindPattern.testFind           [A-Za-z0-9]  
> ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  avgt    3   349.939 ±  33.585  
> ns/op
> 
> JDK 17 fixed
> -------------------
> Benchmark                  (patternString)                                    
>   (text)  Mode  Cnt    Score     Error  Units
> FindPattern.testFind          [^A-Za-z0-9]  
> abcdefghijklmnop1234567890ABCDEFGHIJKLMNOP  avgt    3  343.541 ± 181.147  
> ns/op
> FindPattern.testFind          [^A-Za-z0-9]  
> ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  avgt    3  744.226 ± 114.802  
> ns/op
> FindPattern.testFind           [A-Za-z0-9]  
> abcdefghijklmnop1234567890ABCDEFGHIJKLMNOP  avgt    3  753.297 ± 216.331  
> ns/op
> FindPattern.testFind           [A-Za-z0-9]  
> ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,  avgt    3  240.142 ±  13.339  
> ns/op
> 
> 
> ## TL;DR
> 
> The reason for this regression is trashing of the `_secondary_super_cache` 
> entry in the generated Lambda classes. During matching we alternately call 
> methods from different interfaces on the generated Lambda class objects with 
> the effect that  the quick type check against the `_secondary_super_cache` 
> continuously fails. Consequently we have to take the slow path which does a 
> linear search over all secondary super types and updates 
> `_secondary_super_cache`.
> 
> This was the "short" description but as I've personally found the details 
> quite intricate I've wrote a much more detailed description of the problem at 
> the end of this PR.
> 
> The fix of the problem is quite simple. Move the Lambda-generating methods 
> out of the function interfaces. As a result the calls to the lambda functions 
> become static and don't depend on the exact type of the Lambda classes any 
> more.
> 
> I've also removed the overloaded `union(CharPredicate... predicates)` version 
> from `BmpCharPredicate`. First, it is not used anywhere anyway but second and 
> more important, its implementation is plain wrong. You can't cast a 
> `CharPredicate` into a `BmpCharPredicate` - this will deterministically lead 
> to a `ClassCastException` at runtime. That said, I think I understand the 
> motivation behind that overloaded version of `union(..)`. It could be use to 
> minimize the number of union nodes (i.e. predicates) for character classes 
> with more than two ranges (e.g. [a-zA-Z0-9]). But I leave that optimization 
> and a correct implementation of `union(CharPredicate... predicates)` for a 
> future change.
> 
> ## Long, but maybe worth reading :)
> 
> Currently, when a `Pattern` is compiled, we build a tree of `Pattern$Node`s. 
> Each `Node` has a `match()` method to match its part of the pattern. For 
> matching a character we use the general `CharProperty` node and the derived 
> `BmpCharProperty` node which only match characters from the [Basic 
> Multilingual 
> Plane](https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane)
>  (i.e. characters which can be encoded in 16 bits). These nodes take a 
> `CharPredicate` or `BmpCharPredicate` argument respectively which are used to 
> decide if the character in question will be matched. `BmpCharPredicate` is a 
> specialization of and derived from `CharPredicate`. Since JDK 9 
> `CharPredicate`/`BmpCharPredicate` are `@FunctionalInterface`s which are 
> implemented by various Lambda functions. They all implement the `boolean 
> CharPredicate::is(int ch)` function which decides if a given character 
> matches. `CharPredicate`s can be combined to a tree to match more complex 
> character classes.
> 
> In order to match a character class like "[^A-Za-z0-9]" the corresponding 
> compiled `Pattern` will contain a `CharProperty` node with the following 
> `CharPredicate` tree:
> 
> 
>           Negate
>             |
>             |
>             |
>           Union
>             |\
>           | \
>           |  ___________________________________________
>           |                                              |
>           |                                              |
>         Union                                          Range (0-9)
>             |\
>           | \
>           |  ___________________________________________
>           |                                              |
>           |                                              |
>         Range (A-Z)                                    Range (a-z)
> 
> 
> Every node in this tree implements `CharPredicate` or `BmpCharPredicate`. 
> They are currently all implemented as Lambda functions (see 
> [Pattern.java](https://github.com/openjdk/jdk/blob/e5ffdf9120c14b38e4c8794888d2002e2686ebfc/src/java.base/share/classes/java/util/regex/Pattern.java#L5608)):
> 
> 
> @FunctionalInterface
> static interface CharPredicate {
>     boolean is(int ch);
> 
>     default CharPredicate union(CharPredicate p) {
>         return ch -> is(ch) || p.is(ch);
>     }
>     default CharPredicate negate() {
>         return ch -> !is(ch);
>     }
>     ...  
> }
> static interface BmpCharPredicate extends CharPredicate {
>     default CharPredicate union(CharPredicate p) {
>         if (p instanceof BmpCharPredicate)
>             return (BmpCharPredicate)(ch -> is(ch) || p.is(ch));
>         return ch -> is(ch) || p.is(ch);
>     }
>     ...
> }
> static CharPredicate Range(int lower, int upper) {
>     if (upper < Character.MIN_HIGH_SURROGATE ||
>         lower > Character.MAX_LOW_SURROGATE &&
>         upper < Character.MIN_SUPPLEMENTARY_CODE_POINT)
>         return (BmpCharPredicate)(ch -> inRange(lower, ch, upper));
>     return ch -> inRange(lower, ch, upper);
> }
> 
> 
> `Range` is a Lambda function which captures a `lower` and an `upper` bound 
> and calls `isRange()` for a given character to check if this character falls 
> inside the predefined range. A `Union` captures two existing `CharPredicate`s 
> and evaluates if at least one of them is true for a given character.
> 
> Because `Range(int lower, int upper){..}` is a static method of the `Pattern` 
> class, the Lambda it returns will trigger the generation of a synthetic 
> Lambda class (i.e. `Pattern$$Lambda$20`) which implements `CharPredicate` or 
> `BmpCharPredicate` (depending on `lower`/`upper`) and calls a synthetic 
> Lambda method (i.e. `Pattern::lambda$Range$10()`) which will be injected into 
> the `Pattern` class.
> 
> For `Union`/`Negate` things are a little more complicated. These Lambdas are 
> generated from default methods in the `CharPredicate`/`BmpCharPredicate` 
> interfaces. So for a `Union` of two `BmpCharPredicate`s 
> `BmpCharPredicate::union()` will generate a synthetic Lambda class (i.e. 
> `Pattern$BmpCharPredicate$$Lambda$21`) which calls a Lambda method (i.e. 
> `Pattern$BmpCharPredicate::lambda$union$2()`) which will be injected into the 
> `BmpCharPredicate` interface.
> 
> The `Negate` and `Union` Lambda class also capture one respectively two 
> `CharPredicate`/`BmpCharPredicate` objects. I.e. for `Negate` it's the 
> implicit `this` parameter which is always a `CharPredicate`, for `Union` it's 
> the implicit `this` parameter which is either a `CharPredicate` or a 
> `BmpCharPredicate` object (depending on which overloaded version of `union()` 
> we call) and the additional `CharPredicate` object which is passed in as a 
> parameter. 
> 
> With this information we can refine our previously depicted predicate tree as 
> follows:
> 
> 
>           Negate
> (CharPredicate$$Lambda$22)
>             |
>             |
>             |
>           Union
> (BmpCharPredicate$$Lambda$21)
>             |\
>           | \
>           |  ___________________________________________
>           |                                              |
>           |                                              |
>         Union                                          Range 0-9
> (BmpCharPredicate$$Lambda$21)                     (Pattern$$Lambda$20)
>             |\
>           | \
>           |  ___________________________________________
>           |                                              |
>           |                                              |
>         Range A-Z                                      Range a-z
>      (Pattern$$Lambda$20)                         (Pattern$$Lambda$20)
> 
> 
> When we start to match the first character with this pattern, we'll arrive at 
> the following stack trace:
> 
> 
> [1]  java.util.regex.Pattern.inRange (Pattern.java:5,728)
> [2]  java.util.regex.Pattern.lambda$Range$10 (Pattern.java:5,738)
> [3]  java.util.regex.Pattern$$Lambda$20/0x0000000080001c60.is (null)
> [4]  java.util.regex.Pattern$BmpCharPredicate.lambda$union$2 
> (Pattern.java:5,636)
> [5]  
> java.util.regex.Pattern$BmpCharPredicate$$Lambda$21/0x0000000080080000.is 
> (null)
> [6]  java.util.regex.Pattern$BmpCharPredicate.lambda$union$2 
> (Pattern.java:5,636)
> [7]  
> java.util.regex.Pattern$BmpCharPredicate$$Lambda$21/0x0000000080080000.is 
> (null)
> [8]  java.util.regex.Pattern$CharPredicate.lambda$negate$3 
> (Pattern.java:5,623)
> [9]  java.util.regex.Pattern$CharPredicate$$Lambda$22/0x0000000080080278.is 
> (null)
> [10] java.util.regex.Pattern$CharProperty.match (Pattern.java:3,940)
> [11] java.util.regex.Pattern$StartS.match (Pattern.java:3,651)
> [12] java.util.regex.Matcher.search (Matcher.java:1,728)
> [13] java.util.regex.Matcher.find (Matcher.java:745)
> 
> 
> We first call `is()` on a `Negate` node which is actually a 
> `CharPredicate$$Lambda$22` (frame [9] in the stack trace):
> 
> final class java.util.regex.Pattern$CharPredicate$$Lambda$22 implements 
> java.util.regex.Pattern$CharPredicate {
>   private final java.util.regex.Pattern$CharPredicate arg$1;
>   public boolean is(int) {
>     0: aload_0
>     1: getfield      #15                 // Field 
> arg$1:Ljava/util/regex/Pattern$CharPredicate;
>     4: iload_1
>     5: invokeinterface #20,  2           // InterfaceMethod 
> java/util/regex/Pattern$CharPredicate.lambda$negate$3:(I)Z
>    10: ireturn
> 
> 
> When the `Negate` predicate was created it captured another predicate, a 
> `Union` predicate of exact type `BmpCharPredicate$$Lambda$21` in this case, 
> and stored it in its private `arg$1` field. Next 
> `java.util.regex.Pattern$CharPredicate::lambda$negate$3` will be called on 
> this `BmpCharPredicate$$Lambda$21` object. `lambda$negate$3` is a synthetic 
> method which was injected into the `java.util.regex.Pattern$CharPredicate` 
> interface. The JVM will check for this call that `arg$1` implements 
> `Pattern$CharPredicate`. `arg$1` is actually of type 
> `BmpCharPredicate$$Lambda$21` but `BmpCharPredicate$$Lambda$21` has 
> `Pattern$CharPredicate` as one of its "secondary subtypes". The JVM will 
> detect this and will **store `Pattern$CharPredicate` in the 
> `_secondary_super_cache` field of `BmpCharPredicate$$Lambda$21`'s instance 
> klass** for quicker lookup in the future. Finally, `lambda$negate$3` itself 
> calls `is()` on itself and returns the negated result of that call:
> 
> 
>   private boolean Pattern$CharPredicate::lambda$negate$3(int) {
>     0: aload_0
>     1: iload_1
>     2: invokeinterface #12,  2           // InterfaceMethod 
> java/util/regex/Pattern$CharPredicate.is:(I)Z
>     7: ifne          14
>    10: iconst_1
>    11: goto          15
>    14: iconst_0
>    15: ireturn
> 
> 
> Now remember that the current predicate object on which we've invoked 
> `lambda$negate$3` is really a `Union` predicate (i.e. an instance of 
> `BmpCharPredicate$$Lambda$21`) which overrides `is()` so we will be actually 
> calling `BmpCharPredicate$$Lambda$21::is()` (frame [7] in the stack trace):
> 
> final class java.util.regex.Pattern$BmpCharPredicate$$Lambda$21 implements 
> java.util.regex.Pattern$BmpCharPredicate {
>   private final java.util.regex.Pattern$BmpCharPredicate arg$1;
>   private final java.util.regex.Pattern$CharPredicate arg$2;
>   public boolean is(int) {
>     0: aload_0
>     1: getfield      #17                 // Field 
> arg$1:Ljava/util/regex/Pattern$BmpCharPredicate;
>     4: aload_0
>     5: getfield      #19                 // Field 
> arg$2:Ljava/util/regex/Pattern$CharPredicate;
>     8: iload_1
>     9: invokeinterface #25,  3           // InterfaceMethod 
> java/util/regex/Pattern$BmpCharPredicate.lambda$union$2:(Ljava/util/regex/Pattern$CharPredicate;I)Z
>    14: ireturn
> 
> 
> When the `Pattern$BmpCharPredicate$$Lambda$21` predicate was created it 
> captured two other predicates in its private `arg$1` and `arg$2` fields. On 
> the first one in `arg$1`, `BmpCharPredicate$$Lambda$21::is()` will now invoke 
> the `Pattern$BmpCharPredicate::lambda$union$2(..)` method (frame [6]). 
> Because this lambda method was created from the 
> `Pattern$BmpCharPredicate::union(..)` default method, it was injected into 
> the `Pattern$BmpCharPredicate` class. Therefore the JVM will now have to make 
> sure that `arg$1`'s class implements `Pattern$BmpCharPredicate`. 
> Unfortunately, the `_secondary_super_cache` field of 
> `BmpCharPredicate$$Lambda$21`'s instance klass is currently containing the 
> `Pattern$CharPredicate` interface. So we'll have to take the slow path again 
> and search all of `BmpCharPredicate$$Lambda$21` secondary supertypes. At the 
> end of this search we will find **`Pattern$BmpCharPredicate` as an 
> implemented supertype and store it in the `_secondary_super_cache` field of 
> `BmpChar
 Predicate$$Lambda$21`'s instance klass** (overwriting the previous 
`Pattern$CharPredicate` value).
> 
> `BmpCharPredicate$$Lambda$21`'s secondary supertype cached in the 
> `_secondary_super_cache` field will now constantly bounce between 
> `Pattern$BmpCharPredicate` and `Pattern$CharPredicate` forcing every check to 
> go through the slow path and updating the field. This is the major reason of 
> the performance regression observed for patterns with negated character 
> classes like "[^A-Za-z0-9]". It can be clearly seen in a JMH profile:
> 
> 
> ....[Hottest 
> Regions]...............................................................................
>  24.72%   c2, level 4  
> java.util.regex.Pattern$BmpCharPredicate$$Lambda$21.0x00000007c0149fc8::is, 
> version 609 (138 bytes) 
>  19.95%   c2, level 4  
> java.util.regex.Pattern$CharPredicate$$Lambda$22.0x00000007c014a240::is, 
> version 621 (102 bytes) 
> 
> ....[Hottest Region 
> 1]..............................................................................
> c2, level 4, 
> java.util.regex.Pattern$BmpCharPredicate$$Lambda$21.0x00000007c0149fc8::is, 
> version 609 (138 bytes) 
>   0.48%       0x00007fffe0482fcf:   mov    0x10(%rsi),%r11d             
> ;*getfield arg$2 {reexecute=0 rethrow=0 return_oop=0}
>                                                                         ; - 
> java.util.regex.Pattern$BmpCharPredicate$$Lambda$21/0x00000007c0149fc8::is@5
>   0.08%       0x00007fffe0482fd3:   mov    0xc(%rsi),%r10d              
> ;*getfield arg$1 {reexecute=0 rethrow=0 return_oop=0}
>                                                                         ; - 
> java.util.regex.Pattern$BmpCharPredicate$$Lambda$21/0x00000007c0149fc8::is@1
>               0x00007fffe0482fd7:   nopw   0x0(%rax,%rax,1)
>   0.26%       0x00007fffe0482fe0:   mov    0x8(%r12,%r10,8),%ecx        ; 
> implicit exception: dispatches to 0x00007fffe0483071
>   0.37%       0x00007fffe0482fe5:   movabs $0x7c01498f8,%rax            ;   
> {metadata(&apos;java/util/regex/Pattern$BmpCharPredicate&apos;)}
>   0.04%       0x00007fffe0482fef:   xor    %rsi,%rsi
>   0.03%       0x00007fffe0482ff2:   lea    (%rsi,%rcx,8),%rsi
>   1.77%       0x00007fffe0482ff6:   mov    0x20(%rsi),%r9
>   0.01%       0x00007fffe0482ffa:   nopw   0x0(%rax,%rax,1)
>   0.29%       0x00007fffe0483000:   cmp    %rax,%r9
>   0.26%  ╭    0x00007fffe0483003:   jne    0x00007fffe048302b           
> ;*invokeinterface lambda$union$2 {reexecute=0 rethrow=0 return_oop=0}
>          │                                                              ; - 
> java.util.regex.Pattern$BmpCharPredicate$$Lambda$21/0x00000007c0149fc8::is@9
>          │ ↗  0x00007fffe0483005:   lea    (%r12,%r10,8),%rsi           
> ;*getfield arg$1 {reexecute=0 rethrow=0 return_oop=0}
>          │ │                                                            ; - 
> java.util.regex.Pattern$BmpCharPredicate$$Lambda$21/0x00000007c0149fc8::is@1
>          │ │  0x00007fffe0483009:   mov    %r11,%rdx
>   0.36%  │ │  0x00007fffe048300c:   shl    $0x3,%rdx                    
> ;*getfield arg$2 {reexecute=0 rethrow=0 return_oop=0}
>          │ │                                                            ; - 
> java.util.regex.Pattern$BmpCharPredicate$$Lambda$21/0x00000007c0149fc8::is@5
>   0.27%  │ │  0x00007fffe0483010:   mov    %r8d,%ecx
>   0.41%  │ │  0x00007fffe0483013:   callq  0x00007fffd8fa6700           ; 
> ImmutableOopMap {}
>          │ │                                                            
> ;*invokeinterface lambda$union$2 {reexecute=0 rethrow=0 return_oop=0}
>          │ │                                                            ; - 
> java.util.regex.Pattern$BmpCharPredicate$$Lambda$21/0x00000007c0149fc8::is@9
>          │ │                                                            ;   
> {optimized virtual_call}
>   0.59%  │ │  0x00007fffe0483018:   add    $0x20,%rsp
>   0.01%  │ │  0x00007fffe048301c:   pop    %rbp
>   0.50%  │ │  0x00007fffe048301d:   cmp    0x338(%r15),%rsp             ;   
> {poll_return}
>          │ │  0x00007fffe0483024:   ja     0x00007fffe0483084
>   0.10%  │ │  0x00007fffe048302a:   retq   
>          ↘ │  0x00007fffe048302b:   push   %rax
>            │  0x00007fffe048302c:   mov    %rax,%rax
>            │  0x00007fffe048302f:   mov    0x28(%rsi),%rdi
>   0.89%    │  0x00007fffe0483033:   mov    (%rdi),%ecx
>            │  0x00007fffe0483035:   add    $0x8,%rdi
>            │  0x00007fffe0483039:   test   %rax,%rax
>   8.19%    │  0x00007fffe048303c:   repnz scas %es:(%rdi),%rax
>   7.41%    │  0x00007fffe048303f:   pop    %rax
>           ╭│  0x00007fffe0483040:   jne    0x00007fffe048304a
>   0.69%   ││  0x00007fffe0483046:   mov    %rax,0x20(%rsi)
>   0.00%   ↘╰  0x00007fffe048304a:   je     0x00007fffe0483005
> 
> ....[Hottest Region 
> 2]..............................................................................
> c2, level 4, 
> java.util.regex.Pattern$CharPredicate$$Lambda$22.0x00000007c014a240::is, 
> version 621 (102 bytes) 
>   0.17%       0x00007fffe04897c8:   sub    $0x20,%rsp                   
> ;*synchronization entry
>                                                                         ; - 
> java.util.regex.Pattern$CharPredicate$$Lambda$22/0x00000007c014a240::is@-1
>   0.01%       0x00007fffe04897cc:   mov    0xc(%rsi),%r11d              
> ;*getfield arg$1 {reexecute=0 rethrow=0 return_oop=0}
>                                                                         ; - 
> java.util.regex.Pattern$CharPredicate$$Lambda$22/0x00000007c014a240::is@1
>   0.37%       0x00007fffe04897d0:   mov    0x8(%r12,%r11,8),%r10d       ; 
> implicit exception: dispatches to 0x00007fffe0489849
>   0.00%       0x00007fffe04897d5:   movabs $0x7c0149710,%rax            ;   
> {metadata(&apos;java/util/regex/Pattern$CharPredicate&apos;)}
>   0.12%       0x00007fffe04897df:   xor    %rsi,%rsi
>   0.16%       0x00007fffe04897e2:   lea    (%rsi,%r10,8),%rsi
>   1.07%       0x00007fffe04897e6:   mov    0x20(%rsi),%r10
>   0.00%       0x00007fffe04897ea:   cmp    %rax,%r10
>   0.34%  ╭    0x00007fffe04897ed:   jne    0x00007fffe048980b           
> ;*invokeinterface lambda$negate$3 {reexecute=0 rethrow=0 return_oop=0}
>          │                                                              ; - 
> java.util.regex.Pattern$CharPredicate$$Lambda$22/0x00000007c014a240::is@5
>          │ ↗  0x00007fffe04897ef:   lea    (%r12,%r11,8),%rsi
>   0.31%  │ │  0x00007fffe04897f3:   callq  0x00007fffd8fa93e0           ; 
> ImmutableOopMap {}
>          │ │                                                            
> ;*invokeinterface lambda$negate$3 {reexecute=0 rethrow=0 return_oop=0}
>          │ │                                                            ; - 
> java.util.regex.Pattern$CharPredicate$$Lambda$22/0x00000007c014a240::is@5
>          │ │                                                            ;   
> {optimized virtual_call}
>          │ │  0x00007fffe04897f8:   add    $0x20,%rsp
>   0.25%  │ │  0x00007fffe04897fc:   pop    %rbp
>   0.00%  │ │  0x00007fffe04897fd:   cmp    0x338(%r15),%rsp             ;   
> {poll_return}
>          │ │  0x00007fffe0489804:   ja     0x00007fffe0489858
>   0.03%  │ │  0x00007fffe048980a:   retq   
>          ↘ │  0x00007fffe048980b:   push   %rax
>   0.00%    │  0x00007fffe048980c:   mov    %rax,%rax
>            │  0x00007fffe048980f:   mov    0x28(%rsi),%rdi
>   0.91%    │  0x00007fffe0489813:   mov    (%rdi),%ecx
>   0.00%    │  0x00007fffe0489815:   add    $0x8,%rdi
>            │  0x00007fffe0489819:   test   %rax,%rax
>   7.59%    │  0x00007fffe048981c:   repnz scas %es:(%rdi),%rax
>   7.68%    │  0x00007fffe048981f:   pop    %rax
>           ╭│  0x00007fffe0489820:   jne    0x00007fffe048982a
>   0.65%   ││  0x00007fffe0489826:   mov    %rax,0x20(%rsi)
>           ↘╰  0x00007fffe048982a:   je     0x00007fffe04897ef
> 
> 
> There might be ways to handle such situations more intelligently in the 
> `LambdaMetafactory` when creating the lambda classes or in HotSpot when doing 
> the type checks for a call. But this must be left for a future enhancement. 
> For now I propose to fix this right in the Java code of `Pattern.java`. 
> There's no reason to create the lambda functions right in the default methods 
> of the functional interfaces. We can just as well move the lambda creation 
> into static helper functions in the `Pattern` class. This way the lambda 
> helper functions like `lambda$negate$3` and `lambda$union$2` can be invoked 
> with an `invokestatic` bytecode on the `Patter` class and avoid the secondary 
> subtype checks which caused the problems in the original implementation.
> 
> Notice that the current `invokeinterface` calls in the Lambda classes `is()` 
> methods where only introduced in JDK 15 with [JDK-8238358: Implementation of 
> JEP 371: Hidden Classes](https://bugs.openjdk.java.net/browse/JDK-8238358). 
> Before that (e.g. in JDK 11), `LambdaMetafactory` created `invokespecial` 
> calls to the `lambda$negate$3`/`lambda$union$2` lambda methods. With regards 
> to this issue, that makes no difference because the JVM does the same 
> secondary sub-type checking for both `invokespecial` and `invokeinterface`. 
> However, [JDK-8238358](https://bugs.openjdk.java.net/browse/JDK-8238358) has 
> introduced another regression in C1's [interface 
> call](https://wiki.openjdk.java.net/display/HotSpot/InterfaceCalls) 
> implementation which is currently tracked under [JDK-8274983: Pattern.matcher 
> performance regression after 
> JDK-8238358](https://bugs.openjdk.java.net/browse/JDK-8274983). These changes 
> will workaround that regressions described in 
> [JDK-8274983](https://bugs.openjdk.java.net/
 browse/JDK-8274983) for some patterns but can obviously not solve the C1 
weakness in general.

Hi Volker,
good finding and a sensible way to fix the performance regression, given that 
the type checking mechanism in hotspot is as it is.

-------------

Marked as reviewed by clanger (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/6434

Reply via email to