Re: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v3]

Jason Tatton Mon, 21 Sep 2020 05:46:46 -0700

> This is an implementation of the indexOf(char) intrinsic for StringLatin1 (1 
> byte encoded Strings). It is provided for
> x86 and ARM64. The implementation is greatly inspired by the indexOf(char) 
> intrinsic for StringUTF16. To incorporate it
> I had to make a small change to StringLatin1.java (refactor of functionality 
> to intrisified private method) as well as
> code for C2.  Submitted to: hotspot-compiler-dev and core-libs-dev as this 
> patch contains a change to hotspot and
> java/lang/StringLatin1.java  https://bugs.openjdk.java.net/browse/JDK-8173585
> 
> Details of testing:
> ============
> I have created a jtreg test 
> “compiler/intrinsics/string/TestStringLatin1IndexOfChar” to cover this new 
> intrinsic. Note
> that, particularly for the x86 implementation of the intrinsic, the code path 
> taken is dependent upon the length of the
> input String. Hence the test has been designed to cover all these cases. In 
> summary they are:
> - A “short” string of < 16 characters.
> - A SIMD String of 16 – 31 characters.
> - A AVX2 SIMD String of 32 characters+.
> 
> Hardware used for testing:
> -----------------------------
> 
> - Intel Xeon CPU E5-2680 (JVM did not recognize this as having AVX2 support) 
> • Intel i7 processor (with AVX2 support).
> - AWS Graviton 2 (ARM 64 processor).
> 
> I also ran; ‘run-test-tier1’ and ‘run-test-tier2’ for: x86_64 and aarch64.
> 
> Possible future enhancements:
> ====================
> For the x86 implementation there may be two further improvements we can make 
> in order to improve performance of both
> the StringUTF16 and StringLatin1 indexOf(char) intrinsics:
> 1. Make use of AVX-512 instructions.
> 2. For “short” Strings (see below), I think it may be possible to modify the 
> existing algorithm to still use SSE SIMD
> instructions instead of a loop.
> Benchmark results:
> ============
> **Without** the new StringLatin1 indexOf(char) intrinsic:
> 
> | Benchmark  | Mode | Cnt | Score | Error | Units |
> | ------------- | ------------- |------------- |------------- |------------- 
> |------------- |
> | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **26,389.129** | ± 182.581 
> | ns/op |
> | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 17,885.383 | ± 435.933  | 
> ns/op |
> 
> 
> **With** the new StringLatin1 indexOf(char) intrinsic:
> 
> | Benchmark  | Mode | Cnt | Score | Error | Units |
> | ------------- | ------------- |------------- |------------- |------------- 
> |------------- |
> | IndexOfBenchmark.latin1_mixed_char | avgt | 5 | **17,875.185** | ± 407.716  
> | ns/op |
> | IndexOfBenchmark.utf16_mixed_char | avgt | 5 | 18,292.802 | ± 167.306  | 
> ns/op |
> 
> 
> The objective of the patch is to bring the performance of StringLatin1 
> indexOf(char) in line with StringUTF16
> indexOf(char) for x86 and ARM64. We can see above that this has been 
> achieved. Similar results were obtained when
> running on ARM.


Jason Tatton has updated the pull request incrementally with one additional 
commit since the last revision:

  Add missing newline to end of vmSymbols.cpp

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/71/files
  - new: https://git.openjdk.java.net/jdk/pull/71/files/b85a7fb4..c8cc441e

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=71&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=71&range=01-02

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/71.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/71/head:pull/71

PR: https://git.openjdk.java.net/jdk/pull/71

Re: RFR: 8173585: Intrinsify StringLatin1.indexOf(char) [v3]

Reply via email to