Hello, 

I've reworked the patch in order to decide about char[] / byte[] lazily.

This allows to dodge performance impact of reflective calls to 
String.isLatin1() for non-latin Strings
and at the same time keep the benefits of discrimination between char[] / 
byte[].

I've collected resutls for all the versions of my patch into the table below. 
As you can see 
StringBuilder reduces complexity but significantly slows down the case then 
non-latin Strings are joined.

Exclamation marks denote the cases where the last implementation clearly looses 
in performance
agains original StringJoiner.

Regards,
Sergey Tsypanov

P.S.

Also I think the further improvement is possible here: Tagir mentioned decoding 
of byte[] happening in
constructor of String(byte[]). The constructor is not aware about how the bytes 
are encoded and
has to decode them. However in our case when came to `new String(byte[])` we 
are sure that the bytes
are of Latin1. So we can skip decoding by calling StringLatin1.newString(). 
However, it copies incoming
byte[] which is again redundant in our particular case.


Benchmark                              (count)  (latin)  (length)          
Original           Patched1                 SB          Patched2        Units
stringJoiner                                 1     true         1      26.9 ±   
0.7       48.8 ±   2.2       33.4 ±   0.2      42.6 ±   1.0        ns/op   !
stringJoiner                                 1     true         5      30.5 ±   
1.0       46.1 ±   2.1       33.0 ±   0.1      42.3 ±   1.1        ns/op   !
stringJoiner                                 1     true        10      31.2 ±   
0.6       47.3 ±   1.3       34.3 ±   0.3      46.6 ±   1.4        ns/op   !
stringJoiner                                 1     true       100      62.5 ±   
3.3       79.9 ±   4.8       44.4 ±   0.1      63.5 ±   2.1        ns/op
stringJoiner                                 5     true         1      78.2 ±   
1.6      110.3 ±   2.9       87.8 ±   0.8      93.4 ±   1.7        ns/op   !
stringJoiner                                 5     true         5      94.2 ±   
8.7      116.6 ±   0.7       88.2 ±   0.9      91.4 ±   0.9        ns/op
stringJoiner                                 5     true        10      95.3 ±   
6.9      100.1 ±   0.4       91.6 ±   0.6      93.7 ±   0.4        ns/op
stringJoiner                                 5     true       100     188.0 ±  
10.2      136.0 ±   0.4      126.1 ±   0.7     135.3 ±   0.9        ns/op
stringJoiner                                10     true         1     160.3 ±   
4.5      172.9 ±   0.8      177.6 ±   0.8     172.1 ±   0.6        ns/op   !
stringJoiner                                10     true         5     169.0 ±   
4.7      180.2 ±   9.1      179.4 ±   1.0     171.7 ±   0.6        ns/op
stringJoiner                                10     true        10     205.7 ±  
16.4      182.7 ±   1.1      189.5 ±   1.2     178.3 ±   0.5        ns/op
stringJoiner                                10     true       100     366.5 ±  
17.0      284.5 ±   3.1      290.0 ±   0.8     282.1 ±   0.9        ns/op
stringJoiner                               100     true         1    1117.6 ±  
11.1     2123.7 ±  11.1     1563.8 ±   2.8    1379.5 ±   4.4        ns/op   !
stringJoiner                               100     true         5    1270.7 ±  
40.2     2163.6 ±  12.4     1592.4 ±   4.0    1426.1 ±   4.4        ns/op   !
stringJoiner                               100     true        10    1364.4 ±  
14.0     2283.8 ±  16.1     1773.7 ±  57.7    1525.0 ±   3.2        ns/op   !
stringJoiner                               100     true       100    3592.9 ± 
164.8     3535.2 ±  29.9     2899.2 ±  51.0    3043.9 ±  85.6        ns/op

stringJoiner                                 1    false         1      35.6 ±   
1.2       59.1 ±   3.0       52.7 ±   1.2      44.9 ±   0.9        ns/op   !
stringJoiner                                 1    false         5      39.3 ±   
1.2       52.6 ±   2.5       54.4 ±   1.6      37.4 ±   0.9        ns/op
stringJoiner                                 1    false        10      42.2 ±   
1.6       53.6 ±   0.3       52.2 ±   1.0      43.2 ±   1.0        ns/op
stringJoiner                                 1    false       100      70.5 ±   
1.8       86.4 ±   0.4       78.6 ±   1.2      75.8 ±   2.2        ns/op   !
stringJoiner                                 5    false         1      89.0 ±   
3.5      102.2 ±   1.0      116.3 ±   3.8      85.3 ±   0.1        ns/op
stringJoiner                                 5    false         5      87.6 ±   
0.7      106.5 ±   1.2      115.2 ±   2.9      90.7 ±   0.6        ns/op   !
stringJoiner                                 5    false        10     109.0 ±   
5.6      116.5 ±   1.2      126.5 ±   0.5      97.2 ±   0.3        ns/op
stringJoiner                                 5    false       100     324.0 ±  
16.5      221.9 ±   0.5      288.9 ±   0.5     227.3 ±   0.7        ns/op
stringJoiner                                10    false         1     183.9 ±   
5.9      204.7 ±   5.5      261.2 ±   7.7     168.4 ±   0.7        ns/op
stringJoiner                                10    false         5     198.7 ±   
9.7      202.4 ±   1.5      253.3 ±   6.7     178.1 ±   0.6        ns/op
stringJoiner                                10    false        10     196.7 ±   
6.9      226.7 ±   6.4      274.3 ±   7.0     191.8 ±   0.5        ns/op
stringJoiner                                10    false       100     535.8 ±   
2.3      553.0 ±   5.6      677.3 ±   6.4     538.8 ±   2.3        ns/op
stringJoiner                               100    false         1    1674.6 ± 
122.1     1940.8 ±  16.2     2212.5 ±  32.2    1418.9 ±   2.9        ns/op
stringJoiner                               100    false         5    1791.9 ±  
58.1     2158.1 ±  12.0     2492.8 ±  30.9    1583.9 ±   3.9        ns/op
stringJoiner                               100    false        10    2124.1 ± 
193.3     2364.0 ±  25.2     2611.7 ±  17.5    1861.8 ±   5.9        ns/op
stringJoiner                               100    false       100    4323.4 ±  
29.2     4675.5 ±  11.8     5501.3 ±  21.1    4172.3 ±  12.7        ns/op

stringJoiner:·gc.alloc.rate.norm             1     true         1     120.0 ±   
0.0      120.0 ±   0.0      144.0 ±   0.0     120.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             1     true         5     128.0 ±   
0.0      120.0 ±   0.0      144.0 ±   0.0     120.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             1     true        10     144.0 ±   
0.0      136.0 ±   0.0      160.0 ±   0.0     136.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             1     true       100     416.0 ±   
0.0      312.0 ±   0.0      336.0 ±   0.0     312.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             5     true         1     144.0 ±   
0.0      136.0 ±   0.0      160.0 ±   0.0     136.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             5     true         5     200.0 ±   
0.0      168.0 ±   0.0      192.0 ±   0.0     168.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             5     true        10     272.0 ±   
0.0      216.0 ±   0.0      240.0 ±   0.0     216.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             5     true       100    1632.0 ±   
0.0     1128.0 ±   0.0     1152.0 ±   0.0    1128.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm            10     true         1     256.0 ±   
0.0      232.0 ±   0.0      256.0 ±   0.0     232.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm            10     true         5     376.0 ±   
0.0      316.8 ±   4.9      336.0 ±   0.0     312.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm            10     true        10     520.0 ±   
0.0      408.0 ±   0.0      432.0 ±   0.0     408.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm            10     true       100    3224.1 ±   
0.0     2236.9 ±  21.2     2240.1 ±   0.0    2216.1 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm           100     true         1    1748.1 ±   
4.0     1592.2 ±   0.0     1568.2 ±   0.0    1544.2 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm           100     true         5    2948.2 ±   
4.0     2392.3 ±   0.0     2368.2 ±   0.0    2344.2 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm           100     true        10    4444.3 ±   
4.0     3384.3 ±   0.0     3364.3 ±   4.0    3336.3 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm           100     true       100   31441.4 ±   
0.0    21385.4 ±   0.0    21365.1 ±   4.1   21353.2 ±   6.6        B/op

stringJoiner:·gc.alloc.rate.norm             1    false         1     144.0 ±   
0.0      144.0 ±   0.0      192.0 ±   0.0     144.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             1    false         5     160.0 ±   
0.0      160.0 ±   0.0      208.0 ±   0.0     160.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             1    false        10     184.0 ±   
0.0      184.0 ±   0.0      240.0 ±   0.0     184.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             1    false       100     640.0 ±   
0.0      640.0 ±   0.0      784.0 ±   0.0     640.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             5    false         1     184.0 ±   
0.0      184.0 ±   0.0      240.0 ±   0.0     184.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             5    false         5     280.0 ±   
0.0      280.0 ±   0.0      349.6 ±   2.4     280.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             5    false        10     400.0 ±   
0.0      400.0 ±   0.0      496.0 ±   0.0     400.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm             5    false       100    2664.1 ±   
0.0     2664.1 ±   0.0     3216.1 ±   0.0    2664.1 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm            10    false         1     320.0 ±   
0.0      334.4 ±   7.4      384.0 ±   0.0     320.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm            10    false         5     520.0 ±   
0.0      520.0 ±   0.0      624.0 ±   0.0     520.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm            10    false        10     760.0 ±   
0.0      769.6 ±   6.5      912.0 ±   0.0     760.0 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm            10    false       100    5264.2 ±   
0.0     5273.8 ±   6.5     6320.3 ±   0.0    5264.2 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm           100    false         1    2204.2 ±   
4.0     2216.3 ±   0.0     2436.3 ±   6.8    2168.2 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm           100    false         5    4196.3 ±   
6.2     4216.4 ±   0.0     4832.4 ±   6.6    4168.3 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm           100    false        10    6696.5 ±   
5.4     6712.6 ±   0.0     7844.7 ±   4.0    6664.5 ±   0.0        B/op
stringJoiner:·gc.alloc.rate.norm           100    false       100   51702.0 ±   
4.1    51714.4 ±   0.0    61838.6 ±   6.2   51666.1 ±   0.0        B/op


10.02.2020, 14:08, "Tagir Valeev" <[email protected]>:
> Hello!
>
> In many tests, I see little or no performance improvements. E.g.:
> stringJoiner 100 10 1768.8 ±
> 160.6 1760.8 ± 111.4 ns/op
>
> How would you explain that this code change doesn't improve the
> performance for given count and length?
>
> Also, you are using `new String(bytes)` assuming that the platform
> charset is latin1-compatible. This is not always true, so your code
> would produce incorrect results depending on this setting. In general,
> decoding via charset is tons of extra work. Instead, I would
> experiment with pre-sized StringBuilder inside compactElts, instead of
> char[] array. While it does some extra checks, StringBuilder already
> can use the compact string representation, so if it's properly
> pre-sized, no extra memory will be allocated.
>
> Finally, if you optimize for the case when X is true you should always
> benchmark what happens when X is false. It's great that in some cases
> we see a speedup for latin1 strings. But what if not all of the
> strings are latin1? Is there any performance degradation? If yes, can
> we tolerate it?
>
> With best regards,
> Tagir Valeev.
>
> On Mon, Feb 10, 2020 at 3:22 PM Сергей Цыпанов
> <[email protected]> wrote:
>>  Hello,
>>
>>  I've reworked the code, patch is attached. Could you please review my 
>> solution regarding usage of SahredSecrets?
>>
>>  P.S. After I've created the patch it came to my mind that instead of 
>> checking all Strings when calling StringJoiner.add()
>>  we can check them in toString() method and fail-fast in case at least one 
>> of them is non-latin. This likely to reduce
>>  regression related to usage of reflection.
>>
>>  Regards,
>>  Sergey Tsypanov
>>
>>  05.02.2020, 23:21, "[email protected]" <[email protected]>:
>>  > ----- Mail original -----
>>  >> De: "Сергей Цыпанов" <[email protected]>
>>  >> À: "Remi Forax" <[email protected]>, "core-libs-dev" 
>> <[email protected]>
>>  >> Envoyé: Mercredi 5 Février 2020 22:12:34
>>  >> Objet: Re: [PATCH] Enhancement proposal for java.util.StringJoiner
>>  >
>>  >> Hello,
>>  >>
>>  >>> If you want to optimize StringJoiner, the best way to do it is to use 
>> the shared
>>  >>> secret mechanism so a java.util class can see implementation details of 
>> a
>>  >>> java.lang class without exposing those details publicly.
>>  >>> As an example, take a look to EnumSet and its implementations.
>>  >>
>>  >> I've looked into SharedSecrets, it seems there's no ready-to-use method 
>> for
>>  >> accessing package-private method. Do you mean it's necessary to add 
>> particular
>>  >> functionality to JavaLangReflectionAccess as they did for JavaLangAccess 
>> in
>>  >> order to deal with EnumSet?
>>  >
>>  > yes !
>>  > crossing package boundary in a non public way is not free,
>>  > but given that StringJoiner is used quite often (directly or indirectly 
>> using Collectors.joining()), it may worth the cost.
>>  >
>>  >> Regards,
>>  >> Sergey
>>  >
>>  > Regards,
>>  > Rémi
>>  >
>>  >> 04.02.2020, 12:12, "Remi Forax" <[email protected]>:
>>  >>> ----- Mail original -----
>>  >>>> De: "Сергей Цыпанов" <[email protected]>
>>  >>>> À: "jonathan gibbons" <[email protected]>, "core-libs-dev"
>>  >>>> <[email protected]>
>>  >>>> Envoyé: Mardi 4 Février 2020 08:53:31
>>  >>>> Objet: Re: [PATCH] Enhancement proposal for java.util.StringJoiner
>>  >>>
>>  >>>> Hello,
>>  >>>
>>  >>> Hi Sergey,
>>  >>>
>>  >>>> I'd probably agree about a new class in java.lang, but what is wrong 
>> about
>>  >>>> exposing package-private method
>>  >>>> which doesn't modify the state of the object and has no side effects?
>>  >>>
>>  >>> You can not change the implementation anymore,
>>  >>> by example if instead of having a split between latin1 and non latin1, 
>> we decide
>>  >>> in the future to split between utf8 and non utf8.
>>  >>>
>>  >>> If you want to optimize StringJoiner, the best way to do it is to use 
>> the shared
>>  >>> secret mechanism so a java.util class can see implementation details of 
>> a
>>  >>> java.lang class without exposing those details publicly.
>>  >>> As an example, take a look to EnumSet and its implementations.
>>  >>>
>>  >>> regards,
>>  >>> Rémi
>>  >>>
>>  >>>> 04.02.2020, 00:58, "Jonathan Gibbons" <[email protected]>:
>>  >>>>> Sergey,
>>  >>>>>
>>  >>>>> It is equally bad to create a new class in the java.lang package as it
>>  >>>>> is to add a new public method to java.lang.String.
>>  >>>>>
>>  >>>>> -- Jon
>>  >>>>>
>>  >>>>> On 2/3/20 2:38 PM, Сергей Цыпанов wrote:
>>  >>>>>> Hello,
>>  >>>>>>
>>  >>>>>> as of JDK14 java.util.StringJoiner still uses char[] as a storage of 
>> glued
>>  >>>>>> Strings.
>>  >>>>>>
>>  >>>>>> This applies for the cases when all joined Strings as well as 
>> delimiter, prefix
>>  >>>>>> and suffix contain only ASCII symbols.
>>  >>>>>>
>>  >>>>>> As a result when StringJoiner.toString() is invoked, byte[] stored 
>> in String is
>>  >>>>>> inflated in order to fill in char[] and
>>  >>>>>> finally char[] is compressed when constructor of String is called:
>>  >>>>>>
>>  >>>>>> String delimiter = this.delimiter;
>>  >>>>>> char[] chars = new char[this.len + addLen];
>>  >>>>>> int k = getChars(this.prefix, chars, 0);
>>  >>>>>> if (size > 0) {
>>  >>>>>> k += getChars(elts[0], chars, k); // inflate byte[] -> char[]
>>  >>>>>>
>>  >>>>>> for(int i = 1; i < size; ++i) {
>>  >>>>>> k += getChars(delimiter, chars, k);
>>  >>>>>> k += getChars(elts[i], chars, k);
>>  >>>>>> }
>>  >>>>>> }
>>  >>>>>>
>>  >>>>>> k += getChars(this.suffix, chars, k);
>>  >>>>>> return new String(chars); // compress char[] -> byte[]
>>  >>>>>>
>>  >>>>>> This can be improved by detecting cases when String.isLatin1() 
>> returns true for
>>  >>>>>> all involved Strings.
>>  >>>>>>
>>  >>>>>> I've prepared a patch along with benchmark proving that this change 
>> is correct
>>  >>>>>> and brings improvement.
>>  >>>>>> The only concern I have is about String.isLatin1(): as far as String 
>> belongs to
>>  >>>>>> java.lang and StringJoiner to java.util
>>  >>>>>> package-private String.isLatin1() cannot be directly accessed, we 
>> need to make
>>  >>>>>> it public for successful compilation.
>>  >>>>>>
>>  >>>>>> Another solution is to create an intermediate utility class located 
>> in java.lang
>>  >>>>>> which delegates the call to String.isLatin1():
>>  >>>>>>
>>  >>>>>> package java.lang;
>>  >>>>>>
>>  >>>>>> public class StringHelper {
>>  >>>>>> public static boolean isLatin1(String str) {
>>  >>>>>> return str.isLatin1();
>>  >>>>>> }
>>  >>>>>> }
>>  >>>>>>
>>  >>>>>> This allows to keep java.lang.String intact and have access to it's
>>  >>>>>> package-private method outside of java.lang package.
>>  >>>>>>
>>  >>>>>> Below I've added results of benchmarking for specified case (all 
>> Strings are
>>  >>>>>> Latin1). The other case (at least one String is UTF-8) uses existing 
>> code so
>>  >>>>>> there will be only a tiny regression due to several if-checks.
>>  >>>>>>
>>  >>>>>> With best regards,
>>  >>>>>> Sergey Tsypanov
>>  >>>>>>
>>  >>>>>> (count) (length) Original Patched Units
>>  >>>>>> stringJoiner 1 1 26.7 ± 1.3 38.2 ± 1.1 ns/op
>>  >>>>>> stringJoiner 1 5 27.4 ± 0.0 40.5 ± 2.2 ns/op
>>  >>>>>> stringJoiner 1 10 29.6 ± 1.9 38.4 ± 1.9 ns/op
>>  >>>>>> stringJoiner 1 100 61.1 ± 6.9 47.6 ± 0.6 ns/op
>>  >>>>>> stringJoiner 5 1 91.1 ± 6.7 83.6 ± 2.0 ns/op
>>  >>>>>> stringJoiner 5 5 96.1 ± 10.7 85.6 ± 1.1 ns/op
>>  >>>>>> stringJoiner 5 10 105.5 ± 14.3 84.7 ± 1.1 ns/op
>>  >>>>>> stringJoiner 5 100 266.6 ± 30.1 139.6 ± 14.0 ns/op
>>  >>>>>> stringJoiner 10 1 190.7 ± 23.0 162.0 ± 2.9 ns/op
>>  >>>>>> stringJoiner 10 5 200.0 ± 16.9 167.5 ± 11.0 ns/op
>>  >>>>>> stringJoiner 10 10 216.4 ± 12.4 164.8 ± 1.7 ns/op
>>  >>>>>> stringJoiner 10 100 545.3 ± 49.7 282.2 ± 12.0 ns/op
>>  >>>>>> stringJoiner 100 1 1467.0 ± 90.3 1302.0 ± 18.5 ns/op
>>  >>>>>> stringJoiner 100 5 1491.8 ± 166.2 1493.0 ± 135.4 ns/op
>>  >>>>>> stringJoiner 100 10 1768.8 ± 160.6 1760.8 ± 111.4 ns/op
>>  >>>>>> stringJoiner 100 100 3654.3 ± 113.1 3120.9 ± 175.9 ns/op
>>  >>>>>>
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 1 1 120.0 ± 0.0 120.0 ± 0.0 B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 1 5 128.0 ± 0.0 120.0 ± 0.0 B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 1 10 144.0 ± 0.0 136.0 ± 0.0 B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 1 100 416.0 ± 0.0 312.0 ± 0.0 B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 5 1 144.0 ± 0.0 136.0 ± 0.0 B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 5 5 200.0 ± 0.0 168.0 ± 0.0 B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 5 10 272.0 ± 0.0 216.0 ± 0.0 B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 5 100 1632.0 ± 0.0 1128.0 ± 0.0 B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 10 1 256.0 ± 0.0 232.0 ± 0.0 B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 10 5 376.0 ± 0.0 312.0 ± 0.0 B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 10 10 520.0 ± 0.0 408.0 ± 0.0 B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 10 100 3224.1 ± 0.0 2216.1 ± 0.0 
>> B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 100 1 1760.2 ± 14.9 1544.2 ± 0.0 
>> B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 100 5 2960.3 ± 14.9 2344.2 ± 0.0 
>> B/op
>>  >>>>>> stringJoiner:·gc.alloc.rate.norm 100 10 4440.4 ± 0.0 3336.3 ± 0.0 
>> B/op
>>  >> >> >> stringJoiner:·gc.alloc.rate.norm 100 100 31449.3 ± 12.2 21346.7 ± 
>> 14.7 B/op
diff --git a/src/java.base/share/classes/java/util/StringJoiner.java b/src/java.base/share/classes/java/util/StringJoiner.java
--- a/src/java.base/share/classes/java/util/StringJoiner.java
+++ b/src/java.base/share/classes/java/util/StringJoiner.java
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2013, 2018, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 2013, 2020, Oracle and/or its affiliates. All rights reserved.
  * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
  *
  * This code is free software; you can redistribute it and/or modify it
@@ -24,6 +24,8 @@
  */
 package java.util;
 
+import jdk.internal.access.SharedSecrets;
+
 /**
  * {@code StringJoiner} is used to construct a sequence of characters separated
  * by a delimiter and optionally starting with a supplied prefix
@@ -60,9 +62,12 @@
  *
  * @see java.util.stream.Collectors#joining(CharSequence)
  * @see java.util.stream.Collectors#joining(CharSequence, CharSequence, CharSequence)
- * @since  1.8
-*/
+ * @since 1.8
+ */
 public final class StringJoiner {
+
+    private static final boolean isLatin1Available = SharedSecrets.getJavaLangStringAccess().isLatin1("");
+
     private final String prefix;
     private final String delimiter;
     private final String suffix;
@@ -77,7 +82,7 @@
     private int len;
 
     /**
-     * When overridden by the user to be non-null via {@link setEmptyValue}, the
+     * When overridden by the user to be non-null via {@link #setEmptyValue}, the
      * string returned by toString() when no elements have yet been added.
      * When null, prefix + suffix is used as the empty value.
      */
@@ -153,6 +158,13 @@
         return len;
     }
 
+    @SuppressWarnings("deprecation")
+    private static int getBytes(String s, byte[] bytes, int start) {
+        int len = s.length();
+        s.getBytes(0, len, bytes, start);
+        return len;
+    }
+
     /**
      * Returns the current value, consisting of the {@code prefix}, the values
      * added so far separated by the {@code delimiter}, and the {@code suffix},
@@ -173,20 +185,42 @@
             compactElts();
             return size == 0 ? "" : elts[0];
         }
-        final String delimiter = this.delimiter;
+        if (isLatin1Available && allLatin1(elts, size)) {
+            return bytesToString(elts, size, addLen);
+        }
+        return charsToString(elts, size, addLen);
+    }
+
+    private String charsToString(String[] elts, int size, int addLen) {
         final char[] chars = new char[len + addLen];
         int k = getChars(prefix, chars, 0);
         if (size > 0) {
+            final String delimiter = this.delimiter;
             k += getChars(elts[0], chars, k);
             for (int i = 1; i < size; i++) {
                 k += getChars(delimiter, chars, k);
                 k += getChars(elts[i], chars, k);
             }
         }
-        k += getChars(suffix, chars, k);
+        getChars(suffix, chars, k);
         return new String(chars);
     }
 
+    private String bytesToString(String[] elts, int size, int addLen) {
+        final byte[] bytes = new byte[len + addLen];
+        int k = getBytes(prefix, bytes, 0);
+        if (size > 0) {
+            final String delimiter = this.delimiter;
+            k += getBytes(elts[0], bytes, k);
+            for (int i = 1; i < size; i++) {
+                k += getBytes(delimiter, bytes, k);
+                k += getBytes(elts[i], bytes, k);
+            }
+        }
+        getBytes(suffix, bytes, k);
+        return new String(bytes);
+    }
+
     /**
      * Adds a copy of the given {@code CharSequence} value as the next
      * element of the {@code StringJoiner} value. If {@code newElement} is
@@ -239,18 +273,39 @@
 
     private void compactElts() {
         if (size > 1) {
-            final char[] chars = new char[len];
-            int i = 1, k = getChars(elts[0], chars, 0);
-            do {
-                k += getChars(delimiter, chars, k);
-                k += getChars(elts[i], chars, k);
-                elts[i] = null;
-            } while (++i < size);
-            size = 1;
-            elts[0] = new String(chars);
+            if (isLatin1Available && allLatin1(elts, size)) {
+                compactBytes();
+            } else {
+                compactChars();
+            }
         }
     }
 
+    private void compactChars() {
+        final char[] chars = new char[len];
+        int i = 1, k = getChars(elts[0], chars, 0);
+        do {
+            k += getChars(delimiter, chars, k);
+            k += getChars(elts[i], chars, k);
+            elts[i] = null;
+        } while (++i < size);
+        size = 1;
+        elts[0] = new String(chars);
+    }
+
+    private void compactBytes() {
+        final byte[] bytes = new byte[len];
+        int i = 1;
+        int k = getBytes(elts[0], bytes, 0);
+        do {
+            k += getBytes(delimiter, bytes, k);
+            k += getBytes(elts[i], bytes, k);
+            elts[i] = null;
+        } while (++i < size);
+        size = 1;
+        elts[0] = new String(bytes);
+    }
+
     /**
      * Returns the length of the {@code String} representation
      * of this {@code StringJoiner}. Note that if
@@ -265,4 +320,15 @@
         return (size == 0 && emptyValue != null) ? emptyValue.length() :
             len + prefix.length() + suffix.length();
     }
+
+    private static boolean allLatin1(String[] strings, int size) {
+        for (int i = 0; i < size; i++) {
+            String str = strings[i];
+            if (!SharedSecrets.getJavaLangStringAccess().isLatin1(str)) {
+                return false;
+            }
+        }
+        return true;
+    }
+
 }
diff --git a/src/java.base/share/classes/jdk/internal/access/JavaLangStringAccess.java b/src/java.base/share/classes/jdk/internal/access/JavaLangStringAccess.java
new file mode 100644
--- /dev/null
+++ b/src/java.base/share/classes/jdk/internal/access/JavaLangStringAccess.java
@@ -0,0 +1,41 @@
+/*
+ * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 only, as
+ * published by the Free Software Foundation.  Oracle designates this
+ * particular file as subject to the "Classpath" exception as provided
+ * by Oracle in the LICENSE file that accompanied this code.
+ *
+ * This code is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * version 2 for more details (a copy is included in the LICENSE file that
+ * accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License version
+ * 2 along with this work; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
+ * or visit www.oracle.com if you need additional information or have any
+ * questions.
+ */
+
+package jdk.internal.access;
+
+/**
+ * An interface which gives access to internals of {@link String}.
+ */
+public interface JavaLangStringAccess {
+
+    /**
+     * Returns true in case all characters of argument {@link String} are ASCII symbols
+     *
+     * @param str String to be checked
+     * @return whether all characters of argument {@link String} are ASCII symbols
+     */
+    boolean isLatin1(String str);
+
+}
diff --git a/src/java.base/share/classes/jdk/internal/access/SharedSecrets.java b/src/java.base/share/classes/jdk/internal/access/SharedSecrets.java
--- a/src/java.base/share/classes/jdk/internal/access/SharedSecrets.java
+++ b/src/java.base/share/classes/jdk/internal/access/SharedSecrets.java
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2002, 2019, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 2002, 2020, Oracle and/or its affiliates. All rights reserved.
  * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
  *
  * This code is free software; you can redistribute it and/or modify it
@@ -58,6 +58,7 @@
     private static JavaLangModuleAccess javaLangModuleAccess;
     private static JavaLangRefAccess javaLangRefAccess;
     private static JavaLangReflectAccess javaLangReflectAccess;
+    private static JavaLangStringAccess javaLangStringAccess;
     private static JavaIOAccess javaIOAccess;
     private static JavaIOFileDescriptorAccess javaIOFileDescriptorAccess;
     private static JavaIOFilePermissionAccess javaIOFilePermissionAccess;
@@ -139,6 +140,17 @@
         return javaLangReflectAccess;
     }
 
+    public static JavaLangStringAccess getJavaLangStringAccess() {
+        if (javaLangStringAccess == null) {
+            unsafe.ensureClassInitialized(StringAccess.class);
+        }
+        return javaLangStringAccess;
+    }
+
+    public static void setJavaLangStringAccess(JavaLangStringAccess stringAccess) {
+        javaLangStringAccess = stringAccess;
+    }
+
     public static void setJavaNetUriAccess(JavaNetUriAccess jnua) {
         javaNetUriAccess = jnua;
     }
diff --git a/src/java.base/share/classes/jdk/internal/access/StringAccess.java b/src/java.base/share/classes/jdk/internal/access/StringAccess.java
new file mode 100644
--- /dev/null
+++ b/src/java.base/share/classes/jdk/internal/access/StringAccess.java
@@ -0,0 +1,68 @@
+/*
+ * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved.
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 only, as
+ * published by the Free Software Foundation.  Oracle designates this
+ * particular file as subject to the "Classpath" exception as provided
+ * by Oracle in the LICENSE file that accompanied this code.
+ *
+ * This code is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * version 2 for more details (a copy is included in the LICENSE file that
+ * accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License version
+ * 2 along with this work; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
+ * or visit www.oracle.com if you need additional information or have any
+ * questions.
+ */
+
+package jdk.internal.access;
+
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.util.Objects;
+
+/**
+ * Package-private class implementing the
+ * jdk.internal.access.JavaLangStringAccess interface,
+ * allowing non-members of java.lang package
+ * to access internals of {@link java.lang.String}.
+ * */
+class StringAccess implements jdk.internal.access.JavaLangStringAccess {
+
+    static {
+        SharedSecrets.setJavaLangStringAccess(new StringAccess());
+    }
+
+    private final Method isLatin1 = initIsLatin1Method();
+
+    private StringAccess() {
+    }
+
+    @Override
+    public boolean isLatin1(String str) {
+        Objects.requireNonNull(str);
+        try {
+            return (boolean) isLatin1.invoke(str);
+        } catch (IllegalAccessException | InvocationTargetException e) {
+            throw new RuntimeException(e);
+        }
+    }
+
+    private static Method initIsLatin1Method() {
+        try {
+            final Method isLatin1 = String.class.getDeclaredMethod("isLatin1");
+            isLatin1.setAccessible(true);
+            return isLatin1;
+        } catch (NoSuchMethodException e) {
+            throw new Error(e);
+        }
+    }
+}

Reply via email to