Integrated: 8364320: String encodeUTF8 latin1 with negatives

2025-08-19 Thread Brett Okken
jdk.org/jdk/commit/3bbaa772b0bb94694940156ec0ce421f87f02be7 Stats: 9 lines in 1 file changed: 6 ins; 1 del; 2 mod 8364320: String encodeUTF8 latin1 with negatives Reviewed-by: liach, rriggs - PR: https://git.openjdk.org/jdk/pull/26597

Re: RFR: 8364320: String encodeUTF8 latin1 with negatives [v2]

2025-08-18 Thread duke
On Fri, 1 Aug 2025 13:15:29 GMT, Brett Okken wrote: >> As suggested on mailing list, when encoding latin1 bytes to utf-8, we can >> count the leading positive bytes and in the case where there is a negative, >> we can copy all the positive values to the target byte[] prior to processing >> the

Re: RFR: 8364320: String encodeUTF8 latin1 with negatives [v2]

2025-08-18 Thread Roger Riggs
On Fri, 1 Aug 2025 13:15:29 GMT, Brett Okken wrote: >> As suggested on mailing list, when encoding latin1 bytes to utf-8, we can >> count the leading positive bytes and in the case where there is a negative, >> we can copy all the positive values to the target byte[] prior to processing >> the

Re: RFR: 8364320: String encodeUTF8 latin1 with negatives

2025-08-11 Thread Brett Okken
On Fri, 1 Aug 2025 16:12:46 GMT, Chen Liang wrote: >> Benchmark on win64 >> >> Baseline: >> >> >> Benchmark (charsetName) Mode Cnt Score >> Error Units >> StringEncode.encodeAllMixed UTF-8 avgt 10 20067.519 ┬▒ >> 528.152 ns/op >> Str

Re: RFR: 8364320: String encodeUTF8 latin1 with negatives [v2]

2025-08-10 Thread Chen Liang
On Fri, 1 Aug 2025 13:15:29 GMT, Brett Okken wrote: >> As suggested on mailing list, when encoding latin1 bytes to utf-8, we can >> count the leading positive bytes and in the case where there is a negative, >> we can copy all the positive values to the target byte[] prior to processing >> the

Re: RFR: 8364320: String encodeUTF8 latin1 with negatives

2025-08-01 Thread Chen Liang
On Fri, 1 Aug 2025 12:39:05 GMT, Brett Okken wrote: >> As suggested on mailing list, when encoding latin1 bytes to utf-8, we can >> count the leading positive bytes and in the case where there is a negative, >> we can copy all the positive values to the target byte[] prior to processing >> the

Re: RFR: 8364320: String encodeUTF8 latin1 with negatives [v2]

2025-08-01 Thread Roger Riggs
On Fri, 1 Aug 2025 13:15:29 GMT, Brett Okken wrote: >> As suggested on mailing list, when encoding latin1 bytes to utf-8, we can >> count the leading positive bytes and in the case where there is a negative, >> we can copy all the positive values to the target byte[] prior to processing >> the

Re: RFR: 8364320: String encodeUTF8 latin1 with negatives [v2]

2025-08-01 Thread Brett Okken
> As suggested on mailing list, when encoding latin1 bytes to utf-8, we can > count the leading positive bytes and in the case where there is a negative, > we can copy all the positive values to the target byte[] prior to processing > the remaining data 1 byte at a time. > > https://mail.openjd

Re: RFR: 8364320: String encodeUTF8 latin1 with negatives

2025-08-01 Thread Chen Liang
On Fri, 1 Aug 2025 12:34:15 GMT, Brett Okken wrote: > As suggested on mailing list, when encoding latin1 bytes to utf-8, we can > count the leading positive bytes and in the case where there is a negative, > we can copy all the positive values to the target byte[] prior to processing > the rem

Re: RFR: 8364320: String encodeUTF8 latin1 with negatives

2025-08-01 Thread Brett Okken
On Fri, 1 Aug 2025 12:34:15 GMT, Brett Okken wrote: > As suggested on mailing list, when encoding latin1 bytes to utf-8, we can > count the leading positive bytes and in the case where there is a negative, > we can copy all the positive values to the target byte[] prior to processing > the rem

RFR: 8364320: String encodeUTF8 latin1 with negatives

2025-08-01 Thread Brett Okken
As suggested on mailing list, when encoding latin1 bytes to utf-8, we can count the leading positive bytes and in the case where there is a negative, we can copy all the positive values to the target byte[] prior to processing the remaining data 1 byte at a time. https://mail.openjdk.org/piperm

Re: String encodeUTF8 latin1 with negatives

2025-07-28 Thread Brett Okken
re-libs-dev on behalf of Brett > Okken > *Sent:* Monday, July 28, 2025 4:59 PM > *To:* Roger Riggs > *Cc:* core-libs-dev@openjdk.org > *Subject:* Re: String encodeUTF8 latin1 with negatives > > Roger, > > For a String, the byte[] val is immutable, right? > And even th

Re: String encodeUTF8 latin1 with negatives

2025-07-28 Thread Chen Liang
: Roger Riggs Cc: core-libs-dev@openjdk.org Subject: Re: String encodeUTF8 latin1 with negatives Roger, For a String, the byte[] val is immutable, right? And even the current behavior of checking for negatives and then cloning would not be safe in the face of a concurrent modification, right? Is

Re: String encodeUTF8 latin1 with negatives

2025-07-28 Thread Brett Okken
Roger, For a String, the byte[] val is immutable, right? And even the current behavior of checking for negatives and then cloning would not be safe in the face of a concurrent modification, right? Is there something else going on here which I am missing? Thanks, Brett On Mon, Jul 28, 2025 at 3:1

Re: String encodeUTF8 latin1 with negatives

2025-07-28 Thread Roger Riggs
Hi Brett, Extra care is needed if the input array might be modified concurrently with the method execution. When control flow decisions are made based on array contents, the integrity of the result depends on reading each byte of the array exactly once. Regards, Roger On 7/27/25 4:45 PM,

String encodeUTF8 latin1 with negatives

2025-07-27 Thread Brett Okken
In String.encodeUTF8, when the coder is latin1, there is a call to StringCoding.hasNegatives to determine if any special handling is needed. If not, a clone of the val is returned. If there are negative values, it then loops, from the beginning, through all the values to handle any individual negat