Sounds good to me Le ven. 29 avr. 2022 à 11:35, Mark Struberg <[email protected]> a écrit :
> Means we are good to go and ship another release? > > Would love to do that and then update TomEE and Meecrowave with it (tests > do look good in both). > Wdyt? > Would start the release train in 2h, love getting it out before the > weekend. > > LieGrue, > strub > > > Am 28.04.2022 um 08:27 schrieb David Blevins <[email protected]>: > > > >> On Apr 26, 2022, at 10:55 AM, David Blevins <[email protected]> > wrote: > >> > >> I'd need to check on the character encoding issue you mention. In my > mind the original code and current code is trying to create a string of max > snippet length. If it doesn't do that, it's a bug. > > > > So I dug into this and it looks like counting bytes is very flawed and > counting chars is as perfect as it gets in java. > > > > It looks like even with UTF-8 you can have a single character be > anywhere from 1 to 4 bytes. The character `ñ` is string length of 1 but a > byte length of 2. If you grabbed the first 3 bytes of "mañana" you'd get > "ma�..." > > > > If you create a UTF-8 string from a four-byte UTF-8 character you get of > course 4 bytes in the OutputStream, but you also get a string instance that > claims to be of length 2 not 1. If you call substring(0,1) on that you get > an unprintable result. > > > > So we fixed a bug in the switch from OutputStream to Writer. Any issues > there are with counting chars passed to the Writer and shared by > java.lang.String so users should not be surprised if they see a funny > character at the end of the snippet sometimes. > > > > > > -David > > > > > > > >
