Fixing fastinputstream and javabin parsing may actually be as simple as removing unsigned byte masks from the InputStream itself (some jetty fake InputStream classes also unsign shift > 0 - the http client i InputStream response handler for example - not sure if that would need to go) and making sure all the masks happen at the site of bit shifts/logic. Then the readFully always behavior could likely be removed and byte reads could be readAndCheckByte that check for -1.
MRM On Sat, Jun 12, 2021 at 2:01 PM Mark Miller <markrmil...@gmail.com> wrote: > There are lots of actionable items here from my view. > > So many that I don’t see any general importance to any one of them on a > large scale. > > I run into the same problem if someone where to say, let’s make sure all > the tests are solid for the upcoming release. Well what does that mean? > With little effort I can eek out tons of fails that are not usually > prominent. I know tons of issues great and small that the tests almost > never reveal that are worse than many of the fails. I know plenty of > behavior and fails nix each other out. That poor performance and various > behavior keeps some tests happy but doesn’t necessary translate to the real > world. > > I can’t approach it as let’s make sure the tests are solid. But I can > approach from directions like, let’s look at possible regressions in tests > or behavior. Pain points in tests for a release, etc. > > For cross node communication, it’s the same. It’s not really an issue of > “what are the problems”, but what are the problems that are currently in > the way of what has to be supported. How much are they practically > affecting the current expectations. > > My knowledges and approaches to many of these problems can be a bit > difficult to translate. I’ve kind of ripped down javabin and streams and IO > and built them back up. I use a mix of inputstreams that both stream and > read/write data types and byte buffers and byte arrays via offheap buffers > directly. JavaBin and transaction logs may use var int / zig zag encodings > or direct byte buffer put/gets or intermix them. IO might happen > sequentially like a stream or perhaps in parallel. The http client and > jetty server do async and async IO and the off heap buffers for all of this > or things like the bytes and reads/writes that back char sequences are all > from my ByteBuffer pool. All to say, there are no direct translations for a > lot of actionable items, nor is there a master list of what the practical > affects and ramifications are in fairly different worlds. > > So in IMO, there are enough actionable items as to make most of them of > relative low concern or priority to me. But if the list starts getting hit > with many users having problems with connection resets and server stalls > and javabin EOFException oddities or corrupt javabin outputs or something > I’m supporting does, I’m ready to take action. > > I don’t know how important or what to do about the strange FastInputStream > and JavaBin. I have looked at other data input stream impls and they don’t > behave this way. FastInputStreams also affect replication and transaction > logs and various things. > > At the end of the day, you just have to make sure bit transformation, int > byte to int auto casting doesn’t screw you. I have not found this to > require the off contract readFully behavior or lack of -1 EOF checking that > you don’t find elsewhere. > > I have ripped all of these pieces apart and gutted a lot of it. I didn’t > find fast InputStream or FastInputWriter and the like fast. I found often > 3-4 or 5 layers of buffering depending on what was layered on what path. > Each buffer allocating a byte array to be gc’d. 98% of the time writing and > reading in memory or to *Jetty* ByteBuffer *buffers*, making the layered > byte arrays being pumped out in between entirely not fast. And most of > this, stream limited and oriented while sitting on tops of super fast and > efficient NIO at all the end points. My reading is that they are called > fast because mostly they remove synchronized methods - which have almost no > cost uncontended anyway. So I take issue with the naming. > > Anyway, I literally ripped a bunch of that chain out and reworked how > most of the parts do IO - generally with no byte array gc pumping at all > (the javabin utf conversations are also major offenders) and without any > unnecessary layers or pretence of old style IO streams that don’t exist. If > I need to stream something, I stream it, if I don’t, bye bye stream api, > hello to the byte buffers that are actually there and all the bulk and > direct off heap and concurrent access that allows for. > > And then http2 behavior and the client is the same thing. Had to rip and > tear and rebuild and unbuild and expand and take over parts. > > So what’s actionable. Depending on what you want or need, 10000 things. > But what’s necessary. I’m open to finding out based on the problems being > hit, the likelihood and the frequency. > > MRM > > > > On Thu, Jun 10, 2021 at 8:58 AM Gus Heck <gus.h...@gmail.com> wrote: > >> >> On Wed, Jun 9, 2021 at 12:25 AM Mark Miller <markrmil...@gmail.com> >> wrote: >> >>> >>> Instead of respecting stream contracts, JavaBin is given how much to >>> read and it diligently works to read to that point or in some cases it’s >>> own end marker. >>> >>> >> That sounds actionable, should it be reading the available contents into >> an array and letting javabin work with the array instead? Or perhaps to >> avoid that, a javbin2 format needs a prefix/header section indicating >> exactly how much needs to be read to keep http2 happy? I'd guess from your >> comment that this currently gets into problems due to JVM version/os or >> client/server version mismatches changing the length of something? Haven't >> looked at javabin code in a long while so I'm mostly guessing... >> >> -Gus >> > -- > - Mark > > http://about.me/markrmiller > -- - Mark http://about.me/markrmiller