Re: ConcurrentUpdate issue of solr 8.8.1

Mark Miller Sun, 13 Jun 2021 16:39:41 -0700

Fixing fastinputstream and javabin parsing may actually be as simple as
removing unsigned byte masks from the InputStream itself (some jetty fake
InputStream classes also unsign shift > 0 - the http client i InputStream
response handler for example - not sure if that would need to go) and
making sure all the masks happen at the site of bit shifts/logic. Then the
readFully always behavior could likely be removed and byte reads could be
readAndCheckByte that check for -1.


MRM

On Sat, Jun 12, 2021 at 2:01 PM Mark Miller <markrmil...@gmail.com> wrote:

> There are lots of actionable items here from my view.
>
> So many that I don’t see any general importance to any one of them on a
> large scale.
>
> I run into the same problem if someone where to say, let’s make sure all
> the tests are solid for the upcoming release. Well what does that mean?
> With little effort I can eek out tons of fails that are not usually
> prominent. I know tons of issues great and small that the tests almost
> never reveal that are worse than many of the fails. I know plenty of
> behavior and fails nix each other out. That poor performance and various
> behavior keeps some tests happy but doesn’t necessary translate to the real
> world.
>
> I can’t approach it as let’s make sure the tests are solid. But I can
> approach from directions like, let’s look at possible regressions in tests
> or behavior. Pain points in tests for a release, etc.
>
> For cross node communication, it’s the same. It’s not really an issue of
> “what are the problems”, but what are the problems that are currently in
> the way of what has to be supported. How much are they practically
> affecting the current expectations.
>
> My knowledges and approaches to many of these problems can be a bit
> difficult to translate. I’ve kind of ripped down javabin and streams and IO
> and built them back up. I use a mix of inputstreams that both stream and
> read/write data types and byte buffers and byte arrays via offheap buffers
> directly. JavaBin and transaction logs may use var int / zig zag encodings
> or direct byte buffer put/gets or intermix them. IO might happen
> sequentially like a stream or perhaps in parallel. The http client and
> jetty server do async and async IO and the off heap buffers for all of this
> or things like the bytes and reads/writes that back char sequences are all
> from my ByteBuffer pool. All to say, there are no direct translations for a
> lot of actionable items, nor is there a master list of what the practical
> affects and ramifications are in fairly different worlds.
>
> So in IMO, there are enough actionable items as to make most of them of
> relative low concern or priority to me. But if the list starts getting hit
> with many users having problems with connection resets and server stalls
> and javabin EOFException oddities or corrupt javabin outputs or something
> I’m supporting does,  I’m ready to take action.
>
> I don’t know how important or what to do about the strange FastInputStream
> and JavaBin. I have looked at other data input stream impls and they don’t
> behave this way. FastInputStreams also affect replication and transaction
> logs and various things.
>
> At the end of the day, you just have to make sure bit transformation, int
> byte to int auto casting doesn’t screw you. I have not found this to
> require the off contract readFully behavior or lack of -1 EOF checking that
> you don’t find elsewhere.
>
> I have ripped all of these pieces apart and gutted a lot of it.  I didn’t
> find fast InputStream or FastInputWriter and the like fast. I found often
> 3-4 or 5 layers of buffering depending on what was layered on what path.
> Each buffer allocating a byte array to be gc’d. 98% of the time writing and
> reading in memory or to *Jetty* ByteBuffer *buffers*, making the layered
> byte arrays being pumped out in between entirely not fast. And most of
> this, stream limited and oriented while sitting on tops of super fast and
> efficient NIO at all the end points. My reading is that they are called
> fast because mostly they remove synchronized methods - which have almost no
> cost uncontended anyway. So I take issue with the naming.
>
> Anyway,  I literally ripped a bunch of that chain out and reworked how
> most of the parts do IO - generally with no byte array gc pumping at all
> (the javabin utf conversations are also major offenders) and without any
> unnecessary layers or pretence of old style IO streams that don’t exist. If
> I need to stream something, I stream it, if I don’t, bye bye stream api,
> hello to the byte buffers that are actually there and all the bulk and
> direct off heap and concurrent access that allows for.
>
> And then http2 behavior and the client is the same thing. Had to rip and
> tear and rebuild and unbuild and expand and take over parts.
>
> So what’s actionable. Depending on what you want or need, 10000 things.
> But what’s necessary. I’m open to finding out based on the problems being
> hit, the likelihood and the frequency.
>
> MRM
>
>
>
> On Thu, Jun 10, 2021 at 8:58 AM Gus Heck <gus.h...@gmail.com> wrote:
>
>>
>> On Wed, Jun 9, 2021 at 12:25 AM Mark Miller <markrmil...@gmail.com>
>> wrote:
>>
>>>
>>> Instead of respecting stream contracts, JavaBin is given how much to
>>> read and it diligently works to read to that point or in some cases it’s
>>> own end marker.
>>>
>>>
>> That sounds actionable, should it be reading the available contents into
>> an array and letting javabin work with the array instead? Or perhaps to
>> avoid that, a javbin2 format needs a prefix/header section indicating
>> exactly how much needs to be read to keep http2 happy? I'd guess from your
>> comment that this currently gets into problems due to JVM version/os or
>> client/server version mismatches changing the length of something? Haven't
>> looked at javabin code in a long while so I'm mostly guessing...
>>
>> -Gus
>>
> --
> - Mark
>
> http://about.me/markrmiller
>
-- 
- Mark

http://about.me/markrmiller

Re: ConcurrentUpdate issue of solr 8.8.1

Reply via email to