> On Apr 24, 2022, at 11:09 PM, Romain Manni-Bucau <[email protected]> > wrote: > > Le dim. 24 avr. 2022 à 22:30, David Blevins <[email protected]> a > écrit : > >> All, >> >> I added more tests and found that most the optimizations were not >> happening due to buffering. >> >> Essentially there are two buffers between Snippet.Buffer and >> Snippet.SnippetOutputStream. The SnippetOutputStream had the >> responsibility to tell the code up the stack when we've reached the max >> snippet length. Since all the bytes were buffered, it would see nothing >> until the very end and we'd end up serializing the full json text anyway. >> >> One is the 64k buffer in JsonGeneratorImpl and the other is an 8k buffer >> in the JVM implementation code of OutputStreamWriter. Since the >> OutputStreamWriter buffer is hardcoded, we can't solve this by adjusting >> buffer sizes and have no choice but to aggressively call flush() to ensure >> SnippetOutputStream has the bytes and can do its job. >> > > Not sure I get that since if you close in a finally block the generator, it > will flush the actual output and all will be good. > But can be to call tostring to early rather than a buffering issue
It'd difficult to explain, but I'll do my best and thanks for the patience if my attempt is poor. The code is designed with the assumption that as json is serialized there will be write calls made on SnippetOutputStream, which then counts the bytes and can eventually tell Snippet.Buffer to stop making more json via the 'snippet.terminate()' calls. In practice this doesn't happen. In practice what does happen is the entire json document, up to a limit of 64k, will be created before any calls reach SnippetOutputStream. This is because JsonGeneratorImpl is holding a buffer (64k by default) and does not call any writes on the Writer instance it's holding until that buffer has filled or close is called. My first instinct was to reduce that 64k buffer to the snippet max length and solve this problem that way. The trick with that is there is yet another buffer being held internally by ObjectOutputStream and it recreates the issue. That buffer is hardcoded to be 8k. So even if we adjust the JsonGeneratorImpl buffer size, in practice what happens is the entire json document, up to a limit of 8k, will be created before any calls reach SnippetOutputStream. Certainly 8k is better than 64k which is better than potentially 1GB of json, but I wanted to try and get close to the spirit of what we were both after originally which is that we avoid serializing a lot of json only to throw most of it away and show just a chunk of it. The only way to do that is the flush() calls. That's the only way to ensure SnippetOutputStream is getting the json data as we serialize it. Hope some of this helps. -David
smime.p7s
Description: S/MIME cryptographic signature
