> On Apr 25, 2022, at 12:23 PM, Romain Manni-Bucau <[email protected]> > wrote: > > Oh, got it, thanks for re-explaining. > > Did you try a simple heuristic on visiting to estimate the size (3 chars > for a number, no escaping for string, just length etc...), should enable to > cut fast enough the visiting without hacking lower level any buffer > strategy or buffer nor calling flush and moving the array data too often? > Otherwise we can do a custom generator factory from the provider in mapper > propagting properties but Im less a fan of that option cause it complexify > the user customization of the generator (today we can output other stuff > than json through that way).
The most optimized solution I can image is setting the buffer size of JsonGeneratorImpl to the snippet size, then adding an option to JsonGeneratorImpl so that it can be told to call flush() on the Writer when it reaches its buffer limit. That said, I think we should shift focus to getting this used by as many exceptions as we can and updating the documentation then come back to it. With the default snippet size of 50 no real harm can be done by the flushes. My $0.02 at least. -David > > Le lun. 25 avr. 2022 à 19:47, David Blevins <[email protected]> a > écrit : > >>> On Apr 24, 2022, at 11:09 PM, Romain Manni-Bucau <[email protected]> >> wrote: >>> >>> Le dim. 24 avr. 2022 à 22:30, David Blevins <[email protected]> a >>> écrit : >>> >>>> All, >>>> >>>> I added more tests and found that most the optimizations were not >>>> happening due to buffering. >>>> >>>> Essentially there are two buffers between Snippet.Buffer and >>>> Snippet.SnippetOutputStream. The SnippetOutputStream had the >>>> responsibility to tell the code up the stack when we've reached the max >>>> snippet length. Since all the bytes were buffered, it would see nothing >>>> until the very end and we'd end up serializing the full json text >> anyway. >>>> >>>> One is the 64k buffer in JsonGeneratorImpl and the other is an 8k buffer >>>> in the JVM implementation code of OutputStreamWriter. Since the >>>> OutputStreamWriter buffer is hardcoded, we can't solve this by adjusting >>>> buffer sizes and have no choice but to aggressively call flush() to >> ensure >>>> SnippetOutputStream has the bytes and can do its job. >>>> >>> >>> Not sure I get that since if you close in a finally block the generator, >> it >>> will flush the actual output and all will be good. >>> But can be to call tostring to early rather than a buffering issue >> >> It'd difficult to explain, but I'll do my best and thanks for the patience >> if my attempt is poor. >> >> The code is designed with the assumption that as json is serialized there >> will be write calls made on SnippetOutputStream, which then counts the >> bytes and can eventually tell Snippet.Buffer to stop making more json via >> the 'snippet.terminate()' calls. In practice this doesn't happen. >> >> In practice what does happen is the entire json document, up to a limit of >> 64k, will be created before any calls reach SnippetOutputStream. This is >> because JsonGeneratorImpl is holding a buffer (64k by default) and does not >> call any writes on the Writer instance it's holding until that buffer has >> filled or close is called. >> >> My first instinct was to reduce that 64k buffer to the snippet max length >> and solve this problem that way. The trick with that is there is yet >> another buffer being held internally by ObjectOutputStream and it recreates >> the issue. That buffer is hardcoded to be 8k. So even if we adjust the >> JsonGeneratorImpl buffer size, in practice what happens is the entire json >> document, up to a limit of 8k, will be created before any calls reach >> SnippetOutputStream. >> >> Certainly 8k is better than 64k which is better than potentially 1GB of >> json, but I wanted to try and get close to the spirit of what we were both >> after originally which is that we avoid serializing a lot of json only to >> throw most of it away and show just a chunk of it. >> >> The only way to do that is the flush() calls. That's the only way to >> ensure SnippetOutputStream is getting the json data as we serialize it. >> >> Hope some of this helps. >> >> >> -David >> >>
smime.p7s
Description: S/MIME cryptographic signature
