Hmm, you know what, from the size of the impl which mainly delegate to Charset.newEncoder()... I think we should just implement our own SizedOutputStreamWriter(Charset, OutputStream, int buffer)....even java.nio.channels.Channels#newWriter(java.nio.channels.WritableByteChannel, java.nio.charset.CharsetEncoder, int) workaround does not work cause the buffer size is a min for StreamEncoder so just doing our own looks way simpler overall. Would it help?
That said I just checked and looks like org.apache.johnzon.core.JsonGeneratorFactoryImpl#createGenerator(java.io.OutputStream) has a bug to not fallback on a configured encoding - we do it for the reader but not generator which should reuse the org.apache.johnzon.encoding property , this is a nasty bug making us not symmetric - nor functional - with some API :(. Will try to have a look tomorrow if I get time. Guess we'll end up dropping org.apache.johnzon.core.JsonGeneratorImpl#JsonGeneratorImpl(java.io.OutputStream, org.apache.johnzon.core.BufferStrategy.BufferProvider<char[]>, boolean). Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book <https://www.packtpub.com/application-development/java-ee-8-high-performance> Le mar. 26 avr. 2022 à 19:55, David Blevins <[email protected]> a écrit : > The critical reason for the switch from OutputStream to Writer is the > hardcoded 8k buffer inside OutputStreamWriter, which prevents our > optimization from working. We'd still have been writing up to 8k. I > switched to Writer to be in front of that buffer rather than behind it. > > I'd need to check on the character encoding issue you mention. In my mind > the original code and current code is trying to create a string of max > snippet length. If it doesn't do that, it's a bug. > > > -David > > > > On Apr 26, 2022, at 10:10 AM, Romain Manni-Bucau <[email protected]> > wrote: > > > > Hi David, > > > > Just reviewed the PR, if you can quickly go through the questions/points > i > > can follow up tomorrow to finish it if you don't have time anymore. > > Long story short i'm a bit "??" on the move from stream to writer and I > > don't see how it works, rest is about being picky so I could handle it > > after the merge. > > > > Romain Manni-Bucau > > @rmannibucau <https://twitter.com/rmannibucau> | Blog > > <https://rmannibucau.metawerx.net/> | Old Blog > > <http://rmannibucau.wordpress.com> | Github < > https://github.com/rmannibucau> | > > LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book > > < > https://www.packtpub.com/application-development/java-ee-8-high-performance > > > > > > > > Le mar. 26 avr. 2022 à 19:02, David Blevins <[email protected]> a > > écrit : > > > >> I really like it. I pushed a version of this that gets to the heart of > >> the idea. I need to move on to the messages themselves as I have a hard > >> stop approaching me, but I am supportive if you want to keep refining > it. > >> > >> Ideally, we merge and you make any changes you feel are necessary and I > >> make second PR for rolling this out to the messages. > >> > >> > >> -David > >> > >>> On Apr 26, 2022, at 12:02 AM, Romain Manni-Bucau < > [email protected]> > >> wrote: > >>> > >>> Hey David, > >>> > >>> Just got an idea we can maybe refine to merge both points. > >>> I'll try to draft it high level (understand a Writer can be an > >> OutputStream > >>> or the opposite in your impl) so don't get too picky about the details > >>> please ;). > >>> > >>> High level the idea is to make JsonGeneratorFactoryImpl to be able to > >>> handle an outputstream or writer which would provide its buffer size. > >>> > >>> Here are the idea steps I can envision: > >>> > >>> 1. Create a new interface in johnzon-core `public interface > >>> MetadataProvider { int bufferSize(); }` > >>> 2. Create an OutputStream implementation of MetadataProvider - let's > call > >>> it MetadataProviderOutputStream for this list - if it helps it can be a > >>> MetadataProviderSnippetOutputStream directly which would just be a > >> subclass > >>> of SnippetOutputStream, think it would make it simpler. > >>> 3. Create a Supplier<OutputStream> (factory) in MapperBuilder - where > we > >>> create the Snippet instance - which will basically test the class name > of > >>> the generator factory - no reflection there please to stay compatible > >> with > >>> all our downstream consumers without changes - and if it > >>> is org.apache.johnzon.core.JsonGeneratorFactoryImpl then it will > create a > >>> MetadataProviderOutputStream else a plain OutputStream and the feature > >> will > >>> be disabled (for now at least) - note that it can need a small > >> indirection > >>> like a MetadataProviderOutputStreamSupplier to not trigger the load of > >> the > >>> MetadataProvider on MapperBuilder class init/verification. This will > also > >>> use the snippet max size indeed. Here we want to ensure we can run > >> without > >>> johnzon-core as JSON-P impl. > >>> 4. Pass the supplier created at 3 to the Snippet class next to the > >>> generator factory > >>> 5. Change JsonGeneratorFactoryImpl#createGenerator(OutputStream) to > test > >> if > >>> it is a MetadataProvider instance to handle it (small trick there is to > >> try > >>> to cache the snippet buffer to keep reusing the buffer caching - and > only > >>> when needed): > >>> > >>> public class JsonGeneratorFactoryImpl extends AbstractJsonFactory > >>> implements JsonGeneratorFactory { > >>> public static final String GENERATOR_BUFFER_LENGTH = > >>> "org.apache.johnzon.default-char-buffer-generator"; > >>> public static final int DEFAULT_GENERATOR_BUFFER_LENGTH = > >>> Integer.getInteger(GENERATOR_BUFFER_LENGTH, 64 * 1024); //64k > >>> > >>> static final Collection<String> SUPPORTED_CONFIG_KEYS = asList( > >>> JsonGenerator.PRETTY_PRINTING, GENERATOR_BUFFER_LENGTH, > >>> BUFFER_STRATEGY > >>> ); > >>> //key caching currently disabled > >>> private final boolean pretty; > >>> private final Buffer buffer; > >>> private volatile Buffer firstCustomBuffer; > >>> > >>> public JsonGeneratorFactoryImpl(final Map<String, ?> config) { > >>> super(config, SUPPORTED_CONFIG_KEYS, null); > >>> this.pretty = getBool(JsonGenerator.PRETTY_PRINTING, false); > >>> > >>> final int bufferSize = getInt(GENERATOR_BUFFER_LENGTH, > >>> DEFAULT_GENERATOR_BUFFER_LENGTH); > >>> if (bufferSize <= 0) { > >>> throw new IllegalArgumentException("buffer length must be > >>> greater than zero"); > >>> } > >>> this.buffer = new > >>> Buffer(getBufferProvider().newCharProvider(bufferSize), bufferSize); > >>> } > >>> > >>> @Override > >>> public JsonGenerator createGenerator(final Writer writer) { > >>> return new JsonGeneratorImpl(writer, buffer.provider, pretty); > >>> } > >>> > >>> @Override > >>> public JsonGenerator createGenerator(final OutputStream out) { > >>> if (MetadataProvider.class.isInstance(out)) { // mainly for > >> Snippet > >>> handling since generators don't have properties > >>> final int bufferSize = > >>> MetadataProvider.class.cast(out).bufferSize(); > >>> if (buffer.size != bufferSize) { > >>> // most of the time it will be a single aside buffer so > >>> keep its factory to reuse the provider cache > >>> if (firstCustomBuffer == null) { > >>> synchronized (this) { > >>> if (firstCustomBuffer == null) { > >>> firstCustomBuffer = new > >>> Buffer(getBufferProvider().newCharProvider(bufferSize), bufferSize); > >>> } > >>> } > >>> } > >>> Buffer customBuffer = firstCustomBuffer; > >>> if (customBuffer.size != bufferSize) { // don't use the > >>> cache, shouldn't occur often > >>> return new JsonGeneratorImpl(out, > >>> getBufferProvider().newCharProvider(bufferSize), pretty); > >>> } > >>> return new JsonGeneratorImpl(out, customBuffer.provider, > >>> pretty); > >>> } // else just reuse the original buffer since it matches and > >>> leverages its cache > >>> } > >>> return new JsonGeneratorImpl(out, buffer.provider, pretty); > >>> } > >>> > >>> @Override > >>> public JsonGenerator createGenerator(final OutputStream out, final > >>> Charset charset) { > >>> return new JsonGeneratorImpl(out,charset, buffer.provider, > >> pretty); > >>> } > >>> > >>> @Override > >>> public Map<String, ?> getConfigInUse() { > >>> return Collections.unmodifiableMap(internalConfig); > >>> } > >>> > >>> private static final class Buffer { > >>> private final BufferStrategy.BufferProvider<char[]> provider; > >>> private final int size; > >>> > >>> private Buffer(final BufferStrategy.BufferProvider<char[]> > >>> bufferProvider, final int bufferSize) { > >>> this.provider = bufferProvider; > >>> this.size = bufferSize; > >>> } > >>> } > >>> } > >>> > >>> 6. you use the outputstream supplier in Snippet and done > >>> > >>> Wdyt? > >>> > >>> Romain Manni-Bucau > >>> @rmannibucau <https://twitter.com/rmannibucau> | Blog > >>> <https://rmannibucau.metawerx.net/> | Old Blog > >>> <http://rmannibucau.wordpress.com> | Github < > >> https://github.com/rmannibucau> | > >>> LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book > >>> < > >> > https://www.packtpub.com/application-development/java-ee-8-high-performance > >>> > >>> > >>> > >>> Le mar. 26 avr. 2022 à 07:05, Romain Manni-Bucau < > [email protected]> > >> a > >>> écrit : > >>> > >>>> > >>>> > >>>> Le mar. 26 avr. 2022 à 00:44, David Blevins <[email protected]> > a > >>>> écrit : > >>>> > >>>>> Since we're now getting the snippet max length via the config, could > we > >>>>> construct the JsonGeneratorImpl directly and pass in a buffer of the > >>>>> correct size? > >>>>> > >>>> > >>>> From the JsonProvider we can - mapper must not depend on johnzon-core > - > >> at > >>>> the condition to promote the other properties too AND not do it if > user > >>>> provided a custom generator factory which would just be reused in this > >>>> case, so not sure it solves the issue. > >>>> > >>>> > >>>>> -David > >>>>> > >>>>> > >>>>>> On Apr 25, 2022, at 2:19 PM, Romain Manni-Bucau < > >> [email protected]> > >>>>> wrote: > >>>>>> > >>>>>> No worries, got it. > >>>>>> > >>>>>> One issue copying the generator is that it will still miss the flush > >> for > >>>>>> all small objects and for all other cases the heuristic + a simple > >>>>>> truncation if needed would work with almost no other downsides than > >> not > >>>>>> being exact but goal was to limit the overflow/cpu/mem so all good > >> IMHO. > >>>>>> > >>>>>> Le lun. 25 avr. 2022 à 22:57, David Blevins < > [email protected]> > >> a > >>>>>> écrit : > >>>>>> > >>>>>>>> On Apr 25, 2022, at 10:47 AM, David Blevins < > >> [email protected]> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> The trick with that is there is yet another buffer being held > >>>>> internally > >>>>>>> by ObjectOutputStream and it recreates the issue. > >>>>>>> > >>>>>>> Just read this again -- this was supposed to be > "OutputStreamWriter" > >>>>> not > >>>>>>> ObjectOutputStream. Too many years of EJB :) > >>>>>>> > >>>>>>> > >>>>>>> -David > >>>>>>> > >>>>>>> > >>>>> > >>>>> > >> > >> > >
