The critical reason for the switch from OutputStream to Writer is the hardcoded 8k buffer inside OutputStreamWriter, which prevents our optimization from working. We'd still have been writing up to 8k. I switched to Writer to be in front of that buffer rather than behind it.
I'd need to check on the character encoding issue you mention. In my mind the original code and current code is trying to create a string of max snippet length. If it doesn't do that, it's a bug. -David > On Apr 26, 2022, at 10:10 AM, Romain Manni-Bucau <[email protected]> > wrote: > > Hi David, > > Just reviewed the PR, if you can quickly go through the questions/points i > can follow up tomorrow to finish it if you don't have time anymore. > Long story short i'm a bit "??" on the move from stream to writer and I > don't see how it works, rest is about being picky so I could handle it > after the merge. > > Romain Manni-Bucau > @rmannibucau <https://twitter.com/rmannibucau> | Blog > <https://rmannibucau.metawerx.net/> | Old Blog > <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | > LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book > <https://www.packtpub.com/application-development/java-ee-8-high-performance> > > > Le mar. 26 avr. 2022 à 19:02, David Blevins <[email protected]> a > écrit : > >> I really like it. I pushed a version of this that gets to the heart of >> the idea. I need to move on to the messages themselves as I have a hard >> stop approaching me, but I am supportive if you want to keep refining it. >> >> Ideally, we merge and you make any changes you feel are necessary and I >> make second PR for rolling this out to the messages. >> >> >> -David >> >>> On Apr 26, 2022, at 12:02 AM, Romain Manni-Bucau <[email protected]> >> wrote: >>> >>> Hey David, >>> >>> Just got an idea we can maybe refine to merge both points. >>> I'll try to draft it high level (understand a Writer can be an >> OutputStream >>> or the opposite in your impl) so don't get too picky about the details >>> please ;). >>> >>> High level the idea is to make JsonGeneratorFactoryImpl to be able to >>> handle an outputstream or writer which would provide its buffer size. >>> >>> Here are the idea steps I can envision: >>> >>> 1. Create a new interface in johnzon-core `public interface >>> MetadataProvider { int bufferSize(); }` >>> 2. Create an OutputStream implementation of MetadataProvider - let's call >>> it MetadataProviderOutputStream for this list - if it helps it can be a >>> MetadataProviderSnippetOutputStream directly which would just be a >> subclass >>> of SnippetOutputStream, think it would make it simpler. >>> 3. Create a Supplier<OutputStream> (factory) in MapperBuilder - where we >>> create the Snippet instance - which will basically test the class name of >>> the generator factory - no reflection there please to stay compatible >> with >>> all our downstream consumers without changes - and if it >>> is org.apache.johnzon.core.JsonGeneratorFactoryImpl then it will create a >>> MetadataProviderOutputStream else a plain OutputStream and the feature >> will >>> be disabled (for now at least) - note that it can need a small >> indirection >>> like a MetadataProviderOutputStreamSupplier to not trigger the load of >> the >>> MetadataProvider on MapperBuilder class init/verification. This will also >>> use the snippet max size indeed. Here we want to ensure we can run >> without >>> johnzon-core as JSON-P impl. >>> 4. Pass the supplier created at 3 to the Snippet class next to the >>> generator factory >>> 5. Change JsonGeneratorFactoryImpl#createGenerator(OutputStream) to test >> if >>> it is a MetadataProvider instance to handle it (small trick there is to >> try >>> to cache the snippet buffer to keep reusing the buffer caching - and only >>> when needed): >>> >>> public class JsonGeneratorFactoryImpl extends AbstractJsonFactory >>> implements JsonGeneratorFactory { >>> public static final String GENERATOR_BUFFER_LENGTH = >>> "org.apache.johnzon.default-char-buffer-generator"; >>> public static final int DEFAULT_GENERATOR_BUFFER_LENGTH = >>> Integer.getInteger(GENERATOR_BUFFER_LENGTH, 64 * 1024); //64k >>> >>> static final Collection<String> SUPPORTED_CONFIG_KEYS = asList( >>> JsonGenerator.PRETTY_PRINTING, GENERATOR_BUFFER_LENGTH, >>> BUFFER_STRATEGY >>> ); >>> //key caching currently disabled >>> private final boolean pretty; >>> private final Buffer buffer; >>> private volatile Buffer firstCustomBuffer; >>> >>> public JsonGeneratorFactoryImpl(final Map<String, ?> config) { >>> super(config, SUPPORTED_CONFIG_KEYS, null); >>> this.pretty = getBool(JsonGenerator.PRETTY_PRINTING, false); >>> >>> final int bufferSize = getInt(GENERATOR_BUFFER_LENGTH, >>> DEFAULT_GENERATOR_BUFFER_LENGTH); >>> if (bufferSize <= 0) { >>> throw new IllegalArgumentException("buffer length must be >>> greater than zero"); >>> } >>> this.buffer = new >>> Buffer(getBufferProvider().newCharProvider(bufferSize), bufferSize); >>> } >>> >>> @Override >>> public JsonGenerator createGenerator(final Writer writer) { >>> return new JsonGeneratorImpl(writer, buffer.provider, pretty); >>> } >>> >>> @Override >>> public JsonGenerator createGenerator(final OutputStream out) { >>> if (MetadataProvider.class.isInstance(out)) { // mainly for >> Snippet >>> handling since generators don't have properties >>> final int bufferSize = >>> MetadataProvider.class.cast(out).bufferSize(); >>> if (buffer.size != bufferSize) { >>> // most of the time it will be a single aside buffer so >>> keep its factory to reuse the provider cache >>> if (firstCustomBuffer == null) { >>> synchronized (this) { >>> if (firstCustomBuffer == null) { >>> firstCustomBuffer = new >>> Buffer(getBufferProvider().newCharProvider(bufferSize), bufferSize); >>> } >>> } >>> } >>> Buffer customBuffer = firstCustomBuffer; >>> if (customBuffer.size != bufferSize) { // don't use the >>> cache, shouldn't occur often >>> return new JsonGeneratorImpl(out, >>> getBufferProvider().newCharProvider(bufferSize), pretty); >>> } >>> return new JsonGeneratorImpl(out, customBuffer.provider, >>> pretty); >>> } // else just reuse the original buffer since it matches and >>> leverages its cache >>> } >>> return new JsonGeneratorImpl(out, buffer.provider, pretty); >>> } >>> >>> @Override >>> public JsonGenerator createGenerator(final OutputStream out, final >>> Charset charset) { >>> return new JsonGeneratorImpl(out,charset, buffer.provider, >> pretty); >>> } >>> >>> @Override >>> public Map<String, ?> getConfigInUse() { >>> return Collections.unmodifiableMap(internalConfig); >>> } >>> >>> private static final class Buffer { >>> private final BufferStrategy.BufferProvider<char[]> provider; >>> private final int size; >>> >>> private Buffer(final BufferStrategy.BufferProvider<char[]> >>> bufferProvider, final int bufferSize) { >>> this.provider = bufferProvider; >>> this.size = bufferSize; >>> } >>> } >>> } >>> >>> 6. you use the outputstream supplier in Snippet and done >>> >>> Wdyt? >>> >>> Romain Manni-Bucau >>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>> <https://rmannibucau.metawerx.net/> | Old Blog >>> <http://rmannibucau.wordpress.com> | Github < >> https://github.com/rmannibucau> | >>> LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book >>> < >> https://www.packtpub.com/application-development/java-ee-8-high-performance >>> >>> >>> >>> Le mar. 26 avr. 2022 à 07:05, Romain Manni-Bucau <[email protected]> >> a >>> écrit : >>> >>>> >>>> >>>> Le mar. 26 avr. 2022 à 00:44, David Blevins <[email protected]> a >>>> écrit : >>>> >>>>> Since we're now getting the snippet max length via the config, could we >>>>> construct the JsonGeneratorImpl directly and pass in a buffer of the >>>>> correct size? >>>>> >>>> >>>> From the JsonProvider we can - mapper must not depend on johnzon-core - >> at >>>> the condition to promote the other properties too AND not do it if user >>>> provided a custom generator factory which would just be reused in this >>>> case, so not sure it solves the issue. >>>> >>>> >>>>> -David >>>>> >>>>> >>>>>> On Apr 25, 2022, at 2:19 PM, Romain Manni-Bucau < >> [email protected]> >>>>> wrote: >>>>>> >>>>>> No worries, got it. >>>>>> >>>>>> One issue copying the generator is that it will still miss the flush >> for >>>>>> all small objects and for all other cases the heuristic + a simple >>>>>> truncation if needed would work with almost no other downsides than >> not >>>>>> being exact but goal was to limit the overflow/cpu/mem so all good >> IMHO. >>>>>> >>>>>> Le lun. 25 avr. 2022 à 22:57, David Blevins <[email protected]> >> a >>>>>> écrit : >>>>>> >>>>>>>> On Apr 25, 2022, at 10:47 AM, David Blevins < >> [email protected]> >>>>>>> wrote: >>>>>>>> >>>>>>>> The trick with that is there is yet another buffer being held >>>>> internally >>>>>>> by ObjectOutputStream and it recreates the issue. >>>>>>> >>>>>>> Just read this again -- this was supposed to be "OutputStreamWriter" >>>>> not >>>>>>> ObjectOutputStream. Too many years of EJB :) >>>>>>> >>>>>>> >>>>>>> -David >>>>>>> >>>>>>> >>>>> >>>>> >> >>
smime.p7s
Description: S/MIME cryptographic signature
