Re: codec strategies

Kritikos, Alex Mon, 12 May 2014 23:54:25 -0700

Hi Raf,

I suggest you re-measure codec times using System.nanotime which should give 
you more accurate results for benchmarking. Please note that on windows 
platforms the accuracy is much worse than POSIX platforms.


Thanks,

Alex Kritikos
Software AG

On 7 Μαϊ 2014, at 5:36 π.μ., Rafael Schloming <r...@alum.mit.edu> wrote:

> I've put together some mock ups of a few different codec strategies both to 
> compare from an API/usability perspective and to get a rough idea of some of 
> the performance implications of the different choices. Please see the 
> attached code for the full details. I'll summarize the different strategies 
> below.
>
> The SimpleEncoder is pretty straighforward, the only real point here is to 
> use basic types to represent values and thereby minimize the amount of 
> intermediate memory and CPU required in order to use the codec.
>
> The DispatchingDecoder works similarly to a sax style parser. It basically 
> iterates over the encoded content and dispatches values to a handler.
>
> The StreamingDecoder is similar to the DispatchingDecoder except instead of 
> an internal "bytecode" loop calling out to a handler, it is externally driven 
> by calling into the decoder. This appears to be marginally slower than the 
> DispatchingDecoder in the particular scenario in the mock up, however it may 
> have some API benefitis, e.g. conversions can be done on demand and it is 
> possible to skip over uninteresting data rather than parsing it.
>
> The mock up also includes the same data being encoded/decode using the 
> existing codec (with Clebert's patch).
>
> Fair warning, the data I chose to encode/decode is completely arbitrary and 
> not intended to be representative at all. That said, the numbers I'm getting 
> suggest to me that we can do a whole lot better than the current codec if we 
> start with something simple and keep it that way. Here is the output I'm 
> getting for a run with a hundred million iterations:
>
>   simple encode: 4416 millis
>   dispatching decode: 3049 millis
>   streaming decode: 3243 millis
>   existing encode: 9515 millis
>   existing decode: 13931 millis
>
> Another factor to consider is the difficulty in quantifying the impact of 
> generating lots of garbage. In a small benchmark like this there isn't a lot 
> of memory pressure, so extra garbage doesn't have a lot of impact, however in 
> a real application that would translate into increased GC cycles and so might 
> be more of a factor. What I can say from watching memory usage under the 
> profiler is that at any given point there are typically hundreds of megs 
> worth of garbage Integer and UUID instances lying around when the existing 
> codec is running. All of the alternative strategies I've included don't 
> generate any garbage.
>
> --Rafael
>
> <Codec.java>

This communication contains information which is confidential and may also be 
privileged. It is for the exclusive use of the intended recipient(s). If you 
are not the intended recipient(s), please note that any distribution, copying, 
or use of this communication or the information in it, is strictly prohibited. 
If you have received this communication in error please notify us by e-mail and 
then delete the e-mail and any copies of it.
Software AG (UK) Limited Registered in England & Wales 1310740 - 
http://www.softwareag.com/uk

Re: codec strategies

Reply via email to