Thanks for looking into this. I will investigate what it would take to solve
these performance regressions.

-Bryan

On Wed, Jan 13, 2010 at 9:12 PM, Chad Walters <[email protected]>wrote:

> I gathered the following data using the thrift-protobuf-compare benchmarks
> (multiple runs with each change). It's probably not very representative of
> any real world workload, but it's what I have to work with at the moment.
>
> Original (some pre-0.1 version of Thrift)
>                        ,   Object create,       Serialize,  /w Same Object,
>     Deserialize, and Check Media,   and Check All,      Total Time,
> Serialized Size
> thrift                  ,       318.93000,      4675.50000,
>  6708.00000,      5040.00000,             NaN,             NaN,
>  9715.50000,        353
> thrift                  ,       352.25500,      4556.00000,
>  6658.50000,      5086.00000,             NaN,             NaN,
>  9642.00000,        353
> thrift                  ,       350.59000,      4572.00000,
>  6748.50000,      5067.00000,             NaN,             NaN,
>  9639.00000,        353
>
> Thrift 0.2
>                        ,   Object create,       Serialize,  /w Same Object,
>     Deserialize, and Check Media,   and Check All,      Total Time,
> Serialized Size
> thrift                  ,       352.36500,      4506.00000,
>  6686.00000,      5783.00000,             NaN,             NaN,
> 10641.36500,        353
> thrift                  ,       329.13000,      4462.50000,
>  6757.00000,      5770.50000,             NaN,             NaN,
> 10562.13000,        353
> thrift                  ,       331.42000,      4503.00000,
>  6803.00000,      5768.50000,             NaN,             NaN,
> 10602.92000,        353
>
> Thrift 0.2 w/TCompactProtocol
>                        ,   Object create,       Serialize,  /w Same Object,
>     Deserialize, and Check Media,   and Check All,      Total Time,
> Serialized Size
> thrift                  ,       324.50500,      4467.00000,
>  6426.00000,      5703.00000,             NaN,             NaN,
> 10494.50500,        234
> thrift                  ,       323.00500,      4418.00000,
>  6423.00000,      5703.00000,             NaN,             NaN,
> 10444.00500,        234
> thrift                  ,       330.57000,      4388.50000,
>  5244.50000,      5704.00000,             NaN,             NaN,
> 10423.07000,        234
>
> Thrift 0.2 w/TCompactProtocol + read() fix
>                        ,   Object create,       Serialize,  /w Same Object,
>     Deserialize, and Check Media,   and Check All,      Total Time,
> Serialized Size
> thrift                  ,       327.55000,      4201.00000,
>  6163.50000,      4988.50000,             NaN,             NaN,
>  9517.05000,        234
> thrift                  ,       326.82500,      4194.00000,
>  5223.00000,      4975.50000,             NaN,             NaN,
>  9496.32500,        234
> thrift                  ,       324.36000,      4191.00000,
>  6217.00000,      4978.50000,             NaN,             NaN,
>  9493.86000,        234
>
> Thrift 0.2 w/TCompactProtocol + read() fix + enum fix
>                        ,   Object create,       Serialize,  /w Same Object,
>     Deserialize, and Check Media,   and Check All,      Total Time,
> Serialized Size
> thrift                  ,       346.90500,      4130.00000,
>  5311.00000,      4637.50000,             NaN,             NaN,
>  9114.40500,        234
> thrift                  ,       310.77500,      4168.50000,
>  5448.00000,      4718.00000,             NaN,             NaN,
>  9197.27500,        234
> thrift                  ,       314.56000,      4147.50000,
>  5540.50000,      4645.50000,             NaN,             NaN,
>  9107.56000,        234
>
>
> The "read() fix" reverts read() to use a switch on the field.id int rather
> than mapping to the enum and switching on that. The "enum fix" gets rid of
> the map in the enum classes and implements findByValue() as switch. I just
> applied these fixes by hand in the generated files.
>
> I'll go ahead and open issues on both of these items.
>
> Chad
>
> -----Original Message-----
> From: Bryan Duxbury [mailto:[email protected]]
> Sent: Tuesday, January 12, 2010 11:05 PM
> To: [email protected]
> Subject: Re: Perf regression in Java bindings?
>
> Chad -
>
> This is an interesting observation. Do you have a sense of how much of a
> difference it makes?
>
> I think you should open a ticket for this regression. Your proposed
> solutions sound viable and easy, so I can probably whip something up prett
> quickly.
>
> -Bryan
>
> On Tue, Jan 12, 2010 at 10:54 PM, Chad Walters
> <[email protected]>wrote:
>
> > I was playing around with upgrading some benchmarks to Thrift 0.2 and I
> > think that THRIFT-623 <https://issues.apache.org/jira/browse/THRIFT-623>
> > may have introduced a performance regression.
> >
> > Struct deserialization now does a map lookup to go from an integer to an
> > enumeration, checks for null in case the integer was not known, and then
> > does a switch on the value of the enumeration.
> >
> > But it appears to me that this mapping is entirely fixed at compile time.
> > Furthermore the enumeration does not appear to be used for anything other
> > than performing the switch (in the struct reading routine at least).
> >
> > This leads me to ask:
> > 1. Why not just switch on the integer and have a default case for unknown
> > values?
> > 2. Why have a dynamic map for a mapping that is known at compile time?
> > Wouldn't it be faster to just provide the mapping via a switch (when it
> is
> > needed at all)?
> >
> > Perhaps I am missing some subtlety but I thought I'd ask...
> >
> > Chad
> >
> >
> >
>

Reply via email to