Thanks for looking into this. I will investigate what it would take to solve these performance regressions.
-Bryan On Wed, Jan 13, 2010 at 9:12 PM, Chad Walters <[email protected]>wrote: > I gathered the following data using the thrift-protobuf-compare benchmarks > (multiple runs with each change). It's probably not very representative of > any real world workload, but it's what I have to work with at the moment. > > Original (some pre-0.1 version of Thrift) > , Object create, Serialize, /w Same Object, > Deserialize, and Check Media, and Check All, Total Time, > Serialized Size > thrift , 318.93000, 4675.50000, > 6708.00000, 5040.00000, NaN, NaN, > 9715.50000, 353 > thrift , 352.25500, 4556.00000, > 6658.50000, 5086.00000, NaN, NaN, > 9642.00000, 353 > thrift , 350.59000, 4572.00000, > 6748.50000, 5067.00000, NaN, NaN, > 9639.00000, 353 > > Thrift 0.2 > , Object create, Serialize, /w Same Object, > Deserialize, and Check Media, and Check All, Total Time, > Serialized Size > thrift , 352.36500, 4506.00000, > 6686.00000, 5783.00000, NaN, NaN, > 10641.36500, 353 > thrift , 329.13000, 4462.50000, > 6757.00000, 5770.50000, NaN, NaN, > 10562.13000, 353 > thrift , 331.42000, 4503.00000, > 6803.00000, 5768.50000, NaN, NaN, > 10602.92000, 353 > > Thrift 0.2 w/TCompactProtocol > , Object create, Serialize, /w Same Object, > Deserialize, and Check Media, and Check All, Total Time, > Serialized Size > thrift , 324.50500, 4467.00000, > 6426.00000, 5703.00000, NaN, NaN, > 10494.50500, 234 > thrift , 323.00500, 4418.00000, > 6423.00000, 5703.00000, NaN, NaN, > 10444.00500, 234 > thrift , 330.57000, 4388.50000, > 5244.50000, 5704.00000, NaN, NaN, > 10423.07000, 234 > > Thrift 0.2 w/TCompactProtocol + read() fix > , Object create, Serialize, /w Same Object, > Deserialize, and Check Media, and Check All, Total Time, > Serialized Size > thrift , 327.55000, 4201.00000, > 6163.50000, 4988.50000, NaN, NaN, > 9517.05000, 234 > thrift , 326.82500, 4194.00000, > 5223.00000, 4975.50000, NaN, NaN, > 9496.32500, 234 > thrift , 324.36000, 4191.00000, > 6217.00000, 4978.50000, NaN, NaN, > 9493.86000, 234 > > Thrift 0.2 w/TCompactProtocol + read() fix + enum fix > , Object create, Serialize, /w Same Object, > Deserialize, and Check Media, and Check All, Total Time, > Serialized Size > thrift , 346.90500, 4130.00000, > 5311.00000, 4637.50000, NaN, NaN, > 9114.40500, 234 > thrift , 310.77500, 4168.50000, > 5448.00000, 4718.00000, NaN, NaN, > 9197.27500, 234 > thrift , 314.56000, 4147.50000, > 5540.50000, 4645.50000, NaN, NaN, > 9107.56000, 234 > > > The "read() fix" reverts read() to use a switch on the field.id int rather > than mapping to the enum and switching on that. The "enum fix" gets rid of > the map in the enum classes and implements findByValue() as switch. I just > applied these fixes by hand in the generated files. > > I'll go ahead and open issues on both of these items. > > Chad > > -----Original Message----- > From: Bryan Duxbury [mailto:[email protected]] > Sent: Tuesday, January 12, 2010 11:05 PM > To: [email protected] > Subject: Re: Perf regression in Java bindings? > > Chad - > > This is an interesting observation. Do you have a sense of how much of a > difference it makes? > > I think you should open a ticket for this regression. Your proposed > solutions sound viable and easy, so I can probably whip something up prett > quickly. > > -Bryan > > On Tue, Jan 12, 2010 at 10:54 PM, Chad Walters > <[email protected]>wrote: > > > I was playing around with upgrading some benchmarks to Thrift 0.2 and I > > think that THRIFT-623 <https://issues.apache.org/jira/browse/THRIFT-623> > > may have introduced a performance regression. > > > > Struct deserialization now does a map lookup to go from an integer to an > > enumeration, checks for null in case the integer was not known, and then > > does a switch on the value of the enumeration. > > > > But it appears to me that this mapping is entirely fixed at compile time. > > Furthermore the enumeration does not appear to be used for anything other > > than performing the switch (in the struct reading routine at least). > > > > This leads me to ask: > > 1. Why not just switch on the integer and have a default case for unknown > > values? > > 2. Why have a dynamic map for a mapping that is known at compile time? > > Wouldn't it be faster to just provide the mapping via a switch (when it > is > > needed at all)? > > > > Perhaps I am missing some subtlety but I thought I'd ask... > > > > Chad > > > > > > >
