(Hoping to find time to look at this soon.) On Sun, Jun 6, 2010 at 8:19 AM, Evan Jones <ev...@mit.edu> wrote:
> My patch to improve string encoding performance is now available as the > following code review. The result: 13%-27% improvement on the ProtoBench > files included in SVN. This is faster than the JDK because it significantly > reduces memory allocations (JDK best case: 5X string length; my best case: > string length + 64 bytes). It also eliminates a copy, but it also adds a > copy of the String data, so that probably is about equal. > > http://codereview.appspot.com/949044 > > This patch was designed to not change the lite runtime at all, so there is > this weird hacky class called FastStringEncoder, that really contains > methods that should be added to CodedOutputStream. > > I think it would be a good idea to include this patch in the protocol > buffer library, although there is a risk that my UTF-8 encoding code may > have bugs in it. Hence, I won't be disappointed if this is rejected for the > protocol buffer distribution, but I will try to maintain the patch. > > I have more detailed performance results, if anyone cares. > > Evan > > > > Detailed results for speed messages: > > ORIGINAL > Benchmarking benchmarks.GoogleSpeed$SpeedMessage1 with file > google_message1.dat > Serialize to byte string: 21006530 iterations in 32.088s; 142.34642MB/s > Serialize to byte array: 19310791 iterations in 29.529s; 142.19565MB/s > Serialize to memory stream: 19679249 iterations in 32.203s; 132.87619MB/s > Serialize to /dev/null with FileOutputStream: 15728640 iterations in > 29.929s; 114.27044MB/s > Serialize to /dev/null reusing FileOutputStream: 14796462 iterations in > 27.534s; 116.848595MB/s > Serialize to /dev/null with FileChannel: 18961591 iterations in 31.51s; > 130.84625MB/s > Serialize to /dev/null reusing FileChannel: 19157904 iterations in 30.755s; > 135.44632MB/s > > Benchmarking benchmarks.GoogleSpeed$SpeedMessage2 with file > google_message2.dat > Serialize to byte string: 46108 iterations in 26.724s; 139.15257MB/s > Serialize to byte array: 50547 iterations in 28.874s; 141.19029MB/s > Serialize to memory stream: 48282 iterations in 29.776s; 130.77818MB/s > Serialize to /dev/null with FileOutputStream: 50505 iterations in 28.799s; > 141.44037MB/s > Serialize to /dev/null reusing FileOutputStream: 51478 iterations in > 30.064s; 138.09926MB/s > Serialize to /dev/null with FileChannel: 51328 iterations in 29.668s; > 139.53477MB/s > Serialize to /dev/null reusing FileChannel: 48454 iterations in 27.46s; > 142.31332MB/s > > > OPTIMIZED > Benchmarking benchmarks.GoogleSpeed$SpeedMessage1 with file > google_message1.dat > Serialize to byte string: 24207218 iterations in 29.098s; 180.89088MB/s > Serialize to byte array: 24480373 iterations in 29.937s; 177.8053MB/s > Serialize to memory stream: 22928046 iterations in 30.515s; 163.37613MB/s > Serialize to /dev/null with FileOutputStream: 20242779 iterations in > 29.626s; 148.57033MB/s > Serialize to /dev/null reusing FileOutputStream: 19803135 iterations in > 27.7s; 155.44943MB/s > Serialize to /dev/null with FileChannel: 25135661 iterations in 34.242s; > 159.61221MB/s > Serialize to /dev/null reusing FileChannel: 22421439 iterations in 29.61s; > 164.64934MB/s > > Benchmarking benchmarks.GoogleSpeed$SpeedMessage2 with file > google_message2.dat > Serialize to byte string: 58071 iterations in 29.694s; 157.72736MB/s > Serialize to byte array: 56888 iterations in 29.112s; 157.60321MB/s > Serialize to memory stream: 53171 iterations in 29.709s; 144.34547MB/s > Serialize to /dev/null with FileOutputStream: 58154 iterations in 29.968s; > 156.5086MB/s > Serialize to /dev/null reusing FileOutputStream: 57880 iterations in > 29.779s; 156.75984MB/s > Serialize to /dev/null with FileChannel: 55803 iterations in 28.881s; > 155.83382MB/s > Serialize to /dev/null reusing FileChannel: 59563 iterations in 30.668s; > 156.64175MB/s > > > Size messages: > > ORIGINAL > Benchmarking benchmarks.GoogleSize$SizeMessage1 with file > google_message1.dat > Serialize to byte string: 2789755 iterations in 29.686s; 20.433807MB/s > Serialize to byte array: 2748801 iterations in 29.597s; 20.194382MB/s > Serialize to memory stream: 2702515 iterations in 28.65s; 20.510603MB/s > Serialize to /dev/null with FileOutputStream: 2716518 iterations in > 29.376s; 20.107351MB/s > Serialize to /dev/null reusing FileOutputStream: 2507755 iterations in > 28.299s; 19.268545MB/s > Serialize to /dev/null with FileChannel: 2809689 iterations in 31.171s; > 19.599386MB/s > Serialize to /dev/null reusing FileChannel: 2764260 iterations in 29.827s; > 20.151354MB/s > > Benchmarking benchmarks.GoogleSize$SizeMessage2 with file > google_message2.dat > Serialize to byte string: 6530 iterations in 27.688s; 19.021206MB/s > Serialize to byte array: 7303 iterations in 30.9s; 19.061596MB/s > Serialize to memory stream: 6918 iterations in 30.389s; 18.360332MB/s > Serialize to /dev/null with FileOutputStream: 7154 iterations in 31.094s; > 18.556187MB/s > Serialize to /dev/null reusing FileOutputStream: 6707 iterations in > 28.757s; 18.810535MB/s > Serialize to /dev/null with FileChannel: 6887 iterations in 28.743s; > 19.324774MB/s > Serialize to /dev/null reusing FileChannel: 7373 iterations in 31.919s; > 18.629936MB/s > > > > OPTIMIZED > Benchmarking benchmarks.GoogleSize$SizeMessage1 with file > google_message1.dat > Serialize to byte string: 3432701 iterations in 29.986s; 24.891575MB/s > Serialize to byte array: 3455325 iterations in 30.373s; 24.73638MB/s > Serialize to memory stream: 3398582 iterations in 30.742s; 24.038122MB/s > Serialize to /dev/null with FileOutputStream: 2932259 iterations in > 28.331s; 22.504812MB/s > Serialize to /dev/null reusing FileOutputStream: 2779893 iterations in > 26.785s; 22.566872MB/s > Serialize to /dev/null with FileChannel: 3129454 iterations in 28.526s; > 23.854078MB/s > Serialize to /dev/null reusing FileChannel: 3183935 iterations in 28.779s; > 24.056MB/s > > Benchmarking benchmarks.GoogleSize$SizeMessage2 with file > google_message2.dat > Serialize to byte string: 6497 iterations in 26.656s; 19.657772MB/s > Serialize to byte array: 7231 iterations in 29.827s; 19.552631MB/s > Serialize to memory stream: 6643 iterations in 27.582s; 19.424726MB/s > Serialize to /dev/null with FileOutputStream: 7078 iterations in 27.844s; > 20.501957MB/s > Serialize to /dev/null reusing FileOutputStream: 7434 iterations in > 30.969s; 19.360287MB/s > Serialize to /dev/null with FileChannel: 6988 iterations in 29.144s; > 19.338385MB/s > Serialize to /dev/null reusing FileChannel: 7279 iterations in 30.338s; > 19.3509MB/s > Deserialize from byte string: 5254 iterations in 29.942s; 14.152257MB/s > Deserialize from byte array: 5429 iterations in 30.481s; 14.3650465MB/s > Deserialize from memory stream: 6156 iterations in 32.337s; 15.353779MB/s > > -- > Evan Jones > http://evanjones.ca/ > > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To post to this group, send email to proto...@googlegroups.com. > To unsubscribe from this group, send email to > protobuf+unsubscr...@googlegroups.com<protobuf%2bunsubscr...@googlegroups.com> > . > For more options, visit this group at > http://groups.google.com/group/protobuf?hl=en. > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.