OK.  If your message composition (or parsing, on the receiving end) takes a
lot of time, you might look into how much of that is due to memory
allocation.  Usually this is a pretty significant fraction.  Two good ways
to improve that:
1) If your app builds many messages over time and most of them have roughly
the same "shape" (i.e. which fields are set, the size of repeated fields,
etc. are usually similar), then you should clear and reuse the same message
object rather than allocate a new one each time.  This way it will reuse the
same memory, avoiding allocation.

2) Use tcmalloc:
It is often faster than your system's malloc, particularly for
multi-threaded C++ apps.  All C++ servers at Google use this.

On Mon, Jul 13, 2009 at 11:50 PM, Alex Black <a...@alexblack.ca> wrote:

> Kenton: I made a mistake with these numbers - pls ignore them - I'll
> revisit tomorrow.
> Thx.
> -----Original Message-----
> From: protobuf@googlegroups.com [mailto:proto...@googlegroups.com] On
> Behalf Of Alex Black
> Sent: Tuesday, July 14, 2009 2:05 AM
> To: Protocol Buffers
> Subject: Re: Performance: Sending a message with ~150k items, approx 3.3mb,
> can I do better than 100ms?
> ok, I took I/O out of the picture by serializing each message into a
> pre-allocated buffer, and this time I did a more through measurement.
> Benchmark 1: Complete scenario
> - average time 262ms (100 runs)
> Benchmark 2: Same as # 1 but no IO
> - average time 250ms (100 runs)
> Benchmark 3: Same as 2 but with serialization commented out
> - average time 251ms (100 runs)
> Benchmark 4: Same as 3 but with message composition commented out too (no
> protobuf calls)
> - average time 185 ms (100 runs)
> So from this I conclude:
> - My initial #s were wrong
> - My timings vary too much for each run to really get accurate averages
> - IO takes about 10ms
> - Serialization takes ~0ms
> - Message composition and setting of fields takes ~66ms
> My message composition is in a loop, the part in the loop looks like:
>                        uuid_t relatedVertexId;
>                        myProto::IdConfidence* neighborIdConfidence =
> pNodeWithNeighbors-
> >add_neighbors();
>                        // Set the vertex id
>                        neighborIdConfidence->set_id((const void*)
> relatedVertexId, 16);
>                        // set the confidence
>                        neighborIdConfidence->set_confidence( confidence );
>                        currentBatchSize++;
>                        if ( currentBatchSize == BatchSize )
>                        {
>                                // Flush out this batch
>                                //stream << getNeighborsResponse;
>                                getNeighborsResponse.Clear();
>                                currentBatchSize = 0;
>                        }
> On Jul 14, 1:27 am, Kenton Varda <ken...@google.com> wrote:
> > Oh, I didn't even know you were including composition in there.  My
> > benchmarks are only for serialization of already-composed messages.
> > But this still doesn't tell us how much time is spent on network I/O vs.
> > protobuf serialization.  My guess is that once you factor that out,
> > your performance is pretty close to the benchmarks.
> >
> > On Mon, Jul 13, 2009 at 10:11 PM, Alex Black <a...@alexblack.ca> wrote:
> >
> > > If I comment out the actual serialization and sending of the message
> > > (so I am just composing messages, and clearing them each batch) then
> > > the 100ms drops to about 50ms.
> >
> > > On Jul 14, 12:36 am, Alex Black <a...@alexblack.ca> wrote:
> > > > I'm sending a message with about ~150k repeated items in it, total
> > > > size is about 3.3mb, and its taking me about 100ms to serialize it
> > > > and send it out.
> >
> > > > Can I expect to do any better than this? What could I look into to
> > > > improve this?
> > > > - I have "option optimize_for = SPEED;" set in my proto file
> > > > - I'm compiling with -O3
> > > > - I'm sending my message in batches of 1000
> > > > - I'm using C++, on ubuntu, x64
> > > > - I'm testing all on one machine (e.g. client and server are on
> > > > one
> > > > machine)
> >
> > > > My message looks like:
> >
> > > > message NodeWithNeighbors
> > > > {
> > > >         required Id nodeId = 1;
> > > >         repeated IdConfidence neighbors = 2;
> >
> > > > }
> >
> > > > message GetNeighborsResponse
> > > > {
> > > >         repeated NodeWithNeighbors nodesWithNeighbors = 1;
> >
> > > > }
> >
> > > > message IdConfidence
> > > > {
> > > >         required bytes id = 1;
> > > >         required float confidence = 2;
> >
> > > > }
> >
> > > > Where "bytes id" is used to send 16byte IDs (uuids).
> >
> > > > I'm writing each message (batch) out like this:
> >
> > > >         CodedOutputStream codedOutputStream(&m_ProtoBufStream);
> >
> > > >         // Write out the size of the message
> > > >         codedOutputStream.WriteVarint32(message.ByteSize());
> > > >         // Ask the message to serialize itself to our stream
> > > > adapter,
> > > which
> > > > ultimately calls Write on us
> > > >         // which we then call Write on our composed stream
> > > >         message.SerializeWithCachedSizes(&codedOutputStream);
> >
> > > > In my stream implementation I'm buffering every 16kb, and calling
> > > > send on the socket once i have 16kb.
> >
> > > > Thanks!
> >
> > > > - Alex
> >

You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
For more options, visit this group at 

Reply via email to