Re: Performance: Sending a message with ~150k items, approx 3.3mb, can I do better than 100ms?

Kenton Varda Wed, 15 Jul 2009 16:49:02 -0700

If you can find a way to make it faster, please send a patch!  :)

On Wed, Jul 15, 2009 at 4:46 PM, Alex Black <a...@alexblack.ca> wrote:


>  Thanks, yes performance seems really good, though I wouldn't mind seeing
> the java deserialization faster.
>
>  ------------------------------
> *From:* Kenton Varda [mailto:ken...@google.com]
> *Sent:* Tuesday, July 14, 2009 8:06 PM
> *To:* Alex Black
> *Cc:* protobuf@googlegroups.com
>
> *Subject:* Re: Performance: Sending a message with ~150k items, approx
> 3.3mb, can I do better than 100ms?
>
> So, 172 MB/s for composition + serialization.  Sounds about right.
>
> On Tue, Jul 14, 2009 at 10:46 AM, Alex Black <a...@alexblack.ca> wrote:
>
>>  Thanks for those tips.  I am using tcmalloc, and I'm re-using message
>> for each batch, e.g. I fill it up with say 500 items, send it out, clear it,
>> re-use it.
>>
>> Here are my hopefully accurate timings, each done 100 times, averaged:
>>
>> 1. Baseline (just loops through the data on the server) no protobuf: 191ms
>> 2. Compose messages, serialize them, no I/O or deserialization: 213ms
>> 3. Same as #2 but with IO to a dum java client: 265ms
>> 4. Same as #3 but add java protobuf deserialization: 323ms
>>
>> So from this it looks like:
>> - composing and serializing the messages takes 22ms
>> - sending the data over sockets takes 52ms
>> - deserializing the data in java with protobuf takes 58ms
>>
>> The amount of data being sent is: 3,959,368 bytes in 158,045 messages
>> (composed in batches of 1000).
>>
>> - Alex
>>
>>  ------------------------------
>> *From:* Kenton Varda [mailto:ken...@google.com]
>> *Sent:* Tuesday, July 14, 2009 3:26 AM
>> *To:* Alex Black
>> *Cc:* Protocol Buffers
>>
>> *Subject:* Re: Performance: Sending a message with ~150k items, approx
>> 3.3mb, can I do better than 100ms?
>>
>>   OK.  If your message composition (or parsing, on the receiving end)
>> takes a lot of time, you might look into how much of that is due to memory
>> allocation.  Usually this is a pretty significant fraction.  Two good ways
>> to improve that:
>> 1) If your app builds many messages over time and most of them have
>> roughly the same "shape" (i.e. which fields are set, the size of repeated
>> fields, etc. are usually similar), then you should clear and reuse the same
>> message object rather than allocate a new one each time.  This way it will
>> reuse the same memory, avoiding allocation.
>>
>> 2) Use tcmalloc:
>>   http://google-perftools.googlecode.com
>> It is often faster than your system's malloc, particularly for
>> multi-threaded C++ apps.  All C++ servers at Google use this.
>>
>> On Mon, Jul 13, 2009 at 11:50 PM, Alex Black <a...@alexblack.ca> wrote:
>>
>>>
>>> Kenton: I made a mistake with these numbers - pls ignore them - I'll
>>> revisit tomorrow.
>>>
>>> Thx.
>>>
>>> -----Original Message-----
>>> From: protobuf@googlegroups.com [mailto:proto...@googlegroups.com] On
>>> Behalf Of Alex Black
>>> Sent: Tuesday, July 14, 2009 2:05 AM
>>> To: Protocol Buffers
>>> Subject: Re: Performance: Sending a message with ~150k items, approx
>>> 3.3mb, can I do better than 100ms?
>>>
>>>
>>> ok, I took I/O out of the picture by serializing each message into a
>>> pre-allocated buffer, and this time I did a more through measurement.
>>>
>>> Benchmark 1: Complete scenario
>>> - average time 262ms (100 runs)
>>>
>>> Benchmark 2: Same as # 1 but no IO
>>> - average time 250ms (100 runs)
>>>
>>> Benchmark 3: Same as 2 but with serialization commented out
>>> - average time 251ms (100 runs)
>>>
>>> Benchmark 4: Same as 3 but with message composition commented out too (no
>>> protobuf calls)
>>> - average time 185 ms (100 runs)
>>>
>>> So from this I conclude:
>>> - My initial #s were wrong
>>> - My timings vary too much for each run to really get accurate averages
>>> - IO takes about 10ms
>>> - Serialization takes ~0ms
>>> - Message composition and setting of fields takes ~66ms
>>>
>>> My message composition is in a loop, the part in the loop looks like:
>>>
>>>                        uuid_t relatedVertexId;
>>>
>>>                        myProto::IdConfidence* neighborIdConfidence =
>>> pNodeWithNeighbors-
>>> >add_neighbors();
>>>
>>>                        // Set the vertex id
>>>                        neighborIdConfidence->set_id((const void*)
>>> relatedVertexId, 16);
>>>                        // set the confidence
>>>                        neighborIdConfidence->set_confidence( confidence
>>> );
>>>
>>>                        currentBatchSize++;
>>>
>>>                        if ( currentBatchSize == BatchSize )
>>>                        {
>>>                                // Flush out this batch
>>>                                //stream << getNeighborsResponse;
>>>                                getNeighborsResponse.Clear();
>>>                                currentBatchSize = 0;
>>>                        }
>>>
>>> On Jul 14, 1:27 am, Kenton Varda <ken...@google.com> wrote:
>>> > Oh, I didn't even know you were including composition in there.  My
>>> > benchmarks are only for serialization of already-composed messages.
>>> > But this still doesn't tell us how much time is spent on network I/O
>>> vs.
>>> > protobuf serialization.  My guess is that once you factor that out,
>>> > your performance is pretty close to the benchmarks.
>>> >
>>> > On Mon, Jul 13, 2009 at 10:11 PM, Alex Black <a...@alexblack.ca>
>>> wrote:
>>> >
>>> > > If I comment out the actual serialization and sending of the message
>>> > > (so I am just composing messages, and clearing them each batch) then
>>> > > the 100ms drops to about 50ms.
>>> >
>>> > > On Jul 14, 12:36 am, Alex Black <a...@alexblack.ca> wrote:
>>> > > > I'm sending a message with about ~150k repeated items in it, total
>>> > > > size is about 3.3mb, and its taking me about 100ms to serialize it
>>> > > > and send it out.
>>> >
>>> > > > Can I expect to do any better than this? What could I look into to
>>> > > > improve this?
>>> > > > - I have "option optimize_for = SPEED;" set in my proto file
>>> > > > - I'm compiling with -O3
>>> > > > - I'm sending my message in batches of 1000
>>> > > > - I'm using C++, on ubuntu, x64
>>> > > > - I'm testing all on one machine (e.g. client and server are on
>>> > > > one
>>> > > > machine)
>>> >
>>> > > > My message looks like:
>>> >
>>> > > > message NodeWithNeighbors
>>> > > > {
>>> > > >         required Id nodeId = 1;
>>> > > >         repeated IdConfidence neighbors = 2;
>>> >
>>> > > > }
>>> >
>>> > > > message GetNeighborsResponse
>>> > > > {
>>> > > >         repeated NodeWithNeighbors nodesWithNeighbors = 1;
>>> >
>>> > > > }
>>> >
>>> > > > message IdConfidence
>>> > > > {
>>> > > >         required bytes id = 1;
>>> > > >         required float confidence = 2;
>>> >
>>> > > > }
>>> >
>>> > > > Where "bytes id" is used to send 16byte IDs (uuids).
>>> >
>>> > > > I'm writing each message (batch) out like this:
>>> >
>>> > > >         CodedOutputStream codedOutputStream(&m_ProtoBufStream);
>>> >
>>> > > >         // Write out the size of the message
>>> > > >         codedOutputStream.WriteVarint32(message.ByteSize());
>>> > > >         // Ask the message to serialize itself to our stream
>>> > > > adapter,
>>> > > which
>>> > > > ultimately calls Write on us
>>> > > >         // which we then call Write on our composed stream
>>> > > >         message.SerializeWithCachedSizes(&codedOutputStream);
>>> >
>>> > > > In my stream implementation I'm buffering every 16kb, and calling
>>> > > > send on the socket once i have 16kb.
>>> >
>>> > > > Thanks!
>>> >
>>> > > > - Alex
>>>
>>>
>>> >>>
>>>
>>
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Performance: Sending a message with ~150k items, approx 3.3mb, can I do better than 100ms?

Reply via email to