So, 172 MB/s for composition + serialization. Sounds about right. On Tue, Jul 14, 2009 at 10:46 AM, Alex Black <a...@alexblack.ca> wrote:
> Thanks for those tips. I am using tcmalloc, and I'm re-using message for > each batch, e.g. I fill it up with say 500 items, send it out, clear it, > re-use it. > > Here are my hopefully accurate timings, each done 100 times, averaged: > > 1. Baseline (just loops through the data on the server) no protobuf: 191ms > 2. Compose messages, serialize them, no I/O or deserialization: 213ms > 3. Same as #2 but with IO to a dum java client: 265ms > 4. Same as #3 but add java protobuf deserialization: 323ms > > So from this it looks like: > - composing and serializing the messages takes 22ms > - sending the data over sockets takes 52ms > - deserializing the data in java with protobuf takes 58ms > > The amount of data being sent is: 3,959,368 bytes in 158,045 messages > (composed in batches of 1000). > > - Alex > > ------------------------------ > *From:* Kenton Varda [mailto:ken...@google.com] > *Sent:* Tuesday, July 14, 2009 3:26 AM > *To:* Alex Black > *Cc:* Protocol Buffers > > *Subject:* Re: Performance: Sending a message with ~150k items, approx > 3.3mb, can I do better than 100ms? > > OK. If your message composition (or parsing, on the receiving end) takes a > lot of time, you might look into how much of that is due to memory > allocation. Usually this is a pretty significant fraction. Two good ways > to improve that: > 1) If your app builds many messages over time and most of them have roughly > the same "shape" (i.e. which fields are set, the size of repeated fields, > etc. are usually similar), then you should clear and reuse the same message > object rather than allocate a new one each time. This way it will reuse the > same memory, avoiding allocation. > > 2) Use tcmalloc: > http://google-perftools.googlecode.com > It is often faster than your system's malloc, particularly for > multi-threaded C++ apps. All C++ servers at Google use this. > > On Mon, Jul 13, 2009 at 11:50 PM, Alex Black <a...@alexblack.ca> wrote: > >> >> Kenton: I made a mistake with these numbers - pls ignore them - I'll >> revisit tomorrow. >> >> Thx. >> >> -----Original Message----- >> From: protobuf@googlegroups.com [mailto:proto...@googlegroups.com] On >> Behalf Of Alex Black >> Sent: Tuesday, July 14, 2009 2:05 AM >> To: Protocol Buffers >> Subject: Re: Performance: Sending a message with ~150k items, approx >> 3.3mb, can I do better than 100ms? >> >> >> ok, I took I/O out of the picture by serializing each message into a >> pre-allocated buffer, and this time I did a more through measurement. >> >> Benchmark 1: Complete scenario >> - average time 262ms (100 runs) >> >> Benchmark 2: Same as # 1 but no IO >> - average time 250ms (100 runs) >> >> Benchmark 3: Same as 2 but with serialization commented out >> - average time 251ms (100 runs) >> >> Benchmark 4: Same as 3 but with message composition commented out too (no >> protobuf calls) >> - average time 185 ms (100 runs) >> >> So from this I conclude: >> - My initial #s were wrong >> - My timings vary too much for each run to really get accurate averages >> - IO takes about 10ms >> - Serialization takes ~0ms >> - Message composition and setting of fields takes ~66ms >> >> My message composition is in a loop, the part in the loop looks like: >> >> uuid_t relatedVertexId; >> >> myProto::IdConfidence* neighborIdConfidence = >> pNodeWithNeighbors- >> >add_neighbors(); >> >> // Set the vertex id >> neighborIdConfidence->set_id((const void*) >> relatedVertexId, 16); >> // set the confidence >> neighborIdConfidence->set_confidence( confidence ); >> >> currentBatchSize++; >> >> if ( currentBatchSize == BatchSize ) >> { >> // Flush out this batch >> //stream << getNeighborsResponse; >> getNeighborsResponse.Clear(); >> currentBatchSize = 0; >> } >> >> On Jul 14, 1:27 am, Kenton Varda <ken...@google.com> wrote: >> > Oh, I didn't even know you were including composition in there. My >> > benchmarks are only for serialization of already-composed messages. >> > But this still doesn't tell us how much time is spent on network I/O vs. >> > protobuf serialization. My guess is that once you factor that out, >> > your performance is pretty close to the benchmarks. >> > >> > On Mon, Jul 13, 2009 at 10:11 PM, Alex Black <a...@alexblack.ca> wrote: >> > >> > > If I comment out the actual serialization and sending of the message >> > > (so I am just composing messages, and clearing them each batch) then >> > > the 100ms drops to about 50ms. >> > >> > > On Jul 14, 12:36 am, Alex Black <a...@alexblack.ca> wrote: >> > > > I'm sending a message with about ~150k repeated items in it, total >> > > > size is about 3.3mb, and its taking me about 100ms to serialize it >> > > > and send it out. >> > >> > > > Can I expect to do any better than this? What could I look into to >> > > > improve this? >> > > > - I have "option optimize_for = SPEED;" set in my proto file >> > > > - I'm compiling with -O3 >> > > > - I'm sending my message in batches of 1000 >> > > > - I'm using C++, on ubuntu, x64 >> > > > - I'm testing all on one machine (e.g. client and server are on >> > > > one >> > > > machine) >> > >> > > > My message looks like: >> > >> > > > message NodeWithNeighbors >> > > > { >> > > > required Id nodeId = 1; >> > > > repeated IdConfidence neighbors = 2; >> > >> > > > } >> > >> > > > message GetNeighborsResponse >> > > > { >> > > > repeated NodeWithNeighbors nodesWithNeighbors = 1; >> > >> > > > } >> > >> > > > message IdConfidence >> > > > { >> > > > required bytes id = 1; >> > > > required float confidence = 2; >> > >> > > > } >> > >> > > > Where "bytes id" is used to send 16byte IDs (uuids). >> > >> > > > I'm writing each message (batch) out like this: >> > >> > > > CodedOutputStream codedOutputStream(&m_ProtoBufStream); >> > >> > > > // Write out the size of the message >> > > > codedOutputStream.WriteVarint32(message.ByteSize()); >> > > > // Ask the message to serialize itself to our stream >> > > > adapter, >> > > which >> > > > ultimately calls Write on us >> > > > // which we then call Write on our composed stream >> > > > message.SerializeWithCachedSizes(&codedOutputStream); >> > >> > > > In my stream implementation I'm buffering every 16kb, and calling >> > > > send on the socket once i have 16kb. >> > >> > > > Thanks! >> > >> > > > - Alex >> >> >> >> >> > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to protobuf@googlegroups.com To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/protobuf?hl=en -~----------~----~----~----~------~----~------~--~---