here is the link to a forum which states why i have to set the limit.

excerpt from the link

"The problem is that CodedInputStream has internal counter of how many
bytes are read so far with the same object.

In my case, there are a lot of small messages saved in the same file.
I do not read them at once and therefore do not care about large
messages, limits. I am safe.

So, the problem can be easily solved by calling:

CodedInputStream input_stream(...);
input_stream.SetTotalBytesLimit(1e9, 9e8);

My use-case is really about storing extremely large number (up to 1e9)
of small messages ~ 10K each. "

My problem is same as above, so i will have to set the limits on coded
input object.


On Jan 16, 10:26 am, alok <> wrote:
> I was actually doing that initially, but I kept getting error on
> "Maximum length for a message is reached" ( I dont have exact error
> string at the moment). This was because my input binary file is large
> and it reaches the limit for coded input very fast.
> I saw a post on the forum (or maybe on Stack Exchange) which suggested
> that i should create a new coded_input object for each message. I have
> to reset the limits for coded input object. user on that thread
> suggested that its easy to create and destroy coded_input object.
> These objects are not big.
> Anyways, I will try it again by resetting the limits on this object.
> But then, would this be casuing the slowness? I will try and let you
> know the results.
> Regards,
> Alok
> On Jan 16, 9:46 am, Daniel Wright <> wrote:
> > You're making a new CodedInputStream for each message -- I think that gives
> > very poor buffering behavior.  You should just pass coded_input to
> > ReadAllMessages and keep reusing it.
> > Cheers
> > Daniel
> > On Sun, Jan 15, 2012 at 4:41 PM, alok <> wrote:
> > > Daniel,
> > > i am hoping that my code is incorrect but i am not sure what is wrong
> > > or what is really causing this slowness.
> > > @ Henner Zeller, sorry i forgot to include the object length in above
> > > example. I do store object length for each object. I dont have issues
> > > in reading all the objects. Code is working fine. I just want to make
> > > sure to be able to make the code run faster now.
> > > attaching my code here...
> > > File format is
> > > File header
> > > Record1, Record2, Record3
> > > Each record contains n objects of type defined in proto file. 1st
> > > object has header which contains the number of objects in each record.
> > > <code>
> > > proto file
> > > message HeaderMessage {
> > >        required double timestamp = 1;
> > >  required string ric_code = 2;
> > >  required int32 count = 3;
> > >  required int32 total_message_size = 4;
> > > }
> > > message QuoteMessage {
> > >        enum Side {
> > >    ASK = 0;
> > >    BID = 1;
> > >  }
> > >  required Side type = 1;
> > >        required int32 level = 2;
> > >        optional double price = 3;
> > >        optional int64 size = 4;
> > >        optional int32 count = 5;
> > >        optional HeaderMessage header = 6;
> > > }
> > > message CustomMessage {
> > >        required string field_name = 1;
> > >        required double value = 2;
> > >        optional HeaderMessage header = 3;
> > > }
> > > message TradeMessage {
> > >        optional double price = 1;
> > >        optional int64 size = 2;
> > >        optional int64 AccumulatedVolume = 3;
> > >        optional HeaderMessage header = 4;
> > > }
> > > message AlphaMessage {
> > >        required int32 level = 1;
> > >        required double alpha = 2;
> > >        optional double stddev = 3;
> > >         optional HeaderMessage header = 4;
> > > }
> > > </code>
> > > <code>
> > > Reading records from binary file
> > > bool ReadNextRecord(CodedInputStream *coded_input,
> > > stdext::hash_set<std::string> instruments)
> > > {
> > >        uint32 count, objtype, objlen;
> > >        int i;
> > >        int objectsread = 0;
> > >        HeaderMessage *hMsg = NULL;
> > >        TradeMessage tMsg;
> > >        QuoteMessage qMsg;
> > >        CustomMessage cMsg;
> > >        AlphaMessage aMsg;
> > >        while(1)
> > >        {
> > >                if(!coded_input->ReadLittleEndian32(&objtype)) {
> > >                        return false;
> > >                }
> > >                if(!coded_input->ReadLittleEndian32(&objlen)) {
> > >                        return false;
> > >                }
> > >                CodedInputStream::Limit lim =
> > > coded_input->PushLimit(objlen);
> > >                switch(objtype)
> > >                {
> > >                case 2:
> > >                        qMsg.ParseFromCodedStream(coded_input);
> > >                        if(qMsg.has_header())
> > >                        {
> > >                                //hMsg =
> > >                                hMsg = new HeaderMessage();
> > >                                hMsg->Clear();
> > >                                hMsg->Swap(qMsg.mutable_header());
> > >                        }
> > >                        objectsread++;
> > >                        break;
> > >                case 3:
> > >                        tMsg.ParseFromCodedStream(coded_input);
> > >                        if(tMsg.has_header())
> > >                        {
> > >                                //hMsg = tMsg.mutable_header();
> > >                                hMsg = new HeaderMessage();
> > >                                hMsg->Clear();
> > >                                hMsg->Swap(tMsg.mutable_header());
> > >                        }
> > >                        objectsread++;
> > >                        break;
> > >                case 4:
> > >                        aMsg.ParseFromCodedStream(coded_input);
> > >                        if(aMsg.has_header())
> > >                        {
> > >                                //hMsg = aMsg.mutable_header();
> > >                                hMsg = new HeaderMessage();
> > >                                hMsg->Clear();
> > >                                hMsg->Swap(aMsg.mutable_header());
> > >                        }
> > >                        objectsread++;
> > >                        break;
> > >                case 5:
> > >                        cMsg.ParseFromCodedStream(coded_input);
> > >                        if(cMsg.has_header())
> > >                        {
> > >                                //hMsg = cMsg.mutable_header();
> > >                                hMsg = new HeaderMessage();
> > >                                hMsg->Clear();
> > >                                hMsg->Swap(cMsg.mutable_header());
> > >                        }
> > >                        objectsread++;
> > >                        break;
> > >                default:
> > >                        cout << "Invalid object type "<< objtype <<
> > > endl;
> > >                        return false;
> > >                        break;
> > >                }
> > >                coded_input->PopLimit(lim);
> > >                if(objectsread == hMsg->count()) break;
> > >        }
> > >        return true;
> > > }
> > > void ReadAllMessages(ZeroCopyInputStream *raw_input,
> > > stdext::hash_set<std::string> instruments)
> > > {
> > >        int item_count = 0;
> > >        while(1)
> > >        {
> > >                CodedInputStream in(raw_input);
> > >                if(!ReadNextRecord(&in, instruments))
> > >                        break;
> > >                item_count++;
> > >        }
> > >        cout << "Finished reading file. Total "<<item_count<<" items
> > > read."<<endl;
> > > }
> > > int _tmain(int argc, _TCHAR* argv[])
> > > {
> > >        ZeroCopyInputStream *raw_input;
> > >        CodedInputStream *coded_input;
> > >        stdext::hash_set<std::string> instruments;
> > >        string filename = "S:/users/aaj/sandbox/tickdata/bin/hk/
> > > 2011/2011.01.04.bin";
> > >        int fd = _open(filename.c_str(), _O_BINARY | O_RDONLY);
> > >        if( fd == -1 )
> > >        {
> > >                printf( "Error opening the file. \n" );
> > >                exit( 1 );
> > >        }
> > >        raw_input = new FileInputStream(fd);
> > >        coded_input = new CodedInputStream(raw_input);
> > >        uint32 magic_no;
> > >        coded_input->ReadLittleEndian32(&magic_no);
> > >        cout << "HEADER: " << "\t" << magic_no<<endl;
> > >        cout << "Reading data objects.." << endl;
> > >        delete coded_input;
> > >        cout << td << '\n';
> > >        ReadAllMessages(raw_input, instruments);
> > >        cout << td << '\n';
> > >        delete raw_input;
> > >        _close(fd);
> > >        google::protobuf::ShutdownProtobufLibrary();
> > >        return 0;
> > > }
> > > </code>
> > > On Jan 14, 3:37 am, Henner Zeller <>
> > > wrote:
> > > > On Fri, Jan 13, 2012 at 11:22, Daniel Wright <> wrote:
> > > > > It's extremely unlikely that text parsing is faster than binary
> > > parsing on
> > > > > pretty much any message.  My guess is that there's something wrong in
> > > the
> > > > > way you're reading the binary file -- e.g. no buffering, or possibly a
> > > bug
> > > > > where you hand the protobuf library multiple messages concatenated
> > > together.
> > > > In particular, the
> > > >    object type, object, object type object ..
> > > > doesn't seem to include headers that describe the length of the
> > > > following message, but such a separator is needed.
> > > > (
> > > .)
> > > > >  It'd be easier to comment if you post the code.
> > > > > Cheers
> > > > > Daniel
> > > > > On Fri, Jan 13, 2012 at 1:22 AM, alok
> ...
> read more »

You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at

Reply via email to