any suggestions? experiences?

regards,
Alok

On Jan 11, 1:16 pm, alok <alok.jad...@gmail.com> wrote:
> my point is ..should i have one message something like
>
> Message Record{
>   required HeaderMessage header;
>   optional TradeMessage trade;
>   repeated QuoteMessage quotes; // 0 or more
>   repeated CustomMessage customs; // 0 or more
>
> }
>
> or rather should i keep my file plain as
> object type, object, objecttype, object
> without worrying about the concept of a record.
>
> Each message in file is usually header + any 1 type of message (trade,
> quote or custom) ..  and mostly only 1 quote or custom message not
> more.
>
> what would be faster to decode?
>
> Regards,
> Alok
>
> On Jan 11, 12:41 pm, alok <alok.jad...@gmail.com> wrote:
>
>
>
>
>
>
>
> > Hi everyone,
>
> > My program is taking more time to read binary files than the text
> > files. I think the issue is with the structure of the binary files
> > that i have designed. (Or could it be possible that binary decoding is
> > slower than text files parsing? ).
>
> > Data file is a large text file with 1 record per row. upto 1.2 GB.
> > Binary file is around 900 MB.
>
> > **
> >  - Text file reading takes 3 minutes to read the file.
> >  - Binary file reading takes 5 minutes.
>
> > I saw a very strange behavior.
> >  - Just to see how long it takes to skim through binary file, i
> > started reading header on each message which holds the length of the
> > message and then skipped that many bytes using the Skip() function of
> > coded_input object. After making this change, i was expecting that
> > reading through file should take less time, but it took more than 10
> > minutes. Is skipping not same as adding n bytes to the file pointer?
> > is it slower to skip the object than read it?
>
> > Are their any guidelines on how the structure should be designed to
> > get the best performance?
>
> > My current structure looks as below
>
> > message HeaderMessage {
> >   required double timestamp = 1;
> >   required string ric_code = 2;
> >   required int32 count = 3;
> >   required int32 total_message_size = 4;
>
> > }
>
> > message QuoteMessage {
> >         enum Side {
> >     ASK = 0;
> >     BID = 1;
> >   }
> >   required Side type = 1;
> >         required int32 level = 2;
> >         optional double price = 3;
> >         optional int64 size = 4;
> >         optional int32 count = 5;
> >         optional HeaderMessage header = 6;
>
> > }
>
> > message CustomMessage {
> >         required string field_name = 1;
> >         required double value = 2;
> >         optional HeaderMessage header = 3;
>
> > }
>
> > message TradeMessage {
> >         optional double price = 1;
> >         optional int64 size = 2;
> >         optional int64 AccumulatedVolume = 3;
> >         optional HeaderMessage header = 4;
>
> > }
>
> > Binary file format is
> > object type, object, object type object ...
>
> > 1st object of a record holds header with n number of objects in that
> > record. next n-1 objects will not hold header since they all belong to
> > same record (same update time).
> > now n+1th object belongs to the new record and it will hold header for
> > next record.
>
> > Any advices?
>
> > Regards,
> > Alok

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.

Reply via email to