Re: [protobuf] Message limit
Thanks folks, that was very useful. Right now I have sequence of messages since we're processing serially. RecordIO seems like a great idea. Is the "framing format" just multiple messages in a file with an inverted index in the beginning? - Delip On Tue, Jan 12, 2010 at 2:35 PM, Kenton Varda wrote: > So to rephrase what I said: You should break up your message in multiple > pieces that you store / send one at a time. Usually very large messages are > actually lists of smaller messages, so instead of using one big repeated > field, store each message separately. When storing to a file, it's probably > advantageous to use a "framing" format that lets you store multiple > "records" such that you can seek to any particular record quickly -- using a > large repeated field doesn't provide this anyway, so you need something else > (we have some code internally that we call RecordIO). > BTW, we would love to open source the libraries I mentioned, it's just a > matter of finding the time to get it done. > > On Tue, Jan 12, 2010 at 11:29 AM, Kenton Varda wrote: >> >> Dang it, I got my mailing lists mixed up and referred to some things we >> haven't released open source. Sigh. >> >> On Tue, Jan 12, 2010 at 11:28 AM, Kenton Varda wrote: >>> >>> But you should consider a design that doesn't require you to send >>> enormous messages. Protocol buffers are not well-optimized for this sort of >>> use. For data stored on disk, consider storing multiple records in a >>> RecordIO file. For data passed over Stubby, consider streaming it in >>> multiple pieces. >>> >>> On Tue, Jan 12, 2010 at 9:40 AM, Jason Hsueh wrote: The limit applies to the data source from which a message is parsed. So if you want to parse a serialization of Foo, it applies to Foo. But if you parse a bunch of Bar messages one by one, and add them individually to Bar, then the limit only applies to each individual Bar. You can change the limit in your code if you create your own CodedInputStream and call its SetTotalBytesLimit method in C++, or its Java equivalent setSizeLimit. On Tue, Jan 12, 2010 at 8:41 AM, Delip Rao wrote: > > Hi, > > I'm trying to understand protobuf message size limits. Is the 64M > message limit fixed or can it be changed via some compile option? If I > have a message Foo defined as: > > message Foo { > repeated Bar bars = 1; > } > > Will the limit apply to Foo or just the individual Bars? > > Thanks, > Delip > > -- > You received this message because you are subscribed to the Google > Groups "Protocol Buffers" group. > To post to this group, send email to proto...@googlegroups.com. > To unsubscribe from this group, send email to > protobuf+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/protobuf?hl=en. > > > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en. >>> >> > > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
[protobuf] Re: Issue 122 in protobuf: Two test failures on Windows
Comment #14 on issue 122 by briford.wylie: Two test failures on Windows http://code.google.com/p/protobuf/issues/detail?id=122 Hi, thanks for the tip. Worked fine. -- You received this message because you are listed in the owner or CC fields of this issue, or because you starred this issue. You may adjust your issue notification preferences at: http://code.google.com/hosting/settings -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
[protobuf] In python - How would I send and receive a PB in the POST payload of http request?
I'm fairly new to python and very new to protocol buffers. Any points in the right direction would be helpful. thx -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] In python - How would I send and receive a PB in the POST payload of http request?
You'd need to use a separate HTTP library for that. Protobuf itself doesn't provide HTTP integration, but once you have the bytes from the HTTP payload you can use protobuf to parse them. On Wed, Jan 13, 2010 at 10:04 AM, Rich wrote: > I'm fairly new to python and very new to protocol buffers. Any points > in the right direction would be helpful. > > thx > > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To post to this group, send email to proto...@googlegroups.com. > To unsubscribe from this group, send email to > protobuf+unsubscr...@googlegroups.com > . > For more options, visit this group at > http://groups.google.com/group/protobuf?hl=en. > > > > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Message limit
I actually don't know how the format works, but that is certainly one possibility. On Wed, Jan 13, 2010 at 5:19 AM, Delip Rao wrote: > Thanks folks, that was very useful. Right now I have sequence of > messages since we're processing serially. RecordIO seems like a great > idea. Is the "framing format" just multiple messages in a file with an > inverted index in the beginning? > > - Delip > > On Tue, Jan 12, 2010 at 2:35 PM, Kenton Varda wrote: > > So to rephrase what I said: You should break up your message in multiple > > pieces that you store / send one at a time. Usually very large messages > are > > actually lists of smaller messages, so instead of using one big repeated > > field, store each message separately. When storing to a file, it's > probably > > advantageous to use a "framing" format that lets you store multiple > > "records" such that you can seek to any particular record quickly -- > using a > > large repeated field doesn't provide this anyway, so you need something > else > > (we have some code internally that we call RecordIO). > > BTW, we would love to open source the libraries I mentioned, it's just a > > matter of finding the time to get it done. > > > > On Tue, Jan 12, 2010 at 11:29 AM, Kenton Varda > wrote: > >> > >> Dang it, I got my mailing lists mixed up and referred to some things we > >> haven't released open source. Sigh. > >> > >> On Tue, Jan 12, 2010 at 11:28 AM, Kenton Varda > wrote: > >>> > >>> But you should consider a design that doesn't require you to send > >>> enormous messages. Protocol buffers are not well-optimized for this > sort of > >>> use. For data stored on disk, consider storing multiple records in a > >>> RecordIO file. For data passed over Stubby, consider streaming it in > >>> multiple pieces. > >>> > >>> On Tue, Jan 12, 2010 at 9:40 AM, Jason Hsueh > wrote: > > The limit applies to the data source from which a message is parsed. > So > if you want to parse a serialization of Foo, it applies to Foo. But if > you > parse a bunch of Bar messages one by one, and add them individually to > Bar, > then the limit only applies to each individual Bar. > You can change the limit in your code if you create your own > CodedInputStream and call its SetTotalBytesLimit method in C++, or its > Java > equivalent setSizeLimit. > > On Tue, Jan 12, 2010 at 8:41 AM, Delip Rao > wrote: > > > > Hi, > > > > I'm trying to understand protobuf message size limits. Is the 64M > > message limit fixed or can it be changed via some compile option? If > I > > have a message Foo defined as: > > > > message Foo { > > repeated Bar bars = 1; > > } > > > > Will the limit apply to Foo or just the individual Bars? > > > > Thanks, > > Delip > > > > -- > > You received this message because you are subscribed to the Google > > Groups "Protocol Buffers" group. > > To post to this group, send email to proto...@googlegroups.com. > > To unsubscribe from this group, send email to > > protobuf+unsubscr...@googlegroups.com > . > > For more options, visit this group at > > http://groups.google.com/group/protobuf?hl=en. > > > > > > > > > -- > You received this message because you are subscribed to the Google > Groups "Protocol Buffers" group. > To post to this group, send email to proto...@googlegroups.com. > To unsubscribe from this group, send email to > protobuf+unsubscr...@googlegroups.com > . > For more options, visit this group at > http://groups.google.com/group/protobuf?hl=en. > > >>> > >> > > > > > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
[protobuf] STL_HASH.m4
Hello Guys, I am seeing that google protocol buffer is now supporting unorderd_map with new modification in hash.h . But I am confused where exactly stl_hash.m4 looks for unordered_map by default . Can we make it to look in different directly as xlc compiler on AIX is installed under XYZ/vacpp/include which is different that default /usr/include directory? I tried to run m4 with stl_hash.m4 as input and XYZ/vacpp/include as include directory but it failed. saying " end quote is not provided" Is there anyway I can make stl_hash.m4 to look into different include file than /usr/include Thanks & Regards, Vikram -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] STL_HASH.m4
stl_hash.m4 should automatically look it whatever directory your compiler uses. If for some reason your compiler does not automatically look in the directory you want, then you should add the proper CXXFLAGS to make it look there, e.g.: ./configure CXXFLAGS=-I/XYZ/vacpp/include (-I is GCC's flag for this; your compiler may be different.) On Wed, Jan 13, 2010 at 12:20 PM, vikram wrote: > Hello Guys, > > I am seeing that google protocol buffer is now supporting > unorderd_map with new modification in hash.h . But I am confused where > exactly stl_hash.m4 looks for unordered_map by default . Can we make > it to look in different directly as xlc compiler on AIX is installed > under XYZ/vacpp/include which is different that default /usr/include > directory? > > I tried to run m4 with stl_hash.m4 as input and XYZ/vacpp/include as > include directory but it failed. saying " end quote is not provided" > Is there anyway I can make stl_hash.m4 to look into > different include file than /usr/include > > Thanks & Regards, > Vikram > > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To post to this group, send email to proto...@googlegroups.com. > To unsubscribe from this group, send email to > protobuf+unsubscr...@googlegroups.com > . > For more options, visit this group at > http://groups.google.com/group/protobuf?hl=en. > > > > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
[protobuf] How can I reset a FileInputStream?
Hello Kenton, currently I have the following problem: I have a very big file with many small messages serialized with Protobuf. Each message contains its owner separator and thus can be found even in an unsynchronized stream. I move through this file using lseek64, because FileInputStream::Skip only works into forwarding direction and FileInputStream::BackUp can move back only up to the current buffer boundary. Since I am the owner of the file descriptor, also used by FileInputStream, I can randomly seek to any position in the file. However after seek'ing, my FileInputStream is obviously in an unusable state and has to be reset. Currently the only feasible solution is to replace the current FileInputStream object by a new one - which, somehow is quite inefficient! Wouldn't it make sense to add a member function which resets a FileInputStream to the state of a natively opened and repositioned file descriptor? Or is there any other solution to randomly access the raw content of the file, say by wrapping seek? Regards, Jacob -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] How can I reset a FileInputStream?
On Wed, Jan 13, 2010 at 3:02 PM, Jacob Rief wrote: > Currently the only feasible solution is to > replace the current FileInputStream object by a new one - which, > somehow is quite inefficient! > What makes you think it is inefficient? It does mean the buffer has to be re-allocated but with a decent malloc implementation that shouldn't take long. Certainly the actual reading from the file would take longer. Have you seen performance problems with this approach? > Wouldn't it make sense to add a member function which resets a > FileInputStream to the state of a natively opened and repositioned > file descriptor? If there really is a performance problem with allocating new objects, then sure. -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.
Re: [protobuf] Re: WriteDelimited/parseDelimited in python
(I have this on the back burner as I'm kind of swamped, but I do want to get this submitted at some point, hopefully within a week.) On Tue, Jan 5, 2010 at 3:57 AM, Graham Cox wrote: > I was saying the user *could* do that, and that it's currently what I'm > doing in my server-side code. The reason being, as you said, if you naively > read from a stream and the message isn't all present then you need to block > until it is with the way that the Java code works at present. If you are > using it for client-side code then likely this is not an issue in the > slightest, but a server that needs to be able to handle many clients at once > just can not block on one of them... > > As to your other alternative, (a), I would suggest that this leaves too > much of the underlying network protocol bare to the caller. This will make > it very difficult to change the way that delimiting messages happens in the > future should such a thing be required. If - for example - it is decided to > go from having the length prefixed to having a special delimiting sequence > after the message then it will cause all current calling code to need to be > changed. It might be that this is considered a low enough level library that > this is acceptable, but that would be a Google decision... > > One more alternative would be how the asn1c library works for parsing ASN.1 > streams into objects, which is to be resumable. The decoder reads all the > data it is given, and tries to build the object from this. If it doesn't > have enough data yet then it does what it can, remembers where it got to and > returns back to the user who can then supply more data when it becomes > available. If the entire message does parse from the data provided then > return back to the user the amount of data consumed so that they can discard > this (reading from the stream directly makes this slightly cleaner still). > At present, the Protobuf libraries (any of them) can not support this method > of decoding an object, and it is not a trivial change to make it possible to > do, but it does - IMO - give a much cleaner and easier to use method of use. > -- > Graham Cox > > On Tue, Jan 5, 2010 at 1:32 AM, Kenton Varda wrote: > >> Make sure to "reply all" so that the group is CC'd. >> >> So you are saying that the user should read whatever data is on the >> socket, then attempt to parse it, and if it fails, assume that it's because >> there is more data to read? Seems rather wasteful. I think what we ideally >> want is either: >> (a) Provide a way for the caller to read the size independently, so that >> they can then make sure to read that many bytes from the input before >> parsing. >> (b) Provide a method that reads from a stream, so that the protobuf >> library can automatically take care of reading all necessary bytes. >> >> Option (b) is obviously cleaner but has a few problems: >> - We have to choose a particular stream interface to support. While the >> Python "file-like" interface is pretty common I'm not sure if it's universal >> for this kind of task. >> - If not all bytes of the message are available yet, we'd have to block. >> This might be fine most of the time, but would be unacceptable for some >> uses. >> >> Thoughts? >> >> On Mon, Jan 4, 2010 at 3:09 PM, Graham Cox wrote: >> >>> I'm using it for reading/writing to sockets in my functional tests - >>> works well enough there... >>> In my Java-side server code, I read from the socket into a byte buffer, >>> then deserialize the byte buffer into Protobuf objects, throwing away the >>> data that has been deserialized. The python "MergeDelimitedFromString" >>> function also returns the number of bytes that were processed to build up >>> the Protobuf object, so the user could easily do the same - read the socket >>> onto the end of a buffer, and then while the buffer is successfully >>> deserializing into objects throw away the first x bytes as appropriate... >>> >>> Just a thought :) >>> >>> On Mon, Jan 4, 2010 at 9:57 PM, Kenton Varda wrote: >>> Hmm, it occurs to me that this currently is not useful for reading from a socket or similar stream since the caller has to make sure to read an entire message before trying to parse it, but the caller doesn't actually know how long the message is (because the code that determines this is encapsulated). Any thoughts on this? On Mon, Jan 4, 2010 at 12:11 PM, Kenton Varda wrote: > Mostly looks good. There are some style issues (e.g. lines over 80 > chars) but I can clean those up myself. > > You'll need to sign the contributor license agreement: > > http://code.google.com/legal/individual-cla-v1.0.html -- If you own > copyright on this change. > http://code.google.com/legal/corporate-cla-v1.0.html -- If your > employer does. > > Please let me know after you've done this and then I can submit these. > > > On Fri, Jan 1, 2010 at 12:53 PM, Graham wrote: >