Re: [protobuf] Message limit

2010-01-13 Thread Delip Rao
Thanks folks, that was very useful. Right now I have sequence of
messages since we're processing serially. RecordIO seems like a great
idea. Is the "framing format" just multiple messages in a file with an
inverted index in the beginning?

- Delip

On Tue, Jan 12, 2010 at 2:35 PM, Kenton Varda  wrote:
> So to rephrase what I said:  You should break up your message in multiple
> pieces that you store / send one at a time.  Usually very large messages are
> actually lists of smaller messages, so instead of using one big repeated
> field, store each message separately.  When storing to a file, it's probably
> advantageous to use a "framing" format that lets you store multiple
> "records" such that you can seek to any particular record quickly -- using a
> large repeated field doesn't provide this anyway, so you need something else
> (we have some code internally that we call RecordIO).
> BTW, we would love to open source the libraries I mentioned, it's just a
> matter of finding the time to get it done.
>
> On Tue, Jan 12, 2010 at 11:29 AM, Kenton Varda  wrote:
>>
>> Dang it, I got my mailing lists mixed up and referred to some things we
>> haven't released open source.  Sigh.
>>
>> On Tue, Jan 12, 2010 at 11:28 AM, Kenton Varda  wrote:
>>>
>>> But you should consider a design that doesn't require you to send
>>> enormous messages.  Protocol buffers are not well-optimized for this sort of
>>> use.  For data stored on disk, consider storing multiple records in a
>>> RecordIO file.  For data passed over Stubby, consider streaming it in
>>> multiple pieces.
>>>
>>> On Tue, Jan 12, 2010 at 9:40 AM, Jason Hsueh  wrote:

 The limit applies to the data source from which a message is parsed. So
 if you want to parse a serialization of Foo, it applies to Foo. But if you
 parse a bunch of Bar messages one by one, and add them individually to Bar,
 then the limit only applies to each individual Bar.
 You can change the limit in your code if you create your own
 CodedInputStream and call its SetTotalBytesLimit method in C++, or its Java
 equivalent setSizeLimit.

 On Tue, Jan 12, 2010 at 8:41 AM, Delip Rao  wrote:
>
> Hi,
>
> I'm trying to understand protobuf message size limits. Is the 64M
> message limit fixed or can it be changed via some compile option? If I
> have a message Foo defined as:
>
> message Foo {
>  repeated Bar bars = 1;
> }
>
> Will the limit apply to Foo or just the individual Bars?
>
> Thanks,
> Delip
>
> --
> You received this message because you are subscribed to the Google
> Groups "Protocol Buffers" group.
> To post to this group, send email to proto...@googlegroups.com.
> To unsubscribe from this group, send email to
> protobuf+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/protobuf?hl=en.
>
>
>


 --
 You received this message because you are subscribed to the Google
 Groups "Protocol Buffers" group.
 To post to this group, send email to proto...@googlegroups.com.
 To unsubscribe from this group, send email to
 protobuf+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/protobuf?hl=en.

>>>
>>
>
>
-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] Re: Issue 122 in protobuf: Two test failures on Windows

2010-01-13 Thread protobuf


Comment #14 on issue 122 by briford.wylie: Two test failures on Windows
http://code.google.com/p/protobuf/issues/detail?id=122

Hi, thanks for the tip. Worked fine.

--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings
-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




[protobuf] In python - How would I send and receive a PB in the POST payload of http request?

2010-01-13 Thread Rich
I'm fairly new to python and very new to protocol buffers.  Any points
in the right direction would be helpful.

thx
-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] In python - How would I send and receive a PB in the POST payload of http request?

2010-01-13 Thread Kenton Varda
You'd need to use a separate HTTP library for that.  Protobuf itself doesn't
provide HTTP integration, but once you have the bytes from the HTTP payload
you can use protobuf to parse them.

On Wed, Jan 13, 2010 at 10:04 AM, Rich  wrote:

> I'm fairly new to python and very new to protocol buffers.  Any points
> in the right direction would be helpful.
>
> thx
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To post to this group, send email to proto...@googlegroups.com.
> To unsubscribe from this group, send email to
> protobuf+unsubscr...@googlegroups.com
> .
> For more options, visit this group at
> http://groups.google.com/group/protobuf?hl=en.
>
>
>
>
-- 

You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.

To post to this group, send email to proto...@googlegroups.com.

To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Message limit

2010-01-13 Thread Kenton Varda
I actually don't know how the format works, but that is certainly one
possibility.

On Wed, Jan 13, 2010 at 5:19 AM, Delip Rao  wrote:

> Thanks folks, that was very useful. Right now I have sequence of
> messages since we're processing serially. RecordIO seems like a great
> idea. Is the "framing format" just multiple messages in a file with an
> inverted index in the beginning?
>
> - Delip
>
> On Tue, Jan 12, 2010 at 2:35 PM, Kenton Varda  wrote:
> > So to rephrase what I said:  You should break up your message in multiple
> > pieces that you store / send one at a time.  Usually very large messages
> are
> > actually lists of smaller messages, so instead of using one big repeated
> > field, store each message separately.  When storing to a file, it's
> probably
> > advantageous to use a "framing" format that lets you store multiple
> > "records" such that you can seek to any particular record quickly --
> using a
> > large repeated field doesn't provide this anyway, so you need something
> else
> > (we have some code internally that we call RecordIO).
> > BTW, we would love to open source the libraries I mentioned, it's just a
> > matter of finding the time to get it done.
> >
> > On Tue, Jan 12, 2010 at 11:29 AM, Kenton Varda 
> wrote:
> >>
> >> Dang it, I got my mailing lists mixed up and referred to some things we
> >> haven't released open source.  Sigh.
> >>
> >> On Tue, Jan 12, 2010 at 11:28 AM, Kenton Varda 
> wrote:
> >>>
> >>> But you should consider a design that doesn't require you to send
> >>> enormous messages.  Protocol buffers are not well-optimized for this
> sort of
> >>> use.  For data stored on disk, consider storing multiple records in a
> >>> RecordIO file.  For data passed over Stubby, consider streaming it in
> >>> multiple pieces.
> >>>
> >>> On Tue, Jan 12, 2010 at 9:40 AM, Jason Hsueh 
> wrote:
> 
>  The limit applies to the data source from which a message is parsed.
> So
>  if you want to parse a serialization of Foo, it applies to Foo. But if
> you
>  parse a bunch of Bar messages one by one, and add them individually to
> Bar,
>  then the limit only applies to each individual Bar.
>  You can change the limit in your code if you create your own
>  CodedInputStream and call its SetTotalBytesLimit method in C++, or its
> Java
>  equivalent setSizeLimit.
> 
>  On Tue, Jan 12, 2010 at 8:41 AM, Delip Rao 
> wrote:
> >
> > Hi,
> >
> > I'm trying to understand protobuf message size limits. Is the 64M
> > message limit fixed or can it be changed via some compile option? If
> I
> > have a message Foo defined as:
> >
> > message Foo {
> >  repeated Bar bars = 1;
> > }
> >
> > Will the limit apply to Foo or just the individual Bars?
> >
> > Thanks,
> > Delip
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Protocol Buffers" group.
> > To post to this group, send email to proto...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > protobuf+unsubscr...@googlegroups.com
> .
> > For more options, visit this group at
> > http://groups.google.com/group/protobuf?hl=en.
> >
> >
> >
> 
> 
>  --
>  You received this message because you are subscribed to the Google
>  Groups "Protocol Buffers" group.
>  To post to this group, send email to proto...@googlegroups.com.
>  To unsubscribe from this group, send email to
>  protobuf+unsubscr...@googlegroups.com
> .
>  For more options, visit this group at
>  http://groups.google.com/group/protobuf?hl=en.
> 
> >>>
> >>
> >
> >
>
-- 

You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.

To post to this group, send email to proto...@googlegroups.com.

To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.



[protobuf] STL_HASH.m4

2010-01-13 Thread vikram
Hello Guys,

 I am seeing that google protocol buffer is now supporting
unorderd_map with new modification in hash.h . But I am confused where
exactly stl_hash.m4 looks for unordered_map by default . Can we make
it to look in different directly as xlc compiler on AIX is installed
under XYZ/vacpp/include which is different that default /usr/include
directory?

I tried to run m4 with stl_hash.m4 as input and XYZ/vacpp/include as
include directory but it failed. saying " end quote is not provided"
Is there anyway I can make stl_hash.m4 to look into
different include file than /usr/include

Thanks & Regards,
Vikram
-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] STL_HASH.m4

2010-01-13 Thread Kenton Varda
stl_hash.m4 should automatically look it whatever directory your compiler
uses.  If for some reason your compiler does not automatically look in the
directory you want, then you should add the proper CXXFLAGS to make it look
there, e.g.:

  ./configure CXXFLAGS=-I/XYZ/vacpp/include

(-I is GCC's flag for this; your compiler may be different.)

On Wed, Jan 13, 2010 at 12:20 PM, vikram  wrote:

> Hello Guys,
>
> I am seeing that google protocol buffer is now supporting
> unorderd_map with new modification in hash.h . But I am confused where
> exactly stl_hash.m4 looks for unordered_map by default . Can we make
> it to look in different directly as xlc compiler on AIX is installed
> under XYZ/vacpp/include which is different that default /usr/include
> directory?
>
> I tried to run m4 with stl_hash.m4 as input and XYZ/vacpp/include as
> include directory but it failed. saying " end quote is not provided"
> Is there anyway I can make stl_hash.m4 to look into
> different include file than /usr/include
>
> Thanks & Regards,
> Vikram
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To post to this group, send email to proto...@googlegroups.com.
> To unsubscribe from this group, send email to
> protobuf+unsubscr...@googlegroups.com
> .
> For more options, visit this group at
> http://groups.google.com/group/protobuf?hl=en.
>
>
>
>
-- 

You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.

To post to this group, send email to proto...@googlegroups.com.

To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.



[protobuf] How can I reset a FileInputStream?

2010-01-13 Thread Jacob Rief
Hello Kenton,

currently I have the following problem: I have a very big file with
many small messages serialized with Protobuf. Each message contains
its owner separator and thus can be found even in an unsynchronized
stream. I move through this file using lseek64, because
FileInputStream::Skip only works into forwarding direction and
FileInputStream::BackUp can move back only up to the current buffer
boundary. Since I am the owner of the file descriptor, also used by
FileInputStream, I can randomly seek to any position in the file.
However after seek'ing, my FileInputStream is obviously in an unusable
state and has to be reset. Currently the only feasible solution is to
replace the current FileInputStream object by a new one - which,
somehow is quite inefficient!

Wouldn't it make sense to add a member function which resets a
FileInputStream to the state of a natively opened and repositioned
file descriptor? Or is there any other solution to randomly access the
raw content of the file, say by wrapping seek?

Regards,
Jacob
-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.




Re: [protobuf] How can I reset a FileInputStream?

2010-01-13 Thread Kenton Varda
On Wed, Jan 13, 2010 at 3:02 PM, Jacob Rief  wrote:

> Currently the only feasible solution is to
> replace the current FileInputStream object by a new one - which,
> somehow is quite inefficient!
>

What makes you think it is inefficient?  It does mean the buffer has to be
re-allocated but with a decent malloc implementation that shouldn't take
long.  Certainly the actual reading from the file would take longer.  Have
you seen performance problems with this approach?


> Wouldn't it make sense to add a member function which resets a
> FileInputStream to the state of a natively opened and repositioned
> file descriptor?


If there really is a performance problem with allocating new objects, then
sure.
-- 

You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.

To post to this group, send email to proto...@googlegroups.com.

To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Re: WriteDelimited/parseDelimited in python

2010-01-13 Thread Kenton Varda
(I have this on the back burner as I'm kind of swamped, but I do want to get
this submitted at some point, hopefully within a week.)

On Tue, Jan 5, 2010 at 3:57 AM, Graham Cox  wrote:

> I was saying the user *could* do that, and that it's currently what I'm
> doing in my server-side code. The reason being, as you said, if you naively
> read from a stream and the message isn't all present then you need to block
> until it is with the way that the Java code works at present. If you are
> using it for client-side code then likely this is not an issue in the
> slightest, but a server that needs to be able to handle many clients at once
> just can not block on one of them...
>
> As to your other alternative, (a), I would suggest that this leaves too
> much of the underlying network protocol bare to the caller. This will make
> it very difficult to change the way that delimiting messages happens in the
> future should such a thing be required. If - for example - it is decided to
> go from having the length prefixed to having a special delimiting sequence
> after the message then it will cause all current calling code to need to be
> changed. It might be that this is considered a low enough level library that
> this is acceptable, but that would be a Google decision...
>
> One more alternative would be how the asn1c library works for parsing ASN.1
> streams into objects, which is to be resumable. The decoder reads all the
> data it is given, and tries to build the object from this. If it doesn't
> have enough data yet then it does what it can, remembers where it got to and
> returns back to the user who can then supply more data when it becomes
> available. If the entire message does parse from the data provided then
> return back to the user the amount of data consumed so that they can discard
> this (reading from the stream directly makes this slightly cleaner still).
> At present, the Protobuf libraries (any of them) can not support this method
> of decoding an object, and it is not a trivial change to make it possible to
> do, but it does - IMO - give a much cleaner and easier to use method of use.
> --
> Graham Cox
>
> On Tue, Jan 5, 2010 at 1:32 AM, Kenton Varda  wrote:
>
>> Make sure to "reply all" so that the group is CC'd.
>>
>> So you are saying that the user should read whatever data is on the
>> socket, then attempt to parse it, and if it fails, assume that it's because
>> there is more data to read?  Seems rather wasteful.  I think what we ideally
>> want is either:
>> (a) Provide a way for the caller to read the size independently, so that
>> they can then make sure to read that many bytes from the input before
>> parsing.
>> (b) Provide a method that reads from a stream, so that the protobuf
>> library can automatically take care of reading all necessary bytes.
>>
>> Option (b) is obviously cleaner but has a few problems:
>> - We have to choose a particular stream interface to support.  While the
>> Python "file-like" interface is pretty common I'm not sure if it's universal
>> for this kind of task.
>> - If not all bytes of the message are available yet, we'd have to block.
>>  This might be fine most of the time, but would be unacceptable for some
>> uses.
>>
>> Thoughts?
>>
>> On Mon, Jan 4, 2010 at 3:09 PM, Graham Cox  wrote:
>>
>>> I'm using it for reading/writing to sockets in my functional tests -
>>> works well enough there...
>>> In my Java-side server code, I read from the socket into a byte buffer,
>>> then deserialize the byte buffer into Protobuf objects, throwing away the
>>> data that has been deserialized. The python "MergeDelimitedFromString"
>>> function also returns the number of bytes that were processed to build up
>>> the Protobuf object, so the user could easily do the same - read the socket
>>> onto the end of a buffer, and then while the buffer is successfully
>>> deserializing into objects throw away the first x bytes as appropriate...
>>>
>>> Just a thought :)
>>>
>>> On Mon, Jan 4, 2010 at 9:57 PM, Kenton Varda  wrote:
>>>
 Hmm, it occurs to me that this currently is not useful for reading from
 a socket or similar stream since the caller has to make sure to read an
 entire message before trying to parse it, but the caller doesn't actually
 know how long the message is (because the code that determines this is
 encapsulated).  Any thoughts on this?

 On Mon, Jan 4, 2010 at 12:11 PM, Kenton Varda wrote:

> Mostly looks good.  There are some style issues (e.g. lines over 80
> chars) but I can clean those up myself.
>
> You'll need to sign the contributor license agreement:
>
> http://code.google.com/legal/individual-cla-v1.0.html -- If you own
> copyright on this change.
> http://code.google.com/legal/corporate-cla-v1.0.html -- If your
> employer does.
>
> Please let me know after you've done this and then I can submit these.
>
>
> On Fri, Jan 1, 2010 at 12:53 PM, Graham  wrote:
>