Re: Deserializing a message for which I don't have the .proto (in Java)

2009-06-23 Thread Kenton Varda
You can use UnknownFieldSet, but be warned that the interface for that class
is likely to change in a future version (because the current design is
somewhat inefficient).  If you just want to print the contents, you should
be fine -- just parse into an UnknownFieldSet and then call its toString()
method.  Those parts won't change.
On Tue, Jun 23, 2009 at 7:46 AM, Toph  wrote:

>
> Hi folks,
>
> I understand that protocol buffers messages are not fully self-
> describing.
> However, the message contains the field number, wire type, and value,
> right?
>
> In Java, how can I take a byte[] (array of bytes) that represents a
> message and deserialize it into a list of tuples that contain the
> field number, wire type, and value?  I really just want to print out
> these tuples, even if the value is binary.
>
> Do any of the classes i the Java API provide a mechanism to do this?
> I know I could just take the documentation of the encoding and write
> code myself, but I am hoping that one of the API classes exists in a
> usable or near-usable form to do this for me.  Will CodedInputStream
> do it?  Can I use any of the parseFrom() or mergeFrom() methods?
>
> Note that I do not have the corresponding .proto file in source form,
> compiled form, or available to transmit along with any messages
>
> Thanks
>
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: encoding of embedded messages and repeated elements

2009-06-23 Thread Kenton Varda
The advantage of writing the length is that a parser can skip the entire
sub-message easily without having to parse its contents.  Otherwise, we
would probably use the "group" encoding for sub-messages, where a special
end tag marks the end of the message.

On Tue, Jun 23, 2009 at 9:06 AM, etorri  wrote:

>
>
> Hello,
>
> The "length delimited" encoding basically tells that the following N
> bytes belong to this field. Wouldn't it be easier to instead use the
> number of elements that belong to the embedded message (repeated
> element).
>
> Now (as far as I have understood) the message needs to be built from
> fragments and then collected together as the lengths are not known
> beforehand and it would be expensive to calculate the byte-length of
> the embedded message.
>
> Instead, it would be relatively inexpensive to calculate just the
> number of following elements that make the embedded message before
> starting to encode it.
>
> This would enable streaming of PB or encoding and sending the elements
> right as they are encoded.
>
> Sorry if I misunderstood something. I have just started looking at BP.
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: encoding of embedded messages and repeated elements

2009-06-23 Thread etorri



ok thanks. it is as it is.

(just looking at the feasibility of implementing the PB in Ada for my
own projects :-)



On Jun 23, 6:17 pm, Alek Storm  wrote:
> Hi etorri,
>
> Embedded messages and strings have the exact same wire format.  When parsing
> a message, it's impossible to know whether you're parsing one or the other,
> and since strings have to be encoded using their length in bytes, we can't
> do something different for embedded messages.
>
> Cheers,
> Alek
>
>
>
> On Tue, Jun 23, 2009 at 9:06 AM, etorri  wrote:
>
> > Hello,
>
> > The "length delimited" encoding basically tells that the following N
> > bytes belong to this field. Wouldn't it be easier to instead use the
> > number of elements that belong to the embedded message (repeated
> > element).
>
> > Now (as far as I have understood) the message needs to be built from
> > fragments and then collected together as the lengths are not known
> > beforehand and it would be expensive to calculate the byte-length of
> > the embedded message.
>
> > Instead, it would be relatively inexpensive to calculate just the
> > number of following elements that make the embedded message before
> > starting to encode it.
>
> > This would enable streaming of PB or encoding and sending the elements
> > right as they are encoded.
>
> > Sorry if I misunderstood something. I have just started looking at BP.
>
> --
> Alek Storm
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: encoding of embedded messages and repeated elements

2009-06-23 Thread Alek Storm
Hi etorri,

Embedded messages and strings have the exact same wire format.  When parsing
a message, it's impossible to know whether you're parsing one or the other,
and since strings have to be encoded using their length in bytes, we can't
do something different for embedded messages.

Cheers,
Alek

On Tue, Jun 23, 2009 at 9:06 AM, etorri  wrote:

>
>
> Hello,
>
> The "length delimited" encoding basically tells that the following N
> bytes belong to this field. Wouldn't it be easier to instead use the
> number of elements that belong to the embedded message (repeated
> element).
>
> Now (as far as I have understood) the message needs to be built from
> fragments and then collected together as the lengths are not known
> beforehand and it would be expensive to calculate the byte-length of
> the embedded message.
>
> Instead, it would be relatively inexpensive to calculate just the
> number of following elements that make the embedded message before
> starting to encode it.
>
> This would enable streaming of PB or encoding and sending the elements
> right as they are encoded.
>
> Sorry if I misunderstood something. I have just started looking at BP.
> >
>


-- 
Alek Storm

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



encoding of embedded messages and repeated elements

2009-06-23 Thread etorri


Hello,

The "length delimited" encoding basically tells that the following N
bytes belong to this field. Wouldn't it be easier to instead use the
number of elements that belong to the embedded message (repeated
element).

Now (as far as I have understood) the message needs to be built from
fragments and then collected together as the lengths are not known
beforehand and it would be expensive to calculate the byte-length of
the embedded message.

Instead, it would be relatively inexpensive to calculate just the
number of following elements that make the embedded message before
starting to encode it.

This would enable streaming of PB or encoding and sending the elements
right as they are encoded.

Sorry if I misunderstood something. I have just started looking at BP.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: PB's vs ASN.1

2009-06-23 Thread Christopher Smith

No, but in short the advantages over ASN.1 can be summed up as
"simpler, and for most cases, more efficient".


On 6/23/09, Jon M  wrote:
>
> Hello,
>
> The system I am currently working on uses ASN.1 at the heart of the
> client/server communication. I am evaluating PB's for another part of
> the system that hasn't been implemented yet and was curious if anyone
> can point me to any articles/blogs comparing and contrasting PB's and
> ASN.1?
>
> Thanks,
> Jon
> >
>

-- 
Sent from my mobile device

Chris

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Deserializing a message for which I don't have the .proto (in Java)

2009-06-23 Thread Toph

Hi folks,

I understand that protocol buffers messages are not fully self-
describing.
However, the message contains the field number, wire type, and value,
right?

In Java, how can I take a byte[] (array of bytes) that represents a
message and deserialize it into a list of tuples that contain the
field number, wire type, and value?  I really just want to print out
these tuples, even if the value is binary.

Do any of the classes i the Java API provide a mechanism to do this?
I know I could just take the documentation of the encoding and write
code myself, but I am hoping that one of the API classes exists in a
usable or near-usable form to do this for me.  Will CodedInputStream
do it?  Can I use any of the parseFrom() or mergeFrom() methods?

Note that I do not have the corresponding .proto file in source form,
compiled form, or available to transmit along with any messages

Thanks


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



PB's vs ASN.1

2009-06-23 Thread Jon M

Hello,

The system I am currently working on uses ASN.1 at the heart of the
client/server communication. I am evaluating PB's for another part of
the system that hasn't been implemented yet and was curious if anyone
can point me to any articles/blogs comparing and contrasting PB's and
ASN.1?

Thanks,
Jon
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: valgrind issues

2009-06-23 Thread Kenton Varda
Did you call ShutdownProtobufLibrary() before checking for leaks?

As it says, the memory in question is still reachable, so whether or not it
is a "leak" is debatable.  ShutdownProtobufLibrary() will go around and
delete all the objects the library has allocated.  It's a huge waste of time
if you're about to call exit() anyway, but it should make valgrind shut up.
Or maybe you can tell valgrind to ignore reachable memory?

On Mon, Jun 22, 2009 at 11:23 PM, Monty Taylor  wrote:

>
> Hey guys,
>
> We're valgrinding drizzle at the moment and see a lot of:
>
> #
> ==3378== 40 bytes in 1 blocks are still reachable in loss record 14 of 121
> #
> ==3378==at 0x4A06D5C: operator new(unsigned long)
> (vg_replace_malloc.c:230)
> #
> ==3378==by 0x5894A8:
>
> drizzled::message::protobuf_BuildDesc_table_2eproto_AssignGlobalDescriptors(google::protobuf::FileDescriptor
> const*) (table.pb.cc:173)
> #
> ==3378==by 0x4C64F13:
>
> google::protobuf::DescriptorBuilder::BuildFile(google::protobuf::FileDescriptorProto
> const&, void (*)(google::protobuf::FileDescriptor const*))
> (descriptor.cc:2391)
> #
> ==3378==by 0x4C65BBF:
> google::protobuf::DescriptorPool::InternalBuildGeneratedFile(void
> const*, int, void (*)(google::protobuf::FileDescriptor const*))
> (descriptor.cc:1962)
> #
> ==3378==by 0x5883B9: _GLOBAL__I_table.pb.cc (table.pb.cc:580)
> #
> ==3378==by 0x708675: (within /home/brian/merge/drizzled/drizzled)
> #
> ==3378==by 0x411BCA: (within /home/brian/merge/drizzled/drizzled)
> #
> ==3378==by 0x3171402B1F: (within /lib64/libpthread-2.9.so)
> #
> ==3378==by 0x708604: __libc_csu_init (in
> /home/brian/merge/drizzled/drizzled)
> #
> ==3378==by 0x3170C1E501: (below main) (libc-start.c:179)
>
> I'm not sure whether they are valid and something I should report to
> you, or whether they are invalid and something I should suppress. Any
> thoughts?
>
> Thanks!
> Monty
>
> >
>

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---