Bug report and nearing completion hprotoc on par with protocol-buffers 2.0.2

2008-11-10 Thread Chris Kuklewicz

Kenton,

I am nearly ready with the Haskell update to protoc [1] that will
support the user defined options introduced by protocol-buffers 2.0.2
(protoc).

To have a hope of testing my code, I have redesigned my processing to
have hprotoc produce a binary FileDescriptorSet that I could compare
to the output of protoc.  This should also allow hprotoc to consume
the binary FileDescriptorSet output of protoc.

I have a few questions and two bug reports against protoc-2.0.2 that
all arise from me examining the FileDescriptorSet output with protoc's
decoding:

BUG * The user-defined options have the wrong value for some 32 bit
value.  You store 64 bit values:
unittest_custom_options.proto: optional int32 message_opt1 = 7739036;
unittest_custom_options.proto: option (message_opt1) = -56;
protoc:  7739036: 18446744073709551560
hprotoc:7739036: 4294967240
There is another problem which is seen in the raw output :
unittest_custom_options.proto:
message DummyMessageContainingEnum {
  enum TestEnumType {
TEST_OPTION_ENUM_TYPE1 = 22;
TEST_OPTION_ENUM_TYPE2 = -23;
  }
}
protoc:
  2 {
1: TEST_OPTION_ENUM_TYPE2
2: 18446744073709551593
  }
hprotoc:
  2 {
1: TEST_OPTION_ENUM_TYPE2
2: 4294967273
  }
The negative enum value reveals that this is stored as a 64 bits
number instead of 32 bits. This obviously makes the inefficient
negative values about twice as bad as they would otherwise be, and
threatens to cause errors when read into other implementations that
only expect 32 bits.

BUG * The user-defined options from unittest_custom.proto have
repetitions in the output from protoc that are not present in
the .proto file.  Not all fields are repeated (apparently just the
fixed width ones), but this looks dangerous in the presence of
repeated fields. Example from the raw output from protoc:
  4 {
1: CustomOptionMinIntegerValues
7 {
  7706090: 0
  7705709: 18446744071562067968
  7705542: 9223372036854775808
  7704880: 0
  7702367: 0
  7701568: 4294967295
  7700863: 18446744073709551615
  7700307: 0x
  7700307: 0x
  7700194: 0x
  7700194: 0x
  7698645: 0x8000
  7698645: 0x8000
  7685475: 0x8000
  7685475: 0x8000
}
  }


* The default_value of bytes and string types are stored differently.
The bytes are stored in a raw form at the same escaping level as the
proto file.  A string is stored after the escape codes have been
interpreted.
** Why, oh why, are they stored with different escape conventions?
** Is this documented anywhere?

* The name field of the FileDescriptorProto seems to be the file
path passed on the command line or the filepath in the import
statement.
** I have not checked, but if I were on windows would the file path
from the command line have \ instead of / ?
** Is this documented anywhere?

Thanks for your attention,
  Chris

[1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/protocol-buffers
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Re: Bug report and nearing completion hprotoc on par with protocol-buffers 2.0.2

2008-11-10 Thread Kenton Varda
On Mon, Nov 10, 2008 at 9:12 AM, Chris Kuklewicz [EMAIL PROTECTED]wrote:

 BUG * The user-defined options have the wrong value for some 32 bit
 value.


Looks like you figured this out.  Yeah, the idea is that 32-bit varints and
64-bit varints should always be compatible.  So, if you write a 32-bit
negative number as a varint, it needs to be sign-extended to 64 bits so that
if it is read as a 64-bit varint you still get the correct result.

The whole negative varints problem was a mistake made in an early version of
protocol buffers that unfortunately we're stuck with now.


 BUG * The user-defined options from unittest_custom.proto have
 repetitions in the output from protoc that are not present in
 the .proto file.  Not all fields are repeated (apparently just the
 fixed width ones), but this looks dangerous in the presence of
 repeated fields.


Thanks, I'm looking into this.


 * The default_value of bytes and string types are stored differently.
 The bytes are stored in a raw form at the same escaping level as the
 proto file.  A string is stored after the escape codes have been
 interpreted.
 ** Why, oh why, are they stored with different escape conventions?


The default_value field of FileDescriptorProto is a string.  Strings can
only contain structurally-valid UTF-8 text.  So, the default values for
other strings can be represented just fine with no escaping, but raw bytes
need to be escaped somehow such that they are valid UTF-8.  In retrospect,
this may not have been the best format.


 ** Is this documented anywhere?


Yes, in the comments in descriptor.proto.


 * The name field of the FileDescriptorProto seems to be the file
 path passed on the command line or the filepath in the import
 statement.
 ** I have not checked, but if I were on windows would the file path
 from the command line have \ instead of / ?


It will always be a forward slash.

The path is actually not taken from the command line or from import
statements.  The path of each file is its location relative to the source
tree defined by the --proto_path (or -I) flag.  The goal here is to have a
canonical name for every file.  Note that this also implies that file names
cannot contain . or .. components and cannot be absolute paths.


 ** Is this documented anywhere?


I guess not as well as it should be.  descriptor.proto describes the name
field as file name, relative to the root of the source tree, but that's
not precise enough for someone trying to write their own implementation.
 Sorry, my intent was never for people to write their own compiler; I hoped
everyone would reuse libprotoc.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---



Regarding support for Wince

2008-11-10 Thread mpprasad

Is the Protocol Buffers supports WinCE platform?

Thanks
Prasad.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
Protocol Buffers group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~--~~~~--~~--~--~---