Re: [protobuf] Feature proposal: mapped fields

2010-10-06 Thread Igor Gatis
  // EXPERIMENTAL.  DO NOT USE.
  // For "map" fields, the name of the field in the enclosed type that
  // is the key for this map.  For example, suppose we have:
  //   message Item {
  // required string name = 1;
  // required string value = 2;
  //   }
  //   message Config {
  // repeated Item items = 1 [experimental_map_key="name"];
  //   }
  // In this situation, the map key for Item will be set to "name".
  // TODO: Fully-implement this, then remove the "experimental_" prefix.
  optional string experimental_map_key = 9;

Interesting. It basically pins a field within repeated object to be the key.
I like that. That's very aligned with my ideas. This allows other key types
fairly easily. I guess it would make sense to support string, int and enum
by default. I like the syntax I described better though. :)

On Wed, Oct 6, 2010 at 12:07 PM, David Yu  wrote:

> In descriptor.proto, you'll see an experimental map field.  It's not usable
> atm.
> In the meantime, you could always simulate a map serialization using a
> repeated message with
> odd field numbers as $key and even as $value (sequential).
>
> On Wed, Oct 6, 2010 at 2:23 PM, Igor Gatis  wrote:
>
>> Not sure whether this has been discussed before. In any case...
>>
>> It would be nice to have mapped fields, e.g. key-value pairs. It would
>> work similar to repeated fields, which are implicit maps, e.g 0..N keyed
>> messages. Mapped fields would break from 0..N keys to int or string keys.
>> Integers are very compact and that is very attractive in terms of wire
>> format but settle with integer keys are not really greater than 0..N keys.
>> Thus, string seems more suitable keys of mapped fields. Thus, it seems each
>> item of a mapped field could be defined by the following template-like proto
>> message:
>>
>> message KeyValuePair_of_SomeMessageType {
>>   required string key = 1;
>>   optional SomeMessageType value = 2;
>> }
>>
>>
>> Let's pick a example. Consider the following messages:
>>
>> message Foo {
>>   optional int int_field = 1;
>>   ...
>> }
>>
>> message Bar {
>>   mapped Foo foo = 1;
>> }
>>
>> Internally, protobuf would read the above code as something like:
>>
>> message Foo {
>>   optional int int_field = 1;
>>   ...
>> }
>>
>> // Known in code generation time only.
>> message KeyValuePair_of_Foo {
>>   required string key = 1;
>>   optional Foo value = 2;
>> }
>>
>> message Bar {
>>   repeated KeyValuePair_of_Foo foo = 1;
>> }
>>
>>
>> And generated C++ code for Bar would look like:
>>
>> int32 foo_size() const;
>> bool has_foo(const string& key) const;
>> const Foo& foo(const string& key) const;
>> Foo* mutable_foo(const string& key);
>> void put_foo(const string& key, const Foo& foo);
>> void remove_foo(const string& key);
>> const RepeatedPtrField& foo_keys() const;
>> const RepeatedPtrField& foo_values() const;
>>
>>
>> Thoughts?
>>
>> -Gatis
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Protocol Buffers" group.
>> To post to this group, send email to proto...@googlegroups.com.
>> To unsubscribe from this group, send email to
>> protobuf+unsubscr...@googlegroups.com
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/protobuf?hl=en.
>>
>
>
>
> --
> When the cat is away, the mouse is alone.
> - David Yu
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Feature proposal: mapped fields

2010-10-06 Thread Igor Gatis
On Wed, Oct 6, 2010 at 11:40 AM, Evan Jones  wrote:

> On Oct 6, 2010, at 9:23 , Igor Gatis wrote:
>
>> It would be nice to have mapped fields, e.g. key-value pairs.
>>
>
> I think that map support would probably be useful. I've basically created
> my own maps in protocol buffers a couple times, either by using two repeated
> fields, or a repeated field of a custom "pair" type. In these cases, it
> would have been nice to be able to use the Protocol Buffer as a map
> directly, rather than needing to transfer the data to some other object that
> actually implements the map. I would be interested to hear the opinion of
> the Google maintainers. I'm assuming that there are probably many
> applications inside Google that exchange map-like messages.
>
> This would be a big change, although it wouldn't be an impossible one, I
> don't think. I think it could be implemented as "syntactic sugar" over a
> repeated Pair message.


The syntactic sugar is what I meant by implementing it with repeated pairs.


> I think the biggest challenge is that maps are a "higher level" abstraction
> than repeated fields, which leads to many design challenges:
>
> * Are the maps ordered or unordered?
>* If ordered, how are keys compared? This needs to be consistent
> across programming languages.
>* If unordered, how are hash values computed? This could result in a
> message being parsed and re-serialized differently, if different languages
> compute the hashes differently.
>

Actually, I have a different opinion. High level is not a problem. It's a
dictionary, e.g. string -> object. It does not matter whether the list is
ordered or not. It does not matter how the hash is computed. Serialization
is simple: list of Pairs. The it does not matter the order.

A given implementation may decide to read each pair at a time and insert it
into a hashmap. The down side is that a big list will cause hashmap to grow
many times. A solution could be reading the flat list to figure out what the
size is, then create a map out of it.

   * For both, how are "'unknown" fields handled?
> * Do the maps support repeated keys?
>* If not, what happens when parsing a message with repeated keys?
>

Hm... I'd say no. Simple (unique) string to object seems to fit lots of
people needs. Perhaps, thought, values could be repeated. But that is easily
achieved by a wrapping message which has a repeated object.


>
> Other message protocols contain map-like structures: JSON, Thrift, and
> Avro. Avro only supports string keys. JSON only supports primitive keys.
>  Thrift has a similar note about maps:
>
> http://wiki.apache.org/thrift/ThriftTypes
>
>  For maximal compatibility, the key type for map should be a basic type
>> rather than a struct or container type. There are some languages which do
>> not support more complex key types in their native map types. In addition
>> the JSON protocol only supports key types that are base types.
>>
>
>
> Evan
>
> --
> Evan Jones
> http://evanjones.ca/
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Feature proposal: mapped fields

2010-10-06 Thread David Yu
In descriptor.proto, you'll see an experimental map field.  It's not usable
atm.
In the meantime, you could always simulate a map serialization using a
repeated message with
odd field numbers as $key and even as $value (sequential).

On Wed, Oct 6, 2010 at 2:23 PM, Igor Gatis  wrote:

> Not sure whether this has been discussed before. In any case...
>
> It would be nice to have mapped fields, e.g. key-value pairs. It would work
> similar to repeated fields, which are implicit maps, e.g 0..N keyed
> messages. Mapped fields would break from 0..N keys to int or string keys.
> Integers are very compact and that is very attractive in terms of wire
> format but settle with integer keys are not really greater than 0..N keys.
> Thus, string seems more suitable keys of mapped fields. Thus, it seems each
> item of a mapped field could be defined by the following template-like proto
> message:
>
> message KeyValuePair_of_SomeMessageType {
>   required string key = 1;
>   optional SomeMessageType value = 2;
> }
>
>
> Let's pick a example. Consider the following messages:
>
> message Foo {
>   optional int int_field = 1;
>   ...
> }
>
> message Bar {
>   mapped Foo foo = 1;
> }
>
> Internally, protobuf would read the above code as something like:
>
> message Foo {
>   optional int int_field = 1;
>   ...
> }
>
> // Known in code generation time only.
> message KeyValuePair_of_Foo {
>   required string key = 1;
>   optional Foo value = 2;
> }
>
> message Bar {
>   repeated KeyValuePair_of_Foo foo = 1;
> }
>
>
> And generated C++ code for Bar would look like:
>
> int32 foo_size() const;
> bool has_foo(const string& key) const;
> const Foo& foo(const string& key) const;
> Foo* mutable_foo(const string& key);
> void put_foo(const string& key, const Foo& foo);
> void remove_foo(const string& key);
> const RepeatedPtrField& foo_keys() const;
> const RepeatedPtrField& foo_values() const;
>
>
> Thoughts?
>
> -Gatis
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To post to this group, send email to proto...@googlegroups.com.
> To unsubscribe from this group, send email to
> protobuf+unsubscr...@googlegroups.com
> .
> For more options, visit this group at
> http://groups.google.com/group/protobuf?hl=en.
>



-- 
When the cat is away, the mouse is alone.
- David Yu

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Feature proposal: mapped fields

2010-10-06 Thread Evan Jones

On Oct 6, 2010, at 9:23 , Igor Gatis wrote:

It would be nice to have mapped fields, e.g. key-value pairs.


I think that map support would probably be useful. I've basically  
created my own maps in protocol buffers a couple times, either by  
using two repeated fields, or a repeated field of a custom "pair"  
type. In these cases, it would have been nice to be able to use the  
Protocol Buffer as a map directly, rather than needing to transfer the  
data to some other object that actually implements the map. I would be  
interested to hear the opinion of the Google maintainers. I'm assuming  
that there are probably many applications inside Google that exchange  
map-like messages.


This would be a big change, although it wouldn't be an impossible one,  
I don't think. I think it could be implemented as "syntactic sugar"  
over a repeated Pair message. I think the biggest challenge is that  
maps are a "higher level" abstraction than repeated fields, which  
leads to many design challenges:


* Are the maps ordered or unordered?
	* If ordered, how are keys compared? This needs to be consistent  
across programming languages.
	* If unordered, how are hash values computed? This could result in a  
message being parsed and re-serialized differently, if different  
languages compute the hashes differently.

* For both, how are "'unknown" fields handled?
* Do the maps support repeated keys?
* If not, what happens when parsing a message with repeated keys?


Other message protocols contain map-like structures: JSON, Thrift, and  
Avro. Avro only supports string keys. JSON only supports primitive  
keys.  Thrift has a similar note about maps:


http://wiki.apache.org/thrift/ThriftTypes

For maximal compatibility, the key type for map should be a basic  
type rather than a struct or container type. There are some  
languages which do not support more complex key types in their  
native map types. In addition the JSON protocol only supports key  
types that are base types.



Evan

--
Evan Jones
http://evanjones.ca/

--
You received this message because you are subscribed to the Google Groups "Protocol 
Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



[protobuf] Feature proposal: mapped fields

2010-10-06 Thread Igor Gatis
Not sure whether this has been discussed before. In any case...

It would be nice to have mapped fields, e.g. key-value pairs. It would work
similar to repeated fields, which are implicit maps, e.g 0..N keyed
messages. Mapped fields would break from 0..N keys to int or string keys.
Integers are very compact and that is very attractive in terms of wire
format but settle with integer keys are not really greater than 0..N keys.
Thus, string seems more suitable keys of mapped fields. Thus, it seems each
item of a mapped field could be defined by the following template-like proto
message:

message KeyValuePair_of_SomeMessageType {
  required string key = 1;
  optional SomeMessageType value = 2;
}


Let's pick a example. Consider the following messages:

message Foo {
  optional int int_field = 1;
  ...
}

message Bar {
  mapped Foo foo = 1;
}

Internally, protobuf would read the above code as something like:

message Foo {
  optional int int_field = 1;
  ...
}

// Known in code generation time only.
message KeyValuePair_of_Foo {
  required string key = 1;
  optional Foo value = 2;
}

message Bar {
  repeated KeyValuePair_of_Foo foo = 1;
}


And generated C++ code for Bar would look like:

int32 foo_size() const;
bool has_foo(const string& key) const;
const Foo& foo(const string& key) const;
Foo* mutable_foo(const string& key);
void put_foo(const string& key, const Foo& foo);
void remove_foo(const string& key);
const RepeatedPtrField& foo_keys() const;
const RepeatedPtrField& foo_values() const;


Thoughts?

-Gatis

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.



Re: [protobuf] Python installation does not build plugin_pb2

2010-10-06 Thread Louis-Marie
Thanks for your answer.

To give you a little bit more information, here's what I'm trying to
do. I want to deliver a tool using a custom protoc plugin implemented
in python, so that end user can generate code from its own proto
files.

There would be nothing special to do before using this plugin, but
since it depends on plugin_pb2, I need to find the plugin.proto file
(using pkg-config), compile it, and put it in some appropriate
location.

Also, I can't put it in its "natural" parent python package
(google.protobuf) which is already provided by protobuf installation.
On the other side, the c++ code is generated (and I guess, compiled
into the shared library), which makes it a little bit more
straightforward to write c++ plugins.

What would be the best way to achieve what I'm trying to do? If the
python library size increase is not acceptable there, couldn't the
plugin_pb2 file be generated in an independent location, so that one
could still rely on it being present on any protobuf install?

Thanks for your advices,

Louis-Marie


2010/10/6 Kenton Varda :
> It's not generated because none of the python implementation actually uses
> it.  So, generating it and including it in the egg would just increase the
> library size for everyone, when most people don't need it.
> What makes you feel uncomfortable about generating it yourself?
>
> On Fri, Oct 1, 2010 at 6:01 AM, Louis-Marie  wrote:
>>
>> Hi all,
>>
>> It looks like python installation of protocol buffers does not
>> generate the google.protobuf.compiler.plugin_pb2 python file, while
>> google.protobuf.descriptor_pb2 is explicitly generated by
>> protobuf/python/setup.py
>>
>>  generate_proto("../src/google/protobuf/descriptor.proto")
>>
>> Shouldn't the plugin.proto file be compiled and installed the same
>> way? Maybe I am missing something there, be I feel very uncomfortable
>> recompiling it when I need to write a plugin.
>>
>> Thanks,
>>
>> Louis-Marie
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Protocol Buffers" group.
>> To post to this group, send email to proto...@googlegroups.com.
>> To unsubscribe from this group, send email to
>> protobuf+unsubscr...@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/protobuf?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To post to this group, send email to proto...@googlegroups.com.
> To unsubscribe from this group, send email to
> protobuf+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/protobuf?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en.