Re: Why we shouldn't use RPC Frameworks for the New Geode Protocol
I don't like the idea of using someone else's messaging software for several reasons: * we may have needs that are beyond its capabilities * it may impose something like a broker or endpoint IDs that we have to deal with * it introduces more overhead per message and we will have no control over that overhead * it can introduce backward compatibility issues that we have no control over * it won't fit into our configuration and management infrastructure However, I do like the idea of exploring the use of other serialization software if it has good multi-language support. Le 4/10/2017 à 1:38 PM, Michael William Dodge a écrit : MQTT is an interesting idea but from the wiki page it sounds like it depends on a broker. In that architecture, would the server be the broker as well as a publisher and subscriber? Would the locator be the broker? Or would we introduce a separate broker, either third-party or bespoke? Sarge On 10 Apr, 2017, at 13:21, Michael Stolz wrote: I am wondering why we are leaning so heavily toward RPC IDL rather than messaging standards. One of the big features of the client-server discussion around Geode is the ability to register interest and run Continuous Queries. Both of these behave more like messaging than RPCs. Beyond that all that Geode really does is puts and gets and function calls. A put is analogous to a publish. A get is similar to a subscribe. A function call can be implemented as a put on a special topic that has a callback attached to it. In fact that's how we used to do server side functions before we added the function execution api. The other thing we are constantly battling with is being able to push more and more data into Geode from clients faster and faster. That too lends itself to a messaging protocol. In fact, I remember that last year we were already having some discussions about maybe developing a client based on MQTT. That would make GemFire a natural for the Internet-of-Things and for mobile devices. I wonder if it would be sufficient for a full-blown .Net GemFire client. One of our goals in this thread is to be able to have clients in many languages. Well, there are at least 75 different language bindings for MQTT already out in the wild. MQTT is agnostic about what format the payload is in, so we could support PDX if we choose to, or ProtoBufs or FlatBuffers or whatever else for payload serialization. Thoughts? -- Mike Stolz Principal Engineer, GemFire Product Manager Mobile: +1-631-835-4771 On Mon, Apr 10, 2017 at 2:39 PM, Galen M O'Sullivan wrote: On Mon, Apr 10, 2017 at 10:47 AM, Bruce Schuchardt I agree that key/value serialization is a separate issue and more related to storage than communications. The thing I'm struggling with is how to specify message metadata in an RPC IDL. I'm thinking of things like an eventID, transaction info, security principal, etc. The basic client/server messaging doesn't need to know the details of this information - just that it exists and maybe the ID of each piece of metadata. Is there any reason that this data couldn't be packed into, say, Thrift types? It's all numbers, right?
Re: Why we shouldn't use RPC Frameworks for the New Geode Protocol
MQTT is an interesting idea but from the wiki page it sounds like it depends on a broker. In that architecture, would the server be the broker as well as a publisher and subscriber? Would the locator be the broker? Or would we introduce a separate broker, either third-party or bespoke? Sarge > On 10 Apr, 2017, at 13:21, Michael Stolz wrote: > > I am wondering why we are leaning so heavily toward RPC IDL rather than > messaging standards. > > One of the big features of the client-server discussion around Geode is the > ability to register interest and run Continuous Queries. > Both of these behave more like messaging than RPCs. > > Beyond that all that Geode really does is puts and gets and function calls. > A put is analogous to a publish. A get is similar to a subscribe. A > function call can be implemented as a put on a special topic that has a > callback attached to it. In fact that's how we used to do server side > functions before we added the function execution api. > > The other thing we are constantly battling with is being able to push more > and more data into Geode from clients faster and faster. > That too lends itself to a messaging protocol. > > In fact, I remember that last year we were already having some discussions > about maybe developing a client based on MQTT. > That would make GemFire a natural for the Internet-of-Things and for mobile > devices. > I wonder if it would be sufficient for a full-blown .Net GemFire client. > > One of our goals in this thread is to be able to have clients in many > languages. > Well, there are at least 75 different language bindings for MQTT already > out in the wild. > > MQTT is agnostic about what format the payload is in, so we could support > PDX if we choose to, or ProtoBufs or FlatBuffers or whatever else for > payload serialization. > > Thoughts? > > > -- > Mike Stolz > Principal Engineer, GemFire Product Manager > Mobile: +1-631-835-4771 > > On Mon, Apr 10, 2017 at 2:39 PM, Galen M O'Sullivan > wrote: > >> On Mon, Apr 10, 2017 at 10:47 AM, Bruce Schuchardt >> >> wrote: >> >>> I agree that key/value serialization is a separate issue and more related >>> to storage than communications. The thing I'm struggling with is how to >>> specify message metadata in an RPC IDL. I'm thinking of things like an >>> eventID, transaction info, security principal, etc. The basic >>> client/server messaging doesn't need to know the details of this >>> information - just that it exists and maybe the ID of each piece of >>> metadata. >>> >> >> Is there any reason that this data couldn't be packed into, say, Thrift >> types? It's all numbers, right? >>
Re: Why we shouldn't use RPC Frameworks for the New Geode Protocol
I am wondering why we are leaning so heavily toward RPC IDL rather than messaging standards. One of the big features of the client-server discussion around Geode is the ability to register interest and run Continuous Queries. Both of these behave more like messaging than RPCs. Beyond that all that Geode really does is puts and gets and function calls. A put is analogous to a publish. A get is similar to a subscribe. A function call can be implemented as a put on a special topic that has a callback attached to it. In fact that's how we used to do server side functions before we added the function execution api. The other thing we are constantly battling with is being able to push more and more data into Geode from clients faster and faster. That too lends itself to a messaging protocol. In fact, I remember that last year we were already having some discussions about maybe developing a client based on MQTT. That would make GemFire a natural for the Internet-of-Things and for mobile devices. I wonder if it would be sufficient for a full-blown .Net GemFire client. One of our goals in this thread is to be able to have clients in many languages. Well, there are at least 75 different language bindings for MQTT already out in the wild. MQTT is agnostic about what format the payload is in, so we could support PDX if we choose to, or ProtoBufs or FlatBuffers or whatever else for payload serialization. Thoughts? -- Mike Stolz Principal Engineer, GemFire Product Manager Mobile: +1-631-835-4771 On Mon, Apr 10, 2017 at 2:39 PM, Galen M O'Sullivan wrote: > On Mon, Apr 10, 2017 at 10:47 AM, Bruce Schuchardt > > wrote: > > > I agree that key/value serialization is a separate issue and more related > > to storage than communications. The thing I'm struggling with is how to > > specify message metadata in an RPC IDL. I'm thinking of things like an > > eventID, transaction info, security principal, etc. The basic > > client/server messaging doesn't need to know the details of this > > information - just that it exists and maybe the ID of each piece of > > metadata. > > > > Is there any reason that this data couldn't be packed into, say, Thrift > types? It's all numbers, right? >
Re: Why we shouldn't use RPC Frameworks for the New Geode Protocol
On Mon, Apr 10, 2017 at 10:47 AM, Bruce Schuchardt wrote: > I agree that key/value serialization is a separate issue and more related > to storage than communications. The thing I'm struggling with is how to > specify message metadata in an RPC IDL. I'm thinking of things like an > eventID, transaction info, security principal, etc. The basic > client/server messaging doesn't need to know the details of this > information - just that it exists and maybe the ID of each piece of > metadata. > Is there any reason that this data couldn't be packed into, say, Thrift types? It's all numbers, right?
Re: Why we shouldn't use RPC Frameworks for the New Geode Protocol
I agree that key/value serialization is a separate issue and more related to storage than communications. The thing I'm struggling with is how to specify message metadata in an RPC IDL. I'm thinking of things like an eventID, transaction info, security principal, etc. The basic client/server messaging doesn't need to know the details of this information - just that it exists and maybe the ID of each piece of metadata. Le 4/7/2017 à 5:23 PM, Jacob Barrett a écrit : You are confusing message protocol with user data serialization. The two are not related. Look at HTTP, the message protocol is pretty simple, PUT, GET, etc., but it does nothing with the data being PUT/GET. On a GET the message protocol has a field that specifies the Content-Type and Content-Encoding and some other metadata. So the GET could get HTML, JPEG, etc. but the protocol doesn't care and doesn't know anything special about that type of the data it puts. The structure for JPEG is not defined in the HTTP protocol at all. So relate that to what we want to do. Our message protocol defines a PUT and a GET operation, some metadata perhaps and a section for the data. It should have no restriction or care on how that data was serialized. The protocol does not define in any way the structure of the data being PUT or GET. Separating that concern then, does your argument still stand that PRC frameworks do not work for the new Geode protocol? On Fri, Apr 7, 2017 at 3:11 PM Galen M O'Sullivan wrote: I think the main selling point of an RPC framework/IDL is ease-of-use for defined remote communications that look like function calls. If you have calls you're making to remote servers asking them to do work, you can fairly trivially define the interface and then call through. You can then use native types in function calls and they transparently get transformed and sent across the wire. The RPC protocols I've seen are based on the idea that the types that can be sent will be predefined -- otherwise it's hard to describe with an IDL. However, we want to support storing unstructured data, or at least data structures that are defined (from the cluster's point of view) at runtime -- one of the main selling points of Geode is PDX serialization, which lets us store arbitrary object structures in the cache. If we were to use an RPC framework we have all the commands accept byte arrays and include some meta-information. This loses us the ease-of-use. What's left in the protocol then is the calls and the number of arguments they accept, and what order we put those (and the serialized arguments) in on the wire. I don't think we gain much by using a preexisting RPC language, and we lose control over the wire format and message structure. If we want to be able to make the protocol really fast, and customized to our use case; if we want to implement asynchronous requests, futures, etc. then we have to write wrappers for a given language anyways, and packing those things through an RPC framework like Thrift or gRPC will be an extra layer of confusing complexity. Best, Galen
Re: Why we shouldn't use RPC Frameworks for the New Geode Protocol
I have used plenty of RPC protocols that pass raw binary as input/output values because the types are not pre defined. All the frameworks mentioned support byte arrays. -Jake On Mon, Apr 10, 2017 at 10:36 AM Galen M O'Sullivan wrote: > Hi Jacob, > > The message protocol is conflated with user data serialization in pretty > much all of these frameworks I've seen. If we define some RPC in Thrift, we > have to specify the type of data that gets passed to the call. The type of > that data is specified using the Thrift IDL, meaning it's a > Thrift-serialized object. We could have all the remote procedure calls take > and return byte arrays, but then I'm not sure what the benefit of using > Thrift is. > > The nice thing about HTTP is that it's specified what it looks like on the > wire and there's a defined mechanism for negotiating content type. The RPC > frameworks I've seen aren't as clearly specified[1]. We would also have to > write our own mechanism for negotiating encoding. It seems to me like we > would be doing more work than we would be gaining. > > Another factor that will also constrain our serialization choices as well > as RPC frameworks is that a lot of serialization libraries expect all the > types to be known as runtime. We currently allow users to serialize POJOs > or anything that will fit in PDX, and we want to keep that capability. > > [1]: I think I could reconstruct BERT based on the Erlang docs, but Thrift > is harder. gRPC is basically Protobuf-encoded data over HTTP/2. We could > put binary data in protobufs, but I'd rather not. > > On Fri, Apr 7, 2017 at 5:23 PM, Jacob Barrett wrote: > > > You are confusing message protocol with user data serialization. The two > > are not related. Look at HTTP, the message protocol is pretty simple, > PUT, > > GET, etc., but it does nothing with the data being PUT/GET. On a GET the > > message protocol has a field that specifies the Content-Type and > > Content-Encoding and some other metadata. So the GET could get HTML, > JPEG, > > etc. but the protocol doesn't care and doesn't know anything special > about > > that type of the data it puts. The structure for JPEG is not defined in > the > > HTTP protocol at all. > > > > So relate that to what we want to do. Our message protocol defines a PUT > > and a GET operation, some metadata perhaps and a section for the data. It > > should have no restriction or care on how that data was serialized. The > > protocol does not define in any way the structure of the data being PUT > or > > GET. > > > > Separating that concern then, does your argument still stand that PRC > > frameworks do not work for the new Geode protocol? > > > > On Fri, Apr 7, 2017 at 3:11 PM Galen M O'Sullivan > > > wrote: > > > > > I think the main selling point of an RPC framework/IDL is ease-of-use > for > > > defined remote communications that look like function calls. If you > have > > > calls you're making to remote servers asking them to do work, you can > > > fairly trivially define the interface and then call through. You can > then > > > use native types in function calls and they transparently get > transformed > > > and sent across the wire. > > > > > > The RPC protocols I've seen are based on the idea that the types that > can > > > be sent will be predefined -- otherwise it's hard to describe with an > > IDL. > > > > > > However, we want to support storing unstructured data, or at least data > > > structures that are defined (from the cluster's point of view) at > runtime > > > -- one of the main selling points of Geode is PDX serialization, which > > lets > > > us store arbitrary object structures in the cache. If we were to use an > > RPC > > > framework we have all the commands accept byte arrays and include some > > > meta-information. This loses us the ease-of-use. > > > > > > What's left in the protocol then is the calls and the number of > arguments > > > they accept, and what order we put those (and the serialized arguments) > > in > > > on the wire. I don't think we gain much by using a preexisting RPC > > > language, and we lose control over the wire format and message > structure. > > > If we want to be able to make the protocol really fast, and customized > to > > > our use case; if we want to implement asynchronous requests, futures, > > etc. > > > then we have to write wrappers for a given language anyways, and > packing > > > those things through an RPC framework like Thrift or gRPC will be an > > extra > > > layer of confusing complexity. > > > > > > Best, > > > Galen > > > > > >
Re: Why we shouldn't use RPC Frameworks for the New Geode Protocol
Hi Jacob, The message protocol is conflated with user data serialization in pretty much all of these frameworks I've seen. If we define some RPC in Thrift, we have to specify the type of data that gets passed to the call. The type of that data is specified using the Thrift IDL, meaning it's a Thrift-serialized object. We could have all the remote procedure calls take and return byte arrays, but then I'm not sure what the benefit of using Thrift is. The nice thing about HTTP is that it's specified what it looks like on the wire and there's a defined mechanism for negotiating content type. The RPC frameworks I've seen aren't as clearly specified[1]. We would also have to write our own mechanism for negotiating encoding. It seems to me like we would be doing more work than we would be gaining. Another factor that will also constrain our serialization choices as well as RPC frameworks is that a lot of serialization libraries expect all the types to be known as runtime. We currently allow users to serialize POJOs or anything that will fit in PDX, and we want to keep that capability. [1]: I think I could reconstruct BERT based on the Erlang docs, but Thrift is harder. gRPC is basically Protobuf-encoded data over HTTP/2. We could put binary data in protobufs, but I'd rather not. On Fri, Apr 7, 2017 at 5:23 PM, Jacob Barrett wrote: > You are confusing message protocol with user data serialization. The two > are not related. Look at HTTP, the message protocol is pretty simple, PUT, > GET, etc., but it does nothing with the data being PUT/GET. On a GET the > message protocol has a field that specifies the Content-Type and > Content-Encoding and some other metadata. So the GET could get HTML, JPEG, > etc. but the protocol doesn't care and doesn't know anything special about > that type of the data it puts. The structure for JPEG is not defined in the > HTTP protocol at all. > > So relate that to what we want to do. Our message protocol defines a PUT > and a GET operation, some metadata perhaps and a section for the data. It > should have no restriction or care on how that data was serialized. The > protocol does not define in any way the structure of the data being PUT or > GET. > > Separating that concern then, does your argument still stand that PRC > frameworks do not work for the new Geode protocol? > > On Fri, Apr 7, 2017 at 3:11 PM Galen M O'Sullivan > wrote: > > > I think the main selling point of an RPC framework/IDL is ease-of-use for > > defined remote communications that look like function calls. If you have > > calls you're making to remote servers asking them to do work, you can > > fairly trivially define the interface and then call through. You can then > > use native types in function calls and they transparently get transformed > > and sent across the wire. > > > > The RPC protocols I've seen are based on the idea that the types that can > > be sent will be predefined -- otherwise it's hard to describe with an > IDL. > > > > However, we want to support storing unstructured data, or at least data > > structures that are defined (from the cluster's point of view) at runtime > > -- one of the main selling points of Geode is PDX serialization, which > lets > > us store arbitrary object structures in the cache. If we were to use an > RPC > > framework we have all the commands accept byte arrays and include some > > meta-information. This loses us the ease-of-use. > > > > What's left in the protocol then is the calls and the number of arguments > > they accept, and what order we put those (and the serialized arguments) > in > > on the wire. I don't think we gain much by using a preexisting RPC > > language, and we lose control over the wire format and message structure. > > If we want to be able to make the protocol really fast, and customized to > > our use case; if we want to implement asynchronous requests, futures, > etc. > > then we have to write wrappers for a given language anyways, and packing > > those things through an RPC framework like Thrift or gRPC will be an > extra > > layer of confusing complexity. > > > > Best, > > Galen > > >
Re: Why we shouldn't use RPC Frameworks for the New Geode Protocol
You are confusing message protocol with user data serialization. The two are not related. Look at HTTP, the message protocol is pretty simple, PUT, GET, etc., but it does nothing with the data being PUT/GET. On a GET the message protocol has a field that specifies the Content-Type and Content-Encoding and some other metadata. So the GET could get HTML, JPEG, etc. but the protocol doesn't care and doesn't know anything special about that type of the data it puts. The structure for JPEG is not defined in the HTTP protocol at all. So relate that to what we want to do. Our message protocol defines a PUT and a GET operation, some metadata perhaps and a section for the data. It should have no restriction or care on how that data was serialized. The protocol does not define in any way the structure of the data being PUT or GET. Separating that concern then, does your argument still stand that PRC frameworks do not work for the new Geode protocol? On Fri, Apr 7, 2017 at 3:11 PM Galen M O'Sullivan wrote: > I think the main selling point of an RPC framework/IDL is ease-of-use for > defined remote communications that look like function calls. If you have > calls you're making to remote servers asking them to do work, you can > fairly trivially define the interface and then call through. You can then > use native types in function calls and they transparently get transformed > and sent across the wire. > > The RPC protocols I've seen are based on the idea that the types that can > be sent will be predefined -- otherwise it's hard to describe with an IDL. > > However, we want to support storing unstructured data, or at least data > structures that are defined (from the cluster's point of view) at runtime > -- one of the main selling points of Geode is PDX serialization, which lets > us store arbitrary object structures in the cache. If we were to use an RPC > framework we have all the commands accept byte arrays and include some > meta-information. This loses us the ease-of-use. > > What's left in the protocol then is the calls and the number of arguments > they accept, and what order we put those (and the serialized arguments) in > on the wire. I don't think we gain much by using a preexisting RPC > language, and we lose control over the wire format and message structure. > If we want to be able to make the protocol really fast, and customized to > our use case; if we want to implement asynchronous requests, futures, etc. > then we have to write wrappers for a given language anyways, and packing > those things through an RPC framework like Thrift or gRPC will be an extra > layer of confusing complexity. > > Best, > Galen >
Re: Why we shouldn't use RPC Frameworks for the New Geode Protocol
> On Apr 7, 2017, at 3:11 PM, Galen M O'Sullivan wrote: > > I think the main selling point of an RPC framework/IDL is ease-of-use for > defined remote communications that look like function calls. If you have > calls you're making to remote servers asking them to do work, you can > fairly trivially define the interface and then call through. You can then > use native types in function calls and they transparently get transformed > and sent across the wire. > > The RPC protocols I've seen are based on the idea that the types that can > be sent will be predefined -- otherwise it's hard to describe with an IDL. > > However, we want to support storing unstructured data, or at least data > structures that are defined (from the cluster's point of view) at runtime > -- one of the main selling points of Geode is PDX serialization, which lets > us store arbitrary object structures in the cache. If we were to use an RPC > framework we have all the commands accept byte arrays and include some > meta-information. This loses us the ease-of-use. IMO, data encoding should be a separate concern from message definition. I would advocate that any approach we take should view the key/value fields as opaque bytes. By keeping data encoding and message definition separate, we can evolve them independently. > > What's left in the protocol then is the calls and the number of arguments > they accept, and what order we put those (and the serialized arguments) in > on the wire. I don't think we gain much by using a preexisting RPC > language, and we lose control over the wire format and message structure. > If we want to be able to make the protocol really fast, and customized to > our use case; if we want to implement asynchronous requests, futures, etc. > then we have to write wrappers for a given language anyways, and packing > those things through an RPC framework like Thrift or gRPC will be an extra > layer of confusing complexity. I believe grpc supports an async API. I would really love to see benchmark data for a get() message using grpc. Comparing that same message sent using an async netty implementation would be instructive :-) > > Best, > Galen Anthony
Why we shouldn't use RPC Frameworks for the New Geode Protocol
I think the main selling point of an RPC framework/IDL is ease-of-use for defined remote communications that look like function calls. If you have calls you're making to remote servers asking them to do work, you can fairly trivially define the interface and then call through. You can then use native types in function calls and they transparently get transformed and sent across the wire. The RPC protocols I've seen are based on the idea that the types that can be sent will be predefined -- otherwise it's hard to describe with an IDL. However, we want to support storing unstructured data, or at least data structures that are defined (from the cluster's point of view) at runtime -- one of the main selling points of Geode is PDX serialization, which lets us store arbitrary object structures in the cache. If we were to use an RPC framework we have all the commands accept byte arrays and include some meta-information. This loses us the ease-of-use. What's left in the protocol then is the calls and the number of arguments they accept, and what order we put those (and the serialized arguments) in on the wire. I don't think we gain much by using a preexisting RPC language, and we lose control over the wire format and message structure. If we want to be able to make the protocol really fast, and customized to our use case; if we want to implement asynchronous requests, futures, etc. then we have to write wrappers for a given language anyways, and packing those things through an RPC framework like Thrift or gRPC will be an extra layer of confusing complexity. Best, Galen