Re: [grpc-io] Re: gRFC A4: Header Extraction

'Mark D. Roth' via grpc.io Fri, 03 Feb 2017 10:40:14 -0800

(+dgq, who is working on the grpclb affinity design)

FYI, Abhishek told me yesterday that he's gotten some push-back from Sanjay
on this approach and is now exploring alternatives, such as possibly
pushing the problem up to the layer above gRPC and handling it in VTK.  I
think there are some drawbacks to that proposal (most notably the fact that
there would not be a centralized way to change the configuration for all
clients), but I'm going to hold off on further work on this design until we
see what that investigation yields.


That having been said, here are a few comments related to some of the
recent discussion.

First, the reason for not pushing the parameters for modifying the field
value up into the protobuf layer is that we wanted this mechanism to be
agnostic to the serialization mechanism used for the payload.  If, for
example, a user was using thrift instead of protobuf, they should be able
to use the same mechanism.  And really, modifying the resulting value does
not have anything to do with what data structure the value originally came
from, so there's no reason that we need to link the two.

That having been said, I am fine with the idea of restricting this
mechanism to only extract the full field value and making other mechanisms
(e.g., grpclb affinity code or server-side request routing code such as
that in the GFEs) responsible for modifying the field value as needed.  I
like the fact that this makes it more obvious that extracting the payload
field and modifying the resulting value are two separate operations.  This
does mean that both the grpclb affinity mechanism and the server-side
request routing mechanism would each have to implement their own code to
modify the value, but that also adds flexibility, since it would let each
of them implement it differently if needed.  And making the server do the
value modification probably doesn't add a lot of overhead there, since
we're still avoiding the need for the server to deserialize the request
payload in order to figure out how to route the request.

The only catch I can think of is that we would probably need to make sure
that we apply some sort of length limitation, so that we don't wind up
sending a huge amount of data when the server only needs a small prefix.
But that doesn't seem like a problem.

If we do wind up going forward with this proposal, I will change the doc
accordingly.  I will update this thread once I know more.

On Thu, Feb 2, 2017 at 4:18 PM, Louis Ryan <lr...@google.com> wrote:

>
>
> On Thu, Feb 2, 2017 at 3:34 PM, Craig Tiller <ctil...@google.com> wrote:
>
>> Alternative proposal:
>> - this mechanism only allows single field extraction, with no string
>> processing
>>
>
> This seems like a very reasonable restriction if we're allowing more than
> one header value to be produced. Given that this allows allows the
> extraction rule to be opaque to GRPC (if not to real implementations) then
> it can evolve over time as CEL or other datatype expression languages allow
>
>
>> - when we do the affinity based load balancing design, we allow some
>> level of string manipulation there
>>
>> Advantages:
>> - this feature becomes universal (if we need to provide a smaller binary,
>> we can drop load balancers, but keep support for dos protection via this
>> mechanism)
>> - we minimize the blast radius of possible implementations in wrapped
>> languages
>>
>> On Thu, Feb 2, 2017 at 3:02 PM Craig Tiller <ctil...@google.com> wrote:
>>
>>> Alfred linked me to some docs explaining.
>>>
>>> FTR this seems like very use-case specific bloat being piled onto gRPC,
>>> with a clearly stated aim to pile more bloat onto the mechanism in the
>>> future.
>>>
>>> My recommendation would be to drop the delimiter and find some other
>>> solution for these use-cases.
>>>
>>> On Thu, Feb 2, 2017 at 2:54 PM Craig Tiller <ctil...@google.com> wrote:
>>>
>>> Can you describe your actual use case for this, and why it can't be
>>> accomplished server side? (that seems to be completely missing here, or
>>> I've missed it somewhere).
>>>
>>> Also, what is CEL?
>>>
>>> On Thu, Feb 2, 2017 at 2:46 PM Alfred Fuller <arful...@google.com>
>>> wrote:
>>>
>>> Why not make the headerName a key to an outer map, instead of using a
>>> list. That would enforce uniqueness.
>>>
>>>
>>> On Thu, Feb 2, 2017 at 1:23 PM Alfred Fuller <arful...@google.com>
>>> wrote:
>>>
>>> Interestingly, most of the time, the extracted fields are also bound to
>>> the url for REST, in which case the strings are URL encoded, and the bytes
>>> are base64 encoded. So it would seem to make sense to follow suit.
>>>
>>> On Thu, Feb 2, 2017 at 1:19 PM Alfred Fuller <arful...@google.com>
>>> wrote:
>>>
>>> I think this type of config is very service centric, not interface
>>> centric. For example, look at the LRO API. Different services use the same
>>> protobuf, but have different resource name formats. If the requirement was
>>> copy a field verbatim it 'might' work to define the same extraction for all
>>> services that implement LRO, but the requirement here is that the client is
>>> able to uniquely identify an 'affinity key', so the protocol must be able
>>> to extract the exact substring needed.
>>>
>>> I can also see some services implementing a routing layer that is happy
>>> to inspect everything, and in those cases, no extraction is needed even
>>> though the same interface is exposed.
>>>
>>> Also, this eliminates the problems of clients using old configuration
>>> with new infrastructure.
>>>
>>> So I would say this is not interface configuration, so it should not be
>>> in the interface definition (i.e. the proto files).
>>>
>>> On Thu, Feb 2, 2017 at 1:00 PM Louis Ryan <lr...@google.com> wrote:
>>>
>>> I'm a little confused by this spec. Given that protobuf is backed by a
>>> significant amount of code-generation and expression handling machinery why
>>> is the spec putting responsibility for the extraction and formatting of
>>> these headers on the GRPC runtime?
>>>
>>> Is there some requirement to be able to configure the field extraction
>>> behavior dynamically for APIs that did not anticipate the need for it when
>>> they were launched? If so it seems like that responsibility should rest
>>> entirely on the protobuf runtime and the spec for GRPC should just talk
>>> about how to talk to it to get something extracted. The same would be true
>>> for other encoding runtimes.
>>>
>>> Given that protobuf has (or will have) a standard expression language it
>>> seems like that is what should be passed up to the protobuf runtime
>>>
>>> On Wed, Feb 1, 2017 at 4:28 PM, Alfred Fuller <arful...@google.com>
>>> wrote:
>>>
>>> +Trevor Schroeder <trev...@google.com> what is easiest for the GFE and
>>> friends? Base64 decoding a header, url unescaping a header or something
>>> else?
>>>
>>> (I like the idea of %encoding for strings and base64 encoding bytes,
>>> though if I had to pick only one it would probably be base64, as the blowup
>>> from that is fixed and known)
>>>
>>> On Wed, Feb 1, 2017 at 10:02 AM Mark D. Roth <r...@google.com> wrote:
>>>
>>> On Wed, Feb 1, 2017 at 9:57 AM, Eric Anderson <ej...@google.com> wrote:
>>>
>>> On Wed, Feb 1, 2017 at 9:38 AM, Mark D. Roth <r...@google.com> wrote:
>>>
>>> Right off the cuff, I can think of a few possible options here:
>>>
>>> 1. Always base64-encode the extracted values.
>>> 2. Do base64 encoding only when non-ASCII characters are actually
>>> present.
>>> 3. Simply strip out non-ASCII characters.
>>>
>>>
>>> There's also the option of encoding the special characters. Say, with
>>> %-encoding. We're doing this for status messages. I think we are sad each
>>> time we have to do this, but it frequently seems to be the least-bad
>>> solution.
>>>
>>> Note this would get pretty strange (from a parsing perspective) when
>>> values are binary, not text. So we may want a different solution for binary.
>>>
>>>
>>> You're right, that is another viable option.  And I do like the fact
>>> that it makes things easier to debug in the common case, where the string
>>> is mostly ASCII but may have a small number of non-ASCII characters.
>>>
>>> The down-side of this approach is that, as you pointed out, we would
>>> probably want a separate solution for binary data.  That would mean two
>>> different code-paths and probably some way to indicate which one we want to
>>> use for each header extraction spec.
>>>
>>> Alfred, what are you thoughts on all of this?
>>>
>>>
>>> --
>>> Mark D. Roth <r...@google.com>
>>> Software Engineer
>>> Google, Inc.
>>>
>>>
>>>
>


-- 
Mark D. Roth <r...@google.com>
Software Engineer
Google, Inc.

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To post to this group, send email to grpc-io@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/CAJgPXp7oy47DE87VmetQOx_TpJ2nw%2BU6t8uFTTJrB_GXw0t_7Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [grpc-io] Re: gRFC A4: Header Extraction

Reply via email to