On Fri, Jan 20, 2017 at 11:36 AM, 'Mark D. Roth' via grpc.io <
grpc-io@googlegroups.com> wrote:

> On Fri, Jan 20, 2017 at 9:23 AM, Julien Boeuf <jbo...@google.com> wrote:
>
>>
>>
>> On Fri, Jan 20, 2017 at 7:17 AM, 'Mark D. Roth' via grpc.io <
>> grpc-io@googlegroups.com> wrote:
>>
>>> On Thu, Jan 19, 2017 at 3:06 PM, 'Julien Boeuf' via grpc.io <
>>> grpc-io@googlegroups.com> wrote:
>>>
>>>> +stubblefield since he expressed interest.
>>>>
>>>> Thanks Mark for the reply. Please see inline.
>>>>
>>>> Cheers,
>>>>
>>>>      Julien.
>>>>
>>>> On Thu, Jan 19, 2017 at 8:08 AM, Mark D. Roth <r...@google.com> wrote:
>>>>
>>>>> Julien,
>>>>>
>>>>> The gRFC process
>>>>> <https://github.com/grpc/proposal/blob/master/README.md> says that
>>>>> all discussion should happen in this thread, rather than in the PR.  So
>>>>> I'll reply to your comments here.
>>>>>
>>>> Ack. Sorry about that.
>>>>
>>>>
>>>>>
>>>>> I agree with you that the proxy mapper could set the HTTP CONNECT
>>>>> argument to a server name instead of to an IP address.  However, that 
>>>>> would
>>>>> not be enough to address the case where the servers' DNS information is 
>>>>> not
>>>>> available, at least not in the general case, because the client still 
>>>>> needs
>>>>> to know the set of server addresses in order to open the right set of
>>>>> connections to load-balance across.
>>>>>
>>>>> As you and I have discussed, in the specific case where the grpclb
>>>>> load balancing policy is in use, then you could in principle make this
>>>>> work, because the set of server addresses will actually come from the load
>>>>> balancers instead of from the name resolver.  However, this would require 
>>>>> a
>>>>> number of additional hacks:
>>>>>
>>>>>    - The name resolver would somehow have to know that when you
>>>>>    request a load balanced name, it should return the address of the 
>>>>> proxy but
>>>>>    with the "is_balancer" bit set.
>>>>>
>>>>> Correct. This can be done using naming convention which is a
>>>> reasonable thing to do.
>>>>
>>>>
>>>>>
>>>>>    - The proxy mapper would need some way to differentiate between
>>>>>    the connections to the load balancers and the connections to the 
>>>>> backend
>>>>>    servers, so that it could set the HTTP CONNECT argument to the server 
>>>>> name
>>>>>    for the load balancer connections and to the IP address for the backend
>>>>>    server connections.
>>>>>
>>>>> Yes, that is correct. A way to do that is to work hand in hand with
>>>> the resolver which would set a well known / invalid IP address in case of
>>>> the balancer connection (e.g. link local address) so that it can be
>>>> processed as a special case by the proxy mapper. It's not great but it
>>>> would certainly work.
>>>>
>>>>
>>>>>
>>>>>    - The proxy itself would have to know how to resolve the internal
>>>>>    name of the load balancers.
>>>>>
>>>>> Yes, this is totally reasonable and is one of the benefits of using
>>>> HTTP CONNECT. In fact, we are using that very feature for the http_proxy
>>>> env var case today.
>>>>
>>>>
>>>>> And even once all of those hacks are implemented, this approach still
>>>>> only works for the case where the grpclb load balancing policy is in use.
>>>>> If you want to use something like round_robin instead, it won't work at 
>>>>> all.
>>>>>
>>>> IMO, it is OK. I don't believe that round robin is very useful if you
>>>> have grpclb at your disposal. If your client is not able to properly
>>>> resolve names, then round-robin is out of the equation to begin with.
>>>>
>>>>
>>>>>
>>>>> I continue to believe that running a gRPC-level proxy is a better
>>>>> solution for this use-case.
>>>>>
>>>> I agree that this could work. However, this is no silver bullet. Here
>>>> are some issues I have with this scheme.
>>>> 1. This requires the deployment of a full gRPC proxy in the path as
>>>> opposed to a more standard HTTP CONNECT proxy.
>>>>
>>>
>>> That's true, but I think that having this kind of proxy would be fairly
>>> useful in other scenarios as well.
>>>
>>>
>>>> 2. More importantly, it requires the termination of the secure session
>>>> at the proxy which means that the proxy has to be fully trusted.
>>>>
>>>
>>> Is this a significant problem, given that the proxy would be under the
>>> control of the same organization as the servers?
>>>
>> This is not necessarily the case. And even if it is the same
>> organization, such a proxy would be able to be impersonate any of these
>> connections which basically makes it "root" on all gRPC backends that
>> accept connections through it: this is something that we are trying hard to
>> avoid.
>> On the other hand, even if an HTTP CONNECT proxy runs in a more
>> privileged environment since it has access to name resolution, it is not
>> able to impersonate clients and as such a compromised proxy has a limited
>> blast radius.
>> Since a proxy lives on the edge of 2 security zones (e.g less trusted on
>> the client side, and more trusted on the backend side), it is very much
>> subject to attacks as it is exposed on the less trusted side.
>>
>
> That's a good point.  I guess the trade-off here is that in the gRPC-level
> proxy case, you would no longer expose individual servers to attacks, since
> they'd be hidden behind the proxy.  But a successful attack on any
> individual service would only compromise that service, not every service
> behind the proxy, so perhaps this is a worthwhile trade-off.
>
>
>>
>>
>>>
>>>
>>>> 3. Even if the proxy is fully trusted, you will need a way to:
>>>> - carry the whole authentication information of the client from the
>>>> proxy to the backend (e.g. attributes, restrictions etc...).
>>>> - depending on your transport security protocol, you may or may not
>>>> have access to something like Server Name Indication (SNI:
>>>> https://en.wikipedia.org/wiki/Server_Name_Indication) which would be
>>>> needed in this kind of deployment.
>>>>
>>>
>>> Won't having those capabilities be useful in other scenarios too?
>>>
>>>
>>> I certainly agree that there's some work that needs to be done for the
>>> gRPC-level proxy approach.  However, it seems like that work would yield a
>>> set of tools that would be generally useful in other situations -- in
>>> effect, we'd be creating new building blocks that we could compose in
>>> different ways in the future to solve other problems.  In contrast, the
>>> hacks described above that would be necessary to do the work on the gRPC
>>> client would only be useful in this particular scenario, and they would
>>> actually complicate the existing code instead of providing new building
>>> blocks that can be reused later.
>>>
>> For me, the biggest 'hack' is the link-local IP address (or a marker that
>> IP resolution did not work). For the rest, I don't believe that these are
>> hacks. I also believe that the implications on the code are not that bad:
>> the proxy mapper will have to return the parameters for the HTTP CONNECT
>> (which it may have to do anyway if custom headers are needed in the CONNECT
>> request) as opposed to return just a new IP address and let the framework
>> do the HTTP CONNECT.
>>
>
> A proxy mapper will pretty much always need to return the HTTP CONNECT
> argument anyway, so that's not a problem from my perspective.
>
> I agree with you that the main code hack here is having some sort of
> "sentinel" address returned by the resolver, and that hack has to live in
> two places: both the resolver and the proxy mapper.  But in addition, this
> is still only a partial solution, because it will work with grpclb but not
> with round_robin, and it will not allow access to the service config
> information.
>
> Actually, could we resolve this by externally publishing a DNS record for
> the service name that points to the proxy address and has the is_balancer
> bit set?  It wouldn't have to expose anything about the internal network
> architecture; it would just be an externally facing record to point the
> client to the proxy.  It could even include service config information.  If
> we did that, then the resolver would not need to do anything special; the
> only thing we'd need would be the proxy mapper to redirect requests for
> internal addresses through the proxy.  This would essentially reduce the
> problem so that this would look a lot more like case 2.  What do you think?
>
> (This reminds me that I still need to put together a gRFC for how the
> is_balancer bit is going to be encoded in DNS.  But for the purposes of
> this discussion, let's assume that problem is solved.)
>
I guess that could work indeed. Just to make sure I understand correctly.
1. Load balanced name (lb_name) is resolved by DNS to (proxy_ip_addr, lb =
true) (and maybe service config).
2. proxy mapper recognizes the proxy_ip_addr and returns instructions to
form a CONNECT request to this proxy_ip_addr which would look like:
CONNECT <lb_name> HTTP 1.1
Host:<lb_name>
<Custom headers>
3. The proxy resolves the real name and talks to the load balancer. From
now on, we have end to end communication between the client and the LB.
4. The client receives IP addresses of backends from the LB Channel.
5. The proxy mapper recognizes these IP addresses and issues a CONNECT
request to the proxy_ip_addr (mapped) which would look like:
CONNECT <backend_ip> HTTP1.1
Host: <backend_ip>
<Custom headers>
6. The proxy makes a connection to the <backend_ip>. From now on, we have
end to end communication between the client and the backend.


>
>
>>
>>
>>>
>>> I think we are in agreement that either approach could be made to work.
>>> However, I think the gRPC-level proxy approach is cleaner and provides more
>>> long-term benefit.
>>>
>> I don't think that these 2 approaches are equivalent in terms for
>> security. While the gRPC-level proxy could be useful, it may not fulfill
>> some security requirements as I tried to explain above. On the other hand,
>> the trust that we put on a TCP-level proxy is much more tunable.
>>
>
> You're right that there are trade-offs here.  I will update the gRFC to
> document this once we figure out the details of the client-side approach.
>
Thanks! Very much appreciated.


>
>
>>
>>
>>
>>
>>>
>>>
>>>>
>>>>
>>>>>
>>>>> On Wed, Jan 18, 2017 at 2:18 PM, Julien Boeuf <jbo...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks I saw this. I'll comment on the doc.
>>>>>>
>>>>>> BTW, i'm at an offsite today (and I was yesterday) but this is really
>>>>>> high on my priority list.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>>     Julien.
>>>>>>
>>>>>> On Wed, Jan 18, 2017 at 2:12 PM, Mark D. Roth <r...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I've created a gRFC describing how HTTP CONNECT proxies will be
>>>>>>> supported in gRPC:
>>>>>>>
>>>>>>> https://github.com/grpc/proposal/pull/4
>>>>>>>
>>>>>>> Please keep discussion in this thread.  Thanks!
>>>>>>>
>>>>>>> --
>>>>>>> Mark D. Roth <r...@google.com>
>>>>>>> Software Engineer
>>>>>>> Google, Inc.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Mark D. Roth <r...@google.com>
>>>>> Software Engineer
>>>>> Google, Inc.
>>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "grpc.io" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to grpc-io+unsubscr...@googlegroups.com.
>>>> To post to this group, send email to grpc-io@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/grpc-io.
>>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>>> gid/grpc-io/CAAvOVOd%3DXK0Mw1E9L0hnb7Tb5RaDESvVb%3D9S7GE99Hf
>>>> R4w1djg%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/grpc-io/CAAvOVOd%3DXK0Mw1E9L0hnb7Tb5RaDESvVb%3D9S7GE99HfR4w1djg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> --
>>> Mark D. Roth <r...@google.com>
>>> Software Engineer
>>> Google, Inc.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "grpc.io" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to grpc-io+unsubscr...@googlegroups.com.
>>> To post to this group, send email to grpc-io@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/grpc-io.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/grpc-io/CAJgPXp5kLSpNgxx%2BpeSBQZR0fjWvBmFfh4Vre3GeEOM6u
>>> ufnPA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/grpc-io/CAJgPXp5kLSpNgxx%2BpeSBQZR0fjWvBmFfh4Vre3GeEOM6uufnPA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>
>
> --
> Mark D. Roth <r...@google.com>
> Software Engineer
> Google, Inc.
>
> --
> You received this message because you are subscribed to the Google Groups "
> grpc.io" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to grpc-io+unsubscr...@googlegroups.com.
> To post to this group, send email to grpc-io@googlegroups.com.
> Visit this group at https://groups.google.com/group/grpc-io.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/grpc-io/CAJgPXp4QVp3Ln0EnWphdQ8cvg53OyUrAD8qBc-GxW%3DR%3DaQUnQA%
> 40mail.gmail.com
> <https://groups.google.com/d/msgid/grpc-io/CAJgPXp4QVp3Ln0EnWphdQ8cvg53OyUrAD8qBc-GxW%3DR%3DaQUnQA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To post to this group, send email to grpc-io@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/CAAvOVOeiOz0qyGqY8Z_CnBETto2U%2BFE30Jdy8EuRLh-PuzpoyA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to