On 3/13/2017 7:31 PM, William A Rowe Jr wrote:
> On Sat, Mar 11, 2017 at 1:33 PM, Daniel Ruggeri <drugg...@primary.net> wrote:
>> On 2/20/2017 10:58 AM, William A Rowe Jr wrote:
>>> On Sat, Feb 18, 2017 at 4:44 PM, Daniel Ruggeri <drugg...@primary.net> 
>>> wrote:
>>>> Hi, Bill;
>>>>    I've replied about the pre_connnection situation - hoping someone can
>>>> give the proposed patch a test as I don't have a handy H2 testbed.
>>> Yup! Will review that thread - it's the -1 half (as opposed to a general -0 
>>> half
>>> for a 'pause' request while I was trying to get to reviewing the
>>> original commit.)
>> No worries at all. Reviews are important.
>>
>> I am curious what you mean about -1 half vs -0 half, though. Is that
>> -1.5 vs -.5? :-)
> The entire -1 reservation was simply till a fix is in place... My -0 
> reservation
> (or 'half') was just a beg for time to review...  if you applied it
> literally it would
> be -0.5 :)
>
>>>> On the other comment, can you help me understand what redundant code is
>>>> happening per-request? When manipulating the request, there are only
>>>> four things happening differently:
>>>> 1. A check that we have data stored away from the connection filter
>>>> 2. A check that the connection data has a client IP
>>>> 3. The assignment of the data to the request_rec's structure and logging
>>>> at TRACE1
>>>> 4. If no data was found, a check to see if it was optional and a logging
>>>> statement/return according to that result
>>> AIUI; the directives are all configured per-Server, the PROXY protocol data
>>> is fixed for the lifespan of the Connection.  The PROXY protocol is
>>> significantly
>>> more binding that either x-f-f or even x-remoteip. I'm not even sure where 
>>> the
>>> 'optional' scheme originated; if present when not allowed, that's a probable
>>> abuse pattern, and when not present when honored, that too indicates some
>>> malfunction and traffic shouldn't proceed IMO. I don't know that the 
>>> optional
>>> list should be shipped, it's far too simple to create a completely insecure
>>> setup that won't raise eyebrows. The PROXY protocol reference spec states
>>> the connection (by origin or destination IP) follows the PROXY protocol, or
>>> it does not.
>> Sorry to mix threads. I just replied a moment ago with a bit of
>> reasoning behind the Optional use case. While it's possible that a
>> server admin could mistakenly enable something they don't intend to or
>> open things up more than they should, that's applicable any time someone
>> enables some sort of authnz. I'm happy to reinforce this point in the
>> docs for the Optional case but I still think enough utility is there to
>> include it.
> Is OK - Read your replies to my questions in that thread and they were
> very clear, thanks.
>
> Your objection in this thread centered around being able to connect both
> as a test/monitoring and as a consumer of the site passing through HAProxy.
> I'd expect that to be a binary decision based on the origin IP/netmask?
> Not sure that is settled, we can dive deeper into that subject, but you are
> not wrong that a given vhost needs to be monitored without PROXY protocol
> and traffiked via PROXY+httpd. I'm hoping the by-immediate-peer IP is
> sufficient to accomplish that as a binary decision, per the spec you refer
> to below.

At first blush, I don't see why not. Inspecting the client IP and
determining if it is in a certain network range would be an
easy/efficient toggle for whether the filter gets injected or not. I'm
sure someone could dream up a use case, but since this filter is dealing
with layer 4 information it seems reasonable for layer 4 to be the
deciding point whether or not to use it.


>
>>> Beyond that concern, I'm wondering if we shouldn't be using the *original*
>>> design of mod_remoteip, changing the conn_rec client_addr/client_ip (and
>>> null out remote_host/logname) and never alter it between requests.
>>>
>>> We can leave a conn pool note behind for the per-req processing, to retrieve
>>> the proxy IP into a req variable if desired doing the rest of the
>>> remoteip request
>>> phase, but the remaining per-req code and processing is near insignificant.
>>>
>>> Thoughts?
>>>
>>>> This should all be quite straight forward per request... In fact, it's a
>>>> much shorter logical path and less work than having to parse the
>>>> X-Forwarded-For header.
>>> So I was unspooling how we would handle stacked variables.
>>>
>>> Any PROXY protocol is the nearest hop; if multiple PROXY protocol header
>>> lines occurred, the closest would be transmitted first, etc.
>> I'm not sure if multiple PROXY lines are permitted. Looking at section
>> 4.1, I think the intent is that PROXY-aware servers would continue
>> propagating the original client IP address in any PROXY headers it emits.
>> For example, in the diagram in section 4.1, PX2 should emit a PROXY
>> header to the backend server that has the client IP it received from the
>> PROXY header in PX1.
>>
>> Ref: http://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
> So I read nothing that prohibits it... and you?

No, I did not. My take on the examples and descriptions imply that there
would be one header, but there's certainly no RFC2119-like grammar to
remove all doubt.
I can follow up with the author if you think it's vague enough to warrant.


>
>>> All local x-remoteip style values would be the next most distant hop; very
>>> similar to the haproxy protocol, it indicates some absolutely trusted edge
>>> router/balancer.
>>>
>>> Any x-f-f that occurs would reflect all the next most distant hops. Finally,
>>> any 'Forwarded' header (rfc7239) are the most distant hops. I'm basing
>>> that conclusion on the fact that all 'Forwarded'-aware intermediaries which
>>> construct a 'Forwarded' header would not carry the x-f-f, but concatenate
>>> these as closer than the nearest 'Forwarded'-aware hop. So the presence
>>> of an x-f-f header indicates the presence of a 'Forwarded'-unaware agent
>>> between this incoming connection and the closest 'Forwarded'-aware agent.
>> Yep, I follow the thought process and agree.
>> This assumes that the intermediary isn't being clever or dumb by...
>> * Sending the traffic as it received it (so not technically complying
>> with any of the methods of propagating client and intermediary info)
>> * Sending an appropriate 7239 header, but blindly passing X-Forwarded-For
>> * Rewriting both headers to contain the same data in their expected formats
>>
>> FWIW, I feel the struggle of unwrapping all of this, too. At $dayjob,
>> because of the potential silliness of various intermediaries, we chose
>> to create a custom header that is always written (dropped if it comes to
>> us) when our edge devices receive a connection.
> That is the way to always handle it. I do see the PROXY protocol as an
> ALWAYS trusted (otherwise, it would be a 400 response to nonsense
> request input.)
>
>
>>> I'm not suggesting these two enhancements, PROXY and RFC7239 are
>>> intertwined, we can certainly ship them in different releases, but I was
>>> having problems working out X-F-F vs Forwarded until I was working
>>> through the PROXY logic and came to the conclusion above, and am
>>> looking for others to sanity-check my logic on this.
>> Actually, you bring up a really good point that I had not explicitly
>> considered. If a backend server is presented with a PROXY header AND an
>> HTTP header (either a X-Forwarded-For or Forwarded), which header should
>> it use? In theory there can be many intermediate hops that can add to
>> the data.
>> This is important for us on two fronts:
>> * For mod_remoteip, we'd have to decide which to use. The current method
>> is to prefer PROXY.
>> * If we add PROXY support to mod_proxy, we have to decide which to propagate
> IMO, we /never/ add PROXY support to mod_proxy under any circumstance.
> It is a crude hack for a specific use case... injecting metadata on a
> vanilla tunnel.
> The only application of that brute-force might be for mod_proxy_connect.
Sure. I'm heavily neutral, if that even makes sense, on the topic. Just
something to keep in mind. I know there's been some experimentation (or
even use in anger that I don't know about) with a vanilla TCP proxy in
httpd. With several cloud providers (AWS and GCP confirmed) offering
PROXY support for their TCP load balancers, it feels like an emerging
defacto standard.

> We can do several things here, already we support RemoteIPProxiesHeader
> and the PROXY step should be part of that header. We can expand this to
> present a proper RFC7239 Forwarded header, or use X-F-F formatting.
> The X-F-F value still presents the back end server with all the untrusted
> hops of the proxy chain, the RemoteIPProxiesHeader retains all of the
> trusted hops, and the r->useragent is the identity presented by the most
> remote trusted hop. Of course, the PROXY hop is one of those trusted IPs.
>
> I'm thinking we want the option to preserve this r->useragent hop in the
> RemoteIPHeader to be presented in mod_proxy_http requests, and in
> creating this option, choose to format this as legacy (existing behavior),
> X-F-F syntax, or RFC7239 Forwarded syntax. Everything simplified out
> of the RemoteIPHeader is optionally preserved in RemoteIPProxiesHeader.
>
> We support X-F-F to some extent today, but not properly. But because we
> are an HTTP server which can mangle HTTP request metadata, and our
> proxy connections are not remote connection-bound, we should probably
> apply the logic above to generate an RFC7239 Forwarded header. This
> is where we probably collapse all
+
> Whoops, sorry...
>
> "Where we should probably collapse all" trusted proxy data into the alternate
> header, and relay all remaining untrusted X-F-F/Forwarded data on to the
> client as 'you deal with this'.
>
> Or add a flag to recombine it all and let the backend reprocess it all, but 
> the
> entire point of putting httpd somewhere in the chain is to deduplicate and
> eliminate useless data and CPU time.
>
> This all requires further consideration, let's dig deeper into that, after we
> ensure that PROXY protocol is an implicitly 'trusted' IP agent, and any and
> all SEGV's are resolved.

I agree with the notion that a device presenting a PROXY header should
be considered trusted. My thought process is that if you are using PROXY
at all, the webserver must be behind some kind of hop intended to be the
termination point for your traffic and you care what the real IP address
is. It seems self evident to say, but an admin wouldn't enable this
functionality unless they had such a need. Therefore, they are knowingly
telling the webserver (implicit form of trust rather than explicit) that
the incoming header from whatever is upstream should be consumed and used.

Expanding on the first topic, preserving this information in case
mod_proxy wants to share it downstream is a very good idea. It's a bit
beyond the scope of introducing initial PROXY functionality (read as: my
own time as I build and teach a course this semester), but if you have a
few cycles I'm happy to review when you can get to it.

Otherwise, I'm not aware of any remaining crashes/failures/challenges.
I've tested it quite a bit on my own with valid/invalid traffic and
Ruedeger did an awesome job of hypothesizing potential failure cases to
catch before they become shipped bugs. I appreciate the time and
attention since I know you're also busy - just hoping we can get this
folded in relatively soon.

-- 
Daniel Ruggeri

Reply via email to