Re: [PREVIEW] 1xx response forwarding

2010-08-21 Thread Amos Jeffries

Alex Rousskov wrote:

On Mon, 16 Aug 2010 15:53:42 -0600, Alex Rousskov
rouss...@measurement-factory.com wrote:

 Hello,

  We need to forward 1xx control messages from servers to 
clients. I

 see two implementation options:

 1. Use Store. Squid client side expects responses via storeClientCopy,
 so we will be using the usual/normal code paths. Multiple 1xx 
responses

 may be handled with relative ease. The 1xx responses in Store will be
 treated kind of as regular response headers, except they will not be
 cached and such. The code will need to skip them until they reach 
the

 socket-writing client.

 2. Bypass Store. Contact fwdStart caller (e.g., clientReplyContext)
 directly and give it a 1xx response to forward. Store code remains
 unchanged. It may be difficult to get from the fwdStart caller to the
 client socket and comm_write. It will be difficult to handle multiple
 1xx responses or a regular response that arrives before we are done 
with

 writing 1xx response (all unusual, but can happen!).

 Both approaches may have to deal with crazy offset management,
 clientStreams manipulations, and other client-side mess.


 Do you see any other options? Which option is the best?


On 08/16/2010 04:06 PM, Amos Jeffries wrote:
My earlier plan if I did it was to do (2). The complication only 
occurs at

one point, finding the client FD.
comm_write() should not be altering offset of the higher level store 
stuff

directly. If it is that is a bug to be fixed.

Pipelining the responses one at a time with a simple block on further
reply passing-on until the existing header set has been finished with 
gets

around any trickiness with multiple or early real responses.


The block on further reply passing is far from simple because it needs 
to deal with two async jobs.


On 08/18/2010 02:11 PM, Henrik Nordström wrote:

mån 2010-08-16 klockan 15:53 -0600 skrev Alex Rousskov:


Both approaches may have to deal with crazy offset management,
clientStreams manipulations, and other client-side mess.


Yes.

For now I think we need to bypass store to make this sane, and it's
probably also a step in the right direction in general.


Thank you both for your feedback!

The attached patch implements the Bypass Store design and forwards 1xx 
control messages to clients that are likely to be able to handle such 
messages.


The patch appears to pass initial tests but more testing and a sync with 
trunk are needed. There is also one XXX that I still need to resolve, 
but it requires some code from the bug #2583 (pure virtual call) fix. I 
will switch to committing that fix now.


Meanwhile, if you have a chance, please review the overall direction of 
the patch. Preamble has more notes.


The patch removes the ignore_expect_100 feature because we now forward 
100 Continue messages. Is everybody OK with that removal?


No. Bad naming maybe, but its primarily there to suppress 417 responses 
being generated in the negative case to broken clients. Even with 
working 1xx support internally we may still be required to suppress 
these (ie when forwarding to a known 1.0 peer).



Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.6
  Beta testers wanted for 3.2.0.1


Re: [PREVIEW] 1xx response forwarding

2010-08-20 Thread Henrik Nordström
tor 2010-08-19 klockan 10:41 -0600 skrev Alex Rousskov:

 The patch removes the ignore_expect_100 feature because we now forward 
 100 Continue messages. Is everybody OK with that removal?

May need to keep/resurrect it when adding next hop version check as
required by Expect..

Regards
Henrik



Re: [PREVIEW] 1xx response forwarding

2010-08-20 Thread Alex Rousskov

On 08/20/2010 06:30 AM, Henrik Nordström wrote:

tor 2010-08-19 klockan 10:41 -0600 skrev Alex Rousskov:


The patch removes the ignore_expect_100 feature because we now forward
100 Continue messages. Is everybody OK with that removal?


May need to keep/resurrect it when adding next hop version check as
required by Expect..


Good point. The next hop version check is better done on the server side 
though, right? We may not yet know the next hop when accepting the request.


BTW, most things in HTTP are URI- and not hostname-based. I wonder what 
server or next hop means when checking for supported versions. Do we 
look just at the host name:port and hope that it reflects the version of 
everything running there?


Thanks,

Alex.


Re: [PREVIEW] 1xx response forwarding

2010-08-20 Thread Henrik Nordström
Some aspects of http is hop-by-hop not end-to-end. Processing of Expect is one 
such thing. Transfer encoding and message delimiting another.

We just look at what we know about the nexthop we select. Actual URI is pretty 
irrelevant unless used as selecting factor for selecting the nexthop.

Yes proper Expect processing needs to be done in our client (server side in our 
terminology).

regards
Henrik
- Ursprungsmeddelande -
 On 08/20/2010 06:30 AM, Henrik Nordström wrote:
  tor 2010-08-19 klockan 10:41 -0600 skrev Alex Rousskov:
  
   The patch removes the ignore_expect_100 feature because we now
   forward 100 Continue messages. Is everybody OK with that removal?
  
  May need to keep/resurrect it when adding next hop version check as
  required by Expect..
 
 Good point. The next hop version check is better done on the server side 
 though, right? We may not yet know the next hop when accepting the
 request.
 
 BTW, most things in HTTP are URI- and not hostname-based. I wonder what 
 server or next hop means when checking for supported versions. Do we 
 look just at the host name:port and hope that it reflects the version of 
 everything running there?
 
 Thanks,
 
 Alex.



Re: [PREVIEW] 1xx response forwarding

2010-08-20 Thread Alex Rousskov

On 08/20/2010 08:36 AM, Henrik Nordström wrote:


Some aspects of http is hop-by-hop not end-to-end. Processing of Expect is one 
such thing. Transfer encoding and message delimiting another.


Sure. We can consider the next hop == origin server case to avoid 
distractions. I am only wondering whether http://host/path1 and 
http://host/path2 responses are guaranteed to have the same protocol 
version. I do not think HTTP gives such guarantees, and yet its 
requirements imply that remembering versions using host names should work.


Alex.


We just look at what we know about the nexthop we select. Actual URI is pretty 
irrelevant unless used as selecting factor for selecting the nexthop.

Yes proper Expect processing needs to be done in our client (server side in our 
terminology).

regards
Henrik
- Ursprungsmeddelande -

On 08/20/2010 06:30 AM, Henrik Nordström wrote:

tor 2010-08-19 klockan 10:41 -0600 skrev Alex Rousskov:


The patch removes the ignore_expect_100 feature because we now
forward 100 Continue messages. Is everybody OK with that removal?


May need to keep/resurrect it when adding next hop version check as
required by Expect..


Good point. The next hop version check is better done on the server side
though, right? We may not yet know the next hop when accepting the
request.

BTW, most things in HTTP are URI- and not hostname-based. I wonder what
server or next hop means when checking for supported versions. Do we
look just at the host name:port and hope that it reflects the version of
everything running there?

Thanks,

Alex.




Re: [PREVIEW] 1xx response forwarding

2010-08-20 Thread Henrik Nordström
See RFC on use and meaning of HTTP version numbers.

fre 2010-08-20 klockan 08:58 -0600 skrev Alex Rousskov:
 On 08/20/2010 08:36 AM, Henrik Nordström wrote:
 
  Some aspects of http is hop-by-hop not end-to-end. Processing of Expect is 
  one such thing. Transfer encoding and message delimiting another.
 
 Sure. We can consider the next hop == origin server case to avoid 
 distractions. I am only wondering whether http://host/path1 and 
 http://host/path2 responses are guaranteed to have the same protocol 
 version. I do not think HTTP gives such guarantees, and yet its 
 requirements imply that remembering versions using host names should work.
 
 Alex.
 
  We just look at what we know about the nexthop we select. Actual URI is 
  pretty irrelevant unless used as selecting factor for selecting the nexthop.
 
  Yes proper Expect processing needs to be done in our client (server side in 
  our terminology).
 
  regards
  Henrik
  - Ursprungsmeddelande -
  On 08/20/2010 06:30 AM, Henrik Nordström wrote:
  tor 2010-08-19 klockan 10:41 -0600 skrev Alex Rousskov:
 
  The patch removes the ignore_expect_100 feature because we now
  forward 100 Continue messages. Is everybody OK with that removal?
 
  May need to keep/resurrect it when adding next hop version check as
  required by Expect..
 
  Good point. The next hop version check is better done on the server side
  though, right? We may not yet know the next hop when accepting the
  request.
 
  BTW, most things in HTTP are URI- and not hostname-based. I wonder what
  server or next hop means when checking for supported versions. Do we
  look just at the host name:port and hope that it reflects the version of
  everything running there?
 
  Thanks,
 
  Alex.




Re: [PREVIEW] 1xx response forwarding

2010-08-20 Thread Alex Rousskov

On 08/20/2010 09:26 AM, Henrik Nordström wrote:

See RFC on use and meaning of HTTP version numbers.


The only relevant RFC text I can find is an informal discussion that 
HTTP version is tied to a message sender, an undefined concept. 
However, even if we replace message sender with client or server, my 
assertion that HTTP does not guarantee that one host:port corresponds to 
one client or server appears to be valid.


In other words, RFC requirement to maintain a database of server 
versions relies on an undocumented assumption that we can identify 
servers by some short, often reused component of a request (such as 
host:port). That assumption holds for most but not all current 
real-world environments.


Alex.



fre 2010-08-20 klockan 08:58 -0600 skrev Alex Rousskov:

On 08/20/2010 08:36 AM, Henrik Nordström wrote:


Some aspects of http is hop-by-hop not end-to-end. Processing of Expect is one 
such thing. Transfer encoding and message delimiting another.


Sure. We can consider the next hop == origin server case to avoid
distractions. I am only wondering whether http://host/path1 and
http://host/path2 responses are guaranteed to have the same protocol
version. I do not think HTTP gives such guarantees, and yet its
requirements imply that remembering versions using host names should work.

Alex.


We just look at what we know about the nexthop we select. Actual URI is pretty 
irrelevant unless used as selecting factor for selecting the nexthop.

Yes proper Expect processing needs to be done in our client (server side in our 
terminology).

regards
Henrik
- Ursprungsmeddelande -

On 08/20/2010 06:30 AM, Henrik Nordström wrote:

tor 2010-08-19 klockan 10:41 -0600 skrev Alex Rousskov:


The patch removes the ignore_expect_100 feature because we now
forward 100 Continue messages. Is everybody OK with that removal?


May need to keep/resurrect it when adding next hop version check as
required by Expect..


Good point. The next hop version check is better done on the server side
though, right? We may not yet know the next hop when accepting the
request.

BTW, most things in HTTP are URI- and not hostname-based. I wonder what
server or next hop means when checking for supported versions. Do we
look just at the host name:port and hope that it reflects the version of
everything running there?

Thanks,

Alex.






Re: [PREVIEW] 1xx response forwarding

2010-08-20 Thread Henrik Nordström
fre 2010-08-20 klockan 13:00 -0600 skrev Alex Rousskov:
 On 08/20/2010 09:26 AM, Henrik Nordström wrote:
  See RFC on use and meaning of HTTP version numbers.
 
 The only relevant RFC text I can find is an informal discussion that 
 HTTP version is tied to a message sender, an undefined concept. 
 However, even if we replace message sender with client or server, my 
 assertion that HTTP does not guarantee that one host:port corresponds to 
 one client or server appears to be valid.

RFC 2145 section 2.3 Which version number to send in a message

   An HTTP client SHOULD send a request version equal to the highest
   version for which the client is at least conditionally compliant

   An HTTP server SHOULD send a response version equal to the highest
   version for which the server is at least conditionally compliant

   An HTTP server MAY send a lower response version, if it is known or
   suspected that the client incorrectly implements the HTTP
   specification, but this should not be the default, and this SHOULD
   NOT be done if the request version is HTTP/1.1 or greater.


Note: Proxy servers are both servers and clients depending on which side
you look at.

RFC 2616 3.2.2 http URL

  The semantics are that the identified resource is located at
  the server listening for TCP connections on that port of that host


Remember that use of NAT, TCP/IP load balancers etc is pretty much
outside all normal TCP/IP specifications. IP derived specifications
assumes end-to-end semantics at IP level unless otherwise explicitly
stated, where relevant.

Or put in other words, if you use NAT or TCP/IP load balancing or
similar techniques making several different servers answer on the same
ip:port then it's your responsibility to make sure your server as a
whole acts in a coherent manner. As far as specifications is concerned
it's still a single server, even if it internally splits the load across
several physically distinct servers. Many implementations gets bitten by
this at various levels, most notably for the HTTP specifications is
ETag, Content-Location and Location mismatches. HTTP version is in this
same category.

Regards
Henrik



[PREVIEW] 1xx response forwarding

2010-08-19 Thread Alex Rousskov

On Mon, 16 Aug 2010 15:53:42 -0600, Alex Rousskov
rouss...@measurement-factory.com wrote:

 Hello,

  We need to forward 1xx control messages from servers to clients. I
 see two implementation options:

 1. Use Store. Squid client side expects responses via storeClientCopy,
 so we will be using the usual/normal code paths. Multiple 1xx responses
 may be handled with relative ease. The 1xx responses in Store will be
 treated kind of as regular response headers, except they will not be
 cached and such. The code will need to skip them until they reach the
 socket-writing client.

 2. Bypass Store. Contact fwdStart caller (e.g., clientReplyContext)
 directly and give it a 1xx response to forward. Store code remains
 unchanged. It may be difficult to get from the fwdStart caller to the
 client socket and comm_write. It will be difficult to handle multiple
 1xx responses or a regular response that arrives before we are done with
 writing 1xx response (all unusual, but can happen!).

 Both approaches may have to deal with crazy offset management,
 clientStreams manipulations, and other client-side mess.


 Do you see any other options? Which option is the best?


On 08/16/2010 04:06 PM, Amos Jeffries wrote:

My earlier plan if I did it was to do (2). The complication only occurs at
one point, finding the client FD.
comm_write() should not be altering offset of the higher level store stuff
directly. If it is that is a bug to be fixed.

Pipelining the responses one at a time with a simple block on further
reply passing-on until the existing header set has been finished with gets
around any trickiness with multiple or early real responses.


The block on further reply passing is far from simple because it needs 
to deal with two async jobs.


On 08/18/2010 02:11 PM, Henrik Nordström wrote:

mån 2010-08-16 klockan 15:53 -0600 skrev Alex Rousskov:


Both approaches may have to deal with crazy offset management,
clientStreams manipulations, and other client-side mess.


Yes.

For now I think we need to bypass store to make this sane, and it's
probably also a step in the right direction in general.


Thank you both for your feedback!

The attached patch implements the Bypass Store design and forwards 1xx 
control messages to clients that are likely to be able to handle such 
messages.


The patch appears to pass initial tests but more testing and a sync with 
trunk are needed. There is also one XXX that I still need to resolve, 
but it requires some code from the bug #2583 (pure virtual call) fix. I 
will switch to committing that fix now.


Meanwhile, if you have a chance, please review the overall direction of 
the patch. Preamble has more notes.


The patch removes the ignore_expect_100 feature because we now forward 
100 Continue messages. Is everybody OK with that removal?



Thank you,

Alex.

Compliance: Forward 1xx control messages to clients that support them.
Take 0, which needs more work.

The patch removes ignore_expect_100 squid.conf option because we can safely
forward Expect: 100-continue headers to servers because we can forward
100 Continue control messages to the expecting clients.

We now forward 1xx control messages to all HTTP/1.1 clients and to HTTP/1.0
clients that sent an Expect: 100-continue header. RFC 2616 requires clients
to accept 1xx control messages, even if they did not send Expect headers.

We still respond with 417 Expectation Failed to requests with expectations
other than 100-continue.

Implementation notes: 

We forward control messages one-at-a-time and stop processing the server
response while the 1xx message is being written to client, to avoid
server-driven DoS attacks with large number of 1xx messages.

1xx forwarding is done via async calls from HttpStateData to
ConnStateData/ClientSocketContext. The latter then calls back to notify
HttpStateData that the message was written out to client. If any one of the
two async messages is not fired, HttpStateData will get stuck unless it is
destroyed due to an external event/error. The code assumes such event/error
will always happen because when ConnStateData/ClientSocketContext is gone,
HttpStateData job should be terminated. This requires more testing/thought.

XXX: The patch is not finished. We need to cbdata-protect the
HttpRequest::clientConnection member and re-sync with trunk.


=== added file 'src/HttpControlMsg.h'
--- src/HttpControlMsg.h	1970-01-01 00:00:00 +
+++ src/HttpControlMsg.h	2010-08-18 19:36:04 +
@@ -0,0 +1,57 @@
+/*
+ * $Id$
+ */
+
+#ifndef SQUID_HTTP_CONTROL_MSG_H
+#define SQUID_HTTP_CONTROL_MSG_H
+
+#include HttpReply.h
+#include base/AsyncCall.h
+
+class HttpControlMsg;
+
+/* 
+ * This API exists to throttle forwarding of 1xx messages from the server
+ * side (Source == HttpStateData) to the client side (Sink == ConnStateData).
+ *
+ * Without throttling, Squid would have to drop some 1xx responses to
+ * avoid DoS attacks that send many 1xx responses without reading them.
+ * Dropping 1xx responses