Re: next version of content-encoding / gzip design doc

2004-03-03 Thread Henrik Nordstrom
On Wed, 3 Mar 2004, Jon Kay wrote:

 Because current browser implementations treat Content-Encoding much as
 though it was Transfer-Encoding, we will implement Content-Encoding and
 Accept-Encoding as though they were actually the Transfer-Encoding and
 TE described in the HTTP specifications.

This part I do not understand.

Coontent-Encoding and Transfer-Encoding is fundamentally different in 
their operation far beyond the hop-by-hop vs end-to-end difference. You 
can not interchange one for the other.

It is not safe to assume a clients accepts gzip TE only because they 
accept gzip content-encoding. For one thing the message format is 
completely different.

 Etags of replies encoded by Squid will be modified to turn them into
 weak tags if they are not already so.

Why to you oppose creating new unique ETags?

 There will be a configuration option to turn off content-encoding.

Granted, and this will default off in the standard distribution, as any 
other option which violates the semantically transparent HTTP proxy 
requirements.

 Content-Encoding Implementation

No comments there.

 Objects will be stored both in unencoded and encoded formats. An object
 will stay in the format in which Squid receives it until requested by a
 client requesting a different Content-Encoding which Squid supports
 (this could be immediate). Once this happens, the object will be
 streamed coded into a different StoreEntry and on to the client.

Ok.

 A new store_dup module will be created to manage dup store_entries and
 make sure duplicate entries are invalidated when a new version of an
 object is read. It consists of a circular list of StoreEntry pointers
 named dupnext and dupprev When a new duplicate encoding (or
 decoding) of an object is created, it's added to the list. When any
 StoreEntry is invalidated or updated, all dups are invalidated.

Looks a little too complex to me.


Wouldn't something simpler like the following work:

Modify the store key to account for content encoding.

Add a internal meta object listing the known content encodings of a given 
object. When a new encoding is added rewrite this object to add the new 
encoding name.

On cache hits, iterate over the known acceptable encodings until a match
is found in the cache.

In recoded objects include a meta header indicating the identity of the
original object and disregard the recoded object on a cache hit if it no
longer matches the original.

From what I can tell the above would also work for adding server-driven
Content-Encoding negotiation to the proxy to complement the use of Vary 
(which most mod_gzip servers do not support btw).

Regards
Henrik



Re: coss and squid3

2004-03-03 Thread Robert Collins
On Wed, 2004-03-03 at 20:39, Adrian Chadd wrote:
 Hi,
 
 coss, as it stands in squid-3, is completely unusable.
 It bombs out because some of the callback data types aren't
 actually cbdata allocated anymore.

Oh crap. I'm more and more tempted to import my IO rework, it works
there.

Rob
-- 
GPG key available at: http://www.robertcollins.net/keys.txt.


signature.asc
Description: This is a digitally signed message part


Re: coss and squid3

2004-03-03 Thread Adrian Chadd
On Wed, Mar 03, 2004, Robert Collins wrote:
  coss, as it stands in squid-3, is completely unusable.
  It bombs out because some of the callback data types aren't
  actually cbdata allocated anymore.
 
 Oh crap. I'm more and more tempted to import my IO rework, it works
 there.

Heh. Please. :)



Adrian



Re: next version of content-encoding / gzip design doc

2004-03-03 Thread garana


Hi there,

I'm back with this task (again).

Jon: you are far more advanced than I am on understanding squid.  I can start helping 
content compression writing GzipCoder, if you want to.

(Already discussed) About TE/Transfer-Encoding vs Accept-Encoding/Content-Encoding:  
Content-Encoding implementation (even if it bends standards) seems to be the 
reasonable choice, since TE/Transfer-Encoding is not available on most common browsers.

If implemented as Content-Encoding the following headers should be altered before 
encoding:

Content-Length: deleted (could be updated to actual gzipped content length, but it is 
too much trouble i guess).
ETag: modified (appending CEgz, for instance?)
Vary: append Accept-Encoding (if not already there)
Connection: if client is 1.1, could be set to keep-alive, but Transfer-Encoding 
chunked should be added/checked.

Hope this provides some light about possible encoding options.

Regards,

-- 
Gonzalo Arana
Ingenieria
UOLSinectis

Florida 537 Piso 6, Buenos Aires, Argentina 
+54-11-4321-9110 ext 2543
http://www.uolsinectis.com.ar/



2.5 and delay pools

2004-03-03 Thread Adrian Chadd

hi,

I'm still having issues with squid-2.5 and delay pools.
The FDSET stuff is _very_ broken when you're using 1024 fds.
Here is a simple patch to 2.5 only which removes the whole
fdset thing.

I'd like to commit this so the delay pools stuff in 2.5
works for 1024 fds without _lots_ of messing about.
Can I please have someone comment?

Thanks!


Adrian

styx:/usr/local/src/squid-2/squid-2.5/src# cvs -z9 diff -u delay_pools.c
Index: delay_pools.c
===
RCS file: /squid/squid/src/delay_pools.c,v
retrieving revision 1.19.2.8
diff -u -r1.19.2.8 delay_pools.c
--- delay_pools.c   18 Jun 2003 23:53:35 -  1.19.2.8
+++ delay_pools.c   4 Mar 2004 06:31:44 -
@@ -89,7 +89,7 @@
 typedef union _delayPool delayPool;

 static delayPool *delay_data = NULL;
-static fd_set delay_no_delay;
+static int delay_no_delay[SQUID_MAXFD];
 static time_t delay_pools_last_update = 0;
 static hash_table *delay_id_ptr_hash = NULL;
 static long memory_used = 0;
@@ -134,7 +134,7 @@
 delayPoolsInit(void)
 {
 delay_pools_last_update = getCurrentTime();
-FD_ZERO(delay_no_delay);
+bzero(delay_no_delay, sizeof(delay_no_delay));
 cachemgrRegister(delay, Delay Pool Levels, delayPoolStats, 0, 1);
 }

@@ -283,19 +283,19 @@
 void
 delaySetNoDelay(int fd)
 {
-FD_SET(fd, delay_no_delay);
+delay_no_delay[fd] = 1;
 }

 void
 delayClearNoDelay(int fd)
 {
-FD_CLR(fd, delay_no_delay);
+delay_no_delay[fd] = 0;
 }

 int
 delayIsNoDelay(int fd)
 {
-return FD_ISSET(fd, delay_no_delay);
+return (delay_no_delay[fd] == 1);
 }

 static delay_id



Re: 2.5 and delay pools

2004-03-03 Thread Henrik Nordstrom
On Wed, 3 Mar 2004, Adrian Chadd wrote:

 I'm still having issues with squid-2.5 and delay pools.
 The FDSET stuff is _very_ broken when you're using 1024 fds.

More likely the way FD_SETSIZE is extended is broken for your libc
headers..

You need to remove far more fd_set references if this is the problem.  
There is also seveal delay pool related fd_set usage in comm_poll, and a 
few other places I think.

To verify the FD_SETSIZE extension you can use the following

assert(sizeof(fd_set) = (SQUID_MAXFD + 7) / 8);

if this triggers you are in deep trouble. We probably SHOULD add this 
somewhere to make sure the problem is quickly detected if so should 
happen.

Regards
Henrik



Re: 2.5 and delay pools

2004-03-03 Thread Adrian Chadd
On Thu, Mar 04, 2004, Henrik Nordstrom wrote:
 On Wed, 3 Mar 2004, Adrian Chadd wrote:
 
  I'm still having issues with squid-2.5 and delay pools.
  The FDSET stuff is _very_ broken when you're using 1024 fds.
 
 More likely the way FD_SETSIZE is extended is broken for your libc
 headers..

I agree, but its becoming a pain to work around this.

 You need to remove far more fd_set references if this is the problem.  
 There is also seveal delay pool related fd_set usage in comm_poll, and a 
 few other places I think.

Ok. I must've missed them. Let me go through the codebase and remove
all references to fd_set when you're not actually using select().




Adrian



Re: 2.5 and delay pools

2004-03-03 Thread Adrian Chadd
On Thu, Mar 04, 2004, Adrian Chadd wrote:

  You need to remove far more fd_set references if this is the problem.  
  There is also seveal delay pool related fd_set usage in comm_poll, and a 
  few other places I think.
 
 Ok. I must've missed them. Let me go through the codebase and remove
 all references to fd_set when you're not actually using select().

Ok, the only use I can see is in the slowfds use. The other use of
fd_set is in the select() codepath.

Here's the patch I'd like to commit:

ndex: delay_pools.c
===
RCS file: /squid/squid/src/delay_pools.c,v
retrieving revision 1.19.2.8
diff -u -r1.19.2.8 delay_pools.c
--- delay_pools.c   18 Jun 2003 23:53:35 -  1.19.2.8
+++ delay_pools.c   4 Mar 2004 07:47:21 -
@@ -89,7 +89,7 @@
 typedef union _delayPool delayPool;

 static delayPool *delay_data = NULL;
-static fd_set delay_no_delay;
+static int delay_no_delay[SQUID_MAXFD];
 static time_t delay_pools_last_update = 0;
 static hash_table *delay_id_ptr_hash = NULL;
 static long memory_used = 0;
@@ -134,7 +134,7 @@
 delayPoolsInit(void)
 {
 delay_pools_last_update = getCurrentTime();
-FD_ZERO(delay_no_delay);
+bzero(delay_no_delay, sizeof(delay_no_delay));
 cachemgrRegister(delay, Delay Pool Levels, delayPoolStats, 0, 1);
 }

@@ -283,19 +283,19 @@
 void
 delaySetNoDelay(int fd)
 {
-FD_SET(fd, delay_no_delay);
+delay_no_delay[fd] = 1;
 }

 void
 delayClearNoDelay(int fd)
 {
-FD_CLR(fd, delay_no_delay);
+delay_no_delay[fd] = 0;
 }

 int
 delayIsNoDelay(int fd)
 {
-return FD_ISSET(fd, delay_no_delay);
+return (delay_no_delay[fd] == 1);
 }

 static delay_id
Index: comm_select.c
===
RCS file: /squid/squid/src/comm_select.c,v
retrieving revision 1.53.2.7
diff -u -r1.53.2.7 comm_select.c
--- comm_select.c   11 May 2003 17:30:13 -  1.53.2.7
+++ comm_select.c   4 Mar 2004 07:47:21 -
@@ -310,7 +310,7 @@
 {
 struct pollfd pfds[SQUID_MAXFD];
 #if DELAY_POOLS
-fd_set slowfds;
+char slowfds[SQUID_MAXFD];
 #endif
 PF *hdl = NULL;
 int fd;
@@ -332,7 +332,7 @@
/* Handle any fs callbacks that need doing */
storeDirCallback();
 #if DELAY_POOLS
-   FD_ZERO(slowfds);
+   bzero(slowfds, sizeof(slowfds));
 #endif
if (commCheckICPIncoming)
comm_poll_icp_incoming();
@@ -358,7 +358,7 @@
 #if DELAY_POOLS
case -1:
events |= POLLRDNORM;
-   FD_SET(i, slowfds);
+   slowfds[i] = 1;
break;
 #endif
default:
@@ -437,7 +437,7 @@
if (NULL == (hdl = F-read_handler))
(void) 0;
 #if DELAY_POOLS
-   else if (FD_ISSET(fd, slowfds))
+   else if (slowfds[i])
commAddSlowFd(fd);
 #endif
else {



Re: 2.5 and delay pools

2004-03-03 Thread Henrik Nordstrom
On Thu, 4 Mar 2004, Adrian Chadd wrote:

  More likely the way FD_SETSIZE is extended is broken for your libc
  headers..
 
 I agree, but its becoming a pain to work around this.

Please verify the assert I sent. If it triggers we at least know this is 
the problem.

Which libc are you using?

  You need to remove far more fd_set references if this is the problem.  
  There is also seveal delay pool related fd_set usage in comm_poll, and a 
  few other places I think.
 
 Ok. I must've missed them. Let me go through the codebase and remove
 all references to fd_set when you're not actually using select().

I prefer to not touch the comm loop within the 2.5 cycle..

Regards
Henrik