Re: squid-2 dataflow and copying changes

2008-04-05 Thread Adrian Chadd
On Sat, Apr 05, 2008, Henrik Nordstrom wrote:

> It would also be good to look into a temporary fix for 2.7 as the above
> is probably not 2.7 material.

Well, its sort of related - the one or two places where I'm currently passing
in a buffer via storeClientCopy() can be turned into a storeClientRef() or
a storeClientCopy(sc, NULL (buffer)..) ; so the store client is signaled to
return a large buffer as required.

Then I can implement the fix as "allocate buffer for reading in the first
chunk of data and grow the buffer (or over-allocate it in the first place,
letting the VM do the talking), and then either pass the buffer back to the
caller OR copy the first 4k out."

> > This'll involve bringing over various parts of my Squid-2 work in s26_adri
> > into the main tree. I'll initially bring over some of the code reshuffling
> > and the string related changes. Once those are stable and functioning well,
> > I'll look at bringing over some other parts of the work - referenced 
> > strings,
> > writev() for request/reply information.
> 
> Interesting. Do you have any performance metrics comparing this with
> 2.7?

Only what I pulled from a few months ago - the improvements aren't all that 
great
until I start avoiding string buffer copying and reduplication. CPU-wise, I was
saving roughly 15% CPU between 2.7 and s26_adri; quite a bit less than expected.

But then, the whole point of this particular exercise is to lay the foundation 
for
further code and dataflow and improvements to improve performance and 
scalability.
Its hard to fix anything right now (in either tree!) because the codebase needs
massive tidying up and noone here knows exactly what to target in the medium to 
long
term, and as I stated all along, this is precisely what I was trying to attack
in the first place.



Adrian
-- 
- Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support -
- $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -


[MERGE] Support large response headers

2008-04-05 Thread Henrik Nordstrom
Resubmissions as I haven't received any comments on the previous merge
request.

This patch fixes Bug #2001 by reworking how the client_side_*.cc code
deals with response headers. Instead of parsing the headers again it
clones the already parsed header and blindly skips the unparsed headers
in the data stream.

3.0 version submitted separately.
# Bazaar merge directive format 2 (Bazaar 0.90)
# revision_id: [EMAIL PROTECTED]
#   63h3likn45namhsa
# target_branch: file:///data/bzr/squid3/trunk/
# testament_sha1: 861409ffadcaf9c00dc89f6cc2eb48d9dcb8bb25
# timestamp: 2008-04-06 03:37:48 +0200
# base_revision_id: [EMAIL PROTECTED]
#   yff1f2iecpkhoq4t
# 
# Begin patch
=== modified file 'src/HttpReply.cc'
--- src/HttpReply.cc	2008-02-09 01:27:59 +
+++ src/HttpReply.cc	2008-03-15 19:09:52 +
@@ -551,3 +551,17 @@
 }
 }
 }
+
+HttpReply *
+HttpReply::clone() const
+{
+HttpReply *rep = new HttpReply();
+rep->header.append(&header);
+rep->hdrCacheInit();
+rep->hdr_sz = hdr_sz;
+rep->http_ver = http_ver;
+rep->pstate = pstate;
+rep->protocol = protocol;
+rep->sline = sline;
+return rep;
+}

=== modified file 'src/HttpReply.h'
--- src/HttpReply.h	2008-03-16 00:50:50 +
+++ src/HttpReply.h	2008-03-16 00:53:56 +
@@ -127,6 +127,11 @@
 
 void packHeadersInto(Packer * p) const;
 
+/// Clone this reply.
+/// Could be done as a copy-contructor but we do not want to
+/// accidently copy a HttpReply..
+HttpReply *clone() const;
+
 private:
 /* initialize */
 void init();

=== modified file 'src/client_side_reply.cc'
--- src/client_side_reply.cc	2008-03-20 23:20:58 +
+++ src/client_side_reply.cc	2008-03-30 13:34:01 +
@@ -352,68 +352,47 @@
 sendClientOldEntry();
 }
 
-// we have a partial reply from the origin
-else if (STORE_PENDING == http->storeEntry()->store_status && 0 == status) {
-// header is too large, send old entry
-
-if (reqsize >= HTTP_REQBUF_SZ) {
-debugs(88, 3, "handleIMSReply: response from origin is too large '" << http->storeEntry()->url() << "', sending old entry to client" );
-http->logType = LOG_TCP_REFRESH_FAIL;
-sendClientOldEntry();
-}
-
-// everything looks fine, we're just waiting for more data
-else {
-debugs(88, 3, "handleIMSReply: incomplete headers for '" << http->storeEntry()->url() << "', waiting for more data" );
-reqofs = reqsize;
-waitForMoreData();
-}
-}
-
-// we have a reply from the origin
+HttpReply *old_rep = (HttpReply *) old_entry->getReply();
+
+// origin replied 304
+
+if (status == HTTP_NOT_MODIFIED) {
+	http->logType = LOG_TCP_REFRESH_UNMODIFIED;
+
+	// update headers on existing entry
+	HttpReply *old_rep = (HttpReply *) old_entry->getReply();
+	old_rep->updateOnNotModified(http->storeEntry()->getReply());
+	old_entry->timestampsSet();
+
+	// if client sent IMS
+
+	if (http->request->flags.ims) {
+	// forward the 304 from origin
+	debugs(88, 3, "handleIMSReply: origin replied 304, revalidating existing entry and forwarding 304 to client");
+	sendClientUpstreamResponse();
+	} else {
+	// send existing entry, it's still valid
+	debugs(88, 3, "handleIMSReply: origin replied 304, revalidating existing entry and sending " <<
+		   old_rep->sline.status << " to client");
+	sendClientOldEntry();
+	}
+}
+
+// origin replied with a non-error code
+else if (status > HTTP_STATUS_NONE && status < HTTP_INTERNAL_SERVER_ERROR) {
+	// forward response from origin
+	http->logType = LOG_TCP_REFRESH_MODIFIED;
+	debugs(88, 3, "handleIMSReply: origin replied " << status << ", replacing existing entry and forwarding to client");
+	sendClientUpstreamResponse();
+}
+
+// origin replied with an error
 else {
-HttpReply *old_rep = (HttpReply *) old_entry->getReply();
-
-// origin replied 304
-
-if (status == HTTP_NOT_MODIFIED) {
-http->logType = LOG_TCP_REFRESH_UNMODIFIED;
-
-// update headers on existing entry
-HttpReply *old_rep = (HttpReply *) old_entry->getReply();
-old_rep->updateOnNotModified(http->storeEntry()->getReply());
-old_entry->timestampsSet();
-
-// if client sent IMS
-
-if (http->request->flags.ims) {
-// forward the 304 from origin
-debugs(88, 3, "handleIMSReply: origin replied 304, revalidating existing entry and forwarding 304 to client");
-sendClientUpstreamResponse();
-} else {
-// send existing entry, it's still valid
-debugs(88, 3, "handleIMSReply: origin replied 304, revalidating existing entry and sending " <<
-   old_rep->sline.status << " to client");
-sendClientOldEntry();
-}
-}
-
-// origin replied with a non-error code
- 

[MERGE-3.0] Support large response headers

2008-04-05 Thread Henrik Nordstrom
Updated 3.0 patch. Had forgot to merge one related changeset from trunk
adding MemBuf::size(size).
# Bazaar merge directive format 2 (Bazaar 0.90)
# revision_id: [EMAIL PROTECTED]
#   00au6rconzwzcbdv
# target_branch: file:///data/bzr/squid3/branches/SQUID_3_0/
# testament_sha1: beea03b1b1acfed3b0a9f4df661b66b86fe9d102
# timestamp: 2008-04-06 02:39:24 +0200
# source_branch: http://www.henriknordstrom.net/bzr/squid3/hno\
#   /largeresp-3.0
# base_revision_id: [EMAIL PROTECTED]
#   kfyafifqcq0retgb
# 
# Begin patch
=== modified file 'src/HttpReply.cc'
--- src/HttpReply.cc	2007-11-26 20:09:54 +
+++ src/HttpReply.cc	2008-03-30 14:29:57 +
@@ -496,3 +496,17 @@
 
 return expectBody;
 }
+
+HttpReply *
+HttpReply::clone() const
+{
+HttpReply *rep = new HttpReply();
+rep->header.append(&header);
+rep->hdrCacheInit();
+rep->hdr_sz = hdr_sz;
+rep->http_ver = http_ver;
+rep->pstate = pstate;
+rep->protocol = protocol;
+rep->sline = sline;
+return rep;
+}

=== modified file 'src/HttpReply.h'
--- src/HttpReply.h	2007-08-13 23:20:50 +
+++ src/HttpReply.h	2008-03-30 14:29:57 +
@@ -122,6 +122,11 @@
 
 void packHeadersInto(Packer * p) const;
 
+/// Clone this reply.
+/// Could be done as a copy-contructor but we do not want to
+/// accidently copy a HttpReply..
+HttpReply *clone() const;
+
 private:
 /* initialize */
 void init();

=== modified file 'src/MemBuf.h'
--- src/MemBuf.h	2006-08-21 06:50:40 +
+++ src/MemBuf.h	2008-04-06 00:35:11 +
@@ -1,7 +1,7 @@
 
 
 /*
- * $Id: MemBuf.h,v 1.8 2006/08/21 00:50:41 robertc Exp $
+ * $Id: MemBuf.h,v 1.9 2007/12/21 23:48:04 hno Exp $
  *
  *
  * SQUID Web Proxy Cache  http://www.squid-cache.org/
@@ -60,6 +60,7 @@
 
 // these space-related methods assume no growth and allow 0-termination
 char *space() { return buf + size; } // space to add data
+char *space(mb_size_t required) { if (size + required > capacity) grow(size + required); return buf + size; } // space to add data
 
 mb_size_t spaceSize() const;
 bool hasSpace() const { return size+1 < capacity; }

=== modified file 'src/client_side_reply.cc'
--- src/client_side_reply.cc	2008-03-14 04:45:16 +
+++ src/client_side_reply.cc	2008-03-30 14:29:57 +
@@ -352,68 +352,47 @@
 sendClientOldEntry();
 }
 
-// we have a partial reply from the origin
-else if (STORE_PENDING == http->storeEntry()->store_status && 0 == status) {
-// header is too large, send old entry
-
-if (reqsize >= HTTP_REQBUF_SZ) {
-debugs(88, 3, "handleIMSReply: response from origin is too large '" << http->storeEntry()->url() << "', sending old entry to client" );
-http->logType = LOG_TCP_REFRESH_FAIL;
-sendClientOldEntry();
-}
-
-// everything looks fine, we're just waiting for more data
-else {
-debugs(88, 3, "handleIMSReply: incomplete headers for '" << http->storeEntry()->url() << "', waiting for more data" );
-reqofs = reqsize;
-waitForMoreData();
-}
-}
-
-// we have a reply from the origin
+HttpReply *old_rep = (HttpReply *) old_entry->getReply();
+
+// origin replied 304
+
+if (status == HTTP_NOT_MODIFIED) {
+	http->logType = LOG_TCP_REFRESH_UNMODIFIED;
+
+	// update headers on existing entry
+	HttpReply *old_rep = (HttpReply *) old_entry->getReply();
+	old_rep->updateOnNotModified(http->storeEntry()->getReply());
+	old_entry->timestampsSet();
+
+	// if client sent IMS
+
+	if (http->request->flags.ims) {
+	// forward the 304 from origin
+	debugs(88, 3, "handleIMSReply: origin replied 304, revalidating existing entry and forwarding 304 to client");
+	sendClientUpstreamResponse();
+	} else {
+	// send existing entry, it's still valid
+	debugs(88, 3, "handleIMSReply: origin replied 304, revalidating existing entry and sending " <<
+		   old_rep->sline.status << " to client");
+	sendClientOldEntry();
+	}
+}
+
+// origin replied with a non-error code
+else if (status > HTTP_STATUS_NONE && status < HTTP_INTERNAL_SERVER_ERROR) {
+	// forward response from origin
+	http->logType = LOG_TCP_REFRESH_MODIFIED;
+	debugs(88, 3, "handleIMSReply: origin replied " << status << ", replacing existing entry and forwarding to client");
+	sendClientUpstreamResponse();
+}
+
+// origin replied with an error
 else {
-HttpReply *old_rep = (HttpReply *) old_entry->getReply();
-
-// origin replied 304
-
-if (status == HTTP_NOT_MODIFIED) {
-http->logType = LOG_TCP_REFRESH_UNMODIFIED;
-
-// update headers on existing entry
-HttpReply *old_rep = (HttpReply *) old_entry->getReply();
-old_rep->updateOnNotModified(http->storeEntry()->getReply());
-old_entry->timestampsSet();
-
-// if client sent IMS
-
-if (http->request->flags.ims) {
-// forwar

Re: squid-2 dataflow and copying changes

2008-04-05 Thread Henrik Nordstrom
lör 2008-04-05 klockan 15:05 +0800 skrev Adrian Chadd:

> One of my contracts is to improve dataflow and reduce copying in Squid-2.
> I've got one major bug in Squid-2.HEAD / Squid-2.7 to fix (>4k response
> headers for disk objects) which will probably be fixed as part of this
> work.

Ok.

It would also be good to look into a temporary fix for 2.7 as the above
is probably not 2.7 material.

> This'll involve bringing over various parts of my Squid-2 work in s26_adri
> into the main tree. I'll initially bring over some of the code reshuffling
> and the string related changes. Once those are stable and functioning well,
> I'll look at bringing over some other parts of the work - referenced strings,
> writev() for request/reply information.

Interesting. Do you have any performance metrics comparing this with
2.7?

> I'll probably begin work on this (in parallel with my client-side delay pools
> work) in the next week or so.

Ok.

2.HEAD is always open for submissions of tested features (and
bugfixes..).

Regards
Henrik



Re: Getting back in to Squid

2008-04-05 Thread Henrik Nordstrom
tor 2008-04-03 klockan 10:43 +1100 skrev Benno Rice:

> Some of you may remember me from the original https accelerator  
> support code way back when.  I'm looking to get back into doing some  
> squid work in various areas and so thought getting back on to the dev  
> list would be a good idea. =)

Welcome back!

A lot has happened since then, but there is plenty to get your hands
dirty with as well, both old and new..

Regards
Henrik



Getting back in to Squid

2008-04-05 Thread Benno Rice

Hi there,

Some of you may remember me from the original https accelerator  
support code way back when.  I'm looking to get back into doing some  
squid work in various areas and so thought getting back on to the dev  
list would be a good idea. =)


Look forward to helping out!

--
Benno Rice
[EMAIL PROTECTED]





Re: Revision 8885

2008-04-05 Thread Henrik Nordstrom
sön 2008-04-06 klockan 00:45 +1300 skrev Amos Jeffries:

> The ./html/ folder generated does not get emptied between runs. So if 
> the code has changed, we may get new or dead files, and every time the 
> graphs are reproduced with new hashes they cause dead files.

Such things are supposed to get cleaned by "make clean".

Regards
Henrik



Bug season

2008-04-05 Thread Amos Jeffries
I've just spent a short while reviewing the port-queue and bugzilla 
'unspecified version' lists.


There are a few that might be closed with some agreements on how to do so.


BUG 1356: handling of DNS when in offline_mode.
  http://www.squid-cache.org/bugs/show_bug.cgi?id=1356

see Thread "offline_mode".


BUG 2286:
http://www.squid-cache.org/bugs/show_bug.cgi?id=2286

 - I think we can close this one now. Do you agree Christos?

Christos (and other non-committers), if you have a bug like this for 
squid-3 and you are sure the attached patch fixes it. Feel free to 
assign the bug to me for commit and closure. Otherwise I'll just commit 
and wait for the assignee or reporter to close it.



BUG 2242: RADIUS overflow
 - The one line patch for squid-3 should be back-ported to close this.
http://www.squid-cache.org/Versions/v3/HEAD/changesets/b8872.patch


BUG 1876: disallowed URL characters
  http://www.squid-cache.org/bugs/show_bug.cgi?id=1876

- Sounds invalid.



3.0 PORTING QUEUE:
  http://www.squid-cache.org/Versions/v3/HEAD/changesets/merge.html

I've ported back the entries which clearly need porting. I'd like some 
input regarding whether to back-port the following changes:


  - testheaders.sh alterations (b8892 etc)

We have two options here. First is the entire change, .sh script and 
all. Second is to just port the code changes, leaving the tester itself 
for 3.1-only.


  -


Amos
--
Please use Squid 2.6.STABLE19 or 3.0.STABLE4


Re: Revision 8885

2008-04-05 Thread Amos Jeffries

Henrik Nordstrom wrote:

ons 2008-04-02 klockan 21:47 +1300 skrev Amos Jeffries:

Henrik, I'm a little confused as to what you are trying to do here:

   http://www.squid-cache.org/Versions/v3/HEAD/changesets/b8885.patch

The Makefile did everything to build both the html and the dyn properly.
How else were you expecting doxygen to be used?


I was calling doxygen manually and getting a bit confused by the result.

I would prefer the settings be reversed generating normal html by
default, and with the dyn version not being built unless asked for. The
dyn version is very specific for being included in the website only.

Regards
Henrik



We have a side case I found with multiple runs we need to keep in mind.

The ./html/ folder generated does not get emptied between runs. So if 
the code has changed, we may get new or dead files, and every time the 
graphs are reproduced with new hashes they cause dead files.


That adds up to a lot of dead files eventually. And the reason I had the 
tmp directory intermediate being moved around after a rebuild.


Amos
--
Please use Squid 2.6.STABLE19 or 3.0.STABLE4


Re: $(SHELL) for test-suite/testheaders.sh

2008-04-05 Thread Amos Jeffries

Alex Rousskov wrote:

Amos,

Could you please change Makefile.ams so that
test-suite/testheaders.sh is executed using $(SHELL)? For example,

$(SHELL) $(top_srcdir)/test-suite/testheaders.sh ...

There may be a better, automake-assisted way of running shell scripts,
but I do not know it.

Without SHELL or equivalent, if bash is not installed in /bin/, the user
of "make check" gets an error:



gmake[2]: Entering directory `/home/rousskov/Edit/squid3/ecap/lib'
gmake  testHeaders tests/testAll
gmake[3]: Entering directory `/home/rousskov/Edit/squid3/ecap/lib'
../test-suite/testheaders.sh "g++ -DHAVE_CONFIG_H -I. -I../include -I../include -I../include 
-I/usr/local/include   -Werror -Wall -Wpointer-arith -Wwrite-strings -Wcomments  -g " 
"../include" || exit 1
/usr/local/bin/bash: ../test-suite/testheaders.sh: /usr/bin/bash: bad 
interpreter: No such file or directory
gmake[3]: *** [testHeaders] Error 1


It would be nice to have the name of the testheaders.sh script as a
Makefile.am variable (perhaps even stored in one common.am file which is
included by all Makefiles), but that can wait.



Done.

I dream of it really being completely recursive and only run from one 
Makefile.am . But today the per-directory compiler options are still 
getting in the way. Maybe after the great library changes it will be 
possible.


Amos
--
Please use Squid 2.6.STABLE19 or 3.0.STABLE4


Re: offline_mode

2008-04-05 Thread Kinkie
Still, option 1 is simple and inexpensive, so it seems reasonable to me.

   K

On Sat, Apr 5, 2008 at 8:56 AM, Adrian Chadd <[EMAIL PROTECTED]> wrote:
> I think its something that should be better fleshed out, yes.
>
>  There's probably bigger fish to fry in the codebase though.
>
>
>
>  Adrian
>
>
>
>  On Sat, Apr 05, 2008, Amos Jeffries wrote:
>  > Bug 1356 brings up some confusion over what the true meaning of
>  > 'offline' is for this option.
>  >
>  > As Henrik pointed out fairly early in the discussion there, squids
>  > actual behavior is to 'aggressively cache', not a true
>  > "network-disconnected" mode.
>  >
>  > I'm proposing that the option have one or both of two things done to it:
>  >
>  > 1) simply renamed to something more descriptive ie cache_aggressive on/off.
>  >
>  > 2) that it be kept as 'offline' but changed to accept a range of flags
>  > indicating which of the external-interface engines inside squid are to
>  > behave as if they were offline. ie "offine cache dns auth"
>  >
>  > When any given feature is marked for offline it behaves in a true "not
>  > packet leaves the interface" fashion.
>  > ie
>  >   - 'dns' means no new DNS lookups are performed.
>  >   - 'cache' means no refreshes are performed (RFC 2616 allows for
>  >   stale objects already in cache to be used with certain response 
> types)
>  >   - etc.
>  >
>  > We may need to exempt localhost traffic from offline mode explicitly.
>  > But will need to be consistent with it.
>  >
>  > The existing 'aggressive caching' behaviour may be possible by other
>  > means with cache ACL or refresh_patterns. Or a 'cache=semi' style tag
>  > might be chosen.
>  >
>  > Or, we may choose to do both of the above changes.
>  >
>  > Amos
>  > --
>  > Please use Squid 2.6.STABLE19 or 3.0.STABLE4
>
>  --
>  - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid 
> Support -
>  - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -
>



-- 
 /kinkie