Re: [PATCH] Comm cleanup (Async ConnOpener)

2010-07-18 Thread Amos Jeffries
On Sun, 18 Jul 2010 19:15:37 -0600, Alex Rousskov
 wrote:
> On 07/18/2010 04:52 AM, Amos Jeffries wrote:
>> This updated my previous submission for the comm connection opener
model.
>> 
>> diff since last time:
>> 
>> - removed the multi-path pathways. They are now isolated at the upper
>> level. ConnOpener now handles only one Comm::Connection.
>> 
>> - implemented ConnOpener as an AsyncJob child.
>> 
>> 
>> What has changed from HEAD:
>> 
>> ConnectionDetails objects have been renamed Comm::Connection and been
>> extended to hold the FD and Squids' socket flags.
> ...
>>  On COMM_ERR_CONNECT and COMM_TIMEOUT the conn will be !isOpen() or
NULL
>> and should not be relied on.
> 
> If the connection pointer is NULL, a job opening multiple connections
> would not be able to tell which connection just failed. Let's not hide
> connection pointer from the initiator.

Understood. I'm not sure exactly why I wrote NULL as a case there.

> 
> * Can Adaptation::Icap::Xaction::closeConnection set connection to NULL?
> 

Yes.

Since I got to know AsyncJob a bit better now I think the concept of
"connection" pointer in Xaction is maybe a little wrong. Check me on this
but here are a few state things I found working with FwdState and the
InternalServer from the PAC stuff...


In the Xaction code "connection" is meant to indicate both that a
connection is being attempted and what that connection is (its FD). right?

In the async time between connect being started and connect being
completed the FD-state of connection switches from being <0 to being >=0.
This starts a race condition between the ConnOpener finishing and any
caller abort/timeout handler which may try to close an half-open conn->fd
itself. Which would trigger the ConnOpener to open a new socket on it's
next call and re-start the connection setup, then die on its AsyncCall
callback (for some reason the dialer GetParams fails with a NULL params
object if the caller has died but the call is still scheduled without
cancelling).

The solution I use in the ConnOpener itself (for earlyabort and timeout)
is to have the caller hold a AsyncCall::Pointer to the call is told the
ConnOpener to make. And use that ptrs NULL / not-NULL state in the abort
code to identify if a ConnOpener is running or not. IF a ConnOpener is
running then the call (for which we now have a ptr) can be cancelled and
the conn 'erased' (Comm::ConnectionPointer set to NULL along with the held
AsyncCall::Pointer). This leaves the ConnOpener to finish and no callback
happens, ConnOpener completing then re-closes the FD which was attempted.


> 
> * FwdState::complete may shift paths (via reforward). If that happens,
> the following will close a different/new connection or assert/throw (or
> even segfault in the current patch):
> 
>  fwd->complete();
> -comm_close(fd);
> +fwd->conn()->close();
> 
> Overall, exposing connection pointer to Servers seems wrong, especially
> because the pointer may change from one call to another. A Server that
> needs that pointer, should store it at initialization time, just like it
> stores fd in the current code.
> 
> You have "TODO: store Comm::Connection instead of FD". This is an XXX
> and not just a TODO because of the above concerns.

Ah, so XXX to you means something other than a personal TODO marker?


> 
> 
> That is all I had time for, sorry.
> 

No prob. thats heaps for me to work on tonight :)



Re: [PATCH] Comm cleanup (Async ConnOpener)

2010-07-18 Thread Alex Rousskov
On 07/18/2010 04:52 AM, Amos Jeffries wrote:
> This updated my previous submission for the comm connection opener model.
> 
> diff since last time:
> 
> - removed the multi-path pathways. They are now isolated at the upper
> level. ConnOpener now handles only one Comm::Connection.
> 
> - implemented ConnOpener as an AsyncJob child.
> 
> 
> What has changed from HEAD:
> 
> ConnectionDetails objects have been renamed Comm::Connection and been
> extended to hold the FD and Squids' socket flags.
...
>  On COMM_ERR_CONNECT and COMM_TIMEOUT the conn will be !isOpen() or NULL
> and should not be relied on.

If the connection pointer is NULL, a job opening multiple connections
would not be able to tell which connection just failed. Let's not hide
connection pointer from the initiator.



* Can Adaptation::Icap::Xaction::closeConnection set connection to NULL?

* Connection::close should set fd to -1 only when open (since we have to
spent time doing the check, let's use the result).

* declare ConnOpener::start() as virtual, especially since you declare
other API methods virtual.

* ConnOpener::start(), doneAll(), and swanSong() should remain protected
as they are in the AsyncJob.

* IMO, "" distract and encourage others to put even more
fanciful decorations into the code.

* ConnOpener::operator = should return a reference to ConnOpener, not
just ConnOpener.


* You document private and protected methods in .h files where they are
declared. Did not we agree to document private and protected methods in
.cc files, where they are defined? Or was I doing it wrong for the last
few patches?

* s/solo/conn/ and adjust its documentation since ConnOpener no longer
handles multiple connections?

* Move connect_timeout declaration lower, just above connstart to reduce
padding?

* s/struct _calls/struct Calls/


* Naming consistency:

  s/connstart/connStart/
  s/earlyabort/earlyAbort/
  s/connect_timeout/connectTimeout/
  s/fail_retries/failRetries/


* cbdataReferenceDone() already checks for NULL pointer. You can save
two lines and some cycles if you do not do it in
Comm::Connection::~Connection().

* If you do not want Connections to be copied and assigned, make the
corresponding methods private. It would produce errors earlier and the
error messages will be a little more clear.

* No need to pre-declare "class Connection" in
src/comm/ListenStateData.h because you already included comm/forward.h.

* Do you need "Comm::" prefixes inside Comm namespace in comm/forward.h?


* Please remove the following line. It is a waste of CPU cycles, not to
mention a blunt layering violation:

memset(&calls, 0, sizeof(calls));


* Please remove these redundant and inconsistent lines in
ConnOpener::~ConnOpener():

+solo = NULL;
+calls.earlyabort = NULL;
+calls.timeout = NULL;

* I do not recommend putting debugging in doneAll(). The best place to
report state is in status(), which is called automatically before and
after every job call (and can be used in the debugging statements inside
the job, of course).

* doneAll() should call its parent, even if the current parent always
returns true.

* Should ConnOpener::swanSong() call comm_close instead of just close()?

* ConnOpener::callCallback() clears the close handler and timeout
depending on solo state, which is kind of wrong, but the best thing to
do is to remove that cleanup code from callCallback() completely because
the same code is already present in swanSong() and because it is
unrelated to calling the callback.

* It is better not to not assert() inside async job code. Use Must() and
let Squid kill the job instead of killing Squid. One could make an
exception for before-start() code and for the destructor, but even those
are debatable.

* solo->fd > 0 is inconsistent with other checks. I would recommend
replacing (solo != NULL && solo->fd >= 0) with an inlined
hasConnection() or similar method to avoid these kind of problems.

* The /* handle connecting to one single path */ comment is obsolete.

* It may not be a good idea to separate calls.earlyabort and
calls.timeout existence from the existence of the corresponding Comm
close or timeout handler. Set them when setting the Comm handler. Clear
them when clearing the handler.

* I still think that keeping an array of unopened connections and using
  the first entry as "real connection" in FwdState is wrong. However, if
you insist on doing that, let's at least hide that ugliness using a few
simple inlined methods:

  - s/paths.size() > 0 && paths[0]->fd > -1/connectedToServer()/
  - s/paths[0]->/serverConnection()/

The serverConnection() method should throw or assert if paths are empty.

Actually, you already have conn() method so you can just fix that: make
it return a reference to the pointer and check that paths are not empty.
this is especially important since we call conn() from outside the
FwdState object.


* "if (paths[0]->fd > -1)" statements in the reforwarding code do not
check that the paths are not 

Build failed in Hudson: 3.HEAD-amd64-CentOS-5.3 #659

2010-07-18 Thread noc
See 

Changes:

[Amos Jeffries ] Correction documentation of QoS 
disable-preserve-miss

[Alex Rousskov ] Added debugging scripts that 
work with detailed cache.log

scripts/find-alive.pl: pinpoint objects that are still alive, to find leaks
scripts/trace-job.pl: find cache.log lines that correspond to a given job
scripts/trace-master.pl: trace jobs related to a single master transaction

The scripts require maintenance as the logging format changes, but
they often simplify debugging by extracting relevant information from
tons of poorly structured cache.log data.

--
Started by an SCM change
Building on master
[workspace] $ bzr log --show-ids -r -1
[workspace] $ bzr pull --overwrite http://bzr.squid-cache.org/bzr/squid3/trunk/
http://bzr.squid-cache.org/bzr/squid3/trunk is permanently redirected to 
http://bzr.squid-cache.org/bzr/squid3/trunk/
 M  src/cf.data.pre
All changes applied successfully.
Now on revision 10644.
[workspace] $ bzr log --show-ids -r -1
[workspace] $ bzr log -v -r 
revid:rouss...@measurement-factory.com-20100716223742-xiaw4wr97k3e8ttx..revid:amosjeffr...@squid-cache.org-20100719001430-mw16xajd1iz83lk8
 --long --show-ids
[workspace] $ /bin/sh -xe /tmp/hudson7991644986974526702.sh
+ ./test-builds.sh --verbose --use-config-cache
TESTING: layer-00-bootstrap
total 8
drwxr-xr-x  2 hudson hudson 4096 Jul 19 03:01 .
drwxr-xr-x 26 hudson hudson 4096 Jul 19 03:01 ..
BUILD: .././test-suite/buildtests/layer-00-bootstrap.opts
automake (1.9) : automake
autoconf (2.61) : autoconf
libtool  (1.5) : libtool
Bootstrapping 
compat/Makefile.am:12: Libtool library used but `LIBTOOL' is undefined
compat/Makefile.am:12: 
compat/Makefile.am:12: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
compat/Makefile.am:12: to `configure.in' and run `aclocal' and `autoconf' again.
src/Makefile.am:153: Libtool library used but `LIBTOOL' is undefined
src/Makefile.am:153: 
src/Makefile.am:153: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
src/Makefile.am:153: to `configure.in' and run `aclocal' and `autoconf' again.
src/acl/Makefile.am:4: Libtool library used but `LIBTOOL' is undefined
src/acl/Makefile.am:4: 
src/acl/Makefile.am:4: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
src/acl/Makefile.am:4: to `configure.in' and run `aclocal' and `autoconf' again.
src/adaptation/Makefile.am:15: Libtool library used but `LIBTOOL' is undefined
src/adaptation/Makefile.am:15: 
src/adaptation/Makefile.am:15: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
src/adaptation/Makefile.am:15: to `configure.in' and run `aclocal' and 
`autoconf' again.
src/adaptation/ecap/Makefile.am:4: Libtool library used but `LIBTOOL' is 
undefined
src/adaptation/ecap/Makefile.am:4: 
src/adaptation/ecap/Makefile.am:4: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
src/adaptation/ecap/Makefile.am:4: to `configure.in' and run `aclocal' and 
`autoconf' again.
src/adaptation/icap/Makefile.am:4: Libtool library used but `LIBTOOL' is 
undefined
src/adaptation/icap/Makefile.am:4: 
src/adaptation/icap/Makefile.am:4: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
src/adaptation/icap/Makefile.am:4: to `configure.in' and run `aclocal' and 
`autoconf' again.
src/auth/Makefile.am:7: Libtool library used but `LIBTOOL' is undefined
src/auth/Makefile.am:7: 
src/auth/Makefile.am:7: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
src/auth/Makefile.am:7: to `configure.in' and run `aclocal' and `autoconf' 
again.
src/base/Makefile.am:5: Libtool library used but `LIBTOOL' is undefined
src/base/Makefile.am:5: 
src/base/Makefile.am:5: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
src/base/Makefile.am:5: to `configure.in' and run `aclocal' and `autoconf' 
again.
src/comm/Makefile.am:4: Libtool library used but `LIBTOOL' is undefined
src/comm/Makefile.am:4: 
src/comm/Makefile.am:4: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
src/comm/Makefile.am:4: to `configure.in' and run `aclocal' and `autoconf' 
again.
src/esi/Makefile.am:4: Libtool library used but `LIBTOOL' is undefined
src/esi/Makefile.am:4: 
src/esi/Makefile.am:4: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
src/esi/Makefile.am:4: to `configure.in' and run `aclocal' and `autoconf' again.
src/eui/Makefile.am:4: Libtool library used but `LIBTOOL' is undefined
src/eui/Makefile.am:4: 
src/eui/Makefile.am:4: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
src/eui/Makefile.am:4: to `configure.in' and run `aclocal' and `autoconf' again.
src/fs/Makefile.am:3: Libtool library used but `LIBTOOL' is undefined
src/fs/Makefile.am:3: 
src/fs/Makefile.am:3: The usual way to define `LIBTOOL' is to add 
`AC_PROG_LIBTOOL'
src/fs/Makefile.am:3: to `configure.in' and run `aclocal' and `autoconf' again.
src/icmp/Makefile.am:25: Libtool library used but `LIBTOO

Re: Marking uncached packets with a netfilter mark value

2010-07-18 Thread Amos Jeffries
On Sun, 18 Jul 2010 15:11:52 +0100, Andrew Beverley 
wrote:
>>  So, do you have a clear use-case we can add to the wiki and commit
>> message?
> 
> I propose extending the current QualityOfService feature as follows. The
> existing http://wiki.squid-cache.org/Features/QualityOfService page
> should read:
> 
>   * Allows you to set a TOS/Diffserv value to mark local and peer hits.
>   * For platforms using netfilter, allows you to set a netfilter mark
> value instead of, or in addition to, a TOS value.
>   * Allows you to selectively set only sibling or parent requests 
>   * Allows any HTTP response towards clients to have the TOS value of
> the response coming from the remote server, or in the case of
> marking, the incoming connection's netfilter mark value. For this to
> work correctly with a TOS value, you will need to patch your linux
> kernel with the TOS preserving ZPH patch. The kernel patch can be
> downloaded from http://zph.bratcheda.org. No patch is needed for a
> netfilter mark.
>   * Allows you to mask certain bits in the TOS or mark received, before
> copying the value towards clients. 
> 
>>  qos_flows - adding an initial flag "tos"|"mark" which determines which
>> marking type is to be set. Followed by the current (or extended)
>> stream=value tags. Default to "tos" if missing for backward
compatibility
> 
> Agree with the above for the config file.
> 
>>  So we end up with:
>>qos_flows tos parent-hit=0xA sibling-hit=0xB
>>qos_flows mark local-miss=0x1
> 
> I propose just the addition of the tos|mark flag and leave the remainder
> of the options the same. I don't see any need to add a local-miss
> option, as the user can mark packets before they hit Squid.
> 
> To keep things simple, I propose that the patch is still enabled with
> --enable-zph-qos as with the current TOS patch. However, the mark patch
> will need the libnetfilter_conntrack library, so should a separate
> compiler flag be used instead?

I've been thinking of removing the "zph-" part of the option name for a
while. For now its fine as a main on/off switch for the QoS marking.

--enable/disable-linux-netfilter will also be involved in the logics. If
set to "no" then it override disables this netfilter feature.

A new --with-netfilter-conntrack option will be needed to set linking for
the particular library. With path as a optional parameter, and --without
meaning not to link (probably to prevent the code building as well).

> 
> Incidentally, there is a mistake in the documentation for the existing
> QOS patch. At http://www.squid-cache.org/Doc/config/qos_flows/ it
> states:
> 
> disable-preserve-miss
>   If set, any HTTP response towards clients will
>   have the TOS value of the response comming from the
>   remote server masked with the value of miss-mask.
> 
> This should read:
>   By default, the existing TOS value of the response coming from the
> remote server will be retained and masked with miss-mark. This option
> disables that feature.

Oops. Thanks. Fixed.

Amos


Re: Marking uncached packets with a netfilter mark value

2010-07-18 Thread Andrew Beverley
>  So, do you have a clear use-case we can add to the wiki and commit
> message?

I propose extending the current QualityOfService feature as follows. The
existing http://wiki.squid-cache.org/Features/QualityOfService page
should read:

  * Allows you to set a TOS/Diffserv value to mark local and peer hits.
  * For platforms using netfilter, allows you to set a netfilter mark
value instead of, or in addition to, a TOS value.
  * Allows you to selectively set only sibling or parent requests 
  * Allows any HTTP response towards clients to have the TOS value of
the response coming from the remote server, or in the case of
marking, the incoming connection's netfilter mark value. For this to
work correctly with a TOS value, you will need to patch your linux
kernel with the TOS preserving ZPH patch. The kernel patch can be
downloaded from http://zph.bratcheda.org. No patch is needed for a
netfilter mark.
  * Allows you to mask certain bits in the TOS or mark received, before
copying the value towards clients. 

>  qos_flows - adding an initial flag "tos"|"mark" which determines which
> marking type is to be set. Followed by the current (or extended)
> stream=value tags. Default to "tos" if missing for backward compatibility

Agree with the above for the config file.

>  So we end up with:
>qos_flows tos parent-hit=0xA sibling-hit=0xB
>qos_flows mark local-miss=0x1

I propose just the addition of the tos|mark flag and leave the remainder
of the options the same. I don't see any need to add a local-miss
option, as the user can mark packets before they hit Squid.

To keep things simple, I propose that the patch is still enabled with
--enable-zph-qos as with the current TOS patch. However, the mark patch
will need the libnetfilter_conntrack library, so should a separate
compiler flag be used instead?

Incidentally, there is a mistake in the documentation for the existing
QOS patch. At http://www.squid-cache.org/Doc/config/qos_flows/ it
states:

disable-preserve-miss
If set, any HTTP response towards clients will
have the TOS value of the response comming from the
remote server masked with the value of miss-mask.

This should read:
By default, the existing TOS value of the response coming from the
remote server will be retained and masked with miss-mark. This option
disables that feature.

Regards,

Andy