[squid-dev] Build failed in Jenkins: trunk-polygraph #901

2015-10-26 Thread noc
See 

--
Started by upstream project "trunk-matrix" build number 400
originally caused by:
 Started by an SCM change
Building remotely on polygraph (12.04 amd64-Ubuntu Ubuntu amd64-Ubuntu-12.04 
Ubuntu-12.04 amd64) in workspace 

$ bzr revision-info -d 
info result: bzr revision-info -d 
 returned 0. Command 
output: "14366 squ...@treenet.co.nz-20151026025330-zir5kn2r2sbvin1b
" stderr: ""
[trunk-polygraph] $ bzr update
 M  src/peer_digest.cc
All changes applied successfully.
Updated to revision 14366 of branch http://bzr.squid-cache.org/bzr/squid3/trunk
[trunk-polygraph] $ bzr switch http://bzr.squid-cache.org/bzr/squid3/trunk/
Tree is up to date at revision 14366.
Switched to branch: http://bzr.squid-cache.org/bzr/squid3/trunk/
[trunk-polygraph] $ bzr revert
$ bzr revision-info -d 
info result: bzr revision-info -d 
 returned 0. Command 
output: "14366 squ...@treenet.co.nz-20151026025330-zir5kn2r2sbvin1b
" stderr: ""
[trunk-polygraph] $ bzr log -v -r 
revid:squ...@treenet.co.nz-20151026025330-zir5kn2r2sbvin1b..revid:squ...@treenet.co.nz-20151026025330-zir5kn2r2sbvin1b
 --long --show-ids
Getting local revision...
$ bzr revision-info -d 
info result: bzr revision-info -d 
 returned 0. Command 
output: "14366 squ...@treenet.co.nz-20151026025330-zir5kn2r2sbvin1b
" stderr: ""
RevisionState revno:14366 
revid:squ...@treenet.co.nz-20151026025330-zir5kn2r2sbvin1b
[trunk-polygraph] $ /bin/sh -xe /tmp/hudson7364323057433622698.sh
+ cd /home/jenkins/squidperf
+ python SquidBasicPerf.py --audited 
http://build.squid-cache.org/job/trunk-polygraph/830/artifact/logs/test.lx 
--jjid 901 --svnurl http://bzr.squid-cache.org/bzr/squid3/trunk/ --jobname 
trunk-polygraph
Test is failed
Build step 'Execute shell' marked build as failure
Archiving artifacts
[description-setter] Could not determine description.
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


Re: [squid-dev] [PATCH] %

2015-10-26 Thread Alex Rousskov
On 10/21/2015 03:14 AM, Amos Jeffries wrote:
> On 21/10/2015 4:42 p.m., Alex Rousskov wrote:
>> Hello,
>>
>> Connection stats, including %> connections.
>>
>> The code reusing a pconn was missing a hier.note() call, resulting in 0
>> values logged for %> connection) and probably other missing stats.
>>
>> Also refactored poorly copied statistics collection code to remove
>> duplication and always update to-server connection stats when the actual
>> connection becomes available.
>>
>> Positive side effect: Upon setsockopt(2) failures, the tos and nfmark
>> fields of a pinned connection were set to the desired (but not actually
>> applied) values, while persistent connection fields were left intact
>> (and, hence, stale). Both fields are now reset to zero on failures, for
>> both types of connections.
>>
> 
> +1.

Committed to trunk (r14367).


Thank you,

Alex.

___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev


[squid-dev] [PATCH] No reconfiguration during shutdown

2015-10-26 Thread Alex Rousskov
Hello,

To avoid crashes, prohibit pointless reconfiguration during shutdown.

Also consolidated and polished signal action handling code:

1. For any executed action X, clear do_X at the beginning of action X
   code because once we start X, we should accept/queue more X
   requests (or inform the admin if we reject them).

2. Delay any action X requested during startup or reconfiguration
   because the latter two actions modify global state that X depends
   on. Inform the admin that the requested action is being delayed.

3. Cancel any action X requested during shutdown. We cannot run X
   during shutdown because shutdown modifies global state that X
   depends on, and we never come back from shutdown so there is no
   point in delaying X. Inform the admin that the requested action is
   canceled.

Repeated failed attempts to fix crashes related to various overlapping
signal actions confirm that this code is a lot trickier than it looks.
This change introduces a more systematic/comprehensive approach to
resolving associated conflicts compared to previous ad hoc attempts.

For example, there were several changes related to bug 3574 (trunk
r14354), but trunk Squid still crashes if SIGHUP is received at the
"wrong" time. I hope this fix will kill the remaining similar bugs or at
least make future fixes easier.

http://bugs.squid-cache.org/show_bug.cgi?id=3574


One possible future work is to split shutdown into two states:

* scheduled (waiting for timeout to expire; may not affect some of the
  signal actions) and
* in-progress (blocks out all other actions).

Currently, the two states are merged into one in trunk code (there is
only one shutting_down global). This fix does not attempt to address
that deficiency. Factory does not plan to work on this in the
foreseeable future. Please feel free to solve this problem!


Amos, I have also attached a "bag10s" patch that may work better for the
v3.5 branch should you decide to apply this fix to v3.5 as well.


Thank you,

Alex.
To avoid crashes, prohibit pointless reconfiguration during shutdown.

Also consolidated and polished signal action handling code:

1. For any executed action X, clear do_X at the beginning of action X
   code because once we start X, we should accept/queue more X
   requests (or inform the admin if we reject them).

2. Delay any action X requested during startup or reconfiguration
   because the latter two actions modify global state that X depends
   on. Inform the admin that the requested action is being delayed.

3. Cancel any action X requested during shutdown. We cannot run X
   during shutdown because shutdown modifies global state that X
   depends on, and we never come back from shutdown so there is no
   point in delaying X. Inform the admin that the requested action is
   canceled.

The child signal handling action is exempt from rules #2 and #3
because its code does not depend on Squid state.

Repeated failed attempts to fix crashes related to various overlapping
actions confirm that this code is a lot trickier than it looks. This
change introduces a more systematic/comprehensive approach to
resolving associated conflicts compared to previous ad hoc attempts.

=== modified file 'src/main.cc'
--- src/main.cc	2015-10-12 01:38:02 +
+++ src/main.cc	2015-10-25 17:13:45 +
@@ -220,100 +220,126 @@ private:
 Auth::Scheme::FreeAll();
 #endif
 
 eventAdd("SquidTerminate", , NULL, 0, 1, false);
 }
 
 void doShutdown(time_t wait);
 void handleStoppedChild();
 
 #if KILL_PARENT_OPT
 bool parentKillNotified;
 pid_t parentPid;
 #endif
 };
 
 int
 SignalEngine::checkEvents(int)
 {
 PROF_start(SignalEngine_checkEvents);
 
-if (do_reconfigure) {
-if (!reconfiguring && configured_once) {
-mainReconfigureStart();
-do_reconfigure = 0;
-} // else wait until previous reconfigure is done
-} else if (do_rotate) {
+if (do_reconfigure)
+mainReconfigureStart();
+else if (do_rotate)
 mainRotate();
-do_rotate = 0;
-} else if (do_shutdown) {
+else if (do_shutdown)
 doShutdown(do_shutdown > 0 ? (int) Config.shutdownLifetime : 0);
-do_shutdown = 0;
-}
-if (do_handle_stopped_child) {
-do_handle_stopped_child = 0;
+if (do_handle_stopped_child)
 handleStoppedChild();
-}
 PROF_stop(SignalEngine_checkEvents);
 return EVENT_IDLE;
 }
 
+/// Decides whether the signal-controlled action X should be delayed, canceled,
+/// or executed immediately. Clears do_X (via signalVar) as needed.
+static bool
+AvoidSignalAction(const char *description, volatile int )
+{
+const char *avoiding = "delaying";
+const char *currentEvent = "none";
+if (shutting_down) {
+currentEvent = "shutdown";
+avoiding = "canceling";
+signalVar = 0;
+}
+else if (!configured_once)
+currentEvent = "startup";
+else if (reconfiguring)
+

Re: [squid-dev] [PATCH] No reconfiguration during shutdown

2015-10-26 Thread Amos Jeffries
On 27/10/2015 5:00 a.m., Alex Rousskov wrote:
> Hello,
> 
> To avoid crashes, prohibit pointless reconfiguration during shutdown.
> 
> Also consolidated and polished signal action handling code:
> 
> 1. For any executed action X, clear do_X at the beginning of action X
>code because once we start X, we should accept/queue more X
>requests (or inform the admin if we reject them).
> 
> 2. Delay any action X requested during startup or reconfiguration
>because the latter two actions modify global state that X depends
>on. Inform the admin that the requested action is being delayed.
> 
> 3. Cancel any action X requested during shutdown. We cannot run X
>during shutdown because shutdown modifies global state that X
>depends on, and we never come back from shutdown so there is no
>point in delaying X. Inform the admin that the requested action is
>canceled.
> 
> Repeated failed attempts to fix crashes related to various overlapping
> signal actions confirm that this code is a lot trickier than it looks.
> This change introduces a more systematic/comprehensive approach to
> resolving associated conflicts compared to previous ad hoc attempts.
> 
> For example, there were several changes related to bug 3574 (trunk
> r14354), but trunk Squid still crashes if SIGHUP is received at the
> "wrong" time. I hope this fix will kill the remaining similar bugs or at
> least make future fixes easier.
> 
> http://bugs.squid-cache.org/show_bug.cgi?id=3574
> 

+1 on this patch.

Please apply with a "--fixes squid:3574" and bug reference in the commit
title.


> 
> One possible future work is to split shutdown into two states:
> 
> * scheduled (waiting for timeout to expire; may not affect some of the
>   signal actions) and
> * in-progress (blocks out all other actions).
> 
> Currently, the two states are merged into one in trunk code (there is
> only one shutting_down global). This fix does not attempt to address
> that deficiency. Factory does not plan to work on this in the
> foreseeable future. Please feel free to solve this problem!


I did (re-)discover that the final cycle through SignalsEngine whe
hutdown timeout ends does indeed drain the AsyncQueue. But not wait for
any other types of pending I/O or FD events that might appear during
that drain. That is paving the way for the current swap.state read/write
crashes on shutdown.

I plan to work towards Runners doing all the shutdown handling and in
particular hooking some components into that which are currently not
paying any attention to shutdown termination (ie the swap.state and DNS
sockets FD). Once that conversion is completed we shall see what remains
that needs any async handling after timout ends.

Amos
___
squid-dev mailing list
squid-dev@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-dev