Ahhh. The parallel processing in << common-src/bsdtcp-security.c,
function runbsdtcp >>
*ALSO* does not call asynchronously, and is ALSO giving me a 3 minute TCP
wait
if the node is actually down.Both bits of code have comments saying they
are calling
asynchronously, but neither one is doing so yet.
I understand — it’s an effort to code that! I’ve only gotten as far as
realizing that both
bsdtcp and krb5 go through this same spot in the code. <<
common-src/util.c, function connect_port >>
I *had* already realized that both bsdtcp and krb5 clients were giving me
3 minute TCP waits
(later, times the “connect-tries” parameter in my config, which defaults to
3). I did try setting that to 1
and lost all my other (up) clients anyway.It seems to be the first wait
that bothers everybody else.
So — TCP nodes (which means bsdtcp and krb5 and maybe others, but not bsd
)
have wait times which defaults to 3 minutes 9 seconds
times “connect-tries” if a node is offline.
If I didn’t have any bsd clients (udp connections) would this not be
bothering me?
Do you other people with only bsdtcp and/or krb5 clients have no
problems if a node is
offline?
(If so, then I’ll push to upgrade those older clients, instead of
trying to re-write the code!)
Deb Baddorf
Fermilab
On Sep 23, 2014, at 10:56 AM, Jean-Louis Martineau wrote:
> Debra,
>
> The patch created other problems and was later reverted (I don't remember
> what was the problem).
>
> You can try it if you want
> In common-src/krb5-security.c,function runkrb5,
> replace the last argument (0) of stream_client to 1.
>
> Jean-Louis
>
> On 09/22/2014 04:19 PM, Debra S Baddorf wrote:
>> I seem to recall a patch for this, but I can’t find it now. It’s finally
>> happened firmly enough that I can reproduce it, on amcheck at least:
>>
>> amcheck (and sometimes amdump) hang indefinitely is the client is powered
>> down.
>>
>> amanda v3.3.6 server(happened on 3.3.3 too so I upgraded, but it still
>> happens)
>> 2 clients - are powered down so version cannot matter!
>> both are (were) using auth=krb5
>> other krb5 clients work fine and have done so for more than a year
>>
>> The server’s log doesn’t have anything useful — I killed the process at
>> about 15:10 .
>>
>>
>> server/daily/amcheck.20140922150805.debug
>> ::
>> Mon Sep 22 15:08:05 2014: thd-0x9550330: amcheck: pid 3893 ruid 0 euid 11
>> version 3.3.6: start at Mon Sep 22 15:08:05 2014
>> Mon Sep 22 15:08:05 2014: thd-0x9550330: amcheck: pid 3893 ruid 0 euid 11
>> version 3.3.6: rename at Mon Sep 22 15:08:05 2014
>> Mon Sep 22 15:08:05 2014: thd-0x9550330: amcheck-clients:
>> security_getdriver(name=krb5) returns 0x193240
>> Mon Sep 22 15:08:05 2014: thd-0x9550330: amcheck-clients:
>> security_handleinit(handle=0x96251c8, driver=0x193240 (KRB5))
>> Mon Sep 22 15:08:05 2014: thd-0x9550330: amcheck-clients:
>> security_streaminit(stream=0x9625750, driver=0x193240 (KRB5))
>> Mon Sep 22 15:08:05 2014: thd-0x9550330: amcheck-clients: make_socket
>> opening socket with family 2
>> Mon Sep 22 15:08:05 2014: thd-0x9550330: amcheck-clients: connect_port: Try
>> port 5: available - Success
>>
>> Can somebody point me to the patch I remember hearing about?
>>
>> Deb Baddorf
>> Fermilab
>