Re: [OpenAFS-devel] Re: idle dead timeout processing in clients

Simon Wilkinson Wed, 30 Nov 2011 13:15:03 -0800

On 30 Nov 2011, at 18:58, Andrew Deason wrote:

> On Wed, 30 Nov 2011 18:48:47 +0000
> Simon Wilkinson <[email protected]> wrote:
> 
>> The idle dead code isn't in any shipping versions of 1.4. Current 1.4
>> clients won't get RX_CALL_TIMEOUT, or RX_CALL_DEAD.
> 
> I'm not sure if we're talking about completely different things or what.
> The afs_BlackListOnce code exists in (shipping) 1.4 and, I mean, it
> certainly gets _called_. If I insert a sleep(10000) into the FetchStatus
> handler, the client will give an error (or failover to another site,
> etc); it won't just hang forever on the request.


Okay, so this is all a bit convoluted (isn't everything with RX!). There are 
two ways in which an idle dead timeout can be caused...

The relevant code from rxi_CheckCall() is:

    /* see if we have a non-activity timeout */
    if (call->startWait && idleDeadTime
        && ((call->startWait + idleDeadTime) < now) &&
        (call->flags & RX_CALL_READER_WAIT)) {
        if (call->state == RX_STATE_ACTIVE) {
            cerror = RX_CALL_TIMEOUT;
            goto mtuout;
        }
    }
    if (call->lastSendData && idleDeadTime && (conn->idleDeadErr != 0)
        && ((call->lastSendData + idleDeadTime) < now)) {
        if (call->state == RX_STATE_ACTIVE) {
            cerror = conn->idleDeadErr;
            goto mtuout;
        }
    }

The first code is in 1.4.x, and is enabled there - it returns CALL_TIMEOUT, 
which is handled by BlackListOnce. The second block is only enabled on 1.6 and 
master and is configured to return CALL_DEAD. 

The first block only fires on clients which have turned the call around, and 
are now attempting to read from the fileserver. This is actually really fragile 
- what CALL_RECEIVE_WAIT actually means is that the application thread has 
managed to push all of its packets into the RX layer, and is now blocked on 
rx_Read(). In the current implementation this just means that the number of 
transmitted packets left unacknowledged is less than 2x the current window 
size. For pretty much every AFS-3 RPC other than StoreData, it's meaningless - 
we'll enter RECEIVE_WAIT immediately. What it does mean is that this block is 
very unlikely to fire for StoreData, as for most chunk sizes we'll be writing 
more packets than can be held in the buffer.

So, with StoreData we'll hit the second block. If the other end isn't reading 
packets out of RX (because it's blocked on I/O, for example), we won't be able 
to send any packets, and we'll trigger the timeout.

It's this behaviour, coupled with the lack of error handling for CALL_DEAD, and 
the fact that we don't try and flush a full cache apart from when we're writing 
to it, that was the root cause of the original bug report.

However, all of this has exposed some real problems with the idle dead code, as 
it currently stands. I believe that some of them are the root cause of some 
long standing bug reports.

1) If you have an RPC with a small number of arguments (say CreateFile) the 
client will end up in READER_WAIT as soon as it has transmitted the first 
packet. If that CreateFile requires a callback break which takes longer than
the idle dead timeout, then the client will timeout the call with CALL_TIMEOUT. 
In the meantime, the server will complete the callback break, and create the 
file. afs_Analyze will receive CALL_TIMEOUT and retry the operation, the server 
will see that the file already exists, and return EEXIST. So, we have an 
operation that has actually succeeded returning an error.

2) In cases where a fileserver is taking a long time to break callbacks, the 
client can end up giving up due to idle dead timeouts, even if the server would 
later be able to handle its request. In 1.4, we'll retry, and (possibly) 
succeed, in 1.6 we'll tend to hit the second case first and so fail. However, 
just retrying has penalties ..

3) Idle dead is a big cause of call busy problems. It breaks the client and the 
server's view of which call slots are empty. Take the example of a client that 
has slots 2,3,4 busy with long-running store operations. Slot 1 hits an idle 
dead timeout, and the client must retry. So, it starts a new call to the 
server, but the only slot that's available is slot 1. It starts a call on that 
slot, but that's then bounced back with CALL_BUSY by the server.

Cheers,

Simon.

_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Re: [OpenAFS-devel] Re: idle dead timeout processing in clients

Reply via email to