Re: [OpenAFS] Re: DB servers "quorum" and OpenAFS tools

2014-01-24 Thread Harald Barth
> The problem is that you the client to scan "quickly" to find a server > that is up, but because networks are not perfectly reliable and drop > packets all the time, it cannot know that a server is not up until that > server has failed to respond to multiple retransmissions of the request. > Those

Re: [OpenAFS] Re: DB servers "quorum" and OpenAFS tools

2014-01-24 Thread Simon Wilkinson
On 24 Jan 2014, at 07:48, Harald Barth wrote: > You are completely right if one must talk to that server. But I think > that AFS/RX sometimes hangs to loong on waiting for one server > instead of trying the next one. For example for questions that could > be answered by any VLDB. I'm thinking

[OpenAFS] OpenAFS 1.7.2900 for windows report

2014-01-24 Thread Lars Schimmer
Hi! I just want to write a short report of success with OpenAFS 1.7.2900 for Windows. We have had some problems from time to time with roaming profiles not syncing on logout, the system kept staying in the "logging out" screen forever. We did not find any evidence of this in windows logs. But si

RE: [OpenAFS] Re: mkdir() performance on AFS client

2014-01-24 Thread milek
> > The Windows cache manager even takes things a step further by > maintaining a negative cache for EACCESS errors on {FID, user}. This > has avoided hitting the abort threshold limits triggered by Windows > that > assumes that if it can list a directory it must be able to read the > status of a

Re: [OpenAFS] Re: DB servers "quorum" and OpenAFS tools

2014-01-24 Thread Peter Grandi
>>> For example in an ideal world putting more or less DB servers >>> in the client 'CellServDB' should not matter, as long as one >>> that belongs to the cell is up; again if the logic were for >>> all types of client: "scan quickly the list of potential DB >>> servers, find one that is up and bel

Re: [OpenAFS] Re: mkdir() performance on AFS client

2014-01-24 Thread Jeffrey Altman
On 1/24/2014 8:56 AM, mi...@task.gda.pl wrote: >> >> The Windows cache manager even takes things a step further by >> maintaining a negative cache for EACCESS errors on {FID, user}. This >> has avoided hitting the abort threshold limits triggered by Windows >> that >> assumes that if it can list a

Re: [OpenAFS] DB servers "quorum" and OpenAFS tools

2014-01-24 Thread Neil Davies
Peter To solve this you can't just use the round trip in its raw form, you need to understand it terms of how the "delay and loss" accrued. Its a bit too long (and potentially off-topic) for this list, but briefly the way we perform this sort of analysis (in my day job) is to view it as quality

[OpenAFS] Re: DB servers "quorum" and OpenAFS tools

2014-01-24 Thread Andrew Deason
On Thu, 23 Jan 2014 21:55:15 + p...@afs.list.sabi.co.uk (Peter Grandi) wrote: > > Otherwise, when your network becomes congested, the > > retransmission of dropped packets will act as a runaway positive > > feedback loop, making the congestion worse and saturating the > > network. > > I am so

Re: [OpenAFS] Re: DB servers "quorum" and OpenAFS tools

2014-01-24 Thread Harald Barth
> I have long thought that we should be using multi for vldb lookups, > specifically to avoid the problems with down database servers. The situation is a little bit different for cache managers who can remember which servers are down and command line tools which normally discocver how the world

Re: [OpenAFS] Re: DB servers "quorum" and OpenAFS tools

2014-01-24 Thread Jeffrey Hutzelman
On Fri, 2014-01-24 at 08:01 +, Simon Wilkinson wrote: > On 24 Jan 2014, at 07:48, Harald Barth wrote: > > > You are completely right if one must talk to that server. But I think > > that AFS/RX sometimes hangs to loong on waiting for one server > > instead of trying the next one. For exam

Re: [OpenAFS] Re: DB servers "quorum" and OpenAFS tools

2014-01-24 Thread Brandon Allbery
On Fri, 2014-01-24 at 11:41 -0500, Jeffrey Hutzelman wrote: > The problem is the one-off clients that make _one RPC_ and then exit. > They have no opportunity to remember what didn't work last time. It Has it been considered to write a cache file somewhere (even a user dotfile) that could be used

Re: [OpenAFS] Re: DB servers "quorum" and OpenAFS tools

2014-01-24 Thread Jeffrey Altman
On 1/24/2014 11:45 AM, Brandon Allbery wrote: > On Fri, 2014-01-24 at 11:41 -0500, Jeffrey Hutzelman wrote: >> The problem is the one-off clients that make _one RPC_ and then exit. >> They have no opportunity to remember what didn't work last time. It > > Has it been considered to write a cache f

[OpenAFS] Re: DB servers "quorum" and OpenAFS tools

2014-01-24 Thread Andrew Deason
On Fri, 24 Jan 2014 11:41:35 -0500 Jeffrey Hutzelman wrote: > The problem is the one-off clients that make _one RPC_ and then exit. > They have no opportunity to remember what didn't work last time. It > might help some for these sorts of clients to use multi, if they're > doing read-only reques

Re: [OpenAFS] Re: 'afs/' principal rekeying instructions may be incomplete

2014-01-24 Thread Benjamin Kaduk
Sorry for the delayed response. It looks like Jeffrey's and Andrew's responses should have addressed the major issues. It would also be a little easier for me if the attribution of who wrote the quoted text was retained. On Thu, 23 Jan 2014, Peter Grandi wrote: ** Crucial details for compl