[OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Jeff Blaine

Can anyone point me at the docs where quorum election, IP
address numbering as it pertains to election, etc... lives?
I can't find what I am looking for on openafs.org

I seem to recall that the highest IP is sync site (if I
have that right) nonsense was addressed, but again, cannot
find the modern info about the election logic.

Thanks for any info!
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Ken Hornstein
Can anyone point me at the docs where quorum election, IP
address numbering as it pertains to election, etc... lives?
I can't find what I am looking for on openafs.org

I seem to recall that the highest IP is sync site (if I
have that right) nonsense was addressed, but again, cannot
find the modern info about the election logic.

There are two sources of documentation that I know about: A long-ago paper
by Mike Kazar, and the source code (which actually has reasonable comments).
I actually have a copy of the paper if you care.

The key source code you want is ${OPENAFS}/src/ubik/vote.c.  And in my
reading other than the support for clone servers nothing has changed in
terms of the quorum selection (it's the lowest IP address, actually).

--Ken
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Jeff Blaine

There are two sources of documentation that I know about: A long-ago paper
by Mike Kazar, and the source code (which actually has reasonable comments).
I actually have a copy of the paper if you care.

The key source code you want is ${OPENAFS}/src/ubik/vote.c.  And in my
reading other than the support for clone servers nothing has changed in
terms of the quorum selection (it's the lowest IP address, actually).


Thanks Ken,

Yes, lowest, of course (sorry).

I can't view the .PS documents yet, but I'm not sure it's
necessary to view them if nothing has changed (I was sure
it had).

The lowest IP address favoritism decision is totally
arbitrary, no?

We're kind of screwed unless there's a way around it,
and really would not like to have to apply a local patch
with every rollout.

Andrew, Simon, Jeffrey, Derrick, et al...

Would a favor highest patch be accepted if it was controlled
via configure script, defaulting to the traditional behavior?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Jeffrey Altman
On 10/26/2011 1:49 PM, Jeff Blaine wrote:
 Would a favor highest patch be accepted if it was controlled
 via configure script, defaulting to the traditional behavior?

I would object.  A quorum requirement is that all servers are in
agreement with the server configuration and the quorum algorithm.  Any
change to the quorum algorithm needs to be exposed as part of the
negotiation in order for servers to not get into a state where a
misconfigured server or a server executing with an alternate algorithm
does not result in a failure to achieve quorum.

One of the requirements for pushing patches upstream is that they must
not cause people to hang themselves due to inadvertent use.

Think about what you would need to do if you were running with this
patch locally.  Every sysadmin that upgrades these servers must remember
that the patch is in place (or how the servers were built/configured)
and not forget.  If you leave tomorrow, is the next sysadmin going to be
burned by this change when s/he attempts to install openafs distributed
binaries in your cell?

That is not to say that we don't need to improve things.  We know we do
and it has been talked about for nearly a decade.  However, it is also
hard and since it is hard it has repeatedly been put off.

Jeffrey Altman



signature.asc
Description: OpenPGP digital signature


Re: [OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Ken Hornstein
The lowest IP address favoritism decision is totally arbitrary, no?

Absolutely, yes.  I think ... looking at the source code, the comparison
is done in 3 places in vote.c.  You could replace that with anything else.
I've always thought that an explicit ordering would make more sense, but
I never cared enough to actually write the code.

We're kind of screwed unless there's a way around it,
and really would not like to have to apply a local patch
with every rollout.

Have you considered making the lowest server a clone?  Clones are
like other database servers, except that they can never be elected as a
sync site.  The (default) election winner then would be the next closest.

Also, it's not commonly understood but Ubik voting is what I like to
call Chicago style; the incumbent is always the winner of the election
even if he's not the best candidate.  Thus if you shut off the database
servers of the lowest IP address, once a new election takes place the
winner will be sync-site-for-life (unless he's out of service past the
Ubik change voting interval).

Just trying to present possible solutions that don't involve code changes.

--Ken
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Jeff Blaine

Think about what you would need to do if you were running with this
patch locally.  Every sysadmin that upgrades these servers must remember
that the patch is in place (or how the servers were built/configured)
and not forget.  If you leave tomorrow, is the next sysadmin going to be
burned by this change when s/he attempts to install openafs distributed
binaries in your cell?


You could make the same argument (that you're making) with
at least 5 other existing OpenAFS command-line or build-time
options.  Example: --enable-namei-fileserver vs. not, drop
on a server with existing vice partitions in the wrong
style.

Build/implementation decisions are encapsulated in build
scripts of ours.  Additionally, those decisions are documented
in our wiki.  If he/she hasn't read our internal documentation
about our cell, which is extensive and clear in our wiki, then
yes, he/she will get burned.

Just like he/she would with any other option for cell
or server configuration.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Ken Hornstein
I would object.  A quorum requirement is that all servers are in
agreement with the server configuration and the quorum algorithm.  Any
change to the quorum algorithm needs to be exposed as part of the
negotiation in order for servers to not get into a state where a
misconfigured server or a server executing with an alternate algorithm
does not result in a failure to achieve quorum.

While I agree with that in theory, we don't have that today;
misconfigured servers can easily cause a quorum failure.  Also if the
server times don't match up that can easily cause a quorum failure (I'd
classify that as a misconfigured server as well).

As an aside: you start to see why this problem has never been fixed.
Fixing the basic problem is easy, but if you start talking about some
huge negotiation framework ... gaaah, it's too much.

--Ken
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Jason Edgecombe

On 10/26/2011 02:23 PM, Jeff Blaine wrote:

Have you considered making the lowest server a clone?  Clones are
like other database servers, except that they can never be elected as a
sync site.  The (default) election winner then would be the next 
closest.


YES!  Thank you!  I knew there was something added related to
this topic.

CLONES

I will investigate.



FYI, the CellServDB man page has this info:
http://docs.openafs.org/Reference/5/CellServDB.html

just search for clone

Jason
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Jeffrey Altman
On 10/26/2011 2:21 PM, Jeff Blaine wrote:
 Think about what you would need to do if you were running with this
 patch locally.  Every sysadmin that upgrades these servers must remember
 that the patch is in place (or how the servers were built/configured)
 and not forget.  If you leave tomorrow, is the next sysadmin going to be
 burned by this change when s/he attempts to install openafs distributed
 binaries in your cell?
 
 You could make the same argument (that you're making) with
 at least 5 other existing OpenAFS command-line or build-time
 options.  Example: --enable-namei-fileserver vs. not, drop
 on a server with existing vice partitions in the wrong
 style.

We have spent the last five years removing compile time options.  Inode
vs Namei is a particularly bad example since two things have been
happening in the file server back-end processing:

1. Consolidation of code trees to permit more run time functionally
selection

2. A decision to not accept additional back-end implementations such as
POSIX-Extended-Attributes or HostAFS without also abstracting back-ends
so that a file server can choose which back-end it wants to use at
run-time on a partition by partition basis.

One of the rationales for this is permit sites to migrate from one
back-end to another on the same file server hardware without requiring
that all volumes to relocated to another server as a transition.

 Build/implementation decisions are encapsulated in build
 scripts of ours.  Additionally, those decisions are documented
 in our wiki.  If he/she hasn't read our internal documentation
 about our cell, which is extensive and clear in our wiki, then
 yes, he/she will get burned.
 
 Just like he/she would with any other option for cell
 or server configuration.

Then you can happily maintain the patch locally since it makes a change
to three lines of source code.

There has been discussion over the last several years about what such a
change should look like especially as we move to a world that includes a
mixture of IPv4 and IPv6 as well as the possibility that multiple
service instances could exists on the same machine with different port
numbers.  Such a configuration could be deployed today using DNS SRV
records for any of the database services.

I don't remember all of the details but I believe the agree upon
solution included:

 * UUIDs for each database service instance

 * Configuration data that would be deployed in conjunction with a new
   CellServDB format that would specify the ranking

 * Some hash of the configuration data that would be included in the
   votes to ensure that only votes that are cast on the same ballot
   are included in the resulting decision for those that agree upon
   the ballot.

Where are we on this?  Well,

 * Simon Wilkinson [YFS] has spent time working on implementing the
   new CellServDB file format that was agreed to at the most recent
   AFS hackathon.

 * There is agreement that mixed version database servers are not
   supported within a cell and that ubik is not an afs3-standard
   protocol and as such does not require protocol standardization
   for the purpose of making changes.  That is not to say that OpenAFS
   will permit a change to be accepted without a solid protocol
   description but it does make it easier for OpenAFS to accept and
   roll out changes as part of a version number upgrade.

 * Filling in the rest of the pieces such as assigning UUIDs is not
   an overwhelming amount of work.

Anyone that is interested in contributing to this work with code or
financial support is welcome to contact me privately.

Jeffrey Altman



signature.asc
Description: OpenPGP digital signature