Re: [OpenAFS] no quorum elected
dear rob, i only have one db server. the db server also is the ntp server. i synchronized the time on afs file servers to ntp server. it still has the same problem. please help me. thank you. best, sam On Tue, Jun 3, 2008 at 12:31 PM, Robert Banz [EMAIL PROTECTED] wrote: Verify that the time on your db servers are well synchronized. -rob On Jun 2, 2008, at 9:08 PM, TIARA System Man wrote: dear guys, i could not move volumes. the following messages is what i encountered: # vos move home.cfliu maat /vicepa fs /vicepc -verbose Could not lock entry for volume 536870972 u: no quorum elected Recovery: Accessing VLDB. Recovery: Releasing lock on VLDB entry for volume 536870972 ... done i also read http://www.openafs.org/pipermail/openafs-info/2004-March/012699.html page. i had almost the same problem. but, i don't know how to solve it. please give me hints. thanks. best, sam -- Sam Tseng Academia Sinica Institute of Astronomy and Astrophysics Tel.: +886-2-33652200 ext 742 Fax: +886-2-23677849 -- Sam Tseng Academia Sinica Institute of Astronomy and Astrophysics Tel.: +886-2-33652200 ext 742 Fax: +886-2-23677849
Re: [OpenAFS] no quorum elected
On Jun 2, 2008, at 11:30 PM, TIARA System Man wrote: thank you russ.. i just check my CellServDB files on each file server. i just found one has wrong db info in the file. :$ it's generally good to have at least three DB servers (an odd number is important!). The two most common causes of the quorum error are not having a majority of the DB servers available, or, having a time split between them. File it away for future reference! -rob ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
thank you russ.. i just check my CellServDB files on each file server. i just found one has wrong db info in the file. :$ sorry to bother you guys. thanks again. best, sam On Tue, Jun 3, 2008 at 2:23 PM, Russ Allbery [EMAIL PROTECTED] wrote: TIARA System Man [EMAIL PROTECTED] writes: i only have one db server. the db server also is the ntp server. i synchronized the time on afs file servers to ntp server. it still has the same problem. If you only have one db server and it doesn't think it has quorum, something probably isn't configured right. Either your clients think you have other db servers you don't have, your db server thinks you have other db servers you don't have, or something else is making your db server think that it should be talking to other db servers. -- Russ Allbery ([EMAIL PROTECTED]) http://www.eyrie.org/~eagle/http://www.eyrie.org/%7Eeagle/ ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info -- Sam Tseng Academia Sinica Institute of Astronomy and Astrophysics Tel.: +886-2-33652200 ext 742 Fax: +886-2-23677849
Re: [OpenAFS] no quorum elected
Hi Robert! On Mon, 2 Jun 2008, Robert Banz wrote: On Jun 2, 2008, at 11:30 PM, TIARA System Man wrote: thank you russ.. i just check my CellServDB files on each file server. i just found one has wrong db info in the file. :$ it's generally good to have at least three DB servers (an odd number is important!). The two most common causes of the quorum error are not having a majority of the DB servers available, or, having a time split between them. File it away for future reference! This, of course, is wrong in the case of AFS DB-Servers. The master-server (usually the one with the lowest IP) has an additional half-vote. So no split-brain possible here. If you have 2 servers, and connection is severed, you have 1.5 votes on one side, and 1 on the other. Since the cluster knows there are supposed to be 2.5 votes in total, the single slave server will refuse to accept changes (while happily continuing to serve requests with older data) In the case of 4 servers (which we have in Cologne) you will get the exact same scenarion, only with 2.5 to 2.0 votes. If only the master server is isolated you get 1.5 to 3.0 votes. This will result in the three still connected servers voting for a new master, and the old master will stop accepting changes since it knows that it is in the minority in this situation. Cheers from Cologne, Dipl. Chem. Dr. Stephan Wonczak Zentrum fuer Angewandte Informatik (ZAIK) Regionales Rechenzentrum der Universitaet zu Koeln (RRZK) Universitaet zu Koeln, Robert-Koch-Strasse 10, 50931 Koeln Tel: ++49/(0)221/478-5577, Fax: ++49/(0)221/478-5590 ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
This, of course, is wrong in the case of AFS DB-Servers. The master- server (usually the one with the lowest IP) has an additional half- vote. So no split-brain possible here. When did we change this? All of the documentation I ever read said you needed three so you could have a quorum during such an outage... It's one of those it all depends on what you mean kind of things. With two database servers, if you lose the slave server, you will still maintain quorum. But if you lose the master server, you will NOT be able to elect quorum. So if you want to tolerate the loss of any single database server, you need a minimum of three. But it doesn't matter if there are an even or odd number if the number is greater than three; as already mentioned, the master gets an extra half vote (I was under the impression the extra half vote was really put in place to handle the two server case, but it also serves to make things if you have an even number of servers and half of them are unavailable). As a side effect of the voting algorithm, at least half of your database servers will never be elected masters. AFAICT, this is the way it's always been. --Ken ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
On Tue, Jun 3, 2008 at 1:23 PM, Robert Banz [EMAIL PROTECTED] wrote: On Jun 3, 2008, at 3:53 AM, Stephan Wonczak wrote: Hi Robert! On Mon, 2 Jun 2008, Robert Banz wrote: On Jun 2, 2008, at 11:30 PM, TIARA System Man wrote: thank you russ.. i just check my CellServDB files on each file server. i just found one has wrong db info in the file. :$ it's generally good to have at least three DB servers (an odd number is important!). The two most common causes of the quorum error are not having a majority of the DB servers available, or, having a time split between them. File it away for future reference! This, of course, is wrong in the case of AFS DB-Servers. The master-server (usually the one with the lowest IP) has an additional half-vote. So no split-brain possible here. When did we change this? We didn't. It was like that when we found it. ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
On Jun 3, 2008, at 3:53 AM, Stephan Wonczak wrote: Hi Robert! On Mon, 2 Jun 2008, Robert Banz wrote: On Jun 2, 2008, at 11:30 PM, TIARA System Man wrote: thank you russ.. i just check my CellServDB files on each file server. i just found one has wrong db info in the file. :$ it's generally good to have at least three DB servers (an odd number is important!). The two most common causes of the quorum error are not having a majority of the DB servers available, or, having a time split between them. File it away for future reference! This, of course, is wrong in the case of AFS DB-Servers. The master- server (usually the one with the lowest IP) has an additional half- vote. So no split-brain possible here. When did we change this? All of the documentation I ever read said you needed three so you could have a quorum during such an outage... -rob ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
Verify that the time on your db servers are well synchronized. -rob On Jun 2, 2008, at 9:08 PM, TIARA System Man wrote: dear guys, i could not move volumes. the following messages is what i encountered: # vos move home.cfliu maat /vicepa fs /vicepc -verbose Could not lock entry for volume 536870972 u: no quorum elected Recovery: Accessing VLDB. Recovery: Releasing lock on VLDB entry for volume 536870972 ... done i also read http://www.openafs.org/pipermail/openafs-info/2004-March/012699.html page. i had almost the same problem. but, i don't know how to solve it. please give me hints. thanks. best, sam -- Sam Tseng Academia Sinica Institute of Astronomy and Astrophysics Tel.: +886-2-33652200 ext 742 Fax: +886-2-23677849
RE: [OpenAFS] no quorum elected
Try udebug server name 7004 How many servers are mentioned in the output? You should have only one server listed in your /usr/afs/etc/CellServDB file, and the only cell in that file should be yours. IIRC udebug should say, if configured this way, something like I'm sync site forever, single DB server -- it shouldn't mention any other servers other than the one you're running. Kim -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve Lenti Sent: Wednesday, March 02, 2005 6:35 AM To: openafs-info@openafs.org Subject: RE: [OpenAFS] no quorum elected On Mar 2, 2005, at 6:40 AM, Steve Lenti wrote: Trying an initial install on Suse 9.2. Everything works fine up until I have to add the initial users. Kaserver, vlserver, ptserver, buserver, and bosserver -noauth all up and running. Anyone know the problem I am having below? ka create afs initial_password: Verifying, please re-enter initial_password: Creating user afs : [u] no quorum elected, wait one second Creating user afs : [u] no quorum elected, wait one second ... Try synchronizing the time between your servers. Horst This is being run all on the same server. -Steve ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
On Mar 2, 2005, at 6:40 AM, Steve Lenti wrote: Trying an initial install on Suse 9.2. Everything works fine up until I have to add the initial users. Kaserver, vlserver, ptserver, buserver, and bosserver -noauth all up and running. Anyone know the problem I am having below? ka create afs initial_password: Verifying, please re-enter initial_password: Creating user afs : [u] no quorum elected, wait one second Creating user afs : [u] no quorum elected, wait one second ... Try synchronizing the time between your servers. Horst ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
RE: [OpenAFS] no quorum elected
On Mar 2, 2005, at 6:40 AM, Steve Lenti wrote: Trying an initial install on Suse 9.2. Everything works fine up until I have to add the initial users. Kaserver, vlserver, ptserver, buserver, and bosserver -noauth all up and running. Anyone know the problem I am having below? ka create afs initial_password: Verifying, please re-enter initial_password: Creating user afs : [u] no quorum elected, wait one second Creating user afs : [u] no quorum elected, wait one second ... Try synchronizing the time between your servers. Horst This is being run all on the same server. -Steve ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] no quorum elected
Trying an initial install on Suse 9.2. Everything works fine up until I have to add the initial users. Kaserver, vlserver, ptserver, buserver, and bosserver -noauth all up and running. Anyone know the problem I am having below? ka create afs initial_password: Verifying, please re-enter initial_password: Creating user afs : [u] no quorum elected, wait one second Creating user afs : [u] no quorum elected, wait one second ... Thanks. -Steve ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
Charles == Charles Clancy [EMAIL PROTECTED] writes: I don't even know what a quorum IS. ... IF it haven't anything to do with 'no quorum elected', then I don't know. Charles When you have multiple AFS servers, they need to Charles coordinate with one another to maintain consistency among Charles all the various databases. No quorum elected basically Charles means they couldn't decide who should be in charge, and Charles one side-effect is that you end up with read-only Charles databases (at least in my experience things like pts Charles createu or vos create fail, citing that error Charles message). Frequently this error is caused by AFS servers Charles with their system clocks too skewed relative to one Charles another. I must have the clock to be set EXACT for other reasons (Kerberos V), so I'm using the same reference clock on ALL of my machines, so the time is ok. I'll keep this in mind though. Charles It doesn't seem to have much to do with your problem, Charles though. Oki, thanx. -- Waco, Texas Mossad assassination munitions Uzi Soviet SEAL Team 6 strategic Noriega Saddam Hussein Marxist Treasury 767 Peking SDI [See http://www.aclu.org/echelonwatch/index.html for more about this] ___ OpenAFS-info mailing list [EMAIL PROTECTED] https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
On 10 Jun 2002, Turbo Fredriksson wrote: Derrick == Derrick J Brashear [EMAIL PROTECTED] writes: Derrick Actually if it's a real error it's frequently caused by Derrick that. If you're running clients it's usually the sync Derrick site isn't in the CellServDB on the client. Simple answer Derrick is look at udebug output. For kaserver, udebug Derrick (serverhost) 7004 ptserver, 7002 vlserver, 7003 This is what I got from these commands. Don't quite know what it say though... It tells me the important thing: Host's addresses are: 192.168.1.4 192.168.1.254 Host's 127.0.0.1 time is Mon Jun 10 08:55:34 2002 Local time is Mon Jun 10 08:55:37 2002 (time differential 3 secs) Last yes vote for 127.0.0.1 was 0 secs ago (sync site); The Hosts' 127.0.0.1 and Last yes vote for 127.0.0.1 lines should have a real IP address in them. Take the entry for your hostname off the localhost line in /etc/hosts would be my guess. If not, fix CellServDB. Something is causing references to 127.0.0.1 and you don't want em ___ OpenAFS-info mailing list [EMAIL PROTECTED] https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
Derrick == Derrick J Brashear [EMAIL PROTECTED] writes: Derrick It tells me the important thing: Host's addresses are: 192.168.1.4 192.168.1.254 Host's 127.0.0.1 time is Mon Jun 10 08:55:34 2002 Local time is Mon Jun 10 08:55:37 2002 (time differential 3 secs) Last yes vote for 127.0.0.1 was 0 secs ago (sync site); Derrick The Hosts' 127.0.0.1 and Last yes vote for 127.0.0.1 Derrick lines should have a real IP address in them. Derrick Take the entry for your hostname off the localhost line Derrick in /etc/hosts would be my guess. If not, fix Derrick CellServDB. Something is causing references to 127.0.0.1 Derrick and you don't want em Isn't this because I used 'udebug localhost ...'? - s n i p - tuzjfi:~# head /etc/hosts 127.0.0.1 localhost 192.168.1.4 tuzjfi.bayour.com tuzjfi - s n i p - - s n i p - tuzjfi:~# udebug tuzjfi.bayour.com 7002 Host's addresses are: 192.168.1.4 192.168.1.254 Host's 192.168.1.4 time is Mon Jun 10 13:29:24 2002 Local time is Mon Jun 10 13:29:25 2002 (time differential 1 secs) Last yes vote for 192.168.1.4 was 0 secs ago (sync site); Last vote started 0 secs ago (at Mon Jun 10 13:29:25 2002) Local db version is 2.0 I am sync site forever (1 server) Recovery state 1f Sync site's db version is 2.0 0 locked pages, 0 of them for write tuzjfi:~# udebug tuzjfi.bayour.com 7003 Host's addresses are: 192.168.1.4 192.168.1.254 Host's 192.168.1.4 time is Mon Jun 10 13:29:13 2002 Local time is Mon Jun 10 13:29:15 2002 (time differential 2 secs) Last yes vote for 192.168.1.4 was 0 secs ago (sync site); Last vote started 0 secs ago (at Mon Jun 10 13:29:15 2002) Local db version is 1023694571.11 I am sync site forever (1 server) Recovery state 1f Sync site's db version is 1023694571.11 0 locked pages, 0 of them for write - s n i p - Now there's REAL (?) IP addresses there... -- World Trade Center DES tritium supercomputer domestic disruption iodine South Africa Clinton Noriega Mossad nitrate FBI president KGB Ortega [See http://www.aclu.org/echelonwatch/index.html for more about this] ___ OpenAFS-info mailing list [EMAIL PROTECTED] https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
On 10 Jun 2002, Turbo Fredriksson wrote: Derrick and you don't want em Isn't this because I used 'udebug localhost ...'? Yeah, probably. There's a story as to why I was confused, but that's for another time ___ OpenAFS-info mailing list [EMAIL PROTECTED] https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
On Sun, 9 Jun 2002, Charles Clancy wrote: I don't even know what a quorum IS. ... IF it haven't anything to do with 'no quorum elected', then I don't know. When you have multiple AFS servers, they need to coordinate with one another to maintain consistency among all the various databases. No quorum elected basically means they couldn't decide who should be in charge, and one side-effect is that you end up with read-only databases (at least in my experience things like pts createu or vos create fail, citing that error message). Frequently this error is caused by AFS servers with their system clocks too skewed relative to one another. Actually if it's a real error it's frequently caused by that. If you're running clients it's usually the sync site isn't in the CellServDB on the client. Simple answer is look at udebug output. For kaserver, udebug (serverhost) 7004 ptserver, 7002 vlserver, 7003 ___ OpenAFS-info mailing list [EMAIL PROTECTED] https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
Derek == Derek Atkins [EMAIL PROTECTED] writes: Derek Ok, my other question is: is your CellServDB set properly? Derek Both the client and server CellServDB files? Tuzjfi is to be both client and server. It's to emulate papadoc's job (diskspaces, remember). And the CellServDB looks ok.. - s n i p - tuzjfi:~# head /etc/openafs/CellServDB bayour.com 192.168.1.4 # tuzjfi.bayour.com grand.central.org #GCO Public CellServDB 29 Jan 2002 18.7.14.88 #grand-opening.mit.edu 128.2.191.224 #penn.central.org usatlas.bnl.gov#US Atlas Tier 1 Facility at BNL 130.199.48.32 #aafs01.usatlas.bnl.gov 130.199.48.33 #aafs02.usatlas.bnl.gov 130.199.48.34 #aafs03.usatlas.bnl.gov rpi.edu#Rensselaer Polytechnic Institute - s n i p - -- Ft. Bragg Albanian Rule Psix Khaddafi radar quiche jihad arrangements SDI CIA $400 million in gold bullion North Korea Mossad killed smuggle [See http://www.aclu.org/echelonwatch/index.html for more about this] ___ OpenAFS-info mailing list [EMAIL PROTECTED] https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] no quorum elected
Turbo, I specifically asked you for the contents of BOTH CellServDB files, why did you send me only one? What's the _SERVER_ CellServDB look like? -derek Turbo Fredriksson [EMAIL PROTECTED] writes: Derek == Derek Atkins [EMAIL PROTECTED] writes: Derek Ok, my other question is: is your CellServDB set properly? Derek Both the client and server CellServDB files? Tuzjfi is to be both client and server. It's to emulate papadoc's job (diskspaces, remember). And the CellServDB looks ok.. - s n i p - tuzjfi:~# head /etc/openafs/CellServDB bayour.com 192.168.1.4 # tuzjfi.bayour.com grand.central.org #GCO Public CellServDB 29 Jan 2002 18.7.14.88 #grand-opening.mit.edu 128.2.191.224 #penn.central.org usatlas.bnl.gov#US Atlas Tier 1 Facility at BNL 130.199.48.32 #aafs01.usatlas.bnl.gov 130.199.48.33 #aafs02.usatlas.bnl.gov 130.199.48.34 #aafs03.usatlas.bnl.gov rpi.edu#Rensselaer Polytechnic Institute - s n i p - -- Ft. Bragg Albanian Rule Psix Khaddafi radar quiche jihad arrangements SDI CIA $400 million in gold bullion North Korea Mossad killed smuggle [See http://www.aclu.org/echelonwatch/index.html for more about this] ___ OpenAFS-info mailing list [EMAIL PROTECTED] https://lists.openafs.org/mailman/listinfo/openafs-info -- Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory Member, MIT Student Information Processing Board (SIPB) URL: http://web.mit.edu/warlord/PP-ASEL-IA N1NWH [EMAIL PROTECTED]PGP key available ___ OpenAFS-info mailing list [EMAIL PROTECTED] https://lists.openafs.org/mailman/listinfo/openafs-info