Re: [OpenAFS] no quorum elected

2008-06-03 Thread TIARA System Man
dear rob,

i only have one db server. the db server also is the ntp server. i
synchronized the time on afs file servers to ntp server.  it still has the
same problem.

please help me. thank you.

best,
sam

On Tue, Jun 3, 2008 at 12:31 PM, Robert Banz [EMAIL PROTECTED] wrote:


 Verify that the time on your db servers are well synchronized.

 -rob

 On Jun 2, 2008, at 9:08 PM, TIARA System Man wrote:

 dear guys,

 i could not move volumes. the following messages is what i encountered:

 # vos move home.cfliu maat /vicepa fs /vicepc -verbose

 Could not lock entry for volume 536870972
u: no quorum elected
 Recovery: Accessing VLDB.
 Recovery: Releasing lock on VLDB entry for volume 536870972 ... done

 i also read
 http://www.openafs.org/pipermail/openafs-info/2004-March/012699.html page.
 i had almost the same problem. but, i don't know how to solve it. please
 give me hints. thanks.

 best, sam

 --
 Sam Tseng
 Academia Sinica
 Institute of Astronomy and Astrophysics
 Tel.: +886-2-33652200 ext 742
 Fax: +886-2-23677849





-- 
Sam Tseng
Academia Sinica
Institute of Astronomy and Astrophysics
Tel.: +886-2-33652200 ext 742
Fax: +886-2-23677849


Re: [OpenAFS] no quorum elected

2008-06-03 Thread Robert Banz


On Jun 2, 2008, at 11:30 PM, TIARA System Man wrote:

thank you russ.. i just check my CellServDB files on each file  
server. i just found one has wrong db info in the file. :$




it's generally good to have at least three DB servers (an odd number  
is important!). The two most common causes of the quorum error are not  
having a majority of the DB servers available, or, having a time split  
between them.


File it away for future reference!

-rob
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] no quorum elected

2008-06-03 Thread TIARA System Man
thank you russ.. i just check my CellServDB files on each file server. i
just found one has wrong db info in the file. :$

sorry to bother you guys. thanks again.

best, sam

On Tue, Jun 3, 2008 at 2:23 PM, Russ Allbery [EMAIL PROTECTED] wrote:

 TIARA System Man [EMAIL PROTECTED] writes:

  i only have one db server. the db server also is the ntp server. i
  synchronized the time on afs file servers to ntp server.  it still has
 the
  same problem.

 If you only have one db server and it doesn't think it has quorum,
 something probably isn't configured right.  Either your clients think you
 have other db servers you don't have, your db server thinks you have other
 db servers you don't have, or something else is making your db server
 think that it should be talking to other db servers.

 --
 Russ Allbery ([EMAIL PROTECTED]) 
 http://www.eyrie.org/~eagle/http://www.eyrie.org/%7Eeagle/
 
 ___
 OpenAFS-info mailing list
 OpenAFS-info@openafs.org
 https://lists.openafs.org/mailman/listinfo/openafs-info




-- 
Sam Tseng
Academia Sinica
Institute of Astronomy and Astrophysics
Tel.: +886-2-33652200 ext 742
Fax: +886-2-23677849


Re: [OpenAFS] no quorum elected

2008-06-03 Thread Stephan Wonczak

  Hi Robert!

On Mon, 2 Jun 2008, Robert Banz wrote:



On Jun 2, 2008, at 11:30 PM, TIARA System Man wrote:

thank you russ.. i just check my CellServDB files on each file server. i 
just found one has wrong db info in the file. :$




it's generally good to have at least three DB servers (an odd number is 
important!). The two most common causes of the quorum error are not having a 
majority of the DB servers available, or, having a time split between them.


File it away for future reference!


  This, of course, is wrong in the case of AFS DB-Servers. The 
master-server (usually the one with the lowest IP) has an additional 
half-vote. So no split-brain possible here.
  If you have 2 servers, and connection is severed, you have 1.5 votes on 
one side, and 1 on the other. Since the cluster knows there are supposed 
to be 2.5 votes in total, the single slave server will refuse to accept 
changes (while happily continuing to serve requests with older data)
  In the case of 4 servers (which we have in Cologne) you will get the 
exact same scenarion, only with 2.5 to 2.0 votes. If only the master 
server is isolated you get 1.5 to 3.0 votes. This will result in the three 
still connected servers voting for a new master, and the old master will 
stop accepting changes since it knows that it is in the minority in this 
situation.


  Cheers from Cologne,

Dipl. Chem. Dr. Stephan Wonczak

Zentrum fuer Angewandte Informatik (ZAIK)
Regionales Rechenzentrum der Universitaet zu Koeln (RRZK)
Universitaet zu Koeln, Robert-Koch-Strasse 10, 50931 Koeln
Tel: ++49/(0)221/478-5577, Fax: ++49/(0)221/478-5590

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] no quorum elected

2008-06-03 Thread Ken Hornstein
  This, of course, is wrong in the case of AFS DB-Servers. The master- 
 server (usually the one with the lowest IP) has an additional half- 
 vote. So no split-brain possible here.

When did we change this? All of the documentation I ever read said you  
needed three so you could have a quorum during such an outage...

It's one of those it all depends on what you mean kind of things.

With two database servers, if you lose the slave server, you will still
maintain quorum.  But if you lose the master server, you will NOT be able
to elect quorum.

So if you want to tolerate the loss of any single database server, you
need a minimum of three.  But it doesn't matter if there are an even or
odd number if the number is greater than three; as already mentioned,
the master gets an extra half vote (I was under the impression the
extra half vote was really put in place to handle the two server case,
but it also serves to make things if you have an even number of servers
and half of them are unavailable).  As a side effect of the voting
algorithm, at least half of your database servers will never be elected
masters.

AFAICT, this is the way it's always been.

--Ken
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] no quorum elected

2008-06-03 Thread Derrick Brashear
On Tue, Jun 3, 2008 at 1:23 PM, Robert Banz [EMAIL PROTECTED] wrote:

 On Jun 3, 2008, at 3:53 AM, Stephan Wonczak wrote:

  Hi Robert!

 On Mon, 2 Jun 2008, Robert Banz wrote:


 On Jun 2, 2008, at 11:30 PM, TIARA System Man wrote:

 thank you russ.. i just check my CellServDB files on each file server. i
 just found one has wrong db info in the file. :$

 it's generally good to have at least three DB servers (an odd number is
 important!). The two most common causes of the quorum error are not having a
 majority of the DB servers available, or, having a time split between them.

 File it away for future reference!

  This, of course, is wrong in the case of AFS DB-Servers. The
 master-server (usually the one with the lowest IP) has an additional
 half-vote. So no split-brain possible here.

 When did we change this?

We didn't. It was like that when we found it.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] no quorum elected

2008-06-03 Thread Robert Banz


On Jun 3, 2008, at 3:53 AM, Stephan Wonczak wrote:


 Hi Robert!

On Mon, 2 Jun 2008, Robert Banz wrote:



On Jun 2, 2008, at 11:30 PM, TIARA System Man wrote:

thank you russ.. i just check my CellServDB files on each file  
server. i just found one has wrong db info in the file. :$


it's generally good to have at least three DB servers (an odd  
number is important!). The two most common causes of the quorum  
error are not having a majority of the DB servers available, or,  
having a time split between them.


File it away for future reference!


 This, of course, is wrong in the case of AFS DB-Servers. The master- 
server (usually the one with the lowest IP) has an additional half- 
vote. So no split-brain possible here.


When did we change this? All of the documentation I ever read said you  
needed three so you could have a quorum during such an outage...


-rob
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] no quorum elected

2008-06-02 Thread Robert Banz


Verify that the time on your db servers are well synchronized.

-rob

On Jun 2, 2008, at 9:08 PM, TIARA System Man wrote:


dear guys,

i could not move volumes. the following messages is what i  
encountered:


# vos move home.cfliu maat /vicepa fs /vicepc -verbose

Could not lock entry for volume 536870972
   u: no quorum elected
Recovery: Accessing VLDB.
Recovery: Releasing lock on VLDB entry for volume 536870972 ... done

i also read http://www.openafs.org/pipermail/openafs-info/2004-March/012699.html 
 page. i had almost the same problem. but, i don't know how to solve  
it. please give me hints. thanks.


best, sam

--
Sam Tseng
Academia Sinica
Institute of Astronomy and Astrophysics
Tel.: +886-2-33652200 ext 742
Fax: +886-2-23677849




RE: [OpenAFS] no quorum elected

2005-03-08 Thread Dexter Kimball
Try 

udebug server name 7004

How many servers are mentioned in the output?

You should have only one server listed in your /usr/afs/etc/CellServDB file,
and the only cell in that file should be yours.

IIRC udebug should say, if configured this way, something like I'm sync
site forever, single DB server  -- it shouldn't mention any other servers
other than the one you're running.

Kim




 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Steve Lenti
 Sent: Wednesday, March 02, 2005 6:35 AM
 To: openafs-info@openafs.org
 Subject: RE: [OpenAFS] no quorum elected
 
 
  On Mar 2, 2005, at 6:40 AM, Steve Lenti wrote:
  
   Trying an initial install on Suse 9.2.  Everything works 
  fine up until 
   I have to add the initial users.  Kaserver, vlserver, ptserver, 
   buserver, and bosserver -noauth all up and running.  Anyone 
  know the 
   problem I am having below?
  
  
   ka create afs
   initial_password:
   Verifying, please re-enter initial_password:
   Creating user afs  : [u] no quorum elected, wait one second 
  Creating 
   user afs  : [u] no quorum elected, wait one second ...
  
  Try synchronizing the time between your servers.
  
  Horst
  
 
 This is being run all on the same server.
 -Steve
 ___
 OpenAFS-info mailing list
 OpenAFS-info@openafs.org
 https://lists.openafs.org/mailman/listinfo/openafs-info
 

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] no quorum elected

2005-03-02 Thread Horst Birthelmer
On Mar 2, 2005, at 6:40 AM, Steve Lenti wrote:
Trying an initial install on Suse 9.2.  Everything works fine up until 
I
have to add the initial users.  Kaserver, vlserver, ptserver, buserver,
and bosserver -noauth all up and running.  Anyone know the problem I am
having below?

ka create afs
initial_password:
Verifying, please re-enter initial_password:
Creating user afs  : [u] no quorum elected, wait one second
Creating user afs  : [u] no quorum elected, wait one second
...
Try synchronizing the time between your servers.
Horst
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


RE: [OpenAFS] no quorum elected

2005-03-02 Thread Steve Lenti
 On Mar 2, 2005, at 6:40 AM, Steve Lenti wrote:
 
  Trying an initial install on Suse 9.2.  Everything works 
 fine up until 
  I have to add the initial users.  Kaserver, vlserver, ptserver, 
  buserver, and bosserver -noauth all up and running.  Anyone 
 know the 
  problem I am having below?
 
 
  ka create afs
  initial_password:
  Verifying, please re-enter initial_password:
  Creating user afs  : [u] no quorum elected, wait one second 
 Creating 
  user afs  : [u] no quorum elected, wait one second ...
 
 Try synchronizing the time between your servers.
 
 Horst
 

This is being run all on the same server.
-Steve
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] no quorum elected

2005-03-01 Thread Steve Lenti
Trying an initial install on Suse 9.2.  Everything works fine up until I
have to add the initial users.  Kaserver, vlserver, ptserver, buserver,
and bosserver -noauth all up and running.  Anyone know the problem I am
having below?


ka create afs
initial_password:
Verifying, please re-enter initial_password:
Creating user afs  : [u] no quorum elected, wait one second
Creating user afs  : [u] no quorum elected, wait one second
...

Thanks.
-Steve
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] no quorum elected

2002-06-10 Thread Turbo Fredriksson

 Charles == Charles Clancy [EMAIL PROTECTED] writes:

 I don't even know what a quorum IS.  ...  IF it haven't
 anything to do with 'no quorum elected', then I don't know.

Charles When you have multiple AFS servers, they need to
Charles coordinate with one another to maintain consistency among
Charles all the various databases.  No quorum elected basically
Charles means they couldn't decide who should be in charge, and
Charles one side-effect is that you end up with read-only
Charles databases (at least in my experience things like pts
Charles createu or vos create fail, citing that error
Charles message).  Frequently this error is caused by AFS servers
Charles with their system clocks too skewed relative to one
Charles another.

I must have the clock to be set EXACT for other reasons (Kerberos V),
so I'm using the same reference clock on ALL of my machines, so the time
is ok.

I'll keep this in mind though.

Charles It doesn't seem to have much to do with your problem,
Charles though.

Oki, thanx.
-- 
Waco, Texas Mossad assassination munitions Uzi Soviet SEAL Team 6
strategic Noriega Saddam Hussein Marxist Treasury 767 Peking SDI
[See http://www.aclu.org/echelonwatch/index.html for more about this]
___
OpenAFS-info mailing list
[EMAIL PROTECTED]
https://lists.openafs.org/mailman/listinfo/openafs-info



Re: [OpenAFS] no quorum elected

2002-06-10 Thread Derrick J Brashear

On 10 Jun 2002, Turbo Fredriksson wrote:

  Derrick == Derrick J Brashear [EMAIL PROTECTED] writes:
 
 Derrick Actually if it's a real error it's frequently caused by
 Derrick that. If you're running clients it's usually the sync
 Derrick site isn't in the CellServDB on the client. Simple answer
 Derrick is look at udebug output.  For kaserver, udebug
 Derrick (serverhost) 7004 ptserver, 7002 vlserver, 7003
 
 This is what I got from these commands. Don't quite know what it say though...

It tells me the important thing:

 Host's addresses are: 192.168.1.4 192.168.1.254
 Host's 127.0.0.1 time is Mon Jun 10 08:55:34 2002
 Local time is Mon Jun 10 08:55:37 2002 (time differential 3 secs)
 Last yes vote for 127.0.0.1 was 0 secs ago (sync site);

The Hosts' 127.0.0.1 
and
Last yes vote for 127.0.0.1 
lines should have a real IP address in them.

Take the entry for your hostname off the localhost line in /etc/hosts
would be my guess. If not, fix CellServDB. Something is causing references
to 127.0.0.1 and you don't want em



___
OpenAFS-info mailing list
[EMAIL PROTECTED]
https://lists.openafs.org/mailman/listinfo/openafs-info



Re: [OpenAFS] no quorum elected

2002-06-10 Thread Turbo Fredriksson

 Derrick == Derrick J Brashear [EMAIL PROTECTED] writes:

Derrick It tells me the important thing:

 Host's addresses are: 192.168.1.4 192.168.1.254 Host's
 127.0.0.1 time is Mon Jun 10 08:55:34 2002 Local time is Mon
 Jun 10 08:55:37 2002 (time differential 3 secs) Last yes vote
 for 127.0.0.1 was 0 secs ago (sync site);

Derrick The Hosts' 127.0.0.1 and Last yes vote for 127.0.0.1
Derrick lines should have a real IP address in them.

Derrick Take the entry for your hostname off the localhost line
Derrick in /etc/hosts would be my guess. If not, fix
Derrick CellServDB. Something is causing references to 127.0.0.1
Derrick and you don't want em

Isn't this because I used 'udebug localhost ...'?
- s n i p -
tuzjfi:~# head /etc/hosts
127.0.0.1   localhost
192.168.1.4 tuzjfi.bayour.com   tuzjfi
- s n i p -


- s n i p -
tuzjfi:~# udebug tuzjfi.bayour.com 7002
Host's addresses are: 192.168.1.4 192.168.1.254
Host's 192.168.1.4 time is Mon Jun 10 13:29:24 2002
Local time is Mon Jun 10 13:29:25 2002 (time differential 1 secs)
Last yes vote for 192.168.1.4 was 0 secs ago (sync site);
Last vote started 0 secs ago (at Mon Jun 10 13:29:25 2002)
Local db version is 2.0
I am sync site forever (1 server)
Recovery state 1f
Sync site's db version is 2.0
0 locked pages, 0 of them for write

tuzjfi:~# udebug tuzjfi.bayour.com 7003
Host's addresses are: 192.168.1.4 192.168.1.254
Host's 192.168.1.4 time is Mon Jun 10 13:29:13 2002
Local time is Mon Jun 10 13:29:15 2002 (time differential 2 secs)
Last yes vote for 192.168.1.4 was 0 secs ago (sync site);
Last vote started 0 secs ago (at Mon Jun 10 13:29:15 2002)
Local db version is 1023694571.11
I am sync site forever (1 server)
Recovery state 1f
Sync site's db version is 1023694571.11
0 locked pages, 0 of them for write
- s n i p -

Now there's REAL (?) IP addresses there...
-- 
World Trade Center DES tritium supercomputer domestic disruption
iodine South Africa Clinton Noriega Mossad nitrate FBI president KGB
Ortega
[See http://www.aclu.org/echelonwatch/index.html for more about this]
___
OpenAFS-info mailing list
[EMAIL PROTECTED]
https://lists.openafs.org/mailman/listinfo/openafs-info



Re: [OpenAFS] no quorum elected

2002-06-10 Thread Derrick J Brashear

On 10 Jun 2002, Turbo Fredriksson wrote:

 Derrick and you don't want em
 
 Isn't this because I used 'udebug localhost ...'?

Yeah, probably. There's a story as to why I was confused, but that's for
another time

 

___
OpenAFS-info mailing list
[EMAIL PROTECTED]
https://lists.openafs.org/mailman/listinfo/openafs-info



Re: [OpenAFS] no quorum elected

2002-06-09 Thread Derrick J Brashear

On Sun, 9 Jun 2002, Charles Clancy wrote:

  I don't even know what a quorum IS.
  ...
  IF it haven't anything to do with 'no quorum elected', then I don't know.
 
 When you have multiple AFS servers, they need to coordinate with one
 another to maintain consistency among all the various databases.  No
 quorum elected basically means they couldn't decide who should be in
 charge, and one side-effect is that you end up with read-only databases
 (at least in my experience things like pts createu or vos create
 fail, citing that error message).  Frequently this error is caused by AFS
 servers with their system clocks too skewed relative to one another.

Actually if it's a real error it's frequently caused by that. If you're
running clients it's usually the sync site isn't in the CellServDB on the
client. Simple answer is look at udebug output. 
For kaserver, udebug (serverhost) 7004
ptserver, 7002
vlserver, 7003



___
OpenAFS-info mailing list
[EMAIL PROTECTED]
https://lists.openafs.org/mailman/listinfo/openafs-info



Re: [OpenAFS] no quorum elected

2002-06-06 Thread Turbo Fredriksson

 Derek == Derek Atkins [EMAIL PROTECTED] writes:

Derek Ok, my other question is: is your CellServDB set properly?
Derek Both the client and server CellServDB files?

Tuzjfi is to be both client and server. It's to emulate papadoc's job
(diskspaces, remember).

And the CellServDB looks ok..

- s n i p -
tuzjfi:~# head /etc/openafs/CellServDB
bayour.com
192.168.1.4 # tuzjfi.bayour.com
grand.central.org  #GCO Public CellServDB 29 Jan 2002
18.7.14.88  #grand-opening.mit.edu
128.2.191.224   #penn.central.org
usatlas.bnl.gov#US Atlas Tier 1 Facility at BNL
130.199.48.32   #aafs01.usatlas.bnl.gov
130.199.48.33   #aafs02.usatlas.bnl.gov
130.199.48.34   #aafs03.usatlas.bnl.gov
rpi.edu#Rensselaer Polytechnic Institute
- s n i p -

-- 
Ft. Bragg Albanian Rule Psix Khaddafi radar quiche jihad arrangements
SDI CIA $400 million in gold bullion North Korea Mossad killed smuggle
[See http://www.aclu.org/echelonwatch/index.html for more about this]
___
OpenAFS-info mailing list
[EMAIL PROTECTED]
https://lists.openafs.org/mailman/listinfo/openafs-info



Re: [OpenAFS] no quorum elected

2002-06-06 Thread Derek Atkins

Turbo,

I specifically asked you for the contents of BOTH CellServDB files,
why did you send me only one?

What's the _SERVER_ CellServDB look like?

-derek

Turbo Fredriksson [EMAIL PROTECTED] writes:

  Derek == Derek Atkins [EMAIL PROTECTED] writes:
 
 Derek Ok, my other question is: is your CellServDB set properly?
 Derek Both the client and server CellServDB files?
 
 Tuzjfi is to be both client and server. It's to emulate papadoc's job
 (diskspaces, remember).
 
 And the CellServDB looks ok..
 
 - s n i p -
 tuzjfi:~# head /etc/openafs/CellServDB
 bayour.com
 192.168.1.4 # tuzjfi.bayour.com
 grand.central.org  #GCO Public CellServDB 29 Jan 2002
 18.7.14.88  #grand-opening.mit.edu
 128.2.191.224   #penn.central.org
 usatlas.bnl.gov#US Atlas Tier 1 Facility at BNL
 130.199.48.32   #aafs01.usatlas.bnl.gov
 130.199.48.33   #aafs02.usatlas.bnl.gov
 130.199.48.34   #aafs03.usatlas.bnl.gov
 rpi.edu#Rensselaer Polytechnic Institute
 - s n i p -
 
 -- 
 Ft. Bragg Albanian Rule Psix Khaddafi radar quiche jihad arrangements
 SDI CIA $400 million in gold bullion North Korea Mossad killed smuggle
 [See http://www.aclu.org/echelonwatch/index.html for more about this]
 ___
 OpenAFS-info mailing list
 [EMAIL PROTECTED]
 https://lists.openafs.org/mailman/listinfo/openafs-info

-- 
   Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
   Member, MIT Student Information Processing Board  (SIPB)
   URL: http://web.mit.edu/warlord/PP-ASEL-IA N1NWH
   [EMAIL PROTECTED]PGP key available
___
OpenAFS-info mailing list
[EMAIL PROTECTED]
https://lists.openafs.org/mailman/listinfo/openafs-info