[OpenAFS] Couldn't get CPS for AnyUser, will try again in 30 seconds; code=5376.

2012-04-04 Thread Brett Heroux
I see this in my FileLog. The symptoms are that everything looks fine, but
/afs mounts are not there. I can authenticate with Kerberos5 and aklog
gives me a token, and bos status is ok too.

Sun Apr  1 04:00:03 2012 File server starting
Sun Apr  1 04:00:03 2012 afs_krb_get_lrealm failed, using devicesoft.org.
Sun Apr  1 04:00:03 2012 VL_RegisterAddrs rpc failed; will retry
periodically (code=5376, err=0)
Sun Apr  1 04:00:03 2012 Couldn't get CPS for AnyUser, will try again in 30
seconds; code=5376.
... repeat every 30 seconds

I had a power outage and got a new IP address on my server, had to swap
some NICs that were damaged, but I changed my CellServDB and think I am
over it (I use DHCP). This used to work and I would like very much for it
to work again.

I did google this and just found one unresolved thread. If you can help
with this I would appreciate it.

Thanks,

Brett Heroux


Re: [OpenAFS] Re: Couldn't get CPS for AnyUser, will try again in 30 seconds; code=5376.

2012-04-04 Thread Brett Heroux

I'm pretty sure I restarted my servers already, but I did it again.

I have one db/fileserver and another fileserver. The db/fileserver is 
east-gateway. This is the udebug output.


root@east-gateway:~# bos status east-gateway
Instance buserver, currently running normally.
Instance ptserver, currently running normally.
Instance vlserver, currently running normally.
Instance fs, currently running normally.
Auxiliary status is: file server running.

root@east-gateway:~# udebug 7003 east-gateway
udebug: can't resolve port name east-gateway

Brett Heroux

On 4/4/2012 10:33 AM, Andrew Deason wrote:

On Wed, 4 Apr 2012 10:24:19 -0500
Brett Heroux  wrote:


Sun Apr  1 04:00:03 2012 File server starting
Sun Apr  1 04:00:03 2012 afs_krb_get_lrealm failed, using devicesoft.org.
Sun Apr  1 04:00:03 2012 VL_RegisterAddrs rpc failed; will retry
periodically (code=5376, err=0)
Sun Apr  1 04:00:03 2012 Couldn't get CPS for AnyUser, will try again in 30
seconds; code=5376.

$ translate_et 5376
5376 (u).0 = no quorum elected

Your database servers claim to be out of sync. How many dbservers do you
have? Can you run 'udebug 7003' for each of them?


I had a power outage and got a new IP address on my server, had to
swap some NICs that were damaged, but I changed my CellServDB and
think I am over it (I use DHCP). This used to work and I would like
very much for it to work again.

If you changed the server-side CellServDB after the servers started, you
need to restart the server processes to pick up the changes.



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Couldn't get CPS for AnyUser, will try again in 30 seconds; code=5376.

2012-04-05 Thread Brett Heroux
The CellServDB on my system is:

>devicesoft.org #EFS
74.222.253.110   #east-gateway.devicesoft.org

It resides in /etc/openafs and /etc/openafs/server on both the
db/fileserver and the other fileserver.

The output from udebug is:

root@east-gateway:~# udebug east-gateway 7003
Host's addresses are: 74.222.253.110
Host's 74.222.253.110 time is Thu Apr  5 14:13:10 2012
Local time is Thu Apr  5 14:13:12 2012 (time differential 2 secs)
Last yes vote for 74.222.253.110 was 0 secs ago (sync site);
Last vote started 0 secs ago (at Thu Apr  5 14:13:12 2012)
Local db version is 1333587500.2
I am sync site forever (1 server)
Recovery state 1f
Sync site's db version is 1333587500.2
0 locked pages, 0 of them for write
Last time a new db version was labelled was:
 65690 secs ago (at Wed Apr  4 19:58:22 2012)

Thanks for your help.

Brett Heroux

On Thu, Apr 5, 2012 at 12:02 PM, Andrew Deason wrote:

> On Wed, 04 Apr 2012 20:06:05 -0500
> Brett Heroux  wrote:
>
> > I have one db/fileserver and another fileserver. The db/fileserver is
> > east-gateway. This is the udebug output.
>
> I don't think you can get a quorum error with just one dbserver. What's
> in /usr/afs/etc/CellServDB? (or wherever the server-side CellServDB is)
>
> > root@east-gateway:~# udebug 7003 east-gateway
> > udebug: can't resolve port name east-gateway
>
> As Brandon mentioned, this is backwards. Seeing this with the arguments
> the right way around would still be helpful...
>
> --
> Andrew Deason
> adea...@sinenomine.net
>
> ___
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>


Re: [OpenAFS] Re: Couldn't get CPS for AnyUser, will try again in 30 seconds; code=5376.

2012-04-06 Thread Brett Heroux
The pt_util output looks good, it just gives all the users.

Still would appreciate help.

Thanks,

Brett Heroux

On Thu, Apr 5, 2012 at 2:43 PM, Andrew Deason wrote:

> On Thu, 5 Apr 2012 14:17:30 -0500
> Brett Heroux  wrote:
>
> > The output from udebug is:
>
> This all looks fine. I probably should have asked for udebug on port
> 7002; that is what looks weird:
>
> $ udebug 74.222.253.110 7002
> [...]
> Local db version is 0.134632965
> I am sync site forever (1 server)
> Recovery state 1f
> Sync site's db version is 0.134632965
> 0 locked pages, 0 of them for write
>
> That is not a normal db version number; maybe your ptdb has been
> corrupted. Can you read it from local disk using pt_util? If you run
> 'pt_util -user' as root, it should spit out a list of all users in the
> database. Does it do that, or does it complain about some error?
>
> Do you see anything in PtLog? (I assume this is in /var/log/openafs, or
> wherever openafs logs are for you)
>
> --
> Andrew Deason
> adea...@sinenomine.net
>
> ___
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>


Re: [OpenAFS] Re: Couldn't get CPS for AnyUser, will try again in 30 seconds; code=5376.

2012-04-07 Thread Brett Heroux

That did it.

1) stop the ptserver
2) back up prdb files
3) remove prdb files
4) pt_util -user -group -members -name -system -prdb ./prdb.DB0 
-datafile /tmp/t

5) pt_util -user -group -members -name -system -w -datafile /tmp/t
6) start the ptserver
7) joy

I think the -system was unnecessary, but Thank You So Much Andrew.

Brett Heroux

On 4/7/2012 12:16 PM, Andrew Deason wrote:

On Fri, 6 Apr 2012 14:59:40 -0500
Brett Heroux  wrote:


The pt_util output looks good, it just gives all the users.

Still would appreciate help.

Would you be willing to provide your prdb.DB0? It just contains things
like usernames, groups, group memberships, ids, etc. It shouldn't
contain very sensitive information, unless any of your usernames or
group memberships etc are sensitive.

If you want something quicker to get stuff up and running, one thing
that will probably work is to recreate the ptdb. One way to do this is
to stop the ptserver, move the prdb.DB0 and prdb.DBSYS files out of the
way, and use pt_util to dump the information from them to a temporary
file. Then use pt_util to load that information into a new prdb.DB0, and
start up the ptserver again. Just make sure you back up
prdb.DB0/prdb.DBSYS

(See the pt_util documentation for that; I'm in a bit of a hurry to
provide more detailed info:
<http://docs.openafs.org/Reference/8/pt_util.html>)



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info