I won't call it "fixed", but with much help from the guys in #openafs,
we did get things working.
The problem appears to be in ulimit:
nas1:~# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
max nice (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) unlimited
max rt priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
The stack size is set to 8192. We had to change that to unlimited, then
things started working, so ulimit -s unlimited.
Ed, if you see this...any thoughts on what might cause this?
I've been instructed to file a bug report on openafs-bugs, and to debian
regarding the package, as the /etc/init.d/openafs-filserver script has
to be modified to do ulimit -s unlimited at each startup, as the setting
is a per-session thing. Speculation as to the cause is welcome.
Please don't think a small thing of this. I've spent well over 40
hours, along with the help of several people to weed this out!
Tony Shadwick
OSS Solutions
Tony Shadwick wrote:
I've been bouncing in and out of #OpenAFS for the last week trying to
get this working, and I've been working with Coraid support and all to
no avail. It appears something is up with pthreads, but Coraid support
ran a test and pthreads work in the kernel. Rather than copy and paste
the whole long deal, here's the page I have on my site with all of the
info:
http://www.numbski.com/hacks/coraid/openafs-on-cln22.html
In that log you'll see I've tried using both afs-newcell and the script
found at Debian World.
Here's the logs without and without fileserver -d 99 turned on (I know,
bad loglevel, didn't know until afterwards though):
nas1:/var/log/openafs# cat /var/log/openafs/FileLog
Thu Mar 29 13:52:06 2007 File server starting
Thu Mar 29 13:52:06 2007 afs_krb_get_lrealm failed, using
oss-solutions.com.
Thu Mar 29 13:52:06 2007 Set thread id 14 for FSYNC_sync
Thu Mar 29 13:52:06 2007 Partition /vicepa: attaching volumes
Thu Mar 29 13:52:06 2007 Partition /vicepa: attached 0 volumes; 0
volumes not attached
Thu Mar 29 13:52:06 2007
: Assertion failed! file ../viced/viced.c, line 1956.
and with logging turned up:
nas1:/var/log/openafs# cat FileLog
Thu Mar 29 14:03:02 2007 File server starting
Thu Mar 29 14:03:02 2007 afs_krb_get_lrealm failed, using
oss-solutions.com.
Thu Mar 29 14:03:02 2007 VL_RegisterAddrs rpc failed; will retry
periodically (code=5376, err=0)
Thu Mar 29 14:03:02 2007 Set thread id 14 for FSYNC_sync
Thu Mar 29 14:03:02 2007 Partition /vicepa: attaching volumes
Thu Mar 29 14:03:02 2007 Partition /vicepa: attached 0 volumes; 0
volumes not attached
Thu Mar 29 14:03:02 2007 Starting pthreads
Thu Mar 29 14:03:02 2007 Starting five minute check process
Thu Mar 29 14:03:02 2007 Set thread id 15 for 'FiveMinuteCheckLWP'
Thu Mar 29 14:03:02 2007
: Assertion failed! file ../viced/viced.c, line 1958.
The code in question:
1954 assert(pthread_create
1955 (&serverPid, &tattr, (void *)FiveMinuteCheckLWP,
1956 &fiveminutes) == 0);
1957 assert(pthread_create
1958 (&serverPid, &tattr, (void *)HostCheckLWP, &fiveminutes)
== 0);
1959 assert(pthread_create
1960 (&serverPid, &tattr, (void *)FsyncCheckLWP, &fiveminutes)
== 0);
1961 #else /* AFS_PTHREAD_ENV */
1962 ViceLog(5, ("Starting LWP\n"));
1963 assert(LWP_CreateProcess
1964 (FiveMinuteCheckLWP, stack * 1024, LWP_MAX_PRIORITY - 2,
1965 (void *)&fiveminutes, "FiveMinuteChecks",
1966 &serverPid) == LWP_SUCCESS);
Totally lost, frustrated and confused. Any devs wish to take pity on me
and help? This is an AMD64 box running Debian.
Tony Shadwick
OSS Solutions
_______________________________________________
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info
_______________________________________________
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info