Re: [OpenAFS] /afs area is hanging

2009-05-14 Thread Harald Barth

> If you wipe out your cache, AFS does tend to perform badly. I
> recommend against it ;)

Could afsd something like chmod 0 or use some more fancy stuff to
prevent userland stuff to get at the cache?

Harald.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] /afs area is hanging

2009-05-13 Thread Jason Edgecombe

Mark Henry wrote:



This is a happy day.  Derrick it looks like your skills have pointed us to the
solution.  I mentioned that we use a loopback device to mount our afs cache
filesystem.  That device was /dev/loop0.  Well, after the direction that you
gave us we found that /dev/loop0 was also being used as a method of restricting
font cache for a different app.  When the app would run the afs cache was
getting clobbered and the afs hang would follow.  We have moved the afs cache
to a new place now and it looks like this problem has been solved.  Thank you
all on openafs.org that helped us with this issue.  Thank you Derrick for the
key piece of info that has solved this one.
  
I'm curious what the backing store for /dev/loop0 is in your setup. what 
advantages do you receive while running this way?


Is this so you can store the cache in a ramdisk?

Thanks,
Jason
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] /afs area is hanging

2009-05-13 Thread Derrick Brashear
On Wed, May 13, 2009 at 11:16 AM, Mark Henry  wrote:
>> that may well be related to the problem. what cron jobs do you have?
>
>> Derrick
>
> This is a happy day.  Derrick it looks like your skills have pointed us to the
> solution.  I mentioned that we use a loopback device to mount our afs cache
> filesystem.  That device was /dev/loop0.  Well, after the direction that you
> gave us we found that /dev/loop0 was also being used as a method of 
> restricting
> font cache for a different app.  When the app would run the afs cache was
> getting clobbered and the afs hang would follow.  We have moved the afs cache
> to a new place now and it looks like this problem has been solved.  Thank you
> all on openafs.org that helped us with this issue.  Thank you Derrick for the
> key piece of info that has solved this one.

If you wipe out your cache, AFS does tend to perform badly. I
recommend against it ;)

That said, we can (and possibly also will) improve the behavior a bit.

Derrick
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] /afs area is hanging

2009-05-13 Thread Mark Henry
> that may well be related to the problem. what cron jobs do you have?

> Derrick

This is a happy day.  Derrick it looks like your skills have pointed us to the
solution.  I mentioned that we use a loopback device to mount our afs cache
filesystem.  That device was /dev/loop0.  Well, after the direction that you
gave us we found that /dev/loop0 was also being used as a method of restricting
font cache for a different app.  When the app would run the afs cache was
getting clobbered and the afs hang would follow.  We have moved the afs cache
to a new place now and it looks like this problem has been solved.  Thank you
all on openafs.org that helped us with this issue.  Thank you Derrick for the
key piece of info that has solved this one.



On May 12, 2009, at 6:25 PM, Mark Henry 
wrote:

>>> I take it it;'s not likely that something is deleteing files out from
>>> under AFS inside this filesystem?
>>
>> Not that I am aware of.  We did notice that some of the files were
>> missing in
>> one of the system's cache areas.  One of the systems does not have the
>> following files:  CacheItems, CellItems, VolumeItems and the dir lost
>> +found.
>>
>> Mark Henry

Mark Henry



_
"This message and any attachments are solely for the intended recipient and may 
contain confidential or privileged information. If you are not the intended 
recipient, any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments. Thank 
you." 
_
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] /afs area is hanging

2009-05-12 Thread Derrick Brashear

that may well be related to the problem. what cron jobs do you have?

Derrick


On May 12, 2009, at 6:25 PM, Mark Henry   
wrote:



I take it it;'s not likely that something is deleteing files out from
under AFS inside this filesystem?


Not that I am aware of.  We did notice that some of the files were  
missing in

one of the system's cache areas.  One of the systems does not have the
following files:  CacheItems, CellItems, VolumeItems and the dir lost 
+found.


Mark Henry



_



"This message and any attachments are solely for the intended  
recipient and may contain confidential or privileged information. If  
you are not the intended recipient, any disclosure, copying, use, or  
distribution of the information included in this message and any  
attachments is prohibited. If you have received this communication  
in error, please notify us by reply e-mail and immediately and  
permanently delete this message and any attachments. Thank you."  
_



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] /afs area is hanging

2009-05-12 Thread Mark Henry
> I take it it;'s not likely that something is deleteing files out from
> under AFS inside this filesystem?

Not that I am aware of.  We did notice that some of the files were missing in
one of the system's cache areas.  One of the systems does not have the
following files:  CacheItems, CellItems, VolumeItems and the dir lost+found.

Mark Henry



_
"This message and any attachments are solely for the intended recipient and may 
contain confidential or privileged information. If you are not the intended 
recipient, any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments. Thank 
you." 
_
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] /afs area is hanging

2009-05-12 Thread Derrick Brashear
On Tue, May 12, 2009 at 6:05 PM, Mark Henry  wrote:
>
>> yes, that's very interesting. you've probably told us what we need to.
>> you got an oops in the thread holding the lock, the machine will never
>> recover. what filesystem is behind your afs cache?
>
> We use a virtual ext3 filesystem mounted on a loopback device.  We have done
> this with many afs clients and it seems to be working.  Here is our fstab
> entry:
>
> /AFSvirtualFS     /usr/vice/cache   ext3 defaults,loop=/dev/loop0 0 0

I take it it;'s not likely that something is deleteing files out from
under AFS inside this filesystem?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] /afs area is hanging

2009-05-12 Thread Mark Henry
> yes, that's very interesting. you've probably told us what we need to.
> you got an oops in the thread holding the lock, the machine will never
> recover. what filesystem is behind your afs cache?

We use a virtual ext3 filesystem mounted on a loopback device.  We have 
done this with many afs clients and it seems to be working.  Here is our 
fstab entry:

/AFSvirtualFS /usr/vice/cache   ext3 defaults,loop=/dev/loop0 0 0

Mark Henry




Derrick Brashear  
05/12/2009 02:58 PM

To
Mark Henry 
cc
Felix Frank , openafs-info@openafs.org, Simon 
Wilkinson 
Subject
Re: [OpenAFS] /afs area is hanging






On Tue, May 12, 2009 at 4:37 PM, Mark Henry  
wrote:
>
>
>
>> I took the liberty to paste the interesting parts to
>> http://pastebin.com/m53578cd5. Notice the bottom, which was the 
original
>> bottom as well. Mark, you've been asked to look at dmesg before this, 
so I
>> suppose this didn't happen before you tried this call-tracing?
>
>> Besides, it would be interesting if an upgrade to 1.4.10 makes the 
problem
>> go away. Can you try that?
>
> We upgraded to 1.4.10.  We got the same errors.  Here is the cmdebug 
output:
>
> => cmdebug HOSTNAME
> ** Cache entry @ 0x172d1880 for 2.536870937.28.1820 
[afs.dev.infoprint.com]
> locks: (none_waiting, write_locked(pid:-246839824 at:681))
>3 bytes  DV0  refcnt 2
> callback 22174d40   expires 1242158668
> 0 opens 0 writers
> mount point
> states (0x5), stat'd, read-only

which is afs_HandleLink, called from afs_lookup. Memcache or disk
cache? (different implementations of that function)
>
> The volume # 536870937 above just happens to be root.cell but it has 
been
> different based on which dir I do the ls in.
>
>> Finally, it looks like ls and sshd got locked up trying to determine 
your
>> client machine's home cell. I can see that happening (only) if no cell 
has
>> been set at that point. The output of fs wscell would be interesting in
>> this situation, but I'm not sure wether that would lock up as well (and 
if
>> it is at all helpful).
>
> We tried the fs wscell command.  It worked fine if the fs command was 
local
> and hung if the fs was being retrieved from afs.

well, if afs is unhappy, one presumes afs is unhappy.

> Also, here is a bit of interesting output from dmesg when the system is
> hung:

ah, that'd be disk cache.

> AssertProcessEntry: pohm_main, pid=6518
> openafs: Can't open inode 95550

yes, that's very interesting. you've probably told us what we need to.
you got an oops in the thread holding the lock, the machine will never
recover. what filesystem is behind your afs cache?

> [ cut here ]
> kernel BUG at
> 
/compile/openafs-1.4.10/src/libafs/MODLOAD-2.6.22.5-31-default-MP/osi_file.c:87!
> invalid opcode:  [1] SMP
> last sysfs file: /class/scsi_host/host0/model
> CPU 6
> Modules linked in: tun iptable_filter ip_tables x_tables amk ipv6 
libafs(P)
> microcode firmware_class usbhid hid ff_memless tp_dd af_packet apparmor 
ext2
> loop dm_mod parport_pc parport bnx2 rtc_cmos rtc_core i2c_i801 rtc_lib
> ide_cd i2c_core cdrom shpchp tg3 pci_hotplug container button sg 
ehci_hcd
> uhci_hcd usbcore sd_mod ata_piix libata edd ext3 mbcache jbd fan aacraid
> scsi_mod piix ide_core thermal processor
> Pid: 6519, comm: ASCIIMast Tainted: P  N 2.6.22.5-31-default #1
> RIP: 0010:[]  []
> :libafs:osi_UFSOpen+0x155/0x1f2
> RSP: :81031de299c8  EFLAGS: 00010296
> RAX: 0023 RBX: 81031a92b418 RCX: 0001
> RDX: 804bdfe8 RSI: 0096 RDI: 804bdfe0
> RBP: 81042145f000 R08: 0001 R09: 810001086bc0
> R10: 0046 R11: 81042fa74ec0 R12: 81042af37000
> R13: 0001753e R14: 0001753e R15: 0003
> FS:  () GS:81042ee9f1c0() 
knlGS:
> CS:  0010 DS: 002b ES: 002b CR0: 8005003b
> CR2: f5d6050c CR3: 00039e0b9000 CR4: 06e0
> Process ASCIIMast (pid: 6519, threadinfo 81031de28000, task
> 81031ad770c0)
> Stack:  c2000770eb90 c2000770caa8 81042a631000 
8104172d1880
> 81039800839c 8832f678 0003 0003
>  c2000770eb90 8104172d1880 
> Call Trace:
> [] :libafs:afs_UFSHandleLink+0xf7/0x1bd
> [] :libafs:afs_lookup+0xbb2/0x115f
> [] :libafs:afs_linux_dentry_revalidate+0x422/0x434
> [] :libafs:afs_linux_lookup+0x85/0x1ca
> [] :libafs:PagInCred+0x30/0xa9
> [] do_lookup+0xc4/0x1ae
> [] __link_path_walk+0x36c/0xd8b
> [] dput+0x26/0x115
> [] __link_path_walk+0xc38/0xd8b
>

Re: [OpenAFS] /afs area is hanging

2009-05-12 Thread Derrick Brashear
On Tue, May 12, 2009 at 4:37 PM, Mark Henry  wrote:
>
>
>
>> I took the liberty to paste the interesting parts to
>> http://pastebin.com/m53578cd5. Notice the bottom, which was the original
>> bottom as well. Mark, you've been asked to look at dmesg before this, so I
>> suppose this didn't happen before you tried this call-tracing?
>
>> Besides, it would be interesting if an upgrade to 1.4.10 makes the problem
>> go away. Can you try that?
>
> We upgraded to 1.4.10.  We got the same errors.  Here is the cmdebug output:
>
> => cmdebug HOSTNAME
> ** Cache entry @ 0x172d1880 for 2.536870937.28.1820 [afs.dev.infoprint.com]
>     locks: (none_waiting, write_locked(pid:-246839824 at:681))
>                3 bytes  DV            0  refcnt     2
>     callback 22174d40   expires 1242158668
>     0 opens     0 writers
>     mount point
>     states (0x5), stat'd, read-only

which is afs_HandleLink, called from afs_lookup. Memcache or disk
cache? (different implementations of that function)
>
> The volume # 536870937 above just happens to be root.cell but it has been
> different based on which dir I do the ls in.
>
>> Finally, it looks like ls and sshd got locked up trying to determine your
>> client machine's home cell. I can see that happening (only) if no cell has
>> been set at that point. The output of fs wscell would be interesting in
>> this situation, but I'm not sure wether that would lock up as well (and if
>> it is at all helpful).
>
> We tried the fs wscell command.  It worked fine if the fs command was local
> and hung if the fs was being retrieved from afs.

well, if afs is unhappy, one presumes afs is unhappy.

> Also, here is a bit of interesting output from dmesg when the system is
> hung:

ah, that'd be disk cache.

> AssertProcessEntry: pohm_main, pid=6518
> openafs: Can't open inode 95550

yes, that's very interesting. you've probably told us what we need to.
you got an oops in the thread holding the lock, the machine will never
recover. what filesystem is behind your afs cache?

> [ cut here ]
> kernel BUG at
> /compile/openafs-1.4.10/src/libafs/MODLOAD-2.6.22.5-31-default-MP/osi_file.c:87!
> invalid opcode:  [1] SMP
> last sysfs file: /class/scsi_host/host0/model
> CPU 6
> Modules linked in: tun iptable_filter ip_tables x_tables amk ipv6 libafs(P)
> microcode firmware_class usbhid hid ff_memless tp_dd af_packet apparmor ext2
> loop dm_mod parport_pc parport bnx2 rtc_cmos rtc_core i2c_i801 rtc_lib
> ide_cd i2c_core cdrom shpchp tg3 pci_hotplug container button sg ehci_hcd
> uhci_hcd usbcore sd_mod ata_piix libata edd ext3 mbcache jbd fan aacraid
> scsi_mod piix ide_core thermal processor
> Pid: 6519, comm: ASCIIMast Tainted: P      N 2.6.22.5-31-default #1
> RIP: 0010:[]  []
> :libafs:osi_UFSOpen+0x155/0x1f2
> RSP: :81031de299c8  EFLAGS: 00010296
> RAX: 0023 RBX: 81031a92b418 RCX: 0001
> RDX: 804bdfe8 RSI: 0096 RDI: 804bdfe0
> RBP: 81042145f000 R08: 0001 R09: 810001086bc0
> R10: 0046 R11: 81042fa74ec0 R12: 81042af37000
> R13: 0001753e R14: 0001753e R15: 0003
> FS:  () GS:81042ee9f1c0() knlGS:
> CS:  0010 DS: 002b ES: 002b CR0: 8005003b
> CR2: f5d6050c CR3: 00039e0b9000 CR4: 06e0
> Process ASCIIMast (pid: 6519, threadinfo 81031de28000, task
> 81031ad770c0)
> Stack:  c2000770eb90 c2000770caa8 81042a631000 8104172d1880
> 81039800839c 8832f678 0003 0003
>  c2000770eb90 8104172d1880 
> Call Trace:
> [] :libafs:afs_UFSHandleLink+0xf7/0x1bd
> [] :libafs:afs_lookup+0xbb2/0x115f
> [] :libafs:afs_linux_dentry_revalidate+0x422/0x434
> [] :libafs:afs_linux_lookup+0x85/0x1ca
> [] :libafs:PagInCred+0x30/0xa9
> [] do_lookup+0xc4/0x1ae
> [] __link_path_walk+0x36c/0xd8b
> [] dput+0x26/0x115
> [] __link_path_walk+0xc38/0xd8b
> [] link_path_walk+0x58/0xe0
> [] do_filp_open+0x1c/0x3d
> [] do_path_lookup+0x1ab/0x227
> [] __path_lookup_intent_open+0x56/0x97
> [] open_namei+0x7a/0x674
> [] do_filp_open+0x1c/0x3d
> [] do_sys_open+0x44/0xc1
> [] ia32_sysret+0x0/0xa
>
>
> We still can't seem to get this system to stop hanging in the /afs area.
>
> Mark Henry
>
>
> _
> "This message and any attachments are solely for the intended recipient and
> may contain confidential or privileged information. If you are not the
> intended recipient, any disclosure, copying, use, or distribution of the
> information included in this message and any attachments is prohibited. If
> you have received this communication in error, please notify us by reply
> e-mail and immediately and permanently delete this message and any
> attachments. Thank you."
> _
>



--

Re: [OpenAFS] /afs area is hanging

2009-05-12 Thread Mark Henry
> I took the liberty to paste the interesting parts to 
> http://pastebin.com/m53578cd5. Notice the bottom, which was the original
> bottom as well. Mark, you've been asked to look at dmesg before this, so 
I
> suppose this didn't happen before you tried this call-tracing?

> Besides, it would be interesting if an upgrade to 1.4.10 makes the 
problem
> go away. Can you try that?

We upgraded to 1.4.10.  We got the same errors.  Here is the cmdebug 
output:

=> cmdebug HOSTNAME
** Cache entry @ 0x172d1880 for 2.536870937.28.1820 
[afs.dev.infoprint.com]
locks: (none_waiting, write_locked(pid:-246839824 at:681))
   3 bytes  DV0  refcnt 2
callback 22174d40   expires 1242158668
0 opens 0 writers
mount point
states (0x5), stat'd, read-only

The volume # 536870937 above just happens to be root.cell but it has been 
different based on which dir I do the ls in.

> Finally, it looks like ls and sshd got locked up trying to determine 
your
> client machine's home cell. I can see that happening (only) if no cell 
has 
> been set at that point. The output of fs wscell would be interesting in 
> this situation, but I'm not sure wether that would lock up as well (and 
if 
> it is at all helpful).

We tried the fs wscell command.  It worked fine if the fs command was 
local and hung if the fs was being retrieved from afs.

Also, here is a bit of interesting output from dmesg when the system is 
hung:

AssertProcessEntry: pohm_main, pid=6518
openafs: Can't open inode 95550
[ cut here ]
kernel BUG at 
/compile/openafs-1.4.10/src/libafs/MODLOAD-2.6.22.5-31-default-MP/osi_file.c:87!
invalid opcode:  [1] SMP 
last sysfs file: /class/scsi_host/host0/model
CPU 6 
Modules linked in: tun iptable_filter ip_tables x_tables amk ipv6 
libafs(P) microcode firmware_class usbhid hid ff_memless tp_dd af_packet 
apparmor ext2 loop dm_mod parport_pc parport bnx2 rtc_cmos rtc_core 
i2c_i801 rtc_lib ide_cd i2c_core cdrom shpchp tg3 pci_hotplug container 
button sg ehci_hcd uhci_hcd usbcore sd_mod ata_piix libata edd ext3 
mbcache jbd fan aacraid scsi_mod piix ide_core thermal processor
Pid: 6519, comm: ASCIIMast Tainted: P  N 2.6.22.5-31-default #1
RIP: 0010:[]  [] 
:libafs:osi_UFSOpen+0x155/0x1f2
RSP: :81031de299c8  EFLAGS: 00010296
RAX: 0023 RBX: 81031a92b418 RCX: 0001
RDX: 804bdfe8 RSI: 0096 RDI: 804bdfe0
RBP: 81042145f000 R08: 0001 R09: 810001086bc0
R10: 0046 R11: 81042fa74ec0 R12: 81042af37000
R13: 0001753e R14: 0001753e R15: 0003
FS:  () GS:81042ee9f1c0() 
knlGS:
CS:  0010 DS: 002b ES: 002b CR0: 8005003b
CR2: f5d6050c CR3: 00039e0b9000 CR4: 06e0
Process ASCIIMast (pid: 6519, threadinfo 81031de28000, task 
81031ad770c0)
Stack:  c2000770eb90 c2000770caa8 81042a631000 
8104172d1880
 81039800839c 8832f678 0003 0003
  c2000770eb90 8104172d1880 
Call Trace:
 [] :libafs:afs_UFSHandleLink+0xf7/0x1bd
 [] :libafs:afs_lookup+0xbb2/0x115f
 [] :libafs:afs_linux_dentry_revalidate+0x422/0x434
 [] :libafs:afs_linux_lookup+0x85/0x1ca
 [] :libafs:PagInCred+0x30/0xa9
 [] do_lookup+0xc4/0x1ae
 [] __link_path_walk+0x36c/0xd8b
 [] dput+0x26/0x115
 [] __link_path_walk+0xc38/0xd8b
 [] link_path_walk+0x58/0xe0
 [] do_filp_open+0x1c/0x3d
 [] do_path_lookup+0x1ab/0x227
 [] __path_lookup_intent_open+0x56/0x97
 [] open_namei+0x7a/0x674
 [] do_filp_open+0x1c/0x3d
 [] do_sys_open+0x44/0xc1
 [] ia32_sysret+0x0/0xa


We still can't seem to get this system to stop hanging in the /afs area.

Mark Henry



_
"This message and any attachments are solely for the intended recipient and may 
contain confidential or privileged information. If you are not the intended 
recipient, any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments. Thank 
you." 
_

Re: [OpenAFS] /afs area is hanging

2009-05-07 Thread Felix Frank

On Thu, 7 May 2009, Ted Creedon wrote:


Typical for 3 servers all of which hung after "ls /afs"

I niced affsd to get enough keyboard control to see that klogd and syslog-ng
are using all the cycles filling /var/log/messages with rx carps.

I think that a massive error caused by mis-keying should not hang windows
and linux clients...


Must be a different problem then, Mark reported no suspicious log entries
whatsoever.

On Thu, 7 May 2009, Simon Wilkinson wrote:



On 7 May 2009, at 21:51, Mark Henry wrote:



I ran the 'echo t' command recommended below once afs hung again.  It 
definitely put some output in dmesg.  The only D states that were listed 
were bash sessions (I think).  It looks like they were sessions that I 
opened after the user told me that afs was hung again.  I tried to attach 
the dmesg output and cmdebug output but the email was rejected because the 
log files were way too big.  Any ideas of what to try next?  Or is there 
anything in particular that I should look at in the cmdebug or dmesg 
output.  Thanks,


Put the files somewhere we can fetch them - either on the web, or in AFS. 
Your cmdebug output really shouldn't be that long, though (you want to run 
'cmdebug' _not_ 'cmdebug -long)


My CC came through fine. The dmesg output looks mangled, however, it
starts in the middle of a trace. I hope nothing vital is missing, but I
failed to find afsd, which was mildly disappointing.

I took the liberty to paste the interesting parts to 
http://pastebin.com/m53578cd5. Notice the bottom, which was the original

bottom as well. Mark, you've been asked to look at dmesg before this, so I
suppose this didn't happen before you tried this call-tracing?

Besides, it would be interesting if an upgrade to 1.4.10 makes the problem
go away. Can you try that?

Finally, it looks like ls and sshd got locked up trying to determine your
client machine's home cell. I can see that happening (only) if no cell has 
been set at that point. The output of fs wscell would be interesting in 
this situation, but I'm not sure wether that would lock up as well (and if 
it is at all helpful).


Cheers
 - Felixsshd  S 61bfbfdb6613 0  8543  1 (NOTLB)
 8103489399d8 0086  8830848d
  810348939988 80617800 80617800
 8061d210 80617800 80617800 883082f3
Call Trace:
 [] :libafs:afs_IsPrimaryCell+0x2c/0x36
 [] :libafs:afs_GetCellStale+0x42/0x4b
 [] :libafs:afs_osi_SleepSig+0xc5/0x161
 [] default_wake_function+0x0/0xe
 [] :libafs:EvalMountPoint+0x2fa/0x3d1
 [] :libafs:afs_osi_Sleep+0x69/0xba
 [] :libafs:Afs_Lock_Obtain+0xba/0x1a2
 [] :libafs:afs_lookup+0xe67/0x115f
 [] :libafs:afs_pag_match+0x0/0x17
 [] :libafs:afs_linux_dentry_revalidate+0x178/0x434
 [] igrab+0x25/0x34
 [] :libafs:afs_FindVCache+0x222/0x4f5
 [] :libafs:afs_AccessOK+0x4a/0x14c
 [] mutex_lock+0xd/0x1e
 [] iput+0x42/0x7b
 [] :libafs:afs_PutVCache+0x93/0x10d
 [] :libafs:afs_PutFakeStat+0x3d/0x42
 [] :libafs:afs_access+0x382/0x3be
 [] __d_lookup+0xbc/0x10e
 [] do_lookup+0x157/0x1ae
 [] __link_path_walk+0x36c/0xd8b
 [] link_path_walk+0x58/0xe0
 [] audit_syscall_entry+0x138/0x17a
 [] do_path_lookup+0x1ab/0x227
 [] getname+0x14c/0x1af
 [] __user_walk_fd+0x37/0x4c
 [] sys_faccessat+0x9c/0x148
 [] tracesys+0xdc/0xe1

lsS 61c6b4bd29d2 0  8960  1 (NOTLB)
 8103f4e23968 0082  8830848d
  8103f4e23918 80617800 80617800
 8061d210 80617800 80617800 883082f3
Call Trace:
 [] :libafs:afs_IsPrimaryCell+0x2c/0x36
 [] :libafs:afs_GetCellStale+0x42/0x4b
 [] :libafs:afs_osi_SleepSig+0xc5/0x161
 [] default_wake_function+0x0/0xe
 [] :libafs:EvalMountPoint+0x2fa/0x3d1
 [] :libafs:afs_osi_Sleep+0x69/0xba
 [] :libafs:Afs_Lock_Obtain+0xba/0x1a2
 [] :libafs:afs_lookup+0xe67/0x115f
 [] :libafs:afs_AccessOK+0x4a/0x14c
 [] :libafs:afs_linux_dentry_revalidate+0x178/0x434
 [] igrab+0x25/0x34
 [] :libafs:afs_FindVCache+0x222/0x4f5
 [] :libafs:afs_AccessOK+0x4a/0x14c
 [] mutex_lock+0xd/0x1e
 [] iput+0x42/0x7b
 [] :libafs:afs_PutVCache+0x93/0x10d
 [] :libafs:afs_PutFakeStat+0x3d/0x42
 [] :libafs:afs_access+0x382/0x3be
 [] __d_lookup+0xbc/0x10e
 [] do_lookup+0x157/0x1ae
 [] __link_path_walk+0x8d9/0xd8b
 [] :libafs:afs_TraverseCells+0x4a/0x9b
 [] link_path_walk+0x58/0xe0
 [] current_fs_time+0x3b/0x40
 [] __mark_inode_dirty+0xdf/0x17c
 [] do_path_lookup+0x1ab/0x227
 [] getname+0x14c/0x1af
 [] __user_walk_fd+0x37/0x4c
 [] vfs_lstat_fd+0x18/0x47
 [] __mark_inode_dirty+0xdf/0x17c
 [] sys_newlstat+0x19/0x31
 [] tracesys+0x71/0xe1
 [] tracesys+0xdc/0xe1

bash  D 61d03f4a273d 0  9098   9096 (NOTLB)
 81035ddbfe78 0082  8832198d
 810341013c9c 81035ddbfe28 80617800 80617800
 8061d210 80617800 80617800 8102e3f31780
Call Trace:
 [] :libafs:afs_getatt

Fwd: [OpenAFS] /afs area is hanging

2009-05-07 Thread Ted Creedon
-- Forwarded message --
From: Ted Creedon 
Date: Thu, May 7, 2009 at 2:57 PM
Subject: Re: [OpenAFS] /afs area is hanging
To: Mark Henry 


Use select cut and paste typical portions  of dmesg and /var/log/messages

I fixed my problem by recreating all the principals using "kadmin.local -e
des-cbc-crc:normal"
which forces all the keys to be single des.

Typical for 3 servers all of which hung after "ls /afs"

I niced affsd to get enough keyboard control to see that klogd and syslog-ng
are using all the cycles filling /var/log/messages with rx carps.

I think that a massive error caused by mis-keying should not hang windows
and linux clients...

A key check diagnostic would certainly help.

1.5.59 win and 1.4.10 linux on 3 suse 10.2 and 11.1 server boxes, one is
dual homed

Seems to work fine now.


On Thu, May 7, 2009 at 1:51 PM, Mark Henry  wrote:

>
> I ran the 'echo t' command recommended below once afs hung again.  It
> definitely put some output in dmesg.  The only D states that were listed
> were bash sessions (I think).  It looks like they were sessions that I
> opened after the user told me that afs was hung again.  I tried to attach
> the dmesg output and cmdebug output but the email was rejected because the
> log files were way too big.  Any ideas of what to try next?  Or is there
> anything in particular that I should look at in the cmdebug or dmesg output.
>  Thanks,
>
> Mark Henry
>
>
>
>  *Felix Frank *
>
> 05/04/2009 11:46 PM
>   To
> Mark Henry   cc
> openafs-info@openafs.org  Subject
> Re: [OpenAFS] /afs area is hanging
>
>
>
>
> On Mon, 4 May 2009, Mark Henry wrote:
>
> > I tried the -fakestat-all option and it did not work.
>
> Weird.
>
> > I have searched /var/log/messages.  I have checked the config files.
>  User
> > authentication works fine (even when the system is hanging).  If I run
> the
> > command 'ls -l /afs' that window is hung (or any other command that
> > references afs).  If an afs user logs in when the system is in a bad
> state
> > the session immediately hangs because it can't cd to the afs home dir.  I
> > don't know what to do other than reboot.  Can someone tell me what else
> to
> > try to find out why this system is hanging?  Thanks,
>
> You can find out just which call gets stuck by issuing an
> 'echo t >>/proc/sysrq-trigger'. Call traces for all processes can then be
> found in dmesg. The broken processes are probably in a D state.
>
> This will probably not identify the root cause, but may give a clue about
> what's going on.
>
> HTH
>  - Felix
>
>
>
> _
> "This message and any attachments are solely for the intended recipient and
> may contain confidential or privileged information. If you are not the
> intended recipient, any disclosure, copying, use, or distribution of the
> information included in this message and any attachments is prohibited. If
> you have received this communication in error, please notify us by reply
> e-mail and immediately and permanently delete this message and any
> attachments. Thank you."
> _
>


Re: [OpenAFS] /afs area is hanging

2009-05-07 Thread Simon Wilkinson


On 7 May 2009, at 21:51, Mark Henry wrote:



I ran the 'echo t' command recommended below once afs hung again.   
It definitely put some output in dmesg.  The only D states that were  
listed were bash sessions (I think).  It looks like they were  
sessions that I opened after the user told me that afs was hung  
again.  I tried to attach the dmesg output and cmdebug output but  
the email was rejected because the log files were way too big.  Any  
ideas of what to try next?  Or is there anything in particular that  
I should look at in the cmdebug or dmesg output.  Thanks,


Put the files somewhere we can fetch them - either on the web, or in  
AFS. Your cmdebug output really shouldn't be that long, though (you  
want to run 'cmdebug' _not_ 'cmdebug -long)


S.

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] /afs area is hanging

2009-05-07 Thread Mark Henry
I ran the 'echo t' command recommended below once afs hung again.  It 
definitely put some output in dmesg.  The only D states that were listed 
were bash sessions (I think).  It looks like they were sessions that I 
opened after the user told me that afs was hung again.  I tried to attach 
the dmesg output and cmdebug output but the email was rejected because the 
log files were way too big.  Any ideas of what to try next?  Or is there 
anything in particular that I should look at in the cmdebug or dmesg 
output.  Thanks,

Mark Henry




Felix Frank  
05/04/2009 11:46 PM

To
Mark Henry 
cc
openafs-info@openafs.org
Subject
Re: [OpenAFS] /afs area is hanging






On Mon, 4 May 2009, Mark Henry wrote:

> I tried the -fakestat-all option and it did not work.

Weird.

> I have searched /var/log/messages.  I have checked the config files. 
User
> authentication works fine (even when the system is hanging).  If I run 
the
> command 'ls -l /afs' that window is hung (or any other command that
> references afs).  If an afs user logs in when the system is in a bad 
state
> the session immediately hangs because it can't cd to the afs home dir. I
> don't know what to do other than reboot.  Can someone tell me what else 
to
> try to find out why this system is hanging?  Thanks,

You can find out just which call gets stuck by issuing an
'echo t >>/proc/sysrq-trigger'. Call traces for all processes can then be
found in dmesg. The broken processes are probably in a D state.

This will probably not identify the root cause, but may give a clue about
what's going on.

HTH
  - Felix



_
"This message and any attachments are solely for the intended recipient and may 
contain confidential or privileged information. If you are not the intended 
recipient, any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments. Thank 
you." 
_

Re: [OpenAFS] /afs area is hanging

2009-05-05 Thread Mark Henry
Felix,

Thanks for the reply.  I am waiting for it to crash again to run the echo 
. command below.

There also seems to be a separate issue with the LDAP server timing out. 
These messages show up in /var/log/messages.  It makes the afs area really 
slow but not totally hung.  I then restart nscd and the errors stop and 
all is well again.  This seems to be a seperate issue from the permanent 
afs hang taking place.

Mark Henry




Felix Frank  
05/04/2009 11:46 PM

To
Mark Henry 
cc
openafs-info@openafs.org
Subject
Re: [OpenAFS] /afs area is hanging






On Mon, 4 May 2009, Mark Henry wrote:

> I tried the -fakestat-all option and it did not work.

Weird.

> I have searched /var/log/messages.  I have checked the config files. 
User
> authentication works fine (even when the system is hanging).  If I run 
the
> command 'ls -l /afs' that window is hung (or any other command that
> references afs).  If an afs user logs in when the system is in a bad 
state
> the session immediately hangs because it can't cd to the afs home dir. I
> don't know what to do other than reboot.  Can someone tell me what else 
to
> try to find out why this system is hanging?  Thanks,

You can find out just which call gets stuck by issuing an
'echo t >>/proc/sysrq-trigger'. Call traces for all processes can then be
found in dmesg. The broken processes are probably in a D state.

This will probably not identify the root cause, but may give a clue about
what's going on.

HTH
  - Felix



_
"This message and any attachments are solely for the intended recipient and may 
contain confidential or privileged information. If you are not the intended 
recipient, any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments. Thank 
you." 
_

Re: [OpenAFS] /afs area is hanging

2009-05-04 Thread Felix Frank

On Mon, 4 May 2009, Mark Henry wrote:


I tried the -fakestat-all option and it did not work.


Weird.


I have searched /var/log/messages.  I have checked the config files.  User
authentication works fine (even when the system is hanging).  If I run the
command 'ls -l /afs' that window is hung (or any other command that
references afs).  If an afs user logs in when the system is in a bad state
the session immediately hangs because it can't cd to the afs home dir.  I
don't know what to do other than reboot.  Can someone tell me what else to
try to find out why this system is hanging?  Thanks,


You can find out just which call gets stuck by issuing an
'echo t >>/proc/sysrq-trigger'. Call traces for all processes can then be
found in dmesg. The broken processes are probably in a D state.

This will probably not identify the root cause, but may give a clue about
what's going on.

HTH
 - Felix
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] /afs area is hanging

2009-05-04 Thread Mark Henry
Also, I will try to get the cmdebug output just in case there is something 
in there beyond what I can see.

Mark Henry




Jeffrey Altman  
04/29/2009 11:22 AM
Please respond to
jalt...@secure-endpoints.com


To
Mark Henry 
cc
openafs-info@openafs.org
Subject
Re: [OpenAFS] /afs area is hanging






Mark Henry wrote:
> Thank you all for your responses.  I am trying the -fakestat-all option 
on the
> afsd daemon.  We will see if it works.  Everything works fine for awhile 
after
> reboot and then it is just a matter of time before everything that 
touches afs
> hangs.  Hopefully this -fakestat-all option helps.
> 
> Mark

When the cache manager hangs, execute "cmdebug " and send the
output.




_
"This message and any attachments are solely for the intended recipient and may 
contain confidential or privileged information. If you are not the intended 
recipient, any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments. Thank 
you." 
_

Re: [OpenAFS] /afs area is hanging

2009-05-04 Thread Mark Henry
I was not able to get the cmdebug output this last time the system crashed 
because the user rebooted the system.  We have tried running cmdebug 
several times when it is in its hung state and all entries say 
none_waiting or similar.  The output looks the same as a working afs 
client.

I tried the -fakestat-all option and it did not work.

Here is more background on the issue:  I have an OpenAFS client that works 
after each reboot and then eventually hangs when the afs area is accessed. 
 I am running openafs 1.4.9 that I compiled on the system (I have tried 
many versions of the openafs client with the same results).  The OS is 
OpenSUSE 10.3.  I have two other systems with the same OS that are working 
fine.

I have searched /var/log/messages.  I have checked the config files.  User 
authentication works fine (even when the system is hanging).  If I run the 
command 'ls -l /afs' that window is hung (or any other command that 
references afs).  If an afs user logs in when the system is in a bad state 
the session immediately hangs because it can't cd to the afs home dir.  I 
don't know what to do other than reboot.  Can someone tell me what else to 
try to find out why this system is hanging?  Thanks,

Mark Henry




Jeffrey Altman  
04/29/2009 11:22 AM
Please respond to
jalt...@secure-endpoints.com


To
Mark Henry 
cc
openafs-info@openafs.org
Subject
Re: [OpenAFS] /afs area is hanging






Mark Henry wrote:
> Thank you all for your responses.  I am trying the -fakestat-all option 
on the
> afsd daemon.  We will see if it works.  Everything works fine for awhile 
after
> reboot and then it is just a matter of time before everything that 
touches afs
> hangs.  Hopefully this -fakestat-all option helps.
> 
> Mark

When the cache manager hangs, execute "cmdebug " and send the
output.




_
"This message and any attachments are solely for the intended recipient and may 
contain confidential or privileged information. If you are not the intended 
recipient, any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments. Thank 
you." 
_

Re: [OpenAFS] /afs area is hanging

2009-04-29 Thread Jeffrey Altman
Mark Henry wrote:
> Thank you all for your responses.  I am trying the -fakestat-all option on the
> afsd daemon.  We will see if it works.  Everything works fine for awhile after
> reboot and then it is just a matter of time before everything that touches afs
> hangs.  Hopefully this -fakestat-all option helps.
> 
> Mark

When the cache manager hangs, execute "cmdebug " and send the
output.



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] /afs area is hanging

2009-04-29 Thread Mark Henry
Thank you all for your responses.  I am trying the -fakestat-all option on the
afsd daemon.  We will see if it works.  Everything works fine for awhile after
reboot and then it is just a matter of time before everything that touches afs
hangs.  Hopefully this -fakestat-all option helps.

Mark


_
"This message and any attachments are solely for the intended recipient and may 
contain confidential or privileged information. If you are not the intended 
recipient, any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments. Thank 
you." 
_
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: FW: [OpenAFS] /afs area is hanging

2009-04-29 Thread Sean O'Malley

I hate to state the obvious as I am sure you checked this already, but it
really sounds like a firewall issue or something on the network is
blocking traffic rather then the afs client itself.

Sean


On Wed, 29 Apr 2009, Wheeler, JF (Jonathan) wrote:



> -Original Message-
> > From: openafs-info-ad...@openafs.org On Behalf Of Mark Henry
> > Sent: 28 April 2009 23:22
> >
> > I have an OpenAFS client that works after each reboot and then
> eventually
> > hangs when the afs area is accessed.  I am running openafs 1.4.9 that
> I compiled
> > on the system (I have tried many versions of the openafs client with
> the same
> > results).  The OS is OpenSUSE 10.3.  I have two other systems with the
> same
> > OS that are working fine.
> >
> > I have searched /var/log/messages.  I have checked the config files.
> User
> > authentication works fine (even when the system is hanging).  If I run
> the
> > command 'ls -l /afs' that window is hung.  If an afs user logs in when
> the
> > system is in a bad state the session immediately hangs because it
> can't cd
> > to the afs home dir.  I don't know what to do other than reboot.  Can
> someone
> > tell me what else to try to find out why this system is hanging?
> Thanks,
>
> If you really are using the command "ls -l /afs", I have always
> understood that this is getting information about the whole of AFS
> space, that is, the root directories for all AFS cells; naturally this
> will take a long time.  This may not solve your problem, but the command
> as given is a bad (TM) idea.  Of course, the command "ls -l
> /afs/CELLNAME" where CELLNAME is the name of your cell is a much better
> idea.
>
> Jonathan Wheeler
> e-Science Centre
> Rutherford Appleton Laboratory
> (Cell rl.ac.uk)
>

--
  Sean O'Malley, Information Technologist
  Michigan State University
-

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: FW: [OpenAFS] /afs area is hanging

2009-04-29 Thread Felix Frank

On Wed, 29 Apr 2009, Wheeler, JF (Jonathan) wrote:


-Original Message-

From: openafs-info-ad...@openafs.org On Behalf Of Mark Henry
Sent: 28 April 2009 23:22

I have an OpenAFS client that works after each reboot and then

eventually

hangs when the afs area is accessed.  I am running openafs 1.4.9 that

I compiled

on the system (I have tried many versions of the openafs client with

the same

results).  The OS is OpenSUSE 10.3.  I have two other systems with the

same

OS that are working fine.

I have searched /var/log/messages.  I have checked the config files.

User

authentication works fine (even when the system is hanging).  If I run

the

command 'ls -l /afs' that window is hung.  If an afs user logs in when

the

system is in a bad state the session immediately hangs because it

can't cd

to the afs home dir.  I don't know what to do other than reboot.  Can

someone

tell me what else to try to find out why this system is hanging?

Thanks,

If you really are using the command "ls -l /afs", I have always
understood that this is getting information about the whole of AFS
space, that is, the root directories for all AFS cells; naturally this
will take a long time.  This may not solve your problem, but the command
as given is a bad (TM) idea.  Of course, the command "ls -l
/afs/CELLNAME" where CELLNAME is the name of your cell is a much better
idea.


It depends on your root.afs volume. With dynroot (as far as I understand),
the involved effort is not great, and doing a plain listing of a static
root.afs shouldn't take ages either.

Most importantly, even if there is a longer batch of accesses that makes one
process take forever, this should not freeze the entire machine's AFS
connectivity, right?

What happens if a user w/ AFS home logs on before ls - /afs? Does this not
generate the bad state?

Cheers
 - Felix
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


FW: [OpenAFS] /afs area is hanging

2009-04-29 Thread Wheeler, JF (Jonathan)
-Original Message-
> From: openafs-info-ad...@openafs.org On Behalf Of Mark Henry
> Sent: 28 April 2009 23:22
> 
> I have an OpenAFS client that works after each reboot and then
eventually
> hangs when the afs area is accessed.  I am running openafs 1.4.9 that
I compiled
> on the system (I have tried many versions of the openafs client with
the same
> results).  The OS is OpenSUSE 10.3.  I have two other systems with the
same
> OS that are working fine.
> 
> I have searched /var/log/messages.  I have checked the config files.
User
> authentication works fine (even when the system is hanging).  If I run
the
> command 'ls -l /afs' that window is hung.  If an afs user logs in when
the
> system is in a bad state the session immediately hangs because it
can't cd
> to the afs home dir.  I don't know what to do other than reboot.  Can
someone
> tell me what else to try to find out why this system is hanging?
Thanks,

If you really are using the command "ls -l /afs", I have always
understood that this is getting information about the whole of AFS
space, that is, the root directories for all AFS cells; naturally this
will take a long time.  This may not solve your problem, but the command
as given is a bad (TM) idea.  Of course, the command "ls -l
/afs/CELLNAME" where CELLNAME is the name of your cell is a much better
idea.

Jonathan Wheeler 
e-Science Centre 
Rutherford Appleton Laboratory
(Cell rl.ac.uk)
--
Scanned by iCritical.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] /afs area is hanging

2009-04-28 Thread David Bear
You might also try the -fakestat on the afs startup daemon.

On Tue, Apr 28, 2009 at 3:22 PM, Mark Henry wrote:

>
> I have an OpenAFS client that works after each reboot and then eventually
> hangs
> when the afs area is accessed.  I am running openafs 1.4.9 that I compiled
> on
> the system (I have tried many versions of the openafs client with the same
> results).  The OS is OpenSUSE 10.3.  I have two other systems with the same
> OS
> that are working fine.
>
> I have searched /var/log/messages.  I have checked the config files.  User
> authentication works fine (even when the system is hanging).  If I run the
> command 'ls -l /afs' that window is hung.  If an afs user logs in when the
> system is in a bad state the session immediately hangs because it can't cd
> to
> the afs home dir.  I don't know what to do other than reboot.  Can someone
> tell
> me what else to try to find out why this system is hanging?  Thanks,
>
> Mark
>
>
>
> _
> "This message and any attachments are solely for the intended recipient and
> may contain confidential or privileged information. If you are not the
> intended recipient, any disclosure, copying, use, or distribution of the
> information included in this message and any attachments is prohibited. If
> you have received this communication in error, please notify us by reply
> e-mail and immediately and permanently delete this message and any
> attachments. Thank you."
> _
> ___
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>



-- 
David Bear
College of Public Programs at ASU
602-464-0424


Re: [OpenAFS] /afs area is hanging

2009-04-28 Thread S.J.Chun
What do you get if you type "dmesg"? 

- Original Message -
   From: Mark Henry 
   To: openafs-info@openafs.org
   Sent: 09-04-29 07:22:05
   Subject: [OpenAFS] /afs area is hanging

  
I have an OpenAFS client that works after each reboot and then eventually hangs
when the afs area is accessed.  I am running openafs 1.4.9 that I compiled on
the system (I have tried many versions of the openafs client with the same
results).  The OS is OpenSUSE 10.3.  I have two other systems with the same OS
that are working fine.

I have searched /var/log/messages.  I have checked the config files.  User
authentication works fine (even when the system is hanging).  If I run the
command 'ls -l /afs' that window is hung.  If an afs user logs in when the
system is in a bad state the session immediately hangs because it can't cd to
the afs home dir.  I don't know what to do other than reboot.  Can someone tell
me what else to try to find out why this system is hanging?  Thanks,

Mark


_
"This message and any attachments are solely for the intended recipient and may 
contain confidential or privileged information. If you are not the intended 
recipient, any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments. Thank 
you." 
_
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] /afs area is hanging

2009-04-28 Thread Mark Henry

I have an OpenAFS client that works after each reboot and then eventually hangs
when the afs area is accessed.  I am running openafs 1.4.9 that I compiled on
the system (I have tried many versions of the openafs client with the same
results).  The OS is OpenSUSE 10.3.  I have two other systems with the same OS
that are working fine.

I have searched /var/log/messages.  I have checked the config files.  User
authentication works fine (even when the system is hanging).  If I run the
command 'ls -l /afs' that window is hung.  If an afs user logs in when the
system is in a bad state the session immediately hangs because it can't cd to
the afs home dir.  I don't know what to do other than reboot.  Can someone tell
me what else to try to find out why this system is hanging?  Thanks,

Mark


_
"This message and any attachments are solely for the intended recipient and may 
contain confidential or privileged information. If you are not the intended 
recipient, any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication in error, please notify us by reply e-mail and 
immediately and permanently delete this message and any attachments. Thank 
you." 
_
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info