On Fri, 25 Feb 2005, Michael Niksch wrote:
Michael: You might want to try my patch and see how much things improve for you.
As far as I can tell, the machine dies just as quickly as it did without the patch. The kernel dump also looks similar.
That's odd, it changed things considerably for me on a 32bit machine running AIX 5.1...
I enabled kernel memory debugging (bosdebug -M ; bosboot -a ; reboot) in order to try to further pinpoint the problem, and these are my findings:
* The machine still doesn't dump when I use my "large stack"-patch on 1.3.78 (it should dump if the xmalloc debug thingie detects an error). * Going back to an unpatched 1.3.78 it dumps with the xmalloc debug message "A program has tried to access freed xmalloc memory". In both cases the crash occurs after having obtained an AFS token and then trying to access AFS using that token. The "kdb stat" output from the dumps are at the bottom of this post.
Does this give a hint on what's wrong? I'm wading through the source at random not finding anything obvious. All those #ifdefs makes the thing rather hard to read :/
Also, I'm no kdb guru and I haven't found any good howtos either. If anyone knows how to extract more useful info out of the thing please holler...
Dump 1: ------------------8<------------------------- (0)> stat SYSTEM_CONFIGURATION: POWER_RS2 machine with 1 cpu(s) (32-bit registers)
SYSTEM STATUS: sysname... AIX nodename.. n11 release... 1 version... 5 machine... 000030638100 nid....... 00306381 time of crash: Thu Mar 10 11:01:48 2005 age of system: 11 min., 50 sec. xmalloc debug: enabled Debug kernel error message: A program has tried to access freed xmalloc memory. Address at fault was 0x3DC0E000
CRASH INFORMATION:
CPU 0 CSA 2FF3B400 at time of crash, error code for LEDs: 30000000
pvthread+005B80 STACK:
[08DBC714]memset+000054 ()
[08D99EE8]rxi_Alloc+000140 (00002F34)
[08DE979C]rxkad_NewClientSecurityObject+000080 (00000000, 307DFA24, 00000001, 00000030,
307DF9C0) [08DD0BFC]afs_ConnBySA_7_5+000398 (??, ??, ??, ??, ??, ??, ??, ??)
[08DD0814]afs_Conn+0002A0 (??, ??, ??)
[08DF4B54]afs_DoBulkStat+000BC8 (3D8A68F8, 00000600, 2FF3A8C0)
[08DF359C]afs_lookup+000ED0 (3D8A68F8, 2FF3AB48, 2FF3AB44, 35B46600)
[08DB2CDC]afs_gn_lookup+00004C (3D8A68F8, 2FF3AB44, 2FF3AB48, 00000082,
00000000, 35B46600)
[08DACFD0]vn_lookup+00009C (3D8A68F8, 2FF3AB44, 2FF3AB48, 00000082,
00000000, 35B46600)
[002EEE2C]vnop_lookup+000018 (??, ??, ??, ??, ??, ??)
[002C6E74]lookuppn+000494 (??, ??, ??, ??, ??, ??)
[002C7390]lookupname_cur+000090 (??, ??, ??, ??, ??, ??, ??)
[003295F0]statx+000234 (20003F38, 2FF21708, 00000080, 00000009)
[00003A50].sys_call+000000 ()
Not a valid VMM address @ D01E469C
------------------8<-------------------------
Dump 2: ------------------8<------------------------- (0)> stat SYSTEM_CONFIGURATION: POWER_RS2 machine with 1 cpu(s) (32-bit registers)
SYSTEM STATUS: sysname... AIX nodename.. n11 release... 1 version... 5 machine... 000030638100 nid....... 00306381 time of crash: Thu Mar 10 11:38:18 2005 age of system: 34 min., 16 sec. xmalloc debug: enabled Debug kernel error message: A program has tried to access freed xmalloc memory. Address at fault was 0x3453B000
CRASH INFORMATION:
CPU 0 CSA 2FF3B400 at time of crash, error code for LEDs: 30000000
pvthread+005280 STACK:
[09E50714]memset+000054 ()
[09E2DEE8]rxi_Alloc+000140 (00002F34)
[09E7D79C]rxkad_NewClientSecurityObject+000080 (00000000, 307DFAE4, 00000001, 00000030,
307DF7C0) [09E64BFC]afs_ConnBySA_7_5+000398 (??, ??, ??, ??, ??, ??, ??, ??)
[09E64814]afs_Conn+0002A0 (??, ??, ??)
[09E629FC]afs_FetchStatus+000054 (??, ??, ??, ??)
[09E5FEA4]afs_GetVCache+000478 (??, ??, ??, ??)
[09E47F4C]afs_root_nolock+0000E0 (307BAA10, 2FF3AB44)
[09E47A9C]afs_root+000074 (307BAA10, 2FF3AB44)
[09E41E18]vfs_root+000084 (307BAA10, 2FF3AB44, 35A86400)
[003241D0]vfs_root+000018 (??, ??, ??)
[002C6FE8]lookuppn+000608 (??, ??, ??, ??, ??, ??)
[002C7390]lookupname_cur+000090 (??, ??, ??, ??, ??, ??, ??)
[003295F0]statx+000234 (2FF22CF7, 2FF21A08, 00000080, 00000009)
[00003A50].sys_call+000000 ()
Not a valid VMM address @ D01E469C
------------------8<-------------------------
/Nikke
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | [EMAIL PROTECTED]
---------------------------------------------------------------------------
"In English, Data." - Crusher
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel
