I apologize for the wide margins, but I have some system output I need to post. I wonder if anyone has had Samba VMS on a VAX hang up on them after about two weeks. This process is causing BACKUP to hang, and any process that tries to run TCPIP hangs as well.
The system in question is a Vaxstation 4000/60 with 32 MB RAM running VMS 7.1 and TCPIP 5.1, eco 5. NMBD is going into MUTEX and hanging up my system so badly that I have to reboot it. Take a look at SHOW SYSTEM: OpenVMS V7.1 on node ORFF 27-OCT-2003 15:02:53.91 Uptime 16 16:48:37 Pid Process Name State Pri I/O CPU Page flts Pages 20200081 SWAPPER HIB 16 0 0 00:00:44.72 0 0 20200086 CONFIGURE HIB 10 6 0 00:00:03.62 6644 167 20200088 IPCACP HIB 10 6 0 00:00:00.13 6019 101 20200089 ERRFMT HIB 8 11036 0 00:00:34.25 1784 119 2020008A CACHE_SERVER HIBO 16 -- swapped out -- 121 2020008B CLUSTER_SERVER HIB 8 11 0 00:00:00.05 192 281 2020008C OPCOM HIB 7 22437 0 00:00:47.87 6013 169 2020008D AUDIT_SERVER HIB 10 44564 0 00:00:53.23 3290 397 2020008E JOB_CONTROL HIB 10 28477 0 00:00:41.15 10181 163 2020008F QUEUE_MANAGER HIB 9 8537 0 00:00:43.73 13399 549 20200090 SECURITY_SERVER HIB 10 7540 0 00:02:08.84 131238 603 20200091 SMISERVER HIB 9 35 0 00:00:00.66 7951 71 20200092 TP_SERVER HIB 9 96640 0 00:08:43.59 36033 158 20200093 TCPIP$TNS2 HIBO 4 -- swapped out -- 381 20200094 TCPIP$TNS1 HIBO 4 -- swapped out -- 399 20200095 TCPIP$INETACP HIB 8 25761 0 00:01:03.77 16604 589 20200096 TCPIP$BIND_1 LEF 9 1182971 0 00:48:14.88 189439 1813 N 20200097 TCPIP$PORTM_1 LEF 10 110 0 00:00:00.74 7484 59 N 20200098 TCPIP$FTP_1 LEF 10 207 0 00:00:01.29 8642 1172 N 20200099 TCPIP$LBROKER_1 LEF 9 3381919 0 00:56:27.58 203993 691 N 2020009A TCPIP$METRIC_1 LEF 10 556766 0 00:13:09.13 43868 183 N 2020009B TCPIP$NFS_1 HIB 8 152 0 00:00:21.60 12167 59 N 2020009C TCPIP$MOUNTD_1 LEF 10 240 0 00:00:01.36 7027 65 N 2020009D TCPIP$NTP_1 LEF 9 1481698 0 00:02:06.88 101058 339 N 2020009F TCPIP$POP_1 HIB 10 25064 0 00:01:55.96 25131 1207 N 202000A0 SMTP_ORFF_01 HIB 6 20353 0 00:02:13.01 28122 2003 202000A3 TNT_SERVER HIB 6 10137 0 00:09:12.23 212289 1268 20204824 SMBD_BG1152 RWAST 8 259 0 00:00:01.98 2635 2660 N 202000A5 CircleMUD LEF 6 192364 0 00:01:37.58 24704 489 202000A6 NMBD MUTEX 9 690146 0 03:31:10.51 117217 744 20203C27 ZAP_BRANAGEN LEF 8 5674 0 00:00:10.00 2477 616 202000AB TNT1202000A3 LEFO 1 -- swapped out -- 495 S 202000AF SYSTEM LEF 5 2715 0 00:00:29.42 30293 268 202049B0 CYRIL LEF 5 35421 0 00:04:36.08 2859 1940 20204A31 OPERAGOST RWAST 6 579 0 00:00:01.80 1289 329 20203EB2 XOO6 LEF 9 11859 0 00:00:45.03 4273 1742 20204833 SWAT_BG1177 RWAST 6 135 0 00:00:01.62 2232 1892 N 20204934 _VTA1256: CUR 4 821 0 00:00:04.02 3887 322 20204A39 TCPIP$SM_BG3533 LEF 8 143 0 00:00:01.57 2328 1517 N 20204743 BATCH_553 LEF 6 1360 0 00:00:05.19 1391 1288 B Action taken: first, Samba wasn't responing do I tried to SWAT in to restart it. SWAT hung up in the middle of bring up the web page. So I went in and tried to kill NMBD. Of course this didn't work. I tried killing SWAT and SMBD processes in the hope that would free up something. They just went into RWAST. I tried to run TCPIP so I could disable SAMBA, but it hung up before giving a prompt, putting that process into RWAST as well. Here's what NMBD looks like with SHOW PROCESS in SDA: Process index: 0026 Name: NMBD Extended PID: 202000A6 --------------------------------------------------------- Status : 00140023 res,delpen,respen,phdres,login Status2: 00000001 quantum_resched PCB address 81EE6B40 JIB address 81E86600 PHD address 83639800 Swapfile disk address 00000000 Master internal PID 00010026 Subprocess count 0 Internal PID 00010026 Creator internal PID 00000000 Extended PID 202000A6 Creator extended PID 00000000 State MUTEX Termination mailbox 0000 Current priority 9 AST's enabled KESU Base priority 4 AST's active ES UIC [00001,000004] AST's remaining 21 Mutex count 0 Buffered I/O count/limit 16/18 Waiting EF cluster 0 Direct I/O count/limit 17/18 Starting wait time 1B011B1B BUFIO byte count/limit 128/896 Event flag wait mask 81E86600 # open files allowed left 10 Local EF cluster 0 C0000001 Timer entries allowed left 8 Local EF cluster 1 80000000 Active page table count 0 Global cluster 2 pointer 00000000 Process WS page count 560 Global cluster 3 pointer 00000000 Global WS page count 184 and SHOW SYSTEM /CHANNEL: Process index: 0026 Name: NMBD Extended PID: 202000A6 --------------------------------------------------------- %SDA-W-NOACCESS, process not accessible (swapped out or suspended) Process active channels ----------------------- Channel Window Status Device/file accessed ------- ------ ------ -------------------- 0010 00000000 DKA0: %SDA-E-NOREAD, unable to access location 8363B7EC Here's SHOW PROCESS for the SMDB process that was left running: Process index: 0024 Name: SMBD_BG1152 Extended PID: 20204824 ---------------------------------------------------------------- Status : 00240023 res,delpen,respen,phdres,netwrk Status2: 00000001 quantum_resched PCB address 81EF9100 JIB address 81EAC700 PHD address 8368B800 Swapfile disk address 00000000 Master internal PID 00900024 Subprocess count 0 Internal PID 00900024 Creator internal PID 00000000 Extended PID 20204824 Creator extended PID 00000000 State RWAST Termination mailbox 0013 Current priority 8 AST's enabled KESU Base priority 6 AST's active S UIC [00001,000004] AST's remaining 4195 Mutex count 0 Buffered I/O count/limit 511/512 Waiting EF cluster 0 Direct I/O count/limit 4094/4096 Starting wait time 1B011919 BUFIO byte count/limit ******/2046848 Event flag wait mask 00000001 # open files allowed left 294 Local EF cluster 0 80000000 Timer entries allowed left 30 Local EF cluster 1 80000000 Active page table count 0 Global cluster 2 pointer 00000000 Process WS page count 2321 Global cluster 3 pointer 00000000 Global WS page count 339 WHAT'S UP WITH THE ASTERISKS IN BUFIO? And SHOW PROCESS /CHANNEL: Process active channels ----------------------- Channel Window Status Device/file accessed ------- ------ ------ -------------------- 0010 00000000 DKA0: 0020 81DD4FC0 DKA0:[SAMBA.BIN]SMBD.EXE;1 0030 81DCB700 DKA0:[VMS$COMMON.SYSLIB]SECURESHRP.EXE;1 (section file) 0040 81DCE080 DKA0:[VMS$COMMON.SYSLIB]SECURESHR.EXE;1 (section file) 0050 81DD06C0 DKA0:[VMS$COMMON.SYSLIB]LIBRTL.EXE;1 (section file) 0060 81DC8940 DKA0:[VMS$COMMON.SYSEXE]DCL.EXE;1 (section file) 0070 81DC5040 DKA0:[VMS$COMMON.SYSLIB]UVMTHRTL.EXE;1 (section file) 0080 81DE1340 DKA0:[VMS$COMMON.SYSLIB]DCLTABLES.EXE;81 (section file) 0090 81F0F600 Busy DKA0:[SYS0.SYSMGR]SMBD_STARTUP.LOG;210 00A0 81E9B300 DKA0:[SAMBA.BIN]SMBD_STARTUP.COM;7 00B0 81DD1780 DKA0:[VMS$COMMON.SYSLIB]DECC$SHR.EXE;3 (section file) 00C0 81DD1980 DKA0:[VMS$COMMON.SYSLIB]CMA$TIS_SHR.EXE;1 (section file) 00D0 81DD1740 DKA0:[VMS$COMMON.SYSLIB]UCX$IPC_SHR.EXE;1 (section file) 00E0 81DCFF40 DKA0:[VMS$COMMON.SYSLIB]TCPIP$ACCESS_SHR.EXE;1 (section file) Process index: 0024 Name: SMBD_BG1152 Extended PID: 20204824 ---------------------------------------------------------------- Channel Window Status Device/file accessed ------- ------ ------ -------------------- 00F0 00000000 BG1152: 0100 81E8F540 DKA0:[VMS$COMMON.SYSEXE]RIGHTSLIST.DAT;1 0110 81E5C780 DKA0:[SAMBA.PRIVATE]SECRETS.TDB;1 0120 81E9B480 DKA0:[SAMBA]LOG.SMBD;1 0130 00000000 Busy DKA0: DKA0: is my system disk, so dismounting it just to free this process isn't an option! I don't get it- I assume the NMBD is what's hung up, but I don't know what it means by "%SDA-E-NOREAD, unable to access location 8363B7EC" and it doesn't show anything busy! It also doesn't look like either process has exhausted any quota. ---------------------------------- Stephen Eickhoff [EMAIL PROTECTED] ---------------------------------- PLEASE READ THIS IMPORTANT ETIQUETTE MESSAGE BEFORE POSTING: http://www.catb.org/~esr/faqs/smart-questions.html