Re: [Veritas-bu] Strange hung job problem
Dave Is it just this one client? Wondered if the client needed a bounce? Sounds like the activity from the client is not being passed back to the Master. If its one client - Reboot it perhaps if possible? Simon -Original Message- From: veritas-bu-boun...@mailman.eng.auburn.edu [mailto:veritas-bu-boun...@mailman.eng.auburn.edu] On Behalf Of Dave Markham Sent: Tuesday, April 20, 2010 11:02 AM To: veritas-bu@mailman.eng.auburn.edu Subject: Re: [Veritas-bu] Strange hung job problem Well i dont think the tape has unloaded, but it's getting stranger . This is the final 20 odd lines from bpbkar log on the windows client. This to be says the job has finished and ties in with the start time of 22:00. 22:32:46.684: [4448.2988] <2> tar_base::backup_finish: TAR - backup: 13574 files 22:32:46.684: [4448.2988] <2> tar_base::backup_finish: TAR - backup: file data: 650849865 bytes 6 gigabytes 22:32:46.684: [4448.2988] <2> tar_base::backup_finish: TAR - backup: image data: 703843328 bytes 6 gigabytes 22:32:46.684: [4448.2988] <2> tar_base::backup_finish: TAR - backup: elapsed time: 1590 secs 4494522 bps 22:32:46.700: [4448.2988] <2> tar_base::V_vTarMsgW: INF - EXIT STATUS 0: the requested operation was successfully completed 22:33:09.700: [4448.2988] <4> tar_backup::OVPC_WaitForServerComplete: INF - 'INF - Server status = 0' received 22:33:10.028: [4448.2988] <2> tar_base::V_vTarMsgW: INF - Active Directory: transcation logs truncated 22:33:10.028: [4448.2988] <16> dtcp_write: TCP - failure: send socket (1840) (TCP 10058: Can't send after socket shutdown) 22:33:10.028: [4448.2988] <16> dtcp_write: TCP - failure: attempted to send 6 bytes 22:33:10.028: [4448.2988] <16> dtcp_write: TCP - failure: send socket (1840) (TCP 10058: Can't send after socket shutdown) 22:33:10.028: [4448.2988] <16> dtcp_write: TCP - failure: attempted to send 45 bytes 22:33:10.028: [4448.2988] <4> OVStopCmd: INF - EXIT - status = 0 22:33:10.028: [4448.2988] <2> ov_log::V_GlobalLog: INF - BEDS_Term() Enter InitFlags:0x1 22:33:10.090: [4448.2988] <4> OVShutdown: INF - Finished process 22:33:10.137: [4448.2988] <4> WinMain: INF - Exiting C:\Program Files\VERITAS\NetBackup\bin\bpbkar32.exe 22:33:12.137: [4448.2988] <4> ov_log::OVClose: INF - Closing log file: C:\Program Files\VERITAS\NetBackup\logs\BPBKAR\041910.LOG Yet on the Master the job is still active and says its about 90% through (although i don't use the gui a great deal). Anyway on the gui job details it says it started loading a new tape P805L2 around 01:18 this morning??? There are also WRN messages from bperror on the master for the client in question with time stamps after 22:33. The messages i have seen before and am not too concerned about, but how can the master still be doing things when the client bpbkar process has finished? # bperror -all -client -U --extract--- 04/19/2010 23:14:44from client : WRN - Removable Storage Management: unable to export database (WIN32 21: Unknown error) 04/19/2010 23:14:45from client : ERR - failure reading file: C:\WINDOWS\system32\NtmsData\NTMSDATA (WIN32 33: Unknown error) 04/19/2010 23:18:08from client : ERR - failure reading file: C:\WINDOWS\system32\wbem\Logs\FrameWork.log (WIN32 13: Unknown error) 04/20/2010 01:18:34 begin writing backup id _1271711139, copy 1, fragment 180, to media id P805L2 on drive index 0 These messages keep repeating in the bprd log on the master every 10 odd minutes which i'm not sure if is related.:- 10:35:30.456 [5622] <2> launch: /usr/openv/netbackup/bin/bpsched, pid=465 10:35:31.822 [5622] <2> childterm: pid=465 exit=200, signo=0 core=no 10:35:31.822 [5622] <2> schild: child(ren) not ready to terminate 10:35:31.822 [5622] <2> listen_loop: select() interrupted 10:45:29.742 [5622] <2> get_behavior_rec: ? 10:45:29.743 [5622] <2> get_behavior_rec: MD5 digest line 10:45:30.826 [5622] <2> launch: /usr/openv/netbackup/bin/bpsched, pid=698 10:45:32.188 [5622] <2> childterm: pid=698 exit=200, signo=0 core=no 10:45:32.188 [5622] <2> schild: child(ren) not ready to terminate 10:45:32.188 [5622] <2> listen_loop: select() interrupted Jobs are being multiplexed and everything else seems to have finished apart from this client. There are a few "db_lock_media: unable to lock media at offset 14 (P843L2)" type messages for the first tape in use for the backups in bptm log It successfully mounts the next tape it requires P805L2 (must have ran out of space on P843) and other jobs which were still multiplexing seem to write ok and finish. Any other areas to look? Jeff Cleverley wrote: > Dave, > &g
Re: [Veritas-bu] Strange hung job problem
Well i dont think the tape has unloaded, but it's getting stranger . This is the final 20 odd lines from bpbkar log on the windows client. This to be says the job has finished and ties in with the start time of 22:00. 22:32:46.684: [4448.2988] <2> tar_base::backup_finish: TAR - backup: 13574 files 22:32:46.684: [4448.2988] <2> tar_base::backup_finish: TAR - backup: file data: 650849865 bytes 6 gigabytes 22:32:46.684: [4448.2988] <2> tar_base::backup_finish: TAR - backup: image data: 703843328 bytes 6 gigabytes 22:32:46.684: [4448.2988] <2> tar_base::backup_finish: TAR - backup: elapsed time: 1590 secs 4494522 bps 22:32:46.700: [4448.2988] <2> tar_base::V_vTarMsgW: INF - EXIT STATUS 0: the requested operation was successfully completed 22:33:09.700: [4448.2988] <4> tar_backup::OVPC_WaitForServerComplete: INF - 'INF - Server status = 0' received 22:33:10.028: [4448.2988] <2> tar_base::V_vTarMsgW: INF - Active Directory: transcation logs truncated 22:33:10.028: [4448.2988] <16> dtcp_write: TCP - failure: send socket (1840) (TCP 10058: Can't send after socket shutdown) 22:33:10.028: [4448.2988] <16> dtcp_write: TCP - failure: attempted to send 6 bytes 22:33:10.028: [4448.2988] <16> dtcp_write: TCP - failure: send socket (1840) (TCP 10058: Can't send after socket shutdown) 22:33:10.028: [4448.2988] <16> dtcp_write: TCP - failure: attempted to send 45 bytes 22:33:10.028: [4448.2988] <4> OVStopCmd: INF - EXIT - status = 0 22:33:10.028: [4448.2988] <2> ov_log::V_GlobalLog: INF - BEDS_Term() Enter InitFlags:0x1 22:33:10.090: [4448.2988] <4> OVShutdown: INF - Finished process 22:33:10.137: [4448.2988] <4> WinMain: INF - Exiting C:\Program Files\VERITAS\NetBackup\bin\bpbkar32.exe 22:33:12.137: [4448.2988] <4> ov_log::OVClose: INF - Closing log file: C:\Program Files\VERITAS\NetBackup\logs\BPBKAR\041910.LOG Yet on the Master the job is still active and says its about 90% through (although i don't use the gui a great deal). Anyway on the gui job details it says it started loading a new tape P805L2 around 01:18 this morning??? There are also WRN messages from bperror on the master for the client in question with time stamps after 22:33. The messages i have seen before and am not too concerned about, but how can the master still be doing things when the client bpbkar process has finished? # bperror -all -client -U --extract--- 04/19/2010 23:14:44from client : WRN - Removable Storage Management: unable to export database (WIN32 21: Unknown error) 04/19/2010 23:14:45from client : ERR - failure reading file: C:\WINDOWS\system32\NtmsData\NTMSDATA (WIN32 33: Unknown error) 04/19/2010 23:18:08from client : ERR - failure reading file: C:\WINDOWS\system32\wbem\Logs\FrameWork.log (WIN32 13: Unknown error) 04/20/2010 01:18:34begin writing backup id _1271711139, copy 1, fragment 180, to media id P805L2 on drive index 0 These messages keep repeating in the bprd log on the master every 10 odd minutes which i'm not sure if is related.:- 10:35:30.456 [5622] <2> launch: /usr/openv/netbackup/bin/bpsched, pid=465 10:35:31.822 [5622] <2> childterm: pid=465 exit=200, signo=0 core=no 10:35:31.822 [5622] <2> schild: child(ren) not ready to terminate 10:35:31.822 [5622] <2> listen_loop: select() interrupted 10:45:29.742 [5622] <2> get_behavior_rec: ? 10:45:29.743 [5622] <2> get_behavior_rec: MD5 digest line 10:45:30.826 [5622] <2> launch: /usr/openv/netbackup/bin/bpsched, pid=698 10:45:32.188 [5622] <2> childterm: pid=698 exit=200, signo=0 core=no 10:45:32.188 [5622] <2> schild: child(ren) not ready to terminate 10:45:32.188 [5622] <2> listen_loop: select() interrupted Jobs are being multiplexed and everything else seems to have finished apart from this client. There are a few "db_lock_media: unable to lock media at offset 14 (P843L2)" type messages for the first tape in use for the backups in bptm log It successfully mounts the next tape it requires P805L2 (must have ran out of space on P843) and other jobs which were still multiplexing seem to write ok and finish. Any other areas to look? Jeff Cleverley wrote: > Dave, > > Do you know if the tape library completes the media unload properly? > I believe that the log may show the backup completed but the session > for the master doesn't complete until the media has been unloaded. > You may check things like the bptm or bprd logs on your master also. > > Jeff > > On Mon, Apr 19, 2010 at 8:24 AM, Dave Markham wrote: > >> Guys anyone remember seeing something like this back on NBU 5. >> >> I'm running NBU 5.0MP7 on a Unix Server attached to L25 tape library. >> >> I have a windows client which works fine some times and then other times >> just sits there and hangs everything up. It never finishes and needs >> cancelling manually. >> >> Windows client is
Re: [Veritas-bu] Strange hung job problem
The only time I have seen this is when the windows server ran out of room on the c drive. -Original Message- From: veritas-bu-boun...@mailman.eng.auburn.edu [mailto:veritas-bu-boun...@mailman.eng.auburn.edu] On Behalf Of Dave Markham Sent: Monday, April 19, 2010 9:24 AM To: veritas-bu@mailman.eng.auburn.edu Subject: [Veritas-bu] Strange hung job problem Guys anyone remember seeing something like this back on NBU 5. I'm running NBU 5.0MP7 on a Unix Server attached to L25 tape library. I have a windows client which works fine some times and then other times just sits there and hangs everything up. It never finishes and needs cancelling manually. Windows client is Win Server 2003 SP2 and has NBU 5.0mp7 client. Now according to the bpkar log on the client the job has finished successfully, but on the master/media server the job stays hung as though its not registered it finishing. I have some logs if anyone could help out?? Cheers ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu ___ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu