We recently moved some MSSQL from server A (16G RAM) to server B (4G RAM), scripts call for $ALL.
Of the 15 databases, it tries to start all 15 up, it starts up 8 (and backs up successfully), fails 9 - 15, and the parent jobs just hangs, no error code is returned to NetBackup. If we lower the buffers from 3 to 2 and stripes from 2 to 1, it works fine. The issues we have are 1) how to calculate buffers and stripes, and 2) why this is allowed to lock up and fail with no exit error code. Here is detail from the log and Symantec support comments: I think I found the root cause of the backup hanging. I looked through the dbclient log and see the following: --- 15:14:18.320 [7976.4920] <16> writeToServer: ERR - send() to server on socket failed: 15:14:18.320 [7976.4920] <16> dbc_put: ERR - failed sending data to server 15:14:18.445 [7976.4920] <16> VxBSASendData: ERR - Could not do a bsa_put(). 15:14:18.445 [7976.4920] <16> DBthreads::dbclient: ERR - Error in VxBSASendData: 1. --- Above we have a socket failure. This results in failure to update the thread which sets up the failure below: --- 15:14:18.445 [7976.4920] <16> CDBbackrec::ProcessVxBSAerror: ERR - Error in DBthreads::dbclient: 6. 15:14:18.445 [7976.4920] <1> CDBbackrec::ProcessVxBSAerror: CONTINUATION: - The system cannot find the file specified. 15:14:18.445 [7976.4920] <16> DBthreads::dbclient: ERR - Error in VxBSAEndData: 6. 15:14:18.445 [7976.4920] <1> DBthreads::dbclient: CONTINUATION: - The handle used to associate this call with a previous VxBSAInit() call is invalid. --- At this point the application panics. See the entries below: --- 15:14:18.461 [7976.7632] <16> DBthreads::dbclient: ERR - Error in CompleteCommand: 0x80770004. 15:14:18.461 [7976.7632] <16> DBthreads::dbclient: ERR - A panic close was issued to dbclient #2. 15:14:18.461 [7976.6932] <16> DBthreads::dbclient: ERR - Error in CompleteCommand: 0x80770004. 15:14:18.523 [7976.6932] <16> DBthreads::dbclient: ERR - A panic close was issued to dbclient #1. --- I'm not sure you can call this a bug. I suppose the code could be a little more robust and have a timeout set for the bsa_put() and/or the VxBSAInit() function call. David McMullin _______________________________________________ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu