Thomas, really looks quite similar to the DBLOG666 error we experienced a few days ago. In both cases the logfile seems (at least partially) corrupted.
IBM services suggested a point in time restore of the DB. This leads to data loss for the time between the last dbbackup and the powerfailure. And they offered a different solution, smart but time consuming: As the DB could be fine and only the LOG is corrupted (it's LOGread388 and not DBreadXXX), you could get around with DUMPDB->FORMATLOAD->LOADDB->AUDITDB This way you can avoid data loss (especially all data migrated from diskpools to sequential pools would get lost), the only price you have to pay is time. Your DB and Logsize is very close to the size we have. Our DB was 80% full, that is 360.000.000 DB entries. DUMPDB is fast, you can wait at the console for completion (change devconfig to make it possible all fits in 1 or 2 files). FORMATLOAD is around the same duration like DUMPDB, provide around 25% more space than you had before (just to avoid an unpleasant surprise) LOADDB takes time, in our environment it took more than 10 hrs. Upon completion you see the number of DB entries. AUDITDB is slow, too. You can estimate how long it will take, as the process lists how many entries are processed. From LOADDB you know how many more there are to come. I'm not very happy with this behaviour of the TSM DB. You have a DB that ist fine, you have a LOG with one error, you loose your DB, that is inacceptable. One of the purposes of the combination of DB and LOG ist to get TSM crash-resistant - not to get multiple points of failure. Good luck with your server repair! Best wishes, Michael Bartl Am 19.02.2008 um 19:01 schrieb Thomas Denier:
Our data center fire suppression system was inspected earlier this morning. The inspector somehow managed to trigger the fire suppression system. He was able to abort the activation before Halon was discharged, but not before the electrical power to the data center was cut off. We have gotten the TSM server host (zSeries Linux) back up, but we have not been able to bring up the TSM server (5.3.4.0). It fails during initialization with the following messages: ANR0200I Recovery log assigned capacity is 10800 megabytes. ANR0201I Database assigned capacity is 66800 megabytes. ANR0306I Recovery log volume mount in progress. ANR0353I Recovery log analysis pass in progress. ANR9999D pkthread.c(570): ThreadId<0> Run-time assertion failed: "Cmp64( scanLsn, LOGV->headLsn ) != GREATERTHAN", Thread 0, File logread.c, Line 398. ANR7824S Server operation terminated. ANR7823S Internal error LOGREAD388 detected. I already have a Severity 1 call in to IBM. We have mirrored recovery log volumes. I tried renaming the primary volumes to force the server to use the copies. The server failed with the same messages. I have since renamed the primary log volumes back to their original names. Even so, the TSM server now generates messages like the following: ANR0215W Recovery log volume /tsmlog01/logvol is in the offline state - VARY ON required. I have no idea how I am supposed to vary on the volumes when the TSM server won't start. Is there anything else I should try while waiting to hear from IBM?