Re: hidden flags to EXPIRE INV and CLEANUP EXPTABLE
Correction: In the SHOW NODE against the subkeys, the KEY is the NODE_NAME. Field 1 is still the node number and field2 is PLATFORM_NAME. Also, beware of using SHOW NODE on wrong or random pages. On 06.04.20 at 22:58 [EMAIL PROTECTED] wrote: Date: Thu, 20 Apr 2006 22:58:34 -0500 From: Josh Davis [EMAIL PROTECTED] Reply-To: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU To: ADSM-L@VM.MARIST.EDU Subject: hidden flags to EXPIRE INV and CLEANUP EXPTABLE UNDOCUMENTED OPTIONS FOR EXPIRE INVENTORY: There are two undocumented/unsupported options for EXPIRE INV; BEGINNODEID and ENDNODEID. These accept the decimal node number of a node and can be used to expire a specific node's filespaces, or a specific range. WHY WOULD YOU EVER WANT TO USE THESE? EXPIRE INV won't check for filespace lock before parsing a filespace. As such, if you're running expiration, and a node is backing up a filespace when expire inventory gets to it, expire inventory will wait indefinitely. When this happens, CANCEL EXPIRATION or CANCEL PROC will register as Cancel Pending but will hang there until the lock is released. Officially there's supposed to be a resource timeout, but IBM wasn't able to give details on how long this is. HOW TO FIND THE NODE NUMBERS: Node numbers are sequential, starting at 1, and are in REG_TIME order. Deletions leave gaps. The short way would be a SELECT statement. Supposedly this can be done, but I couldn't figure out the column name. IBM doesn't like to give info regarding undocumented/unsupported options since that might make them liable to support or defend them in the future. The long way is to use SHOW commands. Use SHOW OBJDIR to find the btree node for the Nodes table. This SHOULD be 38. SHOW NODE 38 (hopefully) will show the top level of the tree. On average, there are about 11 second-level leaf nodes per first level leaf node. If you do SHOW NODE on each subtree, and save these to a file, you'll have the raw data for the nodes table. In the data section, field 1 is the node number in hex, and field 2 is the node name in all-caps ascii. OTHER USES FOR THE NODEID: This can be used with SHOW LOCKS and SHOW THREADS to find out which node is holding the lock preventing expire inv from continuing. From there, you can kill a session so that expire inv can continue or be cancelled. This can also be converted to decimal so you can run EXPIRE INV BEGINNODE=10 ENDNODE=20 or similar to operate only on a specific subset of nodes. This could be used to avoid nodes which have long-running transactions, to quick-expire a huge bunch of data that was just deleted, or to set up scheduled expirations for heavy-expire nodes. These same flags work on CLEANUP EXPTABLE. Since there is no way to cancel CLEANUP EXPTABLE, then running it on a small subset of nodes can help if you suspect you're not expiring all that you should be, but don't want to risk having to shutdown TSM to abort it when you're 50 million objects in and it's a week after you started it. WHY I'M SHARING THIS INFO: I've opened a DCR requesting EXPIRE INVENTORY be given an option to allow detection and skipping of locked filespaces, and that it should be implemented without killing expiration or the session/process holding the filespace lock. The FITS request number is MR0420061821 if you or anyone wants to be added to the notify/me-too list for this. If your sales rep doesn't know where/how to get to FITS, it's on D03DB004.boulder.ibm.com. I think it's under m_dir (marketing). This was way longer than I anticipated, but seemed useful enough to risk sharing. -- Josh
Re: hidden flags to EXPIRE INV and CLEANUP EXPTABLE
UNDOCUMENTED OPTIONS FOR EXPIRE INVENTORY: There are two undocumented/unsupported options for EXPIRE INV; BEGINNODEID and ENDNODEID. These accept the decimal node number of a node and can be used to expire a specific node's filespaces, or a specific range. WHY WOULD YOU EVER WANT TO USE THESE? EXPIRE INV won't check for filespace lock before parsing a filespace. We are currently using the CLEANUP EXPTABLE cmd with the BEGINNODEID/ENDNODEID options . . . . . . UNDER THE DIRECTION OF TSM SUPPORT. We are on TSM V5.3.2. Back when we were V5.1 (last year), we were getting ANR messages that prevented us from deleted old nodes out of TSM. We did bunches of stuff with support, but the fix was to move to v5.3. This we were planning, and did. As part of migrating to v5.3 we ran the cleanup procedure for Win system objects. After upgrading we still couldn't delete the nodes out of TSM. The solution was to run the CLEANUP EXPTABLE command. We did this on our test system for each of our production databases. Once started, this command cannot be stopped without halting the TSM server. Running it against a production DB on our test server ran well over 2 weeks for one TSM server, and just about 2 weeks for our other TSM server. After running, we were able to delete the nodes. Ok, now we are told by the support center to run it on production. Remember . . . . you cannot run expiration while this is running. We told them we could not go without expiration for that long, and we can't just halt our TSM server at anytime to stop it!! The answer was to get the NODEID's for all the nodes and run CLEANUP EXPTABLE for one node at a time using the BEGINNODEID/ENDNODEID. We tried this on one node . . . it worked, but we have well over 500 nodes on each TSM server. I've writtes a script that automatically works through a file of nodeid's. After each node is cleaned up, it runs expiration for some amount of time. The longer the cleanup runs, the longer expiration runs. I run this script Monday thru Friday, cleaning up one node at a time with some normal expiration between each cleanup. On the weekend I put our normal expiration processing back in place to get a couple good full expiration runs. It took several months to work through the first TSM server. If I understand the output of the command, it fixed almost 15 million errors on this TSM server. The 2nd TSM server has been going through this process since Feb 1st is finally getting close to finishing - it's processed 509 out of 526 nodes. Anyway, this is one reason to use these options. If anyone has a similar problem, I would be happy to send them this script (ksh). It's highly specific to our environment, but could be adapted easily. rick - The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message.
hidden flags to EXPIRE INV and CLEANUP EXPTABLE
UNDOCUMENTED OPTIONS FOR EXPIRE INVENTORY: There are two undocumented/unsupported options for EXPIRE INV; BEGINNODEID and ENDNODEID. These accept the decimal node number of a node and can be used to expire a specific node's filespaces, or a specific range. WHY WOULD YOU EVER WANT TO USE THESE? EXPIRE INV won't check for filespace lock before parsing a filespace. As such, if you're running expiration, and a node is backing up a filespace when expire inventory gets to it, expire inventory will wait indefinitely. When this happens, CANCEL EXPIRATION or CANCEL PROC will register as Cancel Pending but will hang there until the lock is released. Officially there's supposed to be a resource timeout, but IBM wasn't able to give details on how long this is. HOW TO FIND THE NODE NUMBERS: Node numbers are sequential, starting at 1, and are in REG_TIME order. Deletions leave gaps. The short way would be a SELECT statement. Supposedly this can be done, but I couldn't figure out the column name. IBM doesn't like to give info regarding undocumented/unsupported options since that might make them liable to support or defend them in the future. The long way is to use SHOW commands. Use SHOW OBJDIR to find the btree node for the Nodes table. This SHOULD be 38. SHOW NODE 38 (hopefully) will show the top level of the tree. On average, there are about 11 second-level leaf nodes per first level leaf node. If you do SHOW NODE on each subtree, and save these to a file, you'll have the raw data for the nodes table. In the data section, field 1 is the node number in hex, and field 2 is the node name in all-caps ascii. OTHER USES FOR THE NODEID: This can be used with SHOW LOCKS and SHOW THREADS to find out which node is holding the lock preventing expire inv from continuing. From there, you can kill a session so that expire inv can continue or be cancelled. This can also be converted to decimal so you can run EXPIRE INV BEGINNODE=10 ENDNODE=20 or similar to operate only on a specific subset of nodes. This could be used to avoid nodes which have long-running transactions, to quick-expire a huge bunch of data that was just deleted, or to set up scheduled expirations for heavy-expire nodes. These same flags work on CLEANUP EXPTABLE. Since there is no way to cancel CLEANUP EXPTABLE, then running it on a small subset of nodes can help if you suspect you're not expiring all that you should be, but don't want to risk having to shutdown TSM to abort it when you're 50 million objects in and it's a week after you started it. WHY I'M SHARING THIS INFO: I've opened a DCR requesting EXPIRE INVENTORY be given an option to allow detection and skipping of locked filespaces, and that it should be implemented without killing expiration or the session/process holding the filespace lock. The FITS request number is MR0420061821 if you or anyone wants to be added to the notify/me-too list for this. If your sales rep doesn't know where/how to get to FITS, it's on D03DB004.boulder.ibm.com. I think it's under m_dir (marketing). This was way longer than I anticipated, but seemed useful enough to risk sharing. -- Josh