Re: export nodes causes TSM server crash
John, Thanks for the script to check if any exports are running prior to starting the next batch job of 15 exports. It is currently running to check if it makes life easier for TSM. I managed to get another crash of the TSM server when the export of the first 15 nodes was started. No other activities were being performed on the TSM server except the Tivoli Operational Reporting tool that monitors each hour the TSM activities. I've stopped the reporting service as well to see if this has anything to do with it. No news from IBM support so far, I'll update the thread if I know more later on. regards, Kurt From: ADSM: Dist Stor Manager on behalf of John Monahan Sent: Mon 2/27/2006 22:36 To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] export nodes causes TSM server crash I agree that the TSM server shouldn't ever crash, but just because it shouldn't crash doesn't necessarily mean you should try to run 75 or 100 or 1000 exports concurrently either. Until a fix is produced, I would just limit your concurrent exports to what you know works without committing a self-imposed denial of service attack on your TSM server. Here is what I would do with your scripts that have the exports separated into groups of 15 nodes each: 1. Kick off the first one as is. 2. Modify all the other scripts to first check for any export processes still running, and if there are, then have those scripts reschedule themselves. ie: select * from processes where upper(process)='EXPORT NODE' if (rc_ok) goto reschedule exit :reschedule del sched type=a def sched type=a cmd="run " active=yes startt=NOW+0:30 perunits=onetime exit __ John Monahan Consultant Infrastructure Solutions Group Computech Resources, Inc. Office: 952-833-0930 ext 109 Cell: 952-221-6938 http://www.computechresources.com Kurt Beyers <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" 02/27/2006 02:57 PM Please respond to "ADSM: Dist Stor Manager" To ADSM-L@VM.MARIST.EDU cc Subject Re: export nodes causes TSM server crash John, The export of just 15 nodes was tested earlier on. It contained the larger nodes already. At that time, the TSM server just was slowly (high CPU consumption and a lot of disk I/O which is normal of course). It worked fine. The export of all of the nodes at the same time causes an immediate crash of the TSM server. I did not mean to do the export at once but did not notice that the parallel/serial commands would not work as the exports are started in the background. So I changed the script to work in groups of 15 nodes. The export of the nodes in groups of 15 caused a new crash when the last group export was started. A few of the earlier exports were still running at that time, the nodes in the latest group export were rather small nodes. A support call was logged of course. The question is what causes the TSM server crash. Except the PK_EXCEPTION and PK_THREAD messages in the application log, nothing else is found. Just have to wait for some new from the labs at this time. And will contact them tomorrow again. regards, Kurt Van: ADSM: Dist Stor Manager namens John Monahan Verzonden: ma 2/27/2006 20:15 Aan: ADSM-L@VM.MARIST.EDU Onderwerp: Re: [ADSM-L] export nodes causes TSM server crash Let me see if I understand you correctly. The export works fine when only 15 nodes are running, but after 2 hours when the second set of 15 nodes kicks in (while some from the first group of 15 are stilli running) that is when your server crashes? Or does your server crash with only 15 nodes running an export? __ John Monahan Consultant Infrastructure Solutions Group Computech Resources, Inc. Office: 952-833-0930 ext 109 Cell: 952-221-6938 http://www.computechresources.com Kurt Beyers <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" 02/27/2006 05:35 AM Please respond to "ADSM: Dist Stor Manager" To ADSM-L@VM.MARIST.EDU cc Subject export nodes causes TSM server crash Hello everybody, I've got a TSM server 5.3.2.2 running on Windows2003 Enterprise Edition SP1 (7 GB RAM, Xeon 3,2 GHz CPU) that has about 100 TSM clients defined. Each month an export of each TSM node with the active backup data will be taken to disk (DS4100 with SATA disks of 250 GB). The disk storage pool that contains the backups is on the DS4100 too. I've scheduled the export of the TSM nodes past weekend with a few scripts. I first tried to launch just one script that took the export in blocks of 15 nodes using the PARALLEL and SERIAL commands. However as the export is started in the background, all of the 75 exporst were started immediately. This causes a TSM server crash. After restarting the TSM server, no error logs are found in the activity log. Except that no more than 16 comman
Re: export nodes causes TSM server crash
I agree that the TSM server shouldn't ever crash, but just because it shouldn't crash doesn't necessarily mean you should try to run 75 or 100 or 1000 exports concurrently either. Until a fix is produced, I would just limit your concurrent exports to what you know works without committing a self-imposed denial of service attack on your TSM server. Here is what I would do with your scripts that have the exports separated into groups of 15 nodes each: 1. Kick off the first one as is. 2. Modify all the other scripts to first check for any export processes still running, and if there are, then have those scripts reschedule themselves. ie: select * from processes where upper(process)='EXPORT NODE' if (rc_ok) goto reschedule exit :reschedule del sched type=a def sched type=a cmd="run " active=yes startt=NOW+0:30 perunits=onetime exit __ John Monahan Consultant Infrastructure Solutions Group Computech Resources, Inc. Office: 952-833-0930 ext 109 Cell: 952-221-6938 http://www.computechresources.com Kurt Beyers <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" 02/27/2006 02:57 PM Please respond to "ADSM: Dist Stor Manager" To ADSM-L@VM.MARIST.EDU cc Subject Re: export nodes causes TSM server crash John, The export of just 15 nodes was tested earlier on. It contained the larger nodes already. At that time, the TSM server just was slowly (high CPU consumption and a lot of disk I/O which is normal of course). It worked fine. The export of all of the nodes at the same time causes an immediate crash of the TSM server. I did not mean to do the export at once but did not notice that the parallel/serial commands would not work as the exports are started in the background. So I changed the script to work in groups of 15 nodes. The export of the nodes in groups of 15 caused a new crash when the last group export was started. A few of the earlier exports were still running at that time, the nodes in the latest group export were rather small nodes. A support call was logged of course. The question is what causes the TSM server crash. Except the PK_EXCEPTION and PK_THREAD messages in the application log, nothing else is found. Just have to wait for some new from the labs at this time. And will contact them tomorrow again. regards, Kurt Van: ADSM: Dist Stor Manager namens John Monahan Verzonden: ma 2/27/2006 20:15 Aan: ADSM-L@VM.MARIST.EDU Onderwerp: Re: [ADSM-L] export nodes causes TSM server crash Let me see if I understand you correctly. The export works fine when only 15 nodes are running, but after 2 hours when the second set of 15 nodes kicks in (while some from the first group of 15 are stilli running) that is when your server crashes? Or does your server crash with only 15 nodes running an export? __ John Monahan Consultant Infrastructure Solutions Group Computech Resources, Inc. Office: 952-833-0930 ext 109 Cell: 952-221-6938 http://www.computechresources.com Kurt Beyers <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" 02/27/2006 05:35 AM Please respond to "ADSM: Dist Stor Manager" To ADSM-L@VM.MARIST.EDU cc Subject export nodes causes TSM server crash Hello everybody, I've got a TSM server 5.3.2.2 running on Windows2003 Enterprise Edition SP1 (7 GB RAM, Xeon 3,2 GHz CPU) that has about 100 TSM clients defined. Each month an export of each TSM node with the active backup data will be taken to disk (DS4100 with SATA disks of 250 GB). The disk storage pool that contains the backups is on the DS4100 too. I've scheduled the export of the TSM nodes past weekend with a few scripts. I first tried to launch just one script that took the export in blocks of 15 nodes using the PARALLEL and SERIAL commands. However as the export is started in the background, all of the 75 exporst were started immediately. This causes a TSM server crash. After restarting the TSM server, no error logs are found in the activity log. Except that no more than 16 commands can be started in one PARALLEL statement. The last normal message about the export is written in the log and then the next message are when the server is started again. I've split up then the export myself in a script where the export of 15 nodes was started and 4 administrative schedules were defined that triggered the export of 15 additional nodes every 2 hours later on. The TSM server crashed once more. Is this a know feature when the export of a lot of nodes is started? Am I overseeing some parameters here? Can the export be started in a better way using TSM scripting? An export server instead of an 'export node' for each TSM node is not an option as then the impot of one node would take too much time. thanks in advance, Kurt
Re: export nodes causes TSM server crash
John, The export of just 15 nodes was tested earlier on. It contained the larger nodes already. At that time, the TSM server just was slowly (high CPU consumption and a lot of disk I/O which is normal of course). It worked fine. The export of all of the nodes at the same time causes an immediate crash of the TSM server. I did not mean to do the export at once but did not notice that the parallel/serial commands would not work as the exports are started in the background. So I changed the script to work in groups of 15 nodes. The export of the nodes in groups of 15 caused a new crash when the last group export was started. A few of the earlier exports were still running at that time, the nodes in the latest group export were rather small nodes. A support call was logged of course. The question is what causes the TSM server crash. Except the PK_EXCEPTION and PK_THREAD messages in the application log, nothing else is found. Just have to wait for some new from the labs at this time. And will contact them tomorrow again. regards, Kurt Van: ADSM: Dist Stor Manager namens John Monahan Verzonden: ma 2/27/2006 20:15 Aan: ADSM-L@VM.MARIST.EDU Onderwerp: Re: [ADSM-L] export nodes causes TSM server crash Let me see if I understand you correctly. The export works fine when only 15 nodes are running, but after 2 hours when the second set of 15 nodes kicks in (while some from the first group of 15 are stilli running) that is when your server crashes? Or does your server crash with only 15 nodes running an export? __ John Monahan Consultant Infrastructure Solutions Group Computech Resources, Inc. Office: 952-833-0930 ext 109 Cell: 952-221-6938 http://www.computechresources.com Kurt Beyers <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" 02/27/2006 05:35 AM Please respond to "ADSM: Dist Stor Manager" To ADSM-L@VM.MARIST.EDU cc Subject export nodes causes TSM server crash Hello everybody, I've got a TSM server 5.3.2.2 running on Windows2003 Enterprise Edition SP1 (7 GB RAM, Xeon 3,2 GHz CPU) that has about 100 TSM clients defined. Each month an export of each TSM node with the active backup data will be taken to disk (DS4100 with SATA disks of 250 GB). The disk storage pool that contains the backups is on the DS4100 too. I've scheduled the export of the TSM nodes past weekend with a few scripts. I first tried to launch just one script that took the export in blocks of 15 nodes using the PARALLEL and SERIAL commands. However as the export is started in the background, all of the 75 exporst were started immediately. This causes a TSM server crash. After restarting the TSM server, no error logs are found in the activity log. Except that no more than 16 commands can be started in one PARALLEL statement. The last normal message about the export is written in the log and then the next message are when the server is started again. I've split up then the export myself in a script where the export of 15 nodes was started and 4 administrative schedules were defined that triggered the export of 15 additional nodes every 2 hours later on. The TSM server crashed once more. Is this a know feature when the export of a lot of nodes is started? Am I overseeing some parameters here? Can the export be started in a better way using TSM scripting? An export server instead of an 'export node' for each TSM node is not an option as then the impot of one node would take too much time. thanks in advance, Kurt
Re: export nodes causes TSM server crash
The server should not crash. Ever. End of story. Call support. -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of John Monahan Sent: Monday, February 27, 2006 2:15 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: export nodes causes TSM server crash Let me see if I understand you correctly. The export works fine when only 15 nodes are running, but after 2 hours when the second set of 15 nodes kicks in (while some from the first group of 15 are stilli running) that is when your server crashes? Or does your server crash with only 15 nodes running an export? __ John Monahan Consultant Infrastructure Solutions Group Computech Resources, Inc. Office: 952-833-0930 ext 109 Cell: 952-221-6938 http://www.computechresources.com Kurt Beyers <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" 02/27/2006 05:35 AM Please respond to "ADSM: Dist Stor Manager" To ADSM-L@VM.MARIST.EDU cc Subject export nodes causes TSM server crash Hello everybody, I've got a TSM server 5.3.2.2 running on Windows2003 Enterprise Edition SP1 (7 GB RAM, Xeon 3,2 GHz CPU) that has about 100 TSM clients defined. Each month an export of each TSM node with the active backup data will be taken to disk (DS4100 with SATA disks of 250 GB). The disk storage pool that contains the backups is on the DS4100 too. I've scheduled the export of the TSM nodes past weekend with a few scripts. I first tried to launch just one script that took the export in blocks of 15 nodes using the PARALLEL and SERIAL commands. However as the export is started in the background, all of the 75 exporst were started immediately. This causes a TSM server crash. After restarting the TSM server, no error logs are found in the activity log. Except that no more than 16 commands can be started in one PARALLEL statement. The last normal message about the export is written in the log and then the next message are when the server is started again. I've split up then the export myself in a script where the export of 15 nodes was started and 4 administrative schedules were defined that triggered the export of 15 additional nodes every 2 hours later on. The TSM server crashed once more. Is this a know feature when the export of a lot of nodes is started? Am I overseeing some parameters here? Can the export be started in a better way using TSM scripting? An export server instead of an 'export node' for each TSM node is not an option as then the impot of one node would take too much time. thanks in advance, Kurt
Re: export nodes causes TSM server crash
Let me see if I understand you correctly. The export works fine when only 15 nodes are running, but after 2 hours when the second set of 15 nodes kicks in (while some from the first group of 15 are stilli running) that is when your server crashes? Or does your server crash with only 15 nodes running an export? __ John Monahan Consultant Infrastructure Solutions Group Computech Resources, Inc. Office: 952-833-0930 ext 109 Cell: 952-221-6938 http://www.computechresources.com Kurt Beyers <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" 02/27/2006 05:35 AM Please respond to "ADSM: Dist Stor Manager" To ADSM-L@VM.MARIST.EDU cc Subject export nodes causes TSM server crash Hello everybody, I've got a TSM server 5.3.2.2 running on Windows2003 Enterprise Edition SP1 (7 GB RAM, Xeon 3,2 GHz CPU) that has about 100 TSM clients defined. Each month an export of each TSM node with the active backup data will be taken to disk (DS4100 with SATA disks of 250 GB). The disk storage pool that contains the backups is on the DS4100 too. I've scheduled the export of the TSM nodes past weekend with a few scripts. I first tried to launch just one script that took the export in blocks of 15 nodes using the PARALLEL and SERIAL commands. However as the export is started in the background, all of the 75 exporst were started immediately. This causes a TSM server crash. After restarting the TSM server, no error logs are found in the activity log. Except that no more than 16 commands can be started in one PARALLEL statement. The last normal message about the export is written in the log and then the next message are when the server is started again. I've split up then the export myself in a script where the export of 15 nodes was started and 4 administrative schedules were defined that triggered the export of 15 additional nodes every 2 hours later on. The TSM server crashed once more. Is this a know feature when the export of a lot of nodes is started? Am I overseeing some parameters here? Can the export be started in a better way using TSM scripting? An export server instead of an 'export node' for each TSM node is not an option as then the impot of one node would take too much time. thanks in advance, Kurt
Re: export nodes causes TSM server crash
Henrik, Too many servers over LAN/WAN are in the picture to take a monthly archive in just a weekend. And because there are also many TDP backups, the export was chosen instead of the backupset (only possible for a TSM BA client). The export could be taken relative to a certain data too. A PMR has been opened at TSM support too of course. regards, Kurt From: ADSM: Dist Stor Manager on behalf of Henrik Wahlstedt Sent: Mon 2/27/2006 13:57 To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] export nodes causes TSM server crash Hi, You confuse me, what are you trying to achieve? Monthly exports could easily be replaced with archives or backup sets. //Henrik -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Kurt Beyers Sent: den 27 februari 2006 12:35 To: ADSM-L@VM.MARIST.EDU Subject: export nodes causes TSM server crash Hello everybody, I've got a TSM server 5.3.2.2 running on Windows2003 Enterprise Edition SP1 (7 GB RAM, Xeon 3,2 GHz CPU) that has about 100 TSM clients defined. Each month an export of each TSM node with the active backup data will be taken to disk (DS4100 with SATA disks of 250 GB). The disk storage pool that contains the backups is on the DS4100 too. I've scheduled the export of the TSM nodes past weekend with a few scripts. I first tried to launch just one script that took the export in blocks of 15 nodes using the PARALLEL and SERIAL commands. However as the export is started in the background, all of the 75 exporst were started immediately. This causes a TSM server crash. After restarting the TSM server, no error logs are found in the activity log. Except that no more than 16 commands can be started in one PARALLEL statement. The last normal message about the export is written in the log and then the next message are when the server is started again. I've split up then the export myself in a script where the export of 15 nodes was started and 4 administrative schedules were defined that triggered the export of 15 additional nodes every 2 hours later on. The TSM server crashed once more. Is this a know feature when the export of a lot of nodes is started? Am I overseeing some parameters here? Can the export be started in a better way using TSM scripting? An export server instead of an 'export node' for each TSM node is not an option as then the impot of one node would take too much time. thanks in advance, Kurt --- The information contained in this message may be CONFIDENTIAL and is intended for the addressee only. Any unauthorised use, dissemination of the information or copying of this message is prohibited. If you are not the addressee, please notify the sender immediately by return e-mail and delete this message. Thank you.
Re: export nodes causes TSM server crash
Hi, You confuse me, what are you trying to achieve? Monthly exports could easily be replaced with archives or backup sets. //Henrik -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Kurt Beyers Sent: den 27 februari 2006 12:35 To: ADSM-L@VM.MARIST.EDU Subject: export nodes causes TSM server crash Hello everybody, I've got a TSM server 5.3.2.2 running on Windows2003 Enterprise Edition SP1 (7 GB RAM, Xeon 3,2 GHz CPU) that has about 100 TSM clients defined. Each month an export of each TSM node with the active backup data will be taken to disk (DS4100 with SATA disks of 250 GB). The disk storage pool that contains the backups is on the DS4100 too. I've scheduled the export of the TSM nodes past weekend with a few scripts. I first tried to launch just one script that took the export in blocks of 15 nodes using the PARALLEL and SERIAL commands. However as the export is started in the background, all of the 75 exporst were started immediately. This causes a TSM server crash. After restarting the TSM server, no error logs are found in the activity log. Except that no more than 16 commands can be started in one PARALLEL statement. The last normal message about the export is written in the log and then the next message are when the server is started again. I've split up then the export myself in a script where the export of 15 nodes was started and 4 administrative schedules were defined that triggered the export of 15 additional nodes every 2 hours later on. The TSM server crashed once more. Is this a know feature when the export of a lot of nodes is started? Am I overseeing some parameters here? Can the export be started in a better way using TSM scripting? An export server instead of an 'export node' for each TSM node is not an option as then the impot of one node would take too much time. thanks in advance, Kurt --- The information contained in this message may be CONFIDENTIAL and is intended for the addressee only. Any unauthorised use, dissemination of the information or copying of this message is prohibited. If you are not the addressee, please notify the sender immediately by return e-mail and delete this message. Thank you.
export nodes causes TSM server crash
Hello everybody, I've got a TSM server 5.3.2.2 running on Windows2003 Enterprise Edition SP1 (7 GB RAM, Xeon 3,2 GHz CPU) that has about 100 TSM clients defined. Each month an export of each TSM node with the active backup data will be taken to disk (DS4100 with SATA disks of 250 GB). The disk storage pool that contains the backups is on the DS4100 too. I've scheduled the export of the TSM nodes past weekend with a few scripts. I first tried to launch just one script that took the export in blocks of 15 nodes using the PARALLEL and SERIAL commands. However as the export is started in the background, all of the 75 exporst were started immediately. This causes a TSM server crash. After restarting the TSM server, no error logs are found in the activity log. Except that no more than 16 commands can be started in one PARALLEL statement. The last normal message about the export is written in the log and then the next message are when the server is started again. I've split up then the export myself in a script where the export of 15 nodes was started and 4 administrative schedules were defined that triggered the export of 15 additional nodes every 2 hours later on. The TSM server crashed once more. Is this a know feature when the export of a lot of nodes is started? Am I overseeing some parameters here? Can the export be started in a better way using TSM scripting? An export server instead of an 'export node' for each TSM node is not an option as then the impot of one node would take too much time. thanks in advance, Kurt