Re: Using the DBI::TSM module, memory problem
Hi, I have now tried different approaches for my large TSM query: DBI: Allen Rout confirmed my suspicions in a previous post. This one loads the whole output of the sql into memory before processing. For my large query I killed the program when it reached 1GB. ODBC: This one seems to kick off the query and start processing immediately. However it still uses a lot of memory if not as much as DBI. My query topped out at around 400 MB. I am not sure what is going on under the hood here. Command pipe: open OUT, "dsmadmc -noconfirm -tabd -id=xxx -password=xxx select |"; This worked well, perl just using 4MB of memory. No memory problems on the TSM-server either. Thank you to all who responded! Best regards, Hans Chr. Riksheim -Original Message- From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Raibeck Sent: Monday, December 04, 2006 3:42 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: Using the DBI::TSM module, memory problem I cannot remember if I ever tried using DBI, but I have been able to get Perl to work with the TSM ODBC driver by using the Roth ODBC package for Perl. Go to the URL in my sig, then do a search on: odbc perl to find an article on using Perl with the TSM ODBC driver. I believe my example in the article does not have the memory consumption issue that you describe. Regards, Andy Andy Raibeck IBM Software Group Tivoli Storage Manager Client Development Internal Notes e-mail: Andrew Raibeck/Tucson/[EMAIL PROTECTED] Internet e-mail: [EMAIL PROTECTED] IBM Tivoli Storage Manager support web page: http://www.ibm.com/software/sysmgmt/products/support/IBMTivoliStorageManager.html The only dumb question is the one that goes unasked. The command line is your friend. "Good enough" is the enemy of excellence. "ADSM: Dist Stor Manager" wrote on 12/04/2006 06:21:07 AM: > Hi, > > anybody else using the perl module DBI::TSM? > > I've recently been using it for reporting purposes and find it quite > useful. However I have some problems with the perl memory consumption. > This may very well be caused by my lack of knowledge of DBI, but I > haven't found a remedy yet. > > The problem is that I want to process the output of sql statements > line by line as it is returned by the TSM-server, but it seems that > DBI requires the output of the whole sql to be stored in memory before > I can process it. For large queries the memory consumption will be > huge. > > Example(I know this is a simple one where it could be all handled in > the sql itself, this is just for explaining my problem): > > > # Set up the database handle: > $dbh=DBI->connect("DBI::TSM:myserver", 'admin', 'password¨'); > > # SQL that will return a zillion rows > $sql='select description as "DESC" from archives where node_name='MYNODE''; > > $sth=$dbh->prepare($sql); > > $sth->execute; #This is the one that slurps my memory > > while($row_ref)=$sth->fetchrow_hashref){ > #this processes each row that is already stored in memory > $archive_names{$row_ref->{'DESC'}}=''; > } > > > I guess I could avoid the problem by using "dsmadmc" directly and > write the output to a file that i could later have perl parse, but > that would not be a very elegant solution. I want it to work with DBI. > > > Any ideas? > > > Best regards, > > Hans Chr. Riksheim > > > > > > > This > email originates from Steria AS, Biskop Gunnerus' gate 14a, N-0051 > OSLO, > http://www.steria.no. This email and any attachments may contain > confidential/intellectual property/copyright information and is only > for the use of the addressee(s). You are prohibited from copying, > forwarding, disclosing, saving or otherwise using it in any way if you > are not the addressee(s) or responsible for delivery. If you receive > this email by mistake, please advise the sender and cancel it > immediately. Steria may monitor the content of emails within its > network to ensure compliance with its policies and procedures. Any > email is susceptible to alteration and its integrity cannot be > assured. Steria shall not be liable if the message is altered, > modified, falsified, or even edited. />This email originates from Steria AS, > Biskop Gunnerus' gate 14a, N-0051 OSLO, http://www.steria.no. This email and > any attachments may contain confidential/intellectual property/copyright > information and is only for the use of the addressee(s). You are prohibited > from copying, forwarding, disclosing, saving or otherwise using it in any way > if you are not the addressee(s) or responsible for delivery. If you
Re: Re: Using the DBI::TSM module, memory problem
Hans Christian Riksheim wrote: > > I essentially want to "stream" the output of the sql through a filter > while it is returned from the TSM-server, not glob it up in memory > first and then process it later. Have not found out yet with DBI. Some > database guy suggested use of cursors which make it possible to fetch > one row at a time and have the database keep track of your progress, > but I don't know the first thing about cursors and I don't think the > TSM database is designed for those needs anyway. I haven't used the DBI::TSM module, nor am I a database guy. But the TSM implementation of SQL SELECT does support the use of cursors. I've seen various tools use them (there's a STORServer tool that uses them, and I wouldn't be surprised if TSM Operational Reporting used them). Not being a database guy, though, I couldn't tell you _how_ to use them, or if DBI::TSM supports the use of them. -- Hello World.David Bronder - Systems Admin Segmentation Fault ITS-SPA, Univ. of Iowa Core dumped, disk trashed, quota filled, soda warm. [EMAIL PROTECTED]
Re: Using the DBI::TSM module, memory problem
Allen S. Rout a écrit : On Mon, 4 Dec 2006 19:46:07 +0100, Hans Christian Riksheim <[EMAIL PROTECTED]> said: I essentially want to "stream" the output of the sql through a filter while it is returned from the TSM-server, not glob it up in memory first and then process it later. Have not found out yet with DBI. And you won't. TSM::Function.pm: 138 sub tsm_execute { 139 my ($sth,$statement)[EMAIL PROTECTED]; 140 [...] 160 while (<$ch>) { 161 $errstr.=$_ if m/^[A-Z][A-Z][A-Z]\d\d\d\d[^I]/; 162 chomp; In Perl, that means it's going to read until it's done. I've noodled around a bit thinking about ways to buffer or stall that reading, but I can't guaruntee that you won't just have the entire stream pile up somewhere else in the system; will the server stop the SELECT process just becaus the thing listening to dsmadmc's STDOUT has stalled? Ew. My recommendation is that, if you're going to do big dataset stuff, don't bother with PERL for your primary analysis (and I'm a perl addict, BTW). do something like dsmadmc -dataonly=yes -comma | awk [...] Many of the questions I ask of my server come down to dsmadmc | cut [somehow] | sort | uniq -c | sort -rn which streams nicely. Or you can even dsmadmc | bzip2 -c > tempfile and then mess with that. Remember, database selects tend to compress VERY nicely. Some database guy suggested use of cursors which make it possible to fetch one row at a time and have the database keep track of your progress, but I don't know the first thing about cursors and I don't think the TSM database is designed for those needs anyway. Agreed. The ODBC path you're already working at is probably a better solution for the big queries. - Allen S. Rout Hello, I had two problems when I wrote this module. First, some people use this module on Windows platform. So, I store all the result of dsmadmc command in memory with qx() statement. Second, DBI need to return the dataset as a memory perl structure. For instance, for AIX, il you need to retrieve more than 256Mo you could get few problems. It's not a Perl limitation but the or my implementation for DBI (any good Idea is accepted). In Perl, you could do it as is (only on UNIX) : my $cmd = "dsmadmc "; open my $cmd_h, "$cmd |" or die "Cannot fork: $!\n"; while (<$cmd_h>) { # You have one line in $_ # Do what you want with $_ but don't store it in memory # write result in an other file... print it to STDOUT, ... } close $cmd_h or die "Error in '$cmd' command: $!\n";
Re: Using the DBI::TSM module, memory problem
>> On Mon, 4 Dec 2006 19:46:07 +0100, Hans Christian Riksheim <[EMAIL >> PROTECTED]> said: > I essentially want to "stream" the output of the sql through a > filter while it is returned from the TSM-server, not glob it up in > memory first and then process it later. Have not found out yet with > DBI. And you won't. TSM::Function.pm: 138 sub tsm_execute { 139 my ($sth,$statement)[EMAIL PROTECTED]; 140 [...] 160 while (<$ch>) { 161 $errstr.=$_ if m/^[A-Z][A-Z][A-Z]\d\d\d\d[^I]/; 162 chomp; In Perl, that means it's going to read until it's done. I've noodled around a bit thinking about ways to buffer or stall that reading, but I can't guaruntee that you won't just have the entire stream pile up somewhere else in the system; will the server stop the SELECT process just becaus the thing listening to dsmadmc's STDOUT has stalled? Ew. My recommendation is that, if you're going to do big dataset stuff, don't bother with PERL for your primary analysis (and I'm a perl addict, BTW). do something like dsmadmc -dataonly=yes -comma | awk [...] Many of the questions I ask of my server come down to dsmadmc | cut [somehow] | sort | uniq -c | sort -rn which streams nicely. Or you can even dsmadmc | bzip2 -c > tempfile and then mess with that. Remember, database selects tend to compress VERY nicely. > Some database guy suggested use of cursors which make it possible to > fetch one row at a time and have the database keep track of your > progress, but I don't know the first thing about cursors and I don't > think the TSM database is designed for those needs anyway. Agreed. The ODBC path you're already working at is probably a better solution for the big queries. - Allen S. Rout
Re: Using the DBI::TSM module, memory problem
Hi, I use the 64-bit version of perl which can handle the amount of memory. But the memory required is so great that the server starts swapping and the execution rate goes to a standstill(together with the rest of the things running on that AIX-server). The same will probably happen with DBM. I essentially want to "stream" the output of the sql through a filter while it is returned from the TSM-server, not glob it up in memory first and then process it later. Have not found out yet with DBI. Some database guy suggested use of cursors which make it possible to fetch one row at a time and have the database keep track of your progress, but I don't know the first thing about cursors and I don't think the TSM database is designed for those needs anyway. I will try the ODBC-module Andrew suggested and let you know if I can make it work in a stream/filter fashion. Best regards Hans Christian Riksheim Fra: ADSM: Dist Stor Manager på vegne av Richard Sims Sendt: ma 04.12.2006 15:46 Til: ADSM-L@VM.MARIST.EDU Emne: Re: Using the DBI::TSM module, memory problem On Dec 4, 2006, at 9:21 AM, Hans Christian Riksheim wrote: > The problem is that I want to process the output of sql statements > line by line as it is returned by the TSM-server, but it seems that > DBI requires the output of the whole sql to be stored in memory > before I can process it. For large queries the memory consumption > will be huge. Hans - I faced this problem with a reporting program which had to operate on a very large data set, which caused perl to exhibit flakey problems (Illegal Instruction) as it exhausted its memory - even with Unix Resource Limits high. I initially approached dealing with this via perl's dbm capabilities, which allows hashes to be on disk rather than in memory. Still, that was not enough. I eventually resolved the issue by switching to 64-bit perl, which provided the elbow room necessary. Richard Sims This email originates from Steria AS, Biskop Gunnerus' gate 14a, N-0051 OSLO, http://www.steria.no. This email and any attachments may contain confidential/intellectual property/copyright information and is only for the use of the addressee(s). You are prohibited from copying, forwarding, disclosing, saving or otherwise using it in any way if you are not the addressee(s) or responsible for delivery. If you receive this email by mistake, please advise the sender and cancel it immediately. Steria may monitor the content of emails within its network to ensure compliance with its policies and procedures. Any email is susceptible to alteration and its integrity cannot be assured. Steria shall not be liable if the message is altered, modified, falsified, or even edited.
Re: Using the DBI::TSM module, memory problem
On Dec 4, 2006, at 9:21 AM, Hans Christian Riksheim wrote: The problem is that I want to process the output of sql statements line by line as it is returned by the TSM-server, but it seems that DBI requires the output of the whole sql to be stored in memory before I can process it. For large queries the memory consumption will be huge. Hans - I faced this problem with a reporting program which had to operate on a very large data set, which caused perl to exhibit flakey problems (Illegal Instruction) as it exhausted its memory - even with Unix Resource Limits high. I initially approached dealing with this via perl's dbm capabilities, which allows hashes to be on disk rather than in memory. Still, that was not enough. I eventually resolved the issue by switching to 64-bit perl, which provided the elbow room necessary. Richard Sims
Re: Using the DBI::TSM module, memory problem
I cannot remember if I ever tried using DBI, but I have been able to get Perl to work with the TSM ODBC driver by using the Roth ODBC package for Perl. Go to the URL in my sig, then do a search on: odbc perl to find an article on using Perl with the TSM ODBC driver. I believe my example in the article does not have the memory consumption issue that you describe. Regards, Andy Andy Raibeck IBM Software Group Tivoli Storage Manager Client Development Internal Notes e-mail: Andrew Raibeck/Tucson/[EMAIL PROTECTED] Internet e-mail: [EMAIL PROTECTED] IBM Tivoli Storage Manager support web page: http://www.ibm.com/software/sysmgmt/products/support/IBMTivoliStorageManager.html The only dumb question is the one that goes unasked. The command line is your friend. "Good enough" is the enemy of excellence. "ADSM: Dist Stor Manager" wrote on 12/04/2006 06:21:07 AM: > Hi, > > anybody else using the perl module DBI::TSM? > > I've recently been using it for reporting purposes and find it quite > useful. However I have some problems with the perl memory > consumption. This may very well be caused by my lack of knowledge of > DBI, but I haven't found a remedy yet. > > The problem is that I want to process the output of sql statements > line by line as it is returned by the TSM-server, but it seems that > DBI requires the output of the whole sql to be stored in memory > before I can process it. For large queries the memory consumption > will be huge. > > Example(I know this is a simple one where it could be all handled in > the sql itself, this is just for explaining my problem): > > > # Set up the database handle: > $dbh=DBI->connect("DBI::TSM:myserver", 'admin', 'password¨'); > > # SQL that will return a zillion rows > $sql='select description as "DESC" from archives where node_name='MYNODE''; > > $sth=$dbh->prepare($sql); > > $sth->execute; #This is the one that slurps my memory > > while($row_ref)=$sth->fetchrow_hashref){ > #this processes each row that is already stored in memory > $archive_names{$row_ref->{'DESC'}}=''; > } > > > I guess I could avoid the problem by using "dsmadmc" directly and > write the output to a file that i could later have perl parse, but > that would not be a very elegant solution. I want it to work with DBI. > > > Any ideas? > > > Best regards, > > Hans Chr. Riksheim > > > > > > > This > email originates from Steria AS, Biskop Gunnerus' gate 14a, N-0051 OSLO, > http://www.steria.no. This email and any attachments may contain > confidential/intellectual property/copyright information and is only > for the use of the addressee(s). You are prohibited from copying, > forwarding, disclosing, saving or otherwise using it in any way if > you are not the addressee(s) or responsible for delivery. If you > receive this email by mistake, please advise the sender and cancel > it immediately. Steria may monitor the content of emails within its > network to ensure compliance with its policies and procedures. Any > email is susceptible to alteration and its integrity cannot be > assured. Steria shall not be liable if the message is altered, > modified, falsified, or even edited.