Re: Using the DBI::TSM module, memory problem

2006-12-07 Thread Hans Christian Riksheim
Hi,

I have now tried different approaches for my large TSM query:

DBI:
Allen Rout confirmed my suspicions in a previous post. This one loads the whole 
output of the sql into memory before processing. For my large query I killed 
the program when it reached 1GB.

ODBC:
This one seems to kick off the query and start processing immediately. However 
it still uses a lot of memory if not as much as DBI. My query topped out at 
around 400 MB. I am not sure what is going on under the hood here.

Command pipe:
open OUT, "dsmadmc -noconfirm -tabd -id=xxx -password=xxx select  |";
This worked well, perl just using 4MB of memory. No memory problems on the 
TSM-server either.


Thank you to all who responded!


Best regards,

Hans Chr. Riksheim





-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of Andrew 
Raibeck
Sent: Monday, December 04, 2006 3:42 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: Using the DBI::TSM module, memory problem

I cannot remember if I ever tried using DBI, but I have been able to get Perl 
to work with the TSM ODBC driver by using the Roth ODBC package for Perl. Go to 
the URL in my sig, then do a search on:

   odbc perl

to find an article on using Perl with the TSM ODBC driver. I believe my example 
in the article does not have the memory consumption issue that you describe.

Regards,

Andy

Andy Raibeck
IBM Software Group
Tivoli Storage Manager Client Development Internal Notes e-mail: Andrew 
Raibeck/Tucson/[EMAIL PROTECTED] Internet e-mail: [EMAIL PROTECTED]

IBM Tivoli Storage Manager support web page: 
http://www.ibm.com/software/sysmgmt/products/support/IBMTivoliStorageManager.html

The only dumb question is the one that goes unasked.
The command line is your friend.
"Good enough" is the enemy of excellence.

"ADSM: Dist Stor Manager"  wrote on 12/04/2006
06:21:07 AM:

> Hi,
> 
> anybody else using the perl module DBI::TSM?
> 
> I've recently been using it for reporting purposes and find it quite 
> useful. However I have some problems with the perl memory consumption. 
> This may very well be caused by my lack of knowledge of DBI, but I 
> haven't found a remedy yet.
> 
> The problem is that I want to process the output of sql statements 
> line by line as it is returned by the TSM-server, but it seems that 
> DBI requires the output of the whole sql to be stored in memory before 
> I can process it. For large queries the memory consumption will be 
> huge.
> 
> Example(I know this is a simple one where it could be all handled in 
> the sql itself, this is just for explaining my problem):
> 
> 
> # Set up the database handle:
> $dbh=DBI->connect("DBI::TSM:myserver", 'admin', 'password¨');
> 
> # SQL that will return a zillion rows
> $sql='select description as "DESC" from archives where
node_name='MYNODE'';
> 
> $sth=$dbh->prepare($sql);
> 
> $sth->execute;  #This is the one that slurps my memory
> 
> while($row_ref)=$sth->fetchrow_hashref){
>   #this processes each row that is already stored in memory
>   $archive_names{$row_ref->{'DESC'}}='';
> }
> 
> 
> I guess I could avoid the problem by using "dsmadmc" directly and 
> write the output to a file that i could later have perl parse, but 
> that would not be a very elegant solution. I want it to work with DBI.
> 
> 
> Any ideas?
> 
> 
> Best regards,
> 
> Hans Chr. Riksheim
> 
> 
> 
> 
> 
> 
> This 
> email originates from Steria AS, Biskop Gunnerus' gate 14a, N-0051 
> OSLO,

> http://www.steria.no. This email and any attachments may contain 
> confidential/intellectual property/copyright information and is only 
> for the use of the addressee(s). You are prohibited from copying, 
> forwarding, disclosing, saving or otherwise using it in any way if you 
> are not the addressee(s) or responsible for delivery. If you receive 
> this email by mistake, please advise the sender and cancel it 
> immediately. Steria may monitor the content of emails within its 
> network to ensure compliance with its policies and procedures. Any 
> email is susceptible to alteration and its integrity cannot be 
> assured. Steria shall not be liable if the message is altered, 
> modified, falsified, or even edited. />This email originates from Steria AS, 
> Biskop Gunnerus' gate 14a, N-0051 OSLO, http://www.steria.no. This email and 
> any attachments may contain confidential/intellectual property/copyright 
> information and is only for the use of the addressee(s). You are prohibited 
> from copying, forwarding, disclosing, saving or otherwise using it in any way 
> if you are not the addressee(s) or responsible for delivery. If you 

Re: Re: Using the DBI::TSM module, memory problem

2006-12-05 Thread David Bronder
Hans Christian Riksheim wrote:
>
> I essentially want to "stream" the output of the sql through a filter
> while it is returned from the TSM-server, not glob it up in memory
> first and then process it later. Have not found out yet with DBI. Some
> database guy suggested use of cursors which make it possible to fetch
> one row at a time and have the database keep track of your progress,
> but I don't know the first thing about cursors and I don't think the
> TSM database is designed for those needs anyway.

I haven't used the DBI::TSM module, nor am I a database guy.  But the
TSM implementation of SQL SELECT does support the use of cursors.  I've
seen various tools use them (there's a STORServer tool that uses them,
and I wouldn't be surprised if TSM Operational Reporting used them).

Not being a database guy, though, I couldn't tell you _how_ to use them,
or if DBI::TSM supports the use of them.

--
Hello World.David Bronder - Systems Admin
Segmentation Fault ITS-SPA, Univ. of Iowa
Core dumped, disk trashed, quota filled, soda warm.   [EMAIL PROTECTED]


Re: Using the DBI::TSM module, memory problem

2006-12-05 Thread Laurent Bendavid

Allen S. Rout a écrit :

On Mon, 4 Dec 2006 19:46:07 +0100, Hans Christian Riksheim <[EMAIL PROTECTED]> 
said:
  


  

I essentially want to "stream" the output of the sql through a
filter while it is returned from the TSM-server, not glob it up in
memory first and then process it later. Have not found out yet with
DBI.



And you won't.

TSM::Function.pm:

138 sub tsm_execute {
139 my ($sth,$statement)[EMAIL PROTECTED];
140
[...]
160 while (<$ch>) {
161 $errstr.=$_ if m/^[A-Z][A-Z][A-Z]\d\d\d\d[^I]/;
162 chomp;

In Perl, that means it's going to read until it's done.  I've noodled
around a bit thinking about ways to buffer or stall that reading, but
I can't guaruntee that you won't just have the entire stream pile up
somewhere else in the system;  will the server stop the SELECT process
just becaus the thing listening to dsmadmc's STDOUT has stalled? Ew.

My recommendation is that, if you're going to do big dataset stuff,
don't bother with PERL for your primary analysis (and I'm a perl
addict, BTW).  do something like


dsmadmc -dataonly=yes -comma | awk  [...]

Many of the questions I ask of my server come down to

dsmadmc | cut [somehow] | sort | uniq -c | sort -rn

which streams nicely.  Or you can even

dsmadmc | bzip2 -c > tempfile

and then mess with that.  Remember, database selects tend to compress
VERY nicely.



  

Some database guy suggested use of cursors which make it possible to
fetch one row at a time and have the database keep track of your
progress, but I don't know the first thing about cursors and I don't
think the TSM database is designed for those needs anyway.



Agreed. The ODBC path you're already working at is probably a better
solution for the big queries.


- Allen S. Rout



Hello,

I had two problems when I wrote this module.

First, some people use this module on Windows platform. So, I store all 
the result of dsmadmc command in memory with qx() statement.


Second, DBI need to return the dataset as a memory perl structure. For 
instance, for AIX, il you need to retrieve more than 256Mo you could get 
few problems.


It's not a Perl limitation but the or my implementation for DBI (any 
good Idea is accepted).


In Perl, you could do it as is (only on UNIX) :

my $cmd = "dsmadmc ";

open my $cmd_h, "$cmd |" or die "Cannot fork: $!\n";
while (<$cmd_h>) {
   # You have one line in $_
   # Do what you want with $_ but don't store it in memory
   # write result in an other file... print it to STDOUT, ...
}
close $cmd_h or die "Error in '$cmd' command: $!\n";


Re: Using the DBI::TSM module, memory problem

2006-12-04 Thread Allen S. Rout
>> On Mon, 4 Dec 2006 19:46:07 +0100, Hans Christian Riksheim <[EMAIL 
>> PROTECTED]> said:

> I essentially want to "stream" the output of the sql through a
> filter while it is returned from the TSM-server, not glob it up in
> memory first and then process it later. Have not found out yet with
> DBI.

And you won't.

TSM::Function.pm:

138 sub tsm_execute {
139 my ($sth,$statement)[EMAIL PROTECTED];
140
[...]
160 while (<$ch>) {
161 $errstr.=$_ if m/^[A-Z][A-Z][A-Z]\d\d\d\d[^I]/;
162 chomp;

In Perl, that means it's going to read until it's done.  I've noodled
around a bit thinking about ways to buffer or stall that reading, but
I can't guaruntee that you won't just have the entire stream pile up
somewhere else in the system;  will the server stop the SELECT process
just becaus the thing listening to dsmadmc's STDOUT has stalled? Ew.

My recommendation is that, if you're going to do big dataset stuff,
don't bother with PERL for your primary analysis (and I'm a perl
addict, BTW).  do something like


dsmadmc -dataonly=yes -comma | awk  [...]

Many of the questions I ask of my server come down to

dsmadmc | cut [somehow] | sort | uniq -c | sort -rn

which streams nicely.  Or you can even

dsmadmc | bzip2 -c > tempfile

and then mess with that.  Remember, database selects tend to compress
VERY nicely.



> Some database guy suggested use of cursors which make it possible to
> fetch one row at a time and have the database keep track of your
> progress, but I don't know the first thing about cursors and I don't
> think the TSM database is designed for those needs anyway.

Agreed. The ODBC path you're already working at is probably a better
solution for the big queries.


- Allen S. Rout


Re: Using the DBI::TSM module, memory problem

2006-12-04 Thread Hans Christian Riksheim
Hi,
 
I use the 64-bit version of perl which can handle the amount of memory. But the 
memory required is so great that the server starts swapping and the execution 
rate goes to a standstill(together with the rest of the things running on that 
AIX-server). The same will probably happen with DBM.
 
I essentially want to "stream" the output of the sql through a filter while it 
is returned from the TSM-server, not glob it up in memory first and then 
process it later. Have not found out yet with DBI. Some database guy suggested 
use of cursors which make it possible to fetch one row at a time and have the 
database keep track of your progress, but I don't know the first thing about 
cursors and I don't think the TSM database is designed for those needs anyway.
 
I will try the ODBC-module Andrew suggested and let you know if I can make it 
work in a stream/filter fashion.
 
 
Best regards
 
Hans Christian Riksheim
 
 
 
 



Fra: ADSM: Dist Stor Manager på vegne av Richard Sims
Sendt: ma 04.12.2006 15:46
Til: ADSM-L@VM.MARIST.EDU
Emne: Re: Using the DBI::TSM module, memory problem



On Dec 4, 2006, at 9:21 AM, Hans Christian Riksheim wrote:

> The problem is that I want to process the output of sql statements
> line by line as it is returned by the TSM-server, but it seems that
> DBI requires the output of the whole sql to be stored in memory
> before I can process it. For large queries the memory consumption
> will be huge.

Hans -

I faced this problem with a reporting program which had to operate on
a very large data set, which caused perl to exhibit flakey problems
(Illegal Instruction) as it exhausted its memory - even with Unix
Resource Limits high.  I initially approached dealing with this via
perl's dbm capabilities, which allows hashes to be on disk rather
than in memory.  Still, that was not enough.  I eventually resolved
the issue by switching to 64-bit perl, which provided the elbow room
necessary.

Richard Sims


This email 
originates from Steria AS, Biskop Gunnerus' gate 14a, N-0051 OSLO, 
http://www.steria.no. This email and any attachments may contain 
confidential/intellectual property/copyright information and is only for the 
use of the addressee(s). You are prohibited from copying, forwarding, 
disclosing, saving or otherwise using it in any way if you are not the 
addressee(s) or responsible for delivery. If you receive this email by mistake, 
please advise the sender and cancel it immediately. Steria may monitor the 
content of emails within its network to ensure compliance with its policies and 
procedures. Any email is susceptible to alteration and its integrity cannot be 
assured. Steria shall not be liable if the message is altered, modified, 
falsified, or even edited.


Re: Using the DBI::TSM module, memory problem

2006-12-04 Thread Richard Sims

On Dec 4, 2006, at 9:21 AM, Hans Christian Riksheim wrote:


The problem is that I want to process the output of sql statements
line by line as it is returned by the TSM-server, but it seems that
DBI requires the output of the whole sql to be stored in memory
before I can process it. For large queries the memory consumption
will be huge.


Hans -

I faced this problem with a reporting program which had to operate on
a very large data set, which caused perl to exhibit flakey problems
(Illegal Instruction) as it exhausted its memory - even with Unix
Resource Limits high.  I initially approached dealing with this via
perl's dbm capabilities, which allows hashes to be on disk rather
than in memory.  Still, that was not enough.  I eventually resolved
the issue by switching to 64-bit perl, which provided the elbow room
necessary.

   Richard Sims


Re: Using the DBI::TSM module, memory problem

2006-12-04 Thread Andrew Raibeck
I cannot remember if I ever tried using DBI, but I have been able to get 
Perl to work with the TSM ODBC driver by using the Roth ODBC package for 
Perl. Go to the URL in my sig, then do a search on:

   odbc perl

to find an article on using Perl with the TSM ODBC driver. I believe my 
example in the article does not have the memory consumption issue that you 
describe.

Regards,

Andy

Andy Raibeck
IBM Software Group
Tivoli Storage Manager Client Development
Internal Notes e-mail: Andrew Raibeck/Tucson/[EMAIL PROTECTED]
Internet e-mail: [EMAIL PROTECTED]

IBM Tivoli Storage Manager support web page: 
http://www.ibm.com/software/sysmgmt/products/support/IBMTivoliStorageManager.html

The only dumb question is the one that goes unasked.
The command line is your friend.
"Good enough" is the enemy of excellence.

"ADSM: Dist Stor Manager"  wrote on 12/04/2006 
06:21:07 AM:

> Hi,
> 
> anybody else using the perl module DBI::TSM?
> 
> I've recently been using it for reporting purposes and find it quite
> useful. However I have some problems with the perl memory 
> consumption. This may very well be caused by my lack of knowledge of
> DBI, but I haven't found a remedy yet.
> 
> The problem is that I want to process the output of sql statements 
> line by line as it is returned by the TSM-server, but it seems that 
> DBI requires the output of the whole sql to be stored in memory 
> before I can process it. For large queries the memory consumption 
> will be huge.
> 
> Example(I know this is a simple one where it could be all handled in
> the sql itself, this is just for explaining my problem):
> 
> 
> # Set up the database handle:
> $dbh=DBI->connect("DBI::TSM:myserver", 'admin', 'password¨');
> 
> # SQL that will return a zillion rows
> $sql='select description as "DESC" from archives where 
node_name='MYNODE'';
> 
> $sth=$dbh->prepare($sql);
> 
> $sth->execute;  #This is the one that slurps my memory
> 
> while($row_ref)=$sth->fetchrow_hashref){
>   #this processes each row that is already stored in memory
>   $archive_names{$row_ref->{'DESC'}}='';
> }
> 
> 
> I guess I could avoid the problem by using "dsmadmc" directly and 
> write the output to a file that i could later have perl parse, but 
> that would not be a very elegant solution. I want it to work with DBI.
> 
> 
> Any ideas?
> 
> 
> Best regards,
> 
> Hans Chr. Riksheim
> 
> 
> 
> 
> 
> 
> This 
> email originates from Steria AS, Biskop Gunnerus' gate 14a, N-0051 OSLO, 

> http://www.steria.no. This email and any attachments may contain 
> confidential/intellectual property/copyright information and is only
> for the use of the addressee(s). You are prohibited from copying, 
> forwarding, disclosing, saving or otherwise using it in any way if 
> you are not the addressee(s) or responsible for delivery. If you 
> receive this email by mistake, please advise the sender and cancel 
> it immediately. Steria may monitor the content of emails within its 
> network to ensure compliance with its policies and procedures. Any 
> email is susceptible to alteration and its integrity cannot be 
> assured. Steria shall not be liable if the message is altered, 
> modified, falsified, or even edited.