Re: TSM performance very poor, Recovery log is being pinned

2007-08-01 Thread Robert Clark
Switching to RLV:

Some CPU time that would be used for OS overhead for filesystem is freed. This 
could be used for running one more TSM instance?

Memory that would be used for filesystem cache is freed. This could be used by 
TSM for buffers?

I don't know what effect either would have on overall throughput or efficiency, 
but it would be fascinating to find out.

The relative ease of setup of RLV depends on the filesystem it is compared 
with? On extent based filesystems( like NTFS), dsmfmt finishes very quickly?

[RC]

On Wednesday, August 01, 2007, at 04:32AM, "Richard Sims" <[EMAIL PROTECTED]> 
wrote:
>On Jul 31, 2007, at 11:59 PM, Stuart Lamble wrote:
>
>> I am not going to enter into a debate about the relative merits of
>> raw volumes versus files on filesystems, as I have insufficient
>> direct knowledge to judge either way (I'm trusting a more senior
>> colleague to make the right call there. :)
>
>I'll jump in anyway...
>The pure simplicity of RLVs makes them a joy to implement, compared
>to the time-consuming work entailed in implementing a TSM volume
>within a file system.  Their simplicity almost makes them mandatory
>where rapid disaster recovery is vital, as there's far less TSM
>server set-up time getting in the way of recovering your
>organization's functionality.  However, RLV access amounts to
>unbuffered I/O, and that exacts a performance penalty relative to
>file system volumes, where read-ahead provides a nice boost when the
>task at hand involves stepping through the volume.  I use RLVs, and
>it's apparent that Migration is relatively sluggish, as in a disk
>storage pool struggling to stay empty enough to handle all the
>incoming client backup data so as to prevent some backups having to
>go directly to tape.  The TSM Performance Tuning Guide cautions about
>this.
>
>   Richard Sims, Sr. Systems Programmer at Boston University
>
>


Re: TSM performance very poor, Recovery log is being pinned

2007-08-01 Thread Richard Sims

On Jul 31, 2007, at 11:59 PM, Stuart Lamble wrote:


I am not going to enter into a debate about the relative merits of
raw volumes versus files on filesystems, as I have insufficient
direct knowledge to judge either way (I'm trusting a more senior
colleague to make the right call there. :)


I'll jump in anyway...
The pure simplicity of RLVs makes them a joy to implement, compared
to the time-consuming work entailed in implementing a TSM volume
within a file system.  Their simplicity almost makes them mandatory
where rapid disaster recovery is vital, as there's far less TSM
server set-up time getting in the way of recovering your
organization's functionality.  However, RLV access amounts to
unbuffered I/O, and that exacts a performance penalty relative to
file system volumes, where read-ahead provides a nice boost when the
task at hand involves stepping through the volume.  I use RLVs, and
it's apparent that Migration is relatively sluggish, as in a disk
storage pool struggling to stay empty enough to handle all the
incoming client backup data so as to prevent some backups having to
go directly to tape.  The TSM Performance Tuning Guide cautions about
this.

  Richard Sims, Sr. Systems Programmer at Boston University


Re: TSM performance very poor, Recovery log is being pinned

2007-07-31 Thread Stuart Lamble

On 29/07/2007, at 10:03 PM, Stapleton, Mark wrote:


From: ADSM: Dist Stor Manager on behalf of Craig Ross

TSM is installed on Solaris 10


This is something that popped right out for me. Do you have your
storage pools located on raw logical volumes or mounted
filesystems? If the latter, that might be your problem. Solaris has
traditionally had incredibly poor throughput performance on mounted
filesystems.

You might give thought to rebuilding those storage pools on raw
logical volumes. Of course, that will require that you completely
flush all data from your disk storage pools to tape storage pools
first, so as not to lose client data.


A small trap for young players: TSM has constraints in place to stop
it writing to cylinder 0 of a raw volume on Solaris. If you direct
TSM at slice 2, or some other slice that includes cylinder 0, it will
barf, and the error message is rather cryptic (sorry for the
vagueness; it's been a year or so since I bumped into this. At the
time, I was working on the storage pool volume level, but I would
expect to see similar behaviour for a DB or log volume.)

Workaround is simple: make slice 0 include the entire disk starting
from cylinder 1, and use slice 0 as the raw volume.

I am not going to enter into a debate about the relative merits of
raw volumes versus files on filesystems, as I have insufficient
direct knowledge to judge either way (I'm trusting a more senior
colleague to make the right call there. :)


Re: TSM performance very poor, Recovery log is being pinned

2007-07-31 Thread Wanda Prather
In the TSM admin guide for AIX, look up "raw volumes".
It has examples.


> Evening I have been watching these comments with interest as we are
> currently in the process of building a new TSM server.
> Discussing with colleagues we are baffled by how you create the TSM log
> or DB on a raw presented Lun without creating at least a JFS for a mount
> point first. We are running AIX 5.3.
>
> Look forward to your response
>
> Regards
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
> Craig Ross
> Sent: Tuesday, 31 July 2007 7:09 PM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] TSM performance very poor, Recovery log is being
> pinned
>
> Thanks guys.
>
> All this advise is much appreciated.
> For the record my TSM servers seems to have returned to more normal
> routines.
> However the log is still being pinned and log fills up to about 12% then
> flushes, not ideal but acceptable.
> I am starting to wonder if my install has always had the log pinning
> issue I
> just did not know!!
> I will keep close eye on it.   But I feel ultimately I will install 2nd
> TSM
> server and migrate some Node's to new server.
> I did find a bad configured batch of new SAN storage which is promoting
> the
> slowdown/pinning. I have stopped using this particular disk and
> performance
> has returned to norm.  However this was not obvious before because
> server
> was busy, I could not pin point easy as once stopping the trouble spot
> TSM
> was not jumping into life straight away it had too process the back log
> of
> requests And of course I could not afford to leave server on go slow for
> long periods.
>
> As result I am currently in review of all new Storage installed to
> ensure
> its running optimally.
>
> I will propose to management to install new fast disk for log and DB as
> the
> pinning still issue.
>
> However this is where the debate continues, I have experimented with DB
> volumes log volumes and storage volumes with FS and raw volumes and have
> not
> seen any performance difference (on the same disks) ie have deleted
> volumes
> created RAW and FS and seen no difference.
> I have read material on both sides of this story and no seems more
> convincing than the other except I remember reading somewhere the only
> eyebrow raiser was with RAW on solaris you can have issues, I cannot
> remember exact issue but potential is there. So after testing both
> formats
> finding no difference. I use FS for everything, if this is an
> indisputable
> mistake please let me know.
> Also we have 10 10 Gb volumes for DB should I create more smaller ones?
>
> While I am here though I have another cloudy area, since upgrading
> server
> from 5.1 to 5.3 installing IBM tape device driver and adding 4 LTO3
> drives I
> swear my 6 old LTO 1 drives are running slower than previous is there
> some
> gotchas when installing IBM Tape to get drives running well!
>
> 6 LTO 1 drives are SCSI attached and 4 new LTO 3 are fibre.
>
> Cheers
>
> A happier TSM administrator the last 2 days  :>
>
>
>
>
> On 7/31/07, Roger Deschner <[EMAIL PROTECTED]> wrote:
>>
>> .
>> I think you are right about the Log - it need not be spread across
>> multiple volumes. It's only got one writer.
>>
>> Your RAID type can affect the performance of the Disk Storage Pools
> and
>> the Database dramatically. In particular, RAID5 is very poorly suited
>> for this, because it is 50% writes. RAID5 is also not ideal for the
>> Database, though it can be tolerated for the Log. RAID10 is much
> better.
>>
>> You should be using fast disks, not SATA, for the primary Disk Storage
>> Pools. I've got 10,000rpm IBM SSA disks for these.
>>
>> I use RAID10 for the Disk Storage Pools. I use JBOD disks with TSM
>> mirroring for the Log and Database. This is slightly slower than OS
>> mirroring or RAID-array mirroring, but it is somewhat safer. Each
>> physical volume for Storage Pools and Database is broken into many
>> Logical Volumes.
>>
>> You should be saving your fastest disks for the Database. I've got
>> 15,000prm disks for the Database. When I moved the Database from
>> 10,000rpm disks to 15,000rpm disks, everything in TSM got noticeably
>> faster. For instance, DB backups now take 1/3 less time. RAID boxes
> just
>> get in the way for the Database; it really runs best on JBOD disks
> with
>> TSM doing the mirroring.
>>
>> Here's a controversial paper written by a guy at Oracle. He says you
>> should "Stripe And Mirror Everything" (S.A.M.E.) I've read and reread
>> this several times, and while I definitely do not agree with
> everything
>> said, it does raise some very interesting points that definitely apply
>> to TSM. For one thing he strongly advocates RAID10, as do I.
>>
> http://www.oracle.com/technology/deploy/availability/pdf/oow2000_same.pd
> f
>>
>> Most of my Log pinning problems have been caused by clients. If a
> client
>> suffers a networking problem (typically a half-duplex vs. full-duplex
>> conflict) an

Re: TSM performance very poor, Recovery log is being pinned

2007-07-31 Thread Richard Rhodes
To create a db with raw volumes . . . .

create a vg:  mkvg  . . . .
create log vols:  mklv for each log vol
create dbvols:  mklv for each db vol

create db . . .  Here is the script/cmd I used to create a tsm db on the
raw vols

rsfebkup7p.fenetwork.com:/tsmdata/tsm3/config==>cat z_tsm3_s1_format_db.ksh
#!/bin/ksh
#
# Set the language
#
export LANG=en_US

#
# Max out size of data area
#
ulimit -d unlimited

#
# Allow the server to pack shared memory segments
#
export EXTSHM=ON

# setup to run tsm3
cd /tsmdata/tsm3/config
export PATH=${PATH}:/usr/tivoli/tsm/server/bin
export DSMSERV_DIR=/usr/tivoli/tsm/server/bin
export DSMSERV_CONFIG=/tsmdata/tsm3/config/dsmserv.opt
export DSMSERV_ACCOUNTING_DIR=/tsmdata/tsm3/config

dsmserv_tsm3 format 3 /dev/rtsm3log01 \
  /dev/rtsm3log02 \
  /dev/rtsm3log03 \
9 /dev/rtsm3db01 \
  /dev/rtsm3db02 \
  /dev/rtsm3db03 \
  /dev/rtsm3db04 \
  /dev/rtsm3db05 \
  /dev/rtsm3db06 \
  /dev/rtsm3db07 \
  /dev/rtsm3db08 \
  /dev/rtsm3db09




 Mark Scott
 <[EMAIL PROTECTED]
 COM.AU>To
 Sent by: "ADSM:   ADSM-L@VM.MARIST.EDU
 Dist Stor  cc
 Manager"
 <[EMAIL PROTECTED] Subject
 .EDU>         Re: TSM performance very poor,
               Recovery log is being pinned

 07/31/2007 08:16
 AM


 Please respond to
 "ADSM: Dist Stor
 Manager"
 <[EMAIL PROTECTED]
   .EDU>






Evening I have been watching these comments with interest as we are
currently in the process of building a new TSM server.
Discussing with colleagues we are baffled by how you create the TSM log
or DB on a raw presented Lun without creating at least a JFS for a mount
point first. We are running AIX 5.3.

Look forward to your response

Regards

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
Craig Ross
Sent: Tuesday, 31 July 2007 7:09 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] TSM performance very poor, Recovery log is being
pinned

Thanks guys.

All this advise is much appreciated.
For the record my TSM servers seems to have returned to more normal
routines.
However the log is still being pinned and log fills up to about 12% then
flushes, not ideal but acceptable.
I am starting to wonder if my install has always had the log pinning
issue I
just did not know!!
I will keep close eye on it.   But I feel ultimately I will install 2nd
TSM
server and migrate some Node's to new server.
I did find a bad configured batch of new SAN storage which is promoting
the
slowdown/pinning. I have stopped using this particular disk and
performance
has returned to norm.  However this was not obvious before because
server
was busy, I could not pin point easy as once stopping the trouble spot
TSM
was not jumping into life straight away it had too process the back log
of
requests And of course I could not afford to leave server on go slow for
long periods.

As result I am currently in review of all new Storage installed to
ensure
its running optimally.

I will propose to management to install new fast disk for log and DB as
the
pinning still issue.

However this is where the debate continues, I have experimented with DB
volumes log volumes and storage volumes with FS and raw volumes and have
not
seen any performance difference (on the same disks) ie have deleted
volumes
created RAW and FS and seen no difference.
I have read material on both sides of this story and no seems more
convincing than the other except I remember reading somewhere the only
eyebrow raiser was with RAW on solaris you can have issues, I cannot
remember exact issue but potential is there. So after testing both
formats
finding no difference. I use FS for everything, if this is an
indisputable
mistake please let me know.
Also we have 10 10 Gb volumes for DB should I create more smaller ones?

While I am here though I have another cloudy area, since upgrading
server
from 5.1 to 5.3 installing IBM tape device driver and adding 4 LTO3
drives I
swear my 6 old LTO 1 drives are running slower than previous is there
some
gotchas when installing IBM Tape to get drives running well!

6 LTO 1 drives are SCSI attached and 4 new LTO 3 are fibre.

Cheers

A happier TSM administrator the last 2 days  :>




On 7/31/07, Roger Deschner <[EMAIL PROTECTED]> wrote:
>
> .
> I think you are right about the Log - it need not be spread across
> multiple volumes. It's only got one w

Re: TSM performance very poor, Recovery log is being pinned

2007-07-31 Thread Mark Scott
Evening I have been watching these comments with interest as we are
currently in the process of building a new TSM server. 
Discussing with colleagues we are baffled by how you create the TSM log
or DB on a raw presented Lun without creating at least a JFS for a mount
point first. We are running AIX 5.3. 

Look forward to your response

Regards 

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
Craig Ross
Sent: Tuesday, 31 July 2007 7:09 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] TSM performance very poor, Recovery log is being
pinned

Thanks guys.

All this advise is much appreciated.
For the record my TSM servers seems to have returned to more normal
routines.
However the log is still being pinned and log fills up to about 12% then
flushes, not ideal but acceptable.
I am starting to wonder if my install has always had the log pinning
issue I
just did not know!!
I will keep close eye on it.   But I feel ultimately I will install 2nd
TSM
server and migrate some Node's to new server.
I did find a bad configured batch of new SAN storage which is promoting
the
slowdown/pinning. I have stopped using this particular disk and
performance
has returned to norm.  However this was not obvious before because
server
was busy, I could not pin point easy as once stopping the trouble spot
TSM
was not jumping into life straight away it had too process the back log
of
requests And of course I could not afford to leave server on go slow for
long periods.

As result I am currently in review of all new Storage installed to
ensure
its running optimally.

I will propose to management to install new fast disk for log and DB as
the
pinning still issue.

However this is where the debate continues, I have experimented with DB
volumes log volumes and storage volumes with FS and raw volumes and have
not
seen any performance difference (on the same disks) ie have deleted
volumes
created RAW and FS and seen no difference.
I have read material on both sides of this story and no seems more
convincing than the other except I remember reading somewhere the only
eyebrow raiser was with RAW on solaris you can have issues, I cannot
remember exact issue but potential is there. So after testing both
formats
finding no difference. I use FS for everything, if this is an
indisputable
mistake please let me know.
Also we have 10 10 Gb volumes for DB should I create more smaller ones?

While I am here though I have another cloudy area, since upgrading
server
from 5.1 to 5.3 installing IBM tape device driver and adding 4 LTO3
drives I
swear my 6 old LTO 1 drives are running slower than previous is there
some
gotchas when installing IBM Tape to get drives running well!

6 LTO 1 drives are SCSI attached and 4 new LTO 3 are fibre.

Cheers

A happier TSM administrator the last 2 days  :>




On 7/31/07, Roger Deschner <[EMAIL PROTECTED]> wrote:
>
> .
> I think you are right about the Log - it need not be spread across
> multiple volumes. It's only got one writer.
>
> Your RAID type can affect the performance of the Disk Storage Pools
and
> the Database dramatically. In particular, RAID5 is very poorly suited
> for this, because it is 50% writes. RAID5 is also not ideal for the
> Database, though it can be tolerated for the Log. RAID10 is much
better.
>
> You should be using fast disks, not SATA, for the primary Disk Storage
> Pools. I've got 10,000rpm IBM SSA disks for these.
>
> I use RAID10 for the Disk Storage Pools. I use JBOD disks with TSM
> mirroring for the Log and Database. This is slightly slower than OS
> mirroring or RAID-array mirroring, but it is somewhat safer. Each
> physical volume for Storage Pools and Database is broken into many
> Logical Volumes.
>
> You should be saving your fastest disks for the Database. I've got
> 15,000prm disks for the Database. When I moved the Database from
> 10,000rpm disks to 15,000rpm disks, everything in TSM got noticeably
> faster. For instance, DB backups now take 1/3 less time. RAID boxes
just
> get in the way for the Database; it really runs best on JBOD disks
with
> TSM doing the mirroring.
>
> Here's a controversial paper written by a guy at Oracle. He says you
> should "Stripe And Mirror Everything" (S.A.M.E.) I've read and reread
> this several times, and while I definitely do not agree with
everything
> said, it does raise some very interesting points that definitely apply
> to TSM. For one thing he strongly advocates RAID10, as do I.
>
http://www.oracle.com/technology/deploy/availability/pdf/oow2000_same.pd
f
>
> Most of my Log pinning problems have been caused by clients. If a
client
> suffers a networking problem (typically a half-duplex vs. full-duplex
> conflict) and if that client tries to back up a large file such as a
> movie, that can pin the log on our system until it fills completely.
> Minimum throughput controls in TSM can help here, though it can still
> happen. I wrote a daemon that watches the Log fullness and if it gets
to
> ab

Re: TSM performance very poor, Recovery log is being pinned

2007-07-31 Thread Craig Ross
Thanks guys.

All this advise is much appreciated.
For the record my TSM servers seems to have returned to more normal
routines.
However the log is still being pinned and log fills up to about 12% then
flushes, not ideal but acceptable.
I am starting to wonder if my install has always had the log pinning issue I
just did not know!!
I will keep close eye on it.   But I feel ultimately I will install 2nd TSM
server and migrate some Node's to new server.
I did find a bad configured batch of new SAN storage which is promoting the
slowdown/pinning. I have stopped using this particular disk and performance
has returned to norm.  However this was not obvious before because server
was busy, I could not pin point easy as once stopping the trouble spot TSM
was not jumping into life straight away it had too process the back log of
requests And of course I could not afford to leave server on go slow for
long periods.

As result I am currently in review of all new Storage installed to ensure
its running optimally.

I will propose to management to install new fast disk for log and DB as the
pinning still issue.

However this is where the debate continues, I have experimented with DB
volumes log volumes and storage volumes with FS and raw volumes and have not
seen any performance difference (on the same disks) ie have deleted volumes
created RAW and FS and seen no difference.
I have read material on both sides of this story and no seems more
convincing than the other except I remember reading somewhere the only
eyebrow raiser was with RAW on solaris you can have issues, I cannot
remember exact issue but potential is there. So after testing both formats
finding no difference. I use FS for everything, if this is an indisputable
mistake please let me know.
Also we have 10 10 Gb volumes for DB should I create more smaller ones?

While I am here though I have another cloudy area, since upgrading server
from 5.1 to 5.3 installing IBM tape device driver and adding 4 LTO3 drives I
swear my 6 old LTO 1 drives are running slower than previous is there some
gotchas when installing IBM Tape to get drives running well!

6 LTO 1 drives are SCSI attached and 4 new LTO 3 are fibre.

Cheers

A happier TSM administrator the last 2 days  :>




On 7/31/07, Roger Deschner <[EMAIL PROTECTED]> wrote:
>
> .
> I think you are right about the Log - it need not be spread across
> multiple volumes. It's only got one writer.
>
> Your RAID type can affect the performance of the Disk Storage Pools and
> the Database dramatically. In particular, RAID5 is very poorly suited
> for this, because it is 50% writes. RAID5 is also not ideal for the
> Database, though it can be tolerated for the Log. RAID10 is much better.
>
> You should be using fast disks, not SATA, for the primary Disk Storage
> Pools. I've got 10,000rpm IBM SSA disks for these.
>
> I use RAID10 for the Disk Storage Pools. I use JBOD disks with TSM
> mirroring for the Log and Database. This is slightly slower than OS
> mirroring or RAID-array mirroring, but it is somewhat safer. Each
> physical volume for Storage Pools and Database is broken into many
> Logical Volumes.
>
> You should be saving your fastest disks for the Database. I've got
> 15,000prm disks for the Database. When I moved the Database from
> 10,000rpm disks to 15,000rpm disks, everything in TSM got noticeably
> faster. For instance, DB backups now take 1/3 less time. RAID boxes just
> get in the way for the Database; it really runs best on JBOD disks with
> TSM doing the mirroring.
>
> Here's a controversial paper written by a guy at Oracle. He says you
> should "Stripe And Mirror Everything" (S.A.M.E.) I've read and reread
> this several times, and while I definitely do not agree with everything
> said, it does raise some very interesting points that definitely apply
> to TSM. For one thing he strongly advocates RAID10, as do I.
> http://www.oracle.com/technology/deploy/availability/pdf/oow2000_same.pdf
>
> Most of my Log pinning problems have been caused by clients. If a client
> suffers a networking problem (typically a half-duplex vs. full-duplex
> conflict) and if that client tries to back up a large file such as a
> movie, that can pin the log on our system until it fills completely.
> Minimum throughput controls in TSM can help here, though it can still
> happen. I wrote a daemon that watches the Log fullness and if it gets to
> about 70% it cancels the session that has the Log pinned. I still have
> problems, because the cancel command can take hours to work if the
> client is backing up a large file slowly. If the Log gets to 95% it does
> a TSM shutdown command, which is vastly easier to recover from than a
> 100% full log. At least with a full TSM shutdown, our novice sysadmin's
> first impulse which is to try to restart it, is generally a good thing
> to do. It usually restarts with an empty Log in these cases, so they can
> claim, "I fixed it!" without knowing the underlying complexities.
>
> Roger Deschner

Re: TSM performance very poor, Recovery log is being pinned

2007-07-30 Thread Roger Deschner
.
I think you are right about the Log - it need not be spread across
multiple volumes. It's only got one writer.

Your RAID type can affect the performance of the Disk Storage Pools and
the Database dramatically. In particular, RAID5 is very poorly suited
for this, because it is 50% writes. RAID5 is also not ideal for the
Database, though it can be tolerated for the Log. RAID10 is much better.

You should be using fast disks, not SATA, for the primary Disk Storage
Pools. I've got 10,000rpm IBM SSA disks for these.

I use RAID10 for the Disk Storage Pools. I use JBOD disks with TSM
mirroring for the Log and Database. This is slightly slower than OS
mirroring or RAID-array mirroring, but it is somewhat safer. Each
physical volume for Storage Pools and Database is broken into many
Logical Volumes.

You should be saving your fastest disks for the Database. I've got
15,000prm disks for the Database. When I moved the Database from
10,000rpm disks to 15,000rpm disks, everything in TSM got noticeably
faster. For instance, DB backups now take 1/3 less time. RAID boxes just
get in the way for the Database; it really runs best on JBOD disks with
TSM doing the mirroring.

Here's a controversial paper written by a guy at Oracle. He says you
should "Stripe And Mirror Everything" (S.A.M.E.) I've read and reread
this several times, and while I definitely do not agree with everything
said, it does raise some very interesting points that definitely apply
to TSM. For one thing he strongly advocates RAID10, as do I.
http://www.oracle.com/technology/deploy/availability/pdf/oow2000_same.pdf

Most of my Log pinning problems have been caused by clients. If a client
suffers a networking problem (typically a half-duplex vs. full-duplex
conflict) and if that client tries to back up a large file such as a
movie, that can pin the log on our system until it fills completely.
Minimum throughput controls in TSM can help here, though it can still
happen. I wrote a daemon that watches the Log fullness and if it gets to
about 70% it cancels the session that has the Log pinned. I still have
problems, because the cancel command can take hours to work if the
client is backing up a large file slowly. If the Log gets to 95% it does
a TSM shutdown command, which is vastly easier to recover from than a
100% full log. At least with a full TSM shutdown, our novice sysadmin's
first impulse which is to try to restart it, is generally a good thing
to do. It usually restarts with an empty Log in these cases, so they can
claim, "I fixed it!" without knowing the underlying complexities.

Roger Deschner  University of Illinois at Chicago [EMAIL PROTECTED]
= "Standards are great. That's why there are so many of them." =




On Mon, 30 Jul 2007, Andrew Carlson wrote:

>always heard the DB should, because it opens multiple threads with multiple
>volumes, but since the log is sequentially written to for the most part, I
>can't figure out why that should be in multiple volumes.  Thanks.
>
>On 7/30/07, Charles A Hart <[EMAIL PROTECTED]> wrote:
>>
>> Your DB and Log shold be RAW as well, and in small vols.  (ie 12GB log
>> should be in 2-3GB VOls, DB, vols, depengin on size of db should be 5-10GB
>> vols.  Also try to make sure the raw logical vols are evenly spread
>> accross as many LUNs as possible.
>>
>> Charles Hart
>>
>>
>>
>>
>>
>> "Stapleton, Mark" <[EMAIL PROTECTED]>
>> Sent by: "ADSM: Dist Stor Manager" 
>> 07/29/2007 07:03 AM
>> Please respond to
>> "ADSM: Dist Stor Manager" 
>>
>>
>> To
>> ADSM-L@VM.MARIST.EDU
>> cc
>>
>> Subject
>> Re: [ADSM-L] TSM performance very poor, Recovery log is being pinned
>>
>>
>>
>>
>>
>>
>> From: ADSM: Dist Stor Manager on behalf of Craig Ross
>> >TSM is installed on Solaris 10
>>
>> This is something that popped right out for me. Do you have your storage
>> pools located on raw logical volumes or mounted filesystems? If the
>> latter, that might be your problem. Solaris has traditionally had
>> incredibly poor throughput performance on mounted filesystems.
>>
>> You might give thought to rebuilding those storage pools on raw logical
>> volumes. Of course, that will require that you completely flush all data
>> from your disk storage pools to tape storage pools first, so as not to
>> lose client data.
>>
>> --
>> Mark Stapleton ([EMAIL PROTECTED])
>> Berbee Information Networks (a CDW company)
>>
>>
>>
>> This e-mail, including attachments, may include confidential and/or
>> proprietary information, and may be used only by the person or entity to
>> which it is addressed. If the reader of this e-mail is not the intended
>> recipient or his or her authorized agent, the reader is hereby notified
>> that any dissemination, distribution or copying of this e-mail is
>> prohibited. If you have received this e-mail in error, please notify the
>> sender by replying to this message and delete this e-mail immediately.
>>
>
>
>
>--
>Andy Carlson
>--

Re: TSM performance very poor, Recovery log is being pinned

2007-07-30 Thread Andrew Carlson
Could you elaborate on why the log should be in smaller volumes?  I have
always heard the DB should, because it opens multiple threads with multiple
volumes, but since the log is sequentially written to for the most part, I
can't figure out why that should be in multiple volumes.  Thanks.

On 7/30/07, Charles A Hart <[EMAIL PROTECTED]> wrote:
>
> Your DB and Log shold be RAW as well, and in small vols.  (ie 12GB log
> should be in 2-3GB VOls, DB, vols, depengin on size of db should be 5-10GB
> vols.  Also try to make sure the raw logical vols are evenly spread
> accross as many LUNs as possible.
>
> Charles Hart
>
>
>
>
>
> "Stapleton, Mark" <[EMAIL PROTECTED]>
> Sent by: "ADSM: Dist Stor Manager" 
> 07/29/2007 07:03 AM
> Please respond to
> "ADSM: Dist Stor Manager" 
>
>
> To
> ADSM-L@VM.MARIST.EDU
> cc
>
> Subject
> Re: [ADSM-L] TSM performance very poor, Recovery log is being pinned
>
>
>
>
>
>
> From: ADSM: Dist Stor Manager on behalf of Craig Ross
> >TSM is installed on Solaris 10
>
> This is something that popped right out for me. Do you have your storage
> pools located on raw logical volumes or mounted filesystems? If the
> latter, that might be your problem. Solaris has traditionally had
> incredibly poor throughput performance on mounted filesystems.
>
> You might give thought to rebuilding those storage pools on raw logical
> volumes. Of course, that will require that you completely flush all data
> from your disk storage pools to tape storage pools first, so as not to
> lose client data.
>
> --
> Mark Stapleton ([EMAIL PROTECTED])
> Berbee Information Networks (a CDW company)
>
>
>
> This e-mail, including attachments, may include confidential and/or
> proprietary information, and may be used only by the person or entity to
> which it is addressed. If the reader of this e-mail is not the intended
> recipient or his or her authorized agent, the reader is hereby notified
> that any dissemination, distribution or copying of this e-mail is
> prohibited. If you have received this e-mail in error, please notify the
> sender by replying to this message and delete this e-mail immediately.
>



--
Andy Carlson
---
Gamecube:$150,PSO:$50,Broadband Adapter: $35, Hunters License: $8.95/month,
The feeling of seeing the red box with the item you want in it:Priceless.


Re: TSM performance very poor, Recovery log is being pinned

2007-07-30 Thread Charles A Hart
Your DB and Log shold be RAW as well, and in small vols.  (ie 12GB log
should be in 2-3GB VOls, DB, vols, depengin on size of db should be 5-10GB
vols.  Also try to make sure the raw logical vols are evenly spread
accross as many LUNs as possible.

Charles Hart





"Stapleton, Mark" <[EMAIL PROTECTED]>
Sent by: "ADSM: Dist Stor Manager" 
07/29/2007 07:03 AM
Please respond to
"ADSM: Dist Stor Manager" 


To
ADSM-L@VM.MARIST.EDU
cc

Subject
Re: [ADSM-L] TSM performance very poor, Recovery log is being pinned






From: ADSM: Dist Stor Manager on behalf of Craig Ross
>TSM is installed on Solaris 10

This is something that popped right out for me. Do you have your storage
pools located on raw logical volumes or mounted filesystems? If the
latter, that might be your problem. Solaris has traditionally had
incredibly poor throughput performance on mounted filesystems.

You might give thought to rebuilding those storage pools on raw logical
volumes. Of course, that will require that you completely flush all data
from your disk storage pools to tape storage pools first, so as not to
lose client data.

--
Mark Stapleton ([EMAIL PROTECTED])
Berbee Information Networks (a CDW company)



This e-mail, including attachments, may include confidential and/or
proprietary information, and may be used only by the person or entity to
which it is addressed. If the reader of this e-mail is not the intended
recipient or his or her authorized agent, the reader is hereby notified
that any dissemination, distribution or copying of this e-mail is
prohibited. If you have received this e-mail in error, please notify the
sender by replying to this message and delete this e-mail immediately.


Re: TSM performance very poor, Recovery log is being pinned

2007-07-29 Thread Stapleton, Mark
From: ADSM: Dist Stor Manager on behalf of Craig Ross
>TSM is installed on Solaris 10

This is something that popped right out for me. Do you have your storage pools 
located on raw logical volumes or mounted filesystems? If the latter, that 
might be your problem. Solaris has traditionally had incredibly poor throughput 
performance on mounted filesystems.

You might give thought to rebuilding those storage pools on raw logical 
volumes. Of course, that will require that you completely flush all data from 
your disk storage pools to tape storage pools first, so as not to lose client 
data.

--
Mark Stapleton ([EMAIL PROTECTED])
Berbee Information Networks (a CDW company)


Re: TSM performance very poor, Recovery log is being pinned

2007-07-27 Thread Craig Ross
Thanks for all input guys,

Firstly sorry for lack of detail.

TSM is installed on Solaris 10, No I did not do any benchmarking, as we were
not replacing any existing setup just adding more, I have 6 LTO drives
already installed and about 17TB of SAN storage which is Primary Random, I
have since added 4 new LTO 3 drives (with different Device classes) and they
run better than the LTO 1 drives, when server is not logpinning. And the new
15 TB of SATA, now approx 1 TB of the DISK is Primary Random storage and the
remainder is SEQ file and its all FS not RAW, I have had heavy discussion
over RAW vs FS and I have not been able to find definitive answer

Clients I don;t think are causing the issue any TSM processes can pin the
log from migrations DB backups and clients sessions. Once the server starts
getting busy.

Currently (sorry not in front of installation) but I guess of the 15TB I
have about 60 sequential volumes across 4 Stgpools, and I have still more to
define.

I have not had clients utilize this new storage yet, all I have done is
start to migrate data into these STGPools to release some of the legacy
STGpools


 The Transport to DB and Recovery Log is SAN, however yesterday I created
local copy and this did not improve things.

The STGpools are on WMS SAN and AMS500 SAN. Both SATA disk. All across
Fibre!!


 The SAN engineer when installing the Disk's saw expected performance out of
the DISK's.

I also don't see it being maxsessions because I can Pin the log with 3 or 4
sessions and 3 processes!


 I think its safe to say its configuration somewhere, because now I think
about it its not taking much load to pin the log. Load in which TSM normally
copes ok!!

Next step may be to remove the New DISK's now will I need to just unmount
the FS or will i Need to migrate data off new storage and delete volumes and
New STGpools?

Thanks


On 7/28/07, Lawrence Clark <[EMAIL PROTECTED]> wrote:
>
> Assuming the SATA are on AIX, were the logical volumes set up to hold
> the volumes
> defined as JFS2?
>
> >>> [EMAIL PROTECTED] 07/27/2007 2:30:54 PM >>>
> Do the client backup sessions pin the log? What is the throughput on
> the
> actual client session and are these backups direct to disk? If the
> sessions are cancelled does the system come back to life?
>
> 15 TB of SATA sounds like a lot of storage. how has this been
> added/configured- What raw throughput do you get on these disks outside
> of
> TSM itself?
>
> You say the LTO3 drives are new. Do you have existing LTO3 drives?
> Have
> you configured them correctly with new device class etc if you are
> mixing
> LTO generations in the library?
>
> I have seen this type of pinning/dramatic slow down before. I saw
> itself
> manifest by the server hitting the maxsessions limit as all the
> sessions
> were running so slowly to the disk pool.
>
> Lots of questions i know, but as you have made multiple changes at the
> same time- its going to be difficult to nail down without additional
> info.
>
> Ian Smith
> ---
> Core Engineering - Storage
>
>
>
>
>
> Robert Clark <[EMAIL PROTECTED]>
> Sent by: "ADSM: Dist Stor Manager" 
> 27/07/2007 18:01
> Please respond to
> "ADSM: Dist Stor Manager" 
>
>
> To
> ADSM-L@VM.MARIST.EDU
> cc
>
> Subject
> Re: [ADSM-L] TSM performance very poor, Recovery log is being pinned
>
>
>
>
>
>
> Is the SATA setup as disk storage pools? Is it filesystem or raw
> logical volumes?
>
> What is the OS? vmstat or top/topas may give some ideas.
>
> What is the network transport? Fast ethernet?
>
> [RC]
>
> On Jul 27, 2007, at 2:49 AM, Craig Ross wrote:
>
> > 10 days ago I Recently added 15TB of SATA storage and a new Fabric
> > with 4
> > new LTO drives to our 3584 library,
> > The DB is approx 90GB TSM
> >
> > Few days ago I noticed processing had ground to halt, after digging
> > around I
> > have found as soon as server gets busy maybe 4 processes 8 or so
> > sessions
> > the recovery log begins "sh logpinned" to pin and the Database gets
> > locks.
> > Shown by running "sh locks"
> > And as result the server suffers!
> > Now today I have stopped using the new Tech LTO 3 and SATA and
> > things are
> > coping better but still worse than previous as soon as load is
> > increased Log
> > pins and processing slows drastically.
> >
> > Are there any steps I can take which will help my scenario.
> > Would a DB UNLOAD RELOAD help that much?
> >
> > Reference: Recovery log has heaps of room DB has heaps of room 90Gb
> > DB with
> > 100GB of room.
> >
> > Any advice is much appreciated.
>
>
>
> ---
>
> This e-mail may contain confidential and/or privileged information. If
> you are not the intended recipient (or have received this e-mail in
> error) please notify the sender immediately and delete this e-mail. Any
> unauthorized copying, disclosure or distribution of the material in this
> e-mail is strictly forbidden.
>
> Please refer to http://www.db.com/en/content/eu_disclosures.htm for
> addition

Re: TSM performance very poor, Recovery log is being pinned

2007-07-27 Thread Ian-IT Smith
Do the client backup sessions pin the log? What is the throughput on the
actual client session and are these backups direct to disk? If the
sessions are cancelled does the system come back to life?

15 TB of SATA sounds like a lot of storage. how has this been
added/configured- What raw throughput do you get on these disks outside of
TSM itself?

You say the LTO3 drives are new. Do you have existing LTO3 drives? Have
you configured them correctly with new device class etc if you are mixing
LTO generations in the library?

I have seen this type of pinning/dramatic slow down before. I saw itself
manifest by the server hitting the maxsessions limit as all the sessions
were running so slowly to the disk pool.

Lots of questions i know, but as you have made multiple changes at the
same time- its going to be difficult to nail down without additional info.

Ian Smith
---
Core Engineering - Storage





Robert Clark <[EMAIL PROTECTED]>
Sent by: "ADSM: Dist Stor Manager" 
27/07/2007 18:01
Please respond to
"ADSM: Dist Stor Manager" 


To
ADSM-L@VM.MARIST.EDU
cc

Subject
Re: [ADSM-L] TSM performance very poor, Recovery log is being pinned






Is the SATA setup as disk storage pools? Is it filesystem or raw
logical volumes?

What is the OS? vmstat or top/topas may give some ideas.

What is the network transport? Fast ethernet?

[RC]

On Jul 27, 2007, at 2:49 AM, Craig Ross wrote:

> 10 days ago I Recently added 15TB of SATA storage and a new Fabric
> with 4
> new LTO drives to our 3584 library,
> The DB is approx 90GB TSM
>
> Few days ago I noticed processing had ground to halt, after digging
> around I
> have found as soon as server gets busy maybe 4 processes 8 or so
> sessions
> the recovery log begins "sh logpinned" to pin and the Database gets
> locks.
> Shown by running "sh locks"
> And as result the server suffers!
> Now today I have stopped using the new Tech LTO 3 and SATA and
> things are
> coping better but still worse than previous as soon as load is
> increased Log
> pins and processing slows drastically.
>
> Are there any steps I can take which will help my scenario.
> Would a DB UNLOAD RELOAD help that much?
>
> Reference: Recovery log has heaps of room DB has heaps of room 90Gb
> DB with
> 100GB of room.
>
> Any advice is much appreciated.



---

This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and delete this e-mail. Any unauthorized copying, 
disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to http://www.db.com/en/content/eu_disclosures.htm for additional 
EU corporate and regulatory disclosures.


Re: TSM performance very poor, Recovery log is being pinned

2007-07-27 Thread Lawrence Clark
Assuming the SATA are on AIX, were the logical volumes set up to hold
the volumes
defined as JFS2?

>>> [EMAIL PROTECTED] 07/27/2007 2:30:54 PM >>>
Do the client backup sessions pin the log? What is the throughput on
the
actual client session and are these backups direct to disk? If the
sessions are cancelled does the system come back to life?

15 TB of SATA sounds like a lot of storage. how has this been
added/configured- What raw throughput do you get on these disks outside
of
TSM itself?

You say the LTO3 drives are new. Do you have existing LTO3 drives?
Have
you configured them correctly with new device class etc if you are
mixing
LTO generations in the library?

I have seen this type of pinning/dramatic slow down before. I saw
itself
manifest by the server hitting the maxsessions limit as all the
sessions
were running so slowly to the disk pool.

Lots of questions i know, but as you have made multiple changes at the
same time- its going to be difficult to nail down without additional
info.

Ian Smith
---
Core Engineering - Storage





Robert Clark <[EMAIL PROTECTED]>
Sent by: "ADSM: Dist Stor Manager" 
27/07/2007 18:01
Please respond to
"ADSM: Dist Stor Manager" 


To
ADSM-L@VM.MARIST.EDU
cc

Subject
Re: [ADSM-L] TSM performance very poor, Recovery log is being pinned






Is the SATA setup as disk storage pools? Is it filesystem or raw
logical volumes?

What is the OS? vmstat or top/topas may give some ideas.

What is the network transport? Fast ethernet?

[RC]

On Jul 27, 2007, at 2:49 AM, Craig Ross wrote:

> 10 days ago I Recently added 15TB of SATA storage and a new Fabric
> with 4
> new LTO drives to our 3584 library,
> The DB is approx 90GB TSM
>
> Few days ago I noticed processing had ground to halt, after digging
> around I
> have found as soon as server gets busy maybe 4 processes 8 or so
> sessions
> the recovery log begins "sh logpinned" to pin and the Database gets
> locks.
> Shown by running "sh locks"
> And as result the server suffers!
> Now today I have stopped using the new Tech LTO 3 and SATA and
> things are
> coping better but still worse than previous as soon as load is
> increased Log
> pins and processing slows drastically.
>
> Are there any steps I can take which will help my scenario.
> Would a DB UNLOAD RELOAD help that much?
>
> Reference: Recovery log has heaps of room DB has heaps of room 90Gb
> DB with
> 100GB of room.
>
> Any advice is much appreciated.



---

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient (or have received this e-mail in
error) please notify the sender immediately and delete this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

Please refer to http://www.db.com/en/content/eu_disclosures.htm for
additional EU corporate and regulatory disclosures.


The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain information that is confidential, privileged, and/or otherwise exempt 
from disclosure under applicable law.  If this electronic message is from an 
attorney or someone in the Legal Department, it may also contain confidential 
attorney-client communications which may be privileged and protected from 
disclosure.  If you are not the intended recipient, be advised that you have 
received this message in error and that any use, dissemination, forwarding, 
printing, or copying is strictly prohibited.  Please notify the New York State 
Thruway Authority immediately by either responding to this e-mail or calling 
(518) 436-2700, and destroy all copies of this message and any attachments.


Re: TSM performance very poor, Recovery log is being pinned

2007-07-27 Thread Robert Clark

Is the SATA setup as disk storage pools? Is it filesystem or raw
logical volumes?

What is the OS? vmstat or top/topas may give some ideas.

What is the network transport? Fast ethernet?

[RC]

On Jul 27, 2007, at 2:49 AM, Craig Ross wrote:


10 days ago I Recently added 15TB of SATA storage and a new Fabric
with 4
new LTO drives to our 3584 library,
The DB is approx 90GB TSM

Few days ago I noticed processing had ground to halt, after digging
around I
have found as soon as server gets busy maybe 4 processes 8 or so
sessions
the recovery log begins "sh logpinned" to pin and the Database gets
locks.
Shown by running "sh locks"
And as result the server suffers!
Now today I have stopped using the new Tech LTO 3 and SATA and
things are
coping better but still worse than previous as soon as load is
increased Log
pins and processing slows drastically.

Are there any steps I can take which will help my scenario.
Would a DB UNLOAD RELOAD help that much?

Reference: Recovery log has heaps of room DB has heaps of room 90Gb
DB with
100GB of room.

Any advice is much appreciated.


Re: TSM performance very poor, Recovery log is being pinned

2007-07-27 Thread Joy Hanna
You might also check your diskpool volume count. If its low, assuming
your doing raw logical volumes, you might want to try decreasing the
size of your volumes and thereby increasing the count of your volumes .
A small number of large volumes does not allow several clients or
processes to stream data to the storage pools efficiently.  


Joy Hanna
Enterprise Storage Group
I.T. Computer Operations
(503)745-7748
[EMAIL PROTECTED]

-Original Message-
From: ADSM: Dist Stor Manager [mailto:[EMAIL PROTECTED] On Behalf Of
Craig Ross
Sent: Friday, July 27, 2007 2:49 AM
To: ADSM-L@VM.MARIST.EDU
Subject: [ADSM-L] TSM performance very poor, Recovery log is being
pinned

10 days ago I Recently added 15TB of SATA storage and a new Fabric with
4 new LTO drives to our 3584 library, The DB is approx 90GB TSM

Few days ago I noticed processing had ground to halt, after digging
around I have found as soon as server gets busy maybe 4 processes 8 or
so sessions the recovery log begins "sh logpinned" to pin and the
Database gets locks.
Shown by running "sh locks"
And as result the server suffers!
Now today I have stopped using the new Tech LTO 3 and SATA and things
are coping better but still worse than previous as soon as load is
increased Log pins and processing slows drastically.

Are there any steps I can take which will help my scenario.
Would a DB UNLOAD RELOAD help that much?

Reference: Recovery log has heaps of room DB has heaps of room 90Gb DB
with 100GB of room.

Any advice is much appreciated.


Re: TSM performance very poor, Recovery log is being pinned

2007-07-27 Thread Richard Sims

Craig -

You need to perform analysis to identify problem cause, where the TSM
Problem Determination Guide and Performance Tuning Guide will help.

Log pinning is due to prolonged transactions, and is aggravated by
sluggish networking and sluggish TSM servicing of transactions (often
due to underlying disk/tape issues).

You can quickly see if your TSM server is "behind" in its rate of
servicing incoming client data by inspecting the TCP receive queue
packets backlog.  In AIX that can be done via the command:
  netstat | head -2 ; netstat | grep -vi dns | grep tcp
If the various entries show a large receive queue value, then it is
likely that your networking is good, but that TSM is not keeping up
with the incoming, as may be caused by the underlying disk, tape, and
I/O path technology that it is using.

If your clients have recently started backing up very large files
(digital movies is a stereotypical case), then that would certainly
contribute to what you're seeing.  A quick look at TSM accounting
data or ANE Activity Log messages would give a sense of that, and
Query CONTent with a negative Count value on the collocated tape
volumes that the clients are doing will show biggies.  Query SESSion
during client activity will also help identify consumptive sessions.

Before you gave TSM the new LTO 3 and SATA hardware, I would hope
that you benchmarked it first, to assure that it was providing the
performance you would need in production, and thus uncover any issues
with it beforehand.  A bad RAID choice in disk implementation will
also slow throughput.  Old microcode may have performance-impairing
defects.  A mismatched device driver can cause operational delays.

Don't waste your time or jeopardize your server in doing a TSM db
unload/reload.

You may want to confer with your operating system people to have them
help narrow down the problem area, where they are familiar with all
the specifics of your environment.

   Richard Sims