[no subject]

2021-06-04 Thread J. Eric Wonderley
I have SP running using directory container.  Does anyone know how it
generates the SUR usage?

I have a cronjob that runs what did you do in the last 24h and once a week
I get this output:

SUR_ARCH_OCCUPANCY  1   0.00
SUR_OCCUPANCY   1   125290.75
SUR_RET_OCCUPANCY   1   0.00

I suspect is doing something like:
gen dedupst "directorypool" * and then q dedupst "directorypool" *.  Then
going thru every node and computing how much unique data is in every node
and summing it up.  I can't seem to find a v8.1x macro that does this on
the interwebs


Re: AW: [ADSM-L] anyone running SPv8.1.8 server on rhel7.8?

2020-04-07 Thread J. Eric Wonderley
It was the semaphore settings:

This got it back up:
[root@barge ~]# sysctl -w kernel.sem="250 256000 32 16384"
kernel.sem = 250 256000 32 16384
[root@barge ~]# ps -ef | grep db2
root  3494 1  0 Apr06 ?00:00:04
/opt/tivoli/tsm/db2/bin/db2fmcd
root 21406 30720  0 08:56 pts/000:00:00 grep --color=auto db2
[root@barge ~]# kill -HUP 3494

Thanks guys!

On Tue, Apr 7, 2020 at 8:46 AM Uwe Schreiber 
wrote:

> The cause could be a misconfiguration of the linux kernel SEMIPHORE
> settings?
> What is the output of "ipcs -l"
>
> The suggested settings by IBM are:
>
> kernel.sem (SEMMNI) 16384
> kernel.sem (SEMMSL) 250
> kernel.sem (SEMMNS) 256 000
> kernel.sem (SEMOPM) 32
>
>
> Regard, Uwe
>
> -Ursprüngliche Nachricht-
> Von: ADSM: Dist Stor Manager  Im Auftrag von J.
> Eric Wonderley
> Gesendet: Dienstag, 7. April 2020 14:07
> An: ADSM-L@VM.MARIST.EDU
> Betreff: [ADSM-L] anyone running SPv8.1.8 server on rhel7.8?
>
> I upgraded our target server from rhel7.6 to rhel7.8 and now dsmserv
> doesn't come up.  The server worked fine on 7.6.
>
> After moving to rhel7.8 I now see this fail on an interactive startup:
> ANR0990I Server restart-recovery in progress.
> ANR0152I Database manager successfully started.
> ANR0172I rdbdb.c(2519): Error encountered performing action
> ActivateDatabase.
> ANR0162W Supplemental database diagnostic information:  -1225:SQLSTATE
> 57049: The operating system process limit has been reached.
> :-1225
> (SQL1225N  The request failed because an operating system process, thread,
> or swap space limit was reached.  SQLSTATE=57049 ).
> ANR0171I dbiconn.c(1936): Error detected on 0:1, database in evaluation
> mode.
> ANR0169E An unexpected error has occurred and the IBM Spectrum Protect
> server is stopping.
> ANR0162W Supplemental database diagnostic information:  -1:57049:-1225
> ([IBM][CLI Driver] SQL1225N  The request failed because an operating system
> process, thread, or swap space limit was reached.
> SQLSTATE=57049
> ).
>
> Transaction hash table contents (slots=256):
>   *** no transactions found ***
>
> Lock hash table contents (slots=3002):
> Note: Enabling trace class TMTIMER will provide additional timing info on
> the following locks
>   *** no locks found ***
>
>
> Per ibm website I'm still in bounds on os version.
>
> I also observe:
> [root@barge ~]# ipcs -l
>
> -- Messages Limits 
> max queues system wide = 32000
> max size of message (bytes) = 65536
> default max size of queue (bytes) = 65536
>
> -- Shared Memory Limits 
> max number of segments = 4096
> max seg size (kbytes) = 263874788
> max total shared memory (kbytes) = 17179869184 min seg size (bytes) = 1
>
> -- Semaphore Limits 
> max number of arrays = 128
> max semaphores per array = 250
> max semaphores system wide = 256000
> max ops per semop call = 32
> semaphore max value = 32767
>
> Any ideas?
>
>
>
> Thanks much
>


anyone running SPv8.1.8 server on rhel7.8?

2020-04-07 Thread J. Eric Wonderley
I upgraded our target server from rhel7.6 to rhel7.8 and now dsmserv
doesn't come up.  The server worked fine on 7.6.

After moving to rhel7.8 I now see this fail on an interactive startup:
ANR0990I Server restart-recovery in progress.
ANR0152I Database manager successfully started.
ANR0172I rdbdb.c(2519): Error encountered performing action
ActivateDatabase.
ANR0162W Supplemental database diagnostic information:  -1225:SQLSTATE
57049: The operating system process limit has been reached.
:-1225
(SQL1225N  The request failed because an operating system process, thread,
or swap space limit was reached.  SQLSTATE=57049
).
ANR0171I dbiconn.c(1936): Error detected on 0:1, database in evaluation
mode.
ANR0169E An unexpected error has occurred and the IBM Spectrum Protect
server is stopping.
ANR0162W Supplemental database diagnostic information:  -1:57049:-1225
([IBM][CLI Driver] SQL1225N  The request failed because an
operating system process, thread, or swap space limit was reached.
SQLSTATE=57049
).

Transaction hash table contents (slots=256):
  *** no transactions found ***

Lock hash table contents (slots=3002):
Note: Enabling trace class TMTIMER will provide additional timing info on
the following locks
  *** no locks found ***


Per ibm website I'm still in bounds on os version.

I also observe:
[root@barge ~]# ipcs -l

-- Messages Limits 
max queues system wide = 32000
max size of message (bytes) = 65536
default max size of queue (bytes) = 65536

-- Shared Memory Limits 
max number of segments = 4096
max seg size (kbytes) = 263874788
max total shared memory (kbytes) = 17179869184
min seg size (bytes) = 1

-- Semaphore Limits 
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 256000
max ops per semop call = 32
semaphore max value = 32767

Any ideas?



Thanks much


server does lots of automated container movement

2019-12-16 Thread J. Eric Wonderley
I have a source and target server v818 participating in replication.

Anyhow I see to see lots of automated container moves on the target.  Is
this normal?  I don't see it on the source server:

Protect: BARGE>q proc

Process  Process Description  Process Status

  Number
 
-
   8,476 Database Backup  TYPE=FULL in progress. Bytes backed
up: 1,264
   GB. Current output volume(s):


 /dbbackup2/76524613.DBV,/dbbackup3/76524614.DBV-

 ,/dbbackup4/76524615.DBV,/dbbackup1/76524617.DB-
   V.

   8,487 Move Container   Moving container
/orion_c8/b7/b71f.d-
  (Automatic)  cf. 4,673 MB in 21,131 data
extent(s) moved. 0
   bytes in 0 data extent(s) not moved.
Elapsed
   time: 0 Days, 0 Hours, 4 Minutes.

   8,488 Move Container   Moving container
/orion_c12/b7/b728.-
  (Automatic)  dcf. 4,527 MB in 15,017 data
extent(s) moved. 0
   bytes in 0 data extent(s) not moved.
Elapsed
   time: 0 Days, 0 Hours, 4 Minutes.

   8,489 Move Container   Moving container
/orion_c8/b7/b732.d-
  (Automatic)  cf. 4,700 MB in 19,105 data
extent(s) moved. 0
   bytes in 0 data extent(s) not moved.
Elapsed
   time: 0 Days, 0 Hours, 4 Minutes.

   8,490 Move Container   Moving container
/orion_c9/b7/b733.d-
  (Automatic)  cf. 4,624 MB in 24,212 data
extent(s) moved. 0
more...   ( to continue, 'C' to cancel)

   bytes in 0 data extent(s) not moved.
Elapsed
   time: 0 Days, 0 Hours, 4 Minutes.

   8,491 Move Container   Moving container
/orion_c7/b7/b737.d-
  (Automatic)  cf. 4,637 MB in 19,409 data
extent(s) moved. 0
   bytes in 0 data extent(s) not moved.
Elapsed
   time: 0 Days, 0 Hours, 4 Minutes.


I also have theses default setting on both servers:
Protect: BARGE>q opt DEFRAGCNTRTRIGGER
Session established with server BARGE: Linux/x86_64
  Server Version 8, Release 1, Level 8.000
  Server date/time: 12/16/2019 15:15:53  Last access: 12/16/2019 14:54:24


   Server Option   Option
Setting


   DefragCntrTrigger
90

Protect: BARGE>q opt DEFRAGFSTRIGGER

   Server Option   Option
Setting


 DefragFsTrigger
95


Re: Kernel parameters

2019-12-09 Thread J. Eric Wonderley
Hi Eric:

I'm running v8.1.8 with containershere's what I have:

vm.dirty_background_ratio = 10
vm.dirty_ratio = 40
vm.dirty_expire_centisecs = 3000
vm.dirty_writeback_centisecs = 500
net.core.rmem_max = 212992
net.core.wmem_max = 212992
net.ipv4.tcp_rmem = 4096 87380 6291456
net.ipv4.tcp_wmem = 4096 16384 4194304
net.core.netdev_max_backlog = 1000
net.ipv4.tcp_moderate_rcvbuf = 1
net.ipv4.tcp_no_metrics_save = 0
[e1derley@yacht ~]$ free
  totalusedfree  shared  buff/cache
available
Mem:  52807773617168096 1300116   394717640   509609524
114504416
Swap:511996   48068  463928
[e1derley@yacht ~]$ lscpu
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):48
On-line CPU(s) list:   0-47
Thread(s) per core:2
Core(s) per socket:12
Socket(s): 2
NUMA node(s):  2
Vendor ID: GenuineIntel
CPU family:6
Model: 79
Model name:Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
Stepping:  1
CPU MHz:   2499.975
CPU max MHz:   2900.
CPU min MHz:   1200.
BogoMIPS:  4400.22
Virtualization:VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache:  256K
L3 cache:  30720K
NUMA node0 CPU(s):
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46
NUMA node1 CPU(s):
1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2
x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm
abm 3dnowprefetch epb cat_l3 cdp_l3 intel_pt ibrs ibpb stibp tpr_shadow
vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms
invpcid rtm cqm rdt_a rdseed adx smap xsaveopt cqm_llc cqm_occup_llc
cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts spec_ctrl intel_stibp

Lots of mem and db on ssds are most important.

Eric Wonderley


On Mon, Dec 9, 2019 at 8:38 AM Sasa Drnjevic  wrote:

> Hi Eric.
>
> Not using directory containers, but hope it would still help.
>
> Two servers (repl source and target) - each ISP v8.1 with 24 cores and
> 128 GB RAM. Backups stage to rnd disk volumes and then migrate to LTO7.
> All devices are on FC SAN.
>
> --
> 270 active nodes
> 4 TB / 1.5 mil files avg daily ingest
> --
> vm.dirty_background_ratio = 10
> vm.dirty_ratio = 40
> vm.dirty_expire_centisecs = 3000
> vm.dirty_writeback_centisecs = 500
> net.core.rmem_max = 212992
> net.core.wmem_max = 212992
> net.ipv4.tcp_rmem = 409687380   6291456
> net.ipv4.tcp_wmem = 409616384   4194304
> net.core.netdev_max_backlog = 1000
> net.ipv4.tcp_moderate_rcvbuf = 1
> net.ipv4.tcp_no_metrics_save = 0
> --
>
> I believe those are all RHEL7 defaults.
>
> Never had any performance issues.
>
> Rgds,
>
> --
> Sasa Drnjevic
> www.srce.unizg.hr/en/
>
>
>
>
>
> On 2019-12-09 8:48, Loon, Eric van (ITOP NS) - KLM wrote:
> > Hi guys,
> >
> > I received no response on my question down below. I know it's just a
> matter of cut-and-paste, so I really appreciate if you can have a look at
> my request.
> > Thank you very much in advance!
> >
> > Kind regards,
> > Eric van Loon
> > Air France/KLM Storage & Backup
> >
> > From: Loon, Eric van (ITOP NS) - KLM
> > Sent: donderdag 5 december 2019 14:22
> > To: ADSM-L 
> > Subject: Kernel parameters
> >
> > Hi guys,
> >
> > Our TSM 7.1 servers (with directory containers) were suffering from a
> very slow response for a long time. We already applied the kernel
> parameters specified in the installation guide and the SP Blueprints, but
> since we recently changed several other kernel parameters in one go, thing
> improved a lot. It's still not great, but at least it's not taking ages
> anymore to set up a new client session.
> > I'm interested in the value you are using on your Linux servers,
> especially for servers with a large workload on directory containers. If
> this is the case, can you please send me the output of the following
> commands:
> >
> > sysctl vm.dirty_background_ratio
> > sysctl vm.dirty_ratio
> > sysctl vm.dirty_expire_centisecs
> > sysctl vm.dirty_writeback_centisecs
> > sysctl net.core.rmem_max
> > sysctl net.core.wmem_max
> > sysctl net.ipv4.tcp_rmem
> > sysctl net.ipv4.tcp_wmem
> > sysctl net.core.netdev_max_backlog
> > sysctl net.ipv4.tcp_moderate_rcvbuf
> > sysctl net.ipv4.tcp_no_metrics_save
> >
> > Thank you VERY much for your help in advance!
> >

backup qumulo filer

2019-11-13 Thread J. Eric Wonderley
We are backing up one with ba client over nfs.

Its fairly large so this is slow.

I don't believe its capable of ndmp.  Does anyone backup qumulo using some
other method?  I know I could setup several nfs mounts and breakup the
directory tree...but i think this would be somewhat complicated.


Re: TSM server performance continuing

2019-08-21 Thread J. Eric Wonderley
Get all zeros...

We have these settings on a dell730:
ipcs -l

-- Messages Limits 
max queues system wide = 516096
max size of message (bytes) = 65536
default max size of queue (bytes) = 65536

-- Shared Memory Limits 
max number of segments = 129024
max seg size (kbytes) = 528077736
max total shared memory (kbytes) = 17179869184
min seg size (bytes) = 1

-- Semaphore Limits 
max number of arrays = 129024
max semaphores per array = 250
max semaphores system wide = 256000
max ops per semop call = 32
semaphore max value = 32767


# Put changes in /etc/sysctl.conf
kernel.shmmni =  49152
kernel.shmmax = 206158430208
kernel.shmsem = 250 256000 32 49152
kernel.msgmni = 196608
kernel.msgmax = 65536
kernel.msgmnb = 65536
kernel.randomize_va_space = 0
vm.swappiness = 0
vm.overcommit_memory = 0




ERic



540-392-1742 (Cell)


On Wed, Aug 21, 2019 at 11:44 AM Loon, Eric van (ITOP NS) - KLM <
eric-van.l...@klm.com> wrote:

> Hi guys,
>
> A few weeks ago I already wrote about the severe performance issues we
> have with our TSM 7.1 servers. In the 'old days' we used to back up our
> clients to TSM 6.3 servers with Data Domains attached. Smaller clients
> backed up through the LAN, large ones through the SAN.
> Our newer servers use LAN-only with directory containers and the
> performance of these servers really sucks. Setting up a session takes
> sometimes almost one minute and a q stg also takes 30 to 50 seconds. I
> noticed that performance is OK when there are no TDP for Oracle sessions
> running, but as soon as they are started the performance starts to drop
> drastically.
> We are really lost on where to look for the cause. I sent numerous logs
> and traces to IBM, but I guess they are out of ideas too since I don't hear
> anything back from them lately. The only thing is that supports notices
> delays in DB2, but they don't know why...
> What I noticed on my TSM server is that as soon as there is a load on the
> server, the blocked queue starts to rise. I would like to know if that's
> something to focus on or not.
> Can some of you please run the "vmstat 1" command on their Linux server
> (preferably one with directory containers too) and let me know if you too
> see values other than 0 in the B column?
> Thank you very much for your help in advance!
>
> Kind regards,
> Eric van Loon
> Air France/KLM Storage & Backup
> 
> For information, services and offers, please visit our web site:
> http://www.klm.com. This e-mail and any attachment may contain
> confidential and privileged material intended for the addressee only. If
> you are not the addressee, you are notified that no part of the e-mail or
> any attachment may be disclosed, copied or distributed, and that any other
> action related to this e-mail or attachment is strictly prohibited, and may
> be unlawful. If you have received this e-mail by error, please notify the
> sender immediately by return e-mail, and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its
> employees shall not be liable for the incorrect or incomplete transmission
> of this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch
> Airlines) is registered in Amstelveen, The Netherlands, with registered
> number 33014286
> 
>


Re: NDMP backup to container pool

2019-06-13 Thread J. Eric Wonderley
Yes..I saw that link and fore v>8.1.3 it supposedly works.

We have done file in the past and found performance to be poor...especially
server dedup.  Was curious about what others have observed.

On Thu, Jun 13, 2019 at 3:00 PM Michael Prix  wrote:

> Hello,
>
>  well, as always, it depends:
> https://www-01.ibm.com/support/docview.wss?uid=swg22012594
>
> --
> Michael Prix
>
> On Thu, 2019-06-13 at 13:13 -0400, J. Eric Wonderley wrote:
> > Is this activity supported with SP?
> >
> > I recall in the past it was not but you could go to devc=file
> >
> > Thanks much
>


NDMP backup to container pool

2019-06-13 Thread J. Eric Wonderley
Is this activity supported with SP?

I recall in the past it was not but you could go to devc=file

Thanks much


Re: backup db failure on 817 rhel7.6

2019-05-20 Thread J. Eric Wonderley
Jonas:
Yes...I used sharedmem too.

Unfortunately Iused COMMethod instead of COMMmethod...that killed it.
Support helped me fix...working now

Thanks

Eric

On Fri, May 17, 2019 at 3:10 AM Jansen, Jonas 
wrote:

> Hi,
>
> do you use CA signed certificates?
> We head a similar issue, because TSM only knows abbot 4 CAs. We worked
> around
> this problem by using shared memory connections for database backup. Tis
> SWG
> was published:
> https://www-01.ibm.com/support/docview.wss?uid=swg21664425
>
> ---
> Jonas Jansen
>
> IT Center
> Gruppe: Server & Storage
> Abteilung: Systeme & Betrieb
> RWTH Aachen University
> Seffenter Weg 23
> 52074 Aachen
> Tel: +49 241 80-28784
> Fax: +49 241 80-22134
> jan...@itc.rwth-aachen.de
> www.itc.rwth-aachen.de
>
>
> -Original Message-
> From: ADSM: Dist Stor Manager  On Behalf Of Michael
> Prix
> Sent: Thursday, May 16, 2019 10:00 PM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] backup db failure on 817 rhel7.6
>
> Hello,
>
> check for tsmdbmgr.opt in the instance directory and
> /opt/tivoli/tsm/server/bin/dbbkapi/dsm.sys if they contain valid data.
>
> --
> Michael Prix
>
> On Thu, 2019-05-16 at 15:51 -0400, J. Eric Wonderley wrote:
> > I'm getting this failure:
> > ANR2017I Administrator SERVER_CONSOLE issued command: BACKUP DB
> > devc=dbfile type=full ANR0984I Process 3 for Database Backup started
> > in the BACKGROUND at
> > 04:33:11 PM.
> > ANR4559I Backup DB is in progress.
> > ANR2280I Full database backup started as process 3.
> > IBM Spectrum Protect:PROFESSOR>
> > ANR8340I FILE volume /dbbackup/55101192.DBV mounted.
> > ANR0513I Process 3 opened output volume /dbbackup/55101192.DBV.
> > ANR1360I Output volume /dbbackup/55101192.DBV opened (sequence number 1).
> > ANR4626I Database backup will use 1 streams for processing with the
> > number originally requested 1.
> > ANR2968E Database backup terminated. Db2 sqlcode: -2033. Db2 sqlerrmc:
> 400
> >  .
> > ANR1361I Output volume /dbbackup/55101192.DBV closed.
> > ANR0515I Process 3 closed volume /dbbackup/55101192.DBV.
> > ANR0985I Process 3 for Database Backup running in the BACKGROUND
> > completed with completion state FAILURE at 04:33:12 PM.
> > ANR1893E Process 3 for Database Backup completed with a completion
> > state of FAILURE.
> >
> > IBM Spectrum Protect Server for Linux/x86_64 - Version 8, Release 1,
> > Level
> > 7.000
> >
> > Anyone else experienced this before?  Its very repeatable and I cannot
> > seem to find cause
>


backup db failure on 817 rhel7.6

2019-05-16 Thread J. Eric Wonderley
I'm getting this failure:
ANR2017I Administrator SERVER_CONSOLE issued command: BACKUP DB devc=dbfile
type=full
ANR0984I Process 3 for Database Backup started in the BACKGROUND at
04:33:11 PM.
ANR4559I Backup DB is in progress.
ANR2280I Full database backup started as process 3.
IBM Spectrum Protect:PROFESSOR>
ANR8340I FILE volume /dbbackup/55101192.DBV mounted.
ANR0513I Process 3 opened output volume /dbbackup/55101192.DBV.
ANR1360I Output volume /dbbackup/55101192.DBV opened (sequence number 1).
ANR4626I Database backup will use 1 streams for processing with the number
originally requested
1.
ANR2968E Database backup terminated. Db2 sqlcode: -2033. Db2 sqlerrmc: 400
 .
ANR1361I Output volume /dbbackup/55101192.DBV closed.
ANR0515I Process 3 closed volume /dbbackup/55101192.DBV.
ANR0985I Process 3 for Database Backup running in the BACKGROUND completed
with completion
state FAILURE at 04:33:12 PM.
ANR1893E Process 3 for Database Backup completed with a completion state of
FAILURE.

IBM Spectrum Protect Server for Linux/x86_64 - Version 8, Release 1, Level
7.000

Anyone else experienced this before?  Its very repeatable and I cannot seem
to find cause


convert stgpool timeout errors?

2019-05-03 Thread J. Eric Wonderley
Anyone ever see fail/timeout errors like this?:
04/29/2019 12:04:20  ANR3691E The transaction failed for process
CONVERT
  STGPOOL, process number 40838, because of a
transaction
  timeout. (SESSION: 50394593, PROCESS: 40838)

04/29/2019 12:04:20  ANR0985I Process 40838 for CONVERT STGPOOL running
in the
  BACKGROUND completed with completion state
FAILURE at
  12:04:20 PM. (SESSION: 50394593, PROCESS: 40838)

04/29/2019 12:04:20  ANR1893E Process 40838 for CONVERT STGPOOL
completed with
  a completion state of FAILURE. (SESSION:
50394593,
  PROCESS: 40838)

04/29/2019 12:04:20  ANR0514I Session 50394593 closed volume

  /archivefile_2/archfile020. (SESSION: 50394593)

04/29/2019 12:04:20  ANR0985I Process 40837 for CONVERT STGPOOL running
in the
  BACKGROUND completed with completion state
FAILURE at
  12:04:20 PM. (SESSION: 50394593, PROCESS: 40837)

04/29/2019 12:04:20  ANR1893E Process 40837 for CONVERT STGPOOL
completed with
  a completion state of FAILURE. (SESSION:
50394593,
  PROCESS: 40837)

I'm trying to convert a file pool to container.  I have a ticket opened
with ibm but it moving sorta slow...

Thanks


Re: best way to avoid long rollback

2019-03-11 Thread J. Eric Wonderley
We are running rhel7 on a dell r730 and we just did a full and
dbsnap...ususally take about 1.5h for both to complete.

Typically then shutdown with systemctl stop tsminst1.

I think last time we stopped tsm about an hour after the db finished its
backups.  Likely restarted tsm about an hour after tsm stopped.  It takes
roughly and an hour for us to do all of the patching tripwire reboot etc
that we do.

On Mon, Mar 11, 2019 at 12:19 PM Zoltan Forray  wrote:

> I have to ask what OS/hardware/ISP are you running?  What procedure are you
> using to prep for the OS patching  (we stop client sessions/all admin
> processes - do a full DB backup - halt the server)
>
> Our offsite replica server is RHEL 7 on Dell R740xd with 192GB and 3TB SSD
> with the DB currently at 2.3TB used.  We patch monthly and never had it
> take more than 15-minutes from OS reboot to ISP server being available!
>
> On Mon, Mar 11, 2019 at 11:41 AM J. Eric Wonderley 
> wrote:
>
> > We have a pair of tsm servers doing backup and replication.  Each has a
> > database over 1TB on ssd and 512G of memory
> >
> > Our organization likes to do os patch maintenance every 90d and doing
> this
> > requires a stop and restart of db2.  When would it be best to do
> > maintenance to shorten the rollback time?
> >
> > I would think after completing the db backups.  Last time we did
> > maintenance about 1h after backups completed it took >2h for the db to
> come
> > up.
> >
> > Thanks
> >
>
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/
>


best way to avoid long rollback

2019-03-11 Thread J. Eric Wonderley
We have a pair of tsm servers doing backup and replication.  Each has a
database over 1TB on ssd and 512G of memory

Our organization likes to do os patch maintenance every 90d and doing this
requires a stop and restart of db2.  When would it be best to do
maintenance to shorten the rollback time?

I would think after completing the db backups.  Last time we did
maintenance about 1h after backups completed it took >2h for the db to come
up.

Thanks


container with damaged extents

2019-01-31 Thread J. Eric Wonderley
Looking thru my actlog I noticed msgs like these:
01/30/2019 08:59:07 ANR4847W REPLICATE NODE detected an extent with ID
9421329176063255953 on container
/orion_c3/18/1843.ncf that is marked
damaged. (SESSION: 65110496, PROCESS: 31257)
01/30/2019 08:59:08 ANR4847W REPLICATE NODE detected an extent with ID
391953834519091250 on container
/orion_c2/11/112d.ncf that is marked
damaged. (SESSION: 65110496, PROCESS: 31257)
01/30/2019 08:59:08 ANR4847W REPLICATE NODE detected an extent with ID
9421329176063255953 on container
/orion_c3/18/1843.ncf that is marked

Also this:
Protect: YACHT>q damaged orioncontain

Storage Pool Name Non-Deduplicated Extent Deduplicated
Extent Cloud Orphaned Extent
  Count   Count
   Count
- ---
 --- ---
ORIONCONTAIN  31  0

What is typically done to resolve this?  I do replication and my target
doesn't show any damaged extents.


Re: rman work with stgpooldir container pools?

2018-12-14 Thread J. Eric Wonderley
Yes...I've see that one before.
However my bad.   The error was coming from a non-container server.  As it
turned out we were out of disk space.

Thanks

On Fri, Dec 14, 2018 at 12:58 PM Deschner, Roger Douglas 
wrote:

> Check your MAXSCRATCH settings for that storage pool. This is a gotcha
> that has bitten me in the past.
>
> Roger Deschner
> University of Illinois at Chicago
> "I have not lost my mind; it is backed up on tape somewhere."
>
> ________
> From: J. Eric Wonderley 
> Sent: Thursday, December 13, 2018 11:46
> Subject: rman work with stgpooldir container pools?
>
> We are getting failures that show out of space when we are NOT out of
> space?:
> ORA-19511: non RMAN, but media manager or vendor specific failure, error
> text:
> ANS1311E (RC11) Server out of data storage space
>


rman work with stgpooldir container pools?

2018-12-13 Thread J. Eric Wonderley
We are getting failures that show out of space when we are NOT out of
space?:
ORA-19511: non RMAN, but media manager or vendor specific failure, error
text:
ANS1311E (RC11) Server out of data storage space


running dsmadmc in non-root account

2018-12-12 Thread J. Eric Wonderley
I have tsm v8.1.5 on linux and it does not run...:
[e1derley@yacht ~]$ dsmadmc
ANS1398E Initialization functions cannot open one of the IBM Spectrum
Protect logs or a related file: /reports/dsmerror.log. errno = 13,
Permission denied
[e1derley@yacht ~]$ ll /reports/dsmerror.log
-rw-r--r-- 1 root root 86762 Dec 12 07:11 /reports/dsmerror.log

I have tsm v7.1.4 on aix and it does run.

Has something changed on purpose or is this a configuration issue?


do locked nodes participate in expiration?

2018-11-14 Thread J. Eric Wonderley
we trying to lock out clients from a server we're moving off of but we also
want data to expire for the locked out clients

maybe we need to rename the nodes instead


virturalbox causing ANS1029E

2018-09-07 Thread J. Eric Wonderley
I have a client that runs a guest and inside the guest he has ba client
installed.  There is a nat between the host and guest.
He also has a ba client running in the host.

We are seeing this error in the host logs.  The logs a quiet for about 10m
then lots of these errors.

Also this a laptop doing dhcp.

Anyone have experience with this setup?  Advice?


setting up ssl certs for replication

2018-08-22 Thread J. Eric Wonderley
I think I want to setup my cert list to look like:
[tsminst1@barge tsminst1]$ gsk8capicmd_64 -cert -list -db cert.kdb -stashed
Certificates found
* default, - personal, ! trusted, # secret key
! 198.82.161.20:1500:0
*- "TSM Server SelfSigned SHA Key"

This machine can ping server 198.82.161.20

I cleared out the certs and I'm trying to recreate it...b/c the other
server 198.82.161.20 cannot ping this server.

In the instance directory I get this error:
[tsminst1@barge tsminst1]$ gsk8capicmd_64 -cert -create -setdefault -db
cert.kdb -stashed -label "TSM Server SelfSigned SHA Key" -file
/tsminst1/tsminst1/ocert256.arm
CTGSK3044W No value for parameter "-dn" was provided.

How do I go about creating my certificate list like the one I had above?


Eric