Re: [pca] interesting patch problem with Oracle Access Manager Version: 11.1.1.5.0

2013-07-16 Thread Brookins, Neil (Philadelphia)
In the past, I've been bitten by every possible thing that could go wrong at 
one time or another.
I've developed contingencies that avoid these problems as best as possible -- 
which I've learned over the years.

One of my past issues was not being able to get patches onto a prod box in a 
timely manner. Usually due to an external issue out of my control.
The process I follow now makes this problem impossible -- because the patches 
are staged in advance.

Sorry about the length of this post. Hopefully something I say here helps 
someone.

Here is the general flow I follow for patching...
0) Download the patches from Oracle to a shared NFS mount point. (./pca 
--download ...)
1) Install patches onto a VM/host that no-one uses, just to see if it breaks 
really badly. (such as not booting up)
2) back-out the patches to see if the back-out fails. (using live update to 
activate previous boot environment)
3) re-install the same patches again to same VM/host.
4) Install patches from shared NFS mount point onto all Lab/DEV boxes.
5) Stage patches from step #0 to QA/DR via SFTP.
6) Install patches in QA/DR
7) Stage patches from step #0 to Prod via SFTP.
8) Install patches in Prod.

As soon as step #4 has been tested, we know that the patches are good, and the 
list of patches is frozen in a folder as a bunch of .zip files.
We can then do steps #5 and #7 right away -- which may be several weeks before 
#6 and #8 occur.

Points to note: Prod servers only allow incoming port 22, not outbound 80 and 
443, so therefore they can't use a http caching proxy server to get patches.
The patch's .zip file only goes across the WAN once via SFTP per datacenter. 
Then, from that one point is copied via SFTP to each host over LAN.
Also, only step #0 uses the Internet to access Oracle's web site. Step #0 uses 
pca --fromfiles=DIR ... multiple times, once for each host. This way we get 
all needed patches for all hosts at one time in advance. Later in steps #6 and 
#8 PCA does not need to pull anything via port 80/443 because all patches are 
already in the --patchdir=DIR location.

If new patches become available after step #1 and before step #8, they are NOT 
included in the patches installed in step #8.
For steps #1-8 above, we use the pca --nocheckxref --xrefdir=DIR 
--patchdir=DIR where DIR is the same location as the patches stored in step 
#0, 5, and 7.
This prevents PCA from changing the list of patches after they are frozen.

Neil G. Brookins
Identity and Authentication Solutions - IT Global Solutions
Towers Watson
1500 Market Street | Philadelphia, PA 19102
Phone: +1 215 246 6046
neil.brook...@towerswatson.com

-Original Message-
From: pca-boun...@lists.univie.ac.at [mailto:pca-boun...@lists.univie.ac.at] On 
Behalf Of Paul Lanken
Sent: Monday, July 15, 2013 11:43 AM
To: PCA (Patch Check Advanced) Discussion
Subject: Re: [pca] interesting patch problem with Oracle Access Manager 
Version: 11.1.1.5.0

On Mon, Jul 15, 2013 at 8:33 AM, Martin Paul martin.p...@univie.ac.at wrote:
 Am 15.07.2013 14:23, schrieb Jan Holzhueter:

 So I had to cleanup my proxy to not have bad patches. (Maybe some check
 if the downloaded file is a zipfile would be nice.)


 There is a check in the pca code, but it seems as if it failed in this case.


 Looks like it was a temp problem. This morning after the cleanup the
 download of the broken patches went fine. (Not sure about the number.
 But was more then 10.)


 Thanks for the report, good to hear that!

I can confirm that the problem seemed to simply go away all on its
own sometime
last night.  I did have a downtime window for Saturday night but
missed it entirely
because I could not get the patches in a reasonable fashion.  Something odd and
clearly, based on this maillist, an obscure issue never before seen.

So please let me share an idea, because I found myself in a sysadmin situation
where I had a downtime window approved on production hardware and was watching
it slip past.  Felt a bit of stress that night however Neil G.
Brookins, ( see this thread )
changed my life with his posting.  I was wondering about what to do with some of
the bigger Niagara servers which have thirty zones or more.  Normally
I get a downtime
window starting Friday night and run all weekend to apply patches. Stupid me.

Regardless, I was stuck a bit. Saw a few patches being downloaded and
then a whole
stack of files that all looked like NN-RR.zip ( where N and R are
digits ) but the
files were just html error messages.  I thought I could sed/awk/cut/tr
script-fu my way
around this and send the PCA process into a loop until I had all the
patches. I even
wrote a script to get the list of missing patches first and then post
process that with
awk to create another script which used PCA to pull down the patches
one at a time
with a sleep 30 in between each call to PCA.  Ugly but it also did not help.

So there I was, coffee cup in hand.  Loiking at the clock and then an
idea struck me.

- idea to use

Re: [pca] interesting patch problem with Oracle Access Manager Version: 11.1.1.5.0

2013-07-15 Thread Martin Paul

Am 13.07.2013 21:49, schrieb Paul Lanken:

Under normal circumstances I am able to download a collection of missing
patches from the oracle patch server with PCA just fine.  Not sure if what
I am seeing is abnormal however I normally just use PCA to --download
the missing patches and then deal with things later.


That's definitely abnormal. Strange thing is that this is the first 
report I ever got about such an issue.


Right this morning I downloaded more than 50 patches within 15 minutes, 
and it was fine. Like many others I download all my patches via a pca 
local caching proxy, so all requests come from the same IP and use the 
same MOS account - no problem at all.


So either this is a new issue (maybe Oracle changed their server setup - 
Don, are you listening?) or it was a temporary problem.


Can you please try again in the next days and report whether the problem 
persists? If so please use the --debug option with pca and send the 
output of one of the failing patch downloads.


Martin.



Re: [pca] interesting patch problem with Oracle Access Manager Version: 11.1.1.5.0

2013-07-15 Thread Jan Holzhueter
Hi,

Am 15.07.13 12:34, schrieb Martin Paul:
 Am 13.07.2013 21:49, schrieb Paul Lanken:

 Right this morning I downloaded more than 50 patches within 15 minutes,
 and it was fine. Like many others I download all my patches via a pca
 local caching proxy, so all requests come from the same IP and use the
 same MOS account - no problem at all.
 
 So either this is a new issue (maybe Oracle changed their server setup -
 Don, are you listening?) or it was a temporary problem.

I hit it this weekend too.
Since Oracle did release a lot of new patches :)

So I had to cleanup my proxy to not have bad patches. (Maybe some check
if the downloaded file is a zipfile would be nice.)

Looks like it was a temp problem. This morning after the cleanup the
download of the broken patches went fine. (Not sure about the number.
But was more then 10.)

Greetings
Jan







Re: [pca] interesting patch problem with Oracle Access Manager Version: 11.1.1.5.0

2013-07-15 Thread Martin Paul

Am 15.07.2013 14:23, schrieb Jan Holzhueter:

So I had to cleanup my proxy to not have bad patches. (Maybe some check
if the downloaded file is a zipfile would be nice.)


There is a check in the pca code, but it seems as if it failed in this 
case.



Looks like it was a temp problem. This morning after the cleanup the
download of the broken patches went fine. (Not sure about the number.
But was more then 10.)


Thanks for the report, good to hear that!

Martin.



Re: [pca] interesting patch problem with Oracle Access Manager Version: 11.1.1.5.0

2013-07-15 Thread Paul Lanken
On Mon, Jul 15, 2013 at 8:33 AM, Martin Paul martin.p...@univie.ac.at wrote:
 Am 15.07.2013 14:23, schrieb Jan Holzhueter:

 So I had to cleanup my proxy to not have bad patches. (Maybe some check
 if the downloaded file is a zipfile would be nice.)


 There is a check in the pca code, but it seems as if it failed in this case.


 Looks like it was a temp problem. This morning after the cleanup the
 download of the broken patches went fine. (Not sure about the number.
 But was more then 10.)


 Thanks for the report, good to hear that!

I can confirm that the problem seemed to simply go away all on its
own sometime
last night.  I did have a downtime window for Saturday night but
missed it entirely
because I could not get the patches in a reasonable fashion.  Something odd and
clearly, based on this maillist, an obscure issue never before seen.

So please let me share an idea, because I found myself in a sysadmin situation
where I had a downtime window approved on production hardware and was watching
it slip past.  Felt a bit of stress that night however Neil G.
Brookins, ( see this thread )
changed my life with his posting.  I was wondering about what to do with some of
the bigger Niagara servers which have thirty zones or more.  Normally
I get a downtime
window starting Friday night and run all weekend to apply patches. Stupid me.

Regardless, I was stuck a bit. Saw a few patches being downloaded and
then a whole
stack of files that all looked like NN-RR.zip ( where N and R are
digits ) but the
files were just html error messages.  I thought I could sed/awk/cut/tr
script-fu my way
around this and send the PCA process into a loop until I had all the
patches. I even
wrote a script to get the list of missing patches first and then post
process that with
awk to create another script which used PCA to pull down the patches
one at a time
with a sleep 30 in between each call to PCA.  Ugly but it also did not help.

So there I was, coffee cup in hand.  Loiking at the clock and then an
idea struck me.

- idea to use patchdiag.CHECKSUMS file to verify patch integrity

I am sure you have heard it all before.  Hardly new.  There is a
public file available
at this url :  ( from my panicy scriptint that night )

# do not pass user and passwd for this ... it is public
# Be silent about this. The info is nice to have and not a need at
this time.
$PCA_WGET --output-file=/dev/null --quiet --continue
--no-check-certificate \
  --output-document=$PCA_XREFDIR/patchdiag.CHECKSUMS \
  https://getupdates.oracle.com/reports/CHECKSUMS  /dev/null 21

if [ ! -f $PCA_XREFDIR/patchdiag.CHECKSUMS ]; then
/bin/printf WARN : patchdiag.CHECKSUMS could not be fetched.\n
fi

So then in that file, for any given patch ( I guess ) there is an MD5
hash but it
is on a separate line from the patch filename.

So for this patch :

148931-03 Jul/12/13 Oracle Solaris Studio 12.3: Patch for
Performance Library

I see this :

# /usr/xpg4/bin/grep -n 148931-03 pca_data/xref/patchdiag.CHECKSUMS
366421:148931-03.zip

# cat -n pca_data/xref/patchdiag.CHECKSUMS | head -366424 | tail -4
366421  148931-03.zip
366422  MD5: b848434ebf59e507e347cb99ff7eb1ba
366423  SysV Sum: 63367   256161
366424  Sum: 46010   256161

So some script-fu with sed and tr etc etc can get me the MD5 hash.

 and down the rabbit hole I go to verify the patches etc etc.

I am sure you have heard it all before.

Paul



Re: [pca] interesting patch problem with Oracle Access Manager Version: 11.1.1.5.0

2013-07-13 Thread Brookins, Neil (Philadelphia)
I've noticed the same long patch install time as you. (24 hours or more when 
using many zones)
However, my (end user) downtime is only about 5 minutes. That's the time it 
takes to reboot.
We use live update to make a clone of the OS. Patch the clone, which is not the 
OS copy that's in active use.
Then I activate the patched copy just before I reboot into that version.

Using this method, we keep running live in production -- without seeing any 
instability -- while the patches install without changing the OS that is in use.
After rebooting, if anyone complains that a patch broke something, all we do is 
activate the old boot environment again, do a reboot,
and we are back up and running without the patches again in another 5 minutes.  
 This ability to back-out the patches quickly when they break stuff is 
important to us in production.
I can't imagine having our production systems down for over 24 hours. That's 
simply not possible with customers around the world using it 24x7.

Regarding slow patch install times, make sure that /etc/patch/pdo.conf  has a 
line that says num_proc=16
If you don't do that, it will install patches one zone at a time. This never 
users more than 1 cpu.
With that setting, it will patch global zone first, then all 16 non-global 
zones at the same time.
This can make it go much faster. A Sun v890 with 8 ultrasparc IV cpus has 16 
cores. So num_proc=16 is probably the value you want to set here.

Neil G. Brookins
Identity and Authentication Solutions - IT Global Solutions
Towers Watson
1500 Market Street | Philadelphia, PA 19102
Phone: +1 215 246 6046
neil.brook...@towerswatson.com

-Original Message-
From: pca-boun...@lists.univie.ac.at [mailto:pca-boun...@lists.univie.ac.at] On 
Behalf Of Paul Lanken
Sent: Saturday, July 13, 2013 3:50 PM
To: pca@lists.univie.ac.at
Subject: [pca] interesting patch problem with Oracle Access Manager Version: 
11.1.1.5.0

Under normal circumstances I am able to download a collection of missing
patches from the oracle patch server with PCA just fine.  Not sure if what
I am seeing is abnormal however I normally just use PCA to --download
the missing patches and then deal with things later.  This allows me to
get a feel for my downtime given that Solaris 10 with a pack of zones takes
simply forever to patch, usually downtime is 24 to 48 hours on a Sun Fire
V890 with 16 or more zones.  So I have to schedule things for once or
twice a year.

Anyway, the problem I see is that after the tenth patch ( or thereabouts )
is downloaded via /usr/sfw/bin/wget and PCA into an archive directory that
is specified by the env var PCA_PATCHDIR I end up with a few normal
looking patches and a pile of strange things that are clearly not patches :

# cd archive
# ls -l
total 498921
-rw-r--r-- 1 root root 4287550 Apr 30 01:49 119315-28.zip
-rw-r--r-- 1 root root 96753460 Apr 20 01:55 119757-27.zip
-rw-r--r-- 1 root root 1070808 Jul 13 02:35 119906-21.zip
-rw-r--r-- 1 root root 73405 Jul 13 02:36 120199-21.zip
.
.
.
-rw-r--r-- 1 root root 1994351 Jul 13 02:44 148561-05.zip
-rw-r--r-- 1 root root 2083 Jul 13 19:26 148689-01.zip
-rw-r--r-- 1 root root 2083 Jul 13 19:26 148691-01.zip
-rw-r--r-- 1 root root 2083 Jul 13 19:26 148693-01.zip
-rw-r--r-- 1 root root 2083 Jul 13 19:26 149022-04.zip
-rw-r--r-- 1 root root 2083 Jul 13 19:26 149063-01.zip
-rw-r--r-- 1 root root 2083 Jul 13 19:26 149067-01.zip
-rw-r--r-- 1 root root 2083 Jul 13 19:26 149112-01.zip
-rw-r--r-- 1 root root 2083 Jul 13 19:26 149167-02.zip
-rw-r--r-- 1 root root 2083 Jul 13 19:26 149171-02.zip

These little files are all html generated by Oracle Access Manager
Version: 11.1.1.5.0 which
have a warning in them :

The user has already reached the
maximum allowed number of sessions.
Please close one of the existing sessions
before trying to login again.


So I see that after patch number ten or twenty the Oracle site simply disallows
access perhaps.  Not sure really.

So then I tried to download the patches one at a time.  There are a lot of them
but again after about ten or so I see this on the screen :

Using /root/pca_data/xref/patchdiag.xref from Jul/12/13
Host: neptune (SunOS 5.10/Generic_14-05/sparc/sun4v)
List: 149729-02 (1/0)

Patch IR CR RSB Age Synopsis
-- -- - -- --- --- ---
149729 --  02 --- 2 SunOS 5.10: ike patch

Looking for 149729-02 (1/1)
Trying Oracle
Trying https://getupdates.oracle.com/ (1/3)
Done
--
Download Summary: 1 total, 1 successful, 0 skipped, 0 failed


So that patch should be fine ... however it isn't.  It is the same
little piece of html.

Not sure what the correct action to take here is .. any thoughts?

Paul


Notice of Confidentiality
This transmission contains information that may be confidential. It has been 
prepared for the sole and exclusive use of the intended recipient and on the 
basis agreed with that person

Re: [pca] interesting patch problem with Oracle Access Manager Version: 11.1.1.5.0

2013-07-13 Thread French, David
Though I don't know, my guess would be toward session cookies.  Wget by default 
will save and use cookies in the same session, but this means each invocation 
starts fresh.  Since pca kicks off a new wget process with each call to 
download(), nothing is saved across invocations.

Maybe you can try to add this to your pca.conf file:

wgetopt=--save-cookies=/tmp/PCA_Cookies.txt 
--load-cookies=/tmp/PCA_Cookies.txt

The idea would be each invocation would now use same cookies and hopefully be 
in same session, alleviating your issue.

Good luck, like I said I'm not sure it will help, but worth a shot.

--Dave

 -Original Message-
 From: pca-boun...@lists.univie.ac.at [mailto:pca-boun...@lists.univie.ac.at]
 On Behalf Of Paul Lanken
 Sent: Saturday, July 13, 2013 12:50 PM
 To: pca@lists.univie.ac.at
 Subject: [pca] interesting patch problem with Oracle Access Manager
 Version: 11.1.1.5.0
 
 Under normal circumstances I am able to download a collection of missing
 patches from the oracle patch server with PCA just fine.  Not sure if what I
 am seeing is abnormal however I normally just use PCA to --download the
 missing patches and then deal with things later.  This allows me to get a 
 feel
 for my downtime given that Solaris 10 with a pack of zones takes simply
 forever to patch, usually downtime is 24 to 48 hours on a Sun Fire
 V890 with 16 or more zones.  So I have to schedule things for once or twice a
 year.
 
 Anyway, the problem I see is that after the tenth patch ( or thereabouts ) is
 downloaded via /usr/sfw/bin/wget and PCA into an archive directory that is
 specified by the env var PCA_PATCHDIR I end up with a few normal looking
 patches and a pile of strange things that are clearly not patches :
 
 # cd archive
 # ls -l
 total 498921
 -rw-r--r-- 1 root root 4287550 Apr 30 01:49 119315-28.zip
 -rw-r--r-- 1 root root 96753460 Apr 20 01:55 119757-27.zip
 -rw-r--r-- 1 root root 1070808 Jul 13 02:35 119906-21.zip
 -rw-r--r-- 1 root root 73405 Jul 13 02:36 120199-21.zip .
 .
 .
 -rw-r--r-- 1 root root 1994351 Jul 13 02:44 148561-05.zip
 -rw-r--r-- 1 root root 2083 Jul 13 19:26 148689-01.zip
 -rw-r--r-- 1 root root 2083 Jul 13 19:26 148691-01.zip
 -rw-r--r-- 1 root root 2083 Jul 13 19:26 148693-01.zip
 -rw-r--r-- 1 root root 2083 Jul 13 19:26 149022-04.zip
 -rw-r--r-- 1 root root 2083 Jul 13 19:26 149063-01.zip
 -rw-r--r-- 1 root root 2083 Jul 13 19:26 149067-01.zip
 -rw-r--r-- 1 root root 2083 Jul 13 19:26 149112-01.zip
 -rw-r--r-- 1 root root 2083 Jul 13 19:26 149167-02.zip
 -rw-r--r-- 1 root root 2083 Jul 13 19:26 149171-02.zip
 
 These little files are all html generated by Oracle Access Manager
 Version: 11.1.1.5.0 which
 have a warning in them :
 
 The user has already reached the
 maximum allowed number of sessions.
 Please close one of the existing sessions
 before trying to login again.
 
 
 So I see that after patch number ten or twenty the Oracle site simply
 disallows access perhaps.  Not sure really.
 
 So then I tried to download the patches one at a time.  There are a lot of
 them but again after about ten or so I see this on the screen :
 
 Using /root/pca_data/xref/patchdiag.xref from Jul/12/13
 Host: neptune (SunOS 5.10/Generic_14-05/sparc/sun4v)
 List: 149729-02 (1/0)
 
 Patch IR CR RSB Age Synopsis
 -- -- - -- --- --- ---
 149729 --  02 --- 2 SunOS 5.10: ike patch
 
 Looking for 149729-02 (1/1)
 Trying Oracle
 Trying https://getupdates.oracle.com/ (1/3) Done
 --
 Download Summary: 1 total, 1 successful, 0 skipped, 0 failed
 
 
 So that patch should be fine ... however it isn't.  It is the same little 
 piece of
 html.
 
 Not sure what the correct action to take here is .. any thoughts?
 
 Paul