Even more strange, with 6.0 I may expect something like this but 5.1 MP4 for us has been rock solid.. HMm...

On Wed, 14 Feb 2007, Hampus Lind wrote:

5.1 MP4



Hampus Lind
Rikspolisstyrelsen
National Police Board
Tel dir: +46 (0)8 - 401 99 43
Tel mob: +46 (0)70 - 217 92 66
E-mail: [EMAIL PROTECTED]


-----Ursprungligt meddelande-----
Från: Justin Piszcz [mailto:[EMAIL PROTECTED]
Skickat: den 14 februari 2007 23:11
Till: Hampus Lind
Kopia: 'Steven L. Sesar'; 'Bahnmiller, Bryan';
Veritas-bu@mailman.eng.auburn.edu
Ämne: Re: SV: SV: SV: SV: SV: [Veritas-bu] Serious master issue...

Also are you using 5.x or 6.0?

On Wed, 14 Feb 2007, Hampus Lind wrote:

I will try that tomorrow. But I don’t think the problem reside there..

Iostat and sar don’t show any strange values.. sar -d 1 10 report under
50%
average usage.

But, still I will try with the fastest FC array/disk we have...


Hampus Lind
Rikspolisstyrelsen
National Police Board
Tel dir: +46 (0)8 - 401 99 43
Tel mob: +46 (0)70 - 217 92 66
E-mail: [EMAIL PROTECTED]


-----Ursprungligt meddelande-----
Från: Justin Piszcz [mailto:[EMAIL PROTECTED]
Skickat: den 14 februari 2007 23:05
Till: Hampus Lind
Kopia: 'Steven L. Sesar'; 'Bahnmiller, Bryan';
Veritas-bu@mailman.eng.auburn.edu
Ämne: Re: SV: SV: SV: SV: [Veritas-bu] Serious master issue...

Is it possible for you to move the db/images volume to another set of
disks/raid array?

then ln -s /other/location/db/images /usr/openv/netbackup/db/images

That would rule out your array/FC.

On Wed, 14 Feb 2007, Hampus Lind wrote:

Because of the heavy IO produced by all my bpdbm processes there are now
way
that i can find anything in those logs...

But support has got the all and says everything seems normal.. So what
can
I
do.. ? I am helpless...

Hampus Lind
Rikspolisstyrelsen
National Police Board
Tel dir: +46 (0)8 - 401 99 43
Tel mob: +46 (0)70 - 217 92 66
E-mail: [EMAIL PROTECTED]


-----Ursprungligt meddelande-----
Från: Justin Piszcz [mailto:[EMAIL PROTECTED]
Skickat: den 14 februari 2007 23:01
Till: Hampus Lind
Kopia: 'Steven L. Sesar'; 'Bahnmiller, Bryan';
Veritas-bu@mailman.eng.auburn.edu
Ämne: Re: SV: SV: SV: [Veritas-bu] Serious master issue...

With VERBOSE = 5

cd /usr/openv/netbackup/logs
tail -f */*date_of_today*

Do you see anything weird relating to memory or corruption?


On Wed, 14 Feb 2007, Hampus Lind wrote:

I can't tell.... I think it has been there for a while and got worse
with
time..



Hampus Lind
Rikspolisstyrelsen
National Police Board
Tel dir: +46 (0)8 - 401 99 43
Tel mob: +46 (0)70 - 217 92 66
E-mail: [EMAIL PROTECTED]


-----Ursprungligt meddelande-----
Från: Justin Piszcz [mailto:[EMAIL PROTECTED]
Skickat: den 14 februari 2007 22:58
Till: Hampus Lind
Kopia: 'Steven L. Sesar'; 'Bahnmiller, Bryan';
Veritas-bu@mailman.eng.auburn.edu
Ämne: Re: SV: SV: [Veritas-bu] Serious master issue...

When did this problem happen? Out of the blue or after a patch?

On Wed, 14 Feb 2007, Hampus Lind wrote:

I have run a couple of tests... And it seems that if a want any info at
all
from bpdbm -consistensy 2 I have to shutdown netbackup and then run the
check when everything is down.

Even then it takes forever.. Sometime it gets further then other...


Hampus Lind
Rikspolisstyrelsen
National Police Board
Tel dir: +46 (0)8 - 401 99 43
Tel mob: +46 (0)70 - 217 92 66
E-mail: [EMAIL PROTECTED]


-----Ursprungligt meddelande-----
Från: Justin Piszcz [mailto:[EMAIL PROTECTED]
Skickat: den 14 februari 2007 22:47
Till: Hampus Lind
Kopia: 'Steven L. Sesar'; 'Bahnmiller, Bryan';
Veritas-bu@mailman.eng.auburn.edu
Ämne: Re: SV: [Veritas-bu] Serious master issue...

Another option is turn off backups, move the old images out of the way
one
by one and find what is causing the consistency to choke, does it stop
on
one set of images or does it run through them all but just very slowly?

On Wed, 14 Feb 2007, Hampus Lind wrote:

The NBCC doesn’t look at the image db, and they keep saying we have a
problem there.. But I don’t know how we can fix it or even collect the
info
from the db when bpdbm –consistensy 2 wont runt..



Hampus Lind
Rikspolisstyrelsen
National Police Board
Tel dir: +46 (0)8 - 401 99 43
Tel mob: +46 (0)70 - 217 92 66
E-mail: [EMAIL PROTECTED]

-----Ursprungligt meddelande-----
Från: Steven L. Sesar [mailto:[EMAIL PROTECTED]
Skickat: den 14 februari 2007 20:53
Till: Hampus Lind
Kopia: 'Justin Piszcz'; 'Bahnmiller, Bryan';
Veritas-bu@mailman.eng.auburn.edu
Ämne: Re: [Veritas-bu] Serious master issue...



bpdbm -consistency 2 is useless to you, based on the amount of data
that
you
back up nightly and my own presumption of how long backups run in your
environment. It will take longer to run than your backup domain will
remain
idle. If I recall, they have a process which does a better job at
finding
catalog/db corruption/inconsistency. I think that it's called NBCC.

The problem with NBCC is similar, though. You send them the output of
three
commands:

vmquery -a, bpmedialist -ls, and bpimmedia

Then, they munge the output of the above commands through a reporting
tool
that Symantec will NOT share with end users. At some point later in
the
day
(hopefully, sooner rather than later), they will send you a report.
You
must
then take certain actions to correct any discrepancies found. The
backup
system must be completely idle during this time. Restores are ok, but
no
backup activity can be taking place.

Afterwards, you 'll run those commands again, they'll generate the
report
again, and you'll see how you're doing. It may take you several passes
to
get things squared away.

The problem is that most of us don't have a completely idle backup
infrastructure - at least for long enough for this process to
complete.
I
didn't when I was NBU customer. Once you take backups, the reports
become
obsolete, as do the results of bpdbm -consistency 2.

It would not surprise me if bpdbm was leaking memory on your platform.

--Steve


Hampus Lind wrote:

Hi,

I cant don anything....

Bpdbm -consistecny 2 has been running for over 12 hours and havent
checked
more than 4-5 clients.

It was the first thing support told me. Your db is corrupted... So I
tried
to run bpdbm -consistency 2 check. The check found some issues, like
expired
images which where not removed etc. But when I was about to remove
them
manually the netbackup db clean process already had took care of
them..

So what I understand you can have some level of corruption in your db
which
nbu cleans out when the clean job runs.

I am not compressing my catalogs.

Thanks,

Hampus Lind
Rikspolisstyrelsen
National Police Board
Tel dir: +46 (0)8 - 401 99 43
Tel mob: +46 (0)70 - 217 92 66
E-mail: [EMAIL PROTECTED]


-----Ursprungligt meddelande-----
Från: Justin Piszcz [mailto:[EMAIL PROTECTED]
Skickat: den 14 februari 2007 20:31
Till: Hampus Lind
Kopia: 'Bahnmiller, Bryan'; Veritas-bu@mailman.eng.auburn.edu
Ämne: Re: [Veritas-bu] Serious master issue...

Have you run the check_db_consistency? There is a command that checks
to
make sure your images are not corrupted!

I would recommend checking that.

Also, are you running compression on your catalogs?


On Wed, 14 Feb 2007, Hampus Lind wrote:



Thanks Bryan,



It happens directly after reboot..



The thing is:

-          I have deactivated all polices

-          Stop our media server

-          And then restarted netbackup on the master.



So there are absolutely no action going on (no backup, no user backup,
no
restore, no staging) only internal netbackup work….

At once when netbackup on the master gets active, it starts bpdbm
process
after bpdbm process. It consume 100% of both my CPU`s and write/read


heavily


to the /usr/openv/netbackup/db filesystem.

When I have no action at all after a clean start, we have about 42
bpdbm
processes and nearly as many bprd processes…



I cant figure this one out, and support points to disk config or
something
else that sounds good in there ears…



Thanks for all help,



Hampus Lind
Rikspolisstyrelsen
National Police Board
Tel dir: +46 (0)8 - 401 99 43
Tel mob: +46 (0)70 - 217 92 66
E-mail: [EMAIL PROTECTED]

-----Ursprungligt meddelande-----
Från: Bahnmiller, Bryan [mailto:[EMAIL PROTECTED]
Skickat: den 14 februari 2007 20:04
Till: Hampus Lind
Ämne: RE: [Veritas-bu] Serious master issue...



Hampus,



How quickly does this behaviour start happening after a
recycle/reboot?
I
worked with an N4000 master running 11i. We did have 8 cpus and 8 GB
RAM.


We


were running over 15,000 backup jobs daily though. Our catalog was
over
400GB. (Catalog was on EMC DMX disk.) Running good old 3.4 we would
have


to


reboot the system almost every week. If you can cleanly re-cycle
NetBackup


-


shut it down, kill all NBU processes, and then restart it, that should
be
almost as good.



Here we are running NBU 5.1mp4 on a Win2K3 master - 2 cpus, 4 GB RAM.
(I
inherited the system - not my choice.) We run about 5000 jobs per day,
we
have a 280 GB catalog on EMC Clariion. The system will stay stable for
2
weeks pretty easily. 4 weeks starts pushing things. So we usually
reboot


our


Windows master and media servers every 2 weeks.



It seems like you will have cumulative problems with NetBackup that
can
build up over time. It is way more pronounced on busy systems. We have
another NetBackup system that has 1 Master and 1 Media server. It runs


about


40 jobs per day max. I hardly ever have to reboot those servers.



     Bryan



Bryan Bahnmiller

ISD Business Continuity

Pier 1 Imports, Inc

817-252-8570






_____


From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Hampus


Lind


Sent: Wednesday, February 14, 2007 12:17 PM
To: Veritas-bu@mailman.eng.auburn.edu
Subject: Re: [Veritas-bu] Serious master issue...
Importance: High

All,



Now I have been transferred to USA support… God bless America!



They have told me that they haven’t seen such a big installation in
over
a
year…. Strange, I have about 200 clients and backup a couple a TB per


day..


I was under the impression that this was kinda small installation..??



However, they have told me that this is perfectly normal behaviour
with
netbackup. That it produces heavy disk IO and eat all CPU power. And I
was
really stupid and told them that I also had an case with HP earlier on


this


disk IO problem, so now Symantec support are pointing all there
fingers
at
HP and our disk setup.



Our DB is about 60-65 GB and resides on a StorageTek Flexline 380 disk


array


(SAN). We run a RAID 5 on 146GB FC drives.. I don’t really see the
bottleneck there, but I will create a RAID 5 on 73GB 15K FC drives
just
to
shut netbackup support up…



We run a two CPU HP rp2470  with HP-UX 11.11 as a master server.
Shouldn’t
this be enough for this installation?



Ooh well…



If support cant help me, what should I do?? I am desperate!!!





Hampus Lind
Rikspolisstyrelsen
National Police Board
Tel dir: +46 (0)8 - 401 99 43
Tel mob: +46 (0)70 - 217 92 66
E-mail: [EMAIL PROTECTED]

-----Ursprungligt meddelande-----
Från: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] För Hampus Lind
Skickat: den 14 februari 2007 12:48
Till: Veritas-bu@mailman.eng.auburn.edu
Ämne: [Veritas-bu] Serious master issue...
Prioritet: Hög



Hi,



We have a serious issue here with our master server. The problem
occurred


a


couple of weeks ago, or at least I found out about it then..



I was looking at IO`s and scsi queue depth on my master (hp-ux 11.11)
when


a


say that we had 4000-6000 SCSI commands in que, and a disk utilisation
of
100% for the /usr/openv/netbackup/db disk.



I have patched hpux to the latest patch bundle and we run NBU 5.1 MP4.



HP support sad that bpdbm was leaking memory.



Veritas support still investigating.. But we have about 30 bpdbm and
bprd
processes active on our master which eats both my CPU`s and produces
tons


of


IO against our db disk.



I actived verbose = 5 on the master, and after 15 minutes the bpdbm
log


had


reached the file size limit on our filsystem, 2 GB…



Any one had similar problems?





Thanks and regards,



Hampus Lind
Rikspolisstyrelsen
National Police Board
Tel dir: +46 (0)8 - 401 99 43
Tel mob: +46 (0)70 - 217 92 66
E-mail:   <mailto:[EMAIL PROTECTED]>
<mailto:[EMAIL PROTECTED]> [EMAIL PROTECTED]







_______________________________________________
Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu







--
===================================

  Steven L. Sesar
  Lead Operating Systems Programmer/Analyst
  UNIX Application Services R101
  The MITRE Corporation
  202 Burlington Road - MS K101
  Bedford, MA 01730
  tel: (781) 271-7702
  fax: (781) 271-2600
  mobile: (617) 519-8933
  email: [EMAIL PROTECTED]

===================================





_______________________________________________
Veritas-bu maillist  -  Veritas-bu@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

Reply via email to