Because of the heavy IO produced by all my bpdbm processes there are now way that i can find anything in those logs...
But support has got the all and says everything seems normal.. So what can I do.. ? I am helpless... Hampus Lind Rikspolisstyrelsen National Police Board Tel dir: +46 (0)8 - 401 99 43 Tel mob: +46 (0)70 - 217 92 66 E-mail: [EMAIL PROTECTED] -----Ursprungligt meddelande----- Från: Justin Piszcz [mailto:[EMAIL PROTECTED] Skickat: den 14 februari 2007 23:01 Till: Hampus Lind Kopia: 'Steven L. Sesar'; 'Bahnmiller, Bryan'; Veritas-bu@mailman.eng.auburn.edu Ämne: Re: SV: SV: SV: [Veritas-bu] Serious master issue... With VERBOSE = 5 cd /usr/openv/netbackup/logs tail -f */*date_of_today* Do you see anything weird relating to memory or corruption? On Wed, 14 Feb 2007, Hampus Lind wrote: > I can't tell.... I think it has been there for a while and got worse with > time.. > > > > Hampus Lind > Rikspolisstyrelsen > National Police Board > Tel dir: +46 (0)8 - 401 99 43 > Tel mob: +46 (0)70 - 217 92 66 > E-mail: [EMAIL PROTECTED] > > > -----Ursprungligt meddelande----- > Från: Justin Piszcz [mailto:[EMAIL PROTECTED] > Skickat: den 14 februari 2007 22:58 > Till: Hampus Lind > Kopia: 'Steven L. Sesar'; 'Bahnmiller, Bryan'; > Veritas-bu@mailman.eng.auburn.edu > Ämne: Re: SV: SV: [Veritas-bu] Serious master issue... > > When did this problem happen? Out of the blue or after a patch? > > On Wed, 14 Feb 2007, Hampus Lind wrote: > >> I have run a couple of tests... And it seems that if a want any info at > all >> from bpdbm -consistensy 2 I have to shutdown netbackup and then run the >> check when everything is down. >> >> Even then it takes forever.. Sometime it gets further then other... >> >> >> Hampus Lind >> Rikspolisstyrelsen >> National Police Board >> Tel dir: +46 (0)8 - 401 99 43 >> Tel mob: +46 (0)70 - 217 92 66 >> E-mail: [EMAIL PROTECTED] >> >> >> -----Ursprungligt meddelande----- >> Från: Justin Piszcz [mailto:[EMAIL PROTECTED] >> Skickat: den 14 februari 2007 22:47 >> Till: Hampus Lind >> Kopia: 'Steven L. Sesar'; 'Bahnmiller, Bryan'; >> Veritas-bu@mailman.eng.auburn.edu >> Ämne: Re: SV: [Veritas-bu] Serious master issue... >> >> Another option is turn off backups, move the old images out of the way one >> by one and find what is causing the consistency to choke, does it stop on >> one set of images or does it run through them all but just very slowly? >> >> On Wed, 14 Feb 2007, Hampus Lind wrote: >> >>> The NBCC doesnt look at the image db, and they keep saying we have a >>> problem there.. But I dont know how we can fix it or even collect the >> info >>> from the db when bpdbm consistensy 2 wont runt.. >>> >>> >>> >>> Hampus Lind >>> Rikspolisstyrelsen >>> National Police Board >>> Tel dir: +46 (0)8 - 401 99 43 >>> Tel mob: +46 (0)70 - 217 92 66 >>> E-mail: [EMAIL PROTECTED] >>> >>> -----Ursprungligt meddelande----- >>> Från: Steven L. Sesar [mailto:[EMAIL PROTECTED] >>> Skickat: den 14 februari 2007 20:53 >>> Till: Hampus Lind >>> Kopia: 'Justin Piszcz'; 'Bahnmiller, Bryan'; >>> Veritas-bu@mailman.eng.auburn.edu >>> Ämne: Re: [Veritas-bu] Serious master issue... >>> >>> >>> >>> bpdbm -consistency 2 is useless to you, based on the amount of data that >> you >>> back up nightly and my own presumption of how long backups run in your >>> environment. It will take longer to run than your backup domain will >> remain >>> idle. If I recall, they have a process which does a better job at finding >>> catalog/db corruption/inconsistency. I think that it's called NBCC. >>> >>> The problem with NBCC is similar, though. You send them the output of >> three >>> commands: >>> >>> vmquery -a, bpmedialist -ls, and bpimmedia >>> >>> Then, they munge the output of the above commands through a reporting > tool >>> that Symantec will NOT share with end users. At some point later in the >> day >>> (hopefully, sooner rather than later), they will send you a report. You >> must >>> then take certain actions to correct any discrepancies found. The backup >>> system must be completely idle during this time. Restores are ok, but no >>> backup activity can be taking place. >>> >>> Afterwards, you 'll run those commands again, they'll generate the report >>> again, and you'll see how you're doing. It may take you several passes to >>> get things squared away. >>> >>> The problem is that most of us don't have a completely idle backup >>> infrastructure - at least for long enough for this process to complete. I >>> didn't when I was NBU customer. Once you take backups, the reports become >>> obsolete, as do the results of bpdbm -consistency 2. >>> >>> It would not surprise me if bpdbm was leaking memory on your platform. >>> >>> --Steve >>> >>> >>> Hampus Lind wrote: >>> >>> Hi, >>> >>> I cant don anything.... >>> >>> Bpdbm -consistecny 2 has been running for over 12 hours and havent > checked >>> more than 4-5 clients. >>> >>> It was the first thing support told me. Your db is corrupted... So I > tried >>> to run bpdbm -consistency 2 check. The check found some issues, like >> expired >>> images which where not removed etc. But when I was about to remove them >>> manually the netbackup db clean process already had took care of them.. >>> >>> So what I understand you can have some level of corruption in your db >> which >>> nbu cleans out when the clean job runs. >>> >>> I am not compressing my catalogs. >>> >>> Thanks, >>> >>> Hampus Lind >>> Rikspolisstyrelsen >>> National Police Board >>> Tel dir: +46 (0)8 - 401 99 43 >>> Tel mob: +46 (0)70 - 217 92 66 >>> E-mail: [EMAIL PROTECTED] >>> >>> >>> -----Ursprungligt meddelande----- >>> Från: Justin Piszcz [mailto:[EMAIL PROTECTED] >>> Skickat: den 14 februari 2007 20:31 >>> Till: Hampus Lind >>> Kopia: 'Bahnmiller, Bryan'; Veritas-bu@mailman.eng.auburn.edu >>> Ämne: Re: [Veritas-bu] Serious master issue... >>> >>> Have you run the check_db_consistency? There is a command that checks to >>> make sure your images are not corrupted! >>> >>> I would recommend checking that. >>> >>> Also, are you running compression on your catalogs? >>> >>> >>> On Wed, 14 Feb 2007, Hampus Lind wrote: >>> >>> >>> >>> Thanks Bryan, >>> >>> >>> >>> It happens directly after reboot.. >>> >>> >>> >>> The thing is: >>> >>> - I have deactivated all polices >>> >>> - Stop our media server >>> >>> - And then restarted netbackup on the master. >>> >>> >>> >>> So there are absolutely no action going on (no backup, no user backup, no >>> restore, no staging) only internal netbackup work . >>> >>> At once when netbackup on the master gets active, it starts bpdbm process >>> after bpdbm process. It consume 100% of both my CPU`s and write/read >>> >>> >>> heavily >>> >>> >>> to the /usr/openv/netbackup/db filesystem. >>> >>> When I have no action at all after a clean start, we have about 42 bpdbm >>> processes and nearly as many bprd processes >>> >>> >>> >>> I cant figure this one out, and support points to disk config or > something >>> else that sounds good in there ears >>> >>> >>> >>> Thanks for all help, >>> >>> >>> >>> Hampus Lind >>> Rikspolisstyrelsen >>> National Police Board >>> Tel dir: +46 (0)8 - 401 99 43 >>> Tel mob: +46 (0)70 - 217 92 66 >>> E-mail: [EMAIL PROTECTED] >>> >>> -----Ursprungligt meddelande----- >>> Från: Bahnmiller, Bryan [mailto:[EMAIL PROTECTED] >>> Skickat: den 14 februari 2007 20:04 >>> Till: Hampus Lind >>> Ämne: RE: [Veritas-bu] Serious master issue... >>> >>> >>> >>> Hampus, >>> >>> >>> >>> How quickly does this behaviour start happening after a recycle/reboot? I >>> worked with an N4000 master running 11i. We did have 8 cpus and 8 GB RAM. >>> >>> >>> We >>> >>> >>> were running over 15,000 backup jobs daily though. Our catalog was over >>> 400GB. (Catalog was on EMC DMX disk.) Running good old 3.4 we would have >>> >>> >>> to >>> >>> >>> reboot the system almost every week. If you can cleanly re-cycle > NetBackup >>> >>> >>> - >>> >>> >>> shut it down, kill all NBU processes, and then restart it, that should be >>> almost as good. >>> >>> >>> >>> Here we are running NBU 5.1mp4 on a Win2K3 master - 2 cpus, 4 GB RAM. (I >>> inherited the system - not my choice.) We run about 5000 jobs per day, we >>> have a 280 GB catalog on EMC Clariion. The system will stay stable for 2 >>> weeks pretty easily. 4 weeks starts pushing things. So we usually reboot >>> >>> >>> our >>> >>> >>> Windows master and media servers every 2 weeks. >>> >>> >>> >>> It seems like you will have cumulative problems with NetBackup that can >>> build up over time. It is way more pronounced on busy systems. We have >>> another NetBackup system that has 1 Master and 1 Media server. It runs >>> >>> >>> about >>> >>> >>> 40 jobs per day max. I hardly ever have to reboot those servers. >>> >>> >>> >>> Bryan >>> >>> >>> >>> Bryan Bahnmiller >>> >>> ISD Business Continuity >>> >>> Pier 1 Imports, Inc >>> >>> 817-252-8570 >>> >>> >>> >>> >>> >>> >>> _____ >>> >>> >>> From: [EMAIL PROTECTED] >>> [mailto:[EMAIL PROTECTED] On Behalf Of Hampus >>> >>> >>> Lind >>> >>> >>> Sent: Wednesday, February 14, 2007 12:17 PM >>> To: Veritas-bu@mailman.eng.auburn.edu >>> Subject: Re: [Veritas-bu] Serious master issue... >>> Importance: High >>> >>> All, >>> >>> >>> >>> Now I have been transferred to USA support God bless America! >>> >>> >>> >>> They have told me that they havent seen such a big installation in over > a >>> year . Strange, I have about 200 clients and backup a couple a TB per >>> >>> >>> day.. >>> >>> >>> I was under the impression that this was kinda small installation..?? >>> >>> >>> >>> However, they have told me that this is perfectly normal behaviour with >>> netbackup. That it produces heavy disk IO and eat all CPU power. And I > was >>> really stupid and told them that I also had an case with HP earlier on >>> >>> >>> this >>> >>> >>> disk IO problem, so now Symantec support are pointing all there fingers > at >>> HP and our disk setup. >>> >>> >>> >>> Our DB is about 60-65 GB and resides on a StorageTek Flexline 380 disk >>> >>> >>> array >>> >>> >>> (SAN). We run a RAID 5 on 146GB FC drives.. I dont really see the >>> bottleneck there, but I will create a RAID 5 on 73GB 15K FC drives just > to >>> shut netbackup support up >>> >>> >>> >>> We run a two CPU HP rp2470 with HP-UX 11.11 as a master server. > Shouldnt >>> this be enough for this installation? >>> >>> >>> >>> Ooh well >>> >>> >>> >>> If support cant help me, what should I do?? I am desperate!!! >>> >>> >>> >>> >>> >>> Hampus Lind >>> Rikspolisstyrelsen >>> National Police Board >>> Tel dir: +46 (0)8 - 401 99 43 >>> Tel mob: +46 (0)70 - 217 92 66 >>> E-mail: [EMAIL PROTECTED] >>> >>> -----Ursprungligt meddelande----- >>> Från: [EMAIL PROTECTED] >>> [mailto:[EMAIL PROTECTED] För Hampus Lind >>> Skickat: den 14 februari 2007 12:48 >>> Till: Veritas-bu@mailman.eng.auburn.edu >>> Ämne: [Veritas-bu] Serious master issue... >>> Prioritet: Hög >>> >>> >>> >>> Hi, >>> >>> >>> >>> We have a serious issue here with our master server. The problem occurred >>> >>> >>> a >>> >>> >>> couple of weeks ago, or at least I found out about it then.. >>> >>> >>> >>> I was looking at IO`s and scsi queue depth on my master (hp-ux 11.11) > when >>> >>> >>> a >>> >>> >>> say that we had 4000-6000 SCSI commands in que, and a disk utilisation of >>> 100% for the /usr/openv/netbackup/db disk. >>> >>> >>> >>> I have patched hpux to the latest patch bundle and we run NBU 5.1 MP4. >>> >>> >>> >>> HP support sad that bpdbm was leaking memory. >>> >>> >>> >>> Veritas support still investigating.. But we have about 30 bpdbm and bprd >>> processes active on our master which eats both my CPU`s and produces tons >>> >>> >>> of >>> >>> >>> IO against our db disk. >>> >>> >>> >>> I actived verbose = 5 on the master, and after 15 minutes the bpdbm log >>> >>> >>> had >>> >>> >>> reached the file size limit on our filsystem, 2 GB >>> >>> >>> >>> Any one had similar problems? >>> >>> >>> >>> >>> >>> Thanks and regards, >>> >>> >>> >>> Hampus Lind >>> Rikspolisstyrelsen >>> National Police Board >>> Tel dir: +46 (0)8 - 401 99 43 >>> Tel mob: +46 (0)70 - 217 92 66 >>> E-mail: <mailto:[EMAIL PROTECTED]> >>> <mailto:[EMAIL PROTECTED]> [EMAIL PROTECTED] >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu >>> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu >>> >>> >>> >>> >>> >>> >>> >>> -- >>> =================================== >>> >>> Steven L. Sesar >>> Lead Operating Systems Programmer/Analyst >>> UNIX Application Services R101 >>> The MITRE Corporation >>> 202 Burlington Road - MS K101 >>> Bedford, MA 01730 >>> tel: (781) 271-7702 >>> fax: (781) 271-2600 >>> mobile: (617) 519-8933 >>> email: [EMAIL PROTECTED] >>> >>> =================================== >>> >> > _______________________________________________ Veritas-bu maillist - Veritas-bu@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu