Re: [Bacula-users] How to discard "tape" (file actually) on failed backup?
> > Job "serverA" runs, and auto-labels a new file as > "/store/serverA-434"... but the job fails because ServerA is > unreachable. So bacula times out, and then proceeds with backup job > "serverB", but it uses the file it's already created for serverA's > backup. > > Any way around this? you could create two pools (daily-serverA / dailay-serverB) instead of one for both. - Thomas -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] How to discard "tape" (file actually) on failed backup?
Hi bacula-users, We have configured disk volumes and auto-labeling in such a way that a new file is created for every backup job, and the job ID is part of the filename: -- Pool { Name = daily Pool Type = Backup Recycle = no Use Volume Once = yes AutoPrune = yes # Prune expired volumes Volume Retention = 8 days # one week + 1 day margin Label Format = "${JobName}-${JobId}" Maximum Volume Jobs = 1 Volume Use Duration = 23h ActionOnPurge = Truncate } -- This works great, and it's really easy to identify which file is created by which backup, with the exception of a case where a backup fails.. because the backup volume has already been created based on the name. I.e, say I have 2 backup jobs, "serverA" and "serverB"... Job "serverA" runs, and auto-labels a new file as "/store/serverA-434"... but the job fails because ServerA is unreachable. So bacula times out, and then proceeds with backup job "serverB", but it uses the file it's already created for serverA's backup. Any way around this? Thanks, D -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] bacula, webmin & FQDN
I've set up a very basic Ubuntu server to run as a headless file server only to be controlled through SSH or Webmin. This is running fine on a local network. Next job is to get some basic local disk backup working and then add remote backup. I was drawn to Bacula because it seems to be highly recommended everywhere and also has a good Webmin module which seems to be lacking for any other backup systems I have found. Now, as far as I can tell, the default installation (apt-get install bacula) sets me up with Director, Storage and local File daemons, and a bacula MySQL table; and should more-or-less *just work*... but it doesn't although Webmin reports all services running OK... it gets stuck "waiting on Storage File" according to Webmin. I understand I need to change the address somewhere in the configuration and have edited bacula-dir.conf # Definition of file storage device Storage { Name = File # Do not use "localhost" here Address = xyz-server ... but not sure if there are other places where I should remove localhost; and I wonder if "xyz-server" is not really a suitable address though according to "hostname" it is my FQDN: a...@xyz-server:~$ sudo hostname --fqd xyz-server My default Storage Device is set to automatically label and mount media, but there are no Volumes in my Pools. I must admit I am very hazy on the meaning of the relationships between Storage/Volumes/Pools especially when I only have file storage and no tape drive (which bacula seems to have realised OK). I was hoping that the basics would run OK and I would then be able to unravel it and see how it works. But with it not working I have very little notion of what's wrong. I presume that all that can be wrong is that I have an unsuitable/confusing FQDN or that the address is entered in the wrong place. I've uninstalled and reinstalled a few times to be sure the only things I have changed are this address name and suitable paths for the backup file and the directories to be backed up. We will be setting up a "proper" domain name with dyndns.org ... might this help? Is there a more suitable backup system for our modest needs (never likely to have more than one server to backup) and which has a nice web-based interface? I would be happy with rsync but console-only admin scares me a bit - so I thought I'd easily find something with a simple GUI ... but maybe not so simple. I fear I could have leared a lot about rsync if I'd spent my time learning that instead of trying to get my head around this! Any suggestions gratefully received :? +-- |This was sent by braeds...@gmail.com via Backup Central. |Forward SPAM to ab...@backupcentral.com. +-- -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] backup slowdown (mysqld) after tape autochange
On Tue, December 14, 2010 11:48 am, Robert Wirth wrote: > Hi, > > strange problem. Here's some hardware where Bacula has been running > successfully for ca. 5 years. It was release 1.38.11 under Solaris 10x86. > > Last month, we had a system disk crash on the backup system. No backup > datas have been lost. We just had to reinstall the backup system. > Since this was our only Solaris x86 system, we decided to migrate > to Linux and to a newer Bacula release. Until the repaired hardware > was present, we started with a virtualized new system, just for the > daily incremental backups to disk volumes. > > Since most of our actual systems are Ubuntu Hardy server LTS, we > choosed Bacula 2.2.8 of this distribution as our new version (well, > it's old, but 1.38.11 was running well, and 2.2.8 was the default) > > We upgraded Bacula's mysql database with the corresponding script > from 1.38.11 to 2.2.8. We imported the updated DB using mysql_dump > into the new system which has MySQL 5.1.41 and Linux Kernel 2.6.32 > The virtualized system worked well all the time. > > Now, the hardware version of the system is ready, and a yearly full > backup, which goes directly to tape, is imminent. > > And now, the strange things are coming... > > > /* The system is a 2x2 core AMD Opteron system, 4 GB RAM, 6xLSI SCSI U320 > Megaraid with seperated channels for external disks, tape readers and > autochanger. 23 TB disk storage on external RAIDs, autochanger and > HP-readers for LTO-3 tapes. System: see above. */ > > > NOW BACKING UP... > > Starting a bunch of full backup jobs which fit into 1 SINGLE TAPE > produces NO PROBLEMS: the jobs start, run and write, and terminate > within a usual span of time. In so doing, I can backup a dozen > systems with totally 360 GB on one tape in a few hours. > > > FACING THE PROBLEM... > > Starting a bunch of full backup jobs that DO NOT FIT into 1 single > tape proceeds like follows (with a fresh tape forced by setting the > former one to readonly): > > - first, the jobs run well and write their data to the first fresh tape > of the corresponding pool. Speed is similar as known from the old OS. > > - when the tape is full with around 600GB of data, it is marked as > Full, being unloaded, and the next free tape of the pool is loaded. > > - from this moment on, writing to the new fresh tape becomes incredibly > slow (4 GB/hour) and mysqld has constantly 95%-100% CPU load. > No other process has an important load, and the mysql load isn't > represented in the system's load values: > > Cpu(s): 3.3%us, 2.2%sy, 0.0%ni, 91.6%id, 2.1%wa, 0.1%hi, 0.7%si, > 0.0%st > Mem: 3961616k total, 3850072k used, 111544k free,17532k buffers > Swap: 3906552k total,0k used, 3906552k free, 3579956k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 1356 mysql 20 0 144m 31m 2376 S 98 0.8 163:57.79 mysqld > 1 root 20 0 2620 948 528 S0 0.0 0:00.63 init > 2 root 20 0 000 S0 0.0 0:00.00 kthreadd > > > The only further effect I can see is that the table "bacula.JobMedia" is > growing. No errors in system log, no mysql errors, nor in Baculas log. > > What I mainly don't understand is why this happens after a tape change. > The MaxSpoolSize is 32GB, and I'm backing up 7 systems. Each of them > had several spool steps during the first tape. > >>From the view of Bacula and its program logic, what has changed when > the tape has been changed? I guess it's all the same: spooling data, > writing them to tape and update the catalog, regardless of first, second > or later tape...?!? What do you see under Running Jobs in the 'status dir' output before and after the first tape has filled? If you have only the 'after' just now, that might be interesting. -- Dan Langille -- http://langille.org/ -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] backup slowdown (mysqld) after tape autochange
Hi, strange problem. Here's some hardware where Bacula has been running successfully for ca. 5 years. It was release 1.38.11 under Solaris 10x86. Last month, we had a system disk crash on the backup system. No backup datas have been lost. We just had to reinstall the backup system. Since this was our only Solaris x86 system, we decided to migrate to Linux and to a newer Bacula release. Until the repaired hardware was present, we started with a virtualized new system, just for the daily incremental backups to disk volumes. Since most of our actual systems are Ubuntu Hardy server LTS, we choosed Bacula 2.2.8 of this distribution as our new version (well, it's old, but 1.38.11 was running well, and 2.2.8 was the default) We upgraded Bacula's mysql database with the corresponding script from 1.38.11 to 2.2.8. We imported the updated DB using mysql_dump into the new system which has MySQL 5.1.41 and Linux Kernel 2.6.32 The virtualized system worked well all the time. Now, the hardware version of the system is ready, and a yearly full backup, which goes directly to tape, is imminent. And now, the strange things are coming... /* The system is a 2x2 core AMD Opteron system, 4 GB RAM, 6xLSI SCSI U320 Megaraid with seperated channels for external disks, tape readers and autochanger. 23 TB disk storage on external RAIDs, autochanger and HP-readers for LTO-3 tapes. System: see above. */ NOW BACKING UP... Starting a bunch of full backup jobs which fit into 1 SINGLE TAPE produces NO PROBLEMS: the jobs start, run and write, and terminate within a usual span of time. In so doing, I can backup a dozen systems with totally 360 GB on one tape in a few hours. FACING THE PROBLEM... Starting a bunch of full backup jobs that DO NOT FIT into 1 single tape proceeds like follows (with a fresh tape forced by setting the former one to readonly): - first, the jobs run well and write their data to the first fresh tape of the corresponding pool. Speed is similar as known from the old OS. - when the tape is full with around 600GB of data, it is marked as Full, being unloaded, and the next free tape of the pool is loaded. - from this moment on, writing to the new fresh tape becomes incredibly slow (4 GB/hour) and mysqld has constantly 95%-100% CPU load. No other process has an important load, and the mysql load isn't represented in the system's load values: Cpu(s): 3.3%us, 2.2%sy, 0.0%ni, 91.6%id, 2.1%wa, 0.1%hi, 0.7%si, 0.0%st Mem: 3961616k total, 3850072k used, 111544k free,17532k buffers Swap: 3906552k total,0k used, 3906552k free, 3579956k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 1356 mysql 20 0 144m 31m 2376 S 98 0.8 163:57.79 mysqld 1 root 20 0 2620 948 528 S0 0.0 0:00.63 init 2 root 20 0 000 S0 0.0 0:00.00 kthreadd The only further effect I can see is that the table "bacula.JobMedia" is growing. No errors in system log, no mysql errors, nor in Baculas log. What I mainly don't understand is why this happens after a tape change. The MaxSpoolSize is 32GB, and I'm backing up 7 systems. Each of them had several spool steps during the first tape. >From the view of Bacula and its program logic, what has changed when the tape has been changed? I guess it's all the same: spooling data, writing them to tape and update the catalog, regardless of first, second or later tape...?!? Regards, Robert +++German Research Center for Artificial Intelligence+++ Dipl.-Inform. Robert V. Wirth, Campus D3_2, D-66123 Saarbruecken @office: +49-681-85775-5078 / -5572 +++ @fax: +49-681-85775-5020 mailto:robert.wi...@dfki.de ++ http://www.dfki.de/~wirth Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH - Firmensitz Trippstadter Strasse 122, D-67663 Kaiserslautern Geschaeftsfuehrung (executive board): - Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) - Dr. Walter Olthoff Vorsitzender des Aufsichtsrats (supervisory board chairman): - Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] waiting on max Storage jobs
On Mon, December 13, 2010 9:18 pm, Dan Langille wrote: > From time to time, I get a job stuck. But I'm not sure why. Several > other jobs have already gone through this storage device since this item > was queued. Confused... > > Running Jobs: > Console connected at 14-Dec-10 02:03 > JobId Level Name Status > == > 41810 Increme CopyMaildirToTape.2010-12-13_11.15.00_57 is waiting on > max Storage jobs > I wonder looking today, I see three such queued jobs: Running Jobs: Console connected at 14-Dec-10 16:30 JobId Level Name Status == 41810 Increme CopyMaildirToTape.2010-12-13_11.15.00_57 is waiting on max Storage jobs 41916 Increme CopyMaildirToTape.2010-12-14_09.15.00_47 is waiting on max Storage jobs 41920 Increme CopyMaildirToTape.2010-12-14_11.15.00_51 is waiting on max Storage jobs If I look through the messages log, I see entries for the last two jobs: 14-Dec 09:15 bacula-dir JobId 41916: No JobIds found to copy. 14-Dec 11:15 bacula-dir JobId 41920: No JobIds found to copy. If a CopyJob finds no jobs to copy, it appears to get stuck. -- Dan Langille -- http://langille.org/ -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] job scheduling snafoo
> On Tue, 14 Dec 2010 09:26:40 +0100, Hugo said: > > hi list > > I had to do a full backup of our archive, which is about 5.5 TB big. I ran > into > some problems a few times, like the hardcoded 6 days job running limit, so I > split the archive, moved one of the folders out of my backup dir and started > the > job. after that, I moved it back, and started the backup again, thinking that > bacula will back it up incrementally. I was wrong about that, since it is an > archive, so the dates on the files are all older than the date the last job > was > run. then I tried an "Accurate = on" on the job, but I got a lot of DB errors > like: > > 10-Dec 13:20 appendix-dir JobId 80: Fatal error: sql_get.c:998 sql_get.c:998 > query SELECT > MediaId,VolumeName,VolJobs,VolFiles,VolBlocks,VolBytes,VolMounts,VolErrors,VolWrites,MaxVolBytes,VolCapacityBytes,MediaType,VolStatus,PoolId,VolRetention,VolUseDuration,MaxVolJobs,MaxVolFiles,Recycle,Slot,FirstWritten,LastWritten,InChanger,EndFile,EndBlock,VolParts,LabelType,LabelDate,StorageId,Enabled,LocationId,RecycleCount,InitialWrite,ScratchPoolId,RecyclePoolId,VolReadTime,VolWriteTime,ActionOnPurge > > FROM Media WHERE VolumeName='BA1014L2' failed: > server sent data ("D" message) without prior row description ("T" message) > could not receive data from server: Operation now in progress > > I think they happened because the SELECT took a really long time (2.11min.) > to > get the backed up files from the DB (lots of files, 3.22TB) and bacula > somehow > choked on the data. > so, I decided to set "Accurate = no" back again, and touch all the files in > the > folder i moved back (1.9TB) to the archive dir to 2010.12.12 19:44. I then > manually started a job this sunday at 2010.12.12 19:55. it completed > yesterday > 2010.12.13 at 22:17. > my problem is, since it should be an incremental archive backup, I scheduled > it > to run on mondays at 06:66 AM, so the job scheduled itself again while it was > running. it was a funny listed job, it kept throwing an error if I tried to > get > details, saying that I should try to list the jobs with the option > "catalog=all". the manually called job completed successfully, but the other > one, auto-scheduled on 6AM a day before, started just after that. now I > canceled > it, but it already wrote on 4 tapes (LTO2). > I suspect it started because bacula thought that the touched files weren't > saved, since it didn't yet complete the job. > shouldn't bacula take the last file dates from when the job starts, not when > it > was scheduled? It does take the date from when it starts, but I suspect the second job started and was then suspended until the first job finished. It doesn't reread the date in that case. >what should I do with the last 4 written tapes? shold I now > purge > the files written by the job, or even delete the job? or just ignore it, and > live with the fact that i wasted 4 tapes.. > I just want to bring the tapes to the bank and be done with it already :) Yes, you can use the delete jobid command to delete the unwanted incremental job and the purge volume command to make the 4 tapes reusable. However, I would seriously worry about the integrity of the backup, given this confusion about dates. Consider redoing everything with two separate jobs and two separate filesets, split so that the jobs are small enough to run in a reasonable time. __Martin -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Large scale disk-to-disk Bacula deployment
'Marcello Romani' wrote: >Il 01/12/2010 16:04, Henrik Johansen ha scritto: >> Hi folks, >> >> I did prepare a paper for this years "Bacula Konferenz 2010" about doing >> large scale, high peformance disk-to-disk backups with Bacula but >> unfortunately my workload prohibited me from submitting. >> >> I have turned the essence of the paper into a few blog posts which will >> explain our setup, why we chose Bacula over the competetion (IBM, >> Symanted and CommVault) and give some real world numbers from our Bacula >> deployment. >> >> The first post is out now if people should be interested and can be found >> here : >> >> http://blog.myunix.dk/2010/12/01/large-scale-disk-to-disk-backups-using-bacula/ >> >> The remaining posts will follow over the next month or so. >> >> > >Very interesting. I'm looking seriously at bacula, although for a much >smaller setup than yours (to say the least). Your first post got me very >interested. I hope to read soom the other chapters of the tale... Part V is now online and can be found here : http://blog.myunix.dk/2010/12/14/large-scale-disk-to-disk-backups-using-bacula-part-v/ >-- >Marcello Romani > >-- >Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! >Tap into the largest installed PC base & get more eyes on your game by >optimizing for Intel(R) Graphics Technology. Get started today with the >Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. >http://p.sf.net/sfu/intelisp-dev2dev >___ >Bacula-users mailing list >Bacula-users@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/bacula-users -- Med venlig hilsen / Best Regards Henrik Johansen hen...@scannet.dk Tlf. 75 53 35 00 ScanNet Group A/S ScanNet -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] job scheduling snafoo
hi list I had to do a full backup of our archive, which is about 5.5 TB big. I ran into some problems a few times, like the hardcoded 6 days job running limit, so I split the archive, moved one of the folders out of my backup dir and started the job. after that, I moved it back, and started the backup again, thinking that bacula will back it up incrementally. I was wrong about that, since it is an archive, so the dates on the files are all older than the date the last job was run. then I tried an "Accurate = on" on the job, but I got a lot of DB errors like: 10-Dec 13:20 appendix-dir JobId 80: Fatal error: sql_get.c:998 sql_get.c:998 query SELECT MediaId,VolumeName,VolJobs,VolFiles,VolBlocks,VolBytes,VolMounts,VolErrors,VolWrites,MaxVolBytes,VolCapacityBytes,MediaType,VolStatus,PoolId,VolRetention,VolUseDuration,MaxVolJobs,MaxVolFiles,Recycle,Slot,FirstWritten,LastWritten,InChanger,EndFile,EndBlock,VolParts,LabelType,LabelDate,StorageId,Enabled,LocationId,RecycleCount,InitialWrite,ScratchPoolId,RecyclePoolId,VolReadTime,VolWriteTime,ActionOnPurge FROM Media WHERE VolumeName='BA1014L2' failed: server sent data ("D" message) without prior row description ("T" message) could not receive data from server: Operation now in progress I think they happened because the SELECT took a really long time (2.11min.) to get the backed up files from the DB (lots of files, 3.22TB) and bacula somehow choked on the data. so, I decided to set "Accurate = no" back again, and touch all the files in the folder i moved back (1.9TB) to the archive dir to 2010.12.12 19:44. I then manually started a job this sunday at 2010.12.12 19:55. it completed yesterday 2010.12.13 at 22:17. my problem is, since it should be an incremental archive backup, I scheduled it to run on mondays at 06:66 AM, so the job scheduled itself again while it was running. it was a funny listed job, it kept throwing an error if I tried to get details, saying that I should try to list the jobs with the option "catalog=all". the manually called job completed successfully, but the other one, auto-scheduled on 6AM a day before, started just after that. now I canceled it, but it already wrote on 4 tapes (LTO2). I suspect it started because bacula thought that the touched files weren't saved, since it didn't yet complete the job. shouldn't bacula take the last file dates from when the job starts, not when it was scheduled? what should I do with the last 4 written tapes? shold I now purge the files written by the job, or even delete the job? or just ignore it, and live with the fact that i wasted 4 tapes.. I just want to bring the tapes to the bank and be done with it already :) greets hugo.- -- Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users