Re: [Bacula-users] Catastrophic overflow block problems
On 1/17/2013 2:10 PM, Ruth Ivimey-Cook wrote: Josh Fisher wrote: You don't. I find it very strange that returning "device full" from a volume write can reasonably be interpreted as "device not quite full". The trick is to define a maximum volume size and number of volumes on the drive so that it is impossible to reach 100% of the physical drive's capacity. This will prevent the i/o error, and Bacula will instead hit end of volume and seek another volume. Of course, if no existing volumes can be recycled yet, then there simply isn't enough space on the drive. In that case, it is easy to add another drive to an existing autochanger, since vchanger allows for multiple simultaneous "magazine" drives. I don't understand how to do this then without defining the number of volumes so low that I waste huge amounts of space on the drives as a matter of course. One way is to partition the drives. Keeping volumes of the same size on the same partition allows specifying the exact number of volumes. Each partition is a magazine, and any number of partitions can be used simultaneously. For example, break a 1 TB drive into two partitions, one 200 GB partition holding 10 volumes in a pool with a max volume size of ~20 GB for incremental jobs, and an 800 GB partition holding 8 volumes in a pool with max volume size of 100 GB for full jobs. Etc. A little more detail about what I'm doing: * Some backups are assigned longer retention times than others - e.g. some full backups live for a year, some incrs live for just 3 months. * I have various max volume sizes from 20GB to 400GB, assigned to each file pool depending on the likely size of a backup (e.g. incrs are likely smaller than full) so that a volume will expire in a reasonable time - I don't want 100GB of backups to be kept alive (and using space) because they are in the same volume as more recent backups that haven't expired yet. * I have set up 24 volumes per disk so that, should the volumes be shorter 90GB ones, I don't (on average) run out of volumes too quickly. * The result is that most disks are reasonably full most of the time, which is good. To be honest, I wish Bacula had a "disk mode" in which the concept of volumes was mostly eliminated: devices had backup pools and backups within them and it would be backups that were recycled. It would make much more sense for a random-access medium. True, but Bacula must also work with tape drives, and that would be a very extensive rewrite. Would an alternative solution be to adapt the vchanger program so that it monitored disk space and returned device full "early"? No, because vchanger only runs very briefly when Bacula requests a volume be "loaded" or "unloaded". It basically points Bacula to the particular volume file it is to use and then exits. Bacula reads/writes the file directly, so there is no interaction between vchanger and Bacula when the data is actually being written Ruth -- Software Manager & Engineer Tel: 01223 414180 Blog:http://www.ivimey.org/blog LinkedIn:http://uk.linkedin.com/in/ruthivimeycook/ -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnmore_122712___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Catastrophic overflow block problems
On 2013-01-17 11:06, Ruth Ivimey-Cook wrote: > Hi, > > I am sometimes getting these errors in my bacula backups: > > Fatal error: device.c:192 Catastrophic error. Cannot write overflow > block to device "DiskStorage-drive-0" > and it is more likely on the larger volume backups. It seemingly > results from bacula trying to write an additional block to a disk > drive that is already 100% full. How can I stop bacula from believing > this is a valid thing to do? Disk space is outside the scope of the Bacula project. It is the responsibility of the sysadmin to manage disk space. The other post mentioned how to restrict a Pool to a maximum size per Volume and a maximum number of Volume per Pool. -- Dan Langille - http://langille.org/ -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnmore_122712 ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Catastrophic overflow block problems
On 1/17/2013 11:06 AM, Ruth Ivimey-Cook wrote: Hi, I am sometimes getting these errors in my bacula backups: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" and it is more likely on the larger volume backups. It seemingly results from bacula trying to write an additional block to a disk drive that is already 100% full. How can I stop bacula from believing this is a valid thing to do? You don't. The trick is to define a maximum volume size and number of volumes on the drive so that it is impossible to reach 100% of the physical drive's capacity. This will prevent the i/o error, and Bacula will instead hit end of volume and seek another volume. Of course, if no existing volumes can be recycled yet, then there simply isn't enough space on the drive. In that case, it is easy to add another drive to an existing autochanger, since vchanger allows for multiple simultaneous "magazine" drives. Background: I have bacula setup on my local network to backup a file server and a number of workstations. The file server is also the bacula director and is running Fedora 15 and "bacula-common-5.0.3-28.fc15.x86_64". Bacula is writing backups to an iSCSI disk group (not array) over ethernet; there are 6 disks of 1TB to 2TB size and these are managed using "vchanger" 0.8.6, with 6 magazines each with 24 virtual volumes. The file server has 3.5TB of files and other workstations add about another 1TB. More-complete log: 17-Jan 14:49 helva-sd JobId 3417: Recycled volume "DiskPool1_0006_0017" on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0), all previous data lost. 17-Jan 14:49 helva-sd JobId 3417: New volume "DiskPool1_0006_0017" mounted on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0) at 17-Jan-2013 14:49. 17-Jan 14:49 helva-sd JobId 3417: End of Volume "DiskPool1_0006_0017" at 0:216 on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). Write of 64512 bytes got 3879. 17-Jan 14:49 helva-sd JobId 3417: End of medium on Volume "DiskPool1_0006_0017" Bytes=217 Blocks=0 at 17-Jan-2013 14:49. 17-Jan 14:49 helva-sd JobId 3417: 3307 Issuing autochanger "unload slot 89, drive 0" command. 17-Jan 14:49 helva-dir JobId 3417: Using Volume "DiskPool1_0006_0018" from 'Scratch' pool. 17-Jan 14:49 helva-sd JobId 3417: 3301 Issuing autochanger "loaded? drive 0" command. 17-Jan 14:49 helva-sd JobId 3417: 3302 Autochanger "loaded? drive 0", result: nothing loaded. 17-Jan 14:49 helva-sd JobId 3417: 3304 Issuing autochanger "load slot 90, drive 0" command. 17-Jan 14:49 helva-sd JobId 3417: 3305 Autochanger "load slot 90, drive 0", status is OK. 17-Jan 14:49 helva-sd JobId 3417: Recycled volume "DiskPool1_0006_0018" on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0), all previous data lost. 17-Jan 14:49 helva-sd JobId 3417: New volume "DiskPool1_0006_0018" mounted on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0) at 17-Jan-2013 14:49. 17-Jan 14:49 helva-sd JobId 3417: End of Volume "DiskPool1_0006_0018" at 0:216 on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). Write of 64512 bytes got 3879. 17-Jan 14:49 helva-sd JobId 3417: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). ERR=No space left on device17-Jan 14:49 helva-fd JobId 3417: Error: bsock.c:393 Write error sending 65562 bytes to Storage daemon:helva.cam.ivimey.org:9103: ERR=Connection reset by peer 17-Jan 14:49 helva-fd JobId 3417: Fatal error: backup.c:1024 Network send error to SD. ERR=Connection reset by peer 17-Jan 14:49 helva-sd JobId 3417: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). ERR=No space left on device17-Jan 14:49 helva-sd JobId 3417: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). ERR=No space left on device17-Jan 14:49 helva-sd JobId 3417: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). ERR=No space left on device17-Jan 14:49 helva-sd JobId 3417: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). ERR=No space left on device17-Jan 14:49 helva-sd JobId 3417: Job write elapsed time = 14:30:55, Transfer rate = 12.06 M Bytes/second 17-Jan 14:49 helva-dir JobId 3417: Error: Bacula helva-dir 5.0.3 (04Aug10): 17-Jan-2013 14:49:32 Build OS: x86_64-redhat-linux-gnu redhat JobId: 3417 Job:Helva_Home.2013-01-17_00.17.26_23 Backup Level:
[Bacula-users] Catastrophic overflow block problems
Hi, I am sometimes getting these errors in my bacula backups: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" and it is more likely on the larger volume backups. It seemingly results from bacula trying to write an additional block to a disk drive that is already 100% full. How can I stop bacula from believing this is a valid thing to do? Background: I have bacula setup on my local network to backup a file server and a number of workstations. The file server is also the bacula director and is running Fedora 15 and "bacula-common-5.0.3-28.fc15.x86_64". Bacula is writing backups to an iSCSI disk group (not array) over ethernet; there are 6 disks of 1TB to 2TB size and these are managed using "vchanger" 0.8.6, with 6 magazines each with 24 virtual volumes. The file server has 3.5TB of files and other workstations add about another 1TB. More-complete log: 17-Jan 14:49 helva-sd JobId 3417: Recycled volume "DiskPool1_0006_0017" on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0), all previous data lost. 17-Jan 14:49 helva-sd JobId 3417: New volume "DiskPool1_0006_0017" mounted on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0) at 17-Jan-2013 14:49. 17-Jan 14:49 helva-sd JobId 3417: End of Volume "DiskPool1_0006_0017" at 0:216 on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). Write of 64512 bytes got 3879. 17-Jan 14:49 helva-sd JobId 3417: End of medium on Volume "DiskPool1_0006_0017" Bytes=217 Blocks=0 at 17-Jan-2013 14:49. 17-Jan 14:49 helva-sd JobId 3417: 3307 Issuing autochanger "unload slot 89, drive 0" command. 17-Jan 14:49 helva-dir JobId 3417: Using Volume "DiskPool1_0006_0018" from 'Scratch' pool. 17-Jan 14:49 helva-sd JobId 3417: 3301 Issuing autochanger "loaded? drive 0" command. 17-Jan 14:49 helva-sd JobId 3417: 3302 Autochanger "loaded? drive 0", result: nothing loaded. 17-Jan 14:49 helva-sd JobId 3417: 3304 Issuing autochanger "load slot 90, drive 0" command. 17-Jan 14:49 helva-sd JobId 3417: 3305 Autochanger "load slot 90, drive 0", status is OK. 17-Jan 14:49 helva-sd JobId 3417: Recycled volume "DiskPool1_0006_0018" on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0), all previous data lost. 17-Jan 14:49 helva-sd JobId 3417: New volume "DiskPool1_0006_0018" mounted on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0) at 17-Jan-2013 14:49. 17-Jan 14:49 helva-sd JobId 3417: End of Volume "DiskPool1_0006_0018" at 0:216 on device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). Write of 64512 bytes got 3879. 17-Jan 14:49 helva-sd JobId 3417: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). ERR=No space left on device17-Jan 14:49 helva-fd JobId 3417: Error: bsock.c:393 Write error sending 65562 bytes to Storage daemon:helva.cam.ivimey.org:9103: ERR=Connection reset by peer 17-Jan 14:49 helva-fd JobId 3417: Fatal error: backup.c:1024 Network send error to SD. ERR=Connection reset by peer 17-Jan 14:49 helva-sd JobId 3417: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). ERR=No space left on device17-Jan 14:49 helva-sd JobId 3417: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). ERR=No space left on device17-Jan 14:49 helva-sd JobId 3417: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). ERR=No space left on device17-Jan 14:49 helva-sd JobId 3417: Fatal error: device.c:192 Catastrophic error. Cannot write overflow block to device "DiskStorage-drive-0" (/var/spool/bacula/vchanger/0/drive0). ERR=No space left on device17-Jan 14:49 helva-sd JobId 3417: Job write elapsed time = 14:30:55, Transfer rate = 12.06 M Bytes/second 17-Jan 14:49 helva-dir JobId 3417: Error: Bacula helva-dir 5.0.3 (04Aug10): 17-Jan-2013 14:49:32 Build OS: x86_64-redhat-linux-gnu redhat JobId: 3417 Job:Helva_Home.2013-01-17_00.17.26_23 Backup Level: Full Client: "helva-fd" 5.0.3 (04Aug10) x86_64-redhat-linux-gnu,redhat, FileSet:"Home" 2010-12-07 13:37:32 Pool: "Normal-Full-18w" (From Job FullPool override) Catalog:"MyCatalog" (From Client resource) Storage:"DiskStorage" (From command line) Scheduled time: 17-Jan-2013 00:17:26 Start time: 17-Jan-2013 00:17:28 End time: 17-Jan-2013 14:49:32 Elapsed time: 14 hours 32 mins 4 secs Priority: