Hello everyone,

First, let me start by saying that I am using Josh's excellent vchanger
program in conjunction with removable SATA drives, and have documented my
configuration several years back here: http://www.revpol.com/offsitebackups

This configuration has worked exceedingly well for me without many issues for
several years now.

However, since upgrading my perfectly running 7.4.7 to 9.0.3, things seem to
changed with regards to volume or device selection and/or contention. I have
to spend a lot of time every day managing backups, manually loading the
requested volumes into drive devices over and over for just about every job.

This has been going on immediately after the upgrade. It is really
unmanageable, and backups will all just sit, waiting for me to move volumes
around.

Josh, in case you can think of anything that might change the interaction
between Bacula and vchanger, I am using version 1.0.1 since it has been
released, and as I mentioned, it has worked fine for me until I made the
upgrade to 9.0.3.

I am attaching a text file describing the issue in case the following text
wraps horrible as I fully expect it to. :-/

Let me demonstrate what is going on:

> Currently, the Director says it wants volume "c0_0001_0041" on device 
> "speedy_drv_9" of the autochanger, but it says that this volume is supposedly 
> in device another device "speedy_drv_3".
> There are 10 jobs "running" but at least two are held up (the ones on device 
> 3 and device 9):
> ----8<----
> *m                                                                            
>     
> 01-Oct 10:12 speedy-sd JobId 17125: Warning: Volume "c0_0001_0041" wanted on 
> "speedy_drv_9" (/var/lib/bacula/vchanger/speedy-file/9) is in use by device 
> "speedy_drv_3" (/var/lib/bacula/vchanger/speedy-file/3)                 
> 01-Oct 10:12 speedy-sd JobId 17125: Warning: mount.c:210 Open of File device 
> "speedy_drv_9" (/var/lib/bacula/vchanger/speedy-file/9) Volume "c0_0001_0041" 
> failed:
> ERR=file_dev.c:190 Could not 
> open(/var/lib/bacula/vchanger/speedy-file/9,OPEN_READ_WRITE,0640): ERR=No 
> such file or directory
> ----8<----
> 
> Those two libes are logged every 3 minutes.
> 
> 
> 
> OK, let's see what the storage daemon thinks is going on:
> ----8<----                                                                    
>                                 
> *s storage=speedy-file                                                        
>              
> Connecting to Storage daemon speedy-file at speedy.revpol.com:9103            
>                       
>                                                                               
>    
> speedy-sd Version: 9.0.3 (08 August 2017) x86_64-pc-linux-gnu gentoo          
>                       
> Daemon started 29-Sep-17 21:53. Jobs: run=48, running=9.        
>  Heap: heap=180,224 smbytes=2,964,523 max_bytes=3,615,430 bufs=621 
> max_bufs=881                     
>  Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0 newbsr=0
>  Res: ndevices=10 nautochgr=1
> [...]
> Writing: Incremental Backup job upsmon JobId=17135 Volume="c0_0001_0062"     
> <-----------------Wait a minute this is not c0_0001_0041!
>     pool="Offsite-eSATA-Inc" device="speedy_drv_3" 
> (/var/lib/bacula/vchanger/speedy-file/3)  <--- Drive 3
>     spooling=0 despooling=0 despool_wait=0                                    
>                       
>     Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0                            
>       
>     FDReadSeqNo=6 in_msg=6 out_msg=4 fd=14  
> 
> Writing: Incremental Backup job dns2 JobId=17125 Volume="c0_0001_0041"       
> <---------------- Here, it is already shown as in drive 9
>     pool="Offsite-eSATA-Inc" device="speedy_drv_9" 
> (/var/lib/bacula/vchanger/speedy-file/9)
>     spooling=0 despooling=0 despool_wait=0                                    
>    
>     Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0
>     FDReadSeqNo=6 in_msg=6 out_msg=5 fd=23
> ----8<----
> 
> 
> Ok, there is also an error reported by the Driector in that previous message 
> about this "volume" (well, vchanger symlink):
> ----8<----
> ERR=file_dev.c:190 Could not 
> open(/var/lib/bacula/vchanger/speedy-file/9,OPEN_READ_WRITE,0640): ERR=No 
> such file or directory
> ----8<----
> 
> 
> First, let's see what 'slot' this volume is in:
> ----8<----
> *list volume=c0_0001_0041
> [...]
> slot 110 is shown in the output.
> ----8<----
> 
> 
> Hmmm, OK, let's look at the vchanger stuff the SD itself.
> ----8<----
> speedy bacula #  ls -la /var/lib/bacula/vchanger/speedy-file/9
> ls: cannot access '/var/lib/bacula/vchanger/speedy-file/9': No such file or 
> directory
> 
> speedy bacula #  ls -la /var/lib/bacula/vchanger/speedy-file/
> total 45
> drwxr-xr-x 2 bacula bacula 592 Oct  1 10:33 .
> drwxr-xr-x 3 bacula bacula 144 Apr  9 19:32 ..
> lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:49 0 -> 
> /mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0008
> lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:51 1 -> 
> /mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0023
> lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:50 3 -> 
> /mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0041
> lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:50 4 -> 
> /mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0037
> lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:50 5 -> 
> /mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0062
> lrwxrwxrwx 1 bacula bacula  70 Oct  1 10:10 6 -> 
> /mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0064
> lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:51 7 -> 
> /mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0033
> lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:50 8 -> 
> /mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0048
> -rw-r----- 1 bacula bacula  64 Oct  1 10:33 bay_state-0
> -rw-r----- 1 bacula bacula  63 Oct  1 10:33 bay_state-5
> -rw-r----- 1 bacula bacula  71 Oct  1 09:49 drive_state-0
> -rw-r----- 1 bacula bacula  71 Oct  1 09:51 drive_state-1
> -rw-r----- 1 bacula bacula  71 Oct  1 09:50 drive_state-3
> -rw-r----- 1 bacula bacula  71 Oct  1 09:50 drive_state-4
> -rw-r----- 1 bacula bacula  71 Oct  1 09:50 drive_state-5
> -rw-r----- 1 bacula bacula  71 Oct  1 10:10 drive_state-6
> -rw-r----- 1 bacula bacula  71 Oct  1 09:51 drive_state-7
> -rw-r----- 1 bacula bacula  71 Oct  1 09:50 drive_state-8
> -rw-r----- 1 bacula bacula  18 Oct  1 10:33 dynamic.conf
> ----8<----
> 
> 
> OK, fair enough, there does not seem to be a device 9 symlink.. Let's 
> interact with vchanger and see if 110 is in 9:
> ----8<----
> # vchanger -u bacula -g bacula /etc/bacula/include/vchanger-REVPOL.conf 
> loaded 110 9 9
> 0
> ----8<----
> 
> Ok, so vchanger also thinks this volume is not in drive 9.
> 
> Is it in drive 3?
> ----8<----
> # vchanger -u bacula -g bacula /etc/bacula/include/vchanger-REVPOL.conf 
> loaded 110 3 3
> 110
> ----8<----
> 
> YES, it is, and as we can also see from the ls output above, drive 3 is 
> symlinked to this volume. Hmmmmm
> 
> 
> On the SD, let's force a mount of this volume into drive 9 with vchanger:
> ----8<----
> speedy bacula # vchanger -u bacula -g bacula 
> /etc/bacula/include/vchanger-REVPOL.conf load 110 9 9 
> speedy bacula # vchanger -u bacula -g bacula 
> /etc/bacula/include/vchanger-REVPOL.conf loaded 110 9 9
> 110
> ----8<----
> 
> 
> Now, I just need to wait a few minutes, and JobId=17125 on drive 9 mounts 
> this volume and the job continues:
> ----8<----
> 01-Oct 10:39 speedy-sd JobId 17125: Volume "c0_0001_0041" previously written, 
> moving to end of data.
> 
> ----8<----
> 
> 
> And at the same time the job running on drive 3 reports that it has unloaded 
> this voume:
> ----8<----
> 01-Oct 10:39 speedy-sd JobId 17135: 3307 Issuing autochanger "unload Volume 
> c0_0001_0041, Slot 110, Drive 3" command.
> ----8<----
> 
> 
> This has been my fate for about 18 jobs per day for the last several weeks 
> since I upgraded to 9.0.3. 
> 
> Any help? Any ideas?


-- 
Bill Arlofski
http://www.revpol.com/bacula
-- Not responsible for anything below this line --
Currently, the Director says it wants volume "c0_0001_0041" on device 
"speedy_drv_9" of the autochanger, but it says that this volume is supposedly 
in device another device "speedy_drv_3".
There are 10 jobs "running" but at least two are held up (the ones on device 3 
and device 9):
----8<----
*m                                                                              
  
01-Oct 10:12 speedy-sd JobId 17125: Warning: Volume "c0_0001_0041" wanted on 
"speedy_drv_9" (/var/lib/bacula/vchanger/speedy-file/9) is in use by device 
"speedy_drv_3" (/var/lib/bacula/vchanger/speedy-file/3)                 
01-Oct 10:12 speedy-sd JobId 17125: Warning: mount.c:210 Open of File device 
"speedy_drv_9" (/var/lib/bacula/vchanger/speedy-file/9) Volume "c0_0001_0041" 
failed:
ERR=file_dev.c:190 Could not 
open(/var/lib/bacula/vchanger/speedy-file/9,OPEN_READ_WRITE,0640): ERR=No such 
file or directory
----8<----

Those two libes are logged every 3 minutes.



OK, let's see what the storage daemon thinks is going on:
----8<----                                                                      
                              
*s storage=speedy-file                                                          
           
Connecting to Storage daemon speedy-file at speedy.revpol.com:9103              
                    
                                                                                
 
speedy-sd Version: 9.0.3 (08 August 2017) x86_64-pc-linux-gnu gentoo            
                    
Daemon started 29-Sep-17 21:53. Jobs: run=48, running=9.        
 Heap: heap=180,224 smbytes=2,964,523 max_bytes=3,615,430 bufs=621 max_bufs=881 
                    
 Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0 newbsr=0
 Res: ndevices=10 nautochgr=1
[...]
Writing: Incremental Backup job upsmon JobId=17135 Volume="c0_0001_0062"     
<-----------------Wait a minute this is not c0_0001_0041!
    pool="Offsite-eSATA-Inc" device="speedy_drv_3" 
(/var/lib/bacula/vchanger/speedy-file/3)  <--- Drive 3
    spooling=0 despooling=0 despool_wait=0                                      
                    
    Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0                              
    
    FDReadSeqNo=6 in_msg=6 out_msg=4 fd=14  

Writing: Incremental Backup job dns2 JobId=17125 Volume="c0_0001_0041"       
<---------------- Here, it is already shown as in drive 9
    pool="Offsite-eSATA-Inc" device="speedy_drv_9" 
(/var/lib/bacula/vchanger/speedy-file/9)
    spooling=0 despooling=0 despool_wait=0                                      
 
    Files=0 Bytes=0 AveBytes/sec=0 LastBytes/sec=0
    FDReadSeqNo=6 in_msg=6 out_msg=5 fd=23
----8<----


Ok, there is also an error reported by the Driector in that previous message 
about this "volume" (well, vchanger symlink):
----8<----
ERR=file_dev.c:190 Could not 
open(/var/lib/bacula/vchanger/speedy-file/9,OPEN_READ_WRITE,0640): ERR=No such 
file or directory
----8<----


First, let's see what 'slot' this volume is in:
----8<----
*list volume=c0_0001_0041
[...]
slot 110 is shown in the output.
----8<----


Hmmm, OK, let's look at the vchanger stuff the SD itself.
----8<----
speedy bacula #  ls -la /var/lib/bacula/vchanger/speedy-file/9
ls: cannot access '/var/lib/bacula/vchanger/speedy-file/9': No such file or 
directory

speedy bacula #  ls -la /var/lib/bacula/vchanger/speedy-file/
total 45
drwxr-xr-x 2 bacula bacula 592 Oct  1 10:33 .
drwxr-xr-x 3 bacula bacula 144 Apr  9 19:32 ..
lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:49 0 -> 
/mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0008
lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:51 1 -> 
/mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0023
lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:50 3 -> 
/mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0041
lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:50 4 -> 
/mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0037
lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:50 5 -> 
/mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0062
lrwxrwxrwx 1 bacula bacula  70 Oct  1 10:10 6 -> 
/mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0064
lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:51 7 -> 
/mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0033
lrwxrwxrwx 1 bacula bacula  70 Oct  1 09:50 8 -> 
/mnt/eSATA-1_Backups/2501438d-f468-42ab-a52e-ae6b0853dac7/c0_0001_0048
-rw-r----- 1 bacula bacula  64 Oct  1 10:33 bay_state-0
-rw-r----- 1 bacula bacula  63 Oct  1 10:33 bay_state-5
-rw-r----- 1 bacula bacula  71 Oct  1 09:49 drive_state-0
-rw-r----- 1 bacula bacula  71 Oct  1 09:51 drive_state-1
-rw-r----- 1 bacula bacula  71 Oct  1 09:50 drive_state-3
-rw-r----- 1 bacula bacula  71 Oct  1 09:50 drive_state-4
-rw-r----- 1 bacula bacula  71 Oct  1 09:50 drive_state-5
-rw-r----- 1 bacula bacula  71 Oct  1 10:10 drive_state-6
-rw-r----- 1 bacula bacula  71 Oct  1 09:51 drive_state-7
-rw-r----- 1 bacula bacula  71 Oct  1 09:50 drive_state-8
-rw-r----- 1 bacula bacula  18 Oct  1 10:33 dynamic.conf
----8<----


OK, fair enough, there does not seem to be a device 9 symlink.. Let's interact 
with vchanger and see if 110 is in 9:
----8<----
# vchanger -u bacula -g bacula /etc/bacula/include/vchanger-REVPOL.conf loaded 
110 9 9
0
----8<----

Ok, so vchanger also thinks this volume is not in drive 9.

Is it in drive 3?
----8<----
# vchanger -u bacula -g bacula /etc/bacula/include/vchanger-REVPOL.conf loaded 
110 3 3
110
----8<----

YES, it is, and as we can also see from the ls output above, drive 3 is 
symlinked to this volume. Hmmmmm


On the SD, let's force a mount of this volume into drive 9 with vchanger:
----8<----
speedy bacula # vchanger -u bacula -g bacula 
/etc/bacula/include/vchanger-REVPOL.conf load 110 9 9 
speedy bacula # vchanger -u bacula -g bacula 
/etc/bacula/include/vchanger-REVPOL.conf loaded 110 9 9
110
----8<----


Now, I just need to wait a few minutes, and JobId=17125 on drive 9 mounts this 
volume and the job continues:
----8<----
01-Oct 10:39 speedy-sd JobId 17125: Volume "c0_0001_0041" previously written, 
moving to end of data.

----8<----


And at the same time the job running on drive 3 reports that it has unloaded 
this voume:
----8<----
01-Oct 10:39 speedy-sd JobId 17135: 3307 Issuing autochanger "unload Volume 
c0_0001_0041, Slot 110, Drive 3" command.
----8<----


This has been my fate for about 18 jobs per day for the last several weeks 
since I upgraded to 9.0.3. 

Any help? Any ideas?






















------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to