Upgrade to 1.37.37. I don't know about the first problem, but the second is fixed according to my tests and user feedback in version 1.37.37.

I upgraded to 1.37.37, but tonight the same failures occured. The first night (full backups), this didn't happen, which is when I would think a timeout issue would be more likely to occur. Another thing I notice is all the jobs that fail, have the level listed as "Full (upgraded from Incremental), which is odd because the last full listed for some of these clients is "OK". Below is an example from the logs.

30-Aug 20:50 sioux-dir: Start Backup JobId 109,
Job=sirius.2005-08-30_20.00.32
30-Aug 21:07 sioux-sd: sirius.2005-08-30_20.00.32 Fatal error: acquire.c:359 Wanted Volume "000002", but device "IBM-LTO2" (/dev/nst0) is busy writing on
"000003" .
30-Aug 21:11 sirius-fd: sirius.2005-08-30_20.00.32 Fatal error:
c:\cygwin\home\kern\bacula\k\src\win32\filed\../../filed/job.c:1597 Bad
response to Append Data command. Wanted 3000 OK data
, got 3903 Error append data

30-Aug 21:07 sioux-dir: sirius.2005-08-30_20.00.32 Error: Bacula 1.37.37
(24Aug05): 30-Aug-2005 21:07:59
  JobId:                  109
  Job:                    sirius.2005-08-30_20.00.32
  Backup Level:           Full (upgraded from Incremental)
  Client:                 "sirius-fd" Windows XP,MVS,NT 5.1.2600
  FileSet:                "sirius-fileset" 2005-08-27 03:36:12
  Pool:                   "Full-Pool"
  Storage:                "Dell-PowerVault-132T"
  Scheduled time:         30-Aug-2005 20:00:31
  Start time:             30-Aug-2005 20:50:27
  End time:               30-Aug-2005 21:07:59
  Priority:               10
  FD Files Written:       0
  SD Files Written:       0
  FD Bytes Written:       0
  SD Bytes Written:       0
  Rate:                   0.0 KB/s
  Software Compression:   None
  Volume name(s):
  Volume Session Id:      30
  Volume Session Time:    1125416678
  Last Volume Bytes:      337,686,490,410
  Non-fatal FD errors:    0
  SD Errors:              0
  FD termination status:  Error
  SD termination status:  Error
  Termination:            *** Backup Error ***

As for the other error, a few clients failed with it also. Below is the log created for one of our clients. After this failure occured, I manually ran a backup from bconsole and it successfully completed, no services were restarted between the backups.

30-Aug 20:34 sioux-dir: Start Backup JobId 106,
Job=auriga.2005-08-30_20.00.29
30-Aug 20:44 sioux-dir: auriga.2005-08-30_20.00.29 Fatal error:
authenticate.c:99 Unable to authenticate with Storage daemon. Possible
causes:
Passwords or names not the same or
Maximum Concurrent Jobs exceeded on the SD or
SD networking messed up (restart daemon).
Please see http://www.bacula.org/rel-manual/faq.html#AuthorizationErrors for
help.
30-Aug 20:44 sioux-dir: auriga.2005-08-30_20.00.29 Error: Bacula 1.37.37
(24Aug05): 30-Aug-2005 20:44:27
  JobId:                  106
  Job:                    auriga.2005-08-30_20.00.29
  Backup Level:           Incremental, since=2005-08-27 03:04:07
  Client:                 "auriga-fd" Windows XP,MVS,NT 5.1.2600
  FileSet:                "auriga-fileset" 2005-08-27 03:04:07
  Pool:                   "Inc-Pool"
  Storage:                "Dell-PowerVault-132T"
  Scheduled time:         30-Aug-2005 20:00:28
  Start time:             30-Aug-2005 20:34:26
  End time:               30-Aug-2005 20:44:27
  Priority:               10
  FD Files Written:       0
  SD Files Written:       0
  FD Bytes Written:       0
  SD Bytes Written:       0
  Rate:                   0.0 KB/s
  Software Compression:   None
  Volume name(s):
  Volume Session Id:      0
  Volume Session Time:    0
  Last Volume Bytes:      0
  Non-fatal FD errors:    0
  SD Errors:              0
  FD termination status:
  SD termination status:
  Termination:            *** Backup Error ***


Thanks,
Thomas

Looks like I found another problem with 1.37.37. It seems I can't cancel jobs.
status dir shows:

Running Jobs:
 JobId Level   Name                       Status
======================================================================
122 Full cap1build.2005-08-31_20.00.05 is waiting on Storage Dell-PowerVault-132T 130 Full ibm-8af0a5d734d.2005-08-31_20.00.13 is waiting on Storage Dell-PowerVault-132T 132 Full pegasus.2005-08-31_20.00.15 is waiting on Storage Dell-PowerVault-132T
   137 Full    ws-hq-r210-1.2005-08-31_20.00.20 is running
138 Increme volans.2005-08-31_20.00.21 is waiting on Storage Dell-PowerVault-132T 139 Increme aquariuslt.2005-08-31_20.00.22 is waiting on max Storage jobs

cancel gives me:

Select Job:
     1: JobId=122 Job=cap1build.2005-08-31_20.00.05
     2: JobId=125 Job=socrates.2005-08-31_20.00.08
     ...        
     22: JobId=154 Job=catalog.2005-08-31_20.01.00
Choose Job to cancel (1-22): 2
3902 Job socrates.2005-08-31_20.00.08 not found.

I guess for now I'm gonna have to roll back to 1.37.30, as it seems to be the last stable (enough) version. Am I really the only one with all these issues in 1.37.37? Just my luck :)

Thanks,
Thomas


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to