from:"Thomas Lohman"

Re: [Bacula-users] Virtual tapes or virtual disks

2022-01-26 Thread Thomas Lohman



I'm having a RAID5 array of about 40TB in size. A separate RAID 
controller card handles the disks. I'm planning to use the normal ext4 
file system. It's standard and well known, most probably not the 
fastest though. That will not have any great impact, as there is a 4TB 
NVMe SSD drive, which takes the odd of the slow physical disk 
performance.



Hi,

I'd recommend if you're going to use RAID that you at least use a RAID-6 
configuration.  You don't want to risk losing all your backups if you 
have a drive fail and then during the rebuilding of the RAID-5, you 
happen to have another drive failure/error.


cheers,

--tom




___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] New Bacula Server with multiple disks

2021-12-16 Thread Thomas Lohman

>Can Bacula use my 4 disks in the same way filling up backup1 and than 
using backup2 etc?


The short answer is yes.  We've been doing this for over a decade using 
sym links to create one logical Bacula storage area that then points off 
to 40-50 disks worth of volume data on each server.    In general, I 
would agree with the RAID recommendation given the few drives that you 
have.  One option, if you can afford it, would be to double your disk 
count and create a RAID 10.


Since at the time of creation, we were not able to afford RAID setups 
with that amount of disks and backup servers that we have, I created an 
application that "stripes" our completed backup volume data across all 
the JBOD disks on a given server thus if we lose one disk, it lessens 
the likelihood that we lose an entire sequence of backup data.  It also 
helps to test the drives and root out suspect drives before they totally 
fail - which allows us to then copy all the good backup volumes off of 
it and take it out of circulation.


cheers,


--tom




___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] areas for improvement?

2020-05-27 Thread Thomas Lohman


Bacula DOES NOT LIKE and does not handle network interruptions _at all_
if backups are in progress. This _will_ cause backups to abort - and
these aborted backups are _not_ resumable


Hi,

My feeble two cents is that this has been a bit of an Achilles heel for 
us even though we are a LAN backup environment (e.g. backups don't leave 
our local network).  We are still running an older "somewhat/slightly" 
customized/modified version of community bacula so I have not explored 
the restarting of stopped jobs option that has come with newer versions. 
Given that, I can recall when we initially deployed our "backups to 
disk" setup, I would see backups of large file systems/data (e.g. 1TB) 
write 3/4ths of their data to volumes and then error out due to some 
random network interruption.  I didn't like the idea that this meant 
e.g. 750GBs worth of our volume space was taken up by an 
errored/incomplete job that would never be used.  Because of this, I had 
to implement spooling which typically people would only do if their 
backups were then being written to sequential media (tape).  So, we now 
spool all jobs to dedicated spool disks and then bacula writes that data 
to the disk data volumes.  It fixed the "cruft" issue and made large 
backups more stable (along with other options).  But I can imagine a 
scenario where we would not have had to do this if Bacula could more 
easily recover from network glitches and automatically restart jobs 
where it last left off (thinking along the lines of the concept of 
checkpointing in a RDBMS).


As someone else said, this would require non-trivial changes to Bacula 
(i.e. I won't be making those changes to our version - :) ) and the 
devil would be in the details in practice.  Still, if it was put to a 
vote, I'd probably vote for this as "a nice feature to have."


cheers,


--tom



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Bacula and memory usage

2020-01-28 Thread Thomas Lohman


Hello

I am used to this principle with Linux but I don't understand why it just takes 
it when Bacula is working and it slows down the server so much that I can no 
longer access it in ssh.


How is your storage allocated on the server? i.e. how are things 
partitioned with regard to your backup disks and your database? If your 
DB is located on the same physical disks as your OS and/or your actual 
backup data then you could see such "freeze ups" while Bacula is running 
due to I/O limitations.  I find it helps to separate the OS, DB data and 
any Bacula storage volumes so they are all on separate disk devices if 
possible - separate controllers even better.



--tom




___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Bacula and memory usage

2020-01-28 Thread Thomas Lohman





%Cpu(s):  0.1 us,  0.2 sy,  0.0 ni, 52.9 id, 46.5 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem : 29987532 total,   220092 free,   697356 used, 29070084 buff/cache
KiB Swap: 15138812 total, 15138812 free,0 used. 28880936 avail Mem


It looks like your memory is being used by the Linux file cache. This is 
typical and if the system needs the memory for something else, it will 
use it.


As mentioned in my previous e-mail, can you run status within the 
director (bconsole) and see what the clients are doing when the backups 
are running?  Is bacula actually backing anything up?  The first thing 
to determine is if there is a problem/malfunction or if possibly your 
backups are simply taking too long to run (due to data total, # of 
files, etc.).



--tom




___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Bacula and memory usage

2020-01-27 Thread Thomas Lohman


Hi,

How many files and total space on each client?  6 TB is not necessarily 
a huge total amount but you may want to consider splitting each client 
job into smaller chunks.  Also, what does the status of the jobs show?  
Does it show that it is indeed backing up data?  Unfortunately, if they 
are not close to finishing, you most likely are going to run into the 
hard limit on job run time (6 days?) and the jobs will be canceled.  I'm 
assuming that this hard coded limitation is still in the 7.0.5 code base.


Also, to avoid queuing up additional backup runs for the same job, you 
may to look into the various options that allow one to cancel jobs of 
they are already running, already queued, etc.



--tom


On 1/27/20 2:11 PM, Jean Mark Orfali wrote:

Hello,

Thank you for your reply. Here is the missing information. My Bacula server and 
the four clients are on Linux Centos 7 servers. I use Webmin version 1.941 to 
access bacula. The bacula version is 7.0.5. The SQL server is a MariaDB version 
5.5.64. The server has 30TB of hard drive and 30GB of memory. Backups are saved 
in a directory directly on the backup server. No backup is kept on clients 
side. At the moment there is 6 TB of data to backup. On each of the 4 clients I 
have an incremental backup task schedule every day at 11 p.m. Right now I have 
4 backups running for 5 days and 14 waiting.

Here is the server configuration information:

Thank you so much!

Bacula-dir.conf

#
# Default Bacula Director Configuration file
#
#  The only thing that MUST be changed is to add one or more
#   file or directory names in the Include directive of the
#   FileSet resource.
#
#  For Bacula release 7.0.5 (28 July 2014) -- redhat Enterprise release
#
#  You might also want to change the default email address
#   from root to your address.  See the "mail" and "operator"
#   directives in the Messages resource.
#

Director {# define myself
   Name = bacula-dir
   DIRport = 9101
   QueryFile = "/etc/bacula/query.sql"
   WorkingDirectory = /var/spool/bacula
   PidDirectory = "/var/run"
   Maximum Concurrent Jobs = 100
   Password = "" # Console password
   Messages = Daemon
}



#
# Define the main nightly save backup job
#   By default, this job will back up to disk in /tmp

#Job {
#  Name = "BackupClient2"
#  Client = bacula2-fd
#  JobDefs = "DefaultJob"
#}

#Job {
#  Name = "BackupClient1-to-Tape"
#  JobDefs = "DefaultJob"
#  Storage = LTO-4
#  Spool Data = yes# Avoid shoe-shine
#  Pool = Default
#}

#}

# Backup the catalog database (after the nightly save)

#
# Standard Restore template, to be changed by Console program
#  Only one such job is needed for all Jobs/Clients/Storage ...
#


# List of files to be backed up
FileSet {
   Name = "Full Set"
   Include {
 Options {
   signature = MD5
   compression = GZIP
 }
#
#  Put your list of files here, preceded by 'File =', one per line
#or include an external list with:
#
#File = \" -s \"Bacula: 
%t %e of %c %l\" %r"
   operatorcommand = "/usr/sbin/bsmtp -h 51.79.119.27 -f \"\(Bacula\) \<%r\>\" -s 
\"Bacula: Intervention needed for %j\" %r"
   mail = root@51.79.119.27 = all, !skipped
   operator = root@51.79.119.27 = mount
   console = all, !skipped, !saved
#
# WARNING! the following will create a file that you must cycle from
#  time to time as it will grow indefinitely. However, it will
#  also keep all your messages if they scroll off the console.
#
   append = "/var/log/bacula/bacula.log" = all, !skipped
   catalog = all, !skipped, !saved
}


#
# Message delivery for daemon messages (no job).
Messages {
   Name = Daemon
   mailcommand = "/usr/sbin/bsmtp -h 51.79.119.27 -f \"\(Bacula\) \<%r\>\" -s \"Bacula 
daemon message\" %r"
   mail = root@51.79.119.27 = all, !skipped
   console = all, !skipped, !saved
   append = "/var/log/bacula/bacula.log" = all, !skipped
}

# Default pool definition
Pool {
   Name = Default
   Pool Type = Backup
   Recycle = yes   # Bacula can automatically recycle 
Volumes
   AutoPrune = yes # Prune expired volumes
   Volume Retention = 365 days # one year
   Maximum Volume Bytes = 50G  # Limit Volume size to something 
reasonable
   Maximum Volumes = 100   # Limit number of Volumes in Pool
}

# File Pool definition
Pool {
   Name = File
   Pool Type = Backup
   Label Format = Local-
   Recycle = yes   # Bacula can automatically recycle 
Volumes
   AutoPrune = yes # Prune expired volumes
   Volume Retention = 365 days # one year
   Maximum Volume Bytes = 50G  # Limit Volume size to something 
reasonable
   Maximum Volumes = 100   # Limit number of Volumes in Pool
   #Label Format = "Vol-"   # Auto label
}


# Scratch pool definition
Pool {
   Name = Scratch
   Pool Type = Backup
}

#
# Restricted console used by tray-monitor to get the status of the

Re: [Bacula-users] Ubuntu 18.04 / Bacula 9.0.6 and Postgres 10

2018-09-07 Thread Thomas Lohman

Hi Kern, yes, I know - I should have mentioned that we're still running 
an earlier version of Bacula.  But my main point was that Postgres 10 
doesn't seem to have any issues for us.


cheers,


--tom

On 09/07/2018 02:41 PM, Kern Sibbald wrote:

On 09/07/2018 12:05 PM, Thomas Lohman wrote:



FWIW we have not seen any compatibility problems in v.10, but we're not
using it with bacula. All I can see in bacula is
/usr/libexec/bacula/create_postgresql_database:


We've been using Bacula with Postgres 10.x on RH Enterprise 7.5 for a 
few months now with no issues.  The only change to Bacula I made was 
adding a 10 option to the above mentioned file.


Bacula version 9.2.x corrects the option issue you mentioned.

Best regards,
Kern






___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Ubuntu 18.04 / Bacula 9.0.6 and Postgres 10

2018-09-07 Thread Thomas Lohman





FWIW we have not seen any compatibility problems in v.10, but we're not
using it with bacula. All I can see in bacula is
/usr/libexec/bacula/create_postgresql_database:


We've been using Bacula with Postgres 10.x on RH Enterprise 7.5 for a 
few months now with no issues.  The only change to Bacula I made was 
adding a 10 option to the above mentioned file.



--tom


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Incremental backups stacking up behind long-running job

2017-02-08 Thread Thomas Lohman

> One of the queued backups is the next incremental backup of "archive".
> My expectation was that the incremental backup would run only some hours
> after the full backup finishes, so the difference is really small and it
> only takes some minutes and only requires a small amount of tape
> storage. The problem now is that bacula does its check if there already
> is a full backup of "archive" available when adding the job to the queue
> and not when running it. Since the full backup has not been finished
> yet, there is none and bacula turns the second incremental backup (and
> probably the third one) into a full backup as well.
>
> I'm currently running bacula 5.2.6, so my question is if anybody knows
> a solution to this problem (apart from manually cancelling the queued
> incremental jobs) or if an upgrade to bacula 7 might solve the problem.
> The upgrade to 7.4 is planned for the future already.

I believe that the problem that you're describing is the same one I had 
a number of years ago when running 5.2.x.  I had fixed it and submitted 
a patch I believe.  So my guess is that this should now be fixed and 
should not be an issue in 7.4.x.

http://bugs.bacula.org/view.php?id=1882

In addition, there are options to cancel new jobs if there are already 
running jobs, etc.  Please see the following job options

   Allow Duplicate Jobs = yes/no
   Cancel Lower Level Duplicates = yes/no
   Cancel Queued Duplicates = yes/no
   Cancel Running Duplicates = yes/no


--tom


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Multiple full backups in same month

2015-06-25 Thread Thomas Lohman

 The question now is: bacula decides if it will upgrade jobs when it
 queues the jobs or when it starts the jobs? According to the logs
 above I think it is when it starts.


 To my mind it's upgraded when it's queued... I hope I'm wrong :)

Hi, it is done when the job is queued to run.  So, if you see it listed 
under Running jobs in bconsole then it's already been decided.  Queued 
to run isn't necessarily the same as when the job actually starts due to 
other factors/settings.

hope this helps,


--tom



--
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors 
network devices and physical  virtual servers, alerts via email  sms 
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Multiple full backups in same month

2015-06-25 Thread Thomas Lohman

 On 25/06/15 13:21, Silver Salonen wrote:

 But why it upgraded the other incrementals in the queue if the first
 incremental was upgraded to full?

 Because the algorithm is broken. It should only make that decision when
 the job exits the queue.

 I filed a bug against this a long time ago, It still isn't fixed.

I believe Alan is right and you're experiencing this bug or something 
similar depending on what configuration parameters you have set.

http://bugs.bacula.org/view.php?id=1882

I fixed this particular issue described in the bug report reference 
above that we ran into in 5.2.13 along with some other things but never 
got those into the main code base.  We're still running 5.2.13 and I 
have not had the time to port my changes to 7.0.x but you might be able 
to look at my changes to 5.2.13 and make the equivalent changes in 7.0.x.


--tom



--
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors 
network devices and physical  virtual servers, alerts via email  sms 
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Multiple full backups in same month

2015-06-25 Thread Thomas Lohman

 Ok, so the option Allow Duplicate Job=no can at least prevent multiple
 full backups of the same server in a row as stated before?

As others mentioned, I think it may help in your case but it may not 
completely solve the problem that you saw.  It looks like you had 5 
instances of the same job queued up at the same time.  Disallowing 
duplicate jobs would mean the last 4 would be canceled once queued (but 
after being upgraded to Full).  Now, if we assume your original Full job 
actually ended up running and completed successfully, your next instance 
of this job will still get upgraded to Full I suspect since it's going 
to see the canceled jobs as newer than that successful Full.  The 
problem, I think, is what I described here in bug 1882

The original 5.2.13 behavior when determining if a failed job needs to 
be rerun was to look at the start time of the most recent successful 
backup. From there it would then see if any job had started since then 
and failed. As pointed out, this creates an issue when you have FULL 
jobs that tend to run longer than the time period between normal backups 
for those jobs. i.e. the job laps itself so to speak. Any new jobs would 
be upgraded to FULLs and then canceled since the original FULL was still 
running (this assumes that duplicate jobs are not allowed). But once the 
original FULL finished, Bacula was grabbing it's start time and then 
seeing those canceled FULL jobs that happened since the successful FULL 
was started. To me, it seems like looking at the end time of that 
successful job makes more sense.

The change I made was to have Bacula look at the real end time of the 
last successful job and then see if any jobs have failed since that 
time.  This fixed these type of issues for us.  Sorry that this probably 
doesn't help you with fixing it right now if you're running 7.0.x, but I 
think it does explain the behavior that you're seeing and also says that 
it is still there in 7.0.x

And just for completeness, these are the related settings that we run with:

Allow Duplicate Jobs = no
Cancel Lower Level Duplicates = yes
Cancel Queued Duplicates = yes
Cancel Running Duplicates = no
Rerun Failed Levels = yes

hope this helps,


--tom



--
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors 
network devices and physical  virtual servers, alerts via email  sms 
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Multiple full backups in same month

2015-06-25 Thread Thomas Lohman

No, because the end time of Full job #1 occurred after the end time of 
the failed job #2.  Bacula doesn't see any failed jobs occurring after 
the end time of successful job #1 which is all it cares about - at least 
in our patched version.


--tom

 Wouldn't this changed behavior run into the problem that cancelled
 duplicates are still seen as failed jobs and therefore jobs would be
 upgraded still?

 Eg:

  1. Full starts
  2. Incr is queued, upgraded to Full and cancelled.
  3. Full ends
  4. Incr is queued, checks that Full job no. 1 finished OK, but then
 checks that Incr-Full job no. 2 failed - thus it's still upgraded
 to Full and started.

 --
 Silver


--
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors 
network devices and physical  virtual servers, alerts via email  sms 
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] how to debug a job

2015-01-23 Thread Thomas Lohman

 Even though, IMHO, spooling disks backup is just muda (Japanese
 Term): http://en.wikipedia.org/wiki/Muda_(Japanese_term)

Not necessarily - if you have a number of backups that tend to flake out
halfway through for whatever reasons (network, client issues, user 
issues, etc) e.g. then by spooling backups and then de-spooling 
sequentially to disk you save your disk volumes from filling up with 
unnecessary cruft - which depending on how everything is configured for 
you could cause problems.  If the community version could restart 
backups from an aborted point then this probably wouldn't be a potential 
issue.

cheers,


--tom




--
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Schedule question

2014-12-12 Thread Thomas Lohman

 is there a quick way to set the schedule to be every other week
 (to create full backups every 14 days i.e. on even weeks since
 01.01.1971 for example)

 If there is no predefined keyword, is there a way to trigger this
 based on the result of an external command?

Hi, you may also want to look at the MaxFullInterval option which allows 
one to specify the max number of days between FULLs for a job.

hope this helps,


--tom



--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

[Bacula-users] pruning of virtual full jobs

2014-12-11 Thread Thomas Lohman

This is probably a question for Kern or perhaps should be better posted 
to bacula-devel but I'll send it here since others may have experienced 
or have comments on this.

Assume you are running Virtual Fulls every x days (aka the Max Full 
Interval for Virtual Fulls) and also have retention periods for 
clients/volumes set.  When a client that comes and goes is ready for a 
new Virtual Full, it's possible that there have been no new 
Incremental/Differential backups since the last Virtual Full.  So, it 
simply makes a new copy of the last Virtual Full which makes sense. 
When you then run a prune of that client, it will look at the JobTDate 
of the Virtual Full job and see the date of the original last real 
backup for that client and depending on the retention defined will 
delete the job information which then leads to an error on it's next 
backup attempt.  At this point, you have to get the client in and do a 
new Full for that job.

The issue really seems to be whether or not for Virtual Fulls, pruning 
should use the real job termination time and not the job termination 
time that gets dragged forward from the last real backup that was 
done.  It seems to me that it should but I can see an argument the other 
way as well since the actual data you're storing has aged past your 
retention periods.


--tom


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] How to do Cross Replication Site1=Site2 (DR)

2014-11-26 Thread Thomas Lohman

 First let me thank you all for your responses, i really appreciate
 them. As Joe, i think the problem here are the bacula jobids, ¿ is
 there any way to say bacula to start from (let say) job id 900 ?
 i think that's an easy way to fix all the problem as i will be able

I am not familiar enough with mysql and it's workings but with postgres, 
the jobid column in the job table is defined as a sequence - 
job_jobid_seq.  When this is first created it can be seeded with 
whatever starting value that you wish.

e.g.

\d job_jobid_seq
 Sequence public.job_jobid_seq
 Column |  Type   |Value
---+-+-
  sequence_name | name| job_jobid_seq
  last_value| bigint  | 328864
  start_value   | bigint  | 1
  increment_by  | bigint  | 1
  max_value | bigint  | 9223372036854775807
  min_value | bigint  | 1
  cache_value   | bigint  | 1
  log_cnt   | bigint  | 31
  is_cycled | boolean | f
  is_called | boolean | t

So, you could have one server start at 1 and another start at some 
number that you know the first server will never reach (assuming you 
want them to have unique job id sets forever).

hope this helps,


--tom


--
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration  more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] File volumes and scratch pool

2014-10-07 Thread Thomas Lohman

 My volumes are of type files so using new volumes vs recycling expired
 ones just fills up the file system with old data. It makes it hard to
 manage and forecast filesystem space needs.

 I have never understood Bacula's desire to override my policy and insist
 on preserving data that I already defined as useless.

If one of the issues is getting rid of old data that goes beyond the 
retention period then one should be able to use the truncate volume on 
purge directive and then set up a way then to ask Bacula to purge those 
volumes once they are moved into your recycle pool (via a separate 
job/script that runs the appropriate bconsole commands).  As far as I 
understand things, Bacula won't do the truncate automatically when it 
marks the volume as purged and moves it into the recycle pool.

Bacula will still use new never before used volumes when it grabs one 
from the recycle pool (although I suspect if you knew what you were 
doing you could get around that by updating the proper time 
stamps/attributes on the media records for the truncated volumes so they 
would appear as new) but if the used volumes are truncated then they 
won't fill up the file system and the backup data should be deleted.

hope this helps,


--tom


--
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] v7.0.4 migrate: StartTime older than SchedTime

2014-07-23 Thread Thomas Lohman

 StartTime does not get updated when migrating a job. Is this a bug or
 is it the way it is supposed to be?


I believe that this is the way it is supposed to work.  When 
copying/migrating a job or when creating a virtual Full job from 
previous jobs, the start time of the new job gets set to the start time 
of the copied/migrated job or in the case of a Virtual Full to the start 
time of the last backup used to create the Virtual Full.  This, I 
believe, is because that start time is used when looking to see what 
needs to be backed up if you're doing another backup that will be based 
off of that job.  This can cause issues if you're assuming start time is 
the real start time of a job as you've discovered.  I went ahead and 
added a realstarttime attribute to a job as part of some of my 
patches/extensions but those were for 5.2.13 and not the latest release 
7.0.x.


--tom


--
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Socket terminated message after backup complete

2014-07-08 Thread Thomas Lohman

 According to
 http://www.baculasystems.com/windows-binaries-for-bacula-community-users,
 6.0.6 is still the latest version. Does this mean the bug was never
 fixed there, or is it the text on that page that needs updating? Or
 is there still something else entirely, and is it not this bug that's
 hitting me?

Hi,

it's possible that there may be other scenarios where that particular 
bug occurs or it's also possible that the patch to the community code 
did not make it into the enterprise version that you're using.  I am not 
sure.  Kern may be able to answer.


--tom


--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Socket terminated message after backup complete

2014-07-07 Thread Thomas Lohman


 Because traffic is going through those firewalls, I had already
 configured keepalive packets (heartbeat) at 300 seconds. In my first
 tests, backups *did* fail because that was missing.  Now they don't
 seem to fail anymore, but there's that socket terminated message
 every now and then that doesn't belong there.


Hi,

This seems like the problem that you're having.

http://bugs.bacula.org/view.php?id=1925

I believe this was fixed in community client version 5.2.12 and I can 
verify that we no longer see these warning/error messages on clients 
that have been upgraded to = 5.2.12.  We still see it on Windows 
machines that are running 5.2.10.  I don't know which version of the 
Enterprise client has this fix in it.

The messages themselves are mainly harmless so you can ignore them if 
you want to.


--tom

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Fatal error: Authorization key rejected by Storage daemon

2014-06-12 Thread Thomas Lohman

I've seen this error before on and off on one particular client. 
Nothing changes with regard to the configuration and yet the error will 
crop up.  Usually a combination of the following fixes it - 
cancel/restart the job, restart the Bacula client, or restart the Bacula 
storage daemon.  Since it only happens with this one client, I haven't 
bothered to try and figure out why exactly.  I'd be interested if anyone 
has any thoughts on what causes this error to randomly occur.


--tom

 I've problem with my Bacula server and my FD on my client-test server
 (centos6-fd). When I try to run a job with BAT I've the following
 error :

 centos6-fd Fatal error: Authorization key rejected by Storage
 daemon. Please see
 http://www.bacula.org/en/rel-manual/Bacula_Freque_Asked_Questi.html#SECTION0026
 for help. bacula.local-dir Start Backup JobId 156,
 Job=BackupCentos6.2014-06-03_16.19.34_08 Using Device LTO-4 to
 write. bacula.local-dir Fatal error: Bad response to Storage command:
 wanted 2000 OK storage , got 2902 Bad storage

 From my server I can telnet the client on port 9102 and 9103. From
 my client I can telnet my server on 9101,9102 and 9103.

 So I thought it was a password mistake but I use the same password
 everywhere.

 Any idea/suggestion please ?

 Benjamin

 +--


|This was sent by benja...@oceanet.com via Backup Central.
 |Forward SPAM to ab...@backupcentral.com.
 +--




 --


HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
 Find What Matters Most in Your Big Data with HPCC Systems Open
 Source. Fast. Scalable. Simple. Ideal for Dirty Data. Leverages Graph
 Analysis for Fast Processing  Easy Data Exploration
 http://p.sf.net/sfu/hpccsystems
 ___ Bacula-users mailing
 list Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



--
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing  Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Delete files from failed jobs

2014-05-20 Thread Thomas Lohman

 thank you, so the only way is to configure the volume to be used in only
 1 job, So if a job fail i can delete the entire volumen. I try this.

Hi, you can also choose to spool jobs before they are written to your 
actual volumes.  This way if jobs tend to fail in the middle for 
whatever reason, no space will be wasted inside your volumes.


--tom


--
Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free.
http://p.sf.net/sfu/SauceLabs
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Restore from an incremental job: No Full backup before ... found

2014-05-12 Thread Thomas Lohman

 I guess I will go with Sven's suggestion, or does anyone have any
 other recommendation on running a weekly backup with 7 days archive?

Hi, this may be the same as Sven's recommendation but if you want to
guarantee the ability to restore data as it was 7 days ago then
you'll need to set your retention period to 14 days.  An example may
illustrate best:

May 3rd - Full
May 4th-9th - Incrementals
May 10 - Full
May 11- Incremental
May 12 - restore request for the data as it was on May 9th.

With only a 7 day retention period, by the time May 12 comes around, 
you've lost your May 3rd Full potentially.

Whether or not you've actually lost the data depends on whether the 
volume that it resides on actually has been overwritten/re-used yet. 
How things behave, of course, will depend on your exact configuration. 
If it has not been overwritten, then you do have options.   I have never 
used it but you could try using a volume scanning tool (i.e. bscan) to 
re-create the DB meta-data for the jobs on that volume.  Another option 
would be to restore your DB back to May 9th on another computer (i.e. a 
spare/test Bacula server) and then use it to get at the data.  I've done 
the latter with success when someone wanted some data that was older 
than our restore window.

cheers,


--tom


--
Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free.
http://p.sf.net/sfu/SauceLabs
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] SOLVED: catalog problem: duplicate key value violates unique constraint fileset_pkey

2014-01-16 Thread Thomas Lohman

 It did.  Thanks a lot for your help - I highly appreciate it.
 If we ever should run into each other in real life please remember me
 that I owe you some beer...

No problem :) - glad that you got it working.


--tom




--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] catalog problem: duplicate key value violates unique constraint fileset_pkey

2014-01-15 Thread Thomas Lohman

 I tried that, but it fails:

  Enter SQL query: alter sequence fileset_filesetid_seq restart with 76;
  Query failed: ERROR:  must be owner of relation fileset_filesetid_seq

 I ran this under bconsole, i. e. as user bacula - is this not the
 right thing to do?

Wolfgang,

As someone I think already pointed out, it sounds like the owner of your 
bacula database sequences is another user - more than likely the 
Postgres super user which is probably named something like 'postgres' 
on your system I'm guessing.  You will need to connect to the database 
as that user in order to have update privileges on the sequences.

hope this helps,


--tom



--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] catalog problem: duplicate key value violates unique constraint fileset_pkey

2014-01-14 Thread Thomas Lohman

My guess is that during the migration from MySQL to Postgres, the 
sequences in Bacula did not get seeded right and probably are starting 
with a seed value of 1.

the filesetid field in the fileset table is automatically populated by 
the fileset_filesetid_seq sequence.

Run the following two queries and see what the results are - in 
particular, see what the last_value is for the sequence.  This should be 
equal to the max value from the fileset table which it is in my Bacula 
database.  If not, you'll need to manually fix it via a sql update 
command to the sequence.

select max(filesetid) from fileset;

select * from fileset_filesetid_seq;


hope this helps,


--tom

 Hello,

 I've tried to switch a bacula configuration that has been running for
 years using from MySQL to PostgreSQL.  Everything worked apparently
 fine (I did the same before with two other instalations, where the
 very same steps worked, too), but when trying to run jobs in the new
 PostgreSQL environment, some jobs fail with errors like this:

 13-Jan 22:13 XXX-dir JobId 1: Error: sql_create.c:741 Create DB FileSet 
 record INSERT INTO FileSet (FileSet,MD5,CreateTime) VALUES ('YYY 
 root','zD/PtXx6xx/IEHZH8X5OJB','2014-01-13 22:13:59') failed. ERR=ERROR:  
 duplicate key value violates unique constraint fileset_pkey
 DETAIL:  Key (filesetid)=(1) already exists.

 13-Jan 22:13 XXX-dir JobId 1: Error: Could not create FileSet YYY root 
 record. ERR=sql_create.c:741 Create DB FileSet record INSERT INTO FileSet 
 (FileSet,MD5,CreateTime) VALUES ('YYY 
 root','zD/PtXx6xx/IEHZH8X5OJB','2014-01-13 22:13:59') failed. ERR=ERROR:  
 duplicate key value violates unique constraint fileset_pkey
 DETAIL:  Key (filesetid)=(1) already exists.


 Not all jobs are faliling like this, only some.


 Is there a way to check the DB for consistence (or, even better, to repair 
 it)?

 What could cause such issues, and what could be done to fix these?



 I don;t know if it's related, but maybe I should note that in the old
 setup (with a MySQL DB) I had occasionally jobs failing with errors
 like this:

 30-Dec 00:05 XXX-dir JobId 70535: Start Backup JobId 70535, 
 Job=AAA-Root.2013-12-30_00.05.02_02
 30-Dec 00:05 XXX-dir JobId 70535: Using Device LTO3-1 to write.
 30-Dec 00:19 ZZZ-sd JobId 70535: Fatal error: askdir.c:340 NULL Volume name. 
 This shouldn't happen!!!
 30-Dec 00:19 ZZZ-sd JobId 70535: Spooling data ...
 30-Dec 00:06 AAA-fd JobId 70535:  /work is a different filesystem. Will 
 not descend from / into it.
 30-Dec 00:21 ZZZ-sd JobId 70535: Elapsed time=00:01:13, Transfer rate=0  
 Bytes/second
 30-Dec 00:06 AAA-fd JobId 70535: Error: bsock.c:429 Write error sending 8 
 bytes to Storage daemon:ZZZ:9103: ERR=Connection reset by peer
 30-Dec 00:06 AAA-fd JobId 70535: Fatal error: xattr.c:98 Network send error 
 to SD. ERR=Connection reset by peer

 Out of 30+ jobs running each night, only one would fail about once
 per week, and this was one out of 2 candidates - all others never
 showed any such problem. I have been wondering if there was some DB
 issue for these jobs, which is one of the reasons for switching to
 PostgreSQL.   But maybe this is totally unrelated...


 Any help welcome.  Thanks in advance.

 Best regards,

 Wolfgang Denk



--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] catalog problem: duplicate key value violates unique constraint fileset_pkey

2014-01-14 Thread Thomas Lohman

Wolfgang,

 Dear Thomas,

 In message 52d555c5.9070...@mtl.mit.edu you wrote:
 My guess is that during the migration from MySQL to Postgres, the
 sequences in Bacula did not get seeded right and probably are starting
 with a seed value of 1.

 Do you have any idea why this would happen?  Is this something I can
 influence?
 Are there any other variables that might hit by similar issues?

I can't say exactly why it happened to you but my guess would be that 
this problem could hit anyone porting from mysql to postgres.  I'm not 
familiar with the Bacula procedure for doing that (if you used one) but 
any Postgres sequence creations during the Postgres DB setup would more 
than likely be created with a default starting value of 1 - but if 
you've already got data in your database (migrated over from Mysql) then 
all sequences would need to be seeded properly.  The bad news for you 
may be that almost all of the Bacula tables have sequences to generate 
their id fields.

client
file
filename
path
job
jobmedia
fileset
media
pool

I believe in each case, the 'id' field is the primary key which means it 
will be unique - thus any inserts should fail with an error and thus 
ensure that your database doesn't get into a strange funky state with 
multiple records having the same id.  It may also be that you get lucky 
and avoid that for tables such as file, job, filename because if your 
database had been around awhile, it may be that re-starting those 
counters back to 1 may not overlap with any existing/current data (e.g. 
if the newest job before migration had an id of 1 and all old jobs 
have been purged then restarting at 1 shouldn't cause problems depending 
on your configuration of course).  With that said, if it was me, I'd 
re-seed all the sequences to where the id left off for each of the 
tables to avoid possible future insert errors/conflicts.

 select max(filesetid) from fileset;

 select * from fileset_filesetid_seq;

 This is what I get:

 Enter SQL query: select max(filesetid) from fileset;
 +--+
 | max  |
 +--+
 |   75 |
 +--+
 Enter SQL query: select * from fileset_filesetid_seq;
 +---++-+--+---+---+-+-+---+---+
 | sequence_name | last_value | start_value | increment_by | max_value 
 | min_value | cache_value | log_cnt | is_cycled | is_called |
 +---++-+--+---+---+-+-+---+---+
 | fileset_filesetid_seq |  4 |   1 |1 | 
 9,223,372,036,854,775,807 | 1 |   1 |  32 | f | t 
 |
 +---++-+--+---+---+-+-+---+---+
 Enter SQL query:


 Sorry, my DB / sql knowledge is somewhat limited (read: non-existient).
 Could you please be so kind and tell me how I could fix that?

Well, if your DB knowledge is limited then you may want to consult 
someone in your location who may be able to assist.  Given that, I'll 
say the next part with the usual use at your own risk disclaimer.  To 
change the last_value field of a Postgres sequence, you need to use the 
Postgres alter sequence command

e.g.

alter sequence fileset_filesetid_seq restart with 76;

After that, the next fileset record created should be created with an id 
value of 76.

This may be dependent on your version of Postgres.  I am using 9.1.x and 
am looking at the following documentation:

http://www.postgresql.org/docs/9.1/static/sql-altersequence.html

I would then redo that above procedure for each of the sequences for 
each of the Bacula tables (querying to get the max value currently used 
and then resetting the last_value field to max value + 1).

hope this helps and good luck,


--tom




--
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments  Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Restores fail because of multiple storages

2013-12-17 Thread Thomas Lohman

 That seems a working solution, but creating a symbolic link for every
 volume required by a restore job introduces a manual operation that
 would be better to avoid, especially if a lot of incremental volumes are
 being considered.

We use symbolic links here and have never had any problems.  All volumes 
are created ahead of time so links are created at the same time. It may 
not be the most elegant solution but it's certainly a workable solution 
and for us, it eliminated issues/problems we were having with vchanger 
mistakenly marking volumes in error which then had to be corrected manually.

hope this helps,


--tom



--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Restores fail because of multiple storages

2013-12-13 Thread Thomas Lohman

  10-dic 17:46 thisdir-sd JobId 762: acquire.c:121 Changing read
 device. Want Media Type=JobName_diff have=JobName_full
device=JobName_full (/path/to/storage/JobName_full)

I think that you want to make sure the Media Type for each Storage 
Device is File.  It looks like you've defined them to be different. 
It might help if you were post your storage configuration which would 
allow folks to see the details of your configuration.

hope this helps,


--tom



--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Bacula Console - shows conflicting info

2013-12-02 Thread Thomas Lohman

 25-Nov 13:38 home-server-dir JobId 144: Fatal error: Network error with FD 
 during Backup: ERR=Connection reset by peer
 25-Nov 13:38 home-server-dir JobId 144: Fatal error: No Job status returned 
 from FD.
 25-Nov 13:38 home-server-dir JobId 144: Error: Bacula home-server-dir 5.2.5 
 (26Jan12):

I am not sure what your exact configuration is but my guess/hunch is 
that your jobs are being spooled to the server, but while they are then 
being de-spooled to your volumes, the connection back to the client is 
cut off for whatever reason (Connection reset by peer error).  This I 
think would explain why the client may in fact think it finished ok but 
the server doesn't. That is probably technically a bug and not a 
feature. :)  Look at the Bacula Heartbeat Interval option if you are 
not using this already and see if that helps to keep the connection alive.

hope this helps,


--tom



--
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET,  PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] ERROR Spooling/Backups with large amounts of data from windows server 2012

2013-11-21 Thread Thomas Lohman

 - heartbeat: enabling on the SD (60 seconds) and
 net.ipv4.tcp_keepalive_time also set to 60

In glancing at your error (Connection reset by peer) and your config 
files, I didn't see the Heartbeat Interval setting in all the places 
that it may need to be.  Make sure it is in all the following locations:

Director definition for the server Director daemon.
Storage definition for the server Storage daemon.
FileDaemon definition for the Client File daemon

That error typically means the network/socket connection between the 
file daemon and the storage daemon was closed unexpectedly at one end or 
by something in between blocking/dropping it.  I have also seen that 
error suddenly pop up on Windows clients for no obvious reason but a 
reboot of the Windows box has fixed it.


--tom


--
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing 
conversations that shape the rapidly evolving mobile landscape. Sign up now. 
http://pubads.g.doubleclick.net/gampad/clk?id=63431311iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] FW: Client last backup time report

2013-11-18 Thread Thomas Lohman

We do something like this by running a job within Bacula every morning 
that scans all client configuration files, builds a list of expected 
current jobs/clients and then queries the Bacula DB to see when/if 
they've been successfully backed up or not (i.e. marked with a T).  If 
it's been more than the specified number of days, then they are added to 
a list which is then mailed to whatever address is specified (e.g. the 
IT system folks).  The content of the message looks something like this:

WARNING -- Bacula has not backed up:

(1) Job: foobar for Client: foobar-host in the past 10 days


I suspect that this utility is fairly specific to our configuration 
structure so not sure if it could be of direct help to you but I figured 
I'd throw it out there as an example that it is pretty straight forward 
to do what you want to do and a lot of ways to implement it. :)


--tom

 I need to create a report with the last time a good backup was run of
 each client.

 We are looking for anyone who has not backed up recently, so it would be
 nice if the report could be set for clients that have not had a
 successful backup in 1 week, or even a variable amount of time.

 I am assuming this would be SQL. Grepping (or anything else) Bacula's
 'List Jobs' would not work, since if a client has not even started a
 backup it would not be listed there. (Our backups are kicked off by
 remotely calling a script on each client that starts the FD. We have
 several 'waves' of backups when departments are not here or would be
 least affected by the backup.)

 We envision the report being something like:

 Name Last Backup  F/D/I  JobFiles   JobBytes JobStatus

 COMPUTER12013-11-06 23:59   I  29129,056 T

 LAPTOP2  2013-10-20 10:30   D  17 89,423 T

 COMPUTER22013-10-19 17:05   I   0  0 E

 Anyone else doing something like this, or can point me to some examples?

 Thanks in advance



 --
 DreamFactory - Open Source REST  JSON Services for HTML5  Native Apps
 OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
 Free app hosting. Or install the open source package on any LAMP server.
 Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
 http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk



 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



--
DreamFactory - Open Source REST  JSON Services for HTML5  Native Apps
OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
Free app hosting. Or install the open source package on any LAMP server.
Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Problem with Bacula 5.2.5 and Windows client 5.2.10.

2013-11-15 Thread Thomas Lohman

 We are having a problem between a Bacula server version 5.2.5
 (SD and
 Dir) and a Windows client running Bacula-fd 5.2.10.

While this may not be your problem, in general, I recall it is best to 
keep the client versions = to the server versions.


--tom



--
DreamFactory - Open Source REST  JSON Services for HTML5  Native Apps
OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
Free app hosting. Or install the open source package on any LAMP server.
Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Fwd: Spooling attrs takes forever

2013-11-15 Thread Thomas Lohman


 Yes, for disk storage, it does not make much sense to have data spooling
 turned off.
 I would suggest to always turn attribute spooling on (default off) so
 that attributes
 will be inserted in batch mode (much faster), and if possible ensure
 that the
 working directory, where attributes are spooled is on a different drive
 from the
 Archive Directory.  Of course this last suggestion is most often not
 possible.

One reason to turn on spooling even if you use disk storage for your 
volumes if you tend to have hosts that abruptly get pulled off the 
network during backups or otherwise have hiccups that cause backups to 
fail.  With spooling, you shouldn't get volumes filling up with backup 
data from the partially completed failed backups.


--tom


--
DreamFactory - Open Source REST  JSON Services for HTML5  Native Apps
OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
Free app hosting. Or install the open source package on any LAMP server.
Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

[Bacula-users] rescheduling jobs and max sched time

2013-03-04 Thread Thomas Lohman

Hi,

We have jobs that we want to limit their time either sitting and waiting 
or running to certain number of hours.  In addition, we want these jobs 
to reschedule on error - essentially, start the job at X time, keep 
trying to run but after Y hours end no matter what.  I've found that if 
you use reschedule on error and max run sched time, that the latter will 
use the latest scheduled time as opposed to when the job initially was 
scheduled.  The database schedule time seems to stay the originally 
scheduled time since it's really the same job as far as that is 
concerned.  This seems to all make sense but doesn't accomplish what we 
want to do.  I was wondering if I'm missing existing options or will 
need to extend Bacula with a new Max Run Init Sched Time option which 
will use that initial scheduled time when determining if the job should 
be ended.

thanks,


--tom



--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Client side FS detection

2013-01-16 Thread Thomas Lohman

 One idea I can think of is using a list of filesystem types that matter.
 That way you can handle most things and also exclude cluster
 filesystems like ocfs2 that should best be backed up with a different
 job and separate fd.

This is what we do for our UNIX systems.  We actually define each file 
system as it's own job and have things set up so if a mismatch between 
what is found on a client and what is being backed up occurs, it is 
reported and can be fixed.  You're right in that a problem with this 
approach is if your clients may be attaching storage that uses 
unexpected file system types.  For us, that isn't really a problem since 
the policy is that we back up what is fixed on the computer and each 
computer is set up by us as well.

hope this helps,


--tom



--
Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery
and much more. Keep your Java skills current with LearnJavaNow -
200+ hours of step-by-step video tutorials by Java experts.
SALE $49.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122612 
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Network error with FD during Backup: ERR=Connection reset by peer

2012-10-04 Thread Thomas Lohman


 Yesterday I waited for the job to finish the first tape and then wait
 for me to insert the next one.

 I opened wireshark to see if there is a heartbeat during waiting -
 and there was none. During the job the heartbeat was active.

 From what you wrote the heartbeat should be active when waiting for
 a tape. Could you try to confirm that (have a look at the code)?

Marcus,

I think that you should be seeing heartbeats in this case.  What version 
of the Storage Daemon server are you running?  I am looking at 5.2.10 
and up as far as the code.  Can you run it in debug mode?  If so, set 
the debug level to 400 and you should get some messages in the output if 
the heartbeat logic is working.

The heartbeat is sent from inside this method:

/* 

* Wait for SysOp to mount a tape on a specific device.
   Returns: W_ERROR, W_TIMEOUT, W_POLL, W_MOUNT, or W_WAKE 

*/
int wait_for_sysop(DCR *dcr)

Inside that method, there is a particular debug line:

Dmsg0(dbglvl, Send heartbeat to FD.\n);

Anyhow, if you're not seeing this debug output then it is not sending a 
heartbeat for whatever reason.  If you see it then it is sending it so 
the problem lies elsewhere if you're still not seeing it arriving at 
it's destination.

hope this helps,


--tom





--
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Network error with FD during Backup: ERR=Connection reset by peer

2012-10-03 Thread Thomas Lohman


 I now could check if bacula fd to sd connection timed out because of
 the network switches. This was not the case. My job still cancels.

My experience is that the heartbeat setting has not helped us with our 
Connection Reset by Peer issues that occur occasionally.  Something 
more is going on than a typical network timeout.

 Can someone tell me how and when the heartbeat should occur? Is it
 active when no job is running? In my config I set the following line
 for dir, sd and fd: Heartbeat Interval = 5 This should result in a
 heartbeat every 5 sec?

The heartbeats are only setup when a job with a client is initiated. 
So, there should be no activity when no job is running.  When you 
initiate a job with the client, the director sets up a connection with 
the client telling the client what storage daemon to use.  The client 
then initiates a connection back to that storage daemon.  If you have 
the heartbeat settings in place as you do then you should see heartbeat 
packets sent from the client back to the director in order to keep that 
connection alive while the data is being sent back to the storage 
daemon.  In addition, you may see heartbeat packets send from the 
storage daemon to the client.  I'd have to re-look at the code but I 
believe this is used in the scenario where the storage daemon is waiting 
for a volume to write the data to (i.e. operator intervention).  If the 
heartbeat setting is on then the storage daemon will send heartbeats 
back to the client in order to keep the connection alive while it waits.

Also of note, 5 seconds is the minimum feasible setting you can have. 
The heartbeat thread wakes up every 5 seconds to check to see if it 
needs to send a heartbeat to the director.  So, anything less than that 
really isn't going to do anything.

hope this helps,


--tom

--
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Network error with FD during Backup: ERR=Connection reset by peer

2012-09-27 Thread Thomas Lohman

 Tom: How did you restart the job. Did you have a script or do you do it
 by hand?

There are Job options to reschedule jobs on error:

Reschedule On Error = yes
Reschedule Interval = 30 minutes
Reschedule Times = 18

The above will reschedule the job 30 minutes after the failure and it'll 
try and do that 18 times before finally giving up.  These options come 
in handy if you're backing up laptops or other computers that may not be 
on your network 24x7.

hope this helps,


--tom


--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://ad.doubleclick.net/clk;258768047;13503038;j?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Network error with FD during Backup: ERR=Connection reset by peer

2012-09-26 Thread Thomas Lohman


 2012-09-19 22:58:45   bacula-dir JobId 13962: Start Backup JobId 13962, 
 Job=nina_systemstate.2012-09-19_21.50.01_31
 2012-09-19 22:58:46   bacula-dir JobId 13962: Using Device FileStorageLocal
 2012-09-19 23:02:41   nina-fd JobId 13962: DIR and FD clocks differ by 233 
 seconds, FD automatically compensating.
 2012-09-19 23:02:41   nina-fd JobId 13962: DIR and FD clocks differ by 233 
 seconds, FD automatically compensating.
 2012-09-19 23:02:45   nina-fd JobId 13962: shell command: run 
 ClientRunBeforeJob C:/backup/bacula/systemstate.cmd
 2012-09-19 23:02:45   nina-fd JobId 13962: shell command: run 
 ClientRunBeforeJob C:/backup/bacula/systemstate.cmd
 2012-09-19 23:03:40   bacula-dir JobId 13962: Sending Accurate information.
 2012-09-19 23:05:12   bacula-dir-sd JobId 13962: Job write elapsed time = 
 00:01:21, Transfer rate = 2.517 M Bytes/second
 2012-09-19 23:09:06   nina-fd JobId 13962: shell command: run ClientAfterJob 
 C:/backup/bacula/systemstate.cmd cleanup
 2012-09-19 23:09:06   nina-fd JobId 13962: shell command: run ClientAfterJob 
 C:/backup/bacula/systemstate.cmd cleanup
 2012-09-19 23:05:17   bacula-dir JobId 13962: Fatal error: Network error with 
 FD during Backup: ERR=Connection reset by peer

We have seen that same error (Connection reset by peer) ocassionally 
for many months.  Some are normal - Mac/Windows desktops/laptops that 
either get rebooted or removed from the network during a backup, etc. 
But sometimes we see this error with UNIX servers that are up 24x7.  We 
suspect that it is network related since we've had similar errors with 
print servers and non-Bacula backup servers.  But we have yet to pin it 
down.  We restart failed jobs in Bacula so typically the job always 
completes OK even after initially getting this error on the first try. 
I'd be curious to know if others get these errors occasionally and what 
version of Bacula that you're running.


--tom



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] backup through firewall - timeout

2012-09-10 Thread Thomas Lohman

 Hi folks.

 I've got a problem whereby my email and web servers sometimes fail to backup.

 These two servers are inside the DMZ and backup to the server inside my LAN.

 The problem appears to be the inactivity on the connection after the data has
 been backed up while the database is being updated. Does anyone have any
 suggestions on what I can do?

 Gary

Gary,

Take a look at the Heartbeat Interval options for the client and storage 
configurations.  More than likely your firewall/router is dropping the 
connection due to inactivity.  How fast it's doing this will depend on 
the configuration and the network load so you may need to experiment 
with different interval settings.

hope this helps,


--tom

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

[Bacula-users] Heartbeat Interval errors

2012-08-19 Thread Thomas Lohman

Since adding Heartbeat Interval (set to 15 seconds) on our clients' 
FileDaemon definition as well as the Director definition in 
bacula-dir.conf and the Storage definition in bacula-sd.conf, it has 
fixed some of the firewall timeout issues that we've had backing up some 
clients but we've also started getting some of the following errors 
during each backup cycle (even though the backup finishes OK each time).

client-fd JobId 79326:Error: bsock.c:346 Socket is terminated=1 on call 
to client:xx.xx.xx.xx:36387

My best guess is that the client is trying to send a ping down the 
connection but in the time that it decided to do this, the backup 
finished and the connection was closed.

I was wondering if anyone else who uses this option has seen this error 
and if it should be considered a bug perhaps or if there is something 
we can do in our configuration to fix it.

thanks,


--tom

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Problem with bat from Bacula 5.2.10

2012-08-15 Thread Thomas Lohman

 bat ERROR in lib/smartall.c:121 Failed ASSERT: nbytes 0

This particular message is generated because some calling method is 
passing in a 0 to the SmartAlloc methods as the number of bytes to 
allocate.  This is not allowed via an ASSERT condition at the top of the 
actual smalloc() method in the smartall.c file.  I'd think that you'd 
need to do some kind of trace to see where the problem is originating.


--tom




--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Problem with bat from Bacula 5.2.10

2012-08-15 Thread Thomas Lohman

 bat ERROR in lib/smartall.c:121 Failed ASSERT: nbytes 0

 This particular message is generated because some calling method is
 passing in a 0 to the SmartAlloc methods as the number of bytes to
 allocate.  This is not allowed via an ASSERT condition at the top of the
 actual smalloc() method in the smartall.c file.  I'd think that you'd
 need to do some kind of trace to see where the problem is originating.

 Hm, the question is what should i trace and how? Bat, the director or
 something other?

The bat executable is the one that you'd trace to see what it is doing. 
I don't know how much info bat may put out if you run in some kind of 
debug mode but that may be enough assuming there is such a mode.  But I 
suspect you'll need to somehow find out what it's exactly doing that is 
causing it to try and allocate 0 bytes of memory.  If you can get a 
specific cause then the Bacula bug folks may be able to track it 
down/fix it easier.

hope this helps,


--tom



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] BAT and qt vesrion

2012-08-09 Thread Thomas Lohman

I downloaded the latest stable QT open source version (4.8.2 at the 
time) and built it before building Bacula 5.2.10.  Bat seems to work 
fine with it.  If you do this, just be aware that the first time you 
build it, it will probably find the older 4.6.x RH QT libraries and 
embed their location in the shared library path so when you go to use 
it, it won't work.  The first time I built it, I told it to explicitly 
look in it's own source tree for it's libraries (by setting LDFLAGS), 
installed that version and then re-built it again telling it to now look 
in the install directory.


--tom

 I tried to compile bacula-5.2.10 with BAT on a RHEL6.2 server. I
 found that BAT did not get installed because it needs qt version
 4.7.4 or higher but RHEL6.2 has version qt-4.6.2-24 as the latest.  I
 would like to know what the others are doing about this issue?

 Uthra

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

[Bacula-users] bacula working state job listing

2012-08-02 Thread Thomas Lohman

This may be a stupid question but is the working state data, that are 
cached on the client and used to display the recent job history of a 
client from the tray monitor, limited to the most recent 10 job events? 
  Or is there a way to configure this to show and/or cache more than 
just 10?

thanks,


--tom

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

[Bacula-users] restores to Windows machines

2012-07-25 Thread Thomas Lohman

Hi,

We're running 5.2.10 for both Windows 7 clients and our servers.  My 
system admins have noticed that when during restores of files to a 
Windows 7 client that the restored files are all hidden which requires 
them to then go in and uncheck the hide protected operating system files 
option.  At that the point, the files are then visible to the user. 
Typically, they do a restore and specify a restore directory of 
C:/RestoredFiles or something along those lines.  So, in that directory 
on the client, one sees a C and then the rest of the restored 
path/files underneath it.  The problem seems to be that the permissions 
that C sub-directory in C:\RestoredFiles are what causes everything to 
be hidden.

Of the folks here who back up Windows clients, have you seen this 
problem and does anyone know of any fixes for it on the Bacula side?

thanks,


--tom

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] max run time

2012-07-15 Thread Thomas Lohman

This actually is a hardcoded sanity check in the code itself.  Search 
the mailing lists from the past year.  I'm pretty sure I posted where in 
the code this was and what needed to be changed.  We have no jobs that 
run more than a few days so have not made such changes ourselves so I 
can't guarantee it'll fix your problems completely - all I know is that 
overcoming the 6 day limit definitely will mean making a few tweaks to 
the code.  You may want to submit a bug report and make the case that 
such a sanity check should be removed or have an configurable way to 
override.

hope this helps,


--tom


 but still they are terminated after 6 days:

 14-Jul 20:27 cbe-dir JobId 39969: Fatal error: Network error with FD
   during Backup: ERR=Interrupted system call
 14-Jul 20:27 cbe-dir JobId 39969: Fatal error: No Job status returned from FD.
 14-Jul 20:27 cbe-dir JobId 39969: Error: Watchdog sending kill after
   518426 secs to thread stalled reading File

 I like to know how to fix this.

 I've seen the comments in the mailing list in the past that running
 backups that take more than 6 days is insane. They're wrong in my
 environment. I don't want to hear that again. I have a genuine reason for
 running very long backups and I need to know how to make it work.

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Restores to Windows host fail and file daemon crashes on 5.2.9

2012-06-19 Thread Thomas Lohman


 I am running version 5.2.9 on my director and file daemon. I am
 able to backup successfully but when I attempt to restore data onto
 the 32bit Windows 2003 file daemon the bacula service terminates on
 the 2003 server and the restore job fails. I can choose a Linux
 file daemon as the target for the data and the data is restored but
 if I choose the Windows 2003 32bit file daemon the file daemon
 crashes. What can I do to troubleshoot this further?

Yes, this sounds like the same problem a number of sites, including us, 
have had.  I suspect it will work fine if you put 5.0.3 on the Windows 
client.  Also, looking at the bug tracker e-mails, I believe Kern may 
have fixed this issue in 5.2.10 which will be the next minor release.


--tom


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Restore dies every time

2012-06-15 Thread Thomas Lohman

 Restores to the Windows client systematically crash the FD on the
 client without restoring anything. This seems to be a known, as
 yet unsolved problem. There are several posts on this on the list.

Yes, we have the same problem.  For now, we have rolled back our Windows 
clients to 5.0.3 which works fine.  I opened a bug report for this but I 
don't think that they were able to reproduce it so they wanted a 
complete stack trace of the dying client which I don't have time to do 
at the moment.  I believe the bug was closed but I'd be happy to re-open 
it if anyone has a complete trace of the dead FD.  Or feel free to open 
a new report since there is obvious a bug in there somewhere given the 
number of people experiencing this.


--tom



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Bad interaction between cancel duplicates and rerun failed jobs

2012-06-04 Thread Thomas Lohman

Jon,

I believe I posted this same issue back in April and didn't get any 
replies.  I never did submit it as a bug but it does seem to be a bug to me.

http://sourceforge.net/mailarchive/forum.php?thread_name=4F8ECD71.8080203%40mtl.mit.eduforum_name=bacula-users

Perhaps I'll go ahead and post a bacula bug report and see what they say 
about this scenario.

cheers,


--tom

 So I've got a full backup job that takes more than a day to complete. To
 keep a second full backup from getting started while the first one is still
 completing I've set the following in the Job definitions:
Allow Duplicate Jobs = no
Cancel Queued Duplicates = yes

 However to handle network connection issues or clients being missing when
 their scheduled backup times comes around I have this setting in the Job
 definitions as well:
Rerun Failed Levels = yes

 It seems that the duplicate job handling marks the level as failed, so that
 when the first backup finishes, the next backup that wants to run should be
 an incremental, but gets upgraded to a full because of the duplicate jobs
 that were canceled.

 Anyone know a way around this?


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

[Bacula-users] question on re-running of failed levels

2012-04-18 Thread Thomas Lohman

Before I submit this as a possible bug, I just wanted to see if perhaps 
it is the expected behavior for Bacula.

We have a few long running jobs that take  24 hours to do a Full 
backup.  Because of this, we have the following set:

Allow Duplicate Jobs = no
Cancel Lower Level Duplicates = yes
Cancel Queued Duplicates = yes

In addition, we also have Rerun Failed Levels set to 'yes' since 
sometimes are computers are not accessible when a Differential or Full runs.

So, what I have seen happen recently is the following scenario:

April 16th 5am - Full runs for Job X
April 17th 5am - Job X runs again and is canceled
April 17th 3pm - Original job X finishes successfully
April 18th 5am - Job X runs again and does a Full again

The April 18th job should only run an Incremental but it appears that 
because we have Allow Duplicate Jobs set to 'yes', it sees the April 
17th 5am failure and decides that it needs to rerun the Full even though 
the April 16th 5am job did successfully finish after the April 17th 5am 
failure/cancellation.

Given these settings, should one expect it to see that successful job 
and not rerun the Full?  Has anyone else seen this behavior?

FYI, we are running Bacula 5.2.6 on the director/storage side and 5.0.3 
on this particular client.

thanks,


--tom


--
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] catalog pg_dump fails after 5.2.2 upgrade

2011-12-23 Thread Thomas Lohman

The update postgres script for 5.2.x is missing these two lines which 
you can run manually from within psql (connect to the bacula db as your 
Postgres admin db user):

grant all on RestoreObject to ${bacula_db_user};
grant select, update on restoreobject_restoreobjectid_seq to 
${bacula_db_user};

That should solve your problem, I think.


--tom


 At this point I'm unclear where the permissions problem exists.

 Within PostgreSQL.  The PostgreSQL user does not have permissions on that 
 table…

 This is not a Unix permissions issue.


 Thanks in advance for further clues.

 dn




 I am not using 5.2.2, so I did the version table as an example of what it 
 should look like.


 bacula-# \l
   List of databases
   Name| Owner  | Encoding
 ---++---
 bacula| bacula | SQL_ASCII
 postgres  | pgsql  | UTF8
 template0 | pgsql  | UTF8
 template1 | pgsql  | UTF8
 (4 rows)

 User bacula's shell is defined as /sbin/nologin, so I think it's user
 pgsql that's doing the work (at least it was prior to the upgrade). User
 bacula cannot launch psql nor can I su to that user because of the
 nologin setting.

 What permissions do I need to change to get this dump working?

 Thanks again!

 dn



 I have restarted all bacula and postgresql daemons since the upgrade. I
 have not changed any permissions in the /home/bacula directory.

 Thanks in advance for troubleshooting clues.

 dn


 --
 Write once. Port to many.
 Get the SDK and tools to simplify cross-platform app development. Create
 new or port existing apps to sell to consumers worldwide. Explore the
 Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
 http://p.sf.net/sfu/intel-appdev
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users







--
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] seeking advice re. splitting up large backups -- dynamic filesets to prevent duplicate jobs and reduce backup time

2011-10-13 Thread Thomas Lohman

 In an effort to work around the fact that bacula kills long-running
 jobs, I'm about to partition my backups into smaller sets. For example,
 instead of backing up:

Since we may end up having jobs that run for more than 6 days, I was 
pretty curious to see where in the code (release 5.0.3) this insanity 
check was happening.  Looking at your previous thread's error message, I 
was able to track down these checks to the jcr_timeout_check routine in 
jcr.c.

But after a brief look at the code it looks to me like this only occurs 
if the socket connection is essentially stuck and no read/writes are 
occurring over it (thus the reason Kern probably labeled it an insanity 
check).  This explains why other folks have said that they do have jobs 
that have run  6 days.  Are you actually seeing an active job (i.e. 
it's in the middle of writing data from the client when it's killed)? 
Could it be that it is in the middle of de-spooling a very large job 
(and/or waiting for operator intervention) and that is when this occurs? 
  I could see that happening since no traffic is flowing over the 
connection to the client but the job is still active thus the client 
connection probably is as well.

In any event, if you have access to the source code (5.0.3 - which is 
what I'm looking at) and are comfortable making changes to it then I 
believe all you need to do is change line 75 in lib/bsock.c and line 687 
in lib/bnet.c to something longer than 6 days.  This may be simpler than 
re-working your entire backup scheme to avoid the issue.


--tom




--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Segmentation fault of Storage Daemon when client is not available

2011-09-21 Thread Thomas Lohman

Just to followup on this in case others have this issue.  I was able to 
rebuild bacula with the -g compiler option to get some debugging 
information.  The scenario that causes the SD to crash with a SEGFAULT 
is not consistently reproducible which makes me think of some kind of 
race condition.  But in any event, I was able to finally get a trace in 
gdb and the crash occurs in the same spot that others have reported in 
the URLs referenced below - namely in the deflate zlib method being 
called from openssl.  The solution, I'm hoping, if you're using TLS, is 
to turn TLS off for communication between the director and the storage 
daemon (and to do this, you want to comment out all of your TLS options 
in any Storage definitions in the Director configuration and just the 
Director definition in the SD configuration).  In addition, I also was 
able to set up the Director so if the SD does die, it would take care of 
restarting it and any failed jobs would be re-queued (using the 
Reschedule on Error options).

thanks again,


--tom


 Hi,

 We've been seeing our Bacula Storage Daemon die with a segmentation
 fault when a client can't be reached for backup.  We have two servers
 and have observed this behavior on both of them.  Some searching has
 revealed that others seem to have (or had) this same issue.

 https://bugs.launchpad.net/ubuntu/+source/bacula/+bug/622742

 That looks similar to some existing bacula bug reports:

 http://bugs.bacula.org/view.php?id=1568
 http://bugs.bacula.org/view.php?id=1343


 The behavior is not consistent i.e. sometimes it continues on working
 normally if a client can't be contacted but eventually it'll snag on one
 and die.  In addition, I've now had one of our storage daemons running
 in the foreground with debugging set to the max and of course, that one
 has now gone two days without seg faulting even though there have been
 half a dozen non-responsive clients.

 We're currently running 5.0.3 built from source for both clients and
 servers.  I'm wondering if anyone else here has experienced this problem
 and/or has any pointers to a work around.  While things can be set up to
 automatically restart the storage daemon if it dies, the main problem is
 that any backups Bacula was in the middle of doing end with an error and
 have to be manually rescheduled/run or just wait until the next time
 their job comes up to run.

--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

[Bacula-users] Segmentation fault of Storage Daemon when client is not available

2011-09-02 Thread Thomas Lohman

Hi,

We've been seeing our Bacula Storage Daemon die with a segmentation 
fault when a client can't be reached for backup.  We have two servers 
and have observed this behavior on both of them.  Some searching has 
revealed that others seem to have (or had) this same issue.

https://bugs.launchpad.net/ubuntu/+source/bacula/+bug/622742

The behavior is not consistent i.e. sometimes it continues on working 
normally if a client can't be contacted but eventually it'll snag on one 
and die.  In addition, I've now had one of our storage daemons running 
in the foreground with debugging set to the max and of course, that one 
has now gone two days without seg faulting even though there have been 
half a dozen non-responsive clients.

We're currently running 5.0.3 built from source for both clients and 
servers.  I'm wondering if anyone else here has experienced this problem 
and/or has any pointers to a work around.  While things can be set up to 
automatically restart the storage daemon if it dies, the main problem is 
that any backups Bacula was in the middle of doing end with an error and 
have to be manually rescheduled/run or just wait until the next time 
their job comes up to run.

thanks,


--tom


--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

57 matches

Mail list logo