Hi,

26.02.2009 09:46, Gilles Guillotin wrote:
> 
> 
> Hi all,
> 
> This is my first post on this mailing-list, representing
> ASPerience.

Did you discuss your patch with the developers previously?

> We made some enhancements on Bacula 2.4.0 and created a patch for
> this release which may be easily ported to next releases.
> 
> This patch has been created in order to optimize the communications
> between the File daemon and the Director.

I understand it increases the communication as well...

> With the new features, Bacula can backup clients which change their
> IP like laptops.

Hmm... I wonder if any new feature is really needed for that. There 
are ways to assign static hostnames even to machines with changing IP 
addresses, and there is the setip command which can well be used with 
a named and ACLed console connection...

> There are less error messages when a job is
> canceled because of the absence of the File Daemon. The
> communication between FD and DIR become bidirectional so
> connections are more frequent.
> 
> New features for the DIR: - when the DIR starts, he tries to
> connect to the FD. If the connection is successful, a presence
> parameter in the Client ressource change to "yes". Else the
> presence parameter keep his value "no". - when the DIR is going to
> start a new job, he checks the presence parameter. If the client is
> present, the DIR starts the job, else he waits for him during a
> time specified in the Client ressource in the bacula-dir.conf (this
> parameter is named "WaitTimer"). He checks if the client is
> connected at each interval of a time (attribute "PresenceTimer" in
> bacula-dir.conf).

So the DIR does not connect the FD blindly (as today)? That would be a 
big disadvantage IMO...

> If the client never connects himself during the
> "WaitTimer" time, the job is marked as "JSAutomaticallyCanceled" in
> the Catalog. "JSAutomaticallyCanceled" is a new parameter defined
> in jcr.h and it means that the job is canceled because the File
> daemon has never been connected. - a new file has been created
> named fd_server.c. It allows the DIR to listen to the File Daemon
> connections (the default port is 9104, parameter DIRportFD in
> Director ressource of bacula.dir.conf).

Argh... that needs to be officially assigned, and I doubt the Bacula 
Project will get a third IANA port today... definitely not 9104 which 
is already assigned to PeerWire.

Also, it requires modifying firewall and tunnel settings for 
installations where those exist between DIR and FD. This can be a 
major inconvenience for existing installations on an upgrade.

Thus I would suggest you let this procedure happen on the existing 
ports (which would obviously be the DIR one...)

> The parameter
> MaxClientsPresence defined in Director ressource in bacula-dir.conf
> defines how many File Daemons the DIR can listen simultaneously. -
> Authentifications functions are also implemented in authenticate.c
> in src/dird and src/filed.   New features for the FD: - the FD must
> know the address of the Director which is stocked in the Director
> ressource in bacula-fd.conf.

What about clients that are backed up by more than one DIR?

> Also, he knows on which port he is
> able to contact the DIR (default 9104). - when the FD start, he
> tries to connect to the DIR. If the connection is successful, a
> presence parameter in the Client ressource of the Director daemon
> changes to "yes". Else the presence parameter keep his default
> value "no". For the authentification he uses the existing password
> between the File Daemon and the Director. The File Daemon gives his
> new address to the DIR so if the client is a laptop, jobs can be
> run with any IP.

The latter can be achieved with bconsole and setip.

> - when the File Daemon stops, he warns the DIR he
> is going away. After this warning, presence_parameter = 0 : the DIR
> knows the client is absent. This feature doesn't work on Windows
> system.

That would need to be fixed.

> Perhaps the FD not finished in the same way as it stops on
> Linux. At least, on Windows, bacula does not go in the fonction
> "terminate_filed" in filed.c so the presence parameter keep his
> value at 1. ----> Perhaps there is a possible upgrade to do.   For
> the connections at the start of the two Daemons, there is a
> retry_interval defined at 10 seconds (if connection fail, retry
> after 10 seconds) and a max_retry_time defined at 20 seconds
> (abandon connection after 20 seconds).
> 
> Normally, the old configurations works fine even though files are
> patched.  If there are no configuration files when we apply the
> patch, they are created with a new configuration (Presence
> parameter, PresenceTimer, WaitTimer, Address of the Director...).
> Else you must modify the configuration files: if the Presence
> parameter in Client ressource in bacula-dir.conf and the address
> attribute in Director ressource in bacula-fd.conf don't exist,
> bacula will run in it's classic behaviour.

Hmm... I don't understand if you say that any existing configuration 
will work as expected or not. Can you clarify this?

Also, as, at startup, the DIR needs to contact all its client, have 
you done some performance tests? I suspect that, with several hundred 
clients, a huge delay might result...

>  Here is an example of a new configuration:
> 
> 
> 1/ In "bacula-dir.conf"
> 
> Director { Name = localhost-dir DIRport = 9101 DIRportFD = 9104 
> QueryFile = "/usr/bin/query.sql" WorkingDirectory =
> "/var/bacula/working" PidDirectory = "/var/bacula/working" Maximum
> Concurrent Jobs = 1 Password = "*******************" Messages =
> Daemon MaxClientsPresence = 20  #How many client the DIR can listen
> simultaneously -----------------> NEW }

What's the default here, and what happens if there are more clients 
than allowed by this?

> 
> Client { Name = localhost-fd Address = localhost FDPort = 9102 
> Catalog = MyCatalog Password = "****************" File Retention =
> 30 days Job Retention = 6 months AutoPrune = yes Presence = yes       #
> The presence parameter exist -------------------------> NEW 
> PresenceTimer = 15 # Maximum time to verify the client presence
> --------> NEW WaitTimer = 60 minutes  # Maximum time to wait the
> client --------------> NEW # PresenceTimer and WaitTimer are
> defined in second by default. We can use minutes, hours, days...
> like with other temporal parameters in Bacula. }
> 
> 
> 2/ In "bacula-fd.conf"
> 
> Director { Name = localhost-dir Address = localhost DIRport = 9104
> ---------------------------------------------------------> NEW 
> Password = "*****************" }
> 
> 
> Here is an explanation of a typical communication between the FD
> and the DIR:
> 
> 1/ Starting daemons:
> 
> 1.1/ DIR starts before FD (most frequent situation)
> 
> DIR starts; DIR tries to connect to FD; if (FD connected) { 
> presence_parameter = 1; } FD starts; FD tries to connect to DIR; if
> (DIR connected) { presence_parameter = 1; FD give his new address
> to DIR; }
> 
> 1.2/ FD starts before DIR
> 
> FD starts; FD tries to connect to DIR; if (DIR connected) { 
> presence_parameter = 1; FD give his new address to DIR; } DIR
> starts; DIR tries to connect to FD; if (FD connected) { 
> presence_parameter = 1; }
> 
> 
> 1/ Starting job (Backup, Restore):
> 
> DIR check FD presence; if (FD hasn't got presence_parameter) {
> ----> old configuration run job like old configuration; } else {
> ----> new configuration if (FD present) { run job; } else { while
> (WaitTimer isn't terminate) { check FD connection all the
> PresenceTimer interval; if (FD connect) { run job; } } Job mark at
> JSAutomaticallyCanceled; } }
> 
> The patch can be downloaded at :
> http://docs.asperience.fr/bacula-2.4.0_ASP.patch
> 
> 
> Regards,
> 

Well, I believe I understand what you wanted to solve here, but 
somehow I don't see why you didn't use the existing functionality.

Rerun jobs on failure, limit the time to start a job, dynamic DNS 
updates for mobile clients, and the setip command of bconsole should 
be sufficient to achieve what you need.

Can you clarify why those are *not* sufficient for you?

Cheers,

Arno

-- 
Arno Lehmann
IT-Service Lehmann
Sandstr. 6, 49080 Osnabrück
www.its-lehmann.de

------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to