Re: [Bacula-users] [Bacula-devel] Bacula Status Report
On Wed, Oct 15, 2014 at 12:49:33PM +0200, Kern Sibbald wrote: Hello, On 14-10-13 09:38 PM, Pasi Kärkkäinen wrote: On Fri, Sep 05, 2014 at 11:23:31PM +0200, Kern Sibbald wrote: Hello, Hello, I have posted a Bacula Status report to the www.bacula.org site. It discusses the following items: 1. Bacula Release Status 2. Windows Binaries Hmm.. so there are no community windows binaries anymore? There have not been any for quite a number of years -- since 5.2.10. Yep. I assume those 5.2.10 windows (client) binaries should still work with the latest Bacula server? Are the windows binaries shipped by Bacula Systems proprietary or opensource ? They are proprietary but they can be obtained freely by Bacula open source users, and hopefully by everyone for personal use by the end of the year. OK. The principal reason for only one version of the Windows binaries was that for several years I was way too overloaded with work and had to reduce somewhere, and since Windows is especially painful, that is where I reduced. I am now much less overloaded, so hope to correct a number of such problems by the end of the year. I can see that.. any plans to opensource the windows agent, at least the core components for basic file backups? Thanks for the reply! -- Pasi Best regards, Kern Thanks, -- Pasi 3. Bacula Enterprise 4. Vacation 5. Bareos The following is a link to the report. http://blog.bacula.org/bacula-status-report-30-august-2014/ Best regards, Kern -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] [Bacula-devel] Bacula Status Report
On Fri, Sep 05, 2014 at 11:23:31PM +0200, Kern Sibbald wrote: Hello, Hello, I have posted a Bacula Status report to the www.bacula.org site. It discusses the following items: 1. Bacula Release Status 2. Windows Binaries Hmm.. so there are no community windows binaries anymore? Are the windows binaries shipped by Bacula Systems proprietary or opensource ? Thanks, -- Pasi 3. Bacula Enterprise 4. Vacation 5. Bareos The following is a link to the report. http://blog.bacula.org/bacula-status-report-30-august-2014/ Best regards, Kern -- Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] [Bacula-devel] Bacula Status report
On Fri, Jul 23, 2010 at 05:54:03PM +0200, Kern Sibbald wrote: 2. New release cycle: The little code we currently have for the next major release is in the SF bacula git repository under Branch-5.1. We are considering to moving to a regular 6 month release cycle. The advantage of such a cycle is that it gets features out to you faster. The disadvantage is that it doesn't work so well in small projects like Bacula if there are not sufficient contributions. Such a release would consist of the following points: - A release every 6 months Hopefully this means 'major' release every 6 months. - The deadline is not absolute and could be extended to 9 months if there were insufficient new submissions. - There will be far fewer or no bug fix updates as they are not really needed if we can maintain a 6 month cycle. I've read your comments about this from the other emails, but I think it's important to release new *minor* versions for known/important bugs, assuming the fixes are available. - Two months before the projected release we will decide if there are sufficient new features to release - The release count down will consist of 3 phases 1. We will add all new approved features The first 4 months after a release this phase will go into effect for the next release - 2. Only very small new features (a few lines) will be added Two months before the final release this phase will go into effect. Note, this phase can be delayed 3 months if insufficient new features are submitted 3. Only bug fixes This phase will go into effect one month before the release Under this scheme, we are currently in Phase 3 for the 5.0.3 release, and the next major release (5.2.0) would be made before mid-January 2011, and is currently under development in Branch-5.1 on Source Forge. So if 5.0.3 ends up having some bad bug, 5.0.4 should be released before 5.2.0. ie. use most development efforts on the major/master branch, but still maintain stable branch. -- Pasi I would appreciate comments on this proposed new deadline release cycle. 3. New bugs tracking database Sometime in early August (possibly slightly before) we will be moving the current Mantis based bug tracking system to a new RT based system hosted by Bacula Systems. The upside is that the RT system is far more powerful, flexible and adaptible, and most important of all, it allows email responses to bugs. The downside is that it is a bit more complicated (as are most things that have more features) and that it will require everyone to re-register for the new system. In addition, if you don't want to rely on just the community to furnish bug fixes, you will be able to subscribe to a bug fix service that is more professional and has a guaranteed response time (not to be mistaken for a guaranteed fix time). More on this when the service is ready for production. 4. New Bacula server The current Bacula Community server is as you probably know generously offered by UKFast. However, the hardware is starting to age, so they have gratiously provided us with a new machine that we will be putting in place in the next few weeks. We don't expect that you will notice any differences, but the hardware running www.bacula.org should be more stable. 5. New Bacula source distribution server You may or may not be aware that we have not always been pleased with the services offered by Source Forge. The uploading is complicated by lines dropping (I have *never* seen this else where), their user interface is horrible, we don't get good statistics, being US based, they block direct access to our code from a number of countries such as Cuba, ... So, probably in September or October we will be moving our Bacula project off of Source Forge to a new server provided by UKFast. There is still a *lot* of work to be done to make this work -- principally getting up a good and suitable interface for users -- more as this develops. As mentioned above, I would appreciate any comments you might have, particularly on the proposed new release cycle. Best regards, Kern -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Bacula-devel mailing list bacula-de...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel -- The Palm PDK Hot Apps Program offers developers who use the Plug-In Development Kit to bring their C/C++ apps to Palm for a share of $1 Million in cash or HP Products. Visit us here for more details:
Re: [Bacula-users] [Bacula-devel] Copy jobs between different SDs?
On Wed, May 13, 2009 at 01:06:53PM +0200, Kern Sibbald wrote: On Wednesday 13 May 2009 12:52:03 Pasi Kärkkäinen wrote: Hello! What's the status of supporting copy jobs between different Storage Daemons? Any plans to implement? Obviously we have our own ideas of what to implement when, but before making any decisions, we are waiting for the results of the user voting. If you are interested in implementing it, please let us know as I have long ago designed it (in my head). Feel free to describe it here.. We can at least discuss it now, I'm not sure if I have time to implement it right now.. -- Pasi -- The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] [Bacula-devel] Bacula project voting / Feature request
On Tue, May 05, 2009 at 10:53:35PM +0200, Kern Sibbald wrote: Hello, As many of you know, once we release a major version, we start implementing features for the next version. To allow the Bacula user community to have input to the development process, we accumulate Feature Requests, then the community votes on which features they would like to see. Here are a few of the suggested rules for the upcoming voting process: 1. The list currently comprises some 40 different feature requests, so there is, in my opinion, no need to continue accumulating more feature requests. Hmm.. I guess I haven't submitted my feature requests yet ;) - Relabel (disk) volume on re-use, or whatever would be the appropriate name.. Maximum Volume Jobs = 1 Use Volume Once = Yes Recycle = Yes Volume Retention = 14 days Label Format = ${Client}-${Level}-${NumVols:p/4/0/r}-${Year}_${Month}_${Day}-${Hour}_${Minute} Relabel On Re-use = Yes With configuration like this it would be really easy to check/monitor the disk volumes also from the shell, you'd instantly see with basic tools like ls, du, what's taking disk space etc.. At the moment disk volume gets the name when it's first used, and it won't be changed when the volume is re-used.. so the name is incorrect after recycling and re-using the volume.. I guess this feature is mostly useful with disk volumes. -- Pasi -- The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your production scanning environment may not be a perfect world - but thanks to Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700 Series Scanner you'll get full speed at 300 dpi even with all image processing features enabled. http://p.sf.net/sfu/kodak-com ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang issue. was: bacula sometimes, gets stuck when volume wanted is already in a different drive
On Mon, Mar 02, 2009 at 10:56:41AM +0200, Silver Salonen wrote: Thanks a lot for the notice, dude! I configured my Bacula to use 30+ devices again on saturday and it has been OK for 2 days now (I don't think it would have been previously). If I'll have any problems, I'll let the list know :) 2.4.4-sd-deadlock.patch also fixed my problems with Bacula 2.4.4 SD getting stuck! -- Pasi -- Silver On Friday 27 February 2009 23:35:31 Bob Hetzel wrote: Silver, I recently obtained a patch for the bug I was running into which may be similar to your bug. Do you compile your own bacula? If so, the bug is # 1213 The direct link to the case is at... http://bugs.bacula.org/view.php?id=1213 The patch attached to the case is 2.4.4-sd-deadlock.patch Bob From: Silver Salonen sil...@ultrasoft.ee Subject: Re: [Bacula-users] bacula hang issue. was: bacula sometimes getsstuck when volume wanted is already in a different drive To: bacula-users@lists.sourceforge.net Message-ID: 200902051319.44057.sil...@ultrasoft.ee Content-Type: text/plain; charset=utf-8 OK, so.. it seems I'm on my own again.. anyone else experiencing this problem? The problem (once again): all the jobs that are not waiting for execution (or for any other resource), are waiting on storage. And I still can't understand how can this be a support request and why it can't be considered a bug :S Could anyone else check the current information and see why it's not a bug? PS. I'm sorry I can't let it go.. but my backups are hung every night :( -- Silver -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang issue. was: bacula sometimes gets stuck when volume wanted is already in a different drive
On Tue, Feb 17, 2009 at 10:12:44AM +0200, Silver Salonen wrote: Hello. I just wanted to inform the list that I worked around the issue by minimizing the number of devices/storages. The problem with this is that there may be only as many parallel jobs as the number of devices. Can you be more specific about the configuration that was problematic? What was your Maximum concurrent jobs (in both bacula-dir and bacul-asd)? How many and what kind of devices did you have configured in your SD? I'm also seeing this kind of problems with Bacula 2.4.4 and wondering how to fix them.. I've created separate devices for the clients having biggest backups, currently the number is 4 and it hasn't caused the problem so far.. OK. I guess you're talking about disk devices? What number is 4? PS. If the issue reappears I'll create another thread for that and start debugging it correctly.. currently I just hope it gets solved by itself :) Hehe :) -- Pasi -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] [Bacula-devel] Bacula 2.4.4 Released
On Sat, Jan 10, 2009 at 01:13:54PM -0500, Scott Barninger wrote: Bacula-2.4 RPM Release Notes 10 January 2009 D. Scott Barninger barninger at fairfieldcomputers dot com Release 2.4.4-1 Hello. I tried this: rpmbuild --rebuild --define build_centos5 1 --define build_x86_64 1 --define build_client_only 1 bacula-2.4.4-1.src.rpm And got error: ... Checking for unpackaged file(s): /usr/lib/rpm/check-files /var/tmp/bacula-root error: Installed (but unpackaged) file(s) found: /usr/lib64/bacula/bconsole ... RPM build errors: Installed (but unpackaged) file(s) found: /usr/lib64/bacula/bconsole What's are the correct options for building bacula-fd (client) only? Thanks! -- Pasi -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
[Bacula-users] udev by-id symlinks randomly missing for tape drives on centos5
Hello! I'm having problems with udev /dev/tape/by-id/ symlinks.. it seems symlinks to tape drives are sometimes (randomly) missing after reboot. Server in question has IBM TS3200 tape library connected with 2 drives in it.. so /proc/scsi/scsi shows 3 devices: 2 tape drives, and 1 medium changer (tape library). So I should have 3 symlinks in /dev/tape/by-id/ directory.. 2 symlinks to tape drives, and 1 symlink to mediumchanger/tapelibrary. Running 'udevtrigger' _sometimes_ creates the missing links, and sometimes it doesn't. Sometimes if all the symlinks are there to begin with, and you run 'udevtrigger', some of the tape drive symlinks might get removed too.. The symlink for the mediumchanger seems to be there always though.. Has anyone seen this behaviour? How to fix it? Whenever I run 'cat /proc/scsi/scsi' I can see all of the devices listed there. IBM 'itdt' tape drive / tapelibrary diagnostic tool works OK too, and always shows all the devices. I don't have any errors in the dmesg/syslog, and the devices work OK if I use /dev/nst* (tape drives) or /dev/sg* (mediumchanger/tapelibrary) devicenames to access them. I'd like to be able to use /dev/tape/by-id/foo symlinks to make sure correct devices are accesssed. The server in question is running CentOS 5.2 x86_64 (with latest updates installed). Any thoughts? -- Pasi -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] udev by-id symlinks randomly missing for tape drives on centos5
On Thu, Feb 05, 2009 at 09:12:33PM +0200, Pasi Kärkkäinen wrote: Hello! I'm having problems with udev /dev/tape/by-id/ symlinks.. it seems symlinks to tape drives are sometimes (randomly) missing after reboot. Server in question has IBM TS3200 tape library connected with 2 drives in it.. so /proc/scsi/scsi shows 3 devices: 2 tape drives, and 1 medium changer (tape library). So I should have 3 symlinks in /dev/tape/by-id/ directory.. 2 symlinks to tape drives, and 1 symlink to mediumchanger/tapelibrary. Running 'udevtrigger' _sometimes_ creates the missing links, and sometimes it doesn't. Sometimes if all the symlinks are there to begin with, and you run 'udevtrigger', some of the tape drive symlinks might get removed too.. The symlink for the mediumchanger seems to be there always though.. Has anyone seen this behaviour? How to fix it? Whenever I run 'cat /proc/scsi/scsi' I can see all of the devices listed there. IBM 'itdt' tape drive / tapelibrary diagnostic tool works OK too, and always shows all the devices. I don't have any errors in the dmesg/syslog, and the devices work OK if I use /dev/nst* (tape drives) or /dev/sg* (mediumchanger/tapelibrary) devicenames to access them. I'd like to be able to use /dev/tape/by-id/foo symlinks to make sure correct devices are accesssed. The server in question is running CentOS 5.2 x86_64 (with latest updates installed). Any thoughts? http://rhn.redhat.com/errata/RHBA-2009-0076.html These updated packages fix the following bugs: * race conditions in scsi_id, which may have resulted in devices not being created as per udev rules, have been resolved. in addition to other fixes.. wondering if it could be that bug? Or is this more likely just some (default) configuration issue? I haven't changed any udev configuration.. or actually I haven't done _any_ special configuration for the tape library. -- Pasi -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Performance problems in migration from disk to tape
On Fri, Jan 23, 2009 at 10:05:53AM +0200, Ari Suutari wrote: Hi, My configuration is roughly like this: I back up about 10 hosts to disk volume using bacula and migrate the backups to tape once a week. Backups work ok, the resulting volume file on disk is currently about 25 Gb. I have also some backups going directly to tape, performance there is also ok. But the weekly migration job, which moves backups from disk volume to tape is really slow. For example: Elapsed time: 7 mins 14 secs SD Files Written: 2 SD Bytes Written: 3,913,732 (3.913 MB) A relatively small backup job, only a couple of megabytes took more than 7 minutes. When I looked at the machine, it was doing heavy disk io, tape is mostly idle. This sounds a little bit similar as issue discussed here earlier: http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg31142.html I wonder if there was any solution, configuring things differently maybe ? What version of Bacula are you running? Which OS? What kind of hardware do you have? :) -- Pasi -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Volume Relabelling
On Mon, Jan 12, 2009 at 03:13:47PM -0500, John Drescher wrote: On Mon, Jan 12, 2009 at 2:54 PM, Pasi Kärkkäinen pa...@iki.fi wrote: On Wed, Jan 07, 2009 at 08:54:14AM +0200, Alex Ehrlich wrote: Hello, Can a (disk) volume be automatically relabelled when it is recycled? I don't think that's supported atm.. I'd like to have that feature too. It shouldn't be too hard to implement.. This is not in bacula. You are free to script that with bconsole. Yep. Although it would be more nice if Bacula did it automatically instead of some custom script.. Maybe I'll take a look at this some day.. -- Pasi -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Volume Relabelling
On Mon, Jan 19, 2009 at 04:47:47PM +, Alan Brown wrote: On Mon, 12 Jan 2009, Pasi Kärkkäinen wrote: Can a (disk) volume be automatically relabelled when it is recycled? I don't think that's supported atm.. I'd like to have that feature too. It shouldn't be too hard to implement.. A recycled volume is still only a candidate for being written on. Volumes are labelled when they're moved from recycle to append. The decision to not relabel tapes until the last possible moment is deliberate. This gives a last-possible-moment for recovering data off a tape which have been wiped from the database, using the various bacula command line utilities. It also reduces excessive tape handling. Bear in mind that media like LTO has a chip onboard and counts each load/unload cycle towards the end of the tape's lifetime (162 cycles(*)) even if only the very beginning of the tape has been read/written. (Yes, this means the lifetime warranty on LTO media has different real-world durations depending on the purposes the tapes are put to. Other media has similar limitations) Thanks for your comments. This could be an option to enable only for disk volumes.. If I end up implementing this, I'll make sure default behaviour is not changed, and you need to manually enable 'relabeling' if you want to. -- Pasi -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Volume Relabelling
On Wed, Jan 07, 2009 at 08:54:14AM +0200, Alex Ehrlich wrote: Hello, Can a (disk) volume be automatically relabelled when it is recycled? I don't think that's supported atm.. I'd like to have that feature too. It shouldn't be too hard to implement.. -- Pasi -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] [Bacula-devel] Copy jobs in Bacula version 3.0.0
On Mon, Dec 15, 2008 at 09:52:39PM +0100, Kern Sibbald wrote: Hello, I've been discussing with Eric how we might handle Copy jobs in our development version. Currently, Copy jobs are implemented, and they work much like Migration jobs (share 99% of the code). The difference is that Migration jobs purge the original backup job and keep only the Migrated data. With a Copy Job, the original backup job remains and there is a second identical job that contains the copied data. The only difference between the original and the Copy job is that they will be in different Pools. Now this poses a few problems for doing restores such as: 1. It is possible that a simple restore will choose JobIds from both the original and the Copy Job. 2. There is no easy mechanism for the user to select whether he/she wants to restore from the original backup or the Copy (or Copies). So for the moment, the situation is not really satisfactory (one of the reasons the code is not yet released). We have a number of ideas for different ways to solve the above problems, many have already been discussed on the mailing lists, and we will probably implement a number of the ideas put forward, either before or after the release (depending on the time we have and the complexity of the proposal -- e.g. using the Location table and Costs ...). A few things seem obvious: 1. Any restore where Bacula automatically selects the jobs to be restored (e.g. a restore to the current state -- #5 on the restore prompt menu) should be done by default using the original backups. .. if they're still available. I'm using Bacula to backup to disk, and then after the backup-to-disk job I copy from disk volumes to tapes. Backups are stored on disk volumes for X days, and after that disk volumes are recycled/reused. Backups are stored on tapes for much longer time before they're recycled. 2. If a job has been copied, Bacula should probably display an information message during the restore that indicates that the JobIds to be used have Copies. Yep. 3. The restore command should allow the user to select any Copy job or jobs. Definitely. I propose that we modify the Jobs to look like the following: +---+--+-+--+---+--+---+---+ | JobId | Name | StartTime | Type | Level | JobFiles | +---+--+-+--+---+--+---+---+ | 1 | CopyJobSave | 2008-12-15 20:38:28 | B| F | 7020 | | 5 | CopyJobSave | 2008-12-15 20:38:28 | C| F | 7020 | | 2 | CopyJobSave | 2008-12-15 20:38:32 | B| I | 999 | | 6 | CopyJobSave | 2008-12-15 20:38:32 | C| I | 999 | | 3 | copy-job | 2008-12-15 20:41:05 | c| F | 0| | 4 | copy-job | 2008-12-15 20:41:50 | c| F | 0| | 8 | RestoreFiles | 2008-12-15 20:42:39 | R| F | 7020 | +---+--+-+--+---+--+---+---+ Now here, JobIds 5 and 6 no longer appear to Bacula like they are Backup jobs, rather they are marked as Type=C (i.e. a Copy job), and so they will never be considered for restoration by default. I've give the Copy control jobs the Type 'c' to distinguish them from the real copy -- they exist just to record the actual time the copy was made. Looks good. Now by modifying the restore code in Bacula, we should be able to provide features that allow the user to know that there is a Copy of JobId 1 (i.e. 5) and a copy of JobId 2 (i.e. 6) and allow him/her to choose which copy to use. In my opinion, this would simplify future handling of Copy jobs allowing a lot more flexibility and avoiding confusion. However, it will cause some minor changes (nothing serious, I believe) for users already using the new code. That's not a problem.. something we need to deal with.. being beta-testers:) Note, instead of selecting pairs (original Job, Copy), since all the jobids are unique, you could simply just type in the desired jobids. For the above example: 7,6,3,4. Any comments? Thanks for working with this! -- Pasi -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] minimize iowait
On Wed, Dec 10, 2008 at 04:38:58PM -0600, Lukasz Szybalski wrote: On Wed, Dec 10, 2008 at 4:06 PM, John Drescher [EMAIL PROTECTED] wrote: On Wed, Dec 10, 2008 at 4:10 PM, Lukasz Szybalski [EMAIL PROTECTED] wrote: Hello, My bacula server is pretty busy, and I notice that at times the IOwait reaches 40%. Currently I use: top, free, iostat -k 5 /dev/md5 Are there any tools/commands on linux that can tell me what is the status of Input output. I would like to know: - How much IO are my hard drives capable of? - What is an average? max? min? - Is IO a bottleneck on my performance? - Monitoring tool names and exact commands would be appriciated. Sorry to bring these questions here but I don't know any other list that would have the technical people that have experience and are knowledgeable on the topic. Is this a software raid 5 or 6? Software raid5 using mdadm What kind of drives? how many of them? Single SATA 7200 rpm drive is capable of around 100-150 random IOs per second.. You can check http://www.storagereview.com and go to Performance Database and select for example IOMeter File Server - 128 I/O and click Sort to see some results.. SSD drives seem to beat the crap out of SAS and especially SATA drives:) 15k rpm U320 SCSI drives seem to be over 400 IOs per second.. a lot faster than SATA disks. Anyway, the point was that even when a single SATA drive might have a nice sequential throughput, it won't do very well with random IOs.. If you're running multiple backup jobs at the same time you're pretty much having random IO patterns.. Try benchmarking your md-raid-array and see how much IOs you can get out from it with random IO patterns? Try for example with LTP disktest. With LTP disktest you can try different IO sizes, different amount of threads, different read/write ratios etc.. Hopefully that helps.. -- Pasi -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula-sd volume data errors during copy from disk to tape
On Wed, Nov 12, 2008 at 07:38:40PM +0200, Pasi Kärkkäinen wrote: Hello list! I'm testing Bacula 2.5.19 (upcoming 3.0.0) and copying jobs from disk pools to tape. I'm getting some errors during the copy process.. has anyone else seen these?: bacula-sd JobId 2994: Start Copying JobId 2994, Job=CopyPool3UncopiedToTape.2008-11-12_16.40.09.26 bacula-sd JobId 2994: Using Device IBM-LTO3-Drive bacula-sd JobId 2994: Ready to read from volume Pool3-Vol-0090 on device FSDevice3 (/mnt/backup1/pool03). bacula-sd JobId 2994: Forward spacing Volume Pool3-Vol-0090 to file:block 0:218. bacula-sd JobId 2994: Error: block.c:1098 Volume data error at 0:2594608255! Short block of 2944 bytes on device FSDevice3 (/mnt/backup1/pool03) discarded. bacula-sd JobId 2994: Error: read_record.c:148 block.c:1098 Volume data error at 0:2594608255! Short block of 2944 bytes on device FSDevice3 (/mnt/backup1/pool03) discarded. bacula-sd JobId 2994: End of file 0 on device FSDevice3 (/mnt/backup1/pool03), Volume Pool3-Vol-0090 bacula-sd JobId 2994: End of Volume at file 0 on device FSDevice3 (/mnt/backup1/pool03), Volume Pool3-Vol-0090 bacula-sd JobId 2994: Ready to read from volume Pool3-Vol-0091 on device FSDevice3 (/mnt/backup1/pool03). bacula-sd JobId 2994: Forward spacing Volume Pool3-Vol-0091 to file:block 0:218. bacula-sd JobId 2994: Error: block.c:1098 Volume data error at 0:2314368047! Short block of 3024 bytes on device FSDevice3 (/mnt/backup1/pool03) discarded. bacula-sd JobId 2994: Error: read_record.c:148 block.c:1098 Volume data error at 0:2314368047! Short block of 3024 bytes on device FSDevice3 (/mnt/backup1/pool03) discarded. bacula-sd JobId 2994: End of file 0 on device FSDevice3 (/mnt/backup1/pool03), Volume Pool3-Vol-0091 bacula-sd JobId 2994: End of Volume at file 0 on device FSDevice3 (/mnt/backup1/pool03), Volume Pool3-Vol-0091 bacula-sd JobId 2994: Ready to read from volume Pool3-Vol-0092 on device FSDevice3 (/mnt/backup1/pool03). bacula-sd JobId 2994: Forward spacing Volume Pool3-Vol-0092 to file:block 0:218. However, the job terminates with: SD Files Written: 102,756 SD Bytes Written: 14,278,163,609 (14.27 GB) SD Errors: 0 SD termination status: OK Termination:Copying OK So.. does someone know what's going on? Any ideas? I'm still seeing these every now and then.. wondering what they mean for real. -- Pasi -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula-sd hanging after tape gets full + unload
On Thu, Nov 13, 2008 at 02:26:20PM +0200, Pasi Kärkkäinen wrote: Hello list! I'm using Bacula 2.5.19 and trying 'copy jobs' feature to copy jobs from disk volumes/pools to tape. Sometimes bacula-sd seems to get stuck.. it hangs without doing anything. Now it happened when tape got full and Bacula started to change the tape on the drive (using autoloader): bacula-sd JobId 3082: Start Copying JobId 3082, Job=CopyPool4UncopiedToTape.2008-11-13_10.53.04.54 bacula-sd JobId 3082: Using Device IBM-LTO3-Drive bacula-sd JobId 3082: Ready to read from volume Pool4-Vol-0127 on device FSDevice4 (/mnt/backup1/pool04). bacula-sd JobId 3082: Forward spacing Volume Pool4-Vol-0127 to file:block 0:218. bacula-sd JobId 3082: End of Volume 756NNNL3 at 764:10067 on device IBM-LTO3-Drive (/dev/nst0). Write of 64512 bytes got -1. bacula-sd JobId 3082: Re-read of last block succeeded. bacula-sd JobId 3082: End of medium on Volume 756NNNL3 Bytes=725,237,130,240 Blocks=11,241,894 at 13-Nov-2008 11:51. bacula-sd JobId 3082: 3307 Issuing autochanger unload slot 3, drive 0 command. nothing happens after this *sta Status available for: 1: Director 2: Storage 3: Client 4: All Select daemon type for status (1-4): 2 ... Device status: Autochanger IBM-LTO3-AutoChanger with devices: IBM-LTO3-Drive (/dev/nst0) Device FSDevice0 (/mnt/backup1/pool00) is not open. Device FSDevice1 (/mnt/backup1/pool01) is not open. Device FSDevice2 (/mnt/backup1/pool02) is not open. Device FSDevice3 (/mnt/backup1/pool03) is not open. Device FSDevice4 (/mnt/backup1/pool04) is mounted with: Volume: Pool4-Vol-0127 Pool:Pool4 Media type: File4 Total Bytes Read=1,649,507,328 Blocks Read=25,569 Bytes/block=64,512 Positioned at File=0 Block=1,649,507,534 Device IBM-LTO3-Drive (/dev/nst0) is not open. Device is being initialized. Drive 0 is not loaded. Used Volume status: hangs here and nothing happens I can exit bconsole by pressing CTRL+C multiple times.. if I restart bconsole and run that again, it gets stuck again.. I tried 'strace -p pid' to see what bacula-sd is doing: # strace -p 7339 Process 7339 attached - interrupt to quit select(5, [4], NULL, NULL, NULL unfinished ... Process 7339 detached So.. bacula-sd seems to be stuck on select() .. Running 'mtx' seems to work fine.. at the same time when bacula-sd is stuck. # mtx -f /dev/sg3 status Storage Changer /dev/sg3:1 Drives, 8 Slots ( 0 Import/Export ) Data Transfer Element 0:Empty Storage Element 1:Full :VolumeTag=179MMML3 Storage Element 2:Full :VolumeTag=658NNNL3 Storage Element 3:Full :VolumeTag=756NNNL3 Storage Element 4:Full :VolumeTag=177MMML3 Storage Element 5:Full :VolumeTag=655NNNL3 Storage Element 6:Full :VolumeTag=656NNNL3 Storage Element 7:Full :VolumeTag=657NNNL3 Storage Element 8:Full :VolumeTag=CLNU38L1 Any ideas how to fix this? Other than restarting Bacula.. I don't see any IO errors in dmesg and/or messages. Replying myself.. this was a bug in Bacula 2.5 SVN version, and it can be fixed with this patch: http://www.mail-archive.com/[EMAIL PROTECTED]/msg03646.html -- Pasi -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Wed, Dec 03, 2008 at 01:06:47PM +0200, Pasi Kärkkäinen wrote: On Wed, Dec 03, 2008 at 10:37:05AM +, Alan Brown wrote: On Tue, 2 Dec 2008, Julien Cigar wrote: Yes the SCSI card is an Adaptec Which model and revision? Several older Adaptec scsi HBAs are _very_ sensitive to scsi bus termination and length issues. Hmm.. I have pretty long SCSI cable.. Maybe I should try with a shorter one.. requires moving stuff around though.. Tape library is the only device in the SCSI bus, and it has terminator in the other SCSI connector. So that _should_ be fine.. I ordered new SCSI HBA, so I can try with the (shorter) SCSI cable that came with the tape library. My current HBA doesn't have correct connectors, so I need to swap it. Let's see if that helps.. -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Wed, Dec 03, 2008 at 11:26:40PM +0100, Arno Lehmann wrote: After I had a look at the release notes for 2.4.4-b1 and the bug tracker I suggest you try the 2.4.4-b1 SD - as far as I can see, bug 1192 *could* be what you found. It should be sufficient to ./configure and make the 2.4.4 version and simply run the newly-created SD with the existing configuration, so there shouldn't be a need to upgrade your whole Bacula setup, or even to touch the existing configuration. If it isn't, it's time to prepare a new bug report, I guess... Hmm.. 1192 could be it, yeah.. I'll try to patch my bacula-sd with that patch. -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Thu, Dec 04, 2008 at 11:22:32AM +0200, Pasi Kärkkäinen wrote: On Wed, Dec 03, 2008 at 11:26:40PM +0100, Arno Lehmann wrote: After I had a look at the release notes for 2.4.4-b1 and the bug tracker I suggest you try the 2.4.4-b1 SD - as far as I can see, bug 1192 *could* be what you found. It should be sufficient to ./configure and make the 2.4.4 version and simply run the newly-created SD with the existing configuration, so there shouldn't be a need to upgrade your whole Bacula setup, or even to touch the existing configuration. If it isn't, it's time to prepare a new bug report, I guess... Hmm.. 1192 could be it, yeah.. I'll try to patch my bacula-sd with that patch. Didn't help :( I think I'll continue on bacula-devel with some tracebacks etc.. -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Thu, Dec 04, 2008 at 02:08:38PM +0200, Pasi Kärkkäinen wrote: On Thu, Dec 04, 2008 at 11:22:32AM +0200, Pasi Kärkkäinen wrote: On Wed, Dec 03, 2008 at 11:26:40PM +0100, Arno Lehmann wrote: After I had a look at the release notes for 2.4.4-b1 and the bug tracker I suggest you try the 2.4.4-b1 SD - as far as I can see, bug 1192 *could* be what you found. It should be sufficient to ./configure and make the 2.4.4 version and simply run the newly-created SD with the existing configuration, so there shouldn't be a need to upgrade your whole Bacula setup, or even to touch the existing configuration. If it isn't, it's time to prepare a new bug report, I guess... Hmm.. 1192 could be it, yeah.. I'll try to patch my bacula-sd with that patch. Didn't help :( I think I'll continue on bacula-devel with some tracebacks etc.. And now it's fixed! :) Eric found the bug after looking at my tracebacks.. The fix/patch is in this email: http://www.mail-archive.com/[EMAIL PROTECTED]/msg03646.html It fixed the problem for me. I'm now able to get jobs started without bacula-sd (and bconsole) getting stuck. I'm currently in the process of doing more complete testing.. at least initial results are good. -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Wed, Dec 03, 2008 at 10:37:05AM +, Alan Brown wrote: On Tue, 2 Dec 2008, Julien Cigar wrote: Yes the SCSI card is an Adaptec Which model and revision? Several older Adaptec scsi HBAs are _very_ sensitive to scsi bus termination and length issues. Hmm.. I have pretty long SCSI cable.. Maybe I should try with a shorter one.. requires moving stuff around though.. Tape library is the only device in the SCSI bus, and it has terminator in the other SCSI connector. So that _should_ be fine.. -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Thu, Nov 27, 2008 at 05:53:41PM +0100, Arno Lehmann wrote: Hi, 27.11.2008 15:10, Pasi Kärkkäinen wrote: On Thu, Nov 27, 2008 at 08:14:45AM +0100, Arno Lehmann wrote: Hi, 26.11.2008 21:22, Bob Hetzel wrote: I've got bacula currently in a hung state with the following interesting info. When I run a status storage produces the following... Is your Bacula still stuck? If so, and you have gdb installed, and a Bacula with debug symbols, now might be a good time to see what it's doing... ... I have also seen this lately.. but that was with Bacula 2.5.18. I could make that hang happen multiple times, but I'm not totally sure what caused that.. Well, if you can recreate the issue it's worth the effort building Bacula with debug information so you get usable backtraces. If the problem happens again, you can use gdb to create a backtrace, showing the developers more details about what happens and thus enabling them to fix the issue. I would recommend that now. And now it's stuck again.. Last output in bconsole: 01-Dec 20:01 bacula-sd JobId 4231: Forward spacing Volume Pool4-Vol-0111 to file:block 0:218. 01-Dec 20:04 bacula-sd JobId 4231: Error: block.c:568 Write error at 509:3263 on device IBM-LTO3-Drive (/dev/nst0). ERR=Input/output error. 01-Dec 20:04 bacula-sd JobId 4231: Error: Error writing final EOF to tape. This Volume may not be readable. dev.c:1723 ioctl MTWEOF error on IBM-LTO3-Drive (/dev/nst0). ERR=Input/output error. 01-Dec 20:04 bacula-sd JobId 4231: End of medium on Volume 807NNNL3 Bytes=482,782,454,784 Blocks=7,483,606 at 01-Dec-2008 20:04. 01-Dec 20:04 bacula-sd JobId 4231: 3307 Issuing autochanger unload slot 7, drive 0 command. bconsole is still usable after this.. sta director shows a lot of jobs waiting for execution (since this was a 'copy pool uncopied jobs to tape'-job), but nothing happens really. sta storage makes bconsole hang.. last output: Device status: Autochanger IBM-LTO3-AutoChanger with devices: IBM-LTO3-Drive (/dev/nst0) Device FSDevice0 (/mnt/backup1/pool00) is not open. Device FSDevice1 (/mnt/backup1/pool01) is not open. Device FSDevice2 (/mnt/backup1/pool02) is not open. Device FSDevice3 (/mnt/backup1/pool03) is not open. Device FSDevice4 (/mnt/backup1/pool04) is mounted with: Volume: Pool4-Vol-0111 Pool:*unknown* Media type: File4 Total Bytes Read=3,848,656,896 Blocks Read=59,658 Bytes/block=64,512 Positioned at File=0 Block=3,848,592,601 Device IBM-LTO3-Drive (/dev/nst0) is not open. Device is being initialized. Drive 0 is not loaded. Used Volume status: hangs here, have to kill the bconsole What kind of backtrace do you want? From which daemon? bacula-sd? -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Thu, Nov 27, 2008 at 05:53:41PM +0100, Arno Lehmann wrote: Hi, 27.11.2008 15:10, Pasi Kärkkäinen wrote: On Thu, Nov 27, 2008 at 08:14:45AM +0100, Arno Lehmann wrote: Hi, 26.11.2008 21:22, Bob Hetzel wrote: I've got bacula currently in a hung state with the following interesting info. When I run a status storage produces the following... Is your Bacula still stuck? If so, and you have gdb installed, and a Bacula with debug symbols, now might be a good time to see what it's doing... ... I have also seen this lately.. but that was with Bacula 2.5.18. I could make that hang happen multiple times, but I'm not totally sure what caused that.. Well, if you can recreate the issue it's worth the effort building Bacula with debug information so you get usable backtraces. If the problem happens again, you can use gdb to create a backtrace, showing the developers more details about what happens and thus enabling them to fix the issue. I would recommend that now. And here's my earlier post about this problem: http://www.mail-archive.com/[EMAIL PROTECTED]/msg03587.html Ulrich has the same problem too: http://www.mail-archive.com/[EMAIL PROTECTED]/msg03591.html -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Tue, Dec 02, 2008 at 03:30:06PM +0100, Nils Blanck-Wehde wrote: Hi! Just wanted to let you know that I came across the exact same error Error writing final EOF to tape. This Volume may not be readable. a couple of times with 2.4.2 using a Quantum DLT VS1 drive connected to Adaptec 29160LP. I don't think that the tape is really defective as bacula states. I could do working backups on these tapes later. Maybe its a problem with positioning (forwarding) the tape to the right position? If there is still interest in this issue I might search for the corresponding job-output. I also think the tape itself is fine, since this has happened many times now.. I don't think all of the tapes are bad. I'm also using Adaptec 29160 SCSI HBA to connect to the tape library. I wonder what would be the best way to debug this.. Now after bacula hang the first time I'm not able to get it running again.. it always just hangs when I do sta storage.. and nothing happens for real. I guess the tape drive/library is in some bad state? Or the SCSI driver? Let's see if rebooting the server gets it running again.. -- Pasi Nils Pasi Kärkkäinen schrieb: On Thu, Nov 27, 2008 at 05:53:41PM +0100, Arno Lehmann wrote: Hi, 27.11.2008 15:10, Pasi Kärkkäinen wrote: On Thu, Nov 27, 2008 at 08:14:45AM +0100, Arno Lehmann wrote: Hi, 26.11.2008 21:22, Bob Hetzel wrote: I've got bacula currently in a hung state with the following interesting info. When I run a status storage produces the following... Is your Bacula still stuck? If so, and you have gdb installed, and a Bacula with debug symbols, now might be a good time to see what it's doing... ... I have also seen this lately.. but that was with Bacula 2.5.18. I could make that hang happen multiple times, but I'm not totally sure what caused that.. Well, if you can recreate the issue it's worth the effort building Bacula with debug information so you get usable backtraces. If the problem happens again, you can use gdb to create a backtrace, showing the developers more details about what happens and thus enabling them to fix the issue. I would recommend that now. And now it's stuck again.. Last output in bconsole: 01-Dec 20:01 bacula-sd JobId 4231: Forward spacing Volume Pool4-Vol-0111 to file:block 0:218. 01-Dec 20:04 bacula-sd JobId 4231: Error: block.c:568 Write error at 509:3263 on device IBM-LTO3-Drive (/dev/nst0). ERR=Input/output error. 01-Dec 20:04 bacula-sd JobId 4231: Error: Error writing final EOF to tape. This Volume may not be readable. dev.c:1723 ioctl MTWEOF error on IBM-LTO3-Drive (/dev/nst0). ERR=Input/output error. 01-Dec 20:04 bacula-sd JobId 4231: End of medium on Volume 807NNNL3 Bytes=482,782,454,784 Blocks=7,483,606 at 01-Dec-2008 20:04. 01-Dec 20:04 bacula-sd JobId 4231: 3307 Issuing autochanger unload slot 7, drive 0 command. bconsole is still usable after this.. sta director shows a lot of jobs waiting for execution (since this was a 'copy pool uncopied jobs to tape'-job), but nothing happens really. sta storage makes bconsole hang.. last output: Device status: Autochanger IBM-LTO3-AutoChanger with devices: IBM-LTO3-Drive (/dev/nst0) Device FSDevice0 (/mnt/backup1/pool00) is not open. Device FSDevice1 (/mnt/backup1/pool01) is not open. Device FSDevice2 (/mnt/backup1/pool02) is not open. Device FSDevice3 (/mnt/backup1/pool03) is not open. Device FSDevice4 (/mnt/backup1/pool04) is mounted with: Volume: Pool4-Vol-0111 Pool:*unknown* Media type: File4 Total Bytes Read=3,848,656,896 Blocks Read=59,658 Bytes/block=64,512 Positioned at File=0 Block=3,848,592,601 Device IBM-LTO3-Drive (/dev/nst0) is not open. Device is being initialized. Drive 0 is not loaded. Used Volume status: hangs here, have to kill the bconsole What kind of backtrace do you want? From which daemon? bacula-sd? -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Tue, Dec 02, 2008 at 04:48:24PM +0200, Pasi Kärkkäinen wrote: On Tue, Dec 02, 2008 at 03:30:06PM +0100, Nils Blanck-Wehde wrote: Hi! Just wanted to let you know that I came across the exact same error Error writing final EOF to tape. This Volume may not be readable. a couple of times with 2.4.2 using a Quantum DLT VS1 drive connected to Adaptec 29160LP. I don't think that the tape is really defective as bacula states. I could do working backups on these tapes later. Maybe its a problem with positioning (forwarding) the tape to the right position? If there is still interest in this issue I might search for the corresponding job-output. I also think the tape itself is fine, since this has happened many times now.. I don't think all of the tapes are bad. I'm also using Adaptec 29160 SCSI HBA to connect to the tape library. I wonder what would be the best way to debug this.. Now after bacula hang the first time I'm not able to get it running again.. it always just hangs when I do sta storage.. and nothing happens for real. I guess the tape drive/library is in some bad state? Or the SCSI driver? Let's see if rebooting the server gets it running again.. Hmm.. it looks like rebooting the server didn't solve this problem. Now after the reboot, when I start a job (copy uncopied jobs from disk to tape), and then check the status of the (tape) storage, I get the same hang as earlier: Device status: Autochanger IBM-LTO3-AutoChanger with devices: IBM-LTO3-Drive (/dev/nst0) Device FSDevice0 (/mnt/backup1/pool00) is not open. Device FSDevice1 (/mnt/backup1/pool01) is not open. Device FSDevice2 (/mnt/backup1/pool02) is not open. Device FSDevice3 (/mnt/backup1/pool03) is not open. Device FSDevice4 (/mnt/backup1/pool04) is mounted with: Volume: Pool4-Vol-0102 Pool:*unknown* Media type: File4 Total Bytes Read=0 Blocks Read=0 Bytes/block=0 Positioned at File=0 Block=0 Device IBM-LTO3-Drive (/dev/nst0) is not open. Device is being initialized. Drive 0 status unknown. Used Volume status: hangs here, nothing happens in bconsole Device is being initialized.. so I guess the tape drive has gone into some bad state? I don't think the tape drive is bad, since it was actually just replaced with a new one. I had the same problems with the old tape drive. mtx -f /dev/sg3 status seems to work fine.. the tape drive is empty, no tapes in it.. Any ideas? I guess I have to reboot the tape library.. -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Tue, Dec 02, 2008 at 03:56:33PM +0100, Julien Cigar wrote: Same problem here with a Sony SDX-700C Thanks for the report. Do you also have Adaptec SCSI HBA? -- Pasi On Tue, 2008-12-02 at 15:30 +0100, Nils Blanck-Wehde wrote: Hi! Just wanted to let you know that I came across the exact same error Error writing final EOF to tape. This Volume may not be readable. a couple of times with 2.4.2 using a Quantum DLT VS1 drive connected to Adaptec 29160LP. I don't think that the tape is really defective as bacula states. I could do working backups on these tapes later. Maybe its a problem with positioning (forwarding) the tape to the right position? If there is still interest in this issue I might search for the corresponding job-output. Nils Pasi Kärkkäinen schrieb: On Thu, Nov 27, 2008 at 05:53:41PM +0100, Arno Lehmann wrote: Hi, 27.11.2008 15:10, Pasi Kärkkäinen wrote: On Thu, Nov 27, 2008 at 08:14:45AM +0100, Arno Lehmann wrote: Hi, 26.11.2008 21:22, Bob Hetzel wrote: I've got bacula currently in a hung state with the following interesting info. When I run a status storage produces the following... Is your Bacula still stuck? If so, and you have gdb installed, and a Bacula with debug symbols, now might be a good time to see what it's doing... ... I have also seen this lately.. but that was with Bacula 2.5.18. I could make that hang happen multiple times, but I'm not totally sure what caused that.. Well, if you can recreate the issue it's worth the effort building Bacula with debug information so you get usable backtraces. If the problem happens again, you can use gdb to create a backtrace, showing the developers more details about what happens and thus enabling them to fix the issue. I would recommend that now. And now it's stuck again.. Last output in bconsole: 01-Dec 20:01 bacula-sd JobId 4231: Forward spacing Volume Pool4-Vol-0111 to file:block 0:218. 01-Dec 20:04 bacula-sd JobId 4231: Error: block.c:568 Write error at 509:3263 on device IBM-LTO3-Drive (/dev/nst0). ERR=Input/output error. 01-Dec 20:04 bacula-sd JobId 4231: Error: Error writing final EOF to tape. This Volume may not be readable. dev.c:1723 ioctl MTWEOF error on IBM-LTO3-Drive (/dev/nst0). ERR=Input/output error. 01-Dec 20:04 bacula-sd JobId 4231: End of medium on Volume 807NNNL3 Bytes=482,782,454,784 Blocks=7,483,606 at 01-Dec-2008 20:04. 01-Dec 20:04 bacula-sd JobId 4231: 3307 Issuing autochanger unload slot 7, drive 0 command. bconsole is still usable after this.. sta director shows a lot of jobs waiting for execution (since this was a 'copy pool uncopied jobs to tape'-job), but nothing happens really. sta storage makes bconsole hang.. last output: Device status: Autochanger IBM-LTO3-AutoChanger with devices: IBM-LTO3-Drive (/dev/nst0) Device FSDevice0 (/mnt/backup1/pool00) is not open. Device FSDevice1 (/mnt/backup1/pool01) is not open. Device FSDevice2 (/mnt/backup1/pool02) is not open. Device FSDevice3 (/mnt/backup1/pool03) is not open. Device FSDevice4 (/mnt/backup1/pool04) is mounted with: Volume: Pool4-Vol-0111 Pool:*unknown* Media type: File4 Total Bytes Read=3,848,656,896 Blocks Read=59,658 Bytes/block=64,512 Positioned at File=0 Block=3,848,592,601 Device IBM-LTO3-Drive (/dev/nst0) is not open. Device is being initialized. Drive 0 is not loaded. Used Volume status: hangs here, have to kill the bconsole What kind of backtrace do you want? From which daemon? bacula-sd? -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Tue, Dec 02, 2008 at 06:11:11PM +0200, Pasi Kärkkäinen wrote: On Tue, Dec 02, 2008 at 03:56:33PM +0100, Julien Cigar wrote: Same problem here with a Sony SDX-700C Thanks for the report. Do you also have Adaptec SCSI HBA? And which OS? I'm running CentOS 5.2 x86 32bit. -- Pasi On Tue, 2008-12-02 at 15:30 +0100, Nils Blanck-Wehde wrote: Hi! Just wanted to let you know that I came across the exact same error Error writing final EOF to tape. This Volume may not be readable. a couple of times with 2.4.2 using a Quantum DLT VS1 drive connected to Adaptec 29160LP. I don't think that the tape is really defective as bacula states. I could do working backups on these tapes later. Maybe its a problem with positioning (forwarding) the tape to the right position? If there is still interest in this issue I might search for the corresponding job-output. Nils Pasi Kärkkäinen schrieb: On Thu, Nov 27, 2008 at 05:53:41PM +0100, Arno Lehmann wrote: Hi, 27.11.2008 15:10, Pasi Kärkkäinen wrote: On Thu, Nov 27, 2008 at 08:14:45AM +0100, Arno Lehmann wrote: Hi, 26.11.2008 21:22, Bob Hetzel wrote: I've got bacula currently in a hung state with the following interesting info. When I run a status storage produces the following... Is your Bacula still stuck? If so, and you have gdb installed, and a Bacula with debug symbols, now might be a good time to see what it's doing... ... I have also seen this lately.. but that was with Bacula 2.5.18. I could make that hang happen multiple times, but I'm not totally sure what caused that.. Well, if you can recreate the issue it's worth the effort building Bacula with debug information so you get usable backtraces. If the problem happens again, you can use gdb to create a backtrace, showing the developers more details about what happens and thus enabling them to fix the issue. I would recommend that now. And now it's stuck again.. Last output in bconsole: 01-Dec 20:01 bacula-sd JobId 4231: Forward spacing Volume Pool4-Vol-0111 to file:block 0:218. 01-Dec 20:04 bacula-sd JobId 4231: Error: block.c:568 Write error at 509:3263 on device IBM-LTO3-Drive (/dev/nst0). ERR=Input/output error. 01-Dec 20:04 bacula-sd JobId 4231: Error: Error writing final EOF to tape. This Volume may not be readable. dev.c:1723 ioctl MTWEOF error on IBM-LTO3-Drive (/dev/nst0). ERR=Input/output error. 01-Dec 20:04 bacula-sd JobId 4231: End of medium on Volume 807NNNL3 Bytes=482,782,454,784 Blocks=7,483,606 at 01-Dec-2008 20:04. 01-Dec 20:04 bacula-sd JobId 4231: 3307 Issuing autochanger unload slot 7, drive 0 command. bconsole is still usable after this.. sta director shows a lot of jobs waiting for execution (since this was a 'copy pool uncopied jobs to tape'-job), but nothing happens really. sta storage makes bconsole hang.. last output: Device status: Autochanger IBM-LTO3-AutoChanger with devices: IBM-LTO3-Drive (/dev/nst0) Device FSDevice0 (/mnt/backup1/pool00) is not open. Device FSDevice1 (/mnt/backup1/pool01) is not open. Device FSDevice2 (/mnt/backup1/pool02) is not open. Device FSDevice3 (/mnt/backup1/pool03) is not open. Device FSDevice4 (/mnt/backup1/pool04) is mounted with: Volume: Pool4-Vol-0111 Pool:*unknown* Media type: File4 Total Bytes Read=3,848,656,896 Blocks Read=59,658 Bytes/block=64,512 Positioned at File=0 Block=3,848,592,601 Device IBM-LTO3-Drive (/dev/nst0) is not open. Device is being initialized. Drive 0 is not loaded. Used Volume status: hangs here, have to kill the bconsole What kind of backtrace do you want? From which daemon? bacula-sd? -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Tue, Dec 02, 2008 at 06:56:34PM +0100, Julien Cigar wrote: Yes the SCSI card is an Adaptec (I replaced it today with a QSI Logic to see if I have better results). The OS is FreeBSD 7.0-p6 (32 bits) with the ahc driver. What's strange is that I can write many jobs without any problems, but then it suddenly fails, always with the same error (Error writing final EOF to tape.) I posted a message on the freebsd-scsi mailing list some days ago, but I didn't get any answer : http://lists.freebsd.org/pipermail/freebsd-scsi/2008-November/003706.html I'm less and less sure that it's a driver/OS issue, but rather a Bacula bug (but I could be wrong). Thanks for the info. Could this be related?: [Bacula-users] FreeBSD, Bacula, and a Dell Autochanger 122T SCSI Timeouts http://sourceforge.net/mailarchive/forum.php?thread_name=1228142720.2805.223.camel%40soundwave.ws.pitbpa0.priv.collaborativefusion.comforum_name=bacula-users -- Pasi Best regards, Julien On Tue, 2008-12-02 at 18:18 +0200, Pasi Kärkkäinen wrote: On Tue, Dec 02, 2008 at 06:11:11PM +0200, Pasi Kärkkäinen wrote: On Tue, Dec 02, 2008 at 03:56:33PM +0100, Julien Cigar wrote: Same problem here with a Sony SDX-700C Thanks for the report. Do you also have Adaptec SCSI HBA? And which OS? I'm running CentOS 5.2 x86 32bit. -- Pasi On Tue, 2008-12-02 at 15:30 +0100, Nils Blanck-Wehde wrote: Hi! Just wanted to let you know that I came across the exact same error Error writing final EOF to tape. This Volume may not be readable. a couple of times with 2.4.2 using a Quantum DLT VS1 drive connected to Adaptec 29160LP. I don't think that the tape is really defective as bacula states. I could do working backups on these tapes later. Maybe its a problem with positioning (forwarding) the tape to the right position? If there is still interest in this issue I might search for the corresponding job-output. Nils Pasi Kärkkäinen schrieb: On Thu, Nov 27, 2008 at 05:53:41PM +0100, Arno Lehmann wrote: Hi, 27.11.2008 15:10, Pasi Kärkkäinen wrote: On Thu, Nov 27, 2008 at 08:14:45AM +0100, Arno Lehmann wrote: Hi, 26.11.2008 21:22, Bob Hetzel wrote: I've got bacula currently in a hung state with the following interesting info. When I run a status storage produces the following... Is your Bacula still stuck? If so, and you have gdb installed, and a Bacula with debug symbols, now might be a good time to see what it's doing... ... I have also seen this lately.. but that was with Bacula 2.5.18. I could make that hang happen multiple times, but I'm not totally sure what caused that.. Well, if you can recreate the issue it's worth the effort building Bacula with debug information so you get usable backtraces. If the problem happens again, you can use gdb to create a backtrace, showing the developers more details about what happens and thus enabling them to fix the issue. I would recommend that now. And now it's stuck again.. Last output in bconsole: 01-Dec 20:01 bacula-sd JobId 4231: Forward spacing Volume Pool4-Vol-0111 to file:block 0:218. 01-Dec 20:04 bacula-sd JobId 4231: Error: block.c:568 Write error at 509:3263 on device IBM-LTO3-Drive (/dev/nst0). ERR=Input/output error. 01-Dec 20:04 bacula-sd JobId 4231: Error: Error writing final EOF to tape. This Volume may not be readable. dev.c:1723 ioctl MTWEOF error on IBM-LTO3-Drive (/dev/nst0). ERR=Input/output error. 01-Dec 20:04 bacula-sd JobId 4231: End of medium on Volume 807NNNL3 Bytes=482,782,454,784 Blocks=7,483,606 at 01-Dec-2008 20:04. 01-Dec 20:04 bacula-sd JobId 4231: 3307 Issuing autochanger unload slot 7, drive 0 command. bconsole is still usable after this.. sta director shows a lot of jobs waiting for execution (since this was a 'copy pool uncopied jobs to tape'-job), but nothing happens really. sta storage makes bconsole hang.. last output: Device status: Autochanger IBM-LTO3-AutoChanger with devices: IBM-LTO3-Drive (/dev/nst0) Device FSDevice0 (/mnt/backup1/pool00) is not open. Device FSDevice1 (/mnt/backup1/pool01) is not open. Device FSDevice2 (/mnt/backup1/pool02) is not open. Device FSDevice3 (/mnt/backup1/pool03) is not open. Device FSDevice4 (/mnt/backup1/pool04) is mounted with: Volume: Pool4-Vol-0111
Re: [Bacula-users] bacula hang waiting for storage
On Tue, Dec 02, 2008 at 08:25:32PM +0100, Nils Blanck-Wehde wrote: Julien, hi Pasi, it looks like we have similar problems. But to be honest I have no real clue whats causing them. Right now it could be anything from a HBA driver problem to a bug in bacula. I will just give you some details about our setup so maybe we can find similarities. Hmm.. IIRC I haven't seen any kernel errors, but I'll have to re-check.. I just looked in my old system log. One time the Error writing final EOF happened, I got the following kernel message: scsi 2:0:5:1: scsi: Device offlined - not ready after error recovery The exact text of this messages seems to come from the adaptec aic7xxx kernel module so if you use another host controller, the message may vary. You will find the complete snip of my syslog attached. Our System Setup: bacula 2.4.2 (fschwarz EL5 rpms) on CentOS 5.2 (kernel 2.6.18-92.1.10.el5 #1 SMP Tue Aug 5 07:41:53 EDT 2008 i686 i686 i386 GNU/Linux) We are using a Quantum Superloader 3 DLT connected to an Adaptec 29160LP PCI64 controller sitting in a 32bit PCI slot. Our backup server has a Jetway VIA C7 MiniITX Board. Due to the 1U chassis the HBA sits in a 90° PCI riser card. I think that the risercard COULD lead to timing problems but I am not sure. If you guys use a similar setup we might be able to narrow the phenomenon down. I'm currently running Bacula 2.5.19 on CentOS 5.2 x86. I have seen this problem when I run 'copy from disk pool to tape'-jobs.. not sure if it happens with normal backup-to-tape jobs too. What kind of jobs are you running when it happens? Have you been running/using some other backup software on the same hardware? Do you guys know which tools could be used to check/verity the tape drive? btape? -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Thu, Nov 27, 2008 at 05:53:41PM +0100, Arno Lehmann wrote: Hi, 27.11.2008 15:10, Pasi Kärkkäinen wrote: On Thu, Nov 27, 2008 at 08:14:45AM +0100, Arno Lehmann wrote: Hi, 26.11.2008 21:22, Bob Hetzel wrote: I've got bacula currently in a hung state with the following interesting info. When I run a status storage produces the following... Is your Bacula still stuck? If so, and you have gdb installed, and a Bacula with debug symbols, now might be a good time to see what it's doing... ... I have also seen this lately.. but that was with Bacula 2.5.18. I could make that hang happen multiple times, but I'm not totally sure what caused that.. Well, if you can recreate the issue it's worth the effort building Bacula with debug information so you get usable backtraces. If the problem happens again, you can use gdb to create a backtrace, showing the developers more details about what happens and thus enabling them to fix the issue. I would recommend that now. Now my Bacula server is in a state where it always hangs when I try to run a 'copy from disk pool to tape'-job. I rebooted the server, but it didn't help. Bacula still hangs when it starts to use the tape library/drive. It all started with: Error writing final EOF to tape. This Volume may not be readable. dev.c:1723 ioctl MTWEOF error on IBM-LTO3-Drive (/dev/nst0). ERR=Input/output error. So.. now when I run a job, bacula-sd seems to get stuck.. and bconsole hangs too after I run sta 2 to check the status of the tape storage. What should I try to sort this out? I assume it would get fixed by rebooting the tape library, but I guess I shouldn't do that yet to troubleshoot this.. -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] bacula hang waiting for storage
On Thu, Nov 27, 2008 at 08:14:45AM +0100, Arno Lehmann wrote: Hi, 26.11.2008 21:22, Bob Hetzel wrote: I've got bacula currently in a hung state with the following interesting info. When I run a status storage produces the following... Is your Bacula still stuck? If so, and you have gdb installed, and a Bacula with debug symbols, now might be a good time to see what it's doing... Automatically selected Storage: Dell-PV136T Connecting to Storage daemon Dell-PV136T at gyrus:9103 gyrus-sd Version: 2.4.3 (10 October 2008) i686-pc-linux-gnu suse 10.2 Daemon started 25-Nov-08 19:20, 59 Jobs run since started. Heap: heap=3,756,032 smbytes=3,519,564 max_bytes=3,684,397 bufs=555 max_bufs=557 Sizes: boffset_t=8 size_t=4 int32_t=4 int64_t=8 Running Jobs: Writing: Incremental Backup job axh93-gx270 JobId=45634 Volume=LTO261L2 pool=Default device=IBMLTO2-3 (/dev/nst2) spooling=0 despooling=0 despool_wait=1 Files=78 Bytes=21,123,239 Bytes/sec=2,337 FDReadSeqNo=970 in_msg=750 out_msg=9 fd=20 Writing: Incremental Backup job bxn4-gx280 JobId=45641 Volume=LTO261L2 pool=Default device=IBMLTO2-3 (/dev/nst2) spooling=0 despooling=0 despool_wait=1 Files=155 Bytes=2,925,138,595 Bytes/sec=323,648 FDReadSeqNo=45,916 in_msg=45480 out_msg=9 fd=35 Writing: Incremental Backup job cdking JobId=45646 Volume=LTO261L2 pool=Default device=IBMLTO2-3 (/dev/nst2) spooling=0 despooling=0 despool_wait=1 Files=88 Bytes=11,846,912 Bytes/sec=1,310 FDReadSeqNo=920 in_msg=672 out_msg=9 fd=23 Writing: Incremental Backup job ceg3-d810 JobId=45648 Volume=LTO253L2 pool=Default device=IBMLTO2-2 (/dev/nst1) spooling=0 despooling=1 despool_wait=0 Files=35 Bytes=1,391,695,993 Bytes/sec=176,588 FDReadSeqNo=21,542 in_msg=21439 out_msg=9 fd=36 Writing: Incremental Backup job clifford3 JobId=45651 Volume=LTO261L2 pool=Default device=IBMLTO2-3 (/dev/nst2) spooling=0 despooling=0 despool_wait=0 Files=0 Bytes=0 Bytes/sec=0 FDReadSeqNo=6 in_msg=6 out_msg=4 fd=32 Writing: Incremental Backup job cxj57-gx270 JobId=45657 Volume=LTO261L2 pool=Default device=IBMLTO2-3 (/dev/nst2) spooling=0 despooling=0 despool_wait=0 Files=0 Bytes=0 Bytes/sec=0 FDReadSeqNo=6 in_msg=6 out_msg=4 fd=33 Writing: Incremental Backup job dxa2-d630 JobId=45665 Volume=LTO261L2 pool=Default device=IBMLTO2-3 (/dev/nst2) spooling=0 despooling=0 despool_wait=0 Files=0 Bytes=0 Bytes/sec=0 FDReadSeqNo=6 in_msg=6 out_msg=4 fd=17 Writing: Incremental Backup job educationdean JobId=45667 Volume= pool=Default device=IBMLTO2-1 (/dev/nst0) spooling=0 despooling=0 despool_wait=0 Files=0 Bytes=0 Bytes/sec=0 FDSocket closed Jobs waiting to reserve a drive: 3605 JobId=45667 wants free drive but device IBMLTO2-1 (/dev/nst0) is busy. [terminated jobs info snipped out] Device status: Autochanger Dell-PV136T with devices: IBMLTO2-1 (/dev/nst0) IBMLTO2-2 (/dev/nst1) IBMLTO2-3 (/dev/nst2) Device IBMLTO2-1 (/dev/nst0) is mounted with: Volume: LTO342L2 Pool:Default Media type: LTO-2 Slot 32 is loaded in drive 0. Total Bytes=11,991,168,000 Blocks=185,874 Bytes/block=64,512 Positioned at File=14 Block=0 Device IBMLTO2-2 (/dev/nst1) is mounted with: Volume: LTO253L2 Pool:Default Media type: LTO-2 Slot 48 is loaded in drive 1. Total Bytes=2,193,408 Blocks=33 Bytes/block=66,466 Positioned at File=1 Block=0 Device IBMLTO2-3 (/dev/nst2) is not open. Device is being initialized. Drive 2 status unknown. Used Volume status: [nothing further and the bconsole program hangs here] That alone would be a bug, I guess... Note that the last Writing line has no volume listed. The odd thing is that there actually is a tape in IBMLTO2-1. There's no tape in drive IBMLTO2-3. The pool apparently needs another appendable volume and there are several available in the Scratch pool but bacula is stuck. I tried to mount a volume into the empty drive and got back the following... *mount slot=61 drive=2 Automatically selected Storage: Dell-PV136T 3001 Device IBMLTO2-3 (/dev/nst2) is doing acquire. Does anybody have any idea what to do to further troubleshoot this? I have had some other instances of bacula getting hung up and so I have already previously applied the 2.4.3-orphaned-jobs.patch Sounds like it's worth a bug report - especially if you can re-create the problem. I cc'ed this to Eric, who - I believe - has been working on this sort of problems recently. I have also seen this lately.. but that was with Bacula 2.5.18. I could make that hang happen multiple times, but I'm not totally sure what caused that.. -- Pasi
[Bacula-users] bacula-sd hanging after tape gets full + unload
Hello list! I'm using Bacula 2.5.19 and trying 'copy jobs' feature to copy jobs from disk volumes/pools to tape. Sometimes bacula-sd seems to get stuck.. it hangs without doing anything. Now it happened when tape got full and Bacula started to change the tape on the drive (using autoloader): bacula-sd JobId 3082: Start Copying JobId 3082, Job=CopyPool4UncopiedToTape.2008-11-13_10.53.04.54 bacula-sd JobId 3082: Using Device IBM-LTO3-Drive bacula-sd JobId 3082: Ready to read from volume Pool4-Vol-0127 on device FSDevice4 (/mnt/backup1/pool04). bacula-sd JobId 3082: Forward spacing Volume Pool4-Vol-0127 to file:block 0:218. bacula-sd JobId 3082: End of Volume 756NNNL3 at 764:10067 on device IBM-LTO3-Drive (/dev/nst0). Write of 64512 bytes got -1. bacula-sd JobId 3082: Re-read of last block succeeded. bacula-sd JobId 3082: End of medium on Volume 756NNNL3 Bytes=725,237,130,240 Blocks=11,241,894 at 13-Nov-2008 11:51. bacula-sd JobId 3082: 3307 Issuing autochanger unload slot 3, drive 0 command. nothing happens after this *sta Status available for: 1: Director 2: Storage 3: Client 4: All Select daemon type for status (1-4): 2 ... Device status: Autochanger IBM-LTO3-AutoChanger with devices: IBM-LTO3-Drive (/dev/nst0) Device FSDevice0 (/mnt/backup1/pool00) is not open. Device FSDevice1 (/mnt/backup1/pool01) is not open. Device FSDevice2 (/mnt/backup1/pool02) is not open. Device FSDevice3 (/mnt/backup1/pool03) is not open. Device FSDevice4 (/mnt/backup1/pool04) is mounted with: Volume: Pool4-Vol-0127 Pool:Pool4 Media type: File4 Total Bytes Read=1,649,507,328 Blocks Read=25,569 Bytes/block=64,512 Positioned at File=0 Block=1,649,507,534 Device IBM-LTO3-Drive (/dev/nst0) is not open. Device is being initialized. Drive 0 is not loaded. Used Volume status: hangs here and nothing happens I can exit bconsole by pressing CTRL+C multiple times.. if I restart bconsole and run that again, it gets stuck again.. I tried 'strace -p pid' to see what bacula-sd is doing: # strace -p 7339 Process 7339 attached - interrupt to quit select(5, [4], NULL, NULL, NULL unfinished ... Process 7339 detached So.. bacula-sd seems to be stuck on select() .. Running 'mtx' seems to work fine.. at the same time when bacula-sd is stuck. # mtx -f /dev/sg3 status Storage Changer /dev/sg3:1 Drives, 8 Slots ( 0 Import/Export ) Data Transfer Element 0:Empty Storage Element 1:Full :VolumeTag=179MMML3 Storage Element 2:Full :VolumeTag=658NNNL3 Storage Element 3:Full :VolumeTag=756NNNL3 Storage Element 4:Full :VolumeTag=177MMML3 Storage Element 5:Full :VolumeTag=655NNNL3 Storage Element 6:Full :VolumeTag=656NNNL3 Storage Element 7:Full :VolumeTag=657NNNL3 Storage Element 8:Full :VolumeTag=CLNU38L1 Any ideas how to fix this? Other than restarting Bacula.. I don't see any IO errors in dmesg and/or messages. -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] slow performance copy/migrate disk to tape
On Wed, Nov 05, 2008 at 01:00:19PM +0100, Ulrich Leodolter wrote: On Wed, 2008-11-05 at 13:38 +0200, Pasi Kärkkäinen wrote: On Tue, Nov 04, 2008 at 10:21:54PM +0100, Ulrich Leodolter wrote: Job { Name = CopyDiskToTape Type = Copy Client = dir-fd Level = Full # must be defined, but is ignored FileSet = Full Set # must be defined, but is ignored Pool = DiskBackup Storage = File Messages = Standard Selection Type = PoolUncopiedJobs Maximum Concurrent Jobs = 10 # SpoolData = yes } PoolUncopiedJobs is based on an SQLQuery i posted in the devel list. i am running 2.5.17 svn at the server side Btw could you post the subject of that email, or archive link to it.. I could't find it with some searching.. My initial posting [Bacula-devel] Copy Jobs Selection Implemented by Marco van Wieringen [EMAIL PROTECTED] as noted in [Bacula-devel] Implementation of acls and extended attributes OK, it looks like 'PoolUncopiedJobs' Selection Type is included in 2.5 SVN version of Bacula, based on this email: [Bacula-devel] Implementation of acls and extended attributes: http://www.mail-archive.com/[EMAIL PROTECTED]/msg03471.html - support for copy jobs that help for disk-to-disk-to-tape backups (implemented in bacula as the pooluncopiedjobs copy job which implements a SQL query first send to the bacula list by Ulrich Leodolter) Your initial posting: [Bacula-devel] Copy Jobs Selection: http://www.mail-archive.com/[EMAIL PROTECTED]/msg02398.html But I'm still confused what is the exact SQL query that will be used.. Basicly I have a lot of jobs in the catalog/database, that have been ran, but the data/files of those jobs doesn't exist anymore on the disk volumes.. because of automatic disk volume recycling. so I need to copy only the jobs that still are on disk volumes.. I guess I should take a look at the code:) -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] copy/migrate uncopied jobs from disk to tape
On Thu, Nov 06, 2008 at 01:14:44PM +0200, Pasi Kärkkäinen wrote: On Wed, Nov 05, 2008 at 01:00:19PM +0100, Ulrich Leodolter wrote: On Wed, 2008-11-05 at 13:38 +0200, Pasi Kärkkäinen wrote: On Tue, Nov 04, 2008 at 10:21:54PM +0100, Ulrich Leodolter wrote: Job { Name = CopyDiskToTape Type = Copy Client = dir-fd Level = Full # must be defined, but is ignored FileSet = Full Set # must be defined, but is ignored Pool = DiskBackup Storage = File Messages = Standard Selection Type = PoolUncopiedJobs Maximum Concurrent Jobs = 10 # SpoolData = yes } PoolUncopiedJobs is based on an SQLQuery i posted in the devel list. i am running 2.5.17 svn at the server side Btw could you post the subject of that email, or archive link to it.. I could't find it with some searching.. My initial posting [Bacula-devel] Copy Jobs Selection Implemented by Marco van Wieringen [EMAIL PROTECTED] as noted in [Bacula-devel] Implementation of acls and extended attributes OK, it looks like 'PoolUncopiedJobs' Selection Type is included in 2.5 SVN version of Bacula, based on this email: [Bacula-devel] Implementation of acls and extended attributes: http://www.mail-archive.com/[EMAIL PROTECTED]/msg03471.html - support for copy jobs that help for disk-to-disk-to-tape backups (implemented in bacula as the pooluncopiedjobs copy job which implements a SQL query first send to the bacula list by Ulrich Leodolter) Your initial posting: [Bacula-devel] Copy Jobs Selection: http://www.mail-archive.com/[EMAIL PROTECTED]/msg02398.html But I'm still confused what is the exact SQL query that will be used.. Basicly I have a lot of jobs in the catalog/database, that have been ran, but the data/files of those jobs doesn't exist anymore on the disk volumes.. because of automatic disk volume recycling. so I need to copy only the jobs that still are on disk volumes.. I guess I should take a look at the code:) Found it. src/dird/migrate.c: const char *sql_jobids_of_pool_uncopied_jobs = SELECT DISTINCT Job.JobId,Job.StartTime FROM Job,Pool WHERE Pool.Name = '%s' AND Pool.PoolId = Job.PoolId AND Job.Type = 'B' AND Job.JobStatus = 'T' AND Job.JobId NOT IN (SELECT PriorJobId FROM Job WHERE Type = 'B' AND Job.JobStatus = 'T' AND PriorJobId != 0) ORDER by Job.StartTime; and find_jobids_of_pool_uncopied_jobs() function. Running that query manually with mysql client gives correct looking jobids (and correct looking amount of them). So I guess I'll try it for real:) Thanks! -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] slow performance copy/migrate disk to tape
On Tue, Nov 04, 2008 at 10:21:54PM +0100, Ulrich Leodolter wrote: Job { Name = CopyDiskToTape Type = Copy Client = dir-fd Level = Full # must be defined, but is ignored FileSet = Full Set # must be defined, but is ignored Pool = DiskBackup Storage = File Messages = Standard Selection Type = PoolUncopiedJobs Maximum Concurrent Jobs = 10 # SpoolData = yes } PoolUncopiedJobs is based on an SQLQuery i posted in the devel list. i am running 2.5.17 svn at the server side Btw could you post the subject of that email, or archive link to it.. I could't find it with some searching.. Thanks! -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] slow performance copy/migrate disk to tape
On Tue, Nov 04, 2008 at 09:02:46PM +0100, Ulrich Leodolter wrote: Hi, Problem: Migrate/Copy jobs from disk pool (DiskBackup) to tape (DiskCopy) only get overall speed of 10-20MB/s. full backup job size varies from 10-50GB. Pool setup is simple, just one pool for full an incremental backups to disk (automatic recycle works good) Pool { Name = DiskBackup Pool Type = Backup Recycle = yes RecyclePool = DiskBackup AutoPrune = yes Volume Retention = 15 days Volume Use Duration = 6 days Maximum Volume Bytes = 4G Label Format = Backup- Next Pool = DiskCopy } DiskCopy pool goes to LTO4 tape device. DiskBackup goes to SATA external raid (6T ext3). Which raid-level your SATA external raid is? concurrency for DiskBackup jobs is 15, jobs are spread over DiskBackup Valumes (maybe thats the main problem) i can read write/continuous on both devices at about 70MB/s (SATA is not fast) So your SATA-raid can do max 70 MB/sec.. How much your LTO4 can do? i tried spooling to local SAS raid, but overall speed is lower than direct writing to tape. despooling from SAS raid to Tape runs at Tape maximum spped. So SAS-RAID to Tape is faster than SATA-RAID to Tape? I need some Performance tuning tips, maybe: Limit jobs per Volume in DiskBackup pools? Split DiskBackup into DiskFull and DiskIncr pools? First determine the bottleneck and then work around it.. Btw. could you post your 'copy' job? I'm in the process of trying it out but I'm still stuck with trying to implement 'copy only uncopied jobs'-feature.. -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] slow performance copy/migrate disk to tape
On Tue, Nov 04, 2008 at 10:21:54PM +0100, Ulrich Leodolter wrote: On Tue, 2008-11-04 at 22:38 +0200, Pasi Kärkkäinen wrote: On Tue, Nov 04, 2008 at 09:02:46PM +0100, Ulrich Leodolter wrote: Hi, Problem: Migrate/Copy jobs from disk pool (DiskBackup) to tape (DiskCopy) only get overall speed of 10-20MB/s. full backup job size varies from 10-50GB. Pool setup is simple, just one pool for full an incremental backups to disk (automatic recycle works good) Pool { Name = DiskBackup Pool Type = Backup Recycle = yes RecyclePool = DiskBackup AutoPrune = yes Volume Retention = 15 days Volume Use Duration = 6 days Maximum Volume Bytes = 4G Label Format = Backup- Next Pool = DiskCopy } DiskCopy pool goes to LTO4 tape device. DiskBackup goes to SATA external raid (6T ext3). Which raid-level your SATA external raid is? Raid 5 concurrency for DiskBackup jobs is 15, jobs are spread over DiskBackup Valumes (maybe thats the main problem) i can read write/continuous on both devices at about 70MB/s (SATA is not fast) So your SATA-raid can do max 70 MB/sec.. How much your LTO4 can do? 70MB/sec Have you tried with dd if=/dev/null of=/dev/tape bs=bigblocksize to get the actual max tape drive performance.. ? i tried spooling to local SAS raid, but overall speed is lower than direct writing to tape. despooling from SAS raid to Tape runs at Tape maximum spped. So SAS-RAID to Tape is faster than SATA-RAID to Tape? about 180MB/s continuous write Yep. Over 2x compared to SATA-Raid. So clearly it's the RAID that's limiting your performance. I need some Performance tuning tips, maybe: Limit jobs per Volume in DiskBackup pools? Split DiskBackup into DiskFull and DiskIncr pools? First determine the bottleneck and then work around it.. Thats why i am asking :-) i tried to monitor disk using dstat http://dag.wieers.com/home-made/dstat/ i reports 70MB/s read while bacula Copy Job is running. looks like some overhead is in bacula itself: avg iostat while bacula CopyDiskToTape user 9.12 system 2.79 iowait 17.00 avg iostat while EMC Networker (Legato) Clone Disk to Tape (on the same hardware) user 2.14 system 2.02 iowait 12.84 as u can see user and iowait are higher for bacula. total Copy Jobs size is about 800GB (one LTO4 Tape) Networker backup jobs to disk run in parallel (max 10), i dont think backup file sets are continous on disk. Hmm.. have you adjusted block sizes bacula uses? iostat is also useful tool.. I guess it's part of sysstat package. Btw. could you post your 'copy' job? I'm in the process of trying it out but I'm still stuck with trying to implement 'copy only uncopied jobs'-feature.. -- Pasi Job { Name = CopyDiskToTape Type = Copy Client = dir-fd Level = Full # must be defined, but is ignored FileSet = Full Set # must be defined, but is ignored Pool = DiskBackup Storage = File Messages = Standard Selection Type = PoolUncopiedJobs Maximum Concurrent Jobs = 10 # SpoolData = yes } PoolUncopiedJobs is based on an SQLQuery i posted in the devel list. i am running 2.5.17 svn at the server side Oh nice, I missed that. I'll try it out:) -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] Bacula VSS problems / ERR=Access is denied
On Thu, Oct 09, 2008 at 06:16:51AM -0700, yistoneriver wrote: My colleague and I looked into this issue and found that we could reproduce this behavior under the following conditions: 1) FD runs on Windows 2003 Server 2) VSS is enabled 3) Only one file-list is listed in FileSet resource 4) the file-list is a folder (directory) So, a workaround is to add a dummy file-list. I hope this will help. Hi! Actually yes, I think this was exactly the situation I had. There was only one directory specified for backups! The problem went away when we added more directories/disks.. So yeah, it sounds like a bug.. Thanks a lot for pointing this out! -- Pasi -Yuji Pasi Kärkkäinen wrote: On Mon, Jun 02, 2008 at 07:09:58PM +0300, Pasi Kärkkäinen wrote: On Mon, Jun 02, 2008 at 09:40:23AM -0400, John Drescher wrote: btw. is there a way to make bacula automatically rerun jobs failing like this? Although I have never done this the following two bacula directives look like they could help: Run After Failed Job Rerun Failed Levels http://www.bacula.org/en/dev-manual/Configuring_Director.html Hmm.. it almost looks like 50% of the times I get that VSS error.. ie. every second time it works and every second time it doesn't. What's interesting is that the job itself terminates with OK with warnings.. it's definitely not OK because it does absolutely nothing.. 0 files backup up when that Access is denied happens.. Yeah, did some 8 more test runs and 4 of them failed with that VSS ERR=Access is denied error.. Is it possible that winbacula leaves VSS in some bad state? That would maybe explain why it works every second time.. or is it more likely that there is something on the server causing this? -- Pasi - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users