Re: [Bacula-users] LTO-4 Drive issues
skipunk wrote: > Your right, your setting is close to mine. I would run 1 backup at a time if > I could pull the speeds that I get from a local backup, so currently I'll be > pushing more backup's at once until I can find the issues causing this. Using concurrent backups and spooling is a _big_ win when running incrementals as otherwise these will always cause shoeshining if pointed directly at the tape drive. Once you're using spooling, you can run multiple simultaneous backups and shorten up the backup window even further. My spool disk area is about 300gb with a max chunk size of 100Gb (recently converted to SSD for extra speed). A lot of the machine full backups end up in one piece on the tape as a result. (Data backups are for 1Tb partitions and always end up in several chunks) -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
On 26.05.2010 01:55, skipunk wrote: > The nic's in the server are Broadcom's netextreme II's. I would > assume that they would support checksum offload and large tcp packet > offload, but I really I'm not sure. They do support all offload options. But those options don't boost you from 40MByte/s to 100MByte/s, they help servers which are already very loaded and your local memory and CPU throughput is limiting you. If your CPU is mostly idle, then offloading won't buy you much. (At least with 1GB NICs. With 10GB NICs this is a whole different picture.) But beware: some Linux versions had problems with offloading, so while benchmarking your network with netio and iperf it may be worthwhile to toggle some offloading options to see if you get any changes in throughput and/or CPU load. Grüße, Sven. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
I am getting anywhere from 4 to 40 M Bytes/second transfer rate. sample from log (several lines omitted): 23-May 04:05 ip-bacula-sd JobId 3834: 3307 Issuing autochanger "unload slot 10, drive 0" command. 23-May 04:38 ip-bacula-sd JobId 3834: Job write elapsed time = 00:29:38, Transfer rate = 21.01 M bytes/second 23-May 04:41 ip-bacula-dir JobId 3834: Bacula ip-bacula-dir 3.0.1 (30Apr09): 23-May-2010 04:41:21 Build OS: x86_64-unknown-linux-gnu redhat Backup Level: Full Client: "Linux" 3.0.1 (30Apr09) i686-pc-linux- gnu,redhat,Enterprise release FileSet:"Full Set" 2009-06-30 15:35:14 Storage:"TL2000" (From Job resource) Scheduled time: 23-May-2010 04:05:00 Elapsed time: 34 mins 28 secs Priority: 10 FD Files Written: 321,952 SD Files Written: 321,952 FD Bytes Written: 37,318,560,278 (37.31 GB) SD Bytes Written: 37,364,390,306 (37.36 GB) Rate: 18045.7 KB/s >From your replies it seems the issue is not the backup device but someone mentioned tools for the device and what you can use is itdt-dcr. This program comes bundled in the downloads of firmware upgrades from Dell. Go to support.dell.com and use the form to get to TL2000 updates or: http://support.dell.com/support/downloads/download.aspx?c=us&cs=04&l=en&s=bsd&releaseid=R266766&SystemID=PWV_TL2000&servicetag=&os=LIN4&osl=en&deviceid=14489&devlib=0&typecnt=0&vercnt=10&catid=-1&impid=-1&formatcnt=0&libid=37&typeid=-1&dateid=-1&formatid=-1&source=-1&fileid=393643 There is also a WEB interface built in the TL2000 that can be turned on. Also keep in mind that your Dell TL2000 may show up as IBM Model ULT3580-TD4, mine does because it was made by IBM for Dell. My settings are like yours below except I don't have any of the lines below Autoselect. I have Maximum Concurrent Jobs = 20 but each client is set to 1.. All my servers have at least 2 ethernet ports and I use one on each server set to a static ip in the 192.168.0.x domain, all of which are connected to a gigabit switch not on the network and the backup server is also connected to this switch. I only do one server backup at a time though. hth Thomas On Sunday 23 May 2010 21:22:11 skipunk wrote: > Janusz > > Both nics reported back at 100 Mb/s > I'm beginning to think the network is the bottleneck. > > I ran a backup on my server with bacula. backing up the file I have been > using. A 3.4Gb in 2 min and the only reason for it taking so long was > because of a tape change in the library. So locally we are fine. > > Here is the autochanger and drive settings in bacula-sd.conf > > Autochanger { > Name = Autochanger > Device = Drive-1 > Changer Command = "/etc/bacula/mtx-changer %c %o %S %a %d" > Changer Device = /dev/sg4 > } > > Device { > Name = Drive-1 # > Drive Index = 0 > Media Type = LTO-4 > Archive Device = /dev/nst0 > AutomaticMount = yes; # when device opened, read it > AlwaysOpen = yes; > RemovableMedia = yes; > RandomAccess = no; > AutoChanger = yes > LabelMedia = no; > Autoselect = yes; > Maximum File Size = 5GB > Maximum Network Buffer Size = 262144 > Maximum Block Size = 262144 > Spool Directory = /spool > Maximum Spool Size = 200G > Maximum Job Spool Size = 20G > > Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'" > } > > +-- > > |This was sent by skip...@gmail.com via Backup Central. > |Forward SPAM to ab...@backupcentral.com. > > +-- > > > > --- > --- > > ___ > Bacula-users mailing list > Bacula-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-users > -- == Thomas McMillan Grant Bennett Appalachian State University Operations & Systems AnalystP O Box 32026 University LibraryBoone, North Carolina 28608 (828) 262 6587 Library Systems Help Desk: https://www.library.appstate.edu/help/ == -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
> > Morty Abzug wrote: > > > This sort of problem is typical of a speed/duplex mismatch. That is, > > either your switch or your server is hardcoding the speed of your > > network port, and the other end is using different settings. > > A lot of network admins still believe the best way to run a network is > with port speeds locked. > > IMO that should have been deprecated for around a decade as > autonegotiation works better except in certain edge cases (mainly > involving ancient equipment or cabling) > > Hardcoding port settings causes far more problems than it solves. > I hear you. I've lost count of the number of times I've had that argument. If autonegotiation doesn't work then something is broken and it should be fixed by fixing/replacing the hardware, not by tinkering with the speed and duplex settings. Maybe 10-15 years ago, but not now. I have never seen a case that justified locking the speed and duplex that wasn't caused by some other underlying issue (firmware upgrade required, bad cable, etc). James -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
Morty Abzug wrote: > This sort of problem is typical of a speed/duplex mismatch. That is, > either your switch or your server is hardcoding the speed of your > network port, and the other end is using different settings. A lot of network admins still believe the best way to run a network is with port speeds locked. IMO that should have been deprecated for around a decade as autonegotiation works better except in certain edge cases (mainly involving ancient equipment or cabling) Hardcoding port settings causes far more problems than it solves. AB -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
skipunk schrieb: > > I am aware that more nic's do not increase throughput. Basically a > backup server has to be in place tonight and I really don't want to > start from scratch and now reaching for anything that will resolve > the issue within the next few hours. 1. test the lto drive with tar/dd. I think you already did that and the performance was ok. 2. test the I/O throughput of your data server, the server that you want to backup. Use a tool like bonnie++, tiobench... 3. test the network performance between your data server (file daemon) and your bacula-sd server with tools like netperf, netio... Nothing right now sounds like a bacula problem, you have first to find the bottleneck, which could be the network or somthing else. Ralf -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
> > After returning to my office this morning, I had found the server was > connected to a 10/100 switch. I had moved it to the rack and connected it to > the switch 10/100/1000 and nic 1 pulls 1000 on ethtools > nic 2 pulls 100 on ethtools. > > After futher investigation, it looks like nic 2 is capped at 100 mb/s. > > Even running on 1 nic, my speeds have jumped up to a max of 8 Mb/s with an avg > of 3 - 4 Mb/s when doing backups over the net. > > A little better but not where it should be. > > I'm at a loss for the moment. > iperf should be able to tell you what the link is capable of. Also, be aware that server network cards often support checksum offload and large tcp packet offload. These can give you a significant performance boost, but if there are any bugs in the driver or firmware they cause all sorts of strange problems. James -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
On Mon, May 24, 2010 at 05:22:52PM -0400, skipunk wrote: > We are running a Gigabit network. All servers have gigabit connection. Are you using managed switches, such as Cisco Catalyst switches? Those can be configured per port to use lower speeds. If one end is hardcoded to 100/full and the other end is configured for auto, the auto end will often end up with 100/half. The mismatch will result in less than 100Mbps effective throughput. - Morty -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
On Mon, May 24, 2010 at 02:04:15PM -0400, skipunk wrote: > Even running on 1 nic, my speeds have jumped up to a max of 8 Mb/s with an > avg of 3 - 4 Mb/s when doing backups over the net. > > A little better but not where it should be. > > I'm at a loss for the moment. Test your network connection using some other tool between the server in question and some other system, in a way that doesn't involve your disk or your tape, just the network. iperf is handy, or you can use dd with netcat or socat. This sort of problem is typical of a speed/duplex mismatch. That is, either your switch or your server is hardcoding the speed of your network port, and the other end is using different settings. - Morty -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
On Mon, May 24, 2010 at 6:26 PM, skipunk wrote: > > I am aware that more nic's do not increase throughput. Basically a backup > server has to be in place tonight and I really don't want to start from > scratch and now reaching for anything that will resolve the issue within the > next few hours. > 1. Enable spooling. Use a 5 to 20 GB spool file. 2. Disable all 100MBit network cards 3. Increase your concurrency to 5 to 10 clients. John -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
On 24.05.2010 22:31, skipunk wrote: > What would it take to clear up the network bottleneck. I was looking > at another server that came with our gaming system. Same server > (memory, hd, etc) and tape library, running on win2k3 and netbackup > with no issues. The only difference is, it's running 4 nic's. So it's > leaving me a bit baffled that the only real difference is OS and 2 > more nics and my system is experiencing a bottle neck. You do know under Linux attaching more than one NIC to a LAN does not increase the throughput, do you? You have to use bonding in either 802.3ad- or ALB-mode to achieve this. The first one uses LACP and needs a switch which is specially configured while the last one does not need any configuration in the switch, you can even connect both NICs to different switches, as long as the (V)LAN you connect your server to is available on both. Grüße, Sven. -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
> What would it take to clear up the network bottleneck. Upgrade to gigabit networking. A 24 port gigabit switch can be had for under $200 US. However that assumes your desktops can support gigabit. Most motherboards come with gigabit for the last 3 years however dell, hp ... have been selling desktops with 100MBit ports. It will cost $15 to $35 per gigabit nic if these need to be upgraded. John -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
skipunk wrote: > > What would it take to clear up the network bottleneck. I was looking at > another server that came with our gaming system. Same server (memory, hd, > etc) and tape library, running on win2k3 and netbackup with no issues. The > only difference is, it's running 4 nic's. So it's leaving me a bit baffled > that the only real difference is OS and 2 more nics and my system is > experiencing a bottle neck. It's probably the extra 2 NIC's that are making the difference. Unless you can keep the tape drive and I think you mentioned more than one, continuously fed with data at better than 40MB or so a second, per drive, the tape stops, backs up and starts again and this has a huge impact on throughput due to the time taken for this cycle to complete. Once any initial spool is exhausted, the drive will just stutter along in bursts. Regards, Richard -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
On Mon, May 24, 2010 at 3:16 PM, Richard Scobie wrote: > skipunk wrote: >> >> After returning to my office this morning, I had found the server was >> connected to a 10/100 switch. I had moved it to the rack and connected it >> to the switch 10/100/1000 and nic 1 pulls 1000 on ethtools >> nic 2 pulls 100 on ethtools. >> >> After futher investigation, it looks like nic 2 is capped at 100 mb/s. >> >> Even running on 1 nic, my speeds have jumped up to a max of 8 Mb/s with an >> avg of 3 - 4 Mb/s when doing backups over the net. >> >> A little better but not where it should be. >> >> I'm at a loss for the moment. > > I think you mentioned in a previous post that writing a local, multi GB > file to the drive went at full speed, so your drive is setup OK. > > Given it appears that the network is the bottleneck, you will need to > copy all backup data to the machine local to the drive and then backup > to tape. > > The setup in use here rsyncs remote machines to a fast local RAID5 and > then bacula is used to tape from there. > Spooling with a 5 to 10GB spool file along with running multiple concurrent jobs could help if the server can have at least 1 gigabit link to the switch. Without that your backup jobs will likely take days to complete. John -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
skipunk wrote: > > After returning to my office this morning, I had found the server was > connected to a 10/100 switch. I had moved it to the rack and connected it to > the switch 10/100/1000 and nic 1 pulls 1000 on ethtools > nic 2 pulls 100 on ethtools. > > After futher investigation, it looks like nic 2 is capped at 100 mb/s. > > Even running on 1 nic, my speeds have jumped up to a max of 8 Mb/s with an > avg of 3 - 4 Mb/s when doing backups over the net. > > A little better but not where it should be. > > I'm at a loss for the moment. I think you mentioned in a previous post that writing a local, multi GB file to the drive went at full speed, so your drive is setup OK. Given it appears that the network is the bottleneck, you will need to copy all backup data to the machine local to the drive and then backup to tape. The setup in use here rsyncs remote machines to a fast local RAID5 and then bacula is used to tape from there. Regards, Richard -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
> Both nics reported back at 100 Mb/s > I'm beginning to think the network is the bottleneck. > That is not going to work well for a tape drive that needs 100MB/s+ even with spooling you will always be waiting on the slow network. John -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
> Status update. I just finished the first backup job which was 3.6 Gb. It > took 2.5 hours to run. avg speeds finished around 500k/s. > > I've gone as far again to reboot the library and the server and still no > changes. I'm not sure what i'm over looking. I have 1T of data to backup > on one of the servers and at this rate it will take far too long to > complete. Really pointless to even attempt. If you've already tested using dd/tar/whatever from local server and its still lagging, this question is obsolete, but - just a blind shot: what's your Ethernet speed, the one you talk about looks like 10Mbit? ethtool eth0/bond0/vlan33 | grep -i speed ? -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
What is your blocking factor on the drive? Assuming hardware is OK it sounds like you do not have your drive configured properly for variable block sizes. issue: mt-st -f /dev/st0 setblk 0 mt-st -f /dev/st0 defblksize 0 mt-st -f /dev/st0 defcompression 0 mt-st -f /dev/st0 compression 0 (or mt depending on what tape package you have on fedora).The first two set your drive to variable block, the next two turn off compression (to get raw performance for a test). run btape against your drive. Alternatives would be to use tar with something like: tar -cv -b1024 -f /dev/st0 where is a multigb file preferably non fragmented. Another thing to take a look at is your st driver settings, under some versions of linux this is in /etc/modprobe.d/mt-st On my ubuntu box here I use: options st buffer_kbs=2048 max_sg_segs=128 For bacula you want to modify your bacula-sd.conf for your tape device.Main items would be: Maximum Block Size = 262144 Maximum File Size = 5GB Maximum Network Buffer Size = 262144 And set your spool size as large as you can. On 05/23/2010 00:06, skipunk wrote: I found another site showing the use of byte size and dd and after playing with it a bit i can hold 115 MB/s easy and backup my 3.5 GB file like nothing. I'm looking through Bacula documents, I don't recall if this can be set or not in the config files. +-- |This was sent by skip...@gmail.com via Backup Central. |Forward SPAM to ab...@backupcentral.com. +-- -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
> I was not able to copy/backup files using tar. I may have been doing > something wrong. > > I had a job in progress using dd. a 3.3Gb file which i started 40 min ago > and it's still running. > Check your dmesg output for errors. That is horribly slow. John -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
skipunk wrote: > Status update. I just finished the first backup job which was 3.6 Gb. It > took 2.5 hours to run. avg speeds finished around 500k/s. > > I've gone as far again to reboot the library and the server and still no > changes. I'm not sure what i'm over looking. I have 1T of data to backup on > one of the servers and at this rate it will take far too long to complete. > Really pointless to even attempt. > This throughput is so low that there almost must be some hardware issue. Is your tape drive sharing the same SAS controller as other storage? In order to take full advantage of LTO4 speed, spooling of all data to a fast drive array is mandatory - backing up highly compressible data using on drive compression, the RAID array I use for spooling is feeding the drive at around 180MB/s. Regards, Richard -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
skipunk wrote: > Any suggestions would be appreciated. The other suggestions have suggested running the standard tests, You didn't say anything about hardware and I found that to be very important when I changed from DAT to DLT tapes, which initially ran just as glacial as the DAT. The first thing I did was set up a raid 5 array as the spool area. Perhaps the major aspect was this went from 10Mb/s hardware to 40Mb/s capable hardware. I also moved the tape drive off to separate scsi bus, again 40mbs/s capable. The slow part was the spooling and after checking networking and upgrading from 10Mb/s to 100Mb/s (negligible difference), it was traced back to the individual slow file servers (SO). So I then started to run parallel jobs and divided the spool area equally. Note a large spool file is not neccessarily the best. Smaller spool files might result in faster job times as it minimises the chances of one job interferring with the other when waiting for the tape drive. Trade off is longer recovery times as more tape has to be run through for all the chunks. Last was to carefully arrange the job starts. I queue the two big jobs first and leave the third job spot to the other jumble of shorter jobs, which start a few minutes later. Just 2c for consideration. -- Terry Collins {:-)} Bicycles, Appropriate Technology, Natural Environment, Welding -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
skipunk schrieb: > > Hoping someone could help me out. My department recently purchased > a Dell PowerVault TL2000 autochanger connected via SAS5. > > We are upgrading from a spectralogic AIT4 system. > > The sad part is the LTO-4 drive is running much slower than the AIT > system. I'm avg around 714k/s. I have been reading where others > complain about only 35 - 50 M/s instead of 120+. At this point I'd > love to 35 - 50 M/s. > > I'm running Fedora 12 & Bacula 5.0.2 > This is all on a new server. The AIT system is running on Fedora 10 & Bacula > 5.0.0 > > All the config files are similar if not identical accept for > hardware settings. I just don't seem to understand why this is so > slow. Any suggestions would be appreciated. Did you test the drive with tar, dd or other system tools? Did you test the drive with btape? Anything in the kernel log file? http://www.bacula.org/de/dev-manual/Testing_Your_Tape_Drive.html I'd begin with the basic tests with dd and tar. Create a 5 GB large file with random data (/dev/urandom), put the file on a fast disk and write it to tape with dd. Everything below ~40 MB/s will do harm to your drive and tapes, because the drive will start show-shining. http://en.wikipedia.org/wiki/Tape_drive#Problems Thus you should use spooling for backups jobs that can't deliver this minimum data rate. Ralf -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users
Re: [Bacula-users] LTO-4 Drive issues
> Hi all, > > Hoping someone could help me out. My department recently purchased a Dell > PowerVault TL2000 autochanger connected via SAS5. > > We are upgrading from a spectralogic AIT4 system. > > The sad part is the LTO-4 drive is running much slower than the AIT system. > I'm avg around 714k/s. I have been reading where others complain about only 35 > - 50 M/s instead of 120+. At this point I'd love to 35 - 50 M/s. > > I'm running Fedora 12 & Bacula 5.0.2 > This is all on a new server. The AIT system is running on Fedora 10 & Bacula > 5.0.0 > > All the config files are similar if not identical accept for hardware > settings. I just don't seem to understand why this is so slow. Any suggestions > would be appreciated. > If it was a HP drive you would use HP Library & Tape Tools to run a bunch of tests on it and tell you if everything is good. Dell probably have a similar tool. That would at least tell you that the hardware was okay or not. James -- ___ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users