Re: [BackupPC-users] experiences with very large pools?
I think I've to look for a different solution, I just can't imagine a pool with 10 TB. * I have recently taken my DRBD mirror off-line and copied the BackupPC directory structure to both XFS-without-DRBD and an EXT4 file system for testing. Performance of the XFS file system was not much different with, or without DRBD (a fat fiber link helps there). The first traversal of the pool on the EXT4 partition is about 66% through the pool traversal after about 96 hours. nice ;) Ralf You may want to look at this thread http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17234.html -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
Ralf Gross wrote: I think I've to look for a different solution, I just can't imagine a pool with 10 TB. Backuppc's usual scaling issues are with the number of files/links more than total size, so the problems may be different when you work with huge files. I thought someone had posted here about using nfs with a common archive and several servers running the backups but I've forgotten the details about how he avoided conflicts and managed it. Maybe this would be the place to look at opensolaris with zfs's new block-level de-dup and a simpler rsync copy. -- Les Mikesell lesmikes...@gmail.com -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
Gerald Brandt schrieb: I think I've to look for a different solution, I just can't imagine a pool with 10 TB. * I have recently taken my DRBD mirror off-line and copied the BackupPC directory structure to both XFS-without-DRBD and an EXT4 file system for testing. Performance of the XFS file system was not much different with, or without DRBD (a fat fiber link helps there). The first traversal of the pool on the EXT4 partition is about 66% through the pool traversal after about 96 hours. nice ;) You may want to look at this thread http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17234.html I've seen this thread, but the pool sizes there are max. in the lower TB region. Ralf -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
you would need to move up to 15K rpm drives to have a very large array and the cost will grow exponentially trying to get such a large array. as Les said, look at a zfs array with block level dedup. I have a 3TB setup right now and I have some been running a backup against a unix server and 2 linux servers in my main office here to see how the dedup works opensolaris:~$ zpool list NAME SIZE ALLOC FREECAP DEDUP HEALTH ALTROOT rpool 74G 5.77G 68.2G 7% 1.00x ONLINE - storage 3.06T 1.04T 2.02T 66% 19.03x ONLINE - this is just rsync(3) pulling data over to a directory /storage/host1 which is a zfs fileset off pool storage for each host. my script is very simple at this point zfs snapshot storage/ho...@`date +%Y.%m.%d-%M.%S` rsync -aHXA --exclude-from=/etc/backups/host1excludes.conf host1:/ /storage/host1 to build the pool and fileset format #gives all available disks zpool status will tell you what disks are already in pools zpool create storage mirror disk1 disk2 disk3 etc etc spare disk11 cache disk12 log disk13 #cache disk is a high RPM disk or SSD, basically a massive buffer for IO caching, #log is a transaction log and doesnt need a lot of size but IO is good so high RPM or smaller SSD #cache and log are optional and are mainly for performance improvements when using slower storage drives like my 7200RPM SATA drives zfs create -o dedup=on (or dedup=verify) -o compression=on -o storage/host1 dedup is very very good for writes BUT requires a big CPU. dont re-purpose your old P3 for this. compression is actually going to help your write performance assuming you have a fast CPU. it will reduce the IO load and zfs will re-order writes on the fly. dedup is all in-line so it reduces IO load for anything with common blocks. it is also block level not file level so a large file with slight changes will get deduped. dedup+compression really needs a fast dual core or quad core. if you look at my zpool list above you can see my dedup at 19x and usage at 1.04 which effectively means Im getting 19TB in 1TB worth of space. my servers have relatively few files that change and the large files get appended to so I really only store the changes. snapshots are almost instant and can be browsed at /storage/host1/.zfs/snapshot/ and are labeled by the @`date xxx` so i get folders for the dates. these are read only snapshots and can be shared via samba or nfs. zfs list -t snapshot opensolaris:/storage/host1/.zfs/snapshot# zfs list -t snapshot NAME rpool/ROOT/opensola...@install 270M - 3.26G - storage/ho...@2010.02.19-48.33 zfs set sharesmb=on storage/ho...@2010.02.19-48.33 -or- zfs set sharenfs=on storage/ho...@2010.02.19-48.33 if you dont want to go pure opensolaris then look at nexenta. it is a functional opensolaris-debian/ubuntu hybrid with ZFS and it has dedup. it does not currently share via iscsi so keep that in mind. I believe it also uses a full samba package for samba shares while opensolaris can use the native CIFS server which is faster than samba. opensolaris can also join Active Directory. You also need to extend your AD schema. If you do you can give a priviliged use UID and GUI mappings in AD and then you can access the windows1/C$ shares. I would create a backup user and add them to restricted groups in GP to be local administrators on the machines (but not domain admins). You would probably want to figure out how to do a VSS and rsync that over instead of the active filesystem because you will get tons of file locals if you dont. good luck On Fri, Feb 19, 2010 at 6:51 AM, Les Mikesell lesmikes...@gmail.com wrote: Ralf Gross wrote: I think I've to look for a different solution, I just can't imagine a pool with 10 TB. Backuppc's usual scaling issues are with the number of files/links more than total size, so the problems may be different when you work with huge files. I thought someone had posted here about using nfs with a common archive and several servers running the backups but I've forgotten the details about how he avoided conflicts and managed it. Maybe this would be the place to look at opensolaris with zfs's new block-level de-dup and a simpler rsync copy. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
Les Mikesell schrieb: Ralf Gross wrote: I think I've to look for a different solution, I just can't imagine a pool with 10 TB. Backuppc's usual scaling issues are with the number of files/links more than total size, so the problems may be different when you work with huge files. I thought someone had posted here about using nfs with a common archive and several servers running the backups but I've forgotten the details about how he avoided conflicts and managed it. Maybe this would be the place to look at opensolaris with zfs's new block-level de-dup and a simpler rsync copy. ZFS sounds nice, but we have no experience with opensolaris or ZFS. And I heard in the past that not all of ZFS's features are ready for production. bit off topic: Right now I'm looking for a cheap storage solution that is based on supermicro chassis with 36 drive bays (server) or 45 drive bays (expansion unit) in 4 HU. Frightening, that would be 810 TB in one Rack (36 + 45 HDDs x 5 x 2 TB, 40 HU) with 5 servers. Only problem is power, cooling and backup. Ralf -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
On 2/19/2010 9:42 AM, Ralf Gross wrote: Les Mikesell schrieb: Ralf Gross wrote: I think I've to look for a different solution, I just can't imagine a pool with 10 TB. Backuppc's usual scaling issues are with the number of files/links more than total size, so the problems may be different when you work with huge files. I thought someone had posted here about using nfs with a common archive and several servers running the backups but I've forgotten the details about how he avoided conflicts and managed it. Maybe this would be the place to look at opensolaris with zfs's new block-level de-dup and a simpler rsync copy. ZFS sounds nice, but we have no experience with opensolaris or ZFS. That's something that could be fixed. And I heard in the past that not all of ZFS's features are ready for production. In the past, nothing worked on any OS. bit off topic: Right now I'm looking for a cheap storage solution that is based on supermicro chassis with 36 drive bays (server) or 45 drive bays (expansion unit) in 4 HU. Frightening, that would be 810 TB in one Rack (36 + 45 HDDs x 5 x 2 TB, 40 HU) with 5 servers. Only problem is power, cooling and backup. What's generating that kind of data? Can you make whatever it is write copies to 2 different places so you don't have to deal with finding the differences in something that size for incrementals? Or perhaps store it in time-slice volumes so you know where the changes you need to back up each day will be? -- Les Mikesell lesmikes...@gmail.com -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
Ralf Gross ralf-li...@ralfgross.de wrote on 02/19/2010 10:42:35 AM: bit off topic: Right now I'm looking for a cheap storage solution that is based on supermicro chassis with 36 drive bays (server) or 45 drive bays (expansion unit) in 4 HU. Frightening, that would be 810 TB in one Rack (36 + 45 HDDs x 5 x 2 TB, 40 HU) with 5 servers. Only problem is power, cooling and backup. That's why companies like EMC and NetApp get big money for selling you nearly the *exact* same hardware: but with software and services designed to handle things like...backup. With storage sets of that size, there's really very little you can do outside of snapshots, volume management and lots and lots of disk (and chassis and processor and power and ...) redundancy. Simply traversing a file system of that size is going to take more time than you have for a backup window. If you want anything approaching daily backups, you can't do it at the filesystem level. :( And even for things like off-site backup, it's far easier to have a smaller version of your big array off-site and sync a snapshot periodically (taking advantage of the logging/COW filesystem of the array system) than it is to try to traverse an entire 800TB filesystem (or multiple filesystems that add up to 800TB). Tim Massey -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
Les Mikesell schrieb: On 2/19/2010 9:42 AM, Ralf Gross wrote: Les Mikesell schrieb: Ralf Gross wrote: I think I've to look for a different solution, I just can't imagine a pool with 10 TB. Backuppc's usual scaling issues are with the number of files/links more than total size, so the problems may be different when you work with huge files. I thought someone had posted here about using nfs with a common archive and several servers running the backups but I've forgotten the details about how he avoided conflicts and managed it. Maybe this would be the place to look at opensolaris with zfs's new block-level de-dup and a simpler rsync copy. ZFS sounds nice, but we have no experience with opensolaris or ZFS. That's something that could be fixed. sure, but it's something I can't estimate right now. And I heard in the past that not all of ZFS's features are ready for production. In the past, nothing worked on any OS. that a bit hard... bit off topic: Right now I'm looking for a cheap storage solution that is based on supermicro chassis with 36 drive bays (server) or 45 drive bays (expansion unit) in 4 HU. Frightening, that would be 810 TB in one Rack (36 + 45 HDDs x 5 x 2 TB, 40 HU) with 5 servers. Only problem is power, cooling and backup. What's generating that kind of data? Can you make whatever it is write copies to 2 different places so you don't have to deal with finding the differences in something that size for incrementals? Or perhaps store it in time-slice volumes so you know where the changes you need to back up each day will be? The data is mainly uncompressed raw video data (AFAIK HDF, I don't work with the data). Users come with external HDD's and copy the data on the samba files servers. Right now we backup 70 TB to tape, but I would like to get rid of tapes. For the large RAID volumes there is also no regular backup. We make it by acclamation. But this should change now... Ralf -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
Timothy J Massey schrieb: Ralf Gross ralf-li...@ralfgross.de wrote on 02/19/2010 10:42:35 AM: bit off topic: Right now I'm looking for a cheap storage solution that is based on supermicro chassis with 36 drive bays (server) or 45 drive bays (expansion unit) in 4 HU. Frightening, that would be 810 TB in one Rack (36 + 45 HDDs x 5 x 2 TB, 40 HU) with 5 servers. Only problem is power, cooling and backup. That's why companies like EMC and NetApp get big money for selling you nearly the *exact* same hardware: but with software and services designed to handle things like...backup. With storage sets of that size, there's really very little you can do outside of snapshots, volume management and lots and lots of disk (and chassis and processor and power and ...) redundancy. Simply traversing a file system of that size is going to take more time than you have for a backup window. If you want anything approaching daily backups, you can't do it at the filesystem level. :( And even for things like off-site backup, it's far easier to have a smaller version of your big array off-site and sync a snapshot periodically (taking advantage of the logging/COW filesystem of the array system) than it is to try to traverse an entire 800TB filesystem (or multiple filesystems that add up to 800TB). Your are absolutely right. I hope we will realize part of the storage with eg. NetApp and only a small part with a cheap solution which doesn't need backup or only in a best effort way. The data is not changing much, most of the files just lie there and will not be read again. Ralf -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
dan schrieb: you would need to move up to 15K rpm drives to have a very large array and the cost will grow exponentially trying to get such a large array. as Les said, look at a zfs array with block level dedup. I have a 3TB setup right now and I have some been running a backup against a unix server and 2 linux servers in my main office here to see how the dedup works opensolaris:~$ zpool list NAME SIZE ALLOC FREECAP DEDUP HEALTH ALTROOT rpool 74G 5.77G 68.2G 7% 1.00x ONLINE - storage 3.06T 1.04T 2.02T 66% 19.03x ONLINE - this is just rsync(3) pulling data over to a directory /storage/host1 which is a zfs fileset off pool storage for each host. my script is very simple at this point zfs snapshot storage/ho...@`date +%Y.%m.%d-%M.%S` rsync -aHXA --exclude-from=/etc/backups/host1excludes.conf host1:/ /storage/host1 to build the pool and fileset format #gives all available disks zpool status will tell you what disks are already in pools zpool create storage mirror disk1 disk2 disk3 etc etc spare disk11 cache disk12 log disk13 #cache disk is a high RPM disk or SSD, basically a massive buffer for IO caching, #log is a transaction log and doesnt need a lot of size but IO is good so high RPM or smaller SSD #cache and log are optional and are mainly for performance improvements when using slower storage drives like my 7200RPM SATA drives zfs create -o dedup=on (or dedup=verify) -o compression=on -o storage/host1 dedup is very very good for writes BUT requires a big CPU. dont re-purpose your old P3 for this. compression is actually going to help your write performance assuming you have a fast CPU. it will reduce the IO load and zfs will re-order writes on the fly. dedup is all in-line so it reduces IO load for anything with common blocks. it is also block level not file level so a large file with slight changes will get deduped. dedup+compression really needs a fast dual core or quad core. if you look at my zpool list above you can see my dedup at 19x and usage at 1.04 which effectively means Im getting 19TB in 1TB worth of space. my servers have relatively few files that change and the large files get appended to so I really only store the changes. snapshots are almost instant and can be browsed at /storage/host1/.zfs/snapshot/ and are labeled by the @`date xxx` so i get folders for the dates. these are read only snapshots and can be shared via samba or nfs. zfs list -t snapshot opensolaris:/storage/host1/.zfs/snapshot# zfs list -t snapshot NAME rpool/ROOT/opensola...@install 270M - 3.26G - storage/ho...@2010.02.19-48.33 zfs set sharesmb=on storage/ho...@2010.02.19-48.33 -or- zfs set sharenfs=on storage/ho...@2010.02.19-48.33 if you dont want to go pure opensolaris then look at nexenta. it is a functional opensolaris-debian/ubuntu hybrid with ZFS and it has dedup. it does not currently share via iscsi so keep that in mind. I believe it also uses a full samba package for samba shares while opensolaris can use the native CIFS server which is faster than samba. opensolaris can also join Active Directory. You also need to extend your AD schema. If you do you can give a priviliged use UID and GUI mappings in AD and then you can access the windows1/C$ shares. I would create a backup user and add them to restricted groups in GP to be local administrators on the machines (but not domain admins). You would probably want to figure out how to do a VSS and rsync that over instead of the active filesystem because you will get tons of file locals if you dont. good luck Thanks for you detailed reply. I'll have a look at nexenta, right now www.nexenta.org seems to be down. Ralf -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
On 2/19/2010 10:28 AM, Ralf Gross wrote: if you dont want to go pure opensolaris then look at nexenta. it is a functional opensolaris-debian/ubuntu hybrid with ZFS and it has dedup. it does not currently share via iscsi so keep that in mind. I believe it also uses a full samba package for samba shares while opensolaris can use the native CIFS server which is faster than samba. opensolaris can also join Active Directory. You also need to extend your AD schema. If you do you can give a priviliged use UID and GUI mappings in AD and then you can access the windows1/C$ shares. I would create a backup user and add them to restricted groups in GP to be local administrators on the machines (but not domain admins). You would probably want to figure out how to do a VSS and rsync that over instead of the active filesystem because you will get tons of file locals if you dont. good luck Thanks for you detailed reply. I'll have a look at nexenta, right now www.nexenta.org seems to be down. You'd want www.nexenta.com for the nexentastor product. nexenta.org is for the desktop flavor. With zfs you could probably build 2 identical systems and use an incremental snapshot send/receive for the backup but I have no idea how that scales. I'd expect it would be much faster than traversing the directories, though. -- Les Mikesell lesmikes...@gmail.com -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
Ralf Gross wrote: Gerald Brandt schrieb: You may want to look at this thread http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17234.html I've seen this thread, but the pool sizes there are max. in the lower TB region. Ralf Not all of them... http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17240.html Chris -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
Chris Robertson wrote: Ralf Gross wrote: Gerald Brandt schrieb: You may want to look at this thread http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17234.html I've seen this thread, but the pool sizes there are max. in the lower TB region. Ralf Not all of them... http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17240.html Chris Sorry for the noise... I was looking at the size of the full backups, not the pool. On a side note, that's some serious compression, de-duplication, or a massive problem with the pool. Chris -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
Chris Robertson schrieb: Chris Robertson wrote: Ralf Gross wrote: Gerald Brandt schrieb: You may want to look at this thread http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17234.html I've seen this thread, but the pool sizes there are max. in the lower TB region. Ralf Not all of them... http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17240.html Chris Sorry for the noise... I was looking at the size of the full backups, not the pool. On a side note, that's some serious compression, de-duplication, or a massive problem with the pool. I stumbeld across this too. Ralf -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
Chris Robertson schrieb: Ralf Gross wrote: Hi, I'm faced with the growing storage demands in my department. In the near future we will need several hundred TB. Mostly large files. ATM we already have 80 TB of data with gets backed up to tape. Providing the primary storage is not the big problem. My biggest concern is the backup of the data. One solution would be using a NetApp solution with snapshots. On the other hand is this a very expensive solution, the data will be written once, but then only read again. Short: it should be a cheap solution, but the data should be backed up. And it would be nice if we could abandon tape backups... My idea is to use some big RAID 6 arrays for the primary data, create LUNs in slices of max. 10 TB with XFS filesystems. Backuppc would be ideal for backup, because of the pool feature (we already use backuppc for a smaller amount of data). Has anyone experiences with backuppc and a pool size of 50 TB? I'm not sure how well this will work. I see that backuppc needs 45h to backup 3,2 TB of data right now, mostly small files. I don't like very large filesystems, but I don't see how this will scale with either multiple backuppc server and smaller filesystems (well, more than one server will be needed anyway, but I don't want to run 20 or more server...) or (if possible) with multiple backuppc instances on the same server, each with a own pool filesystem. So, anyone using backuppc in such an environment? In one way, and compared to some my backup set is pretty small (pool is 791.45GB). In another dimension, I think it is one of the larger (comprising 20874602 files). The breadth of my pool leads to... -bash-3.2$ df -i /data/ FilesystemInodes IUsed IFree IUse% Mounted on /dev/drbd0 1932728448 47240613 18854878353% /data ...nearly 50 million inodes used (so somewhere close to 30 million hard links). XFS holds up surprisingly well to this abuse*, but the strain shows. Traversing the whole pool takes three days. Attempting to grow my tail (the number of backups I keep) causes serious performance degradation as I approach 55 million inodes. Just an anecdote to be aware of. I think I've to look for a different solution, I just can't imagine a pool with 10 TB. * I have recently taken my DRBD mirror off-line and copied the BackupPC directory structure to both XFS-without-DRBD and an EXT4 file system for testing. Performance of the XFS file system was not much different with, or without DRBD (a fat fiber link helps there). The first traversal of the pool on the EXT4 partition is about 66% through the pool traversal after about 96 hours. nice ;) Ralf -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] experiences with very large pools?
Ralf Gross wrote: Hi, I'm faced with the growing storage demands in my department. In the near future we will need several hundred TB. Mostly large files. ATM we already have 80 TB of data with gets backed up to tape. Providing the primary storage is not the big problem. My biggest concern is the backup of the data. One solution would be using a NetApp solution with snapshots. On the other hand is this a very expensive solution, the data will be written once, but then only read again. Short: it should be a cheap solution, but the data should be backed up. And it would be nice if we could abandon tape backups... My idea is to use some big RAID 6 arrays for the primary data, create LUNs in slices of max. 10 TB with XFS filesystems. Backuppc would be ideal for backup, because of the pool feature (we already use backuppc for a smaller amount of data). Has anyone experiences with backuppc and a pool size of 50 TB? I'm not sure how well this will work. I see that backuppc needs 45h to backup 3,2 TB of data right now, mostly small files. I don't like very large filesystems, but I don't see how this will scale with either multiple backuppc server and smaller filesystems (well, more than one server will be needed anyway, but I don't want to run 20 or more server...) or (if possible) with multiple backuppc instances on the same server, each with a own pool filesystem. So, anyone using backuppc in such an environment? In one way, and compared to some my backup set is pretty small (pool is 791.45GB). In another dimension, I think it is one of the larger (comprising 20874602 files). The breadth of my pool leads to... -bash-3.2$ df -i /data/ FilesystemInodes IUsed IFree IUse% Mounted on /dev/drbd0 1932728448 47240613 18854878353% /data ...nearly 50 million inodes used (so somewhere close to 30 million hard links). XFS holds up surprisingly well to this abuse*, but the strain shows. Traversing the whole pool takes three days. Attempting to grow my tail (the number of backups I keep) causes serious performance degradation as I approach 55 million inodes. Just an anecdote to be aware of. Ralf Chris * I have recently taken my DRBD mirror off-line and copied the BackupPC directory structure to both XFS-without-DRBD and an EXT4 file system for testing. Performance of the XFS file system was not much different with, or without DRBD (a fat fiber link helps there). The first traversal of the pool on the EXT4 partition is about 66% through the pool traversal after about 96 hours. -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
[BackupPC-users] experiences with very large pools?
Hi, I'm faced with the growing storage demands in my department. In the near future we will need several hundred TB. Mostly large files. ATM we already have 80 TB of data with gets backed up to tape. Providing the primary storage is not the big problem. My biggest concern is the backup of the data. One solution would be using a NetApp solution with snapshots. On the other hand is this a very expensive solution, the data will be written once, but then only read again. Short: it should be a cheap solution, but the data should be backed up. And it would be nice if we could abandon tape backups... My idea is to use some big RAID 6 arrays for the primary data, create LUNs in slices of max. 10 TB with XFS filesystems. Backuppc would be ideal for backup, because of the pool feature (we already use backuppc for a smaller amount of data). Has anyone experiences with backuppc and a pool size of 50 TB? I'm not sure how well this will work. I see that backuppc needs 45h to backup 3,2 TB of data right now, mostly small files. I don't like very large filesystems, but I don't see how this will scale with either multiple backuppc server and smaller filesystems (well, more than one server will be needed anyway, but I don't want to run 20 or more server...) or (if possible) with multiple backuppc instances on the same server, each with a own pool filesystem. So, anyone using backuppc in such an environment? Ralf -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/