Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Gerald Brandt

 
 I think I've to look for a different solution, I just can't imagine a 
 pool with  10 TB. 
 
 
  * I have recently taken my DRBD mirror off-line and copied the BackupPC 
  directory structure to both XFS-without-DRBD and an EXT4 file system for 
  testing. Performance of the XFS file system was not much different 
  with, or without DRBD (a fat fiber link helps there). The first 
  traversal of the pool on the EXT4 partition is about 66% through the 
  pool traversal after about 96 hours. 
 
 nice ;) 
 
 Ralf 
 


You may want to look at this thread 
http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17234.html 

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Les Mikesell
Ralf Gross wrote:

 I think I've to look for a different solution, I just can't imagine a
 pool with  10 TB.

Backuppc's usual scaling issues are with the number of files/links more than 
total size, so the problems may be different when you work with huge files.  I 
thought someone had posted here about using nfs with a common archive and 
several servers running the backups but I've forgotten the details about how he 
avoided conflicts and managed it.  Maybe this would be the place to look at 
opensolaris with zfs's new block-level de-dup and a simpler rsync copy.

-- 
   Les Mikesell
lesmikes...@gmail.com


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Ralf Gross
Gerald Brandt schrieb:
 
  
  I think I've to look for a different solution, I just can't imagine a 
  pool with  10 TB. 
  
  
   * I have recently taken my DRBD mirror off-line and copied the BackupPC 
   directory structure to both XFS-without-DRBD and an EXT4 file system for 
   testing. Performance of the XFS file system was not much different 
   with, or without DRBD (a fat fiber link helps there). The first 
   traversal of the pool on the EXT4 partition is about 66% through the 
   pool traversal after about 96 hours. 
  
  nice ;) 
 
 You may want to look at this thread 
 http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17234.html
  

I've seen this thread, but the pool sizes there are max. in the lower
TB region.

Ralf

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread dan
you would need to move up to 15K rpm drives to have a very large array and
the cost will grow exponentially trying to get such a large array.

as Les said, look at a zfs array with block level dedup.  I have a 3TB setup
right now and I have some been running a backup against a unix server and 2
linux servers in my main office here to see how the dedup works

opensolaris:~$ zpool list
NAME  SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
rpool  74G  5.77G  68.2G 7%  1.00x  ONLINE  -
storage  3.06T   1.04T  2.02T 66%  19.03x  ONLINE  -

this is just rsync(3) pulling data over to a directory
/storage/host1 which is a zfs fileset off pool storage for each host.

my script is very simple at this point

zfs snapshot storage/ho...@`date +%Y.%m.%d-%M.%S`
rsync -aHXA --exclude-from=/etc/backups/host1excludes.conf host1:/
/storage/host1

to build the pool and fileset
format #gives all available disks
zpool status will tell you what disks are already in pools
zpool create storage mirror disk1 disk2 disk3 etc etc spare disk11 cache
disk12 log disk13
#cache disk is a high RPM disk or SSD, basically a massive buffer for IO
caching,
#log is a transaction log and doesnt need a lot of size but IO is good so
high RPM or smaller SSD
#cache and log are optional and are mainly for performance improvements when
using slower storage drives like my 7200RPM SATA drives
zfs create -o dedup=on (or dedup=verify) -o compression=on -o storage/host1

dedup is very very good for writes BUT requires a big CPU.  dont re-purpose
your old P3 for this.
compression is actually going to help your write performance assuming you
have a fast CPU.  it will reduce the IO load and zfs will re-order writes on
the fly.
dedup is all in-line so it reduces IO load for anything with common blocks.
it is also block level not file level so a large file with slight changes
will get deduped.

dedup+compression really needs a fast dual core or quad core.

if you look at my zpool list above you can see my dedup at 19x and usage at
1.04 which effectively means Im getting 19TB in 1TB worth of space.  my
servers have relatively few files that change and the large files get
appended to so I really only store the changes.

snapshots are almost instant and can be browsed at
/storage/host1/.zfs/snapshot/ and are labeled by the @`date xxx` so i get
folders for the dates.  these are read only snapshots and can be shared via
samba or nfs.
zfs list -t snapshot

opensolaris:/storage/host1/.zfs/snapshot# zfs list -t snapshot
NAME
rpool/ROOT/opensola...@install   270M  -  3.26G  -
storage/ho...@2010.02.19-48.33

zfs set sharesmb=on storage/ho...@2010.02.19-48.33
-or-
zfs set sharenfs=on storage/ho...@2010.02.19-48.33


if you dont want to go pure opensolaris then look at nexenta.  it is a
functional opensolaris-debian/ubuntu hybrid with ZFS and it has dedup.  it
does not currently share via iscsi so keep that in mind.  I believe it also
uses a full samba package for samba shares while opensolaris can use the
native CIFS server which is faster than samba.

opensolaris can also join Active Directory. You also need to extend your AD
schema.  If you do you can give a priviliged use UID and GUI mappings in AD
and then you can access the windows1/C$ shares.  I would create a backup
user and add them to restricted groups in GP to be local administrators on
the machines (but not domain admins).  You would probably want to figure out
how to do a VSS and rsync that over instead of the active filesystem because
you will get tons of file locals if you dont.

good luck




On Fri, Feb 19, 2010 at 6:51 AM, Les Mikesell lesmikes...@gmail.com wrote:

 Ralf Gross wrote:
 
  I think I've to look for a different solution, I just can't imagine a
  pool with  10 TB.

 Backuppc's usual scaling issues are with the number of files/links more
 than
 total size, so the problems may be different when you work with huge files.
  I
 thought someone had posted here about using nfs with a common archive and
 several servers running the backups but I've forgotten the details about
 how he
 avoided conflicts and managed it.  Maybe this would be the place to look at
 opensolaris with zfs's new block-level de-dup and a simpler rsync copy.


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Ralf Gross
Les Mikesell schrieb:
 Ralf Gross wrote:
 
  I think I've to look for a different solution, I just can't imagine a
  pool with  10 TB.
 
 Backuppc's usual scaling issues are with the number of files/links more than 
 total size, so the problems may be different when you work with huge files.  
 I 
 thought someone had posted here about using nfs with a common archive and 
 several servers running the backups but I've forgotten the details about how 
 he 
 avoided conflicts and managed it.  Maybe this would be the place to look at 
 opensolaris with zfs's new block-level de-dup and a simpler rsync copy.

ZFS sounds nice, but we have no experience with opensolaris or ZFS.
And I heard in the past that not all of ZFS's features are ready for
production.

bit off topic:
Right now I'm looking for a cheap storage solution that is based
on supermicro chassis with 36 drive bays (server) or 45 drive bays
(expansion unit) in 4 HU. Frightening, that would be 810 TB in one
Rack (36 + 45 HDDs x 5 x 2 TB, 40 HU) with 5 servers. Only problem is
power, cooling and backup.

Ralf

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Les Mikesell
On 2/19/2010 9:42 AM, Ralf Gross wrote:
 Les Mikesell schrieb:
 Ralf Gross wrote:

 I think I've to look for a different solution, I just can't imagine a
 pool with  10 TB.

 Backuppc's usual scaling issues are with the number of files/links more than
 total size, so the problems may be different when you work with huge files.  
 I
 thought someone had posted here about using nfs with a common archive and
 several servers running the backups but I've forgotten the details about how 
 he
 avoided conflicts and managed it.  Maybe this would be the place to look at
 opensolaris with zfs's new block-level de-dup and a simpler rsync copy.

 ZFS sounds nice, but we have no experience with opensolaris or ZFS.

That's something that could be fixed.

 And I heard in the past that not all of ZFS's features are ready for
 production.

In the past, nothing worked on any OS.

 bit off topic:
 Right now I'm looking for a cheap storage solution that is based
 on supermicro chassis with 36 drive bays (server) or 45 drive bays
 (expansion unit) in 4 HU. Frightening, that would be 810 TB in one
 Rack (36 + 45 HDDs x 5 x 2 TB, 40 HU) with 5 servers. Only problem is
 power, cooling and backup.

What's generating that kind of data?  Can you make whatever it is write 
copies to 2 different places so you don't have to deal with finding the 
differences in something that size for incrementals?  Or perhaps store 
it in time-slice volumes so you know where the changes you need to back 
up each day will be?

-- 
   Les Mikesell
lesmikes...@gmail.com




--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Timothy J Massey
Ralf Gross ralf-li...@ralfgross.de wrote on 02/19/2010 10:42:35 AM:

 bit off topic:
 Right now I'm looking for a cheap storage solution that is based
 on supermicro chassis with 36 drive bays (server) or 45 drive bays
 (expansion unit) in 4 HU. Frightening, that would be 810 TB in one
 Rack (36 + 45 HDDs x 5 x 2 TB, 40 HU) with 5 servers. Only problem is
 power, cooling and backup.

That's why companies like EMC and NetApp get big money for selling you 
nearly the *exact* same hardware:  but with software and services designed 
to handle things like...backup.

With storage sets of that size, there's really very little you can do 
outside of snapshots, volume management and lots and lots of disk (and 
chassis and processor and power and ...) redundancy.  Simply traversing a 
file system of that size is going to take more time than you have for a 
backup window.  If you want anything approaching daily backups, you can't 
do it at the filesystem level.  :(

And even for things like off-site backup, it's far easier to have a 
smaller version of your big array off-site and sync a snapshot 
periodically (taking advantage of the logging/COW filesystem of the array 
system) than it is to try to traverse an entire 800TB filesystem (or 
multiple filesystems that add up to 800TB).

Tim Massey


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Ralf Gross
Les Mikesell schrieb:
 On 2/19/2010 9:42 AM, Ralf Gross wrote:
  Les Mikesell schrieb:
  Ralf Gross wrote:
 
  I think I've to look for a different solution, I just can't imagine a
  pool with  10 TB.
 
  Backuppc's usual scaling issues are with the number of files/links more 
  than
  total size, so the problems may be different when you work with huge 
  files.  I
  thought someone had posted here about using nfs with a common archive and
  several servers running the backups but I've forgotten the details about 
  how he
  avoided conflicts and managed it.  Maybe this would be the place to look at
  opensolaris with zfs's new block-level de-dup and a simpler rsync copy.
 
  ZFS sounds nice, but we have no experience with opensolaris or ZFS.
 
 That's something that could be fixed.

sure, but it's something I can't estimate right now.

 
  And I heard in the past that not all of ZFS's features are ready for
  production.
 
 In the past, nothing worked on any OS.

that a bit hard...
 
  bit off topic:
  Right now I'm looking for a cheap storage solution that is based
  on supermicro chassis with 36 drive bays (server) or 45 drive bays
  (expansion unit) in 4 HU. Frightening, that would be 810 TB in one
  Rack (36 + 45 HDDs x 5 x 2 TB, 40 HU) with 5 servers. Only problem is
  power, cooling and backup.
 
 What's generating that kind of data?  Can you make whatever it is write 
 copies to 2 different places so you don't have to deal with finding the 
 differences in something that size for incrementals?  Or perhaps store 
 it in time-slice volumes so you know where the changes you need to back 
 up each day will be?

The data is mainly uncompressed raw video data (AFAIK HDF, I don't
work with the data). Users come with external HDD's and copy the data
on the samba files servers. 

Right now we backup 70 TB to tape, but I would like to get rid of
tapes. For the large RAID volumes there is also no regular backup. We
make it by acclamation. But this should change now...



Ralf

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Ralf Gross
Timothy J Massey schrieb:
 Ralf Gross ralf-li...@ralfgross.de wrote on 02/19/2010 10:42:35 AM:
 
  bit off topic:
  Right now I'm looking for a cheap storage solution that is based
  on supermicro chassis with 36 drive bays (server) or 45 drive bays
  (expansion unit) in 4 HU. Frightening, that would be 810 TB in one
  Rack (36 + 45 HDDs x 5 x 2 TB, 40 HU) with 5 servers. Only problem is
  power, cooling and backup.
 
 That's why companies like EMC and NetApp get big money for selling you 
 nearly the *exact* same hardware:  but with software and services designed 
 to handle things like...backup.
 
 With storage sets of that size, there's really very little you can do 
 outside of snapshots, volume management and lots and lots of disk (and 
 chassis and processor and power and ...) redundancy.  Simply traversing a 
 file system of that size is going to take more time than you have for a 
 backup window.  If you want anything approaching daily backups, you can't 
 do it at the filesystem level.  :(
 
 And even for things like off-site backup, it's far easier to have a 
 smaller version of your big array off-site and sync a snapshot 
 periodically (taking advantage of the logging/COW filesystem of the array 
 system) than it is to try to traverse an entire 800TB filesystem (or 
 multiple filesystems that add up to 800TB).

Your are absolutely right. I hope we will realize part of the storage
with eg. NetApp and only a small part with a cheap solution which
doesn't need backup or only in a best effort way.

The data is not changing much, most of the files just lie there and
will not be read again.

Ralf

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Ralf Gross
dan schrieb:
 you would need to move up to 15K rpm drives to have a very large array and
 the cost will grow exponentially trying to get such a large array.
 
 as Les said, look at a zfs array with block level dedup.  I have a 3TB setup
 right now and I have some been running a backup against a unix server and 2
 linux servers in my main office here to see how the dedup works
 
 opensolaris:~$ zpool list
 NAME  SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
 rpool  74G  5.77G  68.2G 7%  1.00x  ONLINE  -
 storage  3.06T   1.04T  2.02T 66%  19.03x  ONLINE  -
 
 this is just rsync(3) pulling data over to a directory
 /storage/host1 which is a zfs fileset off pool storage for each host.
 
 my script is very simple at this point
 
 zfs snapshot storage/ho...@`date +%Y.%m.%d-%M.%S`
 rsync -aHXA --exclude-from=/etc/backups/host1excludes.conf host1:/
 /storage/host1
 
 to build the pool and fileset
 format #gives all available disks
 zpool status will tell you what disks are already in pools
 zpool create storage mirror disk1 disk2 disk3 etc etc spare disk11 cache
 disk12 log disk13
 #cache disk is a high RPM disk or SSD, basically a massive buffer for IO
 caching,
 #log is a transaction log and doesnt need a lot of size but IO is good so
 high RPM or smaller SSD
 #cache and log are optional and are mainly for performance improvements when
 using slower storage drives like my 7200RPM SATA drives
 zfs create -o dedup=on (or dedup=verify) -o compression=on -o storage/host1
 
 dedup is very very good for writes BUT requires a big CPU.  dont re-purpose
 your old P3 for this.
 compression is actually going to help your write performance assuming you
 have a fast CPU.  it will reduce the IO load and zfs will re-order writes on
 the fly.
 dedup is all in-line so it reduces IO load for anything with common blocks.
 it is also block level not file level so a large file with slight changes
 will get deduped.
 
 dedup+compression really needs a fast dual core or quad core.
 
 if you look at my zpool list above you can see my dedup at 19x and usage at
 1.04 which effectively means Im getting 19TB in 1TB worth of space.  my
 servers have relatively few files that change and the large files get
 appended to so I really only store the changes.
 
 snapshots are almost instant and can be browsed at
 /storage/host1/.zfs/snapshot/ and are labeled by the @`date xxx` so i get
 folders for the dates.  these are read only snapshots and can be shared via
 samba or nfs.
 zfs list -t snapshot
 
 opensolaris:/storage/host1/.zfs/snapshot# zfs list -t snapshot
 NAME
 rpool/ROOT/opensola...@install   270M  -  3.26G  -
 storage/ho...@2010.02.19-48.33
 
 zfs set sharesmb=on storage/ho...@2010.02.19-48.33
 -or-
 zfs set sharenfs=on storage/ho...@2010.02.19-48.33
 
 
 if you dont want to go pure opensolaris then look at nexenta.  it is a
 functional opensolaris-debian/ubuntu hybrid with ZFS and it has dedup.  it
 does not currently share via iscsi so keep that in mind.  I believe it also
 uses a full samba package for samba shares while opensolaris can use the
 native CIFS server which is faster than samba.
 
 opensolaris can also join Active Directory. You also need to extend your AD
 schema.  If you do you can give a priviliged use UID and GUI mappings in AD
 and then you can access the windows1/C$ shares.  I would create a backup
 user and add them to restricted groups in GP to be local administrators on
 the machines (but not domain admins).  You would probably want to figure out
 how to do a VSS and rsync that over instead of the active filesystem because
 you will get tons of file locals if you dont.
 
 good luck

Thanks for you detailed reply. I'll have a look at nexenta, right now
www.nexenta.org seems to be down.

Ralf

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Les Mikesell
On 2/19/2010 10:28 AM, Ralf Gross wrote:


 if you dont want to go pure opensolaris then look at nexenta.  it is a
 functional opensolaris-debian/ubuntu hybrid with ZFS and it has dedup.  it
 does not currently share via iscsi so keep that in mind.  I believe it also
 uses a full samba package for samba shares while opensolaris can use the
 native CIFS server which is faster than samba.

 opensolaris can also join Active Directory. You also need to extend your AD
 schema.  If you do you can give a priviliged use UID and GUI mappings in AD
 and then you can access the windows1/C$ shares.  I would create a backup
 user and add them to restricted groups in GP to be local administrators on
 the machines (but not domain admins).  You would probably want to figure out
 how to do a VSS and rsync that over instead of the active filesystem because
 you will get tons of file locals if you dont.

 good luck

 Thanks for you detailed reply. I'll have a look at nexenta, right now
 www.nexenta.org seems to be down.

You'd want www.nexenta.com for the nexentastor product.  nexenta.org is 
for the desktop flavor.  With zfs you could probably build 2 identical 
systems and use an incremental snapshot send/receive for the backup but 
I have no idea how that scales.  I'd expect it would be much faster than 
traversing the directories, though.

-- 
   Les Mikesell
lesmikes...@gmail.com


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Chris Robertson
Ralf Gross wrote:
 Gerald Brandt schrieb: 
   
 You may want to look at this thread 
 http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17234.html
  
 

 I've seen this thread, but the pool sizes there are max. in the lower
 TB region.

 Ralf
   

Not all of them...

http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17240.html

Chris


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Chris Robertson
Chris Robertson wrote:
 Ralf Gross wrote:
   
 Gerald Brandt schrieb: 
   
 
 You may want to look at this thread 
 http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17234.html
  
 
   
 I've seen this thread, but the pool sizes there are max. in the lower
 TB region.

 Ralf
   
 

 Not all of them...

 http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17240.html

 Chris

Sorry for the noise...  I was looking at the size of the full backups, 
not the pool.  On a side note, that's some serious compression, 
de-duplication, or a massive problem with the pool.


Chris


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-19 Thread Ralf Gross
Chris Robertson schrieb:
 Chris Robertson wrote:
  Ralf Gross wrote:

  Gerald Brandt schrieb: 

  
  You may want to look at this thread 
  http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17234.html
   
  

  I've seen this thread, but the pool sizes there are max. in the lower
  TB region.
 
  Ralf

  
 
  Not all of them...
 
  http://www.mail-archive.com/backuppc-users@lists.sourceforge.net/msg17240.html
 
  Chris
 
 Sorry for the noise...  I was looking at the size of the full backups, 
 not the pool.  On a side note, that's some serious compression, 
 de-duplication, or a massive problem with the pool.

I stumbeld across this too.

Ralf

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-18 Thread Ralf Gross
Chris Robertson schrieb:
 Ralf Gross wrote:
  Hi,
 
  I'm faced with the growing storage demands in my department. In the
  near future we will need several hundred TB. Mostly large files. ATM
  we already have 80 TB of data with gets backed up to tape.
 
  Providing the primary storage is not the big problem. My biggest
  concern is the backup of the data. One solution would be using a
  NetApp solution with snapshots. On the other hand is this a very
  expensive solution, the data will be written once, but then only read
  again. Short: it should be a cheap solution, but the data should be
  backed up. And it would be nice if we could abandon tape backups...
 
  My idea is to use some big RAID 6 arrays for the primary data, create
  LUNs in slices of max. 10 TB with XFS filesystems.
 
  Backuppc would be ideal for backup, because of the pool feature (we
  already use backuppc for a smaller amount of data).
 
  Has anyone experiences with backuppc and a pool size of 50 TB? I'm
  not sure how well this will work. I see that backuppc needs 45h to
  backup 3,2 TB of data right now, mostly small files.
 
  I don't like very large filesystems, but I don't see how this will
  scale with either multiple backuppc server and smaller filesystems
  (well, more than one server will be needed anyway, but I don't want to
  run 20 or more server...) or (if possible) with multiple backuppc
  instances on the same server, each with a own pool filesystem.
 
  So, anyone using backuppc in such an environment?

 
 In one way, and compared to some my backup set is pretty small (pool is 
 791.45GB).  In another dimension, I think it is one of the larger 
 (comprising 20874602 files).  The breadth of my pool leads to...
 
 -bash-3.2$ df -i /data/
 FilesystemInodes   IUsed   IFree IUse% Mounted on
 /dev/drbd0   1932728448 47240613 18854878353% /data
 
 ...nearly 50 million inodes used (so somewhere close to 30 million hard 
 links).  XFS holds up surprisingly well to this abuse*, but the strain 
 shows.  Traversing the whole pool takes three days.  Attempting to grow 
 my tail (the number of backups I keep) causes serious performance 
 degradation as I approach 55 million inodes.
 
 Just an anecdote to be aware of.

I think I've to look for a different solution, I just can't imagine a
pool with  10 TB.

 
 * I have recently taken my DRBD mirror off-line and copied the BackupPC 
 directory structure to both XFS-without-DRBD and an EXT4 file system for 
 testing.  Performance of the XFS file system was not much different 
 with, or without DRBD (a fat fiber link helps there).  The first 
 traversal of the pool on the EXT4 partition is about 66% through the 
 pool traversal after about 96 hours.

nice ;)

Ralf

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] experiences with very large pools?

2010-02-16 Thread Chris Robertson
Ralf Gross wrote:
 Hi,

 I'm faced with the growing storage demands in my department. In the
 near future we will need several hundred TB. Mostly large files. ATM
 we already have 80 TB of data with gets backed up to tape.

 Providing the primary storage is not the big problem. My biggest
 concern is the backup of the data. One solution would be using a
 NetApp solution with snapshots. On the other hand is this a very
 expensive solution, the data will be written once, but then only read
 again. Short: it should be a cheap solution, but the data should be
 backed up. And it would be nice if we could abandon tape backups...

 My idea is to use some big RAID 6 arrays for the primary data, create
 LUNs in slices of max. 10 TB with XFS filesystems.

 Backuppc would be ideal for backup, because of the pool feature (we
 already use backuppc for a smaller amount of data).

 Has anyone experiences with backuppc and a pool size of 50 TB? I'm
 not sure how well this will work. I see that backuppc needs 45h to
 backup 3,2 TB of data right now, mostly small files.

 I don't like very large filesystems, but I don't see how this will
 scale with either multiple backuppc server and smaller filesystems
 (well, more than one server will be needed anyway, but I don't want to
 run 20 or more server...) or (if possible) with multiple backuppc
 instances on the same server, each with a own pool filesystem.

 So, anyone using backuppc in such an environment?
   

In one way, and compared to some my backup set is pretty small (pool is 
791.45GB).  In another dimension, I think it is one of the larger 
(comprising 20874602 files).  The breadth of my pool leads to...

-bash-3.2$ df -i /data/
FilesystemInodes   IUsed   IFree IUse% Mounted on
/dev/drbd0   1932728448 47240613 18854878353% /data

...nearly 50 million inodes used (so somewhere close to 30 million hard 
links).  XFS holds up surprisingly well to this abuse*, but the strain 
shows.  Traversing the whole pool takes three days.  Attempting to grow 
my tail (the number of backups I keep) causes serious performance 
degradation as I approach 55 million inodes.

Just an anecdote to be aware of.

 Ralf

Chris

* I have recently taken my DRBD mirror off-line and copied the BackupPC 
directory structure to both XFS-without-DRBD and an EXT4 file system for 
testing.  Performance of the XFS file system was not much different 
with, or without DRBD (a fat fiber link helps there).  The first 
traversal of the pool on the EXT4 partition is about 66% through the 
pool traversal after about 96 hours.


--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


[BackupPC-users] experiences with very large pools?

2010-02-15 Thread Ralf Gross
Hi,

I'm faced with the growing storage demands in my department. In the
near future we will need several hundred TB. Mostly large files. ATM
we already have 80 TB of data with gets backed up to tape.

Providing the primary storage is not the big problem. My biggest
concern is the backup of the data. One solution would be using a
NetApp solution with snapshots. On the other hand is this a very
expensive solution, the data will be written once, but then only read
again. Short: it should be a cheap solution, but the data should be
backed up. And it would be nice if we could abandon tape backups...

My idea is to use some big RAID 6 arrays for the primary data, create
LUNs in slices of max. 10 TB with XFS filesystems.

Backuppc would be ideal for backup, because of the pool feature (we
already use backuppc for a smaller amount of data).

Has anyone experiences with backuppc and a pool size of 50 TB? I'm
not sure how well this will work. I see that backuppc needs 45h to
backup 3,2 TB of data right now, mostly small files.

I don't like very large filesystems, but I don't see how this will
scale with either multiple backuppc server and smaller filesystems
(well, more than one server will be needed anyway, but I don't want to
run 20 or more server...) or (if possible) with multiple backuppc
instances on the same server, each with a own pool filesystem.

So, anyone using backuppc in such an environment?

Ralf

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/