On 04/ 6/10 02:42 PM, Phil Stracchino wrote: > On 04/06/10 02:37, Craig Ringer wrote: >> Is this insane? Or a viable approach to tackling some of the >> complexities of faking tape backup on disk as Bacula currently tries to do? > > Well, just off the top of my head, the first thing that comes to mind is > that the only ways such a scheme is not going to result in massive disk > fragmentation are: > > (a) it's built on top of a custom filesystem with custom device drivers > to allow pre-positioning of volumes spaced across the disk surface, in > which case it's going to be horribly slow because it's going to spend > almost all its time seeking track-to-track; or > > (b) it writes to raw devices and one volume is one spindle, in which > case you pretty much lose all the flexibility of using disk storage, and > you need large numbers of spindles for the large numbers of concurrent > volumes you want. To all practical purposes, you would be replacing > "simulating tape on disk" with using disks as though they were tapes. > > You could possibly simplify some of the issues involved in (a) by making > it a FUSE userspace filesystem, but then you add the two drawbacks that > (1) it's probably going to be slow, because userspace filesystems > usually are, and (2) it'll only be workable on Linux. > > Now, all you're going to gain from this is non-interleaved disk volumes, > and that's basically going to help you only during restores. So you're > sacrificing the common case to optimize for the rare case.
That depends on what you need, actually. Some people are fine with slower backups as long as they get fast restores. There are a number of reasons why you might want to segregate backups into a one-volume-per-client or a one-volume-per-job relationship : 1. Keeping the size of a volume down for manageability. 2. The ability to migrate certain client data WITHOUT relying on Bacula to do it for you (think zfs send / receive, rsync, etc). 3. Hard quota for limiting disk consumption of given a client. Some other aspects involve performance and / or deduplication but are highly dependent on the underlying infrastructure. > You mention > spool files, but the obvious question there is, if you're backing up to > disk anyway, why use spooling at all? The purpose of disk spooling was > to buffer between clients and tape devices. When backing up to disk, > there's really not a lot of point in spooling at all. What you really > want is de-interleaving. Correct? Spooling to a sufficiently large RAM disk is plaussible and would serve the same purpose as spooling does for tape devices. > > As has already been discussed, you can achieve this end by creating > multiple storage devices on the same disk pool and assigning one storage > device per client, but this will result in massive disk fragmentation - > and, honestly, you'll be no better off. That depends largely on the underlying filesystem and thus should not be matter of such generalization. > If what you want is to de-interleave your backups, then look into the > Migration function. You can allow backups to run normally, then Migrate > one job at a time to a new device, which will give you non-interleaved > jobs on the output volume. But you're still not guaranteed that the > output volume will be unfragmented, because you don't have control over > the disk space allocation scheme; and you're still sacrificing the > common case to optimize for the rare case. > -- Med venlig hilsen / Best Regards Henrik Johansen hen...@scannet.dk Tlf. 75 53 35 00 ScanNet Group A/S ScanNet ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users