[i]And would drive storage requirements through the roof!![/i] The interesting part is, Nathan, you're probably wrong.
First, though, some of my contacts in the enterprise gladly spent millions for third-party applications running on Microsoft to do exactly that. [But we all know that SUN is famous for almost always missing the departing train] I have no proof for what I state, though my hypothesis is just the opposite. Increasing backup frequency simply requires more storage, I think we can all agree. It is therefore my assumption, that at some moment in time, 1. cron-like jobs will consume more and more resources (undoubtedly) 2. at one moment in increasing the density of cron-like (time-line) backups, the amount of metadata is going to supersede the amount of actual changes. Of course, you are right, it depends on the applications. Though I guess that - very roughly - an hourly backup (TimeMachine) is already close to that point, at least on an average home box. (Of course, you don't include /tmp in *any* such observations!) The disadvantages of something like TimeMachine are manyfold, worst of all is that they work on the level of files. Someone mentioned something like doing it through the application was more useful. Welcome to the 20th century. The argument is wrong, by the way, since such logs (RDBMS) are high-level, usually human-readable append operations on the file system through the application->operating system. Yes, they *are* useful, very useful; for a limited number of specific operations. But this argument fades very much within the context we discuss here. 'Backup' is not 'archive'; but comprehensive. So doing the task of CDP on that level is suicide on system resources (cycles) as well as on storage space. Back to my hypothesis: increasing backup frequency increases data amount (my hypothesis: metadata beyond 'change data'); offering 'only' a near-CDP experience. Once the time slots are done away with, the only data to be stored will be actual 'change data'. Meaning that the amount of metadata will be greatly reduced. And one can achieve real CDP. Prerequisite, though, is probably that the notion of 'file change' is not done on a system-wide level, but local to the change, the 'write' process. And, of course, it is obligatory to perform it on a block level, incremental, you-name-it. I challenge you to follow up on this matter. My interest arouse due to a presentation I'll be giving shortly at a conference. To me, when preparing, it was obvious that ZFS can do this (CDP). Just wanted to make sure; surprise, surprise. Now I am really interested to prove my hypothesis, that - under disregard of the possible technology involved - the 'change data', the amount of actual changes to be stored for CDP, is below the amount of data (and resources) required for near-CDP on the level of files. Contact me offline if you feel like sponsoring this research. Uwe This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss