On Fri, Nov 09, 2007 at 03:47:10PM -0500, Daniel Ouellet wrote: > Ted Unangst wrote: > >On 11/9/07, Daniel Ouellet <[EMAIL PROTECTED]> wrote: > >>Just for example, a source file that is sparse badly, don't really have > >>allocated disk block yet, but when copy over, via scp, or rsync will > >>actually use that space on the destination servers. All the servers are > >>identical (or suppose to be anyway) but what is happening is the copy of > >>them are running out of space at time in the copy process. Like when it > >>is copying them, it may easy use twice the amount of space in the > >>process and sadly filling up the destinations then then the sync process > >>stop making the distribution of the load unusable. I need to increase > >>the capacity yes, except that it will take me times to do so. > > > >so what are you going to do when you find these sparse files? > > So far. When I find them. Not all of them, but huge waisting space one. > I delete them and replace them. with the original one, or even with the
I am confused by what you say. A sparse file does NOT waste space, it REDUCES disk usage, compared to a non-sparse (dense?) file with the same contents. > one copy using rsync -S back to the original reduce it's size in 1/2 and If the size is reduced, it is not the same file. Please be more accurate in your description. A file's size is not the same as it's disk usage. > more at times. So, yes, very inefficiently, but manageable anyway. It's > a plaster for now if you want. Don't get me wrong. Sparse files makes no > problem what so ever when they stay on the same systems. It's when you > need to move them around servers, and specially across Internet > connected locations and keep them in sync as much as possible in as > shorter time as possible that it becomes unmanageable. That's really the > issue at hands. Not that sparse files are bad in any ways. Keeping them > in sync across multiples system is however. You cannot blame sparse files for that. If the same file would not be sparse, your problem would be at least as big. -Otto > > I was looking if there was a more intelligent ways to do it. (;> Like > finding them about some level of sparse, like let say 25% and then > compact them at the source to be none sparse again, or something > similar. Doesn't need to do every single one, even if that might be a > good thing in special cases, not all obviously. > > The problem is that some customers end up running out of space and I > really didn't know, plus the huge factor of waisted bandwidth and > filling up their connections transferring empty files if you like and > taking much longer in sync time that other wise it wouldn't if you sync > as is. > > Still is an interesting problem after I found out what it really was. > > I hope it explained the issue somewhat better. > > Thanks for the feedback never the less. > > Daniel