Ted Unangst wrote:
On 11/9/07, Daniel Ouellet <[EMAIL PROTECTED]> wrote:
Just for example, a source file that is sparse badly, don't really have
allocated disk block yet, but when copy over, via scp, or rsync will
actually use that space on the destination servers. All the servers are
identical (or suppose to be anyway) but what is happening is the copy of
them are running out of space at time in the copy process. Like when it
is copying them, it may easy use twice the amount of space in the
process and sadly filling up the destinations then then the sync process
stop making the distribution of the load unusable. I need to increase
the capacity yes, except that it will take me times to do so.
so what are you going to do when you find these sparse files?
So far. When I find them. Not all of them, but huge waisting space one.
I delete them and replace them. with the original one, or even with the
one copy using rsync -S back to the original reduce it's size in 1/2 and
more at times. So, yes, very inefficiently, but manageable anyway. It's
a plaster for now if you want. Don't get me wrong. Sparse files makes no
problem what so ever when they stay on the same systems. It's when you
need to move them around servers, and specially across Internet
connected locations and keep them in sync as much as possible in as
shorter time as possible that it becomes unmanageable. That's really the
issue at hands. Not that sparse files are bad in any ways. Keeping them
in sync across multiples system is however.
I was looking if there was a more intelligent ways to do it. (;> Like
finding them about some level of sparse, like let say 25% and then
compact them at the source to be none sparse again, or something
similar. Doesn't need to do every single one, even if that might be a
good thing in special cases, not all obviously.
The problem is that some customers end up running out of space and I
really didn't know, plus the huge factor of waisted bandwidth and
filling up their connections transferring empty files if you like and
taking much longer in sync time that other wise it wouldn't if you sync
as is.
Still is an interesting problem after I found out what it really was.
I hope it explained the issue somewhat better.
Thanks for the feedback never the less.
Daniel