Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Robert Milkowski

Ross Walker wrote:

On Sep 30, 2009, at 10:40 AM, Brian Hubbleday  wrote:

Just realised I missed a rather important word out there, that could 
confuse.


So the conclusion I draw from this is that the --incremental-- 
snapshot simply contains every written block since the last snapshot 
regardless of whether the data in the block has changed or not.


It's because ZFS is a COW file system so each block written is a new 
block.


Doesn't matter if it is COW or not here.
He is probably effectively writting brand new file to the file system.
All file systems (maybe save for some with de-dup) would behave the same 
here.



--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Toby Thain


On 30-Sep-09, at 10:48 AM, Brian Hubbleday wrote:

I had a 50mb zfs volume that was an iscsi target. This was mounted  
into a Windows system (ntfs) and shared on the network. I used  
notepad.exe on a remote system to add/remove a few bytes at the end  
of a 25mb file.


I'm astonished that's even possible with notepad.

I agree with Richard, it looks like your workflow needs attention.  
Making random edits to very large, remotely stored flat files with  
super-simplistic tools seems in defiance of 5 decades of data  
management technology...


--T


--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread erik.ableson
Depending on the data content that you're dealing you can compress the  
snapshots inline with the send/receive operations by piping the data  
through gzip.  Given that we've been talking about 500Mb text files,  
this seems to be a very likely solution. There was some mention in the  
Kernel Keynote in Australia of inline deduplication, ie  
compression :-) in the zfs send stream. But there remains the question  
of references to deduplicated blocks that no longer exist on the  
destination.


Noting that ZFS deduplication will eventually help in diminishing the  
overall volume you have to treat since that while the output of the  
text editor will be to different physical blocks, many of these blocks  
will be identical to previously stored blocks (which will also be kept  
since they exist in snapshots) so that the send/receive operations  
will consist of a lot more block references rather than complete blocks.


Erik

PS - this is pretty much the operational mode of all products that use  
snapshots.  It's even worse on a lot of other storage systems where  
the snapshot content must be written to a specific reserved volume  
(which is often very small compared to the main data store) rather  
than the host pool. Until deduplication becomes the standard method of  
managing blocks, the volume of data required by this use case will not  
change.


On 30 sept. 2009, at 16:35, Brian Hubbleday wrote:

I took binary dumps of the snapshots taken in between the edits and  
this showed that there was actually very little change in the block  
structure, however the incremental snapshots were very large. So the  
conclusion I draw from this is that the snapshot simply contains  
every written block since the last snapshot regardless of whether  
the data in the block has changed or not.


Okay so snapshots work this way, I'm simply suggesting that things  
could be better.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Scott Meilicke
It is more cost, but a WAN Accelerator (Cisco WAAS, Riverbed, etc.) would be a 
big help.

Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Ross Walker

On Sep 30, 2009, at 10:40 AM, Brian Hubbleday  wrote:

Just realised I missed a rather important word out there, that could  
confuse.


So the conclusion I draw from this is that the --incremental--  
snapshot simply contains every written block since the last snapshot  
regardless of whether the data in the block has changed or not.


It's because ZFS is a COW file system so each block written is a new  
block.


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Brian Hubbleday
I had a 50mb zfs volume that was an iscsi target. This was mounted into a 
Windows system (ntfs) and shared on the network. I used notepad.exe on a remote 
system to add/remove a few bytes at the end of a 25mb file.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Brian Hubbleday
Just realised I missed a rather important word out there, that could confuse.

So the conclusion I draw from this is that the --incremental-- snapshot simply 
contains every written block since the last snapshot regardless of whether the 
data in the block has changed or not.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Brian Hubbleday
I took binary dumps of the snapshots taken in between the edits and this showed 
that there was actually very little change in the block structure, however the 
incremental snapshots were very large. So the conclusion I draw from this is 
that the snapshot simply contains every written block since the last snapshot 
regardless of whether the data in the block has changed or not. 

Okay so snapshots work this way, I'm simply suggesting that things could be 
better.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Casper . Dik

>On Sep 30, 2009, at 5:48 AM, Brian Hubbleday wrote:
>
>> I am looking to use Opensolaris/ZFS to create an iscsi SAN to  
>> provide storage for a collection of virtual systems and replicate to  
>> an offiste device.
>>
>> While testing the environment I was surprised to see the size of the  
>> incremental snapshots, which I need to send/receive over a WAN  
>> connection, considering this is supposed to be block level  
>> replication. Having run several tests making minor changes to large  
>> files, I now see what is happening. When I use a text editor to  
>> modify a file, the whole file is written back to disk and so the  
>> snapshot includes every written block, whether that block contains  
>> the same information as before or not.
>
>Yep, that is how most text editors work.

And dedup will not help you in that case either; unless you append to the 
end or when you insert a 128K block in the middle of the file, the blocks
themselves will all be different.

>> Would it be possible to develop the incremental snapshot process so  
>> that they only contain changed written blocks rather than every  
>> written block. Certainly in my environment where we have large files  
>> (>500mb), the effect upon what is sent over the WAN would be  
>> drastically reduced.
>
>That is how snapshots work. But your application (text editor) writes  
>new data.
>Maybe you can find another way to edit the files.

What type of changes are being made?

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental snapshot size

2009-09-30 Thread Richard Elling

On Sep 30, 2009, at 5:48 AM, Brian Hubbleday wrote:

I am looking to use Opensolaris/ZFS to create an iscsi SAN to  
provide storage for a collection of virtual systems and replicate to  
an offiste device.


While testing the environment I was surprised to see the size of the  
incremental snapshots, which I need to send/receive over a WAN  
connection, considering this is supposed to be block level  
replication. Having run several tests making minor changes to large  
files, I now see what is happening. When I use a text editor to  
modify a file, the whole file is written back to disk and so the  
snapshot includes every written block, whether that block contains  
the same information as before or not.


Yep, that is how most text editors work.

Would it be possible to develop the incremental snapshot process so  
that they only contain changed written blocks rather than every  
written block. Certainly in my environment where we have large files  
(>500mb), the effect upon what is sent over the WAN would be  
drastically reduced.


That is how snapshots work. But your application (text editor) writes  
new data.

Maybe you can find another way to edit the files.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss