Re: Files on disk

2010-07-22 Thread Edmund R. MacKenty
On Wednesday 21 July 2010 18:26, Sterling James wrote:
>What's compression set to? I know that has other implications, also.  Look
>at the makesparsefile option for restore.
>
>"Tivoli Storage Manager backs up a sparse file as a regular file if client
>compression is off. Set the compression option to yes to enable file
>compression when backing up sparse files to minimize network transaction
>time and maximize server storage space. "

I don't know jack about TSM, but based only on that quote and this thread so 
far I have to wonder what happens during a restore.  If it's using 
compression to deal with sparse files, it's probably still compressing all 
those empty blocks, right?  So on restore, does it decompress them and write 
blocks of zeros out instead of re-creating a sparse file?  If that's the 
case, then it will still try to restore that 26GB sparse file to use 26GB of 
DASD, even if it compressed it down to 200MB on the server because of all the 
blocks of zeros in it.

Has anyone investigated that problem?
- MacK.
-
Edmund R. MacKenty
Software Architect
Rocket Software
275 Grove Street · Newton, MA 02466-2272 · USA
Tel: +1.617.614.4321
Email: m...@rs.com
Web: www.rocketsoftware.com  

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Files on disk

2010-07-22 Thread David Boyes
> Does that imply then that a TMC backed up sparse file could not be
> restored to the same device it came off of? Would TMC attempt to
> restore
> all 26G?

Yes, and yes. Bitten by that one on Solaris too. 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Files on disk

2010-07-21 Thread Sterling James
What's compression set to? I know that has other implications, also.  Look 
at the makesparsefile option for restore.

"Tivoli Storage Manager backs up a sparse file as a regular file if client 
compression is off. Set the compression option to yes to enable file 
compression when backing up sparse files to minimize network transaction 
time and maximize server storage space. "

Thanks



From:
"Edmund R. MacKenty" 
To:
LINUX-390@VM.MARIST.EDU
Date:
07/21/2010 04:58 PM
Subject:
Re: Files on disk
Sent by:
Linux on 390 Port 



On Wednesday 21 July 2010 17:48, Dave Jones wrote:
>Does that imply then that a TMC backed up sparse file could not be
>restored to the same device it came off of? Would TMC attempt to restore
>all 26G?

I would expect so.  If it doesn't know enough to preserve the sparseness 
of a 
file as it backs it up, I doubt it would be making a file sparse again 
upon 
restore just because some blocks contain all zeros.  I'd look for some 
configuration option that makes it aware of sparse files.
 - MacK.
-
Edmund R. MacKenty
Software Architect
Rocket Software
275 Grove Street · Newton, MA 02466-2272 · USA
Tel: +1.617.614.4321
Email: m...@rs.com
Web: www.rocketsoftware.com 

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or 
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/



-
Please consider the environment before printing this email and any
attachments.

This e-mail and any attachments are intended only for the
individual or company to which it is addressed and may contain
information which is privileged, confidential and prohibited from
disclosure or unauthorized use under applicable law.  If you are
not the intended recipient of this e-mail, you are hereby notified
that any use, dissemination, or copying of this e-mail or the
information contained in this e-mail is strictly prohibited by the
sender.  If you have received this transmission in error, please
return the material received to the sender and delete all copies
from your system.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Files on disk

2010-07-21 Thread Shane G
I might just add that despite it's manpage assertion, rsync isn't too
intelligent about it at all.
My (non z) testing indicated that if you re-use the same target file, after
the initial run "cp" is significantly more efficient. The initial run for
both is comparable as the target needs to be created.
Didn't test tar as it wasn't relevant (to me) at that juncture.

Shane 


On Thu, Jul 22nd, 2010 at 3:06 AM, "Edmund R. MacKenty" wrote:

> Rick pointed out that rsync and tar have options that deal with sparse
> files intelligently: when they copy a sparse file, they do not write out 
> blocks
> of all zeros.  Instead, they seek past such "empty" blocks to avoid writing
> to them, thus creating a sparse output file.  That's how a proper Linux file
> copy is done.  The cp command also does that.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Files on disk

2010-07-21 Thread Edmund R. MacKenty
On Wednesday 21 July 2010 17:48, Dave Jones wrote:
>Does that imply then that a TMC backed up sparse file could not be
>restored to the same device it came off of? Would TMC attempt to restore
>all 26G?

I would expect so.  If it doesn't know enough to preserve the sparseness of a 
file as it backs it up, I doubt it would be making a file sparse again upon 
restore just because some blocks contain all zeros.  I'd look for some 
configuration option that makes it aware of sparse files.
- MacK.
-
Edmund R. MacKenty
Software Architect
Rocket Software
275 Grove Street · Newton, MA 02466-2272 · USA
Tel: +1.617.614.4321
Email: m...@rs.com
Web: www.rocketsoftware.com  

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Files on disk

2010-07-21 Thread Dave Jones

Does that imply then that a TMC backed up sparse file could not be
restored to the same device it came off of? Would TMC attempt to restore
all 26G?

On 07/21/2010 12:14 PM, David Boyes wrote:

Sparse files. OK. Then the next question, how can I store a 26G file in
a machine that isn't that large?


Because sparse files are just a bunch of pointers -- the data doesn't really
exist. It's just diddling around with the directory inode.


And to add to this, why does the
filesystem backup really dump 26G into our TSM server?


TSM doesn't process sparse files correctly on many platforms. In this case
it's using APIs that don't recognize sparse files and so it's dumping all
26G.

-- db

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


--
Dave Jones
V/Soft
www.vsoft-software.com
Houston, TX
281.578.7544

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Files on disk

2010-07-21 Thread Berry van Sleeuwen
Thank you all for your replies. It's clear to me, we were dumping zero's.

Regards, Berry.

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Files on disk

2010-07-21 Thread David Boyes
>> Sparse files. OK. Then the next question, how can I store a 26G file in
>> a machine that isn't that large?

Because sparse files are just a bunch of pointers -- the data doesn't really
exist. It's just diddling around with the directory inode.

>> And to add to this, why does the
>> filesystem backup really dump 26G into our TSM server?

TSM doesn't process sparse files correctly on many platforms. In this case
it's using APIs that don't recognize sparse files and so it's dumping all
26G. 

-- db

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Files on disk

2010-07-21 Thread Edmund R. MacKenty
On Wednesday 21 July 2010 11:59, Berry van Sleeuwen wrote:
>Sparse files. OK. Then the next question, how can I store a 26G file in
>a machine that isn't that large? And to add to this, why does the
>filesystem backup really dump 26G into our TSM server?

Because it isn't really using 26GB of disk space.  The *length* of the file is 
26GB because the program writing it seeked out that far and wrote something.  
But it didn't write all the data between zero and 26GB, so Linux didn't 
allocate disk space for the parts of the file that were never written to.  
Run "du -h /var/log/lastlog" to see just how little disk space that file 
uses.  Here's what it says on my system, for example:

# ll -h /var/log/lastlog
-rw-r--r-- 1 root tty 1.2M Jul 20 08:30 /var/log/lastlog
# du -h /var/log/lastlog
48K /var/log/lastlog

So even though the file is 1.2MB long, it's only using up 48KB (or 12 blocks) 
of disk space.  The file is "sparse" because it does not have blocks 
allocated for its entire length.

The backup dumps a 26GB file because when a program reads a part of a sparse 
file that was never written, it gets back a block of all zeros.  So TSM is 
reading all that unallocated space, and writing out lots of blocks of zeros 
to the backup file.  Thus the backup file is not a sparse file, because TSM 
wrote every block of that 26GB.  Perhaps there's some TSM option to get it to 
recognise sparse files?

Rick pointed out that rsync and tar have options that deal with sparse files 
intelligently: when they copy a sparse file, they do not write out blocks of 
all zeros.  Instead, they seek past such "empty" blocks to avoid writing to 
them, thus creating a sparse output file.  That's how a proper Linux file 
copy is done.  The cp command also does that.
- MacK.
-
Edmund R. MacKenty
Software Architect
Rocket Software
275 Grove Street · Newton, MA 02466-2272 · USA
Tel: +1.617.614.4321
Email: m...@rs.com
Web: www.rocketsoftware.com  

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Files on disk

2010-07-21 Thread Richard Troth
RSYNC is your friend.

rsync -a -S sourcedir/. targetdir/.

where "-S" means "handle sparse files intelligently".  TAR also has an
option for handling sparse files.

-- R;   <><





On Wed, Jul 21, 2010 at 11:59, Berry van Sleeuwen
 wrote:
> Hello Edmund,
>
> Sparse files. OK. Then the next question, how can I store a 26G file in
> a machine that isn't that large? And to add to this, why does the
> filesystem backup really dump 26G into our TSM server?
>
> So it looks like the data is going somewhere.
>
> Berry.
>
> Op 21-07-10 15:41, Edmund R. MacKenty schreef:
>>
>> Because they are sparse files.  Linux only allocates blocks for a file that
>> have actually been written, so if a process creates a file and seeks a couple
>> of gigabytes into it before the first write, the file size is reported as
>> over 2GB, but it really only uses the blocks actually written after that
>> point.  Use du(1) to report the actual space used by those files.
>>
>>
>>
>
> --
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
> --
> For more information on Linux on System z, visit
> http://wiki.linuxvm.org/
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Files on disk

2010-07-21 Thread Berry van Sleeuwen
Hello Edmund,

Sparse files. OK. Then the next question, how can I store a 26G file in
a machine that isn't that large? And to add to this, why does the
filesystem backup really dump 26G into our TSM server?

So it looks like the data is going somewhere.

Berry.

Op 21-07-10 15:41, Edmund R. MacKenty schreef:
>
> Because they are sparse files.  Linux only allocates blocks for a file that
> have actually been written, so if a process creates a file and seeks a couple
> of gigabytes into it before the first write, the file size is reported as
> over 2GB, but it really only uses the blocks actually written after that
> point.  Use du(1) to report the actual space used by those files.
>
>
>

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/


Re: Files on disk

2010-07-21 Thread Edmund R. MacKenty
On Wednesday 21 July 2010 05:03, van Sleeuwen, Berry wrote:
>On a SLES8 guest we have found that file /var/log/lastlog is reported to
>be 26G. Also the /var/log/faillog is reported to be 2G. But, the /var is
>located on a 3390 model 3. So that disk, that also contains other
>directories, is only 2.3 G. Command df shows that the / is 83% in use.
>
>How can it be that files can grow larger than the disk they reside on?
>And why would df report on 83% instead of 100% usage?

Because they are sparse files.  Linux only allocates blocks for a file that 
have actually been written, so if a process creates a file and seeks a couple 
of gigabytes into it before the first write, the file size is reported as 
over 2GB, but it really only uses the blocks actually written after that 
point.  Use du(1) to report the actual space used by those files.

IIRC, sparse files are used for these logs because they are in a kind of 
record-oriented format, where the position in the file is the record key.  
That's why you need to use last(1) and faillog(8) to look at those files: 
they are not plain text files the way /var/log/messages is.
- MacK.
-
Edmund R. MacKenty
Software Architect
Rocket Software
275 Grove Street · Newton, MA 02466-2272 · USA
Tel: +1.617.614.4321
Email: m...@rs.com
Web: www.rocketsoftware.com  

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
--
For more information on Linux on System z, visit
http://wiki.linuxvm.org/