Re: NTFS fragmentation under Cygwin not NT/XP; redux

2006-11-28 Thread Christopher Faylor
On Mon, Nov 27, 2006 at 07:44:55PM -0800, Linda Walsh wrote:
Christopher Faylor wrote:
I was hoping that this discussion about ext3 would die a natural death but
it looks like I have to make the observation that this really has nothing
to do with Cygwin
---
   Don't know what cygwin you are talking about, but the one  I
download from cygwin.com  seems to have several utils that deal with
ext2/ext3.  If  ext2/ext3 performance relative to NTFS is a verboten
discussion and has nothing to do with Cygwin, then perhaps
these utils shouldn't be in Cygwin??   

Was the discussion about how these utilities create fragmented ext3
filesystems under Cygwin?  No.  The message that I responded to was
dumping the contents of an ext3 filesystem and talking about how to look
at segments on ext3.

We often redirect general how do I use the tools (e.g., bash, gcc,
make) discussions to more appropriate mailing lists.  There is nothing
cygwin-specific that I can see in the last couple of messages.

How can we begin to determine how well or poorly cygwin on top of NT
does if we aren't allowed to discuss how well ext2/ext3 perform.
For whatever reasons, they are the only non-NT file systems
cygwin seems to have utilities for.

Maybe I was premature in declaring this off-topic but it certainly seems
to me that you are discussing something that is not going to be of very
much interest to anyone here when you start discussing how to interpret
the output of debugfs and, additionally, I haven't seen any indication
that cygwin is doing anything particularly wrong.

So, I withdraw my objection but please be cognizant of the fact that
this is a cygwin list not a how do I use particular standard utilities
that cygwin supplies list.

cgf

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation redux

2006-11-27 Thread Vladimir Dergachev
On Saturday 25 November 2006 10:12 pm, Linda Walsh wrote:
 Vladimir Dergachev wrote:
  This is curious - how do you find out fragmentation of ext3 file ? I do
  not know of a utility to tell me that.

 ---
   There's a debugfs for ext2/ext3 that allows you to dump all of the
 segments associated with an inode.  ls -i dumps the inode number.
 A quick hack (attached) displays segments for either extX or (using
 xfs_bmap) xfs. I couldn't find a similar tool for jfs or reiser (at least
 not in my distro).


Cool, thank you !

fragfilt.ext does not quite work for me. Also, looking at the code, it is not 
obvious whether it takes indirect blocks into account - but I am not that 
fluent in perl, so, perhaps I missed it.

Here is a piece of output from my debugfs:


(IND):118948480, (898060-898435):118948488-118948863, 
(898436-898660):118949376-118949600, (898661-899083):118949612-118950034, 
(IND):118950035, (899084-900107):118950036-118951059, (IND):118951060, 
(900108-901131):118951061-118952084, (IND):118952085, 
(901132-902155):118952086-118953109, (IND):118953110, 
(902156-902741):118953111-118953696, (902742-903179):118953701-118954138, 
(IND):118954139, (903180-903760):118954140-118954720, 
(903761-904203):118955745-118956187, (IND):118956188, 
(904204-904783):118956189-118956768, (904784-905227):118957813-118958256, 
(IND):118958257, (905228-906251):118958258-118959281, (IND):118959282, 
(906252-906760):118959283-118959791

  From indirect observation ext3  does not have fragmentation nearly that
  bad until the filesystem is close to full or I would not be able to reach
  sequential read speeds (the all-seeks speed is about 6 MB/sec for me, I
  was getting 40-50 MB/sec). This was on much larger files though.

 ---
 On an empty partition, I created a deterministic pathological case. Lots
 little files all separated by holes.  ext3 (default mount) just
 allocated 4k blocks in a first come-first serve manner.  XFS apparently
 looked for larger allocation units as the file was larger than 4K.
 In that regard, it's similar to NT.  I believe both use a form of B-Tree
 to manage free space.

I see. Well, I was concerned with non-pathological case of having lots of 
contiguous free space and apparent inability of NTFS to handle slowly grown 
files (i.e. writes in append mode). A common usage case are logfiles and 
downloads.


  Which journal option was the filesystem mounted with ?

 ---
   I can't see how that would matter, but default. For speed of
 test, I mounted both with noatime,async  xfs also got
 nodiratime and logbuffs=8 (or deletes take way long).

Thank you, just wanted to cover all possibilities.


  I actually implemented a workaround that calls fsutil file createnew
  FILESIZE to preallocate space and then write data in append mode
  (after doing seek 0).

 ---
   I wonder if it does the same thing as dd or if it uses
 the special call to tell the OS what to expect.  FWIW,
 cp used some smallish number of blocks (4 or 8, I think), so
 it is almost guaranteed to give you about the worse possibly
 fragmented file! :-)  Most likely the other file utils will
 give similar allocation performance (not so good).

I believe it is a special call that tells the filesystem to reserve needed 
space, but does not write anything to disk. I wonder whether it leaks 
information from deleted files.

Btw, I found out that IE writes files downloaded from the web into the 
temporary directory - and they end up all broken in tiny pieces, but, after 
that, it *copies* them to the actual location (instead of doing a move as 
would be reasonable). The copy ends up not being fragmented as, my guess, IE 
now knows its sides and asks for it.

 best

Vladimir Dergachev


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation redux

2006-11-27 Thread Christopher Faylor
On Mon, Nov 27, 2006 at 02:25:00PM -0500, Vladimir Dergachev wrote:
On Saturday 25 November 2006 10:12 pm, Linda Walsh wrote:
 Vladimir Dergachev wrote:
  This is curious - how do you find out fragmentation of ext3 file ? I do
  not know of a utility to tell me that.

 ---
  There's a debugfs for ext2/ext3 that allows you to dump all of the
 segments associated with an inode.  ls -i dumps the inode number.
 A quick hack (attached) displays segments for either extX or (using
 xfs_bmap) xfs. I couldn't find a similar tool for jfs or reiser (at least
 not in my distro).


Cool, thank you !

fragfilt.ext does not quite work for me. Also, looking at the code, it is not 
obvious whether it takes indirect blocks into account - but I am not that 
fluent in perl, so, perhaps I missed it.

Here is a piece of output from my debugfs:


(IND):118948480, (898060-898435):118948488-118948863, 
(898436-898660):118949376-118949600, (898661-899083):118949612-118950034, 
(IND):118950035, (899084-900107):118950036-118951059, (IND):118951060, 
(900108-901131):118951061-118952084, (IND):118952085, 
(901132-902155):118952086-118953109, (IND):118953110, 
(902156-902741):118953111-118953696, (902742-903179):118953701-118954138, 
(IND):118954139, (903180-903760):118954140-118954720, 
(903761-904203):118955745-118956187, (IND):118956188, 
(904204-904783):118956189-118956768, (904784-905227):118957813-118958256, 
(IND):118958257, (905228-906251):118958258-118959281, (IND):118959282, 
(906252-906760):118959283-118959791

I was hoping that this discussion about ext3 would die a natural death but
it looks like I have to make the observation that this really has nothing
to do with Cygwin.

Please take this somewhere else.

cgf

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation under Cygwin not NT/XP; redux

2006-11-27 Thread Linda Walsh

Christopher Faylor wrote:

I was hoping that this discussion about ext3 would die a natural death but
it looks like I have to make the observation that this really has nothing
to do with Cygwin

---
   Don't know what cygwin you are talking about, but the one  I
download from cygwin.com  seems to have several utils that deal with
ext2/ext3.  If  ext2/ext3 performance relative to NTFS is a verboten
discussion and has nothing to do with Cygwin, then perhaps
these utils shouldn't be in Cygwin??   


 uname -sro
CYGWIN_NT-5.1 1.5.22(0.156/4/2) Cygwin
 apropos ext2
debugfs  (8)  - ext2/ext3 file system debugger
dumpe2fs (8)  - dump ext2/ext3 filesystem information
e2fsck   (8)  - check a Linux ext2/ext3 file system
e2fsck [fsck](8)  - check a Linux ext2/ext3 file system
e2image  (8)  - Save critical ext2/ext3 filesystem data to a file
e2label  (8)  - Change the label on an ext2/ext3 filesystem
mke2fs   (8)  - create an ext2/ext3 filesystem
mke2fs [mkfs](8)  - create an ext2/ext3 filesystem
resize2fs(8)  - ext2/ext3 file system resizer
tune2fs  (8)  - adjust tunable filesystem parameters on ext2/ext3 fi
lesystems
law apropos ext3
debugfs  (8)  - ext2/ext3 file system debugger
dumpe2fs (8)  - dump ext2/ext3 filesystem information
e2fsck   (8)  - check a Linux ext2/ext3 file system
e2fsck [fsck](8)  - check a Linux ext2/ext3 file system
e2image  (8)  - Save critical ext2/ext3 filesystem data to a file
e2label  (8)  - Change the label on an ext2/ext3 filesystem
mke2fs   (8)  - create an ext2/ext3 filesystem
mke2fs [mkfs](8)  - create an ext2/ext3 filesystem
resize2fs(8)  - ext2/ext3 file system resizer
tune2fs  (8)  - adjust tunable filesystem parameters on ext2/ext3 fi
lesystems
--
The discussion has been about NTFS and how NT-native apps handle
fragmentation compared to Cygwin.

Cygwin's performance with NTFS (both default and optimized) was being
compared to linux's allocation performance on xfs and ext2(3).

Two responses clarifying the test conditions on ext2/3 were asked and
then you declare the whole discussion[sic] as having nothing to do
with Cygwin?  Perhaps not caring for the topic, you haven't been
reading it?   The discussion has everything to do with the performance
on NTFS, the underlying filesystem that cygwin recommends vs. ext2/ext3. 


As for your assertion that ext2/ext3 have nothing to do with cygwin,
the cygwin distribution contains/provides several utilities ( as
shown above) for for creating, checking, debugging, resizing,
tuning imaging, labeling and dumping ext2/ext3 file systems.

The data I processed was output from debugfs -- a utility that also
exists on cygwin.

How can we begin to determine how well or poorly cygwin on top of NT
does if we aren't allowed to discuss how well ext2/ext3 perform.
For whatever reasons, they are the only non-NT file systems
cygwin seems to have utilities for.

-l





--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation redux

2006-11-25 Thread Linda Walsh

Vladimir Dergachev wrote:
This is curious - how do you find out fragmentation of ext3 file ? I do not 
know of a utility to tell me that. 

---
There's a debugfs for ext2/ext3 that allows you to dump all of the
segments associated with an inode.  ls -i dumps the inode number.
A quick hack (attached) displays segments for either extX or (using xfs_bmap) 
xfs.
I couldn't find a similar tool for jfs or reiser (at least not in my distro).

From indirect observation ext3  does not have fragmentation nearly that bad 
until the filesystem is close to full or I would not be able to reach 
sequential read speeds (the all-seeks speed is about 6 MB/sec for me, I was 
getting 40-50 MB/sec). This was on much larger files though.

---
On an empty partition, I created a deterministic pathological case. Lots
little files all separated by holes.  ext3 (default mount) just
allocated 4k blocks in a first come-first serve manner.  XFS apparently
looked for larger allocation units as the file was larger than 4K.
In that regard, it's similar to NT.  I believe both use a form of B-Tree
to manage free space.


Which journal option was the filesystem mounted with ?

---
I can't see how that would matter, but default. For speed of
test, I mounted both with noatime,async  xfs also got
nodiratime and logbuffs=8 (or deletes take way long).

I actually implemented a workaround that calls fsutil file createnew 
FILESIZE to preallocate space and then write data in append mode

(after doing seek 0).

---
I wonder if it does the same thing as dd or if it uses
the special call to tell the OS what to expect.  FWIW,
cp used some smallish number of blocks (4 or 8, I think), so
it is almost guaranteed to give you about the worse possibly
fragmented file! :-)  Most likely the other file utils will
give similar allocation performance (not so good).

-linda

#!/bin/bash

export debugfs=$(which debugfs)

if (($#1)); then echo need filename 2 ; exit 1; fi

while [[ -n $1 ]]; do
if [ ! -e $1 ]; then echo Name \$1\ doesn't exist. Ignoring 2
else
inode=$(ls -i $1|cut -d\  -f1)
echo -n $1: 
PAGER=cat $debugfs /dev/hdc1 -R stat $inode 2/dev/null | 
fragfilt.ext
fi
shift
done

# vim:ts=4:sw=4:
#!/usr/bin/perl -w

my $lineno=0;
my $frags=0;

while () {
++$lineno;
chomp;
/^ \( \d+[^\)]* \) :\d+[^,]*,/x  do {
my @block_ranges;
my @blocknums;
my @this_range;

s/\([^\)\n]+\)://g;
s/,//g;
@block_ranges = split / /;
foreach (@block_ranges) {
s/-/../;
@this_range=eval;
push @blocknums, @this_range;
}
my $bn=$blocknums[0];
print $#blocknums blocks;
my $frags=1;
for ($i=1; $i  $#blocknums; ++$i) {
my $nbn=$blocknums[$i];
#   print bn=$bn, nbn=$nbn; ;
if ($bn+1 != $nbn ) {
#   print Hiccup, skip ($bn - $nbn)\n;
++$frags;
}
$bn=$nbn;
}
last;
#   next;
};
}
if ($frags==0) {
print No fragments in file, (length = 0?)\n;
exit 1;
}
if ($frags==1) {
print , fully defragmented\n;
} else {
print  in $frags fragments\n;
}

# vim:ts=4:sw=4
--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

RE: NTFS fragmentation

2006-08-06 Thread Charli Li
-Original Message-
From: Robert Pendell
Sent: Saturday, August 05, 2006 7:50 PM
To: Cygwin Mailing List
Subject: Re: NTFS fragmentation


Vladimir Dergachev wrote:
  Also, I tried the following experiment - found a 17 MB file in
ibiblio.org and
 downloaded it with Firefox. The file ended up fragmented into
more than 200
 pieces. Tried the same file with IE - no fragmentation.

 It could be, of course, that Firefox is compiled with cygwin,
but I have not
 found cygwin.dll anywhere in its installation directory.

IE moves the files from your Temporary Internet Files to where your
defined destination is once the download is complete.  Firefox however
skips that and just writes it to the destination.  That is why you see
the fragmentation in Firefox and not IE.  The move that IE does isn't
always noticeable.  The box will only come up if it takes more than a
few seconds but occasionally you see it say Moving FILE from Temporary
Internet Files to DEST (replacing FILE and DEST as appropriate).  The
message is probably different but you get the idea.

--
Robert Pendell
[EMAIL PROTECTED]

Thawte Web of Trust Notary
CAcert Assurer


When you actually open the file using IE, it just downloads the file into
Temporary Internet Files and opens it.  But in Fx, it downloads it into the
preset folder, but when it is just about to open, it moves the file into the
temp folder (C:\Documents and Settings\cygwin\Local Settings\temp (replace
cygwin as your username), then opens it.
---
Sure, Fx is compiled USING cygwin but not using GCC.  The build is driven by
a makefile (using make) and configure (autoconf-2.13), but the tools are
Microsoft Visual C++ tools found in C:\Program Files\Microsoft Visual
Studio\VC\bin\.  Those tools are cl (compiler), link (linker), rc (resource
compiler), and vcvars32.bat (environment setup).

Charli


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-05 Thread Jim Lawson
  From: Vladimir Dergachev [EMAIL PROTECTED]
 To: cygwin@cygwin.com
 Subject: Re: NTFS fragmentation
 Date: Thu, 3 Aug 2006 14:54:33 -0400
 
 On Thursday 03 August 2006 2:37 pm, Dave Korn wrote:
  On 03 August 2006 18:50, Vladimir Dergachev wrote:
   On Thursday 03 August 2006 5:18 am, Dave Korn
 wrote:
   On 03 August 2006 00:46, Vladimir Dergachev
 wrote:
  
  
   Hi Vladimir,
  
   Please CC me - I am not on the list.
  
 Done :)
  
  
   I guess this means that sequential writes are
 officially broken on NTFS.
  
   Anyone has any idea for a workaround ? It would
 be nice if a simple
   tar zcvf a.tgz * does not result in a completely
 fragmented file.
 
I can only think of one thing worth trying off
 the top of my head: what
  happens if you open a file (in non-sparse mode)
 and immediately seek to the
  file size, then seek back to the start and
 actually write the contents?  Or
  perhaps after seeking to the end you'd need to
 write (at least) a single
  byte, then seek back to the beginning?
 
 
 I am not sure that I understand, if one creates the
 file and then seeks to 
 +1G, wouldn't the file pointer be still at 0 as the
 filesize is 0 ?
 
 What I am thinking about is modifying cygwin's open
 and write calls so that 
 they preallocate files in chunks of 10MB
 (configurable by an environment 
 variable). 
 
 This way we still get some fragmentation, but it
 would not be so bad - 
 assuming 50MB/sec disk read speed reading 10MB will
 take 200ms, while a seek 
 is at worst 20ms (usually around 10-15ms).
 
  best
 
 Vladimir
 Dergachev

 
It turns out that to actually allocate the file
blocks, you need to write some data. Seeking to the
desired size doesn't (or didn't used to) actually
allocate the intervening blocks. As Dave suggests, you
need to seek to the end and actually write something
to get the file blocks allocated. If you try this for
a very large file (several Gigabytes), you had better
be prepared to go and have a nice meal while you wait
for the block allocation to complete. Window's
security policy requires that the blocks not only be
allocated, but that they be written with data as well
- ostensibly to prevent malicious code from reading
old data it shouldn't have access to.

Granted, there are better ways to do this - zero-fill
on attempts to read from allocated but uninitialized
file space or at the very least, throw some kind of
exception when an application attempts to read
uninitialized file data. Since Windows supports sparse
files, the basic mechanism is there somewhere.

Windows doesn't (or didn't use to) allow preallocation
of files without actually writing data UNLESS you know
the proper incantation to prove you're a good guy (
your application needs to do a dance to grant itself
the SeManageVolumePrivilege privilege so it can
issue the SetFileValidData call).


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-05 Thread Robert Pendell
Vladimir Dergachev wrote:
  Also, I tried the following experiment - found a 17 MB file in
ibiblio.org and
 downloaded it with FireFox. The file ended up fragmented into more than 200 
 pieces. Tried the same file with IE - no fragmentation.
 
 It could be, of course, that Firefox is compiled with cygwin, but I have not 
 found cygwin.dll anywhere in its installation directory.

IE moves the files from your Temporary Internet Files to where your
defined destination is once the download is complete.  Firefox however
skips that and just writes it to the destination.  That is why you see
the fragmentation in Firefox and not IE.  The move that IE does isn't
always noticeable.  The box will only come up if it takes more than a
few seconds but occasionally you see it say Moving FILE from Temporary
Internet Files to DEST (replacing FILE and DEST as appropriate).  The
message is probably different but you get the idea.

-- 
Robert Pendell
[EMAIL PROTECTED]

Thawte Web of Trust Notary
CAcert Assurer



signature.asc
Description: OpenPGP digital signature


RE: NTFS fragmentation

2006-08-03 Thread Dave Korn
On 03 August 2006 00:46, Vladimir Dergachev wrote:


Hi Vladimir,


 Please CC me - I am not on the list.

  Done :)

 
 PS I'll try writing a C program when time permits - any suggestions on what
 API besides regular open/write/close to use ?

  I think you might want to go straight to ZwCreateFile in the native API, and
see if you can pin the difference in behaviour down to some change in the
flags passed to that call.

  Actually, maybe the most informative thing would be to look at the device IO
controls sent by both testcases, using filemon or similar.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-03 Thread Christian Franke

Vladimir Dergachev wrote:

...
Also, I tried the following experiment - found a 17 MB file in ibiblio.org and 
downloaded it with FireFox. The file ended up fragmented into more than 200 
pieces. Tried the same file with IE - no fragmentation.
  


The difference is probably that IE initially creates the file with full 
size and then overwrites it. This is at least the case if you copy files 
with explorer, copy, xcopy or CopyFileEx().


FireFox, Cygwin's cp and most other programs use regular sequential 
write. This may lead to fragmentation when the disk has less space.


Christian


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-03 Thread Vladimir Dergachev
On Thursday 03 August 2006 5:18 am, Dave Korn wrote:
 On 03 August 2006 00:46, Vladimir Dergachev wrote:


 Hi Vladimir,

  Please CC me - I am not on the list.

   Done :)


   Actually, maybe the most informative thing would be to look at the device
 IO controls sent by both testcases, using filemon or similar.

Thank you for the suggestion !

I used filemon and discovered that all three programs (ntfs_test.tcl, Firefox 
and IE) use sequential access, but IE writes the file first to Temporary 
Internet Files folder and then copies it.

If one runs analyze from defragmenter while IE is still downloading the file 
the file in the Temporary Internet Files folder is just as fragmented as 
other files.

I guess this means that sequential writes are officially broken on NTFS.

Anyone has any idea for a workaround ? It would be nice if a simple
tar zcvf a.tgz * does not result in a completely fragmented file.

  thank you

  Vladimir Dergachev



 cheers,
   DaveK



--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-03 Thread Vladimir Dergachev
On Thursday 03 August 2006 5:35 am, Christian Franke wrote:
 Vladimir Dergachev wrote:
  ...
  Also, I tried the following experiment - found a 17 MB file in
  ibiblio.org and downloaded it with FireFox. The file ended up fragmented
  into more than 200 pieces. Tried the same file with IE - no
  fragmentation.

 The difference is probably that IE initially creates the file with full
 size and then overwrites it. This is at least the case if you copy files
 with explorer, copy, xcopy or CopyFileEx().

 FireFox, Cygwin's cp and most other programs use regular sequential
 write. This may lead to fragmentation when the disk has less space.

Well, what I see is that the file is completely fragmented on a 400 GB disk, 
which is 40% full and has been recently defragmented.

By completely fragmented I mean as if each write ends up in its own separate 
place on disk and NTFS does not even check whether there is free space after 
the last written block. Honestly, FAT was more efficient - at least when 
written anew there was no fragmentation.

best

Vladimir Dergachev


 Christian



--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: NTFS fragmentation

2006-08-03 Thread Dave Korn
On 03 August 2006 18:50, Vladimir Dergachev wrote:

 On Thursday 03 August 2006 5:18 am, Dave Korn wrote:
 On 03 August 2006 00:46, Vladimir Dergachev wrote:
 
 
 Hi Vladimir,
 
 Please CC me - I am not on the list.
 
   Done :)
 
 
   Actually, maybe the most informative thing would be to look at the device
 IO controls sent by both testcases, using filemon or similar.
 
 Thank you for the suggestion !
 
 I used filemon and discovered that all three programs (ntfs_test.tcl,
 Firefox and IE) use sequential access, but IE writes the file first to
 Temporary Internet Files folder and then copies it.
 
 If one runs analyze from defragmenter while IE is still downloading the file
 the file in the Temporary Internet Files folder is just as fragmented as
 other files.
 
 I guess this means that sequential writes are officially broken on NTFS.
 
 Anyone has any idea for a workaround ? It would be nice if a simple
 tar zcvf a.tgz * does not result in a completely fragmented file.


  I can only think of one thing worth trying off the top of my head: what
happens if you open a file (in non-sparse mode) and immediately seek to the
file size, then seek back to the start and actually write the contents?  Or
perhaps after seeking to the end you'd need to write (at least) a single byte,
then seek back to the beginning?



cheers,
  DaveK
-- 
Can't think of a witty .sigline today


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-03 Thread Vladimir Dergachev
On Thursday 03 August 2006 2:37 pm, Dave Korn wrote:
 On 03 August 2006 18:50, Vladimir Dergachev wrote:
  On Thursday 03 August 2006 5:18 am, Dave Korn wrote:
  On 03 August 2006 00:46, Vladimir Dergachev wrote:
 
 
  Hi Vladimir,
 
  Please CC me - I am not on the list.
 
Done :)
 
 
  I guess this means that sequential writes are officially broken on NTFS.
 
  Anyone has any idea for a workaround ? It would be nice if a simple
  tar zcvf a.tgz * does not result in a completely fragmented file.

   I can only think of one thing worth trying off the top of my head: what
 happens if you open a file (in non-sparse mode) and immediately seek to the
 file size, then seek back to the start and actually write the contents?  Or
 perhaps after seeking to the end you'd need to write (at least) a single
 byte, then seek back to the beginning?


I am not sure that I understand, if one creates the file and then seeks to 
+1G, wouldn't the file pointer be still at 0 as the filesize is 0 ?

What I am thinking about is modifying cygwin's open and write calls so that 
they preallocate files in chunks of 10MB (configurable by an environment 
variable). 

This way we still get some fragmentation, but it would not be so bad - 
assuming 50MB/sec disk read speed reading 10MB will take 200ms, while a seek 
is at worst 20ms (usually around 10-15ms).

 best

Vladimir Dergachev


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-03 Thread Corinna Vinschen
On Aug  3 14:54, Vladimir Dergachev wrote:
 On Thursday 03 August 2006 2:37 pm, Dave Korn wrote:
 What I am thinking about is modifying cygwin's open and write calls so that 
 they preallocate files in chunks of 10MB (configurable by an environment 
 variable). 

No.


Corinna

-- 
Corinna Vinschen  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader  cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



RE: NTFS fragmentation

2006-08-02 Thread Gary R. Van Sickle
[cc'ing you per your request]

 From:  Vladimir Dergachev
 Sent: Wednesday, August 02, 2006 5:33 PM
 Subject: NTFS fragmentation
 
 
 Hi all, 
 
I have encountered a rather puzzling fragmentation 
 that occurs when writing files using Cygwin. 
 
What happens is that if one creates a new file and 
 writes data to it (whether via a command line redirect or 
 with a Tcl script - have not tried C
 yet) the file ends up heavily fragmented. 
 
In contrast, native Windows utilities do not exhibit 
 this issue.
 
Someone suggested to me that Windows requires an 
 expected file length to be passed at the time of open, thus I 
 searched on Google and found fsutil program that allows to 
 reserve space on the filesystem.
 
I attached a small Tcl script that, when run, creates 
 two 30 MB files - one using regular open/write pair (and 
 which is fragmented into about 300 pieces on my system) and 
 one using fsutil/open in append mode/seek 0 method.
 
   To see the problem defragment your system, run the test 
 script and then run analyze and ask to view report. You will 
 see a.dat at top of the list, while b.dat never appears in 
 the report. 
 
Despite the workaround, it is still kinda hard for me 
 to believe that anyone has designed a filesystem that needs 
 to know what is the file size going to be - especially for a 
 single program writing on an almost empty disk. Perhaps there 
 is some sort of environment variable that I need to set ?
 
 Any suggestions and comments would be greatly 
 appreciated.
 Please CC me - I am not on the list.
 
thank you very much
 
 Vladimir Dergachev

I'll try your test case when I get a chance, but my WAG is that you're
seeing the effects of Cygwin's creation of sparse files by default for any
file beyond a certain size.  I unfortunately do not recall what that size
is.  What happens as you change FILE_SIZE and/or BUFFER_SIZE in your script,
to maybe a small multiple of your cluster size?

-- 
Gary R. Van Sickle
 



--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-02 Thread Larry Hall (Cygwin)

Gary R. Van Sickle wrote:

[cc'ing you per your request]


From:  Vladimir Dergachev
Sent: Wednesday, August 02, 2006 5:33 PM
Subject: NTFS fragmentation


Hi all, 

   I have encountered a rather puzzling fragmentation 
that occurs when writing files using Cygwin. 

   What happens is that if one creates a new file and 
writes data to it (whether via a command line redirect or 
with a Tcl script - have not tried C
yet) the file ends up heavily fragmented. 

   In contrast, native Windows utilities do not exhibit 
this issue.


   Someone suggested to me that Windows requires an 
expected file length to be passed at the time of open, thus I 
searched on Google and found fsutil program that allows to 
reserve space on the filesystem.


   I attached a small Tcl script that, when run, creates 
two 30 MB files - one using regular open/write pair (and 
which is fragmented into about 300 pieces on my system) and 
one using fsutil/open in append mode/seek 0 method.


  To see the problem defragment your system, run the test 
script and then run analyze and ask to view report. You will 
see a.dat at top of the list, while b.dat never appears in 
the report. 

   Despite the workaround, it is still kinda hard for me 
to believe that anyone has designed a filesystem that needs 
to know what is the file size going to be - especially for a 
single program writing on an almost empty disk. Perhaps there 
is some sort of environment variable that I need to set ?


Any suggestions and comments would be greatly 
appreciated.
Please CC me - I am not on the list.


   thank you very much

Vladimir Dergachev


I'll try your test case when I get a chance, but my WAG is that you're
seeing the effects of Cygwin's creation of sparse files by default for any
file beyond a certain size.  I unfortunately do not recall what that size
is.  What happens as you change FILE_SIZE and/or BUFFER_SIZE in your script,
to maybe a small multiple of your cluster size?




This clicked with me as well.  I was thinking first though to try it with
straight C, to remove the possibility of some TCL pollution.  Otherwise a
quick Google of sparse for cygwin dot com turns up lots of relevant hits,
utilities, and technical details.  Should be pretty easy to determine if the
resulting files are sparse or not with this info.


--
Larry Hall  http://www.rfk.com
RFK Partners, Inc.  (508) 893-9779 - RFK Office
216 Dalton Rd.  (508) 893-9889 - FAX
Holliston, MA 01746

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-02 Thread Vladimir Dergachev
Hi Gary and Larry, 

  Thank you for your comments, replies below:

On Wednesday 02 August 2006 7:08 pm, you wrote:
  Any suggestions and comments would be greatly
  appreciated.
  Please CC me - I am not on the list.
 
 thank you very much
 
  Vladimir Dergachev

 I'll try your test case when I get a chance, but my WAG is that you're
 seeing the effects of Cygwin's creation of sparse files by default for any
 file beyond a certain size.  I unfortunately do not recall what that size
 is.  What happens as you change FILE_SIZE and/or BUFFER_SIZE in your
 script, to maybe a small multiple of your cluster size?

I tried buffer_size of 10K, 100K, 1M and 10M - no big difference, except a 
small decrease in number of fragments for 10M value - could be noise..

I also tried a smaller file size - 3M, the number of fragments decreased to 
33, roughly proportionally to size.

Unfortunately, I do not know what cluster size is.

With regard to sparse files the intent here is to open a file, write data to 
it and the close. No seeks involved, much less void regions. I do understand 
that internally cygwin could do something different. 

I have not found a utility to identify a sparse file yet - if you happen to 
have a link I would greatly appreciate it.

Also, I tried the following experiment - found a 17 MB file in ibiblio.org and 
downloaded it with FireFox. The file ended up fragmented into more than 200 
pieces. Tried the same file with IE - no fragmentation.

It could be, of course, that Firefox is compiled with cygwin, but I have not 
found cygwin.dll anywhere in its installation directory.

 thank you

Vladimir Dergachev

PS I'll try writing a C program when time permits - any suggestions on what 
API besides regular open/write/close to use ?


--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-02 Thread Larry Hall (Cygwin)

Vladimir Dergachev wrote:
Hi Gary and Larry, 


  Thank you for your comments, replies below:

On Wednesday 02 August 2006 7:08 pm, you wrote:

Any suggestions and comments would be greatly
appreciated.
Please CC me - I am not on the list.

   thank you very much

Vladimir Dergachev

I'll try your test case when I get a chance, but my WAG is that you're
seeing the effects of Cygwin's creation of sparse files by default for any
file beyond a certain size.  I unfortunately do not recall what that size
is.  What happens as you change FILE_SIZE and/or BUFFER_SIZE in your
script, to maybe a small multiple of your cluster size?


I tried buffer_size of 10K, 100K, 1M and 10M - no big difference, except a 
small decrease in number of fragments for 10M value - could be noise..


I also tried a smaller file size - 3M, the number of fragments decreased to 
33, roughly proportionally to size.


Unfortunately, I do not know what cluster size is.

With regard to sparse files the intent here is to open a file, write data to 
it and the close. No seeks involved, much less void regions. I do understand 
that internally cygwin could do something different. 

I have not found a utility to identify a sparse file yet - if you happen to 
have a link I would greatly appreciate it.


Also, I tried the following experiment - found a 17 MB file in ibiblio.org and 
downloaded it with FireFox. The file ended up fragmented into more than 200 
pieces. Tried the same file with IE - no fragmentation.


It could be, of course, that Firefox is compiled with cygwin, but I have not 
found cygwin.dll anywhere in its installation directory.



If you pulled it from Mozilla.org, it ain't Cygwin-based.  That would point to
a more general, non-Cygwin problem.


PS I'll try writing a C program when time permits - any suggestions on what 
API besides regular open/write/close to use ?



I would recommend making a POSIX API version and a straight Win32 version.
But if what you said about Firefox is true, you should see a similar problem
even using MinGW (www.mingw.org) or the '-mno-cygwin'.  Again, that would
point to this being a non-Cygwin problem, though still quite an annoying one.



--
Larry Hall  http://www.rfk.com
RFK Partners, Inc.  (508) 893-9779 - RFK Office
216 Dalton Rd.  (508) 893-9889 - FAX
Holliston, MA 01746

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-02 Thread Christopher Faylor
On Wed, Aug 02, 2006 at 09:11:03PM -0400, Larry Hall (Cygwin) wrote:
If you pulled it from Mozilla.org, it ain't Cygwin-based.  That would
point to a more general, non-Cygwin problem.

Especially since tclsh.exe is just barely a cygwin program and I wouldn't
be surprised if it didn't even use Cygwin's open or write functions.

cgf

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: NTFS fragmentation

2006-08-02 Thread Larry Hall (Cygwin)

Christopher Faylor wrote:

On Wed, Aug 02, 2006 at 09:11:03PM -0400, Larry Hall (Cygwin) wrote:

If you pulled it from Mozilla.org, it ain't Cygwin-based.  That would
point to a more general, non-Cygwin problem.


Especially since tclsh.exe is just barely a cygwin program and I wouldn't
be surprised if it didn't even use Cygwin's open or write functions.




Another good point...


--
Larry Hall  http://www.rfk.com
RFK Partners, Inc.  (508) 893-9779 - RFK Office
216 Dalton Rd.  (508) 893-9889 - FAX
Holliston, MA 01746

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/