Re: NTFS fragmentation under Cygwin not NT/XP; redux
On Mon, Nov 27, 2006 at 07:44:55PM -0800, Linda Walsh wrote: Christopher Faylor wrote: I was hoping that this discussion about ext3 would die a natural death but it looks like I have to make the observation that this really has nothing to do with Cygwin --- Don't know what cygwin you are talking about, but the one I download from cygwin.com seems to have several utils that deal with ext2/ext3. If ext2/ext3 performance relative to NTFS is a verboten discussion and has nothing to do with Cygwin, then perhaps these utils shouldn't be in Cygwin?? Was the discussion about how these utilities create fragmented ext3 filesystems under Cygwin? No. The message that I responded to was dumping the contents of an ext3 filesystem and talking about how to look at segments on ext3. We often redirect general how do I use the tools (e.g., bash, gcc, make) discussions to more appropriate mailing lists. There is nothing cygwin-specific that I can see in the last couple of messages. How can we begin to determine how well or poorly cygwin on top of NT does if we aren't allowed to discuss how well ext2/ext3 perform. For whatever reasons, they are the only non-NT file systems cygwin seems to have utilities for. Maybe I was premature in declaring this off-topic but it certainly seems to me that you are discussing something that is not going to be of very much interest to anyone here when you start discussing how to interpret the output of debugfs and, additionally, I haven't seen any indication that cygwin is doing anything particularly wrong. So, I withdraw my objection but please be cognizant of the fact that this is a cygwin list not a how do I use particular standard utilities that cygwin supplies list. cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation redux
On Saturday 25 November 2006 10:12 pm, Linda Walsh wrote: Vladimir Dergachev wrote: This is curious - how do you find out fragmentation of ext3 file ? I do not know of a utility to tell me that. --- There's a debugfs for ext2/ext3 that allows you to dump all of the segments associated with an inode. ls -i dumps the inode number. A quick hack (attached) displays segments for either extX or (using xfs_bmap) xfs. I couldn't find a similar tool for jfs or reiser (at least not in my distro). Cool, thank you ! fragfilt.ext does not quite work for me. Also, looking at the code, it is not obvious whether it takes indirect blocks into account - but I am not that fluent in perl, so, perhaps I missed it. Here is a piece of output from my debugfs: (IND):118948480, (898060-898435):118948488-118948863, (898436-898660):118949376-118949600, (898661-899083):118949612-118950034, (IND):118950035, (899084-900107):118950036-118951059, (IND):118951060, (900108-901131):118951061-118952084, (IND):118952085, (901132-902155):118952086-118953109, (IND):118953110, (902156-902741):118953111-118953696, (902742-903179):118953701-118954138, (IND):118954139, (903180-903760):118954140-118954720, (903761-904203):118955745-118956187, (IND):118956188, (904204-904783):118956189-118956768, (904784-905227):118957813-118958256, (IND):118958257, (905228-906251):118958258-118959281, (IND):118959282, (906252-906760):118959283-118959791 From indirect observation ext3 does not have fragmentation nearly that bad until the filesystem is close to full or I would not be able to reach sequential read speeds (the all-seeks speed is about 6 MB/sec for me, I was getting 40-50 MB/sec). This was on much larger files though. --- On an empty partition, I created a deterministic pathological case. Lots little files all separated by holes. ext3 (default mount) just allocated 4k blocks in a first come-first serve manner. XFS apparently looked for larger allocation units as the file was larger than 4K. In that regard, it's similar to NT. I believe both use a form of B-Tree to manage free space. I see. Well, I was concerned with non-pathological case of having lots of contiguous free space and apparent inability of NTFS to handle slowly grown files (i.e. writes in append mode). A common usage case are logfiles and downloads. Which journal option was the filesystem mounted with ? --- I can't see how that would matter, but default. For speed of test, I mounted both with noatime,async xfs also got nodiratime and logbuffs=8 (or deletes take way long). Thank you, just wanted to cover all possibilities. I actually implemented a workaround that calls fsutil file createnew FILESIZE to preallocate space and then write data in append mode (after doing seek 0). --- I wonder if it does the same thing as dd or if it uses the special call to tell the OS what to expect. FWIW, cp used some smallish number of blocks (4 or 8, I think), so it is almost guaranteed to give you about the worse possibly fragmented file! :-) Most likely the other file utils will give similar allocation performance (not so good). I believe it is a special call that tells the filesystem to reserve needed space, but does not write anything to disk. I wonder whether it leaks information from deleted files. Btw, I found out that IE writes files downloaded from the web into the temporary directory - and they end up all broken in tiny pieces, but, after that, it *copies* them to the actual location (instead of doing a move as would be reasonable). The copy ends up not being fragmented as, my guess, IE now knows its sides and asks for it. best Vladimir Dergachev -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation redux
On Mon, Nov 27, 2006 at 02:25:00PM -0500, Vladimir Dergachev wrote: On Saturday 25 November 2006 10:12 pm, Linda Walsh wrote: Vladimir Dergachev wrote: This is curious - how do you find out fragmentation of ext3 file ? I do not know of a utility to tell me that. --- There's a debugfs for ext2/ext3 that allows you to dump all of the segments associated with an inode. ls -i dumps the inode number. A quick hack (attached) displays segments for either extX or (using xfs_bmap) xfs. I couldn't find a similar tool for jfs or reiser (at least not in my distro). Cool, thank you ! fragfilt.ext does not quite work for me. Also, looking at the code, it is not obvious whether it takes indirect blocks into account - but I am not that fluent in perl, so, perhaps I missed it. Here is a piece of output from my debugfs: (IND):118948480, (898060-898435):118948488-118948863, (898436-898660):118949376-118949600, (898661-899083):118949612-118950034, (IND):118950035, (899084-900107):118950036-118951059, (IND):118951060, (900108-901131):118951061-118952084, (IND):118952085, (901132-902155):118952086-118953109, (IND):118953110, (902156-902741):118953111-118953696, (902742-903179):118953701-118954138, (IND):118954139, (903180-903760):118954140-118954720, (903761-904203):118955745-118956187, (IND):118956188, (904204-904783):118956189-118956768, (904784-905227):118957813-118958256, (IND):118958257, (905228-906251):118958258-118959281, (IND):118959282, (906252-906760):118959283-118959791 I was hoping that this discussion about ext3 would die a natural death but it looks like I have to make the observation that this really has nothing to do with Cygwin. Please take this somewhere else. cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation under Cygwin not NT/XP; redux
Christopher Faylor wrote: I was hoping that this discussion about ext3 would die a natural death but it looks like I have to make the observation that this really has nothing to do with Cygwin --- Don't know what cygwin you are talking about, but the one I download from cygwin.com seems to have several utils that deal with ext2/ext3. If ext2/ext3 performance relative to NTFS is a verboten discussion and has nothing to do with Cygwin, then perhaps these utils shouldn't be in Cygwin?? uname -sro CYGWIN_NT-5.1 1.5.22(0.156/4/2) Cygwin apropos ext2 debugfs (8) - ext2/ext3 file system debugger dumpe2fs (8) - dump ext2/ext3 filesystem information e2fsck (8) - check a Linux ext2/ext3 file system e2fsck [fsck](8) - check a Linux ext2/ext3 file system e2image (8) - Save critical ext2/ext3 filesystem data to a file e2label (8) - Change the label on an ext2/ext3 filesystem mke2fs (8) - create an ext2/ext3 filesystem mke2fs [mkfs](8) - create an ext2/ext3 filesystem resize2fs(8) - ext2/ext3 file system resizer tune2fs (8) - adjust tunable filesystem parameters on ext2/ext3 fi lesystems law apropos ext3 debugfs (8) - ext2/ext3 file system debugger dumpe2fs (8) - dump ext2/ext3 filesystem information e2fsck (8) - check a Linux ext2/ext3 file system e2fsck [fsck](8) - check a Linux ext2/ext3 file system e2image (8) - Save critical ext2/ext3 filesystem data to a file e2label (8) - Change the label on an ext2/ext3 filesystem mke2fs (8) - create an ext2/ext3 filesystem mke2fs [mkfs](8) - create an ext2/ext3 filesystem resize2fs(8) - ext2/ext3 file system resizer tune2fs (8) - adjust tunable filesystem parameters on ext2/ext3 fi lesystems -- The discussion has been about NTFS and how NT-native apps handle fragmentation compared to Cygwin. Cygwin's performance with NTFS (both default and optimized) was being compared to linux's allocation performance on xfs and ext2(3). Two responses clarifying the test conditions on ext2/3 were asked and then you declare the whole discussion[sic] as having nothing to do with Cygwin? Perhaps not caring for the topic, you haven't been reading it? The discussion has everything to do with the performance on NTFS, the underlying filesystem that cygwin recommends vs. ext2/ext3. As for your assertion that ext2/ext3 have nothing to do with cygwin, the cygwin distribution contains/provides several utilities ( as shown above) for for creating, checking, debugging, resizing, tuning imaging, labeling and dumping ext2/ext3 file systems. The data I processed was output from debugfs -- a utility that also exists on cygwin. How can we begin to determine how well or poorly cygwin on top of NT does if we aren't allowed to discuss how well ext2/ext3 perform. For whatever reasons, they are the only non-NT file systems cygwin seems to have utilities for. -l -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation redux
Vladimir Dergachev wrote: This is curious - how do you find out fragmentation of ext3 file ? I do not know of a utility to tell me that. --- There's a debugfs for ext2/ext3 that allows you to dump all of the segments associated with an inode. ls -i dumps the inode number. A quick hack (attached) displays segments for either extX or (using xfs_bmap) xfs. I couldn't find a similar tool for jfs or reiser (at least not in my distro). From indirect observation ext3 does not have fragmentation nearly that bad until the filesystem is close to full or I would not be able to reach sequential read speeds (the all-seeks speed is about 6 MB/sec for me, I was getting 40-50 MB/sec). This was on much larger files though. --- On an empty partition, I created a deterministic pathological case. Lots little files all separated by holes. ext3 (default mount) just allocated 4k blocks in a first come-first serve manner. XFS apparently looked for larger allocation units as the file was larger than 4K. In that regard, it's similar to NT. I believe both use a form of B-Tree to manage free space. Which journal option was the filesystem mounted with ? --- I can't see how that would matter, but default. For speed of test, I mounted both with noatime,async xfs also got nodiratime and logbuffs=8 (or deletes take way long). I actually implemented a workaround that calls fsutil file createnew FILESIZE to preallocate space and then write data in append mode (after doing seek 0). --- I wonder if it does the same thing as dd or if it uses the special call to tell the OS what to expect. FWIW, cp used some smallish number of blocks (4 or 8, I think), so it is almost guaranteed to give you about the worse possibly fragmented file! :-) Most likely the other file utils will give similar allocation performance (not so good). -linda #!/bin/bash export debugfs=$(which debugfs) if (($#1)); then echo need filename 2 ; exit 1; fi while [[ -n $1 ]]; do if [ ! -e $1 ]; then echo Name \$1\ doesn't exist. Ignoring 2 else inode=$(ls -i $1|cut -d\ -f1) echo -n $1: PAGER=cat $debugfs /dev/hdc1 -R stat $inode 2/dev/null | fragfilt.ext fi shift done # vim:ts=4:sw=4: #!/usr/bin/perl -w my $lineno=0; my $frags=0; while () { ++$lineno; chomp; /^ \( \d+[^\)]* \) :\d+[^,]*,/x do { my @block_ranges; my @blocknums; my @this_range; s/\([^\)\n]+\)://g; s/,//g; @block_ranges = split / /; foreach (@block_ranges) { s/-/../; @this_range=eval; push @blocknums, @this_range; } my $bn=$blocknums[0]; print $#blocknums blocks; my $frags=1; for ($i=1; $i $#blocknums; ++$i) { my $nbn=$blocknums[$i]; # print bn=$bn, nbn=$nbn; ; if ($bn+1 != $nbn ) { # print Hiccup, skip ($bn - $nbn)\n; ++$frags; } $bn=$nbn; } last; # next; }; } if ($frags==0) { print No fragments in file, (length = 0?)\n; exit 1; } if ($frags==1) { print , fully defragmented\n; } else { print in $frags fragments\n; } # vim:ts=4:sw=4 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: NTFS fragmentation
-Original Message- From: Robert Pendell Sent: Saturday, August 05, 2006 7:50 PM To: Cygwin Mailing List Subject: Re: NTFS fragmentation Vladimir Dergachev wrote: Also, I tried the following experiment - found a 17 MB file in ibiblio.org and downloaded it with Firefox. The file ended up fragmented into more than 200 pieces. Tried the same file with IE - no fragmentation. It could be, of course, that Firefox is compiled with cygwin, but I have not found cygwin.dll anywhere in its installation directory. IE moves the files from your Temporary Internet Files to where your defined destination is once the download is complete. Firefox however skips that and just writes it to the destination. That is why you see the fragmentation in Firefox and not IE. The move that IE does isn't always noticeable. The box will only come up if it takes more than a few seconds but occasionally you see it say Moving FILE from Temporary Internet Files to DEST (replacing FILE and DEST as appropriate). The message is probably different but you get the idea. -- Robert Pendell [EMAIL PROTECTED] Thawte Web of Trust Notary CAcert Assurer When you actually open the file using IE, it just downloads the file into Temporary Internet Files and opens it. But in Fx, it downloads it into the preset folder, but when it is just about to open, it moves the file into the temp folder (C:\Documents and Settings\cygwin\Local Settings\temp (replace cygwin as your username), then opens it. --- Sure, Fx is compiled USING cygwin but not using GCC. The build is driven by a makefile (using make) and configure (autoconf-2.13), but the tools are Microsoft Visual C++ tools found in C:\Program Files\Microsoft Visual Studio\VC\bin\. Those tools are cl (compiler), link (linker), rc (resource compiler), and vcvars32.bat (environment setup). Charli -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
From: Vladimir Dergachev [EMAIL PROTECTED] To: cygwin@cygwin.com Subject: Re: NTFS fragmentation Date: Thu, 3 Aug 2006 14:54:33 -0400 On Thursday 03 August 2006 2:37 pm, Dave Korn wrote: On 03 August 2006 18:50, Vladimir Dergachev wrote: On Thursday 03 August 2006 5:18 am, Dave Korn wrote: On 03 August 2006 00:46, Vladimir Dergachev wrote: Hi Vladimir, Please CC me - I am not on the list. Done :) I guess this means that sequential writes are officially broken on NTFS. Anyone has any idea for a workaround ? It would be nice if a simple tar zcvf a.tgz * does not result in a completely fragmented file. I can only think of one thing worth trying off the top of my head: what happens if you open a file (in non-sparse mode) and immediately seek to the file size, then seek back to the start and actually write the contents? Or perhaps after seeking to the end you'd need to write (at least) a single byte, then seek back to the beginning? I am not sure that I understand, if one creates the file and then seeks to +1G, wouldn't the file pointer be still at 0 as the filesize is 0 ? What I am thinking about is modifying cygwin's open and write calls so that they preallocate files in chunks of 10MB (configurable by an environment variable). This way we still get some fragmentation, but it would not be so bad - assuming 50MB/sec disk read speed reading 10MB will take 200ms, while a seek is at worst 20ms (usually around 10-15ms). best Vladimir Dergachev It turns out that to actually allocate the file blocks, you need to write some data. Seeking to the desired size doesn't (or didn't used to) actually allocate the intervening blocks. As Dave suggests, you need to seek to the end and actually write something to get the file blocks allocated. If you try this for a very large file (several Gigabytes), you had better be prepared to go and have a nice meal while you wait for the block allocation to complete. Window's security policy requires that the blocks not only be allocated, but that they be written with data as well - ostensibly to prevent malicious code from reading old data it shouldn't have access to. Granted, there are better ways to do this - zero-fill on attempts to read from allocated but uninitialized file space or at the very least, throw some kind of exception when an application attempts to read uninitialized file data. Since Windows supports sparse files, the basic mechanism is there somewhere. Windows doesn't (or didn't use to) allow preallocation of files without actually writing data UNLESS you know the proper incantation to prove you're a good guy ( your application needs to do a dance to grant itself the SeManageVolumePrivilege privilege so it can issue the SetFileValidData call). __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
Vladimir Dergachev wrote: Also, I tried the following experiment - found a 17 MB file in ibiblio.org and downloaded it with FireFox. The file ended up fragmented into more than 200 pieces. Tried the same file with IE - no fragmentation. It could be, of course, that Firefox is compiled with cygwin, but I have not found cygwin.dll anywhere in its installation directory. IE moves the files from your Temporary Internet Files to where your defined destination is once the download is complete. Firefox however skips that and just writes it to the destination. That is why you see the fragmentation in Firefox and not IE. The move that IE does isn't always noticeable. The box will only come up if it takes more than a few seconds but occasionally you see it say Moving FILE from Temporary Internet Files to DEST (replacing FILE and DEST as appropriate). The message is probably different but you get the idea. -- Robert Pendell [EMAIL PROTECTED] Thawte Web of Trust Notary CAcert Assurer signature.asc Description: OpenPGP digital signature
RE: NTFS fragmentation
On 03 August 2006 00:46, Vladimir Dergachev wrote: Hi Vladimir, Please CC me - I am not on the list. Done :) PS I'll try writing a C program when time permits - any suggestions on what API besides regular open/write/close to use ? I think you might want to go straight to ZwCreateFile in the native API, and see if you can pin the difference in behaviour down to some change in the flags passed to that call. Actually, maybe the most informative thing would be to look at the device IO controls sent by both testcases, using filemon or similar. cheers, DaveK -- Can't think of a witty .sigline today -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
Vladimir Dergachev wrote: ... Also, I tried the following experiment - found a 17 MB file in ibiblio.org and downloaded it with FireFox. The file ended up fragmented into more than 200 pieces. Tried the same file with IE - no fragmentation. The difference is probably that IE initially creates the file with full size and then overwrites it. This is at least the case if you copy files with explorer, copy, xcopy or CopyFileEx(). FireFox, Cygwin's cp and most other programs use regular sequential write. This may lead to fragmentation when the disk has less space. Christian -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
On Thursday 03 August 2006 5:18 am, Dave Korn wrote: On 03 August 2006 00:46, Vladimir Dergachev wrote: Hi Vladimir, Please CC me - I am not on the list. Done :) Actually, maybe the most informative thing would be to look at the device IO controls sent by both testcases, using filemon or similar. Thank you for the suggestion ! I used filemon and discovered that all three programs (ntfs_test.tcl, Firefox and IE) use sequential access, but IE writes the file first to Temporary Internet Files folder and then copies it. If one runs analyze from defragmenter while IE is still downloading the file the file in the Temporary Internet Files folder is just as fragmented as other files. I guess this means that sequential writes are officially broken on NTFS. Anyone has any idea for a workaround ? It would be nice if a simple tar zcvf a.tgz * does not result in a completely fragmented file. thank you Vladimir Dergachev cheers, DaveK -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
On Thursday 03 August 2006 5:35 am, Christian Franke wrote: Vladimir Dergachev wrote: ... Also, I tried the following experiment - found a 17 MB file in ibiblio.org and downloaded it with FireFox. The file ended up fragmented into more than 200 pieces. Tried the same file with IE - no fragmentation. The difference is probably that IE initially creates the file with full size and then overwrites it. This is at least the case if you copy files with explorer, copy, xcopy or CopyFileEx(). FireFox, Cygwin's cp and most other programs use regular sequential write. This may lead to fragmentation when the disk has less space. Well, what I see is that the file is completely fragmented on a 400 GB disk, which is 40% full and has been recently defragmented. By completely fragmented I mean as if each write ends up in its own separate place on disk and NTFS does not even check whether there is free space after the last written block. Honestly, FAT was more efficient - at least when written anew there was no fragmentation. best Vladimir Dergachev Christian -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: NTFS fragmentation
On 03 August 2006 18:50, Vladimir Dergachev wrote: On Thursday 03 August 2006 5:18 am, Dave Korn wrote: On 03 August 2006 00:46, Vladimir Dergachev wrote: Hi Vladimir, Please CC me - I am not on the list. Done :) Actually, maybe the most informative thing would be to look at the device IO controls sent by both testcases, using filemon or similar. Thank you for the suggestion ! I used filemon and discovered that all three programs (ntfs_test.tcl, Firefox and IE) use sequential access, but IE writes the file first to Temporary Internet Files folder and then copies it. If one runs analyze from defragmenter while IE is still downloading the file the file in the Temporary Internet Files folder is just as fragmented as other files. I guess this means that sequential writes are officially broken on NTFS. Anyone has any idea for a workaround ? It would be nice if a simple tar zcvf a.tgz * does not result in a completely fragmented file. I can only think of one thing worth trying off the top of my head: what happens if you open a file (in non-sparse mode) and immediately seek to the file size, then seek back to the start and actually write the contents? Or perhaps after seeking to the end you'd need to write (at least) a single byte, then seek back to the beginning? cheers, DaveK -- Can't think of a witty .sigline today -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
On Thursday 03 August 2006 2:37 pm, Dave Korn wrote: On 03 August 2006 18:50, Vladimir Dergachev wrote: On Thursday 03 August 2006 5:18 am, Dave Korn wrote: On 03 August 2006 00:46, Vladimir Dergachev wrote: Hi Vladimir, Please CC me - I am not on the list. Done :) I guess this means that sequential writes are officially broken on NTFS. Anyone has any idea for a workaround ? It would be nice if a simple tar zcvf a.tgz * does not result in a completely fragmented file. I can only think of one thing worth trying off the top of my head: what happens if you open a file (in non-sparse mode) and immediately seek to the file size, then seek back to the start and actually write the contents? Or perhaps after seeking to the end you'd need to write (at least) a single byte, then seek back to the beginning? I am not sure that I understand, if one creates the file and then seeks to +1G, wouldn't the file pointer be still at 0 as the filesize is 0 ? What I am thinking about is modifying cygwin's open and write calls so that they preallocate files in chunks of 10MB (configurable by an environment variable). This way we still get some fragmentation, but it would not be so bad - assuming 50MB/sec disk read speed reading 10MB will take 200ms, while a seek is at worst 20ms (usually around 10-15ms). best Vladimir Dergachev -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
On Aug 3 14:54, Vladimir Dergachev wrote: On Thursday 03 August 2006 2:37 pm, Dave Korn wrote: What I am thinking about is modifying cygwin's open and write calls so that they preallocate files in chunks of 10MB (configurable by an environment variable). No. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Project Co-Leader cygwin AT cygwin DOT com Red Hat -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: NTFS fragmentation
[cc'ing you per your request] From: Vladimir Dergachev Sent: Wednesday, August 02, 2006 5:33 PM Subject: NTFS fragmentation Hi all, I have encountered a rather puzzling fragmentation that occurs when writing files using Cygwin. What happens is that if one creates a new file and writes data to it (whether via a command line redirect or with a Tcl script - have not tried C yet) the file ends up heavily fragmented. In contrast, native Windows utilities do not exhibit this issue. Someone suggested to me that Windows requires an expected file length to be passed at the time of open, thus I searched on Google and found fsutil program that allows to reserve space on the filesystem. I attached a small Tcl script that, when run, creates two 30 MB files - one using regular open/write pair (and which is fragmented into about 300 pieces on my system) and one using fsutil/open in append mode/seek 0 method. To see the problem defragment your system, run the test script and then run analyze and ask to view report. You will see a.dat at top of the list, while b.dat never appears in the report. Despite the workaround, it is still kinda hard for me to believe that anyone has designed a filesystem that needs to know what is the file size going to be - especially for a single program writing on an almost empty disk. Perhaps there is some sort of environment variable that I need to set ? Any suggestions and comments would be greatly appreciated. Please CC me - I am not on the list. thank you very much Vladimir Dergachev I'll try your test case when I get a chance, but my WAG is that you're seeing the effects of Cygwin's creation of sparse files by default for any file beyond a certain size. I unfortunately do not recall what that size is. What happens as you change FILE_SIZE and/or BUFFER_SIZE in your script, to maybe a small multiple of your cluster size? -- Gary R. Van Sickle -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
Gary R. Van Sickle wrote: [cc'ing you per your request] From: Vladimir Dergachev Sent: Wednesday, August 02, 2006 5:33 PM Subject: NTFS fragmentation Hi all, I have encountered a rather puzzling fragmentation that occurs when writing files using Cygwin. What happens is that if one creates a new file and writes data to it (whether via a command line redirect or with a Tcl script - have not tried C yet) the file ends up heavily fragmented. In contrast, native Windows utilities do not exhibit this issue. Someone suggested to me that Windows requires an expected file length to be passed at the time of open, thus I searched on Google and found fsutil program that allows to reserve space on the filesystem. I attached a small Tcl script that, when run, creates two 30 MB files - one using regular open/write pair (and which is fragmented into about 300 pieces on my system) and one using fsutil/open in append mode/seek 0 method. To see the problem defragment your system, run the test script and then run analyze and ask to view report. You will see a.dat at top of the list, while b.dat never appears in the report. Despite the workaround, it is still kinda hard for me to believe that anyone has designed a filesystem that needs to know what is the file size going to be - especially for a single program writing on an almost empty disk. Perhaps there is some sort of environment variable that I need to set ? Any suggestions and comments would be greatly appreciated. Please CC me - I am not on the list. thank you very much Vladimir Dergachev I'll try your test case when I get a chance, but my WAG is that you're seeing the effects of Cygwin's creation of sparse files by default for any file beyond a certain size. I unfortunately do not recall what that size is. What happens as you change FILE_SIZE and/or BUFFER_SIZE in your script, to maybe a small multiple of your cluster size? This clicked with me as well. I was thinking first though to try it with straight C, to remove the possibility of some TCL pollution. Otherwise a quick Google of sparse for cygwin dot com turns up lots of relevant hits, utilities, and technical details. Should be pretty easy to determine if the resulting files are sparse or not with this info. -- Larry Hall http://www.rfk.com RFK Partners, Inc. (508) 893-9779 - RFK Office 216 Dalton Rd. (508) 893-9889 - FAX Holliston, MA 01746 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
Hi Gary and Larry, Thank you for your comments, replies below: On Wednesday 02 August 2006 7:08 pm, you wrote: Any suggestions and comments would be greatly appreciated. Please CC me - I am not on the list. thank you very much Vladimir Dergachev I'll try your test case when I get a chance, but my WAG is that you're seeing the effects of Cygwin's creation of sparse files by default for any file beyond a certain size. I unfortunately do not recall what that size is. What happens as you change FILE_SIZE and/or BUFFER_SIZE in your script, to maybe a small multiple of your cluster size? I tried buffer_size of 10K, 100K, 1M and 10M - no big difference, except a small decrease in number of fragments for 10M value - could be noise.. I also tried a smaller file size - 3M, the number of fragments decreased to 33, roughly proportionally to size. Unfortunately, I do not know what cluster size is. With regard to sparse files the intent here is to open a file, write data to it and the close. No seeks involved, much less void regions. I do understand that internally cygwin could do something different. I have not found a utility to identify a sparse file yet - if you happen to have a link I would greatly appreciate it. Also, I tried the following experiment - found a 17 MB file in ibiblio.org and downloaded it with FireFox. The file ended up fragmented into more than 200 pieces. Tried the same file with IE - no fragmentation. It could be, of course, that Firefox is compiled with cygwin, but I have not found cygwin.dll anywhere in its installation directory. thank you Vladimir Dergachev PS I'll try writing a C program when time permits - any suggestions on what API besides regular open/write/close to use ? -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
Vladimir Dergachev wrote: Hi Gary and Larry, Thank you for your comments, replies below: On Wednesday 02 August 2006 7:08 pm, you wrote: Any suggestions and comments would be greatly appreciated. Please CC me - I am not on the list. thank you very much Vladimir Dergachev I'll try your test case when I get a chance, but my WAG is that you're seeing the effects of Cygwin's creation of sparse files by default for any file beyond a certain size. I unfortunately do not recall what that size is. What happens as you change FILE_SIZE and/or BUFFER_SIZE in your script, to maybe a small multiple of your cluster size? I tried buffer_size of 10K, 100K, 1M and 10M - no big difference, except a small decrease in number of fragments for 10M value - could be noise.. I also tried a smaller file size - 3M, the number of fragments decreased to 33, roughly proportionally to size. Unfortunately, I do not know what cluster size is. With regard to sparse files the intent here is to open a file, write data to it and the close. No seeks involved, much less void regions. I do understand that internally cygwin could do something different. I have not found a utility to identify a sparse file yet - if you happen to have a link I would greatly appreciate it. Also, I tried the following experiment - found a 17 MB file in ibiblio.org and downloaded it with FireFox. The file ended up fragmented into more than 200 pieces. Tried the same file with IE - no fragmentation. It could be, of course, that Firefox is compiled with cygwin, but I have not found cygwin.dll anywhere in its installation directory. If you pulled it from Mozilla.org, it ain't Cygwin-based. That would point to a more general, non-Cygwin problem. PS I'll try writing a C program when time permits - any suggestions on what API besides regular open/write/close to use ? I would recommend making a POSIX API version and a straight Win32 version. But if what you said about Firefox is true, you should see a similar problem even using MinGW (www.mingw.org) or the '-mno-cygwin'. Again, that would point to this being a non-Cygwin problem, though still quite an annoying one. -- Larry Hall http://www.rfk.com RFK Partners, Inc. (508) 893-9779 - RFK Office 216 Dalton Rd. (508) 893-9889 - FAX Holliston, MA 01746 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
On Wed, Aug 02, 2006 at 09:11:03PM -0400, Larry Hall (Cygwin) wrote: If you pulled it from Mozilla.org, it ain't Cygwin-based. That would point to a more general, non-Cygwin problem. Especially since tclsh.exe is just barely a cygwin program and I wouldn't be surprised if it didn't even use Cygwin's open or write functions. cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: NTFS fragmentation
Christopher Faylor wrote: On Wed, Aug 02, 2006 at 09:11:03PM -0400, Larry Hall (Cygwin) wrote: If you pulled it from Mozilla.org, it ain't Cygwin-based. That would point to a more general, non-Cygwin problem. Especially since tclsh.exe is just barely a cygwin program and I wouldn't be surprised if it didn't even use Cygwin's open or write functions. Another good point... -- Larry Hall http://www.rfk.com RFK Partners, Inc. (508) 893-9779 - RFK Office 216 Dalton Rd. (508) 893-9889 - FAX Holliston, MA 01746 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/