Re: [PERFORM] fsync vs open_sync

2004-09-09 Thread Mark Wong
On Sun, Sep 05, 2004 at 12:16:42AM -0500, Steve Bergman wrote:
> On Sat, 2004-09-04 at 23:47 -0400, Christopher Browne wrote:
> > The world rejoiced as [EMAIL PROTECTED] ("Merlin Moncure") wrote:
> > > Ok, you were right.  I made some tests and NTFS is just not very
> > > good in the general case.  I've seen some benchmarks for Reiser4
> > > that are just amazing.
> > 
> > Reiser4 has been sounding real interesting.
> > 
> 
> Are these independent benchmarks, or the benchmarketing at namesys.com?
> Note that the APPEND, MODIFY, and OVERWRITE phases have been turned off
> on the mongo tests and the other tests have been set to a lexical (non
> default for mongo) mode.  I've done some mongo benchmarking myself and
> reiser4 loses to ext3 (data=ordered) in the excluded tests.  APPEND
> phase performance is absolutely *horrible*.  So they just turned off the
> phases in which reiser4 lost and published the remaining results as
> proof that "resier4 is the fastest filesystem".
> 
> See: http://marc.theaimsgroup.com/?l=reiserfs&m=109363302000856
> 
> 
> -Steve Bergman
> 
> 
> 

Reiser4 also isn't optmized for lots of fsyncs (unless it's been done
recently.)  I believe the mention fsync performance in their release
notes.  I've seen this dramatically hurt performance with our OLTP
workload.

-- 
Mark Wong - - [EMAIL PROTECTED]
Open Source Development Lab Inc - A non-profit corporation
12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005
(503) 626-2455 x 32 (office)
(503) 626-2436  (fax)
http://developer.osdl.org/markw/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] fsync vs open_sync

2004-09-05 Thread Pierre-Frédéric Caillaud
I trust ReiserFS 3.
I wouldn't trust the 4 before maybe 1-2 years.
On Sun, 05 Sep 2004 07:41:29 -0400, Geoffrey <[EMAIL PROTECTED]> wrote:
Christopher Browne wrote:
I'm not sure what all SuSE supports; they're about the only other Linx
vendor that EMC would support, and I don't expect that Reiser4 yet
fits into the "supportable" category :-(.
I use quite a bit of SuSE, and although I don't know their official  
position on Reiser file systems, I do know that it is the default when  
installing, so I'd suggest you might check into it.



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] fsync vs open_sync

2004-09-05 Thread Pierre-Frédéric Caillaud
Were you upset by my message ? I'll try to clarify.
I understood from your email that you are a Windows haters
	Well, no, not really. I use Windows everyday and it has its strengths. I  
still don't think the average (non-geek) person can really use Linux as a  
Desktop OS. The problem I have with Windows is that I think it could be  
made much faster, without too much effort (mainly some tweaking in the  
Disk IO field), but Microsoft doesn't do it. Why ? I can't understand this.

in Linux.  You can write 1 files in one second and the HDD is still  
idle... then  when it decides to flush it all goes to disk in one burst.
You can not trust your data in this.
	That's why I mentioned that it did not relate to database type  
performance. If the computer crashes while writing these files, some may  
be partially written, some not at all, some okay... the only certainty is  
about filesystem integrity. But it's exactly the same on all Journaling  
filesystems (including NTFS). Thus, with equal reliability, the faster  
wins. Maybe, with Reiser4, we will see real filesystem transactions and  
maybe this will translate in higher postgres performance...

I've had my computers shutdown violently by power failures and no   
reiserfs problems so far. NTFS is very crash proof too. My windows  
machine  bluescreens twice a day and still no data loss ;)
If you have the BSOD twice a day then you have a broken driver or broken
HW. CPU overclocked ?
	I think this machine has crap hardware. In fact this example was to  
emphasize the reliability of NTFS : it is indeed remarkable that no data  
loss occurs even on such a crap machine. I know Windows has got quite  
reliable now.



---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] fsync vs open_sync

2004-09-05 Thread Geoffrey
Christopher Browne wrote:
I'm not sure what all SuSE supports; they're about the only other Linx
vendor that EMC would support, and I don't expect that Reiser4 yet
fits into the "supportable" category :-(.
I use quite a bit of SuSE, and although I don't know their official 
position on Reiser file systems, I do know that it is the default when 
installing, so I'd suggest you might check into it.

--
Until later, Geoffrey   Registered Linux User #108567
AT&T Certified UNIX System Programmer - 1995
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] fsync vs open_sync

2004-09-04 Thread Steve Bergman
On Sat, 2004-09-04 at 23:47 -0400, Christopher Browne wrote:
> The world rejoiced as [EMAIL PROTECTED] ("Merlin Moncure") wrote:
> > Ok, you were right.  I made some tests and NTFS is just not very
> > good in the general case.  I've seen some benchmarks for Reiser4
> > that are just amazing.
> 
> Reiser4 has been sounding real interesting.
> 

Are these independent benchmarks, or the benchmarketing at namesys.com?
Note that the APPEND, MODIFY, and OVERWRITE phases have been turned off
on the mongo tests and the other tests have been set to a lexical (non
default for mongo) mode.  I've done some mongo benchmarking myself and
reiser4 loses to ext3 (data=ordered) in the excluded tests.  APPEND
phase performance is absolutely *horrible*.  So they just turned off the
phases in which reiser4 lost and published the remaining results as
proof that "resier4 is the fastest filesystem".

See: http://marc.theaimsgroup.com/?l=reiserfs&m=109363302000856


-Steve Bergman



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] fsync vs open_sync

2004-09-04 Thread Christopher Browne
The world rejoiced as [EMAIL PROTECTED] ("Merlin Moncure") wrote:
> Ok, you were right.  I made some tests and NTFS is just not very
> good in the general case.  I've seen some benchmarks for Reiser4
> that are just amazing.

Reiser4 has been sounding real interesting.

The killer problem is thus:

  "We must caution that just as Linux 2.6 is not yet as stable as
  Linux 2.4, it will also be some substantial time before V4 is as
  stable as V3."

In practice, there's a further problem.

We have some systems at work we need to connect to EMC disk arrays;
that's something that isn't supported by EMC unless you're using a
whole set of pieces that are "officially supported."

RHAT doesn't want to talk to you about support for anything other than
ext3.

I'm not sure what all SuSE supports; they're about the only other Linx
vendor that EMC would support, and I don't expect that Reiser4 yet
fits into the "supportable" category :-(.

The upshot of that is that this means that we'd only consider using
stuff like Reiser4 on "toy" systems, and, quite frankly, that means
that they'll have "toy" disk as opposed to the good stuff :-(.

And frankly, we're too busy with issues nearer to our hearts than
testing out ReiserFS.  :-(
-- 
output = ("cbbrowne" "@" "cbbrowne.com")
http://cbbrowne.com/info/emacs.html
"Linux!  Guerrilla Unix Development Venimus, Vidimus, Dolavimus."
-- <[EMAIL PROTECTED]> Mark A. Horton KA4YBR

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] fsync vs open_sync

2004-09-04 Thread Cott Lang
Another possibly useless datapoint on this thread for anyone who's
curious ... open_sync absolutely stinks over NFS at least on Linux. :)





---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] fsync vs open_sync

2004-09-04 Thread Gaetano Mendola
Pierre-Frédéric Caillaud wrote:
22 KB files, 1000 of them :
open(), read(), close() : 10.000 files/s
open(), write(), close() : 4.000 files/s
This is quite far from database FS activity, but it's still 
amazing,  although the disk doesn't even get used. Which is what I like 
in Linux.  You can write 1 files in one second and the HDD is still 
idle... then  when it decides to flush it all goes to disk in one burst.
You can not trust your data in this.

I've had my computers shutdown violently by power failures and no  
reiserfs problems so far. NTFS is very crash proof too. My windows 
machine  bluescreens twice a day and still no data loss ;)
If you have the BSOD twice a day then you have a broken driver or broken
HW. CPU overclocked ?
I understood from your email that you are a Windows haters, try to post
something here:
http://ihatelinux.blogspot.com/
:-)
Regards
Gaetano Mendola
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] fsync vs open_sync

2004-09-03 Thread Pierre-Frédéric Caillaud
>There is also the fact that NTFS is a very slow filesystem, and
> Linux is
> a lot better than Windows for everything disk, caching and IO related.
Try
> to copy some files in NTFS and in ReiserFS...
I'm not so sure I would agree with such a blanket generalization.  I  
find
NTFS to be very fast, my main complaint is fragmentation issues...I bet
NTFS is better than ext3 at most things (I do agree with you about the
cache, thoughO.
Ok, you were right.  I made some tests and NTFS is just not very good in  
the general case.  I've seen some benchmarks for Reiser4 that are just  
amazing.
	As a matter of fact I was again amazed today.
	I was looking into a way to cache database queries for a website (not  
yet) written in Python. The purpose was to cache long queries like those  
used to render forum pages (which is the typical slow query, selecting  
from a big table where records are rather random and LIMIT is used to cut  
the result in pages).
	I wanted to save a serialized (python pickled) representation of the data  
to disk to avoid reissuing the query every time.
	In the end it took about 1 ms to load or save the data for a page with 40  
posts... then I wondered, how much does it take just to read or write the  
file ?

ReiserFS 3.6, Athlon XP 2.5G+, 512Mb DDR400
7200 RPM IDE Drive with 8MB Cache
This would be considered a very underpowered server...
22 KB files, 1000 of them :
open(), read(), close() : 10.000 files/s
open(), write(), close() : 4.000 files/s
	This is quite far from database FS activity, but it's still amazing,  
although the disk doesn't even get used. Which is what I like in Linux.  
You can write 1 files in one second and the HDD is still idle... then  
when it decides to flush it all goes to disk in one burst.

	I did make benchmarks some time ago and found that what sets Linux apart  
from Windows in terms of filesystems is :
	- very high performance filesystems like ReiserFS
	This is the obvious part ; although with a hge amount of data in  
small files accessed randomly, ReiserFS is faster but not 10x, maybe  
something like 2x NTFS. I trust Reiser4 to offer better performance, but  
not right now. Also ReiserFS lacks a defragmenter, and it gets slower  
after 1-2 years (compared to 1-2 weeks with NTFS this is still not that  
bad, but I'd like to defragment and I cant). Reiser4 will fix that  
apparently with background defragger etc.

	- caching.
	Linux disk caching is amazing. When copying a large file to the same disk  
on Windows, the drive head swaps a lot, like the OS can't decide between  
reading and writing. Linux, on the other hand, reads and writes by large  
chunks and loses a lot less time seekng. Even when reading two files at  
the same time, Linux reads ahead in large chunks (very little performance  
loss) whereas Windows seeks a lot. The read-ahead and write-back thus gets  
it a lot faster than 2x NTFS for everyday tasks like copying files,  
backing up, making archives, grepping, serving files, etc...
	My windows box was able to saturate a 100Mbps ethernet while serving one  
large FTP file on the LAN (not that impressive, it's only 10 MB/s hey!).  
However, when several simultaneous clients were trying to download  
different files which were not in the disk cache, all hell broke loose :  
lots of seeking, and bandwidth dropped to 30 Mbits/s. Not enough  
read-ahead...
	The Linux box, serving FTP, with half the RAM (256 Mb), had no problem  
pushing the 100 Mbits/s with something like 10 simultaneous connections.  
The amusing part is that I could not use the Windows box to test it  
because it would choke at such a "high" IO concurrency (writing 10  
MBytes/s to several files at once, my god).
	Of course the files which had been downloaded to the Windows box were cut  
in as many fragments as the number of disk seeks during the download...  
several hundred fragments each... my god...

	What amazes me is that it must just be some parameter somewhere and the  
Microsoft guys probably could have easily changed the read-ahead  
thresholds and time between seeks when in a multitasking environment, but  
they didn't. Why ?

	Thus people are forced to buy 1RPM SCSI drives for their LAN servers  
when an IDE raid, used with Linux, could push nearly a Gigabit...

	For database, this is different, as we're concerned about large files,  
and fsync() times... but it seems reiserfs still wins over ext3 so...

	About NTFS vs EXT3 : ext3 dies if you put a lot of files in the same  
directory. It's fast but still outperformed by reiser.

	I saw XFS fry eight 7 harddisk RAID bays. The computer was rebooted with  
the Reset button a few times because a faulty SCSI cable in the eighth  
RAID bay was making it hang. The 7 bays had no problem. When it went back  
up, all the bays were in mayhem. XFSrepair just vomited over itself and we  
got plenty of files with random data in them. Fortunately there was a  
ca

Re: [PERFORM] fsync vs open_sync

2004-09-03 Thread Merlin Moncure
> > There is also the fact that NTFS is a very slow filesystem, and
> > Linux is
> > a lot better than Windows for everything disk, caching and IO related.
> Try
> > to copy some files in NTFS and in ReiserFS...
> 
> I'm not so sure I would agree with such a blanket generalization.  I find
> NTFS to be very fast, my main complaint is fragmentation issues...I bet
> NTFS is better than ext3 at most things (I do agree with you about the
> cache, thoughO.

Ok, you were right.  I made some tests and NTFS is just not very good in the general 
case.  I've seen some benchmarks for Reiser4 that are just amazing.

Merlin

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] fsync vs open_sync

2004-08-13 Thread Merlin Moncure
>   There is also the fact that NTFS is a very slow filesystem, and
> Linux is
> a lot better than Windows for everything disk, caching and IO related. Try
> to copy some files in NTFS and in ReiserFS...

I'm not so sure I would agree with such a blanket generalization.  I find NTFS to be 
very fast, my main complaint is fragmentation issues...I bet NTFS is better than ext3 
at most things (I do agree with you about the cache, thoughO.

I think in very general sense the open source stuff is higher quality but Microsoft 
benefits from a very tight vertical integration of the system.  They added 
ReadFileScatter and WriteFileScatter to the win32 api specifically to make SQL Server 
run faster and SQL server is indeed very, very good at i/o.

SQL Server keeps a one file database with blocks collected and written asynchronously. 
 It's a very tight system because they have control over every layer of the system.

Know your enemy.

That said, I think transaction based file I/O is 'the way' and if implemented on 
Reiser4 faster than I/O methodology than offered on windows/ntfs.  

Merlin

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] fsync vs open_sync

2004-08-13 Thread Pierre-Frédéric Caillaud

What caught my attention initially was the 300+/sec insert performance.
On 8.0/NTFS/fsync=on, I can't break 100/sec on a 10k rpm ATA disk.  My
hardware seems to be more or less in the same league as psql's, so I was
naturally curious if this was a NT/Unix issue, a 7.4/8.0 issue, or a
combination of both.
	There is also the fact that NTFS is a very slow filesystem, and Linux is  
a lot better than Windows for everything disk, caching and IO related. Try  
to copy some files in NTFS and in ReiserFS...

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] fsync vs open_sync

2004-08-13 Thread pgsql
>> OSDL did some testing and found Ext3 to be perhaps the worst FS for
>> PostgreSQL
>> -- although this testing was with the default options.   Ext3 involved
> an
>> almost 40% write performance penalty compared with Ext2, whereas the
>> penalty
>> for ReiserFS and JFS was less than 10%.
>>
>> This concurs with my personal experience.
>
> I'm really curious to see if you guys have compared insert performance
> results between 7.4 and 8.0.  As you probably know the system sync()
> call was replaced with a looping fsync on open file handles.  This may
> have some interesting interactions with the WAL sync method.
>
> What caught my attention initially was the 300+/sec insert performance.
> On 8.0/NTFS/fsync=on, I can't break 100/sec on a 10k rpm ATA disk.  My
> hardware seems to be more or less in the same league as psql's, so I was
> naturally curious if this was a NT/Unix issue, a 7.4/8.0 issue, or a
> combination of both.

The system on which I can get 300 inserts per second is a battery backed
up XEON system with 512M RAM, a Promise PDC DMA ATA card, and some fast
disks with write caching enabled.

(We are not worried about write caching because we have a UPS. Since all
non-redundent systems are evaluated on probability of error, we decided
that the probability of power failure and UPS failure was sufficiently
more rare than system crash with file system corruption or hard disk
failure.)
>
> A 5ms seek time disk would be limited to 200 transaction commits/sec if
> each transaction commit has at least 1 seek.  Are there some
> circumstances where a transaction commit does not generate a physical
> seek?
>
> Maybe ext3 is not the worst filesystem after all!
>
> Merlin
>


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] fsync vs open_sync

2004-08-11 Thread Merlin Moncure
> OSDL did some testing and found Ext3 to be perhaps the worst FS for
> PostgreSQL
> -- although this testing was with the default options.   Ext3 involved
an
> almost 40% write performance penalty compared with Ext2, whereas the
> penalty
> for ReiserFS and JFS was less than 10%.
> 
> This concurs with my personal experience.

I'm really curious to see if you guys have compared insert performance
results between 7.4 and 8.0.  As you probably know the system sync()
call was replaced with a looping fsync on open file handles.  This may
have some interesting interactions with the WAL sync method.

What caught my attention initially was the 300+/sec insert performance.
On 8.0/NTFS/fsync=on, I can't break 100/sec on a 10k rpm ATA disk.  My
hardware seems to be more or less in the same league as psql's, so I was
naturally curious if this was a NT/Unix issue, a 7.4/8.0 issue, or a
combination of both.

A 5ms seek time disk would be limited to 200 transaction commits/sec if
each transaction commit has at least 1 seek.  Are there some
circumstances where a transaction commit does not generate a physical
seek?  

Maybe ext3 is not the worst filesystem after all!

Merlin

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster