Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-10-06 Thread Curt Sampson

On 26 Sep 2002, Neil Conway wrote:

 The fact that ext2 defaults to asynchronous mode and UFS (at least on
 the BSDs) defaults to synchronous mode seems like a total non-issue to
 me. Is there any more to the alleged difference in reliability?

It was sort of pointed out here, but perhaps not made completely
clear, that Berkley FFS defaults to synchronous meta-data updates,
but asynchronous data updates. You can also specify entirely
synchronous or entirely asynchronous updates. Linux ext2fs supports
only these last two modes, which is the problem.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-10-03 Thread Mike Benoit

Some of you may be interested in this seemingly exhaustive benchmark
between ext2/3, ReiserFS, JFS, and XFS.

http://www.osdl.org/presentations/lwe-jgfs.pdf



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-10-03 Thread Greg Copeland

Hey, excellent.  Thanks!

Based on that, it appears that XFS is a pretty good FS to use.  For me,
the real surprise was how well reiserfs performed.

Greg

On Thu, 2002-10-03 at 18:09, Mike Benoit wrote:
 Some of you may be interested in this seemingly exhaustive benchmark
 between ext2/3, ReiserFS, JFS, and XFS.
 
 http://www.osdl.org/presentations/lwe-jgfs.pdf
 
 
 
 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?
 
 http://archives.postgresql.org




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-10-03 Thread Bruce Momjian

Greg Copeland wrote:
-- Start of PGP signed section.
 Hey, excellent.  Thanks!
 
 Based on that, it appears that XFS is a pretty good FS to use.  For me,
 the real surprise was how well reiserfs performed.
 

OK, hardware performance paper updated:

---

File system choice is particularly difficult on Linux because there are
so many file system choices, and none of them are optimal: ext2 is not
entirely crash-safe, ext3, xfs, and jfs are journal-based, and Reiser is
optimized for small files and does journalling. The journalling file
systems can be significantly slower than ext2 but when crash recovery is
required, ext2 isn't an option. If ext2 must be used, mount it with sync
enabled. Some people recommend xfs or an ext3 filesystem mounted with
data=writeback.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-28 Thread James Maes

Has there been any thought of providing RAW disk support to bypass the fs?

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Bruce Momjian
Sent: Thursday, September 26, 2002 3:57 PM
To: Neil Conway
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Subject: Re: [HACKERS] [GENERAL] Performance while loading data and
indexing


Neil Conway wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  The paper does recommend ext3, but the differences between file systems
  are very small.

 Well, I only did a very rough benchmark (a few runs of pgbench), but
 the results I found were drastically different: ext2 was significantly
 faster (~50%) than ext3-writeback, which was in turn significantly
 faster (~25%) than ext3-ordered.

  Also, though ext3 is slower, turning fsync off should make ext3 function
  similar to ext2.

 Why would that be?

OK, I changed the text to:

File system choice is particularly difficult on Linux because there are
so many file system choices, and none of them are optimal: ext2 is not
entirely crash-safe, ext3, xfs, and jfs are journal-based, and Reiser is
optimized for small files and does journalling. The journalling file
systems can be significantly slower than ext2 but when crash recovery is
required, ext2 isn't an option.

--
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-27 Thread Florian Weimer

Tom Lane [EMAIL PROTECTED] writes:

 We'd be happiest with a filesystem that journals its own metadata and
 not the user data in the file(s).  I dunno if there are any.

Most journalling file systems work this way.  Data journalling is not
very widespread, AFAIK.

-- 
Florian Weimer[EMAIL PROTECTED]
University of Stuttgart   http://CERT.Uni-Stuttgart.DE/people/fw/
RUS-CERT  fax +49-711-685-5898

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Martijn van Oosterhout

I'll preface this by saying that while I have a large database, it doesn't
require quite the performace you're talking about here.

On Thu, Sep 26, 2002 at 02:05:44PM +0530, Shridhar Daithankar wrote:
 1) Database load time from flat file using copy is very high
 2) Creating index takes huge amount of time.
 3) Any suggsestions for runtime as data load and query will be going in 
 parallel.

You're loading all the data in one copy. I find that INSERTs are mostly
limited by indexes. While index lookups are cheap, they are not free and
each index needs to be updated for each row.

I fond using partial indexes to only index the rows you actually use can
help with the loading. It's a bit obscure though.

As for parallel loading, you'll be limited mostly by your I/O bandwidth.
Have you measured it to take sure it's up to speed?

 Now the details. Note that this is a test run only..
 
 Platform:- 4x Xeon2.4GHz/4GB RAM/4x48 SCSI RAID5/72 GB SCSI
 RedHat7.2/PostgreSQL7.1.3
 
 Database in flat file: 
 125,000,000 records of around 100 bytes each. 
 Flat file size 12GB
 
 Load time: 14581 sec/~8600 rows persec/~ an MB of data per sec.
 Create unique composite index on 2 char and a timestamp field:  25226 sec.
 Database size on disk: 26GB
 Select query: 1.5 sec. for approx. 150 rows.

So you're loading at a rate of 860KB per sec. That's not too fast. How many
indexes are active at that time? Triggers and foreign keys also take their
toll.

 Important postgresql.conf settings
 
 sort_mem = 12000
 shared_buffers = 24000
 fsync=true (Sad but true. Left untouched.. Will that make a difference on 
 SCSI?)
 wal_buffers = 65536 
 wal_files = 64 

fsync IIRC only affects the WAL buffers now but it may be quite expensive,
especially considering it's running on every transaction commit. Oh, your
WAL files are on a seperate disk from the data?

 Initial flat data load: 250GB of data. This has gone up since last query. It 
 was 150GB earlier..
 Ongoing inserts: 5000/sec. 
 Number of queries: 4800 queries/hour
 Query response time: 10 sec.

That looks quite acheivable.

 1)  Instead of copying from a single 12GB data file,  will a parallel copy from 
 say 5 files will speed up the things? 

Limited by I/O bandwidth. On linux vmstat can tell you how many blocks are
being loaded and stored per second. Try it. As long as sync() doesn't get
done too often, it should be help.

 Couple MB of data per sec. to disk is just not saturating it. It's a RAID 5 
 setup..

No, it's not. You should be able to do better.

 2) Sort mem.=12K i.e. 94MB, sounds good enough to me. Does this need further 
 addition to improve create index performance?

Should be fine. Admittedly your indexes are taking rather long to build.

 3) 5K concurrent inserts with an index on, will this need a additional CPU 
 power? Like deploying it on dual RISC CPUs etc? 

It shouldn't. Do you have an idea of what your CPU usage is? ps aux should
give you a decent idea.

 4) Query performance is not a problem. Though 4.8K queries per sec. expected 
 response time from each query is 10 sec. But my guess is some serius CPU power 
 will be chewed there too..

Should be fine.

 5)Will upgrading to 7.2.2/7.3 beta help?

Possibly, though it may be wirth it just for the features/bugfixes.

 All in all, in the  test, we didn't see the performance where hardware is 
 saturated to it's limits. So effectively we are not able to get postgresql 
 making use of it. Just pushing WAL and shared buffers does not seem to be the 
 solution.
 
 If you guys have any suggestions. let me know.  I need them all..

Find the bottleneck: CPU, I/O or memory?

 Mysql is almost out because it's creating index for last 17 hours. I don't 
 think it will keep up with 5K inserts per sec. with index. SAP DB is under 
 evaluation too. But postgresql is most favourite as of now because it works. So 
 I need to come up with solutions to problems that will occur in near future..
 ;-)

17 hours! Ouch. Either way, you should be able to do much better. Hope this
helps,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 There are 10 kinds of people in the world, those that can do binary
 arithmetic and those that can't.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Justin Clift

Hi Shridhar,

Shridhar Daithankar wrote:
snip
 3) Any suggsestions for runtime as data load and query will be going in
 parallel.

That sounds unusual.  From reading this, it *sounds* like you'll be
running queries against an incomplete dataset, or maybe just running the
queries that affect the tables loaded thus far (during the initial
load).

snip
 fsync=true (Sad but true. Left untouched.. Will that make a difference on
 SCSI?)

Definitely.  Have directly measured a ~ 2x tps throughput increase on
FreeBSD when leaving fsync off whilst performance measuring stuff
recently (PG 7.2.2).  Like anything it'll depend on workload, phase of
moon, etc, but it's a decent indicator.

snip
 Now questions.
 
 1)  Instead of copying from a single 12GB data file,  will a parallel copy from
 say 5 files will speed up the things?

Not sure yet.  Haven't get done enough performance testing (on the cards
very soon though).

 Couple MB of data per sec. to disk is just not saturating it. It's a RAID 5
 setup..

fsync = off would help during the data load, but not a good idea if
you're going to be running queries against it at the same time.

Am still getting the hang of performance tuning stuff.  Have a bunch of
Ultra160 hardware for the Intel platform, and am testing against it as
time permits.

Not as high end as I'd like, but it's a start.

:-)

Regards and best wishes,

Justin Clift

snip
 Bye
  Shridhar

-- 
My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there.
   - Indira Gandhi

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Shridhar Daithankar

On 26 Sep 2002 at 19:17, Justin Clift wrote:
 Shridhar Daithankar wrote:
 snip
  3) Any suggsestions for runtime as data load and query will be going in
  parallel.
 
 That sounds unusual.  From reading this, it *sounds* like you'll be
 running queries against an incomplete dataset, or maybe just running the
 queries that affect the tables loaded thus far (during the initial
 load).

That's correct. Load the data so far and keep inserting data as and when it 
generates.

They don't mind running against data so far. It's not very accurate stuff 
IMO...

  fsync=true (Sad but true. Left untouched.. Will that make a difference on
  SCSI?)
 
 Definitely.  Have directly measured a ~ 2x tps throughput increase on
 FreeBSD when leaving fsync off whilst performance measuring stuff
 recently (PG 7.2.2).  Like anything it'll depend on workload, phase of
 moon, etc, but it's a decent indicator.

I didn't know even that matters with SCSI..Will check out..

 fsync = off would help during the data load, but not a good idea if
 you're going to be running queries against it at the same time.

That's OK for the reasons mentioned above. It wouldn't be out of place to 
expect a UPS to such an installation...

Bye
 Shridhar

--
Hoare's Law of Large Problems:  Inside every large problem is a small problem 
struggling to get out.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Shridhar Daithankar

On 26 Sep 2002 at 19:05, Martijn van Oosterhout wrote:

 On Thu, Sep 26, 2002 at 02:05:44PM +0530, Shridhar Daithankar wrote:
  1) Database load time from flat file using copy is very high
  2) Creating index takes huge amount of time.
  3) Any suggsestions for runtime as data load and query will be going in 
  parallel.
 
 You're loading all the data in one copy. I find that INSERTs are mostly
 limited by indexes. While index lookups are cheap, they are not free and
 each index needs to be updated for each row.
 
 I fond using partial indexes to only index the rows you actually use can
 help with the loading. It's a bit obscure though.
 
 As for parallel loading, you'll be limited mostly by your I/O bandwidth.
 Have you measured it to take sure it's up to speed?

Well. It's like this, as of now.. CreateDB-create table-create index-Select.

So loading is not slowed by index. As of your hint of vmstat, will check it 
out.
 So you're loading at a rate of 860KB per sec. That's not too fast. How many
 indexes are active at that time? Triggers and foreign keys also take their
 toll.

Nothing except the table where data os  loaded..

 fsync IIRC only affects the WAL buffers now but it may be quite expensive,
 especially considering it's running on every transaction commit. Oh, your
 WAL files are on a seperate disk from the data?

No. Same RAID 5 disks..

 It shouldn't. Do you have an idea of what your CPU usage is? ps aux should
 give you a decent idea.

I guess we forgot to monitor system parameters. Next on my list is running 
vmstat, top and tuning bdflush.
 
 Find the bottleneck: CPU, I/O or memory?

Understood..
 
  Mysql is almost out because it's creating index for last 17 hours. I don't 
  think it will keep up with 5K inserts per sec. with index. SAP DB is under 
  evaluation too. But postgresql is most favourite as of now because it works. So 
  I need to come up with solutions to problems that will occur in near future..
  ;-)
 
 17 hours! Ouch. Either way, you should be able to do much better. Hope this
 helps,

Heh.. no wonder this evaluation is taking more than 2 weeks.. Mysql was running 
out of disk space while creating index and crashin. An upgrade to mysql helped 
there but no numbers as yet..

Thanks once again...
Bye
 Shridhar

--
Boren's Laws:   (1) When in charge, ponder. (2) When in trouble, delegate.  (3) 
When in doubt, mumble.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Justin Clift

Shridhar Daithankar wrote:
snip
   fsync=true (Sad but true. Left untouched.. Will that make a difference on
   SCSI?)
 
  Definitely.  Have directly measured a ~ 2x tps throughput increase on
  FreeBSD when leaving fsync off whilst performance measuring stuff
  recently (PG 7.2.2).  Like anything it'll depend on workload, phase of
  moon, etc, but it's a decent indicator.
 
 I didn't know even that matters with SCSI..Will check out..

Cool.  When testing it had FreeBSD 4.6.2 installed on one drive along
with the PostgreSQL 7.2.2 binaries, it had the data on a second drive
(mounted as /pgdata), and it had the pg_xlog directory mounted on a
third drive.  Swap had it's own drive as well.

Everything is UltraSCSI, etc.  Haven't yet tested for a performance
difference through moving the indexes to another drive after creation
though.  That apparently has the potential to help as well.

:-)

Regards and best wishes,

Justin Clift

-- 
My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there.
   - Indira Gandhi

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Justin Clift

Shridhar Daithankar wrote:
 
 On 26 Sep 2002 at 19:05, Martijn van Oosterhout wrote:
snip
  fsync IIRC only affects the WAL buffers now but it may be quite expensive,
  especially considering it's running on every transaction commit. Oh, your
  WAL files are on a seperate disk from the data?
 
 No. Same RAID 5 disks..

Not sure if this is a good idea.  Would have to think deeply about the
controller and drive optimisation/load characteristics.

If it's any help, when I was testing recently with WAL on a separate
drive, the WAL logs were doing more readwrites per second than the main
data drive.  This would of course be affected by the queries you are
running against the database.  I was just running Tatsuo's TPC-B stuff,
and the OSDB AS3AP tests.

 I guess we forgot to monitor system parameters. Next on my list is running
 vmstat, top and tuning bdflush.

That'll just be the start of it for serious performance tuning and
learning how PostgreSQL works.  :)

snip
 Thanks once again...
 Bye
  Shridhar

-- 
My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there.
   - Indira Gandhi

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Hans-Jürgen Schönig

I have seen various benchmarks where XFS seems to perform best when it 
comes to huge amounts of data and many files (due to balanced internal 
b+ trees).
also, XFS seems to be VERY mature and very stable.
ext2/3 don't seem to be that fast in most of the benchmarks.

i did some testing with reiser some time ago. the problem is that it 
seems to restore a very historic consistent snapshot of the data. XFS 
seems to be much better in this respect.

i have not tested JFS yet (but on this damn AIX beside me)
from my point of view i strongly recommend XFS (maybe somebody from 
RedHat should think about it).

Hans


Neil Conway wrote:

Bruce Momjian [EMAIL PROTECTED] writes:
  

The paper does recommend ext3, but the differences between file systems
are very small.



Well, I only did a very rough benchmark (a few runs of pgbench), but
the results I found were drastically different: ext2 was significantly
faster (~50%) than ext3-writeback, which was in turn significantly
faster (~25%) than ext3-ordered.

  

Also, though ext3 is slower, turning fsync off should make ext3 function
similar to ext2.



Why would that be?

Cheers,

Neil

  



-- 
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at http://www.postgresql.at, cluster.postgresql.at 
http://cluster.postgresql.at, www.cybertec.at 
http://www.cybertec.at, kernel.cybertec.at http://kernel.cybertec.at


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Greg Copeland

I tend to agree with this though I have nothing to back up it with.  My
impression is that XFS does very well for large files.  Accepting that
as fact?, my impression is that XFS historically does well for
database's.  Again, I have nothing to back that up other than hear-say
and conjecture.

Greg


On Thu, 2002-09-26 at 15:55, Hans-Jürgen Schönig wrote:
 I have seen various benchmarks where XFS seems to perform best when it 
 comes to huge amounts of data and many files (due to balanced internal 
 b+ trees).
 also, XFS seems to be VERY mature and very stable.
 ext2/3 don't seem to be that fast in most of the benchmarks.
 
 i did some testing with reiser some time ago. the problem is that it 
 seems to restore a very historic consistent snapshot of the data. XFS 
 seems to be much better in this respect.
 
 i have not tested JFS yet (but on this damn AIX beside me)
 from my point of view i strongly recommend XFS (maybe somebody from 
 RedHat should think about it).
 
 Hans
 
 
 Neil Conway wrote:
 
 Bruce Momjian [EMAIL PROTECTED] writes:
   
 
 The paper does recommend ext3, but the differences between file systems
 are very small.
 
 
 
 Well, I only did a very rough benchmark (a few runs of pgbench), but
 the results I found were drastically different: ext2 was significantly
 faster (~50%) than ext3-writeback, which was in turn significantly
 faster (~25%) than ext3-ordered.
 
   
 
 Also, though ext3 is slower, turning fsync off should make ext3 function
 similar to ext2.
 
 
 
 Why would that be?
 
 Cheers,
 
 Neil
 
   
 
 
 
 -- 
 *Cybertec Geschwinde u Schoenig*
 Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
 Tel: +43/1/913 68 09; +43/664/233 90 75
 www.postgresql.at http://www.postgresql.at, cluster.postgresql.at 
 http://cluster.postgresql.at, www.cybertec.at 
 http://www.cybertec.at, kernel.cybertec.at http://kernel.cybertec.at
 
 
 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Tom Lane

Shridhar Daithankar [EMAIL PROTECTED] writes:
 RedHat7.2/PostgreSQL7.1.3

I'd suggest a newer release of Postgres ... 7.1.3 is pretty old ...

 Create unique composite index on 2 char and a timestamp field:  25226 sec.

What do you mean by char exactly?  If it's really char(N), how much
are you paying in padding space?  There are very very few cases where
I'd not say to use varchar(N), or text, instead.  Also, does it have to
be character data?  If you could use an integer or float datatype
instead the index operations should be faster (though I can't say by
how much).  Have you thought carefully about the order in which the
composite index columns are listed?

 sort_mem = 12000

To create an index of this size, you want to push sort_mem as high as it
can go without swapping.  12000 sounds fine for the global setting, but
in the process that will create the index, try setting sort_mem to some
hundreds of megs or even 1Gb.  (But be careful: the calculation of space
actually used by CREATE INDEX is off quite a bit in pre-7.3 releases
:-(.  You should probably expect the actual process size to grow to two
or three times what you set sort_mem to.  Don't let it get so big as to
swap.)

 wal_buffers = 65536 

The above is a complete waste of memory space, which would be better
spent on letting the kernel expand its disk cache.  There's no reason
for wal_buffers to be more than a few dozen.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Shridhar Daithankar

On 26 Sep 2002 at 10:33, Tom Lane wrote:

 Shridhar Daithankar [EMAIL PROTECTED] writes:
  RedHat7.2/PostgreSQL7.1.3
 
 I'd suggest a newer release of Postgres ... 7.1.3 is pretty old ...

I agree.. downloadind 7.2.2 right away..

  Create unique composite index on 2 char and a timestamp field:  25226 sec.
 
 What do you mean by char exactly?  If it's really char(N), how much
 are you paying in padding space?  There are very very few cases where
 I'd not say to use varchar(N), or text, instead.  Also, does it have to
 be character data?  If you could use an integer or float datatype
 instead the index operations should be faster (though I can't say by
 how much).  Have you thought carefully about the order in which the
 composite index columns are listed?

I have forwarded the idea of putting things into number. If it causes speedup 
in index lookup/creation, it would do. Looks like bigint is the order of the 
day..

 
  sort_mem = 12000
 
 To create an index of this size, you want to push sort_mem as high as it
 can go without swapping.  12000 sounds fine for the global setting, but
 in the process that will create the index, try setting sort_mem to some
 hundreds of megs or even 1Gb.  (But be careful: the calculation of space
 actually used by CREATE INDEX is off quite a bit in pre-7.3 releases
 :-(.  You should probably expect the actual process size to grow to two
 or three times what you set sort_mem to.  Don't let it get so big as to
 swap.)

Great. I was skeptical to push it beyond 100MB. Now I can push it to corners..

  wal_buffers = 65536 
 
 The above is a complete waste of memory space, which would be better
 spent on letting the kernel expand its disk cache.  There's no reason
 for wal_buffers to be more than a few dozen.

That was a rather desparate move. Nothing was improving performance and then we 
started pushing numbers.. WIll get it back..  Same goes for 64 WAL files.. A GB 
looks like waste to me..

I might have found the bottleneck, although by accident. Mysql was running out 
of space while creating index. So my friend shut down mysql and tried to move 
things by hand to create links. He noticed that even things like cp were 
terribly slow and it hit us.. May be the culprit is the file system. Ext3 in 
this case. 

My friend argues for ext2 to eliminate journalling overhead but I favour 
reiserfs personally having used it in pgbench with 10M rows on paltry 20GB IDE 
disk for 25 tps..

We will be attempting raiserfs and/or XFS if required. I know how much speed 
difference exists between resiserfs and ext2. Would not be surprised if 
everythng just starts screaming in one go..

Bye
 Shridhar

--
Cropp's Law:The amount of work done varies inversly with the time spent in the 
 
office.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Shridhar Daithankar

On 26 Sep 2002 at 10:42, Tom Lane wrote:

 Justin Clift [EMAIL PROTECTED] writes:
  If it's any help, when I was testing recently with WAL on a separate
  drive, the WAL logs were doing more readwrites per second than the main
  data drive.
 
 ... but way fewer seeks.  For anything involving lots of updating
 transactions (and certainly 5000 separate insertions per second would
 qualify; can those be batched??), it should be a win to put WAL on its
 own spindle, just to get locality of access to the WAL.

Probably they will be a single transcation. If possible we will bunch more of 
them together.. like 5 seconds of data pushed down in a single transaction but 
not sure it's possible..

This is bit like replication but from live oracle machine to postgres, from 
information I have. So there should be some chance of tuning there..

Bye
 Shridhar

--
Langsam's Laws: (1) Everything depends. (2) Nothing is always.  (3) Everything 
is sometimes.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Denis Perchine

On Thursday 26 September 2002 21:52, Shridhar Daithankar wrote:

 I might have found the bottleneck, although by accident. Mysql was running
 out of space while creating index. So my friend shut down mysql and tried
 to move things by hand to create links. He noticed that even things like cp
 were terribly slow and it hit us.. May be the culprit is the file system.
 Ext3 in this case.

 My friend argues for ext2 to eliminate journalling overhead but I favour
 reiserfs personally having used it in pgbench with 10M rows on paltry 20GB
 IDE disk for 25 tps..

 We will be attempting raiserfs and/or XFS if required. I know how much
 speed difference exists between resiserfs and ext2. Would not be surprised
 if everythng just starts screaming in one go..

As it was found by someone before any non-journaling FS is faster than
journaling one. This due to double work done by FS and database.

Try it on ext2 and compare.

--
Denis


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Justin Clift

Shridhar Daithankar wrote:
snip
 My friend argues for ext2 to eliminate journalling overhead but I favour
 reiserfs personally having used it in pgbench with 10M rows on paltry 20GB IDE
 disk for 25 tps..

If it's any help, the setup I mentioned before with differnt disks for
the data and the WAL files was getting an average of about 72 tps with
200 concurrent users on pgbench.  Haven't tuned it in a hard core way at
all, and it only has 256MB DDR RAM in it at the moment (single CPU
AthonXP 1600).  These are figures made during the 2.5k+ test runs of
pgbench done when developing pg_autotune recently.

As a curiosity point, how predictable are the queries you're going to be
running on your database?  They sound very simple and very predicatable.

The pg_autotune tool might be your friend here.  It can deal with
arbitrary SQL instead of using the pg_bench stuff of Tatsuos, and it can
also deal with an already loaded database.  You'd just have to tweak the
names of the tables that it vacuums and the names of the indexes that it
reindexes between each run, to get some idea of your overall server
performance at different load points.

Probably worth taking a good look at if you're not afraid of editing
variables in C code.  :)
 
 We will be attempting raiserfs and/or XFS if required. I know how much speed
 difference exists between resiserfs and ext2. Would not be surprised if
 everythng just starts screaming in one go..

We'd all probably be interested to hear this.  Added the PostgreSQL
Performance mailing list to this thread too, Just In Case. (wow that's
a lot of cross posting now).

Regards and best wishes,

Justin Clift
 
 Bye
  Shridhar
 
 --
 Cropp's Law:The amount of work done varies inversly with the time spent in the
 office.
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

-- 
My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there.
   - Indira Gandhi

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Shridhar Daithankar

On 27 Sep 2002 at 1:12, Justin Clift wrote:

 Shridhar Daithankar wrote:
 As a curiosity point, how predictable are the queries you're going to be
 running on your database?  They sound very simple and very predicatable.

Mostly predictable selects. Not a domain expert on telecom so not very sure. 
But in my guess prepare statement in 7.3 should come pretty handy. i.e. by the 
time we finish evaluation and test deployment, 7.3 will be out in next couple 
of months to say so. So I would recommend doing it 7.3 way only..
 
 The pg_autotune tool might be your friend here.  It can deal with
 arbitrary SQL instead of using the pg_bench stuff of Tatsuos, and it can
 also deal with an already loaded database.  You'd just have to tweak the
 names of the tables that it vacuums and the names of the indexes that it
 reindexes between each run, to get some idea of your overall server
 performance at different load points.
 
 Probably worth taking a good look at if you're not afraid of editing
 variables in C code.  :)

Gladly. We started with altering pgbench here for testing and rapidly settled 
to perl generated random queries. Once postgresql wins the evaluation match and 
things come to implementation, pg_autotune would be a handy tool. Just that 
can't do it right now. Have to fight mysql and SAP DB before that..

BTW any performance figures on SAP DB? People here are as it frustrated with it 
with difficulties in setting it up. But still..
  

  We will be attempting raiserfs and/or XFS if required. I know how much speed
  difference exists between resiserfs and ext2. Would not be surprised if
  everythng just starts screaming in one go..
 
 We'd all probably be interested to hear this.  Added the PostgreSQL
 Performance mailing list to this thread too, Just In Case. (wow that's
 a lot of cross posting now).

I know..;-) Glad that PG list does not have strict policies like no non-
subscriber posting or no attachments.. etc.. 

IMO reiserfs, though journalling one, is faster than ext2 etc. because the way 
it handles metadata. Personally I haven't come across ext2 being faster than 
reiserfs on few machine here for day to day use.

I guess I should have a freeBSD CD handy too.. Just to give it a try. If it 
comes down to a better VM.. though using 2.4.19 here.. so souldn't matter 
much..

I will keep you guys posted on file system stuff... Glad that we have much 
flexibility with postgresql..

Bye
 Shridhar

--
Bilbo's First Law:  You cannot count friends that are all packed up in barrels.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Greg Copeland

On Thu, 2002-09-26 at 09:52, Shridhar Daithankar wrote:
 My friend argues for ext2 to eliminate journalling overhead but I favour 
 reiserfs personally having used it in pgbench with 10M rows on paltry 20GB IDE 
 disk for 25 tps..
 
 We will be attempting raiserfs and/or XFS if required. I know how much speed 
 difference exists between resiserfs and ext2. Would not be surprised if 
 everythng just starts screaming in one go..
 

I'm not sure about reiserfs or ext3 but with XFS, you can create your
log on another disk.  Also worth noting is that you can also configure
the size and number of log buffers.  There are also some other
performance type enhancements you can fiddle with if you don't mind
risking time stamp consistency in the event of a crash.  If your setup
allows for it, you might want to consider using XFS in this
configuration.

While I have not personally tried moving XFS' log to another device,
I've heard that performance gains can be truly stellar.  Assuming memory
allows, twiddling with the log buffering is said to allow for large
strides in performance as well.

If you do try this, I'd love to hear back about your results and
impressions.

Greg




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Bruce Momjian

Shridhar Daithankar wrote:
 I might have found the bottleneck, although by accident. Mysql was running out 
 of space while creating index. So my friend shut down mysql and tried to move 
 things by hand to create links. He noticed that even things like cp were 
 terribly slow and it hit us.. May be the culprit is the file system. Ext3 in 
 this case. 

I just added a file system and multi-cpu section to my performance
tuning paper:

http://www.ca.postgresql.org/docs/momjian/hw_performance/

The paper does recommend ext3, but the differences between file systems
are very small. If you are seeing 'cp' as slow, I wonder if it may be
something more general, like poorly tuned hardware or something. You can
use 'dd' to throw some data around the file system and see if that is
showing slowness;  compare those numbers to another machine that has
different hardware/OS.

Also, though ext3 is slower, turning fsync off should make ext3 function
similar to ext2.  That would be an interesting test if you suspect ext3.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Doug cNaught

Greg Copeland [EMAIL PROTECTED] writes:

 I'm not sure about reiserfs or ext3 but with XFS, you can create your
 log on another disk.  Also worth noting is that you can also configure
 the size and number of log buffers.  There are also some other
 performance type enhancements you can fiddle with if you don't mind
 risking time stamp consistency in the event of a crash.  If your setup
 allows for it, you might want to consider using XFS in this
 configuration.

You can definitely put the ext3 log on a different disk with 2.4
kernels.  

Also, if you put the WAL logs on a different disk from the main
database, and mount that partition with 'data=writeback' (ie
metadata-only journaling) ext3 should be pretty fast, since WAL files
are preallocated and there will therefore be almost no metadata
updates.

You should be able to mount the main database with data=ordered (the
default) for good performance and reasonable safety.

I think putting WAL on its own disk(s) is one of the keys here.

-Doug

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Greg Copeland

On Thu, 2002-09-26 at 11:41, Bruce Momjian wrote:
 Shridhar Daithankar wrote:
  I might have found the bottleneck, although by accident. Mysql was running out 
  of space while creating index. So my friend shut down mysql and tried to move 
  things by hand to create links. He noticed that even things like cp were 
  terribly slow and it hit us.. May be the culprit is the file system. Ext3 in 
  this case. 
 
 I just added a file system and multi-cpu section to my performance
 tuning paper:
 
   http://www.ca.postgresql.org/docs/momjian/hw_performance/
 
 The paper does recommend ext3, but the differences between file systems
 are very small. If you are seeing 'cp' as slow, I wonder if it may be
 something more general, like poorly tuned hardware or something. You can
 use 'dd' to throw some data around the file system and see if that is
 showing slowness;  compare those numbers to another machine that has
 different hardware/OS.
 
 Also, though ext3 is slower, turning fsync off should make ext3 function
 similar to ext2.  That would be an interesting test if you suspect ext3.

I'm curious as to why you recommended ext3 versus some other (JFS,
XFS).  Do you have tests which validate that recommendation or was it a
simple matter of getting the warm fuzzies from familiarity?

Greg




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Greg Copeland

On Thu, 2002-09-26 at 11:41, Bruce Momjian wrote:
 Shridhar Daithankar wrote:
  I might have found the bottleneck, although by accident. Mysql was running out 
  of space while creating index. So my friend shut down mysql and tried to move 
  things by hand to create links. He noticed that even things like cp were 
  terribly slow and it hit us.. May be the culprit is the file system. Ext3 in 
  this case. 
 
 I just added a file system and multi-cpu section to my performance
 tuning paper:
 
   http://www.ca.postgresql.org/docs/momjian/hw_performance/
 
 The paper does recommend ext3, but the differences between file systems
 are very small. If you are seeing 'cp' as slow, I wonder if it may be
 something more general, like poorly tuned hardware or something. You can
 use 'dd' to throw some data around the file system and see if that is
 showing slowness;  compare those numbers to another machine that has
 different hardware/OS.


That's a good point.  Also, if you're using IDE, you do need to verify
that you're using DMA and proper PIO mode if at possible.  Also, big
performance improvements can be seen by making sure your IDE bus speed
has been properly configured.  The drivetweak-gtk and hdparm utilities
can make huge difference in performance.  Just be sure you know what the
heck your doing when you mess with those.

Greg




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Bruce Momjian

Greg Copeland wrote:
  The paper does recommend ext3, but the differences between file systems
  are very small. If you are seeing 'cp' as slow, I wonder if it may be
  something more general, like poorly tuned hardware or something. You can
  use 'dd' to throw some data around the file system and see if that is
  showing slowness;  compare those numbers to another machine that has
  different hardware/OS.
  
  Also, though ext3 is slower, turning fsync off should make ext3 function
  similar to ext2.  That would be an interesting test if you suspect ext3.
 
 I'm curious as to why you recommended ext3 versus some other (JFS,
 XFS).  Do you have tests which validate that recommendation or was it a
 simple matter of getting the warm fuzzies from familiarity?

I used the attached email as a reference.  I just changed the wording to
be:

File system choice is particularly difficult on Linux because there are
so many file system choices, and none of them are optimal: ext2 is not
entirely crash-safe, ext3 and xfs are journal-based, and Reiser is
optimized for small files. Fortunately, the journaling file systems
aren't significantly slower than ext2 so they are probably the best
choice.

so I don't specifically recommend ext3 anymore.  As I remember, ext3 is
good only in that it can read ext2 file systems.  I think XFS may be the
best bet.

Can anyone clarify if data=writeback is safe for PostgreSQL. 
Specifically, are the data files recovered properly or is this option
only for a filesystem containing WAL?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073


From [EMAIL PROTECTED] Sun Apr 14 
10:26:49 2002
Return-path: [EMAIL PROTECTED]
Received: from postgresql.org (postgresql.org [64.49.215.8])
by candle.pha.pa.us (8.11.6/8.10.1) with ESMTP id g3EEQmS21621
for [EMAIL PROTECTED]; Sun, 14 Apr 2002 10:26:48 -0400 (EDT)
Received: from postgresql.org (postgresql.org [64.49.215.8])
by postgresql.org (Postfix) with SMTP id 94C5B475C9B
for [EMAIL PROTECTED]; Sun, 14 Apr 2002 10:26:43 -0400 (EDT)
Received: from sphinx.mythic-beasts.com (sphinx.mythic-beasts.com [195.82.107.246])
by postgresql.org (Postfix) with ESMTP id 1E27547517F
for [EMAIL PROTECTED]; Sun, 14 Apr 2002 10:25:31 -0400 (EDT)
Received: from matthew (helo=localhost)
by sphinx.mythic-beasts.com with local-esmtp (Exim 3.33 #2)
id 16wkwR-p2-00; Sun, 14 Apr 2002 15:25:19 +0100
Date: Sun, 14 Apr 2002 15:25:19 +0100 (BST)
From: Matthew Kirkwood [EMAIL PROTECTED]
X-X-Sender: [EMAIL PROTECTED]
To: Riebs, Andy [EMAIL PROTECTED]
cc: [EMAIL PROTECTED]
Subject: Re: [GENERAL] Performance Tuning Document?
In-Reply-To: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Precedence: bulk
Sender: [EMAIL PROTECTED]
Status: OR

On Thu, 28 Mar 2002, Matthew Kirkwood wrote:

[ oops, forgot to send this ages ago ]

   Under the crossSectionTests(Mixed IR) part of an OSDB run, a
   large number of shared_buffers causes severe slowdown on one of
   the tests -- it goes from a little over 200 seconds to nearly
   2000.

  --postgresql=no_hash_index

 Ah, that would make a lot of sense.  I'll do a run again with
 that option and see what turns up.

That was right on the nose.  The numbers are much better now.

My initial interest was in benchmarking different filesystems
on Linux.  In case anyone is interested, here are today's
numbers:

tuning? single  ir  cs-ir   oltpcs-oltp
(sec)   (tps)   (sec)   (tps)   (sec)
ext3 kn 841.28   61.52  203.33  407.58  159.72
ext3-wb  kn 841.19   63.73  217.19  406.30  160.88
ext3-jd  kn 839.96   58.96  203.02  307.85  159.89
jfs  kn 840.53   62.74  205.90  348.33  177.70
minixkn 841.51   62.12  201.44  343.87  176.68
ext2 kn 840.72   65.02  205.40  338.20  182.22

ext3-wb is ext3 with the data=writeback mount option.  ext3-jd
is ext3 with data=journal and a 200Mb journal instead of the
usual 32Mb one.  All filesystems were mounted noatime.

postgresql.conf for all these runs looks like:

tcpip_socket = true
shared_buffers = 10240
max_fsm_relations = 100
max_fsm_pages = 1
max_locks_per_transaction = 256
wal_buffers = 10240
sort_mem = 512
vacuum_mem = 81920

Without hash indexes, it looks like only OLTP loads can
differentiate the filesystems.  Sometime (once I have got
a more recent kernel going) I'll try a dataset larger than
memory.

Matthew.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])




---(end 

Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Bruce Momjian

Neil Conway wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  The paper does recommend ext3, but the differences between file systems
  are very small.
 
 Well, I only did a very rough benchmark (a few runs of pgbench), but
 the results I found were drastically different: ext2 was significantly
 faster (~50%) than ext3-writeback, which was in turn significantly
 faster (~25%) than ext3-ordered.

Wow.  That leaves no good Linux file system alternatives.  PostgreSQL
just wants an ordinary file system that has reliable recovery from a
crash.

  Also, though ext3 is slower, turning fsync off should make ext3 function
  similar to ext2.
 
 Why would that be?

I assumed it was the double fsync for the normal and journal that made
the journalling file systems slog.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Bruce Momjian

Neil Conway wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  The paper does recommend ext3, but the differences between file systems
  are very small.
 
 Well, I only did a very rough benchmark (a few runs of pgbench), but
 the results I found were drastically different: ext2 was significantly
 faster (~50%) than ext3-writeback, which was in turn significantly
 faster (~25%) than ext3-ordered.
 
  Also, though ext3 is slower, turning fsync off should make ext3 function
  similar to ext2.
 
 Why would that be?

OK, I changed the text to:

File system choice is particularly difficult on Linux because there are
so many file system choices, and none of them are optimal: ext2 is not
entirely crash-safe, ext3, xfs, and jfs are journal-based, and Reiser is
optimized for small files and does journalling. The journalling file
systems can be significantly slower than ext2 but when crash recovery is
required, ext2 isn't an option.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Neil Conway

Bruce Momjian [EMAIL PROTECTED] writes:
 Wow.  That leaves no good Linux file system alternatives.
 PostgreSQL just wants an ordinary file system that has reliable
 recovery from a crash.

I'm not really familiar with the reasoning behind ext2's reputation as
recovering poorly from crashes; if we fsync a WAL record to disk
before we lose power, can't we recover reliably, even with ext2?

   Also, though ext3 is slower, turning fsync off should make ext3
   function similar to ext2.
  
  Why would that be?
 
 I assumed it was the double fsync for the normal and journal that
 made the journalling file systems slog.

Well, a journalling file system would need to write a journal entry
and flush that to disk, even if fsync is disabled -- whereas without
fsync enabled, ext2 doesn't have to flush anything to disk. ISTM that
the performance advantage of ext2 over ext3 is should be even larger
when fsync is not enabled.

Cheers,

Neil

-- 
Neil Conway [EMAIL PROTECTED] || PGP Key ID: DB3C29FC


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Bruce Momjian

Neil Conway wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Wow.  That leaves no good Linux file system alternatives.
  PostgreSQL just wants an ordinary file system that has reliable
  recovery from a crash.
 
 I'm not really familiar with the reasoning behind ext2's reputation as
 recovering poorly from crashes; if we fsync a WAL record to disk
 before we lose power, can't we recover reliably, even with ext2?
 
Also, though ext3 is slower, turning fsync off should make ext3
function similar to ext2.
   
   Why would that be?
  
  I assumed it was the double fsync for the normal and journal that
  made the journalling file systems slog.
 
 Well, a journalling file system would need to write a journal entry
 and flush that to disk, even if fsync is disabled -- whereas without
 fsync enabled, ext2 doesn't have to flush anything to disk. ISTM that
 the performance advantage of ext2 over ext3 is should be even larger
 when fsync is not enabled.

Yes, it is still double-writing.  I just thought that if that wasn't
happening while the db was waiting for a commit that it wouldn't be too
bad.

Is it just me or do all the Linux file systems seem like they are
lacking something when PostgreSQL is concerned?  We just want a UFS-like
file system on Linux and no one has it.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Greg Copeland

On Thu, 2002-09-26 at 16:03, Neil Conway wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Wow.  That leaves no good Linux file system alternatives.
  PostgreSQL just wants an ordinary file system that has reliable
  recovery from a crash.
 
 I'm not really familiar with the reasoning behind ext2's reputation as
 recovering poorly from crashes; if we fsync a WAL record to disk
 before we lose power, can't we recover reliably, even with ext2?

Well, I have experienced data loss from ext2 before.  Also, recovery
from crashes on large file systems take a very, very long time.  I can't
imagine anyone running a production database on an ext2 file system
having 10's or even 100's of GB.  Ouch.  Recovery would take forever! 
Even recovery on small file systems (2-8G) can take extended periods of
time.  Especially so on IDE systems.  Even then manual intervention is
not uncommon.

While I can't say that x, y or z is the best FS to use on Linux, I can
say that ext2 is probably an exceptionally poor choice from a
reliability and/or uptime perspective.

Greg




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Neil Conway

Greg Copeland [EMAIL PROTECTED] writes:
 On Thu, 2002-09-26 at 16:03, Neil Conway wrote:
  I'm not really familiar with the reasoning behind ext2's
  reputation as recovering poorly from crashes; if we fsync a WAL
  record to disk before we lose power, can't we recover reliably,
  even with ext2?
 
 Well, I have experienced data loss from ext2 before.  Also, recovery
 from crashes on large file systems take a very, very long time.

Yes, but wouldn't you face exactly the same issues if you ran a
UFS-like filesystem in asynchronous mode? Albeit it's not the default,
but performance in synchronous mode is usually pretty poor.

The fact that ext2 defaults to asynchronous mode and UFS (at least on
the BSDs) defaults to synchronous mode seems like a total non-issue to
me. Is there any more to the alleged difference in reliability?

Cheers,

Neil

-- 
Neil Conway [EMAIL PROTECTED] || PGP Key ID: DB3C29FC


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Doug McNaught

Bruce Momjian [EMAIL PROTECTED] writes:

 Can anyone clarify if data=writeback is safe for PostgreSQL. 
 Specifically, are the data files recovered properly or is this option
 only for a filesystem containing WAL?

data=writeback means that no data is journaled, just metadata (which
is like XFS or Reiser).  An fsync() call should still do what it
normally does, commit the writes to disk before returning.

data=journal journals all data and is the slowest and safest.
data=ordered writes out data blocks before committing a journal
transaction, which is faster than full data journaling (since data
doesn't get written twice) and almost as safe.  data=writeback is
noted to keep obsolete data in the case of some crashes (since the
data may not have been written yet) but a completed fsync() should
ensure that the data is valid.

So I guess I'd probably use data=ordered for an all-on-one-fs
installation, and data=writeback for a WAL-only drive.

Hope this helps...

-Doug

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Tom Lane

Neil Conway [EMAIL PROTECTED] writes:
 I'm not really familiar with the reasoning behind ext2's reputation as
 recovering poorly from crashes; if we fsync a WAL record to disk
 before we lose power, can't we recover reliably, even with ext2?

Up to a point.  We do assume that the filesystem won't lose checkpointed
(sync'd) writes to data files.  To the extent that the filesystem is
vulnerable to corruption of its own metadata for a file (indirect blocks
or whatever ext2 uses), that's not a completely safe assumption.

We'd be happiest with a filesystem that journals its own metadata and
not the user data in the file(s).  I dunno if there are any.

Hmm, maybe this is why Oracle likes doing their own filesystem on a raw
device...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Doug McNaught

Tom Lane [EMAIL PROTECTED] writes:

 We'd be happiest with a filesystem that journals its own metadata and
 not the user data in the file(s).  I dunno if there are any.

ext3 with data=writeback?  (See my previous message to Bruce).

-Doug

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Bruce Momjian

Neil Conway wrote:
 Greg Copeland [EMAIL PROTECTED] writes:
  On Thu, 2002-09-26 at 16:03, Neil Conway wrote:
   I'm not really familiar with the reasoning behind ext2's
   reputation as recovering poorly from crashes; if we fsync a WAL
   record to disk before we lose power, can't we recover reliably,
   even with ext2?
  
  Well, I have experienced data loss from ext2 before.  Also, recovery
  from crashes on large file systems take a very, very long time.
 
 Yes, but wouldn't you face exactly the same issues if you ran a
 UFS-like filesystem in asynchronous mode? Albeit it's not the default,
 but performance in synchronous mode is usually pretty poor.

Yes, before UFS had soft updates, the synchronous nature of UFS made it
slower than ext2, but now with soft updates, that performance difference
is gone so you have two files systems, ext2 and ufs, similar peformance,
but one is crash-safe and the other is not.

And, when comparing the journalling file systems, you have UFS vs.
XFS/ext3/JFS/Reiser, and UFS is faster.  The only thing the journalling
file system give you is more rapid reboot, but frankly, if your OS goes
down often enough so that is an issue, you have bigger problems than
fsync time.

The big problem is that Linux went from non-crash safe right to
crash-safe and reboot quick.  We need a middle ground, which is where
UFS/soft updates is.

 The fact that ext2 defaults to asynchronous mode and UFS (at least on
 the BSDs) defaults to synchronous mode seems like a total non-issue to
 me. Is there any more to the alleged difference in reliability?

The reliability problem isn't alleged.  ext2 developers admits ext2
isn't 100% crash-safe.  They will say it is usually crash-safe, but that
isn't good enough for PostgreSQL.

I wish I was wrong.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Bruce Momjian

Doug McNaught wrote:
 Tom Lane [EMAIL PROTECTED] writes:
 
  We'd be happiest with a filesystem that journals its own metadata and
  not the user data in the file(s).  I dunno if there are any.
 
 ext3 with data=writeback?  (See my previous message to Bruce).

OK, so that makes ext3 crash safe without lots of overhead?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Rod Taylor

On Thu, 2002-09-26 at 17:39, Bruce Momjian wrote:
 Neil Conway wrote:
  Greg Copeland [EMAIL PROTECTED] writes:
   On Thu, 2002-09-26 at 16:03, Neil Conway wrote:
I'm not really familiar with the reasoning behind ext2's
reputation as recovering poorly from crashes; if we fsync a WAL
record to disk before we lose power, can't we recover reliably,
even with ext2?
   
   Well, I have experienced data loss from ext2 before.  Also, recovery
   from crashes on large file systems take a very, very long time.
  
  Yes, but wouldn't you face exactly the same issues if you ran a
  UFS-like filesystem in asynchronous mode? Albeit it's not the default,
  but performance in synchronous mode is usually pretty poor.
 
 Yes, before UFS had soft updates, the synchronous nature of UFS made it
 slower than ext2, but now with soft updates, that performance difference
 is gone so you have two files systems, ext2 and ufs, similar peformance,
 but one is crash-safe and the other is not.

Note entirely true.  ufs is both crash-safe and quick-rebootable.  You
do need to fsck at some point, but not prior to mounting it.  Any
corrupt blocks are empty, and are easy to avoid.

Someone just needs to implement a background fsck that will run on a
mounted filesystem.

-- 
  Rod Taylor


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Bruce Momjian

Rod Taylor wrote:
  Yes, before UFS had soft updates, the synchronous nature of UFS made it
  slower than ext2, but now with soft updates, that performance difference
  is gone so you have two files systems, ext2 and ufs, similar peformance,
  but one is crash-safe and the other is not.
 
 Note entirely true.  ufs is both crash-safe and quick-rebootable.  You
 do need to fsck at some point, but not prior to mounting it.  Any
 corrupt blocks are empty, and are easy to avoid.

I am assuming you need to mount the drive as part of the reboot.  Of
course you can boot fast with any file system if you don't have to mount
it.  :-)

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Rod Taylor

On Thu, 2002-09-26 at 17:47, Bruce Momjian wrote:
 Rod Taylor wrote:
   Yes, before UFS had soft updates, the synchronous nature of UFS made it
   slower than ext2, but now with soft updates, that performance difference
   is gone so you have two files systems, ext2 and ufs, similar peformance,
   but one is crash-safe and the other is not.
  
  Note entirely true.  ufs is both crash-safe and quick-rebootable.  You
  do need to fsck at some point, but not prior to mounting it.  Any
  corrupt blocks are empty, and are easy to avoid.
 
 I am assuming you need to mount the drive as part of the reboot.  Of
 course you can boot fast with any file system if you don't have to mount
 it.  :-)

Sorry, poor explanation.

Background fsck (when implemented) would operate on a currently mounted
(and active) file system.  The only reason fsck is required prior to
reboot now is because no-one had done the work.

http://www.freebsd.org/cgi/man.cgi?query=fscksektion=8manpath=FreeBSD+5.0-current

See the first paragraph of the above.
-- 
  Rod Taylor


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Bruce Momjian

Rod Taylor wrote:
 On Thu, 2002-09-26 at 17:47, Bruce Momjian wrote:
  Rod Taylor wrote:
Yes, before UFS had soft updates, the synchronous nature of UFS made it
slower than ext2, but now with soft updates, that performance difference
is gone so you have two files systems, ext2 and ufs, similar peformance,
but one is crash-safe and the other is not.
   
   Note entirely true.  ufs is both crash-safe and quick-rebootable.  You
   do need to fsck at some point, but not prior to mounting it.  Any
   corrupt blocks are empty, and are easy to avoid.
  
  I am assuming you need to mount the drive as part of the reboot.  Of
  course you can boot fast with any file system if you don't have to mount
  it.  :-)
 
 Sorry, poor explanation.
 
 Background fsck (when implemented) would operate on a currently mounted
 (and active) file system.  The only reason fsck is required prior to
 reboot now is because no-one had done the work.
 
 http://www.freebsd.org/cgi/man.cgi?query=fscksektion=8manpath=FreeBSD+5.0-current
 
 See the first paragraph of the above.

Oh, yes, I have heard of that missing feature.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Doug McNaught

Bruce Momjian [EMAIL PROTECTED] writes:

 Doug McNaught wrote:
  Tom Lane [EMAIL PROTECTED] writes:
  
   We'd be happiest with a filesystem that journals its own metadata and
   not the user data in the file(s).  I dunno if there are any.
  
  ext3 with data=writeback?  (See my previous message to Bruce).
 
 OK, so that makes ext3 crash safe without lots of overhead?

Metadata is journaled so you shouldn't lose data blocks or directory
entries.  Some data blocks (that haven't been fsync()'ed) may have old
or wrong data in them, but I think that's the same as ufs, right?  And
WAL replay should take care of that.

It'd be very interesting to do some tests of the various journaling
modes.  I have an old K6 that I might be able to turn into a
hit-the-reset-switch-at-ramdom-times machine.  What kind of tests
should be run?

-Doug

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Tom Lane

Doug McNaught [EMAIL PROTECTED] writes:
 data=writeback means that no data is journaled, just metadata (which
 is like XFS or Reiser).  An fsync() call should still do what it
 normally does, commit the writes to disk before returning.
 data=journal journals all data and is the slowest and safest.
 data=ordered writes out data blocks before committing a journal
 transaction, which is faster than full data journaling (since data
 doesn't get written twice) and almost as safe.  data=writeback is
 noted to keep obsolete data in the case of some crashes (since the
 data may not have been written yet) but a completed fsync() should
 ensure that the data is valid.

Thanks for the explanation.

 So I guess I'd probably use data=ordered for an all-on-one-fs
 installation, and data=writeback for a WAL-only drive.

Actually I think the ideal thing for Postgres would be data=writeback
for both data and WAL drives.  We can handle loss of un-fsync'd data
for ourselves in both cases.

Of course, if you keep anything besides Postgres data files on a
partition, you'd possibly want the more secure settings.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] [GENERAL] Performance while loading data and indexing

2002-09-26 Thread Yury Bokhoncovich

Hello!

On Thu, 26 Sep 2002, Bruce Momjian wrote:

  I'm not really familiar with the reasoning behind ext2's reputation as
  recovering poorly from crashes; if we fsync a WAL record to disk

On relatively big volumes ext2 recovery can end up in formatting the fs 
under certain cirrumstances.;-)

   I assumed it was the double fsync for the normal and journal that
   made the journalling file systems slog.
  
  Well, a journalling file system would need to write a journal entry
  and flush that to disk, even if fsync is disabled -- whereas without
  fsync enabled, ext2 doesn't have to flush anything to disk. ISTM that
  the performance advantage of ext2 over ext3 is should be even larger
  when fsync is not enabled.
 
 Yes, it is still double-writing.  I just thought that if that wasn't
 happening while the db was waiting for a commit that it wouldn't be too
 bad.
 
 Is it just me or do all the Linux file systems seem like they are
 lacking something when PostgreSQL is concerned?  We just want a UFS-like
 file system on Linux and no one has it.

mount -o sync an ext2 volume on Linux - and you can get a UFS-like fs.:)
mount -o async an FFS volume on FreeBSD  - and you can get boost in fs 
performance.
Personally me always mount ext2 fs where Pg is living with sync option.
Fsync in pg is off (since 6.3), this way successfully pass thru a few 
serious crashes on various systems (mostly on power problems).
If fsync is on in Pg, performance gets so-oh-oh-oh-oh slowly!=)
I just have done upgrade from 2.2 kernel on ext2 to ext3 capable 2.4 one
so I'm planning to do some benchmarking. Roughly saying w/o benchmarks, 
the performance have been degraded in 2/3 proportion.
But better safe then sorry.

-- 
WBR, Yury Bokhoncovich, Senior System Administrator, NOC of F1 Group.
Phone: +7 (3832) 106228, ext.140, E-mail: [EMAIL PROTECTED]
Unix is like a wigwam -- no Gates, no Windows, and an Apache inside.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])