Re: [PERFORM] Speeding up Aggregates

2003-10-09 Thread Christopher Kings-Lynne

Say, what do people think about a comment board thing like php.net has
attached to the documentation. People can add comments that show up directly
on the bottom of the documentation for each function. I find it's mostly full
of junk but skimming the comments often turns up one or two relevant warnings,
especially when I'm wondering why something's not behaving the way I expect.
I thought we had that:

http://www.postgresql.org/docs/7.3/interactive/functions-aggregate.html

...and someone has already made the comment.

Chris



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Speeding up Aggregates

2003-10-09 Thread Greg Stark

Dror Matalon <[EMAIL PROTECTED]> writes:

> Ouch. I just double checked and you're right. Is this considered a bug,
> or just an implementation issue? 

Call it a wishlist bug. The problem is it would be a hard feature to implement
properly. And none of the people paid to work on postgres by various companies
seem to have this on their to-do lists. So don't expect it in the near future.

> While I've seen this hint a few times in the lists, it seems like it's
> one of those magic incantations that those in the know, know about, and
> that people new to postgres are going to be surprised by the need to use
> this idiom.

Yup. Though it's in the FAQ and comes up on the mailing list about once a week
or so, so it's hard to see how to document it any better. Perhaps a warning
specifically on the min/max functions in the documentation?


Say, what do people think about a comment board thing like php.net has
attached to the documentation. People can add comments that show up directly
on the bottom of the documentation for each function. I find it's mostly full
of junk but skimming the comments often turns up one or two relevant warnings,
especially when I'm wondering why something's not behaving the way I expect.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Bruce Momjian

Isn't it great how you have the same directory on every host so you can
download once and run the same tests easily.


Neil Conway wrote:
> $ uname -a
> Linux spe170 2.4.17-64 #1 Sat Mar 16 17:31:44 MST 2002 parisc64 unknown
> $ gcc --version
> 3.0.4
> 
> 'make check' passes

I didn't know there was a pa-risc-64 chip.

> BTW, this platform doesn't have any code written for native spinlocks.
> 
> (4)
> 
> $ uname -a
> Linux spe156 2.4.18-mckinley-smp #1 SMP Thu Jul 11 12:51:02 MDT 2002
> ia64 unknown
> $ gcc --version
> 
> When you compile PostgreSQL without changing the CFLAGS configure picks,
> the initdb required for 'make check' fails with:
> 
> [...]
> initializing pg_depend... ok
> creating system views... ok
> loading pg_description... ok
> creating conversions... ERROR:  could not identify operator 679
> 
> I tried to compile PostgreSQL with CFLAGS='-O0' to see if the above
> resulted from an optimization-induced compiler error, but I got the
> following error:
> 
> $ gcc -O0 -Wall -Wmissing-prototypes -Wmissing-declarations
> -I../../../../src/include -D_GNU_SOURCE   -c -o xlog.o xlog.c
> ../../../../src/include/storage/s_lock.h: In function `tas':
> ../../../../src/include/storage/s_lock.h:125: error: inconsistent
> operand constraints in an `asm'
> 
> Whereas this works fine:
> 
> $ gcc -O2 -Wall -Wmissing-prototypes -Wmissing-declarations
> -I../../../../src/include -D_GNU_SOURCE   -c -o xlog.o xlog.c
> $
> 
> BTW, line 138 of s_lock.h is:
> 
> #if defined(__arm__) || defined(__arm__)

Fix just committed.  Thanks.

> That seems a little redundant.
> 
> Anyway, I tried running initdb after compiling all of pgsql with "-O0',
> except for the files that included s_lock.h, but make check still
> failed:
> 
> creating information schema... ok
> vacuuming database template1...
> /house/neilc/pgsql/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/initdb: 
> line 882: 22035 Segmentation fault  (core dumped) "$PGPATH"/postgres $PGSQL_OPT 
> template1 >/dev/null  < ANALYZE;
> VACUUM FULL FREEZE;
> EOF
> 
> The core file seems to indicate a stack overflow due to an infinitely
> recursive function:
> 
> (gdb) bt 25
> #0  0x40645dc0 in hash_search ()
> #1  0x40616930 in RelationSysNameCacheGetRelation ()
> #2  0x40616db0 in RelationSysNameGetRelation ()
> #3  0x40082e40 in relation_openr ()
> #4  0x40083910 in heap_openr ()
> #5  0x4060e6b0 in ScanPgRelation ()
> #6  0x40611d60 in RelationBuildDesc ()
> #7  0x40616e70 in RelationSysNameGetRelation ()
> #8  0x40082e40 in relation_openr ()
> #9  0x40083910 in heap_openr ()
> #10 0x4060e6b0 in ScanPgRelation ()
> #11 0x40611d60 in RelationBuildDesc ()
> #12 0x40616e70 in RelationSysNameGetRelation ()
> #13 0x40082e40 in relation_openr ()
> #14 0x40083910 in heap_openr ()
> #15 0x4060e6b0 in ScanPgRelation ()
> #16 0x40611d60 in RelationBuildDesc ()
> #17 0x40616e70 in RelationSysNameGetRelation ()
> #18 0x40082e40 in relation_openr ()
> #19 0x40083910 in heap_openr ()
> #20 0x4060e6b0 in ScanPgRelation ()
> #21 0x40611d60 in RelationBuildDesc ()
> #22 0x40616e70 in RelationSysNameGetRelation ()
> #23 0x40082e40 in relation_openr ()
> #24 0x40083910 in heap_openr ()
> (More stack frames follow...)
> 
> (It also dumps core in the same place during initdb if CFLAGS='-O' is
> specified.)
> 
> So it looks like the Itanium port is a little broken. Does anyone have
> an idea what needs to be done to fix it?

My guess is that the compiler itself is broken --- what else could it
be?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Speeding up Aggregates

2003-10-09 Thread Dror Matalon
On Thu, Oct 09, 2003 at 08:35:22PM -0500, Bruno Wolff III wrote:
> On Thu, Oct 09, 2003 at 17:44:46 -0700,
>   Dror Matalon <[EMAIL PROTECTED]> wrote:
> > 
> > How is doing order by limit 1 faster than doing max()? Seems like the
> > optimizer will need to sort or scan the data set either way. That part
> > didn't actually make a difference in my specific case.
> 
> max() will never be evaluated by using an index to find the greatest value.
> So in many cases using order by and limit 1 is faster.

Ouch. I just double checked and you're right. Is this considered a bug,
or just an implementation issue? 

While I've seen this hint a few times in the lists, it seems like it's
one of those magic incantations that those in the know, know about, and
that people new to postgres are going to be surprised by the need to use
this idiom.

Regards,

Dror

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Neil Conway
On Wed, 2003-10-08 at 21:44, Bruce Momjian wrote:
> Agreed.  Do we set them all to -O2, then remove it from the ones we
> don't get successful reports on?

I took the time to compile CVS tip with a few different machines from
HP's TestDrive program, to see if there were any regressions using the
new optimization flags:

(1) (my usual dev machine)

$ uname -a
Linux tokyo 2.4.19-xfs #1 Mon Jan 20 19:12:29 EST 2003 i686 GNU/Linux
$ gcc --version
gcc (GCC) 3.3.2 20031005 (Debian prerelease)

'make check' passes

(2)

$ uname -a
Linux spe161 2.4.18-smp #1 SMP Sat Apr 6 21:42:22 EST 2002 alpha unknown
$ gcc --version
gcc (GCC) 3.3.1

'make check' passes

(3)

$ uname -a
Linux spe170 2.4.17-64 #1 Sat Mar 16 17:31:44 MST 2002 parisc64 unknown
$ gcc --version
3.0.4

'make check' passes

BTW, this platform doesn't have any code written for native spinlocks.

(4)

$ uname -a
Linux spe156 2.4.18-mckinley-smp #1 SMP Thu Jul 11 12:51:02 MDT 2002
ia64 unknown
$ gcc --version

When you compile PostgreSQL without changing the CFLAGS configure picks,
the initdb required for 'make check' fails with:

[...]
initializing pg_depend... ok
creating system views... ok
loading pg_description... ok
creating conversions... ERROR:  could not identify operator 679

I tried to compile PostgreSQL with CFLAGS='-O0' to see if the above
resulted from an optimization-induced compiler error, but I got the
following error:

$ gcc -O0 -Wall -Wmissing-prototypes -Wmissing-declarations
-I../../../../src/include -D_GNU_SOURCE   -c -o xlog.o xlog.c
../../../../src/include/storage/s_lock.h: In function `tas':
../../../../src/include/storage/s_lock.h:125: error: inconsistent
operand constraints in an `asm'

Whereas this works fine:

$ gcc -O2 -Wall -Wmissing-prototypes -Wmissing-declarations
-I../../../../src/include -D_GNU_SOURCE   -c -o xlog.o xlog.c
$

BTW, line 138 of s_lock.h is:

#if defined(__arm__) || defined(__arm__)

That seems a little redundant.

Anyway, I tried running initdb after compiling all of pgsql with "-O0',
except for the files that included s_lock.h, but make check still
failed:

creating information schema... ok
vacuuming database template1...
/house/neilc/pgsql/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/initdb: 
line 882: 22035 Segmentation fault  (core dumped) "$PGPATH"/postgres $PGSQL_OPT 
template1 >/dev/null  <

Re: [PERFORM] OFFTOPIC: PostgreSQL vs MySQL

2003-10-09 Thread Dennis Bjorklund
On Thu, 9 Oct 2003, David Griffiths wrote:

> > > "PostgreSQL supports constraints. MySQL doesn't; programmers need to
> > > take care of that from the client side"
> > > Again, InnoDB supports constraints.
> >
> > Really?  This is news.   We did some tests on constraints on InnoDB, and
> > found that while they parsed, they were not actually enforced.Was 
> > our test in error?
> 
> You may have turned them off to load data? I've run into constraints
> when my data-load script missed some rows in address_type. When it went
> to do the address_list table, all rows that had the missing address_type
> failed, as they should. I saw no weakness in the constraints.

It sounds like you talk about foreign keys only, while the previous writer 
talkes about other constraints also. For example, in postgresql you 
can do:

CREATE TABLE foo (
  x int,

  CONSTRAINT bar CHECK (x > 5)
);

and then

# INSERT INTO foo VALUES (4);
ERROR:  ExecInsert: rejected due to CHECK constraint "bar" on "foo"


I don't know MySQL, but I've got the impression from other posts on the
lists that innodb supports foreign keys only. I might be wrong though.

-- 
/Dennis


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Speeding up Aggregates

2003-10-09 Thread Bruno Wolff III
On Thu, Oct 09, 2003 at 17:44:46 -0700,
  Dror Matalon <[EMAIL PROTECTED]> wrote:
> 
> How is doing order by limit 1 faster than doing max()? Seems like the
> optimizer will need to sort or scan the data set either way. That part
> didn't actually make a difference in my specific case.

max() will never be evaluated by using an index to find the greatest value.
So in many cases using order by and limit 1 is faster.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Speeding up Aggregates

2003-10-09 Thread Dror Matalon
On Thu, Oct 09, 2003 at 07:07:00PM -0400, Greg Stark wrote:
> Dror Matalon <[EMAIL PROTECTED]> writes:
> 
> > Actually what finally sovled the problem is repeating the 
> > dtstamp > last_viewed
> > in the sub select
> 
> That will at least convince the optimizer to use an index range lookup. But it
> still will have to scan every record that matches channel==$1, link==$2, and
> dtstamp>$3.
> 
> The trick of using limit 1 will be faster still as it only has to retrieve a
> single record using the index. But you have to be sure to convince it to use

How is doing order by limit 1 faster than doing max()? Seems like the
optimizer will need to sort or scan the data set either way. That part
didn't actually make a difference in my specific case.


> the index and the way to do that is to list exactly the same columns in the
> ORDER BY as are in the index definition. 
> 
> Even if some of the leading columns are redundant because they'll be constant
> for all of the records retrieved. The optimizer doesn't know to ignore those.

The main problem in my case was that the optimizer was doing the max()
on all 700 rows, rather than the filtered rows. It's not until I put the
"dtstamp> last_viewed" in the sub select as well as in the main query
that it realized that it can first filter the 696 rows out and then to
the max() on the 4 rows that satisfied this constraint. 

That was the big saving.

Hope this all makes sense,

Dror
> 
> > > (This is the thing i pointed out previously in
> > > <[EMAIL PROTECTED]> on Feb 13th 2003 on pgsql-general)
> 
> -- 
> greg
> 

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] further testing on IDE drives

2003-10-09 Thread Bruce Momjian

How did this drive come by default?  Write-cache disabled?

---

scott.marlowe wrote:
> On Thu, 2 Oct 2003, scott.marlowe wrote:
> 
> > I was testing to get some idea of how to speed up the speed of pgbench 
> > with IDE drives and the write caching turned off in Linux (i.e. hdparm -W0 
> > /dev/hdx).
> > 
> > The only parameter that seems to make a noticeable difference was setting 
> > wal_sync_method = open_sync.  With it set to either fsync, or fdatasync, 
> > the speed with pgbench -c 5 -t 1000 ran from 11 to 17 tps.  With open_sync 
> > it jumped to the range of 45 to 52 tps.  with write cache on I was getting 
> > 280 to 320 tps.  so, not instead of being 20 to 30 times slower, I'm only 
> > about 5 times slower, much better.
> > 
> > Now I'm off to start a "pgbench -c 10 -t 1" and pull the power cord 
> > and see if the data gets corrupted with write caching turned on, i.e. do 
> > my hard drives have the ability to write at least some of their cache 
> > during spin down.
> 
> OK, back from testing.
> 
> Information:  Dual PIV system with a pair of 80 gig IDE drives, model 
> number: ST380023A (seagate).  File system is ext3 and is on a seperate 
> drive from the OS.
> 
> These drives DO NOT write cache when they lose power.  Testing was done by 
> issuing a 'hdparm -W0/1 /dev/hdx' command where x is the real drive 
> letter, and 0 or 1 was chosen in place of 0/1.  Then I'd issue a 'pgbench 
> -c 50 -t 1' command, wait for a few minutes, then pull the power 
> cord.
> 
> I'm running RH linux 9.0 stock install, kernel: 2.4.20-8smp.
> 
> Three times pulling the plug with 'hdparm -W0 /dev/hdx' resulted in a 
> machine that would boot up, recover with journal, and a database that came 
> up within about 30 seconds, with all the accounts still intact.
> 
> Switching the caching back on with 'hdparm -W1 /dev/hdx' and doing the 
> same 'pgbench -c 50 -t 1' resulted in a corrupted database each 
> time.
> 
> Also, I tried each of the following fsync methods: fsync, fdatasync, and
> open_sync with write caching turned off.  Each survived a power off test 
> with no corruption of the database.  fsync and fdatasync result in 11 to 
> 17 tps with 'pgbench -c 5 -t 500' while open_sync resulted in 45 to 55 
> tps, as mentioned in the previous post.
> 
> I'd be interested in hearing from other folks which sync method works 
> for them and whether or not there are any IDE drives out there that can 
> write their cache to the platters on power off when caching is enabled.
> 
> 
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] further testing on IDE drives

2003-10-09 Thread Bruce Momjian
scott.marlowe wrote:
> I was testing to get some idea of how to speed up the speed of pgbench 
> with IDE drives and the write caching turned off in Linux (i.e. hdparm -W0 
> /dev/hdx).
> 
> The only parameter that seems to make a noticeable difference was setting 
> wal_sync_method = open_sync.  With it set to either fsync, or fdatasync, 
> the speed with pgbench -c 5 -t 1000 ran from 11 to 17 tps.  With open_sync 
> it jumped to the range of 45 to 52 tps.  with write cache on I was getting 
> 280 to 320 tps.  so, not instead of being 20 to 30 times slower, I'm only 
> about 5 times slower, much better.
> 
> Now I'm off to start a "pgbench -c 10 -t 1" and pull the power cord 
> and see if the data gets corrupted with write caching turned on, i.e. do 
> my hard drives have the ability to write at least some of their cache 
> during spin down.

Is this a reason we should switch to open_sync as a default, if it is
availble, rather than fsync?  I think we are doing a single write before
fsync a lot more often than we are doing multiple writes before fsync.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] backup/restore - another area.

2003-10-09 Thread Greg Stark


Jeff <[EMAIL PROTECTED]> writes:

> Idea #1:
> Use an LVM and take a snapshop - archive that.
> From the way I see it. the downside is the LVM will use a lot of space
> until the snapshot is removed.  Also PG may be in a slightly inconsistant
> state - but this should "appear" to PG the same as if the power went out.
> 
> For restore, simply unarchive this snapshot and point postgres at it. Let
> it recover and you are good to go.
> 
> Little overhead from what I see...
> I'm leaning towards this method the more I think of it.

I don't quite follow your #2 so I can only comment on the above idea of using
an LVM snapshot. If you have the hardware and the LVM-fu to be able to do this
properly I would recommend it.

We actually used to do this with veritas even on Oracle which has full online
backup support simply because it was much much faster and the snapshot could
be backed up during peak times without any significant performance impact.
That's partly because Veritas and Hitachi storage systems kick butt though.
Depending on the systems you're considering you may or may not have nearly the
same success.

Note, you should *test* this backup. You're depending on some subtle semantics
with this. If you do it slightly wrong or the LVM does something slightly
wrong and you end up with an inconsistent snapshot or missing some critical
file the whole backup could be useless.

Also, I wouldn't consider this a replacement for having a pg_dump export. In a
crisis when you want to restore everything *exactly* the way things were you
want the complete filesystem snapshot. But if you just want to load a table
the way it was the day before to compare, or if you want to load a test box to
do some performance testing, or whatever, you'll need the logical export.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Speeding up Aggregates

2003-10-09 Thread Greg Stark
Dror Matalon <[EMAIL PROTECTED]> writes:

> Actually what finally sovled the problem is repeating the 
> dtstamp > last_viewed
> in the sub select

That will at least convince the optimizer to use an index range lookup. But it
still will have to scan every record that matches channel==$1, link==$2, and
dtstamp>$3.

The trick of using limit 1 will be faster still as it only has to retrieve a
single record using the index. But you have to be sure to convince it to use
the index and the way to do that is to list exactly the same columns in the
ORDER BY as are in the index definition. 

Even if some of the leading columns are redundant because they'll be constant
for all of the records retrieved. The optimizer doesn't know to ignore those.

> > (This is the thing i pointed out previously in
> > <[EMAIL PROTECTED]> on Feb 13th 2003 on pgsql-general)

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Compare rows

2003-10-09 Thread Gaetano Mendola
Greg Spiegelberg wrote:

Josh Berkus wrote:

As you can see, the NULLs are not stored, making this system much more 
efficient on storage space.

Tommorrow I'll (hopefully) write up how to query this for 
comparisons.   It would help if you gave a little more details about 
what specific comparison you're doing, e.g. between tables or table to 
value, comparing just the last value or all rows, etc.

Got it.  I can see how it would be more efficient in storing.  At this
point it would require a lot of query and code rewrites to handle it.
Fortunately, we're looking for alternatives for the next revision and
we're leaving ourselves open for a rewrite much to the boss's chagrin.
I'm not sure about the save in storage. See the Hannu Krosing
arguments.
Regards
Gaetano Mendola
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] OFFTOPIC: PostgreSQL vs MySQL

2003-10-09 Thread David Griffiths
> Thanks for being considerate, thourough, and honest about your opinions.
> Particulary that you didn't simple depart in a huff.

Why would I depart in a huff? I was just trying to make a few objective
observations.

I really have no biases; I like what I've seen in MySQL, and I like alot of
the more Oracle-like
features in Postgres.

> > 4) we looked at MySQL first (we needed replication, and eRServer had not
> > been open-sourced when we started looking)
>
> I can't do anything about that, now can I?

My point was that it's since been open-sourced; it just means I've looked
longer at
MySQL, as it had replication when we started looking.

> Have you checked these pages?  They've been posted on this list numerous
> times:
> http://techdocs.postgresql.org
> http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
> http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html

Those are much more instructive; I'm curious - why aren't then in the
administrator's
section of the docs?

> We've been working on this on the advocacy list  that is, giving an
> accurate listing of PostgreSQL features not posessed by MySQL (same for
> Oracle and DB2 as well, MySQL is just easier to start becuase we don't
have
> to worry about being sued).   I'd appreciate it if you'd take an interest
in
> that document and revise anything which is innaccurate or perjorative.

I might be able to provide some insight, but I've only been working with
MySQL for a month
or so (Oracle for about 8 years).

> > "PostgreSQL supports constraints. MySQL doesn't; programmers need to
> > take care of that from the client side"
> > Again, InnoDB supports constraints.
>
> Really?  This is news.   We did some tests on constraints on InnoDB, and
found
> that while they parsed, they were not actually enforced.Was our test
in
> error?

You may have turned them off to load data? I've run into constraints when my
data-load script missed some rows in address_type. When it went to do the
address_list table, all rows that had the missing address_type failed, as
they
should. I saw no weakness in the constraints.


> > Maybe the Postgres community needs an anti-FUD individual or two; people
> > that know both databases, and can provide the proper information for
> > answering questions like this. A section in the docs would help as well.
> > Yes, I know many of the people advocating Postgres do not want to
> > compare themselves to MySQL (but rather to Oracle, Sybase, DB2, etc) ,
> > but the volume of responses on a thread like this indicates that the
> > comparison is going to happen regardless. Better to nip it in the bud
> > quickly than let it go on over 3-4 days.
>
> Would you care to volunteer?   We'd be glad to have you.

Maybe once all this database testing is done; it's extra work on top of an
already
heavy load (add a new baby, and free time goes right down the toilet).

I need to figure out my performance issues with Postgres, finish my
benchmark
suite, test a bunch of databases, argue with the CTO, and then start
migrating.

I'll be sure to post my results to the  [EMAIL PROTECTED]
along with
the tests.

David.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Compare rows, SEMI-SUMMARY

2003-10-09 Thread Greg Spiegelberg
Per Josh's recommendation to implement a Vertical Child Table I came
up with 3 possible tables to handle the 3 possible value types: varchar,
numeric and bigint.  Each table has 7 columns: 1 to denote the time the
data was collected, 4 which identify where the data came from, 1 to
tell me the value name and the last being the value itself.
OLD NEW
tables  1   3
columns 642 7 each
indexes ~1200   39
views   37  ?
rows1700-3000   30,000
query on table  0.01 sec0.06 sec
query on view   0.02 sec?
Not too bad.  Guess there were a few 0's and NULL's out there, eh?

642 * 1,700= 1,091,400 cells
3 * 7 * 30,000 =   630,000 cells
   461,400 NULL's and 0's using the big 'ol table
I can get around in this setup, however, I would appreciate some help
in recreating my views.  The views use to be there simply as an initial
filter and to hide all the 0's and NULL's.  If I can't do this I will
be revisiting and testing possibly hundreds of programs and scripts.
Any takers?

Greg

--
Greg Spiegelberg
 Sr. Product Development Engineer
 Cranel, Incorporated.
 Phone: 614.318.4314
 Fax:   614.431.8388
 Email: [EMAIL PROTECTED]
Cranel. Technology. Integrity. Focus.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [pgsql-advocacy] [PERFORM] OFFTOPIC: PostgreSQL vs MySQL

2003-10-09 Thread Josh Berkus
Scott,

> any chance of getting the perf.html file from varlena folded into the main 
> documentation tree somewhere?  it's a great document, and it would 
> definitely help if the tuning section of the main docs said "For a more 
> thorough examination of postgresql tuning see this:" and pointed to it.

Actually, I'm working on that this weekend.

-- 
-Josh Berkus

__AGLIO DATABASE SOLUTIONS___
Josh Berkus
   Complete information technology  [EMAIL PROTECTED]
and data management solutions   (415) 565-7293
   for law firms, small businesses   fax 621-2533
and non-profit organizations.   San Francisco


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] PostgreSQL vs MySQL

2003-10-09 Thread Robert Treat
On Thu, 2003-10-09 at 13:30, David Griffiths wrote:
> I also have to admit a bit of irritation reading this thread; there is a
> fair number of incorrect statements on this thread that, while not
> wrong, definately aren't right:
>  
> "Speed depends on the nature of use and the complexity of queries.  If
> you are doing updates of related tables, ACID is of vital importance and
> MySQL doesn't provide it."
> MySQL has ACID in InnoDB. 

Actually it only kinda sorta has acid.  As Jeff mentioned, and it can be
expanded upon, mysql has a nasty habit of transforming invalid data into
something that will insert into a table and not telling you about it. I
think Josh mentioned reports that it ignores some constraint
definitions.  And then theres the whole mixing MyISAM and InnoDB tables
completely breaks the ability to rollback transactions...

>  
> "using InnoDB tables (the only way to have foreign keys, transactions,
> and row level locking for MySQL) makes MySQL slower and adds complexity
> to tuning the database"
> Adding this: "innodb_flush_method=O_DSYNC" to the my.cnf made InnoDB as
> fast as MyISAM in our tests. It doesn't turn off disk flushing; it's
> just a flush method that might work better with different kernels and
> drives; it's one of those "play with this and see if it helps"
> parameters; there are lots of those in Postgres, it seems. There are 10
> variables for tuning InnoDB (and you don't have to tune for MyISAM, so
> it's actually a six-of-one, half-dozen-of-the-other). Setup between the
> two seems to be about the same.

Well, I've yet to see MySQL benchmark themselves vs. the big boys using
InnoDB tables, I'm only guessing that it's because those tables are
slower. (Well, guessing and calling upon experience) Sure there may be
work arounds, but that does add a certain complexity. (Bonus for us,
PostgreSQL is just complex from the get go :-P )

>  
> "PostgreSQL supports constraints. MySQL doesn't; programmers need to
> take care of that from the client side"
> Again, InnoDB supports constraints.
>  

We've seen evidence it doesn't. If they've fixed this great. Of course
I'll quote from the mysql docs 

"InnoDB allows you to drop any table even though that would break the
foreign key constraints which reference the table."  

last I knew it did this silently and without warning. there are other
issues as well, so it's support is relative...

> "Transactions: We've been here before. Suffice to say, MySQL+InnoDB is
> almost there. Plain ol' MySQL doesn't have it, which tells you something
> about their philosophy towards database design."
> InnoDB supports transactions very nicely, has the equivalent of WAL, and
> one thing I really like: a tablespace (comprised of data files that can
> be spread around multiple hard drives), and in a month or so, InnoDB
> will support multiple tablespaces.
>  

Just don't mix InnoDB and MyISAM tables together or you could end up in
a world of trouble... its unfortunate that this breaks one of the main
keys to building a DBMS, namely hiding implementation details from the
end users. 

> Maybe the Postgres community needs an anti-FUD individual or two; people
> that know both databases, and can provide the proper information for
> answering questions like this. 

Well, among the major advocacy folk we do have a mantra about no FUD,
but these are public lists so we cant really stop people from posting. 
Of course this overlooks the fact that different people interpret
different information differently. (heh)  Take this quote I saw posted
in a non postgresql forum a while back: "MySQL doesn't fully support
subqueries" which of course created a slew of posts about FUD and
postgresql users being idiots. If course, when the posted responded back
with the question "Can mysql do subselects in the SELECT, FROM, and
WHERE clauses like postgresql, and nest subselects within those
subselects?" it stopped everyone in their tracks...

> A section in the docs would help as well.

In the docs no, on techdocs, maybe. 

> Yes, I know many of the people advocating Postgres do not want to
> compare themselves to MySQL (but rather to Oracle, Sybase, DB2, etc) ,
> but the volume of responses on a thread like this indicates that the
> comparison is going to happen regardless. Better to nip it in the bud
> quickly than let it go on over 3-4 days.
>  

It was due to the help of postgresql users that the following site has
become available: http://sql-info.de/mysql/gotchas.html
I'd suggest you look it over if your trying to evaluate a switch from
Oracle to MySQL.

And anyone is welcome, actually encouraged, to correct erroneous
information they see posted about any system on these lists. God bless
if you're willing to try and follow every list every day to watch for
these types of posts. 


> One last observation: someone looking at both databases, reading those
> posts, might get a bad impression of Postgres based on the inconsistency
> and incorrectness of some of the statements 

Re: [PERFORM] PostgreSQL vs MySQL

2003-10-09 Thread Jason Hihn
I concur 100%. PostgreSQL was big and scary and MySQL seemed cute and
cuddly, warm and fuzzy. Then I took my undergrad CS RDBMS course (a course
that focused on designing the backend software), and only then was I ready
to appreciate and wield the battle axe that is PostgreSQL.

He also let me use PostgreSQL for my final project (the standard was
Oracle). I got an A. :)

I do have to admit that I prefer OSS (and docs) better than proprietary. I
had some Informix work and that was not fun at all. So even though the MySQL
is pink fuzzy bunnies, PostgreSQL is at least a brown fuzzy bunny [to me
anyway].

> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of
> scott.marlowe
> Sent: Thursday, October 09, 2003 3:26 PM
> To: Jeff
> Cc: David Griffiths; [EMAIL PROTECTED]
> Subject: Re: [PERFORM] PostgreSQL vs MySQL
>
>
> On Thu, 9 Oct 2003, Jeff wrote:
>
> > On Thu, 9 Oct 2003, David Griffiths wrote:
> >
> > > 1) the MySQL docs are better (sorry - I found them easier to read, and
> > > more comprehensive; I had an easier time finding the answers I needed)
> >
> > Huh. I had the opposite experience. Each to his own.
> > I think everybody agrees PG needs a better tuning doc (or
> pointers to it,
> > or something).
>
> I think the issue is that Postgresql documentation is oriented
> towards DBA
> types, who already understand databases in general, so they can find what
> they want, while MySQL docs are oriented towards dbms newbies, who don't
> know much, if anything, about databases.
>
>
> ---(end of broadcast)---
> TIP 9: the planner will ignore your desire to choose an index scan if your
>   joining column's datatypes do not match
>


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] PostgreSQL vs MySQL

2003-10-09 Thread scott.marlowe
On Thu, 9 Oct 2003, Jeff wrote:

> On Thu, 9 Oct 2003, David Griffiths wrote:
> 
> > 1) the MySQL docs are better (sorry - I found them easier to read, and
> > more comprehensive; I had an easier time finding the answers I needed)
> 
> Huh. I had the opposite experience. Each to his own.
> I think everybody agrees PG needs a better tuning doc (or pointers to it,
> or something).

I think the issue is that Postgresql documentation is oriented towards DBA 
types, who already understand databases in general, so they can find what 
they want, while MySQL docs are oriented towards dbms newbies, who don't 
know much, if anything, about databases.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


[PERFORM] backup/restore - another area.

2003-10-09 Thread Jeff
Boy, I must be getting annoying by now huh?

Anyway, after the joys of Solaris being fast I'm moving onto another area
- backup & restore.  I've been checking the archives and haven't seen any
"good" tips for backing up big databases (and more importantly,
restoring).

I've noticed while doing a backup (with both -Fc and regular recipe) that
my IO is no where near being stressed.  According to vmstat, it sits
around reading about 512kB/sec (with occasional spikes) and every 5-6
seconds it writes out a 3MB hunk.

So as a test I decided to cp a 1GB file and got a constant read speed of
20MB/sec and the writes. well. were more sporatic (buffering most likely)
and it would write out 60MB every 3 seconds.

And. then.. on the restore I notice similar things - IO hardly being
stressed at all... reading in at ~512kB/sec and every now and then writing
out a few MB.


So, I've been thinking of various backup/restore strategies... some I'm
sure some people do, some need code written and may be controvertial..

Idea #1:
Use an LVM and take a snapshop - archive that.
>From the way I see it. the downside is the LVM will use a lot of space
until the snapshot is removed.  Also PG may be in a slightly inconsistant
state - but this should "appear" to PG the same as if the power went out.

For restore, simply unarchive this snapshot and point postgres at it. Let
it recover and you are good to go.

Little overhead from what I see...
I'm leaning towards this method the more I think of it.

Idea #2:

a new program/internal "system". Lets call it pg_backup. It would generate
a very fast backup (that restores very fast) at the expense of disk space.
Pretty much what we would do is write out new copies of all the pages in
the db - both indexes and tables.

the pro's to this is it does not depend on an LVM and therefore is
accessable to all platforms.  it also has the other benfets mentioned
above, except speed.

For a restore PG would need something like a 'restore mode' where we can
just have it pump pages into it somehow.. It would not have to build
index, check constraints, and all that because by definition the backup
would contain valid data.

The downside for both of these are that the backup is only good for that
version of PG on that architecture.  Speaking in Informix world this is
how it is - it has a fast backup & fast restore that does essentially #2
and then it has export/import options (works like our current pg_dump and
restore).

and oh yeah -I've tried disabling fsync on load and while it did go faster
it was only 2 minutes faster (9m vs 11m).

Any thoughts on this? What do you ther folk with big db's do?

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [pgsql-advocacy] [PERFORM] OFFTOPIC: PostgreSQL vs MySQL

2003-10-09 Thread scott.marlowe
On Thu, 9 Oct 2003, Josh Berkus wrote:

> David Griffiths wrote: 
> > With regards to #1, I'd like to specifically mention tuning - the docs
> > at http://www.postgresql.org/docs/7.3/static/runtime-config.html
> >   give a
> 
> Have you checked these pages?  They've been posted on this list numerous 
> times:
> http://techdocs.postgresql.org
> http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
> http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html
> 
> Also, the runtime docs are being improved in 7.4:
> http://developer.postgresql.org/docs/postgres/runtime-config.html
> ... and I'm still working on more general "how to" text.

any chance of getting the perf.html file from varlena folded into the main 
documentation tree somewhere?  it's a great document, and it would 
definitely help if the tuning section of the main docs said "For a more 
thorough examination of postgresql tuning see this:" and pointed to it.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Bruce Momjian
Christopher Browne wrote:
> [EMAIL PROTECTED] (Bruce Momjian) writes:
> > 5) How can I compile for optimum performance?
> >
> > Try using the "-fast" compile flag.  The binaries might not be portable to
> > other Solaris systems, and you might need to compile everything that links
> > to PostgreSQL with "-fast", but PostgreSQL will run significantly faster,
> > 50% faster on some tests.
> 
> You might also mention something like the following:
> 
>   If you are compiling using GCC, you will quite likely want to add in
>   the "-O2" compile flag.

We already do that by default in current CVS for gcc, and -O for
non-gcc.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Christopher Browne
[EMAIL PROTECTED] (Bruce Momjian) writes:
> 5) How can I compile for optimum performance?
>
> Try using the "-fast" compile flag.  The binaries might not be portable to
> other Solaris systems, and you might need to compile everything that links
> to PostgreSQL with "-fast", but PostgreSQL will run significantly faster,
> 50% faster on some tests.

You might also mention something like the following:

  If you are compiling using GCC, you will quite likely want to add in
  the "-O2" compile flag.
-- 
let name="cbbrowne" and tld="libertyrms.info" in String.concat "@" [name;tld];;

Christopher Browne
(416) 646 3304 x124 (land)

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [pgsql-advocacy] [PERFORM] OFFTOPIC: PostgreSQL vs MySQL

2003-10-09 Thread Josh Berkus
Nick,

> Josh- It would be great to have a link to those last two excellent resources
> from the techdocs area- perhaps from the "optimizing" section in
> http://techdocs.postgresql.org/oresources.php. Who should we suggest this
> to? (I submitted these using the form in that area, but you may have better
> connections.)

This is my  responsibility;  I'll add it to the list.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] IMMUTABLE function's flag do not work: 7.3.4, plpgsql

2003-10-09 Thread Gaetano Mendola
Andriy Tkachuk wrote:
On Wed, 8 Oct 2003, Tom Lane wrote:


Andriy Tkachuk <[EMAIL PROTECTED]> writes:

At second. calc_total() is immutable function:
but it seems that it's not cached in one session:
It's not supposed to be.


but it's written id doc:

  IMMUTABLE indicates that the function always  returns  the  same
  result when given the same argument values; that is, it does not
  do database lookups or otherwise use  information  not  directly
  present in its parameter list. If this option is given, any call
  of the function with all-constant arguments can  be  immediately
  replaced with the function value.
The doc say "can be" not must and will be.



Regards
Gaetano Mendola
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Any 7.4 w32 numbers in yet?

2003-10-09 Thread Bruce Momjian
Jason Hihn wrote:
> I am very interested in the non-Cygwin windows port. Looking over the 7.4
> beta release, it looks like the code made it in. I read through the win32
> related docs, to find out that they are out-of date instructions (11/2002).
> I do hope these get updated with the native windows stuff.
> 
> But I came here to ask more about the performance of pg-w32. Did it take a
> hit? Is it faster (than Cygwin, than Unix)? Stability? I saw there were some
> mailings about file-moving race conditions, links and such.

See:

http://momjian.postgresql.org/main/writings/pgsql/win32.html

We don't have it running yet.  It will be running in 7.5.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] Any 7.4 w32 numbers in yet?

2003-10-09 Thread Jason Hihn
I am very interested in the non-Cygwin windows port. Looking over the 7.4
beta release, it looks like the code made it in. I read through the win32
related docs, to find out that they are out-of date instructions (11/2002).
I do hope these get updated with the native windows stuff.

But I came here to ask more about the performance of pg-w32. Did it take a
hit? Is it faster (than Cygwin, than Unix)? Stability? I saw there were some
mailings about file-moving race conditions, links and such.

Thanks.

Jason Hihn
Paytime Payroll



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [pgsql-advocacy] [PERFORM] OFFTOPIC: PostgreSQL vs MySQL

2003-10-09 Thread Nick Fankhauser
> Have you checked these pages?  They've been posted on this list numerous
> times:
> http://techdocs.postgresql.org
> http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
> http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html
>

Josh- It would be great to have a link to those last two excellent resources
from the techdocs area- perhaps from the "optimizing" section in
http://techdocs.postgresql.org/oresources.php. Who should we suggest this
to? (I submitted these using the form in that area, but you may have better
connections.)

-Nick



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Bruce Momjian
Jeff wrote:
> We're keeping the -O2 for gcc in the template and moving the mention of
> -fast to the FAQ, correct?

gcc gets -O2, non-gcc gets -O, and -fast is in the FAQ, yea.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] go for a script! / ex: PostgreSQL vs. MySQL

2003-10-09 Thread Sean Chittenden
> Yeah, I had similar thought to Oliver's and suspected that this
> would be the answer.  Also, while it's not too hard to do this for a
> single platform, it gets complecated once you start looking at
> different ones.
> 
> Josh, let me know when you're ready to do this. I'll try to help,
> although my perl's kind of rusty. Also, can you even assume perl for
> a postgres install? Does Solaris, for instance come with perl?

Um, why not wait until the C version of initdb is committed, then
steak out a section that'll allow us to submit patches to have initdb
autotune to our hearts content?  There's a tad bit of precedence with
having shared buffer's automatically set in initdb, why not continue
with it?  I know under FreeBSD initdb will have some #ifdef's to wrap
around the syscall sysctl() to get info about kernel bits.  Talking
about how to expand handle this gracefully for a gazillion different
platforms might be a more useful discussion at this point because I'm
sure people from their native OS will be able to contrib the necessary
patches to extract info from their OS so that initdb can make useful
decisions.  Or, lastly, does anyone think that this should be in a
different, external program?  -sc

-- 
Sean Chittenden

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Jeff
We're keeping the -O2 for gcc in the template and moving the mention of
-fast to the FAQ, correct?

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] OFFTOPIC: PostgreSQL vs MySQL

2003-10-09 Thread Josh Berkus
David,

Thanks for being considerate, thourough, and honest about your opinions.   
Particulary that you didn't simple depart in a huff.

> 1) the MySQL docs are better (sorry - I found them easier to read, and
> more comprehensive; I had an easier time finding the answers I needed)

I can believe that.   MySQL AB has paid documentation writers; we don't.   

> 2) there are more web pages devoted to MySQL (probably because it has a
> bit more market share)

Particularly among web developers.

> 3) there are more books on MySQL at the bookstore (I haven't had a
> chance to pick up Bruce's book yet; it might be all the book I'd ever
> need)

Bruce's book is out of date -- released in 1998.  I recommend Korry Douglas' 
book instead, just because of its up-to-date nature (printed late 2002 or 
early 2003).

> 4) we looked at MySQL first (we needed replication, and eRServer had not
> been open-sourced when we started looking)

I can't do anything about that, now can I?

> With regards to #1, I'd like to specifically mention tuning - the docs
> at http://www.postgresql.org/docs/7.3/static/runtime-config.html
>   give a

Have you checked these pages?  They've been posted on this list numerous 
times:
http://techdocs.postgresql.org
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html

Also, the runtime docs are being improved in 7.4:
http://developer.postgresql.org/docs/postgres/runtime-config.html
... and I'm still working on more general "how to" text.

> "I guess my point is simply this: instead of saying: "okay we use
> default settings that will run on _old_ hardware too" we should go for a
> little script that creates a "still save but much better" config file.
> There's just no point in setting SHARED_BUFFERS to something like 16
> (what's the current default?) if the PC has >= 1 GB of RAM. Setting it
> to 8192 would still be save, but 512 times better...  ;-) (IIRC 8192
> would take 64 MB of RAM, which should be save if you leave the default
> MAX_CONNECTIONS.)" 

You'll be interested to know that SHARED_BUFFERS are actually addressed in the 
initdb script in 7.4.  However, may OSes have low limits on per-process 
memory that requires the admin to modify the sysconf before postgresql.conf 
can be adjusted properly.  This makes writing a multi-platform tuning script 
a significant effort, and to date nobody who is complaining about it the 
loudest has volunteered to do the work.

To reiterate my point above, PostgreSQL is a 100% volunteer Open Source 
project.  MySQL is a commercial company which distributes its products via 
Open Source licensing.  That makes some things easier for them than for us 
(and vice-versa, of course).

> I also have to admit a bit of irritation reading this thread; there is a
> fair number of incorrect statements on this thread that, while not
> wrong, definately aren't right:

We've been working on this on the advocacy list  that is, giving an 
accurate listing of PostgreSQL features not posessed by MySQL (same for 
Oracle and DB2 as well, MySQL is just easier to start becuase we don't have 
to worry about being sued).   I'd appreciate it if you'd take an interest in 
that document and revise anything which is innaccurate or perjorative.

Also, keep in mind that many members of the PostgreSQL community have "an axe 
to grind" about MySQL.  This is not only because of MySQL's eclipsing us in 
the popular press as "THE open source database"; it is also because prominent 
individuals at MySQL AB, particularly Monty and David Axmark, have in the 
past signaled their intent to rub out all other OSS databases, starting with 
PostgreSQL.   While this says little about the MySQL community, it does make 
members of our communty very touchy when the "M" word comes up.

I quote the rest of your debunking for the benefit of the readers on the 
Advocacy list, with a couple of comments:

> "Speed depends on the nature of use and the complexity of queries.  If
> you are doing updates of related tables, ACID is of vital importance and
> MySQL doesn't provide it."
> MySQL has ACID in InnoDB. I've found that MySQL is actually very fast on
> complex queries w/InnoDB (six tables, 1 million rows, two of the joins
> are outer-joins. In fact, I can get InnoDB to be almost as fast as
> MyISAM. Complex updates are also very very fast. We have not tried
> flooding either database with dozens of complex statements from multiple
> clients; that's coming soon, and from what I've read, MySQL won't do too
> well.
>
> "using InnoDB tables (the only way to have foreign keys, transactions,
> and row level locking for MySQL) makes MySQL slower and adds complexity
> to tuning the database"
> Adding this: "innodb_flush_method=O_DSYNC" to the my.cnf made InnoDB as
> fast as MyISAM in our tests. It doesn't turn off disk flushing; it's
> just a flush method that might work

Re: [PERFORM] PostgreSQL vs MySQL

2003-10-09 Thread Jeff
On Thu, 9 Oct 2003, David Griffiths wrote:

> 1) the MySQL docs are better (sorry - I found them easier to read, and
> more comprehensive; I had an easier time finding the answers I needed)

Huh. I had the opposite experience. Each to his own.
I think everybody agrees PG needs a better tuning doc (or pointers to it,
or something).

> "Speed depends on the nature of use and the complexity of queries.  If
> you are doing updates of related tables, ACID is of vital importance and
> MySQL doesn't provide it."

I don't know if you looked at my presentation. But in preparation for it I
checked out MySQL 4.0.x[most recent stable]. I found that I violates the C
in acid in some places. ie you can insert a date of /00/00 and have it
sit there and be fine.  Perhaps this is the fault of mysql's timestamp
type.

> MyISAM. Complex updates are also very very fast. We have not tried
> flooding either database with dozens of complex statements from multiple
> clients;

You don't need complex statements to topple mysql over in high
concurrency. I was doing fairly simple queries with 20 load generators -
it didn't like it.  Not at all (mysql: 650 seconds pg: 220)

> 3) I see alot more corrupt-database bugs on the MySQL lists (most are
> MyISAM, but a few InnoDB bugs pop up from time to time) - way more than
> I see on the Postgres lists.

I saw this as well. I was seeing things in the changelog as late as
september (this year) about fixing bugs that cause horrific corruption.
That doesn't make me feel comfy.  Remember - in reality InnoDB is still
very new.  The PG stuff has been tinkered with for years.  I like
innovation and new things, but in some cases, I prefer the old code
that has been looked at for years.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Bruce Momjian
Andrew Sullivan wrote:
> On Thu, Oct 09, 2003 at 01:04:23PM -0400, Jeff wrote:
> > 
> > So you think we should leave PG alone and let it run horrifically slowly?
> > Do you have a better idea of how to do this?
> 
> Given the point in the release cycle, mightn't the FAQ_Solaris or
> some other place be better for this for now?  I agree with the
> concern.  I'd rather have slow'n'stable than fast-but-broken.

FAQ added.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Compare rows

2003-10-09 Thread Hannu Krosing
Josh Berkus kirjutas N, 09.10.2003 kell 08:36:
> Chris,

> > The need to do a lot of joins would likely hurt performance somewhat,
> > as well as the way that it greatly increases the number of rows.
> > Although you could always split it into several tables, one for each
> > "value_type", and UNION them into a view...
> 
> It increases the number of rows, yes, but *decreases* the storage size of data 
> by eliminating thousands ... or millions ... of NULL fields. 

I'm not sure I buy that.

Null fields take exactly 1 *bit* to store (or more exactly, if you have
any null fields in tuple then one 32bit int for each 32 fields is used
for NULL bitmap), whereas the same fields in "vertical" table takes 4
bytes for primary key and 1-4 bytes for category key + tuple header per
value + neccessary indexes. So if you have more than one non-null field
per tuple you will certainly lose in storage. 

> How would splitting the vertical values into dozens of seperate tables help things?

If you put each category in a separate table you save 1-4 bytes for
category per value, but still store primary key and tuple header *per
value*.

Jou may stii get better performance for single-column comparisons as
fewer pages must be touched.

> Personally, I'd rather have a table with 3 columns and 8 million rows than a 
> table with 642 columns and 100,000 rows.  Much easier to deal with.

Same here ;)

--
Hannu


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL vs MySQL

2003-10-09 Thread David Griffiths



This is a timely thread for myself, as I'm in the 
middle of testing both databases as an Oracle replacement.
 
As of this moment, I know more about MySQL (tuning, 
setup, features) than I do about Postgres. Not because I like MySQL more, but 
because
 
1) the MySQL docs are better (sorry - I found 
them easier to read, and more comprehensive; I had an easier time finding the 
answers I needed)
2) there are more web pages devoted to MySQL 
(probably because it has a bit more market share)
3) there are more books on MySQL at the 
bookstore (I haven't had a chance to pick up Bruce's book yet; it might be all 
the book I'd ever need)
4) we looked at MySQL first (we needed 
replication, and eRServer had not been open-sourced when we started 
looking)
 
With regards to #1, I'd like to specifically 
mention tuning - the docs at http://www.postgresql.org/docs/7.3/static/runtime-config.html give 
a basic explanation of the different options, but much more is needed for 
tuning. I'm running into a problem with an update statement (that uses a select 
in a sub-query) in Postgres - it's taking  hours to run (the equiv, using a 
multi-table update statement in MySQL instead of a sub-query, takes all of 2 
seconds). I'll be posting it later once I do more reading to make sure I've done 
as much as I can to solve it myself.
 
I really agree with this post:
 
"I guess my point is simply this: instead of 
saying: "okay we use default settings that will run on _old_ hardware too" we 
should go for a little script that creates a "still save but much better" config 
file. There's just no point in setting SHARED_BUFFERS to something like 16 
(what's the current default?) if the PC has >= 1 GB of RAM. Setting it to 
8192 would still be save, but 512 times better...  ;-) (IIRC 8192 would 
take 64 MB of RAM, which should be save if you leave the default 
MAX_CONNECTIONS.)" It provides examples, and some real numbers to help someone 
new to the database take an initial crack at tuning. Remember, you're trying to 
compete with the big-guys (Oracle, etc), so providing info that an Oracle DBA 
needs is pretty critical. I'm currently at a complete loss for tuning Postgres 
(it seems to do things very differently than both Oracle and 
MySQL).
 
 
I also have to admit a bit of irritation reading 
this thread; there is a fair number of incorrect statements on this thread that, 
while not wrong, definately aren't right:
 
"Speed depends on the nature of use and the 
complexity of queries.  If you are doing updates of related tables, ACID is 
of vital importance and MySQL doesn't provide it."
MySQL has ACID in InnoDB. I've found that MySQL is 
actually very fast on complex queries w/InnoDB (six tables, 1 million rows, two 
of the joins are outer-joins. In fact, I can get InnoDB to be almost as fast as 
MyISAM. Complex updates are also very very fast. We have not tried flooding 
either database with dozens of complex statements from multiple clients; that's 
coming soon, and from what I've read, MySQL won't do too well.
 
"using InnoDB tables (the only way to have foreign 
keys, transactions, and row level locking for MySQL) makes MySQL slower 
and adds complexity to tuning the database"
Adding this: "innodb_flush_method=O_DSYNC" to the 
my.cnf made InnoDB as fast as MyISAM in our tests. It doesn't turn off disk 
flushing; it's just a flush method that might work better with different kernels 
and drives; it's one of those "play with this and see if it helps" parameters; 
there are lots of those in Postgres, it seems. There are 10 variables for tuning 
InnoDB (and you don't have to tune for MyISAM, so it's actually a six-of-one, 
half-dozen-of-the-other). Setup between the two seems to be about the 
same.
 
"PostgreSQL supports constraints. MySQL doesn't; 
programmers need to take care of that from the client side"
Again, InnoDB supports constraints.
 
"Transactions: We've been here before. Suffice to 
say, MySQL+InnoDB is almost there. Plain ol' MySQL doesn't have it, which tells 
you something about their philosophy towards database design."
InnoDB supports transactions very nicely, has the 
equivalent of WAL, and one thing I really like: a tablespace (comprised of data 
files that can be spread around multiple hard drives), and in a month or so, 
InnoDB will support multiple tablespaces.
 
 
To be fair, here are a few MySQL "bad-things" that 
weren't mentioned:
 
1) InnoDB can't do a hot-backup with the basic 
backup tools. To hot-backup an InnoDB database, you need to pay $450 US per 
database per year ($1150 per database perpetual) for a proprietary hot-backup 
tool
2) InnoDB can't do full-text 
searching.
3) I see alot more corrupt-database bugs on the 
MySQL lists (most are MyISAM, but a few InnoDB bugs pop up from time to time) - 
way more than I see on the Postgres lists.
4) There are some really cranky people on the MySQL 
lists; the Postgres lists seem to be much more effective (esp. with people like 
Tom Lane). Maybe it's be

Re: [PERFORM] go for a script! / ex: PostgreSQL vs. MySQL

2003-10-09 Thread Dror Matalon

Yeah, I had similar thought to Oliver's and suspected that this would be
the answer. 
Also, while it's not too hard to do this for a single platform, it gets
complecated once you start looking at different ones.

Josh, let me know when you're ready to do this. I'll try to help,
although my perl's kind of rusty. Also, can you even assume perl for a
postgres install? Does Solaris, for instance come with perl?

Dror

On Thu, Oct 09, 2003 at 09:56:11AM -0700, Josh Berkus wrote:
> Oliver,
> 
> > I think instead of thinking about where to put the
> > information about tuning, someone should provide a
> > "pgsql-autotune". Maybe even a shell script would do the
> > trick.
> 
> Well, you see, there's the issue.  "I think someone."  Lots of people have 
> spoken in favor of an "auto-conf" script; nobody so far has stepped forward 
> to get it done for 7.4, and I doubt we have time now.
> 
> I'll probably create a Perl script in a month or so, but not before that 
> 
> -- 
> Josh Berkus
> Aglio Database Solutions
> San Francisco
> 
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Andrew Sullivan
On Thu, Oct 09, 2003 at 01:04:23PM -0400, Jeff wrote:
> 
> So you think we should leave PG alone and let it run horrifically slowly?
> Do you have a better idea of how to do this?

Given the point in the release cycle, mightn't the FAQ_Solaris or
some other place be better for this for now?  I agree with the
concern.  I'd rather have slow'n'stable than fast-but-broken.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Bruce Momjian
Jeff wrote:
> On Thu, 9 Oct 2003, Kenneth Marshall wrote:
> 
> > Jeff,
> >
> > My first concern with the -fast option is that it makes an executable
> > that is specific for the platform on which the compilation is run
> > unless other flags are given. My second concern is the effect it has
> > on IEEE floating point behavior w.r.t. rounding, error handling, 
> > And my third concern is that if you use -fast, all other code must
> > be compiled and linked with the -fast option for correct operation,
> > this includes any functional languages such as perl, python, R,...
> > That is a pretty big requirement for a default compilation flag.
> >
> > Ken Marshall
> >
> 
> So you think we should leave PG alone and let it run horrifically slowly?
> Do you have a better idea of how to do this?
> 
> And do you have evidence apps compiled with -fast linked to non -fast
> (or gcc compiled) have problems?

I have updated the Solaris FAQ:


5) How can I compile for optimum performance?

Try using the "-fast" compile flag.  The binaries might not be portable to
other Solaris systems, and you might need to compile everything that links
to PostgreSQL with "-fast", but PostgreSQL will run significantly faster,
50% faster on some tests.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Jeff
On Thu, 9 Oct 2003, Kenneth Marshall wrote:

> Jeff,
>
> My first concern with the -fast option is that it makes an executable
> that is specific for the platform on which the compilation is run
> unless other flags are given. My second concern is the effect it has
> on IEEE floating point behavior w.r.t. rounding, error handling, 
> And my third concern is that if you use -fast, all other code must
> be compiled and linked with the -fast option for correct operation,
> this includes any functional languages such as perl, python, R,...
> That is a pretty big requirement for a default compilation flag.
>
> Ken Marshall
>

So you think we should leave PG alone and let it run horrifically slowly?
Do you have a better idea of how to do this?

And do you have evidence apps compiled with -fast linked to non -fast
(or gcc compiled) have problems?


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] go for a script! / ex: PostgreSQL vs. MySQL

2003-10-09 Thread Josh Berkus
Oliver,

> I think instead of thinking about where to put the
> information about tuning, someone should provide a
> "pgsql-autotune". Maybe even a shell script would do the
> trick.

Well, you see, there's the issue.  "I think someone."  Lots of people have 
spoken in favor of an "auto-conf" script; nobody so far has stepped forward 
to get it done for 7.4, and I doubt we have time now.

I'll probably create a Perl script in a month or so, but not before that 

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Jeff
On Thu, 9 Oct 2003, Bruce Momjian wrote:

> > 52 seconds to 19-20 seconds
>
> Wow, that's dramatic.  Do you want to propose some flags for non-gcc
> Solaris?  Is -fast the only one?  Is there one that suppresses those
> warnings or are they OK?
>

Well. As I said, I didn't see an obvious way to hide those warnings.
I'd love to make those warnings go away.  That is why I suggested perhaps
printing a message to ensure the user knows that warnings may be printed
when using sunsoft.

-fast should be all you need - it picks the "best settings" to use for the
platform that is doing the compile.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Bruce Momjian
Jeff wrote:
> On Thu, 9 Oct 2003, Bruce Momjian wrote:
> 
> >
> > What is the performance win for the -fast flag again?
> >
> > ---
> >
> 52 seconds to 19-20 seconds

Wow, that's dramatic.  Do you want to propose some flags for non-gcc
Solaris?  Is -fast the only one?  Is there one that suppresses those
warnings or are they OK?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Jeff
On Thu, 9 Oct 2003, Bruce Momjian wrote:

>
> What is the performance win for the -fast flag again?
>
> ---
>
52 seconds to 19-20 seconds


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Linux filesystem shootout

2003-10-09 Thread scott.marlowe
On Thu, 9 Oct 2003, Shridhar Daithankar wrote:

> Kaarel wrote:
> >>>http://www.ussg.iu.edu/hypermail/linux/kernel/0310.1/0208.html
> >>>
> >>>Shridhar
> >>>
> >>>
> > I feel incompetent when it comes to file systems. Yet everybody would like to 
> > have the best file system if given the choice...so do I :) Here I am looking at 
> > those tables seeing JFS having more green cells than others. The more green the 
> > better right? So based on these tests JFS ought to be the one?
> 
> Yes and no. Yes for the results. No for the tests that weren't run.
> 
> Database load is quite different. Its mixture of read and write load with a 
> dynamics varying from one extreme to other, between these two.
> 
> All it says that if you want to choose a good file system for postgresql, look 
> at JFS first..:-)
> 
>   Besides all the tests were done on files file bigger than 1GB. If single file 
> size is restricted to 1GB, it might produce a different result set. And 
> postgresql does not exceed 1GB limit per file.
> 
> So still, quite a few unknowns there..

Absolutely.  For instance, one file system may be faster on a RAID card 
with battery backed cache, while another may be faster on an IDE drive 
with write cache disabled, while another may be faster on software RAID1, 
while another might be faster on software RAID5.

If you haven't tested different file systems on your setup, you don't 
really know which will be faster until you do.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Bruce Momjian

What is the performance win for the -fast flag again?

---

Jeff wrote:
> On Thu, 9 Oct 2003, Bruce Momjian wrote:
> 
> >
> > So you want -fast added as default for non-gcc Solaris?  You mentioned
> > there is a warning generated that we have to deal with?
> >
> 
>  Yeah, suncc generates a warning for _every_ file that says:
> Warning: -xarch=native has been explicitly specified, or implicitly
> specified by a macro option, -xarch=native on this architecture implies
> -xarch=v8plusa which generates code that does not run on pre-UltraSPARC
> processors
> 
> And then I get various warnings here and there...
> 
> lots of "statement not reached" as in ecpg's type.c module
> The offending code is a big switch statement like:
> case ECPGt_bool:
> return ("ECPGt_bool");
> break;
> 
> And then any functiont aht uses PG_RETURN_NULL generates " warning:
> end-of-loop code not reached"
> 
> and a bunch of "constant promoted to unsigned long long"
> 
> 
> And some places such as in fe-exec.c have code like this:
> buflen = strlen(strtext);   /* will shrink, also we discover
> if
> 
> where strtext is an unsigned char * which generates warning: argument #1
> is incompatible with prototype:
> 
> and then various other type mismatches here and there.
> 
> I skimmed through the manpage.. it doesn't look like we can supress
> these..
> 
> 
> Not sure we want it to look like we have bad code if someone uses cc.
> perhaps issue a ./configure notice or something?
> 
> gcc compiles things fine.
> 
> 
> --
> Jeff Trout <[EMAIL PROTECTED]>
> http://www.jefftrout.com/
> http://www.stuarthamm.net/
> 
> 
> 
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faqs/FAQ.html
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Compare rows

2003-10-09 Thread Greg Spiegelberg
Josh Berkus wrote:
Greg,


You lost me on that one.  What's a "vertical child table"?


Currently, you store data like this:

id  address uptime  speed   memory  tty
3   67.92   0.3 11.237  6
7   69.51.1 NULL15  NULL
9   65.50.1 NULL94  2
The most efficient way for you to store data would be like this:

main table
id  address
3   67.92
7   69.5
9   65.5
child table
id  value_type  value
3   uptime  0.3
3   speed   11.2
3   memory  37
3   tty 6
7   uptime  1.1
7   memory  15
9   uptime  0.1
9   memory  94
9   tty 2
As you can see, the NULLs are not stored, making this system much more 
efficient on storage space.

Tommorrow I'll (hopefully) write up how to query this for comparisons.   It 
would help if you gave a little more details about what specific comparison 
you're doing, e.g. between tables or table to value, comparing just the last 
value or all rows, etc.

Got it.  I can see how it would be more efficient in storing.  At this
point it would require a lot of query and code rewrites to handle it.
Fortunately, we're looking for alternatives for the next revision and
we're leaving ourselves open for a rewrite much to the boss's chagrin.
I will be spinning up a test server soon and may attempt a quick
implementation.  I may make value_type a foreign key on a table that
includes a full and/or brief description of the key.  Problem I'll have
then will be categorizing all those keys into disk, cpu, memory, user,
and all the other data categories since it's in one big table rather
than specialized tables.
Greg

--
Greg Spiegelberg
 Sr. Product Development Engineer
 Cranel, Incorporated.
 Phone: 614.318.4314
 Fax:   614.431.8388
 Email: [EMAIL PROTECTED]
Cranel. Technology. Integrity. Focus.


---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Linux filesystem shootout

2003-10-09 Thread Bill Moran
Kaarel wrote:

http://www.ussg.iu.edu/hypermail/linux/kernel/0310.1/0208.html

Shridhar
   
I feel incompetent when it comes to file systems. Yet everybody would 
like to have the best file system if given the choice...so do I :) Here 
I am looking at those tables seeing JFS having more green cells than 
others. The more green the better right? So based on these tests JFS 
ought to be the one?
Those tests seem to align with the ones I did recently:
http://www.potentialtech.com/wmoran/postgresql.php#results
There were less filesystems involved, and the data is less comprehensive,
but probably a little easier to understand (i.e. -> fastest filesystem
at the top of the graph, slowest at the bottom).
I've been telling people that JFS is fastest.  This is definately
oversimplified, since the "shoot out" shows that it's not _always_
fastest, but for people who just want to make a good initial choice,
and won't do their own testing to find out what's fastest in their
configuration (for whatever reason), I think JFS is the safest bet.
Since it's a journalling filesystem as well, it should have good
recoverability in the even of catastrophy, but I haven't tested
that.
--
Bill Moran
Potential Technologies
http://www.potentialtech.com
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Linux filesystem shootout

2003-10-09 Thread Shridhar Daithankar
Kaarel wrote:
http://www.ussg.iu.edu/hypermail/linux/kernel/0310.1/0208.html

Shridhar
   

I feel incompetent when it comes to file systems. Yet everybody would like to 
have the best file system if given the choice...so do I :) Here I am looking at 
those tables seeing JFS having more green cells than others. The more green the 
better right? So based on these tests JFS ought to be the one?
Yes and no. Yes for the results. No for the tests that weren't run.

Database load is quite different. Its mixture of read and write load with a 
dynamics varying from one extreme to other, between these two.

All it says that if you want to choose a good file system for postgresql, look 
at JFS first..:-)

 Besides all the tests were done on files file bigger than 1GB. If single file 
size is restricted to 1GB, it might produce a different result set. And 
postgresql does not exceed 1GB limit per file.

So still, quite a few unknowns there..

Best thing could be repeat those benchmarks on $PGDATA with your live data 
inside it. It could mimmic the load pretty well..

 Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Linux filesystem shootout

2003-10-09 Thread Kaarel






  
http://www.ussg.iu.edu/hypermail/linux/kernel/0310.1/0208.html

Shridhar

  
  

I feel incompetent when it comes to file systems. Yet everybody would
like to have the best file system if given the choice...so do I :) Here
I am looking at those tables seeing JFS having more green cells than
others. The more green the better right? So based on these tests JFS
ought to be the one?

Kaarel





Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Jeff
On Thu, 9 Oct 2003, Bruce Momjian wrote:

>
> So you want -fast added as default for non-gcc Solaris?  You mentioned
> there is a warning generated that we have to deal with?
>

 Yeah, suncc generates a warning for _every_ file that says:
Warning: -xarch=native has been explicitly specified, or implicitly
specified by a macro option, -xarch=native on this architecture implies
-xarch=v8plusa which generates code that does not run on pre-UltraSPARC
processors

And then I get various warnings here and there...

lots of "statement not reached" as in ecpg's type.c module
The offending code is a big switch statement like:
case ECPGt_bool:
return ("ECPGt_bool");
break;

And then any functiont aht uses PG_RETURN_NULL generates " warning:
end-of-loop code not reached"

and a bunch of "constant promoted to unsigned long long"


And some places such as in fe-exec.c have code like this:
buflen = strlen(strtext);   /* will shrink, also we discover
if

where strtext is an unsigned char * which generates warning: argument #1
is incompatible with prototype:

and then various other type mismatches here and there.

I skimmed through the manpage.. it doesn't look like we can supress
these..


Not sure we want it to look like we have bad code if someone uses cc.
perhaps issue a ./configure notice or something?

gcc compiles things fine.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Andrew Sullivan
On Wed, Oct 08, 2003 at 02:31:29PM -0400, Bruce Momjian wrote:
> Well, this is really embarassing.  I can't imagine why we would not set
> at least -O on all platforms.  Looking at the template files, I see
> these have no optimization set:

I think gcc _used_ to generate bad code on SPARC if you set any
optimisation.  We tested it on Sol7 with gcc 2.95 more than a year
ago, and tried various settings.  -O2 worked, but other items were
really bad.  Some of them would pass regression but cause strange
behaviour, random coredumps, &c.  A little digging demonstrated that
anything beyond -O2 just didn't work for gcc at the time.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Bruce Momjian

So you want -fast added as default for non-gcc Solaris?  You mentioned
there is a warning generated that we have to deal with?

---

Jeff wrote:
> On Wed, 8 Oct 2003, Neil Conway wrote:
> 
> > Hey Jeff,
> >
> > On Wed, 2003-10-08 at 11:46, Jeff wrote:
> > > Yeah - like I expected it was able to generate much better code for
> > > _bt_checkkeys which was the #1 function in gcc on both sun & linux.
> >
> > If you get a minute, would it be possible to compare the performance of
> > your benchmark under linux/gcc and solaris/gcc when PostgreSQL is
> > compiled with "-O3"?
> >
> Sun:
> gcc:
> none: 60 seconds
> -O: 21 seconds
> -O2: 20 seconds
> -O3: 19 seconds
> 
> suncc:
> none: 52 seconds
> -fast: 20 secondsish.
> 
> -fast is actually a macro that expands to the "best settings" for the
> platform that is doing the compilation.
> 
> 
> Linux:
> -O2: 35
> -O3: 40
> Odd.. I wonder why it took longer. Perhaps gcc built some bad code?
> I thought the results were odd there so I ran the test many times.. same
> results! Swapped the binaries back (so -O2 was running) and boom. back to
> 35.
> 
> Sun gcc -O2 and suncc -fast both pass make check.
> 
> 
> --
> Jeff Trout <[EMAIL PROTECTED]>
> http://www.jefftrout.com/
> http://www.stuarthamm.net/
> 
> 
> 
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Compare rows

2003-10-09 Thread Greg Spiegelberg
Christopher Browne wrote:
Wow, that takes me back to a paper I have been looking for for
_years_.
Some time in the late '80s, probably '88 or '89, there was a paper
presented in Communications of the ACM that proposed using this sort
of "hypernormalized" schema as a way of having _really_ narrow schemas
that would be exceedingly expressive.  They illustrated an example of
an address table that could hold full addresses with a schema with
only about half a dozen columns, the idea being that you'd have
several rows linked together.
I'd be interested in the title / author when you remember.
I'm kinda sick.  I like reading on most computer theory,
designs, algorithms, database implementations, etc.  Usually
how I get into trouble too with 642 column tables though. :)
--
Greg Spiegelberg
 Sr. Product Development Engineer
 Cranel, Incorporated.
 Phone: 614.318.4314
 Fax:   614.431.8388
 Email: [EMAIL PROTECTED]
Cranel. Technology. Integrity. Focus.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Serious Problem with the windows and postgres configuration

2003-10-09 Thread shyamperi



DISCLAIMER: This message contains privileged and confidential information and is
intended only for the individual named.If you are not the intended
recipient you should not disseminate,distribute,store,print, copy or
deliver this message.Please notify the sender immediately by e-mail if
you have received this e-mail by mistake and delete this e-mail from
your system.18:15p
Dear all,
There is a problem I am facing while connecting to postgresqk database server, which is intalled on the remote machine. When I check the log's 
at database end PG_recv buf is reaching the EOF, and at my program level, java socket exception.
I need some help regarding this... as this is not allowing my programs to execute..
Thanking You
-
Warm Regards
Shÿam Peri

II Floor, Punja Building,
M.G.Road,
Ballalbagh,
Mangalore-575003 
Ph : 91-824-2451001/5
Fax : 91-824-2451050 




DISCLAIMER: This message contains privileged and confidential information and is
intended only for the individual named.If you are not the intended
recipient you should not disseminate,distribute,store,print, copy or
deliver this message.Please notify the sender immediately by e-mail if
you have received this e-mail by mistake and delete this e-mail from
your system.
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Sun performance - Major discovery!

2003-10-09 Thread Jeff
On Wed, 8 Oct 2003, Neil Conway wrote:

> Hey Jeff,
>
> On Wed, 2003-10-08 at 11:46, Jeff wrote:
> > Yeah - like I expected it was able to generate much better code for
> > _bt_checkkeys which was the #1 function in gcc on both sun & linux.
>
> If you get a minute, would it be possible to compare the performance of
> your benchmark under linux/gcc and solaris/gcc when PostgreSQL is
> compiled with "-O3"?
>
Sun:
gcc:
none: 60 seconds
-O: 21 seconds
-O2: 20 seconds
-O3: 19 seconds

suncc:
none: 52 seconds
-fast: 20 secondsish.

-fast is actually a macro that expands to the "best settings" for the
platform that is doing the compilation.


Linux:
-O2: 35
-O3: 40
Odd.. I wonder why it took longer. Perhaps gcc built some bad code?
I thought the results were odd there so I ran the test many times.. same
results! Swapped the binaries back (so -O2 was running) and boom. back to
35.

Sun gcc -O2 and suncc -fast both pass make check.


--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Linux filesystem shootout

2003-10-09 Thread Grega Bremec
I should at least read the URLs before re-posting info.

My bad, I'm utterly sorry about this... :-(

Cheers,
-- 
Grega Bremec
Sistemska administracija in podpora
grega.bremec-at-noviforum.si
http://najdi.si/
http://www.noviforum.si/


pgp0.pgp
Description: PGP signature


Re: [PERFORM] Linux filesystem shootout

2003-10-09 Thread Grega Bremec
...and on Thu, Oct 09, 2003 at 04:42:53PM +0530, Shridhar Daithankar used the keyboard:
>
> http://www.ussg.iu.edu/hypermail/linux/kernel/0310.1/0208.html
> 
> Shridhar

My $0.1:

I just stumbled across an interesting filesystem comparison table today,
comparing ext2/ext3/reiser/reiser4/jfs/xfs on a single UP P2/450 machine
with an old UDMA2 Seagate.

Now however archaic this box may have been, I think that the tests still
bear some objectivity, as it's a comparison test and not some "how much
can we squeeze out of xyz" type of bragging.

The tests were done using bonnie++ and IOZone and are essentially just a
couple of tables listing the average results achieved by each of those
tests.

Also, ext3, reiser and reiser4 were tested in a couple of different
configurations (reiser4 extents, reiser notail, ext3 journal, ordered and
writeback mode).

Oh, i shouldn't forget - the address is http://fsbench.netnation.com/ :)

Cheers,
-- 
Grega Bremec
Sistemska administracija in podpora
grega.bremec-at-noviforum.si
http://najdi.si/
http://www.noviforum.si/


pgp0.pgp
Description: PGP signature


Re: [PERFORM] Compare rows

2003-10-09 Thread Christopher Browne
The world rejoiced as [EMAIL PROTECTED] (Josh Berkus) wrote:
> Chris,
>> Some time in the late '80s, probably '88 or '89, there was a paper
>> presented in Communications of the ACM that proposed using this sort
>> of "hypernormalized" schema as a way of having _really_ narrow schemas
>> that would be exceedingly expressive.  They illustrated an example of
> 
>> The entertaining claim was that they felt they could model the
>> complexities of the operations of any sort of company using not
>> more than 50 tables.  It seemed somewhat interesting, at the time;
>> it truly resonated as Really Interesting when I saw SAP R/3, with
>> its bloat of 1500-odd tables.
>
> One can always take things too far.  Trying to make everying 100%
> dynamic so that you can cram your whole database into 4 tables is
> going too far; so is the kind of bloat that produces systems like
> SAP, which is more based on legacy than design (I analyzed a large
> commercial billing system once and was startled to discover that 1/4
> of its 400 tables and almost half of the 40,000 collective columns
> were not used and present only for backward compatibility).

With R/3, the problem is that there are hundreds (now thousands) of
developers trying to coexist on the same code base, with the result
tables containing nearly-the-same fields are strewn all over.

It's _possible_ that the design I saw amounted to nothing more than a
clever hack for implementing LDAP atop a relational database, but they
seemed to have something slightly more to say than that.
-- 
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','ntlug.org').
http://www3.sympatico.ca/cbbrowne/emacs.html
Why does the word "lisp" have an "s" in it? 

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] Linux filesystem shootout

2003-10-09 Thread Shridhar Daithankar
http://www.ussg.iu.edu/hypermail/linux/kernel/0310.1/0208.html

Shridhar



---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] go for a script! / ex: PostgreSQL vs. MySQL

2003-10-09 Thread Paul Thomas
On 09/10/2003 09:29 Oliver Scheit wrote:
Hi guys,

I followed the discussion and here are my 0.2$:

I think instead of thinking about where to put the
information about tuning, someone should provide a
"pgsql-autotune". Maybe even a shell script would do the
trick.
It's not so hard to find out, how much memory is installed,
and IMHO SHARED_BUFFERS, SORT_MEM and EFFECTIVE_CACHE_SIZE
depend heavily on this. a "cat /proc/sys/kernel/shmmax"
would give some valuable information on linux boxes,
there is probably other stuff for different OSes.
random_page_cost could be set after probing the harddisks,
maybe even do a hdparm -tT if they seem to be ATA, not SCSI.
Now, let's pretend the script finds out there is 1 GB RAM,
it could ask something like "Do you want to optimize the
settings for postgres (other applications may suffer from
having not enough RAM) or do you want to use moderate
settings?"
Something like this, you get the idea.


ISR reading that 7.4 will use a default of shared_beffers = 1000 if the 
machine can support it (most can). This alone should make a big difference 
in out-of-the-box performance.

--
Paul Thomas
+--+-+
| Thomas Micro Systems Limited | Software Solutions for the Smaller 
Business |
| Computer Consultants | 
http://www.thomas-micro-systems-ltd.co.uk   |
+--+-+

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] PostgreSQL vs. MySQL

2003-10-09 Thread Andrew Sullivan
On Wed, Oct 08, 2003 at 01:28:53PM -0400, Bruce Momjian wrote:
> 
> Agreed.  Text added to install docs:

[&c.]

I think this is just right.  It tells a user where to find the info
needed, doesn't reproduce it all over the place, and still points out
that this is something you'd better do.  Combined with the new
probe-to-set-shared-buffers bit at install time, I think the reports
of 400 billion times worse performance than MySQL will probably
diminish.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] go for a script! / ex: PostgreSQL vs. MySQL

2003-10-09 Thread Oliver Scheit
Hi guys,

I followed the discussion and here are my 0.2$:

I think instead of thinking about where to put the
information about tuning, someone should provide a
"pgsql-autotune". Maybe even a shell script would do the
trick.

It's not so hard to find out, how much memory is installed,
and IMHO SHARED_BUFFERS, SORT_MEM and EFFECTIVE_CACHE_SIZE
depend heavily on this. a "cat /proc/sys/kernel/shmmax"
would give some valuable information on linux boxes,
there is probably other stuff for different OSes.

random_page_cost could be set after probing the harddisks,
maybe even do a hdparm -tT if they seem to be ATA, not SCSI.

Now, let's pretend the script finds out there is 1 GB RAM,
it could ask something like "Do you want to optimize the
settings for postgres (other applications may suffer from
having not enough RAM) or do you want to use moderate
settings?"

Something like this, you get the idea.

This would give new users a much more usable start than
the current default settings and would still leave all
the options to do fine-tuning later.

I guess my point is simply this:
instead of saying: "okay we use default settings that will
run on _old_ hardware too" we should go for a little script
that creates a "still save but much better" config file.
There's just no point in setting SHARED_BUFFERS to something
like 16 (what's the current default?) if the PC has >= 1 GB
of RAM. Setting it to 8192 would still be save, but 512 times
better...  ;-) (IIRC 8192 would take 64 MB of RAM, which
should be save if you leave the default MAX_CONNECTIONS.)

As said before: just my $0.2

My opinion on this case is Open Source. Feel free to modify
and add.  :-)

regards,
Oli

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] IMMUTABLE function's flag do not work: 7.3.4, plpgsql

2003-10-09 Thread Andriy Tkachuk
On Wed, 8 Oct 2003, Tom Lane wrote:

> Andriy Tkachuk <[EMAIL PROTECTED]> writes:
> > At second. calc_total() is immutable function:
> > but it seems that it's not cached in one session:
>
> It's not supposed to be.

but it's written id doc:

  IMMUTABLE indicates that the function always  returns  the  same
  result when given the same argument values; that is, it does not
  do database lookups or otherwise use  information  not  directly
  present in its parameter list. If this option is given, any call
  of the function with all-constant arguments can  be  immediately
  replaced with the function value.

I meant that the result of calc_total() is not "immediately replaced with the function 
value"
as it's written in doc, but it takes as long time as the first function call
in the session (with the same arguments).

Maybe i misunderstand something?

Thank you,
 Andriy Tkachuk.

http://www.imt.com.ua


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster