subject:"\[PERFORM\] Joel's Performance Issues WAS \: Opteron vs Xeon"

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-27 Thread Jim C. Nasby

BTW, http://stats.distributed.net/~decibel/base.log is a test I ran;
select count(*) was ~6x faster than explain analyze select *.

On Tue, Apr 26, 2005 at 07:46:52PM -0700, Kevin Brown wrote:
 Josh Berkus wrote:
  Jim, Kevin,
  
Hrm... I was about to suggest that for timing just the query (and not
output/data transfer time) using explain analyze, but then I remembered
that explain analyze can incur some non-trivial overhead with the timing
calls. Is there a way to run the query but have psql ignore the output?
If so, you could use \timing.
  
   Would timing SELECT COUNT(*) FROM (query) work?
  
  Just \timing would work fine; PostgreSQL doesn't return anything until it 
  has 
  the whole result set.  
 
 Hmm...does \timing show the amount of elapsed time between query start
 and the first results handed to it by the database (even if the
 database itself has prepared the entire result set for transmission by
 that time), or between query start and the last result handed to it by
 the database?
 
 Because if it's the latter, then things like server-client network
 bandwidth are going to affect the results that \timing shows, and it
 won't necessarily give you a good indicator of how well the database
 backend is performing.  I would expect that timing SELECT COUNT(*)
 FROM (query) would give you an idea of how the backend is performing,
 because the amount of result set data that has to go over the wire is
 trivial.
 
 Each is, of course, useful in its own right, and you want to be able
 to measure both (so, for instance, you can get an idea of just how
 much your network affects the overall performance of your queries).
 
 
  That's why MSSQL vs. PostgreSQL timing comparisons are 
  deceptive unless you're careful:  MSSQL returns the results on block at a 
  time, and reports execution time as the time required to return the *first* 
  block, as opposed to Postgres which reports the time required to return the 
  whole dataset.
 
 Interesting.  I had no idea MSSQL did that, but I can't exactly say
 I'm surprised.  :-)
 
 
 -- 
 Kevin Brown [EMAIL PROTECTED]
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
 

-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-26 Thread Kevin Brown

Josh Berkus wrote:
 Jim, Kevin,
 
   Hrm... I was about to suggest that for timing just the query (and not
   output/data transfer time) using explain analyze, but then I remembered
   that explain analyze can incur some non-trivial overhead with the timing
   calls. Is there a way to run the query but have psql ignore the output?
   If so, you could use \timing.
 
  Would timing SELECT COUNT(*) FROM (query) work?
 
 Just \timing would work fine; PostgreSQL doesn't return anything until it has 
 the whole result set.  

Hmm...does \timing show the amount of elapsed time between query start
and the first results handed to it by the database (even if the
database itself has prepared the entire result set for transmission by
that time), or between query start and the last result handed to it by
the database?

Because if it's the latter, then things like server-client network
bandwidth are going to affect the results that \timing shows, and it
won't necessarily give you a good indicator of how well the database
backend is performing.  I would expect that timing SELECT COUNT(*)
FROM (query) would give you an idea of how the backend is performing,
because the amount of result set data that has to go over the wire is
trivial.

Each is, of course, useful in its own right, and you want to be able
to measure both (so, for instance, you can get an idea of just how
much your network affects the overall performance of your queries).


 That's why MSSQL vs. PostgreSQL timing comparisons are 
 deceptive unless you're careful:  MSSQL returns the results on block at a 
 time, and reports execution time as the time required to return the *first* 
 block, as opposed to Postgres which reports the time required to return the 
 whole dataset.

Interesting.  I had no idea MSSQL did that, but I can't exactly say
I'm surprised.  :-)


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Joel Fradkin

Thanks we will try that, we are working on a test suit for the way our app
gets data (ODBC).
we plan to include updates, inserts, and selects and all three at once with
a log of the results.
Then we should use a stress test tool to see how it works with multiple
instances (I used Microsoft's tool last time I did stress testing).

Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology  advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.
 

 

-Original Message-
From: Marko Ristola [mailto:[EMAIL PROTECTED] 
Sent: Sunday, April 24, 2005 2:15 AM
To: Joel Fradkin
Cc: 'Mohan, Ross'; [EMAIL PROTECTED];
pgsql-performance@postgresql.org
Subject: Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs
Xeon


Here is, how you can receive all one billion rows with
pieces of 2048 rows. This changes PostgreSQL and ODBC behaviour:

Change ODBC data source configuration in the following way:

Fetch = 2048
UseDeclareFetch = 1

It does not create core dumps with 32 bit computers with billions of rows!
This is a bit slower than fetching all rows at once. Scalability means 
sometimes
a bit less speed :(

With UseDeclareFetch=1 you might get even 150 thousands rows per second.
With UseDeclareFetch=0 the backend might be able to send about 200 
thousands rows per
second.

So, these high numbers come, if all the results are already in memory, 
and no disc
accesses are needed. These are about the peak speeds with VARCHAR, 
without Unicode,
with Athlon64 home computer.

With sequential disc scan, more typical fetching
speed is about 50-100 thousands rows per second.

PostgreSQL ODBC row fetching speed is very good.
Perhaps with better discs, with RAID10, the current upper limit about 
200 thousands
rows per second could be achieved??

So the in memory examples show, that the hard disc is normally
the bottleneck. It is on the server side.
My experiments are done in Linux. In Windows, the speed might be a bit 
different
by a constant factor (algorithmically).

These speeds depend on very many factos even on sequential scan.
ODBC speed is affected by the number of columns fetched and the types of 
the columns.
Integers are processed faster than textual or date columns.

The network latency is decreased with UseDeclareFetc=1 by increasing the 
Fetch=2048
parameter: With Fetch=1 you get a bad performance with lots of rows, but 
if you fetch
more data from the server once per 2048 rows, the network latency 
affects only once for
the 2048 row block.

Regards,
Marko Ristola

Joel Fradkin wrote:

Hate to be dumb, but unfortunately I am.

Could you give me an idea what I should be using, or is there a good
resource for me to check out.
I have been spending so much time with config and moving data, converting
etc, I never looked at the odbc settings (didn't even think about it until
Josh brought it up). I did ask him for his advice, but would love a second
opinion.

Our data is a bit of a mixture, some records have text items most are
varchars and integers with a bit of Booleans mixed in.

I am running 8.0.2 so not sure if the protocol is ODBC or Postgres?

Thanks for responding I appreciate any help 

Joel Fradkin
 
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mohan, Ross
Sent: Thursday, April 21, 2005 10:01 AM
To: [EMAIL PROTECTED]
Subject: Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs
Xeon

Joel, thanks. A couple of things jump out there for
me, not a problem for a routine ODBC connection, but
perhaps in the lotsa stuff context of your current
explorations, it might be relevant?

I am completely shooting from the hip, here, but...if
it were my goose to cook, I'd be investigating

Session(StringConn) =
DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=
;
PWD=;ReadOnly=0;Protocol=6.4;

|| Protocol? Is this related to version? is the driver wy old?


FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;

||  Fetch great for OLTP, lousy for batch?


Socket=4096;UnknownSizes=0;MaxVarcharSize=254;MaxLongVarcharSize=8190;

||  what ARE the datatypes and sizes in your particular case? 

Debug=0;

||  a run with debug=1 probably would spit up something interesting

CommLog=0;Optimizer=1;

||  Optimizer? that's a new one on me

Ksqo=1;UseDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsA
s
Char=1;Parse=0;CancelAsFreeStmt=;ExtraSysTablePrefixes=dd_;LFConversion=1;U
p

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Merlin Moncure

 In practice, we have watched Windows evolve in such a fashion with
 respect to multiuser support, and, in effect, it has never really
 gotten it.  Microsoft started by hacking something on top of MS-DOS,
 and by the time enough applications had enough dependancies on the way
 that worked, it has essentially become impossible for them to migrate
 properly to a multiuser model since applications are normally designed
 with the myopic this is MY computer! model of the world.

Completely false.  NT was a complete rewrite (1993ish) and was
inherently multi-user with even the GDI running as a user level process
(no longer however).  The NT kernel was scalable and portable, running
on the Alpha, MIPS, etc.

However, you do have a point with applications...many win32 developers
have a very bad habit about expecting their apps to install and run as
root.  However, this is generally not a problem with Microsoft stuff.
In short, the problem is really people, not the technology.

Merlin

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Joel Fradkin

Tried changing the settings and saw no change in a test using asp.
The test does several selects on views and tables.
It actually seemed to take a bit longer.

Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology  advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.
 

 

-Original Message-
From: Marko Ristola [mailto:[EMAIL PROTECTED] 
Sent: Sunday, April 24, 2005 2:15 AM
To: Joel Fradkin
Cc: 'Mohan, Ross'; [EMAIL PROTECTED];
pgsql-performance@postgresql.org
Subject: Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs
Xeon


Here is, how you can receive all one billion rows with
pieces of 2048 rows. This changes PostgreSQL and ODBC behaviour:

Change ODBC data source configuration in the following way:

Fetch = 2048
UseDeclareFetch = 1

It does not create core dumps with 32 bit computers with billions of rows!
This is a bit slower than fetching all rows at once. Scalability means 
sometimes
a bit less speed :(

With UseDeclareFetch=1 you might get even 150 thousands rows per second.
With UseDeclareFetch=0 the backend might be able to send about 200 
thousands rows per
second.

So, these high numbers come, if all the results are already in memory, 
and no disc
accesses are needed. These are about the peak speeds with VARCHAR, 
without Unicode,
with Athlon64 home computer.

With sequential disc scan, more typical fetching
speed is about 50-100 thousands rows per second.

PostgreSQL ODBC row fetching speed is very good.
Perhaps with better discs, with RAID10, the current upper limit about 
200 thousands
rows per second could be achieved??

So the in memory examples show, that the hard disc is normally
the bottleneck. It is on the server side.
My experiments are done in Linux. In Windows, the speed might be a bit 
different
by a constant factor (algorithmically).

These speeds depend on very many factos even on sequential scan.
ODBC speed is affected by the number of columns fetched and the types of 
the columns.
Integers are processed faster than textual or date columns.

The network latency is decreased with UseDeclareFetc=1 by increasing the 
Fetch=2048
parameter: With Fetch=1 you get a bad performance with lots of rows, but 
if you fetch
more data from the server once per 2048 rows, the network latency 
affects only once for
the 2048 row block.

Regards,
Marko Ristola

Joel Fradkin wrote:

Hate to be dumb, but unfortunately I am.

Could you give me an idea what I should be using, or is there a good
resource for me to check out.
I have been spending so much time with config and moving data, converting
etc, I never looked at the odbc settings (didn't even think about it until
Josh brought it up). I did ask him for his advice, but would love a second
opinion.

Our data is a bit of a mixture, some records have text items most are
varchars and integers with a bit of Booleans mixed in.

I am running 8.0.2 so not sure if the protocol is ODBC or Postgres?

Thanks for responding I appreciate any help 

Joel Fradkin
 
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mohan, Ross
Sent: Thursday, April 21, 2005 10:01 AM
To: [EMAIL PROTECTED]
Subject: Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs
Xeon

Joel, thanks. A couple of things jump out there for
me, not a problem for a routine ODBC connection, but
perhaps in the lotsa stuff context of your current
explorations, it might be relevant?

I am completely shooting from the hip, here, but...if
it were my goose to cook, I'd be investigating

Session(StringConn) =
DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=
;
PWD=;ReadOnly=0;Protocol=6.4;

|| Protocol? Is this related to version? is the driver wy old?


FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;

||  Fetch great for OLTP, lousy for batch?


Socket=4096;UnknownSizes=0;MaxVarcharSize=254;MaxLongVarcharSize=8190;

||  what ARE the datatypes and sizes in your particular case? 

Debug=0;

||  a run with debug=1 probably would spit up something interesting

CommLog=0;Optimizer=1;

||  Optimizer? that's a new one on me

Ksqo=1;UseDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsA
s
Char=1;Parse=0;CancelAsFreeStmt=;ExtraSysTablePrefixes=dd_;LFConversion=1;U
p
datableCursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBina
r
y=0;UseServerSidePrepare=0


||  that's about all I can see, prima facie.  I'll be very

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Merlin Moncure

 I am waiting to here back from Josh on using cursors and trying to
flatten
 long running views.
 
 I am a little disappointed I have not understood enough to get my
analyzer
 to use the proper plan, we had to set seqscan off to get the select
from
 response_line to work fast and I had to turn off merge joins to get
assoc
 list to work fast. Once I am up I can try to learn more about it, I am
so
 glad there are so many folks here willing to take time to educate us
 newb's.

I am not a big fan of tweaking the optimizer because you are robbing
Peter to pay Paul, so to speak.  pg 8.1 may come out with new optimizer
tweaks and you'll have to do it all over again. If the optimizer is not
'getting' your view, there are a few different approaches to fixing the
problem.

I am also not a big fan of de-normalizing your database.  Essentially
you are lighting a fuse that may blow up later.  Here are some general
approaches to planner optimization that can help out in tricky
situations.

1. Split up views.  Often overlooked but can provide good enhancements.
If your view is based on 3 or more tables, has left/right joins,
consider breaking it up into two or more views.  Views can be based on
views and it is easier to force the planner to pick good plans this way.
If you can find other uses for component views in other queries, so much
the better.

2. Materialize your view.  Use lazy materialization, i.e. you query the
view into a table at scheduled times.  Now we are trading disk spaces
and coherence for performance...this may not fit your requirements but
the nice thing about it is that it will help give us the 'ideal plan'
running time which we are shooting for.

3. pl/pgsql.  Using combinations of loops, refcursors, and queries, you
can cut code that should give you comparable performance to the ideal
plan.  If you can do the actual work here as well (no data returned to
client), you get a tremendous win.  Also pl/pgsql works really well for
recursive sets and other things that are difficult to run in the context
of a single query.  Just be aware of the disadvantages:
a. not portable
b. maintenance overhead
c. require relatively high developer skill set

I will go out on a limb and say that mastering the above approaches can
provide the solution to virtually any performance problem within the
limits of your hardware and the problem complexity.

Based on your questions, it sounds to me like your #1 problem is your
developer skillset relative to your requirements.  However, this is
easily solvable...just keep attacking the problem and don't be afraid to
bring in outside help (which you've already done, that's a start!).

Merlin

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-25 Thread Christopher Browne

Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Merlin Moncure) 
wrote:
 In practice, we have watched Windows evolve in such a fashion with
 respect to multiuser support, and, in effect, it has never really
 gotten it.  Microsoft started by hacking something on top of MS-DOS,
 and by the time enough applications had enough dependancies on the way
 that worked, it has essentially become impossible for them to migrate
 properly to a multiuser model since applications are normally designed
 with the myopic this is MY computer! model of the world.

 Completely false.  NT was a complete rewrite (1993ish) and was
 inherently multi-user with even the GDI running as a user level
 process (no longer however).  The NT kernel was scalable and
 portable, running on the Alpha, MIPS, etc.

Completely irrelevant.  When Win32 was deployed, the notion that more
than a tiny fraction of the users would be running Win32 apps on
multiuser platforms was absolutely laughable.  It continued to be
laughable until well into this century, when Microsoft ceased to sell
systems based on MS-DOS.

 However, you do have a point with applications...many win32 developers
 have a very bad habit about expecting their apps to install and run as
 root.  However, this is generally not a problem with Microsoft stuff.
 In short, the problem is really people, not the technology.

Reality is that it is all about the applications.

Microsoft spent _years_ pushing people from MS-DOS to Windows 3.1 to
WfW to Windows 95, and had to do a lot of hard pushing.

The result of that was that a lot of vendors built Win32 applications
for Windows 95.

None of those systems supported multiple users, so the usage and
experience with Win32 pointed everyone to the construction of single
user applications.

At that point, whether Windows NT did or didn't support multiple users
became irrelevant.  Usage patterns had to be oriented towards single
user operation because that's all Win32 could be used to support for
the vast majority that _weren't_ running Windows NT.
-- 
let name=cbbrowne and tld=gmail.com in String.concat @ [name;tld];;
http://linuxfinances.info/info/x.html
But  what can  you  do with  it?   -- ubiquitous  cry from  Linux-user
partner.  -- Andy Pearce, [EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-24 Thread Marko Ristola

Here is, how you can receive all one billion rows with
pieces of 2048 rows. This changes PostgreSQL and ODBC behaviour:
Change ODBC data source configuration in the following way:
Fetch = 2048
UseDeclareFetch = 1
It does not create core dumps with 32 bit computers with billions of rows!
This is a bit slower than fetching all rows at once. Scalability means 
sometimes
a bit less speed :(

With UseDeclareFetch=1 you might get even 150 thousands rows per second.
With UseDeclareFetch=0 the backend might be able to send about 200 
thousands rows per
second.

So, these high numbers come, if all the results are already in memory, 
and no disc
accesses are needed. These are about the peak speeds with VARCHAR, 
without Unicode,
with Athlon64 home computer.

With sequential disc scan, more typical fetching
speed is about 50-100 thousands rows per second.
PostgreSQL ODBC row fetching speed is very good.
Perhaps with better discs, with RAID10, the current upper limit about 
200 thousands
rows per second could be achieved??

So the in memory examples show, that the hard disc is normally
the bottleneck. It is on the server side.
My experiments are done in Linux. In Windows, the speed might be a bit 
different
by a constant factor (algorithmically).

These speeds depend on very many factos even on sequential scan.
ODBC speed is affected by the number of columns fetched and the types of 
the columns.
Integers are processed faster than textual or date columns.

The network latency is decreased with UseDeclareFetc=1 by increasing the 
Fetch=2048
parameter: With Fetch=1 you get a bad performance with lots of rows, but 
if you fetch
more data from the server once per 2048 rows, the network latency 
affects only once for
the 2048 row block.

Regards,
Marko Ristola
Joel Fradkin wrote:
Hate to be dumb, but unfortunately I am.
Could you give me an idea what I should be using, or is there a good
resource for me to check out.
I have been spending so much time with config and moving data, converting
etc, I never looked at the odbc settings (didn't even think about it until
Josh brought it up). I did ask him for his advice, but would love a second
opinion.
Our data is a bit of a mixture, some records have text items most are
varchars and integers with a bit of Booleans mixed in.
I am running 8.0.2 so not sure if the protocol is ODBC or Postgres?
Thanks for responding I appreciate any help 

Joel Fradkin
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Mohan, Ross
Sent: Thursday, April 21, 2005 10:01 AM
To: [EMAIL PROTECTED]
Subject: Re: [ODBC] [PERFORM] Joel's Performance Issues WAS : Opteron vs
Xeon
Joel, thanks. A couple of things jump out there for
me, not a problem for a routine ODBC connection, but
perhaps in the lotsa stuff context of your current
explorations, it might be relevant?
I am completely shooting from the hip, here, but...if
it were my goose to cook, I'd be investigating
Session(StringConn) =
DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=;
PWD=;ReadOnly=0;Protocol=6.4;
|| Protocol? Is this related to version? is the driver wy old?
FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;
||  Fetch great for OLTP, lousy for batch?
Socket=4096;UnknownSizes=0;MaxVarcharSize=254;MaxLongVarcharSize=8190;
||  what ARE the datatypes and sizes in your particular case? 

Debug=0;
||  a run with debug=1 probably would spit up something interesting
CommLog=0;Optimizer=1;
||  Optimizer? that's a new one on me
Ksqo=1;UseDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAs
Char=1;Parse=0;CancelAsFreeStmt=;ExtraSysTablePrefixes=dd_;LFConversion=1;Up
datableCursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinar
y=0;UseServerSidePrepare=0
||  that's about all I can see, prima facie.  I'll be very curious to know
if ODBC is
  any part of your performance equation. 

HTH, 

Ross
-Original Message-
From: Joel Fradkin [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 21, 2005 10:54 AM
To: Mohan, Ross
Cc: [EMAIL PROTECTED]; PostgreSQL Perform
Subject: RE: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

Here is the connect string I am using.
It could be horrid as I cut it from ODBC program.
Session(StringConn) =
DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=;
PWD=;ReadOnly=0;Protocol=6.4;FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;Socket=4096;UnknownSizes=0;MaxVar
charSize=254;MaxLongVarcharSize=8190;Debug=0;CommLog=0;Optimizer=1;Ksqo=1;Us
eDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Pa
rse=0;CancelAsFreeStmt=0;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableC
ursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseS
erverSidePrepare=0
Joel Fradkin
-Original Message-
From: Mohan, Ross [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 21, 2005 9:42 AM

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-23 Thread Joel Fradkin

I would very, very strongly encourage you to run multi-user tests before
deciding on mysql. Mysql is nowhere near as capable when it comes to
concurrent operations as PostgreSQL is. From what others have said, it
doesn't take many concurrent operations for it to just fall over. I
can't speak from experience because I avoid mysql like the plague,
though. :)

I am just testing the water so to speak, if it cant handle single user tests
then multiple user tests are kind of a waste of time.

I am trying to de-normalize my view into a table to see if I can get my app
to work. It is a good idea anyway but raises a ton of questions about
dealing with the data post a case being closed etc; also on multiple child
relationships like merchandise and payments etc.

I did do a test of all three (MSSQL, MYSQL,and postgres) in aqua studio ,
all on the same machine running the servers and found postgres beat out
MYSQL, but like any other test it may have been an issue with aqua studio
and mysql in any case I have not made a decision to use mysql I am still
researching fixes for postgres.

I am waiting to here back from Josh on using cursors and trying to flatten
long running views. 

I am a little disappointed I have not understood enough to get my analyzer
to use the proper plan, we had to set seqscan off to get the select from
response_line to work fast and I had to turn off merge joins to get assoc
list to work fast. Once I am up I can try to learn more about it, I am so
glad there are so many folks here willing to take time to educate us newb's.



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-23 Thread Joshua D. Drake

Joel Fradkin wrote:
I would very, very strongly encourage you to run multi-user tests before
deciding on mysql. Mysql is nowhere near as capable when it comes to
concurrent operations as PostgreSQL is. From what others have said, it
doesn't take many concurrent operations for it to just fall over. I
can't speak from experience because I avoid mysql like the plague,
though. :)
I am just testing the water so to speak, if it cant handle single user tests
then multiple user tests are kind of a waste of time.
Joel I think you are missing the point on the above comment. The above
comment as I read is, o.k. you are having problems with PostgreSQL BUT 
MySQL isn't going to help you and you will see that in multi-user tests.

MySQL is known to work very well on small databases without a lot of 
concurrent sessions. I don't think anybody here would argue that.

Where MySQL runs into trouble is larger databases with lots of 
concurrent connections.

I am a little disappointed I have not understood enough to get my analyzer
to use the proper plan, we had to set seqscan off to get the select from
response_line to work fast and I had to turn off merge joins to get assoc
list to work fast. Once I am up I can try to learn more about it, I am so
glad there are so many folks here willing to take time to educate us newb's.
Sincerely,
Joshua D. Drake
Command Prompt, Inc.


---(end of broadcast)---
TIP 6: Have you searched our list archives?
   http://archives.postgresql.org

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-23 Thread brew



 I am just testing the water so to speak, if it cant handle single user
 tests then multiple user tests are kind of a waste of time.

At the risk of being even more pedantic, let me point out that if you are
going to be running your application with multiple users the reverse is
even more true, 'If it can't handle multiple user tests then single user
tests are kind of a waste of time'.

brew

 ==
  Strange Brew   ([EMAIL PROTECTED])
  Check out my Stock Option Covered Call website  http://www.callpix.com
 and my Musician's Online Database Exchange http://www.TheMode.com
 ==


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-23 Thread Christopher Browne

Centuries ago, Nostradamus foresaw when [EMAIL PROTECTED] (Joel Fradkin) 
would write:
 I am just testing the water so to speak, if it cant handle single
 user tests then multiple user tests are kind of a waste of time.

I would suggest that if multi-user functionality is needed, then
starting with single user tests is a similar waste of time.

There's good reason to look at it this way...  

It is all too common for people to try to start building things with
primitive functionality, and then try to evolve the system into what
they need.  It is possible for that to work, if the base covers
enough of the necessary functionality.

In practice, we have watched Windows evolve in such a fashion with
respect to multiuser support, and, in effect, it has never really
gotten it.  Microsoft started by hacking something on top of MS-DOS,
and by the time enough applications had enough dependancies on the way
that worked, it has essentially become impossible for them to migrate
properly to a multiuser model since applications are normally designed
with the myopic this is MY computer! model of the world.

You may not need _total_ functionality in the beginning, but,
particularly for multiuser support, which has deep implications for
applications, it needs to be there In The Beginning.
-- 
output = reverse(moc.liamg @ enworbbc)
http://linuxdatabases.info/info/lisp.html
A CONS is an object which cares.  -- Bernie Greenberg.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-22 Thread Dave Page

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Andreas Pflug
 Sent: 21 April 2005 14:06
 To: Joel Fradkin
 Cc: 'John A Meinel'; josh@agliodbs.com; 
 pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

 Beware!
  From the data, I can see that you're probably using pgAdmin3.
 The time to execute your query including transfer of all data to the 
 client is 17s in this example, while displaying it (i.e. pure GUI and 
 memory alloc stuff) takes 72s. Execute to a file to avoid this.

Perhaps we should add a guruhint there for longer runtimes?

Regards, dave

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-22 Thread Mischa Sandberg

Quoting Alvaro Herrera [EMAIL PROTECTED]:

 One further question is: is this really a meaningful test?  I mean, in
 production are you going to query 30 rows regularly?  And is the
 system always going to be used by only one user?  I guess the question
 is if this big select is representative of the load you expect in
 production.

While there may be some far-out queries that nobody would try,
you might be surprised what becomes the norm for queries,
as soon as the engine feasibly supports them. SQL is used for
warehouse and olap apps, as a data queue, and as the co-ordinator
or bridge for (non-SQL) replication apps. In all of these,
you see large updates, large result sets and volatile tables
(large to me means over 20% of a table and over 1M rows).

To answer your specific question: yes, every 30 mins,
in a data redistribution app that makes a 1M-row query, 
and writes ~1000 individual update files, of overlapping sets of rows. 
It's the kind of operation SQL doesn't do well, 
so you have to rely on one big query to get the data out.

My 2c
-- 
Dreams come true, not free. -- S.Sondheim, ITW


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-22 Thread Jim C. Nasby

Hrm... I was about to suggest that for timing just the query (and not
output/data transfer time) using explain analyze, but then I remembered
that explain analyze can incur some non-trivial overhead with the timing
calls. Is there a way to run the query but have psql ignore the output?
If so, you could use \timing.

In any case, it's not valid to use pgadmin to time things.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-22 Thread Alvaro Herrera

On Fri, Apr 22, 2005 at 01:51:08PM -0400, Joel Fradkin wrote:
 I just finished testing Postgres, MYSQL, and MSSQL on my machine (2 gigs
 internal XP).
 
 I have adjusted the postgres config to what I think is an ok place and have
 mysql default and mssql default.

 Using Aqua studio a program that hooks to all three I have found:
 
   Initial exec  Second exec  Returning 331,640 records on all 3 database
 MSSQL468ms  16ms  2 mins 3  secs
 MYSQL   14531ms   6625ms  2 mins 42 secs 
 Postgr  52120ms  11702ms  2 mins 15 secs

One further question is: is this really a meaningful test?  I mean, in
production are you going to query 30 rows regularly?  And is the
system always going to be used by only one user?  I guess the question
is if this big select is representative of the load you expect in
production.

What happens if you execute the query more times?  Do the times stay the
same as the second run?

-- 
Alvaro Herrera ([EMAIL PROTECTED])
Use it up, wear it out, make it do, or do without

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-22 Thread Jim C. Nasby

On Fri, Apr 22, 2005 at 05:04:19PM -0400, Joel Fradkin wrote:
 And is the system always going to be used by only one user?  
 No we have 400+ concurrent users
 
 I guess the question is if this big select is representative of the load you
 expect in production.
 Yes we see many time on the two processor box running MSSQL large return
 sets using 100%cpu for 5-30 seconds.
 
 What happens if you execute the query more times?  Do the times stay the
 same as the second run?
 I will definitely have to pressure testing prior to going live in
 production. I have not done concurrent tests as honestly single user tests
 are failing, so multiple user testing is not something I need yet.

I would very, very strongly encourage you to run multi-user tests before
deciding on mysql. Mysql is nowhere near as capable when it comes to
concurrent operations as PostgreSQL is. From what others have said, it
doesn't take many concurrent operations for it to just fall over. I
can't speak from experience because I avoid mysql like the plague,
though. :)

Likewise, MSSQL will probably look better single-user than it will
multi-user. Unless you're going to only access the database single-user,
it's just not a valid test case (and by the way, this is true no matter
what database you're looking at. Multiuser access is where you uncover
your real bottlenecks.)
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-22 Thread Joel Fradkin

One further question is: is this really a meaningful test?  I mean, in
production are you going to query 30 rows regularly? 

It is a query snippet if you will as the view I posted for audit and case
where tables are joined are more likely to be ran.

Josh and I worked over this until we got explain analyze on the linux box to
1 sec. I was just using this as a test as I don't have my views set up on
MYSQL.

So many of my reports pull huge data sets (comprised of normalized joins).
I am thinking I probably have to modify to using an non normalized table,
and Josh is sending me information on using cursors instead of selects.

And is the system always going to be used by only one user?  
No we have 400+ concurrent users

I guess the question is if this big select is representative of the load you
expect in production.
Yes we see many time on the two processor box running MSSQL large return
sets using 100%cpu for 5-30 seconds.

What happens if you execute the query more times?  Do the times stay the
same as the second run?
I will definitely have to pressure testing prior to going live in
production. I have not done concurrent tests as honestly single user tests
are failing, so multiple user testing is not something I need yet.

Joel

-- 
Alvaro Herrera ([EMAIL PROTECTED])
Use it up, wear it out, make it do, or do without


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-22 Thread Bruce Momjian


Are you using 8.0.2?  I hope so because there were some Win32
performance changes related to fsync() in that release.

---

Joel Fradkin wrote:
 I just finished testing Postgres, MYSQL, and MSSQL on my machine (2 gigs
 internal XP).
 
 I have adjusted the postgres config to what I think is an ok place and have
 mysql default and mssql default.
 
 Using Aqua studio a program that hooks to all three I have found:
 
   Initial exec  Second exec  Returning 331,640 records on all 3 database
 MSSQL468ms  16ms  2 mins 3  secs
 MYSQL   14531ms   6625ms  2 mins 42 secs 
 Postgr  52120ms  11702ms  2 mins 15 secs
 
 Not sure if this proves your point on PGadmin versus MYSQL query tool versus
 MSSQL Query tool, but it certainly seems encouraging.
 
 I am going to visit Josh's tests he wanted me to run on the LINUX server.
  
 Joel Fradkin
  
 
 
 
 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-22 Thread Joel Fradkin

I just finished testing Postgres, MYSQL, and MSSQL on my machine (2 gigs
internal XP).

I have adjusted the postgres config to what I think is an ok place and have
mysql default and mssql default.

Using Aqua studio a program that hooks to all three I have found:

  Initial exec  Second exec  Returning 331,640 records on all 3 database
MSSQL468ms  16ms  2 mins 3  secs
MYSQL   14531ms   6625ms  2 mins 42 secs 
Postgr  52120ms  11702ms  2 mins 15 secs

Not sure if this proves your point on PGadmin versus MYSQL query tool versus
MSSQL Query tool, but it certainly seems encouraging.

I am going to visit Josh's tests he wanted me to run on the LINUX server.
 
Joel Fradkin
 



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Merlin Moncure

 John A Meinel [EMAIL PROTECTED] writes:
  Joel Fradkin wrote:
  Postgres was on the second run
  Total query runtime: 17109 ms.
  Data retrieval runtime: 72188 ms.
  331640 rows retrieved.
 
  How were you measuring data retrieval time?
 
 I suspect he's using pgadmin.  We've seen reports before suggesting
that
 pgadmin can be amazingly slow, eg here
 http://archives.postgresql.org/pgsql-performance/2004-10/msg00427.php
 where the *actual* data retrieval time as shown by EXPLAIN ANALYZE
 was under three seconds, but pgadmin claimed the query runtime was 22
 sec and data retrieval runtime was 72 sec.

The problem is that pgAdmin takes your query results and puts it in a
grid.  The grid is not designed to be used in that way for large
datasets. The time complexity is not linear and really breaks down
around 10k-100k rows depending on various factors.  pgAdmin users just
have to become used to it and use limit or the filter feature at
appropriate times.

The ms sql enterprise manager uses cursors which has its own set of
nasty issues (no mvcc).

In fairness, unless you are running with \a switch, psql adds a fair
amount of time to the query too.

Joel:
Postgres was on the second run
Total query runtime: 17109 ms.
Data retrieval runtime: 72188 ms.
331640 rows retrieved.

The Data retrieval runtime is time spend by pgAdmin formatting, etc.
The query runtime is the actual timing figure you should be concerned
with (you are not comparing apples to apples).  I can send you a utility
I wrote in Delphi which adds only a few seconds overhead for 360k result
set.  Or, go into psql, throw \a switch, and run query.

or: 
psql -A -c select * from myview where x  output.txt

it should finish the above in 16-17 sec plus the time to write out the
file.

Joel, I have a lot of experience with all three databases you are
evaluating and you are making a huge mistake switching to mysql.  you
can make a decent case for ms sql, but it's quite expensive at your
level of play as you know.

Merlin




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Andreas Pflug

Joel Fradkin wrote:
I did think of something similar just loading the data tables with junk
records and I may visit that idea with Josh.
I did just do some comparisons on timing of a plain select * from tbl where
indexed column = x and it was considerably slower then both MSSQL and MYSQL,
so I am still a bit confused. This still might be configuration issue (I ran
on my 2gig desktop and the 8 gig Linux box comparisons were all ran on the
same machines as far MSSQL, MYSQL, and Postgres.
I turned off postgres when running MYSQL and turned off MYSQL when running
postgres, MSSQL had one of the two running while I tested it.
For the 360,000 records returned MYSQL did it in 40 seconds first run and 17
seconds second run.
MSSQL did it in 56 seconds first run and 16 seconds second run.
Postgres was on the second run
Total query runtime: 17109 ms.
Data retrieval runtime: 72188 ms.
331640 rows retrieved.
Beware!
From the data, I can see that you're probably using pgAdmin3.
The time to execute your query including transfer of all data to the 
client is 17s in this example, while displaying it (i.e. pure GUI and 
memory alloc stuff) takes 72s. Execute to a file to avoid this.

Regards,
Andreas
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Joel Fradkin

Why is MYSQL returning 360,000 rows, while Postgres is only returning
330,000? This may not be important at all, though.
I also assume you are selecting from a plain table, not a view.

Yes plain table. Difference in rows is one of the datasets had sears data in
it. It (speed differences found) is much worse on some of my views, which is
what forced me to start looking at other options.

I suppose knowing your work_mem, and shared_buffers settings would be
useful. I have posted my configs, but will add the Tampa to the bottom
again. My desktop has
# - Memory -

shared_buffers = 8000   # min 16, at least max_connections*2, 8KB
each
work_mem = 8000#1024# min 64, size in KB
maintenance_work_mem = 16384# min 1024, size in KB
#max_stack_depth = 2048 # min 100, size in KB

# - Free Space Map -

max_fsm_pages = 3#2 # min max_fsm_relations*16, 6 bytes
each
max_fsm_relations = 1000# min 100, ~50 bytes each
# - Planner Cost Constants -

effective_cache_size = 8#1000   # typically 8KB each
random_page_cost = 2# units are one sequential page fetch cost

How were you measuring data retrieval time? And how does this compare
to what you were measuring on the other machines? It might be possible
that what you are really measuring is just the time it takes psql to
load up all the data into memory, and then print it out. And since psql
defaults to measuring entry lengths for each column, this may not be
truly comparable.
It *looks* like it only takes 18s for postgres to get the data, but then
it is taking 72s to transfer the data to you. That would be network
latency, or something like that, not database latency.
And I would say that 18s is very close to 16 or 17 seconds.
This was ran on the machine with database (as was MYSQL and MSSQL).
The PG timing was from PGADMIN and the 18 secs was second run, first run was
Same time to return the data and 70 secs to do the first part like 147 secs
all told, compared to the 40 seconds first run of MYSQL and 56 Seconds
MSSQL. MYSQL was done in their query tool, it returns the rows as well and
MSSQL was done in their query analyzer. All three tools appear to use a
similar approach. Just an FYI doing an explain analyze of my problem view
took much longer then actually returning the data in MSSQL and MYSQL. I have
done extensive testing with MYSQL (just this table and two of my problem
views). I am not using the transactional version, because I need the best
speed.


I don't know what commands you were issuing, or how you measured,
though. You might be using some other interface (like ODBC), which I
can't say too much about.

John
=:-

This is the Linux box config.
# -
# PostgreSQL configuration file
# -
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Please note that re-commenting a setting is NOT sufficient to revert it
# to the default value, unless you restart the postmaster.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have 
# to SIGHUP the postmaster for the changes to take effect, or use 
# pg_ctl reload. Some settings, such as listen_address, require
# a postmaster shutdown and restart to take effect.


#---
# FILE LOCATIONS
#---

# The default values of these variables are driven from the -D command line
# switch or PGDATA environment variable, represented here as ConfigDir.
# data_directory = 'ConfigDir'  # use data in another directory
#data_directory = '/pgdata/data'
# hba_file = 'ConfigDir/pg_hba.conf'# the host-based authentication file
# ident_file = 'ConfigDir/pg_ident.conf'  # the IDENT configuration file

# If external_pid_file is not explicitly set, no extra pid file is written.
# external_pid_file = '(none)'  # write an extra pid file


#---
# CONNECTIONS AND AUTHENTICATION
#---

# - Connection Settings -

#listen_addresses = 'localhost' # what IP interface(s) to listen on; 
# defaults to localhost, '*' = any

listen_addresses = '*'
port = 5432
max_connections = 100
# note: increasing max_connections costs about 500

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Mohan, Ross

FWIW, ODBC has variables to tweak, as well. fetch/buffer sizes, and the like. 

Maybe one of the ODBC cognoscenti here can chime in more concretely

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joel Fradkin
Sent: Thursday, April 21, 2005 10:36 AM
To: 'Tom Lane'; 'John A Meinel'
Cc: 'Postgresql Performance'
Subject: Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

I suspect he's using pgadmin.  
Yup I was, but I did try running on the linux box in psql, but it was running 
to the screen and took forever because of that.

The real issue is returning to my app using ODBC is very slow (Have not tested 
the ODBC for MYSQL, MSSQL is ok (the two proc dell is running out of steam but 
been good until this year when we about doubled our demand by adding sears as a 
client).

Using odbc to postgres on some of the views (Josh from Command is having me do 
some very specific testing) is timing out with a 10 minute time limit. These 
are pages that still respond using MSSQL (this is wehere production is using 
the duel proc and the test is using the 4 proc).

I have a tool that hooks to all three databases so I can try it with that and 
see if I get different responses.

Joel

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Joel Fradkin


I suspect he's using pgadmin.  
Yup I was, but I did try running on the linux box in psql, but it was
running to the screen and took forever because of that.

The real issue is returning to my app using ODBC is very slow (Have not
tested the ODBC for MYSQL, MSSQL is ok (the two proc dell is running out of
steam but been good until this year when we about doubled our demand by
adding sears as a client).

Using odbc to postgres on some of the views (Josh from Command is having me
do some very specific testing) is timing out with a 10 minute time limit.
These are pages that still respond using MSSQL (this is wehere production is
using the duel proc and the test is using the 4 proc).

I have a tool that hooks to all three databases so I can try it with that
and see if I get different responses.

Joel


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Joel Fradkin

Here is the connect string I am using.
It could be horrid as I cut it from ODBC program.

Session(StringConn) =
DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=;
PWD=;ReadOnly=0;Protocol=6.4;FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;Socket=4096;UnknownSizes=0;MaxVar
charSize=254;MaxLongVarcharSize=8190;Debug=0;CommLog=0;Optimizer=1;Ksqo=1;Us
eDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Pa
rse=0;CancelAsFreeStmt=0;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableC
ursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseS
erverSidePrepare=0

Joel Fradkin
 

-Original Message-
From: Mohan, Ross [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 21, 2005 9:42 AM
To: [EMAIL PROTECTED]
Subject: RE: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

FWIW, ODBC has variables to tweak, as well. fetch/buffer sizes, and the
like. 

Maybe one of the ODBC cognoscenti here can chime in more concretely






---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-21 Thread Mohan, Ross

Joel, thanks. A couple of things jump out there for
me, not a problem for a routine ODBC connection, but
perhaps in the lotsa stuff context of your current
explorations, it might be relevant?

I am completely shooting from the hip, here, but...if
it were my goose to cook, I'd be investigating

Session(StringConn) = 
DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=;
PWD=;ReadOnly=0;Protocol=6.4;

|| Protocol? Is this related to version? is the driver wy old?


FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;

||  Fetch great for OLTP, lousy for batch?


Socket=4096;UnknownSizes=0;MaxVarcharSize=254;MaxLongVarcharSize=8190;

||  what ARE the datatypes and sizes in your particular case? 

Debug=0;

||  a run with debug=1 probably would spit up something interesting

CommLog=0;Optimizer=1;

||  Optimizer? that's a new one on me

Ksqo=1;UseDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Parse=0;CancelAsFreeStmt=;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableCursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseServerSidePrepare=0

||  that's about all I can see, prima facie.  I'll be very curious to know if 
ODBC is
   any part of your performance equation. 


HTH, 

Ross

-Original Message-
From: Joel Fradkin [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 21, 2005 10:54 AM
To: Mohan, Ross
Cc: [EMAIL PROTECTED]; PostgreSQL Perform
Subject: RE: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon


Here is the connect string I am using.
It could be horrid as I cut it from ODBC program.

Session(StringConn) = 
DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=;
PWD=;ReadOnly=0;Protocol=6.4;FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;
ShowSystemTables=0;ConnSettings=;Fetch=100;Socket=4096;UnknownSizes=0;MaxVar
charSize=254;MaxLongVarcharSize=8190;Debug=0;CommLog=0;Optimizer=1;Ksqo=1;Us
eDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Pa
rse=0;CancelAsFreeStmt=0;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableC
ursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseS
erverSidePrepare=0

Joel Fradkin
 

-Original Message-
From: Mohan, Ross [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 21, 2005 9:42 AM
To: [EMAIL PROTECTED]
Subject: RE: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

FWIW, ODBC has variables to tweak, as well. fetch/buffer sizes, and the like. 

Maybe one of the ODBC cognoscenti here can chime in more concretely






---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Josh Berkus

Joel,

 I did not see any marked improvement, but I dont think my issues are
 related to the hardware.

If you won't believe it, then we certainly can't convince you.  AFAIK your bad 
view is a bad query plan made worse by the Dell's hardware problems.

 I am giving up on postgres and three developers two months of work and
 trying MYSQL.

I'd suggest testing your *whole* application and not just this one query.  And 
remember that you need to test InnoDB tables if you want transactions.


 I have posted several items and not got a response (not that I expect folks
 to drop everything). I want to thank everyone who has been of help and
 there are several.

Hmmm ... I see about 25 responses to some of your posts on this list.   
Including ones by some of our head developers.   That's more than you'd get 
out of a paid MSSQL support contract, I know from experience.

If you want anything more, then you'll need a do-or-die contract with a 
support company. If your frustration is because you can't find this kind of 
help than I completely understand ... I have a waiting list for performance 
contracts myself.  (and, if you hired me the first thing I'd tell you is to 
junk the Dell)

 I really like the environment and feel I have learned a lot in the past few
 months, but bottom line for me is speed. We bought a 30K Dell 6650 to get
 better performance. 

Would have been smart to ask on this list *before* buying the Dell, hey?  Even 
a Google of this mailing list would have been informative.

 I chose PG because MSSQL was 70K to license. I believe 
 the MYSQL will be 250.00 to license for us, but I may choose the 4k
 platinum support just to feel safe about having some one to touch base with
 in the event of an issue.

Hmmm ... you're willing to pay MySQL $4k but expect the PG community to solve 
all your problems with free advice and a couple $100 with CMD?   I sense an 
apples vs. barca loungers comparison here ...

 I am not sure I am walking away feeling real good about
 postgres, because it just should not take a rocket scientist to get it to
 work, and I used to think I was fairly smart and could figure stuff out and
 I hate admitting defeat (especially since we have everything working with
 postgres now).

While I understand your frustration (I've been frustrated more than a few 
times with issues that stump me on Linux, for example) it's extremely unfair 
to lash out at a community that has provided you a lot of free advice because 
the advice hasn't fixed everything yet.  By my reading, you first raised your 
query issue 6 days ago.  6 days is not a lot of time for getting *free* 
troubleshooting help by e-mail. Certainly it's going to take more than 6 days 
to port to MySQL.  

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Joel Fradkin

Sorry if you feel I am lashing out at a community.
Just to say it again, I am very appreciative of all the help everyone has
supplied.

I am running on more then just the 4 proc Dell (in fact my tests have been
mostly on desktops).

I have MSSQL running on a 2 proc dell which until my load has increased
(over aprx 2 years) it was just fine. I totally agree that there are better
solutions based on this lists comments, but I have all Dell hardware now and
resist trying different vendors just to suit Postgres. I was under the
impression there were still issues with 64bit postgres and Linux (or at
least were when I purchased). I believed I could make my next aquistion a
opteron based hardware.

Again I am not at all trying to critasize any one, so please except my
apology if I some how came across with that attitude. I am very disappointed
at this point. My views may not be that great (although I am not saying that
either), but they run ok on MSSQL and appear to run ok on MYSQL.

I wish I did understand what I am doing wrong because I do not wish to
revisit engineering our application for MYSQL.

I would of spent more $ with Command, but he does need my data base to help
me and I am not able to do that.

I agree testing the whole app is the only way to see and unfortunately it is
a time consuming bit. I do not have to spend 4k on MYSQL, that is if I want
to have their premium support. I can spend $250.00 a server for the
commercial license if I find the whole app does run well. I just loaded the
data last night and only had time to convert one view this morning. I am
sure it is something I do not understand and not a problem with postgres. I
also am willing to take time to get more knowledgeable, but my time is
running out and I feel honestly stupid.

I have been in the process of converting for over two months and have said
several times these lists are a godsend. 

It was never my intention to make you feel like I was flaming anyone
involved. On the contrary, I feel many have taken time to look at my
questions and given excellent advice. I know I check the archives so
hopefully that time will help others after me. 

I may yet find that MYSQL is not a good fit as well. I have my whole app
converted at this point and find pg works well for a lot of my usage.  

There are some key reporting views that need to retrieve many rows with many
joins that just take too long to pull the data. I told my boss just now that
if I try to de-normalize many of these data sets (like 6 main groups of data
that the reporting may work, but as is many of my web pages are timing out
(these are pages that still work on MSSQL and the 2 proc machine).

Thanks again for all the help and know I truly appreciate what time every
one has spent on my issues.

I may find that revisiting the datasets is a way to make PG work, or as you
mentioned maybe I can get some one with more knowledge to step in locally. I
did ask Tom if he knew of anyone, maybe some one else on the list is aware
of a professional in the Tampa FL area.

Realistically I don't think a 30k$ Dell is a something that needs to be
junked. I am pretty sure if I got MSSQL running on it, it would outperform
my two proc box. I can agree it may not have been the optimal platform. My
decision is not based solely on the performance on the 4 proc box.

Joel Fradkin
 

-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, April 20, 2005 1:54 PM
To: Joel Fradkin
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

Joel,

 I did not see any marked improvement, but I don't think my issues are
 related to the hardware.

If you won't believe it, then we certainly can't convince you.  AFAIK your
bad 
view is a bad query plan made worse by the Dell's hardware problems.

 I am giving up on postgres and three developers two months of work and
 trying MYSQL.

I'd suggest testing your *whole* application and not just this one query.
And 
remember that you need to test InnoDB tables if you want transactions.


 I have posted several items and not got a response (not that I expect
folks
 to drop everything). I want to thank everyone who has been of help and
 there are several.

Hmmm ... I see about 25 responses to some of your posts on this list.   
Including ones by some of our head developers.   That's more than you'd get 
out of a paid MSSQL support contract, I know from experience.

If you want anything more, then you'll need a do-or-die contract with a 
support company. If your frustration is because you can't find this kind of 
help than I completely understand ... I have a waiting list for performance 
contracts myself.  (and, if you hired me the first thing I'd tell you is to 
junk the Dell)

 I really like the environment and feel I have learned a lot in the past
few
 months, but bottom line for me is speed. We bought a 30K Dell 6650 to get
 better performance. 

Would have been smart to ask

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Josh Berkus

Joel,

 I have MSSQL running on a 2 proc dell which until my load has increased
 (over aprx 2 years) it was just fine. I totally agree that there are better
 solutions based on this lists comments, but I have all Dell hardware now
 and resist trying different vendors just to suit Postgres. I was under the
 impression there were still issues with 64bit postgres and Linux (or at
 least were when I purchased). I believed I could make my next aquistion a
 opteron based hardware.

Yeah, sorry, the Dell stuff is a sore point with me.   You can't imagine the 
number of conversations I have that go like this:
We're having a severe performance problem with PostgreSQL
What hardware/OS are you using?
Dell *650 with RHAS 3.0 

BTW, which Update version is your RHAS?   If you're on Update3, you can grab 
more performance right there by upgrading to Update4.

 Again I am not at all trying to critasize any one, so please except my
 apology if I some how came across with that attitude. I am very
 disappointed at this point. My views may not be that great (although I am
 not saying that either), but they run ok on MSSQL and appear to run ok on
 MYSQL.

Yeah.  I think you'll find a few things that are vice-versa.   For that 
matter, I can point to a number of queries we run better than Oracle, and a 
number we don't.

Your particular query problem seems to stem from some bad estimates.   Can you 
post an EXPLAIN ANALYZE based on all the advice people have given you so far?

 I wish I did understand what I am doing wrong because I do not wish to
 revisit engineering our application for MYSQL.

I can imagine.  

 I would of spent more $ with Command, but he does need my data base to help
 me and I am not able to do that.

Yes.  For that matter, it'll take longer to troubleshoot on this list because 
of your security concerns.

 I agree testing the whole app is the only way to see and unfortunately it
 is a time consuming bit. I do not have to spend 4k on MYSQL, that is if I
 want to have their premium support. I can spend $250.00 a server for the
 commercial license if I find the whole app does run well. I just loaded the
 data last night and only had time to convert one view this morning. I am
 sure it is something I do not understand and not a problem with postgres. I
 also am willing to take time to get more knowledgeable, but my time is
 running out and I feel honestly stupid.

You're not.  You have a real query problem and it will require further 
troubleshooting to solve.  Some of us make a pretty handsome living solving 
these kinds of problems, it take a lot of expert knowledge.

 It was never my intention to make you feel like I was flaming anyone
 involved. On the contrary, I feel many have taken time to look at my
 questions and given excellent advice. I know I check the archives so
 hopefully that time will help others after me.

Well, I overreacted too.   Sorry!

 I may find that revisiting the datasets is a way to make PG work, or as you
 mentioned maybe I can get some one with more knowledge to step in locally.
 I did ask Tom if he knew of anyone, maybe some one else on the list is
 aware of a professional in the Tampa FL area.

Well, Robert Treat is in Florida but I'm pretty sure he's busy full-time.

 Realistically I don't think a 30k$ Dell is a something that needs to be
 junked. I am pretty sure if I got MSSQL running on it, it would outperform
 my two proc box. I can agree it may not have been the optimal platform. My
 decision is not based solely on the performance on the 4 proc box.

Oh, certainly it's too late to buy a Sunfire or eServer instead.   You just 
could have gotten far more bang for the buck with some expert advice, that's 
all.   But don't bother with Dell support any further, they don't really have 
the knowledge to help you.

So ... new EXPLAIN ANALYZE ?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Anjan Dave

He is running RHAS4, which is the latest 2.6.x kernel from RH. I believe
it should have done away with the RHAS3.0 Update 3 IO issue.

anjan

-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, April 20, 2005 4:23 PM
To: Joel Fradkin
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

Joel,

 I have MSSQL running on a 2 proc dell which until my load has
increased
 (over aprx 2 years) it was just fine. I totally agree that there are
better
 solutions based on this lists comments, but I have all Dell hardware
now
 and resist trying different vendors just to suit Postgres. I was under
the
 impression there were still issues with 64bit postgres and Linux (or
at
 least were when I purchased). I believed I could make my next
aquistion a
 opteron based hardware.

Yeah, sorry, the Dell stuff is a sore point with me.   You can't imagine
the 
number of conversations I have that go like this:
We're having a severe performance problem with PostgreSQL
What hardware/OS are you using?
Dell *650 with RHAS 3.0 

BTW, which Update version is your RHAS?   If you're on Update3, you can
grab 
more performance right there by upgrading to Update4.

 Again I am not at all trying to critasize any one, so please except my
 apology if I some how came across with that attitude. I am very
 disappointed at this point. My views may not be that great (although I
am
 not saying that either), but they run ok on MSSQL and appear to run ok
on
 MYSQL.

Yeah.  I think you'll find a few things that are vice-versa.   For that 
matter, I can point to a number of queries we run better than Oracle,
and a 
number we don't.

Your particular query problem seems to stem from some bad estimates.
Can you 
post an EXPLAIN ANALYZE based on all the advice people have given you so
far?

 I wish I did understand what I am doing wrong because I do not wish to
 revisit engineering our application for MYSQL.

I can imagine.  

 I would of spent more $ with Command, but he does need my data base to
help
 me and I am not able to do that.

Yes.  For that matter, it'll take longer to troubleshoot on this list
because 
of your security concerns.

 I agree testing the whole app is the only way to see and unfortunately
it
 is a time consuming bit. I do not have to spend 4k on MYSQL, that is
if I
 want to have their premium support. I can spend $250.00 a server for
the
 commercial license if I find the whole app does run well. I just
loaded the
 data last night and only had time to convert one view this morning. I
am
 sure it is something I do not understand and not a problem with
postgres. I
 also am willing to take time to get more knowledgeable, but my time is
 running out and I feel honestly stupid.

You're not.  You have a real query problem and it will require further 
troubleshooting to solve.  Some of us make a pretty handsome living
solving 
these kinds of problems, it take a lot of expert knowledge.

 It was never my intention to make you feel like I was flaming anyone
 involved. On the contrary, I feel many have taken time to look at my
 questions and given excellent advice. I know I check the archives so
 hopefully that time will help others after me.

Well, I overreacted too.   Sorry!

 I may find that revisiting the datasets is a way to make PG work, or
as you
 mentioned maybe I can get some one with more knowledge to step in
locally.
 I did ask Tom if he knew of anyone, maybe some one else on the list is
 aware of a professional in the Tampa FL area.

Well, Robert Treat is in Florida but I'm pretty sure he's busy
full-time.

 Realistically I don't think a 30k$ Dell is a something that needs to
be
 junked. I am pretty sure if I got MSSQL running on it, it would
outperform
 my two proc box. I can agree it may not have been the optimal
platform. My
 decision is not based solely on the performance on the 4 proc box.

Oh, certainly it's too late to buy a Sunfire or eServer instead.   You
just 
could have gotten far more bang for the buck with some expert advice,
that's 
all.   But don't bother with Dell support any further, they don't really
have 
the knowledge to help you.

So ... new EXPLAIN ANALYZE ?

-- 

--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread John A Meinel

Joel Fradkin wrote:
...
I would of spent more $ with Command, but he does need my data base to help
me and I am not able to do that.

...
What if someone were to write an anonymization script. Something that
changes any of the data of the database, but leaves all of the
relational information. It could turn all strings into some sort of
hashed version, so you don't give out any identifiable information.
It could even modify relational entries, as long as it updated both
ends, and this didn't affect the actual performance at all.
I don't think this would be very hard to write. Especially if you can
give a list of the tables, and what columns need to be modified.
Probably this would generally be a useful script to have for cases like
this where databases are confidential, but need to be tuned by someone else.
Would that be reasonable?
I would think that by renaming columns, hashing the data in the columns,
and renaming tables, most of the proprietary information is removed,
without removing the database information.
John
=:-


signature.asc
Description: OpenPGP digital signature

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Vivek Khera

On Apr 20, 2005, at 4:22 PM, Josh Berkus wrote:
Realistically I don't think a 30k$ Dell is a something that needs to 
be
junked. I am pretty sure if I got MSSQL running on it, it would 
outperform
my two proc box. I can agree it may not have been the optimal 
platform. My
decision is not based solely on the performance on the 4 proc box.
Oh, certainly it's too late to buy a Sunfire or eServer instead.   You 
just
could have gotten far more bang for the buck with some expert advice, 
that's
all.   But don't bother with Dell support any further, they don't 
really have
the knowledge to help you.

FWIW, I have a $20k Dell box (PE2650 with 14-disk external PowerVault 
RAID enclosure) which I'm phasing out for a dual opteron box because it 
can't handle the load.  It will be re-purposed as a backup system.  
Damn waste of money, but complaining customers can cost more...

Trust me, it is likely your Dell hardware, as moving to the Opteron 
system has improved performance tremendously with fewer disks.  Same 
amount of RAM and other basic configurations.  Both have LSI based RAID 
cards, even.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Joel Fradkin

I did think of something similar just loading the data tables with junk
records and I may visit that idea with Josh.

I did just do some comparisons on timing of a plain select * from tbl where
indexed column = x and it was considerably slower then both MSSQL and MYSQL,
so I am still a bit confused. This still might be configuration issue (I ran
on my 2gig desktop and the 8 gig Linux box comparisons were all ran on the
same machines as far MSSQL, MYSQL, and Postgres.
I turned off postgres when running MYSQL and turned off MYSQL when running
postgres, MSSQL had one of the two running while I tested it.

For the 360,000 records returned MYSQL did it in 40 seconds first run and 17
seconds second run.

MSSQL did it in 56 seconds first run and 16 seconds second run.

Postgres was on the second run
Total query runtime: 17109 ms.
Data retrieval runtime: 72188 ms.
331640 rows retrieved.

So like 89 on the second run.
The first run was 147 secs all told.

These are all on my 2 meg desktop running XP.
I can post the config. I noticed the postgres was using 70% of the cpu while
MSSQL was 100%.

Joel Fradkin
 

I would of spent more $ with Command, but he does need my data base to help
me and I am not able to do that.


...

What if someone were to write an anonymization script. Something that
changes any of the data of the database, but leaves all of the
relational information. It could turn all strings into some sort of
hashed version, so you don't give out any identifiable information.
It could even modify relational entries, as long as it updated both
ends, and this didn't affect the actual performance at all.

I don't think this would be very hard to write. Especially if you can
give a list of the tables, and what columns need to be modified.

Probably this would generally be a useful script to have for cases like
this where databases are confidential, but need to be tuned by someone else.

Would that be reasonable?
I would think that by renaming columns, hashing the data in the columns,
and renaming tables, most of the proprietary information is removed,
without removing the database information.

John
=:-



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread John A Meinel

Joel Fradkin wrote:
I did think of something similar just loading the data tables with junk
records and I may visit that idea with Josh.
I did just do some comparisons on timing of a plain select * from tbl where
indexed column = x and it was considerably slower then both MSSQL and MYSQL,
so I am still a bit confused. This still might be configuration issue (I ran
on my 2gig desktop and the 8 gig Linux box comparisons were all ran on the
same machines as far MSSQL, MYSQL, and Postgres.
I turned off postgres when running MYSQL and turned off MYSQL when running
postgres, MSSQL had one of the two running while I tested it.
For the 360,000 records returned MYSQL did it in 40 seconds first run and 17
seconds second run.
MSSQL did it in 56 seconds first run and 16 seconds second run.
Postgres was on the second run
Total query runtime: 17109 ms.
Data retrieval runtime: 72188 ms.
331640 rows retrieved.
So like 89 on the second run.
The first run was 147 secs all told.
These are all on my 2 meg desktop running XP.
I can post the config. I noticed the postgres was using 70% of the cpu while
MSSQL was 100%.
Joel Fradkin

Why is MYSQL returning 360,000 rows, while Postgres is only returning
330,000? This may not be important at all, though.
I also assume you are selecting from a plain table, not a view.
I suppose knowing your work_mem, and shared_buffers settings would be
useful.
How were you measuring data retrieval time? And how does this compare
to what you were measuring on the other machines? It might be possible
that what you are really measuring is just the time it takes psql to
load up all the data into memory, and then print it out. And since psql
defaults to measuring entry lengths for each column, this may not be
truly comparable.
It *looks* like it only takes 18s for postgres to get the data, but then
it is taking 72s to transfer the data to you. That would be network
latency, or something like that, not database latency.
And I would say that 18s is very close to 16 or 17 seconds.
I don't know what commands you were issuing, or how you measured,
though. You might be using some other interface (like ODBC), which I
can't say too much about.
John
=:-



signature.asc
Description: OpenPGP digital signature

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Josh Berkus

Joel,

Ok, please try this:

ALTER TABLE tblresponseheader ALTER COLUMN clientnum SET STATISTICS 1000;
ALTER TABLE tblresponseheader ALTER COLUMN locationid SET STATISTICS 1000;
ALTER TABLE tbllocation ALTER COLUMN clientnum SET STATISTICS 1000;
ALTER TABLE tbllocation ALTER COLUMN divisionid SET STATISTICS 1000;
ALTER TABLE tbllocation ALTER COLUMN regionid SET STATISTICS 1000;
ANALYZE tblresponseheader;
ANALYZE tbllocation;

Then run the EXPLAIN ANALYZE again.   (on Linux)


-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Tom Lane

John A Meinel [EMAIL PROTECTED] writes:
 Joel Fradkin wrote:
 Postgres was on the second run
 Total query runtime: 17109 ms.
 Data retrieval runtime: 72188 ms.
 331640 rows retrieved.

 How were you measuring data retrieval time?

I suspect he's using pgadmin.  We've seen reports before suggesting that
pgadmin can be amazingly slow, eg here
http://archives.postgresql.org/pgsql-performance/2004-10/msg00427.php
where the *actual* data retrieval time as shown by EXPLAIN ANALYZE
was under three seconds, but pgadmin claimed the query runtime was 22
sec and data retrieval runtime was 72 sec.

I wouldn't be too surprised if that time was being spent formatting
the data into a table for display inside pgadmin.  It is a GUI after
all, not a tool for pushing vast volumes of data around.

It'd be interesting to check the runtimes for the same query with
LIMIT 3000, ie, see if a tenth as much data takes a tenth as much
processing time or not.  The backend code should be pretty darn
linear in this regard, but maybe pgadmin isn't.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

37 matches

Mail list logo