Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread Harald Lau (Sector-X)
Mark,

 MSSQL Machine:
 That Explain Analyze command doesn't work for MSSQL,

try this:
set showplan_all on
go
select ...
go

Harald

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables

2004-09-14 Thread Harald Lau (Sector-X)
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

 From: Mischa Sandberg [EMAIL PROTECTED]
 
  If your company is currently happy with MySQL, there probably are
  other (nontechnical) reasons to stick with it. I'm impressed that
  you'd consider reconsidering PG.
 
 I'd like to second Mischa on that issue.

Though both of you are right from my point of view, I don't think
it's very useful to discuss this item here.

Having once migrated a MySQL-DB to PG I can confirm, that in fact
chances are good you will be unhappy if you adopt the MySQL
data-model and the SQL 1:1.
As well as PG has to be much more configured and optimized than
MySQL.
As well as the client-application is supposed to be modified to a
certain extend, particularly if you want to take over some -or some
more- business-logic from client to database.

But, from what Mark stated so far I'm sure he is not going to migrate
his app just for fun, resp. without having considered this.

 NEVER reimplement an existing system unless the project includes
 substantial functional imporovement.

or monetary issues
I know one big database that was migrated from Oracle to PG and
another from SQLServer to PG because of licence-costs. Definitely
there are some more.
That applies to MySQL, too; licence policy is somewhat obscure to me,
but under certain circumstances you have to pay

regards Harald

-BEGIN PGP SIGNATURE-
Version: PGPfreeware 6.5.3 for non-commercial use http://www.pgp.com

iQA/AwUBQUb+O8JpD/drhCuMEQJCZACgqdJsrWjOwdP779PFaFMjxdgvqkwAoIPc
jPONy6urLRLf3vylVjVlEyci
=/1Ka
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] no index-usage on aggregate-functions?

2004-06-29 Thread Harald Lau (Sector-X)
Hi,

I've experienced that PG up to current release does not make use of an index when 
aggregating. Which of course may result in unacceptable answering times

This behaviour is reproducable on any table with any aggregat function in all of my 
databases on every machine (PostgreSQL 7.4.2 on i386-redhat-linux-gnu and PostgreSQL 
7.2.1 on i686-pc-linux-gnu)

f.e. querying against a 2.8-mio-records (2.800.000) table the_table
SELECT count(*) FROM the_table
= Seq scan - takes about 12 sec

SELECT Avg(num_found) AS NumFound FROM the_table --(index on num_found)
= Seq scan - takes about 10 sec

SELECT Sum(num_found) AS TotalFound FROM the_table --(index on num_found)
= Seq scan - takes about 11 sec

SELECT Max(date_) AS LatestDate FROM the_table --(index on date_)
= Seq scan - takes about 14 sec

But
SELECT date_ AS LatestDate FROM the_table ORDER BY date_ DESC LIMIT 1;
= Index scan - takes 0.18 msec

MS SQLServer 2000: Use of an appropriate index _whenever_ aggregating.

Am I doing something wrong?

Greetings Harald

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] no index-usage on aggregate-functions?

2004-06-29 Thread Harald Lau (Sector-X)
@Chris:

  SELECT count(*) FROM the_table
  = Seq scan - takes about 12 sec
 This cannot be made O(1) in postgres due to MVCC.  You just have to live 
 with it.

bad news
BTW: in this case you could workaround
select reltuples from pg_class where relname='the_table'
(yes, I know: presumes a regular vacuum analyse)

 Average and sum can never use an index AFAIK, in any db server.  You 
 need information from every row.

Take a look at the SQLSrv-pendant:
create index x_1 on the_table (num_found)
select avg(num_found) from the_table
- Index Scan(OBJECT:([midata].[dbo].[THE_TABLE].[x_1])

(I'm not sure what Oracle does - have to re-install it first ...)


@Scott:
 Yes, you're expecting an MVCC database to behave like a row locking
 database.

h...
So, it seems that PG is not s well suited for a datawarehouse and/or performing 
extensive statistics/calculations/reportings on large tables, is it?

Greetings Harald

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] no index-usage on aggregate-functions?

2004-06-29 Thread Harald Lau (Sector-X)
 Note that there ARE other options.  While the inability to provide a
 speedy count is a cost of using an MVCC system, the ability to allow
 thousands of readers to run while updates are happening underneath them
 more than makes up for the slower aggregate performance.

IMO this depends on the priority of your application resp. the customers intentions 
and wishes

 This, however, is not paradise.

you can't have it all ;-)

 On the contrary, it makes it GREAT for datawarehousing.  Not because any
 one child process will be super fast, but because ALL the child
 processes will run reasonably fast, even under very heavy read and write
 load.

What I meant with datawarehouse are many db's at many locations whose data are to be 
collected in one central db in order to mix em up, sum up or do anything equivalent.
But in fact my quite heavy-read/write-accessed db is running really fast since 1 1/2 
years now
Even though still on PG 7.2
The one and only bottleneck are the statistics and the reports - and the tables are 
getting larger and larger ...

  Note that if you've got the memory for the hash agg algo to fire
 into shared memory, it's pretty darned fast now,

yes, I've noticed here on the testing server

 so if the data (mostly)
 fit into kernel cache you're gold.  And 12 gig Intel boxes aren't that
 expensive, compared to an Oracle license.

*that's* the point ...

Anyway: Greetings and thanks for your answers
Harald

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html