Re: [PERFORM] Evaluation of PG performance vs MSDE/MSSQL 2000 (not 2005)

2007-12-06 Thread Mark Cave-Ayland
On Thu, 2007-12-06 at 08:50 +, Dave Page wrote:

> EnterpriseDB Postgres is essentially a packaging and bundling project in
> which the aim is to provide consistent and easy to use installers for
> Windows, Mac and Linux that allow users to get started with Postgres,
> Slony, PostGIS, pgAdmin, phpPgAdmin etc...
> 
> EnterpriseDB Advanced Server is the entirely different product - thats
> the one that includes the Oracle compatibility and replication/migration
> tools etc.

Ah indeed - my mistake for not realising EnterpriseDB Postgres was a
different product from EnterpriseDB Advanced Server
(postgres.enterprisedb.com turned out to be quite an enlightening read).


ATB,

Mark.

-- 
ILande - Open Source Consultancy
http://www.ilande.co.uk



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Evaluation of PG performance vs MSDE/MSSQL 2000 (not 2005)

2007-12-05 Thread Mark Cave-Ayland
On Wed, 2007-12-05 at 00:13 -0800, Robert Bernabe wrote:
> Hi All,
> I've been tasked to evaluate PG as a possible replacement of our
> MS SQL 2000 solution. Our solution is 100% stored procedure/function
> centric. It's a report generation system whose sole task is to produce
> text files filled with processed data that is post-processed by a
> secondary system. Basically options are selected via a web interface
> and all these parameters are passed unto the stored procedure and then
> the stored procedure would run and in the process call other stored
> procedures until eventually a single formatted text file is produced. 
> I decided on Fedora Core 7 and the 8.3 Beta release of Enterprise
> DB PostgreSQL. I decided to port 1 stored procedure plus it's several
> support stored procedures into pl/pgsql from T-SQL and compare the
> performance by measuring how long each system takes to produce the
> text file. For this test,  the output to the text file was discarded
> and the stored procedure/function would end once the final temporary
> table is filled with the information that is eventually dumped into
> the text file. 
> 
> Windows 2000 Professional + MSDE (/MS SQL) Boxvs.   FC7 +
> EnterpriseDB PG Box
> 
> Note that both boxes have EXACTLY the same hardware (not VMWARE or
> anything) 
> AMD X2 3800
> 2 G RAM DDR 400
> 80 G Barracuda Sata
> 
> The data was copied to the Linux box and checked lightly for
> consistency versus the windows box (number of tables / columns and
> records) and they all match. After data transfer to the Linux Box, I
> ran REINDEX and ANALYZE. 
> 
> For the actual run the following tables were used and I'm displaying
> the results of analyze.
> 
> INFO:  analyzing "public.AreaDefn"
> INFO:  "AreaDefn": scanned 15 of 15 pages, containing 2293 live rows
> and 0 dead rows; 2293 rows in sample, 2293 estimated total rows
> INFO:  analyzing "public.AreaDefn2"
> INFO:  "AreaDefn2": scanned 30 of 30 pages, containing 3439 live rows
> and 0 dead rows; 3000 rows in sample, 3439 estimated total rows
> INFO:  analyzing "public.Areas"
> INFO:  "Areas": scanned 2 of 2 pages, containing 164 live rows and 0
> dead rows; 164 rows in sample, 164 estimated total rows
> INFO:  analyzing "public.Brands"
> INFO:  "Brands": scanned 1 of 1 pages, containing 11 live rows and 0
> dead rows; 11 rows in sample, 11 estimated total rows
> INFO:  analyzing "public.Categories"
> INFO:  "Categories": scanned 1 of 1 pages, containing 26 live rows and
> 0 dead rows; 26 rows in sample, 26 estimated total rows
> INFO:  analyzing "public.CategoryDefn"
> INFO:  "CategoryDefn": scanned 1 of 1 pages, containing 133 live rows
> and 0 dead rows; 133 rows in sample, 133 estimated total rows
> INFO:  analyzing "public.CategoryDefn2"
> INFO:  "CategoryDefn2": scanned 2 of 2 pages, containing 211 live rows
> and 0 dead rows; 211 rows in sample, 211 estimated total rows
> INFO:  analyzing "public.CategorySets"
> INFO:  "CategorySets": scanned 1 of 1 pages, containing 3 live rows
> and 0 dead rows; 3 rows in sample, 3 estimated total rows
> INFO:  analyzing "public.CATemplateGroup"
> INFO:  analyzing "public.Channels"
> INFO:  "Channels": scanned 1 of 1 pages, containing 7 live rows and 0
> dead rows; 7 rows in sample, 7 estimated total rows
> INFO:  analyzing "public.ClientCodes"
> INFO:  analyzing "public.Clients"
> INFO:  "Clients": scanned 7 of 7 pages, containing 366 live rows and 0
> dead rows; 366 rows in sample, 366 estimated total rows
> INFO:  analyzing "public.Customers"
> INFO:  "Customers": scanned 2 of 2 pages, containing 129 live rows and
> 0 dead rows; 129 rows in sample, 129 estimated total rows
> NFO:  analyzing "public.Databases"
> INFO:  "Databases": scanned 1 of 1 pages, containing 1 live rows and 0
> dead rows; 1 rows in sample, 1 estimated total rows
> INFO:  analyzing "public.DataSources"
> INFO:  "DataSources": scanned 1 of 1 pages, containing 8 live rows and
> 0 dead rows; 8 rows in sample, 8 estimated total rows
> INFO:  analyzing "public.DateToWeekConversion"
> INFO:  "DateToWeekConversion": scanned 4 of 4 pages, containing 371
> live rows and 0 dead rows; 371 rows in sample, 371 estimated total
> rows
> INFO:  analyzing "public.Divisions"
> INFO:  "Divisions": scanned 1 of 1 pages, containing 1 live rows and 0
> dead rows; 1 rows in sample, 1 estimated total rows
> INFO:  analyzing "public.MetricTable"
> INFO:  "MetricTable": scanned 1 of 1 pages, containing 48 live rows
> and 0 dead rows; 48 rows in sample, 48 estimated total rows
> INFO:  analyzing "public.Offtake"
> INFO:  "Offtake": scanned 3000 of 13824 pages, containing 141000 live
> rows and 0 dead rows; 3000 rows in sample, 649728 estimated total rows
> INFO:  analyzing "public.SKUs"
> INFO:  "SKUs": scanned 3 of 3 pages, containing 73 live rows and 0
> dead rows; 73 rows in sample, 73 estimated total rows
> INFO:  analyzing "public.SMSDefaults"
> INFO:  "SMSDefaults": scanned 1 of 1 pages, containing 43 live rows
> and 0 dead r

Re: [PERFORM] [postgis-users] Is my query planner failing me,or vice versa?

2005-12-20 Thread Mark Cave-Ayland
> -Original Message-
> From: Tom Lane [mailto:[EMAIL PROTECTED]
> Sent: Wed 12/14/2005 9:36 PM
> To:   Gregory S. Williamson
> Cc:   pgsql-performance@postgresql.org; PostGIS Users Discussion
> Subject:  Re: [PERFORM] [postgis-users] Is my query planner failing
me,
> or vice versa?
> "Gregory S. Williamson" <[EMAIL PROTECTED]> writes:
> > Forgive the cross-posting, but I found myself wondering if might not
> > be some way future way of telling the planner that a given table
> > (column ?) has a high likelyhood of being TOASTed.
> 
> What would you expect the planner to do with the information, exactly?
> 
> We could certainly cause ANALYZE to record some estimate of this, but
> I'm not too clear on what happens after that...
> 
>   regards, tom lane
>
>
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:postgis-users-
> [EMAIL PROTECTED] On Behalf Of Gregory S. Williamson
> Sent: 15 December 2005 12:03
> To: Tom Lane
> Cc: pgsql-performance@postgresql.org; PostGIS Users Discussion
> Subject: RE: [PERFORM] [postgis-users] Is my query planner failing me,or
> vice versa?
> 
> Well, what does the random_page_cost do internally ?
> 
> I don't think I'd expect postgres to be able to *do* anything in
> particular, any more than I would expect it to "do" something about slow
> disk I/O or having limited cache. But it might be useful to the EXPLAIN
> ANALYZE in estimating costs of retrieving such data.
> 
> Admittedly, this is not as clear as wanting a sequential scan in
> preference to indexed reads when there are either very few rows or a huge
> number, but it strikes me as useful to me the DBA to have this factoid
> thrust in front of me when considering why a given query is slower than I
> might like. Perhaps an added time based on this factor and the
> random_page_cost value, since lots of TOAST data and a high access time
> would indicate to my (ignorant!) mind that retrieval would be slower,
> especially over large data sets.
> 
> Forgive my ignorance ... obviously I am but a humble user. grin.
> 
> G


As I understood from the original discussions with Markus/Tom, the problem
was that the optimizer didn't consider the value of the VacAttrStats
stawidth value when calculating the cost of a sequential scan. I don't know
if this is still the case though - Tom will probably have a rough idea
already whereas I would need to spend some time sifting through the source.

However, I do know that the PostGIS statistics collector does store the
average detoasted geometry size in stawidth during ANALYZE so the value is
there if it can be used.


Kind regards,

Mark.


WebBased Ltd
17 Research Way
Plymouth
PL6 8BT

T: +44 (0)1752 797131
F: +44 (0)1752 791023

http://www.webbased.co.uk   
http://www.infomapper.com
http://www.swtc.co.uk  

This email and any attachments are confidential to the intended recipient
and may also be privileged. If you are not the intended recipient please
delete it from your system and notify the sender. You should not copy it or
use it for any purpose nor disclose or distribute its contents to any other
person.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org