Re: [PERFORM] Random Page Cost and Planner

2010-05-26 Thread David Jarvis
Hi, Rob. I tried bumping the effective_cache_size. It made no difference. My latest attempt at forcing PostgreSQL to use the indexes involved two loops: one to loop over the stations, the other to extract the station data from the measurement table. The outer loop executes in 1.5 seconds. The

Re: [PERFORM] Random Page Cost and Planner

2010-05-26 Thread Alexey Klyukin
On May 26, 2010, at 6:50 AM, David Jarvis wrote: That said, when using the following condition, the query is fast (1 second): extract(YEAR FROM sc.taken_start) = 1963 AND extract(YEAR FROM sc.taken_end) = 2009 AND - Index Scan using measurement_013_stc_idx

Re: [PERFORM] shared_buffers advice

2010-05-26 Thread Cédric Villemain
2010/5/24 Konrad Garus konrad.ga...@gmail.com: 2010/3/11 Paul McGarry p...@paulmcgarry.com: I'm basically wondering how the postgresql cache (ie shared_buffers) and the OS page_cache interact. The general advice seems to be to assign 1/4 of RAM to shared buffers. I don't have a good

Re: [PERFORM] Random Page Cost and Planner

2010-05-26 Thread tv
Current Folder: Sent Sign Out Compose Addresses Folders Options Autoreply Search Help CalendarG-Hosting.cz Message List | Delete | Edit Message as New Previous | Next Forward | Forward as Attachment | Reply | Reply All Subject:Re: [PERFORM]

Re: [PERFORM] Random Page Cost and Planner

2010-05-26 Thread Kevin Grittner
David Jarvis thanga...@gmail.com wrote: It sounds as though the active portion of your database is pretty much cached in RAM. True? I would not have thought so; there are seven tables, each with 39 to 43 million rows The machine has 4GB of RAM In that case, modifying seq_page_cost or

Re: [PERFORM] performance of temporary vs. regular tables

2010-05-26 Thread Joachim Worringen
Am 25.05.2010 12:41, schrieb Andres Freund: On Tuesday 25 May 2010 11:00:24 Joachim Worringen wrote: Thanks. So, the Write-Ahead-Logging (being used or not) does not matter? It does matter quite significantly in my experience. Both from an io and a cpu overhead perspective. O.k., looks as if

[PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL

2010-05-26 Thread Eliot Gable
I have been Googling for answers on this for a while, and have not been able to find anything satisfactory. Imagine that you have a stored procedure which is currently written using PL/PGSQL. This stored procedure performs lots of long, complex SQL queries (95% SELECT statements, 5% INSERT or

Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL

2010-05-26 Thread Stephen Frost
* Eliot Gable (egable+pgsql-performa...@gmail.com) wrote: Would a query such as this obtain any performance improvement by being re-written using C? I wouldn't expect the queries called by the pl/pgsql function to be much faster if called through SPI from C instead. I think the question you

Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL

2010-05-26 Thread Eliot Gable
Thanks for the quick follow-up. So, you are saying that if I can do SPI in _PG_init, then I could prepare all my queries there and they would be prepared once for the entire function when it is loaded? That would certainly achieve what I want. Does anybody know whether I can do SPI in _PG_init?

Re: [PERFORM] Random Page Cost and Planner

2010-05-26 Thread David Jarvis
Hi, Alexey. Is it necessary to get the data as far as 1900 all the time ? Maybe there is a possibility to aggregate results from the past years if they are constant. This I have done. I created another table (station_category) that associates stations with when they started to take

Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL

2010-05-26 Thread Stephen Frost
* Eliot Gable (egable+pgsql-performa...@gmail.com) wrote: Thanks for the quick follow-up. So, you are saying that if I can do SPI in _PG_init, then I could prepare all my queries there and they would be prepared once for the entire function when it is loaded? That would certainly achieve what

Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL

2010-05-26 Thread Eliot Gable
Ah, that clears things up. Yes, the connections are more or less persistent. I have a connection manager which doles connections out to the worker threads and reclaims them when the workers are done with them. It dynamically adds new connections based on load. Each worker obtains a connection from

Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL

2010-05-26 Thread Stephen Frost
* Eliot Gable (egable+pgsql-performa...@gmail.com) wrote: Since PostgreSQL is written in C, I assume there is no such additional overhead. I assume that the PL/PGSQL implementation at its heart also uses SPI to perform those executions. Is that a fair statement? Right, but I also wouldn't

Re: [PERFORM] performance of temporary vs. regular tables

2010-05-26 Thread Grzegorz Jaśkiewicz
WAL matters in performance. Hence why it is advisable to have it on a separate drive :) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL Function Language Performance: C vs PL/PGSQL

2010-05-26 Thread Craig James
On 5/26/10 9:47 AM, Stephen Frost wrote: * Eliot Gable (egable+pgsql-performa...@gmail.com) wrote: Since PostgreSQL is written in C, I assume there is no such additional overhead. I assume that the PL/PGSQL implementation at its heart also uses SPI to perform those executions. Is that a fair

Re: [PERFORM] Random Page Cost and Planner

2010-05-26 Thread tv
Hi, And this is what happens in the queries above - the first query covers years 1963-2009, while the second one covers 1900-2009. Given the fact this table contains ~40m rows, the first query returns about 0.01% (3k rows) while the second one returns almost 50% of the data (18m rows). So I

Re: [PERFORM] Random Page Cost and Planner

2010-05-26 Thread David Jarvis
Hi, Kevin. below something in the range of 1.5 to 2 is probably not going to be a good choice for the mix as a whole. Good to know; thanks. This should probably be set to something on the order of 3GB. This will help the optimizer make more intelligent choices about when use of the index

Re: [PERFORM] Random Page Cost and Planner

2010-05-26 Thread David Jarvis
Hi, sc.taken_end = '1996-12-31'::date AND m.taken BETWEEN sc.taken_start AND sc.taken_end AND category of data at a certain time. But I'm afraid this makes the planning much more difficult, as the select from measurements depend on the data returned by other parts of the query (rows from

Re: [PERFORM] Random Page Cost and Planner

2010-05-26 Thread David Jarvis
I was told to try OVERLAPS instead of checking years. The query is now: SELECT extract(YEAR FROM m.taken) AS year, avg(m.amount) as amount FROM climate.city c, climate.station s, climate.station_category sc, climate.measurement m WHERE c.id = 5148 AND