[PERFORM] How does PG know if data is in memory?

2010-09-29 Thread Fabrício dos Anjos Silva
Hi, After reading lots of documentation, I still don't understand fully how PG knows if some needed data is in memory or in second storage. While choosing the best query plan, the optimizer must take this into account. Does PG consider this? If so, how does it know? I presume it chec

Re: [PERFORM] How does PG know if data is in memory?

2010-09-29 Thread Kevin Grittner
Fabrício dos Anjos Silva wrote: > After reading lots of documentation, I still don't understand > fully how PG knows if some needed data is in memory or in second > storage. > Does PG consider this? No. > When setting seq_page_cost and random_page_cost, do I have to > consider the probabili

Re: [PERFORM] How does PG know if data is in memory?

2010-09-29 Thread Samuel Gendler
2010/9/29 Fabrício dos Anjos Silva > > >When setting seq_page_cost and random_page_cost, do I have to consider > the probability that data will be in memory? Or does seq_page_cost mean > "sequential access on disk" and random_page_cost mean "random access on > disk"? > > The reason seq_page_c

Re: [PERFORM] How does PG know if data is in memory?

2010-10-01 Thread Fabrício dos Anjos Silva
Thank you all for the replies. If PG does not know whether needed data is in memory, how does it estimate cost? There is a huge difference between access time in memory and in secondary storage. Not taking this into account results in almost "useless" estimates. I am not saying that PG does

Re: [PERFORM] How does PG know if data is in memory?

2010-10-01 Thread Craig Ringer
On 1/10/2010 7:12 PM, Fabrício dos Anjos Silva wrote: Thank you all for the replies. If PG does not know whether needed data is in memory, how does it estimate cost? There is a huge difference between access time in memory and in secondary storage. Not taking this into account results i

Re: [PERFORM] How does PG know if data is in memory?

2010-10-01 Thread Kevin Grittner
Craig Ringer wrote: > Because the query often only wants a small subset of the data, and > whole relations are rarely fully cached, it's not enough to know > that "some of relation X is cached", it has to know if the cached > parts are the parts that'll be required, or at least an > approximatio

Re: [PERFORM] How does PG know if data is in memory?

2010-10-01 Thread Pierre C
It sounds horrendously complicated to keep track of to me, and in the end it won't make query execution any faster, it'll just potentially help the planner pick a better plan. I wonder if that'd be worth the extra CPU time spent managing the cache and cache content stats, and using those ca

Re: [PERFORM] How does PG know if data is in memory?

2010-10-01 Thread Fabrício dos Anjos Silva
Craig, I agree with you. Not completely, but I do. I'm just stuck in a situation where I can't figure out what values to use for the parameters. I can't even think of a way on how to test and discover that. I followed Josh Berkus' GUC spreadsheet and some tutorials on PG wiki, but how

Re: [PERFORM] How does PG know if data is in memory?

2010-10-01 Thread Tom Lane
"Kevin Grittner" writes: > I agree with that, but I think there's an even more insidious issue > here. Biasing plans heavily toward using what is already in cache > could have a destabilizing effect on performance. Not to mention the destabilizing effect on the plans themselves. Behavior like th

Re: [PERFORM] How does PG know if data is in memory?

2010-10-01 Thread Samuel Gendler
2010/10/1 Fabrício dos Anjos Silva >Craig, > >I agree with you. Not completely, but I do. > >I'm just stuck in a situation where I can't figure out what values to > use for the parameters. I can't even think of a way on how to test and > discover that. >I followed Josh Berkus' GUC

Re: [PERFORM] How does PG know if data is in memory?

2010-10-02 Thread Craig Ringer
On 3/10/2010 7:39 AM, Richard Troy wrote: I can't speak for modern "OpenVMS", but "back in the day", VMS had a very effective memory management strategy which, in effect, made it as if all memory was a cache for disk. It did this by means of a mechanism by which to identify all potentially reach

Re: [PERFORM] How does PG know if data is in memory?

2010-10-03 Thread Greg Smith
Samuel Gendler wrote: As to your question about increasing shared_buffers to be some significant proportion of available RAM - apparently, that is not a good idea. I've seen advice that said you shouldn't go above 8GB for shared_buffers and I've also seen 12GB suggested as an upper limit, too

Re: [PERFORM] How does PG know if data is in memory?

2010-10-03 Thread Greg Smith
Fabrício dos Anjos Silva wrote: After reading lots of documentation, I still don't understand fully how PG knows if some needed data is in memory or in second storage. While choosing the best query plan, the optimizer must take this into account. Does PG consider this? If so, how does it know?

Re: [PERFORM] How does PG know if data is in memory?

2010-10-03 Thread Greg Smith
Craig Ringer wrote: If some kind of cache awareness was to be added, I'd be interested in seeing a "hotness" measure that tracked how heavily a given relation/index has been accessed and how much has been read from it recently. A sort of age-scaled blocks-per-second measure that includes both

Re: [PERFORM] How does PG know if data is in memory?

2010-10-03 Thread Greg Smith
Fabrício dos Anjos Silva wrote: Is there any automated test tool? A can compile a list of real-world queries, and provide an exact copy of my db server just for testing. But how do I do it? Write a bunch of scripts? Is there any serious tool that try different parameters, run a load test, proce

Re: [PERFORM] How does PG know if data is in memory?

2010-10-03 Thread Greg Smith
Fabrício dos Anjos Silva wrote: If someone could point good books about PG tuning, I would appreciate that. I found some yet to be released books about PG 9. Any comments about them? The largest treatment of the subject already in print I'm aware of is in the Korry and Susan Douglas "PostgreS

Re: [PERFORM] How does PG know if data is in memory?

2010-10-04 Thread Cédric Villemain
2010/10/4 Greg Smith : > Craig Ringer wrote: >> >> If some kind of cache awareness was to be added, I'd be interested in >> seeing a "hotness" measure that tracked how heavily a given relation/index >> has been accessed and how much has been read from it recently. A sort of >> age-scaled blocks-per

Re: [PERFORM] How does PG know if data is in memory?

2010-10-04 Thread Jeremy Harris
On 10/04/2010 04:22 AM, Greg Smith wrote: I had a brain-storming session on this subject with a few of the hackers in the community in this area a while back I haven't had a chance to do something with yet (it exists only as a pile of scribbled notes so far). There's a couple of ways to collect

Re: [PERFORM] How does PG know if data is in memory?

2010-10-11 Thread Robert Haas
On Mon, Oct 4, 2010 at 6:47 PM, Jeremy Harris wrote: > On 10/04/2010 04:22 AM, Greg Smith wrote: >> >> I had a brain-storming session on this subject with a few of the hackers >> in the community in this area a while back I haven't had a chance to do >> something with yet (it exists only as a pile

Re: [PERFORM] How does PG know if data is in memory?

2010-10-11 Thread gnuoytr
message >Date: Mon, 11 Oct 2010 22:59:28 -0400 >From: pgsql-performance-ow...@postgresql.org (on behalf of Robert Haas >) >Subject: Re: [PERFORM] How does PG know if data is in memory? >To: Jeremy Harris >Cc: pgsql-performance@postgresql.org > >On Mon, Oct 4, 2010 a

Re: [PERFORM] How does PG know if data is in memory?

2010-10-12 Thread Robert Haas
On Mon, Oct 11, 2010 at 11:11 PM, wrote: > An approach that works can be found in DB2, and likely elsewhere. > > The key is that tablespaces/tables/indexes/buffers are all attached through > the bufferpool (the DB2 term).  A tablespace/bufferpool match is defined.   > Then tables and indexes are

Re: [PERFORM] How does PG know if data is in memory?

2010-10-12 Thread gnuoytr
thout that, I'm not sure what benefit there is to tablespaces, other than a sort of RAID-lite. Robert Original message >Date: Tue, 12 Oct 2010 08:34:23 -0400 >From: pgsql-performance-ow...@postgresql.org (on behalf of Robert Haas >) >Subject: Re: [PERFORM] How does P

Re: [PERFORM] How does PG know if data is in memory?

2010-10-12 Thread Kevin Grittner
wrote: > An approach that works can be found in DB2, and likely elsewhere. > > The key is that tablespaces/tables/indexes/buffers are all > attached through the bufferpool (the DB2 term). A tablespace/ > bufferpool match is defined. Then tables and indexes are assigned > to the tablespace (an

Re: [PERFORM] How does PG know if data is in memory?

2010-10-12 Thread gnuoytr
evin Grittner" >) >Subject: Re: [PERFORM] How does PG know if data is in memory? >To: , > > wrote: > >> An approach that works can be found in DB2, and likely elsewhere. >> >> The key is that tablespaces/tables/indexes/buffers are all >> attache

Re: [PERFORM] How does PG know if data is in memory?

2010-10-12 Thread Kevin Grittner
wrote: > -- None of that is anything for amateurs to play with. > > Not jam a stick in anybody's eye, but shouldn't database pros not > be amateurs? While many PostgreSQL installations are managed by professional DBAs, or programmers or consultants with a deep enough grasp of the issues to tu

Re: [PERFORM] How does PG know if data is in memory?

2010-10-12 Thread Ron Mayer
Kevin Grittner wrote: > > ...Sybase named caches...segment off portions of the memory for > specific caches... bind specific database > objects (tables and indexes) to specific caches. ... > > When I posted to the list about it, the response was that LRU > eviction was superior to any tuning an

Re: [PERFORM] How does PG know if data is in memory?

2010-10-12 Thread Robert Haas
On Tue, Oct 12, 2010 at 10:35 AM, Kevin Grittner wrote: > (1) Heavily used data could be kept fully cached in RAM and not > driven out by transient activity. We've attempted to address this problem by adding logic to prevent the buffer cache from being trashed by vacuums, bulk loads, and sequenti

Re: [PERFORM] How does PG know if data is in memory?

2010-10-14 Thread Cédric Villemain
2010/10/13 Ron Mayer : > Kevin Grittner wrote: >> >> ...Sybase named caches...segment off portions of the memory for >> specific caches... bind specific database >> objects (tables and indexes) to specific caches. ... >> >> When I posted to the list about it, the response was that LRU >> eviction w

Re: [PERFORM] How does PG know if data is in memory?

2010-10-21 Thread Bruce Momjian
Kevin Grittner wrote: > wrote: > > > -- None of that is anything for amateurs to play with. > > > > Not jam a stick in anybody's eye, but shouldn't database pros not > > be amateurs? > > While many PostgreSQL installations are managed by professional > DBAs, or programmers or consultants with

Re: [PERFORM] How does PG know if data is in memory?

2010-10-27 Thread Francisco Reyes
Greg Smith writes: heard privately from two people who have done similar experiments on Linux and found closer to 8GB to be the point where performance started So on a machine with 72GB is 8GB still the recommended value? Usually have only 10 to 20 connections. -- Sent via pgsql-performanc

Re: [PERFORM] How does PG know if data is in memory?

2010-10-28 Thread Cédric Villemain
2010/10/28 Francisco Reyes : > Greg Smith writes: > >> heard privately from two people who have done similar experiments on Linux >> and found closer to 8GB to be the point where performance started > > So on a machine with 72GB is 8GB still the recommended value? Yes, as a maximum, not a minimum.