Re: [PERFORM] sequential scan on select distinct

2004-10-06 Thread Pierre-Frdric Caillaud
You could try :
explain analyze select land from customer_dim group by land;
It will be a lot faster but I can't make it use the index on my machine...
Example :
	create table dummy as (select id, id%255 as number from a large table  
with 1M rows);
	so we have a table with 256 (0-255) disctinct number values.


= explain analyze select distinct number from dummy;
 Unique  (cost=69.83..74.83 rows=200 width=4) (actual  
time=13160.490..14414.004 rows=255 loops=1)
   -  Sort  (cost=69.83..72.33 rows=1000 width=4) (actual  
time=13160.483..13955.792 rows=100 loops=1)
 Sort Key: number
 -  Seq Scan on dummy  (cost=0.00..20.00 rows=1000 width=4)  
(actual time=0.052..1759.145 rows=100 loops=1)
 Total runtime: 14442.872 ms

=   Horribly slow because it has to sort 1M rows for the Unique.

= explain analyze select number from dummy group by number;
 HashAggregate  (cost=22.50..22.50 rows=200 width=4) (actual  
time=1875.214..1875.459 rows=255 loops=1)
   -  Seq Scan on dummy  (cost=0.00..20.00 rows=1000 width=4) (actual  
time=0.107..1021.014 rows=100 loops=1)
 Total runtime: 1875.646 ms

=	A lot faster because it HashAggregates instead of sorting (but still  
seq scan)


Now :
create index dummy_idx on dummy(number);
Let's try again.

explain analyze select distinct number from dummy;
 Unique  (cost=0.00..35301.00 rows=200 width=4) (actual  
time=0.165..21781.732 rows=255 loops=1)
   -  Index Scan using dummy_idx on dummy  (cost=0.00..32801.00  
rows=100 width=4) (actual time=0.162..21154.752 rows=100 loops=1)
 Total runtime: 21782.270 ms

= Index scan the whole table. argh. I should have ANALYZized.

explain analyze select number from dummy group by number;
 HashAggregate  (cost=17402.00..17402.00 rows=200 width=4) (actual  
time=1788.425..1788.668 rows=255 loops=1)
   -  Seq Scan on dummy  (cost=0.00..14902.00 rows=100 width=4)  
(actual time=0.048..960.063 rows=100 loops=1)
 Total runtime: 1788.855 ms
=	Still the same...

Let's make a function :
The function starts at the lowest number and advances to the next number  
in the index until they are all exhausted.

CREATE OR REPLACE FUNCTION sel_distinct()
	RETURNS SETOF INTEGER
	LANGUAGE plpgsql
	AS '
DECLARE
	pos INTEGER;
BEGIN
	SELECT INTO pos number FROM dummy ORDER BY number ASC LIMIT 1;
	IF NOT FOUND THEN
		RAISE NOTICE ''no records.'';
		RETURN;
	END IF;
	
	LOOP
		RETURN NEXT pos;
		SELECT INTO pos number FROM dummy WHERE numberpos ORDER BY number ASC  
LIMIT 1;
		IF NOT FOUND THEN
			RETURN;
		END IF;
	END LOOP;
END;
';

explain analyze select * from sel_distinct();
 Function Scan on sel_distinct  (cost=0.00..12.50 rows=1000 width=4)  
(actual time=215.472..215.696 rows=255 loops=1)
 Total runtime: 215.839 ms

That's better !

Why not use DESC instead of ASC ?
CREATE OR REPLACE FUNCTION sel_distinct()
	RETURNS SETOF INTEGER
	LANGUAGE plpgsql
	AS '
DECLARE
	pos INTEGER;
BEGIN
	SELECT INTO pos number FROM dummy ORDER BY number DESC LIMIT 1;
	IF NOT FOUND THEN
		RAISE NOTICE ''no records.'';
		RETURN;
	END IF;
	
	LOOP
		RETURN NEXT pos;
		SELECT INTO pos number FROM dummy WHERE numberpos ORDER BY number DESC  
LIMIT 1;
		IF NOT FOUND THEN
			RETURN;
		END IF;
	END LOOP;
END;
';

explain analyze select * from sel_distinct();
 Function Scan on sel_distinct  (cost=0.00..12.50 rows=1000 width=4)  
(actual time=13.500..13.713 rows=255 loops=1)
 Total runtime: 13.857 ms

	Hum hum ! Again, a lot better !
	Index scan backwards seems a lot faster than index scan forwards. Why, I  
don't know, but here you go from 15 seconds to 14 milliseconds...

	I don't know WHY (oh why) postgres does not use this kind of strategy  
when distinct'ing an indexed field... Anybody got an idea ?


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Excessive context switching on SMP Xeons

2004-10-06 Thread Alan Stange
Greg Stark wrote:
Alan Stange [EMAIL PROTECTED] writes:
 

A few quick random observations on the Xeon v. Opteron comparison:
- running a dual Xeon with hyperthreading turned on really isn't the same as
having a quad cpu system. I haven't seen postgresql specific benchmarks, but
the general case has been that HT is a benefit in a few particular work
loads but with no benefit in general.
   

Part of the FUD with hyperthreading did have a kernel of truth that lied in
older kernels' schedulers. For example with Linux until recently the kernel
can easily end up scheduling two processes on the two virtual processors of
one single physical processor, leaving the other physical processor totally
idle.
With modern kernels' schedulers I would expect hyperthreading to live up to
its billing of adding 10% to 20% performance. Ie., a dual Xeon machine with
hyperthreading won't be as fast as four processors, but it should be 10-20%
faster than a dual Xeon without hyperthreading.
As with all things that will only help if you're bound by the right limited
resource to begin with. If you're I/O bound it isn't going to help. I would
expect Postgres with its heavy demand on memory bandwidth and shared memory
could potentially benefit more than usual from being able to context switch
during pipeline stalls.
 

All true.   I'd be surprised if HT on an older 2.8 Ghz Xeon with only a 
512K cache will see any real benefit.   The dual Xeon is already memory 
starved, now further increase the memory pressure on the caches (because 
the 512K is now shared by two virtual processors) and you probably 
won't see a gain.  It's memory stalls all around.  To be clear, the 
context switch in this case isn't a kernel context switch but a virtual 
cpu context switch.

The probable reason we see dual Opteron boxes way outperforming dual 
Xeons boxes is exactly because of Postgresql's heavy demand on memory.  
The Opteron's have a much better memory system.

A quick search on google or digging around in the comp.arch archives 
will provide lots of details.HP's web site has (had?) some 
benchmarks comparing these systems.  HP sells both Xeon and Opteron 
systems, so the comparison were quite fair.  Their numbers showed the 
Opteron handily outperfoming the Xeons.

-- Alan
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] stats on cursor and query execution troubleshooting

2004-10-06 Thread Tom Lane
=?ISO-8859-1?Q?=22Alban_M=E9dici_=28NetCentrex=29=22?= [EMAIL PROTECTED] writes:
 I'm looking for the statistic of memory,  CPU,  filesystem access while=20
 executing some regular SQL query,  and I want to compare them to
 same kind of results while executing a cursor function.

I think your second query is finding all the disk pages it needs in
kernel disk cache, because they were all read in by the first query.
This has little to do with cursor versus non cursor, and everything
to do with hitting recently-read data again.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Excessive context switching on SMP Xeons

2004-10-06 Thread Bill Montgomery
Josh Berkus wrote:
I'd be thrilled to test it too, if for no other reason that to determine
whether what I'm experiencing really is the CS problem.
   

Hmmm ... Gavin's patch is built against 8.0, and any version of the patch 
would require linux 2.6, probably 2.6.7 minimum.   Can you test on that linux 
version?   Do you have the resources to back-port Gavin's patch?   
 

I don't currently have any SMP Xeon systems running a 2.6 kernel, but it 
could be arranged. As for back-porting the patch to 7.4.5, probably so, 
but I'd have to see it first.

tps = 369.717832 (including connections establishing)
tps = 370.852058 (excluding connections establishing)
   

Doesn't seem too bad to me.   Have anything to compare it to?
 

Yes, about 280 tps on the same machine with the data directory on a 
3-disk RAID 5 w/ a 128MB cache, rather than the SSD. I was expecting a 
much larger increase, given that the RAID does about 3MB/s of random 8k 
writes, and the SSD device does about 70MB/s of random 8k writes. Said 
differently, I thought my CPU bottleneck would be much higher, as to 
allow for more than a 30% increase in pgbench TPS when I took the IO 
bottleneck out of the equation. (That said, I'm not tuning for pgbench, 
but it is a useful comparison that everyone on the list is familiar 
with, and takes out the possibility that my app just has a bunch of 
poorly written queries).

What's in your postgresql.conf?
 

Some relevant parameters:
shared_buffers = 16384
sort_mem = 2048
vacuum_mem = 16384
max_fsm_pages = 20
max_fsm_relations = 1
fsync = true
wal_sync_method = fsync
wal_buffers = 32
checkpoint_segments = 6
effective_cache_size = 262144
random_page_cost = 0.25
Everything else is left at the default (or not relevant to this post). 
Anything blatantly stupid in there for my setup?

Thanks,
Bill Montgomery
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] sequential scan on select distinct

2004-10-06 Thread Ole Langbehn
Am Mittwoch, 6. Oktober 2004 12:19 schrieb Pierre-Frédéric Caillaud:
  You could try :

  explain analyze select land from customer_dim group by land;
  It will be a lot faster but I can't make it use the index on my machine...
this already speeds up my queries to about 1/4th of the time, which is about 
the range of mysql and oracle.

  Example :

 [..]

  Hum hum ! Again, a lot better !
  Index scan backwards seems a lot faster than index scan forwards. Why, I
 don't know, but here you go from 15 seconds to 14 milliseconds...
thanks for this very extensive answer, it helped me a lot.

  I don't know WHY (oh why) postgres does not use this kind of strategy
 when distinct'ing an indexed field... Anybody got an idea ?
That's the big question I still would like to see answered too. Can anyone 
tell us?

TiA
-- 
Ole Langbehn

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] sequential scan on select distinct

2004-10-06 Thread Greg Stark

Pierre-Frédéric Caillaud [EMAIL PROTECTED] writes:

   I don't know WHY (oh why) postgres does not use this kind of strategy
 when distinct'ing an indexed field... Anybody got an idea ?

Well there are two questions here. Why given the current plans available does
postgres choose a sequential scan instead of an index scan. And why isn't
there this kind of skip index scan available.

Postgres chooses a sequential scan with a sort (or hash aggregate) over an
index scan because it expects it to be faster. sequential scans are much
faster than random access scans of indexes, plus index scans need to read many
more blocks. If you're finding the index scan to be just as fast as sequential
scans you might consider lowering random_page_cost closer to 1.0. But note
that you may be getting fooled by a testing methodology where more things are
cached than would be in production.

why isn't a skip index scan plan available? Well, nobody's written the code
yet. It would part of the same code needed to get an index scan used for:

select y,min(x) from bar group by y

And possibly also related to the TODO item:

Use index to restrict rows returned by multi-key index when used with
non-consecutive keys to reduce heap accesses

For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and col3 =
9, spin though the index checking for col1 and col3 matches, rather than
just col1


Note that the optimizer would have to make a judgement call based on the
expected number of distinct values. If you had much more than 256 distinct
values then the your plpgsql function wouldn't have performed well at all.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Excessive context switching on SMP Xeons

2004-10-06 Thread SZUCS Gábor
Hmmm...

I may be mistaken (I think last time I read about optimization params was in
7.3 docs), but doesn't RPC  1 mean that random read is faster than
sequential read? In your case, do you really think reading randomly is 4x
faster than reading sequentially? Doesn't seem to make sense, even with a
zillion-disk array. Theoretically.

Also not sure, but sort_mem and vacuum_mem seem to be too small to me.

G.
%--- cut here ---%
\end

- Original Message - 
From: Bill Montgomery [EMAIL PROTECTED]
Sent: Wednesday, October 06, 2004 5:45 PM


 Some relevant parameters:
 shared_buffers = 16384
 sort_mem = 2048
 vacuum_mem = 16384
 max_fsm_pages = 20
 max_fsm_relations = 1
 fsync = true
 wal_sync_method = fsync
 wal_buffers = 32
 checkpoint_segments = 6
 effective_cache_size = 262144
 random_page_cost = 0.25


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] sequential scan on select distinct

2004-10-06 Thread Pierre-Frdric Caillaud
There are even three questions here :
- given that 'SELECT DISTINCT field FROM table' is exactly
the same as 'SELECT field FROM table GROUP BY field, postgres could
transform the first into the second and avoid itself a (potentially
killer) sort.
	On my example the table was not too large but on a very large table,  
sorting all the values and then discinct'ing them does not look too  
appealing.

	Currently Postgres does Sort+Unique, but there could be a DistinctSort  
instead of a Sort, that is a thing that sorts and removes the duplicates  
at the same time. Not that much complicated to code than a sort, and much  
faster in this case.
	Or there could be a DistinctHash, which would be similar or rather  
identical to a HashAggregate and would again skip the sort.

	It would (as a bonus) speed up queries like UNION (not ALL), that kind of  
things. For example :

 explain (select number from dummy) union (select number from dummy);
 Unique  (cost=287087.62..297087.62 rows=200 width=4)
   -  Sort  (cost=287087.62..292087.62 rows=200 width=4)
 Sort Key: number
 -  Append  (cost=0.00..49804.00 rows=200 width=4)
   -  Subquery Scan *SELECT* 1  (cost=0.00..24902.00  
rows=100 width=4)
 -  Seq Scan on dummy  (cost=0.00..14902.00  
rows=100 width=4)
   -  Subquery Scan *SELECT* 2  (cost=0.00..24902.00  
rows=100 width=4)
 -  Seq Scan on dummy  (cost=0.00..14902.00  
rows=100 width=4)

This is scary !
I can rewrite it as such (and the planner could, too) :
explain select * from ((select number from dummy) union all (select number  
from dummy)) as foo group by number;
 HashAggregate  (cost=74804.00..74804.00 rows=200 width=4)
   -  Subquery Scan foo  (cost=0.00..69804.00 rows=200 width=4)
 -  Append  (cost=0.00..49804.00 rows=200 width=4)
   -  Subquery Scan *SELECT* 1  (cost=0.00..24902.00  
rows=100 width=4)
 -  Seq Scan on dummy  (cost=0.00..14902.00  
rows=100 width=4)
   -  Subquery Scan *SELECT* 2  (cost=0.00..24902.00  
rows=100 width=4)
 -  Seq Scan on dummy  (cost=0.00..14902.00  
rows=100 width=4)

which avoids a large sort...
However there must be cases in which performing a sort is faster, like  
when there are a lot of distinct values and the HashAggregate becomes huge  
too.

Well there are two questions here. Why given the current plans available  
does
postgres choose a sequential scan instead of an index scan. And why isn't
	Well because it needs to get all the rows in the table in order.
	in this case seq scan+sort is about twice as fast as index scan.
	Interestingly, once I ANALYZED the table, postgres will chooses to  
index-scan, which is slower.

there this kind of skip index scan available.
It would be really nice to have a skip index scan available.
	I have an other idea, lets call it the indexed sequential scan :
	When pg knows there are a lot of rows to access, it will ignore the index  
and seqscan. This is because index access is very random, thus slow.  
However postgres could implement an indexed sequential scan where :
	- the page numbers for the matching rows are looked up in the index
	(this is fast as an index has good locality)
	- the page numbers are grouped so we have a list of pages with one and  
only one instance of each page number
	- the list is then sorted so we have page numbers in-order
	- the pages are loaded in sorted order (doing a kind of partial  
sequential scan) which would be faster than reading them randomly.

Other ideas later

Postgres chooses a sequential scan with a sort (or hash aggregate) over  
an
index scan because it expects it to be faster. sequential scans are much
faster than random access scans of indexes, plus index scans need to  
read many
more blocks. If you're finding the index scan to be just as fast as  
sequential
scans you might consider lowering random_page_cost closer to 1.0. But  
note
that you may be getting fooled by a testing methodology where more  
things are
cached than would be in production.

why isn't a skip index scan plan available? Well, nobody's written the  
code
yet. It would part of the same code needed to get an index scan used for:

select y,min(x) from bar group by y
And possibly also related to the TODO item:
Use index to restrict rows returned by multi-key index when used with
non-consecutive keys to reduce heap accesses
For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and  
col3 =
9, spin though the index checking for col1 and col3 matches, rather  
than
just col1

Note that the optimizer would have to make a judgement call based on the
expected number of distinct values. If you had much more than 256  
distinct
values then the your plpgsql function wouldn't have performed well at  
all.


---(end of 

Re: [PERFORM] Comparing user attributes with bitwise operators

2004-10-06 Thread Patrick Clery
Another problem I should note is that when I first insert all the data into 
the people_attributes table (the int[] table), the GiST index is not used:

THE INDEX:
people_attributes_search gist ((ARRAY[age, gender, orientation, children, 
drinking, education, 
ethnicity, eyecolor, haircolor, hairstyle, height, income, occupation, 
relation, religion, smoking, w
ant_children, weight] + seeking + languages))

PART OF THE QUERY PLAN:
Seq Scan on people_attributes pa  (cost=0.00..0.00 rows=1 width=20)
 Filter: (((ARRAY[age, gender, orientation, children, 
drinking, education, ethnicity, eyecolor, haircolor, hairstyle, height, 
income, occupation, relation, religion, smoking, want_children, weight] + 
seeking) + languages) @@ '( ( 4 | 5 ) | 6 )  88  48  ( 69 | 70 )  92  
( ( ( ( ( ( ( ( ( ( ( ( ( 95 | 96 ) | 97 ) | 98 ) | 99 ) | 100 ) | 101 ) | 
102 ) | 103 ) | 104 ) | 105 ) | 106 ) | 107 ) | 108 )  
( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( 
( ( ( ( ( ( ( ( ( ( 190 
| 191 ) | 192 ) | 193 ) | 194 ) | 195 ) | 196 ) | 197 ) | 198 ) | 199 ) | 
200 ) | 201 ) | 202 ) | 203 ) | 204 ) | 205 ) | 206 ) | 207 ) | 208 ) | 209 ) 
| 210 ) | 211 ) | 212 ) | 213 ) | 214 ) | 215 ) | 216 ) | 217 ) | 218 ) | 
219 ) | 220 ) | 221 ) | 222 ) | 223 ) | 224 ) | 225 ) | 226 ) | 227 ) | 228 ) 
| 229 ) | 230 ) | 231 ) | 232 ) | 233 ) | 234 ) | 235 ) | 236 ) | 237 ) | 
238 ) | 239 ) | 240 ) | 241 ) | 242 ) | 243 )'::query_int)


So I run VACUUM ANALYZE people_attributes, then run again:

PART OF THE QUERY PLAN:
Index Scan using people_attributes_pkey on people_attributes pa  
(cost=0.00..5.32 rows=1 width=20)
 Index Cond: (pa.person_id = outer.person_id)
 Filter: (((ARRAY[age, gender, orientation, children, drinking, 
education, ethnicity, eyecolor, haircolor, hairstyle, height, income, 
occupation, relation, religion, smoking, want_children, weight] + seeking) + 
languages) @@ '( ( 4 | 5 ) | 6 )  88  48  ( 69 | 70 )  92  
( ( ( ( ( ( ( ( ( ( ( ( ( 95 | 96 ) | 97 ) | 98 ) | 99 ) | 100 ) | 101 ) | 
102 ) | 103 ) | 104 ) | 105 ) | 106 ) | 107 ) | 108 )  
( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( 
( ( ( ( ( ( ( ( ( ( 190 
| 191 ) | 192 ) | 193 ) | 194 ) | 195 ) | 196 ) | 197 ) | 198 ) | 199 ) | 
200 ) | 201 ) | 202 ) | 203 ) | 204 ) | 205 ) | 206 ) | 207 ) | 208 ) | 209 ) 
| 210 ) | 211 ) | 212 ) | 213 ) | 214 ) | 215 ) | 216 ) | 217 ) | 218 ) | 
219 ) | 220 ) | 221 ) | 222 ) | 223 ) | 224 ) | 225 ) | 226 ) | 227 ) | 228 ) 
| 229 ) | 230 ) | 231 ) | 232 ) | 233 ) | 234 ) | 235 ) | 236 ) | 237 ) | 
238 ) | 239 ) | 240 ) | 241 ) | 242 ) | 243 )'::query_int)

Still not using the index. I'm trying to DROP INDEX and recreate it, but the 
query just stalls. I remember last time this situation happened that I just 
dropped and recreated the index, and voila it was using the index again. Now 
I can't seem to get this index to drop. Here's the table structure:


Column |   Type| Modifiers  
---+---+
 person_id | integer   | not null
 askmecount| integer   | not null default 0
 age   | integer   | not null
 gender| integer   | not null
 bodytype  | integer   | not null
 children  | integer   | not null
 drinking  | integer   | not null
 education | integer   | not null
 ethnicity | integer   | not null
 eyecolor  | integer   | not null
 haircolor | integer   | not null
 hairstyle | integer   | not null
 height| integer   | not null
 income| integer   | not null
 languages | integer[] | not null
 occupation| integer   | not null
 orientation   | integer   | not null
 relation  | integer   | not null
 religion  | integer   | not null
 smoking   | integer   | not null
 want_children | integer   | not null
 weight| integer   | not null
 seeking   | integer[] | not null
Indexes:
people_attributes_pkey PRIMARY KEY, btree (person_id)
people_attributes_search gist ((ARRAY[age, gender, orientation, 
children, drinking, education, 
ethnicity, eyecolor, haircolor, hairstyle, height, income, occupation, 
relation, religion, smoking, w
ant_children, weight] + seeking + languages))
Foreign-key constraints:
people_attributes_weight_fkey FOREIGN KEY (weight) REFERENCES 
attribute_values(value_id) ON DEL
ETE RESTRICT
people_attributes_person_id_fkey FOREIGN KEY (person_id) REFERENCES 
people(person_id) ON DELETE
 CASCADE DEFERRABLE INITIALLY DEFERRED
people_attributes_age_fkey FOREIGN KEY (age) REFERENCES 
attribute_values(value_id) ON DELETE RE
STRICT
people_attributes_gender_fkey FOREIGN KEY (gender) REFERENCES 
attribute_values(value_id) ON DEL
ETE RESTRICT
people_attributes_bodytype_fkey FOREIGN KEY (bodytype) REFERENCES 
attribute_values(value_id) ON
 DELETE RESTRICT
people_attributes_children_fkey FOREIGN KEY (children) REFERENCES 

Re: [PERFORM] Comparing user attributes with bitwise operators

2004-10-06 Thread Patrick Clery
Err... I REINDEX'ed it and it is now using the index. :)

I'd still appreciate if anyone could tell me why this needs to be
reindexed. Is the index not updated when the records are inserted?

 On Wednesday 06 October 2004 12:55, I wrote:
  Another problem I should note is that when I first insert all the data
  into the people_attributes table (the int[] table), the GiST index is
  not used:
 
  THE INDEX:
  people_attributes_search gist ((ARRAY[age, gender, orientation,
  children, drinking, education,
  ethnicity, eyecolor, haircolor, hairstyle, height, income, occupation,
  relation, religion, smoking, w
  ant_children, weight] + seeking + languages))
 
  PART OF THE QUERY PLAN:
  Seq Scan on people_attributes pa  (cost=0.00..0.00 rows=1 width=20)
   Filter: (((ARRAY[age, gender, orientation, children,
  drinking, education, ethnicity, eyecolor, haircolor, hairstyle, height,
  income, occupation, relation, religion, smoking, want_children, weight] +
  seeking) + languages) @@ '( ( 4 | 5 ) | 6 )  88  48  ( 69 | 70 )  92
   ( ( ( ( ( ( ( ( ( ( ( ( ( 95 | 96 ) | 97 ) | 98 ) | 99 ) | 100 ) | 101
  ) | 102 ) | 103 ) | 104 ) | 105 ) | 106 ) | 107 ) | 108 ) 
  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (
  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( 190
 
  | 191 ) | 192 ) | 193 ) | 194 ) | 195 ) | 196 ) | 197 ) | 198 ) | 199 ) |
 
  200 ) | 201 ) | 202 ) | 203 ) | 204 ) | 205 ) | 206 ) | 207 ) | 208 ) |
  209 )
 
  | 210 ) | 211 ) | 212 ) | 213 ) | 214 ) | 215 ) | 216 ) | 217 ) | 218 ) |
 
  219 ) | 220 ) | 221 ) | 222 ) | 223 ) | 224 ) | 225 ) | 226 ) | 227 ) |
  228 )
 
  | 229 ) | 230 ) | 231 ) | 232 ) | 233 ) | 234 ) | 235 ) | 236 ) | 237 ) |
 
  238 ) | 239 ) | 240 ) | 241 ) | 242 ) | 243 )'::query_int)
 
 
  So I run VACUUM ANALYZE people_attributes, then run again:
 
  PART OF THE QUERY PLAN:
  Index Scan using people_attributes_pkey on people_attributes pa
  (cost=0.00..5.32 rows=1 width=20)
   Index Cond: (pa.person_id = outer.person_id)
   Filter: (((ARRAY[age, gender, orientation, children, drinking,
  education, ethnicity, eyecolor, haircolor, hairstyle, height, income,
  occupation, relation, religion, smoking, want_children, weight] +
  seeking) + languages) @@ '( ( 4 | 5 ) | 6 )  88  48  ( 69 | 70 )  92
   ( ( ( ( ( ( ( ( ( ( ( ( ( 95 | 96 ) | 97 ) | 98 ) | 99 ) | 100 ) | 101
  ) | 102 ) | 103 ) | 104 ) | 105 ) | 106 ) | 107 ) | 108 ) 
  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( (
  ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( 190
 
  | 191 ) | 192 ) | 193 ) | 194 ) | 195 ) | 196 ) | 197 ) | 198 ) | 199 ) |
 
  200 ) | 201 ) | 202 ) | 203 ) | 204 ) | 205 ) | 206 ) | 207 ) | 208 ) |
  209 )
 
  | 210 ) | 211 ) | 212 ) | 213 ) | 214 ) | 215 ) | 216 ) | 217 ) | 218 ) |
 
  219 ) | 220 ) | 221 ) | 222 ) | 223 ) | 224 ) | 225 ) | 226 ) | 227 ) |
  228 )
 
  | 229 ) | 230 ) | 231 ) | 232 ) | 233 ) | 234 ) | 235 ) | 236 ) | 237 ) |
 
  238 ) | 239 ) | 240 ) | 241 ) | 242 ) | 243 )'::query_int)
 
  Still not using the index. I'm trying to DROP INDEX and recreate it, but
  the query just stalls. I remember last time this situation happened that
  I just dropped and recreated the index, and voila it was using the index
  again. Now I can't seem to get this index to drop. Here's the table
  structure:
 
 
  Column |   Type| Modifiers
  ---+---+
   person_id | integer   | not null
   askmecount| integer   | not null default 0
   age   | integer   | not null
   gender| integer   | not null
   bodytype  | integer   | not null
   children  | integer   | not null
   drinking  | integer   | not null
   education | integer   | not null
   ethnicity | integer   | not null
   eyecolor  | integer   | not null
   haircolor | integer   | not null
   hairstyle | integer   | not null
   height| integer   | not null
   income| integer   | not null
   languages | integer[] | not null
   occupation| integer   | not null
   orientation   | integer   | not null
   relation  | integer   | not null
   religion  | integer   | not null
   smoking   | integer   | not null
   want_children | integer   | not null
   weight| integer   | not null
   seeking   | integer[] | not null
  Indexes:
  people_attributes_pkey PRIMARY KEY, btree (person_id)
  people_attributes_search gist ((ARRAY[age, gender, orientation,
  children, drinking, education,
  ethnicity, eyecolor, haircolor, hairstyle, height, income, occupation,
  relation, religion, smoking, w
  ant_children, weight] + seeking + languages))
  Foreign-key constraints:
  people_attributes_weight_fkey FOREIGN KEY (weight) REFERENCES
  attribute_values(value_id) ON DEL
  ETE RESTRICT
  people_attributes_person_id_fkey FOREIGN KEY (person_id) REFERENCES
  people(person_id) ON DELETE
   CASCADE DEFERRABLE INITIALLY DEFERRED
  

Re: [PERFORM] Comparing user attributes with bitwise operators

2004-10-06 Thread Greg Stark

Patrick Clery [EMAIL PROTECTED] writes:

 PART OF THE QUERY PLAN:
 Index Scan using people_attributes_pkey on people_attributes pa (cost=0.00..5.32 
 rows=1 width=20)
  Index Cond: (pa.person_id = outer.person_id)
  Filter: (((ARRAY[age, gender, orientation, children, drinking, 

You'll probably have to show the rest of the plan for anyone to have much idea
what's going on. It seems to be part of a join of some sort and the planner is
choosing to drive the join from the wrong table. This may make it awkward to
force the right plan using enable_seqscan or anything like that. But GiST
indexes don't have very good selectivity estimates so I'm not sure you can
hope for the optimizer to guess right on its own.

 Is it all the foreign keys that are stalling the drop? I have done VACUUM 
 ANALYZE on the entire db. Could anyone offer some insight as to why this 
 index is not being used or why the index is not dropping easily?

I don't think foreign keys cause problems dropping indexes. Foreign key
constraints are just checked whenever there's an insert/update/delete. Perhaps
you're just underestimating the size of this index and the amount of time
it'll take to delete it? Or are there queries actively executing using the
index while you're trying to delete it? Or a vacuum running?

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] sequential scan on select distinct

2004-10-06 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 why isn't a skip index scan plan available? Well, nobody's written the code
 yet.

I don't really think it would be a useful plan anyway.  What *would* be
useful is to support HashAggregate as an implementation alternative for
DISTINCT --- currently I believe we only consider that for GROUP BY.
The DISTINCT planning code is fairly old and crufty and hasn't been
redesigned lately.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] sequential scan on select distinct

2004-10-06 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 Greg Stark [EMAIL PROTECTED] writes:
  why isn't a skip index scan plan available? Well, nobody's written the code
  yet.
 
 I don't really think it would be a useful plan anyway.  

Well it would clearly be useful in this test case, where has a small number of
distinct values in a large table, and an index on the column. His plpgsql
function that emulates such a plan is an order of magnitude faster than the
hash aggregate plan even though it has to do entirely separate index scans for
each key value.

I'm not sure where the break-even point would be, but it would probably be
pretty low. Probably somewhere around the order of 1% distinct values in the
table. That might be uncommon, but certainly not impossible.

But regardless of how uncommon it is, it could be considered important in
another sense: when you need it there really isn't any alternative. It's an
algorithmic improvement with no bound on the performance difference. Nothing
short of using a manually maintained materialized view would bring the
performance into the same ballpark.

So even if it's only useful occasionally, not having the plan available can
leave postgres with no effective plan for what should be an easy query.


-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] The never ending quest for clarity on shared_buffers

2004-10-06 Thread Doug Y
Hello,
  We recently upgraded os from rh 7.2 (2.4 kernel) to Suse 9.1 (2.6 
kernel), and psql from 7.3.4 to 7.4.2

  One of the quirks I've noticed is how the queries don't always have the 
same explain plans on the new psql... but that's a different email I think.

  My main question is I'm trying to convince the powers that be to let me 
use persistent DB connections (from apache 2 / php), and my research has 
yielded conflicting documentation about the shared_buffers setting... real 
shocker there :)

  For idle persistent connections, do each of them allocate the memory 
specified by this setting (shared_buffers * 8k), or is it one pool used by 
all the connection (which seems the logical conclusion based on the name 
SHARED_buffers)? Personally I'm more inclined to think the latter choice, 
but I've seen references that alluded to both cases, but never a definitive 
answer.

  For what its worth, shared_buffers is currently set to 5 (on a 4G 
system). Also, effective_cache_size is 125000. max_connections is 256, so I 
don't want to end up with a possible 100G (50k * 8k * 256) of memory tied 
up... not that it would be possible, but you never know.

  I typically never see more than a dozen or so concurrent connections to 
the db (serving 3 web servers), so I'm thinking of actually using something 
like pgpool to keep about 10 per web server, rather than use traditional 
persistent connections of 1 per Apache child, which would probably average 
about 50 per web server.

Thanks.
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] sequential scan on select distinct

2004-10-06 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 But regardless of how uncommon it is, it could be considered important in
 another sense: when you need it there really isn't any alternative. It's an
 algorithmic improvement with no bound on the performance difference.

[ shrug... ]  There are an infinite number of special cases for which
that claim could be made.  The more we load down the planner with
seldom-useful special cases, the *lower* the overall performance will
be, because we'll waste cycles checking for the special cases in every
case ...

In this particular case, it's not merely a matter of the planner, either.
You'd need some new type of plan node in the executor, so there's a
pretty fair amount of added code bulk that will have to be written and
then maintained.

I'm open to being persuaded that this is worth doing, but the bar is
going to be high; I think there are a lot of other more-profitable ways
to invest our coding effort and planning cycles.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Data warehousing requirements

2004-10-06 Thread Gabriele Bartolini
Hi guys,
   I just discussed about my problem on IRC. I am building a Web usage 
mining system based on Linux, PostgreSQL and C++ made up of an OLTP 
database which feeds several and multi-purpose data warehouses about users' 
behaviour on HTTP servers.

   I modelled every warehouse using the star schema, with a fact table and 
then 'n' dimension tables linked using a surrogate ID.

   Discussing with the guys of the chat, I came up with these conclusions, 
regarding the warehouse's performance:

1) don't use referential integrity in the facts table
2) use INTEGER and avoid SMALLINT and NUMERIC types for dimensions' IDs
3) use an index for every dimension's ID in the fact table
   As far as administration is concerned: run VACUUM ANALYSE daily and 
VACUUM FULL periodically.

   Is there anything else I should keep in mind?
   Also, I was looking for advice regarding hardware requirements for a 
data warehouse system that needs to satisfy online queries. I have indeed 
no idea at the moment. I can only predict 4 million about records a month 
in the fact table, does it make sense or not? is it too much?

   Data needs to be easily backed up and eventually replicated.
   Having this in mind, what hardware architecture should I look for? How 
many hard disks do I need, what kind and what RAID solution do you suggest 
me to adopt (5 or 10 - I think)?

Thank you so much,
-Gabriele
--
Gabriele Bartolini: Web Programmer, ht://Dig  IWA/HWG Member, ht://Check 
maintainer
Current Location: Prato, Toscana, Italia
[EMAIL PROTECTED] | http://www.prato.linux.it/~gbartolini | ICQ#129221447
 Leave every hope, ye who enter!, Dante Alighieri, Divine Comedy, The 
Inferno

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.773 / Virus Database: 520 - Release Date: 05/10/2004

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] The never ending quest for clarity on shared_buffers

2004-10-06 Thread Paul Ramsey
Doug Y wrote:
  For idle persistent connections, do each of them allocate the memory 
specified by this setting (shared_buffers * 8k), or is it one pool used 
by all the connection (which seems the logical conclusion based on the 
name SHARED_buffers)? Personally I'm more inclined to think the latter 
choice, but I've seen references that alluded to both cases, but never a 
definitive answer.
The shared_buffers are shared (go figure) :).  It is all one pool shared 
by all connections.  The sort_mem and vacuum_mem are *per*connection* 
however, so when allocating that size you have to take into account your 
expected number of concurrent connections.

Paul
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Excessive context switching on SMP Xeons

2004-10-06 Thread Gaetano Mendola
Alan Stange wrote:
A few quick random observations on the Xeon v. Opteron comparison:
[SNIP]
I don't care to go into the whole debate of Xeon v. Opteron here.   We 
also have a lot of dual Xeon systems. In every comparison I've done with 
our codes, the dual Opteron clearly outperforms the dual Xeon, when 
running on one and both cpus.
Here http://www6.tomshardware.com/cpu/20030422/  both were tested and there is
a database performance section, unfortunatelly they used MySQL.
Regards
Gaetano Mendola

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Caching of Queries

2004-10-06 Thread Tatsuo Ishii
  I don't know what you are exactly referring to in above URL 
  when you are talking about potential pitfalls of pooling. 
  Please explain more.
 
 Sorry, I wasn't implying that pgpool doesn't deal with the issues, just that
 some people aren't necessarily aware of them up front.  For instance, pgpool
 does an 'abort transaction' and a 'reset all' in lieu of a full reconnect
 (of course, since a full reconnect is exactly what we are trying to avoid).
 Is this is enough to guarantee that a given pooled connection behaves
 exactly as a non-pooled connection would from a client perspective?  For
 instance, temporary tables are usually dropped at the end of a session, so a
 client (badly coded perhaps) that does not already use persistent
 connections might be confused when the sequence 'connect, create temp table
 foo ..., disconnect, connect, create temp table foo ...' results in the
 error 'Relation 'foo' already exists'.

First, it's not a particular problem with pgpool. As far as I know any
connection pool solution has exactly the same problem. Second, it's
easy to fix if PostgreSQL provides a functionarity such as:drop all
temporary tables if any. I think we should implement it if we agree
that connection pooling should be implemented outside the PostgreSQL
engine itself. I think cores agree with this.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] The never ending quest for clarity on shared_buffers

2004-10-06 Thread Neil Conway
On Thu, 2004-10-07 at 08:26, Paul Ramsey wrote:
 The shared_buffers are shared (go figure) :).  It is all one pool shared 
 by all connections.

Yeah, I thought this was pretty clear. Doug, can you elaborate on where
you saw the misleading docs?

 The sort_mem and vacuum_mem are *per*connection* however, so when
 allocating that size you have to take into account your 
 expected number of concurrent connections.

Allocations of size `sort_mem' can actually can actually happen several
times within a *single* connection (if the query plan happens to involve
a number of sort steps or hash tables) -- the limit is on the amount of
memory that will be used for a single sort/hash table. So choosing the
right figure is actually a little more complex than that.

-Neil



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Excessive context switching on SMP Xeons

2004-10-06 Thread Alan Stange
Here's a few numbers from the Opteron 250.  If I get some time I'll post 
a more comprehensive comparison including some other systems.

The system is a Sun v20z.  Dual Opteron 250, 2.4Ghz, Linux 2.6, 8 GB 
memory.   I did a compile and install of pg 8.0 beta 3.  I created a 
data base on a tmpfs file system and ran pgbench.  Everything was out 
of the box, meaning I did not tweak any config files.

I used this for pgbench:
$ pgbench -i -s 32
and this for pgbench invocations:
$ pgbench -s 32 -c 1 -t 1 -v
clients  tps  
11290  
21780   
41760
81680 
16   1376   
32904

How are these results useful?  In some sense, this is a speed of light 
number for the Opteron 250.   You'll never go faster on this system with 
a real storage subsystem involved instead of a tmpfs file system.   It's 
also a set of numbers that anyone else can reproduce as we don't have to 
deal with any differences in file systems, disk subsystems, networking, 
etc.   Finally, it's a set of results that anyone else can compute on 
Xeon's or other systems and make a simple (and naive) comparisons.

Just to stay on topic:   vmstat reported about 30K cs / second while 
this was running the 1 and 2 client cases.

-- Alan
---(end of broadcast)---
TIP 8: explain analyze is your friend