Re: [GENERAL] Slow select

2009-12-17 Thread Sam Mason
On Wed, Dec 16, 2009 at 05:18:12PM -0800, yuliada wrote:
> Sam Mason wrote:
> > How about combining all 1000 selects into one?
> 
> I can't combine these selects into one, I need to run them one after
> another.

Hum, difficult.  What other information is in the row that you need
back?  Can you turn the table structure around somehow so that the
"value" is the primary key and hence only a single row needs to be found
each time.

Other than that, I think you just need faster disks.

> "Bitmap Heap Scan on bn_stringvalue v  (cost=228.40..8688.70 rows=2172 
> width=90) (actual time=1129.767..1781.403 rows=104 loops=1)"
> "  Recheck Cond: (lower((value)::text) = 'esr'::text)"
> "  ->  Bitmap Index Scan on idx_stringv  (cost=0.00..227.86 rows=2172 
> width=0) (actual time=1107.974..1107.974 rows=104 loops=1)"
> "Index Cond: (lower((value)::text) = 'esr'::text)"
> "Total runtime: 1781.566 ms"

It looks like it's doing reasonable things.  I assume you've got a
single disk servicing this, 1781 / (104*2) = 8ms average seek time.

Clustering on "value" may help, but it's going to take a while.  Its
value depends on how common this operation is compared to other ones.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow select

2009-12-16 Thread yuliada

If I search for something which is not in db like 'dfsgsdfgsdfgdsfg' it
always work fast. I suspect that speed depends on number of rows retruned,
but I don't know exactly...
-- 
View this message in context: 
http://old.nabble.com/Slow-select-tp26810673p26821859.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow select

2009-12-16 Thread yuliada


Sam Mason wrote:
> 
> Wouldn't this be "lower(value) = lower(?)" ?
> 

Yes, I use it as "lower(value) = lower(?)", I typed inaccurate example.


Sam Mason wrote:
> 
> So each query is taking approx 300ms?  How much data does each one
> return?
> 

No more than 1000 rows.


Sam Mason wrote:
> 
> How about combining all 1000 selects into one?
> 

I can't combine these selects into one, I need to run them one after
another.


Grzegorz Jaśkiewicz wrote:
> 
> show us explain select * 
> 

"Bitmap Heap Scan on bn_stringvalue v  (cost=228.40..8688.70 rows=2172
width=90) (actual time=1129.767..1781.403 rows=104 loops=1)"
"  Recheck Cond: (lower((value)::text) = 'esr'::text)"
"  ->  Bitmap Index Scan on idx_stringv  (cost=0.00..227.86 rows=2172
width=0) (actual time=1107.974..1107.974 rows=104 loops=1)"
"Index Cond: (lower((value)::text) = 'esr'::text)"
"Total runtime: 1781.566 ms"

Thanks
-- 
View this message in context: 
http://old.nabble.com/Slow-select-tp26810673p26821568.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow select

2009-12-16 Thread Sam Mason
On Wed, Dec 16, 2009 at 04:56:16AM -0800, yuliada wrote:
> I have a table with column of character varying(100). There are about
> 150.000.000 rows in a table. Index was created as
> 
> CREATE INDEX idx_stringv
>   ON bn_stringvalue
>   USING btree
>   (lower(value::text));
> 
> I'm trying to execute queries like 'select * from stringvalue where
> value=lower(?)'.

Wouldn't this be "lower(value) = lower(?)" ?

> Making 1000 selects takes about 4-5 min.

So each query is taking approx 300ms?  How much data does each one
return?

> I did vacuum and
> analyze on this table and checked that query plan uses index. What can I do
> to make it faster?

How about combining all 1000 selects into one?  Maybe something like:

  SELECT * FROM stringvalue
  WHERE lower(value) = ANY (ARRAY ['a','b','c']);

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow select

2009-12-16 Thread Grzegorz Jaśkiewicz
show us explain select * 

-- 
GJ

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Slow select

2009-12-16 Thread yuliada

I have a table with column of character varying(100). There are about
150.000.000 rows in a table. Index was created as

CREATE INDEX idx_stringv
  ON bn_stringvalue
  USING btree
  (lower(value::text));

I'm trying to execute queries like 'select * from stringvalue where
value=lower(?)'. Making 1000 selects takes about 4-5 min. I did vacuum and
analyze on this table and checked that query plan uses index. What can I do
to make it faster?

Thanks in advance, Yulia
-- 
View this message in context: 
http://old.nabble.com/Slow-select-tp26810673p26810673.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] slow select in big table

2009-04-03 Thread Tom Lane
rafalak  writes:
> QUERY PLAN without changes
> Aggregate  (cost=98018.96..98018.97 rows=1 width=4) (actual
> time=64049.326..64049.328 rows=1 loops=1)
>   ->  Bitmap Heap Scan on tbl_photos_keywords  (cost=533.23..97940.02
> rows=31577 width=4) (actual time=157.787..63905.939 rows=119154
> loops=1)
> Recheck Cond: (keyword_id = 14)
> ->  Bitmap Index Scan on keyword_id  (cost=0.00..525.33
> rows=31577 width=0) (actual time=120.876..120.876 rows=119154 loops=1)
>   Index Cond: (keyword_id = 14)

> enable_seqscan = off

This is a bad idea (and did not affect your plan anyway)

> random_page_cost = 1.0

This might or might not be a good idea, depending on whether your
database fits in RAM or not.

> QUERY PLAN with changes
> Aggregate  (cost=30546.30..30546.31 rows=1 width=4) (actual
> time=1710.100..1710.102 rows=1 loops=1)
>   ->  Index Scan using keyword_id on tbl_photos_keywords
> (cost=0.00..30467.36 rows=31577 width=4) (actual time=0.150..1573.843
> rows=119154 loops=1)
> Index Cond: (keyword_id = 14)

It's hard to believe that this plan actually beats the other one on a
consistent basis; and especially not by that much.  I think what really
happened here is that the data was all cached in the second test,
because the first test read it all in already.  You need to test both
cases (cached and not) to get a clearer picture of what you're doing.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] slow select in big table

2009-04-03 Thread Sam Mason
On Fri, Apr 03, 2009 at 01:20:33AM -0700, rafalak wrote:
> QUERY PLAN without changes
> Aggregate  (cost=98018.96..98018.97 rows=1 width=4) (actual 
> time=64049.326..64049.328 rows=1 loops=1)
>   ->  Bitmap Heap Scan on tbl_photos_keywords  (cost=533.23..97940.02 
> rows=31577 width=4) (actual time=157.787..63905.939 rows=119154 loops=1)
> Recheck Cond: (keyword_id = 14)
> ->  Bitmap Index Scan on keyword_id  (cost=0.00..525.33 rows=31577 
> width=0) (actual time=120.876..120.876 rows=119154 loops=1)
>   Index Cond: (keyword_id = 14)

Pulling in 120k rows from disk is always going to take a long time.

> Total runtime: 64049.686 ms

If they're scattered randomly across the table it's going to take a
lot longer even than this; assuming a 8ms average seek time that's 30
minutes (two seeks for each row) and is why PG was preferring to pick a
bitmap heap scan over an index scan.  The reason it only took one minute
is because the data isn't randomly distributed.

> QUERY PLAN with changes

> Aggregate  (cost=30546.30..30546.31 rows=1 width=4) (actual 
> time=1710.100..1710.102 rows=1 loops=1)
>   ->  Index Scan using keyword_id on tbl_photos_keywords (cost=0.00..30467.36 
> rows=31577 width=4) (actual time=0.150..1573.843 rows=119154 loops=1)
> Index Cond: (keyword_id = 14)

> Total runtime: 1710.185 ms

Is the cache hot now? i.e. did you run the tests in that order and hence
the first run pulled the data off disk and into memory where the second
test could use it without hitting the disk?

If you're running a recent Linux, you can tell it to drop the disk
cache:

  http://linux-mm.org/Drop_Caches

If you do this and then restart PG (because PG has its own caches) you
should be able to see what's going on better.

> What else can be changed ?

Disks are slow and it's the seek time which is killing you; as your
dataset isn't too big you could upgrade the memory in the machine so the
entire table can be in memory at once and then you won't have a problem.

If you want to optimise this case you could also look at CLUSTERing this
table on the keyword.

-- 
  Sam  http://samason.me.uk/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] slow select in big table

2009-04-03 Thread rafalak
> shared_buffers = 810MB
> temp_buffers = 128MB
> work_mem = 512MB
> maintenance_work_mem = 256MB
> max_stack_depth = 7MB
> effective_cache_size = 800MB

QUERY PLAN without changes
Aggregate  (cost=98018.96..98018.97 rows=1 width=4) (actual
time=64049.326..64049.328 rows=1 loops=1)
  ->  Bitmap Heap Scan on tbl_photos_keywords  (cost=533.23..97940.02
rows=31577 width=4) (actual time=157.787..63905.939 rows=119154
loops=1)
Recheck Cond: (keyword_id = 14)
->  Bitmap Index Scan on keyword_id  (cost=0.00..525.33
rows=31577 width=0) (actual time=120.876..120.876 rows=119154 loops=1)
  Index Cond: (keyword_id = 14)

Total runtime: 64049.686 ms

shared_buffers = 810MB
temp_buffers = 128MB
work_mem = 12MB
maintenance_work_mem = 256MB
max_stack_depth = 7MB
enable_seqscan = off
seq_page_cost = 1.0
random_page_cost = 1.0
effective_cache_size = 800MB

QUERY PLAN with changes
Aggregate  (cost=30546.30..30546.31 rows=1 width=4) (actual
time=1710.100..1710.102 rows=1 loops=1)
  ->  Index Scan using keyword_id on tbl_photos_keywords
(cost=0.00..30467.36 rows=31577 width=4) (actual time=0.150..1573.843
rows=119154 loops=1)
Index Cond: (keyword_id = 14)

Total runtime: 1710.185 ms

What else can be changed ?
Thx for help.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] slow select in big table

2009-04-02 Thread Scott Marlowe
On Thu, Apr 2, 2009 at 2:48 PM, rafalak  wrote:
> Hello i have big table
> 80mln records, ~6GB data, 2columns (int, int)
>
> if query
> select count(col1) from tab where col2=1234;
> return low records (1-10) time is good 30-40ms
> but when records is >1000 time is >12s
>
>
> How to increse performace ?
>
>
> my postgresql.conf
> shared_buffers = 810MB
> temp_buffers = 128MB
> work_mem = 512MB
> maintenance_work_mem = 256MB
> max_stack_depth = 7MB
> effective_cache_size = 800MB

Try lowering random_page_cost close to the setting of seq_page_cost
(i.e. just over 1 on a default seq_page_cost) and see if that helps.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] slow select in big table

2009-04-02 Thread Abbas
On Fri, Apr 3, 2009 at 2:18 AM, rafalak  wrote:

> Hello i have big table
> 80mln records, ~6GB data, 2columns (int, int)
>
> if query
> select count(col1) from tab where col2=1234;
> return low records (1-10) time is good 30-40ms
> but when records is >1000 time is >12s
>
>
> How to increse performace ?
>
>
> my postgresql.conf
> shared_buffers = 810MB
> temp_buffers = 128MB
> work_mem = 512MB
> maintenance_work_mem = 256MB
> max_stack_depth = 7MB
> effective_cache_size = 800MB
>
>
> db 8.3.7
> server, atlon dual-core 2,0Ghz, 2GB RAM, SATA
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
Is the table has indexes?
Decreasing the work_mem also increase performance.
Monitor these changes by explain the query plan.

Regards,
Abbas.


[GENERAL] slow select in big table

2009-04-02 Thread rafalak
Hello i have big table
80mln records, ~6GB data, 2columns (int, int)

if query
select count(col1) from tab where col2=1234;
return low records (1-10) time is good 30-40ms
but when records is >1000 time is >12s


How to increse performace ?


my postgresql.conf
shared_buffers = 810MB
temp_buffers = 128MB
work_mem = 512MB
maintenance_work_mem = 256MB
max_stack_depth = 7MB
effective_cache_size = 800MB


db 8.3.7
server, atlon dual-core 2,0Ghz, 2GB RAM, SATA


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Slow SELECT

2003-10-13 Thread Tom Lane
Mat <[EMAIL PROTECTED]> writes:
> On Fri, 2003-10-03 at 17:50, Tom Lane wrote:
>> Well, it seems to be running at about 5 msec/row, which would be quite
>> respectable if each fetch required another disk seek.  I'm wondering why
>> you are (apparently) not managing to get more than one row per page
>> fetched.  What are your configuration settings --- particularly
>> shared_buffers?  Could we see the output of VACUUM VERBOSE for this
>> table?

> Lines from postgresql.conf that don't start with a '#':
> shared_buffers = 126976   #992 MB

As someone else pointed out, that is way too large (unless maybe you
have 4Gb of RAM, and even then I'd not counsel making shared_buffers
that large).

> INFO:  --Relation public.meta--
> INFO:  Pages 685043: Changed 0, Empty 8; Tup 5999170: Vac 0, Keep 0,
> UnUsed 5999170.
> Total CPU 18.06s/3.61u sec elapsed 612.91 sec.

This shows you've got less than 9 tuples per 8k disk page on average.
Is the table very wide?  If it doesn't seem to you that the tuples
should occupy 1K apiece, it might be that you need to run a VACUUM FULL
to get rid of some excess free space.  (If so, it's a sign that you need
to increase the FSM settings in postgresql.conf and/or run plain VACUUM
more frequently, so that the free space doesn't get away from you
again.)

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] Slow SELECT

2003-10-09 Thread Shridhar Daithankar
Mat wrote:
Lines from postgresql.conf that don't start with a '#':

tcpip_socket = true
shared_buffers = 126976 #992 MB
sort_mem = 36864#36 MB
vacuum_mem = 73696  #72 MB
I would suggest scale down shared buffers to 128 or 64MB and set effective cache 
size correct. That should help.

HTH

 Shridhar

---(end of broadcast)---
TIP 8: explain analyze is your friend


[GENERAL] Slow SELECT

2003-10-03 Thread psql-mail
I am running a SELECT to get all tuples within a given date range. This 
query is much slwoer than i expected - am i missing something?

I have a table 'meta' with a column 'in_date' of type timestamp(0), i 
am trying to select all
records within a given date range. I have an index on 'in_date' and I 
also have an index on date(in_date). The queries I am doing are between 
dates rather than timestamps.

Immeadiately prior to running the queries shown below a VACUUM ANALYZE 
was run.

The query planner seems to be out by a factor of 10 for cost and number 
of rows. Is this this because of the slow performance? 

I have 6 million records. With dates spread fairly evenly between the 
end of 2001 and now.

I was very suprised to see the query take over 20 minutes when using 
the date(in_date) index. And more suprised to see the seq_scan over in_
date using timestamps take only 10 minutes.

Both are taking too long in my opinion! I was hoping for less than 10 
seconds. Is this too optimistic?

Any suggestions much appreciated.

I am using RH_AS_3 on IBM x450 quad xeon ia64, 4GB mem (1GB shared 
buffers for
postmaster)
When running queries the processor its running on sits down at 15-20% 
usage and the iowait goes up to 80-99% (fiber attached raid(0) yes i 
know its not resiliant).

testdb=# EXPLAIN ANALYZE  SELECT item_id, in_date FROM meta WHERE date(
in_date) >= '2002-03-01' AND date(in_date) < '2002-04-01' order by in_
date DESC;
QUERY PLAN  
  


--
Sort  (cost=122755.65..122830.64 rows=29996 width=50) (actual time=
1248326.17..1248608.39 rows=261305 loops=1)
Sort Key: in_date
->  Index Scan using meta_in_date_date_index on meta  (cost=0.00..
120525.09 rows=29996 width=50) (actual time=0.00..1244835.94 rows=
261305 loops=1)
Index Cond: ((date(in_date) >= '2002-03-01'::date) AND (date(in_date) < 
'2002-04-01'::date))
Total runtime: 1248887.70 msec
(5 rows)


Here are the stats on the in_date column if they're any use...

testdb=# SELECT * FROM pg_stats WHERE tablename = 'meta' and attname = '
in_date';
schemaname | tablename | attname  | null_frac | avg_width | n_distinct |
most_common_vals | most_common_freqs |  
histogram_bounds  | 
correlation
+---+--+---+---+
+--+---+


-+--

public | meta | in_date | 0 | 8 | -1 |  
|   | {"2001-10-18 17:28:23","2001-12-28 19:31:06","
2002-03-14 19:59:08","2002-05-27 08:28:04","2002-07-31 14:06:06","2002-
10-09 19:09:49","2002-12-21 03:58:46","2003-03-02 21:41:37","2003-05-09 
16:12:39","2003-07-22 05:13:18","2003-09-30 13:48:04"} | -0.000184019
(1 row)


Here is the same query as above but using timestamp(0)'s instead of 
dates. 

testdb=# EXPLAIN ANALYZE  SELECT item_id, in_date FROM meta WHERE in_
date >= '2002-03-01' AND in_date < '2002-04-01' order by in_date DESC;  
QUERY PLAN



Sort  (cost=797371.98..797995.09 rows=249246 width=50) (actual time=
616906.25..617183.58 rows=261305 loops=1)
Sort Key: in_date
->  Seq Scan on meta  (cost=0.00..775030.55 rows=249246 width=50) (
actual time=19.53..611541.03 rows=261305 loops=1)
Filter: ((in_date >= '2002-03-01 00:00:00'::timestamp without time zone)
AND (in_date < '2002-04-01 00:00:00'::timestamp without time zone))
Total runtime: 617446.29 msec
(5 rows)


-- 

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] slow SELECT ... LIMIT query

2001-07-18 Thread Tom Lane

What query plans are you getting for these various combinations?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[GENERAL] slow SELECT ... LIMIT query

2001-07-18 Thread Simon Stanlake

Hi,

I have a fairly large table (1 million records) with the following
structure...

sampleid int4
unitid int4
datetimestamp timestamp
data1 float8
data2 float8

btree indexes on sampleid, unitid, and datetimestamp.

I want to be able to pull out the most recent record for a certain unit.

the query that seemed most reasonable was
SELECT * FROM MYTABLE WHERE UNITID = unit_id ORDER BY DATETIMESTAMP DESC
LIMIT 1;

some strange results...

1) for units that have a lot of records (100K), the query is reasonably fast
(~ 0.5 seconds) but for units with not too many records (100) the query is
REALLY SLOW, like 15 seconds.  the explain plan says INDEX SCAN BACKWARDS so
I guess it is using the index.

2) when I take away the LIMIT 1 and run the query it returns the results
almost immediately no matter how many records the unit has.

It looks like the indexes are not being used properly when the LIMIT clause
is inserted.  I didn't do anything fancy when I created the indexes, just
the standard CREATE INDEX statement.  Ran VACUUM ANALYZE and everything.

Is this common when you use the LIMIT clause?  Is there a work around?

Thanks,
Simon


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [GENERAL] Slow SELECT...IN statements

2001-03-23 Thread Bruce Momjian

[ Charset ISO-8859-1 unsupported, converting... ]
> The FAQ states in entry 4.23 that SELECT...IN statements are slow and
> recommends to use EXISTS...IN statements instead. It also states that this
> will be resolved in some future version.
> I didn't find any entries about that in the TODO list, does anybody know
> when this will be fixed?

It will be fixed when we do the query tree rewrite, which is on the TODO
list, hopefully for 7.2.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl



[GENERAL] Slow SELECT...IN statements

2001-03-23 Thread Jan Wessely

The FAQ states in entry 4.23 that SELECT...IN statements are slow and
recommends to use EXISTS...IN statements instead. It also states that this
will be resolved in some future version.
I didn't find any entries about that in the TODO list, does anybody know
when this will be fixed?

PS: The mailinglist archives' search engine wasn't working, so please
forgive me if that was already asked and answered on this list.

--
Jan Wessely
Vienna Knowledge Net
mailto:[EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])