Re: [PERFORM] difficulties with time based queries

2009-04-08 Thread Rainer Mager
So, I defragged my disk and reran my original query and it got a little
better, but still far higher than I'd like. I then rebuilt (dropped and
recreated) the ad_log_date_all index and reran the query and it is quite a
bit better:

 

# explain analyze select * from ad_log where date(start_time) <
date('2009-03-31') and date(start_time) >= date('2009-03-30');

  QUERY PLAN


--

 Bitmap Heap Scan on ad_log  (cost=64770.21..3745596.62 rows=2519276
width=32) (actual time=1166.479..13862.107 rows=2275167 loops=1)

   Recheck Cond: ((date(start_time) < '2009-03-31'::date) AND
(date(start_time) >= '2009-03-30'::date))

   ->  Bitmap Index Scan on ad_log_date_all  (cost=0.00..64140.39
rows=2519276 width=0) (actual time=1143.582..1143.582 rows=2275167 loops=1)

 Index Cond: ((date(start_time) < '2009-03-31'::date) AND
(date(start_time) >= '2009-03-30'::date))

 Total runtime: 14547.885 ms

 

 

During the query the disk throughput peaked at 30MB/s and was mostly at
around 20MB/s, much better.

 

So, a few questions:

 

What can I do to prevent the index from getting bloated, or in whatever
state it was in?

 

What else can I do to further improve queries on this table? Someone
suggested posting details of my conf file. Which settings are most likely to
be useful for this?

 

Any other suggestions?

 

 

Thanks,

 

--Rainer



Re: [PERFORM] Nested query performance issue

2009-04-08 Thread Glenn Maynard
(I didn't notice that I ended up with "score.score" in this test case.  Oops.)

2009/4/8 Віталій Тимчишин :
> How about
>
> select s1.*
> from score s1 join score s2 on s1.game_id=s2.game_id and s2.score >=
> s1.score
> group by s1.*
> having count(s2.*) <= N

I can see what this is doing, but I'm getting:

ERROR:  could not identify an ordering operator for type score
HINT:  Use an explicit ordering operator or modify the query.

I'm not sure why; if I replace s1.* and s2.* with s1.id and s2.id it
works, but then I only get IDs.

Unfortunately, with N = 1 this takes 8100ms (vs. 950ms and 25ms)...

-- 
Glenn Maynard

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Nested query performance issue

2009-04-08 Thread Віталій Тимчишин
2009/4/9 Glenn Maynard 

> (This is related to an earlier post on -sql.)
>
> I'm querying for the N high scores for each game, with two tables:
> scores and games.
>
> CREATE TABLE game (id SERIAL NOT NULL PRIMARY KEY);
> CREATE TABLE score (id SERIAL NOT NULL PRIMARY KEY, score REAL,
> game_id INTEGER REFERENCES game (id));
> -- test data: 1000 games, 10 scores
> INSERT INTO game (id) select generate_series(1,1000);
> INSERT INTO score (game_id, score) select game.id, random() from game,
> generate_series(1,100);
> CREATE INDEX score_idx1 ON score (game_id, score desc);
> ANALYZE;
>

How about

select s1.*
from score s1 join score s2 on s1.game_id=s2.game_id and s2.score >=
s1.score
group by s1.*
having count(s2.*) <= N

Note: you can have problems if you have same scores - you will loose last
group that overlap N

In any case, you don't need to join game since all you need is game_id you
already have in score.

P.S. EXPLAIN ANALYZE could help

Best regards, Vitalii Tymchyshyn


[PERFORM] Nested query performance issue

2009-04-08 Thread Glenn Maynard
(This is related to an earlier post on -sql.)

I'm querying for the N high scores for each game, with two tables:
scores and games.

CREATE TABLE game (id SERIAL NOT NULL PRIMARY KEY);
CREATE TABLE score (id SERIAL NOT NULL PRIMARY KEY, score REAL,
game_id INTEGER REFERENCES game (id));
-- test data: 1000 games, 10 scores
INSERT INTO game (id) select generate_series(1,1000);
INSERT INTO score (game_id, score) select game.id, random() from game,
generate_series(1,100);
CREATE INDEX score_idx1 ON score (game_id, score desc);
ANALYZE;

This query retrieves the single highest score for each game, but
doesn't allow any more than that--I can't get the top five scores for
each game.  However, it's very fast: on the above test data, it runs
in 25ms on my system.  With 100 scores, it takes 40ms.

SELECT s.* FROM score s
WHERE s.id IN (
   -- Get the high scoring score ID for each game:
   SELECT
   (
   -- Get the high score for game g:
   SELECT s2.id FROM score s2 WHERE s2.game_id = g.id ORDER BY
s2.score DESC LIMIT 1
   )
   FROM game g
);


This rewrite allows getting the top N scores.  Unfortunately, this one
takes 950ms for the same data.  With 100 scores, it takes 14800ms.

SELECT s.* FROM score s, game g
WHERE s.game_id = g.id AND
 s.id IN (
SELECT s2.id FROM score s2 WHERE s2.game_id=g.id ORDER BY s2.score
DESC LIMIT 1
 );


This seems simple: for each game, search for the highest score, and
then scan the tree to get the next N-1 highest scores.  The first
version does just that, but the second one is doing a seq scan over
score.

I do want to be able to use a LIMIT higher than 1, which only works
with the second form.  Any suggestions of how to get the efficient
scanning of the first while being able to use a LIMIT greater than 1?

(It'd even be faster to make several calls to the first version,
varying an OFFSET to get each high score--but that's terrible.)

-- 
Glenn Maynard

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-08 Thread Dimitri Fontaine

Hi,

Ok I need to answer some more :)

Le 8 avr. 09 à 20:20, Jeff a écrit :
To add a table with a pk you edit slon_tools.conf and add something  
along the lines of:


"someset" => {
"set_id" => 5,
"table_id" => 5,
"pkeyedtables" => [ "tacos", "burritos", "gorditas" ]
}

then you just run

[create tables on slave(s)]
slonik_create_set someset;
slonik_subscribe_set 1 2;


 $ londiste.py setup.ini provider add schema.table
 $ londiste.py setup.ini subscriber add schema.table

Note both of those commands are to be run from the same host (often  
enough, the slave), if you have more than one slave, issue the second  
of them only on the remaining ones.


there are other handy scripts in there as well for failing over,  
adding tables, merging, etc. that hide a lot of the suck.   
Especially the suck of adding a node and creating the store paths.


There's no set in Londiste, so you just don't manage them. You add  
tables to queues (referencing the provider in fact) and the subscriber  
is free to subscribe to only a subset of the provider queue's tables.  
And any table could participate into more than one queue at any time  
too, of course.


I'm running slony on a rather write intensive system, works fine,  
just make sure you've got beefy IO.  One sucky thing though is if a  
slave is down sl_log can grow very large (I've had it get over 30M  
rows, the slave was only down for hours) and this causes major cpu  
churn while the queries slon issues sift through tons of data.  But,  
to be fair, that'll hurt any replication system.


This could happen in Londiste too, just set pgq_lazy_fetch to a  
reasonable value and Londiste will use a cursor to fetch the events,  
lowering the load. Events are just tuples in an INSERT only table,  
which when not used anymore is TRUNCATEd away. PGQ will use 3 tables  
where to store events and will rotate its choice of where to insert  
new envents, allowing to use TRUNCATE rather than DELETE. And  
PostgreSQL is quite efficient to manage this :)

  
http://wiki.postgresql.org/wiki/Londiste_Tutorial#Londiste_is_eating_all_my_CPU_and_lag_is_raising


Oh and some people asked what Londiste with failover and DDL would  
look like. Here's what the API being cooked looks like at the moment:

 $ londiste setup.ini execute myddl.script.sql

 $ londiste conf/londiste_db3.ini change-provider --provider=rnode1
 $ londiste conf/londiste_db1.ini switchover --target=rnode2

But I'm not the one who should be unveiling all of this, which is  
currently being prepared to reach alpha soon'ish.


Regards,
--
dim


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-08 Thread Jeff


On Apr 7, 2009, at 1:18 PM, Andrew Sullivan wrote:


I should have stated that differently.  First, you're right that if
you don't know where to look or what to look for, you can easily be
unaware of nodes being out of sync.  What's not a problem with Slony


_$cluster.sl_status on the origin is a very handy tool to see your  
slaves, how many sync's behind they are and whatnot.  Maybe I'm lucky,  
but I haven't got into a funky state that didn't cause my alarms that  
watch sl_status to go nuts.



Complexity seems to be the major evil here.


Yes.  Slony is massively complex.



Configuring slony by hand using slonik commands does suck horribly.
But the included altperl tools that come with it, along with  
slon_tools.conf removes a HUGE amount of that suck.


To add a table with a pk you edit slon_tools.conf and add something  
along the lines of:


"someset" => {
"set_id" => 5,
"table_id" => 5,
"pkeyedtables" => [ "tacos", "burritos", "gorditas" ]
}

then you just run

[create tables on slave(s)]
slonik_create_set someset;
slonik_subscribe_set 1 2;

there are other handy scripts in there as well for failing over,  
adding tables, merging, etc. that hide a lot of the suck.  Especially  
the suck of adding a node and creating the store paths.


I'm running slony on a rather write intensive system, works fine, just  
make sure you've got beefy IO.  One sucky thing though is if a slave  
is down sl_log can grow very large (I've had it get over 30M rows, the  
slave was only down for hours) and this causes major cpu churn while  
the queries slon issues sift through tons of data.  But, to be fair,  
that'll hurt any replication system.


--
Jeff Trout 
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] bad query plans for ~ "^string" (and like "string%") (8.3.6)

2009-04-08 Thread Tom Lane
Marinos Yannikos  writes:
> Marinos Yannikos wrote:
>> (what exactly does ANALYZE look at for text columns? in our case, about 
>> 7% of the rows match the index condition, so it seems that left-anchored 
>> regexp/like matches are not evaluated using the gathered 
>> most-common-value list at all)

> oops, I think I gave myself the answer there. Of course the 
> most-common-value list will not help if all the values that match the 
> "bad" index condition exist only once, but have a common prefix...

The costing is really done off the range condition ((e >= 'ean'::text)
AND (e < 'eao'::text) in your example).  I wouldn't think it would have
such a hard time getting a good rowcount estimate for that.  Maybe you
need to bump up the statistics target for that column?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] bad query plans for ~ "^string" (and like "string%") (8.3.6)

2009-04-08 Thread Robert Haas
On Wed, Apr 8, 2009 at 9:42 AM, Marinos Yannikos  wrote:
> It seems that ANALYZE does not really sample text column values as much as
> it could. We have some very bad query plans resulting from this:
>
> ...
>         ->  Bitmap Index Scan on m_pkey  (cost=0.00..28.61 rows=102 width=0)
> (actual time=171.824..171.824 rows=683923 loops=1)
>               Index Cond: ((e >= 'ean'::text) AND (e < 'eao'::text)
>
> This gets even worse for longer strings, where we know that many matching
> rows exist:
>
> # explain analyze select substring(e,5) from m where id=257421 and e ~
> '^ean=';
>                                                        QUERY PLAN
> ---
>  Index Scan using m_pkey on m  (cost=0.00..12.50 rows=1 width=60) (actual
> time=1623.795..1703.958 rows=18 loops=1)
>   Index Cond: ((e >= 'ean='::text) AND (e < 'ean>'::text))
>   Filter: ((e ~ '^ean='::text) AND (id = 257421))
>  Total runtime: 1703.991 ms
> (4 rows)
>
> Here it would be much better to use the existing index on "id" (btree) first
> because the current index condition selects 683k rows whereas the result
> contains 18 rows. Using the index on id would yield 97 rows to filter.
>
> Is it possible to work around this problem somehow, other than adding
> partial indexes for the ~ / LIKE condition (when it's constant) or a
> 2-dimensional index?
>
> (what exactly does ANALYZE look at for text columns? in our case, about 7%
> of the rows match the index condition, so it seems that left-anchored
> regexp/like matches are not evaluated using the gathered most-common-value
> list at all)

What are you using for default_statistics_target?

You can see the gathered data in pg_statistic.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] bad query plans for ~ "^string" (and like "string%") (8.3.6)

2009-04-08 Thread Marinos Yannikos

Marinos Yannikos wrote:
(what exactly does ANALYZE look at for text columns? in our case, about 
7% of the rows match the index condition, so it seems that left-anchored 
regexp/like matches are not evaluated using the gathered 
most-common-value list at all)


oops, I think I gave myself the answer there. Of course the 
most-common-value list will not help if all the values that match the 
"bad" index condition exist only once, but have a common prefix...


Perhaps Postgres could sample the first few characters separately for 
such queries, but it's probably not worth it.


Regards,
 Marinos

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] bad query plans for ~ "^string" (and like "string%") (8.3.6)

2009-04-08 Thread Marinos Yannikos
It seems that ANALYZE does not really sample text column values as much 
as it could. We have some very bad query plans resulting from this:


...
 ->  Bitmap Index Scan on m_pkey  (cost=0.00..28.61 rows=102 
width=0) (actual time=171.824..171.824 rows=683923 loops=1)

   Index Cond: ((e >= 'ean'::text) AND (e < 'eao'::text)

This gets even worse for longer strings, where we know that many 
matching rows exist:


# explain analyze select substring(e,5) from m where id=257421 and e ~ 
'^ean=';
QUERY PLAN 


---
 Index Scan using m_pkey on m  (cost=0.00..12.50 rows=1 width=60) 
(actual time=1623.795..1703.958 rows=18 loops=1)

   Index Cond: ((e >= 'ean='::text) AND (e < 'ean>'::text))
   Filter: ((e ~ '^ean='::text) AND (id = 257421))
 Total runtime: 1703.991 ms
(4 rows)

Here it would be much better to use the existing index on "id" (btree) 
first because the current index condition selects 683k rows whereas the 
result contains 18 rows. Using the index on id would yield 97 rows to 
filter.


Is it possible to work around this problem somehow, other than adding 
partial indexes for the ~ / LIKE condition (when it's constant) or a 
2-dimensional index?


(what exactly does ANALYZE look at for text columns? in our case, about 
7% of the rows match the index condition, so it seems that left-anchored 
regexp/like matches are not evaluated using the gathered 
most-common-value list at all)


Regards,
 Marinos

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-08 Thread Marinos Yannikos

Heikki Linnakangas wrote:

Lists wrote:
Server is a dual core xeon 3GB ram and 2 mirrors of 15k SAS drives (1 
for most data, 1 for wal and a few tables and indexes)


In total all databases on the server are about 10G on disk (about 2GB 
in pgdump format).


I'd suggest buying as much RAM as you can fit into the server. RAM is 
cheap, and with a database of that size more cache could have a dramatic 
effect.


I'll second this. Although it doesn't really answer the original 
question, you have to keep in mind that for read-intensive workloads, 
caching will give you the biggest benefit by far, orders of magnitude 
more than replication solutions unless you want to spend a lot of $ on 
hardware (which I take it you don't if you are reluctant to add new 
disks). Keeping the interesting parts of the DB completely in RAM makes 
a big difference, common older (P4-based) Xeon boards can usually be 
upgraded to 12-16GB RAM, newer ones to anywhere between 16 and 192GB ...


As for replication solutions - Slony I wouldn't recommend (tried it for 
workloads with large writes - bad idea), but PgQ looks very solid and 
you could either use Londiste or build your own very fast non-RDBMS 
slaves using PgQ by keeping the data in an optimized format for your 
queries (e.g. if you don't need joins - use TokyoCabinet/Berkeley DB).


Regards,
 Marinos


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] plpgsql arrays

2009-04-08 Thread Matthew Wakeling

On Tue, 7 Apr 2009, Tom Lane wrote:

Subsequent discussion showed that the problem was Matthew hadn't found
that page.  I guess that at least the DECLARE CURSOR reference page
ought to have something like "if you are trying to use cursors in
plpgsql, see ".  Matthew, where *were* you looking exactly?


The DECLARE CURSOR page, and then guessing the INTO bit because that's how 
SELECT works.


Matthew

--
for a in past present future; do
  for b in clients employers associates relatives neighbours pets; do
  echo "The opinions here in no way reflect the opinions of my $a $b."
done; done

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Best replication solution?

2009-04-08 Thread Mark Kirkwood

Andrew Sullivan wrote:


I should have stated that differently.  First, you're right that if
you don't know where to look or what to look for, you can easily be
unaware of nodes being out of sync.  What's not a problem with Slony
is that the nodes can get out of internally consistent sync state: if
you have a node that is badly lagged, at least it represents, for
sure, an actual point in time of the origin set's history.  Some of
the replication systems aren't as careful about this, and it's
possible to get the replica into a state that never happened on the
origin.  That's much worse, in my view.

In addition, it is not possible that Slony's system tables report the
replica as being up to date without them actually being so, because
the system tables are updated in the same transaction as the data is
sent.  It's hard to read those tables, however, because you have to
check every node and understand all the states.

  


Yes, and nicely explained!


(on Londiste DDL + slave chaining)...
Well, those particular features -- which are indeed the source of much
of the complexity in Slony -- were planned in from the beginning.
Londiste aimed to be simpler, so it would be interesting to see
whether those features could be incorporated without the same
complication.



  

Yeah, that's the challenge!

Personally I would like DDL to be possible without any special wrappers 
or precautions, as the usual (accidental) breakage I end up looking at 
in Slony is because someone (or an app's upgrade script) has performed 
an ALTER TABLE directly on the master schema...


Cheers

Mark


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance