Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-09-25 Thread Jeff Janes
On Wed, Sep 11, 2013 at 2:10 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2013-09-11 15:06:23 -0400, Andrew Dunstan wrote:
 
  One thing that this made me wonder is why we don't have
 transaction_timeout,
  or maybe transaction_idle_timeout.

 Because it's harder than it sounds, at least if you want to support
 idle-in-transactions. Note that we do not support pg_cancel_backend()
 for those yet...


So we are left with pg_terminate_backend in a cron job.  That mostly seems
to work, because usually apps that abandon connections in the
idle-in-transaction state will never check back on them anyway, but cancel
would be nicer.



 Also, I think it might lead to papering over actual issues with
 applications leaving transactions open. I don't really see a valid
 reason for an application needing cancelling of long idle transactions.


Some of us make a living, at least partially, by papering over issues with
3rd party applications that we have no control over!

Cheers,

Jeff


Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?

2013-09-25 Thread Jeff Janes
On Tue, Sep 24, 2013 at 3:35 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Kevin Grittner kgri...@ymail.com writes:
  Are we talking about the probe for the end (or beginning) of an
  index?  If so, should we even care about visibility of the row
  related to the most extreme index entry?  Should we even go to the
  heap during the plan phase?

 Consider the case where some transaction inserted a wildly out-of-range
 value, then rolled back.  If we don't check validity of the heap row,
 we'd be using that silly endpoint value for planning purposes ---
 indefinitely.



Would it really be indefinite?  Would it be different from if someone
inserted a wild value, committed, then deleted it and committed that?  It
seems like eventually the histogram would have to get rebuilt with the
ability to shrink the range.

To get really complicated, it could stop at an in-progress tuple and use
its value for immediate purposes, but suppress storing it in the histogram
(storing only committed, not in-progress, values).

Cheers,

Jeff


Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?

2013-09-25 Thread Jeff Janes
On Tue, Sep 24, 2013 at 10:43 AM, Josh Berkus j...@agliodbs.com wrote:

 On 09/24/2013 08:01 AM, jes...@krogh.cc wrote:
  This stuff is a 9.2 feature right? What was the original problem to be
  adressed?

 Earlier, actually.  9.1?  9.0?

 The problem addressed was that, for tables with a progressive value
 like a sequence or a timestamp, the planner tended to estimate 1 row any
 time the user queried the 10,000 most recent rows due to the stats being
 out-of-date.  This resulted in some colossally bad query plans for a
 very common situation.

 So there's no question that the current behavior is an improvement,
 since it affects *only* users who have left an idle transaction open for
 long periods of time, something you're not supposed to do anyway.


Some transaction just take a long time to complete their work.  If the
first thing it does is insert these poisoned values, then go on to do other
intensive work on other tables, it can do some serious damage without being
idle.



  Not
 that we shouldn't fix it (and backport the fix), but we don't want to
 regress to the prior planner behavior.

 However, a solution is not readily obvious:


The mergejoinscansel code is almost pathologically designed to exercise
this case (which seems to be what is doing in the original poster) because
it systematically probes the highest and lowest values from one table
against the other.  If they have the same range, that means it will always
be testing the upper limit.  Perhaps mergejoinscansel could pass a flag to
prevent the look-up from happening.  My gut feeling is that mergejoin it
would not be very sensitive to the progressive value issue, but I can't
really back that up.  On the other hand, if we could just make getting the
actual value faster then everyone would be better off.



 On 09/20/2013 03:01 PM, Jeff Janes wrote: 3) Even worse, asking if a
 given transaction has finished yet can be a
  serious point of system-wide contention, because it takes the
  ProcArrayLock, once per row which needs to be checked.  So you have 20
  processes all fighting over the ProcArrayLock, each doing so 1000
 times per
  query.

 Why do we need a procarraylock for this?  Seems like the solution would
 be not to take a lock at all; the information on transaction commit is
 in the clog, after all.


My understanding is that you are not allowed to check the clog until after
you verify the transaction is no longer in progress, otherwise you open up
race conditions.

Cheers,

Jeff


Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?

2013-09-25 Thread Sam Wong
Hi All and Merlin,

So here is the explain analyze output.

--
Query A -- single row output, but very slow query
--
SELECT min(log_id) FROM event_log
WHERE event='S-Create' AND
insert_time'2013-09-15' and insert_time'2013-09-16'

http://explain.depesz.com/s/3H5
Result  (cost=134.48..134.49 rows=1 width=0) (actual
time=348370.719..348370.720 rows=1 loops=1)
  Output: $0
  InitPlan 1 (returns $0)
-  Limit  (cost=0.00..134.48 rows=1 width=8) (actual
time=348370.712..348370.713 rows=1 loops=1)
  Output: uco.event_log.log_id
  -  Index Scan using event_log_pkey on uco.event_log
(cost=0.00..1525564.02 rows=11344 width=8) (actual
time=348370.709..348370.709 rows=1 loops=1)
Output: uco.event_log.log_id
Index Cond: (uco.event_log.log_id IS NOT NULL)
Filter: ((uco.event_log.insert_time  '2013-09-15
00:00:00'::timestamp without time zone) AND (uco.event_log.insert_time 
'2013-09-16 00:00:00'::timestamp without time zone) AND (uco.event_log.event
= 'S-Create'::text))
Rows Removed by Filter: 43249789
Total runtime: 348370.762 ms

--
Query B -- multiple row output, fast query, but I could get what I want from
the first output row
--
SELECT log_id FROM event_log
WHERE event='S-Create' AND
insert_time'2013-09-15' and insert_time'2013-09-16'
ORDER BY log_id

http://explain.depesz.com/s/s6P
Sort  (cost=41015.85..41021.52 rows=11344 width=8) (actual
time=3651.695..3652.160 rows=6948 loops=1)
  Output: log_id
  Sort Key: event_log.log_id
  Sort Method: quicksort  Memory: 518kB
  -  Bitmap Heap Scan on uco.event_log  (cost=311.42..40863.05 rows=11344
width=8) (actual time=448.349..3645.465 rows=6948 loops=1)
Output: log_id
Recheck Cond: ((event_log.event = 'S-Create'::text) AND
(event_log.insert_time  '2013-09-15 00:00:00'::timestamp without time zone)
AND (event_log.insert_time  '2013-09-16 00:00:00'::timestamp without time
zone))
-  Bitmap Index Scan on event_data_search  (cost=0.00..310.86
rows=11344 width=0) (actual time=447.670..447.670 rows=6948 loops=1)
  Index Cond: ((event_log.event = 'S-Create'::text) AND
(event_log.insert_time  '2013-09-15 00:00:00'::timestamp without time zone)
AND (event_log.insert_time  '2013-09-16 00:00:00'::timestamp without time
zone))
Total runtime: 3652.535 ms

P.S. If I put a LIMIT 1 at the end of this query, it will get an identical
plan just like Query A.

--
My observation:
In Query A, the lower bound of the INDEX SCAN node estimation is way off. It
won't get the first row output right at 0.00 because the filter needed to be
applied.

Thanks,
Sam



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?

2013-09-25 Thread Merlin Moncure
On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire klaussfre...@gmail.com wrote:
 On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong s...@hellosam.net wrote:
 This event_log table has 4 million rows.

 “log_id” is the primary key (bigint),

 there is a composite index “event_data_search” over (event::text,
 insert_time::datetime).


 I think you need to add log_id to that composite index to get pg to use it.

hurk: OP is two statistics misses (one of them massive that are
combing to gobsmack you).

your solution unfortuantely wont work: you can't combine two range
searches in a single index scan.  it would probably work if you it
like this.  If insert_time is a timestamp, not a timestamptz, we can
convert it to date to get what I think he wants (as long as his
queries are along date boundaries).

how about:
CREATE INDEX ON event_log(event_id, insert_time::date, log_id);

EXPLAIN ANALYZE
  SELECT * FROM event_log
  WHERE
(event_id, insert_time::date, log_id) =  ('S-Create',
'2013-09-15'::date, 0)
AND event_id = 'S-Create' AND insert_time::date  '2013-09-16'::date
  ORDER BY
event_id, insert_time::date, log_id
  LIMIT 1

if insert_time is a timestamptz, we can materialize the date into the
table to get around that (timestamptz-date is a stable expression).
If date boundary handling is awkward, our best bet is probably to hack
the planner with a CTE.  Note the above query will smoke the CTE based
one.

WITH data AS
(
  SELECT log_id FROM event_log
  WHERE event='S-Create' AND
  insert_time'2013-09-15' and insert_time'2013-09-16'
)
SELECT * from data ORDER BY log_id LIMIT 1;

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] earthdistance query performance

2013-09-25 Thread AI Rumman
Hi,

I have a table with zip_code and latitude and longitude.

\d zip_code_based_lng_lat
Table public.zip_code_based_lng_lat
 Column |  Type  | Modifiers
++---
 zip| character varying(100) |
 state  | character varying(100) |
 city   | character varying(100) |
 type   | character varying(100) |
 lat| character varying(100) |
 lng| character varying(100) |
Indexes:
zip_code_based_lng_lat_zipidx btree (zip)

I need to find the closest distance using the radius formula using a
zip_code provided by user.

I build the query like:

select *,
earth_distance(q2_c1, q1.c1) as d
from
(
select *, ll_to_earth(lat::float,lng::float)  as c1  from
zip_code_based_lng_lat
) as q1,
(
select ll_to_earth(lat::float,lng::float) q2_c1 from zip_code_based_lng_lat
where zip='18938'
) as q2
order by d
limit 10


 Limit  (cost=216010.21..216010.24 rows=10 width=55) (actual
time=38296.185..38296.191 rows=10 loops=1)
   -  Sort  (cost=216010.21..216415.74 rows=162212 width=55) (actual
time=38296.182..38296.182 rows=10 loops=1)
 Sort Key:
(sec_to_gc(cube_distance((ll_to_earth((public.zip_code_based_lng_lat.lat)::double
precision, (public.zip_code_based_lng_lat.lng)::double precision))::cube,
(ll
_to_earth((public.zip_code_based_lng_lat.lat)::double precision,
(public.zip_code_based_lng_lat.lng)::double precision))::cube)))
 Sort Method: top-N heapsort  Memory: 27kB
 -  Nested Loop  (cost=0.00..212504.87 rows=162212 width=55)
(actual time=3.244..38052.444 rows=81106 loops=1)
   -  Seq Scan on zip_code_based_lng_lat  (cost=0.00..817.90
rows=81106 width=38) (actual time=0.025..50.669 rows=81106 loops=1)
   -  Materialize  (cost=0.00..0.32 rows=2 width=17) (actual
time=0.000..0.001 rows=1 loops=81106)
 -  Index Scan using zip_code_based_lng_lat_zipidx on
zip_code_based_lng_lat  (cost=0.00..0.31 rows=2 width=17) (actual
time=0.080..0.084 rows=1 loops=1)
   Index Cond: ((zip)::text = '18938'::text)
 Total runtime: 38296.360 ms


The result is fine. But it is too slow.
I am using Postgresql 9.2 with following parameters:

shared_buffers = 6GB
work_mem = 500 MB
seq_page_cost = 0.01
random_page_cost = 0.01

Any idea to improve it.

Thanks.


Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?

2013-09-25 Thread Claudio Freire
On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire klaussfre...@gmail.com 
 wrote:
 On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong s...@hellosam.net wrote:
 This event_log table has 4 million rows.

 “log_id” is the primary key (bigint),

 there is a composite index “event_data_search” over (event::text,
 insert_time::datetime).


 I think you need to add log_id to that composite index to get pg to use it.

 hurk: OP is two statistics misses (one of them massive that are
 combing to gobsmack you).

 your solution unfortuantely wont work: you can't combine two range
 searches in a single index scan.  it would probably work if you it
 like this.  If insert_time is a timestamp, not a timestamptz, we can
 convert it to date to get what I think he wants (as long as his
 queries are along date boundaries).


I was thinking an index over:

(event, date_trunc('day', insert_time), log_id)

And the query like

SELECT min(log_id) FROM event_log
WHERE event='S-Create' AND
date_trunc('day',insert_time) = '2013-09-15'


That's a regular simple range scan over the index.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?

2013-09-25 Thread Merlin Moncure
On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire klaussfre...@gmail.com wrote:
 On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire klaussfre...@gmail.com 
 wrote:
 On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong s...@hellosam.net wrote:
 This event_log table has 4 million rows.

 “log_id” is the primary key (bigint),

 there is a composite index “event_data_search” over (event::text,
 insert_time::datetime).


 I think you need to add log_id to that composite index to get pg to use it.

 hurk: OP is two statistics misses (one of them massive that are
 combing to gobsmack you).

 your solution unfortuantely wont work: you can't combine two range
 searches in a single index scan.  it would probably work if you it
 like this.  If insert_time is a timestamp, not a timestamptz, we can
 convert it to date to get what I think he wants (as long as his
 queries are along date boundaries).


 I was thinking an index over:

 (event, date_trunc('day', insert_time), log_id)

 And the query like

 SELECT min(log_id) FROM event_log
 WHERE event='S-Create' AND
 date_trunc('day',insert_time) = '2013-09-15'


 That's a regular simple range scan over the index.

*) date_trunc has same problems as ::date: it is stable expression
only for timestamptz.  also, the index will be bigger since you're
still indexing timestamp

*) row wise comparison search might be faster and is generalized to
return N records, not jut one.

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Troubleshooting query performance issues

2013-09-25 Thread Jim Garrison
I spent about a week optimizing a query in our performance-testing environment, 
which has hardware similar to production.

I was able to refactor the query and reduce the runtime from hours to about 40 
seconds, through the use of CTEs and a couple of new indexes.

The database was rebuilt and refreshed with the very similar data from 
production, but now the query takes hours again.

In the query plan, it is clear that the row count estimates are WAY too low, 
even though the statistics are up to date.  Here's a sample query plan:

CTE Scan on stef  (cost=164.98..165.00 rows=1 width=38)
  CTE terms
-  Nested Loop  (cost=0.00..62.40 rows=1 width=12)
  -  Index Scan using term_idx1 on term t  (cost=0.00..52.35 rows=1 
width=12)
Index Cond: (partner_id = 497)
Filter: (recalculate_district_averages_yn AND (NOT 
is_deleted_yn))
  -  Index Scan using growth_measurement_window_fk1 on 
growth_measurement_window gw  (cost=0.00..10.04 rows=1 width=4)
Index Cond: (term_id = t.term_id)
Filter: (test_window_complete_yn AND (NOT is_deleted_yn) AND 
((growth_window_type)::text = 'DISTRICT'::text))
  CTE stef
-  Nested Loop  (cost=0.00..102.58 rows=1 width=29)
  Join Filter: ((ssef.student_id = terf.student_id) AND (ssef.grade_id 
= terf.grade_id))
  -  Nested Loop  (cost=0.00..18.80 rows=3 width=28)
-  CTE Scan on terms t  (cost=0.00..0.02 rows=1 width=8)
-  Index Scan using student_school_enrollment_fact_idx2 on 
student_school_enrollment_fact ssef  (cost=0.00..18.74 rows=3 width=20)
  Index Cond: ((partner_id = t.partner_id) AND (term_id = 
t.term_id))
  Filter: primary_yn
  -  Index Scan using test_event_result_fact_idx3 on 
test_event_result_fact terf  (cost=0.00..27.85 rows=4 width=25)
Index Cond: ((partner_id = t.partner_id) AND (term_id = 
t.term_id))
Filter: growth_event_yn

The estimates in the first CTE are correct, but in the second, the scan on 
student_school_enrollment_fact will return about 1.5 million rows, and the scan 
on test_event_result_fact actually returns about 1.1 million.  The top level 
join should return about 900K rows.  I believe the fundamental issue is that 
the CTE stef outer nested loop should be a merge join instead, but I cannot 
figure out why the optimizer is estimating one row when it has the statistics 
to correctly estimate the count.

What would cause PG to so badly estimate the row counts?

I've already regenerated the indexes and re-analyzed the tables involved.

What else can I do to find out why it's running so slowly?


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] earthdistance query performance

2013-09-25 Thread Merlin Moncure
On Wed, Sep 25, 2013 at 10:05 AM, AI Rumman rumman...@gmail.com wrote:
 Hi,

 I have a table with zip_code and latitude and longitude.

 \d zip_code_based_lng_lat
 Table public.zip_code_based_lng_lat
  Column |  Type  | Modifiers
 ++---
  zip| character varying(100) |
  state  | character varying(100) |
  city   | character varying(100) |
  type   | character varying(100) |
  lat| character varying(100) |
  lng| character varying(100) |
 Indexes:
 zip_code_based_lng_lat_zipidx btree (zip)

 I need to find the closest distance using the radius formula using a
 zip_code provided by user.

 I build the query like:

 select *,
 earth_distance(q2_c1, q1.c1) as d
 from
 (
 select *, ll_to_earth(lat::float,lng::float)  as c1  from
 zip_code_based_lng_lat
 ) as q1,
 (
 select ll_to_earth(lat::float,lng::float) q2_c1 from zip_code_based_lng_lat
 where zip='18938'
 ) as q2
 order by d
 limit 10


  Limit  (cost=216010.21..216010.24 rows=10 width=55) (actual
 time=38296.185..38296.191 rows=10 loops=1)
-  Sort  (cost=216010.21..216415.74 rows=162212 width=55) (actual
 time=38296.182..38296.182 rows=10 loops=1)
  Sort Key:
 (sec_to_gc(cube_distance((ll_to_earth((public.zip_code_based_lng_lat.lat)::double
 precision, (public.zip_code_based_lng_lat.lng)::double precision))::cube,
 (ll
 _to_earth((public.zip_code_based_lng_lat.lat)::double precision,
 (public.zip_code_based_lng_lat.lng)::double precision))::cube)))
  Sort Method: top-N heapsort  Memory: 27kB
  -  Nested Loop  (cost=0.00..212504.87 rows=162212 width=55)
 (actual time=3.244..38052.444 rows=81106 loops=1)
-  Seq Scan on zip_code_based_lng_lat  (cost=0.00..817.90
 rows=81106 width=38) (actual time=0.025..50.669 rows=81106 loops=1)
-  Materialize  (cost=0.00..0.32 rows=2 width=17) (actual
 time=0.000..0.001 rows=1 loops=81106)
  -  Index Scan using zip_code_based_lng_lat_zipidx on
 zip_code_based_lng_lat  (cost=0.00..0.31 rows=2 width=17) (actual
 time=0.080..0.084 rows=1 loops=1)
Index Cond: ((zip)::text = '18938'::text)
  Total runtime: 38296.360 ms

your problem is the sort is happening before the limit.  you need to
reconfigure your query so that's compatible with nearest neighbor
search (which was introduced with 9.1).

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Troubleshooting query performance issues

2013-09-25 Thread bricklen
On Wed, Sep 25, 2013 at 8:58 AM, Jim Garrison jim.garri...@nwea.org wrote:

 I spent about a week optimizing a query in our performance-testing
 environment, which has hardware similar to production.

 I was able to refactor the query and reduce the runtime from hours to
 about 40 seconds, through the use of CTEs and a couple of new indexes.

 The database was rebuilt and refreshed with the very similar data from
 production, but now the query takes hours again.

 In the query plan, it is clear that the row count estimates are WAY too
 low, even though the statistics are up to date.  Here's a sample query plan:

 CTE Scan on stef  (cost=164.98..165.00 rows=1 width=38)
   CTE terms
 -  Nested Loop  (cost=0.00..62.40 rows=1 width=12)
   -  Index Scan using term_idx1 on term t  (cost=0.00..52.35
 rows=1 width=12)
 Index Cond: (partner_id = 497)
 Filter: (recalculate_district_averages_yn AND (NOT
 is_deleted_yn))
   -  Index Scan using growth_measurement_window_fk1 on
 growth_measurement_window gw  (cost=0.00..10.04 rows=1 width=4)
 Index Cond: (term_id = t.term_id)
 Filter: (test_window_complete_yn AND (NOT is_deleted_yn)
 AND ((growth_window_type)::text = 'DISTRICT'::text))
   CTE stef
 -  Nested Loop  (cost=0.00..102.58 rows=1 width=29)
   Join Filter: ((ssef.student_id = terf.student_id) AND
 (ssef.grade_id = terf.grade_id))
   -  Nested Loop  (cost=0.00..18.80 rows=3 width=28)
 -  CTE Scan on terms t  (cost=0.00..0.02 rows=1 width=8)
 -  Index Scan using student_school_enrollment_fact_idx2
 on student_school_enrollment_fact ssef  (cost=0.00..18.74 rows=3 width=20)
   Index Cond: ((partner_id = t.partner_id) AND
 (term_id = t.term_id))
   Filter: primary_yn
   -  Index Scan using test_event_result_fact_idx3 on
 test_event_result_fact terf  (cost=0.00..27.85 rows=4 width=25)
 Index Cond: ((partner_id = t.partner_id) AND (term_id =
 t.term_id))
 Filter: growth_event_yn

 The estimates in the first CTE are correct, but in the second, the scan on
 student_school_enrollment_fact will return about 1.5 million rows, and the
 scan on test_event_result_fact actually returns about 1.1 million.  The top
 level join should return about 900K rows.  I believe the fundamental issue
 is that the CTE stef outer nested loop should be a merge join instead, but
 I cannot figure out why the optimizer is estimating one row when it has the
 statistics to correctly estimate the count.

 What would cause PG to so badly estimate the row counts?

 I've already regenerated the indexes and re-analyzed the tables involved.

 What else can I do to find out why it's running so slowly?


More details about the environment would probably be helpful:
https://wiki.postgresql.org/wiki/Slow_Query_Questions
Are you able to swap out the CTE for a temp table and index that (+analyze)
to compare against the CTE version?


Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?

2013-09-25 Thread Sam Wong
 -Original Message-
 From: pgsql-performance-ow...@postgresql.org
 [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Merlin
 Moncure
 Sent: Wednesday, September 25, 2013 23:55
 To: Claudio Freire
 Cc: Sam Wong; postgres performance list
 Subject: Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?
 
 On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire klaussfre...@gmail.com
 wrote:
  On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure mmonc...@gmail.com
 wrote:
  On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire
klaussfre...@gmail.com
 wrote:
  On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong s...@hellosam.net wrote:
  This event_log table has 4 million rows.
 
  log_id is the primary key (bigint),
 
  there is a composite index event_data_search over (event::text,
  insert_time::datetime).
 
 
  I think you need to add log_id to that composite index to get pg to
use it.
 
  hurk: OP is two statistics misses (one of them massive that are
  combing to gobsmack you).
 
  your solution unfortuantely wont work: you can't combine two range
  searches in a single index scan.  it would probably work if you it
  like this.  If insert_time is a timestamp, not a timestamptz, we can
  convert it to date to get what I think he wants (as long as his
  queries are along date boundaries).
 
 
  I was thinking an index over:
 
  (event, date_trunc('day', insert_time), log_id)
 
  And the query like
 
  SELECT min(log_id) FROM event_log
  WHERE event='S-Create' AND
  date_trunc('day',insert_time) = '2013-09-15'
 
 
  That's a regular simple range scan over the index.
 
 *) date_trunc has same problems as ::date: it is stable expression only
for
 timestamptz.  also, the index will be bigger since you're still indexing
 timestamp
 
 *) row wise comparison search might be faster and is generalized to return
N
 records, not jut one.
 
 merlin
I'm afraid it's not anything about composite index. (Because the query B
works fine and the plan is as expected)
BTW, the timestamp is without timezone.

I just thought of another way that demonstrate the issue.
Both query returns the same one row result. log_id is the bigint primary
key, event_data_search is still that indexed.
---
Fast query
---
with Q AS (
select event
from event_log 
WHERE log_id1 and log_id5 
order by event
)
SELECT * FROM Q LIMIT 1

Limit  (cost=2521.82..2521.83 rows=1 width=32) (actual time=88.342..88.342
rows=1 loops=1)
  Output: q.event
  Buffers: shared hit=93 read=622
  CTE q
-  Sort  (cost=2502.07..2521.82 rows=39502 width=6) (actual
time=88.335..88.335 rows=1 loops=1)
  Output: event_log.event
  Sort Key: event_log.event
  Sort Method: quicksort  Memory: 3486kB
  Buffers: shared hit=93 read=622
  -  Index Scan using event_log_pkey on uco.event_log
(cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573
rows=3 loops=1)
Output: event_log.event
Index Cond: ((event_log.log_id  101) AND
(event_log.log_id  105))
Buffers: shared hit=93 read=622
  -  CTE Scan on q  (cost=0.00..237.01 rows=39502 width=32) (actual
time=88.340..88.340 rows=1 loops=1)
Output: q.event
Buffers: shared hit=93 read=622
Total runtime: 89.039 ms

---
Slow Query
---
Result  (cost=1241.05..1241.05 rows=1 width=0) (actual
time=1099.532..1099.533 rows=1 loops=1)
  Output: $0
  Buffers: shared hit=49029 read=57866
  InitPlan 1 (returns $0)
-  Limit  (cost=0.00..1241.05 rows=1 width=6) (actual
time=1099.527..1099.527 rows=1 loops=1)
  Output: uco.event_log.event
  Buffers: shared hit=49029 read=57866
  -  Index Scan using event_data_search on uco.event_log
(cost=0.00..49024009.79 rows=39502 width=6) (actual time=1099.523..1099.523
rows=1 loops=1)
Output: uco.event_log.event
Index Cond: (uco.event_log.event IS NOT NULL)
Filter: ((uco.event_log.log_id  101) AND
(uco.event_log.log_id  105))
Rows Removed by Filter: 303884
Buffers: shared hit=49029 read=57866
Total runtime: 1099.568 ms
(Note: Things got buffered so it goes down to 1 second, but comparing to the
buffer count with the query above this is a few orders slower than that)

---
The CTE fast query works, it is completed with index scan over
event_log_pkey, which is what I am expecting, and it is good.
But it's much straight forward to write it as the slow query, I am
expecting the planner would give the same optimum plan for both.

For the plan of Slow query has an estimated total cost of 1241.05, and the
Fast query has 2521.83, 
hence the planner prefers that plan - the scanning over the
event_data_search index plan - but this choice doesn't make sense to me.

This is my point I want to bring up, why the planner would do that? Instead
of scanning over the event_log_pkey?

Looking into the estimated cost of the Slow Query Plan, the Index Scan node
lower bound estimation is 0.00. IIRC, 

Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?

2013-09-25 Thread Sam Wong
 -Original Message-
 From: pgsql-performance-ow...@postgresql.org
 [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Sam Wong
 Sent: Thursday, September 26, 2013 0:34
 To: 'postgres performance list'
 Subject: Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?
 
  -Original Message-
  From: pgsql-performance-ow...@postgresql.org
  [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Merlin
  Moncure
  Sent: Wednesday, September 25, 2013 23:55
  To: Claudio Freire
  Cc: Sam Wong; postgres performance list
  Subject: Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?
 
  On Wed, Sep 25, 2013 at 10:20 AM, Claudio Freire
  klaussfre...@gmail.com
  wrote:
   On Wed, Sep 25, 2013 at 10:29 AM, Merlin Moncure
   mmonc...@gmail.com
  wrote:
   On Tue, Sep 24, 2013 at 4:56 PM, Claudio Freire
 klaussfre...@gmail.com
  wrote:
   On Tue, Sep 24, 2013 at 6:24 AM, Sam Wong s...@hellosam.net
 wrote:
   This event_log table has 4 million rows.
  
   log_id is the primary key (bigint),
  
   there is a composite index event_data_search over (event::text,
   insert_time::datetime).
  
  
   I think you need to add log_id to that composite index to get pg
   to
 use it.
  
   hurk: OP is two statistics misses (one of them massive that are
   combing to gobsmack you).
  
   your solution unfortuantely wont work: you can't combine two range
   searches in a single index scan.  it would probably work if you it
   like this.  If insert_time is a timestamp, not a timestamptz, we
   can convert it to date to get what I think he wants (as long as his
   queries are along date boundaries).
  
  
   I was thinking an index over:
  
   (event, date_trunc('day', insert_time), log_id)
  
   And the query like
  
   SELECT min(log_id) FROM event_log
   WHERE event='S-Create' AND
   date_trunc('day',insert_time) = '2013-09-15'
  
  
   That's a regular simple range scan over the index.
 
  *) date_trunc has same problems as ::date: it is stable expression
  only
 for
  timestamptz.  also, the index will be bigger since you're still
  indexing timestamp
 
  *) row wise comparison search might be faster and is generalized to
  return
 N
  records, not jut one.
 
  merlin
 I'm afraid it's not anything about composite index. (Because the query B
works
 fine and the plan is as expected) BTW, the timestamp is without timezone.
 
 I just thought of another way that demonstrate the issue.
 Both query returns the same one row result. log_id is the bigint primary
key,
 event_data_search is still that indexed.
 ---
 Fast query
 ---
Excuse me, this is the actual query.

with Q AS (
select event
from event_log 
WHERE log_id101 and log_id105 
order by event
)
SELECT * FROM Q LIMIT 1
 
 Limit  (cost=2521.82..2521.83 rows=1 width=32) (actual time=88.342..88.342
 rows=1 loops=1)
   Output: q.event
   Buffers: shared hit=93 read=622
   CTE q
 -  Sort  (cost=2502.07..2521.82 rows=39502 width=6) (actual
 time=88.335..88.335 rows=1 loops=1)
   Output: event_log.event
   Sort Key: event_log.event
   Sort Method: quicksort  Memory: 3486kB
   Buffers: shared hit=93 read=622
   -  Index Scan using event_log_pkey on uco.event_log
 (cost=0.00..1898.89 rows=39502 width=6) (actual time=13.918..65.573
 rows=3 loops=1)
 Output: event_log.event
 Index Cond: ((event_log.log_id  101) AND
 (event_log.log_id  105))
 Buffers: shared hit=93 read=622
   -  CTE Scan on q  (cost=0.00..237.01 rows=39502 width=32) (actual
 time=88.340..88.340 rows=1 loops=1)
 Output: q.event
 Buffers: shared hit=93 read=622
 Total runtime: 89.039 ms
 
 ---
 Slow Query
 ---
And the query for this...I must have forgot to paste.

select min(event)
from event_log 
WHERE log_id101 and log_id105
 Result  (cost=1241.05..1241.05 rows=1 width=0) (actual
 time=1099.532..1099.533 rows=1 loops=1)
   Output: $0
   Buffers: shared hit=49029 read=57866
   InitPlan 1 (returns $0)
 -  Limit  (cost=0.00..1241.05 rows=1 width=6) (actual
 time=1099.527..1099.527 rows=1 loops=1)
   Output: uco.event_log.event
   Buffers: shared hit=49029 read=57866
   -  Index Scan using event_data_search on uco.event_log
 (cost=0.00..49024009.79 rows=39502 width=6) (actual
 time=1099.523..1099.523
 rows=1 loops=1)
 Output: uco.event_log.event
 Index Cond: (uco.event_log.event IS NOT NULL)
 Filter: ((uco.event_log.log_id  101) AND
 (uco.event_log.log_id  105))
 Rows Removed by Filter: 303884
 Buffers: shared hit=49029 read=57866 Total runtime:
 1099.568 ms
 (Note: Things got buffered so it goes down to 1 second, but comparing to
the
 buffer count with the query above this is a few orders slower than that)
 
 ---
 The CTE fast query works, it is completed with index scan over
 event_log_pkey, which is what I am expecting, and it is good.
 But it's much 

Re: [PERFORM] Slow plan for MAX/MIN or LIMIT 1?

2013-09-25 Thread Claudio Freire
On Wed, Sep 25, 2013 at 12:54 PM, Merlin Moncure mmonc...@gmail.com wrote:
 I was thinking an index over:

 (event, date_trunc('day', insert_time), log_id)

 And the query like

 SELECT min(log_id) FROM event_log
 WHERE event='S-Create' AND
 date_trunc('day',insert_time) = '2013-09-15'


 That's a regular simple range scan over the index.

 *) date_trunc has same problems as ::date: it is stable expression
 only for timestamptz.  also, the index will be bigger since you're
 still indexing timestamp


Ah, yes, good point.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why is n_distinct always -1 for range types?

2013-09-25 Thread Josh Berkus
On 09/19/2013 01:47 PM, Josh Berkus wrote:
 Test:
 
 1. create a table with a range type column.
 2. insert 1000 identical values into that column.
 3. analyze
 4. n-distinct will still be listed as -1 (unique) for the column.
 
 Why?
 

Anyone?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?

2013-09-25 Thread Josh Berkus
On 09/25/2013 12:06 AM, Jeff Janes wrote:
 Why do we need a procarraylock for this?  Seems like the solution would
 be not to take a lock at all; the information on transaction commit is
 in the clog, after all.

 
 My understanding is that you are not allowed to check the clog until after
 you verify the transaction is no longer in progress, otherwise you open up
 race conditions.

In this particular case, I'd argue that we don't care about race
conditions -- it's a plan estimate.  We certainly care about them a lot
less than lock-blocks.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?

2013-09-25 Thread Andres Freund
On 2013-09-25 00:06:06 -0700, Jeff Janes wrote:
  On 09/20/2013 03:01 PM, Jeff Janes wrote: 3) Even worse, asking if a
  given transaction has finished yet can be a
   serious point of system-wide contention, because it takes the
   ProcArrayLock, once per row which needs to be checked.  So you have 20
   processes all fighting over the ProcArrayLock, each doing so 1000
  times per
   query.

That should be gone in master, we don't use SnapshotNow anymore which
had those TransactionIdIsInProgress() calls you're probably referring
to. The lookups discussed in this thread now use the statement's
snapshot. And all those have their own copy of the currently running
transactions.

  Why do we need a procarraylock for this?  Seems like the solution would
  be not to take a lock at all; the information on transaction commit is
  in the clog, after all.

More clog accesses would hardly improve the situation.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why is n_distinct always -1 for range types?

2013-09-25 Thread Peter Geoghegan
On Thu, Sep 19, 2013 at 1:47 PM, Josh Berkus j...@agliodbs.com wrote:
 4. n-distinct will still be listed as -1 (unique) for the column.

 Why?


Because of this:

https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/rangetypes_typanalyze.c#L205

We only collect and use histograms of lower and upper bounds for range
types, and the fraction of empty ranges.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?

2013-09-25 Thread Jeff Janes
On Wed, Sep 25, 2013 at 10:53 AM, Andres Freund and...@2ndquadrant.comwrote:

 On 2013-09-25 00:06:06 -0700, Jeff Janes wrote:
   On 09/20/2013 03:01 PM, Jeff Janes wrote: 3) Even worse, asking if a
   given transaction has finished yet can be a
serious point of system-wide contention, because it takes the
ProcArrayLock, once per row which needs to be checked.  So you have
 20
processes all fighting over the ProcArrayLock, each doing so 1000
   times per
query.

 That should be gone in master, we don't use SnapshotNow anymore which
 had those TransactionIdIsInProgress() calls you're probably referring
 to. The lookups discussed in this thread now use the statement's
 snapshot. And all those have their own copy of the currently running
 transactions.



See HeapTupleSatisfiesMVCC, near line 943 of tqual.c:

else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
return false;
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
HeapTupleHeaderGetXmin(tuple));


If we guarded that check by moving up line 961 to before 943:

if (XidInMVCCSnapshot(HeapTupleHeaderGetXmin(tuple), snapshot))
return false;   /* treat as still in progress */

Then we could avoid the contention, as that check only refers to local
memory.

As far as I can tell, the only downside of doing that is that, since hint
bits might be set later, it is possible some dirty pages will get written
unhinted and then re-dirtied by the hint bit setting, when more aggressive
setting would have only one combined dirty write instead.  But that seems
rather hypothetical, and if it really is a problem we should probably
tackle it directly rather than by barring other optimizations.

Cheers,

Jeff


Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?

2013-09-25 Thread Andres Freund
On 2013-09-25 11:17:51 -0700, Jeff Janes wrote:
 On Wed, Sep 25, 2013 at 10:53 AM, Andres Freund and...@2ndquadrant.comwrote:
 
  On 2013-09-25 00:06:06 -0700, Jeff Janes wrote:
On 09/20/2013 03:01 PM, Jeff Janes wrote: 3) Even worse, asking if a
given transaction has finished yet can be a
 serious point of system-wide contention, because it takes the
 ProcArrayLock, once per row which needs to be checked.  So you have
  20
 processes all fighting over the ProcArrayLock, each doing so 1000
times per
 query.
 
  That should be gone in master, we don't use SnapshotNow anymore which
  had those TransactionIdIsInProgress() calls you're probably referring
  to. The lookups discussed in this thread now use the statement's
  snapshot. And all those have their own copy of the currently running
  transactions.

 See HeapTupleSatisfiesMVCC, near line 943 of tqual.c:
 
 else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
 return false;
 else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
 SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
 HeapTupleHeaderGetXmin(tuple));
 

Hm, sorry, misrembered things a bit there.

 If we guarded that check by moving up line 961 to before 943:
 
 if (XidInMVCCSnapshot(HeapTupleHeaderGetXmin(tuple), snapshot))
 return false;   /* treat as still in progress */
 
 Then we could avoid the contention, as that check only refers to local
 memory.

That wouldn't be correct afaics - the current TransactionIdIsInProgress
callsite is only called when no HEAP_XMIN_COMMITTED was set. So you
would have to duplicate it.

 As far as I can tell, the only downside of doing that is that, since hint
 bits might be set later, it is possible some dirty pages will get written
 unhinted and then re-dirtied by the hint bit setting, when more aggressive
 setting would have only one combined dirty write instead.  But that seems
 rather hypothetical, and if it really is a problem we should probably
 tackle it directly rather than by barring other optimizations.

I am - as evidenced - too tired to think about this properly, but I
think you might be right here.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance