Re: [PERFORM] Parallel Select query performance and shared buffers

2013-12-04 Thread Amit Kapila
On Wed, Dec 4, 2013 at 11:49 PM, Metin Doslu  wrote:
> Here are some extra information:
>
> - When we increased NUM_BUFFER_PARTITIONS to 1024, this problem is
> disappeared for 8 core machines and come back with 16 core machines on
> Amazon EC2. Would it be related with PostgreSQL locking mechanism?

  I think here there is a good chance of improvement with the patch
suggested by Andres in this thread, but
  still i think it might not completely resolve the current problem as
there will be overhead of associating data
  with shared buffers.

  Currently NUM_BUFFER_PARTITIONS is fixed, so may be auto tuning it
based on some parameter's can
  help such situations.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Not same plan between static and prepared query

2013-06-10 Thread Amit Kapila
On Sunday, June 09, 2013 8:45 PM Yuri Levinsky wrote:
> Amit,
> It's very strength for me to hear that PostgreSQL generate execution
> plan for prepared statements during execution, I always was thinking
> that the purpose of the prepared statement is to eliminate such
> behavior. 

It doesn't always choose to generate a new plan, rather it is a calculative
decision.
As far as I understand, it generates custom plan (based on bound parameters)
for 5 times and then generates generic plan (not based on bound parameters),
after that it compares that if the cost of generic plan is less than 10%
more expensive than average custom plan, then it will choose generic plan.

> Can it lead to  some performance degradation in case of heavy
> "update batch", that can run for millions of different values? 

Ideally it should not degrade performance.
What kind of update you have and does the values used for execute can vary
plan too much every time?

> Is it
> some way to give some kind of query hint that will eliminate execution
> path recalculations during heavy updates and instruct regarding correct
> execution plan?

Currently there doesn't exist any way to give any hint.
 
> Sincerely yours,
> 
> 
> Yuri Levinsky, DBA
> Celltick Technologies Ltd., 32 Maskit St., Herzliya 46733, Israel
> Mobile: +972 54 6107703, Office: +972 9 9710239; Fax: +972 9 9710222
> 
> -Original Message-
> From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
> ow...@postgresql.org] On Behalf Of Amit Kapila
> Sent: Thursday, June 06, 2013 1:41 PM
> To: 'Ghislain ROUVIGNAC'; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Not same plan between static and prepared query
> 
> 
> On Thursday, June 06, 2013 1:56 PM Ghislain ROUVIGNAC wrote:
> > Hello,
> 
> 
> > We have a strange issue related to a prepared statement.
> 
> 
> > We have two equals queries where the sole difference is in the limit.
> > - The first is hard coded with limit 500.
> > - The second is prepared with limit $1 ($1 is bound to 500).
> 
> 
> > PostgreSQL give us two different plans with a huge execution time for
> > the
> prepared query:
> 
> It can generate different plan for prepared query, because optimizer
> uses default selectivity in case of bound parameters (in your case
> limit $1).
> 
> 
> > We met the same behaviour with both :
> > - PostgreSQL 8.4.8 on Windows 2008 (Prod)
> > - PostgreSQL 8.4.8 and 8.4.17 on Windows 7 (Dev)
> 
> From PostgreSQL 9.2, it generates plan for prepared query during
> execution (Execute command) as well.
> So I think you will not face this problem in PostgreSQL 9.2 and above.
> 
> With Regards,
> Amit Kapila.
> 
> 
> 
> --
> Sent via pgsql-performance mailing list (pgsql-
> performa...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
> 
> This mail was received via Mail-SeCure System.
> 
> 
> 
> 
> --
> Sent via pgsql-performance mailing list (pgsql-
> performa...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Not same plan between static and prepared query

2013-06-06 Thread Amit Kapila

On Thursday, June 06, 2013 1:56 PM Ghislain ROUVIGNAC wrote:
> Hello,


> We have a strange issue related to a prepared statement.


> We have two equals queries where the sole difference is in the limit.
> - The first is hard coded with limit 500.
> - The second is prepared with limit $1 ($1 is bound to 500).


> PostgreSQL give us two different plans with a huge execution time for the
prepared query:

It can generate different plan for prepared query, because optimizer uses
default selectivity in case of bound parameters (in your case limit $1).


> We met the same behaviour with both :
> - PostgreSQL 8.4.8 on Windows 2008 (Prod)
> - PostgreSQL 8.4.8 and 8.4.17 on Windows 7 (Dev)

>From PostgreSQL 9.2, it generates plan for prepared query during execution
(Execute command) as well.
So I think you will not face this problem in PostgreSQL 9.2 and above.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-05-31 Thread Amit Kapila
On Thursday, May 30, 2013 11:36 PM Josh Berkus wrote:
> Amit,
> 
> > I think it might be because of choosing custom plan option due to
> which it might be generating new plan during exec_bind_message().
> > exec_bind_message()->GetCachedPlan()->choose_custom_plan(). If it
> chooses custom plan, then it will regenerate the plan which can cause
> extra cost
> > observed in test.
> > Though there is calculation that it should not choose custom plan
> always, but still I guess the variation observed in the test can be due
> to this reason.
> 
> This is why I'm asking them to run tests on 9.1.  If 9.1 doesn't
> exhibit
> this behavior, then customplan is liable to be at fault.
> 
> HOWEVER, that doesn't explain why creating a plan for a query during
> application operation would take 80ms, but only 1.2ms when I do it
> interactively.

When you say interactively, does it mean that you are using psql to test the 
same?

> FYI, per questions from IRC: the times for each "cycle" in my data are
> cumulative minutes.  Each cycle runs around 500,000 queries, so that's
> the aggregate across all queries.

Today I tried to see the changes between 8.4 and 9.1 for bind path in server. 
Following is summary of whatever I could see the differences

1. 4 new parameters are added to ParamListInfo, for which palloc is done in 
exec_bind_message
2. changed function for converting client to server encoding, but it seems for 
bind path, it will still follow same path as for 8.4
2. small change in RevalidateCachedPlan() for new hook added in ParamListInfo
3. standard_ExecutorStart(), changes to setup After Statement Trigger context
4. InitPlan has some changes for FOR UPDATE/FOR SELECT statements and junk 
filter case (update/delete statements)

>From the changes, it doesn't seem that any of such changes can cause the 
>problem you have seen.

Do you think it can be due to
a. JDBC - communication, encoding or some other changes
b. can we assume that plans generated for all statements are same, if not it 
might have some cost for query plan initialization (InitPlan) but again it 
should not be that big cost.

How do measure  individual bind time cost (is the cost of only server side or 
it includes client bind or ..)?

With Regards,
Amit Kapila.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance bug in prepared statement binding in 9.2?

2013-05-30 Thread Amit Kapila
On Thursday, May 30, 2013 5:45 AM Josh Berkus wrote:
> Folks,
> 
> I'm seeing what may be a major performance bug in BIND in 9.2.4.
> 
> We have a client who has an application which uses
> Tomcat+Hibernate+JDBC.  They are in the process of upgrading this
> application from 8.4.17 to 9.2.4.  As part of this, they have been
> doing
> performance testing, and 9.2 is coming out MUCH worse than 8.4.  The
> problem appears to be bind/plan time.
> 
> Their application does not use prepared queries usefully, doing
> parse,bind,execute on every query cycle.
> 
> Here's timings overall for 29 test cycles (cycle 1 has been omitted).
> As you can see, parse+execute times are pretty much constant, as are
> application think times, but bind times vary quite a lot.  In 8.4, the
> 29 cycles are constantly 4.5min to 5.75min long.  In 9.2, which is the
> chart below, they are all over the place.

I think it might be because of choosing custom plan option due to which it 
might be generating new plan during exec_bind_message().
exec_bind_message()->GetCachedPlan()->choose_custom_plan(). If it chooses 
custom plan, then it will regenerate the plan which can cause extra cost
observed in test.
Though there is calculation that it should not choose custom plan always, but 
still I guess the variation observed in the test can be due to this reason.

To test if this is the cause, we might hack the code such that it always 
chooses generic plan, so that it doesn't need to generate plan again.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very slow inner join query Unacceptable latency.

2013-05-23 Thread Amit Kapila

On Thursday, May 23, 2013 10:51 PM fburgess wrote:
> serverdb=# set enable_hashjoin=off;
> SET
> serverdb=# explain select count(*) as y0_ from SARS_ACTS this_ inner join 
> SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.ID where tr1.ALGORITHM='SMAT';

> QUERY PLAN
> --
> Aggregate  (cost=7765563.69..7765563.70 rows=1 width=0) 
>  Nested Loop  (cost=0.00..776.35 rows=3336 width=0)
> -> Index Scan using idx_sars_acts_run_algorithm on sars_acts_run tr1_ 
>  (cost=0.00..44.32 rows=650 width=8) 
>  Index Cond:  ((algorithm)::text = 'SMAT'::text)
> -> Index Scan using idx_sars_acts_run_id_end_time on sars_acts this_  
> (cost=0.00..11891.29 rows=4452 width=8) 
>  Index Cond:  (SARS_RUN_ID=tr1_.ID)
>(6 rows)

>serverdb=# \timing
>TIming is on.

>serverdb=# select count(*) as y0_ from SARS_ACTS this_ inner join 
>SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.ID where tr1.ALGORITHM='SMAT';
> y0_
>--
>1481710
>(1 row)

> Time: 85069.416 ms < 1.4 minutes <-- not great, but much better!

> Subsequently, runs in the milliseconds once cached.

If I see the plan from your other mail as below where Hash join is selected, 
the cost of Nested Loop is much more, that is the reason why optimizer would 
have selected 
Hash Join. 

serverdb=# explain analyze select count(*) as y0_ from SARS_ACTS this_ inner 
join SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.ID where 
tr1_.ALGORITHM='SMAT';
QUERY PLAN
--
Aggregate  (cost=3983424.05..3983424.06 rows=1 width=0) (actual 
time=1358298.003..1358298.004 rows=1 loops=1)
  -> Hash Join  (cost=44.93..3983415.81 rows=3297 width=0) (actual 
time=2593.768..1358041.205 rows 1481710 loops=1)


It is quite surprising that after optimizer decided the cost of some plan (Hash 
Join) to be lower but actual execution cost of same is more. 
There might be some problem with cost calculation model of Hash Join for some 
cases.

By the way which version of PostgreSQL you are using?

> But what negative impact is disabling hash joins?

I think using it as a temporary fix might be okay, but keeping such code in 
your application might be risky for you, because as the data changes in your 
tables, it could be quite possible that
in future Hash Join might be the best and cheapest way.

Can you try reproducing it with small data or else can you attach your schema 
and data for the tables/indexes used in query?

With Regards,
Amit Kapila.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very slow inner join query Unacceptable latency.

2013-05-22 Thread Amit Kapila
On Wednesday, May 22, 2013 10:03 PM fburgess wrote:

> I did perform a explain analyze on the query.

Explain analyze doesn't help to collect statistics. You should use Analyze 
.

Ideally optimizer should have slected the best plan, but just to check you can 
once try with

SET enable_hashjoin=off;

And see what is the plan it chooses and does it pick up index scan on larger 
table?

Could you please output of \d SARS_ACTS and \d SARS_ACTS_RUN?


With Regards,
Amit Kapila.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very slow inner join query Unacceptable latency.

2013-05-21 Thread Amit Kapila
On Wednesday, May 22, 2013 3:24 AM fburgess wrote:

> The SARS_ACTS table currently has 37,115,515 rows

> we have indexed: idx_sars_acts_acts_run_id ON SARS_ACTS USING btree 
> (sars_run_id)
> we have pk constraint on the SARS_ACTS_RUN table; sars_acts_run_pkey PRIMARY 
> KEY (id )

> serverdb=# explain select count(*) as y0_ from SARS_ACTS this_ inner join 
> SARS_ACTS_RUN tr1_ on this_.SARS_RUN_ID=tr1_.ID where tr1_.ALGORITHM='SMAT';
>QUERY PLAN
> --
> Aggregate  (cost=4213952.17..4213952.18 rows=1 width=0)
>  -> Hash Join  (cost=230573.06..4213943.93 rows=3296 width=0)
>   Hash Cond:  (this_.SARS_RUN_ID=tr1_.ID)
>   ->  Seq Scan om sars_acts this_  (cost=0.00..3844241.84 rows=37092284 
> width=8)
>   ->  Hash  (cost=230565.81..230565.81 rows=580 width=8)
>  -> Seq Scan on sars_acts_run tr1_  (cost=0.00..230565.81 
> rows=580 width=8)
>   Filter:  ((algorithm)::text = 'SMAT'::text)
> (7 rows)



> This query executes in approximately 5.3 minutes to complete, very very slow, 
> our users are not happy.

> I did add an index on SARS_ACTS_RUN.ALGORITHM column but it didn't improve 
> the run time. 
> The planner just changed the "Filter:" to an "Index Scan:" improving the cost 
> of the Seq Scan 
> on the sars_acts_run table, but the overall run time remained the same. It 
> seems like the bottleneck 
> is in the Seq Scan on the sars_acts table.

>  -> Seq Scan on sars_acts_run tr1_  (cost=0.00..230565.81 
> rows=580 width=8)
>   Filter:  ((algorithm)::text = 'SMAT'::text)

> Does anyone have suggestions about how to speed it up?

Could you please once trying Analyzing both tables and then run the query to 
check which plan it uses:

Analyze SARS_ACTS;
Analyze SARS_ACTS_RUN;


With Regards,
Amit Kapila.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow query even with aggressive auto analyze

2013-02-09 Thread Amit kapila
On Friday, February 08, 2013 6:06 PM Karolis Pocius wrote:


> I've tried changing autovacuum_analyze_scale_factor as well as setting
> job_batches table to auto analyze every 500 changes (by setting scale
> factor to 0 and threshold to 500), but I still keep running into that
> issue, sometimes minutes after the table was analyzed.

> I checked pg_locks to see if anything had granted=false, but that
>doesn't seem to be the case.

> This issue is occurring on two separate instances 9.0.4 and 9.1.4 - both
> have nearly identical settings, just run on a different hardware.

> Config changes http://pgsql.privatepaste.com/8acfb9d136

> Any ideas what is going wrong here?

I think you can verify in Logs whether analyze is happening as per your 
expectation.
You can set log_autovacuum_min_duration = 0, so that auto_analyze can be logged 
everytime it happens.

With Regards,
Amit Kapila.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance