[PERFORM] near identical queries have vastly different plans

2011-06-30 Thread Samuel Gendler
Here's the setup:

I'm cross joining two dimensions before left outer joining to a fact table
so that I can throw a default value into the resultset wherever a value is
missing from the fact table.  I have a time dimension and another dimension.
 I want the cross join to only cross a subset of rows from each dimension,
so I have some filters in the ON clause of the inner join.

I've got 2 nearly identical queries that perform incredibly differently.
 The only difference between them is that the fast query uses two text
columns from the sensor dimension when filtering the inner join while the
slow query uses bigint id values, where the ids correlate to the text
strings in the fast query.  The string columns are device_name and
dpoint_name, where device_name is unique but many devices have dpoints with
the same name.  The bigint columns are device_id and dpoint_id, where both
device_id and dpoint_id map to a single row.  There are indexes on all of
them, so the only real difference is that an index on dpoint_name will have
more rows for a given value than the index on dpoint_id because dpoints with
the same name will still have different ids if attached to different
devices.  In both queries, exactly 35 rows from the sensor dimension will be
returned.  Note also that I'm aggregating fact table rows via avg() because
I have duplicate rows in the fact table, but I want to extract only a single
row for each time and sensor row.  The left outer join allows me to populate
any missing rows with a default value and the aggregation via avg() combines
duplicate rows so that they appear only once.

I can easily just use the fast query, but my concern is that without
understanding why the queries are executing differently, I might suddenly
discover my code using the slow query plan instead of the fast one at some
point in the future, even when using the varchar columns instead of the
bigint ids for filtering.  They differ in execution speed by about 5x (which
translates to many minutes), so that would be a big problem.  If I could
figure out either a query structure or an index structure which will force
the fast query plan, I'd be much happier.  So that is what I am looking for
- an explanation of how I might convince the planner to always use the fast
plan.

Its a CentOS host - Quad core AMD Opteron 1.6Ghz, 2GB of RAM, single 75GB
disk with everything on it.  I'm not looking for outright performance, just
relative performance between the 2 queries.  DB config was taken wholesale
from pg_tune with no changes, IIRC.  It isn't a production box.  I have yet
to spec out production hardware for this application, so I don't know what
it will eventually be.

# select version();
 version

--
 PostgreSQL 8.4.5 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20080704 (Red Hat 4.1.2-48), 64-bit

 name | current_setting

--+-
 checkpoint_completion_target | 0.9
 checkpoint_segments  | 64
 default_statistics_target| 100
 effective_cache_size | 1408MB
 lc_collate   | en_US.UTF-8
 lc_ctype | en_US.UTF-8
 log_directory| pg_log
 log_filename | postgresql-%a.log
 log_rotation_age | 1d
 log_rotation_size| 0
 log_truncate_on_rotation | on
 logging_collector| on
 maintenance_work_mem | 240MB
 max_connections  | 20
 max_stack_depth  | 2MB
 port | 5432
 server_encoding  | UTF8
 shared_buffers   | 480MB
 TimeZone | UTC
 wal_buffers  | 32MB
 work_mem | 48MB


time dimension looks like this:

  Column  |Type | Modifiers
--+-+---
 time_zone| character varying(64)   |
 tstamp   | timestamp without time zone |
 tstamptz | timestamp with time zone|
 local_key| bigint  |
 utc_key  | bigint  |
Indexes:
"idx_time_ny_local_key" btree (local_key)
"idx_time_ny_tstamp" btree (tstamp) CLUSTER
"idx_time_ny_tstamptz" btree (tstamptz)
"idx_time_ny_utc_key" btree (utc_key)

plus lots of other columns (approx 25 columns, mostly integer)  that aren't
relevant to this query.  It has 420,480 rows where each row is 300 seconds
after the previous row.  local_key and utc_key are bigint columns in the
form MMddHHmm (utc_key in UTC time and the other in local time for the
time zone represented by the table) and tstamp is the same value as an
actual timestamp type. tstamptz is jus

Re: [PERFORM] near identical queries have vastly different plans

2011-06-30 Thread Samuel Gendler
On Thu, Jun 30, 2011 at 1:53 AM, Samuel Gendler
wrote:

> If I could figure out either a query structure or an index structure which
> will force the fast query plan, I'd be much happier.  So that is what I am
> looking for - an explanation of how I might convince the planner to always
> use the fast plan.
>
>
For the record, "set enable_nestloop=false" does force a more effective plan
when using the 'slow' query.  It is not quite identical in structure - it
materializes the other side of the query, resulting in about 10% less
performance - but it is close enough that I'm tempted to disable nestloop
whenever I run the query in the hope that it will prevent the planner from
switching to the really awful plan.  I know that's kind of a drastic
measure, so hopefully someone out there will suggest a config fix which
accomplishes the same thing without requiring special handling for this
query, but at least it works (for now).

Incidentally, upgrading to 9.0.x is not out of the question if it is
believed that doing so might help here.  I'm only running 8.4 because I've
got another project in production on 8.4 and I don't want to have to deal
with running both versions on my development laptop.  But that's a pretty
weak reason for not upgrading, I know.

--sam


Re: [PERFORM] is parallel union all possible over dblink?

2011-06-30 Thread Greg Spiegelberg
On Thu, Jun 30, 2011 at 3:02 AM, Svetlin Manavski <
svetlin.manav...@gmail.com> wrote:

> I am now a bit puzzled after the initial satisfaction by Marinos' reply.
>
> 1. what do you mean exactly by "to ensure your UNION succeeds". The dblink
> docs do not mention anything about issues using directly  the suggested
> dblink_send_query() + dblink_get_results(). What problems should I expect in
> using them as suggested by Marinos?
>
>
Admittedly, I hadn't used those specific dblink functions and imagined
dblink_get_result() failing if the query on the connection wasn't finished.
 It appears now that after some experimentation that it's perfectly happy
hanging until the query is finished executing.



> 2. If I understand correctly your method, it is not applicable from inside
> a stored procedure, is it? I need to keep all the business logic within
> PostgreSQL and provide just a clean interface to a simple GUI layer
>
>
Then dblink is your answer.  My suggestion applies if you were implementing
a solution in the application.



> 3. Unfortunately GridSQL and Pgpool-II do not seem mature and stable
> products to be used in commercial software. Neither one provides clear
> documentation. GridSQL has been discontinued and it is not clear what kind
> of future it will have. I have not tried GridSQL but I did try Pgpool-II. It
> is disappointing that it may stop working correctly even just because of the
> way you write the query (e.g. using uppercase in a field or using named
> field in group by, ecc.). Even worse, when it does not recognize something
> in the parallel query, it just provides incorrect result (from only the
> local DB) rather than raising an exception. So I guess Pgpool-II in its
> current state is good only for very simple applications, which are not
> supposed to be reliable at all.
>
>
I don't think GridSQL is discontinued.  Appears though EnterpriseDB has open
sourced it and moved to http://sourceforge.net/projects/gridsql/.  Not
incredibly active but some as recent as last month.

Sorry for the confusion.

Greg


Re: [PERFORM] Poor performance when joining against inherited tables

2011-06-30 Thread Robert Haas
On Wed, May 11, 2011 at 4:47 PM, Lucas Madar  wrote:
> On 05/11/2011 09:38 AM, Robert Haas wrote:
>>>
>>> However, if I disable seqscan (set enable_seqscan=false), I get the
>>> following plan:
>>>
>>>  QUERY PLAN
>>> 
>>>  Hash Join  (cost=10001298843.53..290002337961.71 rows=8643757 width=506)
>>>   Hash Cond: (f.id = objects.id)
>>>   ->    Append  (cost=100.00..29536334.43 rows=8643757
>>> width=20)
>>>         ->    Seq Scan on item f  (cost=100.00..126.30
>>> rows=1630 width=20)
>>>         ->    Index Scan using xxx_pkey on itemXX f  (cost=0.00..10.60
>>> rows=90
>>> width=20)
>>>         ->    Index Scan using yyy_pkey on itemYY f  (cost=0.00..25.24
>>> rows=266 width=20)
>>>         ->    Index Scan using zzz_pkey on itemZZ f  (cost=0.00..9.28
>>> rows=2
>>> width=20)
>>>         ...
>>>   ->    Hash  (cost=999347.17..999347.17 rows=3941949 width=490)
>>>         ->    Index Scan using objects_pkey on objects
>>> (cost=0.00..999347.17
>>> rows=3941949 width=490)
>>>
>>> This seems like a much more sensible query plan.
>>
>> I don't think so.  Scanning the index to extract all the rows in a
>> table is typically going to be a lot slower than a sequential scan.
>>
>
> Compared to the previous query plan (omitted in this e-mail, in which the
> planner was scanning all the item tables sequentially), the second query is
> much more desirable. It takes about 12 seconds to complete, versus the other
> query which I canceled after six hours. However, what you propose seems to
> make even more sense.

I was just looking at this email again, and had another thought:
perhaps the tables in question are badly bloated.  In your situation,
it seems that the plan didn't change much when you set
enable_seqscan=off: it just replaced full-table seq-scans with
full-table index-scans, which should be slower.  But if you have a
giant table that's mostly empty space, then following the index
pointers to the limited number of blocks that contain any useful data
might be faster than scanning all the empty space.  If you still have
these tables around somewhere, you could test this hypothesis by
running CLUSTER on all the tables and see whether the seq-scan gets
faster.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] near identical queries have vastly different plans

2011-06-30 Thread Samuel Gendler
On Thu, Jun 30, 2011 at 1:53 AM, Samuel Gendler
wrote:

> If I could figure out either a query structure or an index structure which
> will force the fast query plan, I'd be much happier.  So that is what I am
> looking for - an explanation of how I might convince the planner to always
> use the fast plan.
>
>
I eventually noticed that constraint_exclusion didn't seem to be working and
remembered that it only works when the filter is on the partitioned table
itself, not when the table is being filtered via a join.  Adding a where
clause which limits f.time_fk to the appropriate range not only fixed
constraint_exclusion behaviour, but also caused the query planner to produce
the same plan for both versions of the query - a plan that was an order of
magnitude faster than the previous fastest plan.   It went from 20 seconds
to just less than 2 seconds.