Re: [GENERAL] Partitioning and ORM tools

2016-03-29 Thread Brian Fehrle

Here is a working example of trigger based partitioning with a view and
'do instead' that works with ORM tools using the affected rows return
(example attached).

The key things that make it work are:

1. RETURN NEW; (in the function after inserting into the partition)

2. INSTEAD OF INSERT (in the trigger)


example:
insert into data_log_view (date, thingy) values ('2015-01-02', 'test');
INSERT 0 1



On 3/24/16 8:28 AM, CS DBA wrote:



On 03/23/2016 02:48 AM, Chris Travers wrote:



On Wed, Mar 23, 2016 at 9:39 AM, Chris Travers
mailto:chris.trav...@gmail.com>> wrote:

Use a view with a DO INSTEAD trigger. That will allow you to
return the tuple properly.

On Tue, Mar 22, 2016 at 7:40 PM, CS DBA
 wrote:

Hi All;

we setup partitioning for a large table but had to back off
because the return status (i.e: "INSERT 0 1") returns "INSERT
0 0" when inserting into the partitioned table which causes
the ORM tool to assume the insert inserted 0 rows.  Is there
a standard / best practices work around for this?


Apologies for the top post above.

Just noting additionally that the view with DO INSTEAD approach was
suggested to me by Matt Trout (major contributor to the DBIx::Class
ORM in Perl.

I have used it.  It works well.  I think it is the best practice there.


Thanks in advance




--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP. Robust and Flexible.  No
vendor lock-in.
http://www.efficito.com/learn_more




--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No
vendor lock-in.
http://www.efficito.com/learn_more


All;

Thanks for the great Ideas, I'll let you know where we end up.





Brian FehrleDatabase Administrator II | comScore, Inc. (NASDAQ:SCOR)
bfeh...@comscore.com
|  | CO

Rentrak and comScore are now one, creating the new model for a dynamic 
cross-platform world. To learn more, visit: www.comscore.com
--
-- PostgreSQL database dump
--

SET statement_timeout = 0;
SET lock_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;

--
-- Name: part; Type: SCHEMA; Schema: -; Owner: postgres
--

CREATE SCHEMA part;


ALTER SCHEMA part OWNER TO postgres;

--
-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner: 
--

CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;


--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner: 
--

COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';


SET search_path = public, pg_catalog;

--
-- Name: insert_trigger(); Type: FUNCTION; Schema: public; Owner: postgres
--

CREATE FUNCTION insert_trigger() RETURNS trigger
LANGUAGE plpgsql
AS $$
BEGIN
INSERT INTO part.data_log_2015 VALUES (NEW.*);
RETURN NEW;
END;
$$;


ALTER FUNCTION public.insert_trigger() OWNER TO postgres;

SET default_tablespace = '';

SET default_with_oids = false;

--
-- Name: data_log; Type: TABLE; Schema: public; Owner: postgres; Tablespace: 
--

CREATE TABLE data_log (
data_log_sid integer NOT NULL,
date timestamp without time zone NOT NULL,
thingy character varying
);


ALTER TABLE data_log OWNER TO postgres;

--
-- Name: data_log_data_log_sid_seq; Type: SEQUENCE; Schema: public; Owner: 
postgres
--

CREATE SEQUENCE data_log_data_log_sid_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;


ALTER TABLE data_log_data_log_sid_seq OWNER TO postgres;

--
-- Name: data_log_data_log_sid_seq; Type: SEQUENCE OWNED BY; Schema: public; 
Owner: postgres
--

ALTER SEQUENCE data_log_data_log_sid_seq OWNED BY data_log.data_log_sid;


SET search_path = part, pg_catalog;

--
-- Name: data_log_2015; Type: TABLE; Schema: part; Owner: postgres; Tablespace: 
--

CREATE TABLE data_log_2015 (
data_log_sid integer DEFAULT 
nextval('public.data_log_data_log_sid_seq'::regclass),
date timestamp without time zone,
thingy character varying
)
INHERITS (public.data_log);


ALTER TABLE data_log_2015 OWNER TO postgres;

SET search_path = public, pg_catalog;

--
-- Name: data_log_view; Type: VIEW; Schema: public; Owner: postgres
--

CREATE VIEW data_log_view AS
 SELECT data_log.data_log_sid,
data_log.date,
data_log.thingy
   FROM data_log;


ALTER TABLE data_log_view OWNER TO postgres;

--
-- Name: data_log_sid; Type: DEFAULT; Schema: public; Owner: postgres
--

ALTER TABLE ONLY data_log ALTER COLUMN data_log_sid SET DEFAULT 
nextval('data_log_data_log_sid_seq'::

Re: [GENERAL] scripted 'pg_ctl start' hangs and never finishes, goes

2012-03-28 Thread Brian Fehrle
Interestingly enough, when using a perl system() vs the qx() or 
backticks, it doesn't have this behavior. So I've got it working now 
with some modifications.


I'm still interested as why I've seen this behavior, a. from a perl qx() 
call and b. from an ssh connection from another server.


- Brian F

On 03/28/2012 04:31 PM, Brian Fehrle wrote:

Hi all,

OS: Linux 64bit
PostgreSQL Version: 9.0.5 installed from source.





- Brian F







[GENERAL] scripted 'pg_ctl start' hangs and never finishes, goes

2012-03-28 Thread Brian Fehrle

Hi all,

OS: Linux 64bit
PostgreSQL Version: 9.0.5 installed from source.

I'm writing up a process that will bring down a warm standby cluster, 
tarball the data directory, then bring the warm standby back up. I'm 
having an issue where starting the database with pg_ctl results in the 
command never exiting. The warmstandby does come back online and starts 
recovering WAL files (evident in the log), however the command just does 
not exit. When I ctl -c from the script, the database receives a "fast 
shutdown".


Basic script logic:
pg_ctl -D /path/to/datadir stop -m fast

cd /path/to/datadir/
tar -czvf /backups/mydatabase.tar.gz  *

pg_ctl -D /path/to/datadir start


Originally, I was performing the 'pg_ctl start' over ssh from another 
box, but I ran into this issue and just assumed it had something to do 
with doing it over ssh. Now I'm doing it on the actual database box from 
a perl script and I've started having the same issue.


I'm testing this on a very small database, 2 megs in size. When I 
execute each event manually, it works just fine.


Actual perl code:
my $output = qx(/bin/pg_ctl -D $dataDir start 2>&1);

The last thing, while the command is 'hung', I search for a running 
pg_ctl process and come back with:

[postgres@gridpoint_4 bin]$ ps aux | grep pg_ctl
postgres   601  0.0  0.0  0 0 pts/3Z+   15:26   0:00 
[pg_ctl] 

postgres   619  0.0  0.0  61180   748 pts/2S+   15:26   0:00 grep pg_ctl


Below is the log from the warmstandby as the actions take place.

LOG:  received fast shutdown request
LOG:  shutting down
LOG:  database system is shut down
LOG:  database system was shut down in recovery at 2012-03-28 15:02:43 PDT
LOG:  starting archive recovery
LOG:  restored log file "00010057" from archive
LOG:  redo starts at 0/57000240
LOG:  consistent recovery state reached at 0/5800



LOG:  received fast shutdown request
LOG:  shutting down
LOG:  database system is shut down


Any thoughts on what could be the issues? This has happened on the same 
environment whether I'm doing it from within perl on the actual cluster, 
or over an ssh command such as  ssh user@standby "pg_ctl -D 
/path/to/data/ start". What's in common is that the pg_ctl becomes a 
child process of something other than my own shell, could that be the issue?


Thanks in advance,

- Brian F





Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

2011-11-01 Thread Brian Fehrle

Update on this:

We did a switchover to another machine with the same hardware, however 
this system was running on some older parameters we had set in the 
postgresql.conf file.


So we went from 400 max_connections to 200 max_connections, and 160MB 
work_mem to 200MB work_mem. And now on this other system, so far it 
seems to be running ok.


Other than the obvious fact that each connection has a certain amount of 
memory usage, is there something else to watch for when increasing 
connections to numbers like 400? When we had the issue of the system 
jumping to 100% cpu usage, even at that point our number of backends to 
the cluster was at MAX 250, but generally in the 175 range, so well 
below our 400 max_connections we allow. So could this be the culprit?


I'll be watching the cluster as we run on the new configuration (with 
only 200 max_connections).


- Brian F

On 10/27/2011 03:22 PM, Brian Fehrle wrote:

On 10/27/2011 02:50 PM, Tom Lane wrote:

Brian Fehrle  writes:

Hi all, need some help/clues on tracking down a performance issue.
PostgreSQL version: 8.3.11
I've got a system that has 32 cores and 128 gigs of ram. We have
connection pooling set up, with about 100 - 200 persistent connections
open to the database. Our applications then use these connections to
query the database constantly, but when a connection isn't currently
executing a query, it's. On average, at any given time, there are
3 - 6 connections that are actually executing a query, while the rest
are.
About once a day, queries that normally take just a few seconds slow 
way

down, and start to pile up, to the point where instead of just having
3-6 queries running at any given time, we get 100 - 200. The whole
system comes to a crawl, and looking at top, the CPU usage is 99%.

This is jumping to a conclusion based on insufficient data, but what you
describe sounds a bit like the sinval queue contention problems that we
fixed in 8.4.  Some prior reports of that:
http://archives.postgresql.org/pgsql-performance/2008-01/msg1.php
http://archives.postgresql.org/pgsql-performance/2010-06/msg00452.php

If your symptoms match those, the best fix would be to update to 8.4.x
or later, but a stopgap solution would be to cut down on the number of
idle backends.

regards, tom lane
That sounds somewhat close to the same issue I am seeing. Main 
differences being that my spike lasts for much longer than a few 
minutes, and can only be resolved when the cluster is restarted. Also, 
that second link shows TOP where much of the CPU is via the 'user', 
rather than the 'sys' like mine.


Is there anything I can look at more to get more info on this 'sinval 
que contention problem'?


Also, having my cpu usage high in 'sys' rather than 'us', could that 
be a red flag? Or is that normal?


- Brian F



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

2011-10-27 Thread Brian Fehrle

On 10/27/2011 01:48 PM, Scott Marlowe wrote:

On Thu, Oct 27, 2011 at 12:39 PM, Brian Fehrle
  wrote:

Looking at top, I see no SWAP usage, very little IOWait, and there are a
large number of postmaster processes at 100% cpu usage (makes sense, at this
point there are 150 or so queries currently executing on the database).

  Tasks: 713 total,  44 running, 668 sleeping,   0 stopped,   1 zombie
Cpu(s):  4.4%us, 92.0%sy,  0.0%ni,  3.0%id,  0.0%wa,  0.0%hi,  0.3%si,
  0.2%st
Mem:  134217728k total, 131229972k used,  2987756k free,   462444k buffers
Swap:  8388600k total,  296k used,  8388304k free, 119029580k cached

OK, a few points.  1: You've got a zombie process.  Find out what's
causing that, it could be a trigger of some type for this behaviour.
2: You're 92% sys.  That's bad.  It means the OS is chewing up 92% of
your 32 cores doing something.  what tasks are at the top of the list
in top?

Out of the top 50 processes in top, 48 of them are postmasters, one is 
syslog, and one is psql. Each of the postmasters have a high %CPU, the 
top ones being 80% and higher, the rest being anywhere between 30% - 
60%. Would postmaster 'queries' that are running attribute to the sys 
CPU usage, or should they be under the 'us' CPU usage?




Try running vmstat 10 for a a minute or so then look at cs and int
columns.  If cs or int is well over 100k there could be an issue with
thrashing, where your app is making some change to the db that
requires all backends to be awoken at once and the machine just falls
over under the load.


We've restarted the postgresql cluster, so the issue is not happening at 
this moment. but running a vmstat 10 had my 'cs' average at 3K and 'in' 
averaging around 9.5K.


- Brian F

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

2011-10-27 Thread Brian Fehrle
Also, I'm not having any issue with the database restarting itself, 
simply becoming unresponsive / slow to respond, to the point where just 
sshing to the box takes about 30 seconds if not longer. Performing a 
pg_ctl restart on the cluster resolves the issue.


I looked through the logs for any segmentation faults, none found. In 
fact the only thing in my log that seems to be 'bad' are the following.


Oct 27 08:53:18  postgres[17517]: [28932839-1] 
user=,db= ERROR:  deadlock detected
Oct 27 11:49:22  postgres[608]: [19-1] user=,db= 
ERROR:  could not serialize access due to concurrent update


I don't believe these occurred too close to the slowdown.

- Brian F

On 10/27/2011 02:09 PM, Brian Fehrle wrote:

On 10/27/2011 01:48 PM, Scott Marlowe wrote:

On Thu, Oct 27, 2011 at 12:39 PM, Brian Fehrle
  wrote:
Looking at top, I see no SWAP usage, very little IOWait, and there 
are a
large number of postmaster processes at 100% cpu usage (makes sense, 
at this

point there are 150 or so queries currently executing on the database).

  Tasks: 713 total,  44 running, 668 sleeping,   0 stopped,   1 zombie
Cpu(s):  4.4%us, 92.0%sy,  0.0%ni,  3.0%id,  0.0%wa,  0.0%hi,  0.3%si,
  0.2%st
Mem:  134217728k total, 131229972k used,  2987756k free,   462444k 
buffers
Swap:  8388600k total,  296k used,  8388304k free, 119029580k 
cached

OK, a few points.  1: You've got a zombie process.  Find out what's
causing that, it could be a trigger of some type for this behaviour.
2: You're 92% sys.  That's bad.  It means the OS is chewing up 92% of
your 32 cores doing something.  what tasks are at the top of the list
in top?

Out of the top 50 processes in top, 48 of them are postmasters, one is 
syslog, and one is psql. Each of the postmasters have a high %CPU, the 
top ones being 80% and higher, the rest being anywhere between 30% - 
60%. Would postmaster 'queries' that are running attribute to the sys 
CPU usage, or should they be under the 'us' CPU usage?




Try running vmstat 10 for a a minute or so then look at cs and int
columns.  If cs or int is well over 100k there could be an issue with
thrashing, where your app is making some change to the db that
requires all backends to be awoken at once and the machine just falls
over under the load.


We've restarted the postgresql cluster, so the issue is not happening 
at this moment. but running a vmstat 10 had my 'cs' average at 3K and 
'in' averaging around 9.5K.


- Brian F



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

2011-10-27 Thread Brian Fehrle

On 10/27/2011 02:27 PM, Scott Mead wrote:



On Thu, Oct 27, 2011 at 2:39 PM, Brian Fehrle 
mailto:bri...@consistentstate.com>> wrote:


Hi all, need some help/clues on tracking down a performance issue.

PostgreSQL version: 8.3.11

I've got a system that has 32 cores and 128 gigs of ram. We have
connection pooling set up, with about 100 - 200 persistent
connections open to the database. Our applications then use these
connections to query the database constantly, but when a
connection isn't currently executing a query, it's . On
average, at any given time, there are 3 - 6 connections that are
actually executing a query, while the rest are .


Remember, when you read pg_stat_activity, it is showing you query 
activity from that exact specific moment in time.  Just because it 
looks like only 3-6 connections are executing, doesn't mean that 200 
aren't actually executing < .1ms statements.  With such a beefy box, I 
would see if you can examine any stats from your connection pooler to 
find out how many connections are actually getting used.


Correct, we're getting a few hundred transactions per second, but under 
normal operation, polling pg_stat_activity will show the average of 3 - 
6 queries that were running at that moment, and those queries run for an 
average of 5 - 7 seconds. So my belief is that something happens to the 
system where either a) We get a ton more queries than normal from the 
application (currently hunting down data to support this), or b) the 
overall speed of the system slows down so that all queries increase in 
time so much that polling pg_stat_activity lets me actually see them.




About once a day, queries that normally take just a few seconds
slow way down, and start to pile up, to the point where instead of
just having 3-6 queries running at any given time, we get 100 -
200. The whole system comes to a crawl, and looking at top, the
CPU usage is 99%.

Looking at top, I see no SWAP usage, very little IOWait, and there
are a large number of postmaster processes at 100% cpu usage
(makes sense, at this point there are 150 or so queries currently
executing on the database).

 Tasks: 713 total,  44 running, 668 sleeping,   0 stopped,   1 zombie
Cpu(s):  4.4%us, 92.0%sy,  0.0%ni,  3.0%id,  0.0%wa,  0.0%hi,
 0.3%si,  0.2%st
Mem:  134217728k total, 131229972k used,  2987756k free,   462444k
buffers
Swap:  8388600k total,  296k used,  8388304k free, 119029580k
cached


In the past, we noticed that autovacuum was hitting some large
tables at the same time this happened, so we turned autovacuum off
to see if that was the issue, and it still happened without any
vacuums running.

That was my next question :)


We also ruled out checkpoints being the cause.

How exactly did you rule this out?  Just because a checkpoint is over 
doesn't mean that it hasn't had a negative effect on the OS cache.  If 
you're stuck going to disk, that could be hurting you (that being 
said, you do point to a low I/O wait above, so you're probably correct 
in ruling this out).


Checkpoint settings were set to the default per install. 5 minute 
timeout, 0.5 completion target, and 30s warning. Looking at the logs, we 
were getting a checkpoint every 5 minutes on the dot.


I looked at the data in pg_stat_database and noticed that buffers 
written by checkpoints are near 4X that of the background writer. So I 
implemented some changes to get more to be written by the background 
writer, including increasing the checkpoint timeout to 30 minutes, and 
setting the frequency of the bgwriter wait time from 200ms to 50ms.


checkpoints now happen 30 mins apart on the dot, and there was not a 
checkpoint happening the last time this issue of major slowdown occured.




I'm currently digging through some statistics I've been gathering
to see if traffic increased at all, or remained the same when the
slowdown occurred. I'm also digging through the logs from the
postgresql cluster (I increased verbosity yesterday), looking for
any clues. Any suggestions or clues on where to look for this to
see what can be causing a slowdown like this would be greatly
appreciated.

Are you capturing table-level stats from pg_stat_user_[tables | 
indexes]?  Just because a server doesn't look busy doesn't mean that 
you're not doing 1000 index scans per second returning 1000 tuples 
each time.



I am not grabbing any of those at the moment, I'll look into those.

- Brian F

--Scott

Thanks,
   - Brian F

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org

<mailto:pgsql-general@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general






Re: [GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

2011-10-27 Thread Brian Fehrle

On 10/27/2011 02:50 PM, Tom Lane wrote:

Brian Fehrle  writes:

Hi all, need some help/clues on tracking down a performance issue.
PostgreSQL version: 8.3.11
I've got a system that has 32 cores and 128 gigs of ram. We have
connection pooling set up, with about 100 - 200 persistent connections
open to the database. Our applications then use these connections to
query the database constantly, but when a connection isn't currently
executing a query, it's. On average, at any given time, there are
3 - 6 connections that are actually executing a query, while the rest
are.
About once a day, queries that normally take just a few seconds slow way
down, and start to pile up, to the point where instead of just having
3-6 queries running at any given time, we get 100 - 200. The whole
system comes to a crawl, and looking at top, the CPU usage is 99%.

This is jumping to a conclusion based on insufficient data, but what you
describe sounds a bit like the sinval queue contention problems that we
fixed in 8.4.  Some prior reports of that:
http://archives.postgresql.org/pgsql-performance/2008-01/msg1.php
http://archives.postgresql.org/pgsql-performance/2010-06/msg00452.php

If your symptoms match those, the best fix would be to update to 8.4.x
or later, but a stopgap solution would be to cut down on the number of
idle backends.

regards, tom lane
That sounds somewhat close to the same issue I am seeing. Main 
differences being that my spike lasts for much longer than a few 
minutes, and can only be resolved when the cluster is restarted. Also, 
that second link shows TOP where much of the CPU is via the 'user', 
rather than the 'sys' like mine.


Is there anything I can look at more to get more info on this 'sinval 
que contention problem'?


Also, having my cpu usage high in 'sys' rather than 'us', could that be 
a red flag? Or is that normal?


- Brian F

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Server hitting 100% CPU usage, system comes to a crawl.

2011-10-27 Thread Brian Fehrle

Hi all, need some help/clues on tracking down a performance issue.

PostgreSQL version: 8.3.11

I've got a system that has 32 cores and 128 gigs of ram. We have 
connection pooling set up, with about 100 - 200 persistent connections 
open to the database. Our applications then use these connections to 
query the database constantly, but when a connection isn't currently 
executing a query, it's . On average, at any given time, there are 
3 - 6 connections that are actually executing a query, while the rest 
are .


About once a day, queries that normally take just a few seconds slow way 
down, and start to pile up, to the point where instead of just having 
3-6 queries running at any given time, we get 100 - 200. The whole 
system comes to a crawl, and looking at top, the CPU usage is 99%.


Looking at top, I see no SWAP usage, very little IOWait, and there are a 
large number of postmaster processes at 100% cpu usage (makes sense, at 
this point there are 150 or so queries currently executing on the database).


 Tasks: 713 total,  44 running, 668 sleeping,   0 stopped,   1 zombie
Cpu(s):  4.4%us, 92.0%sy,  0.0%ni,  3.0%id,  0.0%wa,  0.0%hi,  0.3%si,  
0.2%st

Mem:  134217728k total, 131229972k used,  2987756k free,   462444k buffers
Swap:  8388600k total,  296k used,  8388304k free, 119029580k cached


In the past, we noticed that autovacuum was hitting some large tables at 
the same time this happened, so we turned autovacuum off to see if that 
was the issue, and it still happened without any vacuums running.


We also ruled out checkpoints being the cause.

I'm currently digging through some statistics I've been gathering to see 
if traffic increased at all, or remained the same when the slowdown 
occurred. I'm also digging through the logs from the postgresql cluster 
(I increased verbosity yesterday), looking for any clues. Any 
suggestions or clues on where to look for this to see what can be 
causing a slowdown like this would be greatly appreciated.


Thanks,
- Brian F

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Query performance help with 'shadow table' approach.

2011-09-14 Thread Brian Fehrle

On 09/14/2011 01:10 AM, Alban Hertroys wrote:

On 13 Sep 2011, at 23:44, Brian Fehrle wrote:


These queries basically do a 'select max(primary_key_column) from table group by 
column1, column2." Because of the group by, we would result in a sequential 
scan of the entire table which proves to be costly.

That seems to suggest a row where the primary key that has the max value is "special" in 
some way. Making them more easily distinguishable from "normal" rows seems like a good 
idea here.


Since the table has a ton of columns, I set up a smaller table that will house 
a copy of some of the data that the query uses, the Primary Key colum, and the 
two columns I do my 'group by' on.

That's one way to distinguish these special rows from the rest. You could also 
mark them as special using an extra column and/or create an expression-based 
index over just those rows.

However, especially with the below section in mind, it would appear your data 
could be normalised a bit more (your splitting off that shadow table is a step 
in doing so, in fact).

I'm also wondering, does your primary key have actual meaning? It would appear 
to just indicate the order in which the records were created (I'm assuming it's 
a serial type surrogate PK, and not a natural one).


It isn't a serial type, and the id increment is handled by the application.

This shadow table will also only contain one row for every column1 and column2 
combination (due to the group by), and for those rows, will have the max of the 
primary key. Even with this, the 'shadow' table will have about 14 million 
rows, compared to the 15 million in the main table.

Don't (column1, column2) make up a key then? I get the feeling you should split 
your table in 3 sections:
Table 1: main lookup (PK: pkey_sid)
Table 2: Variation lookup (PK: (column1, column2), FK: pkey_sid)
Table 3: Data (FK: the above)

(column1, column2) could possibly have multiple occurrences of the 
combination. Such as, 4 rows where column1 = 54 and column2 = 86, in 
these cases with multiple rows, I just want the one with the 
max(primary_key).


I'm looking into options like this, but at this moment changing the base 
table structure is out of the question, but adding tables along the side 
to try to speed things up is ok. Im trying to not cause changes in the 
application.

So the issue here comes in retrieving the needed data from my main table. The 
resulting rows is estimated to be 409,600, and the retrieving of the primary 
key's that are associated with those rows is actually really easy. However, 
when we take those 409,600 rows back to the main table to retrieve the other 
columns I need, the planner is just doing a sequential scan as it's most likely 
going to be faster than hitting the index then retrieving the columns I need 
for all 400K+ rows.

Is that estimate accurate? If not, see Ondrej's suggestions.

That is only about 1/30th of your table. I don't think a seqscan makes sense 
here unless your data is distributed badly.

Yeah the more I look at it, the more I think it's postgres _thinking_ 
that it's faster to do a seqential scan. I'll be playing with the 
random_page_cost that Ondrej suggested, and schedule a time where I can 
do some explain analyzes (production server and all).

Things to note:
1. If I reduce my where clause's range, then the sequential scan turns into an 
index scan, but sadly this can't always be done.

Does it make sense to CLUSTER your data in some sense? That would improve the 
data distribution issue and would probably push the threshold for a seqscan up 
some.

Cheers,

Alban Hertroys


Thanks, I'll be reporting back in with my next findings.

- Brian F

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Query performance help with 'shadow table' approach.

2011-09-13 Thread Brian Fehrle

Hi all,
I've got a large table that has 15 million + rows in it, and a set 
of queries I've been trying to speed up. The table has a primary key 
column, and a couple hundred other columns.


These queries basically do a 'select max(primary_key_column) from table 
group by column1, column2." Because of the group by, we would result in 
a sequential scan of the entire table which proves to be costly.


Since the table has a ton of columns, I set up a smaller table that will 
house a copy of some of the data that the query uses, the Primary Key 
colum, and the two columns I do my 'group by' on. My application is 
smart enough to update this 'shadow' table whenever the main table is 
updated, so it will accurately mirror the other table. This shadow table 
will also only contain one row for every column1 and column2 combination 
(due to the group by), and for those rows, will have the max of the 
primary key. Even with this, the 'shadow' table will have about 14 
million rows, compared to the 15 million in the main table.


Here is an example query that I'm working with:
postgres=# explain select T2.pkey_sid, T2.column54, T2.column44. 
T2.column67 FROM

public.mytable AS T2
JOIN public.mytable_shadow AS T3
ON (T2.pkey_sid = T3.pkey_sid)
WHERE T3.column1 >= 1072310434 AND T3.column1 <= 1074124834;
  QUERY PLAN
---
 Hash Join  (cost=118310.65..2250928.27 rows=409600 width=8)
   Hash Cond: (t2.pkey_sid = t3.pkey_sid)
   ->  Seq Scan on mytable t2  (cost=0.00..2075725.51 rows=15394251 
width=8)

   ->  Hash  (cost=113190.65..113190.65 rows=409600 width=8)
 ->  Bitmap Heap Scan on mytable_shadow t3 
(cost=12473.65..113190.65 rows=409600 width=8)
   Recheck Cond: ((1072310434 <= column1) AND (column1 <= 
1074124834))
   ->  Bitmap Index Scan on mytable_shadow_pkey  
(cost=0.00..12371.25 rows=409600 width=0)
 Index Cond: ((1072310434 <= column1) AND (column1 
<= 1074124834))

(8 rows)

So the issue here comes in retrieving the needed data from my main 
table. The resulting rows is estimated to be 409,600, and the retrieving 
of the primary key's that are associated with those rows is actually 
really easy. However, when we take those 409,600 rows back to the main 
table to retrieve the other columns I need, the planner is just doing a 
sequential scan as it's most likely going to be faster than hitting the 
index then retrieving the columns I need for all 400K+ rows.


Things to note:
1. If I reduce my where clause's range, then the sequential scan turns 
into an index scan, but sadly this can't always be done.
2. I have appropriate indexes where they need to be. The issue is in the 
query planner not using them due to it (i assume) just being faster to 
scan the whole table when the data set it needs is as large as it is.
3. Without this shadow table, my query would look _something_ like this 
(The idea being, retrieve a certain set of columns from the rows with 
the max(primary key) based on my group by):
select pkey_sid, column54, column44, column47\\67 from public.mytable 
where pkey_sid in (select max(pkey_sid) from public.mytable group by 
column1, column2);



So I need to see how I can speed this up. Is my approach misguided, or 
are there other ways I can go about it? Any thoughts, suggestions, or 
info would be greatly appreciated. And I tried to explain it all easily, 
if I can be more clear let me know.


Thanks,
- Brian F




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general