Re: [GENERAL] Question About Roles

2014-07-02 Thread Gregory Haase
On Wed, Jul 2, 2014 at 7:50 AM, Rich Shepard 
wrote:

> On Wed, 2 Jul 2014, David G Johnston wrote:
>
>  or if you want to do it as part of creating a new user:
>>
>> CREATE ROLE new_management_user
>> [other stuff here]
>> IN ROLE management;
>>
>> http://www.postgresql.org/docs/9.2/interactive/sql-createrole.html
>>
>
> David,
>
>   I'll have to think deeply about what this is doing. Initially, I read it
> as assigning a new user's role to an existing group's role; that's the
> opposite of what I want.
>
> Thanks,
>
> Rich
>
>
If it made you feel better, remember that CREATE USER is an alias for
CREATE ROLE that includes LOGIN by default. So if you simply swap one word
in your command, the context becomes a little more clear for what you want:

CREATE USER new_management_user
[other stuff here]
IN ROLE management;


-Greg Haase


Re: [GENERAL] json datatype and table bloat?

2013-11-04 Thread Gregory Haase
While I agree that an equality operator doesn't really make sense for json
operationally, there are certain maintenance reasons why it may come in
handy. Removing duplicate records comes to mind.

Other than adding basic stats to necessary columns, I would say that the
equality operator is really one of the most basic tenets of a relational
database and should probably exist for any data type - regardless of it's
given usefullness.

Greg Haase
On Nov 4, 2013 6:31 PM, "ajeli...@gmail.com"  wrote:

> >>> Along the lines of the equality operator; I have ran into issues trying
> to
> >>> pivot a table/result set with a json type due what seemed to be no
> >>> equality
> >>> operator.
> >>
> >> For the curious, and also use-case considerations for development, would
> >> you
> >> be able to share what it is you are doing (and how) that combines full
> >> json
> >> documents with pivoting?
> >>
> >> Compound types holding source data for a pivot seems problematic since
> >> generally all the pivot components are single-valued and, for data,
> often
> >> numerical.
>
> >would also like to see this. json type has completely displaced
> >crosstab in my usage. I don't typically pivot json though: I pivot the
> >raw data then transform to json.  With limited exceptions I consider
> >storing json in actual table rows to be an anti-pattern (but it should
> >still work if you do it).
>
> I could not figure out what I was doing last month to reproduce this.  So
>  I
> did a small pivot poc, and it is erroring on the max function. So it is
> probably not the same issue. My guess is I tried the using the GREATEST
> function as a hail marry (which would not have worked) and got the
> following
> message; ERROR:  could not identify a comparison function for type json and
> then thought/hopped it was the same thing when reading the emails.
>
> CREATE TABLE bad_table_json(id int, detail_type text, details json);
> INSERT INTO bad_table_json values(1, 'a', '{"a":1}'::json);
> INSERT INTO bad_table_json values(1, 'b', '{"a":1}'::json);
> INSERT INTO bad_table_json values(1, 'c', '{"a":1}'::json);
>
> SELECT id
>   ,MAX(CASE WHEN detail_type = 'a' THEN details END) AS a
>   ,MAX(CASE WHEN detail_type = 'b' THEN details END) AS b
>   ,MAX(CASE WHEN detail_type = 'c' THEN details END) AS c
>   FROM bad_table_json
>  GROUP BY id
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/json-datatype-and-table-bloat-tp5776182p5776947.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] json datatype and table bloat?

2013-11-01 Thread Gregory Haase
I spent some more time on this today, and I realized that the issue isn't
that there are records in the toast table. The issue is that there are NO
records in the toast table. Apparently, all the json we are inserting are
too small to get toasted.

I setup a separate benchmark locally:
create table test_json_stats
(
test_json_stats_id serial,
json_data json,
insert_timestamp timestamp default now() not null
);

created a file called "json bench" with the following:
BEGIN;
insert into test_json_stats (json_data) values ('{"counters": [ {
"first":"1","second":"2"}, { "first":"3","second":"4"}, {
"first":"5","second":"6"}, { "first":"7","second":"8"}, {
"first":"9","second":"10"}, { "first":"11","second":"12"}, {
"first":"13","second":"14"}, { "first":"15","second":"16"}, {
"first":"17","second":"18"}, { "first":"19","second":"20"}, {
"first":"21","second":"22"}, { "first":"23","second":"24"}, {
"first":"25","second":"26"}, { "first":"27","second":"28"}, {
"first":"29","second":"30"}, { "first":"31","second":"32"}, {
"first":"33","second":"34"}, { "first":"35","second":"36"}, {
"first":"37","second":"38"}, { "first":"39","second":"40"}, {
"first":"41","second":"42"}, { "first":"43","second":"44"}, {
"first":"45","second":"46"} ] }');
END;

Then ran pgbench:
pgbench -c 5 -t 200 -f json_bench greg

vacuum vebose shows the test_json_stats table has over a million pages and
the toast table exists with zero pages:
INFO:  vacuuming "public.test_json_stats"
INFO:  "test_json_stats": found 0 removable, 0 nonremovable row versions in
0 out of 1010011 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.01u sec elapsed 0.01 sec.
INFO:  vacuuming "pg_toast.pg_toast_51822"
INFO:  index "pg_toast_51822_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_51822": found 0 removable, 0 nonremovable row versions in
0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

The json_data column is not accounted for in pg_stats:
select attname, null_frac, avg_width from pg_stats where tablename =
'test_json_stats';
  attname   | null_frac | avg_width
+---+---
 test_json_stats_id | 0 | 4
 insert_timestamp   | 0 | 8
(2 rows)

So I'm not sure if I'd actually qualify this as a "bug", but it appears
that there is no way to currently get stats on a json data type.

I subsequently inserted a very large json into the table that consumed 2
pages in pg_toast_51822, but there still doesn't appear to me any way to
get stats on the column.

Greg Haase




On Tue, Oct 29, 2013 at 2:55 PM, Gregory Haase wrote:

> Tom is correct: Vacuum verbose shows that their is an associated toast
> table. Neither the check_postgres.pl script or the query on
> http://wiki.postgresql.org/wiki/Show_database_bloat appear to take this
> into consideration. Both rely on null_frac and avg_width from pg_stats to
> estimate how big the table should be. I'm not sure how you would factor the
> toast table into that estimate.
>
> -G
>
>
> On Tue, Oct 29, 2013 at 2:05 PM, Tom Lane  wrote:
>
>> Gregory Haase  writes:
>> > I've isolated the problem to the json field not showing up in pg_stats,
>> > which affects the calculation of the avg row size in the bloat query.
>>
>> > I'm not sure if this is a json issue or some other kind of issue.
>>
>> Possibly your "bloat query" is failing to consider the toast table
>> associated with this table?  If the json values are large they'd
>> mostly be in the toast table not the main table.
>>
>> (It's unfortunate that VACUUM FULL doesn't tell you about what's
>> in the toast table.  I'd try just VACUUM VERBOSE here, without the
>> FULL, to get more info.)
>>
>> regards, tom lane
>>
>
>


Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Gregory Haase
Tom is correct: Vacuum verbose shows that their is an associated toast
table. Neither the check_postgres.pl script or the query on
http://wiki.postgresql.org/wiki/Show_database_bloat appear to take this
into consideration. Both rely on null_frac and avg_width from pg_stats to
estimate how big the table should be. I'm not sure how you would factor the
toast table into that estimate.

-G


On Tue, Oct 29, 2013 at 2:05 PM, Tom Lane  wrote:

> Gregory Haase  writes:
> > I've isolated the problem to the json field not showing up in pg_stats,
> > which affects the calculation of the avg row size in the bloat query.
>
> > I'm not sure if this is a json issue or some other kind of issue.
>
> Possibly your "bloat query" is failing to consider the toast table
> associated with this table?  If the json values are large they'd
> mostly be in the toast table not the main table.
>
> (It's unfortunate that VACUUM FULL doesn't tell you about what's
> in the toast table.  I'd try just VACUUM VERBOSE here, without the
> FULL, to get more info.)
>
> regards, tom lane
>


Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Gregory Haase
I've isolated the problem to the json field not showing up in pg_stats,
which affects the calculation of the avg row size in the bloat query.

I'm not sure if this is a json issue or some other kind of issue.

db_name=# select c.column_name, c.data_type from information_schema.columns
c where table_name = 'table_name' and not exists (select 1 from pg_stats s
where c.table_name = s.tablename and c.column_name = s.attname);
 column_name | data_type
-+---
 criteria| json
(1 row)

-G


On Tue, Oct 29, 2013 at 1:51 PM, Tom Lane  wrote:

> John R Pierce  writes:
> > On 10/29/2013 12:41 PM, Gregory Haase wrote:
> >> db_name=# VACUUM FULL VERBOSE table_schema.table_name;
> >> INFO:  vacuuming "table_schema.table_name"
> >> INFO:  "table_name": found 2 removable, 29663 nonremovable row
> >> versions in 1754 pages
> >> DETAIL:  0 dead row versions cannot be removed yet.
> >> CPU 0.07s/0.10u sec elapsed 0.30 sec.
>
> > is there an old transaction pending?   that 'masks' vacuum from touching
> > any tuples newer than the start of that transaction.
>
> If old transactions were the problem, vacuum would be reporting that
> some-large-number of dead row versions couldn't be removed yet.
>
> There doesn't seem to be anything obviously wrong here.
>
> regards, tom lane
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Gregory Haase
One more thing I just tried:

create table table_schema.table_name_new (like table_schema.table_name);
insert into table_schema.table_name_new select * from table_schema.table_
name;

The new tables shows the same amount of wasted bytes and pages as the old.

So I think based on that I'm going to throw out any notion of updates or
deletes as cause for bloat on this particular table.

-G


On Tue, Oct 29, 2013 at 12:53 PM, Gregory Haase wrote:

> So, between yesterday and today we actually failed over to our hot-standby
> instance and the issue hasn't changed. I don't think you can have a pending
> transaction across streaming replication.
>
>
> On Tue, Oct 29, 2013 at 12:49 PM, John R Pierce wrote:
>
>>  On 10/29/2013 12:41 PM, Gregory Haase wrote:
>>
>> db_name=# VACUUM FULL VERBOSE table_schema.table_name;
>> INFO:  vacuuming "table_schema.table_name"
>> INFO:  "table_name": found 2 removable, 29663 nonremovable row versions
>> in 1754 pages
>> DETAIL:  0 dead row versions cannot be removed yet.
>> CPU 0.07s/0.10u sec elapsed 0.30 sec.
>>
>>
>> is there an old transaction pending?   that 'masks' vacuum from touching
>> any tuples newer than the start of that transaction.
>>
>>
>>
>> --
>> john r pierce  37N 122W
>> somewhere on the middle of the left coast
>>
>>
>


Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Gregory Haase
So, between yesterday and today we actually failed over to our hot-standby
instance and the issue hasn't changed. I don't think you can have a pending
transaction across streaming replication.


On Tue, Oct 29, 2013 at 12:49 PM, John R Pierce  wrote:

>  On 10/29/2013 12:41 PM, Gregory Haase wrote:
>
> db_name=# VACUUM FULL VERBOSE table_schema.table_name;
> INFO:  vacuuming "table_schema.table_name"
> INFO:  "table_name": found 2 removable, 29663 nonremovable row versions
> in 1754 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> CPU 0.07s/0.10u sec elapsed 0.30 sec.
>
>
> is there an old transaction pending?   that 'masks' vacuum from touching
> any tuples newer than the start of that transaction.
>
>
>
> --
> john r pierce  37N 122W
> somewhere on the middle of the left coast
>
>


Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Gregory Haase
Following up.

I don't see any rolled back transactions in the logs.

The part that troubles me really is that vacuum full doesn't actually fix
the problem. If there were bad data that had been corrected via mass
updates, I'd expect the bloat issue to be fixed by a vacuum full.

When I run the vacuum back to back, this is what I get:

db_name=# VACUUM FULL VERBOSE table_schema.table_name;
INFO:  vacuuming "table_schema.table_name"
INFO:  "table_name": found 2 removable, 29663 nonremovable row versions in
1754 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.07s/0.10u sec elapsed 0.30 sec.
VACUUM
db_name=# VACUUM FULL VERBOSE table_schema.table_name;
INFO:  vacuuming "table_schema.table_name"
INFO:  "table_name": found 0 removable, 29663 nonremovable row versions in
1754 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.09s/0.09u sec elapsed 0.32 sec.
VACUUM

I think the question to address may be: "Why does the check_postgres query
think there should only be 334 pages instead of 1754?"

The tbloat and wastedbytes calculations provided in the query from this
page: http://wiki.postgresql.org/wiki/Show_database_bloat seems to
correlate with the wasted bytes reported by nagios though.

Greg Haase




On Tue, Oct 29, 2013 at 7:06 AM, Merlin Moncure  wrote:

> On Tue, Oct 29, 2013 at 5:38 AM, Chris Travers 
> wrote:
> >
> >
> >
> > On Mon, Oct 28, 2013 at 4:17 PM, Gregory Haase 
> > wrote:
> >>
> >> I have a table that is triggering my nagios database bloat alert
> >> regularly. Usually, I have to give it the vacuum full more than once to
> get
> >> it under the threshold. Today I tried repeatedly and cannot get the
> alert to
> >> resolve.
> >>
> >> I had a discussion with one of the primary developers about how the
> table
> >> is utilized, and it turns out they are basically only ever inserting
> into
> >> it. This sort of flies in the face of conventional wisdom about bloat
> being
> >> caused by frequent updates and deletes.
> >
> >
> > As I understand it, vacuuming only removes the tuples removed by delete
> and
> > update operations.
>
> well, or by rolled back transactions.  we we have to wonder if OP has
> a lot of queries trying to insert and failing.  maybe check the log?
>
> merlin
>


[GENERAL] json datatype and table bloat?

2013-10-28 Thread Gregory Haase
I have a table that is triggering my nagios database bloat alert regularly.
Usually, I have to give it the vacuum full more than once to get it under
the threshold. Today I tried repeatedly and cannot get the alert to resolve.

I had a discussion with one of the primary developers about how the table
is utilized, and it turns out they are basically only ever inserting into
it. This sort of flies in the face of conventional wisdom about bloat being
caused by frequent updates and deletes.

We were looking at it, and one of the things that struck me is that this
table has a column with a json datatype. I looked through
information_schema.columns and there is only one other table with a json
datatype, and I recall having bloat issues with this table in the past as
well.

I'm wondering if the json datatype is just naturally more bloated than
other types, or if the query in the check_postgresql.pl nagios script is
not accurate, or if maybe my thresholds are simply too low?

The table design itself is pretty simple:
 id | integer | not null default nextval('
table_schema.table_name_id_seq'::regclass)
 type   | character varying(255)  |
 criteria   | json|
 created_at | timestamp without time zone | not null
 updated_at | timestamp without time zone | not null
Indexes:
"table_name_pkey" PRIMARY KEY, btree (id)


The nagios output looks like this (sanitized):
POSTGRES_BLOAT WARNING: DB "db_name" (host:host.domain.com) (db db_name)
table table_schema.table_name rows:29305 pages:1733 shouldbe:330 (5.3X)
wasted size:11493376 (10 MB)

Thoughts?

Greg Haase


Re: [GENERAL] PostgreSQL Point In Time Recovery

2013-10-25 Thread Gregory Haase
Before going through something like delayed replication, you really want to
consider using zfs or lvm and taking regular snapshots on your hot or warm
standby. In the event of the accidental table drop, you can just roll back
to the snapshot prior and then do PITR from there.

Greg Haase


On Fri, Oct 25, 2013 at 11:14 PM, Jayadevan wrote:

> Alan Hodgson wrote
> > Well, yeah. The point was that you possibly could run it for a while to
> > "catch
> > up" without taking a new base backup if you desired. You should also keep
> > copies of it for PITR.
>
> Something like this -
> delayed replication
>    might
> help. I could say lag by 12 hours, or 1 transactions...
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/PostgreSQL-Point-In-Time-Recovery-tp5775717p5775997.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] What does \timing measure?

2013-09-24 Thread Gregory Haase
I'm curious what the following would show:

time echo '\\timing  select 1 from dual' | $PG_ROOT/bin/psql test >>
out.txt

Greg Haase
On Sep 24, 2013 4:01 PM, "Daniel Tahara"  wrote:

>
> On Tue, Sep 24, 2013 at 6:37 PM, Daniel Tahara wrote:
>
>> time echo '\\timing  select msg from test' | $PG_ROOT/bin/psql >>
>> out.txt
>
>
> This should be:
> time echo '\\timing  select msg from test' | $PG_ROOT/bin/psql test >>
> out.txt
>
> Sorry for the typo, and thanks for the help.
>
> Daniel
>
> Computer Science
> Yale University, Class of 2014
> daniel.tah...@yale.edu
> (646) 397-6379
>


Re: [GENERAL] Partitioning V schema

2013-09-20 Thread Gregory Haase
I would look towards how PostGis handles the Tiger census data for
guidance. It's a similar, massive data set.

Greg Haase
On Sep 20, 2013 9:47 AM, "Jeff Janes"  wrote:

> On Thu, Sep 19, 2013 at 12:02 AM, Dave Potts wrote:
>
>> Hi List
>>
>> I am looking for some general advice about the best was of splitting  a
>> large data table,I have  2 different choices, partitioning or different
>> schemas.
>>
>
>
> I don't think there is much of a choice there.  If you put them in
> different schemas, then you are inherently partitioning the data.  It just
> a question of how you name your partitions, which is more of a naming issue
> than a performance issue.
>
>
>>
>> The data table refers to the number of houses that can be include in a
>> city, as such there are large number of records.
>>
>>
>> I am wondering if decided to partition the table if the update
>> speed/access might be faster that just declaring a different schema per
>> city.
>>
>
> If you partition based on city, then there should be no meaningful
> difference.  If you partition based on something else, you would have to
> describe what it is partitioned on, and what your access patterns are like.
>
> Cheers,
>
> Jeff
>


Re: [GENERAL] ZFS snapshots - to use pg_start_backup() and pg_stop_backup() or no?

2013-09-17 Thread Gregory Haase
Thanks.

If anyone is interested, I added postgresql to the zfstools project and the
owner merged my pull request:
https://github.com/bdrewery/zfstools

I know there are probably 1000 ways to do this out there, but we were
already using this code to manage our MySQL zfs snapshotting process, so
extending it became the path of least resistance.

The tricky part for me was separating the pg_stop_backup command from the
first two commands so that you don't get into a situation where you start
backup, fail on zfs snapshot create, and then pg_stop_backup is not
executed. Maybe not the most elegant, but I'd rather have a failed
pg_stop_backup command than a pg_start_backup command that is left open
indefinitely.


Greg Haase


On Tue, Sep 17, 2013 at 12:33 PM, Vick Khera  wrote:

>
> On Wed, Sep 11, 2013 at 8:00 PM, Gregory Haase wrote:
>
>> Typically how fast is a crash recovery for a ~1TB database with heavy
>> OTLP load? Are we talking several seconds, several minutes, several hours?
>>
>
> This will depend on how fast your file system is, how frequent your
> checkpoints are (this is tunable), and how many WAL segments you allow
> (also tunable). The trade off is if you do less frequent checkpointing and
> have more WAL segments, you can boost your write speeds, but the cost is
> longer recovery.  Ideally you want to tune the number of WAL segments to be
> just the right number to keep from forcing checkpoints before your
> configured timeout to run a checkpoint, and you configure your checkpoint
> time to whatever duration of time you need to keep your recovery time as
> short as you want.
>


Re: [GENERAL] ZFS snapshots - to use pg_start_backup() and pg_stop_backup() or no?

2013-09-11 Thread Gregory Haase
Good point on not needing to shell out. I think my process was a mental
holdover from the fact that MySQL releases 'flush tables with read lock' on
client disconnect.

Typically how fast is a crash recovery for a ~1TB database with heavy OTLP
load? Are we talking several seconds, several minutes, several hours?

Thanks,

-G


On Wed, Sep 11, 2013 at 4:46 PM, Steven Schlansker wrote:

>
> On Sep 11, 2013, at 4:29 PM, Gregory Haase 
> wrote:
>
> > I was trying to figure out how to get the following syntax to work:
> >
> > echo "select pg_start_backup('zfs_snapshot'); \\! zfs snapshot
> zroot/zpgsql@test; \\ select pg_stop_backup();" | psql postgres
>
> I do:
>
> psql -c "select pg_start_backup('whatever');" && zfs snapshot pool/fs@sn&& 
> psql -c "select pg_stop_backup();"
>
> That way no need to shell out from psql :)
>
> >
> > The above command successfully starts the backup and creates the
> snapshot but then fails to stop the backup. I've tried various combinations
> of \ and \\ here with different whitespace and I just can't seem to find a
> combination that works. I don't understand the proper use of \\ (described
> as the separator metacommand).
>
> Keep in mind that echo "\\"  will actually only echo '\' because \ is a
> shell escape as well...
>
> >
> > However, in my research, I noted that a bunch of people seem to just not
> even bother with pg_start_backup/pg_stop_backup and I guess aren't that
> worried about the crash recovery process if they need to perform a restore.
> I also find the omission of the start/stop backup functions from the File
> System Level Backup page:
> http://www.postgresql.org/docs/9.2/static/backup-file.html
> >
> > Is the pg_start_backup() and pg_stop_backup() even necessary?
> >
>
> If all of your Postgres files are part of *the same* consistent snapshot
> (i.e. are on one FS that gets snapshotted), then the start/stop backup
> should not be necessary.  It will just look like a server crash instead.
>
> pg_start_backup is used when you do not have filesystem snapshotting
> available, and is described in detail on the next manual page:
>
> http://www.postgresql.org/docs/9.2/static/continuous-archiving.html
>
>


[GENERAL] ZFS snapshots - to use pg_start_backup() and pg_stop_backup() or no?

2013-09-11 Thread Gregory Haase
I was trying to figure out how to get the following syntax to work:

echo "select pg_start_backup('zfs_snapshot'); \\! zfs snapshot
zroot/zpgsql@test; \\ select pg_stop_backup();" | psql postgres

The above command successfully starts the backup and creates the snapshot
but then fails to stop the backup. I've tried various combinations of \ and
\\ here with different whitespace and I just can't seem to find a
combination that works. I don't understand the proper use of \\ (described
as the separator metacommand).

However, in my research, I noted that a bunch of people seem to just not
even bother with pg_start_backup/pg_stop_backup and I guess aren't that
worried about the crash recovery process if they need to perform a restore.
I also find the omission of the start/stop backup functions from the File
System Level Backup page:
http://www.postgresql.org/docs/9.2/static/backup-file.html

Is the pg_start_backup() and pg_stop_backup() even necessary?

It would be nice to understand the proper syntax for the psql pipe,
regardless whether or not it is even necessary in this case.

Thanks,

Greg Haase


[GENERAL] Generic function for partitioning function?

2013-09-03 Thread Gregory Haase
I am working on a date-based partitioning framework and I would really like
to have a single function that could be used as trigger for any table that
needs to be partitioned by day. I am working in a rails environment, so
every table has a created_at datetime field.

I created my generic function:

create or replace function day_partition_insert_trigger()
returns trigger as $$
declare
ins_tbl varchar;
begin
ins_tbl :=  TG_TABLE_SCHEMA || '.' || TG_TABLE_NAME || '_' ||
to_char(NEW.created_at,'MMDD');
execute 'insert into '|| ins_tbl ||' select ($1).*' using NEW;
return null;
end;
$$ language plpgsql;

And then I assigned the function to two different test tables to make sure
it would work:

create trigger insert_daily_trigger before insert on testdailyone for each
row execute procedure day_partition_insert_trigger();
create trigger insert_daily_trigger before insert on testdailytwo for each
row execute procedure day_partition_insert_trigger();

Inserts work fine, and I was able to validate records are being inserted
into the correct child tables.

I began to wonder if there would be a performance degradation, so I changed
the testdailytwo trigger function the typical if, elsif described in the
partitioning documentation and then ran pgbench against both tables.

I noticed that with 7 partitions, the if, elsif was slightly faster (~8%).
However, when adding 30 partitions, the if, elsif version became slower.
I'd sort of expected this.

So, my conclusion is that the generic function will work, and it will make
administration (even automated administration) of partitioned tables much
simpler.

My question is...  Is there a compelling reason why I should NOT do this. I
must confess, it seems so straightforward that I feel like I must be
missing something.

Thanks,

Greg Haase