Re: [HACKERS] [PROPOSAL] Temporal query processing with range types

2017-10-06 Thread Paul A Jungwirth
On Fri, Jul 22, 2016 at 4:15 AM, Anton Dignös  wrote:
> We would like to contribute to PostgreSQL a solution that supports the query
> processing of "at each time point". The basic idea is to offer two new
> operators, NORMALIZE and ALIGN, whose purpose is to adjust (or split) the
> ranges of tuples so that subsequent queries can use the usual grouping and
> equality conditions to get the intended results.

I just wanted to chime in and say that the work these people have done
is *amazing*. I read two of their papers yesterday [1, 2], and if you
are interested in temporal data, I encourage you to read them too. The
first one is only 12 pages and quite readable. After that the second
is easy because it covers a lot of the same ground but adds "scaling"
of values when a tuple is split, and some other interesting points.
Their contributions could be used to implement SQL:2011 syntax but go
way beyond that.

Almost every project I work on could use temporal database support,
but there is nothing available in the Open Source world. The
temporal_tables extension [3] offers transaction-time support, which
is great for auditing, but it has no valid-time support (aka
application-time or state-time). Same with Magnus Hagander's TARDIS
approach [4], Chronomodel [5] (an extension to the Rails ORM), or any
other project I've seen. But valid-time is the more valuable
dimension, because it tells you the history of the thing itself (not
just when the database was changed). Also nothing is even attempting
full bitemporal support.

The ideas behind temporal data are covered extensively in Snodgrass's
1999 book [6], which shows how valuable it is to handle temporal data
in a principled way, rather than ad hoc. But that book also
demonstrates how complex the queries become to do things like temporal
foreign key constraints and temporal joins. I was sad to learn that
his proposed TSQL2 was rejected as a standard back in the 90s,
although the critiques by C. J. Date [7] have some merit. In
particular, since TSQL2 used *statement* modifiers, some of the
behavior was unclear or bad when using subqueries, views, and
set-returning functions. It makes more sense to have temporal
*operators*, so alongside inner join you have temporal inner join, and
likewise with temporal left outer join, temporal
union/intersection/difference, temporal aggregation, etc. (I think the
drawbacks of TSQL2 came from pursuing an unachievable goal, which was
to enable seamlessly converting existing non-temporal tables to
temporal without breaking any queries.)

Another unsatisfactory approach at historical data, from the industry
rather than academia, is in chapter 4 and elsewhere of Ralph Kimball's
*Data Warehouse Toolkit* [8]. His first suggestion (Type 1 Dimensions)
is to ignore the problem and overwrite old data with new. His Type 2
approach (make a new row) is better but loses the continuity between
the old row and the new. Type 3 fixes that but supports only one
change, not several. And anyway his ideas are tailored to star-schema
designs so are not as broadly useful. Workarounds like bridge tables
and "put the data in the fact table" are even more wedded to a
star-schema approach. But I think his efforts do show how valuable
historical data is, and how hard it is to handle without built-in
support.

As far as I can tell SQL:2011 avoids the statement modifier problem
(I'm not 100% sure), but it is quite limited, mostly covering
transaction-time semantics and not giving any way to do valid-time
outer joins or aggregations. It is clearly an early first step.
Unfortunately the syntax feels (to me) crippled by over-specificity,
like it will have a hard time growing to support all the things you'd
want to do.

The research by Dignös et al shows how you can define temporal
variants for every operator in the relational algebra, and then
implement them by using just two transformations (ALIGN and NORMALIZE)
combined with the existing non-temporal operators. It has a strong
theoretical basis and avoids the TSQL2 problems with composability.
And unlike SQL:2011 it has a great elegance and completeness I haven't
seen anywhere else.

I believe with range types the approach was to build up useful
primitives rather than jumping straight to a less-factored full
implementation of temporal features. (This in spite of SQL:2011
choosing to model begin/end times as separate columns, not as ranges.
:-) It seems to me the Dignös work follows the same philosophy. Their
ALIGN and NORMALIZE could be used to implement SQL:2011 features, but
they are also useful for much more. In their papers they actually
suggest that these transformations need not be exposed to end-users,
although it was convenient to have access to them for their own
research. I think it'd be great if Postgres's SQL dialect supported
them though, since SQL:2011 leaves out so much.

Anyway, I wanted to thank them for their excellent work, their
generosity, and also their perseverance. ([1] is from 2012 and

Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-05 Thread Paul A Jungwirth
On Sat, Aug 5, 2017 at 6:22 PM, Robert Haas  wrote:
> On Sat, Aug 5, 2017 at 4:28 PM, Paul A Jungwirth
>  wrote:
>> I don't have an opinion on the urgency of back-porting a fix, but if
>> pg_stop_backup(boolean) allows for inconsistent backups, it does sound
>> like a problem on 9.6 too.
>
> It doesn't.  The talk about inconsistent backups is, I think, not a
> very precise way of speaking.

Thank you for the reassurance and the detailed explanation!

Paul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-05 Thread Paul A Jungwirth
On Sat, Aug 5, 2017 at 1:28 PM, Paul A Jungwirth
 wrote:
> On Sat, Aug 5, 2017 at 7:51 AM, Robert Haas  wrote:
>> After refreshing my memory further, I take it back.  pg_stop_backup()
>> doesn't even have a second argument on v9.6, so back-porting this fix
>> to 9.6 is a meaningless thing; there's nothing to fix.
>
> According to the docs at
> https://www.postgresql.org/docs/9.6/static/functions-admin.html there
> is a one-arg version in 9.6.

I'm sorry, I just realized you said a *second* argument. Apologies for
the distraction!

Paul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-05 Thread Paul A Jungwirth
On Sat, Aug 5, 2017 at 7:51 AM, Robert Haas  wrote:
> After refreshing my memory further, I take it back.  pg_stop_backup()
> doesn't even have a second argument on v9.6, so back-porting this fix
> to 9.6 is a meaningless thing; there's nothing to fix.

According to the docs at
https://www.postgresql.org/docs/9.6/static/functions-admin.html there
is a one-arg version in 9.6.

I've been watching this thread since I noticed it a few days ago. I
have a system on 9.6 with replication, where the standby uses
`archive_mode: always` so that I can run WAL-E backups from there
instead of from the master. I'm a little worried about my backup
integrity now (It sounds like this might be a "it usually works"
situation). Also, WAL-E currently uses the no-arg pg_stop_backup, but
I was contemplating offering a patch to use non-exclusive backups when
available, both to avoid stopping the standby and also to generate a
tablespace description (which WAL-E requires when restoring with
tablespaces). Just thought I'd offer a real use-case for the
non-exclusive-mode functions on 9.6.

I don't have an opinion on the urgency of back-porting a fix, but if
pg_stop_backup(boolean) allows for inconsistent backups, it does sound
like a problem on 9.6 too.

Paul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New partitioning - some feedback

2017-07-10 Thread Paul A Jungwirth
> Also, there seems to be at least some preference
> for excluding partitions by default from the \d listing.

As another user of partitions I'll chime in and say that would be very
nice! On the other hand, with pre-10 partitions you do see all the
child tables with `\d`, so showing declarative partitions seems more
consistent with the old functionality.

On the third hand with pre-10 partitions I can put the child tables
into a separate schema to de-clutter `\d`, but I don't see any way to
do that with declarative partitions. Since there is no workaround it
makes it a bit more important for partitions not to be so noisy.

Paul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] First Aggregate Funtion?

2015-07-20 Thread Paul A Jungwirth
> The above implementation of "first" aggregate returns the first non-NULL item
> value.

I'm curious what advantages this approach has over these FIRST/LAST
functions from the Wiki?:

https://wiki.postgresql.org/wiki/First/last_%28aggregate%29

Also to get the "first non-null value" you can apply an ordering to
just the aggregate function, e.g.:

select first(id order by start_time nulls last) from events;

If you want speed you should probably write a C version.

Is there something I'm missing?

Also since we're on the hackers list is this a proposal to add these
functions to core Postgres?

Yours,
Paul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for UUIDs

2015-06-25 Thread Paul A Jungwirth
On Thu, Jun 25, 2015 at 8:06 AM, Tom Lane  wrote:
> Paul A Jungwirth  writes:
>> I'm interested in adding GiST support for the UUID column type
>> . . . . So I'm curious where this change would go?
> btree_gist, I'd think

Okay, thank you for your answer! I was worried about the effects of
having btree_gist depend on uuid-ossp. People won't have to say
`CREATE EXTENSION "uuid-ossp"` if they only want `btree_gist`, right?

> the overhead of an extension version bump will probably
> exceed the useful payload :-(

Sorry to put more work on your plate. :-) I'm trying to pick something
easy to get my feet wet.

Yours,
Paul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GiST support for UUIDs

2015-06-24 Thread Paul A Jungwirth
Hello,

I'm interested in adding GiST support for the UUID column type from
the uuid-ossp extension. This has been requested and attempted before:


http://dba.stackexchange.com/questions/83604/optimizing-postgres-row-overlap-constraints-involving-uuids-and-gist

http://www.postgresql.org/message-id/flat/cah3i69njj9ax1fzhwt6uouzcmbqnadbwhmaphrqzzlwifb2...@mail.gmail.com#cah3i69njj9ax1fzhwt6uouzcmbqnadbwhmaphrqzzlwifb2...@mail.gmail.com

I've used Postgres for a long time, but I've only dabbled a bit in the
source code. So I'm curious where this change would go? The btree_gist
extension? The uuid-ossp extension? Somewhere else?

If anyone has any advice I'd be grateful to hear it.

Thanks,
Paul


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers