Re: [GENERAL] Incrementally refreshed materialized view

2016-09-29 Thread Nguyễn Trần Quốc Vinh
We came to C and discontinued the version generating code in
PL/pgSQL because of
- Our testing showed that triggers in C give better performance than the
ones in PL/pgSQL. Our opinion may be wrong.
- If we can generate triggers and other updating codes in C, we may
integrate it into PostgreSQL source codes. We may be wrong too.

:)

We plan to do by the same way for queries with outer-joins next year. With
recursive queries we have no plan because we don't see any effective update
algorithm. The worst is that we have no fund to do with matviews and
PostgreSQL. All that just for fun! We have too many things to do each day.

TS. Nguyễn Trần Quốc Vinh
---
Chủ nhiệm khoa Tin học
Trường ĐH Sư phạm - ĐH Đà Nẵng
Website: http://it.ued.vn ; http://www.ued.vn
; http://www.ued.udn.vn
LLKH: http://scv.ued.vn/~ntquocvinh 
ĐT: 0511.6-512-586
DĐ: 0914.78-08-98

Nguyen Tran Quoc Vinh, PhD
Dean
Faculty of Information Technology
Danang University of Education
Website: http://it.ued.udn.vn; http://www.ued.vn ;
http://www.ued.udn.vn
SCV: http://scv.ued.vn/~ntquocvinh 
Phone: (+84) 511.6-512-586
Mobile: (+84) 914.78-08-98

On Thu, Sep 29, 2016 at 8:56 PM, Nguyễn Trần Quốc Vinh  wrote:

> Dear Adam.
>
> There are read-me files within each projects. They support only
> inner-joins and aggregations (sum, count, min, max, avg). The updating
> algorithm for matviews with min/max must be differ from without ones. When
> there are min and/or max, we have to use the base table, otherwise, not.
>
> About 15 years ago we implemented for SPJ matvew-queries to generate
> triggers in PL/pgSQL. We developed that project for queries with
> aggregations about 4 years ago. Unfortunately we lost the last versions. We
> have now only the version with some error when there are aggregations. For
> SPJ queries it works well. We don't know is it useful or not. If yes, we
> can upload for sharing. We didn't share it because we thought that our work
> is not good enough for public use.
>
> The projects that generate C-codes was "finished" 2 years ago.
>
> Concern the project that generates triggers (doing incremental update) in
> PL/pgSQL, we implemented the algorithm likes your suggestion in this
> thread, i.e. at least one key of the base tables are added  automatically
> into the mat-view queries for further incremental updates.
>
> TS. Nguyễn Trần Quốc Vinh
> ---
> Chủ nhiệm khoa Tin học
> Trường ĐH Sư phạm - ĐH Đà Nẵng
> Website: http://it.ued.vn ; http://www.ued.vn
> ; http://www.ued.udn.vn
> LLKH: http://scv.ued.vn/~ntquocvinh 
> ĐT: 0511.6-512-586
> DĐ: 0914.78-08-98
> 
> Nguyen Tran Quoc Vinh, PhD
> Dean
> Faculty of Information Technology
> Danang University of Education
> Website: http://it.ued.udn.vn; http://www.ued.vn 
> ; http://www.ued.udn.vn
> SCV: http://scv.ued.vn/~ntquocvinh 
> Phone: (+84) 511.6-512-586
> Mobile: (+84) 914.78-08-98
>
> On Thu, Sep 29, 2016 at 8:37 PM, Adam Brusselback <
> adambrusselb...@gmail.com> wrote:
>
>> On Thu, Sep 29, 2016 at 8:10 AM, Nguyễn Trần Quốc Vinh <
>> ntquocv...@gmail.com> wrote:
>>
>>> Dear,
>>>
>>> As it was recommended, we pushed our projects into github:
>>> https://github.com/ntqvinh/PgMvIncrementalUpdate.
>>>
>>> 1) Synchronous incremental update
>>> - For-each-row triggers are generated for all changing events on all
>>> underlying tables.
>>>
>>> 2) Asynchronous (deferred) incremental update
>>> - Triggers are generated for all changing events on all underlying
>>> tables to collect all changed rows
>>> - Other codes are generated for each matview-query to update the matview.
>>>
>>>  We hope that our projects may be helpful for someone!
>>>
>>
>> Very interesting. Does this support materialized views with recursive
>> queries? What about left joins? (not) exists? Aggregates? Window functions?
>> In reading up on the implementations in other databases, I was surprised by
>> some of the limitations imposed by DB2 / Oracle / Sql Server.
>>
>> I'm trying to look through the code base to answer my questions, but it's
>> large enough that it may be easier to just ask first.
>>
>
>


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-29 Thread Nguyễn Trần Quốc Vinh
Dear Adam.

There are read-me files within each projects. They support only inner-joins
and aggregations (sum, count, min, max, avg). The updating algorithm for
matviews with min/max must be differ from without ones. When there are min
and/or max, we have to use the base table, otherwise, not.

About 15 years ago we implemented for SPJ matvew-queries to generate
triggers in PL/pgSQL. We developed that project for queries with
aggregations about 4 years ago. Unfortunately we lost the last versions. We
have now only the version with some error when there are aggregations. For
SPJ queries it works well. We don't know is it useful or not. If yes, we
can upload for sharing. We didn't share it because we thought that our work
is not good enough for public use.

The projects that generate C-codes was "finished" 2 years ago.

Concern the project that generates triggers (doing incremental update) in
PL/pgSQL, we implemented the algorithm likes your suggestion in this
thread, i.e. at least one key of the base tables are added  automatically
into the mat-view queries for further incremental updates.

TS. Nguyễn Trần Quốc Vinh
---
Chủ nhiệm khoa Tin học
Trường ĐH Sư phạm - ĐH Đà Nẵng
Website: http://it.ued.vn ; http://www.ued.vn
; http://www.ued.udn.vn
LLKH: http://scv.ued.vn/~ntquocvinh 
ĐT: 0511.6-512-586
DĐ: 0914.78-08-98

Nguyen Tran Quoc Vinh, PhD
Dean
Faculty of Information Technology
Danang University of Education
Website: http://it.ued.udn.vn; http://www.ued.vn ;
http://www.ued.udn.vn
SCV: http://scv.ued.vn/~ntquocvinh 
Phone: (+84) 511.6-512-586
Mobile: (+84) 914.78-08-98

On Thu, Sep 29, 2016 at 8:37 PM, Adam Brusselback  wrote:

> On Thu, Sep 29, 2016 at 8:10 AM, Nguyễn Trần Quốc Vinh <
> ntquocv...@gmail.com> wrote:
>
>> Dear,
>>
>> As it was recommended, we pushed our projects into github:
>> https://github.com/ntqvinh/PgMvIncrementalUpdate.
>>
>> 1) Synchronous incremental update
>> - For-each-row triggers are generated for all changing events on all
>> underlying tables.
>>
>> 2) Asynchronous (deferred) incremental update
>> - Triggers are generated for all changing events on all underlying tables
>> to collect all changed rows
>> - Other codes are generated for each matview-query to update the matview.
>>
>>  We hope that our projects may be helpful for someone!
>>
>
> Very interesting. Does this support materialized views with recursive
> queries? What about left joins? (not) exists? Aggregates? Window functions?
> In reading up on the implementations in other databases, I was surprised by
> some of the limitations imposed by DB2 / Oracle / Sql Server.
>
> I'm trying to look through the code base to answer my questions, but it's
> large enough that it may be easier to just ask first.
>


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-29 Thread Adam Brusselback
On Thu, Sep 29, 2016 at 8:10 AM, Nguyễn Trần Quốc Vinh  wrote:

> Dear,
>
> As it was recommended, we pushed our projects into github:
> https://github.com/ntqvinh/PgMvIncrementalUpdate.
>
> 1) Synchronous incremental update
> - For-each-row triggers are generated for all changing events on all
> underlying tables.
>
> 2) Asynchronous (deferred) incremental update
> - Triggers are generated for all changing events on all underlying tables
> to collect all changed rows
> - Other codes are generated for each matview-query to update the matview.
>
>  We hope that our projects may be helpful for someone!
>

Very interesting. Does this support materialized views with recursive
queries? What about left joins? (not) exists? Aggregates? Window functions?
In reading up on the implementations in other databases, I was surprised by
some of the limitations imposed by DB2 / Oracle / Sql Server.

I'm trying to look through the code base to answer my questions, but it's
large enough that it may be easier to just ask first.


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-29 Thread Nguyễn Trần Quốc Vinh
Dear,

As it was recommended, we pushed our projects into github:
https://github.com/ntqvinh/PgMvIncrementalUpdate.

1) Synchronous incremental update
- For-each-row triggers are generated for all changing events on all
underlying tables.

2) Asynchronous (deferred) incremental update
- Triggers are generated for all changing events on all underlying tables
to collect all changed rows
- Other codes are generated for each matview-query to update the matview.

 We hope that our projects may be helpful for someone!




TS. Nguyễn Trần Quốc Vinh
---
Chủ nhiệm khoa Tin học
Trường ĐH Sư phạm - ĐH Đà Nẵng
Website: http://it.ued.vn ; http://www.ued.vn
; http://www.ued.udn.vn
LLKH: http://scv.ued.vn/~ntquocvinh 
ĐT: 0511.6-512-586
DĐ: 0914.78-08-98

Nguyen Tran Quoc Vinh, PhD
Dean
Faculty of Information Technology
Danang University of Education
Website: http://it.ued.udn.vn; http://www.ued.vn ;
http://www.ued.udn.vn
SCV: http://scv.ued.vn/~ntquocvinh 
Phone: (+84) 511.6-512-586
Mobile: (+84) 914.78-08-98

On Tue, Sep 27, 2016 at 12:23 PM, hariprasath nallasamy <
hariprasathnallas...@gmail.com> wrote:

> We also tried to achieve incremental refresh of materialized view and our
> solution doesn't solve all of the use cases.
>
> Players:
> 1) WAL
> 2) Logical decoding
> 3) replication slots
> 4) custom background worker
>
> Two kinds of approaches :
> 1. Deferred refresh (oracle type of creating log table for each base
> tables with its PK and agg's columns old and new values)
>   a) Log table for each base table has to be created and this log
> table will keep track of delta changes.
>   b) UDF is called to refresh the view incrementally - this will
> run original materialized view query with the tracked delta PK's in their
> where clause. so only rows that are modified/inserted will be touched.
>   c) Log table will keep track of changed rows from the data given by
> replication slot which uses logical decoding to decode from WAL.
>   d) Shared memory is used to maintain the relationship between the
> view and its base table. In case of restart they are pushed to maintenance
> table.
>
> 2. RealTime refresh (update the view whenever we get any change-sets
> related to that base tables)
>   a) Delta data from the replication slot will be applied to view by
> checking the relationship between our delta data and the view definiton.
> Here also shared memory and maintenance table are used.
>   b) Work completed only for materialized views having single table.
>
> Main disadvantage :
> 1) Data inconsistency when master failure and also slave doesn't have
> replication slot as of now. But 2ndquard guys try to create slots in slave
> using some concepts of failover slots. But that doesn't come along with PG
> :(.
> 2) Sum, count and avg are implemented for aggregates(single table) and for
> other aggs full refresh comes to play a role.
> 3) Right join implementation requires more queries to run on the top of
> MV's.
>
> So we are on a long way to go and dono whether this is the right path.
>
> Only deferred refresh was pushed to github.
> https://github.com/harry-2016/MV_IncrementalRefresh
>
> I wrote a post regarding that in medium.
> https://medium.com/@hariprasathnallsamy/postgresql-materialized-view-
> incremental-refresh-44d1ca742599
>
>
>


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread hariprasath nallasamy
We also tried to achieve incremental refresh of materialized view and our
solution doesn't solve all of the use cases.

Players:
1) WAL
2) Logical decoding
3) replication slots
4) custom background worker

Two kinds of approaches :
1. Deferred refresh (oracle type of creating log table for each base tables
with its PK and agg's columns old and new values)
  a) Log table for each base table has to be created and this log table
will keep track of delta changes.
  b) UDF is called to refresh the view incrementally - this will
run original materialized view query with the tracked delta PK's in their
where clause. so only rows that are modified/inserted will be touched.
  c) Log table will keep track of changed rows from the data given by
replication slot which uses logical decoding to decode from WAL.
  d) Shared memory is used to maintain the relationship between the
view and its base table. In case of restart they are pushed to maintenance
table.

2. RealTime refresh (update the view whenever we get any change-sets
related to that base tables)
  a) Delta data from the replication slot will be applied to view by
checking the relationship between our delta data and the view definiton.
Here also shared memory and maintenance table are used.
  b) Work completed only for materialized views having single table.

Main disadvantage :
1) Data inconsistency when master failure and also slave doesn't have
replication slot as of now. But 2ndquard guys try to create slots in slave
using some concepts of failover slots. But that doesn't come along with PG
:(.
2) Sum, count and avg are implemented for aggregates(single table) and for
other aggs full refresh comes to play a role.
3) Right join implementation requires more queries to run on the top of
MV's.

So we are on a long way to go and dono whether this is the right path.

Only deferred refresh was pushed to github.
https://github.com/harry-2016/MV_IncrementalRefresh

I wrote a post regarding that in medium.
https://medium.com/@hariprasathnallsamy/postgresql-materialized-view-incremental-refresh-44d1ca742599


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Kevin Grittner
On Mon, Sep 26, 2016 at 3:16 PM, Adam Brusselback
 wrote:

> Well I feel like I've learned a ton already reading through the links you
> provided earlier and that example above.

Yeah, I know that example can really help show what will happen
"under the covers", and make it more concrete.  The theory that
it's based on seems brilliant to me.  That fact that it's based on
relational algebra (RA) means that it is much more likely to be
implemented in a complete and correct manner than something ad hoc.
 I know I started at this with a try (similar to yours) at
analyzing from scratch, then went and searched the literature.
When I came back to my ad hoc notes, the RA pointed out some holes
in the logic where corner cases would have been wrong because of
missed details.  RA is well thought out and solid; it seems to me
to be the perfect thing to underlie a solution to this problem.

> I'm very interested in getting this into core. I'll look into what I need to
> do to review. Not crazy familiar with C, as I mainly do Java development.
> I'll see if I can help in any way though.

Just testing it and reviewing the documentation for errors,
omissions, or just plain hard-to-follow language would be a big
help.  Please post back about any problems getting things patched
and build.

> The main reason I was working on an alternative is because I need something
> now rather than in a couple years, but I've been dealing with manually
> creating the few I do need for my database. What I proposed above was just
> me thinking about what could be done with things as they are. Obviously it's
> junk compared to a real solution in-core.  Would you consider my approach
> even worth trying, or should I just suck it up and do things manually for
> now and put that effort into getting incremental refresh into core?

Oh, I've used plain tables and triggers many times myself.  If you
need something now, you kind of have to go that route.  The
algorithms I cited do provide an interesting potential alternative
for how to go about that, although operating a row at a time you
probably won't approach the speed of statement-level set logic for
statements that affect very many rows.  :-(

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Adam Brusselback
On Mon, Sep 26, 2016 at 3:21 PM, Kevin Grittner  wrote:

> On Mon, Sep 26, 2016 at 2:04 PM, Rakesh Kumar
>  wrote:
>
> > Does PG have a concept of MV log, from where it can detect the
> > delta changes and apply  incremental changes quickly.
>
> That is what I am trying to work toward with the patch I cited in
> an earlier post.  Once some variation of that is in, the actual
> incremental maintenance can be build on top of it.  To see an
> example of what would be done with such a delta relation for a
> simple MV, using the count algorithm, see below:
>

Well I feel like I've learned a ton already reading through the links you
provided earlier and that example above.

I'm very interested in getting this into core. I'll look into what I need
to do to review. Not crazy familiar with C, as I mainly do Java
development. I'll see if I can help in any way though.

The main reason I was working on an alternative is because I need something
now rather than in a couple years, but I've been dealing with manually
creating the few I do need for my database. What I proposed above was just
me thinking about what could be done with things as they are. Obviously
it's junk compared to a real solution in-core.  Would you consider my
approach even worth trying, or should I just suck it up and do things
manually for now and put that effort into getting incremental refresh into
core?


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Kevin Grittner
On Mon, Sep 26, 2016 at 2:04 PM, Rakesh Kumar
 wrote:

> Does PG have a concept of MV log, from where it can detect the
> delta changes and apply  incremental changes quickly.

That is what I am trying to work toward with the patch I cited in
an earlier post.  Once some variation of that is in, the actual
incremental maintenance can be build on top of it.  To see an
example of what would be done with such a delta relation for a
simple MV, using the count algorithm, see below:


--
--
-- This file demonstrates how incremental maintenance of materialized views
-- is intended to work using the counting algorithm.  TEMP tables are used
-- to mock up both snapshots of relations and delta relations which will
-- be created automatically "behind the scenes" as part of the algorithm.
--
-- Delta relations for both base tables and materialized views will need a
-- "count(t)" column to track the number of alternative derivations of the
-- tuple.  The column is actually named that in this example.  Where the
-- count is needed by the algorithm and a relation doesn't have it
-- (because, for example, it is a base table), 1 is used.  This is kept in
-- the example for purposes of illustration.  While it is explicitly added
-- to the materialized views for this example, the idea would be that
-- specifying these materialized views using DISTINCT would implicitly add
-- the "count(t)" column when incremental maintenance was specified.
--
-- The logic for a JOIN is that the resulting row should have the product
-- of multiplying the two input row counts.  Calculating a delta for that
-- requires two such joins which are then processed by a UNION with special
-- semantics.  Those semantics are achieved in this example by feeding
-- UNION ALL results to an outer SELECT which uses GROUP BY and HAVING
-- clauses.
--
-- This example dodges some complex and performance-sapping problems that
-- occur when duplicate rows may be present.  It does so with a combination
-- of a PRIMARY KEY declaration in the base table and GROUP BY clauses in
-- the materialized views.
--
-- The temporary relation names in the example are chosen for clarity of
-- the example.  If temporary relation names are actually needed in the
-- implementation, they would need to be different, probably based on the
-- related permanent relation OID, for length.
--
--

-- Set up the base table used for these materialized views.
CREATE TABLE link
(
  src  text not null,
  dst  text not null,
  primary key (src, dst)
);
INSERT INTO link
  VALUES ('a','b'),('b','c'),('b','e'),('a','d'),('d','c');

-- Display and capture "before" image.
SELECT * FROM link ORDER BY 1, 2;
CREATE TEMP TABLE link1 AS SELECT * FROM link;

-- Create and display the initial state of the "hop" MV.
CREATE MATERIALIZED VIEW hop AS
  SELECT t1.src, t2.dst, count(*) AS "count(t)"
FROM link t1
JOIN link t2 ON (t2.src = t1.dst)
GROUP BY t1.src, t2.dst;
SELECT * FROM hop;

-- Execute a transaction that modifies the base table.
-- The assumption is that the changes will be applied shortly after commit.
BEGIN;
DELETE FROM link WHERE src = 'b' AND dst = 'e';
INSERT INTO link VALUES ('c','h'),('f','g');
COMMIT;

-- Display and capture "after" image.
SELECT * FROM link ORDER BY 1, 2;
CREATE TEMP TABLE link2 AS SELECT * FROM link;

-- Mock up the delta(link) table which will get created "behind the scenes".
CREATE TEMP TABLE "Δ(link)" (src text NOT NULL, dst text NOT NULL,
"count(t)" int NOT NULL);
INSERT INTO "Δ(link)" VALUES ('b','e',-1),('c','h',1),('f','g',1);
SELECT * FROM "Δ(link)" ORDER BY 1, 2;

-- Given link before and after images, and delta(link), calculate delta(hop).
-- This must happen while the "before" and "after" snapshots are still
registered,
-- but not necessarily in the same transaction as the modifications to
the base table.
CREATE TEMP TABLE "Δ(hop)"
  (src text NOT NULL, dst text NOT NULL, "count(t)" int NOT NULL);
INSERT INTO "Δ(hop)"
  SELECT src, dst, sum("count(t)") AS "count(t)"
FROM (
   SELECT delta.src, before.dst, delta."count(t)" * 1
 FROM "Δ(link)" delta
 JOIN link1 before ON (before.src = delta.dst)
   UNION ALL
   SELECT after.src, delta.dst, 1 * delta."count(t)"
 FROM link2 after
 JOIN "Δ(link)" delta ON (delta.src = after.dst)
 ) x(src, dst, "count(t)")
GROUP BY src, dst
HAVING sum("count(t)") <> 0;
SELECT * FROM "Δ(hop)" ORDER BY 1, 2;

-- Once the MV delta has been generated, the snapshots can be released.
-- We're using temp tables for demonstration purposes, so drop those,
-- and the base table's delta.
DROP TABLE link1, link2, "Δ(link)";

-- At some later time the MV delta is processed "behind the scenes".
-- We can't do the demonstration maintenance against the MV, so copy it.
CREATE TEMP TABLE hop2 AS SELECT * FROM

Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Rakesh Kumar
> Of course 9.5 is the current release so the answer is Yes, since 9.5

https://www.postgresql.org/docs/current/static/sql-creatematerializedview.html

What am I missing. I don't see any support for incremental refresh.

Just in case we are not speaking the same thing:  When a MV is created for the 
first time on a base table, all further updates on the base table, will be 
handled by only applying the delta changes to the base table. That is called 
incremental refresh. Check how DB2 does it:

http://www.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r977.html

Or does " REFRESH MATERIALIZED VIEW." does it only incremental refresh.

Does PG have a concept of MV log, from where it can detect the delta changes 
and apply  incremental changes quickly.


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Kevin Grittner
On Mon, Sep 26, 2016 at 1:22 PM, Melvin Davidson  wrote:
> On Mon, Sep 26, 2016 at 1:56 PM, Adam Brusselback  
> wrote:
>>
>> I am working on a plan to implement incrementally refreshed
>> materialized "views" with the existing functionality in
>> Postgres.
>>
>> Below is the plan for doing that:

>> [design from scratch, incomplete]

>> I am however stuck on: How do we know what to refresh?

>> Pretty much, I need to figure out how to follow the joins in the
>> view back to whatever key was defined as the "refresh key" for
>> each dependent table.  I know about the
>> information_schema.view_column_usage, but I don't think that'll
>> get me everything I need.
>>
>> I'd really appreciate any help with this, as i'd love a better
>> way to get eagerly refreshed materialized views in Postgres
>> rather than doing everything manually as I have to now.
>>
>> If I can provide any more info please let me know.

> I am a bit curious. Why are you reinventing the wheel?
> What is wrong with:
>
> REFRESH MATERIALIZED VIEW [ CONCURRENTLY ] name
> [ WITH [ NO ] DATA ]
>
> https://www.postgresql.org/docs/9.4/static/sql-refreshmaterializedview.html
>
> Can't you do that in a cron job?

Well, that is not *incremental* maintenance -- the entire query is
executed, with the resulting relation either replacing the previous
contents of the matview or "diffed" against the previous contents
(so that the difference can be applied with transactional
semantics), depending on whether CONCURRENTLY was specified.

The OP is still reinventing the wheel though.  A summary of
available techniques as of the mid-90s can be found here:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.2254&rep=rep1&type=pdf

With some detail for what to me look like the two most promising
techniques here:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.31.3208&rep=rep1&type=pdf

The first step in using either of those techniques (counting or
DRed) is to capture a delta relation to feed into the relational
algebra used by these techniques.  As a first step in that
direction I have been floating a patch to implement the
SQL-standard "transition tables" feature for AFTER triggers.

https://commitfest.postgresql.org/10/778/

If you want to help that effort, reading the thread and reviewing
the patch would be good.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Adam Brusselback
On Mon, Sep 26, 2016 at 2:35 PM, Rob Sargent  wrote:

> Of course 9.5 is the current release so the answer is Yes, since 9.5
>
> It seems like there is some confusion about what we're talking about. I am
talking about incremental updates to a sort of "fake" materialized view
(implemented as a table).  This is totally different than what we currently
have implemented for actual materialized views (REFRESH CONCURRENTLY).
Concurrent refresh just minimizes the time the view is locked by building a
second table in the background and merging the changes between them.


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Adam Brusselback
I require eagerly refreshed materialized views for my use case, which is
something Postgres does not currently support.  I need my updates to a
table the view refers to visible within the same transaction, and often it
is a single change to one row which will only effect a single row in the
view.  If I used materialized views as they're currently implemented, that
would run the entire query and replace what needs to change, but it has to
run that entire query to do so.  For my use case, that is totally out of
the question to do for every insert / update / delete that could effect the
view.

For example, if I had a account balance materialized view that pretty much
summed transactions for a specific user from two different tables and
displayed a "balance" for every user, I would want that to only run the
query for the refresh for the specific user(s) that just had data
inserted/updated/deleted.  Not every user in the system after every
statement.

I've pretty much implemented this manually for some specific views which
performed horribly in Postgres (but would work fine in SQL Server for
example). I am looking to do this in a generic way so it's easier to
implement when necessary, and can be packaged as an extension for others
who may need to use it.

Obviously if we had better support for statement level triggers (so we
could reference all the NEW / OLD values touched by a statement) this would
be lower overhead, but that is one of the things holding up incrementally
refreshed materialized views from being implemented in the first place. I
just thought up a way to do it which gets around not having better
statement level triggers and wanted to see where I could get with things as
they are.


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Rob Sargent

Of course 9.5 is the current release so the answer is Yes, since 9.5

On 09/26/2016 12:29 PM, Rakesh Kumar wrote:


*Does PG support INCREMENTAL MV ? Looks like not (until 9.5)*





Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Rakesh Kumar
Does PG support INCREMENTAL MV ? Looks like not (until 9.5)


Re: [GENERAL] Incrementally refreshed materialized view

2016-09-26 Thread Melvin Davidson
On Mon, Sep 26, 2016 at 1:56 PM, Adam Brusselback  wrote:

> Hello all,
> I am working on a plan to implement incrementally refreshed materialized
> "views" with the existing functionality in Postgres.
>
> Below is the plan for doing that:
>
> Trigger based eagerly updated materialized tables for Postgres 9.5
>>
>>
>>
>> High level plan:
>>
>> Have a view definition stored in the database which we can use for
>> reference.  Create functions which will read that view definition, and
>> create a materialized table with all the same columns as the reference
>> view, create triggers on all tables the view depends on to keep the
>> materialized table fresh within a transaction.  All queries would hit the
>> materialized table, the view is just there so we know what dependencies to
>> track, and have an easy way to update the materialized table.
>>
>>
>>
>> How do we actually do the refresh?
>>
>> 1.   A refresh key is defined for the materialized table.
>>
>> 2.   Every dependent table must roll up to that refresh key so we
>> know what rows to refresh.
>>
>> 3.   That key should be able to be referenced in the views where
>> clause performantly so we can refresh just the rows that match the refresh
>> key using the view.
>>
>> 4.   The refresh will be done by deleting any existing rows with the
>> key, and inserting new ones with the key from the view.
>>
>> How do we know what to refresh?
>>
>> 1.   A before statement trigger to create a temp table to log all
>> changes.
>>
>> 2.   A for each row trigger to log the rows modified by DML.
>>
>> a.   This should be done at the refresh key level.
>>
>>i.  We
>> need to figure out a way to generate queries to roll up things multiple
>> levels on the dependency chain until we get to the refresh key.  Not sure
>> at all how to do that.
>>
>> 3.   An after statement trigger to run a refresh on the materialized
>> table, looking at only the rows touched by the DML.
>>
>
> I am however stuck on: How do we know what to refresh?  -> Step 2
> Pretty much, I need to figure out how to follow the joins in the view back
> to whatever key was defined as the "refresh key" for each dependent table.
> I know about the information_schema.view_column_usage, but I don't think
> that'll get me everything I need.
>
> I'd really appreciate any help with this, as i'd love a better way to get
> eagerly refreshed materialized views in Postgres rather than doing
> everything manually as I have to now.
>
> If I can provide any more info please let me know.
> Thanks,
> -Adam
>


*I am a bit curious. Why are you reinventing the wheel?*

*What is wrong with:*

*REFRESH MATERIALIZED VIEW [ CONCURRENTLY ] name
[ WITH [ NO ] DATA ]*



*https://www.postgresql.org/docs/9.4/static/sql-refreshmaterializedview.html
*



*Can't you do that in a cron job?*
-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.