Re: [HACKERS] - GSoC - snapshot materialized view (work-in-progress) patch

2010-07-12 Thread Pavel Baroš

Dne 9.7.2010 21:33, Robert Haas napsal(a):

2010/7/8 Pavel Barošbaro...@seznam.cz:

Description of patch:
1) can create MV, and is created uninitialized with data
   CREATE MATERIALIZED VIEW mvname AS SELECT ...


This doesn't seem acceptable.  It should populate it on creation.



Yes, it would be better, in addition, true is, this behavior will be 
required if is expected to implement incremental MV in the close future.



2) can refresh MV
   ALTER MATERIALIZED VIEW mvname REFRESH

3) MV cannot be modified by DML commands (INSERT, UPDATE and DELETE are not
permitted)

4) index can be created and used with MV

5) pg_dump is repaired, in previous patch dump threw error, now dont, but it
is sort of dummy, I want to reach state, where refreshing command will be
posed after all COPY statements (when all data are in tables). In this patch
REFRESH command is right behind CREATE MV command.


Hmm... ISTM that you probably need some kind of dependency stuff in
here to make the materialized view get created after the tables it
depends on have been populated with data.  It needs to work with
parallel restore, too.  I'm not sure exactly how the dependency stuff
in pg_dump works, though.



never mind in case MV will be populated on creation.


A subtle point here is that if you dump and restore a database
containing a materialized view, the new database might not be quite
the same as the old one, because the materialized view might have been
out of date before, and when you recreate it, it'll get refreshed.
I'm not sure there's much we can/should do about that, though.



yes, it is interesting, of course, there can be real-life example, where 
population on creating is needed and is not, and I'm thinking of 
solution similar to Oracle or DB2. Add some option to creating MV, that 
enable/disable population on creating:


http://www.ibm.com/developerworks/data/library/techarticle/dm-0708khatri/

Oracle:
  CREATE MATERIALIZED VIEW mvname
  [ BUILD [IMMEDIATE | DEFERRED] ]
  AS SELECT ..

DB2:
  CREATE TABLE mvname
  AS SELECT ...
  [ INITIALLY DEFERRED | IMMEDIATE ]


6) psql works too, new command \dm[S+] was added to the list
  \d[S+] [PATTERN]   - lists all db objects like tables, view, materialized
view and sequences
  \dm[S+] [PATTERN]  - lists all materialized views



I also noticed I forgot handle options \dp and \dpp, this should be OK 
in next version of patch.



7) there are some docs too, but I guess it is not enough, at least my
english will need to correct


If we're going to treat materialized views as a separate object type,
you probably need to break out the docs for CREATE MATERIALIZED VIEW,
ALTER MATERIALIZED VIEW, and DROP MATERIALIZED VIEW into their own
pages, rather than having then mixed up with corresponding pages for
regular views.



Yeah, that was problem I just solved like that here, but I confess this 
would be better.




In progress:
- regression tests
- behavior of various ALTER commands, ie SET STATISTIC, CLUSTER ON,
ENABLE/DISABLE RULE, etc.


This isn't right:

rhaas=# create view v as select * from t;
CREATE VIEW
rhaas=# alter view v refresh;
ERROR:  unrecognized alter table type: 41



I know, cases like that will be more than that. Thats why I work on good 
tests now.



Please add your patch here, so that it will be reviewed during the
about-to-begin CommitFest.

https://commitfest.postgresql.org/action/commitfest_view/open



OK, but will you help me with that form? Do you think I can fill it like 
that? I'm not sure about few fields ..


Name: Snapshot materialized views
CommitFest Topic: [ Miscellaneous | SQL Features ] ???
Patch Status: Needs review
Author:   me
Reviewers:You?
Commiters:who?

and I quess fields 'Date Closed' and 'Message-ID for Original Patch' 
will be filled later.



thanks a lot


Pavel Baros


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] - GSoC - snapshot materialized view (work-in-progress) patch

2010-07-12 Thread Pavel Baroš

Dne 9.7.2010 21:33, Robert Haas napsal(a):

2010/7/8 Pavel Barošbaro...@seznam.cz:

Description of patch:
1) can create MV, and is created uninitialized with data
   CREATE MATERIALIZED VIEW mvname AS SELECT ...


This doesn't seem acceptable.  It should populate it on creation.



Yes, it would be better, in addition, true is, this behavior will be 
required if is expected to implement incremental MV in the close future.



2) can refresh MV
   ALTER MATERIALIZED VIEW mvname REFRESH

3) MV cannot be modified by DML commands (INSERT, UPDATE and DELETE 
are not

permitted)

4) index can be created and used with MV

5) pg_dump is repaired, in previous patch dump threw error, now dont, 
but it
is sort of dummy, I want to reach state, where refreshing command 
will be
posed after all COPY statements (when all data are in tables). In 
this patch

REFRESH command is right behind CREATE MV command.


Hmm... ISTM that you probably need some kind of dependency stuff in
here to make the materialized view get created after the tables it
depends on have been populated with data.  It needs to work with
parallel restore, too.  I'm not sure exactly how the dependency stuff
in pg_dump works, though.



never mind in case MV will be populated on creation.


A subtle point here is that if you dump and restore a database
containing a materialized view, the new database might not be quite
the same as the old one, because the materialized view might have been
out of date before, and when you recreate it, it'll get refreshed.
I'm not sure there's much we can/should do about that, though.



yes, it is interesting, of course, there can be real-life example, where 
population on creating is needed and is not, and I'm thinking of 
solution similar to Oracle or DB2. Add some option to creating MV, that 
enable/disable population on creating:


http://www.ibm.com/developerworks/data/library/techarticle/dm-0708khatri/

Oracle:
  CREATE MATERIALIZED VIEW mvname
  [ BUILD [IMMEDIATE | DEFERRED] ]
  AS SELECT ..

DB2:
  CREATE TABLE mvname
  AS SELECT ...
  [ INITIALLY DEFERRED | IMMEDIATE ]


6) psql works too, new command \dm[S+] was added to the list
  \d[S+] [PATTERN]   - lists all db objects like tables, view, 
materialized

view and sequences
  \dm[S+] [PATTERN]  - lists all materialized views



I also noticed I forgot handle options \dp and \dpp, this should be OK 
in next version of patch.



7) there are some docs too, but I guess it is not enough, at least my
english will need to correct


If we're going to treat materialized views as a separate object type,
you probably need to break out the docs for CREATE MATERIALIZED VIEW,
ALTER MATERIALIZED VIEW, and DROP MATERIALIZED VIEW into their own
pages, rather than having then mixed up with corresponding pages for
regular views.



Yeah, that was problem I just solved like that here, but I confess this 
would be better.




In progress:
- regression tests
- behavior of various ALTER commands, ie SET STATISTIC, CLUSTER ON,
ENABLE/DISABLE RULE, etc.


This isn't right:

rhaas=# create view v as select * from t;
CREATE VIEW
rhaas=# alter view v refresh;
ERROR:  unrecognized alter table type: 41



I know, cases like that will be more than that. Thats why I work on good 
tests now.



Please add your patch here, so that it will be reviewed during the
about-to-begin CommitFest.

https://commitfest.postgresql.org/action/commitfest_view/open



OK, but will you help me with that form? Do you think I can fill it like 
that? I'm not sure about few fields ..


Name: Snapshot materialized views
CommitFest Topic: [ Miscellaneous | SQL Features ] ???
Patch Status: Needs review
Author:   me
Reviewers:You?
Commiters:who?

and I quess fields 'Date Closed' and 'Message-ID for Original Patch' 
will be filled later.



thanks a lot


Pavel Baros



Re: [HACKERS] GSoC - code of implementation of materialized views

2010-06-29 Thread Pavel Baroš

Robert Haas napsal(a):

2010/6/25 Pavel Baros baro...@seznam.cz:
  

On http://github.com/pbaros/postgres can be seen changes and my attempt to
implement materialized views. The first commit to the repository implements
following:

Materialized view can be created, dropped and used in SELECT statement.

CREATE MATERIALIZED VIEW mvname AS SELECT ...;
DROP MATERIALIZED VIEW mvname [CASCADE];
SELECT * FROM mvname;

also works:
COMMENT ON MATERIALIZED VIEW mvname IS 'etc.';
SELECT pg_get_viewdef(mvname);
  

... also you can look at enclosed patch.



So, this patch doesn't actually seem to do very much.  It doesn't
appear that creating the materialized view actually populates it with
any data; and the refresh command doesn't work either.  So it appears
that you can create a materialized view, but it won't actually
contain any data - which doesn't seem at all useful.

  


Yeah, it is my fault, I did not mentioned that this patch is not final. 
It is only small part of whole implementation. I wanted to show just 
this, because I think that is the part that should not change much. And 
to show I did something, I am not ignoring GSoC. Now I can fully focus 
on the program.


Most of the problems you mentioned (except pg_dump) I have implemented 
and I will post it to HACKERS soon. Until now I've not had much time, 
because I just finished my BSc. studies yesterday.


And again, sorry for misunderstanding.

Pavel Baros


Some other problems:

- The command tag for CREATE MATERIALIZED VIEW should return CREATE
MATERIALIZED VIEW rather than CREATE VIEW, since we're treating it as
a separate object type.  I note that dropping a materialized view
already uses DROP MATERIALIZED VIEW, so right now it isn't
symmetrical.
- Using \d with no argument doesn't list materialized views.
- Using \d with a materialized view as an argument doesn't work
properly - the first line says something like ?m? public.m instead
of materialized view public.m.
- Using \d+ with a materialized view as an argument should probably
should the view definition.
- Using \dd doesn't list comments on materialized views.
- Commenting on a column of a materialized view should probably be allowed.
- pg_dump fails with a message like this: failed sanity check, parent
table OID 24604 of pg_rewrite entry OID 24607 not found
- ALTER MATERIALIZED VIEW name OWNER TO role, RENAME TO role, and SET
SCHEMA schema either fall to work or fail to parse (plan ALTER VIEW
also doesn't work on a materialized view)
- ALTER MATERIALIZED VIEW name SET/DROP DEFAULT also doesn't work,
which is OK: it shouldn't work.  But the error message needs work.
- The error message CREATE OR REPLACE on materialized view is not
support! shouldn't end with an exclamation point.
- The parser token OptMater should probably be called OptMaterialized
or opt_materialized, rather than abbreviating.
- There are no docs.
- There are no tests.

  



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC - Materialized Views - is stale or fresh?

2010-06-14 Thread Pavel Baroš

Heikki Linnakangas napsal(a):

On 12/06/10 17:18, Pavel Baros wrote:

I am curious how could I solve the problem:

During refreshing I would like to know, if MV is stale or fresh? And I
had an idea:

In fact, MV need to know if its last refresh (transaction id) is older
than any INSERT, UPDATE, DELETE transaction launched against source
tables. So if MV has information about last (highest) xmin in source
tables, it could simply compare its own xmin to xmins (xmax for deleted
rows) from source tables and decide, if is stale or fresh.

Whole realization could look like this:
1. Make new column in pg_class (or somewhere in pg_stat* ?):
pg_class.rellastxid (of type xid)

2. After each INSERT, UPDATE, DELETE statement (transaction)
pg_class.rellastxid would be updated. That should not be time- or
memory- consuming (not so much) since pg_class is cached, I guess.


rellastxid would have to be updated at every insert/update/delete. It 
would become a big bottleneck. That's not going to work.


Why do you need to know if a MV is stale?




Sorry I did not mention it. If we knew MV is fresh, there is no neeed to 
refresh MV and so it would prevent useless rebuilding of MV. So I 
thought there is room for saving some work.


Anyway, I realized, this idea do not cover all the cases how to find out 
MV is stale or fresh. For example, when updating a row of source table 
of MV, that do not participate in MV, in that case refreshing of MV 
would be useless too.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers