Re: [HACKERS] - GSoC - snapshot materialized view (work-in-progress) patch
Dne 9.7.2010 21:33, Robert Haas napsal(a): 2010/7/8 Pavel Barošbaro...@seznam.cz: Description of patch: 1) can create MV, and is created uninitialized with data CREATE MATERIALIZED VIEW mvname AS SELECT ... This doesn't seem acceptable. It should populate it on creation. Yes, it would be better, in addition, true is, this behavior will be required if is expected to implement incremental MV in the close future. 2) can refresh MV ALTER MATERIALIZED VIEW mvname REFRESH 3) MV cannot be modified by DML commands (INSERT, UPDATE and DELETE are not permitted) 4) index can be created and used with MV 5) pg_dump is repaired, in previous patch dump threw error, now dont, but it is sort of dummy, I want to reach state, where refreshing command will be posed after all COPY statements (when all data are in tables). In this patch REFRESH command is right behind CREATE MV command. Hmm... ISTM that you probably need some kind of dependency stuff in here to make the materialized view get created after the tables it depends on have been populated with data. It needs to work with parallel restore, too. I'm not sure exactly how the dependency stuff in pg_dump works, though. never mind in case MV will be populated on creation. A subtle point here is that if you dump and restore a database containing a materialized view, the new database might not be quite the same as the old one, because the materialized view might have been out of date before, and when you recreate it, it'll get refreshed. I'm not sure there's much we can/should do about that, though. yes, it is interesting, of course, there can be real-life example, where population on creating is needed and is not, and I'm thinking of solution similar to Oracle or DB2. Add some option to creating MV, that enable/disable population on creating: http://www.ibm.com/developerworks/data/library/techarticle/dm-0708khatri/ Oracle: CREATE MATERIALIZED VIEW mvname [ BUILD [IMMEDIATE | DEFERRED] ] AS SELECT .. DB2: CREATE TABLE mvname AS SELECT ... [ INITIALLY DEFERRED | IMMEDIATE ] 6) psql works too, new command \dm[S+] was added to the list \d[S+] [PATTERN] - lists all db objects like tables, view, materialized view and sequences \dm[S+] [PATTERN] - lists all materialized views I also noticed I forgot handle options \dp and \dpp, this should be OK in next version of patch. 7) there are some docs too, but I guess it is not enough, at least my english will need to correct If we're going to treat materialized views as a separate object type, you probably need to break out the docs for CREATE MATERIALIZED VIEW, ALTER MATERIALIZED VIEW, and DROP MATERIALIZED VIEW into their own pages, rather than having then mixed up with corresponding pages for regular views. Yeah, that was problem I just solved like that here, but I confess this would be better. In progress: - regression tests - behavior of various ALTER commands, ie SET STATISTIC, CLUSTER ON, ENABLE/DISABLE RULE, etc. This isn't right: rhaas=# create view v as select * from t; CREATE VIEW rhaas=# alter view v refresh; ERROR: unrecognized alter table type: 41 I know, cases like that will be more than that. Thats why I work on good tests now. Please add your patch here, so that it will be reviewed during the about-to-begin CommitFest. https://commitfest.postgresql.org/action/commitfest_view/open OK, but will you help me with that form? Do you think I can fill it like that? I'm not sure about few fields .. Name: Snapshot materialized views CommitFest Topic: [ Miscellaneous | SQL Features ] ??? Patch Status: Needs review Author: me Reviewers:You? Commiters:who? and I quess fields 'Date Closed' and 'Message-ID for Original Patch' will be filled later. thanks a lot Pavel Baros -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] - GSoC - snapshot materialized view (work-in-progress) patch
Dne 9.7.2010 21:33, Robert Haas napsal(a): 2010/7/8 Pavel Barošbaro...@seznam.cz: Description of patch: 1) can create MV, and is created uninitialized with data CREATE MATERIALIZED VIEW mvname AS SELECT ... This doesn't seem acceptable. It should populate it on creation. Yes, it would be better, in addition, true is, this behavior will be required if is expected to implement incremental MV in the close future. 2) can refresh MV ALTER MATERIALIZED VIEW mvname REFRESH 3) MV cannot be modified by DML commands (INSERT, UPDATE and DELETE are not permitted) 4) index can be created and used with MV 5) pg_dump is repaired, in previous patch dump threw error, now dont, but it is sort of dummy, I want to reach state, where refreshing command will be posed after all COPY statements (when all data are in tables). In this patch REFRESH command is right behind CREATE MV command. Hmm... ISTM that you probably need some kind of dependency stuff in here to make the materialized view get created after the tables it depends on have been populated with data. It needs to work with parallel restore, too. I'm not sure exactly how the dependency stuff in pg_dump works, though. never mind in case MV will be populated on creation. A subtle point here is that if you dump and restore a database containing a materialized view, the new database might not be quite the same as the old one, because the materialized view might have been out of date before, and when you recreate it, it'll get refreshed. I'm not sure there's much we can/should do about that, though. yes, it is interesting, of course, there can be real-life example, where population on creating is needed and is not, and I'm thinking of solution similar to Oracle or DB2. Add some option to creating MV, that enable/disable population on creating: http://www.ibm.com/developerworks/data/library/techarticle/dm-0708khatri/ Oracle: CREATE MATERIALIZED VIEW mvname [ BUILD [IMMEDIATE | DEFERRED] ] AS SELECT .. DB2: CREATE TABLE mvname AS SELECT ... [ INITIALLY DEFERRED | IMMEDIATE ] 6) psql works too, new command \dm[S+] was added to the list \d[S+] [PATTERN] - lists all db objects like tables, view, materialized view and sequences \dm[S+] [PATTERN] - lists all materialized views I also noticed I forgot handle options \dp and \dpp, this should be OK in next version of patch. 7) there are some docs too, but I guess it is not enough, at least my english will need to correct If we're going to treat materialized views as a separate object type, you probably need to break out the docs for CREATE MATERIALIZED VIEW, ALTER MATERIALIZED VIEW, and DROP MATERIALIZED VIEW into their own pages, rather than having then mixed up with corresponding pages for regular views. Yeah, that was problem I just solved like that here, but I confess this would be better. In progress: - regression tests - behavior of various ALTER commands, ie SET STATISTIC, CLUSTER ON, ENABLE/DISABLE RULE, etc. This isn't right: rhaas=# create view v as select * from t; CREATE VIEW rhaas=# alter view v refresh; ERROR: unrecognized alter table type: 41 I know, cases like that will be more than that. Thats why I work on good tests now. Please add your patch here, so that it will be reviewed during the about-to-begin CommitFest. https://commitfest.postgresql.org/action/commitfest_view/open OK, but will you help me with that form? Do you think I can fill it like that? I'm not sure about few fields .. Name: Snapshot materialized views CommitFest Topic: [ Miscellaneous | SQL Features ] ??? Patch Status: Needs review Author: me Reviewers:You? Commiters:who? and I quess fields 'Date Closed' and 'Message-ID for Original Patch' will be filled later. thanks a lot Pavel Baros
Re: [HACKERS] GSoC - code of implementation of materialized views
Robert Haas napsal(a): 2010/6/25 Pavel Baros baro...@seznam.cz: On http://github.com/pbaros/postgres can be seen changes and my attempt to implement materialized views. The first commit to the repository implements following: Materialized view can be created, dropped and used in SELECT statement. CREATE MATERIALIZED VIEW mvname AS SELECT ...; DROP MATERIALIZED VIEW mvname [CASCADE]; SELECT * FROM mvname; also works: COMMENT ON MATERIALIZED VIEW mvname IS 'etc.'; SELECT pg_get_viewdef(mvname); ... also you can look at enclosed patch. So, this patch doesn't actually seem to do very much. It doesn't appear that creating the materialized view actually populates it with any data; and the refresh command doesn't work either. So it appears that you can create a materialized view, but it won't actually contain any data - which doesn't seem at all useful. Yeah, it is my fault, I did not mentioned that this patch is not final. It is only small part of whole implementation. I wanted to show just this, because I think that is the part that should not change much. And to show I did something, I am not ignoring GSoC. Now I can fully focus on the program. Most of the problems you mentioned (except pg_dump) I have implemented and I will post it to HACKERS soon. Until now I've not had much time, because I just finished my BSc. studies yesterday. And again, sorry for misunderstanding. Pavel Baros Some other problems: - The command tag for CREATE MATERIALIZED VIEW should return CREATE MATERIALIZED VIEW rather than CREATE VIEW, since we're treating it as a separate object type. I note that dropping a materialized view already uses DROP MATERIALIZED VIEW, so right now it isn't symmetrical. - Using \d with no argument doesn't list materialized views. - Using \d with a materialized view as an argument doesn't work properly - the first line says something like ?m? public.m instead of materialized view public.m. - Using \d+ with a materialized view as an argument should probably should the view definition. - Using \dd doesn't list comments on materialized views. - Commenting on a column of a materialized view should probably be allowed. - pg_dump fails with a message like this: failed sanity check, parent table OID 24604 of pg_rewrite entry OID 24607 not found - ALTER MATERIALIZED VIEW name OWNER TO role, RENAME TO role, and SET SCHEMA schema either fall to work or fail to parse (plan ALTER VIEW also doesn't work on a materialized view) - ALTER MATERIALIZED VIEW name SET/DROP DEFAULT also doesn't work, which is OK: it shouldn't work. But the error message needs work. - The error message CREATE OR REPLACE on materialized view is not support! shouldn't end with an exclamation point. - The parser token OptMater should probably be called OptMaterialized or opt_materialized, rather than abbreviating. - There are no docs. - There are no tests. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC - Materialized Views - is stale or fresh?
Heikki Linnakangas napsal(a): On 12/06/10 17:18, Pavel Baros wrote: I am curious how could I solve the problem: During refreshing I would like to know, if MV is stale or fresh? And I had an idea: In fact, MV need to know if its last refresh (transaction id) is older than any INSERT, UPDATE, DELETE transaction launched against source tables. So if MV has information about last (highest) xmin in source tables, it could simply compare its own xmin to xmins (xmax for deleted rows) from source tables and decide, if is stale or fresh. Whole realization could look like this: 1. Make new column in pg_class (or somewhere in pg_stat* ?): pg_class.rellastxid (of type xid) 2. After each INSERT, UPDATE, DELETE statement (transaction) pg_class.rellastxid would be updated. That should not be time- or memory- consuming (not so much) since pg_class is cached, I guess. rellastxid would have to be updated at every insert/update/delete. It would become a big bottleneck. That's not going to work. Why do you need to know if a MV is stale? Sorry I did not mention it. If we knew MV is fresh, there is no neeed to refresh MV and so it would prevent useless rebuilding of MV. So I thought there is room for saving some work. Anyway, I realized, this idea do not cover all the cases how to find out MV is stale or fresh. For example, when updating a row of source table of MV, that do not participate in MV, in that case refreshing of MV would be useless too. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers