[ https://issues.apache.org/jira/browse/CASSANDRA-13547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067734#comment-16067734 ]
Krishna Dattu Koneru edited comment on CASSANDRA-13547 at 6/29/17 4:56 AM: --------------------------------------------------------------------------- Thanks [~jasonstack] ! I will try to rework on {{1. Missing Update}} to address your comment. About {quote} Using the greater timestamp from view's columns(pk+non-pk) in base row will later shadow entire row in view if there is a normal column in base as primary key in view. {quote} This looks like a nasty problem. This patch does not cause this. This is a existing behaviour that any updates to view's pk columns will make old row marked as tombstone (at highest timestamp of all columns in base row) and will create a new row with updated pk. See view row timestamps in the below example : Existing behaviour without patch: {code} INSERT INTO test (a, b, c, d) VALUES (1, 1, 1, 1) using timestamp 0; test : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=1 ts=0], [d=1 ts=0] mv_test1 : [1]@0 Row[info=[ts=0] ]: 1 | [c=1 ts=0], [d=1 ts=0] {code} {code} UPDATE test using timestamp 5 set c = 0 WHERE a=1; test : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=0 ts=5], [d=1 ts=0] mv_test1 : [1]@0 Row[info=[ts=0] ]: 1 | [c=0 ts=5], [d=1 ts=0] {code} {code} UPDATE test using timestamp 1 set b = 0 WHERE a=1; test : [1]@0 Row[info=[ts=0] ]: | [b=0 ts=1], [c=0 ts=5], [d=1 ts=0] mv_test1 : [1]@0 Row[info=[ts=1] ]: 0 | [c=0 ts=5], [d=1 ts=0] [1]@39 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498707556(shadowable) ]: 1 | {code} {code} UPDATE test using timestamp 2 set b = 1 WHERE a=1; table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=2], [c=0 ts=5], [d=1 ts=0] View (before compaction) [1]@0 Row[info=[ts=1] ]: 0 | [c=0 ts=5], [d=1 ts=0] [1]@39 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498707556(shadowable) ]: 1 | [1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498707897(shadowable) ]: 0 | [1]@30 Row[info=[ts=2] ]: 1 | [c=0 ts=5], [d=1 ts=0] View (after compaction) [1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498707897(shadowable) ]: 0 | [1]@31 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498707556(shadowable) ]: 1 | {code} With this patch : {code} INSERT INTO test (a, b, c, d) VALUES (1, 1, 1, 1) using timestamp 0; table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=1 ts=0], [d=1 ts=0] view : [1]@0 Row[info=[ts=0] ]: 1 | [c=1 ts=0], [d=1 ts=0] {code} {code} UPDATE test using timestamp 5 set c = 0 WHERE a=1; UPDATE test using timestamp 1 set b = 0 WHERE a=1; table : [1]@0 Row[info=[ts=0] ]: | [b=0 ts=1], [c=0 ts=5], [d=1 ts=0] view : [1]@0 Row[info=[ts=5] ]: 0 | [c=0 ts=5], [d=1 ts=0] /* row ts=5 because of this patch */ [1]@38 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498708886(shadowable) ]: 1 | {code} {code} UPDATE test using timestamp 2 set b = 1 WHERE a=1; table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=2], [c=0 ts=5], [d=1 ts=0] View (before compaction) [1]@0 Row[info=[ts=5] ]: 0 | [c=0 ts=5], [d=1 ts=0] /* row ts=5 because of this patch */ [1]@38 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498708886(shadowable) ]: 1 | [1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498709103(shadowable) ]: 0 | [1]@30 Row[info=[ts=5] ]: 1 | [c=0 ts=5], [d=1 ts=0] /*-- row ts=5 because of this patch */ View (after compaction) [1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498709103(shadowable) ]: 0 | [1]@31 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498708886(shadowable) ]: 1 | {code} I am not sure yet how to fix this issue ... given that if live row and tombstone have same timestamp , tombstone wins. Another problem is that all view deletes are marked as shadowable. But then that is a different problem and I belive it is being fixed in https://issues.apache.org/jira/browse/CASSANDRA-13409 . was (Author: krishna.koneru): Thanks [~jasonstack] ! I will try to rework on {{1. Missing Update}} to address your comment. About {quote} Using the greater timestamp from view's columns(pk+non-pk) in base row will later shadow entire row in view if there is a normal column in base as primary key in view. {quote} This looks like a nasty problem. This patch does not cause this. This is a existing behaviour that any updates to view's pk columns will make old row marked as tombstone (at highest timestamp of all columns in base row) and will create a new row with updated pk. See view row timestamps in the below example : Existing behaviour without patch: {code} INSERT INTO test (a, b, c, d) VALUES (1, 1, 1, 1) using timestamp 0; test : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=1 ts=0], [d=1 ts=0] mv_test1 : [1]@0 Row[info=[ts=0] ]: 1 | [c=1 ts=0], [d=1 ts=0] {code} {code} UPDATE test using timestamp 5 set c = 0 WHERE a=1; test : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=0 ts=5], [d=1 ts=0] mv_test1 : [1]@0 Row[info=[ts=0] ]: 1 | [c=0 ts=5], [d=1 ts=0] {code} {code} UPDATE test using timestamp 1 set b = 0 WHERE a=1; test : [1]@0 Row[info=[ts=0] ]: | [b=0 ts=1], [c=0 ts=5], [d=1 ts=0] mv_test1 : [1]@0 Row[info=[ts=1] ]: 0 | [c=0 ts=5], [d=1 ts=0] [1]@39 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498707556(shadowable) ]: 1 | {code} {code} UPDATE test using timestamp 2 set b = 1 WHERE a=1; table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=2], [c=0 ts=5], [d=1 ts=0] View (before compaction) [1]@0 Row[info=[ts=1] ]: 0 | [c=0 ts=5], [d=1 ts=0] [1]@39 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498707556(shadowable) ]: 1 | [1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498707897(shadowable) ]: 0 | [1]@30 Row[info=[ts=2] ]: 1 | [c=0 ts=5], [d=1 ts=0] View (after compaction) [1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498707897(shadowable) ]: 0 | [1]@31 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498707556(shadowable) ]: 1 | {code} With this patch : {code} INSERT INTO test (a, b, c, d) VALUES (1, 1, 1, 1) using timestamp 0; table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=0], [c=1 ts=0], [d=1 ts=0] view : [1]@0 Row[info=[ts=0] ]: 1 | [c=1 ts=0], [d=1 ts=0] {code} {code} UPDATE test using timestamp 5 set c = 0 WHERE a=1; UPDATE test using timestamp 1 set b = 0 WHERE a=1; table : [1]@0 Row[info=[ts=0] ]: | [b=0 ts=1], [c=0 ts=5], [d=1 ts=0] view : [1]@0 Row[info=[ts=5] ]: 0 | [c=0 ts=5], [d=1 ts=0] /* row ts=5 because of this patch */ [1]@38 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498708886(shadowable) ]: 1 | {code} {code} UPDATE test using timestamp 2 set b = 1 WHERE a=1; table : [1]@0 Row[info=[ts=0] ]: | [b=1 ts=2], [c=0 ts=5], [d=1 ts=0] View (before compaction) [1]@0 Row[info=[ts=5] ]: 0 | [c=0 ts=5], [d=1 ts=0] /* row ts=5 because of this patch */ [1]@38 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498708886(shadowable) ]: 1 | [1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498709103(shadowable) ]: 0 | [1]@30 Row[info=[ts=5] ]: 1 | [c=0 ts=5], [d=1 ts=0] /*-- row ts=5 because of this patch */ View (after compaction) [1]@0 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498709103(shadowable) ]: 0 | [1]@31 Row[info=[ts=-9223372036854775808] del=deletedAt=5, localDeletion=1498708886(shadowable) ]: 1 | {code} I am not sure yet how to fix this issue ... given that if live row and tombstone have same timestamp , tombstone wins. Another problem is that these tombstones should not be marked as shadowable. But then that is a different problem and I belive it is being fixed in https://issues.apache.org/jira/browse/CASSANDRA-13409 . > Filtered materialized views missing data > ---------------------------------------- > > Key: CASSANDRA-13547 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13547 > Project: Cassandra > Issue Type: Bug > Components: Materialized Views > Environment: Official Cassandra 3.10 Docker image (ID 154b919bf8ce). > Reporter: Craig Nicholson > Assignee: Krishna Dattu Koneru > Priority: Blocker > Labels: materializedviews > Fix For: 3.11.x > > > When creating a materialized view against a base table the materialized view > does not always reflect the correct data. > Using the following test schema: > {code:title=Schema|language=sql} > DROP KEYSPACE IF EXISTS test; > CREATE KEYSPACE test > WITH REPLICATION = { > 'class' : 'SimpleStrategy', > 'replication_factor' : 1 > }; > CREATE TABLE test.table1 ( > id int, > name text, > enabled boolean, > foo text, > PRIMARY KEY (id, name)); > CREATE MATERIALIZED VIEW test.table1_mv1 AS SELECT id, name, foo > FROM test.table1 > WHERE id IS NOT NULL > AND name IS NOT NULL > AND enabled = TRUE > PRIMARY KEY ((name), id); > CREATE MATERIALIZED VIEW test.table1_mv2 AS SELECT id, name, foo, enabled > FROM test.table1 > WHERE id IS NOT NULL > AND name IS NOT NULL > AND enabled = TRUE > PRIMARY KEY ((name), id); > {code} > When I insert a row into the base table the materialized views are updated > appropriately. (+) > {code:title=Insert row|language=sql} > cqlsh> INSERT INTO test.table1 (id, name, enabled, foo) VALUES (1, 'One', > TRUE, 'Bar'); > cqlsh> SELECT * FROM test.table1; > id | name | enabled | foo > ----+------+---------+----- > 1 | One | True | Bar > (1 rows) > cqlsh> SELECT * FROM test.table1_mv1; > name | id | foo > ------+----+----- > One | 1 | Bar > (1 rows) > cqlsh> SELECT * FROM test.table1_mv2; > name | id | enabled | foo > ------+----+---------+----- > One | 1 | True | Bar > (1 rows) > {code} > Updating the record in the base table and setting enabled to FALSE will > filter the record from both materialized views. (+) > {code:title=Disable the row|language=sql} > cqlsh> UPDATE test.table1 SET enabled = FALSE WHERE id = 1 AND name = 'One'; > cqlsh> SELECT * FROM test.table1; > id | name | enabled | foo > ----+------+---------+----- > 1 | One | False | Bar > (1 rows) > cqlsh> SELECT * FROM test.table1_mv1; > name | id | foo > ------+----+----- > (0 rows) > cqlsh> SELECT * FROM test.table1_mv2; > name | id | enabled | foo > ------+----+---------+----- > (0 rows) > {code} > However a further update to the base table setting enabled to TRUE should > include the record in both materialzed views, however only one view > (table1_mv2) gets updated. (-) > It appears that only the view (table1_mv2) that returns the filtered column > (enabled) is updated. (-) > Additionally columns that are not part of the partiion or clustering key are > not updated. You can see that the foo column has a null value in table1_mv2. > (-) > {code:title=Enable the row|language=sql} > cqlsh> UPDATE test.table1 SET enabled = TRUE WHERE id = 1 AND name = 'One'; > cqlsh> SELECT * FROM test.table1; > id | name | enabled | foo > ----+------+---------+----- > 1 | One | True | Bar > (1 rows) > cqlsh> SELECT * FROM test.table1_mv1; > name | id | foo > ------+----+----- > (0 rows) > cqlsh> SELECT * FROM test.table1_mv2; > name | id | enabled | foo > ------+----+---------+------ > One | 1 | True | null > (1 rows) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org