[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029741#comment-15029741 ] Benjamin Lerer commented on CASSANDRA-8505: --- Thanks for the review. > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15028468#comment-15028468 ] Sam Tunnicliffe commented on CASSANDRA-8505: +1 > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15026852#comment-15026852 ] Jim Witschey commented on CASSANDRA-8505: - [~blerer] Noted, thanks for the warning. > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15026596#comment-15026596 ] Benjamin Lerer commented on CASSANDRA-8505: --- {quote}Do you have any thoughts on that?{quote} It makes sense to me. I have pushed the fixes for the 2 comments [here|https://github.com/apache/cassandra/compare/trunk...blerer:8505-2.2] and [here|https://github.com/apache/cassandra/compare/trunk...blerer:8505-3.0]. I fixed the DTest for {{secondary_indexes_test.TestSecondaryIndexesOnCollections.test_map_indexes}} [here|https://github.com/riptano/cassandra-dtest/pull/685/files] [~philipthompson], [~mambocab] The changes of this ticket might cause some dtests to fail randomly (if a lot of data were inserted before the index was created). I had a look at the DTests but I might have missed some. If some tests using secondary index start failing once this ticket is committed do not hesitate to assign them to me. > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15022173#comment-15022173 ] Sam Tunnicliffe commented on CASSANDRA-8505: Looks pretty good to me, I just have a small nit & one question/suggestion about naming: Firstly, the catch blocks for {{TombstoneOverwhelmingException}} and {{IndexNotAvailableException}} in {{MessageDeliveryTask::run}} can be combined into one multi catch. Also, that catch for {{IndexNotAvailableException}} was only added in 3.0, and it seems it could also be done for the 2.2 branch. On the question of naming, I wonder if perhaps the names of the new methods which check the state of the indexes ought to reflect the fact that the readiness is in regard to querying (i.e. an index will process updates as soon as it's created, the new flag just guards against it's use in queries). So, on the 2.2 branch, we could rename {{SI::isReady}} to {{SI::isQueryable}} and on 3.0 {{SIM::isIndexReady}} -> {{SIM::isIndexQueryable}}. Do you have any thoughts on that? > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011314#comment-15011314 ] Benjamin Lerer commented on CASSANDRA-8505: --- I have pushed the fixes for 2.2 and 3.0 [here|https://github.com/apache/cassandra/compare/trunk...blerer:8505-2.2] and [here|https://github.com/apache/cassandra/compare/trunk...blerer:8505-3.0]. *The unit test results for 2.2 are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-2.2-testall/5/] *The dtest results for 2.2 are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-2.2-dtest/4/] *The unit test results for 3.0 are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-3.0-testall/3/] *The dtest results for 3.0 are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-3.0-dtest/3/] > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001195#comment-15001195 ] Sam Tunnicliffe commented on CASSANDRA-8505: We could certainly do that in a utest (and we have plenty of tests with such custom indexes), but it not a dtest as it would require the custom index to be on the classpath. Naturally, a utest won't exercise the distributed side of things, but it's still better than no testing, so +1 > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001168#comment-15001168 ] Tyler Hobbs commented on CASSANDRA-8505: It occurred to me that we could also create a custom secondary index that delayed the build completion, either by waiting for some sort of signal or by sleeping. > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001055#comment-15001055 ] Sam Tunnicliffe commented on CASSANDRA-8505: bq. What about accepting either ReadFailureException or the complete, correct result? If index building got faster we might stop hitting the ReadFailureException case, but at least the test wouldn't flap. Yes absolutely, even then though we're not going to be certain exactly what's being tested (if at all) - e.g. the index building gets faster but we also introduce a regression with the {{ReadFailureException}}, we'd never know. But like I say, I can't think of anything better. > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001018#comment-15001018 ] Benjamin Lerer commented on CASSANDRA-8505: --- I think we do not really have the choice. I could easily reproduce the problem with a unit test on my machine but CI is usually much slower. For reproducing it with cqlsh I add to put a breack point in the building task. > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000927#comment-15000927 ] Tyler Hobbs commented on CASSANDRA-8505: bq. It would be good to have some test coverage of this, although the best I could come up with is a dtest which inserts many rows, then adds the index and queries immediately expecting ReadFailureException, which is fairly lame and fragile. What about accepting either {{ReadFailureException}} or the complete, correct result? If index building got faster we might stop hitting the {{ReadFailureException}} case, but at least the test wouldn't flap. > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000767#comment-15000767 ] Sam Tunnicliffe commented on CASSANDRA-8505: I think the various index states can be reduced to a simple ready/not ready check. What's more unless we intend to change the established behaviour fairly significantly, once an index moves to a ready state it never moves back to being not ready. The only times when we modify the status in the system table are when the index is removed (in which case we have no problem with being able to query using it) or during a rebuild. In the latter case though, we probably shouldn't reject queries (and we don't currently), as an index rebuild is incremental. That is, we don't scrap the existing index tables and rebuild everything from scratch, just write new index SSTables to supercede the old ones. So although it's certainly possible to get incorrect results during a rebuild (because of missing/stale entries), the results only get more correct as the rebuild progresses. Changing this so that all queries against that index return errors until all rebuilds complete seems like a step backwards. It seems more reasonable to reject queries until the initial build has been performed, as per the example in the description, but this only requires a simple boolean to track state between instantiating/registering the index and its initial build task completing (if one is required). It would be good to have some test coverage of this, although the best I could come up with is a dtest which inserts many rows, then adds the index and queries immediately expecting ReadFailureException, which is fairly lame and fragile. A couple of points specific to the 3.0 patch: * The fix for CASSANDRA-10595 has been lost. If an index doesn't register itself in {{createIndex}}, don't ask it for an initalization task, just set {{initialBuildTask == null}}. * {{SIM::reloadIndex}} has changed since the patch was created (due to CASSANDRA-10604) - I think that no changes to this method are now required. I did notice though that the current implementation actually makes a redundant call to {{getMetadataReloadTask}}, so if you could fix that while you're here, that'd be great. bq. Secondary index and their build/not build status are node-local. By consequence it is not possible to know on a coordinator node if the index is fully build. It can be built on the coordinator but still building on other nodes For future reference on this point, we also have CASSANDRA-9967 which has a very similar intent. > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my o
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980790#comment-14980790 ] Benjamin Lerer commented on CASSANDRA-8505: --- Secondary index and their build/not build status are node-local. By consequence it is not possible to know on a coordinator node if the index is fully build. It can be built on the coordinator but still building on other nodes. Further more an index rebuild can be triggered at any time. Therefore the only moment where we can check if the index is ready is at query execution time. The first problem of rejecting index queries at execution time is that some {{ALLOW FILTERING}} queries that could have been processed without an index will be rejected. As the {{ALLOW FILTERING}} information is not passed with the command we have no way to know if the query should be executed or not using filtering. On the other hand, currently, if an index exists but is not built Cassandra might silently return the wrong results. By consequence rejecting the query is still an improvement, in my opinion, and we can create a new ticket to improve the situation in the future. The second problem if about communicating back the error to the coordinator node. CASSANDRA-7886 added a mechanism for that but it is not perfect. The user will receive a {{ReadFailureException}} but would have to look within the logs to find the root cause of the problem. Ideally this mechanism should be improved to be able to pass the error message to the {{ReadFailureException}}. The other problem of the mechanism is that it is only available since {{2.2}}, so I could not create a patch for {{2.1}}. The patch for {{2.2}} is [here|https://github.com/apache/cassandra/compare/trunk...blerer:8505-2.2] and the patch for {{3.0}} is [here|https://github.com/apache/cassandra/compare/trunk...blerer:8505-3.0] Both patches keep the index state in memory and throw an Exception if the index is not ready when a request arrive. The paches also shortcut the building of a index if the base table is empty. This optimisation prevent a lot of the existing index tests to fail. *The unit test results for {{2.2}} are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-2.2-testall/3/] *The dtest results for {{2.2}} are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-2.2-dtest/3/] *The unit test results for {{3.0}} are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-3.0-testall/1/] *The dtest results for {{3.0}} are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-3.0-dtest/1/] The {{secondary_indexes_test.TestSecondaryIndexesOnCollections.test_map_indexes}} dtest fails in {{2.2}} because it is not waiting for the index to be built before querying the index. I will provide a patch for the DTest. > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Fix For: 2.1.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1,
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305197#comment-14305197 ] Sylvain Lebresne commented on CASSANDRA-8505: - We probably need write the patch before deciding (in theory, it would be rather simple to have SelectStatement check if the index is built, but in practice doing so means a query to the a system table and that's probably not acceptable performance, so we'd need to cache the info (probably directly in ColumnDefinition?) but we'll have to check how hairy that is exactly). That said, even if we decide it's too hairy for 2.0, we should at least fix it in 2.1. > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug >Reporter: Benjamin Lerer > Fix For: 2.1.4 > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build
[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304304#comment-14304304 ] Philip Thompson commented on CASSANDRA-8505: Is this too big of a change in existing behavior to put into 2.0 or 2.1? > Invalid results are returned while secondary index are being build > -- > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug >Reporter: Benjamin Lerer > Fix For: 3.0 > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), >row(0, 1, 1), >row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)