[ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980790#comment-14980790 ]
Benjamin Lerer commented on CASSANDRA-8505: ------------------------------------------- Secondary index and their build/not build status are node-local. By consequence it is not possible to know on a coordinator node if the index is fully build. It can be built on the coordinator but still building on other nodes. Further more an index rebuild can be triggered at any time. Therefore the only moment where we can check if the index is ready is at query execution time. The first problem of rejecting index queries at execution time is that some {{ALLOW FILTERING}} queries that could have been processed without an index will be rejected. As the {{ALLOW FILTERING}} information is not passed with the command we have no way to know if the query should be executed or not using filtering. On the other hand, currently, if an index exists but is not built Cassandra might silently return the wrong results. By consequence rejecting the query is still an improvement, in my opinion, and we can create a new ticket to improve the situation in the future. The second problem if about communicating back the error to the coordinator node. CASSANDRA-7886 added a mechanism for that but it is not perfect. The user will receive a {{ReadFailureException}} but would have to look within the logs to find the root cause of the problem. Ideally this mechanism should be improved to be able to pass the error message to the {{ReadFailureException}}. The other problem of the mechanism is that it is only available since {{2.2}}, so I could not create a patch for {{2.1}}. The patch for {{2.2}} is [here|https://github.com/apache/cassandra/compare/trunk...blerer:8505-2.2] and the patch for {{3.0}} is [here|https://github.com/apache/cassandra/compare/trunk...blerer:8505-3.0] Both patches keep the index state in memory and throw an Exception if the index is not ready when a request arrive. The paches also shortcut the building of a index if the base table is empty. This optimisation prevent a lot of the existing index tests to fail. *The unit test results for {{2.2}} are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-2.2-testall/3/] *The dtest results for {{2.2}} are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-2.2-dtest/3/] *The unit test results for {{3.0}} are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-3.0-testall/1/] *The dtest results for {{3.0}} are [here|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-8505-3.0-dtest/1/] The {{secondary_indexes_test.TestSecondaryIndexesOnCollections.test_map_indexes}} dtest fails in {{2.2}} because it is not waiting for the index to be built before querying the index. I will provide a patch for the DTest. > Invalid results are returned while secondary index are being build > ------------------------------------------------------------------ > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Reporter: Benjamin Lerer > Assignee: Benjamin Lerer > Fix For: 2.1.x > > > If you request an index creation and then execute a query that use the index > the results returned might be invalid until the index is fully build. This is > caused by the fact that the table column will be marked as indexed before the > index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > > createIndex("CREATE INDEX ON %s(b)"); > > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), > row(0, 1, 1), > row(1, 1, 4)); > } > > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a, > b)))"); > createIndex("CREATE INDEX ON %s(b)"); > > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b = ?;", 1), > row(0, 1, 1), > row(1, 1, 4)); > } > {code} > The first test will fail while the second will work. > In my opinion the first test should reject the request as invalid (as if the > index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)