[ https://issues.apache.org/jira/browse/CASSANDRA-12245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16271895#comment-16271895 ]
Paulo Motta edited comment on CASSANDRA-12245 at 11/30/17 12:37 AM: -------------------------------------------------------------------- bq. Good catch! Done here. I have added an overloaded version of ViewBuilderTask.stop to throw the CompactionInterruptedException only if the stop call comes from a different place than ViewBuilder. That is, the exception is not thrown in the case of a schema change (such as a drop), when the current build should be stopped without errors and maybe restarted. Good job! One thing I noticed is that even though the builder task and the view builder is aborted, the other tasks of the same builder keep running. At least until we have the ability to start and stop view builders, I think that stopping a subtask should also abort the other subtasks of the same view builder - since the view builder will not complete anyway. What do you think? I've done this [here|https://github.com/pauloricardomg/cassandra/commit/81853218eee702b778ba801426ba19d48336cf77] and the tests didn't need any change. I've also extended {{SplitterTest}} with a couple more test cases [here|https://github.com/pauloricardomg/cassandra/commit/428a990d6b3d79df9a4848d0f0f87502e72e470e]. bq. I have added a couple of dtests here. test_resume_stopped_build uses `nodetool stop VIEW_BUILD` to interrupt the running task of an ongoing view build and verifies that the unmarked build is resumed after restarting the nodes. test_drop_with_stopped_build verifies that a view with interrupted taks can still be dropped, which is something that has been problematic while writting the patch. The tests looks good, but sometimes they were failing on my machine because the view builder task finished on some nodes before they were stopped and also {{_wait_for_view_build_start}} did not guarantee the view builder started in all nodes before issuing {{nodetool stop VIEW_BUILD}}, so I fixed this [on this commit|https://github.com/pauloricardomg/cassandra-dtest/commit/667315e42bd2b7d04ac038e79149f1b0e63ba0f2]. I also extended {{test_resume_stopped_build}} to verify that view was not built after abort ([here|https://github.com/pauloricardomg/cassandra-dtest/commit/f4c3ad7ac9e4ea64576d669a1cf30b0ef4e02a3f]). I've rebased and submitted a new CI run with the suggestions above [here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-dtest/] and [here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-testall/]. Besides these minor nits, I'm happy with the latest version of the patch and tests. If you agree with the suggestions above and CI looks good, feel free to incorporate them into your branches and commit. Excellent job and thanks for your patience! :) was (Author: pauloricardomg): bq. Good catch! Done here. I have added an overloaded version of ViewBuilderTask.stop to throw the CompactionInterruptedException only if the stop call comes from a different place than ViewBuilder. That is, the exception is not thrown in the case of a schema change (such as a drop), when the current build should be stopped without errors and maybe restarted. Good job! One thing I noticed is that even though the builder task and the view builder is aborted, the other tasks of the same builder keep running. At least until we have the ability to start and stop view builders, I think that stopping a subtask should also abort the other subtasks of the same view builder - since the view builder will not complete anyway. What do you think? I've done this [here|https://github.com/pauloricardomg/cassandra/commit/81853218eee702b778ba801426ba19d48336cf77] and the tests didn't need any change. I've also extended {{SplitterTest}} with a couple more test cases [here|https://github.com/pauloricardomg/cassandra/commit/428a990d6b3d79df9a4848d0f0f87502e72e470e]. bq. I have added a couple of dtests here. test_resume_stopped_build uses `nodetool stop VIEW_BUILD` to interrupt the running task of an ongoing view build and verifies that the unmarked build is resumed after restarting the nodes. test_drop_with_stopped_build verifies that a view with interrupted taks can still be dropped, which is something that has been problematic while writting the patch. The tests looks good, but sometimes they were failing on my machine because the view builder task finished on some nodes before they were stopped and also {{_wait_for_view_build_start}} did not guarantee the view builder started in all nodes before issuing {{nodetool stop VIEW_BUILD}}, so I fixed this [on this commit|https://github.com/pauloricardomg/cassandra-dtest/commit/fc62dc849d5a4d5e24d2bada6e6f8ce0f2d32b4d]. I also extended {{test_resume_stopped_build}} to verify that view was not built after abort ([here|https://github.com/pauloricardomg/cassandra-dtest/commit/6e38919d3c64a54688ae97bcf03611fff7d59dfe]). I've rebased and submitted a new CI run with the suggestions above [here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-dtest/] and [here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-testall/]. Besides these minor nits, I'm happy with the latest version of the patch and tests. If you agree with the suggestions above and CI looks good, feel free to incorporate them into your branches and commit. Excellent job and thanks for your patience! :) > initial view build can be parallel > ---------------------------------- > > Key: CASSANDRA-12245 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12245 > Project: Cassandra > Issue Type: Improvement > Components: Materialized Views > Reporter: Tom van der Woerdt > Assignee: Andrés de la Peña > Fix For: 4.x > > > On a node with lots of data (~3TB) building a materialized view takes several > weeks, which is not ideal. It's doing this in a single thread. > There are several potential ways this can be optimized : > * do vnodes in parallel, instead of going through the entire range in one > thread > * just iterate through sstables, not worrying about duplicates, and include > the timestamp of the original write in the MV mutation. since this doesn't > exclude duplicates it does increase the amount of work and could temporarily > surface ghost rows (yikes) but I guess that's why they call it eventual > consistency. doing it this way can avoid holding references to all tables on > disk, allows parallelization, and removes the need to check other sstables > for existing data. this is essentially the 'do a full repair' path -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org