[jira] [Comment Edited] (CASSANDRA-12245) initial view build can be parallel

Paulo Motta (JIRA) Wed, 29 Nov 2017 16:38:32 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-12245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16271895#comment-16271895
 ]


Paulo Motta edited comment on CASSANDRA-12245 at 11/30/17 12:37 AM:
--------------------------------------------------------------------

bq. Good catch! Done here. I have added an overloaded version of 
ViewBuilderTask.stop to throw the CompactionInterruptedException only if the 
stop call comes from a different place than ViewBuilder. That is, the exception 
is not thrown in the case of a schema change (such as a drop), when the current 
build should be stopped without errors and maybe restarted. 

Good job! One thing I noticed is that even though the builder task and the view 
builder is aborted, the other tasks of the same builder keep running. At least 
until we have the ability to start and stop view builders, I think that 
stopping a subtask should also abort the other subtasks of the same view 
builder - since the view builder will not complete anyway. What do you think? 
I've done this 
[here|https://github.com/pauloricardomg/cassandra/commit/81853218eee702b778ba801426ba19d48336cf77]
 and the tests didn't need any change. I've also extended {{SplitterTest}} with 
a couple more test cases 
[here|https://github.com/pauloricardomg/cassandra/commit/428a990d6b3d79df9a4848d0f0f87502e72e470e].

bq. I have added a couple of dtests here. test_resume_stopped_build uses 
`nodetool stop VIEW_BUILD` to interrupt the running task of an ongoing view 
build and verifies that the unmarked build is resumed after restarting the 
nodes. test_drop_with_stopped_build verifies that a view with interrupted taks 
can still be dropped, which is something that has been problematic while 
writting the patch.

The tests looks good, but sometimes they were failing on my machine because the 
view builder task finished on some nodes before they were stopped and also 
{{_wait_for_view_build_start}} did not guarantee the view builder started in 
all nodes before issuing {{nodetool stop VIEW_BUILD}}, so I fixed this [on this 
commit|https://github.com/pauloricardomg/cassandra-dtest/commit/667315e42bd2b7d04ac038e79149f1b0e63ba0f2].
 I also extended {{test_resume_stopped_build}} to verify that view was not 
built after abort 
([here|https://github.com/pauloricardomg/cassandra-dtest/commit/f4c3ad7ac9e4ea64576d669a1cf30b0ef4e02a3f]).

I've rebased and submitted a new CI run with the suggestions above 
[here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-dtest/]
 and 
[here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-testall/].
 

Besides these minor nits, I'm happy with the latest version of the patch and 
tests. If you agree with the suggestions above and CI looks good, feel free to 
incorporate them into your branches and commit. Excellent job and thanks for 
your patience! :)


was (Author: pauloricardomg):
bq. Good catch! Done here. I have added an overloaded version of 
ViewBuilderTask.stop to throw the CompactionInterruptedException only if the 
stop call comes from a different place than ViewBuilder. That is, the exception 
is not thrown in the case of a schema change (such as a drop), when the current 
build should be stopped without errors and maybe restarted. 

Good job! One thing I noticed is that even though the builder task and the view 
builder is aborted, the other tasks of the same builder keep running. At least 
until we have the ability to start and stop view builders, I think that 
stopping a subtask should also abort the other subtasks of the same view 
builder - since the view builder will not complete anyway. What do you think? 
I've done this 
[here|https://github.com/pauloricardomg/cassandra/commit/81853218eee702b778ba801426ba19d48336cf77]
 and the tests didn't need any change. I've also extended {{SplitterTest}} with 
a couple more test cases 
[here|https://github.com/pauloricardomg/cassandra/commit/428a990d6b3d79df9a4848d0f0f87502e72e470e].

bq. I have added a couple of dtests here. test_resume_stopped_build uses 
`nodetool stop VIEW_BUILD` to interrupt the running task of an ongoing view 
build and verifies that the unmarked build is resumed after restarting the 
nodes. test_drop_with_stopped_build verifies that a view with interrupted taks 
can still be dropped, which is something that has been problematic while 
writting the patch.

The tests looks good, but sometimes they were failing on my machine because the 
view builder task finished on some nodes before they were stopped and also 
{{_wait_for_view_build_start}} did not guarantee the view builder started in 
all nodes before issuing {{nodetool stop VIEW_BUILD}}, so I fixed this [on this 
commit|https://github.com/pauloricardomg/cassandra-dtest/commit/fc62dc849d5a4d5e24d2bada6e6f8ce0f2d32b4d].
 I also extended {{test_resume_stopped_build}} to verify that view was not 
built after abort 
([here|https://github.com/pauloricardomg/cassandra-dtest/commit/6e38919d3c64a54688ae97bcf03611fff7d59dfe]).

I've rebased and submitted a new CI run with the suggestions above 
[here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-dtest/]
 and 
[here|http://jenkins-cassandra.datastax.lan/view/Dev/view/adelapena/job/pauloricardomg-12245-trunk-testall/].
 

Besides these minor nits, I'm happy with the latest version of the patch and 
tests. If you agree with the suggestions above and CI looks good, feel free to 
incorporate them into your branches and commit. Excellent job and thanks for 
your patience! :)

> initial view build can be parallel
> ----------------------------------
>
>                 Key: CASSANDRA-12245
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12245
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Materialized Views
>            Reporter: Tom van der Woerdt
>            Assignee: Andrés de la Peña
>             Fix For: 4.x
>
>
> On a node with lots of data (~3TB) building a materialized view takes several 
> weeks, which is not ideal. It's doing this in a single thread.
> There are several potential ways this can be optimized :
>  * do vnodes in parallel, instead of going through the entire range in one 
> thread
>  * just iterate through sstables, not worrying about duplicates, and include 
> the timestamp of the original write in the MV mutation. since this doesn't 
> exclude duplicates it does increase the amount of work and could temporarily 
> surface ghost rows (yikes) but I guess that's why they call it eventual 
> consistency. doing it this way can avoid holding references to all tables on 
> disk, allows parallelization, and removes the need to check other sstables 
> for existing data. this is essentially the 'do a full repair' path



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-12245) initial view build can be parallel

Reply via email to