[ 
https://issues.apache.org/jira/browse/CASSANDRA-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139795#comment-16139795
 ] 

Paulo Motta commented on CASSANDRA-13069:
-----------------------------------------

Finally getting back to this after a while, sorry for the delay! Going back to 
the question of the last review round:

{quote}
bq. As far as I can tell, the goal of using a local batchlog is to guarantee 
eventual consistency of a the base table and its views. That is, no matter what 
happens for a given update, either both that update and all the related view 
updates get eventually applied, or none of it is. So I don't understand why:
1. we don't include local view mutations in the batchlog in SP.mutateMV.
2. the base table mutation isn't included in said batchlog alongside it's 
related view updates.
{quote}

I ended up writing a [trunk 
patch|https://github.com/pauloricardomg/cassandra/commit/00a0bef75129a396aeee2cedc89d6a6ec66f610e]
 to include both local and base table mutations in the batchlog as suggested, 
but then looking at the original code I figured that whatever failure happens 
during views update (but before the local base or views are persisted) is 
safeguarded by the [base tablecommit log 
write|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/Keyspace.java#L595]
 prior to the view update, so I don't think we actually need to include the 
local mutations in the batchlog.

Given that remote view writes are [fire and 
forget|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageProxy.java#L843],
  the most probable cause of failure during local view writing would be a 
crash, and in that case the commit log replay will re-apply the base mutations 
and views. The two downsides I can think of are:
a) We cannot ensure the commit log is actually persisted before the crash, 
unless batch commit log is used.
b) The user may have durable_writes=false
c) I'm not sure if many other non-crash failures are possible in this case 
(fail local base and view mutations), besides full/corrupted disk, in which 
case you are screwed anyway, but if they happen you'll need to wait until the 
next restart/commitlog replay to have your base-views consistent.

There's not much we can do about a), so it seems we'll just need to live with 
this and rely on repair to fix potential inconsistencies? b) is actually a 
problem even with including local mutations in the batchlog write, unless we 
force batchlog writes to be durable. c) is not a big deal unless there are 
other legitimate non-crash scenarios which I'm not aware of.

Adding the local base mutation to the view batchlog requires the following 
changes:
a) Special case the batchlog write-path to [skip writing the base 
mutation|https://github.com/pauloricardomg/cassandra/commit/00a0bef75129a396aeee2cedc89d6a6ec66f610e#diff-b8ec5deb7141558cf8f868460077be31R94],
 since that will be written by the calling thread after the view updates, and 
[ack 
it|https://github.com/pauloricardomg/cassandra/commit/00a0bef75129a396aeee2cedc89d6a6ec66f610e#diff-9ca8d303d375e01a14d42154eaf46e37R556]
 after the base table write is done so the batchlog can be clean.
b) Special case the batchlog [replay 
path|https://github.com/pauloricardomg/cassandra/commit/00a0bef75129a396aeee2cedc89d6a6ec66f610e#diff-c77d5f1027ee5e8a49e106903a4ca937R458]
 to avoid generating views when replaying base table mutations in the case of 
view batchlogs.

So, unless there's a good reason not mentioned above to include the local 
mutations on the view batchlog, I'd prefer to keep the current approach of 
writing only remote view mutations in the view batchlog. If you agree with 
that, I [added a 
comment|https://github.com/pauloricardomg/cassandra/commit/8314ced9f1077224008e79e4c23d00d3277073d5#diff-71f06c193f5b5e270cf8ac695164f43aR732]
 in the original patch explaining why the local mutations are not included in 
the local batchlog to avoid confusion in the future. Please find the updated 
patch below:

||3.0||trunk||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-13069]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-13069]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:13069]|
|[testall|http://jenkins-cassandra.datastax.lan/view/Dev/view/paulomotta/job/pauloricardomg-3.0-13069-testall/lastCompletedBuild/testReport/]|[testall|http://jenkins-cassandra.datastax.lan/view/Dev/view/paulomotta/job/pauloricardomg-trunk-13069-testall/lastCompletedBuild/testReport/]|
|[dtest|http://jenkins-cassandra.datastax.lan/view/Dev/view/paulomotta/job/pauloricardomg-3.0-13069-dtest/lastCompletedBuild/testReport/]|[dtest|http://jenkins-cassandra.datastax.lan/view/Dev/view/paulomotta/job/pauloricardomg-trunk-13069-dtest/lastCompletedBuild/testReport/]|


I've added 2 
[dtests|https://github.com/pauloricardomg/cassandra-dtest/commit/f0b7983dd26cd269c8f629f6de8ddd585e644b29#diff-62ba429edee6a4681782f078246c9893R1575]
 to check that base and view are consistent on crash before and after the view 
is applied, which detected that recovered batchlog from commitlogs is not 
replayed straight away due to the batchlog timeout, so I removed that wait on 
the [first 
replay|https://github.com/pauloricardomg/cassandra/commit/8314ced9f1077224008e79e4c23d00d3277073d5#diff-c77d5f1027ee5e8a49e106903a4ca937R200].
 I also updated the MV test to verify that the view batchlog is [empty after 
replay|https://github.com/pauloricardomg/cassandra-dtest/commit/f0b7983dd26cd269c8f629f6de8ddd585e644b29#diff-62ba429edee6a4681782f078246c9893R124].

I did a few simplifications in the MV and batchlog write path in the [other 
patch|https://github.com/pauloricardomg/cassandra/tree/trunk-13069-add-local-to-view-batchlog]
 which may be worth keeping, so if we decide keeping the current approach of 
skipping local mutations on the batchlog I will open a new ticket with the 
refactoring suggestions.

> Local batchlog for MV may not be correctly written on node movements
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-13069
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13069
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Materialized Views
>            Reporter: Sylvain Lebresne
>            Assignee: Paulo Motta
>
> Unless I'm really reading this wrong, I think the code 
> [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageProxy.java#L829-L843],
>  which comes from CASSANDRA-10674, isn't working properly.
> More precisely, I believe we can have both paired and unpaired mutations, so 
> that both {{if}} can be taken, but if that's the case, the 2nd write to the 
> batchlog will basically overwrite (remove) the batchlog write of the 1st 
> {{if}} and I don't think that's the intention. In practice, this means 
> "paired" mutation won't be in the batchlog, which mean they won't be replayed 
> at all if they fail.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to