[ 
https://issues.apache.org/jira/browse/CASSANDRA-15399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17072219#comment-17072219
 ] 

David Capwell commented on CASSANDRA-15399:
-------------------------------------------

bq.  does it mean that feature is not going to happen either or I am free do 
dig into it?

I argue that CASSANDRA-15566 is not a regression, so I have been meaning to 
bring it up on the ML to not have it in 4.0.  This JIRA is a feature so with 
the freeze in mind its not right to go in, which is why I stopped mostly.

CASSANDRA-15406 I linked these two since repair coordinator would benefit from 
it, and being able to expose that state more clearly would help operators.  I 
only now see your comments in CASSANDRA-15406, sorry for not replying before; I 
will reply there.

bq. I do not want to reinvent the wheel so you telling me what parts of the 
code are common (as you say in 15406) would be very beneficial and I would be 
grateful for that a lot.

Repair has 3 main phases: setup, validate, sync/streaming.  In case there are 
issues or slownesses, it is desirable to know where and how fast progress is 
being made.  In this ticket I added validation tracking which tracks the rate 
of progress and % completed, I had not worked on sync as I am less familiar and 
had more tools to monitor (there are stats, not tied to a repair but you could 
detect if streaming wasn't happening, you can't do that with validate).  

Repair has two main code paths for interacting with streaming

* org.apache.cassandra.repair.RepairMessageVerbHandler#doVerb - if the 
coordinator needs to ask another node to stream data; look for *SYNC_* verbs
* org.apache.cassandra.repair.RepairJob#standardSyncing - an individual job (a 
single repair is 1+ jobs).  This delegates to LocalSyncTask, 
AsymmetricRemoteSyncTask, and SymmetricRemoteSyncTask to perform the sync.

What would be desirable is that we could track which streams were happening on 
each node for a specific repair job, and how the progress is going for 
streaming (which is what CASSANDRA-15406 looks to offer).  This ticket makes it 
so the state is in-memory only (so rebooting the node doesn't cause the 
metadata to be out of sync), but also makes sure the data lives far past the 
actual events.  The main reasons were to allow operators to see historic data, 
but would also allow repair coordination to be able to get access so it could 
learn what other nodes think (for example, if validation is done on a node but 
the coordinator has not seen it yet, it could re-ask for the tree to be sent).

> Add ability to track state in repair
> ------------------------------------
>
>                 Key: CASSANDRA-15399
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15399
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Consistency/Repair
>            Reporter: David Capwell
>            Assignee: David Capwell
>            Priority: Normal
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> To enhance the visibility in repair, we should expose internal state via 
> virtual tables; the state should include coordinator as well as participant 
> state (validation, sync, etc.)
> I propose the following tables:
> repairs - high level summary of the global state of repair; this should be 
> called on the coordinator.
> {code:sql}
> CREATE TABLE repairs (
>   id uuid,
>   keyspace_name text,
>   table_names frozen<list<text>>,
>   ranges frozen<list<text>>,
>   coordinator text,
>   participants frozen<list<text>>,
>   state text,
>   progress_percentage float,
>   last_updated_at_millis bigint,
>   duration_micro bigint,
>   failure_cause text,
>   PRIMARY KEY ( (id) )
> )
> {code}
> repair_tasks - represents RepairJob and participants state.  This will show 
> if validations are running on participants and the progress they are making; 
> this should be called on the coordinator.
> {code:sql}
> CREATE TABLE repair_tasks (
>   id uuid,
>   session_id uuid,
>   keyspace_name text,
>   table_name text,
>   ranges frozen<list<text>>,
>   coordinator text,
>   participant text,
>   state text,
>   state_description text,
>   progress_percentage float, -- between 0.0 and 100.0
>   last_updated_at_millis bigint,
>   duration_micro bigint,
>   failure_cause text,
>   PRIMARY KEY ( (id), session_id, table_name, participant )
> )
> {code}
> repair_validations - shows the state of the validation task and updated 
> periodically while validation is running; this should be called on the 
> participants.
> {code:sql}
> CREATE TABLE repair_validations (
>   id uuid,
>   session_id uuid,
>   ranges frozen<list<text>>,
>   keyspace_name text,
>   table_name text,
>   initiator text,
>   state text,
>   progress_percentage float,
>   queue_duration_ms bigint,
>   runtime_duration_ms bigint,
>   total_duration_ms bigint,
>   estimated_partitions bigint,
>   partitions_processed bigint,
>   estimated_total_bytes bigint,
>   failure_cause text,
>   PRIMARY KEY ( (id), session_id, table_name )
> )
> {code}
> The main reason for exposing virtual tables rather than exposing through 
> durable tables is to make sure what is exposed is accurate.  In cases of 
> write failures or node failures, the durable tables could become in-accurate 
> and could add edge cases where the repair is not running but the tables say 
> it is; by relying on repair's internal in-memory bookkeeping, these problems 
> go away.
> This jira does not try to solve the following:
> 1) repair resiliency - there are edge cases where repair hits an error and 
> runs forever (at least from nodetool's perspective).
> 2) repair stream tracking - I have not learned the streaming side yet and 
> what I see is multiple implementations exist, so seems like high scope.  My 
> hope is to punt from this jira and tackle separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to