[ 
https://issues.apache.org/jira/browse/CASSANDRA-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997265#comment-13997265
 ] 

Ben Chan commented on CASSANDRA-5483:
-------------------------------------

I seem to be having problems with JIRA email notifications. May 12 arrived 
fine, May 7 never arrived, May 8 arrived on May 13, but was dated May 10. 
Moving on...

{quote}
Just skip system keyspace entirely and save the logspam (use Keyspace.nonSystem 
instead of Keyspace.all)
{quote}

This patch has grown enough parts to be a little unwieldy. Just to be clear, 
the output from [https://gist.github.com/lyubent/bfc133fe92ef1afb9dd4] is the 
verbose output from {{nodetool}}, which means there is some extra output aside 
from the traces themselves. (TODO to self, I need to make the nodetool verbose 
output optional.). That particular message comes from 
{{src/java/org/apache/cassandra/tools/NodeProbe.java}}, in a part of the code 
untouched by this patch. I can go ahead and nuke that particular message for 
the system keyspace.

{quote}
How does {{Endpoints /127.0.0.2 and /127.0.0.1 are consistent for events}} 
scale up to more replicas? Should we switch to using {{\[..]}} notation instead?
{quote}

{{n * (n - 1)}} differences calculated for {{n}} replicas, so {{n * (n - 1)}} 
"are consistent" messages. I haven't dug deep enough into the code to be 
certain, but on the face of it, it seems like there should be some (possibly 
not-simple) way to reduce this to {{O(n * log\(n))}}. Enough speculation, 
though.

One edge case for the proposed notation would be a consistency partition:

{noformat}
A == B == C
A != D
D == E == F

=>

# We need a separate message for each partition.
Endpoints [A, B, C] are consistent for events
Endpoints [D, E, F] are consistent for events
{noformat}

Even with the edge case, it seems messy, but doable. You do lose trace timing 
information on the calculation of individual differences (the consistent ones, 
at least). On the other hand, comparing matching merkle trees should be a 
consistently fast operation, so you're probably not missing out on too much 
information.

{quote}
I'm a little lost in the commands and sessions, e.g. does {{\[2014-05-08 
23:27:45,368] Session completed successfully}} refer to session 
3617e3f0-d6ef-11e3-a493-7d438369d7fc or 36a49390-d6ef-11e3-a493-7d438369d7fc? 
Is there exactly one session per command? If so let's merge the "starting 
repair command" + "new session" output, and the "completed" + "finished".
{quote}

Each repair command seems to consist of multiple repair sessions (one per 
range). The sessions go semi-sequentially; there's a semi-random overlap 
between the end of one session and the start of another, like so (using small 
integers instead of UUIDs, and some labels on the left for readability):

{noformat}
[command #1           ] Starting repair command #1
[command #1, session 1] New session 1 will sync range ...
[command #1, session 1] Requesting merkle tree for ...
[command #1, session 1] Received merkle tree for ...
[command #1, session 2] New session 2 will sync range ...
[command #1, session 2] Requesting merkle tree for ...
[command #1, session 1] Endpoints ... consistent.
[command #1, session 1] Session 1 completed successfully
[command #1, session 2] Received merkle tree for ...
[command #1, session 2] Endpoints ... consistent.
[command #1, session 3] New session 3 will sync range ...
[command #1, session 2] Session 2 completed successfully
[command #1, session 3] Requesting merkle tree for ...
[command #1, session 3] Received merkle tree for ...
[command #1, session 3] Endpoints ... consistent.
[command #1, session 3] Session 3 completed successfully
[command #1           ] Repair command #1 finished
{noformat}

Most of the time it's obvious from context, but during that overlap, having the 
repair session UUID helps to disambiguate. I suspect the overlap is even 
greater (and more confusing) when you have heavy streaming.

{quote}
Why do we log Repair command #1 finished with no merkle trees requested for 
db.tweet? Is it because all sstables are already repaired? If so we should log 
that.
{quote}

I've never encountered a trace like that in my testing. I always seem to get 
merkle trees exchanged (see the log below), even if no streaming is needed. I'm 
hoping lyubent can provide enough information for me to be able to recreate 
this situation locally.

{quote}
Does this actually show any streaming? If so I'm missing it.
{quote}

lyubent's sample run didn't need streaming, so no streaming to trace. Here's 
how I usually test streaming (using [^ccm-repair-test] and yukim's method.):

{noformat}
./ccm-repair-test -kR &&
ccm node1 stop &&
ccm node1 clear &&
ccm node1 start &&
./ccm-repair-test -rt
{noformat}

Note that this sample run uses the codebase as of 
[^5483-v12-02-cassandra-yaml-ttl-doc.patch]; I haven't got around to doing the 
May 12 changes yet.

I should also warn you (if recent feedback is any guide) that the trace 
messages for streaming are completely unpolished. Though I think if I just 
remove the superlong filenames entirely, it might be close to acceptable.

{noformat}
[2014-05-13 23:49:42,561] Nothing to repair for keyspace 'system'
[2014-05-13 23:49:42,624] Starting repair command #1, repairing 3 ranges for 
keyspace s1.{users} (seq=true, full=true)
[2014-05-13 23:49:42,730] New session c84ae130-db1a-11e3-b0e7-f94811c7b860 will 
sync range (-3074457345618258603,3074457345618258602]
[2014-05-13 23:49:42,973] Requesting merkle trees for users (to [/127.0.0.2, 
/127.0.0.3, /127.0.0.1])
[2014-05-13 23:49:44,058] /127.0.0.2: Sending completed merkle tree to 
/127.0.0.1 for s1/users
[2014-05-13 23:49:44,616] Received merkle tree for users from /127.0.0.2
[2014-05-13 23:49:45,565] /127.0.0.3: Sending completed merkle tree to 
/127.0.0.1 for s1/users
[2014-05-13 23:49:45,706] Received merkle tree for users from /127.0.0.3
[2014-05-13 23:49:46,378] Received merkle tree for users from /127.0.0.1
[2014-05-13 23:49:46,412] New session ca7d5190-db1a-11e3-b0e7-f94811c7b860 will 
sync range (-9223372036854775808,-3074457345618258603]
[2014-05-13 23:49:46,483] Requesting merkle trees for users (to [/127.0.0.2, 
/127.0.0.3, /127.0.0.1])
[2014-05-13 23:49:46,674] Endpoints /127.0.0.2 and /127.0.0.3 are consistent 
for users
[2014-05-13 23:49:46,774] Endpoints /127.0.0.3 and /127.0.0.1 are consistent 
for users
[2014-05-13 23:49:46,781] Endpoints /127.0.0.2 and /127.0.0.1 are consistent 
for users
[2014-05-13 23:49:46,817] Session completed successfully
[2014-05-13 23:49:47,046] /127.0.0.2: Sending completed merkle tree to 
/127.0.0.1 for s1/users
[2014-05-13 23:49:47,116] Received merkle tree for users from /127.0.0.2
[2014-05-13 23:49:47,528] /127.0.0.3: Sending completed merkle tree to 
/127.0.0.1 for s1/users
[2014-05-13 23:49:47,668] Received merkle tree for users from /127.0.0.3
[2014-05-13 23:49:48,149] Received merkle tree for users from /127.0.0.1
[2014-05-13 23:49:48,180] Endpoints /127.0.0.2 and /127.0.0.1 are consistent 
for users
[2014-05-13 23:49:48,181] New session cb8ac9f0-db1a-11e3-b0e7-f94811c7b860 will 
sync range (3074457345618258602,-9223372036854775808]
[2014-05-13 23:49:48,184] Endpoints /127.0.0.2 and /127.0.0.3 are consistent 
for users
[2014-05-13 23:49:48,185] Endpoints /127.0.0.3 and /127.0.0.1 are consistent 
for users
[2014-05-13 23:49:48,209] Session completed successfully
[2014-05-13 23:49:48,258] Requesting merkle trees for users (to [/127.0.0.2, 
/127.0.0.3, /127.0.0.1])
[2014-05-13 23:49:48,525] /127.0.0.2: Sending completed merkle tree to 
/127.0.0.1 for s1/users
[2014-05-13 23:49:48,531] Received merkle tree for users from /127.0.0.2
[2014-05-13 23:49:48,610] /127.0.0.3: Sending completed merkle tree to 
/127.0.0.1 for s1/users
[2014-05-13 23:49:48,653] Received merkle tree for users from /127.0.0.3
[2014-05-13 23:49:48,750] Received merkle tree for users from /127.0.0.1
[2014-05-13 23:49:48,776] Repair session c84ae130-db1a-11e3-b0e7-f94811c7b860 
for range (-3074457345618258603,3074457345618258602] finished
[2014-05-13 23:49:48,777] Repair session ca7d5190-db1a-11e3-b0e7-f94811c7b860 
for range (-9223372036854775808,-3074457345618258603] finished
[2014-05-13 23:49:48,786] Endpoints /127.0.0.2 and /127.0.0.3 are consistent 
for users
[2014-05-13 23:49:48,794] Endpoints /127.0.0.2 and /127.0.0.1 have 1 range(s) 
out of sync for users
[2014-05-13 23:49:48,798] Endpoints /127.0.0.3 and /127.0.0.1 have 1 range(s) 
out of sync for users
[2014-05-13 23:49:48,832] Performing streaming repair of 1 ranges with 
/127.0.0.3
[2014-05-13 23:49:48,850] Performing streaming repair of 1 ranges with 
/127.0.0.2
[2014-05-13 23:49:49,216] Streaming session with /127.0.0.2 prepared.
[2014-05-13 23:49:49,221] Streaming session with /127.0.0.3 prepared.
[2014-05-13 23:49:49,292] 
/home/usrbincc/.ccm/test-5483-7Fz/node1/data/s1/users-b78c3a60db1a11e384aff94811c7b860/s1-users-tmp-ka-2-Data.db
 87/87 bytes(100%) received from idx:0/127.0.0.2
[2014-05-13 23:49:49,306] 
/home/usrbincc/.ccm/test-5483-7Fz/node1/data/s1/users-b78c3a60db1a11e384aff94811c7b860/s1-users-tmp-ka-1-Data.db
 87/87 bytes(100%) received from idx:0/127.0.0.3
[2014-05-13 23:49:49,355] Streaming session with /127.0.0.2 completed 
successfully.
[2014-05-13 23:49:49,404] Streaming session with /127.0.0.3 completed 
successfully.
[2014-05-13 23:49:49,443] Session completed successfully
[2014-05-13 23:49:49,445] Repair session cb8ac9f0-db1a-11e3-b0e7-f94811c7b860 
for range (3074457345618258602,-9223372036854775808] finished
[2014-05-13 23:49:49,445] Repair command #1 finished
[2014-05-13 23:49:49,531] Starting repair command #2, repairing 2 ranges for 
keyspace system_traces.{events, sessions} (seq=true, full=true)
[2014-05-13 23:49:49,541] New session cc5aeb30-db1a-11e3-b0e7-f94811c7b860 will 
sync range (-3074457345618258603,3074457345618258602]
[2014-05-13 23:49:49,650] Requesting merkle trees for events (to [/127.0.0.3, 
/127.0.0.1])
[2014-05-13 23:49:49,920] Received merkle tree for events from /127.0.0.3
[2014-05-13 23:49:49,933] /127.0.0.3: Sending completed merkle tree to 
/127.0.0.1 for system_traces/events
[2014-05-13 23:49:50,190] Received merkle tree for events from /127.0.0.1
[2014-05-13 23:49:50,253] Endpoints /127.0.0.3 and /127.0.0.1 have 1 range(s) 
out of sync for events
[2014-05-13 23:49:50,267] Performing streaming repair of 1 ranges with 
/127.0.0.3
[2014-05-13 23:49:50,311] Requesting merkle trees for sessions (to [/127.0.0.3, 
/127.0.0.1])
[2014-05-13 23:49:50,515] Streaming session with /127.0.0.3 prepared.
[2014-05-13 23:49:50,575] 
/home/usrbincc/.ccm/test-5483-7Fz/node1/data/system_traces/events-8826e8e9e16a372887533bc1fc713c25/system_traces-events-tmp-ka-3-Data.db
 1054/1054 bytes(100%) received from idx:0/127.0.0.3
[2014-05-13 23:49:50,599] 
/home/usrbincc/.ccm/test-5483-7Fz/node1/flush/system_traces/events-8826e8e9e16a372887533bc1fc713c25/system_traces-events-ka-2-Data.db
 815/815 bytes(100%) sent to idx:0/127.0.0.3
[2014-05-13 23:49:50,652] 
/home/usrbincc/.ccm/test-5483-7Fz/node1/data/system_traces/events-8826e8e9e16a372887533bc1fc713c25/system_traces-events-tmp-ka-4-Data.db
 4254/4254 bytes(100%) received from idx:0/127.0.0.3
[2014-05-13 23:49:50,669] 
/home/usrbincc/.ccm/test-5483-7Fz/node1/flush/system_traces/events-8826e8e9e16a372887533bc1fc713c25/system_traces-events-ka-1-Data.db
 4335/4335 bytes(100%) sent to idx:0/127.0.0.3
[2014-05-13 23:49:50,776] Received merkle tree for sessions from /127.0.0.3
[2014-05-13 23:49:50,876] /127.0.0.3: Sending completed merkle tree to 
/127.0.0.1 for system_traces/sessions
[2014-05-13 23:49:50,920] Streaming session with /127.0.0.3 completed 
successfully.
[2014-05-13 23:49:51,312] Received merkle tree for sessions from /127.0.0.1
[2014-05-13 23:49:51,321] New session cd6aad80-db1a-11e3-b0e7-f94811c7b860 will 
sync range (3074457345618258602,-9223372036854775808]
[2014-05-13 23:49:51,339] Endpoints /127.0.0.3 and /127.0.0.1 are consistent 
for sessions
[2014-05-13 23:49:51,425] Session completed successfully
[2014-05-13 23:49:51,442] Requesting merkle trees for events (to [/127.0.0.2, 
/127.0.0.1])
[2014-05-13 23:49:51,613] Received merkle tree for events from /127.0.0.2
[2014-05-13 23:49:51,627] /127.0.0.2: Sending completed merkle tree to 
/127.0.0.1 for system_traces/events
[2014-05-13 23:49:51,734] Received merkle tree for events from /127.0.0.1
[2014-05-13 23:49:51,779] Requesting merkle trees for sessions (to [/127.0.0.2, 
/127.0.0.1])
[2014-05-13 23:49:51,803] Endpoints /127.0.0.2 and /127.0.0.1 are consistent 
for events
[2014-05-13 23:49:51,935] /127.0.0.2: Sending completed merkle tree to 
/127.0.0.1 for system_traces/sessions
[2014-05-13 23:49:51,972] Received merkle tree for sessions from /127.0.0.2
[2014-05-13 23:49:52,243] Received merkle tree for sessions from /127.0.0.1
[2014-05-13 23:49:52,258] Repair session cc5aeb30-db1a-11e3-b0e7-f94811c7b860 
for range (-3074457345618258603,3074457345618258602] finished
[2014-05-13 23:49:52,268] Endpoints /127.0.0.2 and /127.0.0.1 are consistent 
for sessions
[2014-05-13 23:49:52,279] Session completed successfully
[2014-05-13 23:49:52,283] Repair session cd6aad80-db1a-11e3-b0e7-f94811c7b860 
for range (3074457345618258602,-9223372036854775808] finished
[2014-05-13 23:49:52,283] Repair command #2 finished
{noformat}


> Repair tracing
> --------------
>
>                 Key: CASSANDRA-5483
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5483
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Yuki Morishita
>            Assignee: Ben Chan
>            Priority: Minor
>              Labels: repair
>         Attachments: 5483-full-trunk.txt, 
> 5483-v06-04-Allow-tracing-ttl-to-be-configured.patch, 
> 5483-v06-05-Add-a-command-column-to-system_traces.events.patch, 
> 5483-v06-06-Fix-interruption-in-tracestate-propagation.patch, 
> 5483-v07-07-Better-constructor-parameters-for-DebuggableThreadPoolExecutor.patch,
>  5483-v07-08-Fix-brace-style.patch, 
> 5483-v07-09-Add-trace-option-to-a-more-complete-set-of-repair-functions.patch,
>  5483-v07-10-Correct-name-of-boolean-repairedAt-to-fullRepair.patch, 
> 5483-v08-11-Shorten-trace-messages.-Use-Tracing-begin.patch, 
> 5483-v08-12-Trace-streaming-in-Differencer-StreamingRepairTask.patch, 
> 5483-v08-13-sendNotification-of-local-traces-back-to-nodetool.patch, 
> 5483-v08-14-Poll-system_traces.events.patch, 
> 5483-v08-15-Limit-trace-notifications.-Add-exponential-backoff.patch, 
> 5483-v09-16-Fix-hang-caused-by-incorrect-exit-code.patch, 
> 5483-v10-17-minor-bugfixes-and-changes.patch, 
> 5483-v10-rebased-and-squashed-471f5cc.patch, 5483-v11-01-squashed.patch, 
> 5483-v11-squashed-nits.patch, 5483-v12-02-cassandra-yaml-ttl-doc.patch, 
> ccm-repair-test, cqlsh-left-justify-text-columns.patch, 
> prerepair-vs-postbuggedrepair.diff, test-5483-system_traces-events.txt, 
> trunk@4620823-5483-v02-0001-Trace-filtering-and-tracestate-propagation.patch, 
> trunk@4620823-5483-v02-0002-Put-a-few-traces-parallel-to-the-repair-logging.patch,
>  tr...@8ebeee1-5483-v01-001-trace-filtering-and-tracestate-propagation.txt, 
> tr...@8ebeee1-5483-v01-002-simple-repair-tracing.txt, 
> v02p02-5483-v03-0003-Make-repair-tracing-controllable-via-nodetool.patch, 
> v02p02-5483-v04-0003-This-time-use-an-EnumSet-to-pass-boolean-repair-options.patch,
>  v02p02-5483-v05-0003-Use-long-instead-of-EnumSet-to-work-with-JMX.patch
>
>
> I think it would be nice to log repair stats and results like query tracing 
> stores traces to system keyspace. With it, you don't have to lookup each log 
> file to see what was the status and how it performed the repair you invoked. 
> Instead, you can query the repair log with session ID to see the state and 
> stats of all nodes involved in that repair session.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to