[ 
https://issues.apache.org/jira/browse/CASSANDRA-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180004#comment-14180004
 ] 

Ben Chan commented on CASSANDRA-5483:
-------------------------------------

Thanks; "node repair -full" (in conjunction with "ccm clear") does indeed force 
streaming repair.

[^5483-v17-00.patch] fixes a few trace calls that got lost/misplaced in the 
rebase.
[^5483-v17-01.patch] reorganizes some code; functionally identical.

{noformat}
# This bash code should work out-of-the-box with a stock setup.
JMXGET='/jmx_port/{p=$2;} /binary/{split($2,a,/\047/);h=a[2];}
  END{printf("bin/nodetool -h %s -p %s\n",h,p,cmd);}'
ccm_nodetool() { local N=$1; shift; $(ccm $N show | awk -F= "$JMXGET") "$@"; }
dl_apply_maybe() { for url; do { [ -e $(basename $url) ] || curl -sO $url; } &&
  ! [ ${url%.patch} = $url ] && git apply $(basename $url); done; }

NEW_BRANCH=$(date +5483-17--%Y%m%d-%H%M%S)
W=https://issues.apache.org/jira/secure/attachment

git checkout -b $NEW_BRANCH 49833b9 &&
dl_apply_maybe \
  $W/12633156/ccm-repair-test \
  $W/12675963/5483-v17.patch \
  $W/12675963/5483-v17-00.patch \
  $W/12676340/5483-v17-01.patch &&
ant clean && ant &&
chmod +x ./ccm-repair-test && ./ccm-repair-test -kR &&
ccm node1 stop && ccm node1 clear && ccm node1 start &&
ccm_nodetool node1 repair -tr -full &&
ccm node1 showlog | grep "Performing streaming repair"
{noformat}

----

Note that I switched some code to a different thread in order to facilitate 
trace handling -- see the diff hunk near the end of 
StorageService#createRepairTask. I needed certain calls to happen in the parent 
repair thread, and this seemed to be the simplest way.

As far as I can tell, there shouldn't be any differences in functionality or 
concurrency level (i.e. number of tasks that are executing concurrently), but 
someone should examine that section just to make sure.

----

The issue with unfiltered traces (see my previous message) still remains, but 
if push comes to shove, you can consider this as just a special case of having 
a trace call where it's not needed, or missing where it *is* needed.

In other words, a cosmetic problem. Fixing this should not require any changes 
or additions to the over-wire protocol or the JMX interface.


> Repair tracing
> --------------
>
>                 Key: CASSANDRA-5483
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5483
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Yuki Morishita
>            Assignee: Ben Chan
>            Priority: Minor
>              Labels: repair
>             Fix For: 3.0
>
>         Attachments: 5483-full-trunk.txt, 
> 5483-v06-04-Allow-tracing-ttl-to-be-configured.patch, 
> 5483-v06-05-Add-a-command-column-to-system_traces.events.patch, 
> 5483-v06-06-Fix-interruption-in-tracestate-propagation.patch, 
> 5483-v07-07-Better-constructor-parameters-for-DebuggableThreadPoolExecutor.patch,
>  5483-v07-08-Fix-brace-style.patch, 
> 5483-v07-09-Add-trace-option-to-a-more-complete-set-of-repair-functions.patch,
>  5483-v07-10-Correct-name-of-boolean-repairedAt-to-fullRepair.patch, 
> 5483-v08-11-Shorten-trace-messages.-Use-Tracing-begin.patch, 
> 5483-v08-12-Trace-streaming-in-Differencer-StreamingRepairTask.patch, 
> 5483-v08-13-sendNotification-of-local-traces-back-to-nodetool.patch, 
> 5483-v08-14-Poll-system_traces.events.patch, 
> 5483-v08-15-Limit-trace-notifications.-Add-exponential-backoff.patch, 
> 5483-v09-16-Fix-hang-caused-by-incorrect-exit-code.patch, 
> 5483-v10-17-minor-bugfixes-and-changes.patch, 
> 5483-v10-rebased-and-squashed-471f5cc.patch, 5483-v11-01-squashed.patch, 
> 5483-v11-squashed-nits.patch, 5483-v12-02-cassandra-yaml-ttl-doc.patch, 
> 5483-v13-608fb03-May-14-trace-formatting-changes.patch, 
> 5483-v14-01-squashed.patch, 
> 5483-v15-02-Hook-up-exponential-backoff-functionality.patch, 
> 5483-v15-03-Exact-doubling-for-exponential-backoff.patch, 
> 5483-v15-04-Re-add-old-StorageService-JMX-signatures.patch, 
> 5483-v15-05-Move-command-column-to-system_traces.sessions.patch, 
> 5483-v15.patch, 5483-v17-00.patch, 5483-v17-01.patch, 5483-v17.patch, 
> ccm-repair-test, cqlsh-left-justify-text-columns.patch, 
> prerepair-vs-postbuggedrepair.diff, test-5483-system_traces-events.txt, 
> trunk@4620823-5483-v02-0001-Trace-filtering-and-tracestate-propagation.patch, 
> trunk@4620823-5483-v02-0002-Put-a-few-traces-parallel-to-the-repair-logging.patch,
>  tr...@8ebeee1-5483-v01-001-trace-filtering-and-tracestate-propagation.txt, 
> tr...@8ebeee1-5483-v01-002-simple-repair-tracing.txt, 
> v02p02-5483-v03-0003-Make-repair-tracing-controllable-via-nodetool.patch, 
> v02p02-5483-v04-0003-This-time-use-an-EnumSet-to-pass-boolean-repair-options.patch,
>  v02p02-5483-v05-0003-Use-long-instead-of-EnumSet-to-work-with-JMX.patch
>
>
> I think it would be nice to log repair stats and results like query tracing 
> stores traces to system keyspace. With it, you don't have to lookup each log 
> file to see what was the status and how it performed the repair you invoked. 
> Instead, you can query the repair log with session ID to see the state and 
> stats of all nodes involved in that repair session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to