[ 
https://issues.apache.org/jira/browse/DISPATCH-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ken Giusti reassigned DISPATCH-2059:
------------------------------------

    Assignee: Jiri Daněk

> Support running router under rr during test execution
> -----------------------------------------------------
>
>                 Key: DISPATCH-2059
>                 URL: https://issues.apache.org/jira/browse/DISPATCH-2059
>             Project: Qpid Dispatch
>          Issue Type: Wish
>          Components: Tests
>    Affects Versions: 1.15.0
>            Reporter: Jiri Daněk
>            Assignee: Jiri Daněk
>            Priority: Major
>
> Dispatch has env variable {{QPID_DISPATCH_RUNNER}} which is (according to 
> comment) intended to be used for running tests under valgrind. That is 
> outdated comment, because the memory checking is currently solved in a 
> different way, in {{RuntimeChecks.cmake}}. One tool that would make sense to 
> use to wrap dispatch is rr, the record-replay debugger from Mozilla 
> (https://rr-project.org/).
> I've previously tried rr with (very) limited success in DISPATCH-782.
> [~aconway] considered it while working on DISPATCH-902 and used it on other 
> issues.
> There has been an attempt 
> https://issues.apache.org/jira/browse/DISPATCH-739?focusedCommentId=15983719&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15983719
>  to use rr which however did not survive in the mainline to the present day.
> I have two problems with rr:
> # Dispatch system-tests send SIGTERM to the subprocess itself, which is rr. 
> What is necessary is to kill its children instead. Killing rr causes abrupt 
> termination of the recording. When I issue ^C to a {{rr record qdrouterd -c 
> ...}} in the terminal, that signal goes correctly to the child. I am not sure 
> what's happening there in the test, where the difference comes from. 
> Explicitly killing only children in the system test does the right thing. 
> Sadly doing that requires hacks, python's subprocess does not allow to query 
> children easily. The os module has some ways; psutil is the easiest, but 
> thats a 3rd party dependency.
> # CLion debugger disconnects during replay when qdrouterd gets SIGTERM, but 
> the router handles that signal and continues running (cleanup)
> One awesome feature of rr is that the recording can be replayed many times, 
> backwards and forwards, and all memory addresses stay the same in the 
> recording, on every replay. Meaning that one can use {{watch -l *0x0000000}} 
> breakpoints to watch specific places of memory, and use {{reverse-cont}} gdb 
> command. (rr emulates the gdb UI, it's a wrapper over gdb, actually, if I 
> understand correctly.)
> h3. Chaos mode
> rr has a {{--chaos}} switch which tries to explore thread schedules as to 
> reveal more crashes; that could be useful



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to