live4thee opened a new pull request, #3261:
URL: https://github.com/apache/brpc/pull/3261
Otherwise there might segfault due to the race below:
```txt
Socket::OnInputEvent() |
`-- ProcessEvent (bthread) |
|
[ bthread queueed ] | QP error -> SetFailed -> HC ->
WaitAndReset()
| Reset() -> _sbuf.clear()
| CheckHealth() -> Revive()
|
| Socket is now Addressable!
RdmaEndpoint:PollCq() |
Socket::Address() OK! |
RdmaEndpoint:HandleCompletion()
_sbuf[_sq_sent++].clear() <= BOOM! CQ is not drained but _sbuf is
cleared.
```
Another possible fix is to add a _generation_ field in RdmaEndpoint, such
that:
- each RdmaEndpoint::Reset() will advance the _generation_ by 1;
- the RdmaEndpoint::PollCq(m, orig_gen) will need to compare the
_generation_.
But it will contaminate existing interface, and we need to drain CQ anyway.
### What problem does this PR solve?
Issue Number: [3252](https://github.com/apache/brpc/issues/3252)
Problem Summary: see above.
### What is changed and the side effects?
Changed:
1. drain CQ after moving QP into RESET state;
2. do not re-use QP if drain failed.
Side effects:
- Performance effects: n/a
- Breaking backward compatibility: none
---
### Check List:
- Please make sure your changes are compilable.
- When providing us with a new feature, it is best to add related tests.
- Please follow [Contributor Covenant Code of
Conduct](https://github.com/apache/brpc/blob/master/CODE_OF_CONDUCT.md).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]