pdolif opened a new pull request, #25778:
URL: https://github.com/apache/pulsar/pull/25778

   Fixes #25678
   
   ### Motivation
   
   The V5 scalable topic producer with exclusive access mode initially claims 
all active segments. The test 
`V5ProducerAccessModeTest.testExclusiveRejectsSecondProducer` verifies that a 
second exclusive producer is rejected. Therefore, it starts two exclusive 
producers sequentially and asserts that the second producer creation fails. 
Currently, the test is flaky, and it happens that from the client perspective, 
both producers seem to be created successfully. The reason for that is a race 
condition between `eagerAttachInitialAsync` and `onLayoutChange` in 
`ScalableTopicProducer`.
   
   For the second producer, the `ProducerBuilderV5` creates the 
`DagWatchClient`, starts it, and calls `eagerAttachInitialAsync` once it 
receives the initial layout. Concurrently, `onLayoutChange` may receive the 
initial layout. Both methods try to claim the active segments and there is a 
race condition between the two who first creates the segment producer(s).
   
   `eagerAttachInitialAsync`:
   ```java
   for (var seg : activeSegments) {
       if (segmentProducers.containsKey(seg.segmentId())) {
           continue; // -> if onLayoutChange created the producer, the segment 
is skipped here and the producer creation completes successfully
       }
       try {
           getOrCreateSegmentProducer(seg.segmentId());
       } catch (PulsarClientException e) {
           throw new java.util.concurrent.CompletionException(e);
       }
   }
   ```
   
   `onLayoutChange`:
   ```java
   for (var seg : newLayout.activeSegments()) {
       if (segmentProducers.containsKey(seg.segmentId())) {
           continue;
       }
       try {
           getOrCreateSegmentProducer(seg.segmentId());
       } catch (PulsarClientException e) {
           // if onLayoutChange creates the producer, errors are not visible to 
the producer creation flow
           log.warn()...
       }
   }
   ```
   
   ### Modifications
   
   - Skip the active segments claim in `onLayoutChange` for the initial change. 
The initial claim is done by `eagerAttachInitialAsync`.
   - Never skip the `getOrCreateSegmentProducer` call in 
`eagerAttachInitialAsync` to avoid issues if a second layout change is executed 
before `eagerAttachInitialAsync` created the producer(s).
   
   
   ### Verifying this change
   
   - [x] Make sure that the change passes the CI checks.
   
   ### Does this pull request potentially affect one of the following parts:
   
   *If the box was checked, please highlight the changes*
   
   - [ ] Dependencies (add or upgrade a dependency)
   - [ ] The public API
   - [ ] The schema
   - [ ] The default values of configurations
   - [ ] The threading model
   - [ ] The binary protocol
   - [ ] The REST endpoints
   - [ ] The admin CLI options
   - [ ] The metrics
   - [ ] Anything that affects deployment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to