fapaul commented on a change in pull request #18790: URL: https://github.com/apache/flink/pull/18790#discussion_r809110464
########## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/SourceOperator.java ########## @@ -423,6 +422,16 @@ private DataInputStatus emitNextNotReading(DataOutput<OUT> output) throws Except } } + private void initializeMainOutput(DataOutput<OUT> output) { + currentMainOutput = eventTimeLogic.createMainOutput(output, this::onWatermarkEmitted); + initializeLatencyMarkerEmitter(output); + lastInvokedOutput = output; + outputPendingSplits.forEach( + split -> currentMainOutput.createOutputForSplit(split.splitId())); Review comment: I am wondering whether it might be problematic that you initialize all the outputs already but do not release them. Previously the outputs were initialized one by one here [1] and released when moving to the next split[2] [1] https://github.com/apache/flink/blob/106280e10a96d729943985986198b942446197d9/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/SourceReaderBase.java#L327 [2] https://github.com/apache/flink/blob/106280e10a96d729943985986198b942446197d9/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/SourceReaderBase.java#L195 ########## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/SourceOperator.java ########## @@ -516,7 +525,14 @@ public void handleOperatorEvent(OperatorEvent event) { checkWatermarkAlignment(); } else if (event instanceof AddSplitEvent) { try { - sourceReader.addSplits(((AddSplitEvent<SplitT>) event).splits(splitSerializer)); + List<SplitT> newSplits = ((AddSplitEvent<SplitT>) event).splits(splitSerializer); Review comment: Is `handleOperatorEvent` executed by the mailbox thread as the other methods? ########## File path: flink-connectors/flink-connector-base/src/test/java/org/apache/flink/connector/base/source/reader/SourceReaderBaseTest.java ########## @@ -239,6 +257,77 @@ void testPollNextReturnMoreAvailableWhenAllSplitFetcherCloseWithLeftoverElementI .isEqualTo(InputStatus.MORE_AVAILABLE); } + @Test + void testPerSplitWatermark() throws Exception { + MockSplitReader mockSplitReader = + MockSplitReader.newBuilder() + .setNumRecordsPerSplitPerFetch(3) + .setBlockingFetch(true) + .build(); + + MockSourceReader reader = + new MockSourceReader( + new FutureCompletingBlockingQueue<>(), + () -> mockSplitReader, + new Configuration(), + new TestingReaderContext()); + + SourceOperator<Integer, MockSourceSplit> sourceOperator = + createTestOperator( + reader, + WatermarkStrategy.forGenerator( + (context) -> new OnEventWatermarkGenerator()), + true); + + MockSourceSplit splitA = new MockSourceSplit(0, 0, 3); + splitA.addRecord(100); + splitA.addRecord(200); + splitA.addRecord(300); + + MockSourceSplit splitB = new MockSourceSplit(1, 0, 3); + splitB.addRecord(150); + splitB.addRecord(250); + splitB.addRecord(350); + + AddSplitEvent<MockSourceSplit> addSplitsEvent = + new AddSplitEvent<>(Arrays.asList(splitA, splitB), new MockSourceSplitSerializer()); + sourceOperator.handleOperatorEvent(addSplitsEvent); + WatermarkCollectingDataOutput output = new WatermarkCollectingDataOutput(); + + // First 3 records from split A should not generate any watermarks + CommonTestUtils.waitUtil( Review comment: Why do you need this test loop? Can't you call `emitNext` the correct amount of time? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org