cadonna commented on code in PR #18835:
URL: https://github.com/apache/kafka/pull/18835#discussion_r1954155941
##########
streams/src/test/java/org/apache/kafka/streams/tests/SmokeTestDriver.java:
##########
@@ -521,13 +521,16 @@ private static boolean verify(final PrintStream
resultStream,
final Map<String, Map<String,
LinkedList<ConsumerRecord<String, Number>>>> events,
final Function<String, Number>
keyToExpectation,
final boolean printResults) {
+ resultStream.printf("verifying topic '%s'%n", topic);
final Map<String, LinkedList<ConsumerRecord<String, Number>>>
observedInputEvents = events.get("data");
final Map<String, LinkedList<ConsumerRecord<String, Number>>>
outputEvents = events.getOrDefault(topic, emptyMap());
if (outputEvents.isEmpty()) {
- resultStream.println(topic + " is empty");
+ resultStream.println("missing result data; topic '" + topic + "'
is empty, expected " + inputData.size() + " keys");
return false;
} else {
- resultStream.printf("verifying %s with %d keys%n", topic,
outputEvents.size());
+ if (outputEvents.size() < inputData.size()) {
+ resultStream.println("missing result data; got " +
inputData.size() + " keys, expected: " + outputEvents.size() + " keys");
+ }
if (outputEvents.size() != inputData.size()) {
Review Comment:
Fair enough!
I meant something like the following:
```java
if (outputEvents.size() != inputData.size()) {
if (outputEvents.size() < inputData.size()) {
resultStream.println("fail: missing result data; got " +
inputData.size() + " keys, expected: " + outputEvents.size() + " keys");
return false;
} else {
// distinguish between eos and alos
// requires passing processingGuarantee to verify()
}
}
```
##########
tests/kafkatest/tests/streams/streams_smoke_test.py:
##########
@@ -109,5 +109,7 @@ def test_streams(self, processing_guarantee, crash,
metadata_quorum):
if crash and processing_guarantee == 'at_least_once':
self.driver.node.account.ssh("grep -E
'SUCCESS|PROCESSED-MORE-THAN-GENERATED' %s" % self.driver.STDOUT_FILE,
allow_fail=False)
+ # fail if we find "missing result data" output in the stdout file;
while we can tolerate duplication, we cannot tolerate data loss
+ self.driver.node.account.ssh("[ ! `grep 'missing result data'" %
self.driver.STDOUT_FILE % "` ]", allow_fail=False)
else:
self.driver.node.account.ssh("grep SUCCESS %s" %
self.driver.STDOUT_FILE, allow_fail=False)
Review Comment:
It is a bit weird that we grep for `SUCCESS|PROCESSED-MORE-THAN-GENERATED`.
Those two outputs are unrelated. You can have both in the output or just one of
them. The issue that led to this PR was that the output contained
`PROCESSED-MORE-THAN-GENERATED` and `FAILURE`. But since the test found
`PROCESSED-MORE-THAN-GENERATED` it decided that the run was successful although
there was a failure.
If we can ensure that under ALOS the test still outputs a `SUCCESS` in the
output also if it processed some records multiple times, we could simplify this
check.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]