dariuszseweryn commented on PR #10053: URL: https://github.com/apache/nifi/pull/10053#issuecomment-3079256710
> The ConsumeKafka Processor does different output FlowFiles when records have different associated Schemas. However, instead of comparing whether a Record Schema is compatible with Schema from the first Record, it groups output based on Record Schema equality. It still sends invalid records to a parse.failure Relationship, but it provides a logical grouping based on common Record Schema references. This fits the scenario of embedded schema references, and can also work for scenarios where schema inference produces different Record Schema results for different Records. This approach would also avoid some of the performance concerns related to evaluate a KinesisRecord multiple times, and avoids the need for changes to Record Writers. The downside of such approach is that offset tracking gets impossible without per-row sequence/sub-sequence numbers. Was this considered for in `ConsumeKafka`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
