exceptionfactory commented on PR #10053:
URL: https://github.com/apache/nifi/pull/10053#issuecomment-3079218340

   Thanks for the substantive review of options and current discussion 
@dariuszseweryn, that is helpful framing.
   
   Reviewing the options, and considering similar issues in the `ConsumeKafka` 
Processor, the use case related to an embedded schema is a good candidate for 
consideration. Solving that scenario could also benefit the schema inference 
scenario.
   
   Here is an additional approach to consider:
   
   The `ConsumeKafka` Processor does different output FlowFiles when records 
have different associated Schemas. However, instead of comparing whether a 
Record Schema is compatible with Schema from the first Record, it groups output 
based on Record Schema equality. It still sends invalid records to a 
`parse.failure` Relationship, but it provides a logical grouping based on 
common Record Schema references. This fits the scenario of embedded schema 
references, and can also work for scenarios where schema inference produces 
different Record Schema results for different Records. This approach would also 
avoid some of the performance concerns related to evaluate a `KinesisRecord` 
multiple times, and avoids the need for changes to Record Writers.
   
   If that sounds viable, it is probably worth creating a new pull request, 
since it presents a more substantive direction than the initial focus here on 
improving error handling. However, it seems like such an approach gets closer 
to resolving the core issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to