pcgrenier commented on a change in pull request #3573: NIFI-6419: Fixed
AvroWriter single record with external schema result…
URL: https://github.com/apache/nifi/pull/3573#discussion_r301634665
##########
File path:
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/avro/WriteAvroResultWithExternalSchema.java
##########
@@ -78,14 +78,19 @@ protected void onBeginRecordSet() throws IOException {
@Override
public Map<String, String> writeRecord(final Record record) throws
IOException {
// If we are not writing an active record set, then we need to ensure
that we write the
- // schema information.
+ // schema information at the beginning and call flush at the end.
if (!isActiveRecordSet()) {
flush();
schemaAccessWriter.writeHeader(recordSchema, getOutputStream());
}
final GenericRecord rec = AvroTypeUtil.createAvroRecord(record,
avroSchema);
datumWriter.write(rec, encoder);
+
+ if (!isActiveRecordSet()) {
+ flush();
Review comment:
I think we can agree to disagree. But in general, most writer classes would
leverage the buffered output stream to let it flush on full buffers and provide
a configured/optimal writing strategy. Then flush the remaining data on close.
If this is the case then all flushes outside the close are misplaced in my
opinion. If the class utilizing the writer needs the data flushed it should be
required to explicitly call flush(). If you look at the [csv counter
part](https://github.com/apache/nifi/blob/rel/nifi-1.9.2/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/csv/WriteCSVResult.java)
it never flushes, and depends on the printers close method. Which could also
be done here as well, by calling the close on the buffered outputsteam as well.
Going to stress this is mostly an opinion piece and mostly concerned about
data loss, since it caused a lot of headache.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services