Hello. I recently asked the community a question about processing CSV files. I received some helpful advice about using processors such as ConvertRecord and QueryRecord, and was encouraged to employ Readers and RecordSetWriters. I've done that, and thank all who replied.
My incoming CSV files come in with different headers because they are widely different data sets. The header structure is not known in advance. As such, I configure a QueryRecord processor with a CSVReader that employs a Schema Access Strategy that is *Use String Fields From Header*. I configure a CSVRecordSetWriter that sets *Infer Record Schema* as its Schema Access Strategy. Now I want to use that QueryRecord processor to characterize the various fields using SQL. Record counts, min and max values - things of that nature. But in all the examples I find in YouTube and in the open source, the authors presume a knowledge of the fields in advance. For example Property year is set by Value select "year" from FLOWFILE. We simply don't have that luxury, that awareness in advance. After all, that's the very reason we inferred the schema in the reader and writer configuration. The fields are more often than not going to be very different. Hard wiring them into QueryRecord is not a flow solution that is flexible enough. We need to grab them from the inferred schema the Reader and Writer services identified. What syntax or notation can we use in the QueryRecord sql to say "for each field found in the header, execute this sql against that field"? I guess what I'm looking for is iteration through all the inferred schema fields, and dynamic assignment of the field name in the SQL. Has anyone faced this same challenge? How did you solve it? Is there another way to approach this problem? Thank you in advance, Jim