We have a task requiring that we transform incoming CSV files to JSON. The
CSVs vary in schema.

There are a number of interesting flow examples out there illustrating how
one can set up a flow to handle the case where the CSV schema is well known
and fixed, but none for the generalized case.

The structure of the incoming CSV files will not be known in advance in our
use case. Our nifi flow must be generalized because I cannot configure and
rely on a service that defines a specific fixed Avro schema registry. An Avro
schema registry seems to presume an awareness of the CSV structure in
advance. We don't have that luxury in this use case, with CSVs arriving
from many different providers and so characterized by schemas that are
unknown.

What is the best way to get around this challenge? Does anyone know of an
example where NiFi builds the schema on the fly as CSVs arrive for
processing, dynamically defining the Avro schema for the CSV?

Thanks in advance for any thoughts.

Reply via email to