RE: [EXT] NiFi 1.3: Simplest way possible of creating CSV files from SQL queries

Joe Witt Tue, 01 Aug 2017 06:03:39 -0700

There are some great points here.  Am on a phone right now so will be very
brief.  The current 1.4.0 snapshot on master has some nice improvements for
ease of use with record handling.  This includes writers inheriting schema
from the reader and others.


Thanks

On Aug 1, 2017 1:20 AM, "Peter Wicks (pwicks)" <pwi...@micron.com> wrote:

> I hate to respond with “me too”, but I haven’t seen a response and this
> kind of simplification is of interest to me.
>
>
>
> The PutDatabaseRecord processor already does something similar, and I have
> only needed the AvroReader processor without a schema registry.
>
>
>
>
>
> *From:* Márcio Faria [mailto:faria.mar...@ymail.com]
> *Sent:* Tuesday, July 25, 2017 11:09 AM
> *To:* Users <users@nifi.apache.org>
> *Subject:* [EXT] NiFi 1.3: Simplest way possible of creating CSV files
> from SQL queries
>
>
>
> Hi,
>
>
>
> I'm looking for the simplest way possible of creating CSV files from SQL
> queries using Apache NiFi 1.3.
>
>
>
> The flow I currently have (the files are to be SFTP'ed to a remote server):
>
>
>
> ExecuteSQL -> UpdateAttribute -> ConversionRecord [3 CSs] -> PutSFTP
>
>
>
> The concept of SchemaRegistry is new to me, but if I understood it
> correctly in order for the ConversionRecord to work properly is necessary
> to have 3 Controller Services ([3 CSs]) associated with it:
>
>    - AvroSchemaRegistry, with the schema defined in Avro Schema (JSON);
>    - AvroReader, referring to the above schema;
>    - CSVRecordSetWriter, also referring to the same schema.
>
>
>
> It seems there are many benefits in using the schema registry, including
> versioning, validation, etc, but in my example a simpler configuration
> would be welcome.
>
>
>
> Isn't the schema already defined by ExecuteSQL? Can I have the
> ConversionRecord alone with no *dedicated* SchemaRegistry (property),
> AvroReader,(instance), or CSVRecordSetWriter (instance)? Of course, we'd
> still need to specify the output is a CSV, so perhaps a shared
> CSVRecordSetWriter that also gets its schema from the flow file would still
> be useful.
>
>
>
> By the way, would the Schema Access Strategy named "Use Embedded
> Avro Schema" be part of a simpler solution? How?
>
>
>
> In the same vein, what about having the schema-name property optionally
> defined by the ExecuteSQL itself, so we don't have to depend on the
> UpdateAttribute component?
>
>
>
> In summary, I'm wondering if it's possible to have 3 (+ 1 generic)
> components instead of 6 per query:
>
>
>
> ExecuteSQL -> ConversionRecord [CSVRecordSetWriter] -> PutSFTP
>
>
>
> That would make a difference when defining multiple conversions from SQL
> to CSV, or other equivalent flows.
>
>
>
> In addition, consider that someone might want to have maximum flexibility,
> meaning that it would be totally acceptable to change the query and get a
> different layout for the resulting CSV file, without having to change any
> SchemaRegistry, Reader, or Writer.
>
>
>
> I've found a few tickets out there covering a similar topic. In
> particular, [1] mentions the difficulty with more complex Avro data types.
> But I don't see that being a blocker when the data source is an
> old-fashioned SQL query.
>
>
>
> Recommendations?
>
>
>
> P.S.1 Maybe templates would save the effort, but since Controller Services
> are "global", I'm still wondering if having too many parts would make it
> more difficult to manage lots of flows than it could be.
>
>
>
> P.S.2 Will my 1st flow have a good performance? I'm wondering if another
> advantage of using SchemaRegistry etc is that it prevents the creation of
> too many records at once.
>
>
>
> Thank you,
>
>
>
> Marcio
>
>
>
> [1] NIFI-1372 Create ConvertAvroToCSV
> <https://issues.apache.org/jira/browse/NIFI-1372>
>
>
> [NIFI-1372] Create ConvertAvroToCSV - ASF JIRA
>
>
>

RE: [EXT] NiFi 1.3: Simplest way possible of creating CSV files from SQL queries

Reply via email to