Eduardo,

I believe if you have a Parameter Context set on a process group then those
values will be available when you evaluate Expression Language anywhere in
the processor / controller code, as long as your properties are referring
to the parameters and not variables.

Regards,
Matt

On Sun, Mar 17, 2024 at 9:38 PM Eduardo Fontes <[email protected]>
wrote:

> Hi Matt,
>
> I encountered some issues while attempting to implement what I had in mind.
> The main obstacle is that I'm unsure how to pass a "Parameter Context"
> parameter name as a variable, as the evaluation of P.C. occurs during the
> startup phase of the processor/controller. I require this functionality to
> transmit sensitive values, such as passwords, without exposing them in
> flowfile properties as plain text. My initial idea was to obtain a
> parameter context from within a controller code, which is invoked from
> within the onTrigger function of a processor.
>
> Perhaps it would be better if the Parameter Context acted as Azure Key
> Vault or AWS Secrets Manager!
>
> Any thoughts?
>
> PS.: My plan B is to implement it using a Scripted Processor that: 1) get
> values for db connection from a secret vault in the cloud; 2) make a DB
> connection; 3) Read the data and send to processor relationship; 4) close
> connection.
>
> On Fri, Mar 15, 2024 at 5:23 PM Matt Burgess <[email protected]> wrote:
>
> > True, but my concern is that you might see performance issues with a new
> > connection each time, especially if the same value(s) come in many times
> in
> > a row (i.e. choosing the same connection config). Having a small cache
> > might afford you some speedups.
> >
> > Regards,
> > Matt
> >
> > On Sun, Mar 10, 2024 at 9:17 AM Eduardo Fontes <[email protected]
> >
> > wrote:
> >
> > > Hi Matt,
> > >
> > > I don't think I need a pool or a cache, since DB connection will be
> used
> > > once for an object (table/view). So I think that won't be a problem
> > create
> > > a DB connection, read object and destroy connection, for each object.
> > >
> > > I'll try to implement this using DBCPService Controller Interface.
> > >
> > > Thanks for your consideration.
> > >
> > > Eduardo Fontes
> > >
> > > On Tue, Mar 5, 2024 at 11:10 PM Matt Burgess <[email protected]>
> > wrote:
> > >
> > > > Eduardo,
> > > >
> > > > It doesn't sound like DBCPConnectionPoolLookup will work for you
> > because
> > > of
> > > > all the different connection strings. I don't know if there's a good
> > > reason
> > > > why we couldn't create the BasicDataSource when getConnection() is
> > > called,
> > > > passing in a Map of FlowFile attributes (that's how the Lookup
> version
> > > > works). One issue I do see is with "churn" if we're recreating the
> data
> > > > source each time. At that point it's not pooling connections. I
> suppose
> > > you
> > > > could have an internal cache of data sources but it would have to be
> > > > bounded and/or configurable and have a least-recently-used (LRU)
> > eviction
> > > > strategy.
> > > >
> > > > DBCPService is the name of the controller service interface that the
> > > > database processors use, but that's a misnomer since the API doesn't
> > > > mention pooling specifically. Instead you could have an
> implementation
> > > that
> > > > uses a cache vs a pooling approach. But Apache DBCP does handle a lot
> > of
> > > > the management (validation, eviction, idle timeouts, etc.)  so unless
> > > > there's no way to avoid the potential memory/performance issues (like
> > > > having 50+ controller services in a PG) you could try to wrangle
> > smaller
> > > > pools per data source and cache those if that's ok for your use case.
> > > >
> > > > My two cents,
> > > > Matt
> > > >
> > > > On Tue, Mar 5, 2024 at 7:25 PM Eduardo Fontes <
> > [email protected]>
> > > > wrote:
> > > >
> > > > > Hi Everybody!
> > > > >
> > > > > I'm thinking about make a generic ingestor with Apache NiFi but I
> > found
> > > > > some difficulties because of the DataBase Connection Pool
> controller.
> > > It
> > > > > doesn't accept flowfiles parameters for its properties, specially
> > > > > connection string, username and password (for security reasons,
> some
> > > > > sensitive parameter name instead password itself).
> > > > >
> > > > > This is important because, as a generic ingestor, I might have
> > hundreds
> > > > of
> > > > > different connection strings, and I had a lot of problems when I
> > tried
> > > to
> > > > > put 50 DBCP controllers in a Process Group.
> > > > >
> > > > > I wouldn't like to create a flow for each ingestion, but one flow
> for
> > > > each
> > > > > database vendor.
> > > > >
> > > > > Does anyone have any suggestions on how I can achieve this? Would
> it
> > be
> > > > > easy to create a parameterized DBCP controller? (That I could do it
> > > > myself)
> > > > >
> > > > > Best regards.
> > > > >
> > > > > Eduardo Fontes
> > > > > Data Eng / System Analyst Sr.
> > > > >
> > > >
> > >
> >
>

Reply via email to