Eduardo, I believe if you have a Parameter Context set on a process group then those values will be available when you evaluate Expression Language anywhere in the processor / controller code, as long as your properties are referring to the parameters and not variables.
Regards, Matt On Sun, Mar 17, 2024 at 9:38 PM Eduardo Fontes <[email protected]> wrote: > Hi Matt, > > I encountered some issues while attempting to implement what I had in mind. > The main obstacle is that I'm unsure how to pass a "Parameter Context" > parameter name as a variable, as the evaluation of P.C. occurs during the > startup phase of the processor/controller. I require this functionality to > transmit sensitive values, such as passwords, without exposing them in > flowfile properties as plain text. My initial idea was to obtain a > parameter context from within a controller code, which is invoked from > within the onTrigger function of a processor. > > Perhaps it would be better if the Parameter Context acted as Azure Key > Vault or AWS Secrets Manager! > > Any thoughts? > > PS.: My plan B is to implement it using a Scripted Processor that: 1) get > values for db connection from a secret vault in the cloud; 2) make a DB > connection; 3) Read the data and send to processor relationship; 4) close > connection. > > On Fri, Mar 15, 2024 at 5:23 PM Matt Burgess <[email protected]> wrote: > > > True, but my concern is that you might see performance issues with a new > > connection each time, especially if the same value(s) come in many times > in > > a row (i.e. choosing the same connection config). Having a small cache > > might afford you some speedups. > > > > Regards, > > Matt > > > > On Sun, Mar 10, 2024 at 9:17 AM Eduardo Fontes <[email protected] > > > > wrote: > > > > > Hi Matt, > > > > > > I don't think I need a pool or a cache, since DB connection will be > used > > > once for an object (table/view). So I think that won't be a problem > > create > > > a DB connection, read object and destroy connection, for each object. > > > > > > I'll try to implement this using DBCPService Controller Interface. > > > > > > Thanks for your consideration. > > > > > > Eduardo Fontes > > > > > > On Tue, Mar 5, 2024 at 11:10 PM Matt Burgess <[email protected]> > > wrote: > > > > > > > Eduardo, > > > > > > > > It doesn't sound like DBCPConnectionPoolLookup will work for you > > because > > > of > > > > all the different connection strings. I don't know if there's a good > > > reason > > > > why we couldn't create the BasicDataSource when getConnection() is > > > called, > > > > passing in a Map of FlowFile attributes (that's how the Lookup > version > > > > works). One issue I do see is with "churn" if we're recreating the > data > > > > source each time. At that point it's not pooling connections. I > suppose > > > you > > > > could have an internal cache of data sources but it would have to be > > > > bounded and/or configurable and have a least-recently-used (LRU) > > eviction > > > > strategy. > > > > > > > > DBCPService is the name of the controller service interface that the > > > > database processors use, but that's a misnomer since the API doesn't > > > > mention pooling specifically. Instead you could have an > implementation > > > that > > > > uses a cache vs a pooling approach. But Apache DBCP does handle a lot > > of > > > > the management (validation, eviction, idle timeouts, etc.) so unless > > > > there's no way to avoid the potential memory/performance issues (like > > > > having 50+ controller services in a PG) you could try to wrangle > > smaller > > > > pools per data source and cache those if that's ok for your use case. > > > > > > > > My two cents, > > > > Matt > > > > > > > > On Tue, Mar 5, 2024 at 7:25 PM Eduardo Fontes < > > [email protected]> > > > > wrote: > > > > > > > > > Hi Everybody! > > > > > > > > > > I'm thinking about make a generic ingestor with Apache NiFi but I > > found > > > > > some difficulties because of the DataBase Connection Pool > controller. > > > It > > > > > doesn't accept flowfiles parameters for its properties, specially > > > > > connection string, username and password (for security reasons, > some > > > > > sensitive parameter name instead password itself). > > > > > > > > > > This is important because, as a generic ingestor, I might have > > hundreds > > > > of > > > > > different connection strings, and I had a lot of problems when I > > tried > > > to > > > > > put 50 DBCP controllers in a Process Group. > > > > > > > > > > I wouldn't like to create a flow for each ingestion, but one flow > for > > > > each > > > > > database vendor. > > > > > > > > > > Does anyone have any suggestions on how I can achieve this? Would > it > > be > > > > > easy to create a parameterized DBCP controller? (That I could do it > > > > myself) > > > > > > > > > > Best regards. > > > > > > > > > > Eduardo Fontes > > > > > Data Eng / System Analyst Sr. > > > > > > > > > > > > > > >
