Jeff,

That’s a great explanation and a common thought exercise scenario we’ve used 
when planning other features of MiNiFi. I think what Andre suggested below 
would be the easiest and most successful way to accomplish what you are looking 
for. UpdateAttribute will let you get as specific as you want by pulling 
hostname or from variable registry (or you could even run a stream command on 
the host or read from a file on system to get some unique identifier), and then 
all of your downstream processors have access to that attribute. You can also 
filter provenance data within NiFi using that discriminator.

Good luck.


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Apr 25, 2017, at 9:57 AM, Jeff Zemerick <jzemer...@apache.org> wrote:
> 
> Aldrin,
> 
> To simplify it, the situation is analogous to a deployment of temperature 
> sensors. Each sensor has a unique ID that is assigned by us at deployment 
> time and each sensor periodically adds a new row to a database table that is 
> stored on the sensor. Each sensor uses the same database schema so if you 
> combined all the rows you couldn't tell which rows originated from which 
> sensor. In NiFi, I need to do different things based on where the data 
> originated and I need to associate the sensor's ID with its data. (Such as 
> inserting the data into DynamoDB with the sensor ID as the Hash key and a 
> timestamp as the Range key.) The goal is to use the same MiNiFi configuration 
> for all devices.
> 
> I can easily use the ExecuteSQL processor to grab the new rows. But I need 
> some way to attach an attribute to the data that identifies where it 
> originated. That was what led to the initial question in this thread. The 
> Variable Registry along with the UpdateAttribute processor appears to satisfy 
> that need cleaner than a custom processor.
> 
> I hope that explains the situation a bit!
> 
> Thanks,
> Jeff
> 
> 
> 
> On Tue, Apr 25, 2017 at 11:17 AM, Aldrin Piri <aldrinp...@gmail.com 
> <mailto:aldrinp...@gmail.com>> wrote:
> Jeff,
> 
> Could you expand upon what a device id is in your case?  Something intrinsic 
> to the device? The agent?  Are these generated and assigned during 
> provisioning?   How are you making use of these when the data arrives at its 
> desired destination?
> 
> What you are expressing is certainly a common need.  Would welcome any 
> perspective on what your deployment looks like such that we can frame uses 
> people are rolling out to guide assumptions that get made during our 
> development and design processes.
> 
> Thanks for diving in and exploring!
> --Aldrin
> 
> 
> On Tue, Apr 25, 2017 at 11:05 AM, Andre <andre-li...@fucs.org 
> <mailto:andre-li...@fucs.org>> wrote:
> Jeff,
> 
> That would be next suggestion. :-)
> 
> Cheers
> 
> On Wed, Apr 26, 2017 at 1:04 AM, Jeff Zemerick <jzemer...@apache.org 
> <mailto:jzemer...@apache.org>> wrote:
> It is possible. I will take a look to see if the hostname is sufficient for 
> the device ID.
> 
> I just learned about the Variable Registry. It seems if I use the Variable 
> Registry to store the device ID it would be available to the UpdateAttribute 
> processor. Is that correct?
> 
> Thanks,
> Jeff
> 
> 
> On Tue, Apr 25, 2017 at 10:48 AM, Andre <andre-li...@fucs.org 
> <mailto:andre-li...@fucs.org>> wrote:
> Jeff,
> 
> Would if be feasible for you use UpdateAttribute (which I believe is part of 
> MiNiFi core processors) and use the ${hostname(true)} Expression language 
> function?
> 
> More about it can be found here:
> 
> https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#hostname
>  
> <https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#hostname>
> 
> Cheers
> 
> On Wed, Apr 26, 2017 at 12:39 AM, Jeff Zemerick <jzemer...@apache.org 
> <mailto:jzemer...@apache.org>> wrote:
> When processing data in NiFi that was received via MiNiFi edge devices I need 
> to be able to identify the source of the data. All of the data on the edge 
> devices will be pulled from a database and will not contain any data that 
> self-identifies the source. My attempt to solve this was to write a processor 
> that reads a configuration file on the edge device to get its device ID and 
> put that ID as an attribute in the flowfile. This appears to work, but, I was 
> wondering if there is a more recommended approach?
> 
> Thanks,
> Jeff
> 
> 
> 
> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to