Re: ReplaceText regex question

2018-08-17 Thread Ed B
JSON format requires to use double quotes for string values. Since you are using Put Db record, check you schema for JSON data. Most probably there is a mismatch between data/json structure and schema. Regards, Ed. On Thu, Aug 16, 2018 at 9:43 AM Kuhfahl, Bob wrote: > I have to massage some

Re: Creating an attribute

2018-08-17 Thread James McMahon
This sounds like just what I need. Thank you very much Matt. I'll dig in and give this a try. Thanks again to each of you guys who responded. Cheers, Jim On Fri, Aug 17, 2018 at 4:24 PM, Matt Burgess wrote: > Jim, > > You can use UpdateRecord for this, your input schema would have "last" > and

Re: Creating an attribute

2018-08-17 Thread Matt Burgess
Jim, You can use UpdateRecord for this, your input schema would have "last" and "first" in it (and I think you can have an optional "myKey" field so you can use the same schema for the writer), and the output schema would have all three fields in it. Then you'd set the Replacement Value Strategy

Re: Creating an attribute

2018-08-17 Thread James McMahon
I do appreciate your point, Tim and Lee. What if I do this instead: append select attributes to my data payload. Would that minimize the impact on RAM? Can I do that? More specifically, my data payload is a string representation of a JSON object, like so: {"last":"manson","first":"marilyn"} and I

Re: Creating an attribute

2018-08-17 Thread Lee Laim
Jim, I think the ExtractText processor with a large enough MaxCaptureGroup length (default :1024) will do that. Though, I share Tim’s concerns when you scale up Thanks, Lee > On Aug 17, 2018, at 11:52 AM, Timothy Tschampel > wrote: > > > This may not be applicable to your use case

Re: Creating an attribute

2018-08-17 Thread Timothy Tschampel
This may not be applicable to your use case depending on message volume / # of attributes; but I would avoid putting payloads into attributes for scalability reasons (especially RAM usage). > On Aug 17, 2018, at 10:47 AM, James McMahon wrote: > > I have flowfiles with data payloads that

Creating an attribute

2018-08-17 Thread James McMahon
I have flowfiles with data payloads that represent small strings of text (messages consumed from AMQP queues). I want to create an attribute that holds the entire payload for downstream use. How can I capture the entire data payload of a flowfile in a new attribute on the flowfile? Thank you in

Re: Large JSON File Best Practice Question

2018-08-17 Thread Joe Witt
Ben, I'm not sure that you could reliably convert the format of data and retain schema information unless both formats allow for explicit schema retention in them (as Avro does for instance). JSON doesn't really offer that. So when you say you want to convert even for unknown fields but there

Re: NiFi hangs at Computing Data Lineage... 100% for a specific flow

2018-08-17 Thread Daniel Watson
Mark, I just read somewhere that the EnforceOrder processor is not meant to be used when numbers will be "skipped". Since im using a timestamp it wont work for me. So I'll need to find another way to do that ordering. I looked at the PriorityAttributePrioritizer, but that doesn't seem to be able

Re: Detect a pattern in incoming json content

2018-08-17 Thread James McMahon
EvaluateJSONPath - I'll give that a try. I can't rule out such edge cases. Thank you very much Andy. On Thu, Aug 16, 2018 at 12:21 PM, Andy LoPresto wrote: > Jim, > > If the JSON can span multiple lines, you may also want to use > EvaluateJSONPath to extract the value of the *request* key and

Re: NiFi hangs at Computing Data Lineage... 100% for a specific flow

2018-08-17 Thread Daniel Watson
Mark, Thanks! That worked in allowing the DP page to load. Now it exposes the underlying problem, which is that the EnforceOrder processor seems to be adding hundreds of "attributes modified" events, even when no attributes are being modified. Does this have something to do with the flow files in

Re: NiFi hangs at Computing Data Lineage... 100% for a specific flow

2018-08-17 Thread Mark Payne
OK, my recommendation would be to change to the WriteAheadProvenanceRepository. You can do this by changing the value of the "nifi.provenance.repository.implementation" property in nifi.properties to "org.apache.nifi.provenance.WriteAheadProvenanceRepository" The WriteAhead implementation was

Re: NiFi hangs at Computing Data Lineage... 100% for a specific flow

2018-08-17 Thread Daniel Watson
Yes. On Fri, Aug 17, 2018 at 10:19 AM Mark Payne wrote: > OK, thanks. Are you using the default implementation of the Provenance > Repository? I.e., the PersistentProvenanceRepository? > > > On Aug 17, 2018, at 10:10 AM, Daniel Watson wrote: > > Mark, > > 1.7.0 > 06/19/2018 21:55:30 EDT >

Re: NiFi hangs at Computing Data Lineage... 100% for a specific flow

2018-08-17 Thread Mark Payne
OK, thanks. Are you using the default implementation of the Provenance Repository? I.e., the PersistentProvenanceRepository? On Aug 17, 2018, at 10:10 AM, Daniel Watson mailto:dcwatso...@gmail.com>> wrote: Mark, 1.7.0 06/19/2018 21:55:30 EDT Tagged nifi-1.7.0-RC1 Thanks On Fri, Aug 17,

Re: Design pattern advice needed

2018-08-17 Thread Mark Payne
Hey Bob, The InferAvroSchema processor actually works against JSON and CSV data. It is designed to infer an Avro Schema so that you can convert either of those into Avro if you want. So you can send it JSON data and it will infer the schema for you and put it in the "inferred.avro.schema"

Re: NiFi hangs at Computing Data Lineage... 100% for a specific flow

2018-08-17 Thread Daniel Watson
Mark, 1.7.0 06/19/2018 21:55:30 EDT Tagged nifi-1.7.0-RC1 Thanks On Fri, Aug 17, 2018 at 10:09 AM Mark Payne wrote: > Hi Daniel, > > What version of NiFi are you running? > > Thanks > -Mark > > > > On Aug 17, 2018, at 10:07 AM, Daniel Watson > wrote: > > > > Anyone have any issues with

Re: NiFi hangs at Computing Data Lineage... 100% for a specific flow

2018-08-17 Thread Mark Payne
Hi Daniel, What version of NiFi are you running? Thanks -Mark > On Aug 17, 2018, at 10:07 AM, Daniel Watson wrote: > > Anyone have any issues with the data lineage screen? My NiFi instance can't > compute the data lineage for a specific flow. It worked originally, then > after running

RE: [EXT] Design pattern advice needed

2018-08-17 Thread Paul Gibeault (pagibeault)
Bob, Even if you were able to manually create the NiFi flow for all 200 tables successfully, you may want to make some change to the flows down the road. You would have to perform 200 manual changes again. This may be mitigated slightly by breaking the flows into 2 pieces: Acquisition and

NiFi hangs at Computing Data Lineage... 100% for a specific flow

2018-08-17 Thread Daniel Watson
Anyone have any issues with the data lineage screen? My NiFi instance can't compute the data lineage for a specific flow. It worked originally, then after running some files through, it no longer will. It gets stuck at 100% and the UI shows nothing. The flow is... list files -> update att ->

Design pattern advice needed

2018-08-17 Thread Kuhfahl, Bob
Problem: * Source database with over 200 tables. * Current Nifi ‘system’ we are developing can extract data from those 200 tables into NiFi flows of JSON-formatted data, essentially separate flows for each table with an attribute that indicates the tablename and other useful attributes

Re: List/Fetch pattern for QueryDatabaseTable

2018-08-17 Thread Matt Burgess
Merging Ed's suggestion with mine: GenerateFlowFile -> SplitText -> RPG -> Input Port -> ExtractText -> GTF -> ExecuteSQL If you want to fully distribute the fetch, you could have another RPG -> Input Port between GTF and ExecuteQL. The ExtractText is there to pull the line into an attribute

Re: List/Fetch pattern for QueryDatabaseTable

2018-08-17 Thread Ed B
There is a good reason to use ListDatabaseTables in cases when tables aren't known, or can be dynamically added to source system. If you already know the list of tables you will be pulling data from, you can use Generate Flow File. In GFF, in custom text property, specify a list of tables (one

RE: List/Fetch pattern for QueryDatabaseTable

2018-08-17 Thread Vos, Walter
Hi Matt, Thanks for that very thorough explanation. I'll be sure to present this to my dev's in order to come to a way of working. When it comes to ListDatabaseTables, how can I use this when I don't want to query all tables from the database? Let's say there's some 30 tables in the database