Re: How to read a schema from a ScriptedReader

2020-04-08 Thread Matt Burgess
tions :) Regards, Matt [1] https://issues.apache.org/jira/browse/NIFI-7343 [2] https://issues.apache.org/jira/browse/NIFI-5115 On Wed, Apr 8, 2020 at 8:19 PM Matt Burgess wrote: > > Jairo, > We should probably move this to the dev list since we're getting into > the NiFi API,

Re: Terminate process turns ExecuteScript process invalid on nifi-1.11.3

2020-04-28 Thread Matt Burgess
Carlos, This is indeed a bug, although I'm not sure which change introduced the issue. I have written up NIFI-7404 [1] to describe the problem and cover the fix. The basic issue is that the wrong thread calls the method that adds the script engines, so it can't find the ones that are packaged in t

Re: ExecuteSQL Unable to resolve union for value

2020-05-06 Thread Matt Burgess
Trevor, What does your table look like and what DB are you using? On Wed, May 6, 2020 at 1:26 PM Trevor Dunn wrote: > > Hi I am using ExecuteSQL processor downstream of GenerateTableFetch to pull > data from a table. However when I run the flow I get error below. I > tried a couple of dif

Re: Possible memory leak in PutHive3Streaming?

2020-05-11 Thread Matt Burgess
Martin, There were two memory leaks a while back, one in NiFi code and one in Hive client code (brought in as a dependency for the Hive 3 components). NiFi has fixed their side in 1.9.0 (via NIFI-5841 [1]) and Hive has fixed their side in 3.1.1 (via HIVE-20979 [2]). Until NiFi 1.11.4, we were stil

Re: groovy script in nifi 1.11 : unable to loasd FastStringService

2020-05-11 Thread Matt Burgess
Chris, There's definitely something funky going on there, the script doesn't get the same classloader chain that the ScriptedRecordSetWriter (that loads the script) does, instead it gets one with the standard NAR as the parent instead of the scripting NAR. I'm looking into it now. BTW for scripte

Re: groovy script in nifi 1.11 : unable to loasd FastStringService

2020-05-11 Thread Matt Burgess
iters but since they are pretty much part of an "ecosystem" that might be a viable option. Definitely interested in any thoughts about how to proceed (I'm looking at you Payne lol). Regards, Matt On Mon, May 11, 2020 at 6:34 PM Matt Burgess wrote: > > Chris, > > There&

Re: Combine Attributes & Content

2020-05-19 Thread Matt Burgess
Dweep, Depending on how complex the content JSON is, you might be able to use ReplaceText to smuggle the attributes into the text, but this can be tricky as you need to match on the opening JSON and the rest, and then replace it with the opening JSON, the attributes, then the rest in order to pres

Re: Route Attribute - Database down

2020-06-11 Thread Matt Burgess
Although the error attribute can help as a workaround, counting on a text value is probably not the best option (although it's pretty much all we have for now). I wrote up NIFI-7524 [1] to add a "retry" relationship to ExecuteSQL like we have for PutSQL and PutDatabaseRecord. It would route things

Re: Enrichment of record data with a REST API

2020-06-29 Thread Matt Burgess
Mike, I think you can use LookupRecord with a RestLookupService to do this. If it's missing features or it otherwise doesn't work for your use case, please let us know and/or write up whatever Jiras you feel are appropriate. Regards, Matt On Mon, Jun 29, 2020 at 4:56 PM Mike Thomsen wrote: > >

Re: Hive_1_1 Processors and Controllers Missing in NiFi 1.11.4

2020-07-07 Thread Matt Burgess
Harsha, There are two NARs associated with Hive components, nifi-hive-services-api-nar which has the Hive1_1ConnectionPool service (actually an interface, but that's under the hood), and the nifi-hive1_1-nar which has the processors that declare themselves as users of that interface (and the actua

Re: Hive_1_1 Processors and Controllers Missing in NiFi 1.11.4

2020-07-07 Thread Matt Burgess
manually load them again. > > > > Thank you, > Harsha > > Sent from Outlook <http://aka.ms/weboutlook> > -- > *From:* Matt Burgess > *Sent:* Tuesday, July 7, 2020 7:05 PM > *To:* users@nifi.apache.org > *Subject:* Re: Hive_1_1 Proces

Re: Processor Extensibility

2020-07-07 Thread Matt Burgess
This is probably better suited for the dev list (not sure if you're subscribed but please do, BCC'ing users and moving to dev), but the implementations (components and their NARs) are not designed to be subclassed for custom extensions outside the codebase, can you describe your use case (and custo

Re: Error with FetchFTP when filename has non-ASCII charachters

2020-07-14 Thread Matt Burgess
Luca, I'm guessing the issue is the same as the one in [1] but it just wasn't fixed for FetchFTP. Please feel free to write an improvement Jira [2] to add this to FetchFTP as well. Regards, Matt [1] https://issues.apache.org/jira/browse/NIFI-4137 [2] https://issues.apache.org/jira/browse/NIFI

Re: Retry logic for rest api - NIFI

2020-07-30 Thread Matt Burgess
Asmath, InvokeHttp routes the original flowfile to a number of different relationships based on things like the status code. For example if you're looking for a 2xx code but want to retry on that for some reason, you'd use the "Original" relationship. If you want a retryable code (5xx) you can use

Re: Get all available variables in the InvokeScriptedProcessor

2020-08-11 Thread Matt Burgess
Although this is an "unnatural" use of Groovy (and a conversation much better suited for the dev list :), it is possible to get at a map of defined variables (key and value). This counts on particular implementations of the API and that there is no SecurityManager installed in the JVM so Groovy ign

Re: Detect duplicate records

2020-08-15 Thread Matt Burgess
In addition to the SO answer, if you know all the fields in the record, you can use QueryRecord with SELECT DISTINCT field1,field2... FROM FLOWFILE. The SO answer might be more performant but is more complex, and QueryRecord will do the operations in-memory so it might not handle very large flowfil

Re: access property in ScriptedRecordSetWriter?

2020-08-21 Thread Matt Burgess
Dave, For ScriptedRecordSetWriter (and all the scripted Controller Services), you provide the properties yourself, rather than (like ExecuteScript) defining dynamic properties and referring to them from the script. I have an example [1] of using Record controller services from InvokeScriptedProces

Re: access property in ScriptedRecordSetWriter?

2020-08-21 Thread Matt Burgess
Dave, Your snippet is looking good on the inside, but as you want a ScriptedRecordSetWriter you will want to create that instead of a Processor, something like this: class GroovyRecordSetWriter implements RecordSetWriter { private int recordCount = 0 private final OutputStream out pri

Re: access property in ScriptedRecordSetWriter?

2020-08-21 Thread Matt Burgess
Oops copy paste error, the GroovyScriptedRecordSetWriterFactory has to extend AbstractControllerService Sent from my iPhone > On Aug 21, 2020, at 4:50 PM, David Early wrote: > >  > Matt, > > This is very cool of you, and I feel like this is close, but once again > hanging up on my inexperie

Re: Re[2]: access property in ScriptedRecordSetWriter?

2020-08-24 Thread Matt Burgess
Dave, You can't override the initialize() method as it is final, but note that it calls the init() method at the bottom, that's the method you can override, its signature is: protected void init(final ControllerServiceInitializationContext config) throws InitializationException so instead of "de

Re: Re[2]: access property in ScriptedRecordSetWriter?

2020-08-24 Thread Matt Burgess
The formatting got a bit wonky on the code snippet you provided, but if your GroovyRecordSetWriterFactory extends AbstractControllerService, it should have access to the getProperty() method. Try without the context, just "getProperty(CACHE_CLIENT).asControllerService(DistributedMapCacheClient)" O

Re: Re[4]: access property in ScriptedRecordSetWriter?

2020-08-24 Thread Matt Burgess
ointerException: Cannot invoke method > get() on null object > > The only "get()" is on line 69 above, def redis_entry = > mapCacheClient.get(mn, > StringSerializer, StringDeserializer) > > Basically, that means that the mapCacheClient is null, but we can't figu

Re: Data performance with FlowFile Repo's RocksDB

2020-09-10 Thread Matt Burgess
You can use a JsonTreeReader set to Infer Schema and use that in JoltTransformRecord. But if your payload is one big JSON object (rather than a top-level array of JSON objects), then you only have one record and should stick to JoltTransformJson. If you do have an array, JoltTransformJson will

Re: NiFi V1.9.2 Performance

2020-09-24 Thread Matt Burgess
Nathan, If you have multiple JSON messages in one flow file, is it in one large array, or a top-level JSON object with an array inside? Also are you trying to transform each message or the whole thing (i.e. do you need to know about more than one message at a time)? If you have a top-level array a

Re: NiFi V1.9.2 Performance

2020-09-24 Thread Matt Burgess
"Field3": > "[&(1)].OutputField2", > > "Sub": { > > "0": { > > "SubField1": "[&(3)].OutputField4", > >

Re: How to split json subarrays and keep root

2020-09-28 Thread Matt Burgess
Jens, Try ForkRecord [1] with "Mode" set to "Extract" and "Include Parent Fields" set to "true", I think that does what you're looking to do. Regards, Matt [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.12.1/org.apache.nifi.processors.standard.ForkReco

Re: newbie attempting first flow with InvokeHTTP and PutDatabaseRecord

2020-09-29 Thread Matt Burgess
Eric, Depending on how large the JSON content is, you could use JoltTransformJSON to "hoist" the desired data to the top level. Given this example JSON: { "json": { "data": { "value": 3 } } } The spec would be: [ { "operation": "shift", "spec": { "json": {

Re: NiFi V1.9.2 Performance

2020-10-02 Thread Matt Burgess
It does seem to > be slightly better performing. > > > > We are trying to switch out the ConsumeKafka_0_10 and ConvertRecord > processors to the ConsumeKafkaRecord_0_10 processor based on feedback in > this chain as well. > > > > With the ConvertRecord processor, we u

Re: Hive NAR not loading because of snappy?

2020-10-13 Thread Matt Burgess
IIRC this is likely a permissions issue, Xerial Snappy tries to unzip the native library to the location pointed to by “java.io.tempdir” which on *nix defaults to /tmp. Does the NiFi user have write access to that directory? If not you can change the Java temp dir or set it specifically for Snap

Re: Hive NAR not loading because of snappy?

2020-10-13 Thread Matt Burgess
uot;$LD_LIBRARY_PATH:/tmp/snappy-1.0.5-libsnappyjava.so" && > /opt/nifi/bin/nifi.sh start ) > > but it's not something we want to do (in case that shared object disappears > from /tmp). > > > On 10/13/20 3:42 PM, Matt Burgess wrote: >> IIRC this is lik

Re: Hive NAR not loading because of snappy?

2020-10-13 Thread Matt Burgess
PM, Russell Bateman wrote: > >  No, we don't even use (nor have we ever used) Hive in our flows. It's just > there and we didn't want to modify the NiFi download. Should this not even > happen if we're not using it? > >> On 10/13/20 4:24 PM, Matt Burgess w

Re: Hive NAR not loading because of snappy?

2020-10-13 Thread Matt Burgess
d component to > be a required part of its installation since we don't produce the NiFi > download. We'd rather install it as it comes. > > On 10/13/20 4:45 PM, Matt Burgess wrote: >> Ouch! It does happen on the loading of the NAR to ensure the native library >

Re: Run Nifi in IntelliJ to debug?

2020-10-26 Thread Matt Burgess
Yes, that's a pretty common operation amongst NiFi developers. In conf/bootstrap.conf there's a section called Enable Remote Debugging and a commented-out line something like: java.arg.debug=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005 You can remove the comment from that li

Re: Run Nifi in IntelliJ to debug?

2020-10-26 Thread Matt Burgess
03 PM Matt Burgess wrote: > > Yes, that's a pretty common operation amongst NiFi developers. In > conf/bootstrap.conf there's a section called Enable Remote Debugging > and a commented-out line something like: > > java.arg.debug=-agentlib:jdwp=transport=dt_socket,server

Re: GetFile with putsql/executesql

2020-10-28 Thread Matt Burgess
Asmath, GetFile doesn't take an input connection, but if the attribute is going to contain a file to ingest, you can use FetchFile instead. To get an attribute from a database, take a look at LookupAttribute with a SimpleDatabaseLookupService. Depending on the query you were going to execute, you

Re: PrometheusReportingTask Metrics

2020-11-02 Thread Matt Burgess
David, The documentation for the metrics is in the "help" section of the datapoint definition, if you hit the REST endpoint you can see the descriptions, also they are listed in code [1]. Regards, Matt [1] https://github.com/apache/nifi/blob/main/nifi-nar-bundles/nifi-extension-utils/nifi-promet

Re: horizontal merge

2020-11-17 Thread Matt Burgess
Geoffrey, Where are the two flowfiles coming from? This use case is often handled in NiFi using LookupRecord with one of the LookupService implementations (REST, RDBMS, CSV, etc.). We don't currently have a mechanism (besides scripting) to do enrichment/lookups from flowfiles. For your script, yo

Re: schema index out of range

2020-12-02 Thread Matt Burgess
Satish, Can you provide some sample data that causes this issue? Thanks, Matt On Wed, Dec 2, 2020 at 5:18 AM naga satish wrote: > > Hi all, In record readers(CSVreader) when schema strategy is set to > InferSchema, sometimes it keeps on giving error. the error states that index > of "stud

Re: A "proxy" question from the irc channel

2015-11-08 Thread Matt Burgess
Would we have more participation with something like a Slack team? Apache Drill has one with #user and #dev channels, seems to work pretty well and has various integrations with other tools (email, GitHub, Jira, etc.) Sent from my iPhone > On Nov 8, 2015, at 12:41 PM, Tony Kurc wrote: > > The

Re: A "proxy" question from the irc channel

2015-11-08 Thread Matt Burgess
cting people. > >> On Sun, Nov 8, 2015 at 12:53 PM, Matt Burgess wrote: >> Would we have more participation with something like a Slack team? Apache >> Drill has one with #user and #dev channels, seems to work pretty well and >> has various integrations with other tool

Re: Nifi service fail to start - Removed custom processor

2015-11-10 Thread Matt Burgess
Perhaps the UI could have a placeholder for a missing processor, and the context menu or body could include more details as far as which processor it was looking for. Sent from my iPhone > On Nov 10, 2015, at 8:30 PM, Chakrader Dewaragatla > wrote: > > Thank you. I wish this process get simp

Re: Nifi service fail to start - Removed custom processor

2015-11-10 Thread Matt Burgess
t [1]? > > Thanks > Joe > > [1] https://issues.apache.org/jira/browse/NIFI-1052 > >> On Tue, Nov 10, 2015 at 9:18 PM, Matt Burgess wrote: >> Perhaps the UI could have a placeholder for a missing processor, and the >> context menu or body could include more details

Re: Expression language

2015-11-12 Thread Matt Burgess
Not sure if it would prove useful but I've started messing around with the Aho-Corasick algorithm in the hopes of the user being able to paste in some sample data and getting a regex out. If the data is "regular", the user wouldn't need to know an expression language, they would just need a rep

Re: Expression language

2015-11-12 Thread Matt Burgess
u've got ideas on > how to provide a more intuitive play - yes please. You will find an > implementation of aho corasick under the standard processors > (ScanContent) and the associated library under search tools. > Amazingly fast. > > Thanks! > Joe > >> On Thu,

Re: Mapping from JMS -> JSON -> SQL

2016-01-30 Thread Matt Burgess
If the JMS source is actual JSON then you can use EvaluateJsonPath (or SplitJson for arrays), you can craft the attributes to match the arguments to PutSql and have a prepared statement within... I think :) ExecuteScript will be for those times you just can't presently connect the dots with exi

Re: Generate URL based on different conditions

2016-02-16 Thread Matt Burgess
Here's a Gist template that uses Joe's approach of RouteOnAttribute then UpdateAttribute to generate URLs with the use case you described: https://gist.github.com/mattyb149/8fd87efa1338a70c On Tue, Feb 16, 2016 at 9:51 PM, Joe Witt wrote: > Jeff, > > For each of the input files could it be t

Re: Using Apache Nifi and Tika to extract content from pdf

2016-02-20 Thread Matt Burgess
I have a blog post on how to do this with NiFi using a Groovy script in the ExecuteScript (new in 0.5.0) processor using PDFBox instead of Tika: http://funnifi.blogspot.com/2016/02/executescript-extract-text-metadata.html?m=1 Jython is also supported but can't yet use Java libraries (it uses Jyt

Re: Using Apache Nifi and Tika to extract content from pdf

2016-02-20 Thread Matt Burgess
can write up an improvement Jira with the initial findings. Regards, Matt > On Feb 20, 2016, at 2:18 PM, Russell Whitaker > wrote: > > Don't forget Clojure as well. > > Russell Whitaker > Sent from my iPhone > >> On Feb 20, 2016, at 7:44 AM, Matt Burges

Re: Using Apache Nifi and Tika to extract content from pdf

2016-02-20 Thread Matt Burgess
te(flowFile, {inputStream, outputStream -> > doc = PDDocument.load(inputStream) > info = doc.getDocumentInformation() > s.writeText(doc, new OutputStreamWriter(outputStream)) > } as StreamCallback > ) > > Thanks for your help. > > BR > Ralf > &g

Re: Using Apache Nifi and Tika to extract content from pdf

2016-02-21 Thread Matt Burgess
umentation? Or where did I find some infos? > > Sorry for all my questions. > > BR and thanks. > > Ralf > > > Am 20.02.2016 um 22:27 schrieb Matt Burgess : > > I will update the blog to make these more clear. I used PDFBox 1.8.10 so > I'm not sure what e

Re: Auto installation of template

2016-02-26 Thread Matt Burgess
Michael, I don't think you can put a template into conf/templates and have it be picked up, I tried a couple of things and it looks like the system manages the things put in there. For the REST API, the documentation for uploading a template is missing because there are two ways to use POST and t

Re: javascript executescript processor

2016-03-01 Thread Matt Burgess
Mike, I have a blog containing a few posts on how to use ExecuteScript and InvokeScriptedProcessor: http://funnifi.blogspot.com One contains an example using Javascript to get data from Hazelcast and update flowfile attributes: http://funnifi.blogspot.com/2016/02/executescript-using-modules.html

Re: javascript executescript processor

2016-03-02 Thread Matt Burgess
27;m looking for - much appreciated ! >> >> Thanks, >> Mike >> >> On Tue, 1 Mar 2016 at 18:13, Matt Burgess wrote: >> >>> Mike, >>> >>> I have a blog containing a few posts on how to use ExecuteScript and >>> InvokeScript

Re: javascript executescript processor

2016-03-02 Thread Matt Burgess
ss/nifi/processors/NiFiUtils.java > > Sent from my iPhone > >> On Mar 2, 2016, at 1:40 PM, Matt Burgess wrote: >> >> Ask and ye shall receive ;) I realize most of my examples are in Groovy so >> it was a good idea to do some non-trivial stuff in another language, th

Re: ExecuteSQL Extract database tables multiple times.

2016-03-04 Thread Matt Burgess
Currently ExecuteSql will put all available rows into a single flow file. There is a Jira case (https://issues.apache.org/jira/browse/NIFI-1251) to allow the user to break up the result set into flow files containing a specified number of records. I'm not sure why you get 26 flow files, although i

Re: ExecuteSQL and NiFi 0.5.1 - Error org.apache.avro.SchemaParseException: Empty name

2016-03-05 Thread Matt Burgess
That's on me, that commit went into 0.5.0 and looks like a negative logic error. I thought I had unit tested it but I guess not :( Sent from my iPhone > On Mar 5, 2016, at 6:57 PM, Bryan Bende wrote: > > I think this a legitimate bug that was introduced in 0.5.0. > > I created this ticket: h

Re: ExecuteSQL and NiFi 0.5.1 - Error org.apache.avro.SchemaParseException: Empty name

2016-03-05 Thread Matt Burgess
Actually on second thought it's not negative logic, it should be checking against tableNameFromMeta. Sent from my iPhone > On Mar 5, 2016, at 6:57 PM, Bryan Bende wrote: > > I think this a legitimate bug that was introduced in 0.5.0. > > I created this ticket: https://issues.apache.org/jira/

Re: ExecuteSQL Extract database tables multiple times.

2016-03-07 Thread Matt Burgess
a8b9-a77d0be27273] failed to process >>> due to org.apache.avro.SchemaParseException: Empty name; rolling back >>> session: org.apache.avro.SchemaParseException: Empty name >>> >>> 10:30:02 CET ERROR >>> ExecuteSQL[id=d32x32d7-c477-4b3b-a8b9-a77d0be27273] Pr

Re: ExecuteSQL Extract database tables multiple times.

2016-03-07 Thread Matt Burgess
, > > thanks for the reply. Is this fix also solving the issue with Microsoft > SQL Server? > Is there estimation at which time such a fix is available for the public? > > Thanks for your help. > > BR > Ralf > > > Am 07.03.2016 um 15:15 schrieb Matt Burgess : &

Re: javascript executescript processor

2016-03-07 Thread Matt Burgess
Looks like on Rhino you need a different syntax to import stuff: http://docs.oracle.com/javase/7/docs/technotes/guides/scripting/programmer_guide/#jsengine On Mon, Mar 7, 2016 at 11:26 AM, Matt Burgess wrote: > So that's weird since you're running NiFi on Java already and tr

Re: javascript executescript processor

2016-03-07 Thread Matt Burgess
t, > > Thanks for doing this - I've just tried to run the template and I get the > reference error: "Java" is not defined. I have JAVA_HOME set on my ubuntu > machine - just wondering if theres a new config setting I'm missing perhaps? > > Mike > >

Re: javascript executescript processor

2016-03-07 Thread Matt Burgess
va classes > in a script before. > > > > On 7 March 2016 at 17:03, Mike Harding wrote: > >> aaa ok cool. Given that org.apache.nifi.processor.io.StreamCallback is >> an interface do I need to include the underlying classes? >> >> On 7 March 2016 at 16

Re: DetectDuplicate : java.net.ConnectException

2016-03-19 Thread Matt Burgess
Arathi, You'll need to add another Controller Service, one of type DistributedMapCacheServer, set up on port 4557 (to match your DistributedMapCacheClientService), and enable/start it. Then you should be able to connect successfully. Regards, Matt On Thu, Mar 17, 2016 at 4:15 PM, Arathi Maddula

Re: NiFi: command-line interface ?

2016-03-20 Thread Matt Burgess
Dmitry, With regards to nifi-client (I am the author), that exception occurs when the flow has been changed externally and the shell has not recognized it. What the result of the following command? nifi.currentVersion If it is -1, then I recommend restarting the shell. It should be a non-negativ

Re: NiFi: command-line interface ?

2016-03-20 Thread Matt Burgess
). I've got Gradle 2.3 whose version > option's output states Groovy at 2.3.9. > - Dmitry > > > >> On Sun, Mar 20, 2016 at 3:23 PM, Matt Burgess wrote: >> Dmitry, >> >> With regards to nifi-client (I am the author), that exce

Re: NiFi: command-line interface ?

2016-03-20 Thread Matt Burgess
- Dmitry > >> On Sun, Mar 20, 2016 at 3:49 PM, Matt Burgess wrote: >> Hmm looks like it is working properly, not sure why you're getting the 409 >> Conflict. I will look into it more. >> >> I also wanted to mention that you can make use of the nifi-client &q

Re: NiFi: command-line interface ?

2016-03-20 Thread Matt Burgess
t in a valid state. > Returning Conflict response. > > Not sure why the state is "not valid". The GetFile processor seems fine to > me. All the processors in the flow are currently stopped. GetFile has input > files. I would assume this should be OK. > > > >

Re: Help on creating that flow that requires processing attributes in a flow content but need to preserve the original flow content

2016-03-21 Thread Matt Burgess
One way (in NiFi 0.5.0+) is to use the ExecuteScript processor, which gives you full control over the session and flowfile(s). For example if you had JSON in your "kafka.key" attribute such as "{"data": {"myKey": "myValue"}}" , you could use the following Groovy script to parse out the value of th

Re: CSV/delimited to Parquet conversion via Nifi

2016-03-21 Thread Matt Burgess
Edmon, NIFI-1663 [1] was created to add ORC support to NiFi. If you have a target dataset that has been created with Parquet format, I think you can use ConvertCSVtoAvro then StoreInKiteDataset to get flow files in Parquet format into Hive, HDFS, etc. Others in the community know a lot more about

Re: CSV/delimited to Parquet conversion via Nifi

2016-03-22 Thread Matt Burgess
e extra transform could be expensive. >> >>> On Mar 21, 2016 9:39 PM, "Dmitry Goldenberg" >>> wrote: >>> Since NiFi has ConvertJsonToAvro and ConvertCsvToAvro processors, would it >>> make sense to add a feature request for a ConvertJsonToParquet pro

Re: what is the PutElasticsearch Identifier Attribute for?

2016-03-23 Thread Matt Burgess
The Identifier Attribute property should contain the name of a Flow File attribute, which in turn contains the ID of the document to be put into Elasticsearch. Unfortunately it is a required property so having ES auto-generate it is not yet supported [1]. If you don't care what the ID is but need

Re: How to add python modules ?

2016-03-23 Thread Matt Burgess
Madhukar, Glad to hear you found a solution, I was just replying when your email came in. Although in ExecuteScript you have chosen "python" as the script engine, it is actually Jython that is being used to interpret the scripts, not your installed version of Python. The first line (shebang) is

Re: How to add python modules ?

2016-03-24 Thread Matt Burgess
ble > > -Madhu > > On Thu, Mar 24, 2016 at 12:34 AM, Madhukar Thota > wrote: > >> Hi Matt, >> >> Thank you for the input. I updated my config as you suggested and it >> worked like charm and also big thankyou for nice article. i used your >> articl

Re: REST Interface

2016-03-26 Thread Matt Burgess
The REST API is at /nifi-api not /nifi, the doc is somewhere but I am guessing we can do more to announce that in the relevant docs, thanks! Where do you think it would be helpful to add such reference(s)? Thanks, Matt > On Mar 26, 2016, at 2:47 PM, Uwe Geercken wrote: > > Just a quick one: I

Re: String conversion to Int, float double

2016-03-28 Thread Matt Burgess
Sounds good to me. I presume the processor would still put all attributes in the JSON content, but would use any dynamic properties solely for type coercion? Anything not listed would be treated like a String as it is now (to preserve current behavior). We'd need to document the possible values

Re: Common Attributes (FileSize)

2016-03-28 Thread Matt Burgess
Radhakrishna, The "fileSize" attribute should be available for every flow file. Can you describe how you are finding the default set of attributes and which ones you are finding? To test, I generated a flow file with the text "Hello" in it, then sent that to a LogAttribute processor, and got the

Re: EvaluateJsonPath and Json Field Name Starting with @ as the First Character

2016-03-29 Thread Matt Burgess
Hong, I was able to use EvaluateJsonPath with eventClass $.event.@class and the attribute had the correct value (see output from LogAttribute below): -- Standard FlowFile Attributes Key: 'entryDate' Value: 'Tue Mar 29 08:41:36 EDT 2016' Key: 'lineag

Re: How to add python modules ?

2016-03-30 Thread Matt Burgess
You are returning self.d from process() which is a void method. Needs to return None. Sent from my iPhone > On Mar 30, 2016, at 5:00 PM, Madhukar Thota wrote: > > Matt, > > I tired the following code but i am getting the following error. Can you help > me where i am doing wrong? > > Error:

Re: How to add python modules ?

2016-03-30 Thread Matt Burgess
Mahdu, Since you won't be able to return your dictionary, another approach would be to create the dictionary from the main script and pass it into the callback constructor. Then process() can update it, and you can use the populated dictionary after process() returns to set attributes and such.

Re: PutHDFS and LZ4 compression ERROR

2016-03-30 Thread Matt Burgess
I don't think the .so will help you on Windows, you'd need a .dll instead. > On Mar 30, 2016, at 11:39 AM, Thad Guidry wrote: > > Evidently, on Windows the PATH environment variable should also have the path > to your native libraries, so that java.library.path can find them. > > I added the p

Re: No controller service types found that are applicable for this property

2016-03-31 Thread Matt Burgess
Rajeshkumar, Since this is likely related to the custom processor code (either as a result of redeclaring/overriding the existing controller service or NAR dependencies or something else), you may find the d...@nifi.apache.org mailing list has an audience better suited to help you. In either case,

Re: HTTP and OAuth

2016-04-02 Thread Matt Burgess
Pierre, I'm no OAuth expert but maybe you could have a flow that hits the OAuth service for a token (scheduled for the same duration as the token lifetime), then stores it in a DistributedMapCache, then your other flows can fetch the token for the desired operations? Alternatively, if you are to p

Re: EvaluateJsonPath and Json Field Name Starting with @ as the First Character

2016-04-03 Thread Matt Burgess
ile > www.centricconsulting.com | @Centric <https://twitter.com/centric> > > On Tue, Mar 29, 2016 at 8:44 AM, Matt Burgess wrote: > >> Hong, >> >> I was able to use EvaluateJsonPath with eventClass $.event.@class

Re: ExecuteSQL to elasticsearch

2016-04-07 Thread Matt Burgess
Can you provide a sample JSON output from your ConvertAvroToJson processor? It could help identify the location of any mapping/parser exceptions. Thanks, Matt On Thu, Apr 7, 2016 at 1:31 PM, Madhukar Thota wrote: > I am able to construct the dataflow with the following processors > > ExecuteSQL

Re: ExecuteSQL to elasticsearch

2016-04-07 Thread Matt Burgess
": "lab1", "CounterName": > "AvgDiskSecTransfer", "InstanceName": "C:", "MetricValue": > 9.60508652497083E-4}, > {"DateTime": "2016-04-07 17:22:00.0", "HostName": "lab1", "CounterName&q

Re: Lua usage in ExecuteScript Processor

2016-04-20 Thread Matt Burgess
Madhu, I know very little about Lua, so I haven't tried making a Lua version of my JSON-to-JSON scripts/blogs (funnifi.blogspot.com), but here's something that works to get you started. The following Luaj script creates a flow file, writes to it, adds an attribute, then transfers it to success. Ho

Re: Is it possible to call a HIVE table from a ExecuteScript Processor?

2016-04-26 Thread Matt Burgess
Hive doesn't work with ExecuteSQL as its JDBC driver does not support all the JDBC API calls made by ExecuteSQL / PutSQL. However I am working on a Hive NAR to include ExecuteHiveQL and PutHiveQL processors (https://issues.apache.org/jira/browse/NIFI-981), there is a prototype pull request on GitH

Re: ReplaceText processor configuration help

2016-04-26 Thread Matt Burgess
Yes, I think you'll be better off with Aldrin's suggestion of ReplaceText. Then you can put the value of the attribute(s) directly into the content. For example, if you have two attributes "entities" and "users", and you want a JSON doc with those two objects inside, you can use ReplaceText with t

Re: ReplaceText processor configuration help

2016-04-26 Thread Matt Burgess
for some alterbatives like using Groovy for JSON-to-JSON > conversion. But not sure how StandardCharsets.UTF_8 will work with > multi-byte languages. > > > On Tue, Apr 26, 2016 at 12:11 PM, Matt Burgess wrote: >> >> Yes, I think you'll be better off with Aldr

Re: Nifi parsing examples

2016-04-27 Thread Matt Burgess
If you can represent the expected string format as a regular expression, you can use the replaceAll() function [1] with back-references: ${url:replaceAll('(http://[a-zA-Z0-9]+:)[a-zA-Z0-9]+(@.*)','$1x$2')} original: http://username:p...@host.com after: http://username:xx...@host.com Note I h

Re: Nifi parsing examples

2016-04-27 Thread Matt Burgess
Sorry that was just for example #2 :) > On Apr 27, 2016, at 3:59 PM, Matt Burgess wrote: > > If you can represent the expected string format as a regular > expression, you can use the replaceAll() function [1] with > back-references: > > ${url:replaceAll('(http:

Re: Is it possible to call a HIVE table from a ExecuteScript Processor?

2016-04-28 Thread Matt Burgess
to issue the PR for > this? > > > Cheers, > Mike > > On Tue, 26 Apr 2016 at 14:47, Matt Burgess wrote: >> >> Hive doesn't work with ExecuteSQL as its JDBC driver does not support >> all the JDBC API calls made by ExecuteSQL / PutSQL. However I am

Re: Doing development on nifi

2016-04-28 Thread Matt Burgess
Stéphane, Welcome to NiFi, glad to have you aboard! May I ask what version you are using? I believe as of at least 0.6.0, you can view the items in a queued connection. So for your example, you can have a GetHttp into a SplitJson, but don't start the SplitJson, just the GetHttp. You will see any

Re: ExecuteScript Processor Performance

2016-05-02 Thread Matt Burgess
Madhu, In addition to Joe's suggestions, currently ExecuteScript only allows for one task at a time, which is currently a pretty bad bottleneck if you are dealing with lots of throughput. However I have written up a Jira [1] for this and issued a PR [2] to fix it, feel free to try that out and/or

Re: Lua usage in ExecuteScript Processor

2016-05-04 Thread Matt Burgess
;> luajava.LuaState.LdoFile("common_log_format.lua"); >>> >>> >>> On Wed, Apr 20, 2016 at 4:29 PM, Madhukar Thota >>> wrote: >>>> >>>> Thanks Matt. This will be helpful to get started. I will definitely >>>> contribu

Re: Lua usage in ExecuteScript Processor

2016-05-04 Thread Matt Burgess
;> >> >>> I am trying to read the lua file this way, but its not working. How to >> >>> read the lua files from module directory and use it in execution? >> >>> >> >>> luajava.LuaState = luajava.LuaStateFactory.newLuaState() >> &g

Re: PutElasticsearch error

2016-05-06 Thread Matt Burgess
Pierre is correct Sent from my iPhone > On May 6, 2016, at 5:20 PM, Pierre Villard > wrote: > > Hi Igor, > > I believe ES processor uses port 9300 (transport port) and not 9200 port > (http port) > > Pierre. > > 2016-05-06 23:16 GMT+02:00 Igor Kravzov : >> I configured ES processor with ES

Re: SelectHiveQL HiveConnectionPool issues

2016-05-09 Thread Matt Burgess
Your URL has a scheme of "mysql", try replacing with "hive2", and also maybe explicitly setting the port: jdbc:hive2://:1/default If that doesn't work, can you see if there is an error/stack trace in logs/nifi-app.log? Regards, Matt On Mon, May 9, 2016 at 12:04 PM, Mike Harding wrote: > Hi

Re: SelectHiveQL HiveConnectionPool issues

2016-05-09 Thread Matt Burgess
select to export it in Avro I get the following exception: > > [image: Inline images 1] > > I'm assuming this is happening because the underlying data on HDFS my hive > table is reading from is not Avro? its currently standard JSON. > > Thanks, > Mike > > >

Re: SplitJson configuration question

2016-05-11 Thread Matt Burgess
I believe $.* should work to split at the root. > On May 11, 2016, at 5:23 PM, Igor Kravzov wrote: > > Looks like am missing something. How to configure SplitJson to split array > like bellow to individual JSON files. Basically split on "root" of array. > > [{ > "id":1, >"data":"data1"

<    1   2   3   4   5   6   >