Re: RemoveDistributedMapCache

2017-02-13 Thread Matt Burgess
eName, with no age > Off) ->CreteTableIfNotExists -> IncrementalLoadData –> > RemoveDistributedMapCache (tableName) > > > > Unfortunately there isn’t the processor RemoveDistributedMapCache, I could > handle this, thanks to Matt Burgess > (https://community.hort

Re: RemoveDistributedMapCache

2017-02-13 Thread Matt Burgess
able at the same time, what I wish achieve is a > synchronized process for each table. > > Regards > Carlos > > -Original Message- > From: Matt Burgess [mailto:mattyb...@apache.org] > Sent: segunda-feira, 13 de Fevereiro de 2017 18:25 > To: users@nifi.apache.

Re: Execute script and python

2017-02-21 Thread Matt Burgess
ey Brian, > > One good resource around the ExecuteScript processor is Matt Burgess' blog > [1]. Matt wrote the ExecuteScript processor and has a bunch of how-to guides > there. > > [1] http://funnifi.blogspot.com/ > > Thanks, > Bryan > > On Tue, Feb 21, 2017 at 2

Re: new Nifi Processors

2017-02-28 Thread Matt Burgess
Uwe G has made his processors available (thank you!) via his own repo vs the official Apache NiFi repo; this may be directly related to your point about licensing. Having said that, he is of course at liberty to license those separate processors as he sees fit (assuming it is also in accordance wi

Re: Re: new Nifi Processors

2017-03-01 Thread Matt Burgess
ring for the benefit of > all of us. > > If I can, I will adjust whatever is necessary, so that the license is not a > hurdle for using the processors. Nifi is a really great product and I still > remember my first impression when I saw it..... > > Greetings, > > Uwe

Re: Re: Re: new Nifi Processors

2017-03-02 Thread Matt Burgess
> Basically the GPL license puts restrictions on how one can distribute in > practical terms. Meaning your work may live under GPL license as long as > it's not part of the official package. End users will have to download your > NAR themselves. > > Andrew > > > On W

Re: RuleEngine Processor

2017-03-15 Thread Matt Burgess
Uwe, I will do my best to look at this but it will probably be next week. It seems very interesting and my only excuse is being swamped by other things. Thank you for sharing this, hopefully others in the community will give it a go as well. Regards, Matt > On Mar 15, 2017, at 6:12 PM, Uwe Ge

Re: Re: RuleEngine Processor

2017-03-15 Thread Matt Burgess
d this, you will need to have a mysql/mariadb database, import the > schema and also run the rulemaintenance web application in tomcat, to create > the business logic. > > Hope that is not too much to ask for... > > Greetings and thanks again. > > Uwe > >&

Re: Cannot get a connection, pool error Timeout waiting for idle object in PutSQL?

2017-03-22 Thread Matt Burgess
Prabhu, What are your settings for the DBCPConnectionPool controller service? The defaults are 8 Max Connections and 500 milliseconds for Max Wait Time. For 10 concurrent PutSQL tasks, the first 8 will likely get connections, and if none are returned in 500 milliseconds, then one of the other task

Re: Cannot get a connection, pool error Timeout waiting for idle object in PutSQL?

2017-03-22 Thread Matt Burgess
I just noticed this answer on SO as well [1]. Regards, Matt [1] http://stackoverflow.com/questions/42942759/cannot-get-a-connection-pool-error-timeout-waiting-for-idle-object-in-putsql On Wed, Mar 22, 2017 at 10:33 AM, Matt Burgess wrote: > Prabhu, > > What are your setting

Re: ExecuteScript once at workflow inception

2017-03-28 Thread Matt Burgess
Jim, You can use InvokeScriptedProcessor [1] rather than ExecuteScript for this. ExecuteScript basically lets you provide an onTrigger() body, which is called when the ExecuteScript processor "has work to do". None of the other lifecycle methods are available. For InvokeScriptedProcessor, you act

Re: How does lib_community get included in NiFi nar search path?

2017-03-29 Thread Matt Burgess
Jim, You said your Python modules might be in NARs? If so I'm not sure InvokeScriptedProcessor can pick them up. Normally Python modules are installed or otherwise located in a directory, and you add that directory to the Module Directory property of InvokeScriptedProcessor, and it will do a sy

Re: UndeclaredThrowableException from InvokeScriptedProcessor

2017-04-03 Thread Matt Burgess
Jim, I'm not at my keyboard but I'm guessing it is a NullPointerException from returning None from getPropertyDescriptors(), try returning an empty PyMap instead. Regards, Matt Sent from my iPhone > On Apr 3, 2017, at 5:35 PM, James McMahon wrote: > > Good evening. I am trying to get this s

Re: UndeclaredThrowableException from InvokeScriptedProcessor

2017-04-03 Thread Matt Burgess
Sorry empty List. Sent from my iPhone > On Apr 3, 2017, at 5:52 PM, Matt Burgess wrote: > > Jim, > > I'm not at my keyboard but I'm guessing it is a NullPointerException from > returning None from getPropertyDescriptors(), try returning an empty PyMap >

Re: Options for increasing performance?

2017-04-05 Thread Matt Burgess
Jim, One quick thing you can try is to use GenerateFlowFile to send to your ExecuteScript instead of HandleHttpRequest, you can configure it to send whatever body with whatever attributes (such that you would get from HandleHttpRequest) and send files at whatever rate the processor is scheduled. T

Re: GetHDFS and triggering

2017-04-05 Thread Matt Burgess
Arnaud, Can you explain more about what you'd like to do via an INSERT query? Are you trying to accomplish #3 using Hive via JDBC? If so you should be able to use PutHiveQL rather than PutSQL. If you already have an external table in Hive and don't yet have the ORC table, you should be able to us

Re: Change data capture processor

2017-04-12 Thread Matt Burgess
I believe for the upcoming release only MySQL will be supported. True CDC involves interacting "underneath" the database/JDBC layer in many cases, often retrieving records from the transaction logs themselves. Each of the vendors has their own API and/or clients, each with their own configuratio

Re: translating records from MySQL database to Turtle

2017-04-13 Thread Matt Burgess
Bill, Can you share a little bit more detail as to your database setup? What kind of database (MySQL, Oracle, Postgres, e.g.) is it, and what does your table look like? Are you looking to do this once, or periodically, or incrementally as new rows are added? If incrementally, is there a column th

Re: translating records from MySQL database to Turtle

2017-04-14 Thread Matt Burgess
t. According to the README > file, I start nifi using: > > nifi.sh start > > > But after I do this localhost/nifi doesn't connect to the server. Am I > supposed to be running another command? For instance: > > nifi.sh run > nifi.sh install > > And what does

Re: HIVE Controller service and convertJSONtoSQL processor

2017-04-21 Thread Matt Burgess
That seems odd, it seems from the code [1] that it should recognize a Hive Connection Pool. Please feel free to file a Jira [2] with the details of your use case / execution, please and thank you in advance! Regards, Matt [1] https://github.com/apache/nifi/blob/rel/nifi-1.1.0/nifi-nar-bundles/ni

Re: Replace flowfile contents from InvokeScriptedProcessor?

2017-05-02 Thread Matt Burgess
Jim, you still can/should use something like a PyStreamCallback(). ExecuteScript is basically an onTrigger() body, so you can use the same approach inside your onTrigger() body in InvokeScriptedProcessor. Pass an instance of your PyStreamCallback into something like: flowfile = session.write(flow

Re: Replace flowfile contents from InvokeScriptedProcessor?

2017-05-02 Thread Matt Burgess
> PyStreamCallback()) > # this fails too: flowfile = session.write(self,flowfile, > PyStreamCallback()) > > Am I mistaken to configure PyStreamCallback as a second independent class? > Should it be a defined method within class UpdateAttributes() ? > > On Tue, May 2, 2017

Re: Issue with Groovy script

2017-05-04 Thread Matt Burgess
Mike, To follow up on Andy's question, you will likely need more than just the http-builder JAR, I don't believe it is shaded (aka "fat JAR"). I have the "http-builder-0.7-all.zip" unzipped to a folder, and it has the http-builder-0.7.jar at the root level, but then a "dependencies" folder as well

Re: Replace flowfile contents from InvokeScriptedProcessor?

2017-05-05 Thread Matt Burgess
etAttributes( > > session.transfer(flowfile, self.__rel_success) > session.commit() > except: > session.rollback() > raise > > processor = UpdateAttributes() > > On Tue, May 2, 2017

Re: Is it possible to reference python requests module in ExecuteScript?

2017-05-05 Thread Matt Burgess
Mike, I agree with Andy; one of the challenges with Jython is that all the modules (and their dependencies, and THEIR dependencies, and so on) must be Pure Python (i.e. not call native code). Most of the time I see the module reference error, it is because some imported module is a native (CPython

Re: Is it possible to reference python requests module in ExecuteScript?

2017-05-05 Thread Matt Burgess
Russell et al, I'd like to mention a couple of things here re: the context of the scripting processors: 1) The scripting processors were designed to use the JSR-223 spec for interfacing with a scripting language, in order to make adding new languages easier (ideally you just add the dependency to

Re: Is it possible to reference python requests module in ExecuteScript?

2017-05-06 Thread Matt Burgess
nally, I was promoting using some kind of AMQP to > solve our ETL needs. One day, we stumbled upon NiFi. Since then, I've fallen > in love with everything it does better than a home-brewed queue-messaging > approach tying together inevitably disparate ETL applications. I especially &g

Re: Is it possible to reference python requests module in ExecuteScript?

2017-05-06 Thread Matt Burgess
Giovanni, That should indeed a good way to get better performance from the processor (the amount of improvement depends on which ScriptEngine you're using), although I'm not sure if I'm the one who pointed it out ;) The constant (1000) itself could be defined by a dynamic property, whose name bec

Re: Replace flowfile contents from InvokeScriptedProcessor?

2017-05-09 Thread Matt Burgess
LARRY_CURLEY_MOE") > > flowfile = session.write(flowfile,PyStrea > mCallback(PySet(flowfile.getAttributes( > > session.transfer(flowfile, self.__rel_success) > session.commit() > except: >

Re: Create a CovertCSVToJSON Processor

2017-06-06 Thread Matt Burgess
Venkat, In the meantime, i have a Groovy script for ExecuteScript [1] that will read the header and create an Avro schema (stored in the avro.schema attribute) so you can set the access strategy to Use Schema Text. It works like the Use Header Fields strategy in CSVReader, meaning all fields ar

Re: Expression language in QueryRecord

2017-06-07 Thread Matt Burgess
The documentation says that is true (see the Dynamic Properties section under [1]), although I haven't tried it myself. Regards, Matt [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.2.0/org.apache.nifi.processors.standard.QueryRecord/index.html On Wed,

Re: Expression language in QueryRecord

2017-06-07 Thread Matt Burgess
Giovanni, Expression Language is evaluated before the SQL query. So in your case you end up with a query that looks like: SELECT message, 2015-12-03T11:50:24-0500 AS lmt FROM FLOWFILE And it complains about the colon in the timestamp. Try putting the EL expression in quotes (I think doubl

Re: How to perform bulk insert into SQLServer from one machine to another?

2017-06-08 Thread Matt Burgess
Prabhu, >From [1], the data file "must specify a valid path from the server on which SQL Server is running. If data_file is a remote file, specify the Universal Naming Convention (UNC) name. A UNC name has the form \\Systemname\ShareName\Path\FileName. For example, \\SystemX\DiskZ\Sales\update.txt

Re: How to perform bulk insert into SQLServer from one machine to another?

2017-06-08 Thread Matt Burgess
t; Many thanks, > Prabhu > > On 08-Jun-2017 6:27 PM, "Matt Burgess" wrote: > > Prabhu, > > From [1], the data file "must specify a valid path from the server on > which SQL Server is running. If data_file is a remote file, specify > the Universal Nam

Re: How to update line with modified data in Jython?

2017-06-19 Thread Matt Burgess
Prabhu, I'm no Python/Jython master by any means, so I'm sure there's a better way to do this than what I came up with. Along the way I noticed some things about the input data and Jython vs Python: 1) Your "for line in text[1:]:" is skipping the first line, I assume in the "real" data there is a

Re: Nifi 1.3.0 - Problem with schema.name and ConsumeKafkaRecord_0_10 processor

2017-06-22 Thread Matt Burgess
Uwe, Since ConsumeKafkaRecord is a "source" processor, you won't be able to set schema.name as a FlowFile attribute. However you could set it as a property in a Variable Registry file and use that. Are your schemas dynamic based on the topic? If not, you likely don't need to use the schema.name a

Re: Nifi 1.3.0 - Problems with ConsumeKafkaRecord_0_10

2017-06-22 Thread Matt Burgess
Uwe, It looks like this error is directly related to your other question. This line from the stack trace: Caused by: java.lang.NullPointerException: null at org.apache.nifi.processors.kafka.pubsub.ConsumerLease.writeRecordData(ConsumerLease.java:458) Is when it calls record.getSchema(). No

Re: Nifi 1.3.0 - Problems with ConsumeKafkaRecord_0_10

2017-06-22 Thread Matt Burgess
On Thu, Jun 22, 2017 at 3:54 PM, Matt Burgess wrote: > Uwe, > > It looks like this error is directly related to your other question. > This line from the stack trace: > > Caused by: java.lang.NullPointerException: null > at > org.apache.nifi.processors.kafka.pubsub.Cons

Re: Re: Nifi 1.3.0 - Problem with schema.name and ConsumeKafkaRecord_0_10 processor

2017-06-22 Thread Matt Burgess
; Rgds, > > Uwe > > Gesendet: Donnerstag, 22. Juni 2017 um 21:47 Uhr > Von: "Matt Burgess" > An: users@nifi.apache.org > Betreff: Re: Nifi 1.3.0 - Problem with schema.name and > ConsumeKafkaRecord_0_10 processor > Uwe, > > Since ConsumeKafkaRecord is a &quo

Re: NiFi's JdbcCommon convertToAvroStream problem

2017-06-26 Thread Matt Burgess
Ben, If your driver is returning false for isSigned() but the value is negative, then something seems amiss. What version of the SQLite driver are you using? Up to at least 3.8.11.2, isSigned() always returns false [1], which is not good. It looks like you will have to use at least version 3.14.2

Re: QueryRecord and AvroReader: What is the expect syntax ?

2017-07-03 Thread Matt Burgess
Andre, I don't believe that QueryRecord supports RecordPath expressions as "column names" (but someone should correct me if I'm mistaken :) I think it takes a tabular view of the record, meaning each top-level field is a "column" for the purposes of querying. For JSON, the JsonPathReader lets you

Re: Jython in 1.3 ExecuteScript config?

2017-07-05 Thread Matt Burgess
Jim, The Jython script engine returns its name as "python" for some reason, so that's what's displayed as the option, but it is definitely Jython not Python. If you want to run your script as-is and use stdout for the results, you can use ExecuteStreamCommand for that, otherwise Joe's reference

Re: KeyError from Jython in ExecuteScript

2017-07-11 Thread Matt Burgess
Jim, You can check first if the key is in the results dictionary and then if the value is in the valid values dictionary: ... and 'environment' in result and result['environment'] in valid_environment ... if a missing value is "valid", then you can use "or" instead of "and" and put the clause in

Re: Issue with GenerateTableFetch bulk load when using MS SQL Server database type

2017-07-24 Thread Matt Burgess
Bryan, This is likely a bug, I will investigate and write it up if so. In the meantime, are you planning on doing the bulk fetch in parallel (with a Remote Process Group into ExecuteSQL across a NiFi cluster)? If not, you may find that QueryDatabaseTable is a good alternative, it can be configu

Re: Issue with GenerateTableFetch bulk load when using MS SQL Server database type

2017-07-27 Thread Matt Burgess
suming it’s a bug? Ballpark. > > > > Thanks, > > Bryan > > > > From: Matt Burgess [mailto:mattyb...@gmail.com] > Sent: Monday 24 July 2017 14:40 > To: users@nifi.apache.org > Subject: Re: Issue with GenerateTableFetch bulk load when using MS SQL > Server database typ

Re: How to merge two rows data into single row in groovy script/nifi processors?

2017-08-16 Thread Matt Burgess
If your data is always set up such that the first line has a row of data and the second has additional data, you can set up a variable outside the eachLine(), then if the line number is even (because it's zero-based) you store the line and if it is odd you append it to the previous line and output

Re: connecting to google bigQuery via simba jdbc

2017-08-17 Thread Matt Burgess
Margarita, Sorry to hear you're having trouble connecting. In this case, I believe it is the Database Driver Class Name that is the issue. According to [1], you'll want to use the following driver class name (rather than the DataSource one you are using now): com.simba.googlebigquery.jdbc42.Drive

Re: Defaulting to string for dirty data w/ Avro

2017-08-17 Thread Matt Burgess
Mike, String is usually the safe bet, InferAvroSchema and ExecuteSQL default to String if they can't figure out what type to use. Regards, Matt > On Aug 17, 2017, at 6:00 PM, Mike Thomsen wrote: > > Is it safe to choose "string" as a default type with Avro? I'm trudging > through some reall

Re: Upsert

2017-08-22 Thread Matt Burgess
Austin, What are you using for a record reader and schema for PutDatabaseRecord? In order to execute SQL using PutDatabaseRecord, you have to specify a "Field containing SQL", and the incoming record(s) must have a field with that name. The value of that field (for each record) will be executed.

Re: Upsert

2017-08-22 Thread Matt Burgess
ype": "string" > }, > { >"name": "PurchaseOrderPrice", >"type": "float" > }, > { >"name": "SupplierId", >"type": "string" > }, > { >"name

Re: Upsert

2017-08-22 Thread Matt Burgess
" > I am just having trouble really understanding what it is that I have to do > so that the query and the data will be understood by the processor. Do I need > different schemas? One like I sent and one for the query? > >> On Tue, Aug 22, 2017 at 11:38 AM, Matt Burgess wrote: >

Re: Upsert

2017-08-22 Thread Matt Burgess
gt; and were having trouble with it. The current setup actually runs pretty well > im just trying to figure out how to do an insert and then if the data already > exists in the table via the rfidnumber do an update instead. > >> On Tue, Aug 22, 2017 at 12:05 PM, Matt Burgess wrote:

Re: Upsert

2017-08-22 Thread Matt Burgess
t solve all cases for ACID DBs. Regards, Matt > On Aug 22, 2017, at 2:06 PM, Austin Duncan wrote: > > That's what we had originally but we felt it was kind of hacky. > >> On Tue, Aug 22, 2017 at 2:05 PM, Matt Burgess wrote: >> If the order of operations doe

Re: Upsert

2017-08-22 Thread Matt Burgess
il yet to check so im just letting > it run but it should work. I just am updating any files that fail and routing > it back to the putdatabaserecord processor. In my head it seems like that > should work. > >> On Tue, Aug 22, 2017 at 3:12 PM, Matt Burgess wrote: >> It's

Re: Getting multiple LookupService results in one call

2017-08-29 Thread Matt Burgess
Right now it's a single update per processor, you can provide multiple keys to do a compound lookup but it returns a single value. ExecuteScript is technically record-aware so you could script such a thing. Regards, Matt > On Aug 29, 2017, at 1:32 PM, Mike Thomsen wrote: > > Is it possible to

Re: NiFi PutHiveStreaming processor with Hive: Failed connecting to EndPoint

2017-09-05 Thread Matt Burgess
I responded to the SO post, the only time I've seen that error is when the NiFi user (and thus the Hive Streaming user) doesn't have permissions to write to the directory where Hive Streaming is writing the ORC files. Regards, Matt On Tue, Sep 5, 2017 at 1:16 PM, Papa Samba DIOP < papasamba.d...@

Re: QueryDatabaseTable - Schema

2017-09-11 Thread Matt Burgess
One thing about its limitations had to do with timing, the record-aware stuff happened after QDT. Would be great to have QDT use a record writer, then depending on the writer you could choose your schema output strategy as Koji outlined. I'm not sure if there is a JIRA for this or not (or any o

Re: Re: Re: QueryDatabaseTable - Deleted Records

2017-09-18 Thread Matt Burgess
Uwe, Is there anything in the V$ARCHIVED_LOG table [1] in your source database? If so you may be able to get some of that information from there. Also is LogMiner [2] enabled on the database? That's another way to be able to query the logs to get things like deletes. In general, there has to be s

Re: InvokeScriptedProcessor to set common flowfile attributes?

2017-09-20 Thread Matt Burgess
Jim, You should be able to do everything you want with InvokeScriptedProcessor and a handy utility class in NiFi called SynchronousFileWatcher [1]. It is used by various NiFi components such as the ScanAttribute [2] processor. You should be able to import it via: from org.apache.nifi.util.file.m

Re: ExecuteSQL question: how do I stop long running queries

2017-09-29 Thread Matt Burgess
Vikram, I'm not at my computer right now so I'm shooting from the hip, but depending on how complex your query is (meaning if it is very simple), take a look at QueryDatabaseTable and GenerateTableFetch, if you are looking to get all rows (versus incremental fetching), you can omit the maximum

Re: ExecuteSQL question: how do I stop long running queries

2017-09-29 Thread Matt Burgess
; >> >> >> Which version of NiFi would work best for RDBMS processors, I’ll check with >> platform folks if we can go version upgrade. >> >> >> >> Thanks, Appreciate your help >> >> >> >> From: Matt Burgess [mailto:m

Re: StoreInKiteDataset, wasb, and class path

2017-09-30 Thread Matt Burgess
The .name() method is commonly used to provide a more Machine-friendly name, to help in future Internalization (i18n) efforts. The .displayName() method is commonly used to provide a user-friendly name. Currently the display name is used in the UI when present, and if not present, the name is us

Re: Re: Nifi 1.4: problem with QueryRecord Precessor

2017-10-04 Thread Matt Burgess
All, The known issue Mark is referring to is NIFI-4349 [1], however it is not causing the problem; rather it is hiding the problem. If an error occurs (due to misonfiguration, schema errors, etc.), certain cleanup activities aren't being performed so the processor ends up with the error about proc

Re: A bag of groovy questions regarding the ExecuteScript processor

2017-10-04 Thread Matt Burgess
Giovanni, I second all of Andy's answers, they are spot-on. For the each() construct, they are "safe" in the sense that you will be working with one flow file at a time, but remember that there is only one "session". If you throw an Exception from inside the each(), then it will be caught by Execu

Re: A bag of groovy questions regarding the ExecuteScript processor

2017-10-05 Thread Matt Burgess
; } > > That way the fat jar will be much smaller but still executable by NiFi. > Without that a 15kb jar ends up being a 8mb fat jar. > > on-the-fly-reload) I'd rather hack the API that doing that :) Are there any > pointers/examples for this InvokeScriptedProcessor? It

Re: FTPS

2017-10-05 Thread Matt Burgess
Austin, There is an open Jira (NIFI-2278 [1]) to add this support to the existing processor(s), I believe if Apache Commons Net is used for the clients then we would just need to create an FTPSClient [2] instead of an FTPClient. In order to support prototyping such things, Apache Commons Net was a

Version 1.2.0 of nifi-script-tester released

2017-10-05 Thread Matt Burgess
All, I've just released version 1.2.0 of the nifi-script-tester [1], a utility that lets you test your Groovy, Jython, and Javascript scripts for use in the NiFi ExecuteScript processor. Here are the new features: - Upgraded code to NiFi 1.4.0 - Added support for incoming flow file attributes F

Re: convert avro schema to another schema

2017-10-11 Thread Matt Burgess
If you know the input and output schemas, you should be able to use UpdateRecord for this. You would have a user-defined property for each output field (including flattened field names), whose value would be the RecordPath to the field in the map. I believe for any fields that are in the input sche

Re: FTPS

2017-10-11 Thread Matt Burgess
ement this in nifi > > On Thu, Oct 5, 2017 at 3:08 PM, Matt Burgess wrote: >> >> Austin, >> >> There is an open Jira (NIFI-2278 [1]) to add this support to the >> existing processor(s), I believe if Apache Commons Net is used for the >> clients then we

Re: FTPS

2017-10-11 Thread Matt Burgess
7 at 1:08 PM, Matt Burgess wrote: >> >> Austin, >> >> Sorry I lost track of this thread. If you have a command-line FTPS >> client, then you can configure ExecuteProcess or ExecuteStreamCommand >> to run the same command you would from a shell. For the scripted

Re: CSV to XML in NiFi using ScriptedRecordSetWriter

2017-10-12 Thread Matt Burgess
Kiran, There is an example of a Groovy script for an XML writer [1] in the unit tests for ScriptedRecordSetWriter, this should be a pretty good place to get started, but please let us know if you have any questions or issues in making it work. Regards, Matt [1] https://github.com/apache/nifi/bl

Re: CSV to XML in NiFi using ScriptedRecordSetWriter

2017-10-13 Thread Matt Burgess
wants to ingest CSV. On the old version of nifi I would have used > TransformXml and XSLT to achieve this, should I still go down that route or > can you point me in the direction of a xml Scripted reader example? > > Kiran > > > > ---- Original Message >

Re: Avro timestamp problem

2017-10-19 Thread Matt Burgess
Uwe, I think you are running into either AVRO-2065 [1] or its related issue AVRO-1891 [2]. Hopefully they will fix it for 1.8.3 and we can upgrade after it is released. In the meantime, try a schema with just a union between null and long, then use QueryRecord to filter out all the records whose

Re: Python example of setState, getState?

2017-10-20 Thread Matt Burgess
I have an example (albeit a trivial one) of this in my ExecuteScript Cookbook post [1]. As far as a separate workaround, I can't tell from the description what you need to do differently than ListFile. It starts with no state, lists all the files, saves the time of the newest file in state, then

Re: PutElasticsearchHttp array

2017-10-20 Thread Matt Burgess
Pat, Are you trying to put the whole array in as a single document, or are you trying to put each element of the array in as a separate document? If the former, you could use ReplaceText to put the array into a JSON object. If the latter, you can use SplitJSON to split the array into individual e

Re: PutElasticsearchHttp array

2017-10-20 Thread Matt Burgess
Pat, If you match the entire text, you should be able to do something like the following as the replacement: { "array": $1 } I didn't try this, but I think it should put the array into a JSON object. Although an array may be a valid JSON "object", I don't think Elasticsearch accepts them as such

Re: Back pressure deadlock

2017-10-23 Thread Matt Burgess
Perhaps a quick(ish) win would be to implement a DeadlockDetectionReportingTask, where you could specify processor IDs (or names but that can get dicey) and it would monitor those processors for incoming connections that all have backpressure applied, and that the processor has not run for X amount

Re: Select Query With Aliases

2017-10-23 Thread Matt Burgess
Austin, What version of NiFi are you using? There was an issue with aliases (at least for MySQL) before NiFi 1.1.0, fixed by NIFI-3064 [1]. Also what database and driver version are you using? Since NiFi 1.1.0, we are following the JDBC 4 spec which says the drivers (when getColumnLabel is calle

Re: Select Query With Aliases

2017-10-23 Thread Matt Burgess
it says that spaces are an illegal character) > or is it possible for me to erase the first row and replace it with the > column names that I want. I am converting the avro schema into a csv. > > On Mon, Oct 23, 2017 at 3:03 PM, Matt Burgess wrote: >> >> Austin, >> >&g

Re: UpdateAttribute is missing JAXB dependency

2017-10-28 Thread Matt Burgess
Leandro, NiFi does not yet work with Java 9 as far as I know, your version was compiled against (and is intended to run against) Java 8. Regards, Matt > On Oct 28, 2017, at 3:36 PM, Leandro Lourenço > wrote: > > Hi, > > I'm having a strange issue with UpdateAttribute processor. > '' is inv

Re: Creating a record schema that has a date or timestamp field

2017-11-01 Thread Matt Burgess
Mike, Which Record Readers/Writers are you using? Do they have the option for "Date Format", and if so, are they filled in? Date Format defaults to empty, and the doc says it "[s]pecifies the format to use when reading/writing Date fields. If not specified, Date fields will be assumed to be number

Re: Reading flowfile in a stream callback

2017-11-03 Thread Matt Burgess
ode('utf-8'))) > > that omits the encoding, like so: > > outputStream.write(bytearray(some_binary)) ? > > Thank you very much in advance. -Jim > > On Thu, Nov 2, 2017 at 8:26 PM, Andy LoPresto wrote: >> >> James, >> >> The Python API should

Re: Enrichment flow using ScriptedLookup

2017-11-05 Thread Matt Burgess
Eric, Because LookupService implements ControllerService, you must implement initialize(ControllerServiceInitializationContext context), which Andy's script provides an empty body for. However that context object has a method called getLogger() on it, so you can override the initialize() method an

Re: ValidateRecord Processor

2017-11-05 Thread Matt Burgess
Seems like ValidateRecord might make a good two-birds-with-one-stone replacement for ConvertRecord :) -Matt > On Nov 5, 2017, at 3:46 PM, Mark Payne wrote: > > Hey Paul, > > That is accurate - the Record Writer chosen will not affect the validation > process. > The way that the processor wo

Re: Replace Text

2017-11-08 Thread Matt Burgess
Austin, If your data is not coming from something like ExecuteSQL (which Bryan mentioned) but you are defining a schema for it, there are a couple of options. First, what format is your data in? If CSV, you can configure a CSVReader to use your schema and ignore the header, effectively renaming th

Re: Output from PostHTTP

2017-11-08 Thread Matt Burgess
Jim, The content of the flow file is the body of the outgoing POST, so you could query provenance for the PostHttp processor, find the associated flow file(s), and (if the content is still available in the content repository) retrieve the content. Also the resolved URL for the POST (after evaluati

Re: Output from PostHTTP

2017-11-08 Thread Matt Burgess
he flowfile content. How do I set > that attribute to be my flowfile content? > > The challenge I seem to be having is that the service is not a nifi flow. > How do i feed to it the content body? > > On Wed, Nov 8, 2017 at 9:41 AM, Matt Burgess wrote: >> >> Jim, &

Re: Nifi : 504 Gateway Time-Out Error

2017-11-09 Thread Matt Burgess
Aruna, The reason you can no longer log in is due to the Out Of Memory Error occurring on the JVM running NiFi. I believe you will need to restart NiFi in order to reconnect. For the original issue, PutDatabaseRecord uses the JDBC driver to set up a prepared statement along with rows of values. E

Re: Nifi : 504 Gateway Time-Out Error

2017-11-09 Thread Matt Burgess
you have given, which one is recommended? > > > > *From:* Matt Burgess [mailto:mattyb...@apache.org] > *Sent:* Thursday, November 09, 2017 12:14 PM > *To:* users@nifi.apache.org > *Subject:* Re: Nifi : 504 Gateway Time-Out Error > > > > Aruna, > > >

Re: csv to sql

2017-11-09 Thread Matt Burgess
Austin, Yes that's exactly what PutDatabaseRecord is for, it is kind of like ConvertJSONToSQL + PutSQL, but it uses the record reader of your choice, so it doesn't have to be JSON. You can set up a CSVReader and a DBCPConnection pool pointing at your PostgreSQL DB, set the verb (INSERT, e.g.) and

Re: Wait only if flagged?

2017-11-13 Thread Matt Burgess
Peter, I haven't tried this, but my knee-jerk reaction is to switch the roles of the "wait" and "success" relationships. Maybe you can send the "wait" relationship downstream and route the "success" one back to Wait. Then when the flag is "cleared", the flow files will start going to the "success"

Re: How to get DBCP service inside ScriptedLookupService

2017-11-14 Thread Matt Burgess
Eric, So I just learned ALOT about the bowels of the context and initialization framework while digging into this issue, and needless to say we will need a better way of making this available to scripts. Here's some info: 1) The ControllerServiceInitializationContext object passed into initialize

Re: How to get DBCP service inside ScriptedLookupService

2017-11-14 Thread Matt Burgess
are multiple with > the same name. In fact, over 10 different iterations you could get 10 > different services instead of always > getting the same service. > > So I guess the question is: Is there a reason that the typical approach of > identifying the service in a > Property

Re: NIFI 1.4.0 - PutMongo - How to use composite key for "Update Query key" parameter ?

2017-11-15 Thread Matt Burgess
Thomas, You can file an Improvement or New Feature Jira [1] asking for the enhancement. Ironically Avro 1.4.0's Schema.parse() method does allow the dollar sign, but we use Avro 1.8.x now which is apparently more strict. I am toying around with a PegasusSchemaRegistry using the PegasusSchemaParse

Re: NIFI 1.4.0 - PutMongo - How to use composite key for "Update Query key" parameter ?

2017-11-16 Thread Matt Burgess
ght want to be careful about that because Avro 1.8 added the support for > logical types and removing that could break parts of the Record API like the > date/timestamp functionality. > > On Wed, Nov 15, 2017 at 3:41 PM, Matt Burgess wrote: >> >> Thomas, >> >>

Re: NIFI 1.4.0 - PutMongo - How to use composite key for "Update Query key" parameter ?

2017-11-16 Thread Matt Burgess
I wrote up an Improvement Jira to add the property to Validate Field Names to AvroSchemaRegistry: https://issues.apache.org/jira/browse/NIFI-4612 -Matt On Thu, Nov 16, 2017 at 9:15 AM, Matt Burgess wrote: > Mike, > > That's a very good point, I guess that would have to be emulate

Re: Is it possible to import class from NAR bundle in scripted processor?

2017-11-20 Thread Matt Burgess
The other NARs are not immediately available to the scripting NAR, and in general you usually have to put your processor in the same NAR as the base class, or put the base class and interfaces in to an API JAR and share that somehow. IMO there's a little too much voodoo to try and make it work with

Re: Someone recommend a good Avro studdy guide for newbie?

2017-11-22 Thread Matt Burgess
Eric, If you're looking for examples on implementing a scripted record reader/writer, you can see the unit test examples [1] or Drew Lim's blog [2]. However I suspect you are looking to leverage an existing RecordReader/Writer from a scripted processor such as ExecuteScript or InvokeScriptedProce

Re: Hyphenated Tables and Columns names

2017-12-06 Thread Matt Burgess
Alberto, What version of NiFi are you using? As of version 1.1.0, QueryDatabaseTable has a "Normalize Table/Column Names" property that you can set to true, and it will replace all Avro-illegal characters with underscores. Regards, Matt On Wed, Dec 6, 2017 at 12:06 PM, Alberto Bengoa wrote: >

Re: Nifi: how to transfer only last file from the flowFile list?

2017-12-07 Thread Matt Burgess
Sally, I don't think you want a FlowFileFilter here, as your smaller flow files will remain in the queue while the large enough ones get processed. Here's a script that I think does what you want it to, but please let me know if I've misunderstood your intent: def ffList = session.get(1000) def l

<    1   2   3   4   5   6   >