Re: Kafka proxy with NiFi
Hi, I assume by endpoint you suggest to expose the broker side of the kafka API? As far as I know it is not possible. However you could use the site2site API or other supported reliable protocols (http, beats, relp, etc) to feed data to NiFi and from there feed kafka. How this fits into your environment will largely depend on data volumes and kafka acknowledgement strategy. On 9 Jun 2017 07:15, "Laurens Vets"wrote: Hi List, Is it possible to build a Kafka proxy with NiFi such that NiFi will expose a Kafka endpoint and proxy all Kafka messages to another Kafka endpoint?
Re: NiFi 1.2.0 REST API problem
I have not been able to find anything in the logs that is useful, but I'm not sure I enabled all of the logging that I need to. But I did find some more information, if I restart the NiFi service the InvokeHTTP call starts working and we have two other flow patterns that have the same issue. In each case it is always the second InvokeHTTP processor that fails. This is how I have the logging set currently: All of the other logging is set to its default value. On Thu, Jun 8, 2017 at 12:20 PM, Matt Gilmanwrote: > Raymond, > > If you enable debug level logging, I believe that InvokeHTTP will log the > request and response. It may be helpful in diagnosing this issue. I think > you could just set the bulletin level to DEBUG to see these as messages as > bulletins. Additionally, you can update your conf/logback.xml to enable > DEBUG messages for org.apache.nifi.processors.standard.InvokeHTTP to see > these messages in your logs/nifi-app.log. > > Thanks > > Matt > > On Thu, Jun 8, 2017 at 1:01 PM, Raymond Rogers > wrote: > >> No bulletins on any of the processors. All of the output flow-files have >> 0 bytes and the error 401 in the attributes. >> All of the properties look correct and I can copy the values from the >> non-working to the manually created processor and it works fine. >> When you export the SSL context service and re-import it you have to >> reset the password on the trust store and that is the only change I am >> making. >> >> I will need to dig into the nifi logs to check for any errors there. >> >> On Thu, Jun 8, 2017 at 11:24 AM, Matt Gilman >> wrote: >> >>> Raymond, >>> >>> When it's in a state that is not working are there any bulletins on the >>> second processor? When it's in that state and you view the configuration >>> details for that processor, do the properties look correct and the same as >>> when you manually re-add the processor through the UI? Specifically, I'm >>> wondering about the SSL Context Service since you mentioned fixing that >>> after an export/import process resolves the issue. >>> >>> Any other issues in the logs/nifi-app.log or the logs/nifi-user.log? >>> >>> Thanks >>> >>> Matt >>> >>> On Thu, Jun 8, 2017 at 11:59 AM, Raymond Rogers < >>> raymond.rog...@gmail.com> wrote: >>> We have a node.js service that automatically creates & manages NiFi groups using the REST API which works great in NiFi 1.1.1. We are upgrading our NiFi instances to 1.2.0 and I have found that some of the processors are exhibiting odd behavior. We have a flow the connects to the Outlook 365 OWA service generates an access token and then uses that token in two different InvokeHTTP processors. The first processor always works and the second always returns an HTTP error 401. If I delete and manually re-add the InvokeHTTP processor with the same configuration it always works. If I export this flow from the NiFi web interface and then re-import it, only fixing the SSL context service, it works every time. Using our node.js service to create the exact same flow in NiFi 1.1.1 it always works. Thanks, Raymond >>> >>> >> >
Re: Bulk inserting into HBase with NiFi
Mike, Just out of curiosity, what would the original data for your example look like that produced that JSON? Is it a CSV with two lines, like: ABC, XYZ DEF, LMN and then ExecuteScript is turning that into the JSON array? As far as reading the JSON, I created a simple flow of GeneratFlowFile -> ConvertRecord -> LogAttribute where ConvertRecord uses the JsonPathReader with $.value https://gist.github.com/bbende/3789a6907a9af09aa7c32413040e7e2b LogAttribute ends up logging: [ { "value" : "XYZ" }, { "value" : "LMN" } ] Which seems correct given that its reading in the JSON with a schema that only has the field "value" in it. Let me know if that is not what you are looking for. On Thu, Jun 8, 2017 at 4:13 PM, Mike Thomsenwrote: > Bryan, > > I have the processor somewhat operational now, but I'm running into a > problem with the record readers. What I've done is basically this: > > Ex. JSON: > > [ >{ >"key": "ABC", "value": "XYZ" >}, >{ >"key": "DEF", "value": "LMN" >} > ] > > Avro schema: > > { > "type": "record", > "name": "GenomeRecord", > "fields": [{ > "name": "value", > "type": "string" > }, > ] > } > > 1. ExecuteScript iterates over a line and builds a JSON array as mentioned > above. > 2. PutHBaseRecord is wired to use a JsonPathReader that uses an > AvroSchemaRegistry. > - I put a lot of logging in and can verify it is identifying the schema > based on the attribute on the flowfile and looking at the appropriate field > while looping over the Record to turn it into a serializable form for a Put. > - All I get are nulls. > 3. My JsonPath has been variously $.value and $[*].value. It just does not > seem to want to parse that JSON. > > The strategy I was going for is to use the "key" attribute in each JSON > object to set the row key for the Put. > > Any ideas would be great. > > Thanks, > > Mike > > On Wed, Jun 7, 2017 at 4:40 PM, Bryan Bende wrote: >> >> Mike, >> >> Glad to hear that the record API looks promising for what you are trying >> to do! >> >> Here are a couple of thoughts, and please correct me if I am not >> understanding your flow correctly... >> >> We should be able to make a generic PutHBaseRecord processor that uses >> any record reader to read the incoming flow file and then converts >> each record directly into a PutFlowFile (more on this in a minute). >> >> Once we have PutHBaseRecord, then there may be no need for you to >> convert your data from CSV to JSON (unless there is another reason I >> am missing) because you can send your CSV data directly into >> PutHBaseRecord configured with a CSVRecordReader. >> >> If you are doing other processing/enrichment while going from CSV to >> JSON, then you may be able to achieve some of the same things with >> processors like UpdateRecord, PartitionRecord, and LookupRecord. >> Essentially keeping the initial CSV intact and treating it like >> records through the entire flow. >> >> Now back to PutHBaseRecord and the question of how to go from a Record >> to a PutFlowFile... >> >> We basically need to know the rowId, column family, and then a list of >> column-qualifier/value pairs. I haven't fully though this through yet, >> but... >> >> For the row id, we could have a similar strategy as PutHBaseJson, >> where the value comes from a "Row Id" property in the processor or >> from a "Row Id Record Path" which would evaluate the record path >> against the record and use that value for the row id. >> >> For column family, we could probably do the same as above, where it >> could be from a property or a record path. >> >> For the list of column-qualifier/value pairs, we can loop over all >> fields in the record (skipping the row id and family if using record >> fields) and then convert each one into a PutColumn. The bulk of the >> work here is going to be taking the value of a field and turning it >> into an appropriate byte[], so you'll likely want to use the type of >> the field to cast into an appropriate Java type and then figure out >> how to represent that as bytes. >> >> I know this was a lot of information, but I hope this helps, and let >> me know if anything is not making sense. >> >> Thanks, >> >> Bryan >> >> >> On Wed, Jun 7, 2017 at 3:56 PM, Mike Thomsen >> wrote: >> > Yeah, it's really getting hammered by the small files. I took a look at >> > the >> > new record APIs and that looked really promising. In fact, I'm taking a >> > shot >> > at creating a variant of PutHBaseJSON that uses the record API. Look >> > fairly >> > straight forward so far. My strategy is roughly like this: >> > >> > GetFile -> SplitText -> ExecuteScript -> RouteOnAttribute -> >> > PutHBaseJSONRecord >> > >> > ExecuteScript generates a larger flowfile that contains a structure like >> > this now: >> > >> > [ >> > { "key": "XYZ", "value": "ABC" } >> > ] >> > >> > >> > My intention is to have a JsonPathReader take that bigger flowfile
Re: Bulk inserting into HBase with NiFi
Bryan, I have the processor somewhat operational now, but I'm running into a problem with the record readers. What I've done is basically this: Ex. JSON: [ { "key": "ABC", "value": "XYZ" }, { "key": "DEF", "value": "LMN" } ] Avro schema: { "type": "record", "name": "GenomeRecord", "fields": [{ "name": "value", "type": "string" }, ] } 1. ExecuteScript iterates over a line and builds a JSON array as mentioned above. 2. PutHBaseRecord is wired to use a JsonPathReader that uses an AvroSchemaRegistry. - I put a lot of logging in and can verify it is identifying the schema based on the attribute on the flowfile and looking at the appropriate field while looping over the Record to turn it into a serializable form for a Put. - All I get are nulls. 3. My JsonPath has been variously $.value and $[*].value. It just does not seem to want to parse that JSON. The strategy I was going for is to use the "key" attribute in each JSON object to set the row key for the Put. Any ideas would be great. Thanks, Mike On Wed, Jun 7, 2017 at 4:40 PM, Bryan Bendewrote: > Mike, > > Glad to hear that the record API looks promising for what you are trying > to do! > > Here are a couple of thoughts, and please correct me if I am not > understanding your flow correctly... > > We should be able to make a generic PutHBaseRecord processor that uses > any record reader to read the incoming flow file and then converts > each record directly into a PutFlowFile (more on this in a minute). > > Once we have PutHBaseRecord, then there may be no need for you to > convert your data from CSV to JSON (unless there is another reason I > am missing) because you can send your CSV data directly into > PutHBaseRecord configured with a CSVRecordReader. > > If you are doing other processing/enrichment while going from CSV to > JSON, then you may be able to achieve some of the same things with > processors like UpdateRecord, PartitionRecord, and LookupRecord. > Essentially keeping the initial CSV intact and treating it like > records through the entire flow. > > Now back to PutHBaseRecord and the question of how to go from a Record > to a PutFlowFile... > > We basically need to know the rowId, column family, and then a list of > column-qualifier/value pairs. I haven't fully though this through yet, > but... > > For the row id, we could have a similar strategy as PutHBaseJson, > where the value comes from a "Row Id" property in the processor or > from a "Row Id Record Path" which would evaluate the record path > against the record and use that value for the row id. > > For column family, we could probably do the same as above, where it > could be from a property or a record path. > > For the list of column-qualifier/value pairs, we can loop over all > fields in the record (skipping the row id and family if using record > fields) and then convert each one into a PutColumn. The bulk of the > work here is going to be taking the value of a field and turning it > into an appropriate byte[], so you'll likely want to use the type of > the field to cast into an appropriate Java type and then figure out > how to represent that as bytes. > > I know this was a lot of information, but I hope this helps, and let > me know if anything is not making sense. > > Thanks, > > Bryan > > > On Wed, Jun 7, 2017 at 3:56 PM, Mike Thomsen > wrote: > > Yeah, it's really getting hammered by the small files. I took a look at > the > > new record APIs and that looked really promising. In fact, I'm taking a > shot > > at creating a variant of PutHBaseJSON that uses the record API. Look > fairly > > straight forward so far. My strategy is roughly like this: > > > > GetFile -> SplitText -> ExecuteScript -> RouteOnAttribute -> > > PutHBaseJSONRecord > > > > ExecuteScript generates a larger flowfile that contains a structure like > > this now: > > > > [ > > { "key": "XYZ", "value": "ABC" } > > ] > > > > > > My intention is to have a JsonPathReader take that bigger flowfile which > is > > a JSON array and iterate over it as a bunch of records to turn into Puts > > with the new HBase processor. I'm borrowing some code for wiring in the > > reader from the QueryRecord processor. > > > > So my only question now is, what is the best way to serialize the Record > > objects to JSON? The PutHBaseJson processor already has a Jackson setup > > internally. Any suggestions on doing this in a way that doesn't tie me at > > the hip to a particular reader implementation? > > > > Thanks, > > > > Mike > > > > > > On Wed, Jun 7, 2017 at 6:12 PM, Bryan Bende wrote: > >> > >> Mike, > >> > >> Just following up on this... > >> > >> I created this JIRA to track the idea of record-based HBase processors: > >> https://issues.apache.org/jira/browse/NIFI-4034 > >> > >> Also wanted to mention that with the existing processors, the main way > >> to scale up would be to increase the concurrent tasks on PutHBaseJson >
Re: Saving controller services w/ templates?
Yeah, I just screwed up and didn't reference one. On Thu, Jun 8, 2017 at 1:26 PM, Mike Thomsenwrote: > I'll have to look again, but I scanned through the XML and didn't see > either my avro schema registry or the jsonpath reader. > > Thanks, > > Mike > > On Thu, Jun 8, 2017 at 1:10 PM, Matt Gilman > wrote: > >> Mike, >> >> Currently, the services are saved if they are referenced by processors in >> your data flow. There is an existing JIRA [1] to always include them. >> >> Thanks >> >> Matt >> >> [1] https://issues.apache.org/jira/browse/NIFI-2895 >> >> On Thu, Jun 8, 2017 at 12:59 PM, Mike Thomsen >> wrote: >> >>> Is it possible to save the controller services w/ a template? >>> >>> Thanks, >>> >>> Mike >>> >> >> >
Keytab Configuration for Nifi processor
Hi I have Nifi 3 node cluster (Installed Via Hortonworks Data Flow - HDF ) in Kerborized environment. As part of installation Ambari has created nifi service keytab . Can I use this nifi.service.keytab for configuring processors like PutHDFS who talks to Hadoop services ? The nifi.service.keytab is machine specific and always expect principal names with machine information. ex nifi/HOSTNAME@REALM If I configure my Processor with nfii/NODE1_Hostname@REALM information then I see kerberos authentication exception in other two nodes. How do I dynamically resolve hostname to use nifi service keytab ? Thanks Shashi
Re: Saving controller services w/ templates?
I'll have to look again, but I scanned through the XML and didn't see either my avro schema registry or the jsonpath reader. Thanks, Mike On Thu, Jun 8, 2017 at 1:10 PM, Matt Gilmanwrote: > Mike, > > Currently, the services are saved if they are referenced by processors in > your data flow. There is an existing JIRA [1] to always include them. > > Thanks > > Matt > > [1] https://issues.apache.org/jira/browse/NIFI-2895 > > On Thu, Jun 8, 2017 at 12:59 PM, Mike Thomsen > wrote: > >> Is it possible to save the controller services w/ a template? >> >> Thanks, >> >> Mike >> > >
Re: NiFi 1.2.0 REST API problem
Raymond, If you enable debug level logging, I believe that InvokeHTTP will log the request and response. It may be helpful in diagnosing this issue. I think you could just set the bulletin level to DEBUG to see these as messages as bulletins. Additionally, you can update your conf/logback.xml to enable DEBUG messages for org.apache.nifi.processors.standard.InvokeHTTP to see these messages in your logs/nifi-app.log. Thanks Matt On Thu, Jun 8, 2017 at 1:01 PM, Raymond Rogerswrote: > No bulletins on any of the processors. All of the output flow-files have > 0 bytes and the error 401 in the attributes. > All of the properties look correct and I can copy the values from the > non-working to the manually created processor and it works fine. > When you export the SSL context service and re-import it you have to reset > the password on the trust store and that is the only change I am making. > > I will need to dig into the nifi logs to check for any errors there. > > On Thu, Jun 8, 2017 at 11:24 AM, Matt Gilman > wrote: > >> Raymond, >> >> When it's in a state that is not working are there any bulletins on the >> second processor? When it's in that state and you view the configuration >> details for that processor, do the properties look correct and the same as >> when you manually re-add the processor through the UI? Specifically, I'm >> wondering about the SSL Context Service since you mentioned fixing that >> after an export/import process resolves the issue. >> >> Any other issues in the logs/nifi-app.log or the logs/nifi-user.log? >> >> Thanks >> >> Matt >> >> On Thu, Jun 8, 2017 at 11:59 AM, Raymond Rogers > > wrote: >> >>> We have a node.js service that automatically creates & manages NiFi >>> groups using the REST API which works great in NiFi 1.1.1. We are >>> upgrading our NiFi instances to 1.2.0 and I have found that some of the >>> processors are exhibiting odd behavior. >>> >>> We have a flow the connects to the Outlook 365 OWA service generates an >>> access token and then uses that token in two different InvokeHTTP >>> processors. The first processor always works and the second always returns >>> an HTTP error 401. >>> >>> If I delete and manually re-add the InvokeHTTP processor with the same >>> configuration it always works. >>> >>> If I export this flow from the NiFi web interface and then re-import it, >>> only fixing the SSL context service, it works every time. >>> >>> Using our node.js service to create the exact same flow in NiFi 1.1.1 it >>> always works. >>> >>> Thanks, >>> Raymond >>> >> >> >
Re: Saving controller services w/ templates?
Mike, Currently, the services are saved if they are referenced by processors in your data flow. There is an existing JIRA [1] to always include them. Thanks Matt [1] https://issues.apache.org/jira/browse/NIFI-2895 On Thu, Jun 8, 2017 at 12:59 PM, Mike Thomsenwrote: > Is it possible to save the controller services w/ a template? > > Thanks, > > Mike >
Re: Saving controller services w/ templates?
Mike, I believe templates include controller services by default, as long as one or more of the processors in the template references the controller service. Did that not happen for you? Thanks, James On Thu, Jun 8, 2017 at 9:59 AM, Mike Thomsenwrote: > Is it possible to save the controller services w/ a template? > > Thanks, > > Mike >
Re: NiFi 1.2.0 REST API problem
No bulletins on any of the processors. All of the output flow-files have 0 bytes and the error 401 in the attributes. All of the properties look correct and I can copy the values from the non-working to the manually created processor and it works fine. When you export the SSL context service and re-import it you have to reset the password on the trust store and that is the only change I am making. I will need to dig into the nifi logs to check for any errors there. On Thu, Jun 8, 2017 at 11:24 AM, Matt Gilmanwrote: > Raymond, > > When it's in a state that is not working are there any bulletins on the > second processor? When it's in that state and you view the configuration > details for that processor, do the properties look correct and the same as > when you manually re-add the processor through the UI? Specifically, I'm > wondering about the SSL Context Service since you mentioned fixing that > after an export/import process resolves the issue. > > Any other issues in the logs/nifi-app.log or the logs/nifi-user.log? > > Thanks > > Matt > > On Thu, Jun 8, 2017 at 11:59 AM, Raymond Rogers > wrote: > >> We have a node.js service that automatically creates & manages NiFi >> groups using the REST API which works great in NiFi 1.1.1. We are >> upgrading our NiFi instances to 1.2.0 and I have found that some of the >> processors are exhibiting odd behavior. >> >> We have a flow the connects to the Outlook 365 OWA service generates an >> access token and then uses that token in two different InvokeHTTP >> processors. The first processor always works and the second always returns >> an HTTP error 401. >> >> If I delete and manually re-add the InvokeHTTP processor with the same >> configuration it always works. >> >> If I export this flow from the NiFi web interface and then re-import it, >> only fixing the SSL context service, it works every time. >> >> Using our node.js service to create the exact same flow in NiFi 1.1.1 it >> always works. >> >> Thanks, >> Raymond >> > >
Saving controller services w/ templates?
Is it possible to save the controller services w/ a template? Thanks, Mike
Re: How to perform bulk insert into SQLServer from one machine to another?
You won't need/want NiFi for that part; instead you would need to login to the machine running SQL Server, install an FTP daemon (such as ftpd), then in the PutFTP processor in NiFi you can point to the FTP server using the Hostname, Port, Username, Password, etc. On Thu, Jun 8, 2017 at 12:18 PM, prabhu Mahendranwrote: > Matt, > > Thanks for your wonderful response > > I think create FTP server is best way for me to move input file into sql and > runs a query. > > Can you please suggest way > to create FTP server in Sql installed machine using NIFI? > > Many thanks, > Prabhu > > On 08-Jun-2017 6:27 PM, "Matt Burgess" wrote: > > Prabhu, > > From [1], the data file "must specify a valid path from the server on > which SQL Server is running. If data_file is a remote file, specify > the Universal Naming Convention (UNC) name. A UNC name has the form > \\Systemname\ShareName\Path\FileName. For example, > \\SystemX\DiskZ\Sales\update.txt". Can you expose the CSV file via a > network drive/location? If not, can you place the file on the SQL > Server using NiFi? For example, if there were an FTP server running > on the SQL Server instance, you could use the PutFTP processor, then > PutSQL after that to issue your BULK INSERT statement. > > Regards, > Matt > > [1] > https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql > > On Thu, Jun 8, 2017 at 8:11 AM, prabhu Mahendran > wrote: >> i have running nifi instance in one machine and have SQL Server in another >> machine. >> >> Here i can try to perform bulk insert operation with bulk insert Query in >> SQLserver. but i cannot able insert data from one machine and move it into >> SQL Server in another machine. >> >> If i run nifi and SQL Server in same machine then i can able to perform >> bulk >> insert operation easily. >> >> i have configured GetFile->ReplaceText(BulkInsertQuery)-->PutSQL >> processors. >> >> I have tried both nifi and sql server in single machine then bulk insert >> works but not works when both instances in different machines. >> >> I need to get all data's from one machine and write a query to move that >> data into SQL runs in another machine. >> >> Below query works when nifi and sql server in same machine >> >> BULK INSERT BI FROM 'C:\Directory\input.csv' WITH (FIRSTROW = 1, >> ROWTERMINATOR = '\n', FIELDTERMINATOR = ',', ROWS_PER_BATCH = 1) >> if i run that query in another machine then it says..,"FileNotFoundError" >> due to "input.csv" in Host1 machine but runs query in sql server machine >> (host2) >> >> Can anyone give me suggestion to do this?
Re: Detect whether a flowfile has a particular attribute
Jim, This might be related and coincidentally today we were talking with a coworker about the "advanced" button of UpdateAttribute and its ability to set attributes based on conditions. It's pretty powerful. [1] It might come in useful for your efforts. [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.2.0/org.apache.nifi.processors.attributes.UpdateAttribute/index.html On Thu, Jun 8, 2017 at 9:54 AM James McMahonwrote: > I do understand now. Thank you very much Mark. -Jim > > On Thu, Jun 8, 2017 at 9:34 AM, Mark Payne wrote: > >> Jim, >> >> The first expression will return false. None of the expressions below >> will ever throw an Exception. >> >> You could even chain them together like >> ${myAttribute:toLower():length():gt(4)} and if myAttribute does not >> exist, it will return false, rather than throwing an Exception. >> >> Thanks >> -Mark >> >> >> On Jun 8, 2017, at 9:32 AM, James McMahon wrote: >> >> So then if myAttribute does not even exist in a particular flowFile, the >> first expression will return a null value rather than throw an error. Thank >> you very much Mark. -Jim >> >> On Thu, Jun 8, 2017 at 8:44 AM, Mark Payne wrote: >> >>> Jim, >>> >>> You can use the expression: >>> >>> ${myAttribute:isNull()} >>> >>> Or, alternatively, depending on how you want to setup the route: >>> >>> ${myAttribute:notNull()} >>> >>> If you want to check if the attribute contains 'True' somewhere within >>> its value, >>> then you can use: >>> >>> ${myAttribute:contains('True')} >>> >>> Thanks >>> -Mark >>> >>> >>> > On Jun 8, 2017, at 8:19 AM, James McMahon >>> wrote: >>> > >>> > Good morning. I receive HTTP POSTs of various types of files. Some >>> have a particular attribute myAttribute, some do not. I want to route the >>> flowfiles to different workflow paths depending on the presence of this >>> attribute. Can I use RouteAttribute and the expression language to do that, >>> something like this: >>> > >>> > hasTheAttributeOfInterest >>> ${anyAttribute("myAttribute":contains('True')} >>> > >>> > I ask because the expression guide did not say whether a False is >>> returned or the processor throws an error if the attribute does not exist >>> in the flowfile. I may have missed that. I wanted to see if anyone in the >>> group has experience in this regard? >>> > >>> > Thanks in advance for your insights. -Jim >>> >>> >> >> >
Re: NiFi 1.2.0 REST API problem
Raymond, When it's in a state that is not working are there any bulletins on the second processor? When it's in that state and you view the configuration details for that processor, do the properties look correct and the same as when you manually re-add the processor through the UI? Specifically, I'm wondering about the SSL Context Service since you mentioned fixing that after an export/import process resolves the issue. Any other issues in the logs/nifi-app.log or the logs/nifi-user.log? Thanks Matt On Thu, Jun 8, 2017 at 11:59 AM, Raymond Rogerswrote: > We have a node.js service that automatically creates & manages NiFi groups > using the REST API which works great in NiFi 1.1.1. We are upgrading our > NiFi instances to 1.2.0 and I have found that some of the processors are > exhibiting odd behavior. > > We have a flow the connects to the Outlook 365 OWA service generates an > access token and then uses that token in two different InvokeHTTP > processors. The first processor always works and the second always returns > an HTTP error 401. > > If I delete and manually re-add the InvokeHTTP processor with the same > configuration it always works. > > If I export this flow from the NiFi web interface and then re-import it, > only fixing the SSL context service, it works every time. > > Using our node.js service to create the exact same flow in NiFi 1.1.1 it > always works. > > Thanks, > Raymond >
Re: How to perform bulk insert into SQLServer from one machine to another?
Matt, Thanks for your wonderful response I think create FTP server is best way for me to move input file into sql and runs a query. Can you please suggest way to create FTP server in Sql installed machine using NIFI? Many thanks, Prabhu On 08-Jun-2017 6:27 PM, "Matt Burgess"wrote: Prabhu, >From [1], the data file "must specify a valid path from the server on which SQL Server is running. If data_file is a remote file, specify the Universal Naming Convention (UNC) name. A UNC name has the form \\Systemname\ShareName\Path\FileName. For example, \\SystemX\DiskZ\Sales\update.txt". Can you expose the CSV file via a network drive/location? If not, can you place the file on the SQL Server using NiFi? For example, if there were an FTP server running on the SQL Server instance, you could use the PutFTP processor, then PutSQL after that to issue your BULK INSERT statement. Regards, Matt [1] https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk- insert-transact-sql On Thu, Jun 8, 2017 at 8:11 AM, prabhu Mahendran wrote: > i have running nifi instance in one machine and have SQL Server in another > machine. > > Here i can try to perform bulk insert operation with bulk insert Query in > SQLserver. but i cannot able insert data from one machine and move it into > SQL Server in another machine. > > If i run nifi and SQL Server in same machine then i can able to perform bulk > insert operation easily. > > i have configured GetFile->ReplaceText(BulkInsertQuery)-->PutSQL processors. > > I have tried both nifi and sql server in single machine then bulk insert > works but not works when both instances in different machines. > > I need to get all data's from one machine and write a query to move that > data into SQL runs in another machine. > > Below query works when nifi and sql server in same machine > > BULK INSERT BI FROM 'C:\Directory\input.csv' WITH (FIRSTROW = 1, > ROWTERMINATOR = '\n', FIELDTERMINATOR = ',', ROWS_PER_BATCH = 1) > if i run that query in another machine then it says..,"FileNotFoundError" > due to "input.csv" in Host1 machine but runs query in sql server machine > (host2) > > Can anyone give me suggestion to do this?
Re: integration with Zeppelin
I do not believe NiFi has any specific features for Zeppelin yet, but it is possible to write custom Zeppelin code paragraphs that communicate with the NiFi API to pull data or inspect flow status. For an example, I recommend Pierre Villard's US Presidential Election: tweet analysis using HDF/NiFi, Spark, Hive and Zeppelin ( https://community.hortonworks.com/articles/30213/us-presidential-election-tweet-analysis-using-hdfn.html ). The most typical usage is to have NiFi write to a durable store like HDFS, HBase, SQL, etc., and then have Zeppelin read from the data store. Would that not work for you? Thanks, James On Thu, Jun 8, 2017 at 7:09 AM, Wojciech Indykwrote: > Hi All! > I am new here. I find NiFi as a great project for data routing. However, > extendability of NiFi give us a great potential to expand application of > NiFi routing. Recently I looked at Kylo- a software for management of data > lake (based on NiFi and its templates). In this composition of NiFi and > Kylo I feel lack of a component able to do custom data-analytics. I asked > the Kylo community on possible integration of Kylo with Zeppelin here: > https://groups.google.com/forum/#!topic/kylo-community/e6JdzneAnV0 > > The Kylo developers advised me to asked about possible integration here. I > haven't found any NiFi processor for Zeppelin, that enables to work on the > input from NiFi and handle the output from Zeppelin. The idea of such > integration comes from functionality I've seen in Dataiku solution. There > is a processor, that is able to run arbitrary code in the middle of > processing by specific input and output variables. Having such opportunity > the data governance in area of collecting data done by NiFi can be extended > on data governance regarding data processing, like training of Machine > Learning models, custom visualizations, etc. > > What do you think of this idea? Is any NiFi processor, that supports such > integration? Is it in line with a NiFi roadmap? > > -- > Kind regards/ Pozdrawiam, > Wojciech Indyk >
NiFi 1.2.0 REST API problem
We have a node.js service that automatically creates & manages NiFi groups using the REST API which works great in NiFi 1.1.1. We are upgrading our NiFi instances to 1.2.0 and I have found that some of the processors are exhibiting odd behavior. We have a flow the connects to the Outlook 365 OWA service generates an access token and then uses that token in two different InvokeHTTP processors. The first processor always works and the second always returns an HTTP error 401. If I delete and manually re-add the InvokeHTTP processor with the same configuration it always works. If I export this flow from the NiFi web interface and then re-import it, only fixing the SSL context service, it works every time. Using our node.js service to create the exact same flow in NiFi 1.1.1 it always works. Thanks, Raymond
integration with Zeppelin
Hi All! I am new here. I find NiFi as a great project for data routing. However, extendability of NiFi give us a great potential to expand application of NiFi routing. Recently I looked at Kylo- a software for management of data lake (based on NiFi and its templates). In this composition of NiFi and Kylo I feel lack of a component able to do custom data-analytics. I asked the Kylo community on possible integration of Kylo with Zeppelin here: https://groups.google.com/forum/#!topic/kylo-community/e6JdzneAnV0 The Kylo developers advised me to asked about possible integration here. I haven't found any NiFi processor for Zeppelin, that enables to work on the input from NiFi and handle the output from Zeppelin. The idea of such integration comes from functionality I've seen in Dataiku solution. There is a processor, that is able to run arbitrary code in the middle of processing by specific input and output variables. Having such opportunity the data governance in area of collecting data done by NiFi can be extended on data governance regarding data processing, like training of Machine Learning models, custom visualizations, etc. What do you think of this idea? Is any NiFi processor, that supports such integration? Is it in line with a NiFi roadmap? -- Kind regards/ Pozdrawiam, Wojciech Indyk
Re: Detect whether a flowfile has a particular attribute
I do understand now. Thank you very much Mark. -Jim On Thu, Jun 8, 2017 at 9:34 AM, Mark Paynewrote: > Jim, > > The first expression will return false. None of the expressions below will > ever throw an Exception. > > You could even chain them together like > ${myAttribute:toLower():length():gt(4)} > and if myAttribute does not > exist, it will return false, rather than throwing an Exception. > > Thanks > -Mark > > > On Jun 8, 2017, at 9:32 AM, James McMahon wrote: > > So then if myAttribute does not even exist in a particular flowFile, the > first expression will return a null value rather than throw an error. Thank > you very much Mark. -Jim > > On Thu, Jun 8, 2017 at 8:44 AM, Mark Payne wrote: > >> Jim, >> >> You can use the expression: >> >> ${myAttribute:isNull()} >> >> Or, alternatively, depending on how you want to setup the route: >> >> ${myAttribute:notNull()} >> >> If you want to check if the attribute contains 'True' somewhere within >> its value, >> then you can use: >> >> ${myAttribute:contains('True')} >> >> Thanks >> -Mark >> >> >> > On Jun 8, 2017, at 8:19 AM, James McMahon wrote: >> > >> > Good morning. I receive HTTP POSTs of various types of files. Some have >> a particular attribute myAttribute, some do not. I want to route the >> flowfiles to different workflow paths depending on the presence of this >> attribute. Can I use RouteAttribute and the expression language to do that, >> something like this: >> > >> > hasTheAttributeOfInterest ${anyAttribute("myAttribute": >> contains('True')} >> > >> > I ask because the expression guide did not say whether a False is >> returned or the processor throws an error if the attribute does not exist >> in the flowfile. I may have missed that. I wanted to see if anyone in the >> group has experience in this regard? >> > >> > Thanks in advance for your insights. -Jim >> >> > >
Re: Detect whether a flowfile has a particular attribute
Jim, The first expression will return false. None of the expressions below will ever throw an Exception. You could even chain them together like ${myAttribute:toLower():length():gt(4)} and if myAttribute does not exist, it will return false, rather than throwing an Exception. Thanks -Mark On Jun 8, 2017, at 9:32 AM, James McMahon> wrote: So then if myAttribute does not even exist in a particular flowFile, the first expression will return a null value rather than throw an error. Thank you very much Mark. -Jim On Thu, Jun 8, 2017 at 8:44 AM, Mark Payne > wrote: Jim, You can use the expression: ${myAttribute:isNull()} Or, alternatively, depending on how you want to setup the route: ${myAttribute:notNull()} If you want to check if the attribute contains 'True' somewhere within its value, then you can use: ${myAttribute:contains('True')} Thanks -Mark > On Jun 8, 2017, at 8:19 AM, James McMahon > > wrote: > > Good morning. I receive HTTP POSTs of various types of files. Some have a > particular attribute myAttribute, some do not. I want to route the flowfiles > to different workflow paths depending on the presence of this attribute. Can > I use RouteAttribute and the expression language to do that, something like > this: > > hasTheAttributeOfInterest > ${anyAttribute("myAttribute":contains('True')} > > I ask because the expression guide did not say whether a False is returned or > the processor throws an error if the attribute does not exist in the > flowfile. I may have missed that. I wanted to see if anyone in the group has > experience in this regard? > > Thanks in advance for your insights. -Jim
Re: Detect whether a flowfile has a particular attribute
Jim, You can use the expression: ${myAttribute:isNull()} Or, alternatively, depending on how you want to setup the route: ${myAttribute:notNull()} If you want to check if the attribute contains 'True' somewhere within its value, then you can use: ${myAttribute:contains('True')} Thanks -Mark > On Jun 8, 2017, at 8:19 AM, James McMahonwrote: > > Good morning. I receive HTTP POSTs of various types of files. Some have a > particular attribute myAttribute, some do not. I want to route the flowfiles > to different workflow paths depending on the presence of this attribute. Can > I use RouteAttribute and the expression language to do that, something like > this: > > hasTheAttributeOfInterest > ${anyAttribute("myAttribute":contains('True')} > > I ask because the expression guide did not say whether a False is returned or > the processor throws an error if the attribute does not exist in the > flowfile. I may have missed that. I wanted to see if anyone in the group has > experience in this regard? > > Thanks in advance for your insights. -Jim
Re: Set priority to files based on date time value stored on attribute
Hi Pierre, After converting those date time format in to integer (using expression language),I can able to process the file as per the requirement by setting those integer values to the priority attribute and process those files based on that priority. Thanks for your guidance Regards, Manoj kumar R On Thu, Jun 8, 2017 at 8:50 AM, Pierre Villardwrote: > Hi Manoj, > > You may want ot have a look at EnforceOrder processor [1] or simply the > prioritizers [2] of the connections (it depends of how your workflow is > working). The idea would be to extract the date as an attribute of your > flow file, convert into an integer (using expression language) and use it > to ensure order. > > [1] https://nifi.apache.org/docs/nifi-docs/components/org. > apache.nifi/nifi-standard-nar/1.2.0/org.apache.nifi.processors.standard. > EnforceOrder/index.html > [2] https://nifi.apache.org/docs/nifi-docs/html/user-guide. > html#prioritization > > Hope this helps. > > > 2017-06-08 8:43 GMT+02:00 Manojkumar Ravichandran > : > >> Hi All, >> >> I need to process the files based on the date time value stored on the >> attribute >> >> *For example:* >> >> If the incoming files contains the following date time attribute values >> >> *2017/06/07 16:57:02* >> *2017/06/06 12:49:49* >> *2017/06/06 11:09:28* >> *2017/06/06 06:37:45* >> >> I need to process the files based on the order of time that is oldest one >> from the current time >> >> First I want to access the file that contains below date time attribute >> which is the oldest one among them from the current time >> *i.e 2017/06/06 06:37:45* >> and then below one, >> *2017/06/06 11:09:28* >> and then this >> *2017/06/06 12:49:49* >> so on* * >> >> How can I achieve the above mentioned scenario ? >> >> Regards, >> Manoj kumar R >> > >
Detect whether a flowfile has a particular attribute
Good morning. I receive HTTP POSTs of various types of files. Some have a particular attribute myAttribute, some do not. I want to route the flowfiles to different workflow paths depending on the presence of this attribute. Can I use RouteAttribute and the expression language to do that, something like this: hasTheAttributeOfInterest ${anyAttribute("myAttribute":contains('True')} I ask because the expression guide did not say whether a False is returned or the processor throws an error if the attribute does not exist in the flowfile. I may have missed that. I wanted to see if anyone in the group has experience in this regard? Thanks in advance for your insights. -Jim
How to perform bulk insert into SQLServer from one machine to another?
i have running nifi instance in one machine and have SQL Server in another machine. Here i can try to perform bulk insert operation with bulk insert Query in SQLserver. but i cannot able insert data from one machine and move it into SQL Server in another machine. If i run nifi and SQL Server in same machine then i can able to perform bulk insert operation easily. i have configured GetFile->ReplaceText(BulkInsertQuery)-->PutSQL processors. I have tried both nifi and sql server in single machine then bulk insert works but not works when both instances in different machines. I need to get all data's from one machine and write a query to move that data into SQL runs in another machine. Below query works when nifi and sql server in same machine BULK INSERT BI FROM 'C:\Directory\input.csv' WITH (FIRSTROW = 1, ROWTERMINATOR = '\n', FIELDTERMINATOR = ',', ROWS_PER_BATCH = 1) if i run that query in another machine then it says..,"FileNotFoundError" due to "input.csv" in Host1 machine but runs query in sql server machine (host2) Can anyone give me suggestion to do this?
Re: Set priority to files based on date time value stored on attribute
Koji, One could convert date to epoch format which is incremental in nature. Would that help? On 8 Jun 2017 19:33, "Koji Kawamura"wrote: > Hi Manoj, > > I think EnforceOrder would not be useful in your case, as it expects > the order to increases one by one (without skip). > As Pierre suggested, I'd suggest using PriorityAttributePrioritizer. > > Thanks, > Koji > > On Thu, Jun 8, 2017 at 3:50 PM, Pierre Villard > wrote: > > Hi Manoj, > > > > You may want ot have a look at EnforceOrder processor [1] or simply the > > prioritizers [2] of the connections (it depends of how your workflow is > > working). The idea would be to extract the date as an attribute of your > flow > > file, convert into an integer (using expression language) and use it to > > ensure order. > > > > [1] > > https://nifi.apache.org/docs/nifi-docs/components/org. > apache.nifi/nifi-standard-nar/1.2.0/org.apache.nifi.processors.standard. > EnforceOrder/index.html > > [2] > > https://nifi.apache.org/docs/nifi-docs/html/user-guide. > html#prioritization > > > > Hope this helps. > > > > > > 2017-06-08 8:43 GMT+02:00 Manojkumar Ravichandran < > sendmailt...@gmail.com>: > >> > >> Hi All, > >> > >> I need to process the files based on the date time value stored on the > >> attribute > >> > >> For example: > >> > >> If the incoming files contains the following date time attribute values > >> > >> 2017/06/07 16:57:02 > >> 2017/06/06 12:49:49 > >> 2017/06/06 11:09:28 > >> 2017/06/06 06:37:45 > >> > >> I need to process the files based on the order of time that is oldest > one > >> from the current time > >> > >> First I want to access the file that contains below date time attribute > >> which is the oldest one among them from the current time > >> i.e 2017/06/06 06:37:45 > >> and then below one, > >> 2017/06/06 11:09:28 > >> and then this > >> 2017/06/06 12:49:49 > >> so on > >> > >> How can I achieve the above mentioned scenario ? > >> > >> Regards, > >> Manoj kumar R > > > > >
Re: Set priority to files based on date time value stored on attribute
Hi Manoj, I think EnforceOrder would not be useful in your case, as it expects the order to increases one by one (without skip). As Pierre suggested, I'd suggest using PriorityAttributePrioritizer. Thanks, Koji On Thu, Jun 8, 2017 at 3:50 PM, Pierre Villardwrote: > Hi Manoj, > > You may want ot have a look at EnforceOrder processor [1] or simply the > prioritizers [2] of the connections (it depends of how your workflow is > working). The idea would be to extract the date as an attribute of your flow > file, convert into an integer (using expression language) and use it to > ensure order. > > [1] > https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.2.0/org.apache.nifi.processors.standard.EnforceOrder/index.html > [2] > https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization > > Hope this helps. > > > 2017-06-08 8:43 GMT+02:00 Manojkumar Ravichandran : >> >> Hi All, >> >> I need to process the files based on the date time value stored on the >> attribute >> >> For example: >> >> If the incoming files contains the following date time attribute values >> >> 2017/06/07 16:57:02 >> 2017/06/06 12:49:49 >> 2017/06/06 11:09:28 >> 2017/06/06 06:37:45 >> >> I need to process the files based on the order of time that is oldest one >> from the current time >> >> First I want to access the file that contains below date time attribute >> which is the oldest one among them from the current time >> i.e 2017/06/06 06:37:45 >> and then below one, >> 2017/06/06 11:09:28 >> and then this >> 2017/06/06 12:49:49 >> so on >> >> How can I achieve the above mentioned scenario ? >> >> Regards, >> Manoj kumar R > >
Re: Set priority to files based on date time value stored on attribute
Hi Manoj, You may want ot have a look at EnforceOrder processor [1] or simply the prioritizers [2] of the connections (it depends of how your workflow is working). The idea would be to extract the date as an attribute of your flow file, convert into an integer (using expression language) and use it to ensure order. [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.2.0/org.apache.nifi.processors.standard.EnforceOrder/index.html [2] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization Hope this helps. 2017-06-08 8:43 GMT+02:00 Manojkumar Ravichandran: > Hi All, > > I need to process the files based on the date time value stored on the > attribute > > *For example:* > > If the incoming files contains the following date time attribute values > > *2017/06/07 16:57:02* > *2017/06/06 12:49:49* > *2017/06/06 11:09:28* > *2017/06/06 06:37:45* > > I need to process the files based on the order of time that is oldest one > from the current time > > First I want to access the file that contains below date time attribute > which is the oldest one among them from the current time > *i.e 2017/06/06 06:37:45* > and then below one, > *2017/06/06 11:09:28* > and then this > *2017/06/06 12:49:49* > so on* * > > How can I achieve the above mentioned scenario ? > > Regards, > Manoj kumar R >
Set priority to files based on date time value stored on attribute
Hi All, I need to process the files based on the date time value stored on the attribute *For example:* If the incoming files contains the following date time attribute values *2017/06/07 16:57:02* *2017/06/06 12:49:49* *2017/06/06 11:09:28* *2017/06/06 06:37:45* I need to process the files based on the order of time that is oldest one from the current time First I want to access the file that contains below date time attribute which is the oldest one among them from the current time *i.e 2017/06/06 06:37:45* and then below one, *2017/06/06 11:09:28* and then this *2017/06/06 12:49:49* so on* * How can I achieve the above mentioned scenario ? Regards, Manoj kumar R