RE: MiNiFi agent cannot update flow configuration

2021-09-22 Thread Tomislav Novosel
Hi Matt,

I will try with different implementations of Provenanve repo, unit now I used 
default one: org.apache.nifi.provenance.MiNiFiPersistentProvenanceRepository.

Regarding C2 server, I downloaded version 0.5.0 here: 
https://nifi.apache.org/minifi/download.html , at the bottom of the page.

Where can I download 1.14.0 version?

Thanks,
Tom

-Original Message-
From: Matt Burgess  
Sent: 21 September 2021 21:51
To: users@nifi.apache.org
Subject: Re: MiNiFi agent cannot update flow configuration

Tom,

Which implementation of the Provenance Repository are you using? If not the 
VolatileProvenanceRepository, can you try that as a workaround? Also are you 
using the 1.14.0 version of the C2 server?

Regards,
Matt

On Tue, Sep 21, 2021 at 3:45 PM Tomislav Novosel 
 wrote:
>
> Hi to all,
>
>
> I'm using MiNiFi 1.14.0 with configured change ingestor to pull from 
> HTTP C2 server whenever there is a change in configuration (change in 
> NiFi flow that suppose to be running on MiNiFi).
>
> MiNiFi agent is running on Raspberry Pi 3 with enough disk space.
>
>
> When I make a change and save the new template with the name 
> template_name.v2, C2 pulls it, saves it into ./cache folder and sends it to 
> MiNiFi agent.
>
>
> Then in MiNiFi agent log I have this error:
>
>
>
> 2021-09-21 12:54:26,456 ERROR [MiNiFi logging handler] 
> org.apache.nifi.minifi.StdErr Failed to start flow service: Unable to 
> load flow due to: java.lang.RuntimeException: Unable to create 
> Provenance Repository
> 2021-09-21 12:54:26,457 ERROR [MiNiFi logging handler] 
> org.apache.nifi.minifi.StdErr Shutting down...
> 2021-09-21 12:54:27,384 INFO [main] o.apache.nifi.minifi.bootstrap.RunMiNiFi 
> Swap file exists, MiNiFi failed trying to change configuration. Reverting to 
> old configuration.
> 2021-09-21 12:54:27,425 INFO [main] 
> o.apache.nifi.minifi.bootstrap.RunMiNiFi Replacing config file with 
> swap file and deleting swap file
> 2021-09-21 12:54:27,444 INFO [main] 
> o.apache.nifi.minifi.bootstrap.RunMiNiFi Successfully spawned the 
> thread to start Apache MiNiFi with PID 64002
> 2021-09-21 12:54:29,384 INFO [MiNiFi Bootstrap Command Listener] 
> o.apache.nifi.minifi.bootstrap.RunMiNiFi The thread to run Apache 
> MiNiFi is now running and listening for Bootstrap requests on port 
> 38889
>
>
>
> It cannot change the configuration flow because it cannot create 
> Provenance Repository, and then reverts to old configuration of the flow.
>
> I tried to delete all the files in ./provenance_repository folder, and start 
> it again, but the same happens.
>
>
>
> Does anybody know why is this?
>
>
>
> Thanks in advance,
>
> Regards,
>
> Tom


MiNiFi agent cannot update flow configuration

2021-09-21 Thread Tomislav Novosel
Hi to all,

I'm using MiNiFi 1.14.0 with configured change ingestor to pull from HTTP C2 
server
whenever there is a change in configuration (change in NiFi flow that suppose 
to be running
on MiNiFi).
MiNiFi agent is running on Raspberry Pi 3 with enough disk space.

When I make a change and save the new template with the name template_name.v2, 
C2 pulls it,
saves it into ./cache folder and sends it to MiNiFi agent.

Then in MiNiFi agent log I have this error:

2021-09-21 12:54:26,456 ERROR [MiNiFi logging handler] 
org.apache.nifi.minifi.StdErr Failed to start flow service: Unable to load flow 
due to: java.lang.RuntimeException: Unable to create Provenance Repository
2021-09-21 12:54:26,457 ERROR [MiNiFi logging handler] 
org.apache.nifi.minifi.StdErr Shutting down...
2021-09-21 12:54:27,384 INFO [main] o.apache.nifi.minifi.bootstrap.RunMiNiFi 
Swap file exists, MiNiFi failed trying to change configuration. Reverting to 
old configuration.
2021-09-21 12:54:27,425 INFO [main] o.apache.nifi.minifi.bootstrap.RunMiNiFi 
Replacing config file with swap file and deleting swap file
2021-09-21 12:54:27,444 INFO [main] o.apache.nifi.minifi.bootstrap.RunMiNiFi 
Successfully spawned the thread to start Apache MiNiFi with PID 64002
2021-09-21 12:54:29,384 INFO [MiNiFi Bootstrap Command Listener] 
o.apache.nifi.minifi.bootstrap.RunMiNiFi The thread to run Apache MiNiFi is now 
running and listening for Bootstrap requests on port 38889

It cannot change the configuration flow because it cannot create Provenance 
Repository, and then
reverts to old configuration of the flow.
I tried to delete all the files in ./provenance_repository folder, and start it 
again, but the same happens.

Does anybody know why is this?

Thanks in advance,
Regards,
Tom


MiNiFi C2 server 0.5.0 bug or not?

2021-09-21 Thread Tomislav Novosel
Hi to all,

I am using MiNiFi C2 server version 0.5.0 and MiNiFi version 1.14.0.
C2 server is configured to pull templates from localhost NiFi installation
and MiNiFi is configured with configuration change ingestor to pull config from 
C2 server.

After I created MiNiFi flow on NiFi canvas including Remote Process Group 
pointed to the same
NiFi installation (IP and port configured and the flow tested on localhost, 
flowfiles are coming
on input port) I saved the flow as template.

The template has the following ID's for the input port:





1
true
true
f43fd60b-017b-1000-ad38-01d2ac321927
b4fbf2a8-46d0-3ac7-b833-21533794f7a7
DataFromSensors
f43dd13c-017b-1000-df9a-acf01964ae4f

When C2 server pulls the template, it apparently sets wrong ID in config.yml 
file, instead of putting input port ,
it sets  into config.yml file, causing MiNiFi agent is unable to 
accept that configuration and denies it, it is simply wrong.

This is the Exception from MiNiFi log:

Caused by: 
org.apache.nifi.minifi.bootstrap.exception.InvalidConfigurationException: 
Failed to transform config file due to:[Connection with id 
8f6c9b69-5b31-313e-- has invalid destination id 
b4fbf2a8-46d0-3ac7-b833-21533794f7a7]
   at 
org.apache.nifi.minifi.bootstrap.util.ConfigTransformer.throwIfInvalid(ConfigTransformer.java:131)
   at 
org.apache.nifi.minifi.bootstrap.util.ConfigTransformer.transformConfigFile(ConfigTransformer.java:94)
   at 
org.apache.nifi.minifi.bootstrap.RunMiNiFi.performTransformation(RunMiNiFi.java:1693)

To fix it, I need to go  to cached config file in ./cache folder of C2 server, 
and set the right ID of the input port manually.

NOTE: I tried to convert template file manually using minifi tookit 1.14.0 and 
it sets correct  of the input port.

Is this a bug in C2 server or behaviour with purpose? I found the same case for 
minifi toolkit, but apparently it was fixed:
https://stackoverflow.com/questions/59214373/failing-in-minifi-tutorial-toolkit-error-connection-with-id-has-invalid-de

Thanks,
Tom


RE: Minifi 1.14.0 exception - sensitive props key

2021-09-20 Thread Tomislav Novosel
Hi Jeremy,

MiNiFi constructs nifi.properties file from the config.yml located in ./conf 
folder and this property needs to be set
manually in config.yml before the startup:

Security Properties:
  Sensitive Props:
key: ''

Thanks,
Tom

From: Jeremy Pemberton-Pigott 
Sent: 20 September 2021 16:58
To: users@nifi.apache.org
Subject: Re: Minifi 1.14.0 exception - sensitive props key

Hi Tom,

I recall that I used the NiFi 1.14.0 binary to set the key and then copied that 
over to the MiNiFi properties file to get it to work.

Linux only:
https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#updating-the-sensitive-properties-key
./bin/nifi.sh set-sensitive-properties-key 

Jeremy

On Mon, Sep 20, 2021 at 9:14 PM Tomislav Novosel 
mailto:tomislav.novo...@clearpeaks.com>> wrote:
According to migration guidance for NiFi property 'Sensitive Properties Key' 
should be generated at startup, I believe this is the same behaviour in case of 
MiNiFi since they merged the codebase. Why is this happening?

BR,
Tom
From: Tomislav Novosel 
mailto:tomislav.novo...@clearpeaks.com>>
Sent: 20 September 2021 11:05
To: users@nifi.apache.org<mailto:users@nifi.apache.org>
Subject: Minifi 1.14.0 exception - sensitive props key

Hi to all,

I was using MiNiFi 0.5.0 running on ubuntu, installed as a service.
I switched now to MiNiFi 1.14.0 – I disabled minifi service, deleted
installation folder and unpacked MiNiFi 1.14.0 folder at the same place
where was 0.5.0 installed.

I started new MiNiFi 1.14.0 and the service don't want to start properly
with this exception:

java.lang.Exception: Unable to load flow due to: 
java.lang.IllegalArgumentException: NiFi Sensitive Properties Key 
[nifi.sensitive.props.key] is required
  at 
org.apache.nifi.headless.HeadlessNiFiServer.start(HeadlessNiFiServer.java:166)
  at org.apache.nifi.minifi.MiNiFi.(MiNiFi.java:163)
  at org.apache.nifi.minifi.MiNiFi.(MiNiFi.java:64)
  at org.apache.nifi.minifi.MiNiFi.main(MiNiFi.java:265)
Caused by: java.lang.IllegalArgumentException: NiFi Sensitive Properties Key 
[nifi.sensitive.props.key] is required
  at 
org.apache.nifi.encrypt.PropertyEncryptorFactory.getPropertyEncryptor(PropertyEncryptorFactory.java:42)
  at 
org.apache.nifi.headless.HeadlessNiFiServer.start(HeadlessNiFiServer.java:125)
  ... 3 common frames omitted

I tried to set the key manually, but the property in nifi.properties gets 
overwritten
every time I am starting the service.

Could not find anything about this exception, can someone help, why is this 
happening?

Thanks, regards,
Tom


RE: Minifi 1.14.0 exception - sensitive props key

2021-09-20 Thread Tomislav Novosel
According to migration guidance for NiFi property 'Sensitive Properties Key' 
should be generated at startup, I believe this is the same behaviour in case of 
MiNiFi since they merged the codebase. Why is this happening?

BR,
Tom
From: Tomislav Novosel 
Sent: 20 September 2021 11:05
To: users@nifi.apache.org
Subject: Minifi 1.14.0 exception - sensitive props key

Hi to all,

I was using MiNiFi 0.5.0 running on ubuntu, installed as a service.
I switched now to MiNiFi 1.14.0 - I disabled minifi service, deleted
installation folder and unpacked MiNiFi 1.14.0 folder at the same place
where was 0.5.0 installed.

I started new MiNiFi 1.14.0 and the service don't want to start properly
with this exception:

java.lang.Exception: Unable to load flow due to: 
java.lang.IllegalArgumentException: NiFi Sensitive Properties Key 
[nifi.sensitive.props.key] is required
  at 
org.apache.nifi.headless.HeadlessNiFiServer.start(HeadlessNiFiServer.java:166)
  at org.apache.nifi.minifi.MiNiFi.(MiNiFi.java:163)
  at org.apache.nifi.minifi.MiNiFi.(MiNiFi.java:64)
  at org.apache.nifi.minifi.MiNiFi.main(MiNiFi.java:265)
Caused by: java.lang.IllegalArgumentException: NiFi Sensitive Properties Key 
[nifi.sensitive.props.key] is required
  at 
org.apache.nifi.encrypt.PropertyEncryptorFactory.getPropertyEncryptor(PropertyEncryptorFactory.java:42)
  at 
org.apache.nifi.headless.HeadlessNiFiServer.start(HeadlessNiFiServer.java:125)
  ... 3 common frames omitted

I tried to set the key manually, but the property in nifi.properties gets 
overwritten
every time I am starting the service.

Could not find anything about this exception, can someone help, why is this 
happening?

Thanks, regards,
Tom


Minifi 1.14.0 exception - sensitive props key

2021-09-20 Thread Tomislav Novosel
Hi to all,

I was using MiNiFi 0.5.0 running on ubuntu, installed as a service.
I switched now to MiNiFi 1.14.0 - I disabled minifi service, deleted
installation folder and unpacked MiNiFi 1.14.0 folder at the same place
where was 0.5.0 installed.

I started new MiNiFi 1.14.0 and the service don't want to start properly
with this exception:

java.lang.Exception: Unable to load flow due to: 
java.lang.IllegalArgumentException: NiFi Sensitive Properties Key 
[nifi.sensitive.props.key] is required
  at 
org.apache.nifi.headless.HeadlessNiFiServer.start(HeadlessNiFiServer.java:166)
  at org.apache.nifi.minifi.MiNiFi.(MiNiFi.java:163)
  at org.apache.nifi.minifi.MiNiFi.(MiNiFi.java:64)
  at org.apache.nifi.minifi.MiNiFi.main(MiNiFi.java:265)
Caused by: java.lang.IllegalArgumentException: NiFi Sensitive Properties Key 
[nifi.sensitive.props.key] is required
  at 
org.apache.nifi.encrypt.PropertyEncryptorFactory.getPropertyEncryptor(PropertyEncryptorFactory.java:42)
  at 
org.apache.nifi.headless.HeadlessNiFiServer.start(HeadlessNiFiServer.java:125)
  ... 3 common frames omitted

I tried to set the key manually, but the property in nifi.properties gets 
overwritten
every time I am starting the service.

Could not find anything about this exception, can someone help, why is this 
happening?

Thanks, regards,
Tom


PutDistributedMapCache

2021-07-09 Thread Tomislav Novosel
Hi to all,

I have these error bulletin message showing now and there on
PutDistributedMapCache processor. The error disappears
after a few minutes, but now and there it shows up again,
and it is self-resolving (I have failure relationship routed again to the 
processor).

I increased connection timeout from 30secs to 60, but it doesn't help.

Why is this happening?

[Graphical user interface, text  Description automatically generated]

Thanks,
Tom


RE: NiFi configuration files changes

2021-06-30 Thread Tomislav Novosel
Hi Andrew,

thanks for the answer.

I looked at CM API endpoints and there is only to retrieve service 
configuration files according to this
https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/configuring-clusters/topics/cm-api-get-configuration.html

There is no anything related to the change log.

Regards,
Tom

From: Andrew Grande 
Sent: 29 June 2021 18:35
To: users@nifi.apache.org
Subject: Re: NiFi configuration files changes

The physical files will get synchronized to the reference state from a central 
config management source (CM). There's no point watching them on the file 
system. If you need a change log for config files, I'd look into CM api to 
fetch those instead.

On Tue, Jun 29, 2021, 8:30 AM Tomislav Novosel 
mailto:tomislav.novo...@clearpeaks.com>> wrote:
Hi to all,

Is there a good way how to capture NiFi configuration files changes 
(nifi.properties, authorizers.xml,…etc)
and to forward that changes (or just to notify) some other system or app?
Can I do it with NiFi itself?
The question is in the context of Cloudera platform – CFM.

Thanks,
Regards,
Tom


NiFi configuration files changes

2021-06-29 Thread Tomislav Novosel
Hi to all,

Is there a good way how to capture NiFi configuration files changes 
(nifi.properties, authorizers.xml,...etc)
and to forward that changes (or just to notify) some other system or app?
Can I do it with NiFi itself?
The question is in the context of Cloudera platform - CFM.

Thanks,
Regards,
Tom


RE: Nifi Registry git presistence

2021-06-25 Thread Tomislav Novosel
Hi Bryan,

Thanks for the explanation.

Unfortunately, we don't have possibility at all to integrate all the envs with 
one central NiFi registry nor to integrate dedicated NiFi registries between
the envs, security reasons etc, no connectivity between them.

The way we are having is to sync underlaying git repos when transferring the 
flows to higher logical env.

We have also parameter context which needs to be env specific. Is there a way 
how to automate this, to have like some kind of tmeplate file which can be 
populated
with specific env values?
Just to avoid manually modifying parameter context file and then to import it 
to the higher env.

Best regards,
Tom

-Original Message-
From: Bryan Bende  
Sent: 23 June 2021 16:34
To: users@nifi.apache.org
Subject: Re: Nifi Registry git presistence

There is no automatic pulling from git. The way registry works is that there is 
a database (H2/MySql/Postgres) that stores all the metadata and knowledge of 
what items exist, and then a persistence provider for each time of versioned 
item (flows, extension bundles). Git is just an implementation of a flow 
persistence provider, so it was really meant to be treated like just another 
place to store a blob. There are also file system and database providers for 
flow persistence.

When registry starts up, if the database is empty and git is configured, then 
it reads the git repo to rebuild the database. So the trick some people are 
using, is that in higher environments like prod where the registry is 
read-only, they periodically stop registry, delete the H2 database, git pull, 
and start again, and now everything is up to date.

On Wed, Jun 23, 2021 at 10:21 AM Tomislav Novosel 
 wrote:
>
> Hi Chris, Bryan,
>
> Thanks for the answers.
>
> How to configure "follower" Registries to pull from the "leader" 
> automatically? What if one of the Registries goes down, how to recover, is it 
> tricky?
>
> We have three separated envs, with separated nifi registries. Is it even 
> required to have git as a persistence layer?
> What is the difference if we setup just file persistence layer in comparison 
> to git?
>
> I know Toolkit CLI can be used in our case for transfering the flows 
> between registries, but I'm curious and want to know pros and cons of having 
> one git repo for all the registries.
>
> Thanks,
> Tom
> -Original Message-
> From: Bryan Bende 
> Sent: 23 June 2021 14:58
> To: users@nifi.apache.org
> Subject: Re: Nifi Registry git presistence
>
> It wasn't really meant for that, but as others pointed out, if only one of 
> the registries is doing writes and the others are read-only for pulling into 
> nifi in another environment, then it can work.
>
> The best option is to have one central registry that all nifi's can access 
> (if you are willing to access the same git repo from all environments, then 
> why not the same nifi registry?).
>
> The second best option is the database persistence provider, with all 
> registries pointing at the same database.
>
> On Wed, Jun 23, 2021 at 7:56 AM Chris McKeever  wrote:
> >
> > Yes -- however:
> > - only one should be the authority
> > - see the older thread of how you need to have the read-only-clients 
> > re-poll the remote git in order ot pick up the latest changes.
> >
> > On Wed, Jun 23, 2021 at 5:25 AM Tomislav Novosel 
> >  wrote:
> >>
> >> Hi to all,
> >>
> >>
> >>
> >> i sit possible to have one central Git repo for multiple NiFi 
> >> registries
> >>
> >> as persistance layer?
> >>
> >> Ii it possible to configure, for example, three NiFi registries 
> >> (through providers.xml file)
> >>
> >> to communicate with one git repo?
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Tom


RE: Nifi Registry git presistence

2021-06-23 Thread Tomislav Novosel
Hi Chris, Bryan,

Thanks for the answers.

How to configure "follower" Registries to pull from the "leader" automatically? 
What if one of the Registries goes down, how to recover, is it tricky?

We have three separated envs, with separated nifi registries. Is it even 
required to have git as a persistence layer?
What is the difference if we setup just file persistence layer in comparison to 
git?

I know Toolkit CLI can be used in our case for transfering the flows between 
registries, but I'm curious and want to know pros and cons of having
one git repo for all the registries.

Thanks,
Tom
-Original Message-
From: Bryan Bende  
Sent: 23 June 2021 14:58
To: users@nifi.apache.org
Subject: Re: Nifi Registry git presistence

It wasn't really meant for that, but as others pointed out, if only one of the 
registries is doing writes and the others are read-only for pulling into nifi 
in another environment, then it can work.

The best option is to have one central registry that all nifi's can access (if 
you are willing to access the same git repo from all environments, then why not 
the same nifi registry?).

The second best option is the database persistence provider, with all 
registries pointing at the same database.

On Wed, Jun 23, 2021 at 7:56 AM Chris McKeever  wrote:
>
> Yes -- however:
> - only one should be the authority
> - see the older thread of how you need to have the read-only-clients re-poll 
> the remote git in order ot pick up the latest changes.
>
> On Wed, Jun 23, 2021 at 5:25 AM Tomislav Novosel 
>  wrote:
>>
>> Hi to all,
>>
>>
>>
>> i sit possible to have one central Git repo for multiple NiFi 
>> registries
>>
>> as persistance layer?
>>
>> Ii it possible to configure, for example, three NiFi registries 
>> (through providers.xml file)
>>
>> to communicate with one git repo?
>>
>>
>>
>> Thanks,
>>
>> Tom


Nifi Registry git presistence

2021-06-23 Thread Tomislav Novosel
Hi to all,

i sit possible to have one central Git repo for multiple NiFi registries
as persistance layer?
Ii it possible to configure, for example, three NiFi registries (through 
providers.xml file)
to communicate with one git repo?

Thanks,
Tom


Setting Param Context to PG

2021-06-17 Thread Tomislav Novosel
Hi to all,

How can I set parameter context to all processor groups at once?
I imported root PG to another NiFi environment using template,
but now I need to go manually on every nested PG and set param context.

How to set it on all PGs at once?

Can I use toolkit or NiFi REST API? In toolkit there is command to list PGs,
but not recursively as I know.
If I could list it recursively, then it is easy to set param context to all of 
them
from shell script.

I want to avoid some recursive scripts in listing PGs using toolkit.

Thanks,
Regards,
Tom


Delete FlowFile content

2021-05-20 Thread Tomislav Novosel
Hi to all,

what is the best way to delete flowfiles content so that flowfile
keeps attributes, but to not take space in next relationship between
two processors?

I found this:

https://stackoverflow.com/questions/53312069/what-is-a-fastest-way-to-remove-nifi-flowfile-content

But then flowfile will be striped off the attributes and lineage is lost.

I tried also ReplaceText to replace with Empty string, but I have large files
in the flow, so it is extremely slow.

Thanks,

Tom


Nifi registry - NiFi toolkit

2021-05-13 Thread Tomislav Novosel
Hi to all,

Is it possible to use NiFi registry with github persistence provider instead of
local file system provider for flows versioning but to be able to use 
nifi-toolkit
and command nifi pg-import to import flows from registry to another nifi env?

I configured github persistence provider for nifi registry and flow snapshots 
are not
anymore saved to local folder on path /flows_storage/bucket_id/flow_id/version,
they are pushed to github as snapshots and one .yml file with the details of 
the bucket and flows inside.

In that case nifi toolkit command "nifi pg-import" complains it cannot find 
bucket and flow anymore.

Am I missing something in config or that simply is not possible?


Thanks,
Tom


RE: Some retry flowfile questions

2021-04-23 Thread Tomislav Novosel
Hi Harald, Mark,

I asked about RetryFlowfile the other day and its potential danger, but no 
answer yet.
My question was not referred to penalty and yield really, but just to make 
consideration about it.

@Harald, if on this Retry in your schema you are using RetryFlowfile processor, 
there can be sooner or later
potential of deadlock if you are having a lot of files going through this point 
in your flow.

Imagine there is big number of flowfiles and that “unreliable” endpoint you 
mentioned is sleeping
for a while, all the flowfiles are going to failure relationship and after 
sometime(depends how you configured
number of retries in RetryFlowfile processor) files are going to retry 
relationship to retry endpoint again.

If both of that relationships are full to the backpressure threshold, there 
will deadlock
and even if that endpoint wakes up, NiFi will not try it.

Related to your “slow down” question, in RetryFlowFile there is an option to 
penalize flowfiles before sending
to retry relationship.

Thanks,
Regards,
Tom

From: Dobbernack, Harald (Key-Work) 
Sent: 23 April 2021 09:50
To: users@nifi.apache.org
Subject: AW: Some retry flowfile questions

Mark, thank you so much for this great explanation!
Harald

Von: Mark Payne mailto:marka...@hotmail.com>>
Gesendet: Donnerstag, 22. April 2021 22:32
An: users@nifi.apache.org
Betreff: Re: Some retry flowfile questions

Geoff,

The difference between penalization and yielding is whether the failure is 
data-dependent or not.

So, an easy way to think about this is to consider a scenario where you have a 
simple flow: GetFTP -> PutFTP.
Something else is picking up data from the FTP server that you’re putting to.

You know that sometimes the data will already exist with the same name, but you 
don’t want to overwrite it because it’s likely to actually be different data 
with a conflicting filename.
So you want to wait a while and try to push that file again. In the meantime, 
you want to continue pushing other files to the FTP server.
In this case, the processor would penalize that FlowFile so that it can 
continue working on other data.

On the other hand, if PutFTP were to get a connection failure, it’s not even 
able to connect to that FTP server, then it doesn’t make sense to penalize that 
FlowFile and move onto the next one and try to push it. It can’t connect, so it 
can’t make progress regardless of what data it has.
In this case, the processor should yield.

Note, however, that it is up to the processor developer to tell the processor 
to yield or to penalize the FlowFile. It’s not up to the creator of the data 
flow.

Does that help?

Thanks
-Mark

On Apr 22, 2021, at 2:08 PM, Greene (US), Geoffrey N 
mailto:geoffrey.n.gre...@boeing.com>> wrote:

We have a rest endpoint that is “unreliable”. It works sometimes.
When it doesn’t work, the solution seems to be to sleep for awhile, then try 
again

So I put in a retry processor:

http processor<-  Retry
   |  \ ^
Success  Failure  -|

So far, so good, that loop works.  But how do I handle the slow down?
Does the penalty / yield go on the retry? Or on the http?  Whats the 
difference?  How do I know if I should YIELD or impose a penalty? I’m not sure 
I understand the differences here

Thanks
Geoff



Harald Dobbernack

Key-Work Consulting GmbH | Kriegsstr. 100 | 76133 | Karlsruhe | Germany | 
www.key-work.de | 
Datenschutz
Fon: +49-721-78203-264 | E-Mail: 
harald.dobbern...@key-work.de

Key-Work Consulting GmbH, Karlsruhe, HRB 108695, HRG Mannheim
Geschäftsführer: Andreas Stappert, Tobin Wotring


Stopping processor after MAX number of retries

2021-02-26 Thread Tomislav Novosel
Hi guys,

I want to stop the processor after exceeding maximum number of retries.
For that I'm using RetryFlowFile processor, after 5 times of retry, it routes
flowfile to retries_exceeded.

When that kicks in, I want to stop the processor which was retried 5 times.

What is the best approach? I have few ones:


  *   Execute shell script which sends request to nifi-api to set processor 
state to STOPPED
  *   Put InvokeHTTP processor to send request

The downside is, what if processor-id changes, e.g. deploying to another env or 
nifi restart, not sure about that.
Also, it is nifi cluster with authentication and SSL, so it complicates the 
things.

Maybe someone has much simpler approach, with backpressure or something.

Regards,
Tom


RE: Groovy script

2021-02-24 Thread Tomislav Novosel
Hi Mike,

attribute 'file_path' is not pointing to folder only, it has value 
/path/to/filename, so it is like /opt/data/folder/filename.txt. The attribute 
value is ok, I double checked.

Tom
-Original Message-
From: Mike Thomsen  
Sent: 24 February 2021 18:00
To: users@nifi.apache.org
Subject: Re: Groovy script

If file_path is pointing to a folder as you said, it's going to check for the 
folder's existence. The fact that it's failing to return true there suggests 
that something is wrong with the path in the file_path attribute.

On Wed, Feb 24, 2021 at 11:47 AM Tomislav Novosel 
 wrote:
>
> Hi guys,
>
>
>
> I want to check if file exists with this groovy script:
>
>
>
> flowfile = session.get()
> if(!flowfile) return
>
> file_path = flowfile.getAttribute('file_path')
> File file = new File(file_path)
>
> if(file.exists()){
> session.transfer(flowfile, REL_FAILURE) } else{ 
> session.transfer(flowfile, REL_SUCCESS) }
>
>
>
> and to route all files which exist to FAILURE relationship, but all of 
> them go to SUCCESS, file is for sure in the folder
>
> ‘file_path’, I checked.
>
>
>
> What am I doing wrong?
>
>
>
> Thanks,
>
>
>
> Tom


Groovy script

2021-02-24 Thread Tomislav Novosel
Hi guys,

I want to check if file exists with this groovy script:

flowfile = session.get()
if(!flowfile) return
file_path = flowfile.getAttribute('file_path')
File file = new File(file_path)
if(file.exists()){
session.transfer(flowfile, REL_FAILURE)
}
else{
session.transfer(flowfile, REL_SUCCESS)
}

and to route all files which exist to FAILURE relationship, but all of them go 
to SUCCESS, file is for sure in the folder
'file_path', I checked.

What am I doing wrong?

Thanks,

Tom


DMC and ListFile

2021-02-15 Thread Tomislav Novosel
Hi team,

is Redis good alternative to use it like DMC with ListFile processor in mode 
'Tracking entities'?
In general, can Listfile use external cache service located on separated server?

Something similar is dscribed here by Bryan:
https://bryanbende.com/development/2017/10/09/apache-nifi-redis-integration

Thanks,
Tom


Monitoring big directory tree

2021-02-02 Thread Tomislav Novosel
Hi guys,

I have following situation:

There is SMB mounted folder on one Nifi worker and it has many subfolders with 
subfolders (the depth of nesting is not known in advance).
If new files in that directory tree appears or file is moved or old file is 
copied/moved, modification timestamp changes. It is achieved
with some other tools, configs etc.

What is the best way to list new/updated/new-old files with NiFi if we take 
into account there are enormous number of subfolders and files in them (let's 
say milions).

In case of ListFile and using 'Tracking Entities' I am concerned about the 
following:


  *   How big is the I/O if ListFile constantly checks the directory tree and 
all files?
It can be CRON based not to do it all the time, but if it is not, how ListFile 
is doing that in the background?
  *   How big is the cahce of listed entities, what is stored in fact in the 
cache, just metadata or?
  *   What if cahce is not persisted and NiFi restart occurs? Will cache be 
incosistent?
In case it is persisted, what if restart occurs in the moment when ListFile is 
checking new/old entities and their size, name etc?
What is the interval of persisting the cache, is it related to snapshots NiFi 
takes in configured intervals?

What is the best and the most efficient way to do this? Maybe some extra tools 
or engines to use for finding difference and to
persist last known state, like elastic, some DB maybe?
Or to construct list of paths which need to be fecthed using some python 
sctipts?

The constarint here is shared(mounted) folders and even if modification date is 
changed for every new/updated file,
how to efficiently monitor big directory tree or how to efficiently trigger 
ListFile (NiFi flow) to fetch new/old-new files?

In case of 'Tracking entities', maybe having separated standalone NiFi instance 
on separated server with configured CacheServer
to serve as cache is not bad idea?

Thanks in advance,

Tom





NiFi user and access rights

2020-02-12 Thread Tomislav Novosel
Hi guys,

I'm having this situation inside my company projects. We are using NiFi as
DataFlow platform and there are multiple projects.
Every project has files on shared disk/folder from which one Nifi
instance(standalone instance) is reading data.
NiFi instance service is running under one generic user which has read
rights for every shared folder/project and that is fine.

As there will be more and more projects and only one generic user will need
to have read rights on all shared disks/folders of all projects. So which
is better solution:


   1. To have one NiFi instance running with one generic user which has
   read rights on all shared disks/folders. From security standpoint it is not
   ok. Shared folders are from various customers. Data volume and load is not
   too big for only one standalone NiFi instance.
   2. To have Multiple NiFi instances on one server each running under
   different generic user and every generic user belongs to one customer
   shared folder regarding read rights, 1:1 relationship.

In the future there will be need to scure NiFi instances with SSL, maybe to
add more nodes and to establish multi-tenancy.

Is there maybe some other third solution for this situation? How to setup
that kind of data flow where are multiple data sources and security is
important?

Thanks in advance and best regards.

Tom


Merge and transform JSON

2020-01-15 Thread Tomislav Novosel
Hi Nifi Community,

I have situation where I need to merge JSON content from multiple flowfiles
into one JSON content (single flowfile).

JSON in every flowfile looks like this:

{
  "analyses": "prep_array",
  "args": "prep_array",
  "scriptId": "142",
  "libIds": "141",
  "job_name": "my_demo_job",
  "project_id": "23",
  "configId": "5",
  "file_id": "150",
  "analysis_name": "my_analysis_name",
  "source_id": "12",
  "customer_id": "16"
}

and every flowfile has only one filed different from the others, that is
file_id. I want to merge it all in one JSON but to get this structure:

{
  "analyses": "prep_array",
  "args": "prep_array",
  "scriptId": "142",
  "libIds": "141",
  "job_name": "my_demo_job",
  "project_id": "23",
  "configId": "5",
  "file_id": [
150,
151,
152,
153,
154
  ],
  "analysis_name": "my_analysis_name",
  "source_id": "12",
  "customer_id": "16"
}

What is the best way to do it? It sounds to me a little bit tricky, but I
hope I'm wrong.

Please can someone give me advice or some guidelines?

Thanks in advance,
Tom


Re: Nifi errors - FetchFile and UnpackContent

2019-10-11 Thread Tomislav Novosel
Any more suggestions to this situation?

Thanks,
Tom

On Thu, 3 Oct 2019 at 19:54, Tomislav Novosel  wrote:

> Hi Jeff,
>
> None of this is applied in pipeline and FetchFile processor.
> It is not on cluster, it runs only on one standalone Nifi instance.
> Completion strategy is on None, nor deleting, nor Moving.
>
> Only thing that can be is that someone else uses the file at the same time
> because that shared disk is used by other people to
> who are reading the files and doing some analysis.
>
> Can that be also the cause for Truncated ZIP file on UnpackContent
> processor?
>
> I applied loopback relationship on that processors for failure flowfiles
> to retry on failure.
>
> Thanks.
> Tom
>
> On Thu, 3 Oct 2019 at 17:18, Jeff  wrote:
>
>> Hello Tomislav,
>>
>> Are these processors running in a multi-node cluster?  Is FetchFile
>> downstream from a ListFile processor that is scheduled to run on all nodes
>> versus Primary Node only?  Is FetchFile's Completion Strategy set to "Move
>> File" or "Delete File"?  Typically, source processors should be scheduled
>> to run on the primary node, otherwise when reading from the same source
>> across multiple nodes, for example a shared network drive, each source
>> processor might pull the same data.  In a situation like this, the same
>> file could be listed by each node, and the FetchFile processor on each node
>> may attempt to fetch the same file.
>>
>> If you set the source processor to run on Primary Node only, you can
>> load-balance the connection between the source processor and FetchFile to
>> distribute the load of fetching the files across the cluster.
>>
>> On Thu, Oct 3, 2019 at 2:32 AM Tomislav Novosel 
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm getting errors from FetchFile and UnpackContent processors.
>>> I have pipeline where I fetch zip files as they come continuously on
>>> shared network drive
>>> with Minimum file age set to 30 sec to avoid fetching file before it is
>>> written to disk completely.
>>>
>>> Sometimes I get this error from FetchFile:
>>>
>>> FetchFile[id=c741187c-1172-1166-e752-1f79197a8029] Could not fetch file
>>> \\avl01\ATGRZ\TestFactory\02 Dep Service\01
>>> Processdata\Backup\dfs_atfexport\MANA38\ANA_12_BPE7347\ANA_12_BPE7347_TDL_HL_1\measurement_file.atf.zip
>>> from file system for
>>> StandardFlowFileRecord[uuid=e7a5e3c4-0981-4ff3-85ea-91e41f0c3c0e,claim=,offset=0,name=PEI_BPE7347_TDLHL1new_826_20191001161312.atf.zip,size=0]
>>> because the existence of the file cannot be verified; routing to failure
>>>
>>>
>>> And from UnpackContent sometimes I get this error:
>>>
>>>
>>> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d] Unable to unpack
>>> StandardFlowFileRecord[uuid=4a019d58-fe45-4276-a161-e46cd8b1667c,claim=StandardContentClaim
>>> [resourceClaim=StandardResourceClaim[id=1570052741201-5000,
>>> container=default, section=904], offset=1651,
>>> length=28417768],offset=0,name=measurement.atf.zip,size=28417768] due to
>>> IOException thrown from
>>> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
>>> java.io.IOException: Truncated ZIP file; routing to failure:
>>>
>>> org.apache.nifi.processor.exception.ProcessException: IOException thrown
>>> from UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
>>> java.io.IOException: Truncated ZIP file
>>>
>>>
>>> After getting this error from UnpackContent I tried to fetch file again
>>> and to unpack it. It went well, without any errors.
>>> So what does this errors mean? I spoke to colleagues who are using this
>>> files on the source side and they said files are ok, not corrupted or
>>> something.
>>>
>>> Please help or give advice.
>>>
>>> Thanks in advance.
>>> Tom
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>


Re: Nifi errors - FetchFile and UnpackContent

2019-10-03 Thread Tomislav Novosel
Hi Jeff,

None of this is applied in pipeline and FetchFile processor.
It is not on cluster, it runs only on one standalone Nifi instance.
Completion strategy is on None, nor deleting, nor Moving.

Only thing that can be is that someone else uses the file at the same time
because that shared disk is used by other people to
who are reading the files and doing some analysis.

Can that be also the cause for Truncated ZIP file on UnpackContent
processor?

I applied loopback relationship on that processors for failure flowfiles to
retry on failure.

Thanks.
Tom

On Thu, 3 Oct 2019 at 17:18, Jeff  wrote:

> Hello Tomislav,
>
> Are these processors running in a multi-node cluster?  Is FetchFile
> downstream from a ListFile processor that is scheduled to run on all nodes
> versus Primary Node only?  Is FetchFile's Completion Strategy set to "Move
> File" or "Delete File"?  Typically, source processors should be scheduled
> to run on the primary node, otherwise when reading from the same source
> across multiple nodes, for example a shared network drive, each source
> processor might pull the same data.  In a situation like this, the same
> file could be listed by each node, and the FetchFile processor on each node
> may attempt to fetch the same file.
>
> If you set the source processor to run on Primary Node only, you can
> load-balance the connection between the source processor and FetchFile to
> distribute the load of fetching the files across the cluster.
>
> On Thu, Oct 3, 2019 at 2:32 AM Tomislav Novosel 
> wrote:
>
>> Hi all,
>>
>> I'm getting errors from FetchFile and UnpackContent processors.
>> I have pipeline where I fetch zip files as they come continuously on
>> shared network drive
>> with Minimum file age set to 30 sec to avoid fetching file before it is
>> written to disk completely.
>>
>> Sometimes I get this error from FetchFile:
>>
>> FetchFile[id=c741187c-1172-1166-e752-1f79197a8029] Could not fetch file
>> \\avl01\ATGRZ\TestFactory\02 Dep Service\01
>> Processdata\Backup\dfs_atfexport\MANA38\ANA_12_BPE7347\ANA_12_BPE7347_TDL_HL_1\measurement_file.atf.zip
>> from file system for
>> StandardFlowFileRecord[uuid=e7a5e3c4-0981-4ff3-85ea-91e41f0c3c0e,claim=,offset=0,name=PEI_BPE7347_TDLHL1new_826_20191001161312.atf.zip,size=0]
>> because the existence of the file cannot be verified; routing to failure
>>
>>
>> And from UnpackContent sometimes I get this error:
>>
>>
>> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d] Unable to unpack
>> StandardFlowFileRecord[uuid=4a019d58-fe45-4276-a161-e46cd8b1667c,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1570052741201-5000,
>> container=default, section=904], offset=1651,
>> length=28417768],offset=0,name=measurement.atf.zip,size=28417768] due to
>> IOException thrown from
>> UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
>> java.io.IOException: Truncated ZIP file; routing to failure:
>>
>> org.apache.nifi.processor.exception.ProcessException: IOException thrown
>> from UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
>> java.io.IOException: Truncated ZIP file
>>
>>
>> After getting this error from UnpackContent I tried to fetch file again
>> and to unpack it. It went well, without any errors.
>> So what does this errors mean? I spoke to colleagues who are using this
>> files on the source side and they said files are ok, not corrupted or
>> something.
>>
>> Please help or give advice.
>>
>> Thanks in advance.
>> Tom
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>


Nifi errors - FetchFile and UnpackContent

2019-10-03 Thread Tomislav Novosel
Hi all,

I'm getting errors from FetchFile and UnpackContent processors.
I have pipeline where I fetch zip files as they come continuously on shared
network drive
with Minimum file age set to 30 sec to avoid fetching file before it is
written to disk completely.

Sometimes I get this error from FetchFile:

FetchFile[id=c741187c-1172-1166-e752-1f79197a8029] Could not fetch file
\\avl01\ATGRZ\TestFactory\02 Dep Service\01
Processdata\Backup\dfs_atfexport\MANA38\ANA_12_BPE7347\ANA_12_BPE7347_TDL_HL_1\measurement_file.atf.zip
from file system for
StandardFlowFileRecord[uuid=e7a5e3c4-0981-4ff3-85ea-91e41f0c3c0e,claim=,offset=0,name=PEI_BPE7347_TDLHL1new_826_20191001161312.atf.zip,size=0]
because the existence of the file cannot be verified; routing to failure


And from UnpackContent sometimes I get this error:


UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d] Unable to unpack
StandardFlowFileRecord[uuid=4a019d58-fe45-4276-a161-e46cd8b1667c,claim=StandardContentClaim
[resourceClaim=StandardResourceClaim[id=1570052741201-5000,
container=default, section=904], offset=1651,
length=28417768],offset=0,name=measurement.atf.zip,size=28417768] due to
IOException thrown from
UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
java.io.IOException: Truncated ZIP file; routing to failure:

org.apache.nifi.processor.exception.ProcessException: IOException thrown
from UnpackContent[id=0164106c-d3b7-1e3f-c770-6e6e07f9259d]:
java.io.IOException: Truncated ZIP file


After getting this error from UnpackContent I tried to fetch file again and
to unpack it. It went well, without any errors.
So what does this errors mean? I spoke to colleagues who are using this
files on the source side and they said files are ok, not corrupted or
something.

Please help or give advice.

Thanks in advance.
Tom


Re: Reading only latest file

2019-08-27 Thread Tomislav Novosel
Yes, I've got the same idea. Thanks.
Tom

On Tue, 27 Aug 2019, 10:49 Arpad Boda,  wrote:

> Quite a special case, I would go for executescript proc and do the logic
> you need in Python.
>
> On Tue, Aug 27, 2019 at 8:45 AM Tomislav Novosel 
> wrote:
>
>> Hi Arpad,
>>
>> The thing is I don't have exact file creation frequency and it is not
>> always the same. Also, files was created months ago.
>>
>> Regards,
>> Tom
>>
>> On Mon, 26 Aug 2019 at 11:59, Arpad Boda  wrote:
>>
>>> Tom,
>>>
>>> What about ListFile->FetchFile flowchain, ListFile configured with a
>>> maximum file age lower than the frequency of the file creation you have?
>>>
>>> Regards,
>>> Arpad
>>>
>>> On Mon, Aug 26, 2019 at 10:19 AM Tomislav Novosel 
>>> wrote:
>>>
>>>> Any ideas?
>>>> Thanks,
>>>> Tom
>>>>
>>>> On Thu, 22 Aug 2019 at 16:06, Tomislav Novosel 
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I have scenario where I need to read only the latest(the youngest)
>>>>> file according to creation date. The files are:
>>>>>
>>>>> Load_Dump_1.001.2019-07-22_17-22-45994.ifl - creation date
>>>>> 2019-07-22T17:24:44+0200
>>>>> Load_Dump_1.001.2019-07-22_17-25-09132.ifl - creation date
>>>>> 2019-07-22T17:26:14+0200
>>>>>
>>>>> So I need to fetch only the second file which is the youngest.
>>>>> I have multiple folders with files and they are filtered by extension
>>>>> (.ifl).
>>>>>
>>>>> How can I filter them and fetch only the youngest .ifl file from every
>>>>> folder?
>>>>>
>>>>> BR,
>>>>> Tom
>>>>>
>>>>>


Re: Reading only latest file

2019-08-27 Thread Tomislav Novosel
Hi Arpad,

The thing is I don't have exact file creation frequency and it is not
always the same. Also, files was created months ago.

Regards,
Tom

On Mon, 26 Aug 2019 at 11:59, Arpad Boda  wrote:

> Tom,
>
> What about ListFile->FetchFile flowchain, ListFile configured with a
> maximum file age lower than the frequency of the file creation you have?
>
> Regards,
> Arpad
>
> On Mon, Aug 26, 2019 at 10:19 AM Tomislav Novosel 
> wrote:
>
>> Any ideas?
>> Thanks,
>> Tom
>>
>> On Thu, 22 Aug 2019 at 16:06, Tomislav Novosel 
>> wrote:
>>
>>> Hi all,
>>>
>>> I have scenario where I need to read only the latest(the youngest) file
>>> according to creation date. The files are:
>>>
>>> Load_Dump_1.001.2019-07-22_17-22-45994.ifl - creation date
>>> 2019-07-22T17:24:44+0200
>>> Load_Dump_1.001.2019-07-22_17-25-09132.ifl - creation date
>>> 2019-07-22T17:26:14+0200
>>>
>>> So I need to fetch only the second file which is the youngest.
>>> I have multiple folders with files and they are filtered by extension
>>> (.ifl).
>>>
>>> How can I filter them and fetch only the youngest .ifl file from every
>>> folder?
>>>
>>> BR,
>>> Tom
>>>
>>>


Re: Reading only latest file

2019-08-27 Thread Tomislav Novosel
Hi Arpad,

The thing is I don't have exact file creation frequency and it is now
always the same. Also, files was created months ago.

Regards,
Tom

On Mon, 26 Aug 2019 at 11:59, Arpad Boda  wrote:

> Tom,
>
> What about ListFile->FetchFile flowchain, ListFile configured with a
> maximum file age lower than the frequency of the file creation you have?
>
> Regards,
> Arpad
>
> On Mon, Aug 26, 2019 at 10:19 AM Tomislav Novosel 
> wrote:
>
>> Any ideas?
>> Thanks,
>> Tom
>>
>> On Thu, 22 Aug 2019 at 16:06, Tomislav Novosel 
>> wrote:
>>
>>> Hi all,
>>>
>>> I have scenario where I need to read only the latest(the youngest) file
>>> according to creation date. The files are:
>>>
>>> Load_Dump_1.001.2019-07-22_17-22-45994.ifl - creation date
>>> 2019-07-22T17:24:44+0200
>>> Load_Dump_1.001.2019-07-22_17-25-09132.ifl - creation date
>>> 2019-07-22T17:26:14+0200
>>>
>>> So I need to fetch only the second file which is the youngest.
>>> I have multiple folders with files and they are filtered by extension
>>> (.ifl).
>>>
>>> How can I filter them and fetch only the youngest .ifl file from every
>>> folder?
>>>
>>> BR,
>>> Tom
>>>
>>>


Re: Reading only latest file

2019-08-26 Thread Tomislav Novosel
Any ideas?
Thanks,
Tom

On Thu, 22 Aug 2019 at 16:06, Tomislav Novosel  wrote:

> Hi all,
>
> I have scenario where I need to read only the latest(the youngest) file
> according to creation date. The files are:
>
> Load_Dump_1.001.2019-07-22_17-22-45994.ifl - creation date
> 2019-07-22T17:24:44+0200
> Load_Dump_1.001.2019-07-22_17-25-09132.ifl - creation date
> 2019-07-22T17:26:14+0200
>
> So I need to fetch only the second file which is the youngest.
> I have multiple folders with files and they are filtered by extension
> (.ifl).
>
> How can I filter them and fetch only the youngest .ifl file from every
> folder?
>
> BR,
> Tom
>
>


Reading only latest file

2019-08-22 Thread Tomislav Novosel
Hi all,

I have scenario where I need to read only the latest(the youngest) file
according to creation date. The files are:

Load_Dump_1.001.2019-07-22_17-22-45994.ifl - creation date
2019-07-22T17:24:44+0200
Load_Dump_1.001.2019-07-22_17-25-09132.ifl - creation date
2019-07-22T17:26:14+0200

So I need to fetch only the second file which is the youngest.
I have multiple folders with files and they are filtered by extension
(.ifl).

How can I filter them and fetch only the youngest .ifl file from every
folder?

BR,
Tom


JSON argument

2019-06-26 Thread Tomislav Novosel
Hi all,

I have a case where I'm triggering python script with ExecuteStreamCommand
processor and one of the parameters needs to be JSON string e.g.
'{"foo":"bar"}', so when the script receives it, it need to convert it into
python dictionary.

The error is due to incorrect handover of parameter as JSON string, so the
Nifi gives the script this parameter without double and single quotes e.g.
{foo:bar}. So why is that? Is there any workaround?

I checked that with simple part where I wrote parameter value into txt file
where the script is located.

I also tried with expression escapeJson(), so for given parameter
'{"fo":"bar"}', expression converts it into '{\"foo\":\"bar\"', but
ExecuteStreamCommand processor receives it like '{\foo\:\bar\}'.

Thank you.
BR,
Tom


Re: InvokeHTTP with SSL

2019-06-10 Thread Tomislav Novosel
Yeah, I was thinking about that. But what if service doesn't have any
certificates at alL?
I think that service listens on K8S cluster without SSL certs and inside
our corporate network.

BR,
Tom

On Mon, 3 Jun 2019 at 16:16, Bryan Bende  wrote:

> Hello,
>
> You should be specifying an SSL Context Service in the processor which
> points to a truststore that trusts the certificate of the service you
> are calling.
>
> Alternatively, if the CA certs system truststore trusts the service
> cert then it should also work.
>
> Thanks,
>
> Bryan
>
> On Mon, Jun 3, 2019 at 10:14 AM Tomislav Novosel 
> wrote:
> >
> > Hi all,
> >
> > I have a case where I need to send POST request on one enpoint which is
> located
> > on K8S cluster and behind reverse proxy. Only HTTPS can be used.
> > If I put value of endpoint using https:// I get error 'Unable to find
> valid certification path to requested target'.
> > I spoke to my admin/devops guy and he says there is no other way to
> access that endpoint other than URL he gave me.
> >
> > Is there a way to bypass SSL verification or something else?
> >
> > Thanks,
> > BR,
> > Tom
>


InvokeHTTP with SSL

2019-06-03 Thread Tomislav Novosel
Hi all,

I have a case where I need to send POST request on one enpoint which is
located
on K8S cluster and behind reverse proxy. Only HTTPS can be used.
If I put value of endpoint using https:// I get error 'Unable to find valid
certification path to requested target'.
I spoke to my admin/devops guy and he says there is no other way to access
that endpoint other than URL he gave me.

Is there a way to bypass SSL verification or something else?

Thanks,
BR,
Tom


Re: MergeContent processor

2019-05-31 Thread Tomislav Novosel
Yes Mark, that helps a lot.

Thanks.
Tom

On Fri, 31 May 2019 at 14:49, Mark Payne  wrote:

> Tom,
>
> You have the Minimum Number of Entries set to 2 and Minimum Group Size set
> to 1 MB. That means that
> as soon as you have 2 files queued up that total at least 1 MB, it will
> create an output FlowFile. That output
> will contain as much data as it can, based on what is queued up at that
> moment. So if you pull in 2 files, then
> a few seconds later pull in 2 more, then a few seconds later 2 more, then
> 2 more, you're going to end up with
> 8 output FlowFiles, each containing 2 files.
>
> As I understand it, the idea is to periodically pull in everything in the
> directory, merge them together, and then
> move on. Later, you'll do another listing, merge those together, and move
> on, correct?
>
> The problem here is that, as you said, you don't know how many files are
> in the directory, so MergeContent doesn't
> know how many files to wait for, before merging. So one possibility would
> be to just set the Minimum Number of Entries
> and Minimum Group Size to something much larger. Then set "Max Bin Age" to
> say 30 seconds or 60 seconds. That
> way, as soon as MergeContent sees a single file, it will wait 30 seconds
> or 60 seconds or whatever you have set, and
> then merge together all of the files that it has queued up.
>
> Does that help?
>
> Thanks
> -Mark
>
>
> On May 31, 2019, at 8:27 AM, Tomislav Novosel 
> wrote:
>
> I forgot to mention that I put Correlation Attribute Name as attribute
> name which holds directory name from which that 8 files coming from.
> And there is not always 8 files, this is just for example. The number of
> files changes always as the files come in the folders.
>
> On Fri, 31 May 2019 at 14:24, Tomislav Novosel 
> wrote:
>
>> Hi all,
>>
>> I need to create one flowfile from multiple flowfiles which are of files
>> in one directory. So if I have e.g. 8 files in directory, I want to merge
>> them and the output I want is 1 flowfile for further processing ( I want to
>> extract folder name and path of that 8 files).
>>
>> I tried with MergeContent processor and this is my setup.
>>
>> 
>>
>> But every time i run the flow I get multiple flowfiles in the output
>> queue. Sometimes 2, sometimes 3 or even 4. What am I doing wrong?
>> Or is there some other way to do this in Nifi?
>>
>> Thanks in advance,
>> BR.
>> Tom
>>
>
>


Re: MergeContent processor

2019-05-31 Thread Tomislav Novosel
I forgot to mention that I put Correlation Attribute Name as attribute name
which holds directory name from which that 8 files coming from.
And there is not always 8 files, this is just for example. The number of
files changes always as the files come in the folders.

On Fri, 31 May 2019 at 14:24, Tomislav Novosel  wrote:

> Hi all,
>
> I need to create one flowfile from multiple flowfiles which are of files
> in one directory. So if I have e.g. 8 files in directory, I want to merge
> them and the output I want is 1 flowfile for further processing ( I want to
> extract folder name and path of that 8 files).
>
> I tried with MergeContent processor and this is my setup.
>
> [image: image.png]
>
> But every time i run the flow I get multiple flowfiles in the output
> queue. Sometimes 2, sometimes 3 or even 4. What am I doing wrong?
> Or is there some other way to do this in Nifi?
>
> Thanks in advance,
> BR.
> Tom
>


MergeContent processor

2019-05-31 Thread Tomislav Novosel
Hi all,

I need to create one flowfile from multiple flowfiles which are of files in
one directory. So if I have e.g. 8 files in directory, I want to merge them
and the output I want is 1 flowfile for further processing ( I want to
extract folder name and path of that 8 files).

I tried with MergeContent processor and this is my setup.

[image: image.png]

But every time i run the flow I get multiple flowfiles in the output queue.
Sometimes 2, sometimes 3 or even 4. What am I doing wrong?
Or is there some other way to do this in Nifi?

Thanks in advance,
BR.
Tom


Re: When do I download the older releases of Apache NiFi

2019-02-27 Thread Tomislav Novosel
Hi Vijay,

here you can find older releases:
https://archive.apache.org/dist/nifi/

BR,
Tom

On Wed, 27 Feb 2019, 18:52 Vijay Chhipa,  wrote:

> Hi,
>
> Noticed that with the release of NiFi 1.9.0  there is not a link to
> download version 1.7.1 and older,
>
> Is there a link to the archived versions?
>
> Thanks
>
>


Re: How to integrate Secured Ragistry with Secured Nifi

2019-02-19 Thread Tomislav Novosel
Any ideas?

Tom

On Tue, 19 Feb 2019 at 11:08, Mike Thomsen  wrote:

> Copy pasta.
>
> On Tue, Feb 19, 2019 at 12:51 AM Tomislav Novosel 
> wrote:
>
>> Hi Mike, Kevin
>>
>> Thank you for your answers, I appreciate it.
>>
>> @Mike, why are you setting WEB_HTTP_HOST and WEB_HTTP_PORT when you are
>> using secured nifi? Shouldnt that be empty and only HTTPS host and port
>> used?
>>
>> BR,
>> Tom
>>
>> On Mon, 18 Feb 2019, 23:56 Mike Thomsen >
>>> Tom,
>>>
>>> > Note: both Registry and Nifi are in Docker containers on the same
>>> node. I tried with IP address, but nothing.
>>>
>>> Each docker container has its own IP address. You need to link the two
>>> containers. I always use Docker Compose, so I can't help you on how to set
>>> it up manually. That said, I did a sample last year connecting a few
>>> different NiFi nodes and a registry w/ SSL here:
>>>
>>>
>>> https://github.com/MikeThomsen/nifi-docker-compose/blob/master/docker-compose-registry.yml
>>>
>>> I can't remember if I kept the LDAP docker container referenced in it,
>>> but you should be able to look at it and figure out how to link everything
>>> up from that with Docker Compose.
>>>
>>> Mike
>>>
>>> On Mon, Feb 18, 2019 at 12:00 PM Kevin Doran  wrote:
>>>
>>>>
>>>> Hi Tom,
>>>>
>>>> Given that you are getting a Connection refused exception and not an
>>>> HTTP 401 or 403, I suspect that the problem is networking related and not
>>>> authentication/authorization.
>>>>
>>>> Are the two docker containers on the same docker network? Can you
>>>> resolve/ping the Registry container from the NiFi container, and when you
>>>> create the Registry client in NiFi, are you using the hostname that the
>>>> NiFi server/container would use to address Registry (ie, not the host a
>>>> REgistry UI use might use if you are using port mapping to the docker
>>>> container with the host).
>>>>
>>>> Here is an example repo in which I have an example of connecting NiFi
>>>> and Registry and docker conatiners using docker-compose:
>>>>
>>>> https://github.com/kevdoran/flowops
>>>>
>>>> Hope this helps,
>>>> Kevin
>>>>
>>>>
>>>> On February 18, 2019 at 10:08:54, Tomislav Novosel (
>>>> to.novo...@gmail.com) wrote:
>>>> > Hi all,
>>>> >
>>>> > I generated standalone certificate with nifi-toolkit for my two Nifi
>>>> > instances and for Nifi registry instance. All are on the same domain
>>>> so I
>>>> > used one certificate and its credentials for properties file (trustore
>>>> > path, keystore path, keystore passw, trustore passw).
>>>> >
>>>> > Auth is configured through domain LDAP server and everything works.
>>>> >
>>>> > On both Nifi node and Registry node I configured authorizers.xml file
>>>> on
>>>> > property "Node Identity 1" with value from keystore.jks on "Owner"
>>>> > attribute.
>>>> >
>>>> > Owner: <>
>>>> > In Nifi registry I added that as user and gave rights to read and
>>>> modify
>>>> > buckets.
>>>> >
>>>> > When I add Registry Client on Nifi node and Hit Start version control
>>>> on
>>>> > process group I got error:
>>>> >
>>>> > Error
>>>> >
>>>> > Unable to obtain listing of buckets: java.net.ConnectException:
>>>> Connection
>>>> > refused (Connection refused)
>>>> >
>>>> >
>>>> > I missed something in configuration, please help me.
>>>> >
>>>> > Note: both Registry and Nifi are in Docker containers on the same
>>>> node. I
>>>> > tried with IP address, but nothing.
>>>> >
>>>> >
>>>> > Thank you,
>>>> >
>>>> > Tom
>>>> >
>>>>
>>>>


Re: How to integrate Secured Ragistry with Secured Nifi

2019-02-19 Thread Tomislav Novosel
Hi again,

@Kevin, I chacked all you mentioned, containers are on the same docker
network and I can ping from the Nifi container host where is Nifi registry,
although they are on the
same server, it is then localhost I believe.
When adding nifi registry client I'm using hostname from the
nifi-registry.properties whisch is the same as hostname where the registry
and nifi instance are (because I added flag --hostname
while running containers and hostname is the same as the server hostname
where the containers are running).

I also checked yours and Mike's docker-compose files and everything is
preety much the same regarding, hosts and ports.

BR,
Tom

On Tue, 19 Feb 2019 at 06:51, Tomislav Novosel  wrote:

> Hi Mike, Kevin
>
> Thank you for your answers, I appreciate it.
>
> @Mike, why are you setting WEB_HTTP_HOST and WEB_HTTP_PORT when you are
> using secured nifi? Shouldnt that be empty and only HTTPS host and port
> used?
>
> BR,
> Tom
>
> On Mon, 18 Feb 2019, 23:56 Mike Thomsen 
>> Tom,
>>
>> > Note: both Registry and Nifi are in Docker containers on the same node.
>> I tried with IP address, but nothing.
>>
>> Each docker container has its own IP address. You need to link the two
>> containers. I always use Docker Compose, so I can't help you on how to set
>> it up manually. That said, I did a sample last year connecting a few
>> different NiFi nodes and a registry w/ SSL here:
>>
>>
>> https://github.com/MikeThomsen/nifi-docker-compose/blob/master/docker-compose-registry.yml
>>
>> I can't remember if I kept the LDAP docker container referenced in it,
>> but you should be able to look at it and figure out how to link everything
>> up from that with Docker Compose.
>>
>> Mike
>>
>> On Mon, Feb 18, 2019 at 12:00 PM Kevin Doran  wrote:
>>
>>>
>>> Hi Tom,
>>>
>>> Given that you are getting a Connection refused exception and not an
>>> HTTP 401 or 403, I suspect that the problem is networking related and not
>>> authentication/authorization.
>>>
>>> Are the two docker containers on the same docker network? Can you
>>> resolve/ping the Registry container from the NiFi container, and when you
>>> create the Registry client in NiFi, are you using the hostname that the
>>> NiFi server/container would use to address Registry (ie, not the host a
>>> REgistry UI use might use if you are using port mapping to the docker
>>> container with the host).
>>>
>>> Here is an example repo in which I have an example of connecting NiFi
>>> and Registry and docker conatiners using docker-compose:
>>>
>>> https://github.com/kevdoran/flowops
>>>
>>> Hope this helps,
>>> Kevin
>>>
>>>
>>> On February 18, 2019 at 10:08:54, Tomislav Novosel (to.novo...@gmail.com)
>>> wrote:
>>> > Hi all,
>>> >
>>> > I generated standalone certificate with nifi-toolkit for my two Nifi
>>> > instances and for Nifi registry instance. All are on the same domain
>>> so I
>>> > used one certificate and its credentials for properties file (trustore
>>> > path, keystore path, keystore passw, trustore passw).
>>> >
>>> > Auth is configured through domain LDAP server and everything works.
>>> >
>>> > On both Nifi node and Registry node I configured authorizers.xml file
>>> on
>>> > property "Node Identity 1" with value from keystore.jks on "Owner"
>>> > attribute.
>>> >
>>> > Owner: <>
>>> > In Nifi registry I added that as user and gave rights to read and
>>> modify
>>> > buckets.
>>> >
>>> > When I add Registry Client on Nifi node and Hit Start version control
>>> on
>>> > process group I got error:
>>> >
>>> > Error
>>> >
>>> > Unable to obtain listing of buckets: java.net.ConnectException:
>>> Connection
>>> > refused (Connection refused)
>>> >
>>> >
>>> > I missed something in configuration, please help me.
>>> >
>>> > Note: both Registry and Nifi are in Docker containers on the same
>>> node. I
>>> > tried with IP address, but nothing.
>>> >
>>> >
>>> > Thank you,
>>> >
>>> > Tom
>>> >
>>>
>>>


Re: How to integrate Secured Ragistry with Secured Nifi

2019-02-18 Thread Tomislav Novosel
Hi Mike, Kevin

Thank you for your answers, I appreciate it.

@Mike, why are you setting WEB_HTTP_HOST and WEB_HTTP_PORT when you are
using secured nifi? Shouldnt that be empty and only HTTPS host and port
used?

BR,
Tom

On Mon, 18 Feb 2019, 23:56 Mike Thomsen  Tom,
>
> > Note: both Registry and Nifi are in Docker containers on the same node.
> I tried with IP address, but nothing.
>
> Each docker container has its own IP address. You need to link the two
> containers. I always use Docker Compose, so I can't help you on how to set
> it up manually. That said, I did a sample last year connecting a few
> different NiFi nodes and a registry w/ SSL here:
>
>
> https://github.com/MikeThomsen/nifi-docker-compose/blob/master/docker-compose-registry.yml
>
> I can't remember if I kept the LDAP docker container referenced in it, but
> you should be able to look at it and figure out how to link everything up
> from that with Docker Compose.
>
> Mike
>
> On Mon, Feb 18, 2019 at 12:00 PM Kevin Doran  wrote:
>
>>
>> Hi Tom,
>>
>> Given that you are getting a Connection refused exception and not an HTTP
>> 401 or 403, I suspect that the problem is networking related and not
>> authentication/authorization.
>>
>> Are the two docker containers on the same docker network? Can you
>> resolve/ping the Registry container from the NiFi container, and when you
>> create the Registry client in NiFi, are you using the hostname that the
>> NiFi server/container would use to address Registry (ie, not the host a
>> REgistry UI use might use if you are using port mapping to the docker
>> container with the host).
>>
>> Here is an example repo in which I have an example of connecting NiFi and
>> Registry and docker conatiners using docker-compose:
>>
>> https://github.com/kevdoran/flowops
>>
>> Hope this helps,
>> Kevin
>>
>>
>> On February 18, 2019 at 10:08:54, Tomislav Novosel (to.novo...@gmail.com)
>> wrote:
>> > Hi all,
>> >
>> > I generated standalone certificate with nifi-toolkit for my two Nifi
>> > instances and for Nifi registry instance. All are on the same domain so
>> I
>> > used one certificate and its credentials for properties file (trustore
>> > path, keystore path, keystore passw, trustore passw).
>> >
>> > Auth is configured through domain LDAP server and everything works.
>> >
>> > On both Nifi node and Registry node I configured authorizers.xml file on
>> > property "Node Identity 1" with value from keystore.jks on "Owner"
>> > attribute.
>> >
>> > Owner: <>
>> > In Nifi registry I added that as user and gave rights to read and modify
>> > buckets.
>> >
>> > When I add Registry Client on Nifi node and Hit Start version control on
>> > process group I got error:
>> >
>> > Error
>> >
>> > Unable to obtain listing of buckets: java.net.ConnectException:
>> Connection
>> > refused (Connection refused)
>> >
>> >
>> > I missed something in configuration, please help me.
>> >
>> > Note: both Registry and Nifi are in Docker containers on the same node.
>> I
>> > tried with IP address, but nothing.
>> >
>> >
>> > Thank you,
>> >
>> > Tom
>> >
>>
>>


How to integrate Secured Ragistry with Secured Nifi

2019-02-18 Thread Tomislav Novosel
Hi all,

I generated standalone certificate with nifi-toolkit for my two Nifi
instances and for Nifi registry instance. All are on the same domain so I
used one certificate and its credentials for properties file (trustore
path, keystore path, keystore passw, trustore passw).

Auth is configured through domain LDAP server and everything works.

On both Nifi node and Registry node I configured authorizers.xml file on
property "Node Identity 1" with value from keystore.jks on "Owner"
attribute.

Owner: <

Can't open summary-systems diagnostics

2019-02-14 Thread Tomislav Novosel
Hi all,

I installed Nifi in cluster mode with two nodes and both secured with
Kerberos for user Auth.

When I want to see summary->system diagnostics I've got error:
An unexpected error has occurred  in UI
and in the log:

 ERROR [NiFi Web Server-80406] o.a.nifi.web.api.config.ThrowableMapper An
unexpected error has occurred: java.lang.NoClassDefFoundError: Could not
initialize class sun.nio.fs.LinuxNativeDispatcher.

Everything else is working as normal.

BR,
Tom


Re: Nifi registry Kerberos Auth with Docker

2019-02-13 Thread Tomislav Novosel
Hi Kevin,

Thank you for your suggestions. I succeeded to get everything working now.
As you described, now is everything exectly like that in files you
mentioned.

One strange thing. At first stratup of container, I can login into UI
without problems, but I cannot add new users and policies. After I
refreshed UI in browser, I was able to do that. So just after refreshing. ??

And also, Im not able to modify my initial admin and user privileges, I
mean for myself, but for new added user I can.

I read on some forums that it can be slow snyc beetwen Nifi and AD. Im on
my company's domain and there are couple of hundreds users.

BR,
Tom

On Wed, 13 Feb 2019, 15:29 Kevin Doran  Hi Tom,
>
> How are you configuring the various config files? Through the docker
> container's environment variables, or through modifying those files
> directly? If modifying those files, are you injecting them through a volume
> or something like that? Trying to determine if there is something else at
> play here overwritting your settings on startup...
>
> It sounds like you are able to configure authentication/login
> successfully, and are just running into a snag on the authorization /
> initial admin side of things.
>
> Try this:
>
> 1. In authorizers.xml, set the "Initial User Identity 1" and "Initial
> Admin Identity" properties to exactly match the user identity recognized by
> NiFi (the one you see in the upper-right corner of the UI after logging
> in). Make sure whitespace and capitalization all agree.
>
> 2. Delete users.xml and authorizations.xml files and restart NiFI Registry.
>
> If all goes successfully, your users.xml file should be regenerated to
> hold a user with an identity matching "Initial User Identity 1", and
> authorizations.xml should be regenerated to hold the policies for the
> "Initial Admin Identity".
>
> If you get that working, you can improve things a bit by configuring the
> LdapUserGroupProvider to sync users and groups from LDAP, letting you set
> policies in the UI without having to manually create users that match the
> LDAP directory users.
>
> Hope this helps,
> Kevin
>
>
> On February 13, 2019 at 03:56:52, Tomislav Novosel (to.novo...@gmail.com)
> wrote:
> > Also, FYI.
> >
> > If I set for INITIAL_ADMIN_IDENTITY my user's full DN,
> cn=...,ou=...,dc=...
> > I can also login into UI, but there is no properties button upper right
> in
> > the UI.
> >
> > [image: 1.PNG]
> >
> > If I set only USERNEMA to be u21g46, I can see properties button, but I
> > can't add new users.
> >
> > BR,
> > Tom
> >
> > On Fri, 8 Feb 2019 at 16:03, Bryan Bende wrote:
> >
> > > Thinking about it more, I guess if you are not trying to do spnego
> > > then that message from the logs is not really an error. The registry
> > > UI always tries the spnego end-point first and if it returns the
> > > conflict response (as the log says) then you get sent to the login
> > > page.
> > >
> > > Maybe try turning on debug logging by editing logback.xml > >
> name="org.apache.nifi.registry" level="INFO"/> and changing to DEBUG.
> > >
> > > On Fri, Feb 8, 2019 at 9:51 AM Tomislav Novosel
> > > wrote:
> > > >
> > > > Hi Bryan,
> > > >
> > > > I don't have this properties populated in Nifi registry instance
> > > > outside Docker (as a service on linux server), and everything works.
> > > >
> > > > What are this properties up to?
> > > >
> > > > Regards,
> > > > Tom
> > > >
> > > >
> > > >
> > > > On Fri, 8 Feb 2019 at 15:25, Bryan Bende wrote:
> > > >>
> > > >> The message about "Kerberos service ticket login not supported by
> this
> > > >> NiFi Registry" means that one of the following properties is not
> > > >> populated:
> > > >>
> > > >> nifi.registry.kerberos.spnego.principal=
> > > >> nifi.registry.kerberos.spnego.keytab.location=
> > > >>
> > > >> On Fri, Feb 8, 2019 at 8:20 AM Tomislav Novosel
> > > wrote:
> > > >> >
> > > >> > Hi Daniel,
> > > >> >
> > > >> > Ok, I see. Thanks for the answer.
> > > >> >
> > > >> > I switched to official Nifi registry image. I succeeded to spin up
> > > registry in docker container and to
> > > >> > setup Kerberos provider in identity-providers.xml

Re: Nifi registry Kerberos Auth with Docker

2019-02-13 Thread Tomislav Novosel
Also, FYI.

If I set for INITIAL_ADMIN_IDENTITY my user's full DN, cn=...,ou=...,dc=...
I can also login into UI, but there is no properties button upper right in
the UI.

[image: 1.PNG]

If I set only USERNEMA to be u21g46, I can see properties button, but I
can't add new users.

BR,
Tom

On Fri, 8 Feb 2019 at 16:03, Bryan Bende  wrote:

> Thinking about it more, I guess if you are not trying to do spnego
> then that message from the logs is not really an error. The registry
> UI always tries the spnego end-point first and if it returns the
> conflict response (as the log says) then you get sent to the login
> page.
>
> Maybe try turning on debug logging by editing logback.xml  name="org.apache.nifi.registry" level="INFO"/> and changing to DEBUG.
>
> On Fri, Feb 8, 2019 at 9:51 AM Tomislav Novosel 
> wrote:
> >
> > Hi Bryan,
> >
> > I don't have this properties populated in Nifi registry instance
> > outside Docker (as a service on linux server), and everything works.
> >
> > What are this properties up to?
> >
> > Regards,
> > Tom
> >
> >
> >
> > On Fri, 8 Feb 2019 at 15:25, Bryan Bende  wrote:
> >>
> >> The message about "Kerberos service ticket login not supported by this
> >> NiFi Registry" means that one of the following properties is not
> >> populated:
> >>
> >> nifi.registry.kerberos.spnego.principal=
> >> nifi.registry.kerberos.spnego.keytab.location=
> >>
> >> On Fri, Feb 8, 2019 at 8:20 AM Tomislav Novosel 
> wrote:
> >> >
> >> > Hi Daniel,
> >> >
> >> > Ok, I see. Thanks for the answer.
> >> >
> >> > I switched to official Nifi registry image. I succeeded to spin up
> registry in docker container and to
> >> > setup Kerberos provider in identity-providers.xml. Also I configured
> authorizers.xml as per afficial Nifi documentation.
> >> >
> >> > I already have the same setup with Kerberos, but not in Docker
> container. And everything works like a charm.
> >> >
> >> > When I enter credentials, login does not pass. This is app log:
> >> >
> >> > 2019-02-08 12:52:30,568 INFO [NiFi Registry Web Server-14]
> o.a.n.r.w.m.IllegalStateExceptionMapper java.lang.IllegalStateException:
> Kerberos service ticket login not supported by this NiFi Registry.
> Returning Conflict response.
> >> > 2019-02-08 12:52:30,644 INFO [NiFi Registry Web Server-13]
> o.a.n.r.w.s.NiFiRegistrySecurityConfig Client could not be authenticated
> due to:
> org.springframework.security.authentication.AuthenticationCredentialsNotFoundException:
> An Authentication object was not found in the SecurityContext Returning 401
> response.
> >> > 2019-02-08 12:52:50,557 INFO [NiFi Registry Web Server-14]
> o.a.n.r.w.m.UnauthorizedExceptionMapper
> org.apache.nifi.registry.web.exception.UnauthorizedException: The supplied
> client credentials are not valid.. Returning Unauthorized response.
> >> >
> >> > Not sure what is going on here.
> >> >
> >> > Regards,
> >> > Tom
> >> >
> >> >
> >> > On Fri, 8 Feb 2019 at 11:36, Daniel Chaffelson 
> wrote:
> >> >>
> >> >> Hi Tomislav,
> >> >> I created that build a long time ago before the official apache one
> was up, and it is out of date sorry.
> >> >> Can I suggest you switch to the official apache image that Kevin
> mentioned and try again? It is an up to date version and recommended by the
> community.
> >> >>
> >> >> On Thu, Feb 7, 2019 at 5:54 PM Tomislav Novosel <
> to.novo...@gmail.com> wrote:
> >> >>>
> >> >>> Hi Kevin,
> >> >>>
> >> >>> I'm using image from Docker hub on this link:
> >> >>> https://hub.docker.com/r/chaffelson/nifi-registry
> >> >>>
> >> >>> I think I know where is the problem. The problem is in config file
> where
> >> >>> http host and http port property remains even if I manually set
> https host and htpps port.
> >> >>> I deleted http host and http port to be empty, but when I started
> container again, those values are again there.
> >> >>>
> >> >>> I don't know what the author of image wanted to say with this:
> >> >>>
> >> >>> The Docker image can be built using the following command:
> >> >>>
> >> >>> .
> ~/Projects/nifi-dev/nifi-re

Re: Nifi registry Kerberos Auth with Docker

2019-02-13 Thread Tomislav Novosel
Hi all,

I gave up regarding Kerberos auth from Docker, it is strange issue.
I switched after to LDAP auth form Docker container and it works.

I'm using official nifi image and I used 'docker run' command form the site:
https://hub.docker.com/r/apache/nifi

But still, issue remains...after I login, I cant add new users or modify
them.

In conf folder I see in authorizations.xml that my Initial admin identitiy
user has rights to do that.

My conf for authorizers,xml is this:



file-user-group-provider

org.apache.nifi.registry.security.authorization.file.FileUserGroupProvider
./conf/users.xml
user1



file-access-policy-provider

org.apache.nifi.registry.security.authorization.file.FileAccessPolicyProvider
file-user-group-provider
./conf/authorizations.xml
user1





managed-authorizer

org.apache.nifi.registry.security.authorization.StandardManagedAuthorizer
file-access-policy-provider


In identity-providers.xml everything is good i believe as I can login into
Nifi UI.

Also when I open user1 properties in Nifi UI I can see privileges of that
initial user and it has all the rights to create new users, policies etc.

What am I missing?

Thanks,
Tom








On Fri, 8 Feb 2019 at 16:03, Bryan Bende  wrote:

> Thinking about it more, I guess if you are not trying to do spnego
> then that message from the logs is not really an error. The registry
> UI always tries the spnego end-point first and if it returns the
> conflict response (as the log says) then you get sent to the login
> page.
>
> Maybe try turning on debug logging by editing logback.xml  name="org.apache.nifi.registry" level="INFO"/> and changing to DEBUG.
>
> On Fri, Feb 8, 2019 at 9:51 AM Tomislav Novosel 
> wrote:
> >
> > Hi Bryan,
> >
> > I don't have this properties populated in Nifi registry instance
> > outside Docker (as a service on linux server), and everything works.
> >
> > What are this properties up to?
> >
> > Regards,
> > Tom
> >
> >
> >
> > On Fri, 8 Feb 2019 at 15:25, Bryan Bende  wrote:
> >>
> >> The message about "Kerberos service ticket login not supported by this
> >> NiFi Registry" means that one of the following properties is not
> >> populated:
> >>
> >> nifi.registry.kerberos.spnego.principal=
> >> nifi.registry.kerberos.spnego.keytab.location=
> >>
> >> On Fri, Feb 8, 2019 at 8:20 AM Tomislav Novosel 
> wrote:
> >> >
> >> > Hi Daniel,
> >> >
> >> > Ok, I see. Thanks for the answer.
> >> >
> >> > I switched to official Nifi registry image. I succeeded to spin up
> registry in docker container and to
> >> > setup Kerberos provider in identity-providers.xml. Also I configured
> authorizers.xml as per afficial Nifi documentation.
> >> >
> >> > I already have the same setup with Kerberos, but not in Docker
> container. And everything works like a charm.
> >> >
> >> > When I enter credentials, login does not pass. This is app log:
> >> >
> >> > 2019-02-08 12:52:30,568 INFO [NiFi Registry Web Server-14]
> o.a.n.r.w.m.IllegalStateExceptionMapper java.lang.IllegalStateException:
> Kerberos service ticket login not supported by this NiFi Registry.
> Returning Conflict response.
> >> > 2019-02-08 12:52:30,644 INFO [NiFi Registry Web Server-13]
> o.a.n.r.w.s.NiFiRegistrySecurityConfig Client could not be authenticated
> due to:
> org.springframework.security.authentication.AuthenticationCredentialsNotFoundException:
> An Authentication object was not found in the SecurityContext Returning 401
> response.
> >> > 2019-02-08 12:52:50,557 INFO [NiFi Registry Web Server-14]
> o.a.n.r.w.m.UnauthorizedExceptionMapper
> org.apache.nifi.registry.web.exception.UnauthorizedException: The supplied
> client credentials are not valid.. Returning Unauthorized response.
> >> >
> >> > Not sure what is going on here.
> >> >
> >> > Regards,
> >> > Tom
> >> >
> >> >
> >> > On Fri, 8 Feb 2019 at 11:36, Daniel Chaffelson 
> wrote:
> >> >>
> >> >> Hi Tomislav,
> >> >> I created that build a long time ago before the official apache one
> was up, and it is out of date sorry.
> >> >> Can I suggest you switch to the official apache image that Kevin
> mentioned and try again? It is an up to date version and recommended by the
> community.
> >> >>
> >> >> On Thu, Feb 7, 2019 at 5:54 PM Tomislav Novosel <
>

Re: Nifi registry Kerberos Auth with Docker

2019-02-09 Thread Tomislav Novosel
Yes, this log INFO I see also on my Nifi registry without docker.

I configured logback to DEBUG inside my container and I found in app.log
this exception:

un.security.krb5.KrbException: Cannot locate default realm

I started container with --add-host command to add in hosts file all my
Domain Kerberos FQDN's
for KDC server. I tried also to ping it from container and it is alive.

FYI. I crated volume for container to have access to krb5.conf file.
krb5.conf file is the same as for
my Nifi registry as a service where everything works. File permissions are
added also.

Don't know what else it could be.

Any suggestions?

Thank you.

BR,
Tom

On Fri, 8 Feb 2019 at 16:03, Bryan Bende  wrote:

> Thinking about it more, I guess if you are not trying to do spnego
> then that message from the logs is not really an error. The registry
> UI always tries the spnego end-point first and if it returns the
> conflict response (as the log says) then you get sent to the login
> page.
>
> Maybe try turning on debug logging by editing logback.xml  name="org.apache.nifi.registry" level="INFO"/> and changing to DEBUG.
>
> On Fri, Feb 8, 2019 at 9:51 AM Tomislav Novosel 
> wrote:
> >
> > Hi Bryan,
> >
> > I don't have this properties populated in Nifi registry instance
> > outside Docker (as a service on linux server), and everything works.
> >
> > What are this properties up to?
> >
> > Regards,
> > Tom
> >
> >
> >
> > On Fri, 8 Feb 2019 at 15:25, Bryan Bende  wrote:
> >>
> >> The message about "Kerberos service ticket login not supported by this
> >> NiFi Registry" means that one of the following properties is not
> >> populated:
> >>
> >> nifi.registry.kerberos.spnego.principal=
> >> nifi.registry.kerberos.spnego.keytab.location=
> >>
> >> On Fri, Feb 8, 2019 at 8:20 AM Tomislav Novosel 
> wrote:
> >> >
> >> > Hi Daniel,
> >> >
> >> > Ok, I see. Thanks for the answer.
> >> >
> >> > I switched to official Nifi registry image. I succeeded to spin up
> registry in docker container and to
> >> > setup Kerberos provider in identity-providers.xml. Also I configured
> authorizers.xml as per afficial Nifi documentation.
> >> >
> >> > I already have the same setup with Kerberos, but not in Docker
> container. And everything works like a charm.
> >> >
> >> > When I enter credentials, login does not pass. This is app log:
> >> >
> >> > 2019-02-08 12:52:30,568 INFO [NiFi Registry Web Server-14]
> o.a.n.r.w.m.IllegalStateExceptionMapper java.lang.IllegalStateException:
> Kerberos service ticket login not supported by this NiFi Registry.
> Returning Conflict response.
> >> > 2019-02-08 12:52:30,644 INFO [NiFi Registry Web Server-13]
> o.a.n.r.w.s.NiFiRegistrySecurityConfig Client could not be authenticated
> due to:
> org.springframework.security.authentication.AuthenticationCredentialsNotFoundException:
> An Authentication object was not found in the SecurityContext Returning 401
> response.
> >> > 2019-02-08 12:52:50,557 INFO [NiFi Registry Web Server-14]
> o.a.n.r.w.m.UnauthorizedExceptionMapper
> org.apache.nifi.registry.web.exception.UnauthorizedException: The supplied
> client credentials are not valid.. Returning Unauthorized response.
> >> >
> >> > Not sure what is going on here.
> >> >
> >> > Regards,
> >> > Tom
> >> >
> >> >
> >> > On Fri, 8 Feb 2019 at 11:36, Daniel Chaffelson 
> wrote:
> >> >>
> >> >> Hi Tomislav,
> >> >> I created that build a long time ago before the official apache one
> was up, and it is out of date sorry.
> >> >> Can I suggest you switch to the official apache image that Kevin
> mentioned and try again? It is an up to date version and recommended by the
> community.
> >> >>
> >> >> On Thu, Feb 7, 2019 at 5:54 PM Tomislav Novosel <
> to.novo...@gmail.com> wrote:
> >> >>>
> >> >>> Hi Kevin,
> >> >>>
> >> >>> I'm using image from Docker hub on this link:
> >> >>> https://hub.docker.com/r/chaffelson/nifi-registry
> >> >>>
> >> >>> I think I know where is the problem. The problem is in config file
> where
> >> >>> http host and http port property remains even if I manually set
> https host and htpps port.
> >> >>> I deleted http host and http port to be empty, but when I started
> containe

Re: Nifi registry Kerberos Auth with Docker

2019-02-09 Thread Tomislav Novosel
Yes, this log INFO I see also on my Nifi registry without docker.

I configured logback to DEBUG inside my container and I found in app.log
this exeption:

un.security.krb5.KrbException: Cannot locate default realm

I started container with --ad-host command to add in hosts file all my
Domain Kerberos FQDN's
for KDC server. I triad also to ping it from container and it is alive.

Don't know what else it could be.

Any suggestions?

Thank you.

BR,
Tom


On Fri, 8 Feb 2019 at 16:03, Bryan Bende  wrote:

> Thinking about it more, I guess if you are not trying to do spnego
> then that message from the logs is not really an error. The registry
> UI always tries the spnego end-point first and if it returns the
> conflict response (as the log says) then you get sent to the login
> page.
>
> Maybe try turning on debug logging by editing logback.xml  name="org.apache.nifi.registry" level="INFO"/> and changing to DEBUG.
>
> On Fri, Feb 8, 2019 at 9:51 AM Tomislav Novosel 
> wrote:
> >
> > Hi Bryan,
> >
> > I don't have this properties populated in Nifi registry instance
> > outside Docker (as a service on linux server), and everything works.
> >
> > What are this properties up to?
> >
> > Regards,
> > Tom
> >
> >
> >
> > On Fri, 8 Feb 2019 at 15:25, Bryan Bende  wrote:
> >>
> >> The message about "Kerberos service ticket login not supported by this
> >> NiFi Registry" means that one of the following properties is not
> >> populated:
> >>
> >> nifi.registry.kerberos.spnego.principal=
> >> nifi.registry.kerberos.spnego.keytab.location=
> >>
> >> On Fri, Feb 8, 2019 at 8:20 AM Tomislav Novosel 
> wrote:
> >> >
> >> > Hi Daniel,
> >> >
> >> > Ok, I see. Thanks for the answer.
> >> >
> >> > I switched to official Nifi registry image. I succeeded to spin up
> registry in docker container and to
> >> > setup Kerberos provider in identity-providers.xml. Also I configured
> authorizers.xml as per afficial Nifi documentation.
> >> >
> >> > I already have the same setup with Kerberos, but not in Docker
> container. And everything works like a charm.
> >> >
> >> > When I enter credentials, login does not pass. This is app log:
> >> >
> >> > 2019-02-08 12:52:30,568 INFO [NiFi Registry Web Server-14]
> o.a.n.r.w.m.IllegalStateExceptionMapper java.lang.IllegalStateException:
> Kerberos service ticket login not supported by this NiFi Registry.
> Returning Conflict response.
> >> > 2019-02-08 12:52:30,644 INFO [NiFi Registry Web Server-13]
> o.a.n.r.w.s.NiFiRegistrySecurityConfig Client could not be authenticated
> due to:
> org.springframework.security.authentication.AuthenticationCredentialsNotFoundException:
> An Authentication object was not found in the SecurityContext Returning 401
> response.
> >> > 2019-02-08 12:52:50,557 INFO [NiFi Registry Web Server-14]
> o.a.n.r.w.m.UnauthorizedExceptionMapper
> org.apache.nifi.registry.web.exception.UnauthorizedException: The supplied
> client credentials are not valid.. Returning Unauthorized response.
> >> >
> >> > Not sure what is going on here.
> >> >
> >> > Regards,
> >> > Tom
> >> >
> >> >
> >> > On Fri, 8 Feb 2019 at 11:36, Daniel Chaffelson 
> wrote:
> >> >>
> >> >> Hi Tomislav,
> >> >> I created that build a long time ago before the official apache one
> was up, and it is out of date sorry.
> >> >> Can I suggest you switch to the official apache image that Kevin
> mentioned and try again? It is an up to date version and recommended by the
> community.
> >> >>
> >> >> On Thu, Feb 7, 2019 at 5:54 PM Tomislav Novosel <
> to.novo...@gmail.com> wrote:
> >> >>>
> >> >>> Hi Kevin,
> >> >>>
> >> >>> I'm using image from Docker hub on this link:
> >> >>> https://hub.docker.com/r/chaffelson/nifi-registry
> >> >>>
> >> >>> I think I know where is the problem. The problem is in config file
> where
> >> >>> http host and http port property remains even if I manually set
> https host and htpps port.
> >> >>> I deleted http host and http port to be empty, but when I started
> container again, those values are again there.
> >> >>>
> >> >>> I don't know what the author of image wanted to say with this:
> >> >>>
>

Re: Nifi registry Kerberos Auth with Docker

2019-02-08 Thread Tomislav Novosel
Hi Bryan,

I don't have this properties populated in Nifi registry instance
outside Docker (as a service on linux server), and everything works.

What are this properties up to?

Regards,
Tom



On Fri, 8 Feb 2019 at 15:25, Bryan Bende  wrote:

> The message about "Kerberos service ticket login not supported by this
> NiFi Registry" means that one of the following properties is not
> populated:
>
> nifi.registry.kerberos.spnego.principal=
> nifi.registry.kerberos.spnego.keytab.location=
>
> On Fri, Feb 8, 2019 at 8:20 AM Tomislav Novosel 
> wrote:
> >
> > Hi Daniel,
> >
> > Ok, I see. Thanks for the answer.
> >
> > I switched to official Nifi registry image. I succeeded to spin up
> registry in docker container and to
> > setup Kerberos provider in identity-providers.xml. Also I configured
> authorizers.xml as per afficial Nifi documentation.
> >
> > I already have the same setup with Kerberos, but not in Docker
> container. And everything works like a charm.
> >
> > When I enter credentials, login does not pass. This is app log:
> >
> > 2019-02-08 12:52:30,568 INFO [NiFi Registry Web Server-14]
> o.a.n.r.w.m.IllegalStateExceptionMapper java.lang.IllegalStateException:
> Kerberos service ticket login not supported by this NiFi Registry.
> Returning Conflict response.
> > 2019-02-08 12:52:30,644 INFO [NiFi Registry Web Server-13]
> o.a.n.r.w.s.NiFiRegistrySecurityConfig Client could not be authenticated
> due to:
> org.springframework.security.authentication.AuthenticationCredentialsNotFoundException:
> An Authentication object was not found in the SecurityContext Returning 401
> response.
> > 2019-02-08 12:52:50,557 INFO [NiFi Registry Web Server-14]
> o.a.n.r.w.m.UnauthorizedExceptionMapper
> org.apache.nifi.registry.web.exception.UnauthorizedException: The supplied
> client credentials are not valid.. Returning Unauthorized response.
> >
> > Not sure what is going on here.
> >
> > Regards,
> > Tom
> >
> >
> > On Fri, 8 Feb 2019 at 11:36, Daniel Chaffelson 
> wrote:
> >>
> >> Hi Tomislav,
> >> I created that build a long time ago before the official apache one was
> up, and it is out of date sorry.
> >> Can I suggest you switch to the official apache image that Kevin
> mentioned and try again? It is an up to date version and recommended by the
> community.
> >>
> >> On Thu, Feb 7, 2019 at 5:54 PM Tomislav Novosel 
> wrote:
> >>>
> >>> Hi Kevin,
> >>>
> >>> I'm using image from Docker hub on this link:
> >>> https://hub.docker.com/r/chaffelson/nifi-registry
> >>>
> >>> I think I know where is the problem. The problem is in config file
> where
> >>> http host and http port property remains even if I manually set https
> host and htpps port.
> >>> I deleted http host and http port to be empty, but when I started
> container again, those values are again there.
> >>>
> >>> I don't know what the author of image wanted to say with this:
> >>>
> >>> The Docker image can be built using the following command:
> >>>
> >>> .
> ~/Projects/nifi-dev/nifi-registry/nifi-registry-docker/dockerhub/DockerBuild.sh
> >>>
> >>> What does this commend mean?
> >>>
> >>> And this:
> >>>
> >>> Note: The default version of NiFi-Registry specified by the Dockerfile
> is typically that of one that is unreleased if working from source. To
> build an image for a prior released version, one can override the
> NIFI_REGISTRY_VERSIONbuild-arg with the following command:
> >>>
> >>> docker build --build-arg=NIFI_REGISRTY_VERSION={Desired NiFi-Registry
> Version} -t apache/nifi-registry:latest .
> >>>
> >>> For this command above you need to have Dockerfile. I tried with
> Dockerfile from docker hub, but there are errors in execution on this line:
> >>>
> >>> ADD sh/ ${NIFI_REGISTRY_BASE_DIR}/scripts/
> >>>
> >>>  On the other hand, If I manage to get the image with first command, I
> will get Nifi registry version 0.1.0 which I don't want.
> >>>
> >>> I'm little bit confused here, sorry for longer mail.
> >>>
> >>> Thanks.
> >>>
> >>> Regards,
> >>> Tom
> >>>
> >>> On Thu, 7 Feb 2019 at 17:38, Kevin Doran  wrote:
> >>>>
> >>>> Hi Tom,
> >>>>
> >>>> Are you using the apache/nifi-registry image or a custom image for
> thi

Re: Nifi registry Kerberos Auth with Docker

2019-02-08 Thread Tomislav Novosel
Hi Daniel,

Ok, I see. Thanks for the answer.

I switched to official Nifi registry image. I succeeded to spin up registry
in docker container and to
setup Kerberos provider in identity-providers.xml. Also I configured
authorizers.xml as per afficial Nifi documentation.

I already have the same setup with Kerberos, but not in Docker container.
And everything works like a charm.

When I enter credentials, login does not pass. This is app log:

2019-02-08 12:52:30,568 INFO [NiFi Registry Web Server-14]
o.a.n.r.w.m.IllegalStateExceptionMapper java.lang.IllegalStateException:
Kerberos service ticket login not supported by this NiFi Registry.
Returning Conflict response.
2019-02-08 12:52:30,644 INFO [NiFi Registry Web Server-13]
o.a.n.r.w.s.NiFiRegistrySecurityConfig Client could not be authenticated
due to:
org.springframework.security.authentication.AuthenticationCredentialsNotFoundException:
An Authentication object was not found in the SecurityContext Returning 401
response.
2019-02-08 12:52:50,557 INFO [NiFi Registry Web Server-14]
o.a.n.r.w.m.UnauthorizedExceptionMapper
org.apache.nifi.registry.web.exception.UnauthorizedException: The supplied
client credentials are not valid.. Returning Unauthorized response.

Not sure what is going on here.

Regards,
Tom


On Fri, 8 Feb 2019 at 11:36, Daniel Chaffelson  wrote:

> Hi Tomislav,
> I created that build a long time ago before the official apache one was
> up, and it is out of date sorry.
> Can I suggest you switch to the official apache image that Kevin mentioned
> and try again? It is an up to date version and recommended by the community.
>
> On Thu, Feb 7, 2019 at 5:54 PM Tomislav Novosel 
> wrote:
>
>> Hi Kevin,
>>
>> I'm using image from Docker hub on this link:
>> https://hub.docker.com/r/chaffelson/nifi-registry
>>
>> I think I know where is the problem. The problem is in config file where
>> http host and http port property remains even if I manually set https
>> host and htpps port.
>> I deleted http host and http port to be empty, but when I started
>> container again, those values are again there.
>>
>> I don't know what the author of image wanted to say with this:
>>
>> The Docker image can be built using the following command:
>>
>> . 
>> ~/Projects/nifi-dev/nifi-registry/nifi-registry-docker/dockerhub/DockerBuild.sh
>>
>> What does this commend mean?
>>
>> And this:
>>
>> Note: The default version of NiFi-Registry specified by the Dockerfile
>> is typically that of one that is unreleased if working from source. To
>> build an image for a prior released version, one can override the
>> NIFI_REGISTRY_VERSIONbuild-arg with the following command:
>>
>> docker build --build-arg=NIFI_REGISRTY_VERSION={Desired NiFi-Registry 
>> Version} -t apache/nifi-registry:latest .
>>
>> For this command above you need to have Dockerfile. I tried with
>> Dockerfile from docker hub, but there are errors in execution on this line:
>>
>> ADD sh/ ${NIFI_REGISTRY_BASE_DIR}/scripts/
>>
>>  On the other hand, If I manage to get the image with first command, I
>> will get Nifi registry version 0.1.0 which I don't want.
>>
>> I'm little bit confused here, sorry for longer mail.
>>
>> Thanks.
>>
>> Regards,
>> Tom
>>
>> On Thu, 7 Feb 2019 at 17:38, Kevin Doran  wrote:
>>
>>> Hi Tom,
>>>
>>> Are you using the apache/nifi-registry image or a custom image for this?
>>>
>>> Have you configured TLS?
>>> Can you share your complete conf dir (removing sensitive values such as
>>> password or domains)?
>>>
>>> Thanks,
>>> Kevin
>>>
>>>
>>> On February 7, 2019 at 05:57:37, Tomislav Novosel (to.novo...@gmail.com)
>>> wrote:
>>> > Hi all,
>>> >
>>> > I'm trying to configure Nifi registry authentication with Kerberos
>>> while
>>> > Nifi registry runs
>>> > inside Docker container.
>>> >
>>> > I configured all security properties in nifi-registry.properties, login
>>> > identity provider and
>>> > authorizers.xml. Everything the same as for Nifi registry running as a
>>> > service without Docker container.
>>> >
>>> > When I open UI in browser and type in login data, login dose not pass.
>>> >
>>> > In /logs/nifi-registry-app.log I see error:
>>> >
>>> > An Authentication object was not found in the SecurityContext Returning
>>> > 401 response
>>> > java.lang.IllegalStateException: Access tokens are only issued over
>>> HTTPS
>>> >
>>> > nifi.registry.web.https.host property is default because of Docker:
>>> > ae24ea32faef
>>> > nifi.registry.web.https.port=18080
>>> >
>>> > How can I resolve this?
>>> > Thanks.
>>> >
>>> >
>>> > BR,
>>> > Tom
>>> >
>>>
>>>


Re: Nifi registry Kerberos Auth with Docker

2019-02-07 Thread Tomislav Novosel
Hi Kevin,

I'm using image from Docker hub on this link:
https://hub.docker.com/r/chaffelson/nifi-registry

I think I know where is the problem. The problem is in config file where
http host and http port property remains even if I manually set https host
and htpps port.
I deleted http host and http port to be empty, but when I started container
again, those values are again there.

I don't know what the author of image wanted to say with this:

The Docker image can be built using the following command:

. 
~/Projects/nifi-dev/nifi-registry/nifi-registry-docker/dockerhub/DockerBuild.sh

What does this commend mean?

And this:

Note: The default version of NiFi-Registry specified by the Dockerfile is
typically that of one that is unreleased if working from source. To build
an image for a prior released version, one can override the
NIFI_REGISTRY_VERSIONbuild-arg with the following command:

docker build --build-arg=NIFI_REGISRTY_VERSION={Desired NiFi-Registry
Version} -t apache/nifi-registry:latest .

For this command above you need to have Dockerfile. I tried with Dockerfile
from docker hub, but there are errors in execution on this line:

ADD sh/ ${NIFI_REGISTRY_BASE_DIR}/scripts/

 On the other hand, If I manage to get the image with first command, I will
get Nifi registry version 0.1.0 which I don't want.

I'm little bit confused here, sorry for longer mail.

Thanks.

Regards,
Tom

On Thu, 7 Feb 2019 at 17:38, Kevin Doran  wrote:

> Hi Tom,
>
> Are you using the apache/nifi-registry image or a custom image for this?
>
> Have you configured TLS?
> Can you share your complete conf dir (removing sensitive values such as
> password or domains)?
>
> Thanks,
> Kevin
>
>
> On February 7, 2019 at 05:57:37, Tomislav Novosel (to.novo...@gmail.com)
> wrote:
> > Hi all,
> >
> > I'm trying to configure Nifi registry authentication with Kerberos while
> > Nifi registry runs
> > inside Docker container.
> >
> > I configured all security properties in nifi-registry.properties, login
> > identity provider and
> > authorizers.xml. Everything the same as for Nifi registry running as a
> > service without Docker container.
> >
> > When I open UI in browser and type in login data, login dose not pass.
> >
> > In /logs/nifi-registry-app.log I see error:
> >
> > An Authentication object was not found in the SecurityContext Returning
> > 401 response
> > java.lang.IllegalStateException: Access tokens are only issued over HTTPS
> >
> > nifi.registry.web.https.host property is default because of Docker:
> > ae24ea32faef
> > nifi.registry.web.https.port=18080
> >
> > How can I resolve this?
> > Thanks.
> >
> >
> > BR,
> > Tom
> >
>
>


Nifi registry Kerberos Auth with Docker

2019-02-07 Thread Tomislav Novosel
Hi all,

I'm trying to configure Nifi registry authentication with Kerberos while
Nifi registry runs
inside Docker container.

I configured all security properties in nifi-registry.properties, login
identity provider and
authorizers.xml. Everything the same as for Nifi registry running as a
service without Docker container.

When I open UI in browser and type in login data, login dose not pass.

In /logs/nifi-registry-app.log I see error:

 An Authentication object was not found in the SecurityContext Returning
401 response
java.lang.IllegalStateException: Access tokens are only issued over HTTPS

nifi.registry.web.https.host property is default because of Docker:
ae24ea32faef
nifi.registry.web.https.port=18080

How can I resolve this?
Thanks.


BR,
Tom


[no subject]

2019-02-07 Thread Tomislav Novosel
Hi all,

I'm trying to configure Nifi registry authentication with Kerberos while
Nifi registry runs
inside Docker container.

I configured all security properties in nifi-registry.properties, login
identity provider and
authorizers.xml. Everything the same as for Nifi registry running as a
service without Docker container.

When I open UI in browser and type in login data, login dose not pass.

In /logs/nifi-registry-app.log I see error:

 An Authentication object was not found in the SecurityContext Returning
401 response
java.lang.IllegalStateException: Access tokens are only issued over HTTPS

nifi.registry.web.https.host property is default because of Docker:
ae24ea32faef
nifi.registry.web.https.port=18080

How can I resolve this?
Thanks.


BR,
Tom


Filter duplicates by modification date

2019-01-31 Thread Tomislav Novosel
Hi all,

I have specific case to filter duplicate files by modification date.

I want to fetch only files with maximum modification date(the youngest
files) if they are duplicated by filename.

How can I achieve this?
Thanks in advance, I appreciate it a lot.

BR,
Tom


Re: Minimum file age

2019-01-31 Thread Tomislav Novosel
Hi all,

@Josef, what do you mean with Wait and DetectDuplicate processors? How to
delay fetching by time of conversion?
How can Wait processor know that file is converted completely? If the file
is listed again and DetectDuplicate processor
caches identifier, Wair processor will pass flowfile downstream. What in
case if file is that big that will be listed three times or four times?

Regards,
Tom

On Mon, 28 Jan 2019 at 11:09,  wrote:

> Hi Tom
>
>
>
> I suggest to use a Wait Processor (to delay the fetch) together with
> DetectDuplicate Processor. In that way you will fetch the file only once
> and after it has been written completely (as long as you know how long it
> takes in max. to finish writing). I know it’s not nice but that’s how we do
> it for the moment… I’m waiting for this feature as well :-(.
>
>
>
> Cheers Josef
>
>
>
>
>
> *From: *Arpad Boda 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Monday, 28 January 2019 at 10:17
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Minimum file age
>
>
>
> Hi,
>
>
>
> It’s on the way: https://issues.apache.org/jira/browse/NIFI-5977 :)
>
>
>
> Regards,
>
> Arpad
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Monday, 28 January 2019 at 09:19
> *To: *"users@nifi.apache.org" 
> *Subject: *Minimum file age
>
>
>
> Hi all,
>
>
>
> I'm having issue with SFTPList processor in Nifi. When reading files from
> folder where another process writes files, it lists the same file multiple
> time and ingests file multiple times because modification date of file
> changes rapidly as another process writes to it.
>
>
>
> It appears that Nifi lists faster than external process writes, so before
> the end of writing (conversion of file from one format to another), Nifi
> lists file multiple time and then creates duplicates.
>
>
>
> There is no property of Minimum file age like in ListFile processor.
>
>
>
> How can I resolve this to wait for a moment when the file is converted
> completely and then to list file and pass it to FetchSFTP processor?
>
>
>
> Thanks in advance,
>
> Tom.
>


Re: Modify Flowfile attributes

2019-01-30 Thread Tomislav Novosel
Yes, it works now as expected. It seems that *strptime* does not work as it
should.
I also tried with *strftime*, and Nifi gives me an error that method does
not exist.
The reason is to avoid hardcoded string "20" for year definition:

week = date(year=int(*"20"*+date_final[0:2]), month=int(date_final[2:4]),
day=int(date_final[4:6])).isocalendar()[1]

date_file = file_name.split("_")[6]
date_final = date_file.split(".")[0]
date_obj = datetime.strptime(date_final,'%y%m%d')
date_string = date_obj.strftime('%Y%m%d')
year_sliced = int(date_string[0:4])
month_sliced = int(date_string[4:6])
day_sliced = int(date_string[6:8])


week = date(year=year_sliced, month=month_sliced, day=day_sliced).
isocalendar()[1]
year = date(year=year_sliced, month=month_sliced, day=day_sliced).
isocalendar()[0]

So this is not nice solution but it works.

Thank you very much Arpad for all the answers, this will be ok for now.
I also appreciate answers and help of all the others.

Regards,
Tom

On Wed, 30 Jan 2019 at 13:38, Arpad Boda  wrote:

> I know it’s a hack, but as your date format is fixed length (6 chars), the
> following should work:
>
>
>
> *week_att  = **date(year=int(date_final[0:1]),
> month=int(date_final[2:3]), day=int(date_final[4:5])).isocalendar()[1]*
>
>
>
> Yeah, it’s not a solution, but I wonder if skipping strptime fixes the
> problem or not.
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 13:31
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Yes, it is strange.
>
>
>
> If I do this:
>
> date_obj = datetime.strptime(‘20’ + date_final,'%Y%m%d')
>
>
>
> The result is the same.
>
> It gives me week 44 and year 118, but if I run code locally it gives
> correct week and year. Week 1 and year 2019.
>
>
>
> Tom.
>
>
>
> On Wed, 30 Jan 2019 at 13:24, Arpad Boda  wrote:
>
> This sounds very strange.
>
>
>
> What happens if you do this:
>
>
>
> date_obj = datetime.strptime(‘20’ + date_final,'%Y%m%d')
>
>
>
> ?
>
>
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 12:44
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Arpad,
>
>
>
> I also tested my python code as standalone locally on my laptop, and the
> results are as expected, 1st week of 2019 for
>
> date 181231 which is my case.
>
>
>
> I also tried to add two variables marked as red as attributes to my
> flowfile, and the result is as expected, date_file
>
> has value 181231.parquet (parquet file is my case) and date_final has
> value 181231.
>
>
>
> So red variables are not the problem.
>
>
>
> Problem is in variables marked red:
>
>
>
> date_file = file_name.split("_")[6]
>
> date_final = date_file.split(".")[0]
>
> date_obj = datetime.strptime(date_final,'%y%m%d')
>
> date_year = date_obj.year
>
> date_day = date_obj.day
>
> date_month = date_obj.month
>
>
>
> So python code runs correct locally, but on Nifi (jython) does not.
>
>
>
> Regards,
>
> Tom
>
>
>
>
>
> On Wed, 30 Jan 2019 at 12:33, Arpad Boda  wrote:
>
> Tom,
>
>
>
> Not sure we are on the same page.
>
>
>
> I tested the python code of yours as standalone, not in NiFi.
>
>
>
> As the Python code is fine (even with JPython), I think the issue is
> somewhere here:
>
>
>
> *date_file = file_name.split("_")[6]*
>
> *date_final = date_file.split(".")[0]*
>
> *date_obj = datetime.strptime(date_final,'%y%m%d')*
>
> My testing assumed “date_final” to be “181231”, which I guess doesn’t
> apply for your case.
>
>
>
> Could you *modify* your python code to add the two variables (marked red)
> as attributes to your flow file?
>
>
>
> Regards,
>
> Arpad
>
>
>
>
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 12:20
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Arpad,
>
>
>
> I tried to pass variables date_year, date_day and date_month to outgoing
> flowfile and I get unexpected values.
>
> For day I get 1, for year 118 and for month 11.
>
> And that gives week number 44 and year 118 according to my code.
>
>
>
> It is strange that my code works as expected on your 

Re: Modify Flowfile attributes

2019-01-30 Thread Tomislav Novosel
Yes, it is strange.

If I do this:
date_obj = datetime.strptime(‘20’ + date_final,'%Y%m%d')

The result is the same.
It gives me week 44 and year 118, but if I run code locally it gives
correct week and year. Week 1 and year 2019.

Tom.

On Wed, 30 Jan 2019 at 13:24, Arpad Boda  wrote:

> This sounds very strange.
>
>
>
> What happens if you do this:
>
>
>
> date_obj = datetime.strptime(‘20’ + date_final,'%Y%m%d')
>
>
>
> ?
>
>
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 12:44
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Arpad,
>
>
>
> I also tested my python code as standalone locally on my laptop, and the
> results are as expected, 1st week of 2019 for
>
> date 181231 which is my case.
>
>
>
> I also tried to add two variables marked as red as attributes to my
> flowfile, and the result is as expected, date_file
>
> has value 181231.parquet (parquet file is my case) and date_final has
> value 181231.
>
>
>
> So red variables are not the problem.
>
>
>
> Problem is in variables marked red:
>
>
>
> date_file = file_name.split("_")[6]
>
> date_final = date_file.split(".")[0]
>
> date_obj = datetime.strptime(date_final,'%y%m%d')
>
> date_year = date_obj.year
>
> date_day = date_obj.day
>
> date_month = date_obj.month
>
>
>
> So python code runs correct locally, but on Nifi (jython) does not.
>
>
>
> Regards,
>
> Tom
>
>
>
>
>
> On Wed, 30 Jan 2019 at 12:33, Arpad Boda  wrote:
>
> Tom,
>
>
>
> Not sure we are on the same page.
>
>
>
> I tested the python code of yours as standalone, not in NiFi.
>
>
>
> As the Python code is fine (even with JPython), I think the issue is
> somewhere here:
>
>
>
> *date_file = file_name.split("_")[6]*
>
> *date_final = date_file.split(".")[0]*
>
> *date_obj = datetime.strptime(date_final,'%y%m%d')*
>
> My testing assumed “date_final” to be “181231”, which I guess doesn’t
> apply for your case.
>
>
>
> Could you *modify* your python code to add the two variables (marked red)
> as attributes to your flow file?
>
>
>
> Regards,
>
> Arpad
>
>
>
>
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 12:20
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Arpad,
>
>
>
> I tried to pass variables date_year, date_day and date_month to outgoing
> flowfile and I get unexpected values.
>
> For day I get 1, for year 118 and for month 11.
>
> And that gives week number 44 and year 118 according to my code.
>
>
>
> It is strange that my code works as expected on your machine. I use Nifi
> 1.7.1
>
>
>
> Regards,
>
> Tom
>
>
>
> On Wed, 30 Jan 2019 at 11:25, Arpad Boda  wrote:
>
> Tom,
>
>
>
> Could you use logattribute processor and somehow log the value of your
> “date_final” variables?
>
>
>
> Tested your code with Jpython, with input string “181231” it works as
> expected (the result is 1st week of 2019).
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 11:10
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Yes, the values are correct. Attribute has value which is expected to be.
>
> i.e. for date 181231 in filename I get value 18231 for attribute
> week_extracted which is extracted from filename with split method.
>
>
>
> Tom.
>
>
>
> On Wed, 30 Jan 2019 at 10:59, Arpad Boda  wrote:
>
> Hi Tom,
>
>
>
> “that is exactly what I tried and date_final or date_file are applied to
> the attribute of outgoing flowfile, it works.”
>
>
>
> It works as they are strings, so not working would be a surprise. The
> question is: what are their values? 
>
>
>
> Regards,
>
> Arpad
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 10:53
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Hi Arpad,
>
>
>
> that is exactly what I tried and date_final or date_file are applied to
> the attribute of outgoing flowfile, it works.
>
> But if I put to attribute week_att, there is error: 

Re: Modify Flowfile attributes

2019-01-30 Thread Tomislav Novosel
Arpad,

I also tested my python code as standalone locally on my laptop, and the
results are as expected, 1st week of 2019 for
date 181231 which is my case.

I also tried to add two variables marked as red as attributes to my
flowfile, and the result is as expected, date_file
has value 181231.parquet (parquet file is my case) and date_final has value
181231.

So red variables are not the problem.

Problem is in variables marked red:

date_file = file_name.split("_")[6]
date_final = date_file.split(".")[0]
date_obj = datetime.strptime(date_final,'%y%m%d')
date_year = date_obj.year
date_day = date_obj.day
date_month = date_obj.month

So python code runs correct locally, but on Nifi (jython) does not.

Regards,
Tom


On Wed, 30 Jan 2019 at 12:33, Arpad Boda  wrote:

> Tom,
>
>
>
> Not sure we are on the same page.
>
>
>
> I tested the python code of yours as standalone, not in NiFi.
>
>
>
> As the Python code is fine (even with JPython), I think the issue is
> somewhere here:
>
>
>
> *date_file = file_name.split("_")[6]*
>
> *date_final = date_file.split(".")[0]*
>
> *date_obj = datetime.strptime(date_final,'%y%m%d')*
>
> My testing assumed “date_final” to be “181231”, which I guess doesn’t
> apply for your case.
>
>
>
> Could you *modify* your python code to add the two variables (marked red)
> as attributes to your flow file?
>
>
>
> Regards,
>
> Arpad
>
>
>
>
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 12:20
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Arpad,
>
>
>
> I tried to pass variables date_year, date_day and date_month to outgoing
> flowfile and I get unexpected values.
>
> For day I get 1, for year 118 and for month 11.
>
> And that gives week number 44 and year 118 according to my code.
>
>
>
> It is strange that my code works as expected on your machine. I use Nifi
> 1.7.1
>
>
>
> Regards,
>
> Tom
>
>
>
> On Wed, 30 Jan 2019 at 11:25, Arpad Boda  wrote:
>
> Tom,
>
>
>
> Could you use logattribute processor and somehow log the value of your
> “date_final” variables?
>
>
>
> Tested your code with Jpython, with input string “181231” it works as
> expected (the result is 1st week of 2019).
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 11:10
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Yes, the values are correct. Attribute has value which is expected to be.
>
> i.e. for date 181231 in filename I get value 18231 for attribute
> week_extracted which is extracted from filename with split method.
>
>
>
> Tom.
>
>
>
> On Wed, 30 Jan 2019 at 10:59, Arpad Boda  wrote:
>
> Hi Tom,
>
>
>
> “that is exactly what I tried and date_final or date_file are applied to
> the attribute of outgoing flowfile, it works.”
>
>
>
> It works as they are strings, so not working would be a surprise. The
> question is: what are their values? 
>
>
>
> Regards,
>
> Arpad
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 10:53
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Hi Arpad,
>
>
>
> that is exactly what I tried and date_final or date_file are applied to
> the attribute of outgoing flowfile, it works.
>
> But if I put to attribute week_att, there is error: week_att cannot be
> coerced as String, and if I put str_week it gives me week number 44.
>
>
>
> Tom
>
>
>
> On Wed, 30 Jan 2019 at 08:40, Arpad Boda  wrote:
>
> Tom,
>
>
>
> The Python code to get the week number for a datetime string seems to be
> correct.
>
>
>
> To help debugging could you stamp your “date_final” or “date_file”
> variable to an attribute, so we could see what’s the input?
>
> My gut feeling says there is some parsing magic going wrong here.
>
>
>
> Regards,
>
> Arpad
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Tuesday, 29 January 2019 at 20:13
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> With following script I get week number 44 and year 118, which is strange
> result.
> Week should be 1 and year 2019 for date 2018-31-12.
>
> What 

Re: Modify Flowfile attributes

2019-01-30 Thread Tomislav Novosel
Arpad,

I tried to pass variables date_year, date_day and date_month to outgoing
flowfile and I get unexpected values.
For day I get 1, for year 118 and for month 11.
And that gives week number 44 and year 118 according to my code.

It is strange that my code works as expected on your machine. I use Nifi
1.7.1

Regards,
Tom

On Wed, 30 Jan 2019 at 11:25, Arpad Boda  wrote:

> Tom,
>
>
>
> Could you use logattribute processor and somehow log the value of your
> “date_final” variables?
>
>
>
> Tested your code with Jpython, with input string “181231” it works as
> expected (the result is 1st week of 2019).
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 11:10
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Yes, the values are correct. Attribute has value which is expected to be.
>
> i.e. for date 181231 in filename I get value 18231 for attribute
> week_extracted which is extracted from filename with split method.
>
>
>
> Tom.
>
>
>
> On Wed, 30 Jan 2019 at 10:59, Arpad Boda  wrote:
>
> Hi Tom,
>
>
>
> “that is exactly what I tried and date_final or date_file are applied to
> the attribute of outgoing flowfile, it works.”
>
>
>
> It works as they are strings, so not working would be a surprise. The
> question is: what are their values? 
>
>
>
> Regards,
>
> Arpad
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 10:53
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Hi Arpad,
>
>
>
> that is exactly what I tried and date_final or date_file are applied to
> the attribute of outgoing flowfile, it works.
>
> But if I put to attribute week_att, there is error: week_att cannot be
> coerced as String, and if I put str_week it gives me week number 44.
>
>
>
> Tom
>
>
>
> On Wed, 30 Jan 2019 at 08:40, Arpad Boda  wrote:
>
> Tom,
>
>
>
> The Python code to get the week number for a datetime string seems to be
> correct.
>
>
>
> To help debugging could you stamp your “date_final” or “date_file”
> variable to an attribute, so we could see what’s the input?
>
> My gut feeling says there is some parsing magic going wrong here.
>
>
>
> Regards,
>
> Arpad
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Tuesday, 29 January 2019 at 20:13
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> With following script I get week number 44 and year 118, which is strange
> result.
> Week should be 1 and year 2019 for date 2018-31-12.
>
> What is wrong here?
>
>
>
> Tom
>
>
>
> from datetime import datetime, timedelta, date
>
>
>
> flowFile = session.get()
>
> if (flowFile != None):
>
> file_name = flowFile.getAttribute('filename')
>
>
>
> date_file = file_name.split("_")[6]
>
> date_final = date_file.split(".")[0]
>
> date_obj = datetime.strptime(date_final,'%y%m%d')
>
> date_year = date_obj.year
>
> date_day = date_obj.day
>
> date_month = date_obj.month
>
>
>
>     week_att = date(year=date_year, month=date_month,
> day=date_day).isocalendar()[1]
>
> year_att = date(year=date_year, month=date_month,
> day=date_day).isocalendar()[0]
>
> str_week = str(week_att)
>
> str_year = str(year_att)
>
>
>
> flowFile = session.putAttribute(flowFile, "year_extracted", str_year)
>
> flowFile = session.putAttribute(flowFile, "week_extracted", str_week)
>
> session.transfer(flowFile, REL_SUCCESS)
>
> session.commit()
>
>
>
> On Tue, 29 Jan 2019 at 16:59, Tomislav Novosel 
> wrote:
>
> Thank you all for answers. The reason why I want this to do with python
> script is wrong calculation of week number from date. Nifi has that
> function in expression lang. (extracted_date:format("w", <>)).
> My time zone is GMT+2.
>
> If i set date, for example 20180819, and time zone in function GMT I get
> week number 34, which is wrong. If I ommit time zone, I get week number 33,
> which is right. I'm not sure if thats bug. You can test it for yourself,
> and if you do, please share your findings here, maybe I'm doing something
> wrong.
>
>
>
> On the other side, if I use python, I'more sure that I will get correct
> week number, even for dates which overlaps with week

Re: Modify Flowfile attributes

2019-01-30 Thread Tomislav Novosel
Yes, the values are correct. Attribute has value which is expected to be.
i.e. for date 181231 in filename I get value 18231 for attribute
week_extracted which is extracted from filename with split method.

Tom.

On Wed, 30 Jan 2019 at 10:59, Arpad Boda  wrote:

> Hi Tom,
>
>
>
> “that is exactly what I tried and date_final or date_file are applied to
> the attribute of outgoing flowfile, it works.”
>
>
>
> It works as they are strings, so not working would be a surprise. The
> question is: what are their values? 
>
>
>
> Regards,
>
> Arpad
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Wednesday, 30 January 2019 at 10:53
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> Hi Arpad,
>
>
>
> that is exactly what I tried and date_final or date_file are applied to
> the attribute of outgoing flowfile, it works.
>
> But if I put to attribute week_att, there is error: week_att cannot be
> coerced as String, and if I put str_week it gives me week number 44.
>
>
>
> Tom
>
>
>
> On Wed, 30 Jan 2019 at 08:40, Arpad Boda  wrote:
>
> Tom,
>
>
>
> The Python code to get the week number for a datetime string seems to be
> correct.
>
>
>
> To help debugging could you stamp your “date_final” or “date_file”
> variable to an attribute, so we could see what’s the input?
>
> My gut feeling says there is some parsing magic going wrong here.
>
>
>
> Regards,
>
> Arpad
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Tuesday, 29 January 2019 at 20:13
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> With following script I get week number 44 and year 118, which is strange
> result.
> Week should be 1 and year 2019 for date 2018-31-12.
>
> What is wrong here?
>
>
>
> Tom
>
>
>
> from datetime import datetime, timedelta, date
>
>
>
> flowFile = session.get()
>
> if (flowFile != None):
>
> file_name = flowFile.getAttribute('filename')
>
>
>
> date_file = file_name.split("_")[6]
>
> date_final = date_file.split(".")[0]
>
> date_obj = datetime.strptime(date_final,'%y%m%d')
>
> date_year = date_obj.year
>
> date_day = date_obj.day
>
> date_month = date_obj.month
>
>
>
> week_att = date(year=date_year, month=date_month,
> day=date_day).isocalendar()[1]
>
> year_att = date(year=date_year, month=date_month,
> day=date_day).isocalendar()[0]
>
> str_week = str(week_att)
>
> str_year = str(year_att)
>
>
>
> flowFile = session.putAttribute(flowFile, "year_extracted", str_year)
>
> flowFile = session.putAttribute(flowFile, "week_extracted", str_week)
>
> session.transfer(flowFile, REL_SUCCESS)
>
> session.commit()
>
>
>
> On Tue, 29 Jan 2019 at 16:59, Tomislav Novosel 
> wrote:
>
> Thank you all for answers. The reason why I want this to do with python
> script is wrong calculation of week number from date. Nifi has that
> function in expression lang. (extracted_date:format("w", <>)).
> My time zone is GMT+2.
>
> If i set date, for example 20180819, and time zone in function GMT I get
> week number 34, which is wrong. If I ommit time zone, I get week number 33,
> which is right. I'm not sure if thats bug. You can test it for yourself,
> and if you do, please share your findings here, maybe I'm doing something
> wrong.
>
>
>
> On the other side, if I use python, I'more sure that I will get correct
> week number, even for dates which overlaps with week number in next
> year(e.g. 20181231)
>
>
>
> Since this calc will be in production, I need resilient workflow in the
> future without errors.
>
>
>
> Regarding script I sent above, I'm getting error: "week cannot bo coerced
> as string". I checked right on the beginning if the session is null or not.
>
>
>
> On Tue, 29 Jan 2019, 16:26 Jerry Vinokurov 
> I wanted to add, since I've done this specific operation many times, that
> you can really just do this via the NiFi expression language, which I think
> is more "idiomatic" than having ExecuteScript processors all over the
> place. Basically, you would have an UpdateAttribute that set something
> called, say, date_extracted with an expression that looks something like
> ${filename:substringAfterLast('_'):toDate('.MM.dd')} (this is an
> approximation based on the above, modify as necessary for your purpose).
>

Re: Modify Flowfile attributes

2019-01-30 Thread Tomislav Novosel
Hi Arpad,

that is exactly what I tried and date_final or date_file are applied to the
attribute of outgoing flowfile, it works.
But if I put to attribute week_att, there is error: week_att cannot be
coerced as String, and if I put str_week it gives me week number 44.

Tom

On Wed, 30 Jan 2019 at 08:40, Arpad Boda  wrote:

> Tom,
>
>
>
> The Python code to get the week number for a datetime string seems to be
> correct.
>
>
>
> To help debugging could you stamp your “date_final” or “date_file”
> variable to an attribute, so we could see what’s the input?
>
> My gut feeling says there is some parsing magic going wrong here.
>
>
>
> Regards,
>
> Arpad
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Tuesday, 29 January 2019 at 20:13
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Modify Flowfile attributes
>
>
>
> With following script I get week number 44 and year 118, which is strange
> result.
> Week should be 1 and year 2019 for date 2018-31-12.
>
> What is wrong here?
>
>
>
> Tom
>
>
>
> from datetime import datetime, timedelta, date
>
>
>
> flowFile = session.get()
>
> if (flowFile != None):
>
> file_name = flowFile.getAttribute('filename')
>
>
>
> date_file = file_name.split("_")[6]
>
> date_final = date_file.split(".")[0]
>
> date_obj = datetime.strptime(date_final,'%y%m%d')
>
> date_year = date_obj.year
>
> date_day = date_obj.day
>
> date_month = date_obj.month
>
>
>
> week_att = date(year=date_year, month=date_month,
> day=date_day).isocalendar()[1]
>
> year_att = date(year=date_year, month=date_month,
> day=date_day).isocalendar()[0]
>
> str_week = str(week_att)
>
> str_year = str(year_att)
>
>
>
> flowFile = session.putAttribute(flowFile, "year_extracted", str_year)
>
> flowFile = session.putAttribute(flowFile, "week_extracted", str_week)
>
> session.transfer(flowFile, REL_SUCCESS)
>
> session.commit()
>
>
>
> On Tue, 29 Jan 2019 at 16:59, Tomislav Novosel 
> wrote:
>
> Thank you all for answers. The reason why I want this to do with python
> script is wrong calculation of week number from date. Nifi has that
> function in expression lang. (extracted_date:format("w", <>)).
> My time zone is GMT+2.
>
> If i set date, for example 20180819, and time zone in function GMT I get
> week number 34, which is wrong. If I ommit time zone, I get week number 33,
> which is right. I'm not sure if thats bug. You can test it for yourself,
> and if you do, please share your findings here, maybe I'm doing something
> wrong.
>
>
>
> On the other side, if I use python, I'more sure that I will get correct
> week number, even for dates which overlaps with week number in next
> year(e.g. 20181231)
>
>
>
> Since this calc will be in production, I need resilient workflow in the
> future without errors.
>
>
>
> Regarding script I sent above, I'm getting error: "week cannot bo coerced
> as string". I checked right on the beginning if the session is null or not.
>
>
>
> On Tue, 29 Jan 2019, 16:26 Jerry Vinokurov 
> I wanted to add, since I've done this specific operation many times, that
> you can really just do this via the NiFi expression language, which I think
> is more "idiomatic" than having ExecuteScript processors all over the
> place. Basically, you would have an UpdateAttribute that set something
> called, say, date_extracted with an expression that looks something like
> ${filename:substringAfterLast('_'):toDate('.MM.dd')} (this is an
> approximation based on the above, modify as necessary for your purpose).
> Then you could use a second UpdateAttribute to extract various information
> from this date with the format command, e.g. ${date_extracted:format(' format expression here>')}. I don't think there's one for "week" but in
> general this is the approach I take when I need to do date munging.
>
>
>
> On Tue, Jan 29, 2019 at 10:06 AM Tomislav Novosel 
> wrote:
>
> Hi Matt, thanks for suggestions. But performance is not crucial here.
>
> This is code i tried. but I get error: "AttributeError: 'NoneType' object
> has no attribute 'getAttribute' at line number 4"
>
> If I remove code from line 6 to line 14, it works with some default
> attribute values for year_extracted and week_extracted, otherwise i get
>
> error form above.
>
>
>
> Tom
>
>
>
> from datetime import datetime, timedelta, date
>
>
>
> flowFile = session.get

Re: Modify Flowfile attributes

2019-01-29 Thread Tomislav Novosel
With following script I get week number 44 and year 118, which is strange
result.
Week should be 1 and year 2019 for date 2018-31-12.
What is wrong here?

Tom

from datetime import datetime, timedelta, date

flowFile = session.get()
if (flowFile != None):
file_name = flowFile.getAttribute('filename')

date_file = file_name.split("_")[6]
date_final = date_file.split(".")[0]
date_obj = datetime.strptime(date_final,'%y%m%d')
date_year = date_obj.year
date_day = date_obj.day
date_month = date_obj.month

week_att = date(year=date_year, month=date_month,
day=date_day).isocalendar()[1]
year_att = date(year=date_year, month=date_month,
day=date_day).isocalendar()[0]
str_week = str(week_att)
str_year = str(year_att)

flowFile = session.putAttribute(flowFile, "year_extracted", str_year)
flowFile = session.putAttribute(flowFile, "week_extracted", str_week)
session.transfer(flowFile, REL_SUCCESS)
session.commit()

On Tue, 29 Jan 2019 at 16:59, Tomislav Novosel  wrote:

> Thank you all for answers. The reason why I want this to do with python
> script is wrong calculation of week number from date. Nifi has that
> function in expression lang. (extracted_date:format("w", <>)).
> My time zone is GMT+2.
> If i set date, for example 20180819, and time zone in function GMT I get
> week number 34, which is wrong. If I ommit time zone, I get week number 33,
> which is right. I'm not sure if thats bug. You can test it for yourself,
> and if you do, please share your findings here, maybe I'm doing something
> wrong.
>
> On the other side, if I use python, I'more sure that I will get correct
> week number, even for dates which overlaps with week number in next
> year(e.g. 20181231)
>
> Since this calc will be in production, I need resilient workflow in the
> future without errors.
>
> Regarding script I sent above, I'm getting error: "week cannot bo coerced
> as string". I checked right on the beginning if the session is null or not.
>
> On Tue, 29 Jan 2019, 16:26 Jerry Vinokurov 
>> I wanted to add, since I've done this specific operation many times, that
>> you can really just do this via the NiFi expression language, which I think
>> is more "idiomatic" than having ExecuteScript processors all over the
>> place. Basically, you would have an UpdateAttribute that set something
>> called, say, date_extracted with an expression that looks something like
>> ${filename:substringAfterLast('_'):toDate('.MM.dd')} (this is an
>> approximation based on the above, modify as necessary for your purpose).
>> Then you could use a second UpdateAttribute to extract various information
>> from this date with the format command, e.g. ${date_extracted:format('> format expression here>')}. I don't think there's one for "week" but in
>> general this is the approach I take when I need to do date munging.
>>
>> On Tue, Jan 29, 2019 at 10:06 AM Tomislav Novosel 
>> wrote:
>>
>>> Hi Matt, thanks for suggestions. But performance is not crucial here.
>>> This is code i tried. but I get error: "AttributeError: 'NoneType'
>>> object has no attribute 'getAttribute' at line number 4"
>>> If I remove code from line 6 to line 14, it works with some default
>>> attribute values for year_extracted and week_extracted, otherwise i get
>>> error form above.
>>>
>>> Tom
>>>
>>> from datetime import datetime, timedelta, date
>>>
>>> flowFile = session.get()
>>> file_name = flowFile.getAttribute('filename')
>>>
>>> date_file = file_name.split("_")[6]
>>> date_final = date_file.split(".")[0]
>>> date_obj = datetime.strptime(date_final,'%y%m%d')
>>> date_year = date_obj.year
>>> date_day = date_obj.day
>>> date_month = date_obj.month
>>>
>>> week = date(year=date_year, month=date_month, day=date_day).isocalendar
>>> ()[1]
>>> year = date(year=date_year, month=date_month, day=date_day).isocalendar
>>> ()[0]
>>>
>>> if (flowFile != None):
>>> flowFile = session.putAttribute(flowFile, "year_extracted", year)
>>> flowFile = session.putAttribute(flowFile, "week_extracted", week)
>>> session.transfer(flowFile, REL_SUCCESS)
>>> session.commit()
>>>
>>> On Tue, 29 Jan 2019 at 15:53, Matt Burgess  wrote:
>>>
>>>> Tom,
>>>>
>>>> Keep in mind that you are using Jython not Python, which I mention
>>>> only to point out that it is *much* slower than the native Java
>>>> process

Re: Modify Flowfile attributes

2019-01-29 Thread Tomislav Novosel
Thank you all for answers. The reason why I want this to do with python
script is wrong calculation of week number from date. Nifi has that
function in expression lang. (extracted_date:format("w", <>)).
My time zone is GMT+2.
If i set date, for example 20180819, and time zone in function GMT I get
week number 34, which is wrong. If I ommit time zone, I get week number 33,
which is right. I'm not sure if thats bug. You can test it for yourself,
and if you do, please share your findings here, maybe I'm doing something
wrong.

On the other side, if I use python, I'more sure that I will get correct
week number, even for dates which overlaps with week number in next
year(e.g. 20181231)

Since this calc will be in production, I need resilient workflow in the
future without errors.

Regarding script I sent above, I'm getting error: "week cannot bo coerced
as string". I checked right on the beginning if the session is null or not.

On Tue, 29 Jan 2019, 16:26 Jerry Vinokurov  I wanted to add, since I've done this specific operation many times, that
> you can really just do this via the NiFi expression language, which I think
> is more "idiomatic" than having ExecuteScript processors all over the
> place. Basically, you would have an UpdateAttribute that set something
> called, say, date_extracted with an expression that looks something like
> ${filename:substringAfterLast('_'):toDate('.MM.dd')} (this is an
> approximation based on the above, modify as necessary for your purpose).
> Then you could use a second UpdateAttribute to extract various information
> from this date with the format command, e.g. ${date_extracted:format(' format expression here>')}. I don't think there's one for "week" but in
> general this is the approach I take when I need to do date munging.
>
> On Tue, Jan 29, 2019 at 10:06 AM Tomislav Novosel 
> wrote:
>
>> Hi Matt, thanks for suggestions. But performance is not crucial here.
>> This is code i tried. but I get error: "AttributeError: 'NoneType' object
>> has no attribute 'getAttribute' at line number 4"
>> If I remove code from line 6 to line 14, it works with some default
>> attribute values for year_extracted and week_extracted, otherwise i get
>> error form above.
>>
>> Tom
>>
>> from datetime import datetime, timedelta, date
>>
>> flowFile = session.get()
>> file_name = flowFile.getAttribute('filename')
>>
>> date_file = file_name.split("_")[6]
>> date_final = date_file.split(".")[0]
>> date_obj = datetime.strptime(date_final,'%y%m%d')
>> date_year = date_obj.year
>> date_day = date_obj.day
>> date_month = date_obj.month
>>
>> week = date(year=date_year, month=date_month, day=date_day).isocalendar
>> ()[1]
>> year = date(year=date_year, month=date_month, day=date_day).isocalendar
>> ()[0]
>>
>> if (flowFile != None):
>> flowFile = session.putAttribute(flowFile, "year_extracted", year)
>> flowFile = session.putAttribute(flowFile, "week_extracted", week)
>> session.transfer(flowFile, REL_SUCCESS)
>> session.commit()
>>
>> On Tue, 29 Jan 2019 at 15:53, Matt Burgess  wrote:
>>
>>> Tom,
>>>
>>> Keep in mind that you are using Jython not Python, which I mention
>>> only to point out that it is *much* slower than the native Java
>>> processors such as UpdateAttribute, and slower than other scripting
>>> engines such as Groovy or Javascript/Nashorn.
>>>
>>> If performance/throughput is not a concern and you're more comfortable
>>> with Jython, then Jerry's suggestion of session.putAttribute(flowFile,
>>> attributeName, attributeValue) should do the trick. Note that if you
>>> are adding more than a couple attributes, it's probably better to
>>> create a dictionary (eventually/actually, a Java Map)
>>> of attribute name/value pairs, and use putAllAttributes(flowFile,
>>> attributes) instead, as it is more performant.
>>>
>>> Regards,
>>> Matt
>>>
>>> On Tue, Jan 29, 2019 at 9:25 AM Tomislav Novosel 
>>> wrote:
>>> >
>>> > Thanks for the answer.
>>> >
>>> > Yes I know I can handle that with Expression language and
>>> UpdateAttribute processor, but this is specific case on my work and I think
>>> Python
>>> > is better and more simple solution. I need to calc that with python
>>> script.
>>> >
>>> > Tom
>>> >
>>> > On Tue, 29 Jan 2019 at 15:18, John McGinn 
>>> wrote:
>>> >>
>>> >> Since you're script sho

Re: Modify Flowfile attributes

2019-01-29 Thread Tomislav Novosel
Hi Matt, thanks for suggestions. But performance is not crucial here.
This is code i tried. but I get error: "AttributeError: 'NoneType' object
has no attribute 'getAttribute' at line number 4"
If I remove code from line 6 to line 14, it works with some default
attribute values for year_extracted and week_extracted, otherwise i get
error form above.

Tom

from datetime import datetime, timedelta, date

flowFile = session.get()
file_name = flowFile.getAttribute('filename')

date_file = file_name.split("_")[6]
date_final = date_file.split(".")[0]
date_obj = datetime.strptime(date_final,'%y%m%d')
date_year = date_obj.year
date_day = date_obj.day
date_month = date_obj.month

week = date(year=date_year, month=date_month, day=date_day).isocalendar()[1]
year = date(year=date_year, month=date_month, day=date_day).isocalendar()[0]

if (flowFile != None):
flowFile = session.putAttribute(flowFile, "year_extracted", year)
flowFile = session.putAttribute(flowFile, "week_extracted", week)
session.transfer(flowFile, REL_SUCCESS)
session.commit()

On Tue, 29 Jan 2019 at 15:53, Matt Burgess  wrote:

> Tom,
>
> Keep in mind that you are using Jython not Python, which I mention
> only to point out that it is *much* slower than the native Java
> processors such as UpdateAttribute, and slower than other scripting
> engines such as Groovy or Javascript/Nashorn.
>
> If performance/throughput is not a concern and you're more comfortable
> with Jython, then Jerry's suggestion of session.putAttribute(flowFile,
> attributeName, attributeValue) should do the trick. Note that if you
> are adding more than a couple attributes, it's probably better to
> create a dictionary (eventually/actually, a Java Map)
> of attribute name/value pairs, and use putAllAttributes(flowFile,
> attributes) instead, as it is more performant.
>
> Regards,
> Matt
>
> On Tue, Jan 29, 2019 at 9:25 AM Tomislav Novosel 
> wrote:
> >
> > Thanks for the answer.
> >
> > Yes I know I can handle that with Expression language and
> UpdateAttribute processor, but this is specific case on my work and I think
> Python
> > is better and more simple solution. I need to calc that with python
> script.
> >
> > Tom
> >
> > On Tue, 29 Jan 2019 at 15:18, John McGinn 
> wrote:
> >>
> >> Since you're script shows that "filename" is an attribute of your
> flowfile, you could use the UpdateAttribute processor.
> >>
> >> If you right click on UpdateAttribute and choose ShowUsage, then choose
> Expression Language Guide, it shows you the things you can handle.
> >>
> >> Something along the lines of ${filename:getDelimitedField(6,'_')}, if I
> understand the Groovy code correctly. I did a GenerateFlowFIle to an
> UpdateAttribute processor setting filename to "1_2_3_4_5_6.2_abc", then
> sent that to another UpdateAttribute with the getDelimitedField() I listed
> and I received 6.2. Then another UpdateAttribute could parse the 6.2 for
> the second substring, or you might be able to chain them in the existing
> UpdateProcessor.
> >>
> >>
> >> 
> >> On Tue, 1/29/19, Tomislav Novosel  wrote:
> >>
> >>  Subject: Modify Flowfile attributes
> >>  To: users@nifi.apache.org
> >>  Date: Tuesday, January 29, 2019, 9:04 AM
> >>
> >>  Hi all,
> >>  I'm trying to calculate week number and date
> >>  from filename using ExecuteScript processor and Jython. Here
> >>  is python script.How can I add calculated
> >>  attributes week and year to flowfile?
> >>  Please help, thank you.Tom
> >>  P.S. Maybe I completely missed with this script.
> >>  Feel free to correct me.
> >>
> >>  import
> >>  jsonimport java.iofrom org.apache.commons.io import
> >>  IOUtilsfrom java.nio.charset import
> >>  StandardCharsetsfrom org.apache.nifi.processor.io import
> >>  StreamCallbackfrom datetime import datetime, timedelta, date
> >>  class PyStreamCallback(StreamCallback):
> >>  def __init__(self, flowfile):
> >>  self.ff = flowfile
> >> pass
> >>  def process(self, inputStream, outputStream):
> >>  file_name =
> >>  self.ff.getAttribute("filename")
> >>  date_file =
> >>  file_name.split("_")[6]
> >>  date_final =
> >>  date_file.split(".")[0]
> >>  date_obj =
> >>  datetime.strptime(date_final,'%y%m%d')
> >>  date_year =
> >>  date_obj.year
> >>date_day =
> >>  date_obj.day
> >>   date_month =
> >>  date_obj.month
> >>  week = date(year=date_year, month=date_month,
> day=date_day).isocalendar()[1]
> >>  year =
> >>  date(year=date_year, month=date_month, day=date_day).isocalendar()[0]
> >>  flowFile =
> >>  session.get()if (flowFile != None):
> >>  session.transfer(flowFile, REL_SUCCESS)
> >>  session.commit()
>


Re: Modify Flowfile attributes

2019-01-29 Thread Tomislav Novosel
Thanks for the answer.

Yes I know I can handle that with Expression language and UpdateAttribute
processor, but this is specific case on my work and I think Python
is better and more simple solution. I need to calc that with python script.

Tom

On Tue, 29 Jan 2019 at 15:18, John McGinn  wrote:

> Since you're script shows that "filename" is an attribute of your
> flowfile, you could use the UpdateAttribute processor.
>
> If you right click on UpdateAttribute and choose ShowUsage, then choose
> Expression Language Guide, it shows you the things you can handle.
>
> Something along the lines of ${filename:getDelimitedField(6,'_')}, if I
> understand the Groovy code correctly. I did a GenerateFlowFIle to an
> UpdateAttribute processor setting filename to "1_2_3_4_5_6.2_abc", then
> sent that to another UpdateAttribute with the getDelimitedField() I listed
> and I received 6.2. Then another UpdateAttribute could parse the 6.2 for
> the second substring, or you might be able to chain them in the existing
> UpdateProcessor.
>
>
> ----
> On Tue, 1/29/19, Tomislav Novosel  wrote:
>
>  Subject: Modify Flowfile attributes
>  To: users@nifi.apache.org
>  Date: Tuesday, January 29, 2019, 9:04 AM
>
>  Hi all,
>  I'm trying to calculate week number and date
>  from filename using ExecuteScript processor and Jython. Here
>  is python script.How can I add calculated
>  attributes week and year to flowfile?
>  Please help, thank you.Tom
>  P.S. Maybe I completely missed with this script.
>  Feel free to correct me.
>
>  import
>  jsonimport java.iofrom org.apache.commons.io import
>  IOUtilsfrom java.nio.charset import
>  StandardCharsetsfrom org.apache.nifi.processor.io import
>  StreamCallbackfrom datetime import datetime, timedelta, date
>  class PyStreamCallback(StreamCallback):
>  def __init__(self, flowfile):
>  self.ff = flowfile
> pass
>  def process(self, inputStream, outputStream):
>  file_name =
>  self.ff.getAttribute("filename")
>  date_file =
>  file_name.split("_")[6]
>  date_final =
>  date_file.split(".")[0]
>  date_obj =
>  datetime.strptime(date_final,'%y%m%d')
>  date_year =
>  date_obj.year
>date_day =
>  date_obj.day
>   date_month =
>  date_obj.month
>  week = date(year=date_year, month=date_month,
> day=date_day).isocalendar()[1]
>  year =
>  date(year=date_year, month=date_month, day=date_day).isocalendar()[0]
>  flowFile =
>  session.get()if (flowFile != None):
>  session.transfer(flowFile, REL_SUCCESS)
>  session.commit()
>


Modify Flowfile attributes

2019-01-29 Thread Tomislav Novosel
Hi all,

I'm trying to calculate week number and date from filename using
ExecuteScript processor and Jython. Here is python script.
How can I add calculated attributes week and year to flowfile?

Please help, thank you.
Tom

P.S. Maybe I completely missed with this script. Feel free to correct me.


import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
from datetime import datetime, timedelta, date

class PyStreamCallback(StreamCallback):
def __init__(self, flowfile):
self.ff = flowfile
pass
def process(self, inputStream, outputStream):
file_name = self.ff.getAttribute("filename")
date_file = file_name.split("_")[6]
date_final = date_file.split(".")[0]
date_obj = datetime.strptime(date_final,'%y%m%d')
date_year = date_obj.year
date_day = date_obj.day
date_month = date_obj.month

week = date(year=date_year, month=date_month, day=date_day).isocalendar()[1]
year = date(year=date_year, month=date_month, day=date_day).isocalendar()[0]

flowFile = session.get()
if (flowFile != None):
session.transfer(flowFile, REL_SUCCESS)
session.commit()


Re: Minimum file age

2019-01-28 Thread Tomislav Novosel
Thank you for replies and suggestions.

Yeah, it is a little disturbing lack of feature, but it's ok. For now, as I
have control over remote server, I use dotted file and remove dot (".")
after file is finished.
Ignore dotted file property set to true.

Cheers,
Tom

On Mon, 28 Jan 2019 at 11:09,  wrote:

> Hi Tom
>
>
>
> I suggest to use a Wait Processor (to delay the fetch) together with
> DetectDuplicate Processor. In that way you will fetch the file only once
> and after it has been written completely (as long as you know how long it
> takes in max. to finish writing). I know it’s not nice but that’s how we do
> it for the moment… I’m waiting for this feature as well :-(.
>
>
>
> Cheers Josef
>
>
>
>
>
> *From: *Arpad Boda 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Monday, 28 January 2019 at 10:17
> *To: *"users@nifi.apache.org" 
> *Subject: *Re: Minimum file age
>
>
>
> Hi,
>
>
>
> It’s on the way: https://issues.apache.org/jira/browse/NIFI-5977 :)
>
>
>
> Regards,
>
> Arpad
>
>
>
> *From: *Tomislav Novosel 
> *Reply-To: *"users@nifi.apache.org" 
> *Date: *Monday, 28 January 2019 at 09:19
> *To: *"users@nifi.apache.org" 
> *Subject: *Minimum file age
>
>
>
> Hi all,
>
>
>
> I'm having issue with SFTPList processor in Nifi. When reading files from
> folder where another process writes files, it lists the same file multiple
> time and ingests file multiple times because modification date of file
> changes rapidly as another process writes to it.
>
>
>
> It appears that Nifi lists faster than external process writes, so before
> the end of writing (conversion of file from one format to another), Nifi
> lists file multiple time and then creates duplicates.
>
>
>
> There is no property of Minimum file age like in ListFile processor.
>
>
>
> How can I resolve this to wait for a moment when the file is converted
> completely and then to list file and pass it to FetchSFTP processor?
>
>
>
> Thanks in advance,
>
> Tom.
>


Minimum file age

2019-01-28 Thread Tomislav Novosel
Hi all,

I'm having issue with SFTPList processor in Nifi. When reading files from
folder where another process writes files, it lists the same file multiple
time and ingests file multiple times because modification date of file
changes rapidly as another process writes to it.

It appears that Nifi lists faster than external process writes, so before
the end of writing (conversion of file from one format to another), Nifi
lists file multiple time and then creates duplicates.

There is no property of Minimum file age like in ListFile processor.

How can I resolve this to wait for a moment when the file is converted
completely and then to list file and pass it to FetchSFTP processor?

Thanks in advance,
Tom.


Nifi fetching files

2018-10-12 Thread Tomislav Novosel
Hi Nifi team,

my usecase is to list files and fetch files using ListFile and FetchFile
processors from intermediate folder which is also  destination of my
external .exe script.

Fetching from that folder has completion strategy to delete files.

How can I wait for let's say 10 minutes before files are fetched from that
folder to avoid conflict between two processes (Nifi and external script)
and exceptions like Access denied etc.

Thanks,
Tom


Re: Putting data to Solr index using Nifi

2018-09-27 Thread Tomislav Novosel
Ok, I understand.
I succeeded to create Docker container w/ Solr in standalone mode
(non-cloud).
Enough to test some features, but in the future we will use non-embedded
Zookeeper
for production use.

Thanks again.
BR,
Tom

On Thu, 27 Sep 2018 at 15:34, Mike Thomsen  wrote:

> Tom,
>
> What I meant was a "non-production" sort of test. You definitely wouldn't
> want to build anything mission-critical using Solr w/ embedded ZooKeeper
> instead of multiple Solr nodes w/ a full production-grade ZooKeeper cluster.
>
> With that said, I'm pretty rusty w/ Solr (been using ElasticSearch instead
> for going on 2 years), but this should suffice:
>
> docker run --name SOMETHING -p 8983:8983 -d solr:latest
>
> That'll bring up a vanilla single node of non-Cloud Solr that you can mess
> with.
>
> Beyond that, you're going to need to dig into Solr documentation on doing
> a Docker deployment of a cluster.
>
> One caveat about Python is the last time my colleagues and I used Python
> w/ SolrCloud, the API we used skipped ZooKeeper and went directly to an
> enumerated list of Solr nodes that we provided. So I would not consider
> your experience with Python to be fungible with the Java APIs because they
> use ZooKeeper for SolrCloud.
>
> Mike
>
> On Thu, Sep 27, 2018 at 9:24 AM Tomislav Novosel 
> wrote:
>
>> Hi Mike, thanks for the answer.
>>
>> How should I configure PutSolrContentStream processor to go directly to
>> Solr?
>> This is not a local test. Docker with running Solr container is on remote
>> server.
>> I don't understand how can I access Solr collection on docker host
>> address and Solr port :
>> from Python script using pysolr, but from Nifi I can't do that.
>>
>> BR,
>> Tom
>>
>> On Thu, 27 Sep 2018 at 14:28, Mike Thomsen 
>> wrote:
>>
>>> I think I've run into similar problems with SolrCloud in the past w/
>>> Docker. SolrCloud stores the IP address it binds to in ZooKeeper, which is
>>> why you see the Docker internal IP address there and not localhost:8983
>>> since presumably you're using localhost: as the Solr Location. I
>>> think you can force Solr to use a particular IP address with an environment
>>> variable on startup.
>>>
>>> If this is a local test, you shouldn't have any problems skipping
>>> ZooKeeper and going straight to Solr since the interface is the same
>>> between cloud and non-cloud Solr for everything that matters here.
>>>
>>> On Thu, Sep 27, 2018 at 7:56 AM Tomislav Novosel 
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>>
>>>>
>>>> I'm trying to put data in Solr Index using Nifi. Solr is v7.5.0 and
>>>> Nifi is v1.6.0.
>>>>
>>>> I'm using PutSolrContentStream processor and Solr in Solrcloud mode
>>>> with embedded zookeeper
>>>>
>>>> Inside docker container. I exposed Solr admin port and zookeeper port
>>>> to be accessible through browser.
>>>>
>>>>
>>>>
>>>> I configured Nifi processor to Solr Cloud, gave collection name and
>>>> Solr location(hostname where docker with Solr container is
>>>>
>>>> and port to embedded zookeeper)
>>>>
>>>>
>>>>
>>>> After I tried to put data into collection, i got error:
>>>>
>>>>
>>>>
>>>> rg.apache.solr.client.solrj.SolrServerException: No live SolrServers
>>>> available to handle this request:[
>>>> http://172.17.0.16:8983/solr/monitoringapi_shard1_replica_n1]
>>>>
>>>> at
>>>> org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:382)
>>>>
>>>> at
>>>> org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1291)
>>>>
>>>> at
>>>> org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1061)
>>>>
>>>> at
>>>> org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:997)
>>>>
>>>> at
>>>> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
>>>>
>>>> at
>>>> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:166)
>>>>
>>>> at
>>>> org.apache.nifi.processors.solr.Put

Re: Putting data to Solr index using Nifi

2018-09-27 Thread Tomislav Novosel
Hi Mike, thanks for the answer.

How should I configure PutSolrContentStream processor to go directly to
Solr?
This is not a local test. Docker with running Solr container is on remote
server.
I don't understand how can I access Solr collection on docker host address
and Solr port :
from Python script using pysolr, but from Nifi I can't do that.

BR,
Tom

On Thu, 27 Sep 2018 at 14:28, Mike Thomsen  wrote:

> I think I've run into similar problems with SolrCloud in the past w/
> Docker. SolrCloud stores the IP address it binds to in ZooKeeper, which is
> why you see the Docker internal IP address there and not localhost:8983
> since presumably you're using localhost: as the Solr Location. I
> think you can force Solr to use a particular IP address with an environment
> variable on startup.
>
> If this is a local test, you shouldn't have any problems skipping
> ZooKeeper and going straight to Solr since the interface is the same
> between cloud and non-cloud Solr for everything that matters here.
>
> On Thu, Sep 27, 2018 at 7:56 AM Tomislav Novosel 
> wrote:
>
>> Hi all,
>>
>>
>>
>> I'm trying to put data in Solr Index using Nifi. Solr is v7.5.0 and Nifi
>> is v1.6.0.
>>
>> I'm using PutSolrContentStream processor and Solr in Solrcloud mode with
>> embedded zookeeper
>>
>> Inside docker container. I exposed Solr admin port and zookeeper port to
>> be accessible through browser.
>>
>>
>>
>> I configured Nifi processor to Solr Cloud, gave collection name and Solr
>> location(hostname where docker with Solr container is
>>
>> and port to embedded zookeeper)
>>
>>
>>
>> After I tried to put data into collection, i got error:
>>
>>
>>
>> rg.apache.solr.client.solrj.SolrServerException: No live SolrServers
>> available to handle this request:[
>> http://172.17.0.16:8983/solr/monitoringapi_shard1_replica_n1]
>>
>> at
>> org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:382)
>>
>> at
>> org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1291)
>>
>> at
>> org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1061)
>>
>> at
>> org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:997)
>>
>> at
>> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
>>
>> at
>> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:166)
>>
>> at
>> org.apache.nifi.processors.solr.PutSolrContentStream$1.process(PutSolrContentStream.java:242)
>>
>> at
>> org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2207)
>>
>> at
>> org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2175)
>>
>> at
>> org.apache.nifi.processors.solr.PutSolrContentStream.onTrigger(PutSolrContentStream.java:199)
>>
>> at
>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>>
>> at
>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1147)
>>
>> at
>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>>
>> at
>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>>
>> at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>
>> at
>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>
>> at java.lang.Thread.run(Thread.java:748)
>>
>> Caused by: org.apache.solr.client.solrj.SolrServerException: Server
>> refused connection at:
>> http://172.17.0.16:8983/solr/monitoringapi_

Putting data to Solr index using Nifi

2018-09-27 Thread Tomislav Novosel
Hi all,



I'm trying to put data in Solr Index using Nifi. Solr is v7.5.0 and Nifi is
v1.6.0.

I'm using PutSolrContentStream processor and Solr in Solrcloud mode with
embedded zookeeper

Inside docker container. I exposed Solr admin port and zookeeper port to be
accessible through browser.



I configured Nifi processor to Solr Cloud, gave collection name and Solr
location(hostname where docker with Solr container is

and port to embedded zookeeper)



After I tried to put data into collection, i got error:



rg.apache.solr.client.solrj.SolrServerException: No live SolrServers
available to handle this request:[
http://172.17.0.16:8983/solr/monitoringapi_shard1_replica_n1]

at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:382)

at
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1291)

at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1061)

at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:997)

at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)

at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:166)

at
org.apache.nifi.processors.solr.PutSolrContentStream$1.process(PutSolrContentStream.java:242)

at
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2207)

at
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2175)

at
org.apache.nifi.processors.solr.PutSolrContentStream.onTrigger(PutSolrContentStream.java:199)

at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)

at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1147)

at
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)

at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)

at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)

at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)

at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.solr.client.solrj.SolrServerException: Server refused
connection at: http://172.17.0.16:8983/solr/monitoringapi_shard1_replica_n1

at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:599)

at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:261)

at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:250)

at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:403)

at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:355)

... 20 common frames omitted

Caused by: java.net.ConnectException: Connection timed out: connect

at java.net.TwoStacksPlainSocketImpl.socketConnect(Native
Method)

at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)

at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)

at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)

at
java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)

at
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)

at java.net.Socket.connect(Socket.java:589)

at
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:117)

at
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:177)

at
org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:304)

at
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:611)

at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:446)

at