Re: Finding slow down in processing

2024-01-10 Thread Lars Winderling
Hi Aaron, is the number of threads set sufficiently high? Once I set it too low 
by accident on a very powerful machine, and when we got more and more flows, at 
some point NiFi slowed down tremendously. By increasing threads to the 
recommend setting (a few per core, cf. admin docs) we got NiFi back to speed.
Another cause of performance loss might be other workloads in the same cluster. 
In case of some cloud provider, you might also get throttled down for high 
disk/resource/... usage. Just a thought.
Anything in the logs? Maybe your repositories for content, flowfiles etc are 
full, and NiFi cannot cope with archiving and shuffling in the background. But 
there should be an indication in the logs.
Good luck, Lars

On 10 January 2024 18:09:07 CET, Joe Witt  wrote:
>Aaron,
>
>The usual suspects are memory consumption leading to high GC leading to
>lower performance over time, or back pressure in the flow, etc.. But your
>description does not really fit either exactly.  Does your flow see a mix
>of large objects and smaller objects?
>
>Thanks
>
>On Wed, Jan 10, 2024 at 10:07 AM Aaron Rich  wrote:
>
>> Hi all,
>>
>>
>>
>> I’m running into an odd issue and hoping someone can point me in the right
>> direction.
>>
>>
>>
>> I have NiFi 1.19 deployed in a Kube cluster with all the repositories
>> volume mounted out. It was processing great with processors like
>> UpdateAttribute sending through 15K/5m PutFile sending through 3K/5m.
>>
>>
>>
>> With nothing changing in the deployment, the performance has dropped to
>> UpdateAttribute doing 350/5m and Putfile to 200/5m.
>>
>>
>>
>> I’m trying to determine what resource is suddenly dropping our performance
>> like this. I don’t see anything on the Kube monitoring that stands out and
>> I have restarted, cleaned repos, changed nodes but nothing is helping.
>>
>>
>>
>> I was hoping there is something from the NiFi POV that can help identify
>> the limiting resource. I'm not sure if there is additional
>> diagnostic/debug/etc information available beyond the node status graphs.
>>
>>
>>
>> Any help would be greatly appreciated.
>>
>>
>>
>> Thanks.
>>
>>
>>
>> -Aaron
>>


Re: [NIFI 1.23.2] Insecure Cipher Provider Algorithm

2023-12-13 Thread Lars Winderling

Hi Quentin,

I second these findings. I'm getting the same error on 1.23.2 using the 
same ciphers.

  deb: 11
  java: 17.0.7 Temurin

Best,
Lars

On 23-12-13 14:58, Quentin HORNEMAN GUTTON wrote:

Hello,

I’m facing an issue after upgrading NiFi 1.13.2 to 1.23.2.

I have a warn log with Insecure Cipher Provider Algorithm 
[PBEWITHMD5AND256BITAES-CBC-OPENSSL]. I tried to update algorithm with 
the set-sensitive-properties-algorithm command to 
NIFI_PBKDF2_AES_GCM_256 but I have an error message with « Descryption 
failed with algorithm » caused by « pad block corrupted ».


Do you have any informations that could help me ?

Best regards,

Quentin HORNEMAN GUTTON




OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: NIFI-REGISTRY Migration

2023-09-29 Thread Lars Winderling

Hi Minh,

you could try to employ URL aliasing 
 
as explained in the docs. This way, the registry will store only an 
alias instead of the actual registry address. When migrating to another 
host, make sure to assign the same alias to it, and it should be able to 
retrieve versioning information properly.
Another option would be to use a persistent host name (same A/ 
entries, or a CNAME).

This way, I have migrated our registry already a few times.

I hope this helps.
Best,
Lars

On 23-09-29 10:44, e-soci...@gmx.fr wrote:

Hello all,
Someone has already migrate their nifi-registry from hostA to hostB 
(nifi-registry) and keep track the versionning for the nifi client 
points to the hostB ?

Have we got some documentations about migration nifi-registry ?
regards
Minh




OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: NiFi not rolling logs

2023-07-10 Thread Lars Winderling

David,
great recommendations, thanks!
Best, Lars

On 23-07-10 15:59, David Handermann wrote:
It is worth noting that the totalSizeCap can be added at any time to 
an existing NiFi deployment, it is not necessary to wait for an 
updated release.


Also related, the nifi-deprecation.log default setting includes a 100 
MB totalSizeCap to avoid excessive repetition of deprecation messages, 
and that has worked well for several releases.


Regards,
David Handermann

On Sun, Jul 9, 2023 at 3:26 PM Mike Thomsen  
wrote:


The totalcapsize feature will help a lot from what I’ve seen.

Sent from my iPhone


On Jul 8, 2023, at 9:26 AM, Lars Winderling
 wrote:


Hi Mike,

thanks for the advice. Our NiFi instances are running for week,
if not months. Often times until the next update, so the startup
option will bring much benefit, I fear, or am I mistaken. But
looking forward to 1.23!


On 8 July 2023 13:40:15 CEST, Mike Thomsen
 wrote:

Lars,

You should also experiment with cleanHistoryOnStart. I did
some experimentation this morning where I set the maxHistory
to 1 (1 day vs the default of 30 which is 30 days), created a
few fake log files from previous days and NiFi immediately
cleared out those "old files" on startup. I have a Jira
ticket up to fix this for 1.x and 2.x and will likely have it
up today. Should definitely be ready for 1.23

On Sat, Jul 8, 2023 at 4:17 AM Lars Winderling
 wrote:

Dear NiFiers, we have been bugged so much by overflowing
logfiles, and nothing has ever helped. I thought it was
just my lack of skills...especially when NiFi has some
issues and keeps on spilling stacktraces with high
frequency to disk, it eats up space quickly. I have
created cronjobs that rotate logs every minute iff
required, and when almost no space is left, it simply
deletes old files. Will try totalCapSize etc. Thank you
for the pointers! Best, Lars


On 8 July 2023 09:33:41 CEST, "Jens M. Kofoed"
 wrote:

Hi

Please have a look at this old jira:
https://issues.apache.org/jira/browse/NIFI-2203
I have had issues where a processor create a log
message ever 10ms resulting in the disk is being
full. For me it seems like the maxHistory settings
only effect how many files defined by the rolling
patten to be kept. If you have defined it like this:

${org.apache.nifi.bootstrap.config.log.dir}/nifi-app%d{-MM-dd}.%i.log
MaxHistory only effect the days not the increments
file %i per day. So you can stille have thousands of
files in one day.
The totalSizeCap will delete the oldes files if the
total size hits the cap settings.

The totalSizeCap have been added in the logback.xml
file for nifi-registry where it has been added inside
the rollingPolicy section. I cound not get it to work
inside the rollingPolicy section in nifi but just
added in appender section. See my comment in the
jira: https://issues.apache.org/jira/browse/NIFI-2203

Kind regards
Jens M. Kofoed

Den lør. 8. jul. 2023 kl. 04.27 skrev Mike Thomsen
:

Yeah, I'm working through some of it where I have
time. I plan to have a Jira up this weekend. I'm
wondering, though, if we shouldn't consider a
spike for switching to log4j2 in 2.X because I
saw a lot of complaints about logback being
inconsistent in honoring its settings.

On Fri, Jul 7, 2023 at 10:19 PM Joe Witt
 wrote:

H. Interesting.  Can you capture these
bits of fun in a jira?

Thanks

On Fri, Jul 7, 2023 at 7:17 PM Mike Thomsen
 wrote:

After doing some research, it appears
that  is a wonky setting
WRT how well it's honored by logback. I
let a GenerateFlowFile > LogAttribute
flow run for a long time, and it just
kept filling up. When I added
 that appeared to force
expected behavior on total log size. We
might want to add the following:

true
50GB

Re: NiFi not rolling logs

2023-07-08 Thread Lars Winderling
Hi Mike,

thanks for the advice. Our NiFi instances are running for week, if not months. 
Often times until the next update, so the startup option will bring much 
benefit, I fear, or am I mistaken. But looking forward to 1.23!

On 8 July 2023 13:40:15 CEST, Mike Thomsen  wrote:
>Lars,
>
>You should also experiment with cleanHistoryOnStart. I did some
>experimentation this morning where I set the maxHistory to 1 (1 day vs the
>default of 30 which is 30 days), created a few fake log files from previous
>days and NiFi immediately cleared out those "old files" on startup. I have
>a Jira ticket up to fix this for 1.x and 2.x and will likely have it up
>today. Should definitely be ready for 1.23
>
>On Sat, Jul 8, 2023 at 4:17 AM Lars Winderling 
>wrote:
>
>> Dear NiFiers, we have been bugged so much by overflowing logfiles, and
>> nothing has ever helped. I thought it was just my lack of
>> skills...especially when NiFi has some issues and keeps on spilling
>> stacktraces with high frequency to disk, it eats up space quickly. I have
>> created cronjobs that rotate logs every minute iff required, and when
>> almost no space is left, it simply deletes old files. Will try totalCapSize
>> etc. Thank you for the pointers! Best, Lars
>>
>>
>> On 8 July 2023 09:33:41 CEST, "Jens M. Kofoed" 
>> wrote:
>>
>>> Hi
>>>
>>> Please have a look at this old jira:
>>> https://issues.apache.org/jira/browse/NIFI-2203
>>> I have had issues where a processor create a log message ever 10ms
>>> resulting in the disk is being full. For me it seems like the maxHistory
>>> settings only effect how many files defined by the rolling patten to be
>>> kept. If you have defined it like this:
>>>
>>> ${org.apache.nifi.bootstrap.config.log.dir}/nifi-app%d{-MM-dd}.%i.log
>>> MaxHistory only effect the days not the increments file %i per day. So
>>> you can stille have thousands of files in one day.
>>> The totalSizeCap will delete the oldes files if the total size hits the
>>> cap settings.
>>>
>>> The totalSizeCap have been added in the logback.xml file for
>>> nifi-registry where it has been added inside the rollingPolicy section. I
>>> cound not get it to work inside the rollingPolicy section in nifi but just
>>> added in appender section. See my comment in the jira:
>>> https://issues.apache.org/jira/browse/NIFI-2203
>>>
>>> Kind regards
>>> Jens M. Kofoed
>>>
>>> Den lør. 8. jul. 2023 kl. 04.27 skrev Mike Thomsen <
>>> mikerthom...@gmail.com>:
>>>
>>>> Yeah, I'm working through some of it where I have time. I plan to have a
>>>> Jira up this weekend. I'm wondering, though, if we shouldn't consider a
>>>> spike for switching to log4j2 in 2.X because I saw a lot of complaints
>>>> about logback being inconsistent in honoring its settings.
>>>>
>>>> On Fri, Jul 7, 2023 at 10:19 PM Joe Witt  wrote:
>>>>
>>>>> H.  Interesting.  Can you capture these bits of fun in a jira?
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Fri, Jul 7, 2023 at 7:17 PM Mike Thomsen 
>>>>> wrote:
>>>>>
>>>>>> After doing some research, it appears that  is a wonky
>>>>>> setting WRT how well it's honored by logback. I let a GenerateFlowFile >
>>>>>> LogAttribute flow run for a long time, and it just kept filling up. When 
>>>>>> I
>>>>>> added  that appeared to force expected behavior on total 
>>>>>> log
>>>>>> size. We might want to add the following:
>>>>>>
>>>>>> true
>>>>>> 50GB
>>>>>>
>>>>>> On Fri, Jul 7, 2023 at 11:33 AM Michael Moser 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Mike,
>>>>>>>
>>>>>>> You aren't alone in experiencing this.  I think logback uses a
>>>>>>> pattern matcher on filename to discover files to delete.  If "something"
>>>>>>> happens which causes a gap in the date pattern, then the matcher will 
>>>>>>> then
>>>>>>> fail to pick up and delete files on the other side of that gap.
>>>>>>>
>>>>>>> Regards,
>>>>>>> -- Mike M
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jul 6, 2023 at 10:28 AM Mike Thomsen 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> We are using the stock configuration, and have noticed that we have
>>>>>>>> a lot of nifi-app* logs that are well beyond the historic data cap of 
>>>>>>>> 30
>>>>>>>> days in logback.xml; some of those logs go back to April. We also have 
>>>>>>>> a
>>>>>>>> bunch of 0 byte nifi-user logs and some of the other logs are 0 bytes 
>>>>>>>> as
>>>>>>>> well. It looks like logback is rotating based on time, but isn't 
>>>>>>>> cleaning
>>>>>>>> up. Is this expected behavior or a problem with the configuration?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Mike
>>>>>>>>
>>>>>>>


Re: NiFi not rolling logs

2023-07-08 Thread Lars Winderling
Dear NiFiers, we have been bugged so much by overflowing logfiles, and nothing 
has ever helped. I thought it was just my lack of skills...especially when NiFi 
has some issues and keeps on spilling stacktraces with high frequency to disk, 
it eats up space quickly. I have created cronjobs that rotate logs every minute 
iff required, and when almost no space is left, it simply deletes old files. 
Will try totalCapSize etc. Thank you for the pointers! Best, Lars

On 8 July 2023 09:33:41 CEST, "Jens M. Kofoed"  wrote:
>Hi
>
>Please have a look at this old jira:
>https://issues.apache.org/jira/browse/NIFI-2203
>I have had issues where a processor create a log message ever 10ms
>resulting in the disk is being full. For me it seems like the maxHistory
>settings only effect how many files defined by the rolling patten to be
>kept. If you have defined it like this:
>${org.apache.nifi.bootstrap.config.log.dir}/nifi-app%d{-MM-dd}.%i.log
>MaxHistory only effect the days not the increments file %i per day. So you
>can stille have thousands of files in one day.
>The totalSizeCap will delete the oldes files if the total size hits the cap
>settings.
>
>The totalSizeCap have been added in the logback.xml file for nifi-registry
>where it has been added inside the rollingPolicy section. I cound not get
>it to work inside the rollingPolicy section in nifi but just added
>in appender section. See my comment in the jira:
>https://issues.apache.org/jira/browse/NIFI-2203
>
>Kind regards
>Jens M. Kofoed
>
>Den lør. 8. jul. 2023 kl. 04.27 skrev Mike Thomsen :
>
>> Yeah, I'm working through some of it where I have time. I plan to have a
>> Jira up this weekend. I'm wondering, though, if we shouldn't consider a
>> spike for switching to log4j2 in 2.X because I saw a lot of complaints
>> about logback being inconsistent in honoring its settings.
>>
>> On Fri, Jul 7, 2023 at 10:19 PM Joe Witt  wrote:
>>
>>> H.  Interesting.  Can you capture these bits of fun in a jira?
>>>
>>> Thanks
>>>
>>> On Fri, Jul 7, 2023 at 7:17 PM Mike Thomsen 
>>> wrote:
>>>
 After doing some research, it appears that  is a wonky
 setting WRT how well it's honored by logback. I let a GenerateFlowFile >
 LogAttribute flow run for a long time, and it just kept filling up. When I
 added  that appeared to force expected behavior on total log
 size. We might want to add the following:

 true
 50GB

 On Fri, Jul 7, 2023 at 11:33 AM Michael Moser 
 wrote:

> Hi Mike,
>
> You aren't alone in experiencing this.  I think logback uses a pattern
> matcher on filename to discover files to delete.  If "something" happens
> which causes a gap in the date pattern, then the matcher will then fail to
> pick up and delete files on the other side of that gap.
>
> Regards,
> -- Mike M
>
>
> On Thu, Jul 6, 2023 at 10:28 AM Mike Thomsen 
> wrote:
>
>> We are using the stock configuration, and have noticed that we have a
>> lot of nifi-app* logs that are well beyond the historic data cap of 30 
>> days
>> in logback.xml; some of those logs go back to April. We also have a bunch
>> of 0 byte nifi-user logs and some of the other logs are 0 bytes as well. 
>> It
>> looks like logback is rotating based on time, but isn't cleaning up. Is
>> this expected behavior or a problem with the configuration?
>>
>> Thanks,
>>
>> Mike
>>
>


Re: BufferedReader best option to search through large flowfiles?

2023-06-05 Thread Lars Winderling

Hi Jim,

RouteText works in a line-by-line fashion, so that shouldn't exhaust 
memory (unless for /very/ long lines). Other processors such as 
ReplaceText have the option to choose whether you want to stream lines, 
or slurp the whole file at once.


Best,
Lars

On 23-06-05 14:49, James McMahon wrote:
Thank you very much Mark and Lars. Ideally I do prefer to employ 
standard "out of the box" processors. In this case my requirement is 
to identify bounding dates across all content in the flowfile. As I 
match my DT patterns, I'll add the tokens to a groovy list that I can 
later sort and use to identify the extreme values. (I may actually 
throw out the extremes to ensure I'm not working with an outlier that 
is an error). I know how to make those manipulations in a groovy 
script. I don't know how to accomplish them using standard processors.


Mark, for future reference is there a risk when using RouteText that a 
huge flowfile might exhaust jvm or repo resources? Is there such a 
risk for the ExtractText, ReplaceText, and RouteOnContent processors 
mentioned by Lars?


Jim

On Mon, Jun 5, 2023 at 8:25 AM Mark Payne  wrote:

Jim,

Take a look at RouteText.

Thanks
-Mark


> On Jun 5, 2023, at 8:09 AM, James McMahon 
wrote:
>
> Hello. I have a requirement to scan for multiple regex patterns
in very large flowfiles. Given that my flowfiles can be very
large, I think my best approach is to employ an
ExecuteGroovyScript processor and a script using a BufferedReader
to scan the file one line at a time.
>
> I am concerned that I might exhaust jvm resources trying to
otherwise process large content if I try to handle it all at once.
Is a BufferedReader the right call? Does anyone recommend a better
approach?
>
> Thanks in advance,
> Jim





OpenPGP_signature
Description: OpenPGP digital signature


Re: BufferedReader best option to search through large flowfiles?

2023-06-05 Thread Lars Winderling

Hi James,

in case the NiFi processors such as ExtractText, ReplaceText and 
RouteOnContent (maybe multiple in a row/in parallel) do not match your 
use case, I'd definitely go with a bufferend reader and line wise 
processing. Afaik you can get it as easily as

    new File("/path/to/my/file").eachLine { line -> ... }

Enjoy your day and take care!
Best,
Lars

On 23-06-05 14:09, James McMahon wrote:
Hello. I have a requirement to scan for multiple regex patterns in 
very large flowfiles. Given that my flowfiles can be very large, I 
think my best approach is to employ an ExecuteGroovyScript processor 
and a script using a BufferedReader to scan the file one line at a time.


I am concerned that I might exhaust jvm resources trying to otherwise 
process large content if I try to handle it all at once. Is a 
BufferedReader the right call? Does anyone recommend a better approach?


Thanks in advance,
Jim




OpenPGP_signature
Description: OpenPGP digital signature


Re: [EXTERNAL] Re: Processor ID/Processor Name as a default attribute

2023-05-26 Thread Lars Winderling
Hi James, 

you might want to check the official user guide 
https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#data_provenance. In 
case you have specific questions, feel free to reach out to me directly!

Best, Lars

On 26 May 2023 12:20:38 CEST, James  wrote:
>Hi again
>
>It definitely could be a training issue on my part, working with Nifi for
>so long and STILL figuring things out and finding new cool things to try.
>
>My methodology is more about routing all errors (as an example) to a single
>point (processor) so there's 1 log file to monitor, and having an easy to
>trace "source processor" to look for.
>
>BUT, if you do things differently/better and feel like sharing some tips,
>I'd be very appreciative, even if you took this off the mailing list/direct
>mail.
>
>thanks again
>
>On Fri, May 26, 2023 at 11:54 AM Lars Winderling 
>wrote:
>
>> Hi folks, I think you can obtain the same information via provenance. Or
>> am I mssing something? It sounds to me like you want to do manual debugging
>> instead of automizing it, so provenance should be easily accessible. Enjoy
>> your day and take care! Best, Lars
>>
>>
>> On 26 May 2023 11:21:13 CEST, James  wrote:
>>
>>> Hi
>>>
>>> This would honestly make life SO much easier!
>>>
>>> I've worked around this by adding my own attributes that can get logged
>>> in a central "LogMessage" processor, but even that can prove tricky
>>> sometimes. And this solution would be so much cleaner.
>>>
>>> Just my 2c
>>>
>>> thanks
>>> James
>>>
>>> On Thu, May 25, 2023 at 5:11 PM Chirthani, Deepak Reddy <
>>> c-deepakreddy.chirth...@charter.com> wrote:
>>>
>>>> In a large dataflow with multiple processors, it will help identifying
>>>> which processor routed flowfile to failure relationship.
>>>>
>>>>
>>>>
>>>>
>>>> https://community.cloudera.com/t5/Support-Questions/NiFi-origin-processor-name-or-id-as-attribute/m-p/175408
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *[image: image005]*
>>>>
>>>> *Deepak Reddy* | Data Engineer
>>>> ​IT Centers of Excellence
>>>> 13736 Riverport Dr., Maryland Heights, MO 63043
>>>>
>>>>
>>>>
>>>> *From:* Joe Witt 
>>>> *Sent:* Wednesday, May 24, 2023 9:51 AM
>>>> *To:* users@nifi.apache.org
>>>> *Subject:* [EXTERNAL] Re: Processor ID/Processor Name as a default
>>>> attribute
>>>>
>>>>
>>>>
>>>> *CAUTION:* The e-mail below is from an external source. Please exercise
>>>> caution before opening attachments, clicking links, or following guidance.
>>>>
>>>> Hello
>>>>
>>>>
>>>>
>>>> Can you describe how you would use this information?
>>>>
>>>>
>>>>
>>>>  These kinds of details and more are present in provenance data now.
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> On Wed, May 24, 2023 at 7:45 AM Chirthani, Deepak Reddy <
>>>> c-deepakreddy.chirth...@charter.com> wrote:
>>>>
>>>> Is there any chance where Processor_ID or Processor_Name can be added as
>>>> a default attribute for each flowfile so that its value should be the
>>>> ID/Name of the most recent processor it was processed by regardless of the
>>>> relationship it sends to?
>>>>
>>>>
>>>>
>>>> https://issues.apache.org/jira/browse/NIFI-4284
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> *[image: image005]*
>>>>
>>>> *Deepak Reddy* | Data Engineer
>>>> ​IT Centers of Excellence
>>>> 13736 Riverport Dr., Maryland Heights, MO 63043
>>>> <https://www.google.com/maps/search/13736+Riverport+Dr.,+Maryland+Heights,+MO+63043?entry=gmail&source=g>
>>>>
>>>>
>>>>
>>>> The contents of this e-mail message and
>>>> any attachments are intended solely for the
>>>> addressee(s) and may contain confidential
>>>> and/or legally privileged information. If you
>>>> are not the intended recipient of this message
>>>> or if this message has been addressed to you
>>>> in error, please immediately alert the sender
>>>> by reply e-mail and then delete this message
>>>> and any attachments. If you are not the
>>>> intended recipient, you are notified that
>>>> any use, dissemination, distribution, copying,
>>>> or storage of this message or any attachment
>>>> is strictly prohibited.
>>>>
>>>> The contents of this e-mail message and
>>>> any attachments are intended solely for the
>>>> addressee(s) and may contain confidential
>>>> and/or legally privileged information. If you
>>>> are not the intended recipient of this message
>>>> or if this message has been addressed to you
>>>> in error, please immediately alert the sender
>>>> by reply e-mail and then delete this message
>>>> and any attachments. If you are not the
>>>> intended recipient, you are notified that
>>>> any use, dissemination, distribution, copying,
>>>> or storage of this message or any attachment
>>>> is strictly prohibited.
>>>>
>>>


Re: [EXTERNAL] Re: Processor ID/Processor Name as a default attribute

2023-05-26 Thread Lars Winderling
Hi folks, I think you can obtain the same information via provenance. Or am I 
mssing something? It sounds to me like you want to do manual debugging instead 
of automizing it, so provenance should be easily accessible. Enjoy your day and 
take care! Best, Lars

On 26 May 2023 11:21:13 CEST, James  wrote:
>Hi
>
>This would honestly make life SO much easier!
>
>I've worked around this by adding my own attributes that can get logged in
>a central "LogMessage" processor, but even that can prove tricky sometimes.
>And this solution would be so much cleaner.
>
>Just my 2c
>
>thanks
>James
>
>On Thu, May 25, 2023 at 5:11 PM Chirthani, Deepak Reddy <
>c-deepakreddy.chirth...@charter.com> wrote:
>
>> In a large dataflow with multiple processors, it will help identifying
>> which processor routed flowfile to failure relationship.
>>
>>
>>
>>
>> https://community.cloudera.com/t5/Support-Questions/NiFi-origin-processor-name-or-id-as-attribute/m-p/175408
>>
>>
>>
>>
>>
>>
>>
>> *[image: image005]*
>>
>> *Deepak Reddy* | Data Engineer
>> ​IT Centers of Excellence
>> 13736 Riverport Dr., Maryland Heights, MO 63043
>>
>>
>>
>> *From:* Joe Witt 
>> *Sent:* Wednesday, May 24, 2023 9:51 AM
>> *To:* users@nifi.apache.org
>> *Subject:* [EXTERNAL] Re: Processor ID/Processor Name as a default
>> attribute
>>
>>
>>
>> *CAUTION:* The e-mail below is from an external source. Please exercise
>> caution before opening attachments, clicking links, or following guidance.
>>
>> Hello
>>
>>
>>
>> Can you describe how you would use this information?
>>
>>
>>
>>  These kinds of details and more are present in provenance data now.
>>
>>
>>
>> Thanks
>>
>>
>>
>> On Wed, May 24, 2023 at 7:45 AM Chirthani, Deepak Reddy <
>> c-deepakreddy.chirth...@charter.com> wrote:
>>
>> Is there any chance where Processor_ID or Processor_Name can be added as a
>> default attribute for each flowfile so that its value should be the ID/Name
>> of the most recent processor it was processed by regardless of the
>> relationship it sends to?
>>
>>
>>
>> https://issues.apache.org/jira/browse/NIFI-4284
>>
>>
>>
>> Thanks
>>
>>
>>
>> *[image: image005]*
>>
>> *Deepak Reddy* | Data Engineer
>> ​IT Centers of Excellence
>> 13736 Riverport Dr., Maryland Heights, MO 63043
>> 
>>
>>
>>
>> The contents of this e-mail message and
>> any attachments are intended solely for the
>> addressee(s) and may contain confidential
>> and/or legally privileged information. If you
>> are not the intended recipient of this message
>> or if this message has been addressed to you
>> in error, please immediately alert the sender
>> by reply e-mail and then delete this message
>> and any attachments. If you are not the
>> intended recipient, you are notified that
>> any use, dissemination, distribution, copying,
>> or storage of this message or any attachment
>> is strictly prohibited.
>>
>> The contents of this e-mail message and
>> any attachments are intended solely for the
>> addressee(s) and may contain confidential
>> and/or legally privileged information. If you
>> are not the intended recipient of this message
>> or if this message has been addressed to you
>> in error, please immediately alert the sender
>> by reply e-mail and then delete this message
>> and any attachments. If you are not the
>> intended recipient, you are notified that
>> any use, dissemination, distribution, copying,
>> or storage of this message or any attachment
>> is strictly prohibited.
>>


Re: Expected mergerecord performance

2022-12-20 Thread Lars Winderling
Hi Richard, 

it's not related, but for the logical types timestamp-millis you should use a 
"long" instead of a "string" (cf 
https://avro.apache.org/docs/1.11.1/specification/#timestamp-millisecond-precision)
 afaik.

Best, Lars

On 21 December 2022 08:29:54 CET, Richard Beare  wrote:
>I have found a way to force the schema to be used, but I've missed
>something in my configuration. When I use a default generic avro writer in
>my jolttransformrecord processor the queue of 259 entries (about 1.8M) is
>processed instantly.
>If I configure my avrowriter to use the schema text property and paste the
>following into the schema text field, the performance is terrible, taking
>tens of minutes to empty the same queue. Both are on the same 25ms run
>duration. I notice that even a "run once" of that processor. I did not
>notice the same behavior on my laptop. Is there likely to be some sort of
>querying remote sites going on behind the scenes that my server is failing
>to access due to firewalls etc? It seems really strange to me that it
>should be so slow with such tiny files, and the only commonality I can find
>is the custom schema. Is there something odd about it?
>
>{
>  "type": "record",
>  "namespace": "cogstack",
>  "name": "document",
>  "fields":
>  [
>{ "name": "doc_id", "type": "string" },
>{ "name": "doc_text", "type": "string", "default": "" },
>{ "name": "processing_timestamp", "type": { "type" : "string",
>"logicalType" : "timestamp-millis" } },
>{ "name": "metadata_x_ocr_applied", "type": "boolean" },
>{ "name": "metadata_x_parsed_by", "type": "string" },
>{ "name": "metadata_content_type", "type": ["null", "string"],
>"default": null },
>{ "name": "metadata_page_count", "type": ["null", "int"], "default":
>null },
>{ "name": "metadata_creation_date", "type": ["null", { "type" :
>"string", "logicalType" : "timestamp-millis" }], "default": null },
>{ "name": "metadata_last_modified", "type": ["null", { "type" :
>"string", "logicalType" : "timestamp-millis" }], "default": null }
>  ]
>}
>}
>
>On Wed, Dec 21, 2022 at 2:05 PM Richard Beare 
>wrote:
>
>> I've made progress with Jolt and I think I'm close to achieving what I'm
>> after. I am missing one conceptual step, I think.
>>
>> I rearrange my json so that it conforms to the desired structure and I can
>> then write the results as avro. However, that is generic avro. How do I
>> ensure that I conform to the schema that has been defined for that part of
>> the flow?
>>
>> i.e. The part of the flow I'm replacing with jolt was a groovy script that
>> created a flowfile according to a schema. That schema is below. Is there a
>> way to utilise this definition in the jolttransformrecord processor, either
>> via specification of data types in the transform definition or by telling
>> the avro writer to use that specification. Or am I overthinking things here?
>> Thanks
>>
>> {
>>   "type": "record",
>>   "name": "document",
>>   "fields":
>>   [
>> { "name": "doc_id", "type": "string" },
>> { "name": "doc_text", "type": "string", "default": "" },
>> { "name": "processing_timestamp", "type": { "type" : "string",
>> "logicalType" : "timestamp-millis" } },
>> { "name": "metadata_x_ocr_applied", "type": "boolean" },
>> { "name": "metadata_x_parsed_by", "type": "string" },
>> { "name": "metadata_content_type", "type": ["null", "string"],
>> "default": null },
>> { "name": "metadata_page_count", "type": ["null", "int"], "default":
>> null },
>> { "name": "metadata_creation_date", "type": ["null", { "type" :
>> "string", "logicalType" : "timestamp-millis" }], "default": null },
>> { "name": "metadata_last_modified", "type": ["null", { "type" :
>> "string", "logicalType" : "timestamp-millis" }], "default": null }
>>   ]
>> }
>>
>>
>>
>> On Wed, Dec 21, 2022 at 9:06 AM Richard Beare 
>> wrote:
>>
>>> Thanks - I'll have a look at that. It is helpfully to get guidance like
>>> this when the system is so large.
>>>
>>> On Wed, Dec 21, 2022 at 5:30 AM Matt Burgess 
>>> wrote:
>>>
 Thanks Vijay! I agree those processors should do the trick but there
 were things in the transformation between input and desired output
 that I wasn't sure of their origin. If you are setting constants you
 can use either a Shift or Default spec, if you are moving fields
 around you can use a Shift spec, and in general whether you end up
 with one spec or multiple, I find it's easiest to use a Chain spec (an
 array of specs) in the processor configuration. You can play around
 with the spec(s) at the Jolt playground [1]

 An important difference to note between JoltTransformJSON and
 JoltTransformRecord is that for the former, the spec is applied to the
 entire input (and it is entirely read into memory) where in
 JoltTransformRecord the spec is applied to each record.

 Regards,
 Matt

 [1] http://jolt-demo.appspot.com/#inception

 On Tue,

Re: Trouble configuring logging

2022-09-28 Thread Lars Winderling
Hi Dylan,

although it's not related to your question I would like to point out that your 
docker image (unless squashed to a single layer) will include even layers for 
deleted files afaik. If you put all your commands into a single RUN directive, 
you might save some space.

Best, Lars

On 28 September 2022 16:13:48 CEST, Dylan Klomparens 
 wrote:
>​Kevin, thank you for taking a look. I am using a custom Dockerfile (and 
>docker-compose.yaml) I wrote, pasted at the end of this email.
>
>I also ran NiFi natively, without Docker, on my desktop computer and observed 
>identical results. I initially suspected Docker could somehow be affecting 
>stdout somehow, but in testing I've not found any evidence of that.
>
>
>Dockerfile
>FROM openjdk:11
>
># These are environment variables that are not meant to be
># configurable, they are essential to the operation of the container.
>ENV NIFI_HOME /opt/nifi/
>ENV NIFI_LOG_DIR /persistent-storage/logs
>ENV NIFI_OVERRIDE_NIFIENV true
>
>WORKDIR /opt
>
># Unpack NiFi and Toolkit
>COPY nifi.zip /opt
>COPY nifi-toolkit.zip /opt
>RUN unzip nifi.zip
>RUN unzip nifi-toolkit.zip
>RUN mv nifi-1.17.0 nifi
>RUN mv nifi-toolkit-1.17.0 nifi-toolkit
>
># Clean out unused files
>RUN rm --force nifi.tar.gz nifi-toolkit.tar.gz
>RUN rm --force /opt/nifi/bin/*.bat
>RUN rm --force /opt/nifi/conf/*
>RUN rm --force /opt/nifi-toolkit/*.bat
>
>RUN mkdir /persistent-storage
>
>WORKDIR /opt/nifi
>
># Set configuration
>COPY bootstrap.conf bootstrap-notification-services.xml authorizers.xml 
>state-management.xml logback.xml /opt/nifi/conf/
>COPY initialize.sh /opt/nifi/bin/
>
>COPY postgres-jdbc-driver.jar /opt
>COPY snowflake-jdbc-driver.jar /opt
>COPY ZIP_codes_to_states.csv /opt
>
>CMD ["/opt/nifi/bin/nifi.sh", "run"]
>EXPOSE 8000/tcp
>
>
>docker-compose.yaml
>services:
>  nifi:
>container_name: nifi
>build: .
>image: nifi
>volumes: [/persistent-storage/:/persistent-storage/]
>ports: [8000:8000]
>environment: [INITIAL_ADMIN_IDENTITY:'*redacted*']
>logging:
>driver: "awslogs"
>options:
>awslogs-region: "*redacted*"
>awslogs-group: "*redacted*"
>awslogs-stream: "All logs"
>
>
>
>
>
>From: Kevin Doran 
>Sent: Wednesday, September 28, 2022 9:48 AM
>To: users@nifi.apache.org 
>Subject: Re: Trouble configuring logging
>
>
>[EXTERNAL]
>
>Dylan - I looked into this and am yet unable to offer an explaination. Perhaps 
>others that are familiar with how org.apache.nifi.StdOut can shed some light, 
>or else I will keep digging when I have a block of time. To help in my 
>understanding: Which Docker image are you using? Is it the apace/nifi image or 
>a custom one, and if custom, can you share the Dockerfile?
>
>Thanks,
>Kevin
>
>On Sep 27, 2022 at 10:21:12, Dylan Klomparens 
>mailto:dklompar...@foodallergy.org>> wrote:
>I am attempting to configure logging for NiFi. I have NiFi running in a Docker 
>container, which sends all console logs to AWS CloudWatch. Therefore, I am 
>configuring NiFi to send all logs to the console.
>
>The problem is, for some reason all log messages are coming from the 
>org.apache.nifi.StdOut logger. I cannot figure out why, since I would like 
>messages to be printed directly from the logger that is receiving them.
>
>It seems like messages are "passing through" loggers, which are ultimately 
>printed out from the org.apache.nifi.StdOut logger. Here is an example of one 
>log message:
>2022-09-27 10:08:01,849 INFO [NiFi logging handler] org.apache.nifi.StdOut 
>2022-09-27 10:08:01,848 INFO [pool-6-thread-1] 
>o.a.n.c.r.WriteAheadFlowFileRepository Initiating checkpoint of FlowFile 
>Repository
>
>Why would every single log message come from the StdOut logger? And how can I 
>have logs delivered from the logger they're supposedly originally coming from?
>
>My logback.xml configuration is below for reference.
>
>
>
>  
>
>  class="ch.qos.logback.classic.jul.LevelChangePropagator">
>true
>  
>
>  
>class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
>  %date %level [%thread] %logger{40} %msg%n
>
>  
>
>  
>  
>  level="INFO"/>
>  level="INFO"/>
>  name="org.apache.nifi.controller.repository.StandardProcessSession" 
>level="WARN" />
>
>  
>  
>  level="ERROR" />
>  level="ERROR" />
>  
>  
>  level="ERROR" />
>  level="ERROR" />
>
>  
>
>  name="org.apache.curator.framework.recipes.leader.LeaderSelector" level="OFF" 
>/>
>  
>
>  
>  
>
>  
>  
>
>  
>  
>
>  
>  
>
>  
>  
>
>  
>  level="ERROR"/>
>
>  
>  
>  
>
>  
>  
>
>  
>  
>
>  
>  
>
>  
>
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>
>  
>  
>
>  
>
>  
>
>


Re: ExecuteStreamCommand fails to extract archived files

2022-09-22 Thread Lars Winderling
Hi Jim,

have you checked whether NiFi can actually access the given paths (/bin/7za and 
/mnt/in)? You might use a script processor just to check both paths exist. If 
that all works, you might want to dump the processor config as json by either 
downloading the flow definition or manually fiddling with flow.json.gz (or 
flow.xml.gz). Maybe some non-printable characters, bad quotes or whatever show 
up. If the jvm tells you the given path does not exist, the jvm is probably 
right. So the path itself does not exist or is wrongly typed. Just my reasoning.

Good luck, Lars

On 22 September 2022 12:01:20 CEST, James McMahon  wrote:
>This continues to fail. I'm wondering if anyone has experience resolving
>this problem below?
>
>It appears that ExecuteStreamCommand cannot seem to find what is a shell
>script pointing to an executable, despite my having fully qualified its
>path (/bin/7za):
>
>ExecuteStreamCommand[id=...] Failed to process session due to
>java.io.IOException: Cannot run program "/bin/7za" (in directory /mnt/in")
>error=2, No such file or directory: java.io.IOException:error=2, No such
>file or directory
>causes: java.io.IOException: Cannot run program "/bin/7za" (in directory
>"/mnt/in")
>
>Here is my updated configuration of my ESC processor:
>Command Arguments Strategy Command Arguments Property
>Command Arguments e;-si;${filename}
>Command Path   /bin/7za
>Ignore STDIN   false
>Working Directory /mnt/in  (this dir is owned by user nifi, which is
>the user the nifi jvm starts by)
>Argument Delimiter   ;
>Output Destination Attribute No value set
>Max Attribute Length   256
>
>I've not found a great deal of info through Google that details how to fix
>this. There are a few articles that talk about ensuring the path includes
>the directory where the binary or script lives, but it's not clear what
>they mean. Do they mean where JAVA_HOME gets set in nifi.sh?
>
>I tried also dropping my 7za script directly into the bin dir where nifi.sh
>exists. My reasoning was that this would surely be a place in nifi's path.
>But this test did not work.
>
>Jim
>
>On Thu, Sep 22, 2022 at 4:15 AM stephen.hindmarch.bt.com via users <
>users@nifi.apache.org> wrote:
>
>> A quick read of the 7-zip command line guide suggests you need the option
>> “-si” to read from stdin.
>>
>>
>>
>> *Steve Hindmarch*
>>
>>
>>
>> *From:* James McMahon 
>> *Sent:* 21 September 2022 17:01
>> *To:* users@nifi.apache.org
>> *Subject:* Re: ExecuteStreamCommand fails to extract archived files
>>
>>
>>
>> Now I see. So in the configuration of the ExecuteStreamCommand there was
>> one thing we had not touched on, and it allowed this to work: Ignore STDIN
>> must be set to true. I did that, I set my Command Arguments to
>> e;/mnt/in/${filename} , and it worked as expected.
>>
>> Thanks again Mike.
>>
>> Jim
>>
>>
>>
>> On Wed, Sep 21, 2022 at 11:50 AM Mike Thomsen 
>> wrote:
>>
>> > ExecuteStreamCommand works on the contents of the incoming flowfile, is
>> that understanding correct?
>>
>> 7za can't read the file from stdin. That's the problem AFAICT in your
>> scenario.
>>
>> On Wed, Sep 21, 2022 at 11:26 AM James McMahon 
>> wrote:
>> >
>> > Thank you Mike. May I ask a few follow-up Qs after trying this and
>> failing still?
>> >
>> > ExecuteStreamCommand works on the contents of the incoming flowfile, is
>> that understanding correct? If so, then why does it matter where the file
>> sits on the filesystem if it will apply /bin/7za to the flowfile in the
>> stream?
>> >
>> > So I have /bin/7za in the /bin directory, it's an executable program,
>> and the user that the nif jvm is running as - user named nifi - has /bin in
>> its path.
>> >
>> > I have an archive file I created in directory /mnt/in, and it is named
>> testArchive.7z. I am successfully able to read that archive file in with a
>> ListFile / FetchFile, and do get it in my stream. These are its attributes:
>> > absolute.path   /mnt/in/
>> > filename   testArchive.7z
>> >
>> > Is this java io exception telling us that it can't find the /bin/7za
>> program, or it can't find the data itself? And if ExecuteStreamCommand is
>> supposed to be applying that command to the flowfile in the stream, why is
>> it important that the archive file exists on disk where
>> ExecuteStreamCommand can find it?
>> >
>> > On Wed, Sep 21, 2022 at 11:07 AM Mike Thomsen 
>> wrote:
>> >>
>> >> To do this, you need to do UpdateAttribute (to set the temp folder
>> >> location) -> PutFile -> ExecuteStreamCommand to ensure the flowfile's
>> >> contents are staged where 7za can find them.
>> >>
>> >> I think the appropriate parameter would be something like this:
>> >>
>> >> Command Arguments: e;${path}/${filename}
>> >>
>> >> Assuming ";" is the argument delimiter.
>> >>
>> >> On Wed, Sep 21, 2022 at 10:45 AM James McMahon 
>> wrote:
>> >> >
>> >> > Hello. I have a program /bin/7za that I need to apply to flowfiles
>> that were created by 7za. One of them is testArchive.7z.
>> >> >
>> >> > I

Re: Trigger content eviction manually?

2022-09-13 Thread Lars Winderling

Dear brothers (and sisters), NiFi did it!

With a dummy flow writing both small and large files to a queue, I 
jammed the disk completely (of the content repo).
Then I deleted all flowfiles again. On my test server, it takes reliably 
2min to evict the content claims from disk. I repeated this a few times. 
First, NiFi tries to clean up, but cannot archive anything because out 
of disk space. Then, it archives some, on next run, it deletes all. With 
the old setting of 10MB, it simply starved at least until next restart.


This is truly amazing. I mean, I totally love NiFi (my complete team 
does), and I check release notes and stuff – but missed out on that 
small detail. Really, amazing job (kudos to all NiFi devs). And thanks 
Joe (and Pat) for both you technical and emotional support :-)


Enjoy your day, see you on the list and may the flow be with you. Always.

Best, Lars

On 22-09-13 20:17, Joe Witt wrote:

Hahah Pat!

Lars

Ok great - very highly probably this is the issue.  Go with 50KB and
lets see what unfolds.

Thanks

On Tue, Sep 13, 2022 at 1:16 PM Patrick Timmins  wrote:

Ha! ... too funny!

You are a good father and NiFi brother

... God Bless!

Pat

On 9/13/2022 1:08 PM, Lars Winderling wrote:

…and guess what I did :-) the joys of remote working. just put my kids
to bed, and here you are!

# Content Repository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository

nifi.content.claim.max.appendable.size=10 MB
nifi.content.claim.max.flow.files=100
nifi.content.repository.directory.default=/srv/nifi-content/data/content-repository

nifi.content.repository.archive.max.retention.period=12 hours
nifi.content.repository.archive.max.usage.percentage=50%
nifi.content.repository.archive.enabled=true
nifi.content.repository.always.sync=false
nifi.content.viewer.url=../nifi-content-viewer/

So we even use 10MB…
will check if lowering the value changes anything

On 22-09-13 20:04, Patrick Timmins wrote:

No, I agree.  Lars, please give up the rest of your evening and drive
back to work and report back with your findings ASAP.  It may be past
normal working hours in Germany, but you have NiFi brothers and
sisters around the world that are counting on you ... please don't
let us down.

:)  <- international smiley/joking symbol


On 9/13/2022 10:15 AM, Joe Witt wrote:

read that again and hopefully it was obvious I was joking.   But I am
looking forward to hearing what you learn.

Thanks

On Tue, Sep 13, 2022 at 10:10 AM Joe Witt  wrote:

Lars

I need you to drive back to work because now I am very vested in
the outcome :)

But yeah this was an annoying problem we saw hit some folks. Changing
that value after fixing the behavior was the answer.  I owe the
community a blog on this

Thanks

On Tue, Sep 13, 2022 at 9:57 AM Lars Winderling
 wrote:

Sorry, misread the jira. We're still on the old default value.
Thank you for being persistant about it. I will try it tomorrow
with the lower value and get back to you. Not at work atm, so I
can't paste the config values in detail.

On 13 September 2022 16:45:30 CEST, Joe Witt 
wrote:

Lars

You should not have to update to 1.17.  While I'm always fond of
peoople being on the latest the issue i mentioned is fixed in
1.16.3.

HOWEVER, please do confirm your values.  The one I'd really focus
you on is
nifi.content.claim.max.appendable.size=50 KB

Our default before was like 1MB and what we'd see is we'd hang on to
large content way longer than we intended because some queue had one
tiny object in it.  So that value became really important.

If you're on 1MB change to 50KB and see what happens.

Thanks

On Tue, Sep 13, 2022 at 9:40 AM Lars Winderling
 wrote:

   I guess the issue you linked, is related. I have seen similar
messages in the log occasionally, but didn't directly connect
it. Our config is pretty similar to the defaults, none of it
should directly cause the issue. Will give 1.17.0 a try and come
back if the issue persists. Your help is really appreciated,
thanks!

   On 13 September 2022 16:33:53 CEST, Joe Witt
 wrote:

   Lars

   The issue that came to mind is
   https://issues.apache.org/jira/browse/NIFI-10023 but that is
fixed in
   1.16.2 and 1.17.0 so that is why I asked.

   What is in your nifi.properties for
   # Content Repository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository

   nifi.content.claim.max.appendable.size=50 KB
nifi.content.repository.directory.default=./content_repository
nifi.content.repository.archive.max.retention.period=7 days
nifi.content.repository.archive.max.usage.percentage=50%
   nifi.content.repository.archive.enabled=true
   nifi.content.repository.always.sync=false

   Thanks

   On Tue, Sep 13, 2022 at 7:04 AM Lars Winderling
wrote:


I'm using 1.16.3 from upstream (no custom build) on java 11
temurin, debian 10, virtualized, no docker setup.

On 13 Sept

Re: Trigger content eviction manually?

2022-09-13 Thread Lars Winderling
…and guess what I did :-) the joys of remote working. just put my kids 
to bed, and here you are!


# Content Repository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.content.claim.max.appendable.size=10 MB
nifi.content.claim.max.flow.files=100
nifi.content.repository.directory.default=/srv/nifi-content/data/content-repository
nifi.content.repository.archive.max.retention.period=12 hours
nifi.content.repository.archive.max.usage.percentage=50%
nifi.content.repository.archive.enabled=true
nifi.content.repository.always.sync=false
nifi.content.viewer.url=../nifi-content-viewer/

So we even use 10MB…
will check if lowering the value changes anything

On 22-09-13 20:04, Patrick Timmins wrote:
No, I agree.  Lars, please give up the rest of your evening and drive 
back to work and report back with your findings ASAP.  It may be past 
normal working hours in Germany, but you have NiFi brothers and 
sisters around the world that are counting on you ... please don't let 
us down.


:)  <- international smiley/joking symbol


On 9/13/2022 10:15 AM, Joe Witt wrote:

read that again and hopefully it was obvious I was joking.   But I am
looking forward to hearing what you learn.

Thanks

On Tue, Sep 13, 2022 at 10:10 AM Joe Witt  wrote:

Lars

I need you to drive back to work because now I am very vested in the 
outcome :)


But yeah this was an annoying problem we saw hit some folks. Changing
that value after fixing the behavior was the answer.  I owe the
community a blog on this

Thanks

On Tue, Sep 13, 2022 at 9:57 AM Lars Winderling
 wrote:
Sorry, misread the jira. We're still on the old default value. 
Thank you for being persistant about it. I will try it tomorrow 
with the lower value and get back to you. Not at work atm, so I 
can't paste the config values in detail.


On 13 September 2022 16:45:30 CEST, Joe Witt  
wrote:

Lars

You should not have to update to 1.17.  While I'm always fond of
peoople being on the latest the issue i mentioned is fixed in 1.16.3.

HOWEVER, please do confirm your values.  The one I'd really focus 
you on is

nifi.content.claim.max.appendable.size=50 KB

Our default before was like 1MB and what we'd see is we'd hang on to
large content way longer than we intended because some queue had one
tiny object in it.  So that value became really important.

If you're on 1MB change to 50KB and see what happens.

Thanks

On Tue, Sep 13, 2022 at 9:40 AM Lars Winderling
 wrote:


  I guess the issue you linked, is related. I have seen similar 
messages in the log occasionally, but didn't directly connect it. 
Our config is pretty similar to the defaults, none of it should 
directly cause the issue. Will give 1.17.0 a try and come back if 
the issue persists. Your help is really appreciated, thanks!


  On 13 September 2022 16:33:53 CEST, Joe Witt 
 wrote:


  Lars

  The issue that came to mind is
  https://issues.apache.org/jira/browse/NIFI-10023 but that is 
fixed in

  1.16.2 and 1.17.0 so that is why I asked.

  What is in your nifi.properties for
  # Content Repository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
  nifi.content.claim.max.appendable.size=50 KB
nifi.content.repository.directory.default=./content_repository
nifi.content.repository.archive.max.retention.period=7 days
nifi.content.repository.archive.max.usage.percentage=50%
  nifi.content.repository.archive.enabled=true
  nifi.content.repository.always.sync=false

  Thanks

  On Tue, Sep 13, 2022 at 7:04 AM Lars Winderling
   wrote:



   I'm using 1.16.3 from upstream (no custom build) on java 11 
temurin, debian 10, virtualized, no docker setup.


   On 13 September 2022 13:37:15 CEST, Joe Witt 
 wrote:



   Lars

   What version are you using?

   Thanks

   On Tue, Sep 13, 2022 at 3:11 AM Lars Winderling 
 wrote:



   Dear community,

   sometimes our content repository grows out of bounds. 
Since it has been separated on disk from the rest of NiFi, we 
can still use the NiFi UI and empty the respective queues. 
However, the disk remains jammed. Sometimes, it gets cleaned 
up after a few mintes, but most of the time we need to 
restart NiFi manually, for the cleanup to happen.
   So. is there any way of triggering the content eviction 
manually without restarting NiFi?
   Btw. the respective files on disk are not archived in the 
content repository (thus not below */archive/*).


   Thanks in advance for your support!
   Best,
   Lars




OpenPGP_signature
Description: OpenPGP digital signature


Re: Trigger content eviction manually?

2022-09-13 Thread Lars Winderling
Sorry, misread the jira. We're still on the old default value. Thank you for 
being persistant about it. I will try it tomorrow  with the lower value and get 
back to you. Not at work atm, so I can't paste the config values in detail.

On 13 September 2022 16:45:30 CEST, Joe Witt  wrote:
>Lars
>
>You should not have to update to 1.17.  While I'm always fond of
>peoople being on the latest the issue i mentioned is fixed in 1.16.3.
>
>HOWEVER, please do confirm your values.  The one I'd really focus you on is
>nifi.content.claim.max.appendable.size=50 KB
>
>Our default before was like 1MB and what we'd see is we'd hang on to
>large content way longer than we intended because some queue had one
>tiny object in it.  So that value became really important.
>
>If you're on 1MB change to 50KB and see what happens.
>
>Thanks
>
>On Tue, Sep 13, 2022 at 9:40 AM Lars Winderling
> wrote:
>>
>> I guess the issue you linked, is related. I have seen similar messages in 
>> the log occasionally, but didn't directly connect it. Our config is pretty 
>> similar to the defaults, none of it should directly cause the issue. Will 
>> give 1.17.0 a try and come back if the issue persists. Your help is really 
>> appreciated, thanks!
>>
>> On 13 September 2022 16:33:53 CEST, Joe Witt  wrote:
>>>
>>> Lars
>>>
>>> The issue that came to mind is
>>> https://issues.apache.org/jira/browse/NIFI-10023 but that is fixed in
>>> 1.16.2 and 1.17.0 so that is why I asked.
>>>
>>> What is in your nifi.properties for
>>> # Content Repository
>>> nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
>>> nifi.content.claim.max.appendable.size=50 KB
>>> nifi.content.repository.directory.default=./content_repository
>>> nifi.content.repository.archive.max.retention.period=7 days
>>> nifi.content.repository.archive.max.usage.percentage=50%
>>> nifi.content.repository.archive.enabled=true
>>> nifi.content.repository.always.sync=false
>>>
>>> Thanks
>>>
>>> On Tue, Sep 13, 2022 at 7:04 AM Lars Winderling
>>>  wrote:
>>>>
>>>>
>>>>  I'm using 1.16.3 from upstream (no custom build) on java 11 temurin, 
>>>> debian 10, virtualized, no docker setup.
>>>>
>>>>  On 13 September 2022 13:37:15 CEST, Joe Witt  wrote:
>>>>>
>>>>>
>>>>>  Lars
>>>>>
>>>>>  What version are you using?
>>>>>
>>>>>  Thanks
>>>>>
>>>>>  On Tue, Sep 13, 2022 at 3:11 AM Lars Winderling 
>>>>>  wrote:
>>>>>>
>>>>>>
>>>>>>  Dear community,
>>>>>>
>>>>>>  sometimes our content repository grows out of bounds. Since it has been 
>>>>>> separated on disk from the rest of NiFi, we can still use the NiFi UI 
>>>>>> and empty the respective queues. However, the disk remains jammed. 
>>>>>> Sometimes, it gets cleaned up after a few mintes, but most of the time 
>>>>>> we need to restart NiFi manually, for the cleanup to happen.
>>>>>>  So. is there any way of triggering the content eviction manually 
>>>>>> without restarting NiFi?
>>>>>>  Btw. the respective files on disk are not archived in the content 
>>>>>> repository (thus not below */archive/*).
>>>>>>
>>>>>>  Thanks in advance for your support!
>>>>>>  Best,
>>>>>>  Lars


Re: Trigger content eviction manually?

2022-09-13 Thread Lars Winderling
I guess the issue you linked, is related. I have seen similar messages in the 
log occasionally, but didn't directly connect it. Our config is pretty similar 
to the defaults, none of it should directly cause the issue. Will give 1.17.0 a 
try and come back if the issue persists. Your help is really appreciated, 
thanks!

On 13 September 2022 16:33:53 CEST, Joe Witt  wrote:
>Lars
>
>The issue that came to mind is
>https://issues.apache.org/jira/browse/NIFI-10023 but that is fixed in
>1.16.2 and 1.17.0 so that is why I asked.
>
>What is in your nifi.properties for
># Content Repository
>nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
>nifi.content.claim.max.appendable.size=50 KB
>nifi.content.repository.directory.default=./content_repository
>nifi.content.repository.archive.max.retention.period=7 days
>nifi.content.repository.archive.max.usage.percentage=50%
>nifi.content.repository.archive.enabled=true
>nifi.content.repository.always.sync=false
>
>Thanks
>
>On Tue, Sep 13, 2022 at 7:04 AM Lars Winderling
> wrote:
>>
>> I'm using 1.16.3 from upstream (no custom build) on java 11 temurin, debian 
>> 10, virtualized, no docker setup.
>>
>> On 13 September 2022 13:37:15 CEST, Joe Witt  wrote:
>>>
>>> Lars
>>>
>>> What version are you using?
>>>
>>> Thanks
>>>
>>> On Tue, Sep 13, 2022 at 3:11 AM Lars Winderling  
>>> wrote:
>>>>
>>>> Dear community,
>>>>
>>>> sometimes our content repository grows out of bounds. Since it has been 
>>>> separated on disk from the rest of NiFi, we can still use the NiFi UI and 
>>>> empty the respective queues. However, the disk remains jammed. Sometimes, 
>>>> it gets cleaned up after a few mintes, but most of the time we need to 
>>>> restart NiFi manually, for the cleanup to happen.
>>>> So. is there any way of triggering the content eviction manually without 
>>>> restarting NiFi?
>>>> Btw. the respective files on disk are not archived in the content 
>>>> repository (thus not below */archive/*).
>>>>
>>>> Thanks in advance for your support!
>>>> Best,
>>>> Lars


Re: Trigger content eviction manually?

2022-09-13 Thread Lars Winderling
I'm using 1.16.3 from upstream (no custom build) on java 11 temurin, debian 10, 
virtualized, no docker setup.

On 13 September 2022 13:37:15 CEST, Joe Witt  wrote:
>Lars
>
>What version are you using?
>
>Thanks
>
>On Tue, Sep 13, 2022 at 3:11 AM Lars Winderling 
>wrote:
>
>> Dear community,
>>
>> sometimes our content repository grows out of bounds. Since it has been
>> separated on disk from the rest of NiFi, we can still use the NiFi UI and
>> empty the respective queues. However, the disk remains jammed. Sometimes,
>> it gets cleaned up after a few mintes, but most of the time we need to
>> restart NiFi manually, for the cleanup to happen.
>> So. is there any way of triggering the content eviction manually without
>> restarting NiFi?
>> Btw. the respective files on disk are not archived in the content
>> repository (thus not below */archive/*).
>>
>> Thanks in advance for your support!
>> Best,
>> Lars
>>
>>


Trigger content eviction manually?

2022-09-13 Thread Lars Winderling

Dear community,

sometimes our content repository grows out of bounds. Since it has been 
separated on disk from the rest of NiFi, we can still use the NiFi UI 
and empty the respective queues. However, the disk remains jammed. 
Sometimes, it gets cleaned up after a few mintes, but most of the time 
we need to restart NiFi manually, for the cleanup to happen.
So. is there any way of triggering the content eviction manually without 
restarting NiFi?
Btw. the respective files on disk are not archived in the content 
repository (thus not below */archive/*).


Thanks in advance for your support!
Best,
Lars



OpenPGP_signature
Description: OpenPGP digital signature


Re: Duplicate UUID after clone events in NiFi 1.16.3

2022-09-01 Thread Lars Winderling
I have just found https://issues.apache.org/jira/browse/NIFI-10203, it's 
a know issue resolved in 1.17.0. sorry for spamming the list.


On 22-09-01 13:41, Lars Winderling wrote:

Dear community,

we have recently upgraded from 1.14.0 to 1.16.3. And the automatic 
UUID generation on CLONE events seems to have changed. When sending a 
flowfile from a source (processor or port) using the same relationship 
to multiple sinks, the cloned flowfiles all have the same UUID. Before 
the upgrade, the flowfiles all received individual flowfiles on CLONE.
To be precise: one flowfile did retain the old UUID, the clones got 
new ones. Now all share the same.


I have created a sample flow consisting of GenerateFlowFile and 
ExecuteGroovyScript. The executed script

 reads as follows:

-

def ff = session.get()
if (!ff) return

log.error("\n{}", [ff.getAttribute("uuid")] as Object[])

session.transfer(ff, REL_SUCCESS)

-

You can find an image and the flow definition at 
https://cloud.winderling.net/s/tzqJBqJzF4fawZN. Please execuse the 
slow initial loading time, it's a low-cost vm.


So my question is: is that a bug or a feature? Since duplicate UUIDs 
lead to many issues, I expect it's the former.


Thank you in advance!
Best,
Lars

---

OS: debian 10
NiFi: 1.16.3
java: temurin 11
openjdk version "11.0.16.1" 2022-08-12
OpenJDK Runtime Environment Temurin-11.0.16.1+1 (build 11.0.16.1+1)
OpenJDK 64-Bit Server VM Temurin-11.0.16.1+1 (build 11.0.16.1+1, mixed 
mode)







OpenPGP_signature
Description: OpenPGP digital signature


Duplicate UUID after clone events in NiFi 1.16.3

2022-09-01 Thread Lars Winderling

Dear community,

we have recently upgraded from 1.14.0 to 1.16.3. And the automatic UUID 
generation on CLONE events seems to have changed. When sending a 
flowfile from a source (processor or port) using the same relationship 
to multiple sinks, the cloned flowfiles all have the same UUID. Before 
the upgrade, the flowfiles all received individual flowfiles on CLONE.
To be precise: one flowfile did retain the old UUID, the clones got new 
ones. Now all share the same.


I have created a sample flow consisting of GenerateFlowFile and 
ExecuteGroovyScript. The executed script

 reads as follows:

-

def ff = session.get()
if (!ff) return

log.error("\n{}", [ff.getAttribute("uuid")] as Object[])

session.transfer(ff, REL_SUCCESS)

-

You can find an image and the flow definition at 
https://cloud.winderling.net/s/tzqJBqJzF4fawZN. Please execuse the slow 
initial loading time, it's a low-cost vm.


So my question is: is that a bug or a feature? Since duplicate UUIDs 
lead to many issues, I expect it's the former.


Thank you in advance!
Best,
Lars

---

OS: debian 10
NiFi: 1.16.3
java: temurin 11
openjdk version "11.0.16.1" 2022-08-12
OpenJDK Runtime Environment Temurin-11.0.16.1+1 (build 11.0.16.1+1)
OpenJDK 64-Bit Server VM Temurin-11.0.16.1+1 (build 11.0.16.1+1, mixed mode)




OpenPGP_signature
Description: OpenPGP digital signature


Re: Requesting Apache NiFi Help/Guidelines

2022-06-20 Thread Lars Winderling
Hi Ben,

I think your regex should read krb5cc_.* , thus including a dot.

Best  Lars

On 20 June 2022 20:02:42 CEST, "Ben .T.George"  wrote:
>Hello,
>
>i have moved files under /home/ansible, which is home directory for the
>user ansible and the user i used to connect to remote server, also changed
>permissions
>[image: image.png]
>
>And my filter is like this, i am not sure it is correct or not:
>[image: image.png]
>
>when i check the process, it is saying files found , 0 matching with filter
>
>[image: image.png]
>
>
>On Mon, Jun 20, 2022 at 8:44 PM Paul Grey  wrote:
>
>> Would it make a difference if you chmod the files from 0600 to 0660?  Is
>> "ansible" (from ListSFTP properties) a member of "Domain Users"?
>>
>> On Mon, Jun 20, 2022 at 1:37 PM Ben .T.George 
>> wrote:
>>
>>> Hello,
>>>
>>> i have checked time on both servers, it's synced through NTP
>>>
>>> [image: image.png]
>>>
>>> Also i could able to see timestamps on remote server from sftp prompt
>>> (Below are the files i was trying to get for testing)
>>> [image: image.png]
>>>
>>>
>>> On Mon, Jun 20, 2022 at 8:21 PM Mark Payne  wrote:
>>>
 Ben,

 If you run sftp from the commandline what do you see for the timestamps
 on those files?

 I am wondering if either there’s a big discrepancy between the time on
 the SFTP server and the time on the server where NiFi is running; or if the
 SFTP server is setup in such a way that it does not update the Last
 Modified Time for files.

 Thanks
 -Mark


 On Jun 20, 2022, at 12:54 PM, Ben .T.George 
 wrote:

 HI,

 I have not set any values to Min/Max age and size properties as I was
 not aware of it.

 

 What should I set for this?


 On Mon, Jun 20, 2022 at 7:42 PM Mark Payne  wrote:

> Ben,
>
> So in the message there, you can see that it found 11 files in the /tmp
> directory, but none of those files matched the filter. So you’ll get no
> output.
> What do you have set for the Minimum/Maximum File Age and for the
> Minimum/Maximum File Size properties? Those are used to filter the 
> results.
>
> Thanks
> -Mark
>
> On Jun 20, 2022, at 12:35 PM, Ben .T.George 
> wrote:
>
> HI
>
> Thanks for response,
>
> i Still i am very new to it and not sure how to explain , i can attach
> some screenshots
>
>
>  this connector is showing zero files
>
> 
>
> but when i test that process, file found
>
> 
>
> ListSFTP process
>
> 
>
>
> On Mon, Jun 20, 2022 at 7:21 PM Mark Payne 
> wrote:
>
>> Ben,
>>
>> ListSFTP -> FetchSFTP is how you’d want to get started. You’d then
>> want to connect that to a PutSFTP to send to Server B and another PutSFTP
>> to send to Server C.
>>
>> You said that it is not working as expected. What does that mean? Are
>> you seeing errors? Not seeing the data show up?
>>
>> Thanks
>> -Mark
>>
>>
>> On Jun 20, 2022, at 10:20 AM, Ben .T.George 
>> wrote:
>>
>>
>> Hello,
>>
>> i am very new to NiFI and trying to get more details about NiFi,
>>
>> My end goal is to achieve some kind of SFTP solution , were i need to
>> transfer file from server A to Server B, Then Server C, which is public,
>>
>> Can you please help me to achieve this, or explain the basics of it.
>>
>> I was trying to use ListSFTP and FetchSFTP, which does not work as
>> expected.
>>
>> Your help will be highly appreciated.
>>
>> Thanks & Regards,
>> Ben
>>
>>
>>
>
> --
> Yours Sincerely
> Ben.T.George
> Phone : +965 - 50629829 / 94100799
>
> *" Live like you will die tomorrow, learn like you will live forever "*
>
>
>

 --
 Yours Sincerely
 Ben.T.George
 Phone : +965 - 50629829 / 94100799

 *" Live like you will die tomorrow, learn like you will live forever "*



>>>
>>> --
>>> Yours Sincerely
>>> Ben.T.George
>>> Phone : +965 - 50629829 / 94100799
>>>
>>> *" Live like you will die tomorrow, learn like you will live forever "*
>>>
>>
>
>-- 
>Yours Sincerely
>Ben.T.George
>Phone : +965 - 50629829 / 94100799
>
>*" Live like you will die tomorrow, learn like you will live forever "*


Re: Re[2]: Minifi and ssl config on NiFi

2022-04-18 Thread Lars Winderling
Hi Dave,  you could use a (custom) CA for your client certs, so only the 
CA-cert would need to be trusted. And for policies, you could use an LDAP group 
and base policies on that.
Downside is that NiFi currently doesn't offer certificate revocation afaik, so 
it might not be applicable to you.

Best, Lars

On 18 April 2022 17:31:14 CEST, David Early  wrote:
>Matt,
>
>The problem is access policies on the input port on the main NiFi:
>
>We are using LDAP on the main NiFi, and when I create Site to Site comms 
>between NiFi instances I have to create a user in NiFi based on the 
>owner name in the cert from the remote.  Once I have that user, I have 
>to ADD that user to an access policy on the input port to allow that 
>port to receive data from the remote.  In addition, I have to do a 
>similar thing for a policy to allow the RPG to be able to get the list 
>of remote ports.
>
>The issue I am having with this is it is a very manual process right 
>now: for each remote, I would need to get the cert, get the owner name, 
>create a user in the main NiFi and associate the user with the policy 
>for the input ports.
>
>My question was probably less about MiNiFi and more about how to 
>optimize the SSL relationships and if there was a shortcut I could use 
>to avoid having to do the user creation and custom policy mod for each 
>remote.
>
>Dave
>
>-- Original Message --
>From: "Matt Burgess" 
>To: users@nifi.apache.org
>Sent: 4/17/2022 9:48:29 AM
>Subject: Re: Minifi and ssl config on NiFi
>
>>MiNiFi is actually alive and well, we just moved it into the NiFi codebase. 
>>We’re actively developing a Command-and-Control (C2) capability to remotely 
>>update the flow on the agent for example.
>>
>>You can configure MiNiFi agents for SSL over Site-to-Site in order to talk to 
>>secure NiFi instances. Not sure about the need for a user but you would need 
>>to present a certificate the same as you would for connecting to the NiFi UI. 
>>Some security features still need to be implemented (like encrypted 
>>properties maybe) but you should definitely be able to do what you’re trying 
>>to do with MiNiFi, happy to help with any issues you may run into.
>>
>>Regards,
>>Matt
>>
>>
>>>  On Apr 17, 2022, at 11:40 AM, Jorge Machado  wrote:
>>>
>>>  I did this on the pass and I end up switching to Nifi. I think you should 
>>> do the same. Minifi is kind of “Dead” not being developed anymore. I found 
>>> better to just switch to single instance of nifi
>>>
>>>  Regards
>>>  Jorge
>>>
  On 17. Apr 2022, at 03:30, David Early  wrote:

  We are considering using several dozen minifi instances to gather data at 
 remote locations and send it to a cloud based central NiFi.

  The problem I am THINKING we will have is setting up ssl. The only way I 
 know of to set up ssl for site to site requires a user be configured for 
 the incoming data on the destination NiFi and permissions given to that 
 user to be able to send data.

  Am I missing something? Will we have to manually set up a user in the 
 cloud NiFi for each minifi instances so we can use ssl transport?

  Dave
>>>

Re: Handling floating point numbers

2021-08-26 Thread Lars Winderling

Hi Vibhath,

that is the usual problem with floating point numbers. The only ways to 
fix that I can imagine:


* store them as numeric/decimal values in your postgres db (might be 
hard to apply at this stage), or
* you might be able to round and cast them in your postgres-query to 
numeric/decimal values (so fix the number of digits) and apply an 
appropriate (avro) schema, or
* in NiFi round to the exact number of digits and store as a string or 
numeric/decimal before saving to csv


For the latter to work reliably, you will likely need a single processor 
(like ExecuteScript/ScriptedTransformRecord, or a custom processor) so 
that you can both round and store the numbers in a single session. If 
you split that up into multiple processors, you might end up with the 
same situation again.


This will of course only work, if you know the exact number of digits in 
advance, since that can't be deduced from a floating point 
representation on its own.


I'm sorry if I have missed something.

Best,
Lars

On 8/26/21 5:12 PM, Vibhath Ileperuma wrote:

Hi All,

I have created a Nifi flow to query from Postgresql database and write 
data into csv files. However, I noticed that the floating point values 
(Double values) can be changed slightly when writing to csv files.
For an example, value 4313681553.3 was written as 4313681553.292. 
Since some of the values I'm extracting are very sensitive, I'm 
wondering if someone can suggest a way to extract the data exactly as 
they are.


Thank You
Vibhath






OpenPGP_signature
Description: OpenPGP digital signature


Re: Negative off heap memory?

2021-08-04 Thread Lars Winderling

Hi Mark,

thank you very much for your quick response. That totally explains it. I 
will make sure to define an upper bound, then.


Best,
Lars

On 8/4/21 7:27 PM, Mark Payne wrote:

Lars,

That’s interesting. The code that computes that is simple:
final MemoryUsage usage =mxBean.getNonHeapMemoryUsage();
return Ratio.of(usage.getUsed(), usage.getMax());
So it’s getting a MemoryUsage object from the JVM and returning 
usage.getUsed() / usage.getMax().
But the documentation for MemoryUsage.getMax() state that if no 
maximum value has been set, it will return -1.


So it looks like it means the non-heap usage is really 2.81816384E8 
for you. So 281 MB. And there is no max value set.


Thanks
-Mark


On Aug 4, 2021, at 1:19 PM, Lars Winderling 
mailto:lars.winderl...@posteo.de>> wrote:


Dear community,

using the QueryNiFiReportingTask on NiFi:1.13.2 on java:8, debian:10, 
with G1 enabled as GC, I see something like this:


  "jvm_heap_used" : 4.26262096E8,
  "jvm_heap_usage" : 0.2646583418051402,
*  "jvm_non_heap_usage" : -2.81816384E8,*

The non heap usage is negative by large. What does that mean? I can't 
seem to find any resources on the web, nor in the sources.

Thanks in advance for your support!

Best,
Lars






OpenPGP_signature
Description: OpenPGP digital signature


Negative off heap memory?

2021-08-04 Thread Lars Winderling

Dear community,

using the QueryNiFiReportingTask on NiFi:1.13.2 on java:8, debian:10, 
with G1 enabled as GC, I see something like this:


  "jvm_heap_used" : 4.26262096E8,
  "jvm_heap_usage" : 0.2646583418051402,
*  "jvm_non_heap_usage" : -2.81816384E8,*

The non heap usage is negative by large. What does that mean? I can't 
seem to find any resources on the web, nor in the sources.

Thanks in advance for your support!

Best,
Lars



OpenPGP_signature
Description: OpenPGP digital signature


Re: NiFi Queue Monitoring

2021-07-20 Thread Lars Winderling
Scott,
you could use tls client cert auth, maybe including appropriate user-mapping. 
Since you have been using ldap, you maybe can use the dn as cert subject as-is. 
Only be aware that whitespace handling in the subject dn might differ between 
nifi and your ldap. We're also running nifi secured with an additional auth 
provider, but 2way tls is always accepted by nifi.
But maybe you could also employ a reporting task instead of polling the api.
Best, Lars

On 20 July 2021 23:31:02 CEST, scott  wrote:
>Hi all,
>I'm trying to setup some monitoring of all queues in my NiFi instance,
>to
>catch before queues become full. One solution I am looking at is to use
>the
>API, but because I have a secure NiFi that uses LDAP, it seems to
>require a
>token that expires in 24 hours or so. I need this to be an automated
>solution, so that is not going to work. Has anyone else tackled this
>problem with a secure LDAP enabled cluster?
>
>Thanks,
>Scott


Re: Expression language within scripts in ExecuteScript

2021-07-17 Thread Lars Winderling
James,

maybe just use?
Import socket
socket.gethostname()
It will give you rather the hostname, but that should also help for 
distinguishing between nodes.

Best, Lars 

On 17 July 2021 22:25:47 CEST, James McMahon  wrote:
>Looking closely at the flowFiles I am creating,in the subsequent output
>queue, I see they have a Node Address listed in FlowFile Details. It is
>not
>listed in the flowfile attributes.
>That is what I need to get at programmatically in my python script. How
>can
>I access Node Address?
>
>On Sat, Jul 17, 2021 at 2:59 PM James McMahon 
>wrote:
>
>> I have a single flowfile I generate on a periodic basis using a cron
>> scheduled GenerateFlowFile. This then flows into an ExecuteScript,
>where I
>> have a python script that will create multiple flowfiles from the
>one.
>>
>> My ExecuteScript is configured to run on all my cluster nodes. For
>each
>> instance of flowfile I am creating, I need to determine which cluster
>node
>> it associates with. There’s an expression language function called
>ip().
>> Can anyone tell me how to employ ${ip()} in my python to determine
>the
>> cluster node the newly created flowFile is associated with?
>>
>> I’d be using this after I execute
>> flowfile = session.create()
>>
>> Thanks in advance for your help.
>>


Re: A few quick questions

2021-05-28 Thread Lars Winderling
Hi Robert,

addressing your second question: I also use global contexts in places where 
this is appropriate. Still, for globally shared values used within groups that 
have a nested context, you could go with environment variables or system 
properties, for non-sensitive values. For sensitive values, you need a 
parameter anyway. Hope that helps.

Best,
Lars

On 28 May 2021 21:37:14 CEST, "Robert R. Bruno"  wrote:
>We recently moved to version 1.13.2 and are finally using the registry
>in
>earnest along with parameter contexts.  Being able to store sensitive
>values is amazing!
>
>Had two quick questions:
>
>1.  Any good way to turn on/off all controller services in a process
>group
>for the UI?
>
>2. Since you can only have one parameter context per process group, how
>are
>you all handling when you have a specific parameter that is used among
>many
>process groups?  I am trying to avoid having one global parameter
>context,
>but perhaps that is the best practice?
>
>Any thoughts on eventually being able to select more than one parameter
>context for a process?  Perhaps the parameter context then would act as
>a
>namespace.  Just a thought.
>
>Thanks,
>Robert


Re: NIFI - Regex replace Help

2020-11-06 Thread Lars Winderling
Hi Asmath,

If I understood correctly, you could try a single \\N|\"  simply join the 
regexes with a pipe. I hope that helps

Best, Lars

On 6 November 2020 17:35:05 CET, KhajaAsmath Mohammed  
wrote:
>Hi,
>
>I have e replacetext processor with below settings. I want to replace
>them
>with only one processor and use or condition in the search query. any
>help
>on how to do this?
>
>[image: image.png]
>
>[image: image.png]
>
>Thanks,
>Asmath