QuestDB Status-repo Issues

2024-06-04 Thread Phillip Lord
Hello all,

This is going out as a follow-up to something I noted in the Nifi Slack
channel.

I'm hitting some issues related to the QuestDB status-repo implementation.
I'm running a small Nifi(v1.23) cluster on 16 core EC2 instances.  I've set
all the appropriate properties for QuestDB, and it appears to be working...
mostly. Stats at the processor/connection/pg level etc. all return without
issue.  However, when I go to the global menu and select "node status
history" it spins for a while as if trying to return the info and then
fails with an unexpected error. The logs did generate a lengthy stack(see
below).
I checked I/O levels of the attached gp3 EBS volume hosting the status-repo
and levels are very low.  When monitoring I/O levels when attempting the db
call I did see a brief spike of 100% utilization maxing out the 3k IOPS
limit of the volume before it failed.  Curious I increased the volume IOPS
to 6k, and this time same problem, but the burst this time consumed all
6k.  Next I bumped it to the 16k max... this time IOPS again briefly spiked
to 7-10k and this time the UI instead returned and API 503 error on the
controller/status-history; no ERROR in the log this time.Has anyone
experienced anything like this??  Again the cluster is largely idle during
these testing periods, so not under heavy load at all.


2024-05-21 18:52:41,073 ERROR
org.glassfish.jersey.server.ServerRuntime$Responder: An I/O error has
occurred while writing a response message entity to the container output
stream.
org.glassfish.jersey.server.internal.process.MappableException:
org.eclipse.jetty.io.EofException
at
org.glassfish.jersey.server.internal.MappableExceptionWrapperInterceptor.aroundWriteTo(MappableExceptionWrapperInterceptor.java:67)
at
org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:139)
at
org.glassfish.jersey.message.internal.MessageBodyFactory.writeTo(MessageBodyFactory.java:1116)
at
org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:649)
at
org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:380)
at
org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:370)
at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:259)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
at
org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:235)
at
org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:684)
at
org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:394)
at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:346)
at
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:358)
at
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:311)
at
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)
at
org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1459)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
at
org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1656)
at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:352)
at
org.springframework.security.web.access.intercept.AuthorizationFilter.doFilter(AuthorizationFilter.java:100)
at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)
at
org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:126)
at
org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:120)
at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)
at
org.apache.nifi.web.security.log.AuthenticationUserFilter.doFilterInternal(AuthenticationUserFilter.java:57)
at
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)
at
org.springframework.security.oauth2.server.resource.web.authentication.BearerTokenAuthenticationFilter.doFilterInternal(BearerTokenAuthenticationFilter.java:132)
at
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:117)
at
org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:361)
at
org.apache.nifi.web.security.NiFiAuthenticationFilter.authenticate(NiFiAuthenticationFilter.java:94)
at

Nifi with dfs?

2024-04-30 Thread Phillip Lord
Hello all,

I'm curious if anyone has any experience running Nifi with distributed
file-system backed repos?  In particular I'm looking at EC2 with EFS backed
repositories.

I've done some work in the past on this... but it's been awhile and wasn't
with EFS.  So I was curious if anyone has tried this approach/any success
stories using EFS.

Of course the typical approach is to utilize EBS volumes attached to your
EC2 instance... but I think being able to scale the repos as needed is a
great benefit... instead of reserving xx GB per node that may rarely be
utilized.

Appreciate any insights!

Thanks,
Phil


Re: Access instance name

2024-02-15 Thread Phillip Lord
Etienne.

Is this what you're looking for?

https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#hostname

Thanks,
Phil

On Wed, Feb 14, 2024 at 5:41 AM Etienne Jouvin 
wrote:

> Hello.
>
> Imagine, I use UpdateAttributes to set an attribute where I put the NiFi
> instance id or name.
> This attribute may be used after, for example when I use PutRecord and if
> I want to add information regarding the NiFi node that do the work.
>
> In cluster mode, you can have a node identifier in the cluster, node-1 /
> node-2 or something like this.
> But in a single node architecture, how can I do ?
>
> Using Parameters Contexte may not works because they will be shared on all
> cluster nodes.
>
> Regards
>
>
>
> Le mer. 14 févr. 2024 à 01:41, Matt Burgess  a
> écrit :
>
>> Etienne,
>>
>> What instance name / id are you referring to?
>>
>> On Tue, Feb 13, 2024 at 8:43 AM Etienne Jouvin 
>> wrote:
>>
>>> Hello all.
>>>
>>> Just simple question, is there a way to access, from controller service,
>>> to the instance name / id event in not cluster implementation?
>>>
>>> Regards
>>>
>>> Etienne Jouvin
>>>
>>


Re: Finding slow down in processing

2024-01-12 Thread Phillip Lord
Ditto...

@Aaron... so outside of the GenerateFlowFile -> PutFile, were there
additional components/dataflows handling data at the same time as the
"stress-test".  These will all share the same thread-pool.  So depending
upon your dataflow footprint and any variability regarding data volumes...
20 timer-driven threads could be exhausted pretty quickly.  This might
cause not only your "stress-test" to slow down but your other flows as well
as components might be waiting for available threads to do their jobs.

Thanks,
Phil

On Thu, Jan 11, 2024 at 3:44 PM Mark Payne  wrote:

> Aaron,
>
> Interestingly, up to version 1.21 of NiFi, if you increase the size of the
> thread pool, it increased immediately. But if you decreased the size of the
> thread pool, the decrease didn’t take effect until you restart NiFi. So
> that’s probably why you’re seeing the behavior you are. Even though you
> reset it to 10 or 20, it’s still running at 40.
>
> This was done to issues with Java many years ago, where it caused problems
> to decrease the thread pool size.  So just recently we updated NiFi to
> immediately scale down the thread pools as well.
>
> Thanks
> -Mark
>
>
> On Jan 11, 2024, at 1:35 PM, Aaron Rich  wrote:
>
> So the good news is it's working now. I know what I did but I don't know
> why it worked so I'm hoping others can enlighten me based on what I did.
>
> TL;DR - "turn it off/turn in on" for Max Timer Driven Thread Count fixed
> performance. Max Timer Driven Thread Count was set to 20. I changed it to
> 30 - performance increased. I changed to more to 40 - it increased. I moved
> it back to 20 - performance was still up and what it originally was before
> ever slowing down.
>
> (this is long to give background and details)
> NiFi version: 1.19.1
>
> NiFi was deployed into a Kubernetes cluster as a single instance - no NiFi
> clustering. We would set a CPU request of 4, and limit of 8, memory request
> of 8, limit of 12. The repos are all volumed mounted out to ssd.
>
> The original deployment was as described above and Max Timer Driven Thread
> Count was set to 20. We ran a very simple data flow
> (generatoeFile->PutFile) AFAP to try to stress as much as possible before
> starting our other data flows. That ran for a week with no issue doing
> 20K/5m.
> We turned on the other data flows and everything was processing as
> expected, good throughput rates and things were happy.
> Then the throughput dropped DRAMATICALLY to (instead of 11K/5m in an
> UpdateAttribute, it went to 350/5m) after 3 days. The data being processed
> did not change in volume/cadence/velocity/etc.
> Rancher Cluster explorer dashboards didn't show resources standing out as
> limiting or constraining.
> Tried restarting workload in Kubernetes, and data flows were slow right
> from start - so there wasn't a ramp up or any degradation over time - it
> was just slow to begin.
> Tried removing all the repos/state so NiFi came up clean incase it was the
> historical data that was issue - still slow from start.
> Tried changing node in Kube Cluster incase node was bad - still slow from
> start.
> Removed CPU limit (allowing NiFi to potentially use all 16 cores on node)
> from deployment to see if there was CPU throttling happening that I wasn't
> able to see on the Grafana dashboards - still slow from start.
> While NiFi was running, I changed the Max Timer Driven Thread Count from
> 20->30, performance picked up. Changed it again from 30->40, performance
> picked up. I changed from 40->10, performance stayed up. I changed from
> 10-20, performance stayed up and was at the original amount before slow
> down every happened.
>
> So end of the day, the Max Timer Driven Thread Count is at exactly what it
> was before but the performance changed. It's like something was "stuck".
> It's very, very odd to me to see things be fine, degrade for days and
> through multiple environment changes/debugging, and then return to fine
> when I change a parameter to a different value->back to original value.
> Effectively, I "turned it off/turned it on" with the Max Timer Driven
> Thread Count value.
>
> My question is - what is happening under the hood when the Max Timer
> Driven Thread Count is changed? What does that affect? Is there something I
> could look at from Kubernetes' side potentially that would relate to that
> value?
>
> Could an internal NiFi thread gotten stuck and changing that value rebuilt
> the thread pool? If that is even possible? If that is even possible, is any
> way to know what caused the thread to "get stuck" in the first place?
>
> Any insight would be greatly appreciated!
>
> Thanks so much for all the suggestions and help on this.
>
> -Aaron
>
>
>
> On Wed, Jan 10, 2024 at 1:54 PM Aaron Rich  wrote:
>
>> Hi Joe,
>>
>> Nothing is load balanced- it's all basic queues.
>>
>> Mark,
>> I'm using NiFi 1.19.1.
>>
>> nifi.performance.tracking.percentage sounds exactly what I might need.
>> I'll give that a shot.
>>
>> Richard,
>> I hadn't 

Re: Hardware requirement for NIFI instance

2024-01-05 Thread Phillip Lord
Agreed on updating defaults... I've seen this play out before.

On Fri, Jan 5, 2024 at 10:10 AM Mark Payne  wrote:

> Thanks for following up. That actually makes sense. I don’t think Output
> Batch Size will play a very big role here. But Fetch Size, if I understand
> correctly, is essentially telling the JDBC Driver “Here’s how many rows you
> should pull back at once.” And so it’s going to buffer all of those rows
> into memory until it has written out all of them.
>
> So if you set Fetch Size = 0, it’s going to pull back all rows in your
> database into memory. To be honest, I cannot imagine a single scenario
> where that’s desirable. We should probably set the default to something
> reasonable like 1,000 or 10,000 at most. And in 2.0, where we have the
> ability to migrate old configurations we should automatically change any
> config that has Fetch Size of 0 to the default value.
>
> @Matt Burgess, et al., any concerns with that?
>
> Thanks
> -Mark
>
>
> On Jan 5, 2024, at 9:45 AM, e-soci...@gmx.fr wrote:
>
> So after some tests, here the result perhaps could help someone.
>
> With nifi (2CPU / 8Go Ram)
>
> I have tested with these couples properties :
>
> > 1 executeSQL with "select * from table"
> Output Batch Size : 1
> Fetch Size : 10
>
> > 2 executeSQL with "select * from table"
> Output Batch Size : 1
> Fetch Size : 20
>
> > 2 executeSQL with "select * from table"
> Output Batch Size : 1
> Fetch Size : 40
> and started 5 executeSQL in the same time
>
> The 5 processors work perfectly and receive 5 avro files with same size.
> And during the test, the memory is stable and the Web UI works perfectly
>
>
> FAILED TEST "OUT OF MEMORY" if the properties are :
>
> > 1 executeSQL with "select * from table"
> Output Batch Size : 0
> Fetch Size : 0
> Regards
>
>
> *Envoyé:* vendredi 5 janvier 2024 à 08:12
> *De:* "Matt Burgess" 
> *À:* users@nifi.apache.org
> *Objet:* Re: Hardware requirement for NIFI instance
> You may not need to merge if your Fetch Size is set appropriately. For
> your case I don't recommend setting Max Rows Per Flow File because you
> still have to wait for all the results to be processed before the
> FlowFile(s) get sent "downstream". Also if you set Output Batch Size
> you can't use Merge downstream as ExecuteSQL will send FlowFiles
> downstream before it knows the total count.
>
> If you have a NiFi cluster and not a standalone instance you MIGHT be
> able to represent your complex query using GenerateTableFetch and use
> a load-balanced connection to grab different "pages" of the table in
> parallel with ExecuteSQL. Those can be merged later as long as you get
> all the FlowFiles back to a single node. Depending on how complex your
> query is then it's a long shot but I thought I'd mention it just in
> case.
>
> Regards,
> Matt
>
>
> On Thu, Jan 4, 2024 at 1:41 PM Pierre Villard
>  wrote:
> >
> > You can merge multiple Avro flow files with MergeRecord with an Avro
> Reader and an Avro Writer
> >
> > Le jeu. 4 janv. 2024 à 22:05,  a écrit :
> >>
> >> And the important thing for us it has only one avro file by table.
> >>
> >> So it is possible to merge avro files to one avro file ?
> >>
> >> Regards
> >>
> >>
> >> Envoyé: jeudi 4 janvier 2024 à 19:01
> >> De: e-soci...@gmx.fr
> >> À: users@nifi.apache.org
> >> Cc: users@nifi.apache.org
> >> Objet: Re: Hardware requirement for NIFI instance
> >>
> >> Hello all,
> >>
> >> Thanks a lot for the reply.
> >>
> >> So for more details.
> >>
> >> All the properties for the ExecuteSQL are set by default, except "Set
> Auto Commit: false".
> >>
> >> The sql command could not be more simple than "select * from
> ${db.table.fullname}"
> >>
> >> The nifi version is 1.16.3 and 1.23.2
> >>
> >> I have also test the same sql command in the another nifi (8 cores/ 16G
> Ram) and it is working.
> >> The result is the avro file with 1.6GB
> >>
> >> The detail about the output flowfile :
> >>
> >> executesql.query.duration
> >> 245118
> >> executesql.query.executiontime
> >> 64122
> >> executesql.query.fetchtime
> >> 180996
> >> executesql.resultset.index
> >> 0
> >> executesql.row.count
> >> 14961077
> >>
> >> File Size
> >> 1.62 GB
> >>
> >> Regards
> >>
> >> Minh
> >>
> >>
> >> Envoyé: jeudi 4 janvier 2024 à 17:18
> >> De: "Matt Burgess" 
> >> À: users@nifi.apache.org
> >> Objet: Re: Hardware requirement for NIFI instance
> >> If I remember correctly, the default Fetch Size for Postgresql is to
> >> get all the rows at once, which can certainly cause the problem.
> >> Perhaps try setting Fetch Size to something like 1000 or so and see if
> >> that alleviates the problem.
> >>
> >> Regards,
> >> Matt
> >>
> >> On Thu, Jan 4, 2024 at 8:48 AM Etienne Jouvin 
> wrote:
> >> >
> >> > Hello.
> >> >
> >> > I also think the problem is more about the processor, I guess
> ExecuteSQL.
> >> >
> >> > Should play with batch configuration and commit flag to commit
> intermediate FlowFile.
> >> >
> >> > The out of memory exception makes me believe the 

Re: Nifi - Content-repo on AWS-EBS volumes

2023-12-15 Thread Phillip Lord
I just switched a cluster using 3 EBS volumes for cont-repo from gp2 to gp3… 
resolved definite I/O throughput issues.  The change to gp3 was significant 
enough that I might actually reduce from 3 to 2 volumes, perhaps even a single 
volume would be sufficient.

Of course every use case is unique.
On Dec 15, 2023 at 5:37 PM -0500, Gregory M. Foreman 
, wrote:
> Mark:
>
> Got it. Thank you for the help.
>
> Greg
>
> > On Dec 15, 2023, at 4:14 PM, Mark Payne  wrote:
> >
> > Greg,
> >
> > Whether or not multiple content repos will have any impact depends very 
> > much on where your system’s bottleneck is. If your bottleneck is disk I/O, 
> > it will absolutely help. If your bottleneck is CPU, it won’t. If, for 
> > example, you’re running on bare metal and have 48 cores on your machine and 
> > you’re running with spinning disks, you’ll definitely want to use multiple 
> > spinning disks. But if you’re running in AWS on a VM that has 4 cores and 
> > you’re using gp3 EBS volumes, it’s unlikely that multiple content repos 
> > will help.
> >
> > Thanks
> > -Mark
> >
> >
> >
> > > On Dec 15, 2023, at 3:25 PM, Gregory M. Foreman 
> > >  wrote:
> > >
> > > Mark:
> > >
> > > I was just discussing multiple content repos on EBS volumes with a 
> > > colleague. I found your post from a long time ago:
> > >
> > > https://lists.apache.org/thread/nq3mpry0wppzrodmldrcfnxwzp3n1cjv
> > >
> > > “Re #2: I don't know that i've used any SAN to back my repositories other 
> > > than the EBS provided by Amazon EC2. In that environment, I found that 
> > > having one or having multiple repos was essentially equivalent.”
> > >
> > > Does that statement still hold true today? Essentially there is no real 
> > > performance benefit to having multiple content repos on multiple EBS 
> > > volumes?
> > >
> > > Thanks,
> > > Greg
> > >
> > >
> > >
> > > > On Dec 11, 2023, at 8:50 PM, Mark Payne  wrote:
> > > >
> > > > Hey Phil,
> > > >
> > > > NiFi will not spread the content of a single file over multiple 
> > > > partitions. It will write the content of FlowFile 1 to content repo 1, 
> > > > then write the next FlowFile to repo 2, etc. so it does round-robin but 
> > > > does not spread a single FlowFile across multiple repos.
> > > >
> > > > Thanks
> > > > -Mark
> > > >
> > > > Sent from my iPhone
> > > >
> > > > > On Dec 11, 2023, at 8:45 PM, Phillip Lord  
> > > > > wrote:
> > > > >
> > > > >
> > > > > Hello Nifi comrades,
> > > > >
> > > > > Here's my scenario...
> > > > > Let's say I have a Nifi cluster running on EC2 instances with 
> > > > > attached EBS volumes serving as their repos. They've split up their 
> > > > > content-repos into three content-repos per node(cont1, cont2, cont3). 
> > > > > Each being a dedicated EBS volume. My understanding is that the 
> > > > > content-claims for a single file can potentially span across more 
> > > > > than one of these repos.(correct me if I've lost my mind over the 
> > > > > years)
> > > > > For instance if you have a 1 MB file, and lets say your 
> > > > > max.content.claim.size is 100KB, that's 10 - 100KB claims(ish) 
> > > > > potentially split up across the 3 EBS volumes. So if Nifi is trying 
> > > > > to move that file to S3 or something for instance... it needs to be 
> > > > > read from each of the volumes.
> > > > > Whereas if it was a single EBS volume for the cont-repo... it would 
> > > > > read from the single volume, which I would think would be more 
> > > > > performant? Or does spreading out any IO contention across volumes 
> > > > > provide more of a benefit?
> > > > > I know there's different levels of EBS volumes... but not factoring 
> > > > > that in for right now.
> > > > >
> > > > > Appreciate any insight... trying to determine the best configuration.
> > > > >
> > > > > Thanks,
> > > > > Phil
> > > > >
> > > > >
> > >
> >
>


Re: ConsumeKafka to PublishKafka doesn't keep the order of the messages in the destination topic

2023-12-13 Thread Phillip Lord
 Perhaps try following this guidance in the docs???

Consumer Partition Assignment

By default, this processor will subscribe to one or more Kafka topics in
such a way that the topics to consume from are randomly assigned to the
nodes in the NiFi cluster. Consider a scenario where a single Kafka topic
has 8 partitions and the consuming NiFi cluster has 3 nodes. In this
scenario, Node 1 may be assigned partitions 0, 1, and 2. Node 2 may be
assigned partitions 3, 4, and 5. Node 3 will then be assigned partitions 6
and 7.

In this scenario, if Node 3 somehow fails or stops pulling data from Kafka,
partitions 6 and 7 may then be reassigned to the other two nodes. For most
use cases, this is desirable. It provides fault tolerance and allows the
remaining nodes to pick up the slack. However, there are cases where this
is undesirable.

One such case is when using NiFi to consume Change Data Capture (CDC) data
from Kafka. Consider again the above scenario. Consider that Node 3 has
pulled 1,000 messages from Kafka but has not yet delivered them to their
final destination. NiFi is then stopped and restarted, and that takes 15
minutes to complete. In the meantime, Partitions 6 and 7 have been
reassigned to the other nodes. Those nodes then proceeded to pull data from
Kafka and deliver it to the desired destination. After 15 minutes, Node 3
rejoins the cluster and then continues to deliver its 1,000 messages that
it has already pulled from Kafka to the destination system. Now, those
records have been delivered out of order.

The solution for this, then, is to assign partitions statically instead of
dynamically. In this way, we can assign Partitions 6 and 7 to Node 3
specifically. Then, if Node 3 is restarted, the other nodes will not pull
data from Partitions 6 and 7. The data will remain queued in Kafka until
Node 3 is restarted. By using this approach, we can ensure that the data
that already was pulled can be processed (assuming First In First Out
Prioritizers are used) before newer messages are handled.

In order to provide a static mapping of node to Kafka partition(s), one or
more user-defined properties must be added using the naming scheme
partitions. with the value being a comma-separated list of Kafka
partitions to use. For example, partitions.nifi-01=0, 3, 6, 9,
partitions.nifi-02=1,
4, 7, 10, and partitions.nifi-03=2, 5, 8, 11. The hostname that is used can
be the fully qualified hostname, the "simple" hostname, or the IP address.
There must be an entry for each node in the cluster, or the Processor will
become invalid. If it is desirable for a node to not have any partitions
assigned to it, a Property may be added for the hostname with an empty
string as the value.

NiFi cannot readily validate that all Partitions have been assigned before
the Processor is scheduled to run. However, it can validate that no
partitions have been skipped. As such, if partitions 0, 1, and 3 are
assigned but not partition 2, the Processor will not be valid. However, if
partitions 0, 1, and 2 are assigned, the Processor will become valid, even
if there are 4 partitions on the Topic. When the Processor is started, the
Processor will immediately start to fail, logging errors, and avoid pulling
any data until the Processor is updated to account for all partitions. Once
running, if the number of partitions is changed, the Processor will
continue to run but not pull data from the newly added partitions. Once
stopped, it will begin to error until all partitions have been assigned.
Additionally, if partitions that are assigned do not exist (e.g.,
partitions 0, 1, 2, 3, 4, 5, 6, and 7 are assigned, but the Topic has only
4 partitions), then the Processor will begin to log errors on startup and
will not pull data.

In order to use a static mapping of Kafka partitions, the "Topic Name
Format" must be set to "names" rather than "pattern." Additionally, all
Topics that are to be consumed must have the same number of partitions. If
multiple Topics are to be consumed and have a different number of
partitions, multiple Processors must be used so that each Processor
consumes only from Topics with the same number of partitions.

On Dec 13, 2023 at 8:59 AM -0500, edi mari , wrote:

Hi Pierre,
Yes, We tried to use FIFO prioritize in the queue, but it didn't help.
Some records in the target topic are ordered differently from the source
topic(which is critical in cleanup policy: compact) .

Edi

On Wed, Dec 13, 2023 at 3:46 PM Pierre Villard 
wrote:

> Hi Edi,
>
> Did you try setting the FIFO prioritizer on the connection between the
> processors?
>
> Thanks,
> Pierre
>
> Le mer. 13 déc. 2023 à 14:19, edi mari  a écrit :
>
>>
>> Hello ,
>> I'm using NIFI v1.20.0 to replicate 250 million messages between Kafka
>> topics.
>> The problem is that NIFI replicates messages in a non-sequential order,
>> resulting in the destination topic storing messages differently than the
>> source topic.
>>
>> for example
>> *source topic - partition 0*
>> offset:5 

Nifi - Content-repo on AWS-EBS volumes

2023-12-11 Thread Phillip Lord
Hello Nifi comrades,

Here's my scenario...
Let's say I have a Nifi cluster running on EC2 instances with attached EBS
volumes serving as their repos.  They've split up their content-repos into
three content-repos per node(cont1, cont2, cont3).  Each being a dedicated
EBS volume.  My understanding is that the content-claims for a single file
can potentially span across more than one of these repos.(correct me if
I've lost my mind over the years)
For instance if you have a 1 MB file, and lets say your
max.content.claim.size is 100KB, that's 10 - 100KB claims(ish) potentially
split up across the 3 EBS volumes.  So if Nifi is trying to move that file
to S3 or something for instance... it needs to be read from each of the
volumes.
Whereas if it was a single EBS volume for the cont-repo... it would read
from the single volume, which I would think would be more performant?  Or
does spreading out any IO contention across volumes provide more of a
benefit?
I know there's different levels of EBS volumes... but not factoring that in
for right now.

Appreciate any insight... trying to determine the best configuration.

Thanks,
Phil


Re: TLSv1.3 SSLContext not available on Java 11 and RHEL8

2023-08-15 Thread Phillip Lord
Can you add the error here for more context?
On Aug 15, 2023 at 9:38 AM -0400, Mike Thomsen , wrote:
> As the subject line says, we're getting a weird error when trying to migrate 
> to RHEL8. We're already on Java 11 on RHEL7, but for some reason NiFi is 
> running into problems instantiating a TLSv1.3 SSLContext.
>
> Does anyone have any suggestions on what could be happening here?


Re: UI SocketTimeoutException - heavy IO

2023-07-12 Thread Phillip Lord
bucket 
> > > > > > > > 736a8f4b-19be-4c01-b2c3-901d9538c5ef due to: Connection refused 
> > > > > > > > (Connection refused)
> > > > > > > > 2023-03-22 10:52:13,960 ERROR [Timer-Driven Process Thread-62] 
> > > > > > > > o.a.nifi.groups.StandardProcessGroup Failed to synchronize 
> > > > > > > > StandardProcessGroup[identifier=920c3600-2954-1c8e-b121-6d7d3d393de6,name=Save
> > > > > > > >  Binary Data] with Flow Registry because could not retrieve 
> > > > > > > > version 1 of flow with identifier 
> > > > > > > > 7a8c82be-1707-4e7d-a5e7-bb3825e0a38f in bucket 
> > > > > > > > 736a8f4b-19be-4c01-b2c3-901d9538c5ef due to: Connection refused 
> > > > > > > > (Connection refused)
> > > > > > > > A clue?
> > > > > > > > -joe
> > > > > > > > On 3/22/2023 10:49 AM, Mark Payne wrote:
> > > > > > > > > Joe,
> > > > > > > > >
> > > > > > > > > 1.8 million FlowFiles is not a concern. But when you say 
> > > > > > > > > “Should I reduce the queue sizes?” it makes me wonder if 
> > > > > > > > > they’re all in a single queue?
> > > > > > > > > Generally, you should leave the backpressure threshold at the 
> > > > > > > > > default 10,000 FlowFile max. Increasing this can lead to huge 
> > > > > > > > > amounts of swapping, which will drastically reduce 
> > > > > > > > > performance and increase disk utilization very significantly.
> > > > > > > > >
> > > > > > > > > Also from the diagnostics, it looks like you’ve got a lot of 
> > > > > > > > > CPU cores, but you’re not using much. And based on the amount 
> > > > > > > > > of disk space available and the fact that you’re seeing 100% 
> > > > > > > > > utilization, I’m wondering if you’re using spinning disks, 
> > > > > > > > > rather than SSDs? I would highly recommend always running 
> > > > > > > > > NiFi with ssd/nvme drives. Absent that, if you have multiple 
> > > > > > > > > disk drives, you could also configure the content repository 
> > > > > > > > > to span multiple disks, in order to spread that load.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > > -Mark
> > > > > > > > >
> > > > > > > > > > On Mar 22, 2023, at 10:41 AM, Joe Obernberger 
> > > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > Thank you.  Was able to get in.
> > > > > > > > > > Currently there are 1.8 million flow files and 3.2G.  Is 
> > > > > > > > > > this too much for a 3 node cluster with mutliple spindles 
> > > > > > > > > > each (SATA drives)?
> > > > > > > > > > Should I reduce the queue sizes?
> > > > > > > > > > -Joe
> > > > > > > > > > On 3/22/2023 10:23 AM, Phillip Lord wrote:
> > > > > > > > > > > Joe,
> > > > > > > > > > >
> > > > > > > > > > > If you need the UI to come back up, try setting the 
> > > > > > > > > > > autoresume setting in nifi.properties to false and 
> > > > > > > > > > > restart node(s).
> > > > > > > > > > > This will bring up every component/controllerService up 
> > > > > > > > > > > stopped/disabled and may provide some breathing room for 
> > > > > > > > > > > the UI to become available again.
> > > > > > > > > > >
> > > > > > > > > > > Phil
> > > > > > > > > > > On Mar 22, 2023 at 10:20 AM -0400, Joe Obernberger 
> > > > > > > > > > > , wrote:
> > > > > > > > > > > > atop shows the disk as being all red with IO - 100% 
> > > > > > > > > > > > utilization. There
> > > > > > > > > > > > are a lot of flowfiles currently trying to run through, 
> > > > > > > > > > > > but I can't
> > > > > > > > > > > > monitor it becauseUI wont' load.
> > > > > > > > > > > >
> > > > > > > > > > > > -Joe
> > > > > > > > > > > >
> > > > > > > > > > > > On 3/22/2023 10:16 AM, Mark Payne wrote:
> > > > > > > > > > > > > Joe,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I’d recommend taking a look at garbage collection. It 
> > > > > > > > > > > > > is far more likely the culprit than disk I/O.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks
> > > > > > > > > > > > > -Mark
> > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mar 22, 2023, at 10:12 AM, Joe Obernberger 
> > > > > > > > > > > > > >  wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I'm getting "java.net.SocketTimeoutException: 
> > > > > > > > > > > > > > timeout" from the user interface of NiFi when load 
> > > > > > > > > > > > > > is heavy. This is 1.18.0 running on a 3 node 
> > > > > > > > > > > > > > cluster. Disk IO is high and when that happens, I 
> > > > > > > > > > > > > > can't get into the UI to stop any of the processors.
> > > > > > > > > > > > > > Any ideas?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I have put the flowfile repository and content 
> > > > > > > > > > > > > > repository on different disks on the 3 nodes, but 
> > > > > > > > > > > > > > disk usage is still so high that I can't get in.
> > > > > > > > > > > > > > Thank you!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > -Joe
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > This email has been checked for viruses by AVG 
> > > > > > > > > > > > > > antivirus software.
> > > > > > > > > > > > > > www.avg.com
> > > > > > > > > >
> > > > > > > > > > Virus-free.www.avg.com
> > > > > > > > >
> > > > > > >
> > > > >


Re: UI SocketTimeoutException - heavy IO

2023-03-22 Thread Phillip Lord
Joe,

If you need the UI to come back up, try setting the autoresume setting in 
nifi.properties to false and restart node(s).
This will bring up every component/controllerService up stopped/disabled and 
may provide some breathing room for the UI to become available again.

Phil
On Mar 22, 2023 at 10:20 AM -0400, Joe Obernberger 
, wrote:
> atop shows the disk as being all red with IO - 100% utilization. There
> are a lot of flowfiles currently trying to run through, but I can't
> monitor it becauseUI wont' load.
>
> -Joe
>
> On 3/22/2023 10:16 AM, Mark Payne wrote:
> > Joe,
> >
> > I’d recommend taking a look at garbage collection. It is far more likely 
> > the culprit than disk I/O.
> >
> > Thanks
> > -Mark
> >
> > > On Mar 22, 2023, at 10:12 AM, Joe Obernberger 
> > >  wrote:
> > >
> > > I'm getting "java.net.SocketTimeoutException: timeout" from the user 
> > > interface of NiFi when load is heavy. This is 1.18.0 running on a 3 node 
> > > cluster. Disk IO is high and when that happens, I can't get into the UI 
> > > to stop any of the processors.
> > > Any ideas?
> > >
> > > I have put the flowfile repository and content repository on different 
> > > disks on the 3 nodes, but disk usage is still so high that I can't get in.
> > > Thank you!
> > >
> > > -Joe
> > >
> > >
> > > --
> > > This email has been checked for viruses by AVG antivirus software.
> > > www.avg.com


Re: Cluster modification

2023-03-14 Thread Phillip Lord
I see... Thanks Pierre.  This all makes sense... I'm going to assume the
reconnecting node uses the Cluster Coordinator to retrieve the latest
flow-version.  In Mark's video he demonstrated reconnecting a node through
the UI.  Should the same process occur when a node is not just
disconnected, just the nifi on that node is actually stopped and requires
restarting the node?  So that on start-up the node ensures it's utilizing
the "up-to-date" cluster flow?

Thanks,
Phil

On Tue, Mar 14, 2023 at 9:56 AM Pierre Villard 
wrote:

> Hi Phil,
>
>
>> Now you're allowed to make changes... meaning the node that was removed
>> from the cluster now has a flow that is no longer in-sync with the cluster
>> and you have to remove that node's flow before it's able to rejoin the
>> cluster.  Was this intentional?  I know there are a lot of things coming
>> down the pipe with Nifi 2.0 but looking to understand the thought process
>> behind this...
>>
>
> This is not completely correct. The node should be able to rejoin the
> cluster and NiFi will automatically apply the flow changes to the rejoining
> cluster so that the flow definition is in-sync again across the cluster.
> This is definitely intentional to provide a better user experience and not
> switch to read-only when a node is disconnected.
>
> Note that there are cases where the node would not be able to rejoin the
> cluster in order to not cause any unwanted data loss: for example if a
> connection is deleted in the cluster, and there is data in this connection
> when the node tries to rejoin the cluster, we would not delete the data.
>
> I'm sure others can provide more details, but I also recommend watching
> Mark's video on this topic:
> https://youtu.be/8G6niPKntTc?t=709
>
> Thanks,
> Pierre
>
> Le mar. 14 mars 2023 à 14:46, Phillip Lord  a
> écrit :
>
>> Nifi Guardians,
>>
>> Can someone explain the motivation behind the somewhat recent change(I
>> believe 1.17ish) that now allows users to make canvas changes to a cluster
>> that is missing a node.  For instance a cluster that is normally 3/3 is for
>> whtvr reason now 2/3.  Previously you would get a warning and you were
>> unable to make changes to the 2/3 clustered canvas.  Which certainly
>> prevents future headaches.
>>
>> Now you're allowed to make changes... meaning the node that was removed
>> from the cluster now has a flow that is no longer in-sync with the cluster
>> and you have to remove that node's flow before it's able to rejoin the
>> cluster.  Was this intentional?  I know there are a lot of things coming
>> down the pipe with Nifi 2.0 but looking to understand the thought process
>> behind this...
>>
>> Thanks,
>> Phil
>>
>


Cluster modification

2023-03-14 Thread Phillip Lord
Nifi Guardians,

Can someone explain the motivation behind the somewhat recent change(I
believe 1.17ish) that now allows users to make canvas changes to a cluster
that is missing a node.  For instance a cluster that is normally 3/3 is for
whtvr reason now 2/3.  Previously you would get a warning and you were
unable to make changes to the 2/3 clustered canvas.  Which certainly
prevents future headaches.

Now you're allowed to make changes... meaning the node that was removed
from the cluster now has a flow that is no longer in-sync with the cluster
and you have to remove that node's flow before it's able to rejoin the
cluster.  Was this intentional?  I know there are a lot of things coming
down the pipe with Nifi 2.0 but looking to understand the thought process
behind this...

Thanks,
Phil


Re: Maximum number of processors?

2023-03-10 Thread Phillip Lord
Eric,

Just wanted to add some thoughts…

In order to help “manage” that many components I’d definitely recommend 
modifying the “nifi.bored.yield.duration” setting.  The default is 10ms… I’d 
recommend increasing this considerably if you’re planning to have 10’s of 
thousands of running components on a single canvas.  This is how often a 
component will check if it has work to do… increasing its bored duration will 
reduce the amount of time components are checking for work.

It “might” introduce some additional latency to flows, but once a component 
understands it has data to work on, it will then continue to run based upon the 
components run-schedule.

Also… I’d recommend breaking down your flows some into separate instances.  
And/or maybe looking to consolidate some functionality… I don’t know how you 
keep track of that many components, but it sounds like a headache :)

Thanks,
Phil
On Mar 10, 2023 at 11:44 AM -0500, Joe Witt , wrote:
> Every processor is attempted to be scheduled to run as often as it
> asks. If it asks to run every 0 seconds that translates to 'run
> pretty darn often/fast'. However, we don't actually invoke the code
> in most cases because the check for 'is work to do' will fail as there
> would be no flowfile sitting there. So you'd not really burn
> resources meaningfully in that model. This is part of why it scales
> so well as there are so many flows all on the same nodes all the time.
> But you might want to lower the scheduled run frequency of processors
> that source data as those will always say 'there is work to do'.
>
> Thanks
>
> On Fri, Mar 10, 2023 at 9:26 AM Eric Secules  wrote:
> >
> > Hi Joe,
> >
> > Thanks for the reply, the reasoning behind my use case for node-slicing of 
> > flows is the assumption that I would otherwise need several VM's with 
> > higher memory allocation for them to hold all of the flows and still have 
> > room for active flowfiles and also have processing capacity to handle the 
> > traffic. I expect traffic to have a daily peak and then taper off to 0 
> > activity. I certainly don't expect all processors to have flowfiles in 
> > their input queues at all times. A couple flows I expect to process a 
> > million flowfiles a day while others might see only a few hundred. They're 
> > all configured to run every 0 seconds. Does the scheduler try to run them 
> > all, or does it only run processors that have flowfiles in the input queue 
> > and processors that have no input?
> >
> > Thanks,
> > Eric
> >
> > On Thu, Mar 9, 2023 at 10:32 AM Joe Witt  wrote:
> > >
> > > Eric
> > >
> > > There is a practical limit in terms of memory, browser performance,
> > > etc... But there isn't otherwise any real hard limit set. We've
> > > seen flows with many 10s of thousands of processors that are part of
> > > what can be many dozens or hundreds of process groups. But the
> > > challenge that comes up is less about the number of components and the
> > > sheer reality of running that many different flows within a single
> > > host system. Now sometimes people doing flows like that don't have
> > > actual live/high volume streams through all of those all the time.
> > > Often that is used for more job/scheduled type flows that run
> > > periodically. That is different and can work out depending on time
> > > slicing/etc..
> > >
> > > The entire notion of how NiFi's clustering is designed and works is
> > > based on 'every node in the clustering being capable of doing any of
> > > the designed flows'. We do not have a design whereby we'd deploy
> > > certain flows on certain nodes such that other nodes wouldn't even
> > > know they exist. However, of course partitioning the work to do be
> > > done across a cluster is a very common thing. For that we have
> > > concepts like 'primary node only' execution. Concepts like load
> > > balanced connections with attribute based affinity so that all data
> > > with a matching attribute end up on the same node/etc..
> > >
> > > It would be very interesting to understand more about your use case
> > > whereby you end up with 100s of thousands of processors and would want
> > > node slicing of flows in the cluster.
> > >
> > > Thanks
> > >
> > > On Wed, Mar 8, 2023 at 9:31 AM Eric Secules  wrote:
> > > >
> > > > Hello,
> > > >
> > > > Is there any upper limit on the number of processors that I can have in 
> > > > my nifi canvas? Would 10 still be okay? As I understand it, each 
> > > > processor takes up space on the heap as an instance of a class.
> > > >
> > > > If this is a problem my idea would be to use multiple unclustered nifi 
> > > > nodes and spread the flows evenly over them.
> > > >
> > > > It would be nice if I could use nifi clustering and set a maximum 
> > > > replication factor on a process group so that the flow inside it only 
> > > > executes on one or two of my clustered nifi nodes.
> > > >
> > > > Thanks,
> > > > Eric


Re: Execute DB2 stored procedue

2023-02-28 Thread Phillip Lord
agh yes... that makes sense.

Thanks Matt!

On Tue, Feb 28, 2023 at 11:23 AM Matt Burgess  wrote:

> Philip,
>
> Those are OUT parameters so ExecuteSQL/PutSQL doesn't know how to get
> the values out after calling the procedure. We'd likely want a
> separate processor like ExecuteStoredProcedure and would have to
> figure out how to handle OUT parameters, maybe adding those fields to
> the outgoing records or something.
>
> Regards,
> Matt
>
> On Tue, Feb 28, 2023 at 11:16 AM Phillip Lord 
> wrote:
> >
> > Thanks for replies...
> >
> > I'm trying putSQL to call the following stored-procedure...
> >
> > CREATE OR REPLACE PROCEDURE SMV.RUN_ALL_PS
> >  (IN   IN_RESET  CHAR(1),   ->  This will always
> be 'N" when called from nifi
> >OUT  OUT_SQLSTATE CHAR(5).
> >OUT  OUT_RETURN_CODEINTEGER,
> >OUT  OUT_ERROR_TEXTVARCHAR(1000),
> >OUT  OUT_SQL_STMT   VARCHAR(3)
> >   )
> >
> >
> > so I'm trying this in putSQL
> >
> > CALL MYPROCEDURE.PROC1('N', ?,?,?,?)
> >
> > and I need to supply sql arg attributes... like...
> >
> > sql.args.1.type = 1
> > sql.args.1.value = not sure what to put here
> > sql.args.2.type = 4
> > sql.args.2.value = not sure what to put here
> > etc...
> >
> > Am I on the right track?
> >
> > Thanks
> >
> >
> >
> >
> >
> >
> > On Mon, Feb 27, 2023 at 8:50 PM Matt Burgess 
> wrote:
> >>
> >> Stored procedures that take no output parameters and return ResultSets
> should work fine with ExecuteSQL, but for DBs that allow OUT and INOUT
> parameters, those won’t make it into the outgoing FlowFile (in either
> content or attributes).
> >>
> >> Regards,
> >> Matt
> >>
> >>
> >> On Feb 27, 2023, at 4:19 PM, Dmitry Stepanov 
> wrote:
> >>
> >> 
> >> We run our procedure using ExecuteSQL.
> >> Just make sure to use proper SQL syntax
> >>
> >> On February 27, 2023 2:09:19 p.m. Phillip Lord 
> wrote:
> >>>
> >>> Hello,
> >>>
> >>> Does anyone have any experience executing a DB2 stored procedure?
> Potentially using PutSQL? I don't think it can be done using ExecuteSQL,
> and I can likely use an executeStreamCommand to accomplish this.  But
> trying not to reinvent the wheel if I can just do it using a simple Nifi
> processor
> >>>
> >>> Thanks
> >>> Phil
> >>
> >>
>


Re: Execute DB2 stored procedue

2023-02-28 Thread Phillip Lord
Thanks for replies...

I'm trying putSQL to call the following stored-procedure...

CREATE OR REPLACE PROCEDURE SMV.RUN_ALL_PS
 (IN   IN_RESET  CHAR(1),   ->  This will always be
'N" when called from nifi
   OUT  OUT_SQLSTATE CHAR(5).
   OUT  OUT_RETURN_CODEINTEGER,
   OUT  OUT_ERROR_TEXTVARCHAR(1000),
   OUT  OUT_SQL_STMT   VARCHAR(3)
  )


so I'm trying this in putSQL

CALL MYPROCEDURE.PROC1('N', ?,?,?,?)

and I need to supply sql arg attributes... like...

sql.args.1.type = 1
sql.args.1.value = not sure what to put here
sql.args.2.type = 4
sql.args.2.value = not sure what to put here
etc...

Am I on the right track?

Thanks






On Mon, Feb 27, 2023 at 8:50 PM Matt Burgess  wrote:

> Stored procedures that take no output parameters and return ResultSets
> should work fine with ExecuteSQL, but for DBs that allow OUT and INOUT
> parameters, those won’t make it into the outgoing FlowFile (in either
> content or attributes).
>
> Regards,
> Matt
>
>
> On Feb 27, 2023, at 4:19 PM, Dmitry Stepanov  wrote:
>
> 
> We run our procedure using ExecuteSQL.
> Just make sure to use proper SQL syntax
>
> On February 27, 2023 2:09:19 p.m. Phillip Lord 
> wrote:
>
>> Hello,
>>
>> Does anyone have any experience executing a DB2 stored procedure?
>> Potentially using PutSQL? I don't think it can be done using ExecuteSQL,
>> and I can likely use an executeStreamCommand to accomplish this.  But
>> trying not to reinvent the wheel if I can just do it using a simple Nifi
>> processor
>>
>> Thanks
>> Phil
>>
>
>


Execute DB2 stored procedue

2023-02-27 Thread Phillip Lord
Hello,

Does anyone have any experience executing a DB2 stored procedure?
Potentially using PutSQL? I don't think it can be done using ExecuteSQL,
and I can likely use an executeStreamCommand to accomplish this.  But
trying not to reinvent the wheel if I can just do it using a simple Nifi
processor

Thanks
Phil