[jira] [Commented] (NIFI-9820) Change PutKudu Property "Kudu Client Worker Count" Default Value

2022-04-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517010#comment-17517010
 ] 

ASF subversion and git services commented on NIFI-9820:
---

Commit daafd08aaebb3c4fcf149e25999d869b9fe94d3c in nifi's branch 
refs/heads/support/nifi-1.16 from David Handermann
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=daafd08aae ]

NIFI-9820 Reduced Kudu Client Worker Count default setting

- Changed default Kudu Client Worker Count to number of runtime-reported 
available processors

Signed-off-by: Joe Gresock 

This closes #5886.


> Change PutKudu Property "Kudu Client Worker Count" Default Value
> 
>
> Key: NIFI-9820
> URL: https://issues.apache.org/jira/browse/NIFI-9820
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Josef Zahner
>Assignee: David Handermann
>Priority: Minor
> Fix For: 1.17.0, 1.16.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The PutKudu processor property "Kudu Client Worker Count" has a suboptimal 
> value. Please don't use the current "number of CPUs multiplied by 2" 
> behaviour as it leads to a massive amount of workers in our case with 
> physical servers. We have a 8-node cluster where each server has 64 CPUs. We 
> have about 30 PutKudu processors configured -> a lot of worker threads per 
> default just for kudu.
> We have changed the number of worker threads in our case to the number of 
> concurrent tasks. I don't know, maybe it would be great to set it a bit 
> higher than that, but to be honest, I don't exactly understand the impact. It 
> looks still fast with the current config.
> *To sum it up, please set a low default value (eg. 4 or 8) for the property 
> "Kudu Client Worker Count" and not a pseudo dynamic one for the PutKudu 
> processor.*
> Btw. are there any suggestions how big the number should be?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (NIFI-9820) Change PutKudu Property "Kudu Client Worker Count" Default Value

2022-03-30 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17514764#comment-17514764
 ] 

ASF subversion and git services commented on NIFI-9820:
---

Commit b53fb87aa1af5638744f4b35d074c8ae433f98a2 in nifi's branch 
refs/heads/main from David Handermann
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=b53fb87 ]

NIFI-9820 Reduced Kudu Client Worker Count default setting

- Changed default Kudu Client Worker Count to number of runtime-reported 
available processors

Signed-off-by: Joe Gresock 

This closes #5886.


> Change PutKudu Property "Kudu Client Worker Count" Default Value
> 
>
> Key: NIFI-9820
> URL: https://issues.apache.org/jira/browse/NIFI-9820
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Josef Zahner
>Assignee: David Handermann
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The PutKudu processor property "Kudu Client Worker Count" has a suboptimal 
> value. Please don't use the current "number of CPUs multiplied by 2" 
> behaviour as it leads to a massive amount of workers in our case with 
> physical servers. We have a 8-node cluster where each server has 64 CPUs. We 
> have about 30 PutKudu processors configured -> a lot of worker threads per 
> default just for kudu.
> We have changed the number of worker threads in our case to the number of 
> concurrent tasks. I don't know, maybe it would be great to set it a bit 
> higher than that, but to be honest, I don't exactly understand the impact. It 
> looks still fast with the current config.
> *To sum it up, please set a low default value (eg. 4 or 8) for the property 
> "Kudu Client Worker Count" and not a pseudo dynamic one for the PutKudu 
> processor.*
> Btw. are there any suggestions how big the number should be?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (NIFI-9820) Change PutKudu Property "Kudu Client Worker Count" Default Value

2022-03-22 Thread Josef Zahner (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17510264#comment-17510264
 ] 

Josef Zahner commented on NIFI-9820:


Ah right, you are referring to a case where NiFi version is smaller dann 
1.14.0. 

Anyway, it's always a tradeoff, in our case we had a massive memory issue due 
to the fact that the default was so high and would be too high with the new 
value (default = no of CPUs) as well. In my point of view it would be better to 
have worse performance per default, instead of having a crashing system like in 
our case. If the performance is worse, you recognise this due to the fact the 
the processor has a bad performance and the queues get filled. So you 
understand relatively fast where the issue could be (somewhere within the 
PutKudu processor) and it's an easy fix. On the other hand with the default 
high, you are getting a memory warning and you don't see where it comes from. 
So without deep analysis, it wouldn't be possible to find the culprit. 

> Change PutKudu Property "Kudu Client Worker Count" Default Value
> 
>
> Key: NIFI-9820
> URL: https://issues.apache.org/jira/browse/NIFI-9820
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Josef Zahner
>Assignee: David Handermann
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The PutKudu processor property "Kudu Client Worker Count" has a suboptimal 
> value. Please don't use the current "number of CPUs multiplied by 2" 
> behaviour as it leads to a massive amount of workers in our case with 
> physical servers. We have a 8-node cluster where each server has 64 CPUs. We 
> have about 30 PutKudu processors configured -> a lot of worker threads per 
> default just for kudu.
> We have changed the number of worker threads in our case to the number of 
> concurrent tasks. I don't know, maybe it would be great to set it a bit 
> higher than that, but to be honest, I don't exactly understand the impact. It 
> looks still fast with the current config.
> *To sum it up, please set a low default value (eg. 4 or 8) for the property 
> "Kudu Client Worker Count" and not a pseudo dynamic one for the PutKudu 
> processor.*
> Btw. are there any suggestions how big the number should be?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (NIFI-9820) Change PutKudu Property "Kudu Client Worker Count" Default Value

2022-03-21 Thread David Handermann (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17510099#comment-17510099
 ] 

David Handermann commented on NIFI-9820:


When upgrading from 1.14.0 and later, the existing value would carry through to 
a new version.  However, when upgrading from something older, such as 1.13.2, 
that instance would get the new default value.

> Change PutKudu Property "Kudu Client Worker Count" Default Value
> 
>
> Key: NIFI-9820
> URL: https://issues.apache.org/jira/browse/NIFI-9820
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Josef Zahner
>Assignee: David Handermann
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The PutKudu processor property "Kudu Client Worker Count" has a suboptimal 
> value. Please don't use the current "number of CPUs multiplied by 2" 
> behaviour as it leads to a massive amount of workers in our case with 
> physical servers. We have a 8-node cluster where each server has 64 CPUs. We 
> have about 30 PutKudu processors configured -> a lot of worker threads per 
> default just for kudu.
> We have changed the number of worker threads in our case to the number of 
> concurrent tasks. I don't know, maybe it would be great to set it a bit 
> higher than that, but to be honest, I don't exactly understand the impact. It 
> looks still fast with the current config.
> *To sum it up, please set a low default value (eg. 4 or 8) for the property 
> "Kudu Client Worker Count" and not a pseudo dynamic one for the PutKudu 
> processor.*
> Btw. are there any suggestions how big the number should be?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (NIFI-9820) Change PutKudu Property "Kudu Client Worker Count" Default Value

2022-03-21 Thread Josef Zahner (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509972#comment-17509972
 ] 

Josef Zahner commented on NIFI-9820:


Everything is better than the value today :).

Correct, an upgrade to a newer NiFi should not change the property or have an 
impact for existing users. However this isn't the case anyway - right? Because 
as soon as the processor is on the canvas, I would expect that the flow.xml.gz 
file stores the value in any case. So an existing processor shouldn't be 
impacted if NiFi changes the default

> Change PutKudu Property "Kudu Client Worker Count" Default Value
> 
>
> Key: NIFI-9820
> URL: https://issues.apache.org/jira/browse/NIFI-9820
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Josef Zahner
>Assignee: David Handermann
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The PutKudu processor property "Kudu Client Worker Count" has a suboptimal 
> value. Please don't use the current "number of CPUs multiplied by 2" 
> behaviour as it leads to a massive amount of workers in our case with 
> physical servers. We have a 8-node cluster where each server has 64 CPUs. We 
> have about 30 PutKudu processors configured -> a lot of worker threads per 
> default just for kudu.
> We have changed the number of worker threads in our case to the number of 
> concurrent tasks. I don't know, maybe it would be great to set it a bit 
> higher than that, but to be honest, I don't exactly understand the impact. It 
> looks still fast with the current config.
> *To sum it up, please set a low default value (eg. 4 or 8) for the property 
> "Kudu Client Worker Count" and not a pseudo dynamic one for the PutKudu 
> processor.*
> Btw. are there any suggestions how big the number should be?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (NIFI-9820) Change PutKudu Property "Kudu Client Worker Count" Default Value

2022-03-21 Thread David Handermann (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509924#comment-17509924
 ] 

David Handermann commented on NIFI-9820:


I agree that the NiFi use case is different than the default value set in the 
Kudu Client library.

Using a small default value is probably a safer approach, and would follow a 
similar pattern to other processors, as well as the default number of 
concurrent tasks.

On the other hand, for flows that have existing PutKudu processors, the 
challenge is to avoid introducing a negative impact.  For example, if an 
existing flow has a larger number of concurrent tasks, upgrading to a version 
of NiFi that defaults to the Worker Client Count to 1 or 2 would have a 
negative impact on performance after the upgrade.  That was part of the reason 
for setting the default value to match the internal value from the Kudu Client 
library.  For this reason, changing the default value to the number of CPU 
cores, versus the number multiplied by 2, seems a middle way forward.

> Change PutKudu Property "Kudu Client Worker Count" Default Value
> 
>
> Key: NIFI-9820
> URL: https://issues.apache.org/jira/browse/NIFI-9820
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Josef Zahner
>Assignee: David Handermann
>Priority: Minor
>
> The PutKudu processor property "Kudu Client Worker Count" has a suboptimal 
> value. Please don't use the current "number of CPUs multiplied by 2" 
> behaviour as it leads to a massive amount of workers in our case with 
> physical servers. We have a 8-node cluster where each server has 64 CPUs. We 
> have about 30 PutKudu processors configured -> a lot of worker threads per 
> default just for kudu.
> We have changed the number of worker threads in our case to the number of 
> concurrent tasks. I don't know, maybe it would be great to set it a bit 
> higher than that, but to be honest, I don't exactly understand the impact. It 
> looks still fast with the current config.
> *To sum it up, please set a low default value (eg. 4 or 8) for the property 
> "Kudu Client Worker Count" and not a pseudo dynamic one for the PutKudu 
> processor.*
> Btw. are there any suggestions how big the number should be?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (NIFI-9820) Change PutKudu Property "Kudu Client Worker Count" Default Value

2022-03-21 Thread Josef Zahner (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509910#comment-17509910
 ] 

Josef Zahner commented on NIFI-9820:


Btw. we haven't tested what happens if we use the default value (128) in our 
case. We have the issue that we have to test the behaviour in our production 
system.

Wouldn't it be great to have a save value to prevent a high memory issue per 
default? Other values are per default as well very small, like the number of 
tasks of a processor (defaults to 1). That's why I wrote a very small number in 
my initial post.

> Change PutKudu Property "Kudu Client Worker Count" Default Value
> 
>
> Key: NIFI-9820
> URL: https://issues.apache.org/jira/browse/NIFI-9820
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Josef Zahner
>Assignee: David Handermann
>Priority: Minor
>
> The PutKudu processor property "Kudu Client Worker Count" has a suboptimal 
> value. Please don't use the current "number of CPUs multiplied by 2" 
> behaviour as it leads to a massive amount of workers in our case with 
> physical servers. We have a 8-node cluster where each server has 64 CPUs. We 
> have about 30 PutKudu processors configured -> a lot of worker threads per 
> default just for kudu.
> We have changed the number of worker threads in our case to the number of 
> concurrent tasks. I don't know, maybe it would be great to set it a bit 
> higher than that, but to be honest, I don't exactly understand the impact. It 
> looks still fast with the current config.
> *To sum it up, please set a low default value (eg. 4 or 8) for the property 
> "Kudu Client Worker Count" and not a pseudo dynamic one for the PutKudu 
> processor.*
> Btw. are there any suggestions how big the number should be?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (NIFI-9820) Change PutKudu Property "Kudu Client Worker Count" Default Value

2022-03-21 Thread Josef Zahner (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509905#comment-17509905
 ] 

Josef Zahner commented on NIFI-9820:


It's really a big advantage that we can change the value with a property since 
1.14.0, however I could imagine that the internal default value from the Kudu 
Client library doesn't expect to have multiple clients (aka processors in nifi) 
on one node. So it makes sense to use the number of CPUs there. In NiFi however 
it's a different case, it's very likely that you don't have just one client.

The question is, why should we use a "dynamic" calculated value as any other 
property (eg. FlowFiles per Batch) is as well just on a fixed value. The use 
has to test/find anyway a good value in his setup.

> Change PutKudu Property "Kudu Client Worker Count" Default Value
> 
>
> Key: NIFI-9820
> URL: https://issues.apache.org/jira/browse/NIFI-9820
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Josef Zahner
>Assignee: David Handermann
>Priority: Minor
>
> The PutKudu processor property "Kudu Client Worker Count" has a suboptimal 
> value. Please don't use the current "number of CPUs multiplied by 2" 
> behaviour as it leads to a massive amount of workers in our case with 
> physical servers. We have a 8-node cluster where each server has 64 CPUs. We 
> have about 30 PutKudu processors configured -> a lot of worker threads per 
> default just for kudu.
> We have changed the number of worker threads in our case to the number of 
> concurrent tasks. I don't know, maybe it would be great to set it a bit 
> higher than that, but to be honest, I don't exactly understand the impact. It 
> looks still fast with the current config.
> *To sum it up, please set a low default value (eg. 4 or 8) for the property 
> "Kudu Client Worker Count" and not a pseudo dynamic one for the PutKudu 
> processor.*
> Btw. are there any suggestions how big the number should be?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (NIFI-9820) Change PutKudu Property "Kudu Client Worker Count" Default Value

2022-03-21 Thread David Handermann (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509900#comment-17509900
 ] 

David Handermann commented on NIFI-9820:


Thanks for raising this issue [~jzahner]. The current default value derives 
from the internal default value used within the Kudu Client library. As you 
have noted, however, configuring multiple PutKudu Processors results in memory 
usage problems.

Some NiFi flow configurations can benefit from a larger number, where only a 
couple PutKudu Processors may be configured. Setting the property based on the 
number of concurrent tasks is a good approach in general.

Changing the default value to the number of reported CPU cores is one simple 
way forward.  Another option could be setting the default value to half the 
number of CPU cores, with a minimum value of 1. I am inclined to go with the 
first option, and I can put forward a pull request for further consideration.

> Change PutKudu Property "Kudu Client Worker Count" Default Value
> 
>
> Key: NIFI-9820
> URL: https://issues.apache.org/jira/browse/NIFI-9820
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Core Framework
>Affects Versions: 1.15.3
>Reporter: Josef Zahner
>Assignee: David Handermann
>Priority: Minor
>
> The PutKudu processor property "Kudu Client Worker Count" has a suboptimal 
> value. Please don't use the current "number of CPUs multiplied by 2" 
> behaviour as it leads to a massive amount of workers in our case with 
> physical servers. We have a 8-node cluster where each server has 64 CPUs. We 
> have about 30 PutKudu processors configured -> a lot of worker threads per 
> default just for kudu.
> We have changed the number of worker threads in our case to the number of 
> concurrent tasks. I don't know, maybe it would be great to set it a bit 
> higher than that, but to be honest, I don't exactly understand the impact. It 
> looks still fast with the current config.
> *To sum it up, please set a low default value (eg. 4 or 8) for the property 
> "Kudu Client Worker Count" and not a pseudo dynamic one for the PutKudu 
> processor.*
> Btw. are there any suggestions how big the number should be?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)