[jira] [Updated] (NIFI-7642) KafkaConsumers: Improve batching into fewer FlowFiles under high latency conditions

2020-07-15 Thread Alex Goos (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Goos updated NIFI-7642:

Summary: KafkaConsumers: Improve batching into fewer FlowFiles under high 
latency conditions  (was: KafkaConsumers: Improve batching into fewer FlowFiles 
under high latency conditios)

> KafkaConsumers: Improve batching into fewer FlowFiles under high latency 
> conditions
> ---
>
> Key: NIFI-7642
> URL: https://issues.apache.org/jira/browse/NIFI-7642
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.3.0, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.7.1, 1.9.0, 
> 1.10.0, 1.9.1, 1.9.2, 1.11.0, 1.11.1, 1.11.2, 1.11.4
>Reporter: Alex Goos
>Priority: Minor
>  Labels: kafka
>
> NIFI-3962 introduced two hard-coded magic numbers into KafkaConsumers. :
>  * maxRecords is capped at 1000, regardless of the property setting
>  * the poll timeout is fixed at 10ms
> Under high throughput & high latency conditions, this leads to too small 
> small files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (NIFI-7642) KafkaConsumers: Improve batching into fewer FlowFiles under high latency conditios

2020-07-15 Thread Alex Goos (Jira)


 [ 
https://issues.apache.org/jira/browse/NIFI-7642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Goos updated NIFI-7642:

Labels: kafka  (was: )

> KafkaConsumers: Improve batching into fewer FlowFiles under high latency 
> conditios
> --
>
> Key: NIFI-7642
> URL: https://issues.apache.org/jira/browse/NIFI-7642
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Extensions
>Affects Versions: 1.3.0, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.7.1, 1.9.0, 
> 1.10.0, 1.9.1, 1.9.2, 1.11.0, 1.11.1, 1.11.2, 1.11.4
>Reporter: Alex Goos
>Priority: Minor
>  Labels: kafka
>
> NIFI-3962 introduced two hard-coded magic numbers into KafkaConsumers. :
>  * maxRecords is capped at 1000, regardless of the property setting
>  * the poll timeout is fixed at 10ms
> Under high throughput & high latency conditions, this leads to too small 
> small files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (NIFI-7642) KafkaConsumers: Improve batching into fewer FlowFiles under high latency conditios

2020-07-15 Thread Alex Goos (Jira)
Alex Goos created NIFI-7642:
---

 Summary: KafkaConsumers: Improve batching into fewer FlowFiles 
under high latency conditios
 Key: NIFI-7642
 URL: https://issues.apache.org/jira/browse/NIFI-7642
 Project: Apache NiFi
  Issue Type: Bug
  Components: Extensions
Affects Versions: 1.11.4, 1.11.2, 1.11.1, 1.11.0, 1.9.2, 1.9.1, 1.10.0, 
1.9.0, 1.7.1, 1.8.0, 1.7.0, 1.6.0, 1.5.0, 1.4.0, 1.3.0
Reporter: Alex Goos


NIFI-3962 introduced two hard-coded magic numbers into KafkaConsumers. :
 * maxRecords is capped at 1000, regardless of the property setting
 * the poll timeout is fixed at 10ms

Under high throughput & high latency conditions, this leads to too small small 
files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (NIFI-5989) Improve PutKudu BatchSize handling

2019-01-31 Thread Alex Goos (JIRA)


 [ 
https://issues.apache.org/jira/browse/NIFI-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Goos updated NIFI-5989:

Attachment: 0001-NIFI-5989-PutKudu-Additional-FF-Queue-length-setting.patch

> Improve PutKudu BatchSize handling
> --
>
> Key: NIFI-5989
> URL: https://issues.apache.org/jira/browse/NIFI-5989
> Project: Apache NiFi
>  Issue Type: Improvement
>  Components: Extensions
>Affects Versions: 1.8.0
>Reporter: Alex Goos
>Priority: Major
>  Labels: kudu, nifi
> Fix For: 1.9.0
>
> Attachments: 
> 0001-NIFI-5989-PutKudu-Additional-FF-Queue-length-setting.patch
>
>
> Current "Batch size" property of PutKudu affects both: the number of 
> Flowfiles pulled per OnTrigger and the size of the Kudu client modification 
> buffer.
> If the Flowfiles contain a considerable amount of records, then a 
> disproportionate amount of data is pulled in and deserialized into memory, 
> when in AUTO_FLUSH_BACKGROUND mode. 
> We propose introducing a separate setting for the batch size of FlowFiles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (NIFI-5989) Improve PutKudu BatchSize handling

2019-01-31 Thread Alex Goos (JIRA)
Alex Goos created NIFI-5989:
---

 Summary: Improve PutKudu BatchSize handling
 Key: NIFI-5989
 URL: https://issues.apache.org/jira/browse/NIFI-5989
 Project: Apache NiFi
  Issue Type: Improvement
  Components: Extensions
Affects Versions: 1.8.0
Reporter: Alex Goos
 Fix For: 1.9.0


Current "Batch size" property of PutKudu affects both: the number of Flowfiles 
pulled per OnTrigger and the size of the Kudu client modification buffer.

If the Flowfiles contain a considerable amount of records, then a 
disproportionate amount of data is pulled in and deserialized into memory, when 
in AUTO_FLUSH_BACKGROUND mode. 

We propose introducing a separate setting for the batch size of FlowFiles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)