[jira] [Commented] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2020-08-31 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187879#comment-17187879
 ] 

ASF subversion and git services commented on KUDU-1587:
---

Commit ee3bb83575a051c2feade1f8c159b2902a7160d5 in kudu's branch 
refs/heads/master from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=ee3bb83 ]

KUDU-1587 part 2: reject write ops if apply queue is overloaded

This patch implements control admission for write requests in tablet
servers based on the load status of their apply queue. With this change,
the recently introduced OpApplyQueueTest.ApplyQueueBackpressure scenario
successfully passes.

If the queue times of the tasks in the apply queue become higher than
the specified threshold, the apply queue enters overloaded state.  When
the queue is overloaded, the tablet server rejects incoming write
requests with some probability.  The longer the queue stays overloaded,
the greater the probability of rejections.  The apply queue exits the
overloaded state when queue times drop below the specified threshold.

This new behavior is not yet enabled by default, keeping the legacy
behavior of unbounded/uncontrolled queue times as is.  To enable it,
set --tablet_apply_pool_overload_threshold_ms to something greater
than 0 (e.g., 500).

Change-Id: I6d7688d6fa832e606b8efc4549568fa52dfa1931
Reviewed-on: http://gerrit.cloudera.org:8080/16343
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong 


> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0, 1.0.0, 1.0.1, 1.1.0, 1.2.0, 1.3.0, 1.3.1, 1.4.0, 
> 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.7.1, 1.9.0, 1.10.0, 1.10.1, 1.11.0, 1.12.0, 
> 1.11.1
>Reporter: Todd Lipcon
>Assignee: Alexey Serbin
>Priority: Critical
>  Labels: roadmap-candidate
> Attachments: graph.png, queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2020-08-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186284#comment-17186284
 ] 

ASF subversion and git services commented on KUDU-1587:
---

Commit c6d438ab417009e8007a1de274178d0bcf0dfb63 in kudu's branch 
refs/heads/master from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=c6d438a ]

[tserver] add test to reproduce KUDU-1587 conditions

Added a test to reproduce conditions described in KUDU-1587.
As of now, the test is disabled: it will be enabled once
KUDU-1587 is addressed.

Change-Id: I515a1b26152680ee9b9361afcf84fec39b8f962d
Reviewed-on: http://gerrit.cloudera.org:8080/16312
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong 


> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Assignee: Alexey Serbin
>Priority: Critical
>  Labels: roadmap-candidate
> Attachments: graph.png, queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2020-08-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186285#comment-17186285
 ] 

ASF subversion and git services commented on KUDU-1587:
---

Commit fc8615c37eb4e28f3cc6bea0fcd5a8732451e883 in kudu's branch 
refs/heads/master from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=fc8615c ]

KUDU-1587 part 1: load meter for ThreadPool

This patch introduces a load meter for ThreadPool, aiming to use
active queue management techniques (AQM) such as CoDel [1] in scenarios
where thread pool queue load metrics are applicable (e.g., KUDU-1587).

[1] https://en.wikipedia.org/wiki/CoDel

Change-Id: I640716dc32f193e68361ca623ee7b9271e661d8b
Reviewed-on: http://gerrit.cloudera.org:8080/16332
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong 


> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Assignee: Alexey Serbin
>Priority: Critical
>  Labels: roadmap-candidate
> Attachments: graph.png, queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2020-08-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186286#comment-17186286
 ] 

ASF subversion and git services commented on KUDU-1587:
---

Commit fc8615c37eb4e28f3cc6bea0fcd5a8732451e883 in kudu's branch 
refs/heads/master from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=fc8615c ]

KUDU-1587 part 1: load meter for ThreadPool

This patch introduces a load meter for ThreadPool, aiming to use
active queue management techniques (AQM) such as CoDel [1] in scenarios
where thread pool queue load metrics are applicable (e.g., KUDU-1587).

[1] https://en.wikipedia.org/wiki/CoDel

Change-Id: I640716dc32f193e68361ca623ee7b9271e661d8b
Reviewed-on: http://gerrit.cloudera.org:8080/16332
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong 


> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Assignee: Alexey Serbin
>Priority: Critical
>  Labels: roadmap-candidate
> Attachments: graph.png, queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2020-08-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186283#comment-17186283
 ] 

ASF subversion and git services commented on KUDU-1587:
---

Commit c6d438ab417009e8007a1de274178d0bcf0dfb63 in kudu's branch 
refs/heads/master from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=c6d438a ]

[tserver] add test to reproduce KUDU-1587 conditions

Added a test to reproduce conditions described in KUDU-1587.
As of now, the test is disabled: it will be enabled once
KUDU-1587 is addressed.

Change-Id: I515a1b26152680ee9b9361afcf84fec39b8f962d
Reviewed-on: http://gerrit.cloudera.org:8080/16312
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong 


> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Assignee: Alexey Serbin
>Priority: Critical
>  Labels: roadmap-candidate
> Attachments: graph.png, queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2020-08-28 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186282#comment-17186282
 ] 

ASF subversion and git services commented on KUDU-1587:
---

Commit c6d438ab417009e8007a1de274178d0bcf0dfb63 in kudu's branch 
refs/heads/master from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=c6d438a ]

[tserver] add test to reproduce KUDU-1587 conditions

Added a test to reproduce conditions described in KUDU-1587.
As of now, the test is disabled: it will be enabled once
KUDU-1587 is addressed.

Change-Id: I515a1b26152680ee9b9361afcf84fec39b8f962d
Reviewed-on: http://gerrit.cloudera.org:8080/16312
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong 


> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Assignee: Alexey Serbin
>Priority: Critical
>  Labels: roadmap-candidate
> Attachments: graph.png, queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2020-08-14 Thread Alexey Serbin (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178138#comment-17178138
 ] 

Alexey Serbin commented on KUDU-1587:
-

I implemented the requested functionality with the following changelists:
* [a test scenario to simulate apply queue 
"overload"|https://gerrit.cloudera.org/#/c/16312/]
* [tracking the state of the apply queue|https://gerrit.cloudera.org/#/c/16332/]
* [controlling the admission of write requests with CoDel-like 
approach|http://gerrit.cloudera.org:8080/16343]


> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Assignee: Alexey Serbin
>Priority: Critical
>  Labels: roadmap-candidate
> Attachments: graph.png, queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2016-08-29 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15448127#comment-15448127
 ] 

Todd Lipcon commented on KUDU-1587:
---

The issue here is that the only backpressure on new operations being submitted 
to the apply queue is based on the TransactionTracker memory limits. Each 
operation here is only about 1KB of inserted data (or less) and it results in a 
half second of worker wall time. With the default 64MB tracker limit (per 
tablet), that's enough space for 64000x500ms = 32000 seconds worth of work. 
Assume 10x parallelism due to many disks, but then recall that the limiting is 
per-tablet and we have 10+ active tablets, which gets us back into the 
30,000-40,000 seconds worth of available queueing before backpressure kicks in.

> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)