[jira] [Commented] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2016-08-29 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448127#comment-15448127
 ] 

Todd Lipcon commented on KUDU-1587:
---

The issue here is that the only backpressure on new operations being submitted 
to the apply queue is based on the TransactionTracker memory limits. Each 
operation here is only about 1KB of inserted data (or less) and it results in a 
half second of worker wall time. With the default 64MB tracker limit (per 
tablet), that's enough space for 64000x500ms = 32000 seconds worth of work. 
Assume 10x parallelism due to many disks, but then recall that the limiting is 
per-tablet and we have 10+ active tablets, which gets us back into the 
30,000-40,000 seconds worth of available queueing before backpressure kicks in.

> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2016-08-29 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated KUDU-1587:
--
Attachment: queue-time.png

Here's a graph of the queue time of the servers in this workload. Note that the 
y axis is seconds and the apply queue is reaching 40,000 seconds (11 hours!)
!queue-time.png!


> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2016-08-29 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated KUDU-1587:
--
Attachment: queue-time.png

> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2016-08-29 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated KUDU-1587:
--
Attachment: (was: queue-time.png)

> Memory-based backpressure is insufficient on seek-bound workloads
> -
>
> Key: KUDU-1587
> URL: https://issues.apache.org/jira/browse/KUDU-1587
> Project: Kudu
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: queue-time.png
>
>
> I pushed a uniform random insert workload from a bunch of clients to the 
> point that the vast majority of bloom filters no longer fit in buffer cache, 
> and the compaction had fallen way behind. Thus, every inserted row turns into 
> 40+ seeks (due to non-compact data) and takes 400-500ms. In this kind of 
> workload, the current backpressure (based on memory usage) is insufficient to 
> prevent ridiculously long queues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KUDU-1587) Memory-based backpressure is insufficient on seek-bound workloads

2016-08-29 Thread Todd Lipcon (JIRA)
Todd Lipcon created KUDU-1587:
-

 Summary: Memory-based backpressure is insufficient on seek-bound 
workloads
 Key: KUDU-1587
 URL: https://issues.apache.org/jira/browse/KUDU-1587
 Project: Kudu
  Issue Type: Bug
  Components: tserver
Affects Versions: 0.10.0
Reporter: Todd Lipcon
Priority: Critical


I pushed a uniform random insert workload from a bunch of clients to the point 
that the vast majority of bloom filters no longer fit in buffer cache, and the 
compaction had fallen way behind. Thus, every inserted row turns into 40+ seeks 
(due to non-compact data) and takes 400-500ms. In this kind of workload, the 
current backpressure (based on memory usage) is insufficient to prevent 
ridiculously long queues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KUDU-1586) If a single op is larger than consensus_max_batch_size_bytes, consensus gets stuck

2016-08-29 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/KUDU-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448056#comment-15448056
 ] 

Todd Lipcon commented on KUDU-1586:
---

I diagnosed this by bumping the vlog level to 2 on this host for a second or 
two (using ts-cli set_flag --force v 2).

{code}
I0829 21:55:05.215745 13731 log_cache.cc:307] T 
7919fcd47fd34c4989ce214d05e62d41 P 38d4433bb09948e58d10a74ba5f97c8b: 
Successfully read 1 ops from disk (611738..611738)
I0829 21:55:05.215782 13731 consensus_queue.cc:382] T 
7919fcd47fd34c4989ce214d05e62d41 P 38d4433bb09948e58d10a74ba5f97c8b [LEADER]: 
Sending status only request to Peer: b18f54151bc04da59520fdb086d5b571: 
tablet_id: "7919fcd47fd34c4989ce214d05e62d41"
caller_uuid: "38d4433bb09948e58d10a74ba5f97c8b"
caller_term: 175
preceding_id {
  term: 174
  index: 611737
}
committed_index {
  term: 175
  index: 611763
}
{code}

it appears that even though the remote peer was lagging behind, the leader was 
just sending status-only requests, probably because this single op was larger 
than the target batch size (1MB). I used ts-cli set_flag 
consensus_max_batch_size_bytes to set to 4MB and the loop stopped itself.

> If a single op is larger than consensus_max_batch_size_bytes, consensus gets 
> stuck
> --
>
> Key: KUDU-1586
> URL: https://issues.apache.org/jira/browse/KUDU-1586
> Project: Kudu
>  Issue Type: Bug
>  Components: consensus
>Affects Versions: 0.10.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
>
> I noticed on a cluster test that a leader was spinning with log messages like:
> I0829 14:17:31.870786 22184 log_cache.cc:307] T 
> e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: 
> Successfully read 1 ops from disk (866604..866604)
> I0829 14:17:31.873234  6186 log_cache.cc:307] T 
> e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: 
> Successfully read 1 ops from disk (866604..866604)
> I0829 14:17:31.875713 22184 log_cache.cc:307] T 
> e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: 
> Successfully read 1 ops from disk (866604..866604)
> I0829 14:17:31.878078  6186 log_cache.cc:307] T 
> e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: 
> Successfully read 1 ops from disk (866604..866604)
> After investigation, it seems this op was larger than 1MB (default consensus 
> batch size) and this caused this tight loop behavior with no progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KUDU-1586) If a single op is larger than consensus_max_batch_size_bytes, consensus gets stuck

2016-08-29 Thread Todd Lipcon (JIRA)
Todd Lipcon created KUDU-1586:
-

 Summary: If a single op is larger than 
consensus_max_batch_size_bytes, consensus gets stuck
 Key: KUDU-1586
 URL: https://issues.apache.org/jira/browse/KUDU-1586
 Project: Kudu
  Issue Type: Bug
  Components: consensus
Affects Versions: 0.10.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker


I noticed on a cluster test that a leader was spinning with log messages like:

I0829 14:17:31.870786 22184 log_cache.cc:307] T 
e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: 
Successfully read 1 ops from disk (866604..866604)
I0829 14:17:31.873234  6186 log_cache.cc:307] T 
e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: 
Successfully read 1 ops from disk (866604..866604)
I0829 14:17:31.875713 22184 log_cache.cc:307] T 
e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: 
Successfully read 1 ops from disk (866604..866604)
I0829 14:17:31.878078  6186 log_cache.cc:307] T 
e7cacfdb22744496a6d5d66227a69823 P 5d15962d2f2445b1ba15b93ead4fb31b: 
Successfully read 1 ops from disk (866604..866604)

After investigation, it seems this op was larger than 1MB (default consensus 
batch size) and this caused this tight loop behavior with no progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KUDU-666) Test Kudu with large cells

2016-08-29 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated KUDU-666:
-
Target Version/s: 1.0.0  (was: GA)

> Test Kudu with large cells
> --
>
> Key: KUDU-666
> URL: https://issues.apache.org/jira/browse/KUDU-666
> Project: Kudu
>  Issue Type: Task
>  Components: tablet, test
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Priority: Critical
>
> We know that we currently have some limitations on cell length before we 
> start running into things like max IPC buffer size. But we don't have any 
> real recommendations on what the limits are before stuff starts to fall 
> apart. Similar to KUDU-665, let's run some tests where we slowly scale up the 
> cell size (perhaps in conjunction with scaling up number of columns) to 
> arrive at some recommendations here. We should also put hard limits in place 
> to prevent people from inserting data that would crash the server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KUDU-1048) master should show versions of tservers, version summary

2016-08-29 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved KUDU-1048.
---
   Resolution: Fixed
Fix Version/s: 1.0.0

> master should show versions of tservers, version summary
> 
>
> Key: KUDU-1048
> URL: https://issues.apache.org/jira/browse/KUDU-1048
> Project: Kudu
>  Issue Type: Improvement
>  Components: master, ops-tooling
>Affects Versions: Private Beta
>Reporter: Todd Lipcon
>Assignee: Will Berkeley
>  Labels: newbie
> Fix For: 1.0.0
>
>
> We should include the version number in the TS registration, and then show 
> that in the table of tablet servers. Perhaps also show a summary of number of 
> hosts with each version. This is useful to confirm that a cluster upgrade has 
> completed (or see progress of a rolling upgrade)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KUDU-687) ksck should work against multi-master

2016-08-29 Thread Adar Dembo (JIRA)

 [ 
https://issues.apache.org/jira/browse/KUDU-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adar Dembo resolved KUDU-687.
-
   Resolution: Fixed
 Assignee: Adar Dembo
Fix Version/s: 1.0.0

Fixed in commits ebe4d78, 6a92db4, and b5aa4a7.

> ksck should work against multi-master
> -
>
> Key: KUDU-687
> URL: https://issues.apache.org/jira/browse/KUDU-687
> Project: Kudu
>  Issue Type: Bug
>  Components: ops-tooling
>Affects Versions: Public beta
>Reporter: Todd Lipcon
>Assignee: Adar Dembo
> Fix For: 1.0.0
>
>
> ksck currently creates an rpc proxy directly to the master, and is only 
> configured by a single address. This doesn't work in a multi-master cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)