Re: Error message: 'Tried to update clock beyond the max. error.'

2017-11-01 Thread Todd Lipcon
Thanks Franco. I filed https://issues.apache.org/jira/browse/KUDU-2209 and put up a patch. I'm also going to work on a change to try to allow Kudu to ride over brief interruptions in ntp synchronization status. Hopefully this will help folks who have some issues with occasional ntp instability. -

Re: Error message: 'Tried to update clock beyond the max. error.'

2017-11-01 Thread Franco Venturi
>From 'tablet_bootsratp.cc': 1030 14:29:37.324306 60682 tablet_bootstrap.cc:884] Check failed: _s.ok() Bad status: Invalid argument: Tried to update clock beyond the max. error. Franco - Original Message - From: "Todd Lipcon" To: user@kudu.apache.org Sent: Wednesday, November

Re: Error message: 'Tried to update clock beyond the max. error.'

2017-11-01 Thread Todd Lipcon
Actually I think I understand the root cause of this. I think at some point NTP can switch the clock from a microseconds-based mode to a nanoseconds-based mode, at which point Kudu starts interpreting the results of the ntp_gettime system call incorrectly, resulting in incorrect error estimates and

Re: Error message: 'Tried to update clock beyond the max. error.'

2017-11-01 Thread Todd Lipcon
What's the full log line where you're seeing this crash? Is it coming from tablet_bootstrap.cc, raft_consensus.cc, or elsewhere? -Todd 2017-11-01 15:45 GMT-07:00 Franco Venturi : > Our version is kudu 1.5.0-cdh5.13.0. > > Franco > > > > > -- Todd Lipcon Software Engineer, Cloudera

Re: Error message: 'Tried to update clock beyond the max. error.'

2017-11-01 Thread Franco Venturi
Our version is kudu 1.5.0-cdh5.13.0. Franco

Re: Low ingestion rate from Kafka

2017-11-01 Thread Todd Lipcon
On Wed, Nov 1, 2017 at 2:10 PM, Chao Sun wrote: > > Great. Keep in mind that, since you have a UUID component at the front > of your key, you are doing something like a random-write workload. So, as > your data grows, if your PK column (and its bloom filters) ends up being > larger than the avail

Re: Low ingestion rate from Kafka

2017-11-01 Thread Chao Sun
> Great. Keep in mind that, since you have a UUID component at the front of your key, you are doing something like a random-write workload. So, as your data grows, if your PK column (and its bloom filters) ends up being larger than the available RAM for caching, each write may generate a disk seek

Re: Kudu background tasks

2017-11-01 Thread Todd Lipcon
Hi Janne, It's not clear whether the issue was that it was taking a long time to restart (i.e replaying WALs) or if somehow you also ended up having to re-replicate a bunch of tablets from host to host in the cluster. There were some bugs in earlier versions of Kudu (eg KUDU-2125, KUDU-2020) which

Re: Low ingestion rate from Kafka

2017-11-01 Thread Todd Lipcon
On Wed, Nov 1, 2017 at 1:23 PM, Chao Sun wrote: > Thanks Todd! I improved my code to use multi Kudu clients for processing > the Kafka messages and > was able to improve the number to 250K - 300K per sec. Pretty happy with > this now. > Great. Keep in mind that, since you have a UUID component a

Re: Low ingestion rate from Kafka

2017-11-01 Thread Chao Sun
Thanks Todd! I improved my code to use multi Kudu clients for processing the Kafka messages and was able to improve the number to 250K - 300K per sec. Pretty happy with this now. Will take a look at the perf tool - looks very nice. It seems it is not available on Kudu 1.3 though. Best, Chao On W

Kudu background tasks

2017-11-01 Thread Janne Keskitalo
Hi Our Kudu test environment got unresponsive yesterday for unknown reason. It has three tablet servers and one master. It's running in AWS on quite small host machines, so maybe some node ran out of memory or something. It has happened before with this setup. Anyway, after we restarted kudu servi

Re: Low ingestion rate from Kafka

2017-11-01 Thread Todd Lipcon
On Wed, Nov 1, 2017 at 12:20 AM, Todd Lipcon wrote: > Sounds good. > > BTW, you can try a quick load test using the 'kudu perf loadgen' tool. > For example something like: > > kudu perf loadgen my-kudu-master.example.com --num-threads=8 > --num-rows-per-thread=100 --table-num-buckets=32 > > T

Re: Low ingestion rate from Kafka

2017-11-01 Thread Todd Lipcon
On Tue, Oct 31, 2017 at 11:56 PM, Chao Sun wrote: > > Sure, but increasing the number of consumers can increase the throughput > (without increasing the number of Kudu tablet servers). > > I see. Make sense. I'll test that later. > > > Currently, if you run 'top' on the TS nodes, do you see them