Hi Kudu users,

We just started to use Kudu (1.4.0+cdh5.12.1). To make a baseline for
evaluation we ingested 3 month worth of data. During ingestion we were
facing messages from the maintenance threads that a soft memory limit were
reached. It seems like the background maintenance threads stopped
performing their tasks at this point in time. It also so seems like the
memory was never recovered even after stopping ingestion so I guess there
was a large backlog being built up. I guess the root cause here is that we
were a bit too conservative when giving Kudu memory. After a reststart a
lot of maintenance tasks were started (i.e. compaction).

When we verified that all data was inserted we found that some data was
missing. We added this missing data and on some chunks we got the
information that all rows were already present, i.e impala says something
like Modified: 0 rows, nnnnnnn errors. Doing the verification again now
shows that the Kudu table is complete. So, even though we did not insert
any data on some chunks, a count(*) operation over these chunks now returns
a different value.

Now to my question. Will data be inconsistent if we recycle Kudu after
seeing soft memory limit warnings?

Is there a way to tell when it is safe to restart Kudu to avoid these
issues? Should we use any special procedure when restarting (e.g. only
restart the tablet servers, only restart one tablet server at a time or
something like that)?

The table design uses 50 tablets per day (times 90 days). It is 8 TB of
data after 3xreplication over 5 tablet servers.

Thanks,
Petter

Reply via email to