At work in our Production environment we have a Kudu cluster with 55 tablet servers running Kudu version 1.7.0. The cluster has 263 tables, 17,620 tablets (x3 to take into account the replicas), with about 58 TB of data. The output from the command 'kudu cluster ksck' shows that all the tables are healthy.
In the last few months we have been seeing a couple of concerning phenomena: - the number of open files in the Kudu process in the tablet servers has increased to now more than 150,000 (as counted using 'lsof'); we raised the limit of maximum number of open files twice already to avoid a crash, but we (and our vendor) are concerned that something might not be right with such a high number of open files. - in some of the tablet servers the disk space used by the WALs is significantly (and concerningly) higher than in most of the other tablet servers; we use a 1TB SSD drive (about 950GB usable) to store the WALs on each tablet server, and this week was the second time where we saw a tablet server almost fill the whole WAL disk. We had to stop and restart the tablet server, so its tablets would be migrated to different TS's, and we could manually clean up the WALs directory, but this is definitely not something we would like to do in the future. We took a look inside the WAL directory on that TS before wiping it, and we observed that there were a few tablets whose WALs were in excess of 30GB. Another piece of information is that the table that the largest of these tablets belong to, receives about 15M transactions a day, of which about 25% are new inserts and the rest are updates of existing rows. We created a couple of support cases with our vendor, and they are currently reviewing the logs, but we also thought it would be useful to post this in the Kudu users mailing list, in case someone has ideas of what could cause this behavior and how to address it, and to find out if anyone else here has noticed something similar on their Kudu clusters, or if it is just peculiar to our configuration and type of load. Thanks in advance, Franco Venturi