The error indicates that the set of outstanding transactions on this tablet is hovering around 64MB. The error message is inherently racy in that the values included aren't necessarily the values that triggered the failure, so don't worry about that aspect of it.
A couple things to think about: 1. Normally the Kudu client retries a write if it receives a "service unavailable" error. Does the Spark error indicate that retrying took place? Was there an eventual timeout? 2. It's pretty unusual to be up against this 64M limit; it's usually a sign that your write batches may be too large, or that the tablet leader isn't able to replicate to its followers. What's this workload look like? What is being inserted? How large are the rows? How many rows per batch? What are your non-default configs? Does 'kudu cluster ksck' come back healthy? 3. If you want to know more about the transaction state of a tserver at any given time, visit the tserver's /transactions page in its web UI (linked via "Dashboards"). On Fri, Feb 22, 2019 at 5:34 AM Nabeelah Harris <nabeelah.har...@impact.com> wrote: > > Hi there > > While writing to a particular partition of a Kudu table using > KuduContext.insertRows, I receive the following error: "Service > unavailable: Transaction failed, tablet <tabletId> transaction memory > consumption (67075227) has exceeded its limit (67108864) or the limit > of an ancestral tracker". How would this occur is the consumption is > less than the limit? Writing to other partitions seems to work just > fine. > > I am inclined to believe that this isn't due to an issue with the > tablet or tablet server itself, as I get the same error when I drop > and re-create the partition, and the partition ends up on a different > tablet/server entirely. Is this simply because I'm trying to write too > much data at once? What are the kinds of logs I can look for on the > masters/tablet servers to indicate that this might happen? What are > some of the configs I might be able to tweak regarding this issue? > > Thanks > Nabeelah