tolbertam commented on code in PR #3917:
URL: https://github.com/apache/cassandra/pull/3917#discussion_r2076730354
##########
src/java/org/apache/cassandra/db/SystemKeyspace.java:
##########
@@ -1863,16 +1864,37 @@ public static void resetPreparedStatements()
}
public static int loadPreparedStatements(TriFunction<MD5Digest, String,
String, Boolean> onLoaded)
+ {
+ return loadPreparedStatements(onLoaded,
QueryProcessor.PRELOAD_PREPARED_STATEMENTS_FETCH_SIZE);
+ }
+
+ public static int loadPreparedStatements(TriFunction<MD5Digest, String,
String, Boolean> onLoaded, int pageSize)
{
String query = String.format("SELECT prepared_id, logged_keyspace,
query_string FROM %s.%s", SchemaConstants.SYSTEM_KEYSPACE_NAME,
PREPARED_STATEMENTS);
- UntypedResultSet resultSet = executeOnceInternal(query);
+ UntypedResultSet resultSet = executeOnceInternalWithPaging(query,
pageSize);
int counter = 0;
+ long initialStatementsEvicted =
QueryProcessor.metrics.preparedStatementsEvicted.getCount();
for (UntypedResultSet.Row row : resultSet)
{
if
(onLoaded.accept(MD5Digest.wrap(row.getByteArray("prepared_id")),
row.getString("query_string"),
row.has("logged_keyspace") ?
row.getString("logged_keyspace") : null))
counter++;
+
+ if (counter % pageSize == 0 &&
QueryProcessor.isPreparedStatementCacheFull())
+ {
+ // In the event that we detect that the prepared statement
cache has filled up, return early to prevent
Review Comment:
> Aren't we missing a test we stop loading pstmnts once cache is full? Maybe
you could use byteman to capture the call to the log warn method. That with a
modified yaml to have a smaller pstmnts cache somehow should do the trick. I
think it's one important feature we're not testing unless I am missing sthg.
I'm glad you brought this up as it pushed me to exercise how effective this
logic could be, and apparently it was not very effective as is!
We deployed an early version of this fix in late 2024/early 2025, at which
time we didn't have any short circuiting logic, so we would just page
indefinitely (which was better than OOMing but still not great). When we
encountered this, we would simply just truncate the
`system.prepared_statements` table. So this particular piece of logic hadn't
really been put through its paces.
After writing a test (`testPreloadPreparedStatementsUntilCacheFull`) I found
that it was *very* difficult to encounter a full cache as Caffeine, while it
allows going past the maximum weight, is very effective at evicting quickly
below it. Therefore, I think we need to add some kind of negative buffer. I
found that testing against 99% of the cache size was very effective.
To avoid potentially logging that there might be a leak when we previously
just had a nearly full cache, we now allow paging one addition time
(`possibleLeakDetected`), at which point if we still haven't exhausted the
result set, we'll log the message and exit earlier.
In terms of validating where we stop once the cache is full, I found that we
could easily detect this by updating `preloadPreparedStatements` to return an
`int` and then comparing the total number of rows returned against the maximum
amount we should expect (I added some github comments to the test I added to
expand on this more). I think this is more precise than testing against the
log entry, which also does indeed get logged in the test:
> WARN [main] 2025-05-06 22:36:42,581 SystemKeyspace.java:1888 - Detected
prepared statement cache filling up during preload after preparing 492
statements. This could be an indication that prepared statements leaked prior
to CASSANDRA-19703 being fixed. Returning early to prevent indefinite startup.
Consider truncating system.prepared_statements to clear out leaked prepared
statements.
Hopefully that all sounds right!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]