terrymanu commented on issue #22290: URL: https://github.com/apache/shardingsphere/issues/22290#issuecomment-3559264002
## Root Cause - The SQL parsing cache uses the full SQL text as the key to cache parsing results. In multi-value batch INSERTs, each batch has a different number of placeholders, resulting in thousands of long SQL statements. This leads to low cache hit rates and high memory consumption. - The provided code snippet shows that the older version used an unbounded LoadingCache with softValues, lacking any eviction policy, causing dynamic SQL to accumulate continuously. This is a high-cardinality caching scenario caused by the usage pattern (not a parsing bug), where it is recommended to keep SQL templates stable and control batch sizes, rather than using variable-length VALUES. ## Problem Analysis (Master Branch Status) - On the master branch, the cache has been switched to a bounded Caffeine cache: infra/parser/src/main/java/org/apache/shardingsphere/infra/parser/sql/SQLStatementParserEngine.javacreates the cache via SQLStatementCacheBuilder. By default, DefaultSQLParserRuleConfigurationBuildersets: SQL_STATEMENT_CACHE(initialCapacity=2000, maximumSize=65535) PARSE_TREE_CACHE(128, 1024) These use LRU + soft values, are configurable, and have capacity limits. - Both Proxy and JDBC allow reducing the cache size through the global SQL_PARSERrule (via YAML or DistSQL: ALTER SQL_PARSER RULE (SQL_STATEMENT_CACHE(...))), which helps limit the retention of large SQLs. This aligns with ShardingSphere’s best practice: maintain fixed batch sizes or fixed SQL templates, and use PreparedStatement#addBatchinstead of variable-length VALUES, to avoid high-cardinality long SQLs in the parsing cache. - - This issue has effectively been mitigated on the master branch through "bounded + configurable" caching. ## Conclusion / Information Needed - Recommendation: Upgrade to a version that includes the aforementioned bounded cache (master / 5.3+), or in your current environment, reduce the capacity of sqlStatementCacheand parseTreeCacheunder SQL_PARSER, and keep batch templates and sizes consistent. - If the issue still occurs, please provide additional details: the ShardingSphere version you are using, the SQL_PARSERconfiguration in server.yaml(or via DistSQL), typical batch sizes and SQL examples, heap size, and GC logs — to confirm whether it is still caused by the old unbounded cache. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
