pan3793 commented on code in PR #7176:
URL: https://github.com/apache/kyuubi/pull/7176#discussion_r2306146265


##########
kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/jdbc/JdbcDatabaseDialect.scala:
##########
@@ -31,6 +32,10 @@ class GenericDatabaseDialect extends JdbcDatabaseDialect {
     s"LIMIT $limit OFFSET $offset"
   }
 
+  override def deleteFromLimitClause(limit: Int): String = {
+    "" // Generic dialect does not support LIMIT in DELETE statements

Review Comment:
   let's add a warning log here. it won't bother the user since generally, it 
won't fall back to `GenericDatabaseDialect`



##########
kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataManager.scala:
##########
@@ -253,6 +256,31 @@ class MetadataManager extends 
AbstractService("MetadataManager") {
       TimeUnit.MILLISECONDS)
   }
 
+  @VisibleForTesting
+  private[metadata] def cleanupMetadata(maxAge: Long, batchSize: Int, 
batchInterval: Long): Unit = {
+    var needToCleanMetadata = true
+    var needToCleanKubernetesInfo = true
+
+    while (needToCleanMetadata || needToCleanKubernetesInfo) {

Review Comment:
   let's have an enforced loop round threshold(maybe `100`, no need to be 
configurable, hardcode is fine) to guard this while loop.



##########
kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala:
##########
@@ -2077,6 +2077,24 @@ object KyuubiConf {
       .timeConf
       .createWithDefault(Duration.ofMinutes(30).toMillis)
 
+  val METADATA_CLEANER_BATCH_SIZE: ConfigEntry[Int] =
+    buildConf("kyuubi.metadata.cleaner.batch.size")
+      .serverOnly
+      .doc("The batch size for cleaning expired metadata. " +
+        "This is used to avoid holding the delete lock for a long time " +
+        "when there are too many expired metadata to be cleaned.")
+      .version("1.11.0")
+      .intConf
+      .createWithDefault(Int.MaxValue)

Review Comment:
   I expect this is a general perf optimization, should we give a more 
reasonable default value? e.g. 1000 or 3000



##########
kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala:
##########
@@ -2077,6 +2077,24 @@ object KyuubiConf {
       .timeConf
       .createWithDefault(Duration.ofMinutes(30).toMillis)
 
+  val METADATA_CLEANER_BATCH_SIZE: ConfigEntry[Int] =
+    buildConf("kyuubi.metadata.cleaner.batch.size")
+      .serverOnly
+      .doc("The batch size for cleaning expired metadata. " +
+        "This is used to avoid holding the delete lock for a long time " +
+        "when there are too many expired metadata to be cleaned.")
+      .version("1.11.0")
+      .intConf
+      .createWithDefault(Int.MaxValue)
+
+  val METADATA_CLEANER_BATCH_INTERVAL: ConfigEntry[Long] =
+    buildConf("kyuubi.metadata.cleaner.batch.interval")
+      .serverOnly
+      .internal
+      .doc("The interval to wait before next batch of cleaning expired 
metadata.")
+      .timeConf
+      .createWithDefault(Duration.ofMillis(100).toMillis)

Review Comment:
   this is a lower-priority work, let's have at least 1s interval to avoid 
impact for batch scheduling



##########
kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataManager.scala:
##########
@@ -253,6 +256,31 @@ class MetadataManager extends 
AbstractService("MetadataManager") {
       TimeUnit.MILLISECONDS)
   }
 
+  @VisibleForTesting
+  private[metadata] def cleanupMetadata(maxAge: Long, batchSize: Int, 
batchInterval: Long): Unit = {
+    var needToCleanMetadata = true
+    var needToCleanKubernetesInfo = true
+
+    while (needToCleanMetadata || needToCleanKubernetesInfo) {
+      if (needToCleanMetadata) {
+        needToCleanMetadata =
+          withMetadataRequestMetrics(_metadataStore.cleanupMetadataByAge(
+            maxAge,
+            batchSize)) >= batchSize
+      }
+      if (needToCleanKubernetesInfo) {
+        needToCleanKubernetesInfo =
+          
withMetadataRequestMetrics(_metadataStore.cleanupKubernetesEngineInfoByAge(
+            maxAge,
+            batchSize)) >= batchSize
+      }
+      if (needToCleanMetadata || needToCleanKubernetesInfo) {
+        info("Sleep for " + batchInterval + "ms before next metadata cleanup 
batch")

Review Comment:
   why not use an interpreted string?



##########
kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataStore.scala:
##########
@@ -102,8 +102,13 @@ trait MetadataStore extends Closeable {
   /**
    * Check and cleanup the terminated batches information with maxAge 
limitation.
    * @param maxAge the batch state info maximum age.
+   * @param limit the maximum number of metadata to be cleaned up.
    */
-  def cleanupMetadataByAge(maxAge: Long): Unit
+  def cleanupMetadataByAge(maxAge: Long, limit: Int): Int
+
+  def cleanupMetadataByAge(maxAge: Long): Int = {
+    cleanupMetadataByAge(maxAge, Int.MaxValue)

Review Comment:
   it worth nothing to keep this API, since the return value changed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to