[ 
https://issues.apache.org/jira/browse/GOBBLIN-1923?focusedWorklogId=883233&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-883233
 ]

ASF GitHub Bot logged work on GOBBLIN-1923:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 03/Oct/23 22:45
            Start Date: 03/Oct/23 22:45
    Worklog Time Spent: 10m 
      Work Description: phet commented on code in PR #3792:
URL: https://github.com/apache/gobblin/pull/3792#discussion_r1344827785


##########
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiter.java:
##########
@@ -204,6 +212,14 @@ public MysqlMultiActiveLeaseArbiter(Config config) throws 
IOException {
     }
     initializeConstantsTable();
 
+    Thread retentionThread = new Thread(new Runnable() {

Review Comment:
   rather than a sleeping/blocking thread, how about a scheduled thread pool 
executor taking this `Runnable` and an exec interval?



##########
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiter.java:
##########
@@ -110,9 +111,13 @@ protected interface CheckedFunction<T, R> {
   private static final String CREATE_LEASE_ARBITER_TABLE_STATEMENT = "CREATE 
TABLE IF NOT EXISTS %s ("
       + "flow_group varchar(" + ServiceConfigKeys.MAX_FLOW_GROUP_LENGTH + ") 
NOT NULL, flow_name varchar("
       + ServiceConfigKeys.MAX_FLOW_GROUP_LENGTH + ") NOT NULL, " + " 
flow_action varchar(100) NOT NULL, "
-      + "event_timestamp TIMESTAMP(3) DEFAULT CURRENT_TIMESTAMP(3), "
-      + "lease_acquisition_timestamp TIMESTAMP(3) NULL DEFAULT NULL, "
+      + "event_timestamp TIMESTAMP NOT NULL, "
+      + "lease_acquisition_timestamp TIMESTAMP NULL, "

Review Comment:
   as far as migrating this schema... will it require manual intervention to 
either `drop` or `alter table`?



##########
gobblin-api/src/main/java/org/apache/gobblin/configuration/ConfigurationKeys.java:
##########
@@ -101,6 +101,8 @@ public class ConfigurationKeys {
   public static final String DEFAULT_MULTI_ACTIVE_SCHEDULER_CONSTANTS_DB_TABLE 
= "gobblin_multi_active_scheduler_constants_store";
   public static final String SCHEDULER_LEASE_DETERMINATION_STORE_DB_TABLE_KEY 
= MYSQL_LEASE_ARBITER_PREFIX + ".schedulerLeaseArbiter.store.db.table";
   public static final String 
DEFAULT_SCHEDULER_LEASE_DETERMINATION_STORE_DB_TABLE = 
"gobblin_scheduler_lease_determination_store";
+  public static final String 
SCHEDULER_LEASE_DETERMINATION_TABLE_RETENTION_PERIOD_MILLIS_KEY = 
MYSQL_LEASE_ARBITER_PREFIX + ".retentionPeriodMillis";
+  public static final int 
DEFAULT_SCHEDULER_LEASE_DETERMINATION_TABLE_RETENTION_PERIOD_MILLIS = 500000;

Review Comment:
   500 seconds?  seems way too low... at least for debugging.  I'd look at more 
like 72 hours



##########
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiter.java:
##########
@@ -221,6 +237,31 @@ private void initializeConstantsTable() throws IOException 
{
     }, true);
   }
 
+  /**
+   * Periodically deletes all rows in the table with event_timestamp older 
than the retention period defined by config.
+   */
+  private void runRetentionOnArbitrationTable() {
+    while (true) {
+      try {
+        Thread.sleep(10000);

Review Comment:
   tip: (same as if you use scheduled TP executor) - set sleep in time-esque 
values (such as 60).  also, 10s looks WAY too frequent, given a lease may 
itself last for minutes!  maybe we try this every six hours (4/daily)?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 883233)
    Time Spent: 0.5h  (was: 20m)

> Add retention thread for lease arbiter table
> --------------------------------------------
>
>                 Key: GOBBLIN-1923
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1923
>             Project: Apache Gobblin
>          Issue Type: Bug
>          Components: gobblin-service
>            Reporter: Urmi Mustafi
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Add retention to lease arbiter table so it does not grow unbounded. The table 
> can be as large as O(number of flows) which may grow so large that 
> reading/writing from this table becomes time consuming and slows down our 
> throughput of obtaining and evaluating leases for launching flows. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to