[ 
https://issues.apache.org/jira/browse/GOBBLIN-1921?focusedWorklogId=883015&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-883015
 ]

ASF GitHub Bot logged work on GOBBLIN-1921:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Oct/23 23:13
            Start Date: 02/Oct/23 23:13
    Worklog Time Spent: 10m 
      Work Description: umustafi commented on code in PR #3790:
URL: https://github.com/apache/gobblin/pull/3790#discussion_r1343247173


##########
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/api/MysqlMultiActiveLeaseArbiter.java:
##########
@@ -86,16 +88,26 @@ protected interface CheckedFunction<T, R> {
   private final int epsilon;
   private final int linger;
   private String thisTableGetInfoStatement;
+  private String thisTableGetInfoStatementForReminder;
   private String thisTableSelectAfterInsertStatement;
   private String thisTableAcquireLeaseIfMatchingAllStatement;
   private String thisTableAcquireLeaseIfFinishedStatement;
 
   // TODO: define retention on this table
+  /*
+    Notes:
+    - Set `event_timestamp` default value to turn off timestamp auto-updates 
for row modifications which alters this col
+      in an unexpected way upon completing the lease
+    - Upon reading any timestamps from MySQL we convert the timezone from 
session (default) to UTC to consistently
+      use epoch-millis in UTC locally
+    - Upon using any timestamps from local we convert the timezone from UTC to 
session
+    - We desire millisecond level precision and denote that with `(3)` for the 
TIMESTAMP types
+   */
   private static final String CREATE_LEASE_ARBITER_TABLE_STATEMENT = "CREATE 
TABLE IF NOT EXISTS %s ("
       + "flow_group varchar(" + ServiceConfigKeys.MAX_FLOW_GROUP_LENGTH + ") 
NOT NULL, flow_name varchar("
       + ServiceConfigKeys.MAX_FLOW_GROUP_LENGTH + ") NOT NULL, " + " 
flow_action varchar(100) NOT NULL, "
-      + "event_timestamp TIMESTAMP, "
-      + "lease_acquisition_timestamp TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP, 
"
+      + "event_timestamp TIMESTAMP(3) DEFAULT CURRENT_TIMESTAMP(3), "

Review Comment:
   I converted everything to BIGINT and updated the tests but reverted back 
because I removed the database laundering after realizing that not having 
database laundering made it very hard to deal with `reminderEvents`.  We end up 
having to think about every possible case of what to do with reminder events 
and `withinEpsilon` and `afterLinger` or not while db laundering instead of 
using the `triggerTimeMillis` from the hosts may result in out of order 
timestamps if we have host drift. With db laundering, we use database timestamp 
and it becomes easier to do time conversions with timestamp type. There's a lot 
of back and forth that went into this 





Issue Time Tracking
-------------------

    Worklog Id:     (was: 883015)
    Time Spent: 2h 10m  (was: 2h)

> Properly handle reminder events
> -------------------------------
>
>                 Key: GOBBLIN-1921
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1921
>             Project: Apache Gobblin
>          Issue Type: Bug
>          Components: gobblin-service
>            Reporter: Urmi Mustafi
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Reminder flow trigger events were being improperly handled and interpreted as 
> new events because they are triggered {{linger}} time after the original 
> trigger where {{epsilon < linger}} and we use {{epsilon}} to determine event 
> distinctness. With reminder events being considered distinct events, we were 
> launching excess concurrent flows that were then being cancelled. Now we 
> handle reminder events differently from normal event triggers to ensure 
> they're properly evaluated. Because of db laundering, reminder events are 
> easy to handle - if they're older than the currently worked upon event in the 
> database they can be skipped and if they're equal to the current event in the 
> database they are handled like normal. Reminder events should never be newer 
> than the current event in the lease arbiter table because db laundering 
> always results in increasing event times. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to