[ 
https://issues.apache.org/jira/browse/GOBBLIN-2188?focusedWorklogId=951135&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-951135
 ]

ASF GitHub Bot logged work on GOBBLIN-2188:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/Jan/25 10:46
            Start Date: 07/Jan/25 10:46
    Worklog Time Spent: 10m 
      Work Description: phet merged PR #4091:
URL: https://github.com/apache/gobblin/pull/4091




Issue Time Tracking
-------------------

            Worklog Id:     (was: 951135)
    Remaining Estimate: 0h
            Time Spent: 10m

> Allow use of stateful writer/converter `Initializer`s with Gobblin-on-Temporal
> ------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-2188
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-2188
>             Project: Apache Gobblin
>          Issue Type: New Feature
>          Components: gobblin-api
>            Reporter: Kip Kohn
>            Assignee: Hung Tran
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Stateful writer/converter `Initializer`s, such as `JdbcWriterInitializer`, 
> work fine with Gobblin-on-MR, but get disrupted by GoT.  While GoMR does also 
> launch an MR application, the remainder of the `MRJobLauncher` execution is 
> within the same process.  `Initializer`s must execute at the end of 
> WorkDiscovery, before `WorkUnit` processing may begin, but are `.close()`d 
> only after Job Commit completes.  Crucially, with GoMR, the same 
> `Initializer` instances remain in memory all throughout.  With GoT, in 
> contrast, Work Discovery and Commit execute completely independently - 
> creating new objects/instances, perhaps even on a new host/container.
> Problem: Some `Initializer`s, such as the `JdbcWriter`'s 
> `JdbcWriterInitializer` are stateful.  (In its case, to maintain the 
> temp/staging table's name, so that may be dropped upon successful Commit.)  
> Specific state originates during Work Discovery (the `GenerateWorkUnitsImpl` 
> activity in GoT) yet must be available during Commit (the 
> `CommitActivityImpl` in GoT).
> Solution: Use the Memento (GoF) Pattern to enable `Initializer`s to convey 
> arbitrary state from one concrete `Initializer` instance to another of the 
> same type.  Leverage the `JobState` to tunnel mementos, since it is 
> serialized at the end of Work Discovery, to be loaded later as the Commit 
> activity begins.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to