[
https://issues.apache.org/jira/browse/GOBBLIN-2188?focusedWorklogId=951135&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-951135
]
ASF GitHub Bot logged work on GOBBLIN-2188:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 07/Jan/25 10:46
Start Date: 07/Jan/25 10:46
Worklog Time Spent: 10m
Work Description: phet merged PR #4091:
URL: https://github.com/apache/gobblin/pull/4091
Issue Time Tracking
-------------------
Worklog Id: (was: 951135)
Remaining Estimate: 0h
Time Spent: 10m
> Allow use of stateful writer/converter `Initializer`s with Gobblin-on-Temporal
> ------------------------------------------------------------------------------
>
> Key: GOBBLIN-2188
> URL: https://issues.apache.org/jira/browse/GOBBLIN-2188
> Project: Apache Gobblin
> Issue Type: New Feature
> Components: gobblin-api
> Reporter: Kip Kohn
> Assignee: Hung Tran
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Stateful writer/converter `Initializer`s, such as `JdbcWriterInitializer`,
> work fine with Gobblin-on-MR, but get disrupted by GoT. While GoMR does also
> launch an MR application, the remainder of the `MRJobLauncher` execution is
> within the same process. `Initializer`s must execute at the end of
> WorkDiscovery, before `WorkUnit` processing may begin, but are `.close()`d
> only after Job Commit completes. Crucially, with GoMR, the same
> `Initializer` instances remain in memory all throughout. With GoT, in
> contrast, Work Discovery and Commit execute completely independently -
> creating new objects/instances, perhaps even on a new host/container.
> Problem: Some `Initializer`s, such as the `JdbcWriter`'s
> `JdbcWriterInitializer` are stateful. (In its case, to maintain the
> temp/staging table's name, so that may be dropped upon successful Commit.)
> Specific state originates during Work Discovery (the `GenerateWorkUnitsImpl`
> activity in GoT) yet must be available during Commit (the
> `CommitActivityImpl` in GoT).
> Solution: Use the Memento (GoF) Pattern to enable `Initializer`s to convey
> arbitrary state from one concrete `Initializer` instance to another of the
> same type. Leverage the `JobState` to tunnel mementos, since it is
> serialized at the end of Work Discovery, to be loaded later as the Commit
> activity begins.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)