[
https://issues.apache.org/jira/browse/GOBBLIN-2054?focusedWorklogId=916426&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-916426
]
ASF GitHub Bot logged work on GOBBLIN-2054:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 25/Apr/24 15:28
Start Date: 25/Apr/24 15:28
Worklog Time Spent: 10m
Work Description: Will-Lo merged PR #3934:
URL: https://github.com/apache/gobblin/pull/3934
Issue Time Tracking
-------------------
Worklog Id: (was: 916426)
Time Spent: 1h 50m (was: 1h 40m)
> `CommitActivityImpl` fails for job types (sources) other than Iceberg-Distcp
> ----------------------------------------------------------------------------
>
> Key: GOBBLIN-2054
> URL: https://issues.apache.org/jira/browse/GOBBLIN-2054
> Project: Apache Gobblin
> Issue Type: New Feature
> Components: gobblin-core
> Reporter: Kip Kohn
> Assignee: Abhishek Tiwari
> Priority: Major
> Time Spent: 1h 50m
> Remaining Estimate: 0h
>
> gobblin-on-temporal execution has been failing for other job types than
> iceberg-distcp (which uses `CopySource`). in particular Commit fails with:
> {code}
> java.lang.IllegalArgumentException: Missing required property
> writer.output.dir
> at
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
> at
> org.apache.gobblin.util.WriterUtils.getWriterOutputDir(WriterUtils.java:121)
> at
> org.apache.gobblin.publisher.BaseDataPublisher.publishData(BaseDataPublisher.java:390)
> at
> org.apache.gobblin.publisher.BaseDataPublisher.publishMultiTaskData(BaseDataPublisher.java:379)
> at
> org.apache.gobblin.publisher.BaseDataPublisher.publishData(BaseDataPublisher.java:366)
> at
> org.apache.gobblin.publisher.DataPublisher.publish(DataPublisher.java:81)
> at
> org.apache.gobblin.runtime.SafeDatasetCommit.commitDataset(SafeDatasetCommit.java:260)
> at
> org.apache.gobblin.runtime.SafeDatasetCommit.call(SafeDatasetCommit.java:168)
> at
> org.apache.gobblin.runtime.SafeDatasetCommit.call(SafeDatasetCommit.java:64)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> this is odd because that same prop had already been used prior to commit,
> while processing the `WorkUnit`! moreover logging shows it to be present
> within the `JobState`
> anyway, even when using a private build that hard-coded that property, this
> later error arises:
> {code}
> Caused by: java.lang.IllegalArgumentException: Can not create a Path from a
> null string
> at org.apache.hadoop.fs.Path.checkPathArg(Path.java:159)
> at org.apache.hadoop.fs.Path.<init>(Path.java:175)
> at org.apache.hadoop.fs.Path.<init>(Path.java:110)
> at
> org.apache.gobblin.runtime.FsDatasetStateStore.sanitizeDatasetStatestoreNameFromDatasetURN(FsDatasetStateStore.java:175)
> at
> org.apache.gobblin.runtime.FsDatasetStateStore.persistDatasetState(FsDatasetStateStore.java:386)
> at
> org.apache.gobblin.runtime.FsDatasetStateStore.persistDatasetState(FsDatasetStateStore.java:90)
> at
> org.apache.gobblin.runtime.SafeDatasetCommit.persistDatasetState(SafeDatasetCommit.java:418)
> at
> org.apache.gobblin.runtime.SafeDatasetCommit.call(SafeDatasetCommit.java:191)
> ... 8 more
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)