[
https://issues.apache.org/jira/browse/GOBBLIN-1906?focusedWorklogId=880265&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-880265
]
ASF GitHub Bot logged work on GOBBLIN-1906:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 13/Sep/23 20:47
Start Date: 13/Sep/23 20:47
Worklog Time Spent: 10m
Work Description: phet commented on code in PR #3770:
URL: https://github.com/apache/gobblin/pull/3770#discussion_r1325062631
##########
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/mapreduce/MRJobLauncher.java:
##########
@@ -316,7 +316,8 @@ protected void runWorkUnits(List<WorkUnit> workUnits)
throws Exception {
prepareHadoopJob(workUnits);
if (this.shouldPersistWorkUnitsThenCancel) {
- LOG.info("Cancelling job after persisting workunits beneath: " +
this.jobInputPath);
+ // NOTE: `warn` level is hack for including path among automatic
troubleshooter 'issues'
+ LOG.warn("Cancelling job after persisting workunits beneath: " +
this.jobInputPath);
Review Comment:
note: this trivial change is unrelated
Issue Time Tracking
-------------------
Worklog Id: (was: 880265)
Remaining Estimate: 0h
Time Spent: 10m
> protect against nulls when converting `State` to a `hadoop.conf.Configuration`
> ------------------------------------------------------------------------------
>
> Key: GOBBLIN-1906
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1906
> Project: Apache Gobblin
> Issue Type: Bug
> Components: gobblin-core
> Reporter: Kip Kohn
> Assignee: Abhishek Tiwari
> Priority: Minor
> Time Spent: 10m
> Remaining Estimate: 0h
>
> A customer reported seeing:
> {code:java}
> Error: java.io.IOException: Task failed: java.lang.IllegalArgumentException:
> The value of property <<redacted>> must not be null
> at
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:146)
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1260)
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1241)
> at
> org.apache.gobblin.util.JobConfigurationUtils.putStateIntoConfiguration(JobConfigurationUtils.java:95)
> at org.apache.gobblin.writer.FsDataWriter.<init>(FsDataWriter.java:102)
> at org.apache.gobblin.writer.GobblinBaseOrcWriter.<init
> (GobblinBaseOrcWriter.java:65)
> at
> org.apache.gobblin.writer.GobblinOrcWriter.<init>(GobblinOrcWriter.java:42)
> at <<redacted>>
> at
> org.apache.gobblin.writer.PartitionedDataWriter$4.get(PartitionedDataWriter.java:230)
> at
> org.apache.gobblin.writer.PartitionedDataWriter$4.get(PartitionedDataWriter.java:225)
> at
> org.apache.gobblin.writer.CloseOnFlushWriterWrapper.<init>(CloseOnFlushWriterWrapper.java:73)
> at
> org.apache.gobblin.writer.PartitionedDataWriter.<init>(PartitionedDataWriter.java:224)
> at org.apache.gobblin.runtime.fork.Fork.buildWriter(Fork.java:571)
> at
> org.apache.gobblin.runtime.fork.Fork.buildWriterIfNotPresent(Fork.java:579)
> at org.apache.gobblin.runtime.fork.Fork.processRecord(Fork.java:525)
> at
> org.apache.gobblin.runtime.fork.AsynchronousFork.processRecord(AsynchronousFork.java:103)
> at
> org.apache.gobblin.runtime.fork.AsynchronousFork.processRecords(AsynchronousFork.java:86)
> at org.apache.gobblin.runtime.fork.Fork.run(Fork.java:257)
> at
> org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
> at
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
> at
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748) (Gobblin task id <<redacted>>,
> container id attempt_1690893552521_3376012_m_000111_0)
> at
> org.apache.gobblin.runtime.GobblinMultiTaskAttempt.persistTaskStateStore(GobblinMultiTaskAttempt.java:367)
> ... {code}
> the appears to arise from concurrent modification to the `State`'s underlying
> `Properties` (i.e. between the time the `keySet()` is first read and when
> each value is accessed from the same `Properties`).
> although the customer's impl seems to warrant synchronization, given that a
> null-value is certain to be rejected by `o.a.hadoop.conf.Configuration`,
> defensively filter those out ahead of time.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)