[
https://issues.apache.org/jira/browse/GOBBLIN-1543?focusedWorklogId=650271&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-650271
]
ASF GitHub Bot logged work on GOBBLIN-1543:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 13/Sep/21 23:39
Start Date: 13/Sep/21 23:39
Worklog Time Spent: 10m
Work Description: aplex commented on a change in pull request #3394:
URL: https://github.com/apache/gobblin/pull/3394#discussion_r707791587
##########
File path:
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/AbstractJobLauncher.java
##########
@@ -663,7 +663,7 @@ public void apply(JobListener jobListener, JobContext
jobContext)
@VisibleForTesting
public static long sumWorkUnitsSizes (WorkUnitStream workUnitStream) {
Collection<WorkUnit> workUnits =
JobLauncherUtils.flattenWorkUnits(workUnitStream.getMaterializedWorkUnitCollection());
- long totalSizeInBytes = workUnits.stream().mapToLong(wu ->
wu.getPropAsLong(ServiceConfigKeys.WORK_UNIT_SIZE)).sum();
+ long totalSizeInBytes = workUnits.stream().mapToLong(wu ->
wu.getPropAsLong(ServiceConfigKeys.WORK_UNIT_SIZE, 1)).sum();
Review comment:
I think we should fix this division by zero where it occurs. It's still
possible for the job to copy 0 bytes. For example, when it copies empty files.
In the place where we calculate progress, can we fallback to counting the
work units if total size is zero? So the variable says total size in bytes, and
it will confuse future developers if we put work unit count there.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 650271)
Time Spent: 1h 20m (was: 1h 10m)
> Set Default for Work Unit Size to Prevent Exception
> ---------------------------------------------------
>
> Key: GOBBLIN-1543
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1543
> Project: Apache Gobblin
> Issue Type: Improvement
> Components: gobblin-service
> Reporter: Urmi Mustafi
> Assignee: Abhishek Tiwari
> Priority: Major
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> Found error when summing work units for retention jobs that do not have size
> associated with it
>
> 02-09-2021 14:39:52 PDT gobblin-ivy_gobblin-ivy-download INFO - ERROR Failed
> to launch and run job
> job_ktwo_k2-cdp-company-features_DateTimeVerFinderBasedRetention_holdem_lasso_multi-hops-1-selfserve-timeaware-file-based-gen2-copy-azure_-809735922_1630618790955
> due tonull: java.lang.NumberFormatException: null
> ...
> org.apache.gobblin.runtime.AbstractJobLauncher.lambda$sumWorkUnitsSizes$1(AbstractJobLauncher.java:666)
> 02-09-2021 14:39:52 PDT gobblin-ivy_gobblin-ivy-download INFO - at
> org.apache.gobblin.runtime.AbstractJobLauncher.sumWorkUnitsSizes(AbstractJobLauncher.java:666)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)