[ 
https://issues.apache.org/jira/browse/GOBBLIN-1543?focusedWorklogId=650271&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-650271
 ]

ASF GitHub Bot logged work on GOBBLIN-1543:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Sep/21 23:39
            Start Date: 13/Sep/21 23:39
    Worklog Time Spent: 10m 
      Work Description: aplex commented on a change in pull request #3394:
URL: https://github.com/apache/gobblin/pull/3394#discussion_r707791587



##########
File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/AbstractJobLauncher.java
##########
@@ -663,7 +663,7 @@ public void apply(JobListener jobListener, JobContext 
jobContext)
   @VisibleForTesting
   public static long sumWorkUnitsSizes (WorkUnitStream workUnitStream) {
     Collection<WorkUnit> workUnits = 
JobLauncherUtils.flattenWorkUnits(workUnitStream.getMaterializedWorkUnitCollection());
-    long totalSizeInBytes = workUnits.stream().mapToLong(wu -> 
wu.getPropAsLong(ServiceConfigKeys.WORK_UNIT_SIZE)).sum();
+    long totalSizeInBytes = workUnits.stream().mapToLong(wu -> 
wu.getPropAsLong(ServiceConfigKeys.WORK_UNIT_SIZE, 1)).sum();

Review comment:
       I think we should fix this division by zero where it occurs. It's still 
possible for the job to copy 0 bytes. For example, when it copies empty files.
   
   In the place where we calculate progress, can we fallback to counting the 
work units if total size is zero? So the variable says total size in bytes, and 
it will confuse future developers if we put work unit count there.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 650271)
    Time Spent: 1h 20m  (was: 1h 10m)

> Set Default for Work Unit Size to Prevent Exception
> ---------------------------------------------------
>
>                 Key: GOBBLIN-1543
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1543
>             Project: Apache Gobblin
>          Issue Type: Improvement
>          Components: gobblin-service
>            Reporter: Urmi Mustafi
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Found error when summing work units for retention jobs that do not have size 
> associated with it
>  
> 02-09-2021 14:39:52 PDT gobblin-ivy_gobblin-ivy-download INFO - ERROR Failed 
> to launch and run job 
> job_ktwo_k2-cdp-company-features_DateTimeVerFinderBasedRetention_holdem_lasso_multi-hops-1-selfserve-timeaware-file-based-gen2-copy-azure_-809735922_1630618790955
>  due tonull: java.lang.NumberFormatException: null
> ...
> org.apache.gobblin.runtime.AbstractJobLauncher.lambda$sumWorkUnitsSizes$1(AbstractJobLauncher.java:666)
> 02-09-2021 14:39:52 PDT gobblin-ivy_gobblin-ivy-download INFO -       at 
> org.apache.gobblin.runtime.AbstractJobLauncher.sumWorkUnitsSizes(AbstractJobLauncher.java:666)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to