[ https://issues.apache.org/jira/browse/GOBBLIN-1896 ]
Andy Jiang deleted comment on GOBBLIN-1896:
-------------------------------------
was (Author: JIRAUSER293972):
Problem:
There is a scenario where an audit map returned from Kafka Audit URL is not
empty, as when there are counts in at least one of the Kafka tiers. However, it
is possible that all or some of the source tiers and all or some of the
reference tiers have 0 count. If there is a comparison of two counts where the
source tier is 0 and the reference tier is also 0, it will result in a 0/0
calculation when calculating the percentage completeness and this evaluates to
NaN (not a number). In this scenario as the determined percentage is NaN, the
completeness check for that type of completeness will be returned as false and
thus, the watermark will not progress for that hour.
The fix:
If the source tier reports 0 and the corresponding reference tier also reports
0, the completeness for that hour can be marked as complete as no records were
expected and no records were consumed. Thus, the completeness check for the
completeness type for that hour can be marked as true and the watermark can be
updated and moved forward.
> Classic and Total Watermark do not update if both the reference and source
> tier counts are 0
> --------------------------------------------------------------------------------------------
>
> Key: GOBBLIN-1896
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1896
> Project: Apache Gobblin
> Issue Type: Bug
> Reporter: Andy Jiang
> Priority: Major
>
> Problem:
> There is a scenario where an audit map returned from Kafka Audit URL is not
> empty, as when there are counts in at least one of the Kafka tiers. However,
> it is possible that all or some of the source tiers and all or some of the
> reference tiers have 0 count. If there is a comparison of two counts where
> the source tier is 0 and the reference tier is also 0, it will result in a
> 0/0 calculation when calculating the percentage completeness and this
> evaluates to NaN (not a number). In this scenario as the determined
> percentage is NaN, the completeness check for that type of completeness will
> be returned as false and thus, the watermark will not progress for that hour.
> The fix:
> If the source tier reports 0 and the corresponding reference tier also
> reports 0, the completeness for that hour can be marked as complete as no
> records were expected and no records were consumed. Thus, the completeness
> check for the completeness type for that hour can be marked as true and the
> watermark can be updated and moved forward.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)