[ 
https://issues.apache.org/jira/browse/SLING-11181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

José Correia updated SLING-11181:
---------------------------------
    Description: 
h3. Context

Currently, our error metrics don't distinguish between distribution failures 
that are permanent and will fail even if retried, or failures that succeed 
after being retried.
We want to improve this in order to be able to differentiate both scenarios.
h3. Solution

Failure metric should be labeled by:
 * {{Transient failure}}
 * {{Permanent failure}}

h3. Proposed approach

We can distinguish both these scenarios by using the following rationale:
 * Transient failures happen whenever a package is distributed successfully but 
had more than 1 attempt at being distributed: {{retries > 0}}

 

  was:
h3. Context


Currently, our error metrics don't distinguish between distribution failures 
that are permanent and will fail even if retried, or failures that succeed 
after being retried.
We want to improve this in order to be able to differentiate both scenarios.

 
h3. Solution

 

Failure metric should be labeled by:
 * {{Transient failure}}
 * {{Permanent failure}}

 
h3. Proposed approach

 

We can distinguish both these scenarios by using the following rationale:
 * Transient failures happen whenever a package is distributed successfully but 
had more than 1 attempt at being distributed: {{retries > 0}}

 


> Emit metrics that distinguish transient and permanent distribution failures
> ---------------------------------------------------------------------------
>
>                 Key: SLING-11181
>                 URL: https://issues.apache.org/jira/browse/SLING-11181
>             Project: Sling
>          Issue Type: Improvement
>          Components: Content Distribution
>            Reporter: José Correia
>            Priority: Major
>
> h3. Context
> Currently, our error metrics don't distinguish between distribution failures 
> that are permanent and will fail even if retried, or failures that succeed 
> after being retried.
> We want to improve this in order to be able to differentiate both scenarios.
> h3. Solution
> Failure metric should be labeled by:
>  * {{Transient failure}}
>  * {{Permanent failure}}
> h3. Proposed approach
> We can distinguish both these scenarios by using the following rationale:
>  * Transient failures happen whenever a package is distributed successfully 
> but had more than 1 attempt at being distributed: {{retries > 0}}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to