[ 
https://issues.apache.org/jira/browse/KNOX-3058?focusedWorklogId=931031&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-931031
 ]

ASF GitHub Bot logged work on KNOX-3058:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Aug/24 21:45
            Start Date: 20/Aug/24 21:45
    Worklog Time Spent: 10m 
      Work Description: pzampino opened a new pull request, #929:
URL: https://github.com/apache/knox/pull/929

   ## What changes were proposed in this pull request?
   
   Modified error handling when a topology is being redeployed, such that the 
response is not HTTP 404 'Not Found', but rather HTTP 503 'Service 
Unavailable'. The 503 response is much more likely to be retried by clients 
than is a 404 response.
   
   Added a Jetty ErrorHandler that checks whether or not the topology being 
requested is in a Set marked as inactive. Topology names are add to this 
inactive set when the associated topology is deactivated, and removed from this 
set when the topology is reactivated. In the case of topology deletion, the 
topology is marked as inactive, but then removed from the inactive set because 
we know it's being deleted.
   
   ## How was this patch tested?
   
   I deployed a test topology with a demo LDAP provider and the Knox Token 
service.
   
   I then ran the following script with 
'https://localhost:8443/gateway/demo/knoxtoken/api/v2/token' and one of the 
demo LDAP username/pwd combinations, and piped the output to a file. This 
script outputs only the HTTP response status code for each invocation.
   ```
   #!/bin/sh
   #
   #
   
   ENDPOINT=$1
   echo "Endpoint: $ENDPOINT"
   
   if [ ! -z "$2" ] ; then
     USER=$2
   fi
   
   if [ ! -z "$3" ] ; then
     PWD=$3
   fi
   
   for i in {1..100000}
   do
     curl -o /dev/null -s -w "%{http_code}\n" -ku ${USER}:${PWD} ${ENDPOINT}
   done
   
   ```
   Example:
   `~/bin/resp-test.sh 
'https://localhost:8443/gateway/demo/knoxtoken/api/v2/token' sam sam-password > 
~/response-code-test.txt &
   `
   While this script is running, I "touched" the test topology to trigger 
redeployment many times over several minutes. Finally, I deleted the test 
topology.
   
   Following this, I reviewed the output to verify that there were no 404 
responses until that time at which I deleted the topology. I also verified the 
periodic 503 responses which are expected, and the normal 200 responses in 
between.
   
   




Issue Time Tracking
-------------------

            Worklog Id:     (was: 931031)
    Remaining Estimate: 0h
            Time Spent: 10m

> Avoid 404 When Topology Is Being Redeployed
> -------------------------------------------
>
>                 Key: KNOX-3058
>                 URL: https://issues.apache.org/jira/browse/KNOX-3058
>             Project: Apache Knox
>          Issue Type: Improvement
>          Components: Server
>            Reporter: Philip Zampino
>            Assignee: Philip Zampino
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> While a topology is being redeployed, if it is requested, the client receives 
> an HTTP 404 response. Most clients will not retry when receiving a 404, so 
> the interaction will fail.
> If Knox were to respond with a more retry-friendly response (e.g., HTTP 503), 
> then clients could overcome these small windows of unavailability with 
> retries.
> The difficult part may be distinguishing topology removal from topology 
> inactivity. I think a deleted topology should still result in a 404.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to