gkousiouris opened a new issue, #5359:
URL: https://github.com/apache/openwhisk/issues/5359

   <!--
   We use the issue tracker for bugs and feature requests. For general 
questions and discussion please use https://openwhisk.apache.org/slack.html or 
https://openwhisk.apache.org/community.html instead.
   
   Do NOT share passwords, credentials or other confidential information.
   
   Before creating a new issue, please check if there is one already open that
   fits the defect you are reporting.
   If you open an issue and realize later it is a duplicate of a pre-existing
   open issue, please close yours and add a comment to the other.
   
   Issues can be created for either defects or enhancement requests. If you are 
a committer than please add the labels "bug" or "feature". If you are not a 
committer please make clear in the comments which one it is, so that committers 
can add these labels later.
   
   If you are reporting a defect, please edit the issue description to include 
the
   information shown below.
   
   If you are reporting an enhancement request, please include information on 
what you are trying to achieve and why that enhancement would help you.
   
   For more information about reporting issues, see
   
https://github.com/apache/openwhisk/blob/master/CONTRIBUTING.md#raising-issues
   
   Use the commands below to provide key information from your environment:
   You do not have to include this information if this is a feature request.
   -->
   
   Hi, we have come across a reporting issue/bug(?). So in the case of 
activations of a working custom docker image, occasionally there are some 
errors (reported as developer errors) especially with the indication of failure 
during initialization (no available POST /init endpoint).  This happens mainly 
in cold starts.
   
   ## Environment details:
   
   Openwhisk deployment on top of kubernetes and Openshift, running on AWS VMs
   
   
   ## Steps to reproduce the issue:
   
   1.   Create a docker action and register it (in our case a docker action 
based on the Node-RED server)
   2.   Test it so that it works ok (with proper INIT, RUN endpoints etc.)
   3.   Launch activations with the same input, normally finalizing with 
success, however occasionally some errors (developer error type) occur, 
especially in cold starts that happen after some inactivity time
   
   
   ## Provide the expected results and outputs:
   
   The action should finalize with success, example: 
    
   ```
   {
       "namespace": "guest",
       "name": "Inference_3fff08e8-3617-4eb4-b118-4229f30514e3.json",
       "version": "0.0.6",
       "subject": "guest",
       "activationId": "be308f36967d4613b08f36967df6138b",
       "start": 1658257753690,
       "end": 1658257762621,
       "duration": 8931,
       "statusCode": 0,
       "response": {
           "status": "success",
           "statusCode": 0,
           "success": true,
           "result": {
             },
       "logs": [],
       "annotations": [
           {
               "key": "path",
               "value": 
"guest/Inference_3fff08e8-3617-4eb4-b118-4229f30514e3.json"
           },
           {
               "key": "waitTime",
               "value": 6
           },
           {
               "key": "kind",
               "value": "blackbox"
           },
           {
               "key": "timeout",
               "value": false
           },
           {
               "key": "limits",
               "value": {
                   "concurrency": 1,
                   "logs": 10,
                   "memory": 512,
                   "timeout": 300000
               }
           }
       ],
       "publish": false
   }
   
   ```
   
   
   ## Provide the actual results and outputs:
   
   The server has started ok inside the action container as seen in the logs. 
Given that it happens in cold starts, maybe there is some form of timeout that 
is violated for fetching the image? Does Openwhisk try to /init before the 
server inside the container is up? If yes then this is masked behind a 
misleading error message (developer error) or potentially there should be some 
retry from Openwhisk side?
   
   ```
   {
       "namespace": "guest",
       "name": "Inference_3fff08e8-3617-4eb4-b118-4229f30514e3.json",
       "version": "0.0.6",
       "subject": "guest",
       "activationId": "1040685ebb1e444980685ebb1e64495a",
       "start": 1658257763885,
       "end": 1658257771584,
       "duration": 7699,
       "statusCode": 0,
       "response": {
           "status": "action developer error",
           "statusCode": 0,
           "success": false,
           "result": {
               "error": "The action failed during initialization: <!DOCTYPE 
html>\n<html lang=\"en\">\n<head>\n<meta 
charset=\"utf-8\">\n<title>Error</title>\n</head>\n<body>\n<pre>Cannot POST 
/init</pre>\n</body>\n</html>\n"
           }
       },
       "logs": [
           "2022-07-19T19:09:23.899349481Z stdout: ",
           "2022-07-19T19:09:23.899349481Z stdout: > node-red-docker@2.2.2 
start /usr/src/node-red",
           "2022-07-19T19:09:23.899349481Z stdout: > node $NODE_OPTIONS 
node_modules/node-red/red.js $FLOWS \"--userDir\" \"/data\"",
           "2022-07-19T19:09:23.899349481Z stdout: ",
           "2022-07-19T19:09:26.301015945Z stdout: 19 Jul 19:09:26 - [info] ",
           "2022-07-19T19:09:26.301015945Z stdout: ",
           "2022-07-19T19:09:26.301015945Z stdout: Welcome to Node-RED",
           "2022-07-19T19:09:26.301015945Z stdout: ===================",
           "2022-07-19T19:09:26.301015945Z stdout: ",
           "2022-07-19T19:09:26.302106314Z stdout: 19 Jul 19:09:26 - [info] 
Node-RED version: v2.2.2",
           "2022-07-19T19:09:26.302956696Z stdout: 19 Jul 19:09:26 - [info] 
Node.js  version: v14.19.1",
           "2022-07-19T19:09:26.303139937Z stdout: 19 Jul 19:09:26 - [info] 
Linux 5.14.14-200.fc34.x86_64 x64 LE",
           "2022-07-19T19:09:27.305314368Z stdout: 19 Jul 19:09:27 - [info] 
Loading palette nodes",
           "2022-07-19T19:09:31.219388687Z stdout: 19 Jul 19:09:31 - [info] 
Settings file  : /data/settings.js",
           "2022-07-19T19:09:31.220267452Z stdout: 19 Jul 19:09:31 - [info] 
Context store  : 'default' [module=memory]",
           "2022-07-19T19:09:31.221039641Z stdout: 19 Jul 19:09:31 - [info] 
User directory : /data",
           "2022-07-19T19:09:31.221229505Z stdout: 19 Jul 19:09:31 - [warn] 
Projects disabled : editorTheme.projects.enabled=false",
           "2022-07-19T19:09:31.239239234Z stdout: 19 Jul 19:09:31 - [info] 
Flows file     : /data/flows.json",
           "2022-07-19T19:09:31.340091186Z stdout: 19 Jul 19:09:31 - [info] 
Server now running at http://127.0.0.1:8080/";,
           "2022-07-19T19:09:31.617204409Z stdout: 19 Jul 19:09:31 - [info] 
Starting flows",
           "2022-07-19T19:09:31.709196934Z stdout: 19 Jul 19:09:31 - [info] 
Started flows",
           "2022-07-19T19:09:31.584796934Z stderr: The action did not 
initialize or run as expected. Log data might be missing."
       ],
       "annotations": [
           {
               "key": "path",
               "value": 
"guest/Inference_3fff08e8-3617-4eb4-b118-4229f30514e3.json"
           },
           {
               "key": "waitTime",
               "value": 4199
           },
           {
               "key": "kind",
               "value": "blackbox"
           },
           {
               "key": "timeout",
               "value": false
           },
           {
               "key": "limits",
               "value": {
                   "concurrency": 1,
                   "logs": 10,
                   "memory": 512,
                   "timeout": 300000
               }
           },
           {
               "key": "initTime",
               "value": 7699
           }
       ],
       "publish": false
   }
   ```
   
   ## Additional information you deem important:
   The issue seems to happen always in the first cold start case after a long 
time of inactivity. So maybe this is triggered by longer delays needed for 
fetching more container layers from the registry. 
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@openwhisk.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to