kriko opened a new issue, #3634:
URL: https://github.com/apache/hop/issues/3634

   ### What would you like to happen?
   
   When orchestrating Apache Hop in a containerized environment (for example, 
running Hop Docker container in Kubernetes), it is common practice to assign 
sensitive (_eg. credentials, username, password, etc_) and environment specific 
parameters (_database url, schema name, etc)_ through environment variables. 
This is commonplace with containerized solutions that are set up in Kubernetes.
   
   Best practice is to use **environment variables**, since they can be set 
separately and easily managed through orchestration levels.
   
   Currently Apache Hop supports the use of environment variables (accessible 
in shell/bash), but they must be redefined with values in the 
`HOP_RUN_PARAMETERS` environment variable itself. Then the variables referenced 
in project configuration can access the values.
   
   This makes orchestration cumbersome, because all environment variables must 
be defined in `HOP_RUN_PARAMETERS` as well.
   
   One workaround that some software solutions use, is to have a predefined 
prefix for environment variables and all ENVs set that match that prefix are 
automatically forwarded to the application. Kubernetes supports this with 
configMap and secret both having the "prefix" option.
   
   For our solution we had to implement a way for the Hop container to find all 
requested environment variables defined in the configuration files and then on 
container start, run a shell script that will define `HOP_RUN_PARAMETERS` on 
the fly with variable names found in configuration files and values that are 
found in the shell environment.
   
   Other approach for us would have been to pass all environment variables to 
`HOP_RUN_PARAMETERS` that were defined in shell, but it is not ideal, because 
then Hop would have access to other ENV's that should perhaps not be accessible 
by Hop.
   
   **Talked about this in [Hop Mattermost chat 
channel](https://chat.project-hop.org/hop/pl/t8oxyt1cnpdg3xnrdyem1mkgbr).**
   
   **[A solution or a workaround to this problem can be found here:
   
https://gist.github.com/kriko/7267b91ff18eebdcd1a456921f0f2fd9](https://gist.github.com/kriko/7267b91ff18eebdcd1a456921f0f2fd9)**
   
   ### Apache Hop run-with-parameters.sh
   Entrypoint to this image is `run-with-parameters.sh`.
   
   When launced, the following files will be processed:
   1. `$HOP_PROJECT_FOLDER/$HOP_PROJECT_CONFIG_FILE_NAME` - [by default 
/files/project-config.json](https://hop.apache.org/tech-manual/latest/docker-container.html#_environment_variables)
   2. `$HOP_ENVIRONMENT_CONFIG_FILE_NAME_PATHS` - comma separated list of 
[environment config 
files](https://hop.apache.org/tech-manual/latest/docker-container.html#_environment_variables)
   
   All environment variables (either `$VARIABLE` or `${VARIABLE}`) found in 
those files will be assigned to `HOP_RUN_PARAMETERS` environment variable as a 
list of keys and values.
   
   So for example, if `/files/project-config.json` contains environment 
variables:
   - `$MY_ENVIRONMENT_VARIABLE` which corresponds to the value `MY_VALUE`
   - `${OTHER_ENV}` which corresponds to the value `OTHER VALUE`
   
   Then the environment variable `HOP_RUN_PARAMETERS` will be assigned the 
value of: `MY_ENVIRONMENT_VARIABLE="MY_VALUE",OTHER_ENV="OTHER VALUE"`.
   
   ### A preferred approach
   A more ideal approach would be not to have a shell script in between, but to 
give Apache Hop direct access to environment variables through a configuration 
option (perhaps a env itself) and it could be solved with a prefix.
   
   For example, when:
   
   - `HOP_ENV_PREFIX` is set to `ETL_` then all environment variables beginning 
with `ETL_` are directly accessible by the Hop job.
   - `HOP_ENV_PREFIX` is set to `` (an empty string, so ENV is defined, but 
empty) then all environment variables in the shell environment are directly 
accessible by the Hop job.
   - and if `HOP_ENV_PREFIX` is not set at all, then it would work as it is 
currently working - not being able to access other variables than specifically 
defined in `HOP_RUN_PARAMETERS`.
   
   
   ### Issue Priority
   
   Priority: 3
   
   ### Issue Component
   
   Component: Hop Run


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hop.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to