Think of the project as the folder where your pipelines, workflows, metadata, datasets and so on are stored. You can then create one or more environnements for your project(s) to make it easy to run the same workflows and pipelines against various servers, folders, hostnames, ...
The definitions of projects and environments are simply stored in your hop-config.json. HTH, Matt Op do 2 mrt. 2023 12:21 schreef MITTERLECHNER Gerhard < [email protected]>: > Hi! > > > > Thanks so much for your quick and elaborate answer! > > > > We will have a lock at the docker variant in the future for sure. > > > > And we really DO want to use projects and environments, but using the > “correct” and recommended, best-practice way, and for the moment without > using containers. > > > > Perhaps our use-cases itself is non-standard, but in our case, we are a > couple of developers working on a couple of ETL projects that are > versionized in a couple of git repositories, and have to execute/test them > on a couple of different servers with different OS versions, different DB > versions etc. > > > > When, say, developer A checks out project B, copies/stores it onto server > C in some folder D and wants to execute it from there with as little > effort/overhead as possible, we search for the best-practice on how to get > the Apache hop project to work. > > In the ideal case without having to execute the command line commands for > project/environment creation and so on, but if that’s the best-practice to > do so we are fine also. > > If on the other hand using some single (already pre-existing) “default” > project and modify the ${PROJECT_HOME} via “project-modify” switch was the > best way for this special kind of use case, we would simply stick to our > current approach with temporary projects, and then, when the bug gets > fixed, switch over. > > > > Thanks lot for any potential further ideas/suggestions in advance. > > Best regards, > > > > Gerhard > > > > *Von:* Bart Maertens <[email protected]> > *Gesendet:* Mittwoch, 1. März 2023 20:09 > *An:* [email protected] > *Betreff:* [EXTERN]Re: Excecution of Hop workflows on a Linux server via > command line > > > > Hi Gerhard, > > > > Projects and environments were introduced because of the lack of > flexibility when working with multiple projects, in multiple environments. > > Many people and organizations started using their own scripts to > circumvent those limitations and juggling with multiple kettle.properties > files that were copied, parsed/populated from templates, etc. > > Apache Hop aims to provide a more unified and flexible way to deal with > that. > > > > Your initial approach, modifying the project home for the default project, > revealed a bug[1], (thanks for reporting it). Once that's fixed, that could > be a possible approach. Maybe not the best one, but at least it should work. > > > > An alternative approach that could make your life a lot easier is to run > your workflows and pipelines through docker[2]. > > Running a pipeline or workflow in the short-lived container would come > down to more or less a hybrid version of the two approaches you describe: > you could mount _any_ folder on your server's file system as the project > folder, and add the necessary environment files. > > All of the hop-conf and hop-run commands will be taken care of by the > container. > > > > If you really, really, really don't want to use projects and environments, > you can just remove the projects plugin from hop/plugins/misc/projects. > > You'll need to find another way to provide environment-specific > configurations in that case, through variables, parameters, system > properties ... > > > > Let us know if any of these approaches work for you or if there's anything > else we can do to help. > > > > [1] https://github.com/apache/hop/issues/2494 > <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fhop%2Fissues%2F2494&data=05%7C01%7Cgmitterlechner%40eurofunk.com%7C4926be3c48a54a9fc44808db1a8872e4%7C6d4fa94918de4214a28c3e8c7fa9c25b%7C0%7C0%7C638132945565032449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=QA%2BYZzwvRdgDbOtqooC1LF9QvL5uh3NWCGkwPSALu5s%3D&reserved=0> > > [2] > https://hop.apache.org/tech-manual/latest/docker-container.html#_how_to_run_the_container > <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fhop.apache.org%2Ftech-manual%2Flatest%2Fdocker-container.html%23_how_to_run_the_container&data=05%7C01%7Cgmitterlechner%40eurofunk.com%7C4926be3c48a54a9fc44808db1a8872e4%7C6d4fa94918de4214a28c3e8c7fa9c25b%7C0%7C0%7C638132945565032449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vjH1TwCpPXbLeUwu1jofUCTaP4QYtZbsuFqeWMf%2F4Tg%3D&reserved=0> > > > > Regards, > > Bart > > > > > > On Wed, Mar 1, 2023 at 8:57 AM MITTERLECHNER Gerhard < > [email protected]> wrote: > > Hi! > > > > When trying to migrate our PDI/Spoon projects over to Apache Hop the > following issue occurred, where the documentation does not give us > sufficient information. > > > > In PDI/Spoon we simple used bash-scripts to start “kitchen.sh” referencing > the main kjb-job of the ETL project (note, we are using GUI-less Linux > servers). > > > > The typical workflow is: > > - We develop using Spoon (or Apache Hop) on our Windows developer > machines, and push the project to a git repos. > - We then clone (or copy) the git repos a projects into an arbitrary > directory on the Linux server and start the project with “kitchen.sh” > > from there (in fact, we use a wrapper shellscript similar to the one below > that calls “kitchen.sh”). > > > > Trying to do the same seems with Hop makes a little more problems: > > - a project is needed > - a environment is needed for reading in the env.-config file(s) > containing environment variables > - so the script (see below, is stored as part of every project with > different variable values) that calls “hop-run.sh” now also uses > “hop-conf.sh” > > > - to (delete and) create an environment > - to (delete and) create a temporary project in the current file > location that is deleted again when the ETL is finished > > > - We find this procedure clumsy and is probably not the recommended > procedure for such a use case. > > > > NOTE: We do not want to have to pre-define/pre-create all potential Hop > projects for all potential file locations on the Linux server, as we are > having many projects, and want to keep the flexibility to run them inside > arbitrary file locations on the server. > > > > #!/bin/bash > > WORKFLOW_TO_START=test_workflow.hwf > > # removes the last 4 characters: > > HOP_HWF=${WORKFLOW_TO_START%????} > > > > CURRENT_FOLDER=$(pwd) > > TIMESTAMP=$(date +"%Y%m%d_%H%M%S") > > CURRENT_LOGFILE=${HOP_HWF}_${TIMESTAMP}.log > > > > HOP_PATH=/home/pentaho/hop/apache-hop-client-2.3.0/hop > > PROJECT_PATH=$(pwd) > > PROJECT_NAME=$(basename $(pwd)) > > ENV_NAME=test-env > > ENV_PURPOSE=Development > > ENV_CONF_FILE=env-var.json > > RUN_CONFIG=local > > DEBUG_LEVEL=BASIC > > > > sh ${HOP_PATH}/hop-conf.sh --project-delete --project ${PROJECT_NAME} > > /dev/null 2>&1 > > sh ${HOP_PATH}/hop-conf.sh --project-create --project ${PROJECT_NAME} > --project-home=${PROJECT_PATH} > > > > sh ${HOP_PATH}/hop-conf.sh --environment-delete --environment ${ENV_NAME} > > /dev/null 2>&1 > > sh ${HOP_PATH}/hop-conf.sh --environment-create --environment ${ENV_NAME} > --environment-project ${PROJECT_NAME} --environment-purpose=${ENV_PURPOSE} > --environment-config-files=${PROJECT_PATH}/${ENV_CONF_FILE} > > > > sh ${HOP_PATH}/hop-run.sh --project ${PROJECT_NAME} --file > ${WORKFLOW_TO_START} --runconfig=${RUN_CONFIG} --environment=${ENV_NAME} > --level=${DEBUG_LEVEL} >> $CURRENT_LOGFILE 2>&1 & > > > > sh ${HOP_PATH}/hop-conf.sh --project-delete --project ${PROJECT_NAME} > > /dev/null 2>&1 > > > > mkdir -p $CURRENT_FOLDER/logs > > mv $CURRENT_LOGFILE logs > > > > > > We cannot imagine that this is the way we should continue. What would be > the recommended way? > > > > > > > > Side note: > > Our *first approach* was to use the already existing “samples” or > “default” project on the server, and simply re-direct its ${PROJECT_HOME} > to the file location where ETL-project we currently want to start is stored. > > So we tried to manipulate the variable using “project-modify” switch: > > sh /home/pentaho/hop/apache-hop-client-2.3.0/hop/hop-conf.sh > --project-modify --project default > --project-home=/home/pentaho/hop/testhoprun/ > > This indeed gives the message that the project-home variable would have > been updated successfully: > > Project configuration for 'default' was modified in > /home/pentaho/hop/apache-hop-client-2.3.0/hop/config/hop-config.json > > BUT looking at hop-config.json we see that *nothing has changed, *the > project-home variable is still pointing to the original location of the > default project. > > (And the workflow stored in the testhoprun location cannot be executed > succesfully, of course.) > > > > This is independent of trying to manipulate ${PROJECT_HOME} of the > “default”, “sample” or any other existing project. > > > > Is this a bug? > > (Tested on Hop 2.3). > > > > > > Mit freundlichen Grüßen > > *Gerhard Mitterlechner* > Senior Development Engineer > Database Management > > > [email protected] > ------------------------------ > > > [image: eurofunk Kappacher Logo] > > eurofunk KAPPACHER GmbH > eurofunk-Straße 1 − 8, 5600 St. Johann im Pongau, Österreich / Austria > > Board of Management: Christian Kappacher, Dr. Christian Kappacher > VAT No.: ATU35454003 | Registered in: FN 52582 b, LG Salzburg > ------------------------------ > > Products | Solutions | Service www.eurofunk.com > > > <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2FeurofunkKappacher%2F&data=05%7C01%7Cgmitterlechner%40eurofunk.com%7C4926be3c48a54a9fc44808db1a8872e4%7C6d4fa94918de4214a28c3e8c7fa9c25b%7C0%7C0%7C638132945565032449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=TKwDPhoadIaXRDx48Ex%2BM7dWbHURAvfgr9ivGjRZlf0%3D&reserved=0> > > <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2F5039801%2F&data=05%7C01%7Cgmitterlechner%40eurofunk.com%7C4926be3c48a54a9fc44808db1a8872e4%7C6d4fa94918de4214a28c3e8c7fa9c25b%7C0%7C0%7C638132945565032449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PvXG1tc4uLoVZqLCHqyTXbTdeQyST8f3ZmEHd6AlIw4%3D&reserved=0> > > <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.youtube.com%2Fuser%2Feurofunkkappacher%2F&data=05%7C01%7Cgmitterlechner%40eurofunk.com%7C4926be3c48a54a9fc44808db1a8872e4%7C6d4fa94918de4214a28c3e8c7fa9c25b%7C0%7C0%7C638132945565032449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tUQWxNhMsx9MyR%2B1IzCz4h0v1WkQxmze9xKl5RAuSKE%3D&reserved=0> > > <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.xing.com%2Fcompanies%2Feurofunkkappachergmbh&data=05%7C01%7Cgmitterlechner%40eurofunk.com%7C4926be3c48a54a9fc44808db1a8872e4%7C6d4fa94918de4214a28c3e8c7fa9c25b%7C0%7C0%7C638132945565032449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=P4%2FxMiblAm5sisfYhpDdzVhzC%2FImXrKgIN22jQJzeCU%3D&reserved=0> > > > > > > > > > Email secured by Check Point >
