Hi Fabian,

Did you get this working and are you willing to share the final results?
If not I will see what I can do, and we can add it to our documentation.

Cheers,
Hans

On Thu, 11 Aug 2022 at 13:14, Matt Casters <[email protected]> wrote:

> When you run class org.apache.hop.beam.run.MainBeam you need to provide 3
> arguments to run:
>
> 1. The filename of the pipeline to run
> 2. The filename which contains Hop metadata
> 3. The name of the pipeline run configuration to use
>
> See also for example:
> https://hop.apache.org/manual/latest/pipeline/pipeline-run-configurations/beam-flink-pipeline-engine.html#_running_with_flink_run
>
> Good luck,
> Matt
>
>
> On Thu, Aug 11, 2022 at 11:08 AM Fabian Peters <[email protected]> wrote:
>
>> Hello Hans,
>>
>> I went through the flex-template process yesterday but the generated
>> template does not work. The main piece that's missing for me is how to pass
>> the actual pipeline that should be run. My test boiled down to:
>>
>> gcloud dataflow flex-template build
>> gs://foo_ag_dataflow/tmp/todays-directories.json \
>>       --image-gcr-path "
>> europe-west1-docker.pkg.dev/dashboard-foo/dataflow/hop:latest" \
>>       --sdk-language "JAVA" \
>>       --flex-template-base-image JAVA11 \
>>       --metadata-file
>> "/Users/fabian/Documents/src/foo/fooDataEngineering/hop/dataflow/todays-directories.json"
>> \
>>       --jar "/Users/fabian/tmp/fat-hop.jar" \
>>       --env
>> FLEX_TEMPLATE_JAVA_MAIN_CLASS="org.apache.hop.beam.run.MainBeam"
>>
>> gcloud dataflow flex-template run "todays-directories-`date
>> +%Y%m%d-%H%M%S`" \
>>     --template-file-gcs-location "
>> gs://foo_ag_dataflow/tmp/todays-directories.json" \
>>     --region "europe-west1"
>>
>> With Dockerfile:
>>
>> FROM gcr.io/dataflow-templates-base/java11-template-launcher-base
>>
>> ARG WORKDIR=/dataflow/template
>> RUN mkdir -p ${WORKDIR}
>> WORKDIR ${WORKDIR}
>>
>> ENV FLEX_TEMPLATE_JAVA_MAIN_CLASS="org.apache.hop.beam.run.MainBeam"
>> ENV FLEX_TEMPLATE_JAVA_CLASSPATH="/dataflow/template/*"
>>
>> ENTRYPOINT ["/opt/google/dataflow/java_template_launcher"]
>>
>>
>> And "todays-directories.json":
>>
>> {
>>     "defaultEnvironment": {},
>>     "image": "
>> europe-west1-docker.pkg.dev/dashboard-foo/dataflow/hop:latest",
>>     "metadata": {
>>         "description": "Test templates creation with Apache Hop",
>>         "name": "Todays directories"
>>     },
>>     "sdkInfo": {
>>         "language": "JAVA"
>>     }
>> }
>>
>> Thanks for having a look at it!
>>
>> cheers
>>
>> Fabian
>>
>> Am 10.08.2022 um 16:03 schrieb Hans Van Akelyen <
>> [email protected]>:
>>
>> Hi Fabian,
>>
>> You have indeed found something we have not yet documented, mainly
>> because we have not yet tried it out ourselves.
>> The main class that gets called when running Beam pipelines is
>> "org.apache.hop.beam.run.MainBeam".
>>
>> I was hoping the "Import as pipeline" button on a job would give you
>> everything you need to execute this but it does not.
>> I'll take a closer look the following days to see what is needed to use
>> this functionality, could be that we need to export the template based on a
>> pipeline.
>>
>> Kr,
>> Hans
>>
>> On Wed, 10 Aug 2022 at 15:46, Fabian Peters <[email protected]> wrote:
>>
>>> Hi all!
>>>
>>> Thanks to Hans' work on the REST transform, I can now deploy my jobs to
>>> Dataflow.
>>>
>>> Next, I'd like to schedule a batch job
>>> <https://cloud.google.com/community/tutorials/schedule-dataflow-jobs-with-cloud-scheduler>,
>>> but for this I need to create a
>>> <https://cloud.google.com/dataflow/docs/concepts/dataflow-templates>
>>> template
>>> <https://cloud.google.com/dataflow/docs/concepts/dataflow-templates>.
>>> I've searched the Hop documentation but haven't found anything on this. I'm
>>> guessing that flex-templates
>>> <https://cloud.google.com/dataflow/docs/guides/templates/using-flex-templates#create_a_flex_template>
>>>  are
>>> the way to go, due to the fat-jar, but I'm wondering what to pass as
>>> the FLEX_TEMPLATE_JAVA_MAIN_CLASS.
>>>
>>> cheers
>>>
>>> Fabian
>>>
>>
>>
>
> --
> Neo4j Chief Solutions Architect
> *✉   *[email protected]
>
>
>
>

Reply via email to