Re: Go SDK - Docker image for Dataflow

2022-04-12 Thread Pawel
Hi,  Thank you Daniel   I tried to build a container and stored it in GCR. My 
container had a binary file built from package main where I have code of my 
job. I changed entrypoint in dockerfile to boot file. I create boot.go in a way 
that is almost the same as  github.com 
https://github.com/apache/beam/blob/master/sdks/go/container/boot.go  with the 
only difference that I override the binary file of worker    const prog = 
/bin/worker   Then my boot should start worker which is also added to 
container.  But I also have one more problem. Dataflow requires me to provide 
template json created by me and model (processing graph) file stored in GCS. I 
have to point pipelineUrl in template configuration to location of model. And 
its a bit problematic to me to generate these 2 files and send to GCS 
manually. So far I tried to start my job from local terminal with dry_run 
parameter to get template and model to export. Maybe Im doing something 
wrong.   Id appreciate if there is a better approach because the one I use 
is very inconvenient and error prone (and something is still not working with 
this approach).  --  Paweł   Dnia 13 kwietnia 2022 02:04 Daniel Oliveira 
danolive...@google.com napisał(a):  Ive used a custom docker image 
for the SDK/Worker Harness before by uploading a docker image to somewhere 
accessible by Dataflow (in my case I used GCR  cloud.google.com 
https://cloud.google.com/container-registry ), and then using that as the URL 
for my container. I used the environment_config flag from Go instead of 
worker_harness_container_image, but I believe they are functionally equivalent, 
you just choose one or the other. So it would look something like this:   
--worker_harness_container_image= us.gcr.io 
us.gcr.io/project/filepath/beam_go_sdk:tag   Hope that helps!   On Wed, Apr 6, 
2022 at 2:01 AM Pawelpawel...@o2.pl  wrote:  Hi,  Im wondering 
if it is possible to create custom docker image with compiled go binary that 
can be deployed to Dataflow? Currently Im using it as below   myjob    
--runner dataflow    --output gs://prog-test-bucket/out    --project ***    
--region europe-west1    --temp_location gs://prog-test-bucket/tmp/    
--staging_location gs://prog-tet-buckt/binaries/    
--worker_harness_container_image=apache/beam_go_sdk:latest   But this is not a 
clean way of starting jobs. Id prefer to have it in a more 
organized way with deploying worker as built container image. Can I make a use 
of  hub.docker.com https://hub.docker.com/r/apache/beam_go_sdk  and build 
custom image with my binary and then deploy it in GCP?  I will appreciate all 
the suggestion/help. Thanks   --  Paweł


Go SDK - Docker image for Dataflow

2022-04-06 Thread Pawel
Hi,  Im wondering if it is possible to create custom docker image with 
compiled go binary that can be deployed to Dataflow? Currently Im using it 
as below   myjob    --runner dataflow    --output gs://prog-test-bucket/out    
--project ***    --region europe-west1    --temp_location 
gs://prog-test-bucket/tmp/   --staging_location gs://prog-tet-buckt/binaries/   
 --worker_harness_container_image=apache/beam_go_sdk:latest   But this is not a 
clean way of starting jobs. Id prefer to have it in a more 
organized way with deploying worker as built container image. Can I make a use 
of  hub.docker.com https://hub.docker.com/r/apache/beam_go_sdk  and build 
custom image with my binary and then deploy it in GCP?  I will appreciate all 
the suggestion/help. Thanks   --  Paweł