Hey sure... it's a crap script :).. just an ordinary dataflow script https://github.com/mmistroni/GCP_Experiments/tree/master/dataflow/edgar_flow
What i meant to say , for your template question, is for you to write a basic script which run on bean... something as simple as this https://github.com/mmistroni/GCP_Experiments/blob/master/dataflow/beam_test.py and then you can create a template out of it by just running this python -m edgar_main --runner=dataflow --project=datascience-projets --template_location=gs://mm_dataflow_bucket/templates/edgar_dataflow_template --temp_location=gs://mm_dataflow_bucket/temp --staging_location=gs://mm_dataflow_bucket/staging That will create a template 'edgar_dataflow_template' which you can use in GCP dataflow console to create your job. hth, i m sort of a noob to Beam, having started writing code just over a month ago. Feel free to ping me if u get stuck kind regards Marco On Sat, Apr 4, 2020 at 6:01 PM Xander Song <iamuuriw...@gmail.com> wrote: > Hi Marco, > > Thanks for your response. Would you mind sending the edgar_main script so > I can take a look? > > On Sat, Apr 4, 2020 at 2:25 AM Marco Mistroni <mmistr...@gmail.com> wrote: > >> Hey >> As far as I know you can generate a dataflow template out of your beam >> code by specifying an option on command line? >> I am running this CMD and once template is generated I kick off a dflow >> job via console by pointing at it >> >> python -m edgar_main --runner=dataflow --project=datascience-projets >> --template_location=gs://<your bucket> Hth >> >> >> On Sat, Apr 4, 2020, 9:52 AM Xander Song <iamuuriw...@gmail.com> wrote: >> >>> I am attempting to write a custom Dataflow Template using the Apache >>> Beam Python SDK, but am finding the documentation difficult to follow. Does >>> anyone have a minimal working example of how to write and deploy such a >>> template? >>> >>> Thanks in advance. >>> >>