Re: Apache Beam BigQueryIO Exception

2022-02-22 Thread Rajnil Guha
Hi, Thank you so much for your response. I tried by specifying a temp dataset using the temp_dataset parameter of ReadFromBigQuery and it worked. I was looking at the BigQuerySource class but could not find any such parameters for setting temp_dataset and my job fails by throwing same permission e

Apache Beam BigQueryIO Exception

2022-02-19 Thread Rajnil Guha
Hi Beam Users, We have a Dataflow pipeline which reads and writes data from and into BigQuery. The basic structure of the pipeline is as follows: query = <> with beam.Pipeline(options = options) as p: read_bq_records = (p | "ReadFromBQ" >> beam.io.ReadFromBigQuery(

Re: Firing a dataflow job using REST API

2022-02-14 Thread Rajnil Guha
ook at the gcloud documentation? > > https://cloud.google.com/dataflow/docs/reference/rest > > Thanks > Deepak > > On Mon, Feb 14, 2022 at 3:53 PM Rajnil Guha > wrote: > >> Hi Beam Community, >> >> I am trying to run a Dataflow pipeline using the REST AP

Firing a dataflow job using REST API

2022-02-14 Thread Rajnil Guha
Hi Beam Community, I am trying to run a Dataflow pipeline using the REST API. Currently I see an example for doing the same for Dataflow templates[1]. My query is can we run a dataflow job using REST API but not create templates. As per the docs the projects.jobs.create method allows to do so, but

Writing Avro Files to Big Query using Python SDK and Dataflow Runner

2021-08-01 Thread Rajnil Guha
into the pipeline from there as it's much easier to maintain that way. Note:- We are using the Python SDK to write our pipelines and running on Dataflow. Thanks & Regards Rajnil Guha

Re: Checking a Pcoll is empty in Apache Beam

2021-07-21 Thread Rajnil Guha
Hi, Thanks for your suggestions, I will surely check them out. My exact use-case is to check if the Pcoll is empty, and if it is, publish a message into a Pub/Sub topic. This message will then be further used downstream by some other processes. Thanks & Regards Rajnil Guha On Wed, Jul 21,

Re: Checking a Pcoll is empty in Apache Beam

2021-07-21 Thread Rajnil Guha
Yes I am just thinking how to modify/rewrite this piece of code if I want to run my pipeline on Dataflow runner. Thanks & Regards Rajnil Guha On Wed, Jul 21, 2021 at 1:12 AM Robert Bradshaw wrote: > On Tue, Jul 20, 2021 at 12:33 PM Rajnil Guha > wrote: > > > > Hi, >

Re: Checking a Pcoll is empty in Apache Beam

2021-07-20 Thread Rajnil Guha
oc[0, 0] == 0): print("Empty") else: print(is_empty_beam_df) Any other way to implement similar style checks using Beam Dataframes? Thanks & Regards Rajnil Guha On Tue, Jul 20, 2021 at 3:52 AM Robert Bradshaw wrote: > On Mon, Jul 19, 2021 at 11:12 AM Reuven Lax wrote: > > > >

Re: Checking a Pcoll is empty in Apache Beam

2021-07-19 Thread Rajnil Guha
when the Pcoll is empty it does not execute the else part and instead executes the if part i.e. prints 0. Thanks & Regards Rajnil Guha On Mon, Jul 19, 2021 at 12:32 AM Reuven Lax wrote: > You could count the collection (with default value of zero). > > On Sun, Jul 18, 2021, 1

Re: Checking a Pcoll is empty in Apache Beam

2021-07-18 Thread Rajnil Guha
Hi Reuven, Yes, for now this is a bounded PCollection. Thanks & Regards Rajnil Guha On Mon, Jul 19, 2021 at 12:02 AM Reuven Lax wrote: > Is this a bounded collection? > > On Sun, Jul 18, 2021, 11:17 AM Rajnil Guha > wrote: > >> Hi Beam Users, >> >> I

Checking a Pcoll is empty in Apache Beam

2021-07-18 Thread Rajnil Guha
ay on how to check whether a Pcollection is empty or not using Python and how to take action based on the check. Is there any way to implement this using Beam. Thanks & Regards Rajnil Guha

Re: [Question] -- Getting error while writing data into Big Query from Dataflow -- "Clients have non-trivial state that is local and unpickleable.", _pickle.PicklingError: Pickling client objects is e

2021-03-30 Thread Rajnil Guha
pipeline to not set "save_main_session" (and set dependencies according to this guide if needed).Thanks,ChamOn Mon, Mar 29, 2021 at 12:31 PM Rajnil Guha <rajnil94.g...@gmail.com> wrote:Hi Beam Community,I am running a Dataflow pipeline using the Python SDK. I am doing do some ETL processing

[Question] -- Getting error while writing data into Big Query from Dataflow -- "Clients have non-trivial state that is local and unpickleable.", _pickle.PicklingError: Pickling client objects is expli

2021-03-29 Thread Rajnil Guha
Hi Beam Community,I am running a Dataflow pipeline using the Python SDK. I am doing do some ETL processing on my data and then write the output into Big Query. When I try to write into Big Query I get below error in Dataflow job. However when running this pipeline from my local on DirectRunner the