The exceptions could be from bad data - i am working on it, or from quota exceeded. The problem is that if i have 2 collections in the pipline, and one fails on the quota, the other will fail also, even thought it should have succeeeded
On Sat, Sep 16, 2017 at 10:35 PM, Eugene Kirpichov <[email protected]> wrote: > There is no way to do catch an exception inside a transform unless you > wrote the transform yourself and have control over the code of its DoFn's. > That's why I'm asking whether configuring bad records would be an > acceptable workaround. > > On Sat, Sep 16, 2017, 11:07 AM Chaim Turkel <[email protected]> wrote: > >> i am using batch, since streaming cannot be done with partitions with >> old data more than 30 days. >> the question is how can i catch the exception in the pipline so that >> other collections do not fail >> >> On Fri, Sep 15, 2017 at 7:37 PM, Eugene Kirpichov >> <[email protected]> wrote: >> > Are you using streaming inserts or batch loads method for writing? >> > If it's streaming inserts, BigQueryIO already can return the bad records, >> > and I believe it won't fail the pipeline, so I'm assuming it's batch >> loads. >> > For batch loads, would it be sufficient for your purposes if >> > BigQueryIO.read() let you configure the configuration.load.maxBadRecords >> > parameter (see >> https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs >> > )? >> > >> > On Thu, Sep 14, 2017 at 10:29 PM Chaim Turkel <[email protected]> wrote: >> > >> >> I am using the sink of BigQueryIO so the example is not the same. The >> >> example is bad data from reading, I have problems when writting. There >> >> can be multiple errors when writing to BigQuery, and if it fails there >> >> is no way to catch this error, and the whole pipeline fails >> >> >> >> chaim >> >> >> >> On Thu, Sep 14, 2017 at 5:48 PM, Reuven Lax <[email protected]> >> >> wrote: >> >> > What sort of error? You can always put a try/catch inside your DoFns >> to >> >> > catch the majority of errors. A common pattern is to save records that >> >> > caused exceptions out to a separate output so you can debug them. This >> >> blog >> >> > post >> >> > < >> >> >> https://cloud.google.com/blog/big-data/2016/01/handling-invalid-inputs-in-dataflow >> >> > >> >> > explained >> >> > the pattern. >> >> > >> >> > Reuven >> >> > >> >> > On Thu, Sep 14, 2017 at 1:43 AM, Chaim Turkel <[email protected]> >> wrote: >> >> > >> >> >> Hi, >> >> >> >> >> >> In one pipeline I have multiple PCollections. If I have an error on >> >> >> one then the whole pipline is canceled, is there a way to catch the >> >> >> error and log it, and for all other PCollections to continue? >> >> >> >> >> >> >> >> >> chaim >> >> >> >> >> >>
