[ 
https://issues.apache.org/jira/browse/BEAM-9873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17548686#comment-17548686
 ] 

Danny McCormick commented on BEAM-9873:
---------------------------------------

This issue has been migrated to https://github.com/apache/beam/issues/20309

> Removing Invalid JSON messages from PCollection before starting BigQueryIO 
> Operations
> -------------------------------------------------------------------------------------
>
>                 Key: BEAM-9873
>                 URL: https://issues.apache.org/jira/browse/BEAM-9873
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-gcp
>            Reporter: Varun
>            Priority: P3
>              Labels: Clarified, features
>
> In a typical set up of Pub Sub and Cloud Dataflow, a pub sub subscriber might 
> receive some messages that does not follow a valid json structure and the Big 
> Query Insert operation fails to process these messages and the worker may 
> gets terminated if the exception is not handled correctly.
> The likelihood of receiving the invalid json messages are very less and the 
> upstream component pushing messages on the Topic should have a validation at 
> their end but this is not always the case and the application should be 
> robust enough to survive even if there are wrong messages being pushed by the 
> Upstreams. 
> I have created an Enum which acts like a Predicate in Filter transform. This 
> is very standard logic of validating Json and i would like to add this to the 
> java SDK(and Python) in the Filter transform 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to