[ 
https://issues.apache.org/jira/browse/BEAM-11252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17309329#comment-17309329
 ] 

Tudor Plugaru commented on BEAM-11252:
--------------------------------------

I've done some more testing with this issue and the issue is with this line 
[https://github.com/apache/beam/blob/553553df4b286cc6d7fd4af0e0bcf8e1cc5e2785/sdks/python/apache_beam/io/gcp/bigquery.py#L1343]

If you want to test this do the following:
 # Create a pipeline that will read events and will use fixed size windows.
 # Use a GroupByKey after windowing
 # Use WriteToBigQuery after the GBK
 # In the events that you push through the pipeline, include one that will fail 
to be inserted. What will happen is that this element will be tagged and it 
will be tried to be converted to a GlobalWindow element and it will fail with 
this error.

I have created a gist in which I replicate this behaviour of converting an 
element from a fixed size window element to a global window element. Find it 
here https://gist.github.com/PlugaruT/9af53bcb54ece94523a386147392ab4c 

> Writing to BigQuery raises a TypeError "GlobalWindow -> ._IntervalWindowBase"
> -----------------------------------------------------------------------------
>
>                 Key: BEAM-11252
>                 URL: https://issues.apache.org/jira/browse/BEAM-11252
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>    Affects Versions: 2.24.0, 2.25.0, 2.26.0, 2.27.0, 2.28.0
>            Reporter: Tudor Plugaru
>            Priority: P2
>
> I have a pipeline that reads from a PubSub topic, does some transforms and 
> then writes to BigQuery. The exception message is this: {{Cannot convert 
> GlobalWindow to apache_beam.utils.windowed_value._IntervalWindowBase}}
> You can find the main part of the pipeline here 
> [https://gist.github.com/PlugaruT/4666406bd8792b7b196dc1519c8885a2]
> The first part is basically reading from Pub/Sub and then I apply this 
> PTranform.
> And here is the gigantic stack trace: 
> [https://gist.github.com/PlugaruT/52bf3834eec95fd5cc5779d3332e1433]
>  
> I've tried to downgrade to 2.24.0 version but this is still happening.
> Also, I can't reproduce the exception locally when I run the pipeline with 
> DirectRunner. It's happening only when running on Dataflow.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to