[ 
https://issues.apache.org/jira/browse/BEAM-8393?focusedWorklogId=327819&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327819
 ]

ASF GitHub Bot logged work on BEAM-8393:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 14/Oct/19 14:02
            Start Date: 14/Oct/19 14:02
    Worklog Time Spent: 10m 
      Work Description: jklukas commented on issue #9784: [BEAM-8393] Fix Java 
BigQueryIO clustering support for multiple partitions
URL: https://github.com/apache/beam/pull/9784#issuecomment-541696374
 
 
   I have not added any additional test case here as I feel the bug and fix are 
both simple and obvious. Introducing a test case would involve potentially 
significant new test code to demonstrate the change; there is an existing case 
in `BigQueryIOWriteTest` that checks that a large collection of files is 
correctly separated into multiple partitions, but it doesn't actually call 
BigQueryIO to load a large number of files.
   
   Demonstrating that this change is working correctly would require actually 
running a load job into BigQuery over a large enough number of files to trigger 
multiple partitions. The effort and compute resource to run the test seem 
unnecessary for the low complexity of the change.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 327819)
    Time Spent: 20m  (was: 10m)

> Java BigQueryIO clustering support breaks on multiple partitions
> ----------------------------------------------------------------
>
>                 Key: BEAM-8393
>                 URL: https://issues.apache.org/jira/browse/BEAM-8393
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.15.0, 2.16.0
>            Reporter: Jeff Klukas
>            Assignee: Jeff Klukas
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Support for writing to clustered tables in BigQuery was added in 2.15, which 
> involved adding a new optional clustering field to TableDestination. 
> Clustering support is working for most cases, but fails with errors about 
> incompatible partitioning specifications for any data that is handled by the 
> MultiplePartitions branch of BigQueryIO logic.
> There is a case in that code path where we provide a modified 
> TableDestination and neglect to copy the clustering definition, so the final 
> load job does not include any clustering columns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to