[ 
https://issues.apache.org/jira/browse/BEAM-5191?focusedWorklogId=270428&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-270428
 ]

ASF GitHub Bot logged work on BEAM-5191:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Jul/19 19:03
            Start Date: 01/Jul/19 19:03
    Worklog Time Spent: 10m 
      Work Description: chamikaramj commented on pull request #8945: 
[BEAM-5191] Support for BigQuery clustering
URL: https://github.com/apache/beam/pull/8945#discussion_r299177287
 
 

 ##########
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 ##########
 @@ -1793,6 +1814,32 @@ static String getExtractDestinationUri(String 
extractDestinationDir) {
       return toBuilder().setJsonTimePartitioning(partitioning).build();
     }
 
+    /**
+     * Specifies the clustering fields to use when writing to a single output 
table. Can only be
+     * used when {@link#withTimePartitioning(TimePartitioning)} is set. If 
{@link
+     * #to(SerializableFunction)} or {@link #to(DynamicDestinations)} is used 
to write to dynamic
+     * tables, this setting is ignored; instead, see {@link 
#enableClustering()}.
+     */
+    public Write<T> withClustering(Clustering clustering) {
+      checkArgument(clustering != null, "clustering can not be null");
+      return 
toBuilder().setEnableClustering(true).setClustering(clustering).build();
+    }
+
+    /**
+     * Allows writing to clustered tables when {@link 
#to(SerializableFunction)} or {@link
+     * #to(DynamicDestinations)} is used. The returned {@link 
TableDestination} objects should
+     * specify the clustering fields per table. If writing to a single table, 
use {@link
+     * #withClustering(Clustering)} instead to specify the clustering fields.
+     *
+     * <p>Setting this option enables use of {@link TableDestinationCoderV3} 
which encodes
+     * clustering information. Pipelines using an older coder must be drained 
before setting this
+     * option, since {@link TableDestinationCoderV3} will not be able to read 
state written with a
+     * previous version.
+     */
+    public Write<T> enableClustering() {
 
 Review comment:
   Having these two methods seems to make the API pretty brittle.
   
   How about just having one function withClustering() that optionally takes a 
clustering object ? In dynamic destinations case optional Clustering object can 
be skipped/null and the method will behave similar to enableClustering() ?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 270428)
    Time Spent: 8h 50m  (was: 8h 40m)

> Add support for writing to BigQuery clustered tables
> ----------------------------------------------------
>
>                 Key: BEAM-5191
>                 URL: https://issues.apache.org/jira/browse/BEAM-5191
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-gcp
>    Affects Versions: 2.6.0
>            Reporter: Robert Sahlin
>            Assignee: Wout Scheepers
>            Priority: Minor
>              Labels: features, newbie
>          Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Google recently added support for clustered tables in BigQuery. It would be 
> useful to set clustering columns the same way as for partitioning. It should 
> support multiple fields (4) for clustering.
> For example:
> [BigQueryIO.Write|https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.Write.html]<[T|https://beam.apache.org/documentation/sdks/javadoc/2.6.0/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.Write.html]>
>  .withClustering(new Clustering().setField("productId").setType("STRING"))



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to