Steven Jon Anderson created BEAM-2870:
-----------------------------------------

             Summary: BQ Partitioned Table Write Fails When Destination has 
Partition Decorator
                 Key: BEAM-2870
                 URL: https://issues.apache.org/jira/browse/BEAM-2870
             Project: Beam
          Issue Type: Bug
          Components: runner-dataflow
    Affects Versions: 2.2.0
         Environment: Dataflow Runner, Streaming, 10 x (n1-highmem-8 & 500gb 
SDD)
            Reporter: Steven Jon Anderson
            Assignee: Thomas Groh
             Fix For: 2.2.0


Dataflow Job ID: 
https://console.cloud.google.com/dataflow/job/2017-09-08_23_03_14-14637186041605198816?project=firebase-lessthan3

Tagging [~reuvenlax] as I believe he built the time partitioning integration 
that was merged into master.

*Background*
Our production pipeline ingests millions of events per day and routes events 
into our clients' numerous tables. To keep costs down, all of our tables are 
partitioned. However, this requires that we create the tables before we allow 
events to process as creating partitioned tables isn't supported in 2.1.0. 
We've been looking forward to [~reuvenlax]'s partition table write feature 
([#3663|https://github.com/apache/beam/pull/3663]) to get merged into master 
for some time now as it'll allow us to launch our client platforms much, much 
faster. Today we got around to testing the 2.2.0 nightly and discovered this 
bug.

*Issue*
Our pipeline that writes to a table with a decorator. When attempting to write 
to an existing partitioned table with a decorator, the write succeeds. When 
using a partitioned table destination that doesn't exist without a decorator, 
the write succeeds. *However, when writing to a partitioned table that doesn't 
exist with a decorator, the write fails*. 

*Example Implementation*
{code:java}
BigQueryIO.writeTableRows()
  .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
  .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
  .withFailedInsertRetryPolicy(InsertRetryPolicy.alwaysRetry())
  .to(new DynamicDestinations<TableRow, String>() {

    @Override
    public String getDestination(ValueInSingleWindow<TableRow> element) {
      return "PROJECT_ID:DATASET_ID.TABLE_ID$20170902";
    }

    @Override
    public TableDestination getTable(String destination) {
      TimePartitioning DAY_PARTITION = new TimePartitioning().setType("DAY");
      return new TableDestination(destination, null, DAY_PARTITION);
    }

    @Override
    public TableSchema getSchema(String destination) {
      return TABLE_SCHEMA;
    }
  })
{code}

*Relevant Logs & Errors in StackDriver*

{code:none}
23:06:26.790 
Trying to create BigQuery table: PROJECT_ID:DATASET_ID.TABLE_ID$20170902

23:06:26.873 
Invalid table ID \"TABLE_ID$20170902\". Table IDs must be alphanumeric (plus 
underscores) and must be at most 1024 characters long. Also, Table decorators 
cannot be used.
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to