Alex Van Boxel created BEAM-7999:
------------------------------------

             Summary: BigQueryIO.readTableRowsWithSchema() doesn't handle 
timestamp correctly
                 Key: BEAM-7999
                 URL: https://issues.apache.org/jira/browse/BEAM-7999
             Project: Beam
          Issue Type: Task
          Components: io-java-gcp
    Affects Versions: 2.14.0, 2.15.0
            Reporter: Alex Van Boxel
            Assignee: Alex Van Boxel


Using the new readTableRowsWithSchema to make a copy of a table (simple 
operation), parsing the timestamp in the table doesn't work as it assumes a 
Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 UTC". 
This isn't handled.

*Reproducable:*

with this table
{code:java}
INSERT `research.alex.in1` (row_id, f_int64, f_timestamp)
VALUES
    (1, 1, '2019-08-16 00:12:00 UTC'),
    (2, 2, '2019-08-16 00:12:00.123 UTC'),
    (3, 3, '2019-08-16 00:12:00.123456 UTC')
{code}
do a copy operation:
{code:java}
pipeline
        .apply(
                BigQueryIO.readTableRowsWithSchema()
                        .from("research:alex.in1")
                //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ)

        )
        .apply(ParDo.of(new Inspect()))
        .apply(
                BigQueryIO.writeTableRows()
                        
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
                        .withMethod(BigQueryIO.Write.Method.FILE_LOADS)
                        .useBeamSchema()
                        .to("research:alex.out4"));
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to