Alex Van Boxel created BEAM-7999: ------------------------------------ Summary: BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly Key: BEAM-7999 URL: https://issues.apache.org/jira/browse/BEAM-7999 Project: Beam Issue Type: Task Components: io-java-gcp Affects Versions: 2.14.0, 2.15.0 Reporter: Alex Van Boxel Assignee: Alex Van Boxel
Using the new readTableRowsWithSchema to make a copy of a table (simple operation), parsing the timestamp in the table doesn't work as it assumes a Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 UTC". This isn't handled. *Reproducable:* with this table {code:java} INSERT `research.alex.in1` (row_id, f_int64, f_timestamp) VALUES (1, 1, '2019-08-16 00:12:00 UTC'), (2, 2, '2019-08-16 00:12:00.123 UTC'), (3, 3, '2019-08-16 00:12:00.123456 UTC') {code} do a copy operation: {code:java} pipeline .apply( BigQueryIO.readTableRowsWithSchema() .from("research:alex.in1") //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ) ) .apply(ParDo.of(new Inspect())) .apply( BigQueryIO.writeTableRows() .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED) .withMethod(BigQueryIO.Write.Method.FILE_LOADS) .useBeamSchema() .to("research:alex.out4")); {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)