Re: BigQuery TIMESTAMP and TimestampedValue()

2020-01-23 Thread Sandy Walsh
Thanks Kenneth.

Yes, fortunately it's *always* UTC, so I was able to solve it with

t = event['ts']
# -[M]M-[D]D[( |T)[H]H:[M]M:[S]S[.DD]][UTC]
dt = datetime.datetime.strptime(t, '%Y-%m-%d %H:%M:%S.%f %Z')
yield beam.window.TimestampedValue(event, dt.timestamp())


On Wed, Jan 22, 2020 at 7:07 PM Kenneth Knowles  wrote:

> Ah, that's too bad. I wonder why they chose to put " UTC" on the end
> instead of just a "Z". Other than that, the format is RFC3339 and the
> iso8601 module does have the extension to use a space instead of a T to
> separate the date and time. I tested and if you strip the " UTC" then
> parsing succeeds.
>
> Since BigQuery TIMESTAMPS do not carry time zone information, it is safe
> to ignore the time zone portion. The problem of course is if they
> change/fix this it could break your code.
>
> Kenn
>
> On Mon, Jan 20, 2020 at 2:45 PM Sandy Walsh  wrote:
>
>> [image: :wave:] Newb here for what will certainly be the first of many
>> silly questions ...
>>
>> I'm working on a dataflow pipeline using python SDK (local runners
>> currently).
>>
>> It's a bounded source from BigQuery. One column is a TIMESTAMP. I'm
>> trying to assign the timestamp using beam.window.TimestampedValue() but
>> the timestamp I'm getting back from BQ seems to be a string and not in
>> RFC3339 format.
>>
>> The format is '2019-12-13 09:38:19.380224 UTC' ... which I could
>> explicitly convert but I'd rather do that in the query.
>>
>> Any suggestions on how to get the timestamp back in format I can parse
>> with iso8601.parse_date() or, ideally, just pass into TimestampedValue()
>> without having to parse a string?
>>
>> Thanks
>>
>>


BigQuery TIMESTAMP and TimestampedValue()

2020-01-20 Thread Sandy Walsh
[image: :wave:] Newb here for what will certainly be the first of many
silly questions ...

I'm working on a dataflow pipeline using python SDK (local runners
currently).

It's a bounded source from BigQuery. One column is a TIMESTAMP. I'm trying
to assign the timestamp using beam.window.TimestampedValue() but the
timestamp I'm getting back from BQ seems to be a string and not in RFC3339
format.

The format is '2019-12-13 09:38:19.380224 UTC' ... which I could explicitly
convert but I'd rather do that in the query.

Any suggestions on how to get the timestamp back in format I can parse with
iso8601.parse_date() or, ideally, just pass into TimestampedValue() without
having to parse a string?

Thanks