It seems that Arrow’s timestamp type can either have no time zone or be UTC. I 
think that is a flawed design, because doesn’t catch user errors.

Suppose you want to find the number of milliseconds between two timestamps. If 
the first has a timezone and the second is implicitly UTC, then you can convert 
them both to instants and subtract. But if the first has a timezone and the 
second has no time zone, you must supply a time zone for the second. So, the 
subtraction function will have a different signature.

There are many similar operations, where a time zone needs to be supplied, or 
where you cannot safely mix timestamps with different time zones.

Julian


> On Jun 3, 2021, at 11:07 AM, Adam Hooper <a...@adamhooper.com> wrote:
> 
> On Thu, Jun 3, 2021 at 2:02 PM Adam Hooper <a...@adamhooper.com> wrote:
> 
>> I understand isAdjustedToUTC=true to mean "timestamp", and
>> isAdjustedToUTC=false to mean, "int64 and I hope somebody attached some
>> docs because
>> https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#local-semantics-timestamps-not-normalized-to-utc
>> lists a whole slew of potential meanings and without extra metadata I'll
>> never be able to figure out what this column means."
>> 
> 
> Correcting myself here: Parquet isAdjustedToUTC=false does have just one
> meaning. It means encoding a "(year, month, day, hour, minute, second,
> microsecond)" tuple as a single integer.
> 
> Adam
> 
> -- 
> Adam Hooper
> +1-514-882-9694
> http://adamhooper.com

Reply via email to