Gotcha. "Assume timezone" is like "tz_localize" in pandas.

> If you are starting from a UTC-zoned timestamp you can just cast it,
as David suggests, to the desired timezone (this is a metadata only
change)

I didn't see any target timezone information being passed in the "cast"
function - how would I do that?

On Thu, Feb 3, 2022 at 10:48 AM Rok Mihevc <rok.mih...@gmail.com> wrote:

> Assume_timezone will "assume timezone" of the local time you pass it
> and give you a zoned timestamp. This means your timestamps will be
> interpreted as local times and converted to UTC in the background.
> Resulting array will have timezone metadata of your assumed timezone.
>
> If you are starting from a UTC-zoned timestamp you can just cast it,
> as David suggests, to the desired timezone (this is a metadata only
> change) and proceed to extracting time components. Components will be
> extracted in the metadata timezone.
>
> On Thu, Feb 3, 2022 at 4:27 PM Li Jin <ice.xell...@gmail.com> wrote:
> >
> > David - Thanks for the pointers. I didn't know you could cast a timestamp
> > to time type to extract the hour/minute information. Nice!
> >
> > Rok - Not sure I understand what you mean...  My input is UTC but I want
> to
> > extract the time information local to New York Timezone (e.g. filter time
> > to 10 AM New York time). How would I do this without assume_timezone?
> >
> > On Thu, Feb 3, 2022 at 10:13 AM Rok Mihevc <rok.mih...@gmail.com> wrote:
> >
> > > Hey Li,
> > >
> > > If your input data is in UTC you don't need assume_timezone [1]. You
> > > would need it if your input was America/New_York local time and you
> > > wanted to convert to a zoned timestamp array where underlying data is
> > > in UTC and timezone is metadata only. Perhaps python tests are
> > > interesting for reference [2].
> > >
> > > Available extraction kernels are listed here: [3].
> > >
> > > Rok
> > >
> > > [1]
> > >
> https://arrow.apache.org/docs/python/generated/pyarrow.compute.assume_timezone.html
> > > [2]
> > >
> https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_compute.py#L1908-L1999
> > > [3]
> > >
> https://arrow.apache.org/docs/cpp/compute.html#temporal-component-extraction
> > >
> > > On Thu, Feb 3, 2022 at 3:54 PM Li Jin <ice.xell...@gmail.com> wrote:
> > > >
> > > > Hello!
> > > >
> > > > I am new to the Arrow C++ compute engine and trying to figure out
> this
> > > time
> > > > zone conversion and time extraction:
> > > >
> > > > t.dt.tz_convert('America/New_York').dt.time == datetime.time(11, 30,
> 0)
> > > >
> > > > So I started looking at:
> > > >
> > >
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc
> > > >
> > > > and found these these functions seem relevant:
> > > > assume_timezone
> > > > hour
> > > > minute
> > > >
> > > > So my thinking is trying to figure out a way to build plan that
> basically
> > > > does these steps:
> > > > (1) Assume timezone to New_York (input data is UTC)
> > > > (2) Extract hour value
> > > > (3) Extract minute value
> > > > (4) Filter on hour and minute value
> > > >
> > > >  I wonder what is a good way to map these functions in
> > > > scalar_temporal_unary to an ExecPlan? (Looked under
> > > >
> https://github.com/apache/arrow/tree/master/cpp/src/arrow/compute/exec
> > > but
> > > > didn't see anything obvious)
> > > >
> > > > Thanks!
> > > > Li
> > >
>

Reply via email to