Gotcha. "Assume timezone" is like "tz_localize" in pandas.
> If you are starting from a UTC-zoned timestamp you can just cast it, as David suggests, to the desired timezone (this is a metadata only change) I didn't see any target timezone information being passed in the "cast" function - how would I do that? On Thu, Feb 3, 2022 at 10:48 AM Rok Mihevc <[email protected]> wrote: > Assume_timezone will "assume timezone" of the local time you pass it > and give you a zoned timestamp. This means your timestamps will be > interpreted as local times and converted to UTC in the background. > Resulting array will have timezone metadata of your assumed timezone. > > If you are starting from a UTC-zoned timestamp you can just cast it, > as David suggests, to the desired timezone (this is a metadata only > change) and proceed to extracting time components. Components will be > extracted in the metadata timezone. > > On Thu, Feb 3, 2022 at 4:27 PM Li Jin <[email protected]> wrote: > > > > David - Thanks for the pointers. I didn't know you could cast a timestamp > > to time type to extract the hour/minute information. Nice! > > > > Rok - Not sure I understand what you mean... My input is UTC but I want > to > > extract the time information local to New York Timezone (e.g. filter time > > to 10 AM New York time). How would I do this without assume_timezone? > > > > On Thu, Feb 3, 2022 at 10:13 AM Rok Mihevc <[email protected]> wrote: > > > > > Hey Li, > > > > > > If your input data is in UTC you don't need assume_timezone [1]. You > > > would need it if your input was America/New_York local time and you > > > wanted to convert to a zoned timestamp array where underlying data is > > > in UTC and timezone is metadata only. Perhaps python tests are > > > interesting for reference [2]. > > > > > > Available extraction kernels are listed here: [3]. > > > > > > Rok > > > > > > [1] > > > > https://arrow.apache.org/docs/python/generated/pyarrow.compute.assume_timezone.html > > > [2] > > > > https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_compute.py#L1908-L1999 > > > [3] > > > > https://arrow.apache.org/docs/cpp/compute.html#temporal-component-extraction > > > > > > On Thu, Feb 3, 2022 at 3:54 PM Li Jin <[email protected]> wrote: > > > > > > > > Hello! > > > > > > > > I am new to the Arrow C++ compute engine and trying to figure out > this > > > time > > > > zone conversion and time extraction: > > > > > > > > t.dt.tz_convert('America/New_York').dt.time == datetime.time(11, 30, > 0) > > > > > > > > So I started looking at: > > > > > > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc > > > > > > > > and found these these functions seem relevant: > > > > assume_timezone > > > > hour > > > > minute > > > > > > > > So my thinking is trying to figure out a way to build plan that > basically > > > > does these steps: > > > > (1) Assume timezone to New_York (input data is UTC) > > > > (2) Extract hour value > > > > (3) Extract minute value > > > > (4) Filter on hour and minute value > > > > > > > > I wonder what is a good way to map these functions in > > > > scalar_temporal_unary to an ExecPlan? (Looked under > > > > > https://github.com/apache/arrow/tree/master/cpp/src/arrow/compute/exec > > > but > > > > didn't see anything obvious) > > > > > > > > Thanks! > > > > Li > > > >
