Nested PyArrow scalar

2022-03-30 Thread Wenlei Xie
Hi, When play around PyArrow scalar, I found it seems to expect the input as a "pure Python object", e.g. it cannot be a Python list of arrow scalar (such as `[ pa.scalar(1) ]`: ``` >>> import pyarrow as pa >>> pa.__version__ '7.0.0' >>> pa.scalar([1]) >>> pa.scalar([pa.scalar(1)]) Traceback

Re: Evaluate expressions on a pyarrow table in-memory

2022-03-30 Thread Weston Pace
Yes and no :) Disclaimer: this answer is a little out of my wheelhouse as I've learned relational algebra through doing and so my formal theory may be off. Anyone is welcome to correct me. Also, this answer turned into a bit of ramble and is a bit scattershot. You may already be very familiar

Evaluate expressions on a pyarrow table in-memory

2022-03-30 Thread Suresh V
Hi, Is there a way to evaluate mathematical expressions against columns of a pyarrow table which is in memory already similar to how projections work for dataset scanners? The goal is to have specify a bunch of strings like sum(a * b)/sum(a), or avg(a[:10]) etc. Convert these into expressions