Can we join on a "dataset" yet using pyarrow? What I mean is, my parquet
file, which is larger than memory, can I read it using dataset API and join
with other dataset/in memory table? If yes, I couldn't find it in
documentation, can you please guide how to do that join
On Tue, Apr 16, 2024, 9:59
Anyone?
On Sun, Aug 27, 2023 at 2:21 AM PASSWORD ADMINISTRATOR <
ultimatepwdmas...@gmail.com> wrote:
> First time using a mailing list so bear with me.
>
> I am trying to run a simple query on full NYC taxi dataset (my local copy
> on HDD), which counts number of rows per gro
First time using a mailing list so bear with me.
I am trying to run a simple query on full NYC taxi dataset (my local copy
on HDD), which counts number of rows per group, i.e group by X then count
(*)
In R-arrow, this can be done using
nyc_taxi = arrow::open_dataset('aria_nyc/',partitioning =