Hello,
At the moment for the upcoming release of Spark 2.3, Arrow support is
limited to using PySpark DataFrame.toPandas(), so reading Parquet data from
Spark does not use Arrow.
-Bryan
On Wed, Aug 30, 2017 at 2:47 AM, big data wrote:
> I want to use Arrow as a middle layer between spark and p
I want to use Arrow as a middle layer between spark and parquet data in
HDFS, but I don't find any docs about how to load parquet data to arrow
in memory, and how spark read arrow data format. Does anyone provide
some examples or manuals to describe it?
thanks.