Thanks Micah. Yes, it's for PySpark.

Looking at what's there, it's exposing the entire result of do_get() to
Spark as one partition so we'll need to change it to expose each
VectorSchemaRoot as one partition instead.

On Fri, Mar 25, 2022 at 12:31 PM Micah Kornfield <[email protected]>
wrote:

> This is for pyspark?
>
> This might be a better question on the Spark mailing list for mechanics,
> but I think the only way to do this efficiently is going though the
> DataSource/DataSourceV2 APIs in Scala/Java.  I think Ryan Murray prototyped
> something along this path a while ago [1], but I'm not sure of its current
> state.
>
> [1]
> https://github.com/rymurr/flight-spark-source/blob/master/src/main/java/org/apache/arrow/flight/spark/FlightDataReader.java
>
> On Fri, Mar 25, 2022 at 12:01 PM James Duong <[email protected]>
> wrote:
>
>> Most of the examples I've seen for loading data into Spark convert a
>> FlightStreamReader to pandas using to_pandas().
>>
>> However this seems to load the entire contents of the stream into memory.
>> Is there an easy way to load individual chunks from the FlightStream into
>> dataframes? I figure you could call get_chunk() to get a record batch then
>> convert the batch to a dataframe.
>>
>> --
>>
>> *James Duong*
>> Lead Software Developer
>> Bit Quill Technologies Inc.
>> Direct: +1.604.562.6082 | [email protected]
>> https://www.bitquilltech.com
>>
>> This email message is for the sole use of the intended recipient(s) and
>> may contain confidential and privileged information.  Any unauthorized
>> review, use, disclosure, or distribution is prohibited.  If you are not the
>> intended recipient, please contact the sender by reply email and destroy
>> all copies of the original message.  Thank you.
>>
>

-- 

*James Duong*
Lead Software Developer
Bit Quill Technologies Inc.
Direct: +1.604.562.6082 | [email protected]
https://www.bitquilltech.com

This email message is for the sole use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure, or distribution is prohibited.  If you are not the
intended recipient, please contact the sender by reply email and destroy
all copies of the original message.  Thank you.

Reply via email to