Arrow RecordBatches to Spark Dataframe

2020-06-24 Thread Tanveer Ahmad - EWI
Hi all, I have a small question, if you people can help me. In this code

Spark dataframe creation through already distributed in-memory data sets

2020-06-16 Thread Tanveer Ahmad - EWI
Hi all, I am new to the Spark community. Please ignore if this question doesn't make sense. My PySpark Dataframe is just taking a fraction of time (in ms) in 'Sorting', but moving data is much expensive (> 14 sec). Explanation: I have a huge Arrow RecordBatches collection which is equally

Re: Arrow RecordBatches/Pandas Dataframes to (Arrow enabled) Spark Dataframe conversion in streaming fashion

2020-06-11 Thread Tanveer Ahmad - EWI
Hi Jorge, Thank you. This union function is better alternative for my work. Regards, Tanveer Ahmad From: Jorge Machado Sent: Monday, May 25, 2020 3:56:04 PM To: Tanveer Ahmad - EWI Cc: Spark Group Subject: Re: Arrow RecordBatches/Pandas Dataframes to (Arrow

Arrow RecordBatches/Pandas Dataframes to (Arrow enabled) Spark Dataframe conversion in streaming fashion

2020-05-25 Thread Tanveer Ahmad - EWI
Hi all, I need some help regarding Arrow RecordBatches/Pandas Dataframes to (Arrow enabled) Spark Dataframe conversions. Here the example explains very well how to convert a single Pandas Dataframe to Spark Dataframe [1]. But in my case, some external applications are generating Arrow