Re: Loading .xlsx and .xlx files using pyspark

Sean Owen Wed, 23 Feb 2022 06:56:56 -0800

This isn't pandas, it's pandas on Spark. It's distributed.

On Wed, Feb 23, 2022 at 8:55 AM Sid <flinkbyhe...@gmail.com> wrote:


> Hi Bjørn,
>
> Thanks for your reply. This doesn't help while loading huge datasets.
> Won't be able to achieve spark functionality while loading the file in
> distributed manner.
>
> Thanks,
> Sid
>
> On Wed, Feb 23, 2022 at 7:38 PM Bjørn Jørgensen <bjornjorgen...@gmail.com>
> wrote:
>
>> from pyspark import pandas as ps
>>
>>
>> ps.read_excel?
>> "Support both `xls` and `xlsx` file extensions from a local filesystem or
>> URL"
>>
>> pdf = ps.read_excel("file")
>>
>> df = pdf.to_spark()
>>
>>

Re: Loading .xlsx and .xlx files using pyspark

Reply via email to