Re: Creating InMemory relations with data in ColumnarBatches

2023-04-04 Thread Bobby Evans
This is not going to work without changes to Spark. InMemoryTableScanExec supports columnar output, but not columnar input. You would have to write code to support that in Spark itself. The second part is that there are only a handful of operators that support columnar output. Really it is just

Re: Creating InMemory relations with data in ColumnarBatches

2023-03-31 Thread praveen sinha
Yes, purely for performance. On Thu, Mar 30, 2023, 3:01 PM Mich Talebzadeh wrote: > Is this purely for performance consideration? > > Mich Talebzadeh, > Lead Solutions Architect/Engineering Lead > Palantir Technologies Limited > > >view my Linkedin profile >

Re: Creating InMemory relations with data in ColumnarBatches

2023-03-30 Thread Mich Talebzadeh
Is this purely for performance consideration? Mich Talebzadeh, Lead Solutions Architect/Engineering Lead Palantir Technologies Limited view my Linkedin profile https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* Use it

Creating InMemory relations with data in ColumnarBatches

2023-03-30 Thread praveen sinha
Hi, I have been trying to implement InMemoryRelation based on spark ColumnarBatches, so far I have not been able to store the vectorised columnarbatch into the relation. Is there a way to achieve this without going with an intermediary representation like Arrow, so as to enable spark to do fast