Hey team,

I have been working for a while on data frames support. Right now, I’m pretty 
confident that a PR can be requested in the next few days. I noticed a lot of 
scaffolding around RDDs to use DataFrames. Multiple existing operators already 
wrap temporary DataFrames. I decided to refactor those too. The idea is that a 
user sets a switch which is either RDD or DataFrames. 

In-depth ParquetSource now carries a preferDatasetOutput flag with 
`preferDatasetOutput(boolean)/isDatasetOutputPreferred()` so we can request 
Dataset-backed execution from higher APIs.

Thoughts?

—Alex 

--
Alexander Alten
CTO & co-founder
Scalytics - We Connect the World’s Data

Subscribe to our newsletter at LinkedIn

e: [email protected]
ln: www.linkedin.com/in/alexanderalten/‬ 
Book a meeting!

Disclaimer: Human written, please excuse typos.


-- 
*Scalytics Connect*
The foundation for secure, scalable, and transparent 
AI.
www.scalytics.io <http://www.scalytics.io>

--  Please consider the 
environment before printing this email --

Disclaimer:
The content of this 
message is confidential. If you have received it by mistake, please inform 
us by an email reply and then delete the message. It is forbidden to copy, 
forward, or in any way reveal the contents of this message to anyone. The 
integrity and security of this email cannot be guaranteed over the 
Internet. Therefore, the sender will not be held liable for any damage 
caused by the message.

Reply via email to