mattcuento commented on PR #1376: URL: https://github.com/apache/datafusion-ballista/pull/1376#issuecomment-3771106271
Hey @milenkovicm 👋 thanks for the feedback, you're probably right. > Could you just use in memory catalog on the scheduler (initialised with custom scheduler) to keep the state. I've updated this PR to use the in-memory catalog for a standalone example to make things easier. > you could create DDL using logical plan and serialize it to substrat, and send it across (to create table, which will state in scheduler catalog) then run your statement Just for the sake of learning what's going on, I do have this for a filesytem catalog and remote example: https://github.com/apache/datafusion-ballista/compare/main...mattcuento:datafusion-ballista:remote-substrait-example?expand=1 (just if you're curious) It turns out that: - `datafusion-substrait` produces `EmptyRelation` for DDL statements - `datafusion-substrait` doesn't support consuming DDL rel nodes so DDL will never make it to the scheduler with the substrait support as is; it must be handled with an external catalog. Happy to write some docs on usage as a follow-up. > will have a better look in the evening, it would be great if we can merge this soon and you get into more interesting issues, if you agree. what do you think ? Separately, yes, would love to get into some more interesting issues 🙂 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
