Thanks for bringing this up Josh!

I agree the current DataFusion community is very focused on building a
foundation for data intensive systems like databases, data flow engines,
etc. It is NOT really on any end user of those systems.

In my mind there is not yet a community around the DataFuson python
bindings that will drive it forward in any way other than bindings to the
underlying engine.

One of DataFusion's strengths is that it can be used to build many things
(including the things you list above)

Therefore, I think the question is "what do you want to build?" Insofar
that DataFusion (the technology and the community at large) can help, we
would love to. This could be within the DataFusion / Arrow project itself
or it could be entirely outside using the code from the project, or maybe
something in between.

Hope that helps
Andrew


On Thu, Aug 24, 2023 at 5:23 AM Josh Magarick <jmagar...@gmail.com> wrote:

> Ahoy!
> Recently, there was a request to find people to take a more active role in
> defining and building a Python interface to DataFusion here:
> https://github.com/apache/arrow-datafusion-python/issues/440
>
> In response, I've filed the following to get a sense of what's important to
> people:
> https://github.com/apache/arrow-datafusion-python/issues/462
>
> However, in addition to wanting to publicize my request more, some nagging
> questions about the broader goals of DataFusion and Arrow remain. Given
> that DF is pitched as a foundation for database systems, what are the
> aspirations for an interface in Python or other high level languages? Are
> folks imagining it will be used for building pipelines, automated analysis,
> interactive EDA? All of the above? Something else entirely?
>
> Given my background I'm inclined toward something aimed at both interactive
> and automated data analysis. It seems like a lot of the foundation is
> there, though I think doing it right requires more than just an interface
> on what exists already. There's more to discuss but hopefully this is
> enough to get started. Thanks for taking the time to read this.
>
> Regards,
>
> Josh
>

Reply via email to