andygrove commented on issue #30: URL: https://github.com/apache/arrow-ballista/issues/30#issuecomment-1146639503
Hi @ziedbouf. I'm glad you are finding the book helpful! Let me try and answer some of your questions. I have not used Plasma and there is no support for it in DataFusion/Ballista. I recall a discussion about it being unmaintained but I could be wrong. Ballista doesn't provide an in-memory cache like redis, although it would be possible to implement a custom datasource to connect to redis. Ballista is similar to Spark SQL and Dask-SQL in terms of architecture, so yes, could be seen as a competitor although it is not as mature yet. Ballista does not have any stream, graph, or ML capabilities yet though. As mentioned in this discussion, some contributors are planning on working on streaming. My personal view is that we need to get Ballista to the point of maturity where it can run industry-standard benchmarks at scale with performance and scalability at least as good as Spark. With that, and some better docs, the project hopefully starts to gain more traction and more contributors and that would eventually lead to people building ML features perhaps. The blaze project is really interesting because it leverages the mature Spark scheduler and uses DataFusion for execution. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
