Hi,I gather from the replies that the plugin is not currently available in the form expected although I am aware of the shell script.
Also have you got some benchmark results from your tests that you can possibly share? Thanks, Mich Talebzadeh, Dad | Technologist | Solutions Architect | Engineer London United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* The information provided is correct to the best of my knowledge, sourced from both personal expertise and other resources but of course cannot be guaranteed . It is essential to note that, as with any advice, one verified and tested result holds more weight than a thousand expert opinions. On Thu, 15 Feb 2024 at 01:18, Chao Sun <sunc...@apache.org> wrote: > Hi Praveen, > > We will add a "Getting Started" section in the README soon, but basically > comet-spark-shell > <https://github.com/apache/arrow-datafusion-comet/blob/main/bin/comet-spark-shell> > in > the repo should provide a basic tool to build Comet and launch a Spark > shell with it. > > Note that we haven't open sourced several features yet including shuffle > support, which the aggregate operation depends on. Please stay tuned! > > Chao > > > On Wed, Feb 14, 2024 at 2:44 PM praveen sinha <praveen.si...@gmail.com> > wrote: > >> Hi Chao, >> >> Is there any example app/gist/repo which can help me use this plugin. I >> wanted to try out some realtime aggregate performance on top of parquet and >> spark dataframes. >> >> Thanks and Regards >> Praveen >> >> >> On Wed, Feb 14, 2024 at 9:20 AM Chao Sun <sunc...@apache.org> wrote: >> >>> > Out of interest what are the differences in the approach between this >>> and Glutten? >>> >>> Overall they are similar, although Gluten supports multiple backends >>> including Velox and Clickhouse. One major difference is (obviously) >>> Comet is based on DataFusion and Arrow, and written in Rust, while >>> Gluten is mostly C++. >>> I haven't looked very deep into Gluten yet, but there could be other >>> differences such as how strictly the engine follows Spark's semantics, >>> table format support (Iceberg, Delta, etc), fallback mechanism >>> (coarse-grained fallback on stage level or more fine-grained fallback >>> within stages), UDF support (Comet hasn't started on this yet), >>> shuffle support, memory management, etc. >>> >>> Both engines are backed by very strong and vibrant open source >>> communities (Velox, Clickhouse, Arrow & DataFusion) so it's very >>> exciting to see how the projects will grow in future. >>> >>> Best, >>> Chao >>> >>> On Tue, Feb 13, 2024 at 10:06 PM John Zhuge <jzh...@apache.org> wrote: >>> > >>> > Congratulations! Excellent work! >>> > >>> > On Tue, Feb 13, 2024 at 8:04 PM Yufei Gu <flyrain...@gmail.com> wrote: >>> >> >>> >> Absolutely thrilled to see the project going open-source! Huge >>> congrats to Chao and the entire team on this milestone! >>> >> >>> >> Yufei >>> >> >>> >> >>> >> On Tue, Feb 13, 2024 at 12:43 PM Chao Sun <sunc...@apache.org> wrote: >>> >>> >>> >>> Hi all, >>> >>> >>> >>> We are very happy to announce that Project Comet, a plugin to >>> >>> accelerate Spark query execution via leveraging DataFusion and Arrow, >>> >>> has now been open sourced under the Apache Arrow umbrella. Please >>> >>> check the project repo >>> >>> https://github.com/apache/arrow-datafusion-comet for more details if >>> >>> you are interested. We'd love to collaborate with people from the >>> open >>> >>> source community who share similar goals. >>> >>> >>> >>> Thanks, >>> >>> Chao >>> >>> >>> >>> --------------------------------------------------------------------- >>> >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> >>> > >>> > >>> > -- >>> > John Zhuge >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>> >>>