Hi Mich,

> Also have you got some benchmark results from your tests that you can
possibly share?

We only have some partial benchmark results internally so far. Once shuffle
and better memory management have been introduced, we plan to publish the
benchmark results (at least TPC-H) in the repo.

> Compared to standard Spark, what kind of performance gains can be
expected with Comet?

Currently, users could benefit from Comet in a few areas:
- Parquet read: a few improvements have been made against reading from S3
in particular, so users can expect better scan performance in this scenario
- Hash aggregation
- Columnar shuffle
- Decimals (Java's BigDecimal is pretty slow)

> Can one use Comet on k8s in conjunction with something like a Volcano
addon?

I think so. Comet is mostly orthogonal to the Spark scheduler framework.

Chao






On Fri, Feb 16, 2024 at 5:39 AM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Hi Chao,
>
> As a cool feature
>
>
>    - Compared to standard Spark, what kind of performance gains can be
>    expected with Comet?
>    -  Can one use Comet on k8s in conjunction with something like a
>    Volcano addon?
>
>
> HTH
>
> Mich Talebzadeh,
> Dad | Technologist | Solutions Architect | Engineer
> London
> United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge, sourced from both personal expertise and other resources but of
> course cannot be guaranteed . It is essential to note that, as with any
> advice, one verified and tested result holds more weight than a thousand
> expert opinions.
>
>
> On Tue, 13 Feb 2024 at 20:42, Chao Sun <sunc...@apache.org> wrote:
>
>> Hi all,
>>
>> We are very happy to announce that Project Comet, a plugin to
>> accelerate Spark query execution via leveraging DataFusion and Arrow,
>> has now been open sourced under the Apache Arrow umbrella. Please
>> check the project repo
>> https://github.com/apache/arrow-datafusion-comet for more details if
>> you are interested. We'd love to collaborate with people from the open
>> source community who share similar goals.
>>
>> Thanks,
>> Chao
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>

Reply via email to