Hi Dennis,

Great work, and great timing! I've been working on a similar integration
into Spark itself <https://github.com/apache/spark/pull/40615>. Looks like
there's a lot of overlap between the implementations, though I don't think
that precludes them both from existing. Happy to sync up sometime, and I'd
appreciate your guidance on the Spark PR if you're interested in
contributing to it.

Thanks

Ryan Berti

Senior Data Engineer  |  Ads DE

M 7023217573

5808 W Sunset Blvd  |  Los Angeles, CA 90028



On Wed, Apr 26, 2023 at 9:03 AM Denis Shuvalov <[email protected]> wrote:

> Hi there,
>
> I hope this email finds you well. I am writing to you today to introduce
> you to a new library that I developed called spark-sketches
> <https://github.com/Gelerion/spark-sketches/tree/spark-3.0> (MIT licensed).
> This library seamlessly integrates DataSketches into Spark, providing
> support for both DataFrame and SQL layers in a highly optimized way. It
> includes features such as code generation, UDT support, and more. It works
> for both Spark 2.4 and 3.x versions and uses the latest version of
> DataSketches.
>
> I was wondering if you would be interested in adding a new section to your
> system integration documentation that highlights spark-sketches as a
> recommended tool for integrating DataSketches into Spark. I believe that
> this would be beneficial to your users who are looking to perform efficient
> ETL processing for trend analysis.
>
> Please let me know if you are interested, and I would be happy to provide
> more information about the library and answer any questions you may have.
>
> Thank you for your time and consideration.
>
> Best regards, Denis
>

Reply via email to