Thanks a lot to everyone for vetting this proposal.

Today's the deadline for submitting proposals and I have mine ready but I
got to know that before 1st April (Tomorrow) all proposals must have
accepted mentors.
I am looking for a community member whom I list as mentor and If possible I
would need someone to register as a mentor for ASF organization and approve
my proposal.
I have applied using email id: [email protected]
Proposal that I will be submitting:
https://docs.google.com/document/d/1ZEgzQj1cxt1fQLXh7auZE7E1xCDGkTNt7dNpA0PaG7U/edit?usp=sharing
:

I have already reached out to Russell Spitzer and Peter Vary directly, but
given the tight timeline I wanted to flag this to the broader community as
well.
Thanks and Apologies for last minute request

On Wed, Mar 18, 2026 at 12:19 AM Varun Lakhyani <[email protected]>
wrote:

> Hey All,
>
> I previously started a discussion on making Spark readers work in parallel
> (asynchronously), which is beneficial in cases with large numbers of small
> files such as compaction, and I have worked on a POC, high-level design,
> implementation, and benchmarking for various scenarios. I presented my
> approach and benchmarking results in the Iceberg Spark sync; the recording
> may be available in the Iceberg Spark Community Sync Notes [0].
>
> I am planning to submit this work as a GSoC 2026 proposal based on this
> idea and was advised to seek formal community vetting on the dev mailing
> list.
>
> Previous DISCUSS thread:
> https://lists.apache.org/thread/b5jrlyv61lmw867kksw05sot2tro5ybn
>
> Issue:
> https://github.com/apache/iceberg/issues/15287
>
> Prototype implementation:
> https://github.com/apache/iceberg/pull/15341
>
> Design document and benchmarking details:
>
> https://docs.google.com/document/d/17vBz5t-gSDdmB0S40MYRceyvmcBSzw9Gii-FcU97Lds/edit?usp=sharing
>
> Initial benchmarking shows noticeable improvements for workloads involving
> many small files, particularly when IO latency is present (details in the
> design document).
>
> Any feedback (+1 / concerns / suggestions) would be appreciated.
> I am specifically looking for community consensus on whether this is a
> viable direction for Iceberg before formalizing the GSoC proposal. The GSoC
> 2026 proposal deadline is March 31 - early feedback would be especially
> appreciated.
>
> [0] Iceberg Spark Community Sync Notes:
> https://docs.google.com/document/d/19nno1RoPznbbxKOZZddZNHHafa7XULjbN6RPExdr2n4/edit?usp=sharing
> --
> Lakhyani Varun
> Indian Institute of Technology Roorkee
>
>

Reply via email to