+1

I agree with Jungtaek and Gabor about switching the default value of
configurations with the migration guide.

Dongjoon

On Thu, Oct 13, 2022 at 12:46 AM Gabor Somogyi <gabor.g.somo...@gmail.com>
wrote:

> Hi Jungtaek,
>
> Good to hear that the new approach is working fine. +1 from my side.
>
> BR,
> G
>
>
> On Thu, Oct 13, 2022 at 4:12 AM Jungtaek Lim <kabhwan.opensou...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I would like to propose flipping the default value of Kafka offset
>> fetching config. The context is following:
>>
>> Before Spark 3.1, there was only one approach on fetching offset, using
>> consumer.poll(0). This has been pointed out as a root cause for hang since
>> there is no timeout for metadata fetch.
>>
>> In Spark 3.1, we addressed this via introducing a new approach on
>> fetching offset, via SPARK-32032
>> <https://issues.apache.org/jira/browse/SPARK-32032>. Since the new
>> approach leverages AdminClient and consumer group is no longer needed for
>> fetching offset, required security ACLs are loosen.
>>
>> Reference:
>> https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html#offset-fetching
>>
>> There was some concern about behavioral change on the security model
>> hence we couldn't make the new approach by default.
>>
>> During the time, we have observed various Kafka connector related issues
>> which came from old offset fetching (e.g. hang, issues on rebalance on
>> customer group, etc.) and we fixed many of these issues via simply flipping
>> the config.
>>
>> Based on this, I would consider the default value as "incorrect". The
>> security-related behavioral change would be introduced inevitably (they can
>> set topic based ACL rule), but most people will get benefited. IMHO this is
>> something we can deal with release/migration note.
>>
>> Would like to hear the voices on this.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>

Reply via email to