+1 I agree with Jungtaek and Gabor about switching the default value of configurations with the migration guide.
Dongjoon On Thu, Oct 13, 2022 at 12:46 AM Gabor Somogyi <gabor.g.somo...@gmail.com> wrote: > Hi Jungtaek, > > Good to hear that the new approach is working fine. +1 from my side. > > BR, > G > > > On Thu, Oct 13, 2022 at 4:12 AM Jungtaek Lim <kabhwan.opensou...@gmail.com> > wrote: > >> Hi all, >> >> I would like to propose flipping the default value of Kafka offset >> fetching config. The context is following: >> >> Before Spark 3.1, there was only one approach on fetching offset, using >> consumer.poll(0). This has been pointed out as a root cause for hang since >> there is no timeout for metadata fetch. >> >> In Spark 3.1, we addressed this via introducing a new approach on >> fetching offset, via SPARK-32032 >> <https://issues.apache.org/jira/browse/SPARK-32032>. Since the new >> approach leverages AdminClient and consumer group is no longer needed for >> fetching offset, required security ACLs are loosen. >> >> Reference: >> https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html#offset-fetching >> >> There was some concern about behavioral change on the security model >> hence we couldn't make the new approach by default. >> >> During the time, we have observed various Kafka connector related issues >> which came from old offset fetching (e.g. hang, issues on rebalance on >> customer group, etc.) and we fixed many of these issues via simply flipping >> the config. >> >> Based on this, I would consider the default value as "incorrect". The >> security-related behavioral change would be introduced inevitably (they can >> set topic based ACL rule), but most people will get benefited. IMHO this is >> something we can deal with release/migration note. >> >> Would like to hear the voices on this. >> >> Thanks, >> Jungtaek Lim (HeartSaVioR) >> >