Re: [DISCUSS] Migration guide on upgrading Kafka to 3.1 in Spark 3.3

2022-03-18 Thread Jungtaek Lim
The thing is, it is “us” who upgrades Kafka client and makes possible divergence between client and broker in end users’ production env. Someone can claim that end users can downgrade the kafka-client artifact when building their app so that the version can be matched, but we don’t test anything

Re: Apache Spark 3.3 Release

2022-03-18 Thread Dongjoon Hyun
Thank you for your summarization. I believe we need to have a discussion in order to evaluate each PR's readiness. BTW, `branch-3.3` is still open for bug fixes including minor dependency changes like the following. (Backported) [SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4 Revert

Re: [DISCUSS] Migration guide on upgrading Kafka to 3.1 in Spark 3.3

2022-03-18 Thread Sean Owen
I think we can assume that someone upgrading Kafka will be responsible for thinking through the breaking changes. We can help by listing anything we know could affect Spark-Kafka usage and calling those out in a release note, for sure. I don't think we need to get into items that would affect

Re: [DISCUSS] Migration guide on upgrading Kafka to 3.1 in Spark 3.3

2022-03-18 Thread Jungtaek Lim
As always, I hope that the direction of the discussion would be focusing on the topic. Let’s avoid ourselves to be side-tracked. Please consider the mail thread as full context and feel free to ask me if there is a lack of information for you to provide a voice. Thanks for the voice in previous

Re: [DISCUSS] Migration guide on upgrading Kafka to 3.1 in Spark 3.3

2022-03-18 Thread Gabor Somogyi
I've just read the related PR and seems like the situation is not so black and white as I've presumed purely from tech point of view... On Fri, 18 Mar 2022, 12:44 Gabor Somogyi, wrote: > Hi Jungtaek, > > I've taken a deeper look at the issue and here are my findings. > As far as I'm concerned

Re: [DISCUSS] Migration guide on upgrading Kafka to 3.1 in Spark 3.3

2022-03-18 Thread Gabor Somogyi
Hi Jungtaek, I've taken a deeper look at the issue and here are my findings. As far as I'm concerned there are basically 2 ways with some minor decorations: * We care * We don't care I'm pretty sure users are clever enough but setting the expectation that all users are tracking Kafka KIPs

Re: [DISCUSS] Migration guide on upgrading Kafka to 3.1 in Spark 3.3

2022-03-18 Thread Jungtaek Lim
CORRECTION: in option 2, we enumerate KIPs which may bring incompatibility with older brokers (not all KIPs). On Fri, Mar 18, 2022 at 7:12 PM Jungtaek Lim wrote: > Hi dev, > > I would like to initiate the discussion about how to deal with the > migration guide on upgrading Kafka to 3.1 (from

[DISCUSS] Migration guide on upgrading Kafka to 3.1 in Spark 3.3

2022-03-18 Thread Jungtaek Lim
Hi dev, I would like to initiate the discussion about how to deal with the migration guide on upgrading Kafka to 3.1 (from 2.8.1) in upcoming Spark 3.3. We didn't care much about the upgrade of Kafka dependency since our belief on Kafka client has been that the new Kafka client version should

Re: Apache Spark 3.3 Release

2022-03-18 Thread Maxim Gekk
Hi All, Here is the allow list which I built based on your requests in this thread: 1. SPARK-37396: Inline type hint files for files in python/pyspark/mllib 2. SPARK-37395: Inline type hint files for files in python/pyspark/ml 3. SPARK-37093: Inline type hints python/pyspark/streaming