Who could help pulling the latest Maven download stats for beam-runners-spark and beam-runners-spark-3 for the last few Beam releases?
Thanks so much! / Moritz On 01.04.22, 16:54, "Moritz Mack" <mm...@talend.com> wrote: I just started looking into the Spark runner code a bit to helpfully help supporting it. Besides having to maintain (test!) twice the number of artifacts, there’s also a significant negative impact on developer ergonomics / productivity supporting I just started looking into the Spark runner code a bit to helpfully help supporting it. Besides having to maintain (test!) twice the number of artifacts, there’s also a significant negative impact on developer ergonomics / productivity supporting multiple major versions (separate modules to deal with breaking changes and all the trouble that comes with that). Thanks, Alexey, for opening the discussion. Certainly a big +1 from my side. / Moritz From: Alexey Romanenko <aromanenko....@gmail.com> Date: Thursday, 31. March 2022 at 18:51 To: dev <dev@beam.apache.org> Subject: Re: [PROPOSAL] Stop Spark2 support in Spark Runner !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. Exercise caution when opening attachments or clicking any links. |-------------------------------------------------------------------! > On 31 Mar 2022, at 18:02, Robert Bradshaw <rober...@google.com> wrote: > > Generally makes sense to me, though I'm curious what the maintenance > burden is *high or low) in keeping it around. Well, we need to provide two versions of spark runner artifacts, job-servers and docker images, to test them separately (different Jenkins jobs). We also have two different code paths for the cases where API is not compatible between Spark2 and Spark3. > We should probably > deprecate it for a period of time before removing support. Agree and I’d suggest even ask users on user@/twitter before. Actually, I see some problem with naming. By default, we used to call “Spark runner” as a runner that works with Spark2 (for example, the artifacts [1][2]). When Spark3 support was added, all its Beam artifacts and related names reflect its version [3][4]. So, it’s not clear how it will be better to deal with this, especially, taking into account, that new Spark version (4, 5, etc) will be available sooner or later. Perhaps, to avoid a confusion in the future, we need to follow the same naming pattern. — Alexey [1] https://urldefense.com/v3/__https://search.maven.org/artifact/org.apache.beam/beam-runners-spark__;!!CiXD_PY!URNp5UzJrCpB9s1jH33QcFeeNp5f3S7yzes0A03mrqRxCP9P3ZJZy1_2l3mF5QcCHmGhxZl0fWKf9ZbVIoOAIg$<https://urldefense.com/v3/__https:/search.maven.org/artifact/org.apache.beam/beam-runners-spark__;!!CiXD_PY!URNp5UzJrCpB9s1jH33QcFeeNp5f3S7yzes0A03mrqRxCP9P3ZJZy1_2l3mF5QcCHmGhxZl0fWKf9ZbVIoOAIg$> [2] https://urldefense.com/v3/__https://search.maven.org/artifact/org.apache.beam/beam-runners-spark-job-server__;!!CiXD_PY!URNp5UzJrCpB9s1jH33QcFeeNp5f3S7yzes0A03mrqRxCP9P3ZJZy1_2l3mF5QcCHmGhxZl0fWKf9ZZb2YMCkw$<https://urldefense.com/v3/__https:/search.maven.org/artifact/org.apache.beam/beam-runners-spark-job-server__;!!CiXD_PY!URNp5UzJrCpB9s1jH33QcFeeNp5f3S7yzes0A03mrqRxCP9P3ZJZy1_2l3mF5QcCHmGhxZl0fWKf9ZZb2YMCkw$> [3] https://urldefense.com/v3/__https://search.maven.org/artifact/org.apache.beam/beam-runners-spark-3__;!!CiXD_PY!URNp5UzJrCpB9s1jH33QcFeeNp5f3S7yzes0A03mrqRxCP9P3ZJZy1_2l3mF5QcCHmGhxZl0fWKf9ZZYlOwKfg$<https://urldefense.com/v3/__https:/search.maven.org/artifact/org.apache.beam/beam-runners-spark-3__;!!CiXD_PY!URNp5UzJrCpB9s1jH33QcFeeNp5f3S7yzes0A03mrqRxCP9P3ZJZy1_2l3mF5QcCHmGhxZl0fWKf9ZZYlOwKfg$> [4] https://urldefense.com/v3/__https://search.maven.org/artifact/org.apache.beam/beam-runners-spark-3-job-server__;!!CiXD_PY!URNp5UzJrCpB9s1jH33QcFeeNp5f3S7yzes0A03mrqRxCP9P3ZJZy1_2l3mF5QcCHmGhxZl0fWKf9ZaC2NvaLw$<https://urldefense.com/v3/__https:/search.maven.org/artifact/org.apache.beam/beam-runners-spark-3-job-server__;!!CiXD_PY!URNp5UzJrCpB9s1jH33QcFeeNp5f3S7yzes0A03mrqRxCP9P3ZJZy1_2l3mF5QcCHmGhxZl0fWKf9ZaC2NvaLw$> > > On Thu, Mar 31, 2022 at 8:52 AM Alexey Romanenko > <aromanenko....@gmail.com> wrote: >> >> Hi everyone, >> >> For the moment, Beam Spark Runner supports two versions of Spark - 2.x and >> 3.x. >> >> Taking into account the several things that: >> - almost all cloud providers already mostly moved to Spark 3.x as a main >> supported version; >> - the latest Spark 2.x release (Spark 2.4.8, maintenance release) was done >> almost a year ago; >> - Spark 3 is considered as a mainstream Spark version for development and >> bug fixing; >> - better to avoid the burden of maintenance (there are some >> incompatibilities between Spark 2 and 3) of two versions; >> >> I’d suggest to stop support Spark 2 for the Spark Runner in the one of the >> next Beam releases. >> >> What are your thoughts on this? Are there any principal objections or >> reasons for not doing this that I probably missed? >> >> — >> Alexey >> >> As a recipient of an email from Talend, your contact personal data will be on our systems. Please see our privacy notice. <https://www.talend.com/privacy/> As a recipient of an email from Talend, your contact personal data will be on our systems. Please see our privacy notice. <https://www.talend.com/privacy/>