On Thu, Feb 6, 2020 at 8:13 AM Etienne Chauchot <[email protected]> wrote:
> Hi, > > please see my comments inline > On 06/02/2020 16:24, Alexey Romanenko wrote: > > Please, see my comments inline. > > On 6 Feb 2020, at 10:50, Etienne Chauchot <[email protected]> wrote: > > 1. regarding version support: ES v2 is no more maintained by Elastic since >>> 2018/02 so we plan to remove it from the IO. In the past we already retired >>> versions (like spark 1.6 for instance). >>> >>> >> My only concern here is that there might be users who use the existing >> module who might not be able to easily upgrade the Beam version if we >> remove it. But given that V2 is 5 versions behind the latest release this >> might be OK. >> > > It seems we have a consensus on this. > I think there should be another general discussion on the long term > support of our prefered tool IO modules. > > => yes, consensus, let's drop ESV2 > > We had (and still have) a similar problem with KafkaIO to support > different versions of Kafka, especially very old version 0.9. We raised > this question on user@ and it appears that there are users who for some > reasons still use old Kafka versions. So, before dropping a support of any > ES versions, I’d suggest to ask it user@ and see if any people will be > affected by this. > > Yes we can do a survey among users but the question is, should we support > an ES version that is no more supported by Elastic themselves ? > +1 for asking in the user list. I guess this is more about whether users need this specific version that we hope to drop support for. Whether we need to support unsupported versions is a more generic question that should prob. be addressed in the dev list. (and I personally don't think we should unless there's a large enough user base for a given version). 2. regarding the user: the aim is to unlock some new features (listed by >>> Ludovic) and give the user more flexibility on his request. For that, it >>> requires to use high level java ES client in place of the low level REST >>> client (that was used because it is the only one compatible with all ES >>> versions). We plan to replace the API (json document in and out) by more >>> complete standard ES objects that contain de request logic (insert/update, >>> doc routing etc...) and the data. There are already IOs like SpannerIO that >>> use similar objects in input PCollection rather than pure POJOs. >>> >>> >> Won't this be a breaking change for all users ? IMO using POJOs in >> PCollections is safer since we have to worry about changes to the >> underlying client library API. Exception would be when underlying client >> library offers a backwards compatibility guarantee that we can rely on for >> the foreseeable future (for example, BQ TableRow). >> > > Agreed but actually, there will be POJOs in order to abstract > Elasticsearch's version support. The following third point explains this. > > => indeed it will be a breaking change, hence this email to get a > consensus on that. Also I think our wrappers of ES request objects will > offer a backward compatible as the underlying objects > > I just want to remind that according to what we agreed some time ago on > dev@ (at least, for IOs), all breaking user API changes have to be added > along with deprecation of old API that could be removed after 3 consecutive > Beam releases. In this case, users will have a time to move to new API > smoothly. > > We are more discussing the target architecture of the new module here but > the process of deprecation is important to recall, I agree. When I say DTOs > backward compatible above I mean between per-version sub-modules inside the > new module. Anyway, sure, for some time, both modules (the old REST-based > that supports v2-7 and the new that supports v5-7) will cohabit and the old > one will receive the deprecation annotations. > +1 for supporting both versions for at least three minor versions to give users time to migrate. Also, we should try to produce a warning for users who use the deprecated versions. Thanks, Cham > Best > > Etienne > > > >
