Hi,

please see my comments inline

On 06/02/2020 16:24, Alexey Romanenko wrote:
Please, see my comments inline.

On 6 Feb 2020, at 10:50, Etienne Chauchot <echauc...@apache.org <mailto:echauc...@apache.org>> wrote:

        1. regarding version support: ES v2 is no more maintained
        by Elastic since 2018/02 so we plan to remove it from the
        IO. In the past we already retired versions (like spark
        1.6 for instance).


    My only concern here is that there might be users who use the
    existing module who might not be able to easily upgrade the Beam
    version if we remove it. But given that V2 is 5 versions behind
    the latest release this might be OK.


It seems we have a consensus on this.
I think there should be another general discussion on the long term support of our prefered tool IO modules.

=> yes, consensus, let's drop ESV2

We had (and still have) a similar problem with KafkaIO to support different versions of Kafka, especially very old version 0.9. We raised this question on user@ and it appears that there are users who for some reasons still use old Kafka versions. So, before dropping a support of any ES versions, I’d suggest to ask it user@ and see if any people will be affected by this.
Yes we can do a survey among users but the question is, should we support an ES version that is no more supported by Elastic themselves ?

        2. regarding the user: the aim is to unlock some new
        features (listed by Ludovic) and give the user more
        flexibility on his request. For that, it requires to use
        high level java ES client in place of the low level REST
        client (that was used because it is the only one
        compatible with all ES versions). We plan to replace the
        API (json document in and out) by more complete standard
        ES objects that contain de request logic (insert/update,
        doc routing etc...) and the data. There are already IOs
        like SpannerIO that use similar objects in input
        PCollection rather than pure POJOs.


    Won't this be a breaking change for all users ? IMO using POJOs
    in PCollections is safer since we have to worry about changes to
    the underlying client library API. Exception would be when
    underlying client library offers a backwards
    compatibility guarantee that we can rely on for the
    foreseeable future (for example, BQ TableRow).


Agreed but actually, there will be POJOs in order to abstract Elasticsearch's version support. The following third point explains this.

=> indeed it will be a breaking change, hence this email to get a consensus on that. Also I think our wrappers of ES request objects will offer a backward compatible as the underlying objects

I just want to remind that according to what we agreed some time ago on dev@ (at least, for IOs), all breaking user API changes have to be added along with deprecation of old API that could be removed after 3 consecutive Beam releases. In this case, users will have a time to move to new API smoothly.

We are more discussing the target architecture of the new module here but the process of deprecation is important to recall, I agree. When I say DTOs backward compatible above I mean between per-version sub-modules inside the new module. Anyway, sure, for some time, both modules (the old REST-based that supports v2-7 and the new that supports v5-7) will cohabit and the old one will receive the deprecation annotations.

Best

Etienne



Reply via email to