[DISCUSS] support different versions of backends in an IO

Etienne Chauchot Fri, 23 Jun 2017 02:52:45 -0700

Hi guys,

I'm working on Elasticsearch 5.x support for Beam IO (it only supportsElasticsearch 2.x right now). I wanted to have your opinion on somepoints related to maintenance.

In this ES case a big part of the code of the IO is common between ESv2.x and ES v5.x. Still, there are some differences:


- initialization of UT (change in embedded test framework)

- Minor differences in one message format

- New feature that will allow improving the split or new feature that isworth leveraging (ES pipelines)

=> Question is: what do you think is the best way to architecture the IOto reduce maintenance

1. We could have an elasticsearchio-common package and two packages thatare specific to each version of the backend. But I find it confusing forthe users to have separate packages and more complex to maintain for us.

2. I'm more in favor of detecting the version at IO initialization timeand then, in the parts that are different do a simple if (version == x).But it will make code paths more complex. Note that for example projectes-hadoop (ES connectors for big data engines) chose this way.

Another thing related to unit tests: in fact they are more close tointegration tests as they use an embedded backend server. I did it thatway because I wanted to unit test things like split that require a realinstance.=> What is the recommended way of testing on both supported versionsknowing that both the test code and the test dependencies are different?

For integration tests (they are mainly used as load testing), the testcode and the test dependencies are the same between versions becausethere is no embedded ES. So, it will be only needed to run them twiceagainst 2 versions of the backend.



What do you think?

PS: sorry for the long email :)

Best!
Etienne

[DISCUSS] support different versions of backends in an IO

Reply via email to