Hi all,
The survey regarding Elasticsearch support in Beam is now closed.
Here are the results after 38 days:
users using
ESv2: 0
ESV5: 1
ESV6: 5
ESV7: 8
So, the new version of ElasticsearchIO after the refactoring discussed
in this thread will no more support Elasticsearch v2.
Regards
Etienne Chauchot.
On 06/03/2020 11:26, Etienne Chauchot wrote:
Hi all,
it's been 3 weeks since the survey on ES versions the users use.
The survey received very few responses: only 9 responses for now
(multiple versions possible of course). The responses are the following:
ES2: 0 clients, ES5: 1, ES6: 5, ES7: 8
It tends to go toward a drop of ES2 support but for now it is still
not very representative.
I'm cross-posting to @users to let you know that I'm closing the
survey within 1 or 2 weeks. So please respond if you're using ESIO.
Best
Etienne
On 13/02/2020 12:37, Etienne Chauchot wrote:
Hi Cham, thanks for your comments !
I just sent an email to user ML with a survey link to count ES uses
per version:
https://lists.apache.org/thread.html/rc8185afb8af86a2a032909c13f569e18bd89e75a5839894d5b5d4082%40%3Cuser.beam.apache.org%3E
Best
Etienne
On 10/02/2020 19:46, Chamikara Jayalath wrote:
On Thu, Feb 6, 2020 at 8:13 AM Etienne Chauchot
<echauc...@apache.org <mailto:echauc...@apache.org>> wrote:
Hi,
please see my comments inline
On 06/02/2020 16:24, Alexey Romanenko wrote:
Please, see my comments inline.
On 6 Feb 2020, at 10:50, Etienne Chauchot
<echauc...@apache.org <mailto:echauc...@apache.org>> wrote:
1. regarding version support: ES v2 is no more
maintained by Elastic since 2018/02 so we plan to
remove it from the IO. In the past we already
retired versions (like spark 1.6 for instance).
My only concern here is that there might be users who use
the existing module who might not be able to easily
upgrade the Beam version if we remove it. But given that
V2 is 5 versions behind the latest release this might be OK.
It seems we have a consensus on this.
I think there should be another general discussion on the
long term support of our prefered tool IO modules.
=> yes, consensus, let's drop ESV2
We had (and still have) a similar problem with KafkaIO to
support different versions of Kafka, especially very old
version 0.9. We raised this question on user@ and it appears
that there are users who for some reasons still use old Kafka
versions. So, before dropping a support of any ES versions, I’d
suggest to ask it user@ and see if any people will be affected
by this.
Yes we can do a survey among users but the question is, should
we support an ES version that is no more supported by Elastic
themselves ?
+1 for asking in the user list. I guess this is more about whether
users need this specific version that we hope to drop support for.
Whether we need to support unsupported versions is a more generic
question that should prob. be addressed in the dev list. (and I
personally don't think we should unless there's a large enough user
base for a given version).
2. regarding the user: the aim is to unlock some
new features (listed by Ludovic) and give the user
more flexibility on his request. For that, it
requires to use high level java ES client in place
of the low level REST client (that was used because
it is the only one compatible with all ES
versions). We plan to replace the API (json
document in and out) by more complete standard ES
objects that contain de request logic
(insert/update, doc routing etc...) and the data.
There are already IOs like SpannerIO that use
similar objects in input PCollection rather than
pure POJOs.
Won't this be a breaking change for all users ? IMO using
POJOs in PCollections is safer since we have to worry
about changes to the underlying client library API.
Exception would be when underlying client library offers
a backwards compatibility guarantee that we can rely on
for the foreseeable future (for example, BQ TableRow).
Agreed but actually, there will be POJOs in order to abstract
Elasticsearch's version support. The following third point
explains this.
=> indeed it will be a breaking change, hence this email to
get a consensus on that. Also I think our wrappers of ES
request objects will offer a backward compatible as the
underlying objects
I just want to remind that according to what we agreed some
time ago on dev@ (at least, for IOs), all breaking user API
changes have to be added along with deprecation of old API that
could be removed after 3 consecutive Beam releases. In this
case, users will have a time to move to new API smoothly.
We are more discussing the target architecture of the new module
here but the process of deprecation is important to recall, I
agree. When I say DTOs backward compatible above I mean between
per-version sub-modules inside the new module. Anyway, sure, for
some time, both modules (the old REST-based that supports v2-7
and the new that supports v5-7) will cohabit and the old one
will receive the deprecation annotations.
+1 for supporting both versions for at least three minor versions to
give users time to migrate. Also, we should try to produce a warning
for users who use the deprecated versions.
Thanks,
Cham
Best
Etienne