Hi Jay,
I'd tend to ask the same question as Chesnay here:
On a regular pipeline such as
/org.apache.flink.streaming.connectors.cassandra.example.CassandraPojoSinkExample/
with a parallelism of 4 you will have 4 /CassandraPojoSink/ with for
each, one /MappingManager/, one Cassandra /Cluster/ and one Cassandra
/Session. /This is perfectly right as the sinks in production could
reside on different taskManagers on different machines.
You can test it yourself by running in debug mode
/CassandraPojoSinkExample /locally and count the number of these objects
in your local JVM. Don't forget to start a local Cassandra cluster (with
docker for example) listening on 127.0.0.1:9042 and create the keyspace
and table described in /CassandraPojoSinkExample/.
Could you give more details on your particular issue ?
Best
Etienne
Le 22/03/2022 à 12:05, Ghiya, Jay (GE Healthcare) a écrit :
Absolutely . Understood! We shall do.😊
-----Original Message-----
From: Chesnay Schepler<ches...@apache.org>
Sent: 22 March 2022 15:52
To:dev@flink.apache.org; Martijn Visser<martijnvis...@apache.org>; Marco
Zühlke<mzueh...@apache.org>
Cc: R, Aromal (GE Healthcare, consultant)<aroma...@ge.com>; Nellimarla, Aswini (GE
Healthcare)<aswini.nellima...@ge.com>
Subject: EXT: Re: Flink cassandra connector performance issue
WARNING: This email originated from outside of GE. Please validate the sender's
email address before clicking on links or attachments as they may not be safe.
I'd be quite curious on what the analysis that the mapper isn't re-used is
based on.
A given CassandraPojoSink instance has been re-using the same mapper since it
was added in 1.1:
https://github.com/apache/flink/blob/release-1.1/flink-streaming-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/streaming/connectors/cassandra/CassandraPojoSink.java
On 22/03/2022 09:22, Martijn Visser wrote:
Hi Jay,
Thanks for reaching out! As just mentioned in the ticket, the
Cassandra connector hasn't been actively maintained in the last couple
months/years.
It would be great if there would be more contributors who could help
out with this. See also my previous request [1] on the Dev mailing
list, which also contains an overview of the current issues with the connector.
I'm also including @Marco Zühlke<mzueh...@apache.org> who previously
volunteered to help out with the connector. It would be great if you
could all help out :)
Best regards,
Martijn Visser
https://twitter.com/MartijnVisser82
[1]https://lists.apache.org/thread/1qokt5tp8dcp58dmshbwjc43ssbm1vvk
On Tue, 22 Mar 2022 at 07:33, Ghiya, Jay (GE Healthcare)
<jay.gh...@ge.com>
wrote:
Hi @dev@flink.apache.org<mailto:dev@flink.apache.org>,
Greetings from gehc.
This is regarding flink Cassandra connector implementation that could
be causing the performance issue that we are facing.
The summary of the error is
"Insertions into scylla might be suffering. Expect performance
problems unless this is resolved."
Upon doing initial analysis we figured out - "flink cassandra
connector is not keeping instance of mapping manager that is used to
convert a pojo to cassandra row. Ideally the mapping manager should
have the same life time as cluster and session objects which are also
created once when the driver is initialized"
Reference:
https://stackoverflow.com/questions/59203418/cassandra-java-driver-wa
rning
Can we take a look at this? Also can we help fix this ? @R, Aromal
(GE Healthcare, consultant)<mailto:aroma...@ge.com> Is our lead dev on this.
Here is the jira issue on the same -
https://issues.apache.org/jira/browse/FLINK-26793
-Jay