Hi Jay,

I'd tend to ask the same question as Chesnay here:

On a regular pipeline such as /org.apache.flink.streaming.connectors.cassandra.example.CassandraPojoSinkExample/ with a parallelism of 4 you will have 4 /CassandraPojoSink/ with for each, one /MappingManager/, one Cassandra /Cluster/ and one Cassandra /Session. /This is perfectly right as the sinks in production could reside on different taskManagers on different machines.

You can test it yourself by running in debug mode /CassandraPojoSinkExample /locally and count the number of these objects in your local JVM. Don't forget to start a local Cassandra cluster (with docker for example) listening on 127.0.0.1:9042 and create the keyspace and table described in /CassandraPojoSinkExample/.


Could you give more details on your particular issue ?

Best

Etienne

Le 22/03/2022 à 12:05, Ghiya, Jay (GE Healthcare) a écrit :
Absolutely . Understood! We shall do.😊

-----Original Message-----
From: Chesnay Schepler<ches...@apache.org> Sent: 22 March 2022 15:52
To:dev@flink.apache.org; Martijn Visser<martijnvis...@apache.org>; Marco 
Zühlke<mzueh...@apache.org>
Cc: R, Aromal (GE Healthcare, consultant)<aroma...@ge.com>; Nellimarla, Aswini (GE 
Healthcare)<aswini.nellima...@ge.com>
Subject: EXT: Re: Flink cassandra connector performance issue

WARNING: This email originated from outside of GE. Please validate the sender's 
email address before clicking on links or attachments as they may not be safe.

I'd be quite curious on what the analysis that the mapper isn't re-used is 
based on.
A given CassandraPojoSink instance has been re-using the same mapper since it 
was added in 1.1:
https://github.com/apache/flink/blob/release-1.1/flink-streaming-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/streaming/connectors/cassandra/CassandraPojoSink.java

On 22/03/2022 09:22, Martijn Visser wrote:
Hi Jay,

Thanks for reaching out! As just mentioned in the ticket, the
Cassandra connector hasn't been actively maintained in the last couple 
months/years.
It would be great if there would be more contributors who could help
out with this. See also my previous request [1] on the Dev mailing
list, which also contains an overview of the current issues with the connector.

I'm also including @Marco Zühlke<mzueh...@apache.org>  who previously
volunteered to help out with the connector. It would be great if you
could all help out :)

Best regards,

Martijn Visser
https://twitter.com/MartijnVisser82

[1]https://lists.apache.org/thread/1qokt5tp8dcp58dmshbwjc43ssbm1vvk


On Tue, 22 Mar 2022 at 07:33, Ghiya, Jay (GE Healthcare)
<jay.gh...@ge.com>
wrote:

Hi @dev@flink.apache.org<mailto:dev@flink.apache.org>,

Greetings from gehc.

This is regarding flink Cassandra connector implementation that could
be causing the performance issue that we are facing.

The summary of the error is

"Insertions into scylla might be suffering. Expect performance
problems unless this is resolved."

Upon doing initial analysis we figured out - "flink cassandra
connector is not keeping instance of mapping manager that is used to
convert a pojo to cassandra row. Ideally the mapping manager should
have the same life time as cluster and session objects which are also
created once when the driver is initialized"

Reference:
https://stackoverflow.com/questions/59203418/cassandra-java-driver-wa
rning

Can we take a look at this? Also can we help fix this ? @R, Aromal
(GE Healthcare, consultant)<mailto:aroma...@ge.com>  Is our lead dev on this.

Here is the jira issue on the same -
https://issues.apache.org/jira/browse/FLINK-26793

-Jay

Reply via email to