[GitHub] flink issue #1771: [FLINK-3311/FLINK-3332] Add Cassandra connector
Github user f1yegor commented on the issue: https://github.com/apache/flink/pull/1771 Could somebody point to a documentation how we could manage state (pool of connections) when Sink works inside single JVM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #1771: [FLINK-3311/FLINK-3332] Add Cassandra connector
Github user zentol commented on the issue: https://github.com/apache/flink/pull/1771 You are correct that every CassandraSink instance opens a separate connection to the cluster. I don't see how we could avoid this without using Singletons, which should be avoided. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #1771: [FLINK-3311/FLINK-3332] Add Cassandra connector
Github user f1yegor commented on the issue: https://github.com/apache/flink/pull/1771 Looking at the code of `CassandraSinkBase` I assume that every new sink will create own session to the cluster. I'm creating arount 1000 streams and sinks for them, so cassandra cluster would be saturated with openning connections. Cassandra session could manage several connections via one session. Could you confirm that this isn't an issue with the implementation? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #1771: [FLINK-3311/FLINK-3332] Add Cassandra connector
Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/1771 The merging will probably be a bit delayed because there are some memory issues: ``` --- T E S T S --- Running org.apache.flink.streaming.connectors.cassandra.CassandraConnectorTest 06/13/2016 11:12:22 Job execution switched to status RUNNING. 06/13/2016 11:12:22 Source: Collection Source -> Sink: Unnamed(1/1) switched to SCHEDULED 06/13/2016 11:12:22 Source: Collection Source -> Sink: Unnamed(1/1) switched to DEPLOYING 06/13/2016 11:12:23 Source: Collection Source -> Sink: Unnamed(1/1) switched to RUNNING 06/13/2016 11:12:27 Source: Collection Source -> Sink: Unnamed(1/1) switched to FINISHED 06/13/2016 11:12:27 Job execution switched to status FINISHED. Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "CompactionExecutor:2" Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 150.02 sec - in org.apache.flink.streaming.connectors.cassandra.CassandraConnectorTest Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "CompactionExecutor:5" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "CompactionExecutor:4" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HintedHandoffManager:1" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "CompactionExecutor:7" == Maven produced no output for 300 seconds. == == The following Java processes are running (JPS) == 2956 Launcher 98454 Jps 97148 surefirebooter7601064672285446339.jar == Printing stack trace of Java process 2956 == 2016-06-13 11:28:23 Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.76-b04 mixed mode): ``` I will take care of them! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #1771: [FLINK-3311/FLINK-3332] Add Cassandra connector
Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/1771 Thank you for testing the cassandra connector. I'll merge the pull request now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #1771: [FLINK-3311/FLINK-3332] Add Cassandra connector
Github user theomega commented on the issue: https://github.com/apache/flink/pull/1771 I tried out this branch and it works like it should in the scenario I set up: I wrote (so only using the Sink) to a quiet complex columnfamily in a 8 node cassandra cluster. I was using a complex setup of windowed streams and all the data appeared and was perfectly readable as expected. I also had multiple sinks at the same time which also worked perfectly. I could not test the scenario @rmetzger is mentioning. Overall, I agree with @rmetzger that it should be considered to merge this to get more users to test it and report issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---