[jira] [Commented] (KAFKA-7749) confluent does not provide option to set consumer properties at connector level
[ https://issues.apache.org/jira/browse/KAFKA-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739888#comment-16739888 ] Paul Davidson commented on KAFKA-7749: -- Hi [~sliebau]. You can probably ignore the comment about task level settings now. It was actually motivated by the problem described here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-411%3A+Add+option+to+make+Kafka+Connect+task+client+ID+values+unique and here https://issues.apache.org/jira/browse/KAFKA-5061 . We seem to be close to resolving that particular issue by simply including the task id in the default client ID. I can't think of any other specific cases where task-level settings would be particularly useful. > confluent does not provide option to set consumer properties at connector > level > --- > > Key: KAFKA-7749 > URL: https://issues.apache.org/jira/browse/KAFKA-7749 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: Manjeet Duhan >Priority: Major > > _We want to increase consumer.max.poll.record to increase performance but > this value can only be set in worker properties which is applicable to all > connectors given cluster._ > __ > _Operative Situation :- We have one project which is communicating with > Elasticsearch and we set consumer.max.poll.record=500 after multiple > performance tests which worked fine for an year._ > _Then one more project onboarded in the same cluster which required > consumer.max.poll.record=5000 based on their performance tests. This > configuration is moved to production._ > _Admetric started failing as it was taking more than 5 minutes to process > 5000 polled records and started throwing commitfailed exception which is > vicious cycle as it will process same data over and over again._ > __ > _We can control above if start consumer using plain java but this control was > not available at each consumer level in confluent connector._ > _I have overridden kafka code to accept connector properties which will be > applied to single connector and others will keep on using default properties > . These changes are already running in production for more than 5 months._ > _Some of the properties which were useful for us._ > max.poll.records > max.poll.interval.ms > request.timeout.ms > key.deserializer > value.deserializer > heartbeat.interval.ms > session.timeout.ms > auto.offset.reset > connections.max.idle.ms > enable.auto.commit > > auto.commit.interval.ms > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7749) confluent does not provide option to set consumer properties at connector level
[ https://issues.apache.org/jira/browse/KAFKA-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737038#comment-16737038 ] Sönke Liebau commented on KAFKA-7749: - Hi [~pdavidson], can you perhaps elaborate why you'd want to set this at task level? Shouldn't tasks just be for load balancing and hence write to the same cluster? I think Connect currently doesn't have the concept of task-level setting, so that would probably be a larger change.. > confluent does not provide option to set consumer properties at connector > level > --- > > Key: KAFKA-7749 > URL: https://issues.apache.org/jira/browse/KAFKA-7749 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: Manjeet Duhan >Priority: Major > > _We want to increase consumer.max.poll.record to increase performance but > this value can only be set in worker properties which is applicable to all > connectors given cluster._ > __ > _Operative Situation :- We have one project which is communicating with > Elasticsearch and we set consumer.max.poll.record=500 after multiple > performance tests which worked fine for an year._ > _Then one more project onboarded in the same cluster which required > consumer.max.poll.record=5000 based on their performance tests. This > configuration is moved to production._ > _Admetric started failing as it was taking more than 5 minutes to process > 5000 polled records and started throwing commitfailed exception which is > vicious cycle as it will process same data over and over again._ > __ > _We can control above if start consumer using plain java but this control was > not available at each consumer level in confluent connector._ > _I have overridden kafka code to accept connector properties which will be > applied to single connector and others will keep on using default properties > . These changes are already running in production for more than 5 months._ > _Some of the properties which were useful for us._ > max.poll.records > max.poll.interval.ms > request.timeout.ms > key.deserializer > value.deserializer > heartbeat.interval.ms > session.timeout.ms > auto.offset.reset > connections.max.idle.ms > enable.auto.commit > > auto.commit.interval.ms > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7749) confluent does not provide option to set consumer properties at connector level
[ https://issues.apache.org/jira/browse/KAFKA-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16726149#comment-16726149 ] Paul Davidson commented on KAFKA-7749: -- I would also like to see this in the context of Source Connectors, where it would be useful for the connector to override producer properties - ideally at the task level. For example, in Mirus ([https://github.com/salesforce/mirus|https://github.com/salesforce/mirus).]) this would allow each Source Connector to be directed at a different destination cluster without setting up a separate set of workers for each destination. It would also allow the connector to tune the producer properties for each destination cluster (e.g. by tuning the linger time and batch size depending on whether the destination cluster is local or remote). > confluent does not provide option to set consumer properties at connector > level > --- > > Key: KAFKA-7749 > URL: https://issues.apache.org/jira/browse/KAFKA-7749 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: Manjeet Duhan >Priority: Major > > _We want to increase consumer.max.poll.record to increase performance but > this value can only be set in worker properties which is applicable to all > connectors given cluster._ > __ > _Operative Situation :- We have one project which is communicating with > Elasticsearch and we set consumer.max.poll.record=500 after multiple > performance tests which worked fine for an year._ > _Then one more project onboarded in the same cluster which required > consumer.max.poll.record=5000 based on their performance tests. This > configuration is moved to production._ > _Admetric started failing as it was taking more than 5 minutes to process > 5000 polled records and started throwing commitfailed exception which is > vicious cycle as it will process same data over and over again._ > __ > _We can control above if start consumer using plain java but this control was > not available at each consumer level in confluent connector._ > _I have overridden kafka code to accept connector properties which will be > applied to single connector and others will keep on using default properties > . These changes are already running in production for more than 5 months._ > _Some of the properties which were useful for us._ > max.poll.records > max.poll.interval.ms > request.timeout.ms > key.deserializer > value.deserializer > heartbeat.interval.ms > session.timeout.ms > auto.offset.reset > connections.max.idle.ms > enable.auto.commit > > auto.commit.interval.ms > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7749) confluent does not provide option to set consumer properties at connector level
[ https://issues.apache.org/jira/browse/KAFKA-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724110#comment-16724110 ] Manjeet Duhan commented on KAFKA-7749: -- I was more concerned about sink connectors at current scenario so did not get the chance to look into producer configuration. My changes specifically applied to consumers and more specifically to sink connectors. If i want to change max.poll.records for my connector ,The only way is to update consumer.max.poll.records at worker level which is applied to all connector in the group. This requires worker restart as well. With my changes , we can pass consumer.max.poll.records along with post configuration at connector start. 1. This is applied to single connector 2. No Worker restart required My changes will accept all consumer properties which worker accept starting with consumer.* Yes user can pass any value starting with consumer.* which will be passed on to consumer but kafka consumer will not recognize this or might fail. This flow will remain same as existing. I have just 2 lines of code change. > confluent does not provide option to set consumer properties at connector > level > --- > > Key: KAFKA-7749 > URL: https://issues.apache.org/jira/browse/KAFKA-7749 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: Manjeet Duhan >Priority: Major > > _We want to increase consumer.max.poll.record to increase performance but > this value can only be set in worker properties which is applicable to all > connectors given cluster._ > __ > _Operative Situation :- We have one project which is communicating with > Elasticsearch and we set consumer.max.poll.record=500 after multiple > performance tests which worked fine for an year._ > _Then one more project onboarded in the same cluster which required > consumer.max.poll.record=5000 based on their performance tests. This > configuration is moved to production._ > _Admetric started failing as it was taking more than 5 minutes to process > 5000 polled records and started throwing commitfailed exception which is > vicious cycle as it will process same data over and over again._ > __ > _We can control above if start consumer using plain java but this control was > not available at each consumer level in confluent connector._ > _I have overridden kafka code to accept connector properties which will be > applied to single connector and others will keep on using default properties > . These changes are already running in production for more than 5 months._ > _Some of the properties which were useful for us._ > max.poll.records > max.poll.interval.ms > request.timeout.ms > key.deserializer > value.deserializer > heartbeat.interval.ms > session.timeout.ms > auto.offset.reset > connections.max.idle.ms > enable.auto.commit > > auto.commit.interval.ms > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7749) confluent does not provide option to set consumer properties at connector level
[ https://issues.apache.org/jira/browse/KAFKA-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724042#comment-16724042 ] Sönke Liebau commented on KAFKA-7749: - Hi [~mduhan], this is closely related to a discussion that is currently taking place on the dev mailing list in the [MirrorMaker 2.0|https://lists.apache.org/thread.html/12c7171d957f3ca4f809b6365e788d7fa9715f4f41c3a554c6529761@%3Cdev.kafka.apache.org%3E] thread - it might be worthwhile chiming in there as well. Quick question on what you wrote: you list a couple of configuration settings, is your code restricted to these settings, or does your code allow for arbitrary settings to be passed to a consumer and these are just the ones that you found useful? Also, does the same general principle also apply to producer code? > confluent does not provide option to set consumer properties at connector > level > --- > > Key: KAFKA-7749 > URL: https://issues.apache.org/jira/browse/KAFKA-7749 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: Manjeet Duhan >Priority: Major > > _We want to increase consumer.max.poll.record to increase performance but > this value can only be set in worker properties which is applicable to all > connectors given cluster._ > __ > _Operative Situation :- We have one project which is communicating with > Elasticsearch and we set consumer.max.poll.record=500 after multiple > performance tests which worked fine for an year._ > _Then one more project onboarded in the same cluster which required > consumer.max.poll.record=5000 based on their performance tests. This > configuration is moved to production._ > _Admetric started failing as it was taking more than 5 minutes to process > 5000 polled records and started throwing commitfailed exception which is > vicious cycle as it will process same data over and over again._ > __ > _We can control above if start consumer using plain java but this control was > not available at each consumer level in confluent connector._ > _I have overridden kafka code to accept connector properties which will be > applied to single connector and others will keep on using default properties > . These changes are already running in production for more than 5 months._ > _Some of the properties which were useful for us._ > max.poll.records > max.poll.interval.ms > request.timeout.ms > key.deserializer > value.deserializer > heartbeat.interval.ms > session.timeout.ms > auto.offset.reset > connections.max.idle.ms > enable.auto.commit > > auto.commit.interval.ms > -- This message was sent by Atlassian JIRA (v7.6.3#76005)