[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...
Github user haoch commented on the issue: https://github.com/apache/incubator-eagle/pull/556 @garrettlish thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...
Github user garrettlish commented on the issue: https://github.com/apache/incubator-eagle/pull/556 Yes, cool, then we can keep the current implementation and override the batch size configuration in publish properties kafka_client_config :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...
Github user haoch commented on the issue: https://github.com/apache/incubator-eagle/pull/556 The default value is just for example :-). And when using kafka producer `async` mode and throughput becomes extremely larger, batch size is one of the most important configurations for tuning. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...
Github user garrettlish commented on the issue: https://github.com/apache/incubator-eagle/pull/556 Thx @haoch @RalphSu. I thought we set async for kafka producer and what we need to change is using callback rather than future wait, it is wrong, thanks for pointing out. I have updated the code to add kafka_client_config (list of name/value map) to specify kafka producer configurations. By default, I only set producer.type=async. For batch.num.messages, queue.buffering.max.ms and queue.buffering.max.messages, I think we can use kafka producer default values. The only difference for default value is batch.num.messages, it is 200 if not specified. Could u please share with us what is your reason to set it to 3000? the kafka producer properties could be defined in publish properties as follows: { "name": "***", "properties" : { "kafka_broker": "***", "topics": "***", "kafka_client_config" : [ { "name" : "request.requrie.acks", "value": 1 }, { "name" : "producer.type", "value": "async" }, ... ] } } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...
Github user haoch commented on the issue: https://github.com/apache/incubator-eagle/pull/556 BTW: these selected config keys are what we have evaluated in real-case and impact kafka throughput mostly, instead of not requiring you guys to include additional many kafka configurations ~~~ producer.type = async batch.num.messages = 3000 queue.buffering.max.ms = 5000 queue.buffering.max.messages = 1 ~~~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...
Github user haoch commented on the issue: https://github.com/apache/incubator-eagle/pull/556 The configuration is ok. The primary concern here is the `async` implementation, kafka producer natively support `async` mode, so that you just need pass through it instead of handle `async` thread manually which will look not very clean. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...
Github user RalphSu commented on the issue: https://github.com/apache/incubator-eagle/pull/556 The convention here: we will have these properties as part of publishment and need to avoid "." in key, since mongo store doesn't support "." in key, use "_" instead. There would be many configuration for kafka, let us focus on above properties, and request.required.acks one in this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...
Github user haoch commented on the issue: https://github.com/apache/incubator-eagle/pull/556 @garrettlish the implementation looks a little confusing. Please just simply use kafka configs: https://kafka.apache.org/08/documentation.html#clientconfig and another part using async kafka producer: https://github.com/apache/incubator-eagle/blob/master/eagle-core/eagle-app/eagle-app-base/src/main/java/org/apache/eagle/app/sink/KafkaStreamSink.java#L53-L57 # kafka properties producer.type = async batch.num.messages = 3000 queue.buffering.max.ms = 5000 queue.buffering.max.messages = 1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---