[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...

2016-10-24 Thread haoch
Github user haoch commented on the issue:

https://github.com/apache/incubator-eagle/pull/556
  
@garrettlish thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...

2016-10-24 Thread garrettlish
Github user garrettlish commented on the issue:

https://github.com/apache/incubator-eagle/pull/556
  
Yes, cool, then we can keep the current implementation and override the 
batch size configuration in publish properties kafka_client_config :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...

2016-10-24 Thread haoch
Github user haoch commented on the issue:

https://github.com/apache/incubator-eagle/pull/556
  
The default value is just for example :-).

And when using kafka producer `async` mode and throughput becomes extremely 
larger, batch size is one of the most important configurations for tuning.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...

2016-10-24 Thread garrettlish
Github user garrettlish commented on the issue:

https://github.com/apache/incubator-eagle/pull/556
  
Thx @haoch @RalphSu. I thought we set async for kafka producer and what we 
need to change is using callback rather than future wait, it is wrong, thanks 
for pointing out.

I have updated the code to add kafka_client_config (list of name/value map) 
to specify kafka producer configurations.

By default, I only set producer.type=async. 
For batch.num.messages, queue.buffering.max.ms and 
queue.buffering.max.messages, I think we can use kafka producer default 
values. 
The only difference for default value is batch.num.messages, it is 200 if 
not specified. Could u please share with us what is your reason to set it to 
3000? 

the kafka producer properties could be defined in publish properties as 
follows:
{
"name": "***",
"properties" : {
"kafka_broker": "***",
"topics": "***",
"kafka_client_config" : [
{
"name" : "request.requrie.acks",
"value": 1
},
{
"name" : "producer.type",
"value": "async"
},
...
]
}
}


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...

2016-10-24 Thread haoch
Github user haoch commented on the issue:

https://github.com/apache/incubator-eagle/pull/556
  
BTW: these selected config keys are what we have evaluated in real-case and 
impact kafka throughput mostly, instead of not requiring you guys to include 
additional many kafka configurations

~~~
producer.type = async
batch.num.messages = 3000
queue.buffering.max.ms  = 5000
queue.buffering.max.messages = 1
~~~



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...

2016-10-24 Thread haoch
Github user haoch commented on the issue:

https://github.com/apache/incubator-eagle/pull/556
  
The configuration is ok. The primary concern here is the `async` 
implementation, kafka producer natively support `async` mode, so that you just 
need pass through it instead of handle `async` thread manually which will look 
not very clean.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...

2016-10-23 Thread RalphSu
Github user RalphSu commented on the issue:

https://github.com/apache/incubator-eagle/pull/556
  
The convention here:

we will have these properties as part of publishment and need to avoid "." 
in key, since mongo store doesn't support "." in key, use "_" instead.

There would be many configuration for kafka, let us focus on above 
properties, and request.required.acks one in this PR.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-eagle issue #556: EAGLE-670: make kafka publisher configurable and...

2016-10-23 Thread haoch
Github user haoch commented on the issue:

https://github.com/apache/incubator-eagle/pull/556
  
@garrettlish the implementation looks a little confusing.

Please just simply use kafka configs: 
https://kafka.apache.org/08/documentation.html#clientconfig and another part 
using async kafka producer: 
https://github.com/apache/incubator-eagle/blob/master/eagle-core/eagle-app/eagle-app-base/src/main/java/org/apache/eagle/app/sink/KafkaStreamSink.java#L53-L57

# kafka properties

producer.type = async
batch.num.messages = 3000
queue.buffering.max.ms  = 5000
queue.buffering.max.messages = 1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---