[jira] [Commented] (KAFKA-3472) Allow MirrorMaker to copy selected partitions and choose target topic name

2016-03-28 Thread Hang Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215143#comment-15215143
 ] 

Hang Sun commented on KAFKA-3472:
-

If we still need to do sampling/filtering after consume the messages, then it 
defeats the purpose to reduce unnecessary load on the brokers and bandwidth on 
the network infrastructure.  As the new Kafka consumer provides ability to 
assign partitions explicitly, it will be ideal to leverage that so the sampling 
can be done more efficiently.

-thanks  

> Allow MirrorMaker to copy selected partitions and choose target topic name
> --
>
> Key: KAFKA-3472
> URL: https://issues.apache.org/jira/browse/KAFKA-3472
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.9.0.1
>Reporter: Hang Sun
>Priority: Minor
>  Labels: mirror-maker
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It would be nice if MirrorMaker can be used to copy only a few partitions 
> instead of all to a different topic.  My use case is to sample a small 
> portion of production traffic in the pre-production environment for testing.  
> The pre-production environment is usually smaller and cannot handle the full 
> load from production.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3472) Allow MirrorMaker to copy selected partitions and choose target topic name

2016-03-28 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214859#comment-15214859
 ] 

Jiangjie Qin commented on KAFKA-3472:
-

[~granthenke] I think we can either use consumer interceptor or a plugin mirror 
maker message handler.

However, both of them are consume-then-filter approach. If the mirror maker 
consumer itself is not powerful enough to consume from the source cluster, 
consumer interceptor or message handler approaches would not work.

Another hacky way is to simply use the mirror maker in the production 
environment to do the sampling. e.g. let the message handler to instantiate an 
internal producer and send the sampled messages using that producer. It should 
have very little impact on the performance. This would work but is hacky. Also 
if you need to change the sampling topic, you might have to stop mirror maker 
unless the implementation of the message handler takes some dynamic 
configuration.

> Allow MirrorMaker to copy selected partitions and choose target topic name
> --
>
> Key: KAFKA-3472
> URL: https://issues.apache.org/jira/browse/KAFKA-3472
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.9.0.1
>Reporter: Hang Sun
>Priority: Minor
>  Labels: mirror-maker
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It would be nice if MirrorMaker can be used to copy only a few partitions 
> instead of all to a different topic.  My use case is to sample a small 
> portion of production traffic in the pre-production environment for testing.  
> The pre-production environment is usually smaller and cannot handle the full 
> load from production.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3472) Allow MirrorMaker to copy selected partitions and choose target topic name

2016-03-28 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214208#comment-15214208
 ] 

Grant Henke commented on KAFKA-3472:


I think the goal of sampling topic data could be done without adding another 
parameter to Mirrormaker, and without depending on having many partitions. I 
linked KAFKA-2670 which talks about adding a sampling rate to Mirrormaker. The 
discussion mentions using an interceptor to implement custom sampling. Would 
that work for your use case?

> Allow MirrorMaker to copy selected partitions and choose target topic name
> --
>
> Key: KAFKA-3472
> URL: https://issues.apache.org/jira/browse/KAFKA-3472
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.9.0.1
>Reporter: Hang Sun
>Priority: Minor
>  Labels: mirror-maker
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It would be nice if MirrorMaker can be used to copy only a few partitions 
> instead of all to a different topic.  My use case is to sample a small 
> portion of production traffic in the pre-production environment for testing.  
> The pre-production environment is usually smaller and cannot handle the full 
> load from production.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3472) Allow MirrorMaker to copy selected partitions and choose target topic name

2016-03-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213355#comment-15213355
 ] 

ASF GitHub Bot commented on KAFKA-3472:
---

GitHub user hsun-cnnxty opened a pull request:

https://github.com/apache/kafka/pull/1147

[KAFKA-3472] Allow MirrorMaker to copy selected partitions and choose 
target topic name

Please see the jira issue for detail: 
https://issues.apache.org/jira/browse/KAFKA-3472

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsun-cnnxty/kafka k3472

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1147.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1147


commit 9ada717f805d6c4e22c3d6567bafe1c3940c7d06
Author: Hang Sun 
Date:   2016-03-27T06:01:57Z

KAFKA-3472: allow MirrorMaker to copy selected partitions and choose target 
topic




> Allow MirrorMaker to copy selected partitions and choose target topic name
> --
>
> Key: KAFKA-3472
> URL: https://issues.apache.org/jira/browse/KAFKA-3472
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.9.0.1
>Reporter: Hang Sun
>Priority: Minor
>  Labels: mirror-maker
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It would be nice if MirrorMaker can be used to copy only a few partitions 
> instead of all to a different topic.  My use case is to sample a small 
> portion of production traffic in the pre-production environment for testing.  
> The pre-production environment is usually smaller and cannot handle the full 
> load from production.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)