[jira] [Commented] (KAFKA-3472) Allow MirrorMaker to copy selected partitions and choose target topic name
[ https://issues.apache.org/jira/browse/KAFKA-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215143#comment-15215143 ] Hang Sun commented on KAFKA-3472: - If we still need to do sampling/filtering after consume the messages, then it defeats the purpose to reduce unnecessary load on the brokers and bandwidth on the network infrastructure. As the new Kafka consumer provides ability to assign partitions explicitly, it will be ideal to leverage that so the sampling can be done more efficiently. -thanks > Allow MirrorMaker to copy selected partitions and choose target topic name > -- > > Key: KAFKA-3472 > URL: https://issues.apache.org/jira/browse/KAFKA-3472 > Project: Kafka > Issue Type: Improvement > Components: tools >Affects Versions: 0.9.0.1 >Reporter: Hang Sun >Priority: Minor > Labels: mirror-maker > Original Estimate: 24h > Remaining Estimate: 24h > > It would be nice if MirrorMaker can be used to copy only a few partitions > instead of all to a different topic. My use case is to sample a small > portion of production traffic in the pre-production environment for testing. > The pre-production environment is usually smaller and cannot handle the full > load from production. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3472) Allow MirrorMaker to copy selected partitions and choose target topic name
[ https://issues.apache.org/jira/browse/KAFKA-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214859#comment-15214859 ] Jiangjie Qin commented on KAFKA-3472: - [~granthenke] I think we can either use consumer interceptor or a plugin mirror maker message handler. However, both of them are consume-then-filter approach. If the mirror maker consumer itself is not powerful enough to consume from the source cluster, consumer interceptor or message handler approaches would not work. Another hacky way is to simply use the mirror maker in the production environment to do the sampling. e.g. let the message handler to instantiate an internal producer and send the sampled messages using that producer. It should have very little impact on the performance. This would work but is hacky. Also if you need to change the sampling topic, you might have to stop mirror maker unless the implementation of the message handler takes some dynamic configuration. > Allow MirrorMaker to copy selected partitions and choose target topic name > -- > > Key: KAFKA-3472 > URL: https://issues.apache.org/jira/browse/KAFKA-3472 > Project: Kafka > Issue Type: Improvement > Components: tools >Affects Versions: 0.9.0.1 >Reporter: Hang Sun >Priority: Minor > Labels: mirror-maker > Original Estimate: 24h > Remaining Estimate: 24h > > It would be nice if MirrorMaker can be used to copy only a few partitions > instead of all to a different topic. My use case is to sample a small > portion of production traffic in the pre-production environment for testing. > The pre-production environment is usually smaller and cannot handle the full > load from production. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3472) Allow MirrorMaker to copy selected partitions and choose target topic name
[ https://issues.apache.org/jira/browse/KAFKA-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214208#comment-15214208 ] Grant Henke commented on KAFKA-3472: I think the goal of sampling topic data could be done without adding another parameter to Mirrormaker, and without depending on having many partitions. I linked KAFKA-2670 which talks about adding a sampling rate to Mirrormaker. The discussion mentions using an interceptor to implement custom sampling. Would that work for your use case? > Allow MirrorMaker to copy selected partitions and choose target topic name > -- > > Key: KAFKA-3472 > URL: https://issues.apache.org/jira/browse/KAFKA-3472 > Project: Kafka > Issue Type: Improvement > Components: tools >Affects Versions: 0.9.0.1 >Reporter: Hang Sun >Priority: Minor > Labels: mirror-maker > Original Estimate: 24h > Remaining Estimate: 24h > > It would be nice if MirrorMaker can be used to copy only a few partitions > instead of all to a different topic. My use case is to sample a small > portion of production traffic in the pre-production environment for testing. > The pre-production environment is usually smaller and cannot handle the full > load from production. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-3472) Allow MirrorMaker to copy selected partitions and choose target topic name
[ https://issues.apache.org/jira/browse/KAFKA-3472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213355#comment-15213355 ] ASF GitHub Bot commented on KAFKA-3472: --- GitHub user hsun-cnnxty opened a pull request: https://github.com/apache/kafka/pull/1147 [KAFKA-3472] Allow MirrorMaker to copy selected partitions and choose target topic name Please see the jira issue for detail: https://issues.apache.org/jira/browse/KAFKA-3472 You can merge this pull request into a Git repository by running: $ git pull https://github.com/hsun-cnnxty/kafka k3472 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1147.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1147 commit 9ada717f805d6c4e22c3d6567bafe1c3940c7d06 Author: Hang Sun Date: 2016-03-27T06:01:57Z KAFKA-3472: allow MirrorMaker to copy selected partitions and choose target topic > Allow MirrorMaker to copy selected partitions and choose target topic name > -- > > Key: KAFKA-3472 > URL: https://issues.apache.org/jira/browse/KAFKA-3472 > Project: Kafka > Issue Type: Improvement > Components: tools >Affects Versions: 0.9.0.1 >Reporter: Hang Sun >Priority: Minor > Labels: mirror-maker > Original Estimate: 24h > Remaining Estimate: 24h > > It would be nice if MirrorMaker can be used to copy only a few partitions > instead of all to a different topic. My use case is to sample a small > portion of production traffic in the pre-production environment for testing. > The pre-production environment is usually smaller and cannot handle the full > load from production. -- This message was sent by Atlassian JIRA (v6.3.4#6332)