[jira] [Commented] (KAFKA-6020) Broker side filtering

2023-11-06 Thread Murari Goswami (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783173#comment-17783173
 ] 

Murari Goswami commented on KAFKA-6020:
---

Valid point. I vote for this feature to be picked up as many will  get benefit 
for this.

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2023-11-06 Thread Vincent Bernardi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783143#comment-17783143
 ] 

Vincent Bernardi commented on KAFKA-6020:
-

[~murari.it] note that this could be restricted to deserialization of headers 
and not of the payload, and maybe also to standard deserializers (Byte [], 
String, JSON), if it helps lighten the load on the broker. Basically _any_ 
compromise would be acceptable tu us to spare the bandwidth.

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2023-11-03 Thread Murari Goswami (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782515#comment-17782515
 ] 

Murari Goswami commented on KAFKA-6020:
---

We are also suffered with moving expansive amount of event data at consumer 
side and then throw away the majority of the records as one of the consumer is 
interested in specific data. A broker side filter will help a lot in reducing 
network load in moving these unwanted data. So if this can be a part of feature 
in kafka in next release will be of great help. 

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2023-09-29 Thread Alexander Grzesik (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770509#comment-17770509
 ] 

Alexander Grzesik commented on KAFKA-6020:
--

It would definitly help for several of our use cases if a consumer could define 
based on some header filters which messages it consumes. Currently we either 
use consumer side filtering with the increased bandwith needs and also some 
security concerns or use the streaming API to split general topics in more 
specific ones but that increases for some cases drastically the amount of 
topics.
We would be very happy if the proposed filter feature could become part of the 
Kafka core.

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2023-09-24 Thread Barak (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768362#comment-17768362
 ] 

Barak commented on KAFKA-6020:
--

Same comment as [~Enzo90910] - Broker side filtering would help a lot in 
multiple use-cases.

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2023-09-23 Thread Vincent Bernardi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768258#comment-17768258
 ] 

Vincent Bernardi commented on KAFKA-6020:
-

My company is currently wasting a non-negligible amount of BW resources with 
several consumer groups reading a whole topic only to process 1% of messages. 
Now I’m guessing resource wasting impacts the community very inequally with 
some providers actually benefiting from it, which may explain why this issue 
hasn’t gained more traction. 

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2023-09-23 Thread Fiachra Corcoran (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17768241#comment-17768241
 ] 

Fiachra Corcoran commented on KAFKA-6020:
-

Looking to revive this topic to gauge the level of interest. Is this something 
that the community would see as beneficial?
See also https://issues.apache.org/jira/browse/KAFKA-10280

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2022-08-16 Thread Vincent Bernardi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17580330#comment-17580330
 ] 

Vincent Bernardi commented on KAFKA-6020:
-

>From my PoV, it's quite clear the right way to do this would be to enable 
>limited-CPU usage filters (i.e. either exact string match, substring, or at 
>the most not-extended  regex match) on message headers. Hope this is 
>considered for a future KIP.

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2022-05-13 Thread flavio livide (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17536490#comment-17536490
 ] 

flavio livide commented on KAFKA-6020:
--

An interesting extension of this feature would be enabling ABAC. Imagine the 
broker is capable of taking a header based filter. The same type of filter 
could be specified in an ACL, enabling a client to "see" only messages with 
certain header fields. At the moment the only way to achieve this is having 
producers to split the data in different topics - potentially duplicating a 
large amount of it.  

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2021-09-23 Thread King Jin (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419571#comment-17419571
 ] 

King Jin commented on KAFKA-6020:
-

How about broker side filtering by Kafka Headers? The consumer sent filtering 
header when request more messages from broker, the broker filtering message by 
header before response message back to consumer.

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2020-12-14 Thread flavio livide (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248887#comment-17248887
 ] 

flavio livide commented on KAFKA-6020:
--

We have use cases where some consumers need the whole topic and others only a 
small subset. The set up is quite dynamic so setting up topics on purpose for 
consumers becomes quite complicated to manage. 
A very simple form of broker side filter would make a big improvement. Could be 
a key prefix, header based, or as brutal as letting a producer set a number 1 
to 1024 and use a bitmask to filter for clients. 

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2020-01-30 Thread DEAN JAIN (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026910#comment-17026910
 ] 

DEAN JAIN commented on KAFKA-6020:
--

its been almost an year and we are looking forward to this feature, any 
updates, just let us know even if there is any plan in near future for this ???

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2019-03-06 Thread Yiming Zang (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786242#comment-16786242
 ] 

Yiming Zang commented on KAFKA-6020:


Any updates for this?

We have smilier needs on our side, strongly support this idea on broker-side 
filtering. 

Our use case comes from N-DC replication. Basically imagine if you have 5 data 
centers and you need to replicate data to everywhere, typically you'll have to 
run N*(N-1) which is 20 mirror-maker jobs in order replicate messages in each 
local data center to all remote data centers. Each mirror maker will have to 
read the whole 5 copies of events, do some processing and only replicate one 
fifth of the events. This is a huge waste of network bandwidth and cpu 
resources. If we can have a way to pre filter the events on broker side, mirror 
maker doesn't need to read all 5 copies of events any more, which can be a huge 
amount of savings when we have even more data centers in the future.

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2018-08-30 Thread Geoffray Adde (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597415#comment-16597415
 ] 

Geoffray Adde commented on KAFKA-6020:
--

Hello. 
I strongly support the idea on message filtering on the broker side.


I am using kafka as a message streaming system. I am not committing anything 
and I am not even keeping track of the offsets.
On the other hands, I have to deal with a vast variety of distinct objects. 
Hundreds of thousands, up to millions.
Obviously, I cannot have that many topics, but with broker side filtering, I 
could get get some sort of sub-topics.

To me, the idea would be, on a fetch request to send a mask to test a custom 
header.
If either the custom header is not present or if it does not match the mask, 
the message is not sent as part of the reply to the fetch request.
The concept seems simple but I have no clue how much work it is to implement.

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: New Feature
>  Components: consumer
>Reporter: Pavel Micka
>Priority: Major
>  Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6020) Broker side filtering

2017-10-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16195223#comment-16195223
 ] 

Ted Yu commented on KAFKA-6020:
---

This needs a KIP, right ?

> Broker side filtering
> -
>
> Key: KAFKA-6020
> URL: https://issues.apache.org/jira/browse/KAFKA-6020
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Pavel Micka
>
> Currently, it is not possible to filter messages on broker side. Filtering 
> messages on broker side is convenient for filter with very low selectivity 
> (one message in few thousands). In my case it means to transfer several GB of 
> data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for 
> performance reasons), I propose to filter just by message key prefix. This 
> can be achieved even without any deserialization, as the prefix to be matched 
> can be passed as an array (hence the broker would do just array prefix 
> compare).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)