Hi Divij, Mickael, 
Since Mickael KIP-390 was accepted, I did not want to respond in that thread to 
not confuse the work. 

As mentioned in the thread, the KIP-390 and KIP-984 do not supercede each 
other. However the scope of KIP-984 goes beyond the scope of KIP-390. Pluggable 
compression interface is added as a new codec. The other codecs already 
implemented are not affected by this change.  I believe these 2 KIP are not the 
same but they compliment each other. 

As I stated before, the motivation is to give the users the ability to  use 
different compressors without needing future changes in Kafka. 
Kafka currently supports zstd, snappy, gzip and lz4. However, other opensource 
compression projects like the Brotli algorithm are also gaining traction. For 
example the HTTP servers Apache and nginx offer Brotli compression as an 
option. With a pluggable interface, any Kafka developer could integrate and 
test Brotli with Kafka simply by writing a plugin. This same motivation can be 
applied to any other compression algorithm including hardware accelerated 
compression. There are hardware companies including intel and AMD that are 
working on accelerating compression.

The main change in itself is an update in the message format to allow for 
metadata to be passed indicating the which plugin to use  to the broker. This 
only happens if the user selects the pluggable codec. The metadata adds on an 
additional 52 bytes to the message format. 

Broker recompression is taking care of when producer and brokers have different 
codec because it is just another codec being added as far as Kafka. 
I have added more information to the  
https://cwiki.apache.org/confluence/display/KAFKA/KIP-984%3A+Add+pluggable+compression+interface+to+Kafka
 I am ready for a PR if this KIP gets accepted

Assane

-----Original Message-----
From: Diop, Assane <assane.d...@intel.com> 
Sent: Wednesday, January 31, 2024 10:24 AM
To: dev@kafka.apache.org
Subject: RE: DISCUSS KIP-984 Add pluggable compression interface to Kafka

Hi Divij,
Thank you for your response!
  
Although compression is not a new problem, it has continued to be an important 
research topic.
The integration and testing of new compression algorithms into Kafka currently 
requires significant code changes and rebuilding of the distribution package 
for Kafka. 
This KIP will allow for any compression algorithm to be seamlessly integrated 
into Kafka by writing a plugin that would bind into the wrapForInput and 
wrapForOutput methods in Kafka.

As you mentioned, Kafka currently supports zstd, snappy, gzip and lz4. However, 
other opensource compression projects like the Brotli algorithm are also 
gaining traction. For example the HTTP servers Apache and nginx offer Brotli 
compression as an option. With a pluggable interface, any Kafka developer could 
integrate and test Brotli with Kafka simply by writing a plugin. This same 
motivation can be applied to any other compression algorithm including hardware 
accelerated compression. There are hardware companies including intel and AMD 
that are working on accelerating compression. 

This KIP would certainly complement the current 
https://issues.apache.org/jira/browse/KAFKA-7632 by adding even more 
flexibility for the users. 
A plugin could be tailored to arbitrary datasets in response to a user's 
specific resource requirements. 
 
For reference, other opensource projects have already started or implemented 
this type of plugin technology such as: 
        1. Cassandra, which has implemented the same concept of pluggable 
interface. 
        2. OpenSearch is also working on enabling the same type of plugin 
framework.
 
With respect to message recompression, the plugin interface would handle this 
use case on the broker side similar to the current recompression process. 
 
Assane  

-----Original Message-----
From: Divij Vaidya <divijvaidy...@gmail.com>
Sent: Friday, December 22, 2023 2:27 AM
To: dev@kafka.apache.org
Subject: Re: DISCUSS KIP-984 Add pluggable compression interface to Kafka

Thank you for writing the KIP Assane.

In general, exposing a "pluggable" interface is not a decision made lightly 
because it limits our ability to remove / change that interface in future.
Any future changes to the interface will have to remain compatible with 
existing plugins which limits the flexibility of changes we can make inside 
Kafka. Hence, we need a strong motivation for adding a pluggable interface.

1\ May I ask the motivation for this KIP? Are the current compression codecs 
(zstd, gzip, lz4, snappy) not sufficient for your use case? Would proving fine 
grained compression options as proposed in
https://issues.apache.org/jira/browse/KAFKA-7632 and 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Support+Compression+Level
address your use case?
2\ "This option impacts the following processes" -> This should also include 
the decompression and compression that occurs during message version 
transformation, i.e. when client send message with V1 and broker expects in V2, 
we convert the message and recompress it.

--
Divij Vaidya



On Mon, Dec 18, 2023 at 7:22 PM Diop, Assane <assane.d...@intel.com> wrote:

> I would like to bring some attention to this KIP. We have added an 
> interface to the compression code that allow anyone to build their own 
> compression plugin and integrate easily back to kafka.
>
> Assane
>
> -----Original Message-----
> From: Diop, Assane <assane.d...@intel.com>
> Sent: Monday, October 2, 2023 9:27 AM
> To: dev@kafka.apache.org
> Subject: DISCUSS KIP-984 Add pluggable compression interface to Kafka
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-984%3A+Add+plugg
> able+compression+interface+to+Kafka
>

Reply via email to