Re: [DISCUSS] KIP-390: Allow fine-grained configuration for compression (Rebooted)

2020-01-28 Thread Dongjin Lee
Hi Guozhang,

Sorry for the late reply. Let me have a detailed look on how each codec
uses compression buffer.

About compression levels, gzip and lz4 also supports this feature.

Thanks,
Dongjin

On Tue, Jan 21, 2020 at 4:31 AM Guozhang Wang  wrote:

> Hello Dongjin,
>
> I'm wondering if you have looked into the different implementor's buffer
> usage? So far as I read from the code:
>
> 1. LZ4 used a shared 64KB for decompression, and when reading it used
> ByteBuffer copy from the decompression buffer.
> 2. Snappy used shared uncompressed buffer, but when reading it uses
> SnappyNative.arrayCopy
> JNI which could be slow.
> 3. GZIP used shared 8KB (inflator), and another shared 16KB for reading it
> out with System.arraycopy.
> 4. ZSTD: dynamically allocate 128KB (i.e. not shared), and also use read
> internal for skip, so skip does not benefit that much.
>
> It seems to me that for different types the buffer used quite differently.
> Also, aside from ZSTD, are there any other types that have levels?
>
>
> Guozhang
>
>
> On Mon, Jun 24, 2019 at 4:30 PM Dongjin Lee  wrote:
>
> > Hello. Here is the new discussion thread for KIP-390: Allow fine-grained
> > configuration for compression.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Allow+fine-grained+configuration+for+compression
> >
> > Here is some history: Initially, the draft implementation was done with
> > traditional configuration scheme (Type A). Then, Becket Qin (
> > becket@gmail.com) and Mickael Maison (mickael.mai...@gmail.com)
> > proposed that the map style configuration
> > like listener.security.protocol.map
> > or max.connections.per.ip.overrides (Type B) would be better. From then
> on,
> > the discussion got struck.
> >
> > So last weekend, I re-implemented the feature against the latest trunk,
> for
> > all public interface alternatives (i.e., Type A & B.), and updated the
> KIP
> > document. You can find the details in this PR:
> > https://github.com/apache/kafka/pull/5927
> >
> > Please have a look when you are free. All kinds of feedbacks are
> welcomed!
> >
> > Regards,
> > Dongjin
> >
> > --
> > *Dongjin Lee*
> >
> > *A hitchhiker in the mathematical world.*
> > *github:  github.com/dongjinleekr
> > linkedin:
> kr.linkedin.com/in/dongjinleekr
> > speakerdeck:
> > speakerdeck.com/dongjin
> > *
> >
>
>
> --
> -- Guozhang
>
-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*
*github:  github.com/dongjinleekr
linkedin: kr.linkedin.com/in/dongjinleekr
speakerdeck: speakerdeck.com/dongjin
*


Re: [DISCUSS] KIP-390: Allow fine-grained configuration for compression (Rebooted)

2020-01-20 Thread Guozhang Wang
Hello Dongjin,

I'm wondering if you have looked into the different implementor's buffer
usage? So far as I read from the code:

1. LZ4 used a shared 64KB for decompression, and when reading it used
ByteBuffer copy from the decompression buffer.
2. Snappy used shared uncompressed buffer, but when reading it uses
SnappyNative.arrayCopy
JNI which could be slow.
3. GZIP used shared 8KB (inflator), and another shared 16KB for reading it
out with System.arraycopy.
4. ZSTD: dynamically allocate 128KB (i.e. not shared), and also use read
internal for skip, so skip does not benefit that much.

It seems to me that for different types the buffer used quite differently.
Also, aside from ZSTD, are there any other types that have levels?


Guozhang


On Mon, Jun 24, 2019 at 4:30 PM Dongjin Lee  wrote:

> Hello. Here is the new discussion thread for KIP-390: Allow fine-grained
> configuration for compression.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Allow+fine-grained+configuration+for+compression
>
> Here is some history: Initially, the draft implementation was done with
> traditional configuration scheme (Type A). Then, Becket Qin (
> becket@gmail.com) and Mickael Maison (mickael.mai...@gmail.com)
> proposed that the map style configuration
> like listener.security.protocol.map
> or max.connections.per.ip.overrides (Type B) would be better. From then on,
> the discussion got struck.
>
> So last weekend, I re-implemented the feature against the latest trunk, for
> all public interface alternatives (i.e., Type A & B.), and updated the KIP
> document. You can find the details in this PR:
> https://github.com/apache/kafka/pull/5927
>
> Please have a look when you are free. All kinds of feedbacks are welcomed!
>
> Regards,
> Dongjin
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
> *github:  github.com/dongjinleekr
> linkedin: kr.linkedin.com/in/dongjinleekr
> speakerdeck:
> speakerdeck.com/dongjin
> *
>


-- 
-- Guozhang


[DISCUSS] KIP-390: Allow fine-grained configuration for compression (Rebooted)

2019-06-24 Thread Dongjin Lee
Hello. Here is the new discussion thread for KIP-390: Allow fine-grained
configuration for compression.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Allow+fine-grained+configuration+for+compression

Here is some history: Initially, the draft implementation was done with
traditional configuration scheme (Type A). Then, Becket Qin (
becket@gmail.com) and Mickael Maison (mickael.mai...@gmail.com)
proposed that the map style configuration
like listener.security.protocol.map
or max.connections.per.ip.overrides (Type B) would be better. From then on,
the discussion got struck.

So last weekend, I re-implemented the feature against the latest trunk, for
all public interface alternatives (i.e., Type A & B.), and updated the KIP
document. You can find the details in this PR:
https://github.com/apache/kafka/pull/5927

Please have a look when you are free. All kinds of feedbacks are welcomed!

Regards,
Dongjin

-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*
*github:  github.com/dongjinleekr
linkedin: kr.linkedin.com/in/dongjinleekr
speakerdeck: speakerdeck.com/dongjin
*