RE: CASSANDRA-19268: Improve Cassandra compression performance using hardware accelerators

2024-01-25 Thread Kokoori, Shylaja
Thank you Dinesh.

To answer your questions,

1. QPL Java Library[1] (JNI bindings to Intel's QPL) does not have any license 
information on the repo. This needs to be corrected. Please see the types of 
licenses we can use[2] for further information.

Will address this. Thank you for pointing out

2. Can you describe how the compressor will behave when the cluster is made up 
of heterogeneous hardware? For example, let's say we have a mix of machines 
where some support Intel's IAA and some don't?

If the hardware is not available, all supported functionalities are executed by 
a software library on CPU

3. Does QPL have checksumming built in?

Yes, QPL does calculate checksum. Here is some more information  
https://intel.github.io/qpl/documentation/dev_guide_docs/c_use_cases/deflate/c_deflate_decompression.html#checksums
Will this work?

Thanks,
Shylaja


From: Dinesh Joshi 
Sent: Tuesday, January 23, 2024 10:36 PM
To: dev@cassandra.apache.org
Subject: Re: CASSANDRA-19268: Improve Cassandra compression performance using 
hardware accelerators

Hi Shylaja,

If you'd like we can continue this on the ticket you opened. Here are my 
concerns -

1. QPL Java Library[1] (JNI bindings to Intel's QPL) does not have any license 
information on the repo. This needs to be corrected. Please see the types of 
licenses we can use[2] for further information.

2. Can you describe how the compressor will behave when the cluster is made up 
of heterogeneous hardware? For example, let's say we have a mix of machines 
where some support Intel's IAA and some don't?

3. Does QPL have checksumming built in?

thanks,

Dinesh

[1] https://github.com/intel/qpl-java
[2] https://www.apache.org/legal/resolved.html#category-a

On Mon, Jan 22, 2024 at 6:37 PM Kokoori, Shylaja 
mailto:shylaja.koko...@intel.com>> wrote:
Dinesh & Abe,
Thank you very much for your feedback.

The algorithm used by this HW compressor is compatible with Deflate but there 
is a constraint of 4K window size. Therefore the concern is that existing data 
may not decompress correctly as is. That is why we chose the path of adding a 
new compressor.
Another reason is that, there are some additional features available in the 
hardware which are not compatible with zlib. With this approach we could enable 
those features as well.

We are also planning to accelerate existing compressors, if that is the 
preferred approach we will try to come up with a solution to work around the 4k 
window limitation.

Thank you,
Shylaja

From: Dinesh Joshi mailto:djo...@apache.org>>
Sent: Monday, January 22, 2024 11:18 AM
To: dev@cassandra.apache.org<mailto:dev@cassandra.apache.org>
Subject: Re: CASSANDRA-19268: Improve Cassandra compression performance using 
hardware accelerators

Shylaja,

Cassandra uses ZStd, LZ4 and other compression libraries via JNI to compress 
data. The intel hardware accelerator support is integrated into those libraries 
and we can benefit from it. If there are special parameters that need to be 
passed in to these libraries we can make those changes on the database but as 
such Cassandra does not directly implement the compression algorithms itself.

Dinesh


Re: CASSANDRA-19268: Improve Cassandra compression performance using hardware accelerators

2024-01-23 Thread Dinesh Joshi
Hi Shylaja,

If you'd like we can continue this on the ticket you opened. Here are my
concerns -

1. QPL Java Library[1] (JNI bindings to Intel's QPL) does not have any
license information on the repo. This needs to be corrected. Please see the
types of licenses we can use[2] for further information.

2. Can you describe how the compressor will behave when the cluster is made
up of heterogeneous hardware? For example, let's say we have a mix of
machines where some support Intel's IAA and some don't?

3. Does QPL have checksumming built in?

thanks,

Dinesh

[1] https://github.com/intel/qpl-java
[2] https://www.apache.org/legal/resolved.html#category-a

On Mon, Jan 22, 2024 at 6:37 PM Kokoori, Shylaja 
wrote:

> Dinesh & Abe,
>
> Thank you very much for your feedback.
>
>
>
> The algorithm used by this HW compressor is compatible with Deflate but
> there is a constraint of 4K window size. Therefore the concern is that
> existing data may not decompress correctly as is. That is why we chose the
> path of adding a new compressor.
>
> Another reason is that, there are some additional features available in
> the hardware which are not compatible with zlib. With this approach we
> could enable those features as well.
>
>
>
> We are also planning to accelerate existing compressors, if that is the
> preferred approach we will try to come up with a solution to work around
> the 4k window limitation.
>
>
>
> Thank you,
>
> Shylaja
>
>
>
> *From:* Dinesh Joshi 
> *Sent:* Monday, January 22, 2024 11:18 AM
> *To:* dev@cassandra.apache.org
> *Subject:* Re: CASSANDRA-19268: Improve Cassandra compression performance
> using hardware accelerators
>
>
>
> Shylaja,
>
>
>
> Cassandra uses ZStd, LZ4 and other compression libraries via JNI to
> compress data. The intel hardware accelerator support is integrated into
> those libraries and we can benefit from it. If there are special parameters
> that need to be passed in to these libraries we can make those changes on
> the database but as such Cassandra does not directly implement the
> compression algorithms itself.
>
>
>
> Dinesh
>


RE: CASSANDRA-19268: Improve Cassandra compression performance using hardware accelerators

2024-01-22 Thread Kokoori, Shylaja
Dinesh & Abe,
Thank you very much for your feedback.

The algorithm used by this HW compressor is compatible with Deflate but there 
is a constraint of 4K window size. Therefore the concern is that existing data 
may not decompress correctly as is. That is why we chose the path of adding a 
new compressor.
Another reason is that, there are some additional features available in the 
hardware which are not compatible with zlib. With this approach we could enable 
those features as well.

We are also planning to accelerate existing compressors, if that is the 
preferred approach we will try to come up with a solution to work around the 4k 
window limitation.

Thank you,
Shylaja

From: Dinesh Joshi 
Sent: Monday, January 22, 2024 11:18 AM
To: dev@cassandra.apache.org
Subject: Re: CASSANDRA-19268: Improve Cassandra compression performance using 
hardware accelerators

Shylaja,

Cassandra uses ZStd, LZ4 and other compression libraries via JNI to compress 
data. The intel hardware accelerator support is integrated into those libraries 
and we can benefit from it. If there are special parameters that need to be 
passed in to these libraries we can make those changes on the database but as 
such Cassandra does not directly implement the compression algorithms itself.

Dinesh


Re: CASSANDRA-19268: Improve Cassandra compression performance using hardware accelerators

2024-01-22 Thread Dinesh Joshi
Shylaja,

Cassandra uses ZStd, LZ4 and other compression libraries via JNI to
compress data. The intel hardware accelerator support is integrated into
those libraries and we can benefit from it. If there are special parameters
that need to be passed in to these libraries we can make those changes on
the database but as such Cassandra does not directly implement the
compression algorithms itself.

Dinesh


Re: CASSANDRA-19268: Improve Cassandra compression performance using hardware accelerators

2024-01-22 Thread Abe Ratnofsky
Hardware acceleration for more things would be great, especially based on the 
success of ACCP (CASSANDRA-18624 
). But I think it would 
be ideal to use existing compressor names and use hardware acceleration if a 
given JAR is present on the classpath / configured, like ACCP. Then hosts with 
varying hardware acceleration support can still interoperate (stream compressed 
SSTables between each other, for example) and migrating existing systems to use 
the new compressors would be as simple as ensuring a JAR is present / 
configured, not requiring a change to table options.

See org.apache.cassandra.security.DefaultCryptoProvider for example.

> On Jan 19, 2024, at 1:51 PM, Kokoori, Shylaja  
> wrote:
> 
> Hi,
> Latest processors have integrated hardware accelerators which can speed up 
> operations like compress/decompress, crypto and analytics. Here are some 
> links to details
> 1) https://cdrdv2.intel.com/v1/dl/getContent/721858
> 2) 
> https://www.intel.com/content/www/us/en/content-details/780887/intel-in-memory-analytics-accelerator-intel-iaa.html
>  
> We would like to add a new compressor which can accelerate 
> compress/decompress when hardware is available and which will default to 
> software otherwise.
>  
> Thanks,
> Shylaja