gongxun0928 commented on code in PR #1344:
URL: https://github.com/apache/cloudberry/pull/1344#discussion_r2318188376
##########
contrib/pax_storage/src/cpp/storage/columns/pax_compress.cc:
##########
@@ -230,9 +234,14 @@ size_t PaxLZ4Compressor::GetCompressBound(size_t src_len) {
}
size_t PaxLZ4Compressor::Compress(void *dst_buff, size_t dst_cap,
- void *src_buff, size_t src_len, int /*lvl*/)
{
- return LZ4_compress_default((char *)src_buff, (char *)dst_buff, src_len,
- dst_cap);
+ void *src_buff, size_t src_len, int lvl) {
+#define LZ4_MAX_ACC 65536
+ // acceleration is oppsite meaning of compress level
+ // map [19, 0] to [0, LZ4_MAX_ACC]
+ int acceleration = 19 - lvl;
+ acceleration = (int)(acceleration * LZ4_MAX_ACC / 20.0);
Review Comment:
FYI
I have little experience with acceleration settings. After reviewing Google
and feedback from claude-opus-4.1, I've identified several issues:
1. Incorrect parameter understanding: The acceleration parameter in
LZ4_compress_fast() is not the inverse of the compression level. It's an
independent parameter that controls the speed/compression ratio tradeoff.
2. Problematic mapping logic: Your code maps level from [0, 19] to
acceleration [65536, 0], which has several issues:
* acceleration = 1 is the default (best compression ratio)
* acceleration > 1 speeds up compression but reduces the compression ratio
* acceleration = 65536 is an extreme value that results in almost no
compression
claude-opus-4.1 show two solution:
Solution 1: Hybrid Fast and High Compression Modes
```cpp
size_t PaxLZ4Compressor::Compress(void *dst_buff, size_t dst_cap,
void *src_buff, size_t src_len, int lvl) {
// lvl: [0-19]
// 0-9: Use fast mode with different acceleration values
// 10-19: Use high compression mode with different compression levels
if (lvl < 10) {
// Fast mode: map lvl [0-9] to acceleration [20, 1]
// Lower lvl = higher acceleration = faster speed but lower
compression ratio
int acceleration = 20 - (lvl * 2);
return LZ4_compress_fast((char *)src_buff, (char *)dst_buff,
src_len, dst_cap, acceleration);
} else {
// High compression mode: map lvl [10-19] to compressionLevel [1-16]
// Higher lvl = higher compressionLevel = better compression ratio
int compressionLevel = (lvl - 9) * 16 / 10;
compressionLevel = std::min(compressionLevel, 16);
return LZ4_compress_HC((char *)src_buff, (char *)dst_buff,
src_len, dst_cap, compressionLevel);
}
}
```
Solution 2: Adaptive Strategy Based on Block Size
```cpp
size_t PaxLZ4Compressor::Compress(void *dst_buff, size_t dst_cap,
void *src_buff, size_t src_len, int lvl) {
// For columnar storage, use fast compression for small blocks,
// and allow higher compression overhead for larger blocks
const size_t SMALL_BLOCK = 4 * 1024; // 4KB threshold
const size_t MEDIUM_BLOCK = 64 * 1024; // 64KB threshold
int acceleration;
if (src_len < SMALL_BLOCK) {
// Small blocks: prioritize speed over compression ratio
// Map lvl [0-19] to acceleration [20, 1]
acceleration = std::max(1, 20 - lvl);
} else if (src_len < MEDIUM_BLOCK) {
// Medium blocks: balance between speed and compression ratio
// Map lvl [0-19] to acceleration [10, 1] with slower progression
acceleration = std::max(1, 10 - lvl/2);
} else {
// Large blocks: can afford higher compression overhead
if (lvl < 15) {
// For moderate compression levels, still use fast mode
// but with lower acceleration values for better compression
acceleration = std::max(1, 5 - lvl/4);
return LZ4_compress_fast((char *)src_buff, (char *)dst_buff,
src_len, dst_cap, acceleration);
} else {
// For high compression levels (15-19), switch to HC mode
// Map lvl [15-19] to compressionLevel [4-16]
int compressionLevel = (lvl - 14) * 4;
compressionLevel = std::min(compressionLevel, 16);
return LZ4_compress_HC((char *)src_buff, (char *)dst_buff,
src_len, dst_cap, compressionLevel);
}
}
return LZ4_compress_fast((char *)src_buff, (char *)dst_buff,
src_len, dst_cap, acceleration);
}
```
For solution 2, opus-4.1 provides a version that eliminates branch
prediction by using a table lookup method.
```cpp
class PaxLZ4Compressor {
private:
// Precomputed lookup table
static constexpr int MAX_LEVEL = 20;
// Size thresholds for different compression strategies
static constexpr size_t SIZE_THRESHOLDS[] = {4*1024, 64*1024, SIZE_MAX};
// Precomputed acceleration value table [size_category][level]
// Optimization: Use lookup table to avoid runtime calculations
static constexpr int ACCELERATION_TABLE[3][MAX_LEVEL] = {
// Small chunks (< 4KB) - prioritize speed
{20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2,
1},
// Medium chunks (4KB - 64KB) - balance speed and ratio
{10, 10, 9, 9, 8, 8, 7, 7, 6, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1},
// Large chunks (> 64KB) - prioritize compression ratio
{5, 5, 4, 4, 4, 3, 3, 3, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1}
};
public:
size_t Compress(void *dst_buff, size_t dst_cap,
void *src_buff, size_t src_len, int lvl) {
// Use bitwise operations to avoid branch prediction penalties
// Determines size category: 0=small, 1=medium, 2=large
int size_category = (src_len >= SIZE_THRESHOLDS[0]) +
(src_len >= SIZE_THRESHOLDS[1]);
// Clamp level to valid range and get acceleration from precomputed
table
int acceleration = ACCELERATION_TABLE[size_category][std::min(lvl,
MAX_LEVEL-1)];
return LZ4_compress_fast((char *)src_buff, (char *)dst_buff,
src_len, dst_cap, acceleration);
}
};
// Implementation Notes and Considerations:
// 1. This implementation is more reasonable as it provides a balanced
approach
// between compression speed and ratio based on data chunk size.
// 2. The acceleration values are constrained to a reasonable range (1-20),
// avoiding the extreme values (0-65536) from the previous implementation.
// 3. The design follows the principle: "small chunks prioritize speed,
// large chunks prioritize compression ratio."
// 4. The use of a precomputed lookup table eliminates runtime calculations,
// improving performance.
// 5. The level mapping is intuitive: level 0 = fastest, level 19 = best
compression.
//
// Important Considerations:
// 1. Validate the acceleration values in the table with real-world data
// 2. Consider adding more size categories for finer granularity
// 3. For known high-redundancy data types (e.g., booleans, enums),
// consider using lower acceleration values regardless of size
// 4. For known low-redundancy data (e.g., floats, hashes),
// consider using higher acceleration values
// 5. Add boundary checks to ensure level parameter is within valid range
// 6. Consider implementing adaptive adjustment mechanism based on historical
// compression ratios for each data type
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]