> 28 сент. 2019 г., в 10:29, Andrey Borodin <x4...@yandex-team.ru> написал(а):
> 
> I hope to benchmark decompression on Silesian corpus soon.

I've done it. And results are quite controversial.
Dataset adds 12 payloads to our 5. Payloads have relatively high entropy. In 
many cases pglz cannot compress them at all, so decompression is nop, data is 
stored as is.

Decompressor pglz_decompress_hacked result 48.281747
Decompressor pglz_decompress_hacked8 result 33.868779
Decompressor pglz_decompress_vanilla result 42.510165

Tested on Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz

With Silesian corpus pglz_decompress_hacked is actually decreasing performance 
on high-entropy data.
Meanwhile pglz_decompress_hacked8 is still faster than usual pglz_decompress.
In spite of this benchmarks, I think that pglz_decompress_hacked8 is safer 
option.

I've updated test suite [0] and anyone interested can verify benchmarks.

--
Andrey Borodin
Open source RDBMS development team leader
Yandex.Cloud

[0] https://github.com/x4m/test_pglz

Reply via email to