Re: Share some experiment results about Gorilla encoding algorithm

Steve Su Sun, 11 Oct 2020 08:53:32 -0700

Hi,

From my point of view, since the reimplementation of this algorithm does not 
change the structure of TsFile, there is no need to upgrade the version number 
of TsFile to 000003.


I think we can change the name of the old Gorilla encoding to 
TSEncoding.OLD_GORILLA in the code under the premise of ensuring the 
compatibility of the old TsFiles, and then reserve TSEncoding.GORILLA for the 
re-implemented version. This may minimize the impact on users.

What do you think? :)

Steve Su

------------------ ???????? ------------------
??????: "dev" <saint...@gmail.com>;
????????: 2020??10??10??(??????) ????11:35
??????: "dev"<dev@iotdb.apache.org>;
????: Re: Share some experiment results about Gorilla encoding algorithm

Hi,

Nice!

One question. So, if we reimplement the Gorilla algorithm, how to consider
the version compatibility?

1. Upgrade the TsFile version to 000003, or
2. Add a new encoding name to the corrected gorilla.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 ??????
???????? ????????


Steve Su <steveyuron...@qq.com> ??2020??10??10?????? ????10:20??????

> Hi,
>
> Recently, we realized that the Gorilla encoding algorithm that has been
> used inside IoTDB may have some issues, because it will cause time series
> data (the value part) to become more space-consuming after encoding. This
> is not in line with expectations. Usually after using Gorilla encoding, the
> data will take up less space.
>
> I found a very good open source Gorilla algorithm implementation by
> Michael on Github (see https://github.com/burmanm/gorilla-tsc). I
> compared the difference in encoding / decoding time cost and compression
> rate between the version implemented by Michael and the version used
> internally by IoTDB, and found that the version used inside IoTDB does have
> a lot of room for improvement.
>
> See
> https://cwiki.apache.org/confluence/display/IOTDB/Gorilla+encoding+algorithm
> for more experiment details.
>
> I think we can refer to Michael's implementation to re-implement the
> algorithm inside IoTDB to reduce the compression rate (fix potential
> errors) and improve performance. I have created a JIRA (see
> https://issues.apache.org/jira/browse/IOTDB-938) for this. If possible, I
> would be happy to re-implement the algorithm.
>
> Thanks,
> Steve Su

Re: Share some experiment results about Gorilla encoding algorithm

Reply via email to