Hi,

Nice!

One question. So, if we reimplement the Gorilla algorithm, how to consider
the version compatibility?

1. Upgrade the TsFile version to 000003, or
2. Add a new encoding name to the corrected gorilla.

Best,
-----------------------------------
Xiangdong Huang
School of Software, Tsinghua University

 黄向东
清华大学 软件学院


Steve Su <steveyuron...@qq.com> 于2020年10月10日周六 下午10:20写道:

> Hi,
>
> Recently, we realized that the Gorilla encoding algorithm that has been
> used inside IoTDB may have some issues, because it will cause time series
> data (the value part) to become more space-consuming after encoding. This
> is not in line with expectations. Usually after using Gorilla encoding, the
> data will take up less space.
>
> I found a very good open source Gorilla algorithm implementation by
> Michael on Github (see https://github.com/burmanm/gorilla-tsc). I
> compared the difference in encoding / decoding time cost and compression
> rate between the version implemented by Michael and the version used
> internally by IoTDB, and found that the version used inside IoTDB does have
> a lot of room for improvement.
>
> See
> https://cwiki.apache.org/confluence/display/IOTDB/Gorilla+encoding+algorithm
> for more experiment details.
>
> I think we can refer to Michael's implementation to re-implement the
> algorithm inside IoTDB to reduce the compression rate (fix potential
> errors) and improve performance. I have created a JIRA (see
> https://issues.apache.org/jira/browse/IOTDB-938) for this. If possible, I
> would be happy to re-implement the algorithm.
>
> Thanks,
> Steve Su

Reply via email to