Hi, Nice!
One question. So, if we reimplement the Gorilla algorithm, how to consider the version compatibility? 1. Upgrade the TsFile version to 000003, or 2. Add a new encoding name to the corrected gorilla. Best, ----------------------------------- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 Steve Su <steveyuron...@qq.com> 于2020年10月10日周六 下午10:20写道: > Hi, > > Recently, we realized that the Gorilla encoding algorithm that has been > used inside IoTDB may have some issues, because it will cause time series > data (the value part) to become more space-consuming after encoding. This > is not in line with expectations. Usually after using Gorilla encoding, the > data will take up less space. > > I found a very good open source Gorilla algorithm implementation by > Michael on Github (see https://github.com/burmanm/gorilla-tsc). I > compared the difference in encoding / decoding time cost and compression > rate between the version implemented by Michael and the version used > internally by IoTDB, and found that the version used inside IoTDB does have > a lot of room for improvement. > > See > https://cwiki.apache.org/confluence/display/IOTDB/Gorilla+encoding+algorithm > for more experiment details. > > I think we can refer to Michael's implementation to re-implement the > algorithm inside IoTDB to reduce the compression rate (fix potential > errors) and improve performance. I have created a JIRA (see > https://issues.apache.org/jira/browse/IOTDB-938) for this. If possible, I > would be happy to re-implement the algorithm. > > Thanks, > Steve Su