Hi, I read the document (which is excellent, good work!) and it sounds very interesting but as far as I understand the algorithm its lossy. Is this true? Or do I miss something?
Thanks! Julian Am 29.09.20, 12:04 schrieb "Jialin Qiao" <qj...@mails.tsinghua.edu.cn>: Hi, Good summary~ > Page header needs to maintain a Map<segmentStartTime, count> It's better to keep the structure of PageHeader the same as 0.10. You can store this information in the PageData. > Encoder, Decoder will take both a data column and a series. Try to provide interface with primitive data type. So a series should be put by encodeFloatPoint(long time, float value) Besides, there are some more details need to consider: - How to store the endpoint of each segment - In the SDT with timestamp encoding, how to store the timestamps? (Maybe just using TS2_DIFF is fine) This is not a small change to IoTDB... Thanks, -- Jialin Qiao School of Software, Tsinghua University 乔嘉林 清华大学 软件学院 > -----原始邮件----- > 发件人: "Haimei Guo" <kel...@gmail.com> > 发送时间: 2020-09-29 15:19:49 (星期二) > 收件人: dev@iotdb.apache.org > 抄送: > 主题: Re: Support SDT compression > > Hi, > > Following is a summary of SDT's encoding and decoding implementation in > IoTDB. > > - > > SDT is mainly to calculate the up and down slopes of the data to the > segment starting point. If it is within the compression deviation CD range, > discard the data. If it exceeds the CD, the original data is stored > - > > In IoTDB, the SDT can act as a new Encoding method. It works inside each > Page (PageWriter and PageReader). > - > > Will support with and without timestamp encoding. > - > > For without timestamps encoding, we will record the count of data points > in each segment in the page header. Page header needs to maintain a > Map<segmentStartTime, count> > > Encoder will be changed to encode(long time, long value) > Data buffer will be stored in each Encoder > Decoder will be changed to getTime(), getXXValue() > Encoder, Decoder will take both a data column and a series. > > > If you have any question or comment, you are more than welcome to reply! > > > Thank you, > > Haimei > > > On Mon, Sep 28, 2020 at 1:20 PM Jialin Qiao <qj...@mails.tsinghua.edu.cn> > wrote: > > > Hi Haimei, > > > > Good work! This doc is comprehensive :) > > > > As for the implementation in IoTDB, here are some points: > > > > (1) First, SDT could act as a new Encoding method in IoTDB. It works > > inside each Page (PageWriter and PageReader). > > (2) The interface of Encoder could be changed to encode(long time, XX > > value). The interface of Decoder could be change to getTime(), > > getXXValue(). Which is, the encoder and decoder is not only responsible for > > one data column but a series. This involves some reconstruction of the > > Encoder and Decoder, the data buffers should be stored inside each encoder. > > (3) For the SDT without timestamps, we need to record the count of each > > segment. > > (4) We could offer two encodings, SDT with timestamps and SDT without > > timestamps. > > > > Thanks, > > -- > > Jialin Qiao > > School of Software, Tsinghua University > > > > 乔嘉林 > > 清华大学 软件学院 > > > > > -----原始邮件----- > > > 发件人: "runhus...@foxmail.com" <runhus...@foxmail.com> > > > 发送时间: 2020-09-25 11:56:16 (星期五) > > > 收件人: dev <dev@iotdb.apache.org> > > > 抄送: > > > 主题: Re: Support SDT compression > > > > > > Great work! > > > > > > > > > > > > Thanks. > > > > > > Chao Wang > > > BONC Ltd > > > > > > > > > From: Eileen Guo > > > Date: 2020-09-25 11:47 > > > To: dev > > > Subject: Support SDT compression > > > Hi all, > > > > > > I've completed a design draft for supporting swinging door compression. > > > > > > Jira: jira SDT link > > > < > > https://issues.apache.org/jira/browse/IOTDB-890?filter=-4&jql=assignee%20in%20(haimeiguo)%20order%20by%20created%20DESC > > > > > > design doc: SDT design doc link > > > < > > https://docs.google.com/document/d/1VeTwVsm4CkQSVR65bWw9pKg6gRdiDYUz0lBBhWXHl5A/edit?usp=sharing > > > > > > > > > The doc explains SDT algorithm, compression and decompression process, > > > performance tests and SDT + IoTDB implementation and usage. > > > > > > There is still some question about where to use this algorithm. If you > > have > > > any idea, welcome to comment. > > > > > > Thank you! > > > Haimei Guo > >