Hi,

Yes, it's lossy. Users need to config the tolerant error.

Thanks,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: "Julian Feinauer" <j.feina...@pragmaticminds.de>
> 发送时间: 2020-10-03 02:09:44 (星期六)
> 收件人: "dev@iotdb.apache.org" <dev@iotdb.apache.org>
> 抄送: 
> 主题: Re: Support SDT compression
> 
> Hi,
> 
> I read the document (which is excellent, good work!) and it sounds very 
> interesting but as far as I understand the algorithm its lossy.
> Is this true?
> Or do I miss something?
> 
> Thanks!
> Julian
> 
> Am 29.09.20, 12:04 schrieb "Jialin Qiao" <qj...@mails.tsinghua.edu.cn>:
> 
>     Hi,
> 
>     Good summary~
> 
>     > Page header needs to maintain a Map<segmentStartTime, count>
> 
>     It's better to keep the structure of PageHeader the same as 0.10.
>     You can store this information in the PageData. 
> 
>     > Encoder, Decoder will take both a data column and a series.
> 
>     Try to provide interface with primitive data type. So a series should be 
> put by encodeFloatPoint(long time, float value)
> 
>     Besides, there are some more details need to consider: 
> 
>     - How to store the endpoint of each segment
>     - In the SDT with timestamp encoding, how to store the timestamps? (Maybe 
> just using TS2_DIFF is fine)
> 
>     This is not a small change to IoTDB...
> 
>     Thanks,
>     --
>     Jialin Qiao
>     School of Software, Tsinghua University
> 
>     乔嘉林
>     清华大学 软件学院
> 
>     > -----原始邮件-----
>     > 发件人: "Haimei Guo" <kel...@gmail.com>
>     > 发送时间: 2020-09-29 15:19:49 (星期二)
>     > 收件人: dev@iotdb.apache.org
>     > 抄送: 
>     > 主题: Re: Support SDT compression
>     > 
>     > Hi,
>     > 
>     > Following is a summary of SDT's encoding and decoding implementation in
>     > IoTDB.
>     > 
>     >    -
>     > 
>     >    SDT is mainly to calculate the up and down slopes of the data to the
>     >    segment starting point. If it is within the compression deviation CD 
> range,
>     >    discard the data. If it exceeds the CD, the original data is stored
>     >    -
>     > 
>     >    In IoTDB, the SDT can act as a new Encoding method. It works inside 
> each
>     >    Page (PageWriter and PageReader).
>     >    -
>     > 
>     >    Will support with and without timestamp encoding.
>     >    -
>     > 
>     >    For without timestamps encoding, we will record the count of data 
> points
>     >    in each segment in the page header. Page header needs to maintain a
>     >    Map<segmentStartTime, count>
>     > 
>     > Encoder will be changed to encode(long time, long value)
>     > Data buffer will be stored in each Encoder
>     > Decoder will be changed to getTime(), getXXValue()
>     > Encoder, Decoder will take both a data column and a series.
>     > 
>     > 
>     > If you have any question or comment, you are more than welcome to reply!
>     > 
>     > 
>     > Thank you,
>     > 
>     > Haimei
>     > 
>     > 
>     > On Mon, Sep 28, 2020 at 1:20 PM Jialin Qiao 
> <qj...@mails.tsinghua.edu.cn>
>     > wrote:
>     > 
>     > > Hi Haimei,
>     > >
>     > > Good work! This doc is comprehensive :)
>     > >
>     > > As for the implementation in IoTDB, here are some points:
>     > >
>     > > (1) First, SDT could act as a new Encoding method in IoTDB. It works
>     > > inside each Page (PageWriter and PageReader).
>     > > (2) The interface of Encoder could be changed to encode(long time, XX
>     > > value). The interface of Decoder could be change to getTime(),
>     > > getXXValue(). Which is, the encoder and decoder is not only 
> responsible for
>     > > one data column but a series. This involves some reconstruction of the
>     > > Encoder and Decoder, the data buffers should be stored inside each 
> encoder.
>     > > (3) For the SDT without timestamps, we need to record the count of 
> each
>     > > segment.
>     > > (4) We could offer two encodings, SDT with timestamps and SDT without
>     > > timestamps.
>     > >
>     > > Thanks,
>     > > --
>     > > Jialin Qiao
>     > > School of Software, Tsinghua University
>     > >
>     > > 乔嘉林
>     > > 清华大学 软件学院
>     > >
>     > > > -----原始邮件-----
>     > > > 发件人: "runhus...@foxmail.com" <runhus...@foxmail.com>
>     > > > 发送时间: 2020-09-25 11:56:16 (星期五)
>     > > > 收件人: dev <dev@iotdb.apache.org>
>     > > > 抄送:
>     > > > 主题: Re: Support SDT compression
>     > > >
>     > > > Great work!
>     > > >
>     > > >
>     > > >
>     > > > Thanks.
>     > > >
>     > > > Chao Wang
>     > > > BONC Ltd
>     > > >
>     > > >
>     > > > From: Eileen Guo
>     > > > Date: 2020-09-25 11:47
>     > > > To: dev
>     > > > Subject: Support SDT compression
>     > > > Hi all,
>     > > >
>     > > > I've completed a design draft for supporting swinging door 
> compression.
>     > > >
>     > > > Jira: jira SDT link
>     > > > <
>     > > 
> https://issues.apache.org/jira/browse/IOTDB-890?filter=-4&jql=assignee%20in%20(haimeiguo)%20order%20by%20created%20DESC
>     > > >
>     > > > design doc: SDT design doc link
>     > > > <
>     > > 
> https://docs.google.com/document/d/1VeTwVsm4CkQSVR65bWw9pKg6gRdiDYUz0lBBhWXHl5A/edit?usp=sharing
>     > > >
>     > > >
>     > > > The doc explains SDT algorithm, compression and decompression 
> process,
>     > > > performance tests and SDT + IoTDB implementation and usage.
>     > > >
>     > > > There is still some question about where to use this algorithm. If 
> you
>     > > have
>     > > > any idea, welcome to comment.
>     > > >
>     > > > Thank you!
>     > > > Haimei Guo
>     > >
> 

Reply via email to