Hi, agree and done. The according Issue is IOTDB-286 (https://issues.apache.org/jira/browse/IOTDB-286).
Julian Am 01.11.19, 08:14 schrieb "Xiangdong Huang" <saint...@gmail.com>: Hi, Let's move it to JIRA as a new feature. > something like "NaN" Indeed, supporting "NaN" is important for real applications. Best, ----------------------------------- Xiangdong Huang School of Software, Tsinghua University 黄向东 清华大学 软件学院 Julian Feinauer <j.feina...@pragmaticminds.de> 于2019年10月31日周四 下午8:40写道: > Hi, > > I agree with your interpretation. Ist just another layer with different > interpretation. > So the idea would be to provide a different API initially to experiment a > bit and probably add it to the "core" API finally. > So that the Type resolution always checks whether the type is primitive or > Logical. > > I mainly wanted to get your ideas and feedback about that and if you could > imagine use cases for that. > We would need something like "NaN" quite often in our use cases and I > would also like to use a "string" mapping for "ON/OFF" rather than > true/false as it makes it easier to interpret the data later on. > > Julian > > Am 31.10.19, 05:39 schrieb "Xiangdong Huang" <saint...@gmail.com>: > > Hi, > > > You can look at how avro handles non primitive types (they call it > LogicalTypes) here: > https://avro.apache.org/docs/1.8.1/spec.html#Logical+Types > > Yes, I read some materials about LogicalTypes. It looks like a nick > name of > a data type, with some new interpretation. E.g., a byte array data > type can > be called as Decimal, while the interpretation relies on how user > define > the precision and scale.. > > Using this kind of implementation is also ok. I think. > > So, you'd like to provide the interface in the IoTDB layer to user (so > using SQL to operate data), or on top of the TsFile layer (so using > TsFile > API to operate data)? > > Best, > ----------------------------------- > Xiangdong Huang > School of Software, Tsinghua University > > 黄向东 > 清华大学 软件学院 > > > Julian Feinauer <j.feina...@pragmaticminds.de> 于2019年10月30日周三 > 下午5:59写道: > > > Hi, > > > > in fact it is mostly in the MDF spec not for compression (that’s a > nice > > side effect) but rather for being able to really express the > (physical) > > content of a signal. > > So my initial idea was to implement it as an optional layer on top > of the > > current tsfile which does the "interpretation". Because in the > tsfile its > > always just a "primitive" series that is stored. > > > > So the idea would be to store some metadata (like a formula, lookup > table, > > ...) on creation and use that on reading but only optionally. > > You can look at how avro handles non primitive types (they call it > > LogicalTypes) here: > > https://avro.apache.org/docs/1.8.1/spec.html#Logical+Types > > This is similar to my idea. > > > > Julian > > > > Am 29.10.19, 14:40 schrieb "Xiangdong Huang" <saint...@gmail.com>: > > > > Hi, > > > > > Then its most efficient to store integers and a formula like a > * x + > > b > > with e.g. b = 3 and a = 1/100. > > > So 3V would be stored as x = 0, 3.01V -> x = 1, ... 4.2V as x > = 1200. > > > So we only store 0 to 1200 and no decimals and stuff which > would be > > very > > easily compressable I thnk. > > > > Good idea! Two thumbs up for that. > > > > But for cases like the above, implementing a new encoding method > is > > better > > than a new data type. > > > > e.g, create time series root.a.b.voltage with encoding = > > linear_transformation and encoding_parameter = "describe the > function > > like > > y=a * x + b" and datatype = INT. > > > > "linear_transformation" is the new encoding method. > > > > Now I get two cases from the discussion, one is like Optional > data, > > and the > > other is data that can be transformative. > > So, do we want to support the above two, or find a more general > data > > type > > for "rich data type" (can the MDF file support some inspiration)? > > > > Best, > > ----------------------------------- > > Xiangdong Huang > > School of Software, Tsinghua University > > > > 黄向东 > > 清华大学 软件学院 > > > > > > Julian Feinauer <j.feina...@pragmaticminds.de> 于2019年10月29日周二 > > 下午8:26写道: > > > > > Hi Xiangdong, > > > > > > to your second question: > > > The use case ist he other way round. > > > We know that we measure e.g. a voltage between 3V and 4.2V > with a > > > precision of 0.01 or something. > > > Then its most efficient to store integers and a formula like a > * x + > > b > > > with e.g. b = 3 and a = 1/100. > > > So 3V would be stored as x = 0, 3.01V -> x = 1, ... 4.2V as x > = 1200. > > > So we only store 0 to 1200 and no decimals and stuff which > would be > > very > > > easily compressable I thnk. > > > > > > Julian > > > > > > Am 29.10.19, 07:13 schrieb "Xiangdong Huang" < > saint...@gmail.com>: > > > > > > Hi, > > > > > > > In Java we could model it as a variable Optional<> x > which > > could be > > > null, > > > Optional.empty(), Optional.of(true), Optional.of(false). > > > > > > It make sense. And, using a new data type to achieve in > IoTDB > > it is > > > ok. > > > > > > > Or scale formulas like a*x+b which allows to leverage the > > precision > > > even > > > for “small” double values or even integers. > > > > > > So, are you considering a use case like: the time series > value > > should > > > be > > > [1, 1, 0, 0, 1, 1, 1, 0, 0...] but actually we get [0.99, > 0.99, > > 0.01, > > > 0, > > > 1, 1, 0.999, 0, 0.01] (because of the precision of > sensors)? > > > And, what values do you want to save? > > > (1)save them as 1 and 0. Or, > > > (2) save them as 0.99, 0.01 indeed, but using a specific > query > > API to > > > return data like 1 and 0? > > > > > > My another question is, is there a general data type can > support > > the > > > above > > > cases? > > > > > > Best, > > > ----------------------------------- > > > Xiangdong Huang > > > School of Software, Tsinghua University > > > > > > 黄向东 > > > 清华大学 软件学院 > > > > > > > > > Julian Feinauer <j.feina...@pragmaticminds.de> > 于2019年10月29日周二 > > > 上午3:58写道: > > > > > > > Hi all, > > > > > > > > I wanted to discuss a possible new feature I will call > Rich > > Datatypes > > > > (RDT) API in the following. > > > > I worked a lot in the automotive industry and there is a > > broadly > > > adopted > > > > open Standard called ASAM MDF ( > > > https://www.asam.net/standards/detail/mdf/ > > > > ). > > > > It is a format which is targeted at the efficient > storage but > > at the > > > same > > > > time it supports VERY complex types (which are often > used in > > > automotive > > > > controllers). > > > > > > > > Take something as simple as a boolean. We could store it > as a > > > boolean (as > > > > java bool) in 1 bit. > > > > BUT we have overall 4 possibilities: > > > > > > > > * No value is available for a timestamp (NULL / > nothing > > stored) > > > > * We had a successful request but the Controller > does not > > know > > > whether > > > > true or false (or had an internal error), this is a bit > like > > > > Optional.isPresent() == false > > > > * True > > > > * False > > > > In Java we could model it as a variable Optional<> x > which > > could be > > > null, > > > > Optional.empty(), Optional.of(true), Optional.of(false). > > > > > > > > Other examples are discrete values like “ON”, “OFF” > (which are > > > handled as > > > > “lookup tables” on integer rows, internally). > > > > Or scale formulas like a*x+b which allows to leverage the > > precision > > > even > > > > for “small” double values or even integers. > > > > A formula but also a “fallback” lookup value like “NV”. > > > > > > > > I think this could be a valuable extension to IoTDB as an > > additional > > > API > > > > (not change anything below but just provide an API on > top to > > do the > > > > calculation). > > > > > > > > What do others think? > > > > > > > > Julian > > > > > > > > > > > > > > > > > > > > > >