Hi,

agree and done. The according Issue is IOTDB-286 
(https://issues.apache.org/jira/browse/IOTDB-286).

Julian

Am 01.11.19, 08:14 schrieb "Xiangdong Huang" <saint...@gmail.com>:

    Hi,
    
    Let's move it to JIRA as a new feature.
    
    > something like "NaN"
    Indeed, supporting "NaN" is important for real applications.
    
    Best,
    -----------------------------------
    Xiangdong Huang
    School of Software, Tsinghua University
    
     黄向东
    清华大学 软件学院
    
    
    Julian Feinauer <j.feina...@pragmaticminds.de> 于2019年10月31日周四 下午8:40写道:
    
    > Hi,
    >
    > I agree with your interpretation. Ist just another layer with different
    > interpretation.
    > So the idea would be to provide a different API initially to experiment a
    > bit and probably add it to the "core" API finally.
    > So that the Type resolution always checks whether the type is primitive or
    > Logical.
    >
    > I mainly wanted to get your ideas and feedback about that and if you could
    > imagine use cases for that.
    > We would need something like "NaN" quite often in our use cases and I
    > would also like to use a "string" mapping for "ON/OFF" rather than
    > true/false as it makes it easier to interpret the data later on.
    >
    > Julian
    >
    > Am 31.10.19, 05:39 schrieb "Xiangdong Huang" <saint...@gmail.com>:
    >
    >     Hi,
    >
    >     > You can look at how avro handles non primitive types (they call it
    >     LogicalTypes) here:
    >     https://avro.apache.org/docs/1.8.1/spec.html#Logical+Types
    >
    >     Yes, I read some materials about LogicalTypes. It looks like a nick
    > name of
    >     a data type, with some new interpretation. E.g., a byte array data
    > type can
    >     be called as Decimal, while the interpretation relies on how user
    > define
    >     the precision and scale..
    >
    >     Using this kind of implementation is also ok. I think.
    >
    >     So, you'd like to provide the interface in the IoTDB layer to user (so
    >     using SQL to operate data), or on top of the TsFile layer (so using
    > TsFile
    >     API to operate data)?
    >
    >     Best,
    >     -----------------------------------
    >     Xiangdong Huang
    >     School of Software, Tsinghua University
    >
    >      黄向东
    >     清华大学 软件学院
    >
    >
    >     Julian Feinauer <j.feina...@pragmaticminds.de> 于2019年10月30日周三
    > 下午5:59写道:
    >
    >     > Hi,
    >     >
    >     > in fact it is mostly in the MDF spec not for compression (that’s a
    > nice
    >     > side effect) but rather for being able to really express the
    > (physical)
    >     > content of a signal.
    >     > So my initial idea was to implement it as an optional layer on top
    > of the
    >     > current tsfile which does the "interpretation". Because in the
    > tsfile its
    >     > always just a "primitive" series that is stored.
    >     >
    >     > So the idea would be to store some metadata (like a formula, lookup
    > table,
    >     > ...) on creation and use that on reading but only optionally.
    >     > You can look at how avro handles non primitive types (they call it
    >     > LogicalTypes) here:
    >     > https://avro.apache.org/docs/1.8.1/spec.html#Logical+Types
    >     > This is similar to my idea.
    >     >
    >     > Julian
    >     >
    >     > Am 29.10.19, 14:40 schrieb "Xiangdong Huang" <saint...@gmail.com>:
    >     >
    >     >     Hi,
    >     >
    >     >     > Then its most efficient to store integers and a formula like a
    > * x +
    >     > b
    >     >     with e.g. b = 3 and a = 1/100.
    >     >     > So 3V would be stored as x = 0, 3.01V -> x = 1, ... 4.2V as x
    > = 1200.
    >     >     > So we only store 0 to 1200 and no decimals and stuff which
    > would be
    >     > very
    >     >     easily compressable I thnk.
    >     >
    >     >     Good idea! Two thumbs up for that.
    >     >
    >     >     But for cases like the above, implementing a new encoding method
    > is
    >     > better
    >     >     than a new data type.
    >     >
    >     >     e.g, create time series root.a.b.voltage with encoding =
    >     >     linear_transformation and encoding_parameter = "describe the
    > function
    >     > like
    >     >     y=a * x + b" and datatype = INT.
    >     >
    >     >     "linear_transformation" is the new encoding method.
    >     >
    >     >     Now I get two cases from the discussion, one is like Optional
    > data,
    >     > and the
    >     >     other is data that can be transformative.
    >     >     So, do we want to support the above two, or find a more general
    > data
    >     > type
    >     >     for "rich data type" (can the MDF file support some 
inspiration)?
    >     >
    >     >     Best,
    >     >     -----------------------------------
    >     >     Xiangdong Huang
    >     >     School of Software, Tsinghua University
    >     >
    >     >      黄向东
    >     >     清华大学 软件学院
    >     >
    >     >
    >     >     Julian Feinauer <j.feina...@pragmaticminds.de> 于2019年10月29日周二
    >     > 下午8:26写道:
    >     >
    >     >     > Hi Xiangdong,
    >     >     >
    >     >     > to your second question:
    >     >     > The use case ist he other way round.
    >     >     > We know that we measure e.g. a voltage between 3V and 4.2V
    > with a
    >     >     > precision of 0.01 or something.
    >     >     > Then its most efficient to store integers and a formula like a
    > * x +
    >     > b
    >     >     > with e.g. b = 3 and a = 1/100.
    >     >     > So 3V would be stored as x = 0, 3.01V -> x = 1, ... 4.2V as x
    > = 1200.
    >     >     > So we only store 0 to 1200 and no decimals and stuff which
    > would be
    >     > very
    >     >     > easily compressable I thnk.
    >     >     >
    >     >     > Julian
    >     >     >
    >     >     > Am 29.10.19, 07:13 schrieb "Xiangdong Huang" <
    > saint...@gmail.com>:
    >     >     >
    >     >     >     Hi,
    >     >     >
    >     >     >     > In Java we could model it as a variable Optional<> x
    > which
    >     > could be
    >     >     > null,
    >     >     >     Optional.empty(), Optional.of(true), Optional.of(false).
    >     >     >
    >     >     >     It make sense.  And, using a new data type to achieve in
    > IoTDB
    >     > it is
    >     >     > ok.
    >     >     >
    >     >     >     > Or scale formulas like a*x+b which allows to leverage 
the
    >     > precision
    >     >     > even
    >     >     >     for “small” double values or even integers.
    >     >     >
    >     >     >     So, are you considering a use case like: the time series
    > value
    >     > should
    >     >     > be
    >     >     >     [1, 1, 0, 0, 1, 1, 1, 0, 0...]  but actually we get [0.99,
    > 0.99,
    >     > 0.01,
    >     >     > 0,
    >     >     >     1, 1, 0.999, 0, 0.01] (because of the precision of
    > sensors)?
    >     >     >     And, what values do you want to save?
    >     >     >     (1)save them as 1 and 0.  Or,
    >     >     >     (2)  save them as 0.99, 0.01 indeed, but using a specific
    > query
    >     > API to
    >     >     >     return data like 1 and 0?
    >     >     >
    >     >     >     My another question is, is there a general data type can
    > support
    >     > the
    >     >     > above
    >     >     >     cases?
    >     >     >
    >     >     >     Best,
    >     >     >     -----------------------------------
    >     >     >     Xiangdong Huang
    >     >     >     School of Software, Tsinghua University
    >     >     >
    >     >     >      黄向东
    >     >     >     清华大学 软件学院
    >     >     >
    >     >     >
    >     >     >     Julian Feinauer <j.feina...@pragmaticminds.de>
    > 于2019年10月29日周二
    >     >     > 上午3:58写道:
    >     >     >
    >     >     >     > Hi all,
    >     >     >     >
    >     >     >     > I wanted to discuss a possible new feature I will call
    > Rich
    >     > Datatypes
    >     >     >     > (RDT) API in the following.
    >     >     >     > I worked a lot in the automotive industry and there is a
    >     > broadly
    >     >     > adopted
    >     >     >     > open Standard called ASAM MDF (
    >     >     > https://www.asam.net/standards/detail/mdf/
    >     >     >     > ).
    >     >     >     > It is a format which is targeted at the efficient
    > storage but
    >     > at the
    >     >     > same
    >     >     >     > time it supports VERY complex types (which are often
    > used in
    >     >     > automotive
    >     >     >     > controllers).
    >     >     >     >
    >     >     >     > Take something as simple as a boolean. We could store it
    > as a
    >     >     > boolean (as
    >     >     >     > java bool) in 1 bit.
    >     >     >     > BUT we have overall 4 possibilities:
    >     >     >     >
    >     >     >     >   *   No value is available for a timestamp (NULL /
    > nothing
    >     > stored)
    >     >     >     >   *   We had a successful request but the Controller
    > does not
    >     > know
    >     >     > whether
    >     >     >     > true or false (or had an internal error), this is a bit
    > like
    >     >     >     > Optional.isPresent() == false
    >     >     >     >   *   True
    >     >     >     >   *   False
    >     >     >     > In Java we could model it as a variable Optional<> x
    > which
    >     > could be
    >     >     > null,
    >     >     >     > Optional.empty(), Optional.of(true), Optional.of(false).
    >     >     >     >
    >     >     >     > Other examples are discrete values like “ON”, “OFF”
    > (which are
    >     >     > handled as
    >     >     >     > “lookup tables” on integer rows, internally).
    >     >     >     > Or scale formulas like a*x+b which allows to leverage 
the
    >     > precision
    >     >     > even
    >     >     >     > for “small” double values or even integers.
    >     >     >     > A formula but also a “fallback” lookup value like “NV”.
    >     >     >     >
    >     >     >     > I think this could be a valuable extension to IoTDB as 
an
    >     > additional
    >     >     > API
    >     >     >     > (not change anything below but just provide an API on
    > top to
    >     > do the
    >     >     >     > calculation).
    >     >     >     >
    >     >     >     > What do others think?
    >     >     >     >
    >     >     >     > Julian
    >     >     >     >
    >     >     >
    >     >     >
    >     >     >
    >     >
    >     >
    >     >
    >
    >
    >
    

Reply via email to