Re: Operation and robustness of iotDB

Julian Feinauer Mon, 04 Mar 2019 04:09:35 -0800

Hey Xiangdong,

thanks for the great explanation.
And in fact, I agree with you that it would be best if we start to play around 
with it and reply all our findings or wishes back to this list (in fact that 
proved to be beneficial in plc4x as well).
 
You confirm my thoughts about the two "levels" of APIs (DB and file) and the 
file api is exactly what we looked for for our use case.
As we do not care much about data loss (when an edge device fails its... gone).
The crucial point for us is that no corrupt files can be generated. 
This means I'm fine when the last data submitted is lost but I'm not fine if we 
can get to a situation where the last datafile is completely lost (well, 
perhaps this could be acceptable).


@tim: Perhaps its best when you give some more information to Xiangdong about 
our idea, and we can also point to our current code in github

Julian

Am 04.03.19, 13:03 schrieb "Xiangdong Huang" <[email protected]>:

    Hi,
    
    TsFile API is not deprecated. In fact, it is designed for this scenario and
    MapReduce/Spark computing.
    
    If you just use Reader and Writer API, there is something you need to know:
    
    Let's suppose your block size is x Bytes, (tsfile-format.properties:
    group_size_in_byte).
    
    1. If you write data and a shutdown occurs, then all data that is flushed
    on disk is ok, and you can read the data ( class
    org.apache.iotdb.tsfile.TsFileSequenceRead is an example, but you need to
    change it a little. I think I can write an example.)
    
    2. Actually, TsFile has the ability to allow you continue to write data at
    the end of the incomplete file. However, We do not provide this API now...
    If needed, I can add the API.
    
    3. In this scenario, you will lose at most x Bytes data. If you do not
    accept that, something like WAL is needed. (It is not very complex, but I
    am not sure that whether it should be an embedded function for TsFile).
    
    Up to now, we can consider that TsFile API is suitable for your scenario
    (even though we need to add a little more API if you desire). And you can
    get the ability to compress data, and query data from the TsFile rather
    than scan the data from the head to the tail.
    
    However, TsFile has one constraint: You can not write out-of-order data
    into a TsFile, otherwise the query API may return incomplete result.
    But I think it is ok for real applications, because I do not think that a
    device can generate out-of-order data....
    
    For example, If you write two devices' data into one TsFile, it is ok if
    you write data like:
    - d1.t1, d1.t2, d2.t1, d2.t2, d2.t3, d1.t4, d1.t5 ....
    or:
    - d1.m1.t1, d1.m1.t2, d1.m2.t1, d1.m2.t2, d2.m1.t1 ...
    
    But you can not write data like:
    - d1.m1.t2, d1.m1.t1 ...
    
    I think it is a good chance to improve TsFile to make it more suitable for
    real applications, so please do not hesitate to tell me more about what you
    think TsFile should want to have?
    
    Best,
    -----------------------------------
    Xiangdong Huang
    School of Software, Tsinghua University
    
     黄向东
    清华大学 软件学院
    
    
    Julian Feinauer <[email protected]> 于2019年3月4日周一 下午7:17写道：
    
    > Hi Xiangdong,
    >
    > thanks for the info.
    > How is it in the case when you use the Reader / Writer API for the tsfiles
    > directly (or should this be considered "deprecated")?
    > Can these files come to corrupted state?
    >
    > One Situation where we have to deal with these situations is "at the edge"
    > when we have devices inside large machines.
    > Usually at the end of the shift these machines (and therefore our device)
    > is powered off hard, so no shutdown or de-initialization is possible.
    >
    > Best
    > Julian
    >
    > Am 04.03.19, 12:14 schrieb "Xiangdong Huang" <[email protected]>:
    >
    >     Hi,
    >
    >     IoTDB can support either on a server with 7*24 or a RaspberryPi. We
    > have
    >     tested both the two scenario.
    >
    >     When you shutdown an IoTDB instance in force (e.g., power off) and
    > restart
    >     it again, no data loses ( if you enable the WAL).
    >
    >     However, currently we do not optimize the time cost of the restart
    > process.
    >     It is an important feature that we need to do, because we hope IoTDB
    > can
    >     support data management either on the edge devices or the data center.
    >
    >     And, the default configuration is not so suitable for running on the
    > edge
    >     device. (e.g., block size is 128MB, which is too large for a
    > RaspberryPi,
    >     and will slow down the restart process because there are too much WAL
    > data
    >     on disk).
    >
    >     Best,
    >     -----------------------------------
    >     Xiangdong Huang
    >     School of Software, Tsinghua University
    >
    >      黄向东
    >     清华大学 软件学院
    >
    >
    >     Tim Mitsch <[email protected]> 于2019年3月4日周一 下午6:53写道：
    >
    >     > Hello development-team
    >     >
    >     > First of all thanks for developing this kind of interesting project
    > and
    >     > bringing it into apache incubator.
    >     >
    >     > I have a question regarding the place of operation and robustness:
    >     >
    >     >   *   Is iotDB concepted as application on a server which is running
    > 24/7
    >     > or
    >     >   *   Is it also possible to run it on a device like RaspberryPi or
    > IPC,
    >     > where operation can interrupt.
    >     > I’m asking because i’m searching for solution for a temporary
    > storage that
    >     > is robust against spontaneous interrupt, e.g. switch off electricity
    >     > without regular shutdown of OS – have u tested something like this
    > yet?
    >     >
    >     > Best regards
    >     > Tim
    >     >
    >     >
    >     >
    >
    >
    >

Re: Operation and robustness of iotDB

Reply via email to