Hi, In my opinion, different measurements use their own timestamp even though they are grouped into one chunk group.they don’t share from each other.
What do you think of this @xiangdong Thanks XuYi iPhoneから送信 2019/03/08 1:41、Julian Feinauer <j.feina...@pragmaticminds.de>のメール: > Hi, > > Yes this is what I meant. > > Julian > > Von meinem Mobiltelefon gesendet > > > -------- Ursprüngliche Nachricht -------- > Betreff: Re: Operation and robustness of iotDB > Von: 徐毅 > An: dev@iotdb.apache.org > Cc: > > Hi, > In the definition of ChunkGroup, what is the meaning of 'share one time > signal'? Do these measurements share same timestamps? > > > Thanks > XuYi > On 3/8/2019 01:11,Julian Feinauer<j.feina...@pragmaticminds.de> wrote: > Hey Xiangdong, > hey all, > > I like the documentation much. > The only thing I'm a bit unsure is about the names (as there is no > clarification). > So, before I update it with any wrong information I would like to ensure that > I have the correct understanding. > > I assume that most naming is similar to Parquet. > > Page - Contains one Measurement, smallest source of compression > Chunk - Collection of multiple Pages, still one measurement > ChunkGroup - Collection of chunks of which share one time signal (one Chunk > for each measurement) > > Is this correct so? > > Julian > > Am 05.03.19, 12:26 schrieb "Xiangdong Huang" <saint...@gmail.com>: > > Hi, > > 1. We have a document to introduce that: > https://cwiki.apache.org/confluence/display/IOTDB/TsFile+Format > > 2. The new API for recovering data is almost done. I am writing the UTs > now. Maybe I can submit a PR tonight (if everything is fine...) > > Best, > ----------------------------------- > Xiangdong Huang > School of Software, Tsinghua University > > 黄向东 > 清华大学 软件学院 > > > Julian Feinauer <j.feina...@pragmaticminds.de> 于2019年3月5日周二 下午6:00写道: > > Hi Xiangdong, > > that sounds excellent. > Do you have a short overview of how the file format is designed on disk? > I know that its somewhat similar to parquet but I did not find more > details. > Basically what would suffice for us would be something like skipping an > invalid column group (or how you name it) and go on with the next, or so. > > Julian > > Am 04.03.19, 13:21 schrieb "Xiangdong Huang" <saint...@gmail.com>: > > Hi, > > If so, I think I need to add a new API to allow you continue to write > data > in an existing but not closed correctly TsFile. Then everything is > fine > for you :D > > Best, > ----------------------------------- > Xiangdong Huang > School of Software, Tsinghua University > > 黄向东 > 清华大学 软件学院 > > > Julian Feinauer <j.feina...@pragmaticminds.de> 于2019年3月4日周一 下午8:08写道: > > Hey Xiangdong, > > thanks for the great explanation. > And in fact, I agree with you that it would be best if we start to > play > around with it and reply all our findings or wishes back to this > list (in > fact that proved to be beneficial in plc4x as well). > > You confirm my thoughts about the two "levels" of APIs (DB and file) > and > the file api is exactly what we looked for for our use case. > As we do not care much about data loss (when an edge device fails > its... > gone). > The crucial point for us is that no corrupt files can be generated. > This means I'm fine when the last data submitted is lost but I'm not > fine > if we can get to a situation where the last datafile is completely > lost > (well, perhaps this could be acceptable). > > @tim: Perhaps its best when you give some more information to > Xiangdong > about our idea, and we can also point to our current code in github > > Julian > > Am 04.03.19, 13:03 schrieb "Xiangdong Huang" <saint...@gmail.com>: > > Hi, > > TsFile API is not deprecated. In fact, it is designed for this > scenario and > MapReduce/Spark computing. > > If you just use Reader and Writer API, there is something you > need to > know: > > Let's suppose your block size is x Bytes, > (tsfile-format.properties: > group_size_in_byte). > > 1. If you write data and a shutdown occurs, then all data that is > flushed > on disk is ok, and you can read the data ( class > org.apache.iotdb.tsfile.TsFileSequenceRead is an example, but > you need > to > change it a little. I think I can write an example.) > > 2. Actually, TsFile has the ability to allow you continue to > write > data at > the end of the incomplete file. However, We do not provide this > API > now... > If needed, I can add the API. > > 3. In this scenario, you will lose at most x Bytes data. If you > do not > accept that, something like WAL is needed. (It is not very > complex, > but I > am not sure that whether it should be an embedded function for > TsFile). > > Up to now, we can consider that TsFile API is suitable for your > scenario > (even though we need to add a little more API if you desire). > And you > can > get the ability to compress data, and query data from the TsFile > rather > than scan the data from the head to the tail. > > However, TsFile has one constraint: You can not write > out-of-order data > into a TsFile, otherwise the query API may return incomplete > result. > But I think it is ok for real applications, because I do not > think > that a > device can generate out-of-order data.... > > For example, If you write two devices' data into one TsFile, it > is ok > if > you write data like: > - d1.t1, d1.t2, d2.t1, d2.t2, d2.t3, d1.t4, d1.t5 .... > or: > - d1.m1.t1, d1.m1.t2, d1.m2.t1, d1.m2.t2, d2.m1.t1 ... > > But you can not write data like: > - d1.m1.t2, d1.m1.t1 ... > > I think it is a good chance to improve TsFile to make it more > suitable > for > real applications, so please do not hesitate to tell me more > about > what you > think TsFile should want to have? > > Best, > ----------------------------------- > Xiangdong Huang > School of Software, Tsinghua University > > 黄向东 > 清华大学 软件学院 > > > Julian Feinauer <j.feina...@pragmaticminds.de> 于2019年3月4日周一 > 下午7:17写道: > > Hi Xiangdong, > > thanks for the info. > How is it in the case when you use the Reader / Writer API for > the > tsfiles > directly (or should this be considered "deprecated")? > Can these files come to corrupted state? > > One Situation where we have to deal with these situations is > "at the > edge" > when we have devices inside large machines. > Usually at the end of the shift these machines (and therefore > our > device) > is powered off hard, so no shutdown or de-initialization is > possible. > > Best > Julian > > Am 04.03.19, 12:14 schrieb "Xiangdong Huang" < > saint...@gmail.com>: > > Hi, > > IoTDB can support either on a server with 7*24 or a > RaspberryPi. > We > have > tested both the two scenario. > > When you shutdown an IoTDB instance in force (e.g., power > off) > and > restart > it again, no data loses ( if you enable the WAL). > > However, currently we do not optimize the time cost of the > restart > process. > It is an important feature that we need to do, because we > hope > IoTDB > can > support data management either on the edge devices or the > data > center. > > And, the default configuration is not so suitable for > running on > the > edge > device. (e.g., block size is 128MB, which is too large for > a > RaspberryPi, > and will slow down the restart process because there are > too > much WAL > data > on disk). > > Best, > ----------------------------------- > Xiangdong Huang > School of Software, Tsinghua University > > 黄向东 > 清华大学 软件学院 > > > Tim Mitsch <t.mit...@pragmaticindustries.de> 于2019年3月4日周一 > 下午6:53写道: > > Hello development-team > > First of all thanks for developing this kind of > interesting > project > and > bringing it into apache incubator. > > I have a question regarding the place of operation and > robustness: > > * Is iotDB concepted as application on a server > which is > running > 24/7 > or > * Is it also possible to run it on a device like > RaspberryPi or > IPC, > where operation can interrupt. > I’m asking because i’m searching for solution for a > temporary > storage that > is robust against spontaneous interrupt, e.g. switch off > electricity > without regular shutdown of OS – have u tested something > like > this > yet? > > Best regards > Tim > > > > > > > > > > > > > >