Excellent, aligned time series is indeed an important feature of TsFile. Thanks for your extraordinary contribution and look forward to your future efforts.
Tian Jiang ---- Replied Message ---- | From | hongzhigao<[email protected]> | | Date | 11/6/2024 19:05 | | To | tsfile-dev<[email protected]> | | Subject | Follow-up: Aligned Time Series Design for TsFile-CPP | Diagram of TsFile Aligned Time Series?? https://i.postimg.cc/PJXXdB3V/Ts-File-Aligned-TS.png Aligned time series in TsFile introduces several key features to improve the organization and retrieval of timestamp-aligned data: 1. TimeChunk in ChunkGroup: Each ChunkGroup now includes a dedicated Chunk of Timestamp (TimeChunk) that records all timestamps corresponding to that ChunkGroup. Additionally, each Page within a ValueChunk now contains a BitMap to indicate the presence or absence of data for specific timestamps. This allows for precise alignment of data points across different series. 2. Timestamp TimeseriesIndex in ChunkIndex Area: A new index for all TimeChunks, known as the TimeseriesIndex of Timestamp, has been added in the ChunkIndex section. Its TimeseriesMetaData has an empty name field (""). This index enhances direct access to timestamp-related information. 3. Timestamp SeriesIndex in SeriesIndex Area: A corresponding index for the TimeseriesIndex of Timestamp is also added in the SeriesIndex area, stored at the leftmost node of the device-sensor tree. This provides a quick path for locating time-related data. These additions improve timestamp alignment and retrieval efficiency in TsFile, significantly reducing storage space. ------------------ ???????? ------------------ ??????: "hongzhigao" <[email protected]>; ????????: 2024??11??6??(??????) ????4:43 ??????: "tsfile-dev"<[email protected]>; ????: ?????? Request for Consideration as Apache TsFile Committer Attached is an overview diagram of the TsFileV4 file format. The link is as follows. https://i.postimg.cc/mr1hqfJQ/Tsfile-V4-overview.png TsFile V4 introduces the TableModel, along with additional components in the index section, namely TableIndexNode, TableSchema, and PropertiesMap. These components enhance data organization, accessibility, and file-level configuration. TableIndexNode: This component links a TableName to the root node of devices within the index tree structure. During data reading, the system first locates the root node by TableName and then recursively traverses down to reach the LeafMeasurement. This enables efficient access to specific measurement data within complex hierarchical structures. TableSchema: The TableSchema associates a TableName with the metadata of sensor data. It acts as a schema reference, defining data structure and sensor-related details necessary for interpreting the stored data. PropertiesMap: The PropertiesMap is designed for setting file-level properties, such as encryption configurations, allowing for flexible adjustments to security and other properties as needed across files. For the PropertiesMap, it should be noted that the encoding and compression settings for the time column are expected to be stored here. Currently, these configurations can be modified, but TsFile itself does not include this information, which poses some risk. These additions in TsFile v4 aim to improve data manageability and extensibility, making TsFile more robust for handling varied data structures and configurations. I will provide additional details on some other features, such as AlignedTimeSeries, new data types like String, Blob, TimeStamp, and Date, in follow-up emails. ------------------ ???????? ------------------ ??????: "dev" <[email protected]>; ????????: 2024??10??15??(??????) ????11:45 ??????: "dev"<[email protected]>; ????: Re: Request for Consideration as Apache TsFile Committer Hi Hongzhi, Welcome to TsFile community and thanks for your contribution! C++ TsFile will greatly expand our end-side scenarios. Looking forward to your emails about the detailed design of C++ TsFile V4 and other features :-) Jialin Qiao > From: "Tian Jiang"<[email protected]> > Date: Tue, Oct 15, 2024, 10:17 > Subject: Re: Request for Consideration as Apache TsFile Committer > To: "[email protected]"<[email protected]> > Hi Hongzhi, > > > Thanks for your active contributions for the Apache TsFile community, which are very helpful improving this project. > > > I personally think your work is quite enough regarding the standard of becoming a committer, and the Apache TsFile community > does need some new committers. We may soon start the nomination and voting process. > > > Looking forward to seeing your future activities in this project. > > > Best, > Tian Jiang > > > ---- Replied Message ---- > | From | hongzhigao<[email protected]> | > | Date | 10/14/2024 15:29 | > | To | tsfile-dev<[email protected]> | > | Subject | Request for Consideration as Apache TsFile Committer | > Hi all, > > > I would like to express my interest in becoming a committer for the Apache TsFile project. I have been actively contributing to the project, and I am eager to take on more responsibility within the community. > > > A brief introduction: I hold a master's degree from Zhejiang University and have experience in storage system development from my time at Tencent Cloud Storage R&amp;D Center. I hope to apply this experience to further improve and enhance TsFile. > > > > > > > Here is a summary of my recent contributions to the C++ version of TsFile: > > Test Coverage Completion: I completed the test suite for TsFile, covering essential components such as core data structures, encoding/decoding mechanisms, and the data writing process.(feature/unittests) > > Bug Fixes and Stability Enhancements: During unit testing, I identified and resolved several code vulnerabilities, significantly improving the stability of the system. > > Consistency with Java Version: I ensured that the C++ encoding/decoding algorithms are consistent with the Java version, maintaining cross-language compatibility. > > Aligned Time Series Implementation: I also successfully implemented aligned time series functionality, a key feature for efficient data organization.(Feature/support aligned ts) > > > > Contribution Statistics: > Pull Requests:&nbsp; > > [CPP]Bugfix: bitmap clear method > > Fix SimpleListNode::remove > > [CPP] fix ts2diff decoder > > fix BitPackDecoder::~BitPackDecoder() > > Fix the bitpack_codec to keep it consistent with the Java version. > > fix ZigZagCodec > > fix/plain_decoder > > [CPP] fix value.h > > feature/unittests > > [CPP] Fix syntax and logic errors in files under the 'filter' directory > > fix get_cur_timestamp > > [CPP] Support alternative gtest url > > Feature/support aligned ts > > Lines of Code Added: 9,185&nbsp; > Lines of Code Deleted: 537&nbsp; > Bugs Fixed: 10&nbsp; > Tests Added: 303 > > > Looking forward, I plan to continue improving the stability of the project, implement new features such as support for additional data types and compression algorithms, and contribute towards supporting TsFile v4. I also intend to explore the IoTDB storage engine to deepen my understanding of how TsFile interacts with it. > > > Thank you for your consideration, and I am excited to continue contributing to the success of Apache TsFile. > > > Best regards,&nbsp; > Hongzhi Gao >
