[Background] Now the size of one segment metadata entry is about 200 bytes in the tablestatus file. if the table has 1 million segments and the mean size of segments is 1GB(means the table size is 1PB), the size of the tablestatus file will reach 200MB.
Any reading/writing operation on this tablestatus file will be costly and has a bad performance. For a concurrent scene, it may be easy to result in reading failure on a tablestatus file which is being modified and writing lock waiting timeout. [Motivation & Goal] Carbon supports the big table which is bigger than 1PB, we should reduce the tablestatus size to improve the performance of reading/writing operation. And better to separate reading/writing to the different tablestatus files to avoid reading a tablestatus file which is being modified. [Modification] There are three solutions as following. solution 1: compress tablestatus file 1) use gzip to compress tablestatus file (200MB -> 20 MB) 2) keep all previous lock mechanism 3) support backward compatibility Read: if magic number (0x1F8B) exists, it will uncompress the tablestatus file at first Write:, compress tablestatus directly. solution 2: Based on solution 1, separate reading and writing to the different tablestatus files. 1) new tablestatus format { "statusFileName":"status-uuid1", "updateStatusFileName":"updatestatus-timestamp1", "historyStatusFileName":"status.history", "segmentMaxId":"1000" } keep it small always, reload this file for each operation 2) add Metadata/tracelog folder store files: status-uuid1,updatestatus-timestamp1, status.history 3) use gzip to compress status-uuid1 file 4) support backword compatibility Read: if it start with "[{", go to old reading flow; if it start with "{", go to the new flow. Write: generate a new status-uuid1 file and updatestatus file, and store name in the tablestatus file 5) clean stale files if the stale files are create before 1 hour (query timeout), we can remove them. loading/compaction/cleanfile can trigger this action. solution 3: Based on solution 2, support tablestatus delta 1) new tablestatus file format { "statusFileName":"status-uuid1", "deltaStatusFileName": "status-uuid2.delta", "updateStatusFileName":"updatestatus-timestamp1", "historyStatusFileName":"status.history", "segmentMaxId":"1000" } 2) tablestatus delta store the recent modification Write: if status file reach 10MB, it starts to write delta file. if delta file reach 1MB, merge delta to status file and set deltaStatusFileName to null. Read: if deltaStatusFileName is not null in the new tablestatus file, need read delta status and combine status file with delta status. please vote for all solutions. ----- Best Regards David Cai -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/