[Discussion] Improve the reading/writing performance on the big tablestatus file

David CaiQiang Tue, 01 Sep 2020 19:51:58 -0700

[Background]
Now the size of one segment metadata entry is about 200 bytes in the
tablestatus file. if the table has 1 million segments and the mean size of
segments is 1GB(means the table size is 1PB), the size of the tablestatus
file will reach 200MB.


Any reading/writing operation on this tablestatus file will be costly and
has a bad performance.

 For a concurrent scene, it may be easy to result in reading failure on a 
tablestatus file which is being modified and writing lock waiting timeout.

[Motivation & Goal]
Carbon supports the big table which is bigger than 1PB, we should reduce the
tablestatus size to improve the performance of reading/writing operation.
And better to separate reading/writing to the different tablestatus files to
avoid reading  a  tablestatus file which is being modified.

[Modification]
There are three solutions as following.

solution 1: compress tablestatus file
  1) use gzip to compress tablestatus file (200MB -> 20 MB)
  2) keep all previous lock mechanism
  3) support backward compatibility
    Read: if magic number (0x1F8B) exists, it will uncompress the
tablestatus file at first
    Write:, compress tablestatus directly.

solution 2: Based on solution 1,  separate reading and writing to the
different tablestatus files.
  1) new tablestatus format
    {
     "statusFileName":"status-uuid1",
     "updateStatusFileName":"updatestatus-timestamp1",
     "historyStatusFileName":"status.history",
     "segmentMaxId":"1000"
    }
    keep it small always, reload this file for each operation

  2) add Metadata/tracelog folder
   store files: status-uuid1,updatestatus-timestamp1, status.history

  3) use gzip to compress status-uuid1 file

  4) support backword compatibility
    Read: if it start with "[{", go to old reading flow; if it start with
"{", go to the new flow.
    Write: generate a new status-uuid1 file and updatestatus file, and store
name in the tablestatus file
  
  5) clean stale files
     if the stale files are create before 1 hour (query timeout), we can
remove them. loading/compaction/cleanfile can trigger this action.

solution 3: Based on solution 2, support tablestatus delta
  1) new tablestatus file format
    {
     "statusFileName":"status-uuid1",
     "deltaStatusFileName": "status-uuid2.delta",
     "updateStatusFileName":"updatestatus-timestamp1",
     "historyStatusFileName":"status.history",
     "segmentMaxId":"1000"
    }
  2) tablestatus delta store the recent modification

    Write: if status file reach 10MB, it starts to write delta file. if
delta file reach 1MB, merge delta to status file and set deltaStatusFileName
to null.

    Read: if deltaStatusFileName is not null in the new tablestatus file,
need read delta status and combine status file with delta status.

please vote for all solutions.



-----
Best Regards
David Cai
--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

[Discussion] Improve the reading/writing performance on the big tablestatus file

Reply via email to