[Background]
Now the size of one segment metadata entry is about 200 bytes in the
tablestatus file. if the table has 1 million segments and the mean size of
segments is 1GB(means the table size is 1PB), the size of the tablestatus
file will reach 200MB.
Any reading/writing operation on this tablestatus file will be costly and
has a bad performance.
For a concurrent scene, it may be easy to result in reading failure on a
tablestatus file which is being modified and writing lock waiting timeout.
[Motivation & Goal]
Carbon supports the big table which is bigger than 1PB, we should reduce the
tablestatus size to improve the performance of reading/writing operation.
And better to separate reading/writing to the different tablestatus files to
avoid reading a tablestatus file which is being modified.
[Modification]
There are three solutions as following.
solution 1: compress tablestatus file
1) use gzip to compress tablestatus file (200MB -> 20 MB)
2) keep all previous lock mechanism
3) support backward compatibility
Read: if magic number (0x1F8B) exists, it will uncompress the
tablestatus file at first
Write:, compress tablestatus directly.
solution 2: Based on solution 1, separate reading and writing to the
different tablestatus files.
1) new tablestatus format
{
"statusFileName":"status-uuid1",
"updateStatusFileName":"updatestatus-timestamp1",
"historyStatusFileName":"status.history",
"segmentMaxId":"1000"
}
keep it small always, reload this file for each operation
2) add Metadata/tracelog folder
store files: status-uuid1,updatestatus-timestamp1, status.history
3) use gzip to compress status-uuid1 file
4) support backword compatibility
Read: if it start with "[{", go to old reading flow; if it start with
"{", go to the new flow.
Write: generate a new status-uuid1 file and updatestatus file, and store
name in the tablestatus file
5) clean stale files
if the stale files are create before 1 hour (query timeout), we can
remove them. loading/compaction/cleanfile can trigger this action.
solution 3: Based on solution 2, support tablestatus delta
1) new tablestatus file format
{
"statusFileName":"status-uuid1",
"deltaStatusFileName": "status-uuid2.delta",
"updateStatusFileName":"updatestatus-timestamp1",
"historyStatusFileName":"status.history",
"segmentMaxId":"1000"
}
2) tablestatus delta store the recent modification
Write: if status file reach 10MB, it starts to write delta file. if
delta file reach 1MB, merge delta to status file and set deltaStatusFileName
to null.
Read: if deltaStatusFileName is not null in the new tablestatus file,
need read delta status and combine status file with delta status.
please vote for all solutions.
-
Best Regards
David Cai
--
Sent from:
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/