Re: [Discussion] Improve the reading/writing performance on the big tablestatus file

2020-09-01 Thread David CaiQiang
add solution 4 to separate the status file by segment status

*solution 4:*   Based on solution 2, support status.inprogress

  1) new tablestatus file format
{
 "statusFileName":"status-uuid1",
 "inProgressStatusFileName": "status-uuid2.inprogess",
 "updateStatusFileName":"updatestatus-timestamp1",
 "historyStatusFileName":"status.history",
 "segmentMaxId":"1000"
}

  2) status.inprogess file store the in-progress segment metadata

Write: at the begin of loading/compaction, add in-progress segment
metadata into status-uuid2.inprogess. at the end, move it to status-uuid1.

Read: query read status-uuid1 only.  other cases read
status-uuid2.inprogess if needed.



-
Best Regards
David Cai
--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: [Discussion] Improve the reading/writing performance on the big tablestatus file

2020-09-01 Thread Zhangshunyu
solution2,  +1



-
My English name is Sunday
--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


[Discussion] Improve the reading/writing performance on the big tablestatus file

2020-09-01 Thread David CaiQiang
[Background]
Now the size of one segment metadata entry is about 200 bytes in the
tablestatus file. if the table has 1 million segments and the mean size of
segments is 1GB(means the table size is 1PB), the size of the tablestatus
file will reach 200MB. 

Any reading/writing operation on this tablestatus file will be costly and
has a bad performance.

 For a concurrent scene, it may be easy to result in reading failure on a 
tablestatus file which is being modified and writing lock waiting timeout.

[Motivation & Goal]
Carbon supports the big table which is bigger than 1PB, we should reduce the
tablestatus size to improve the performance of reading/writing operation.
And better to separate reading/writing to the different tablestatus files to
avoid reading  a  tablestatus file which is being modified.

[Modification]
There are three solutions as following.

solution 1: compress tablestatus file
  1) use gzip to compress tablestatus file (200MB -> 20 MB)
  2) keep all previous lock mechanism
  3) support backward compatibility
Read: if magic number (0x1F8B) exists, it will uncompress the
tablestatus file at first
Write:, compress tablestatus directly.

solution 2: Based on solution 1,  separate reading and writing to the
different tablestatus files.
  1) new tablestatus format
{
 "statusFileName":"status-uuid1",
 "updateStatusFileName":"updatestatus-timestamp1",
 "historyStatusFileName":"status.history",
 "segmentMaxId":"1000"
}
keep it small always, reload this file for each operation

  2) add Metadata/tracelog folder
   store files: status-uuid1,updatestatus-timestamp1, status.history

  3) use gzip to compress status-uuid1 file

  4) support backword compatibility
Read: if it start with "[{", go to old reading flow; if it start with
"{", go to the new flow.
Write: generate a new status-uuid1 file and updatestatus file, and store
name in the tablestatus file
  
  5) clean stale files
 if the stale files are create before 1 hour (query timeout), we can
remove them. loading/compaction/cleanfile can trigger this action.

solution 3: Based on solution 2, support tablestatus delta
  1) new tablestatus file format
{
 "statusFileName":"status-uuid1",
 "deltaStatusFileName": "status-uuid2.delta",
 "updateStatusFileName":"updatestatus-timestamp1",
 "historyStatusFileName":"status.history",
 "segmentMaxId":"1000"
}
  2) tablestatus delta store the recent modification

Write: if status file reach 10MB, it starts to write delta file. if
delta file reach 1MB, merge delta to status file and set deltaStatusFileName
to null.

Read: if deltaStatusFileName is not null in the new tablestatus file,
need read delta status and combine status file with delta status.

please vote for all solutions.



-
Best Regards
David Cai
--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/