[ 
https://issues.apache.org/jira/browse/KYLIN-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930257#comment-17930257
 ] 

Guoliang Sun commented on KYLIN-6069:
-------------------------------------

h3. Dev Design
h4. Scope

- Add `partition_range` and `job_range` to internal table metadata:  
  - `partition_range`: Records the data range that has been loaded or is 
currently being loaded.  
  - `job_range`: Represents the data range currently being loaded.  

Task Submission and Validation

- Update `partition_range` and `job_range` during task submission and perform 
validation:  
  - For full and incremental builds, only check if `job_range` overlaps.  
  - For refresh tasks, check if `job_range` overlaps and ensure the refresh 
range is within the existing range.  

Time Partition and Non-Time Partition Columns

- For Discard tasks: Update `partition_range` and `job_range`. Clear the data 
range currently being executed from `job_range`.  
- Refresh internal table metadata and recalculate `partition_range`.  

Internal Table Task Refactoring

- Split internal table tasks into a dedicated metadata update stage with 
failure retry support.  
- Workflow: Internal table loading --> Metadata update --> Load Gluten cache.  

Frontend Changes

- Expose the **Refresh button** on the frontend.  
- Enhance the refresh API to support multi-partition refresh, allowing specific 
partitions to be passed as input.  
- Only allow refreshing selected partitions.

> [Internal Table - Incremental Load] Overlapping Time Ranges Prohibited for 
> Loading
> ----------------------------------------------------------------------------------
>
>                 Key: KYLIN-6069
>                 URL: https://issues.apache.org/jira/browse/KYLIN-6069
>             Project: Kylin
>          Issue Type: Improvement
>    Affects Versions: 5.0.0
>            Reporter: Guoliang Sun
>            Assignee: Guoliang Sun
>            Priority: Major
>             Fix For: 5.0.2
>
>
> Refactor the internal table build task logic:  
> - For build and refresh tasks, do not allow two tasks to build the same time 
> range simultaneously.  
> - For refresh tasks, do not allow refreshing data beyond the specified time 
> range.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to