[
https://issues.apache.org/jira/browse/KYLIN-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930257#comment-17930257
]
Guoliang Sun commented on KYLIN-6069:
-------------------------------------
h3. Dev Design
h4. Scope
- Add `partition_range` and `job_range` to internal table metadata:
- `partition_range`: Records the data range that has been loaded or is
currently being loaded.
- `job_range`: Represents the data range currently being loaded.
Task Submission and Validation
- Update `partition_range` and `job_range` during task submission and perform
validation:
- For full and incremental builds, only check if `job_range` overlaps.
- For refresh tasks, check if `job_range` overlaps and ensure the refresh
range is within the existing range.
Time Partition and Non-Time Partition Columns
- For Discard tasks: Update `partition_range` and `job_range`. Clear the data
range currently being executed from `job_range`.
- Refresh internal table metadata and recalculate `partition_range`.
Internal Table Task Refactoring
- Split internal table tasks into a dedicated metadata update stage with
failure retry support.
- Workflow: Internal table loading --> Metadata update --> Load Gluten cache.
Frontend Changes
- Expose the **Refresh button** on the frontend.
- Enhance the refresh API to support multi-partition refresh, allowing specific
partitions to be passed as input.
- Only allow refreshing selected partitions.
> [Internal Table - Incremental Load] Overlapping Time Ranges Prohibited for
> Loading
> ----------------------------------------------------------------------------------
>
> Key: KYLIN-6069
> URL: https://issues.apache.org/jira/browse/KYLIN-6069
> Project: Kylin
> Issue Type: Improvement
> Affects Versions: 5.0.0
> Reporter: Guoliang Sun
> Assignee: Guoliang Sun
> Priority: Major
> Fix For: 5.0.2
>
>
> Refactor the internal table build task logic:
> - For build and refresh tasks, do not allow two tasks to build the same time
> range simultaneously.
> - For refresh tasks, do not allow refreshing data beyond the specified time
> range.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)