Kun Liu created KYLIN-4983:
------------------------------
Summary: The stream cube will be paused when user append a batch
segment first
Key: KYLIN-4983
URL: https://issues.apache.org/jira/browse/KYLIN-4983
Project: Kylin
Issue Type: Bug
Components: Real-time Streaming
Reporter: Kun Liu
Assignee: Kun Liu
Env:
stream cube (stream_cube) with lambda, and the window of cube is 1hour
Before enable the stream_cube, we submit a build restful request with the range
[2025-01-01,2025-01-02]
Then enable the stream_cube, the receiver node will consume data and create
segment in real time, but the start time of the new segment is must be less
than `2025-01-01`.
When the receiver node upload the data of the segment to the HDFS and notify
the coordinator, and the coordinator will delete the metadata of the stream
segment like below code, but the segment of data can't be delete in the
receiver node.
```
// If we have a exist historical segment, we should not let new realtime
segment overwrite it, it is so dangrous,
// we just delete the entry to ignore the segment which should not exist
if (segmentRange.getFirst() < minSegmentStart) {
logger.warn(
"The cube segment state is not correct because it belongs to historical part,
cube:{} segment:{}, clear it.",
cubeName, segmentState.getSegmentName());
coordinator.getStreamMetadataStore().removeSegmentBuildState(cubeName,
segmentState.getSegmentName());
continue;
}
```
The number of immutable segment will reach to 100.
There are two way to resolve this issue:
# forbid appending segment in stream with lambda for the restful api
# delete local data and remote hdfs data before remove metadata
Now i think it is a better way to forbid appending segment in lambda stream
cube.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)