Kun Liu created KYLIN-4983:
------------------------------

             Summary: The stream cube will be paused when user append a batch 
segment first
                 Key: KYLIN-4983
                 URL: https://issues.apache.org/jira/browse/KYLIN-4983
             Project: Kylin
          Issue Type: Bug
          Components: Real-time Streaming
            Reporter: Kun Liu
            Assignee: Kun Liu


Env:

stream cube (stream_cube) with lambda, and the window of cube is 1hour

Before enable the stream_cube, we submit a build restful request with the range 
[2025-01-01,2025-01-02]

 

Then enable the stream_cube,  the receiver node will consume data and create 
segment in real time, but the start time of the new segment is must be less 
than `2025-01-01`.

 

When the receiver node upload the data of the segment to the HDFS and notify 
the coordinator, and the coordinator will delete the metadata of the stream 
segment like below code, but the segment of data can't be delete in the 
receiver node.

 

```

// If we have a exist historical segment, we should not let new realtime 
segment overwrite it, it is so dangrous,
// we just delete the entry to ignore the segment which should not exist
if (segmentRange.getFirst() < minSegmentStart) {
 logger.warn(
 "The cube segment state is not correct because it belongs to historical part, 
cube:{} segment:{}, clear it.",
 cubeName, segmentState.getSegmentName());
 coordinator.getStreamMetadataStore().removeSegmentBuildState(cubeName, 
segmentState.getSegmentName());
 continue;
}

```

The number of immutable segment will reach to 100.

 

There are two way to resolve this issue:

 
 # forbid appending segment in stream with lambda for the restful api
 # delete local data and remote hdfs data before remove metadata

Now i think it is a better way to forbid appending segment in lambda stream 
cube.

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to