zhijiangW commented on a change in pull request #11507: [FLINK-16587] Add basic
CheckpointBarrierHandler for unaligned checkpoint
URL: https://github.com/apache/flink/pull/11507#discussion_r403725829
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/api/serialization/SpillingAdaptiveSpanningRecordDeserializer.java
##########
@@ -557,6 +579,28 @@ private void addNextChunkFromMemorySegment(MemorySegment
segment, int offset, in
}
}
+ @Nullable
+ MemorySegment copyToTargetSegment() {
+ // for the case of only partial length, no data
+ final int position = lengthBuffer.position();
+ if (position > 0) {
+ MemorySegment segment =
MemorySegmentFactory.allocateUnpooledSegment(lengthBuffer.remaining());
+ segment.put(0, lengthBuffer,
lengthBuffer.remaining());
+ lengthBuffer.position(position);
+ return segment;
+ }
+
+ // for the case of full length, partial data in buffer
+ if (recordLength != -1) {
+ // In the PoC we skip the case of large record
which size exceeds THRESHOLD_FOR_SPILLING,
Review comment:
I am not thinking through the solution yet. In my previous PoC, I also found
it is difficult to handle since it is also not easy to read the partial data
from spilled files ATM, so I left it as TODO to not pay much efforts.
I think we might not support this large record case in MVP, but we should
not mute it. Otherwise once we encounter this case in production, the completed
checkpoint can not be restored correctly because of missing partial records.
Maybe we can discard the unaligned checkpoint for large record case or throw
exceptions to warn users for tuning the proper checkpoint setting?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services