[ 
https://issues.apache.org/jira/browse/FLINK-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16820940#comment-16820940
 ] 

ryantaocer edited comment on FLINK-12070 at 4/18/19 10:19 AM:
--------------------------------------------------------------

!image-2019-04-18-17-38-24-949.png!

Thanks [~StephanEwen] for your comments. I would like to explain it using a 
figure. In the figure, a physical disk file (black line) can be mapped into a 
java object ByteBuffer (orange line) that is in fact just a logical address 
space with range from say 0 to the file_size of the disk file. However, at that 
moment, the bits of the whole file content are not really copied into the 
physical memory (green line) of the underlying OS. When some read wants to 
access the file content delegated by the ByteBuffer, the OS will automatically 
copy some bits of disk file into the physical memory pages locally around the 
access position rather than the whole file because of limited physical memory 
and program local behaviors. When the read traverses in the ByteBuffer by 
invoking ByteBuffer.get() like methods, the OS will drops(swap-out) outdated 
physical pages and fills new pages with corresponding bits in the disk file 
around the reading position in the range [0, file_size]. This looks like a 
window with limited size sliding within the bits of the disk file. 

It is not a problem if only on read accessing the file. But if there are more 
than one read sharing the same ByteBuffer object like the example in the above 
figure, it becomes problematic since the physical pages are controlled by the 
OS and the memory resource is limited in a multi-tenant sharing server. In the 
figrue, if read#1 traverses, it will possibly cause the OS to drop the pages of 
read#2 so called cache missing and vice versa. This kind of cache 
missing/vibration may make the mmap mechanism quite inefficient.


was (Author: ryantaocer):
!image-2019-04-18-17-38-24-949.png!

Thanks [~StephanEwen] for your comments. I would like to explain it using a 
figure. In the figure, a physical disk file (black line) can be mapped into a 
java object ByteBuffer (orange line) that is in fact just a logical address 
space with range from say 0 to the file_size of the disk file. However, at that 
moment, the bits of the whole file content is not really copied into the 
physical memory (green line) of the underlying OS. When some read wants to 
access the file content delegated by the ByteBuffer, the OS will automatically 
copy some bits of disk file into the physical memory pages locally around the 
access position rather than the whole file because of limited physical memory 
and program local behaviors. When the read traverses in the ByteBuffer by 
invoking ByteBuffer.get() like methods, the OS will drops(swap-out) outdated 
physical pages and fills new pages with corresponding bits in the disk file 
around the reading position in the range [0, file_size]. This looks like a 
window with limited size sliding within the bits of the disk file. 

It is not a problem if only on read accessing the file. But if there are more 
than one read sharing the same ByteBuffer object like the example in the above 
figure, it becomes problematic since the physical pages are controlled by the 
OS and the memory resource is limited in a multi-tenant sharing server. In the 
figrue, if read#1 traverses, it will possibly cause the OS to drop the pages of 
read#2 so called cache missing and vice versa. This kind of cache 
missing/vibration may make the mmap mechanism quite inefficient.

> Make blocking result partitions consumable multiple times
> ---------------------------------------------------------
>
>                 Key: FLINK-12070
>                 URL: https://issues.apache.org/jira/browse/FLINK-12070
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>            Reporter: Till Rohrmann
>            Assignee: BoWang
>            Priority: Major
>         Attachments: image-2019-04-18-17-38-24-949.png
>
>
> In order to avoid writing produced results multiple times for multiple 
> consumers and in order to speed up batch recoveries, we should make the 
> blocking result partitions to be consumable multiple times. At the moment a 
> blocking result partition will be released once the consumers has processed 
> all data. Instead the result partition should be released once the next 
> blocking result has been produced and all consumers of a blocking result 
> partition have terminated. Moreover, blocking results should not hold on slot 
> resources like network buffers or memory as it is currently the case with 
> {{SpillableSubpartitions}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to