[ 
https://issues.apache.org/jira/browse/FLINK-29402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620916#comment-17620916
 ] 

Yuan Mei commented on FLINK-29402:
----------------------------------

# benchmarking for a pure k-v store ruling out other factors (like page cache) 
totally makes sense.
 # However, from Flink perspective, it is more reasonable to take Flink engine 
as an entire piece. In this case, most likely we should and need to use Page 
Cache. Benchmarking with page cache aligns better with a real-world use case. 
But that's a different topic, I would say.

Since you agree as well that we do not introduce a new config purely for 
benchmarking purposes, I am going to close this ticket.

> Add USE_DIRECT_READ configuration parameter for RocksDB
> -------------------------------------------------------
>
>                 Key: FLINK-29402
>                 URL: https://issues.apache.org/jira/browse/FLINK-29402
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / State Backends
>    Affects Versions: 1.16.0
>            Reporter: Donatien
>            Priority: Not a Priority
>              Labels: Enhancement, pull-request-available, rocksdb
>             Fix For: 1.17.0
>
>         Attachments: directIO-performance-comparison.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> RocksDB allows the use of DirectIO for read operations to bypass the Linux 
> Page Cache. To understand the impact of Linux Page Cache on performance, one 
> can run a heavy workload on a single-tasked Task Manager with a container 
> memory limit identical to the TM process memory. Running this same workload 
> on a TM with no container memory limit will result in better performances but 
> with the host memory exceeding the TM requirement.
> Linux Page Cache are of course useful but can give false results when 
> benchmarking the Managed Memory used by RocksDB. DirectIO is typically 
> enabled for benchmarks on working set estimation [Zwaenepoel et 
> al.|[https://arxiv.org/abs/1702.04323].]
> I propose to add a configuration key allowing users to enable the use of 
> DirectIO for reads thanks to the RocksDB API. This configuration would be 
> disabled by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to