[ 
https://issues.apache.org/jira/browse/FLINK-19710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang reassigned FLINK-19710:
--------------------------------

    Assignee: Yun Tang

> Fix performance regression to rebase FRocksDB with higher version RocksDB
> -------------------------------------------------------------------------
>
>                 Key: FLINK-19710
>                 URL: https://issues.apache.org/jira/browse/FLINK-19710
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / State Backends
>            Reporter: Yun Tang
>            Assignee: Yun Tang
>            Priority: Minor
>              Labels: auto-deprioritized-major, auto-unassigned
>             Fix For: 1.14.0
>
>
> We planed to bump base rocksDB version from 5.17.2 to 6.11.x. However, we 
> observed performance regression compared with 5.17.2 and 5.18.3 via our own 
> flink-benchmarks, and reported to RocksDB community in 
> [rocksdb#5774|https://github.com/facebook/rocksdb/issues/5774]. Since 
> rocksDB-5.18.3 is a bit old for RocksDB community, and rocksDB built-in 
> db_bench tool cannot easily reproduce this regression, we did not get any 
> efficient help from RocksDB community.
> Since code freeze of Flink-release-1.12 is close, we have to figure it out by 
> ourself. We try to use rocksDB built-in db_bench tool first to binary 
> searching the 160 different commits between rocksDB 5.17.2 and 5.18.3. 
> However, the performance regression is not so clear. And after using our own 
> flink-benchmarks. We finally detect the commit which introduced the 
> nearly-10% performance regression: [replaced __thread with thread_local 
> keyword 
> |https://github.com/facebook/rocksdb/commit/d6ec288703c8fc53b54be9e3e3f3ffd6a7487c63]
>  .
> From existing knowledge, the performance regression of {{thread-local}} is 
> known from [gcc-4.8 changes|https://gcc.gnu.org/gcc-4.8/changes.html#cxx] and 
> become more serious in [dynamic modules usage 
> |http://david-grs.github.io/tls_performance_overhead_cost_linux/] [[tls 
> benchmark|https://testbit.eu/2015/thread-local-storage-benchmark]]]. That 
> could explain why rocksDB built-in db_bench tool cannot reproduce this 
> regression as it is complied in static mode by recommendation.
>  
> We plan to fix this in our FRocksDB branch first to revert related changes. 
> And from my current local experimental result, that revert proved to be 
> effective to avoid that performance regression.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to