[ https://issues.apache.org/jira/browse/SPARK-34309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279788#comment-17279788 ]
Erik Krogen commented on SPARK-34309: ------------------------------------- The numbers are quite good. If I interpret them correctly, there is 2-4x improvement from Guava to Caffeine, with better improvements as the workload becomes more write-heavy. A slightly delayed eviction seems like a small price to pay... [~LuciferYang] can you elaborate on where Guava Cache is used in performance critical sections within Spark? Microbenchmarks are nice but it would be good to understand what kind of effect this could have on overall performance. > Use Caffeine instead of Guava Cache > ----------------------------------- > > Key: SPARK-34309 > URL: https://issues.apache.org/jira/browse/SPARK-34309 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL > Affects Versions: 3.2.0 > Reporter: Yang Jie > Priority: Minor > Attachments: image-2021-02-05-18-08-48-852.png, screenshot-1.png > > > Caffeine is a high performance, near optimal caching library based on Java 8, > it is used in a similar way to guava cache, but with better performance. The > comparison results are as follow are on the [caffeine benchmarks > |https://github.com/ben-manes/caffeine/wiki/Benchmarks] > At the same time, caffeine has been used in some open source projects like > Cassandra, Hbase, Neo4j, Druid, Spring and so on. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org