[ https://issues.apache.org/jira/browse/OAK-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Davide Giannella updated OAK-5469: ---------------------------------- Fix Version/s: 1.9.0 > TarMK: scaling the content > -------------------------- > > Key: OAK-5469 > URL: https://issues.apache.org/jira/browse/OAK-5469 > Project: Jackrabbit Oak > Issue Type: Epic > Components: segment-tar > Reporter: Michael Dürig > Assignee: Michael Dürig > Labels: scalability > Fix For: 1.8.0, 1.9.0 > > Attachments: segment-per-path.png > > > Production experience has shown that big repositories are prone to thrashing: > {quote} > Monitoring showed as massive level of major page faults, load averages > several times the number of cores, system cpu levels well above 50% and > extreme levels of IO. As more IOPS was provisioned the instance consumed all > available IOPS. The TechOps team reported many TB of read IO per hour and > hardly any write IO. > Investigation revealed that the repository size was just larger than the > available RAM on the machine. The instance was running in MMAPED mode and the > IO was due to major page faults mapping in and out pages of memory. This was > made worse by transparent huge page settings causing huge pages to be mapped > proactively on major page faults. Compaction reduced the repository size to > less than RAM. The TechOps team now monitor the total tar file size and dont > let it exceed the RAM on the machine, scheduling compactions to keep within > limits. Since the default to TarMK was to run memory mapped rather than on > heap, the JVM had no visibility of the mayhem being caused at OS level. > {quote} > This epic is all about improving scalability of the TarMK wrt. the content. > Below are some initial points to consider. Let's create issues and link them > to this epic as we go. > * What kind of internal / external monitoring do we need to understand and > optimally predict thrashing? Can we monitor the working set (active pages)? > The number of segments in the segment cache might be a good starting point. > * (How) can we reproduce the thrashing (easily enough)? Can we scale it down > (i.e. to an instance with littler RAM)? > * What is the impact of transparent huge pages (and switching it off)? How > much do we suffer from read amplification? What would be the impact of not > memory mapping but instead increasing the size of the segment buffer > accordingly? Both approaches aim at having finer grained control over the > data actually being loaded into RAM. > * What other OS level tweaks should / can we look at? > * Can we reduce the working set by keeping it more compact? E.g. running > GC/compaction, reducing read amplification (see above), improving > de-duplication of values, storing values more efficiently (e.g. dates, and > boolean), can we on the fly compress buffers (e.g. segments)? > * How do we testing with big repositories? > * What is a big repository? (Potential target: 100 GB segment store - 500M > nodes, TBC) > * What to measure (indicators of size): size on disk (after compaction), > number of JCR nodes, number of node records (reachable vs. waste) > * How to measure? > * {{oak-run debug}} (needs improvements for better scalability) > * one-line tool to provide all the info? > * How to obtain big repositories (generate or re-use existing)? > * What to analyze / monitor / debug? > * Possible limits: number of nodes (relative to RAM) for which trashing > starts to occur, max. number of direct children, max. concurrent requests > during online garbage collection. > * Platform monitoring: > * basic: disc size, IO, CPU, memory > * Asses impact of hardware upgrades on performance. E.g. what impact > does doubling RAM/IO/CPU have on our test scenarios. > * in depth: page faults, writes / reads per process, working set of > nodes, commit statistics, incoming requests vs Oak operations, other hiccups > * Tools: [Ganglia|http://ganglia.info/], > [jHiccups|https://github.com/giltene/jHiccup], > [AppDynamics|https://www.appdynamics.com/] -- This message was sent by Atlassian JIRA (v6.4.14#64029)