Thanks Erick/Chris for the information. The page faults are occurring on each node of the cluster. These are VMs running SOLR v7.2.1 on RHEL 7. CPUx8, 64GB mem.
We’re collecting GC information and using a DynaTrace agent, so I’m not sure if / how much that contributes to the overhead. This cluster is used strictly for type-ahead/auto-complete functionality. I’ve also just noticed that the shards are imbalanced – 2 having about 90GB and 2 having about 18GB of data. Having just joined this team, I’m not too familiar yet with the documents or queries/updates [and maybe not relevant to the page faults]. Although, I did check the schema, and most of the fields are stored=true, docValues=true Solr v7.2.1 OS: RHEL 7 Collection Configuration - Shard count: 4 configName: pdv201806 replicationFactor: 2 maxShardsPerNode: 1 router: compositeId autoAddReplicas: false Cache configuration – filterCache class="solr.FastLRUCache" size="20000" initialSize="5000" autowarmCount="10" queryResultCache class="solr.LRUCache" size="5000" initialSize="1000" autowarmCount="0" documentCache class="solr.LRUCache" size="15000" initialSize="512" enableLazyFieldLoading=true JVM Information/Configuration – java.runtime.version: 1.8.0_162-b12 -XX:+CMSParallelRemarkEnabled -XX:+CMSScavengeBeforeRemark -XX:+ParallelRefProcEnabled -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+ScavengeBeforeFullGC -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseConcMarkSweepGC -XX:+UseGCLogFileRotation -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow -XX:CMSInitiatingOccupancyFraction=70 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:ConcGCThreads=4 -XX:GCLogFileSize=20M -XX:MaxTenuringThreshold=8 -XX:NewRatio=3 -XX:ParallelGCThreads=8 -XX:PretenureSizeThreshold=64m -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -Xms16g -Xmx32g -Xss256k -verbose:gc Jeremy Branham jb...@allstate.com On 1/7/19, 1:16 PM, "Christopher Schultz" <ch...@christopherschultz.net> wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Erick, On 1/7/19 11:52, Erick Erickson wrote: > Images do not come through, so we don't see what you're seeing. > > That said, I'd expect page faults to happen: > > 1> when indexing. Besides what you'd expect (new segments written > to disk), there's segment merging going on in the background which > has to read segments from disk in order to merge. > > 2> when querying, any fields returned as part of a doc that has > stored=true docValues=false will require a disk access to get the > stored data. A page fault is not necessarily a disk access. It almost always *is*, but it's not because the application is calling fopen(). It's because the OS is performing a memory operation which often results in a dip into virtual memory. Jeremy, are these page-faults occurring on all the machines in your cluster, or only some? What is the hardware configuration of each machine (specifically, memory)? What are your JVM settings for your Solr instances? Is anything else running on these nodes? It would help to understand what's happening on your servers. "I'm seeing page faults" doesn't really help us help you. Thanks, - -chris > On Mon, Jan 7, 2019 at 8:35 AM Branham, Jeremy (Experis) > <jb...@allstate.com> wrote: >> >> Does anyone know if it is typical behavior for a SOLR cluster to >> have lots of page faults (50-100 per second) under heavy load? >> >> We are performing load testing on a cluster with 8 nodes, and my >> performance engineer has brought this information to attention. >> >> I don’t know enough about memory management to say it is normal >> or not. >> >> >> >> The performance doesn’t appear to be suffering, but I don’t want >> to overlook a potential hazard. >> >> >> >> Thanks! >> >> >> >> >> >> >> >> >> >> Jeremy Branham >> >> jb...@allstate.com >> >> Allstate Insurance Company | UCV Technology Services | >> Information Services Group >> >> > -----BEGIN PGP SIGNATURE----- Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/ iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlwzpYsACgkQHPApP6U8 pFgSHxAAgaXV5wkwV7Ru2QyhnvxUnIWY4Iom0IdZYrDuZBDxmFx9wzE7P33zmR3E nrgZCqBtAMdxRSwG9BfyKircChZBssqtQpskw6mgJyzRyGvKVJjJ68r0vEio3Kjo HjaJczBFWvdOKm42W1Li4SeymGyYXu/jmdkWLcIbEM4BgDQLf1HhSEphDeZzP4ST GNDBrIA6XkUJwE1r58FUuj9l0XSKUAPLOPNAx1qGiAn4fKdbysVHvLcvJvJzC0pC 1kx000r+Mqdd61EzhM20ZDIvg2F3vgFgGCUtB31hIi18bfD8whoAafL2FSMkIccD H7X09PpUK8qPM/oQgqCKTtfmVR3M2pi3CSxLFSQ1/QucnF2wxWknOOWUH1TMU/L2 KUQHS6GwuTk+R/8PxdBRsZI8ON3MVb690ECV4QplYlkrtygXrLRg2YOgifgAXsKL 5Kg2mrpKoxfNnDWaRksy4GUDTsSxbkd1rpnHJEZ8le26HXvz9wrug/FtNPzqP8S9 dan2gkgiSqOM9GKlKkA72ROyQDhZa5YiXfGNdRrmfkiQzlDBEcGpD8pg1GwskRJl yidTBfvRSyCHsI5NBGf65nTG+2WfUnr8wClHVK5QQGVilHBn6KzeHeDTL9ZpHvcn GhkDMvc+9f8DR7Hr/mTiGjYIAvJZYiIJeYUoe0Bl2BHmGDv0tEk= =OpZo -----END PGP SIGNATURE-----