Thanks Erick/Chris for the information.
The page faults are occurring on each node of the cluster.
These are VMs running SOLR v7.2.1 on RHEL 7. CPUx8, 64GB mem.

We’re collecting GC information and using a DynaTrace agent, so I’m not sure if 
/ how much that contributes to the overhead.

This cluster is used strictly for type-ahead/auto-complete functionality. 

I’ve also just noticed that the shards are imbalanced – 2 having about 90GB and 
2 having about 18GB of data.
Having just joined this team, I’m not too familiar yet with the documents or 
queries/updates [and maybe not relevant to the page faults]. 
Although, I did check the schema, and most of the fields are stored=true, 
docValues=true

Solr v7.2.1
OS: RHEL 7

Collection Configuration - 
Shard count: 4
configName: pdv201806
replicationFactor: 2
maxShardsPerNode: 1
router: compositeId
autoAddReplicas: false

Cache configuration –
filterCache class="solr.FastLRUCache"
                 size="20000"
                 initialSize="5000"
                 autowarmCount="10"
queryResultCache class="solr.LRUCache"
                      size="5000"
                      initialSize="1000"
                      autowarmCount="0"
documentCache class="solr.LRUCache"
                   size="15000"
                   initialSize="512"

enableLazyFieldLoading=true


JVM Information/Configuration –
java.runtime.version: 1.8.0_162-b12

-XX:+CMSParallelRemarkEnabled
-XX:+CMSScavengeBeforeRemark
-XX:+ParallelRefProcEnabled
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCDateStamps
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+ScavengeBeforeFullGC
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseConcMarkSweepGC
-XX:+UseGCLogFileRotation
-XX:+UseParNewGC
-XX:-OmitStackTraceInFastThrow
-XX:CMSInitiatingOccupancyFraction=70
-XX:CMSMaxAbortablePrecleanTime=6000
-XX:ConcGCThreads=4
-XX:GCLogFileSize=20M
-XX:MaxTenuringThreshold=8
-XX:NewRatio=3
-XX:ParallelGCThreads=8
-XX:PretenureSizeThreshold=64m
-XX:SurvivorRatio=4
-XX:TargetSurvivorRatio=90
-Xms16g
-Xmx32g
-Xss256k
-verbose:gc


 
Jeremy Branham
jb...@allstate.com

On 1/7/19, 1:16 PM, "Christopher Schultz" <ch...@christopherschultz.net> wrote:

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA256
    
    Erick,
    
    On 1/7/19 11:52, Erick Erickson wrote:
    > Images do not come through, so we don't see what you're seeing.
    > 
    > That said, I'd expect page faults to happen:
    > 
    > 1> when indexing. Besides what you'd expect (new segments written
    > to disk), there's segment merging going on in the background which
    > has to read segments from disk in order to merge.
    > 
    > 2> when querying, any fields returned as part of a doc that has
    > stored=true docValues=false will require a disk access to get the
    > stored data.
    
    A page fault is not necessarily a disk access. It almost always *is*,
    but it's not because the application is calling fopen(). It's because
    the OS is performing a memory operation which often results in a dip
    into virtual memory.
    
    Jeremy, are these page-faults occurring on all the machines in your
    cluster, or only some? What is the hardware configuration of each
    machine (specifically, memory)? What are your JVM settings for your
    Solr instances? Is anything else running on these nodes?
    
    It would help to understand what's happening on your servers. "I'm
    seeing page faults" doesn't really help us help you.
    
    Thanks,
    - -chris
    
    > On Mon, Jan 7, 2019 at 8:35 AM Branham, Jeremy (Experis) 
    > <jb...@allstate.com> wrote:
    >> 
    >> Does anyone know if it is typical behavior for a SOLR cluster to
    >> have lots of page faults (50-100 per second) under heavy load?
    >> 
    >> We are performing load testing on a cluster with 8 nodes, and my
    >> performance engineer has brought this information to attention.
    >> 
    >> I don’t know enough about memory management to say it is normal
    >> or not.
    >> 
    >> 
    >> 
    >> The performance doesn’t appear to be suffering, but I don’t want
    >> to overlook a potential hazard.
    >> 
    >> 
    >> 
    >> Thanks!
    >> 
    >> 
    >> 
    >> 
    >> 
    >> 
    >> 
    >> 
    >> 
    >> Jeremy Branham
    >> 
    >> jb...@allstate.com
    >> 
    >> Allstate Insurance Company | UCV Technology Services |
    >> Information Services Group
    >> 
    >> 
    > 
    -----BEGIN PGP SIGNATURE-----
    Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
    
    iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlwzpYsACgkQHPApP6U8
    pFgSHxAAgaXV5wkwV7Ru2QyhnvxUnIWY4Iom0IdZYrDuZBDxmFx9wzE7P33zmR3E
    nrgZCqBtAMdxRSwG9BfyKircChZBssqtQpskw6mgJyzRyGvKVJjJ68r0vEio3Kjo
    HjaJczBFWvdOKm42W1Li4SeymGyYXu/jmdkWLcIbEM4BgDQLf1HhSEphDeZzP4ST
    GNDBrIA6XkUJwE1r58FUuj9l0XSKUAPLOPNAx1qGiAn4fKdbysVHvLcvJvJzC0pC
    1kx000r+Mqdd61EzhM20ZDIvg2F3vgFgGCUtB31hIi18bfD8whoAafL2FSMkIccD
    H7X09PpUK8qPM/oQgqCKTtfmVR3M2pi3CSxLFSQ1/QucnF2wxWknOOWUH1TMU/L2
    KUQHS6GwuTk+R/8PxdBRsZI8ON3MVb690ECV4QplYlkrtygXrLRg2YOgifgAXsKL
    5Kg2mrpKoxfNnDWaRksy4GUDTsSxbkd1rpnHJEZ8le26HXvz9wrug/FtNPzqP8S9
    dan2gkgiSqOM9GKlKkA72ROyQDhZa5YiXfGNdRrmfkiQzlDBEcGpD8pg1GwskRJl
    yidTBfvRSyCHsI5NBGf65nTG+2WfUnr8wClHVK5QQGVilHBn6KzeHeDTL9ZpHvcn
    GhkDMvc+9f8DR7Hr/mTiGjYIAvJZYiIJeYUoe0Bl2BHmGDv0tEk=
    =OpZo
    -----END PGP SIGNATURE-----
    

Reply via email to