Hello! I guess it's pool.pages() * 3L / 4 Since, counter intuitively, the default ThrottlingPolicy is not ThrottlingPolicy.DISABLED. It's CHECKPOINT_BUFFER_ONLY.
Regards, -- Ilya Kasnacheev чт, 31 дек. 2020 г. в 04:33, Raymond Wilson <raymond_wil...@trimble.com>: > Regards this section of code: > > maxDirtyPages = throttlingPlc != ThrottlingPolicy.DISABLED > ? pool.pages() * 3L / 4 > : Math.min(pool.pages() * 2L / 3, cpPoolPages); > > I think the correct ratio will be 2/3 of pages as we do not have a > throttling policy defined, correct?. > > On Thu, Dec 31, 2020 at 12:49 AM Zhenya Stanilovsky <arzamas...@mail.ru> > wrote: > >> Correct code is running from here: >> >> if (checkpointReadWriteLock.getReadHoldCount() > 1 || >> safeToUpdatePageMemories() || checkpointer.runner() == null) >> break;else { >> CheckpointProgress pages = checkpointer.scheduleCheckpoint(0, "too many >> dirty pages"); >> >> and near you can see that : >> maxDirtyPages = throttlingPlc != ThrottlingPolicy.DISABLED ? pool.pages() >> * 3L / 4 : Math.min(pool.pages() * 2L / 3, cpPoolPages); >> >> Thus if ¾ pages are dirty from whole DataRegion pages — will raise this >> cp. >> >> >> In ( >> https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood), >> there is a mention of a dirty pages limit that is a factor that can trigger >> check points. >> >> I also found this issue: >> http://apache-ignite-users.70518.x6.nabble.com/too-many-dirty-pages-td28572.html >> where "too many dirty pages" is a reason given for initiating a checkpoint. >> >> After reviewing our logs I found this: (one example) >> >> 2020-12-15 19:07:00,999 [106] INF [MutableCacheComputeServer] Checkpoint >> started [checkpointId=e2c31b43-44df-43f1-b162-6b6cefa24e28, >> startPtr=FileWALPointer [idx=6339, fileOff=243287334, len=196573], >> checkpointBeforeLockTime=99ms, checkpointLockWait=0ms, >> checkpointListenersExecuteTime=16ms, checkpointLockHoldTime=32ms, >> walCpRecordFsyncDuration=113ms, writeCheckpointEntryDuration=27ms, >> splitAndSortCpPagesDuration=45ms, pages=33421, reason='too many dirty >> pages'] >> >> Which suggests we may have the issue where writes are frozen until the >> check point is completed. >> >> Looking at the AI 2.8.1 source code, the dirty page limit fraction >> appears to be 0.1 (10%), via this entry >> in GridCacheDatabaseSharedManager.java: >> >> /** >> * Threshold to calculate limit for pages list on-heap caches. >> * <p> >> >> * Note: When a checkpoint is triggered, we need some amount of page >> memory to store pages list on-heap cache. >> >> * If a checkpoint is triggered by "too many dirty pages" reason and >> pages list cache is rather big, we can get >> >> * {@code IgniteOutOfMemoryException}. To prevent this, we can limit the >> total amount of cached page list buckets, >> >> * assuming that checkpoint will be triggered if no more then 3/4 of >> pages will be marked as dirty (there will be >> >> * at least 1/4 of clean pages) and each cached page list bucket can be >> stored to up to 2 pages (this value is not >> >> * static, but depends on PagesCache.MAX_SIZE, so if PagesCache.MAX_SIZE >> > PagesListNodeIO#getCapacity it can take >> >> * more than 2 pages). Also some amount of page memory needed to store >> page list metadata. >> */ >> private static final double PAGE_LIST_CACHE_LIMIT_THRESHOLD = 0.1; >> >> This raises two questions: >> >> 1. The data region where most writes are occurring has 4Gb allocated to >> it, though it is permitted to start at a much lower level. 4Gb should be >> 1,000,000 pages, 10% of which should be 100,000 dirty pages. >> >> The 'limit holder' is calculated like this: >> >> /** >> * @return Holder for page list cache limit for given data region. >> */ >> public AtomicLong pageListCacheLimitHolder(DataRegion dataRegion) { >> if (dataRegion.config().isPersistenceEnabled()) { >> return pageListCacheLimits.computeIfAbsent(dataRegion.config >> ().getName(), name -> new AtomicLong( >> (long)(((PageMemoryEx)dataRegion.pageMemory()).totalPages >> () * PAGE_LIST_CACHE_LIMIT_THRESHOLD))); >> } >> >> return null; >> } >> >> ... but I am unsure if totalPages() is referring to the current size of >> the data region, or the size it is permitted to grow to. ie: Could the >> 'dirty page limit' be a sliding limit based on the growth of the data >> region? Is it better to set the initial and maximum sizes of data regions >> to be the same number? >> >> 2. We have two data regions, one supporting inbound arrival of data (with >> low numbers of writes), and one supporting storage of processed results >> from the arriving data (with many more writes). >> >> The block on writes due to the number of dirty pages appears to affect >> all data regions, not just the one which has violated the dirty page limit. >> Is that correct? If so, is this something that can be improved? >> >> Thanks, >> Raymond. >> >> >> On Wed, Dec 30, 2020 at 9:17 PM Raymond Wilson < >> raymond_wil...@trimble.com >> <//e.mail.ru/compose/?mailto=mailto%3araymond_wil...@trimble.com>> wrote: >> >> I'm working on getting automatic JVM thread stack dumping occurring if we >> detect long delays in put (PutIfAbsent) operations. Hopefully this will >> provide more information. >> >> On Wed, Dec 30, 2020 at 7:48 PM Zhenya Stanilovsky <arzamas...@mail.ru >> <//e.mail.ru/compose/?mailto=mailto%3aarzamas...@mail.ru>> wrote: >> >> >> Don`t think so, checkpointing work perfectly well already before this fix. >> Need additional info for start digging your problem, can you share ignite >> logs somewhere? >> >> >> >> I noticed an entry in the Ignite 2.9.1 changelog: >> >> - Improved checkpoint concurrent behaviour >> >> I am having trouble finding the relevant Jira ticket for this in the >> 2.9.1 Jira area at >> https://issues.apache.org/jira/browse/IGNITE-13876?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%202.9.1%20and%20status%20%3D%20Resolved >> >> Perhaps this change may improve the checkpointing issue we are seeing? >> >> Raymond. >> >> >> On Tue, Dec 29, 2020 at 8:35 PM Raymond Wilson < >> raymond_wil...@trimble.com >> <http://e.mail.ru/compose/?mailto=mailto%3araymond_wil...@trimble.com>> >> wrote: >> >> Hi Zhenya, >> >> 1. We currently use AWS EFS for primary storage, with provisioned IOPS to >> provide sufficient IO. Our Ignite cluster currently tops out at ~10% usage >> (with at least 5 nodes writing to it, including WAL and WAL archive), so we >> are not saturating the EFS interface. We use the default page size >> (experiments with larger page sizes showed instability when checkpointing >> due to free page starvation, so we reverted to the default size). >> >> 2. Thanks for the detail, we will look for that in thread dumps when we >> can create them. >> >> 3. We are using the default CP buffer size, which is max(256Mb, >> DataRagionSize / 4) according to the Ignite documentation, so this should >> have more than enough checkpoint buffer space to cope with writes. As >> additional information, the cache which is displaying very slow writes is >> in a data region with relatively slow write traffic. There is a primary >> (default) data region with large write traffic, and the vast majority of >> pages being written in a checkpoint will be for that default data region. >> >> 4. Yes, this is very surprising. Anecdotally from our logs it appears >> write traffic into the low write traffic cache is blocked during >> checkpoints. >> >> Thanks, >> Raymond. >> >> >> >> On Tue, Dec 29, 2020 at 7:31 PM Zhenya Stanilovsky <arzamas...@mail.ru >> <http://e.mail.ru/compose/?mailto=mailto%3aarzamas...@mail.ru>> wrote: >> >> >> 1. Additionally to Ilya reply you can check vendors page for >> additional info, all in this page are applicable for ignite too [1]. >> Increasing threads number leads to concurrent io usage, thus if your have >> something like nvme — it`s up to you but in case of sas possibly better >> would be to reduce this param. >> 2. Log will shows you something like : >> >> Parking thread=%Thread name% for timeout(ms)= %time% >> >> and appropriate : >> >> Unparking thread= >> >> 3. No additional looging with cp buffer usage are provided. cp buffer >> need to be more than 10% of overall persistent DataRegions size. >> 4. 90 seconds or longer — Seems like problems in io or system >> tuning, it`s very bad score i hope. >> >> [1] >> https://www.gridgain.com/docs/latest/perf-troubleshooting-guide/persistence-tuning >> >> >> >> >> >> Hi, >> >> We have been investigating some issues which appear to be related to >> checkpointing. We currently use the IA 2.8.1 with the C# client. >> >> I have been trying to gain clarity on how certain aspects of the Ignite >> configuration relate to the checkpointing process: >> >> 1. Number of check pointing threads. This defaults to 4, but I don't >> understand how it applies to the checkpointing process. Are more threads >> generally better (eg: because it makes the disk IO parallel across the >> threads), or does it only have a positive effect if you have many data >> storage regions? Or something else? If this could be clarified in the >> documentation (or a pointer to it which Google has not yet found), that >> would be good. >> >> 2. Checkpoint frequency. This is defaulted to 180 seconds. I was thinking >> that reducing this time would result in smaller less disruptive check >> points. Setting it to 60 seconds seems pretty safe, but is there a >> practical lower limit that should be used for use cases with new data >> constantly being added, eg: 5 seconds, 10 seconds? >> >> 3. Write exclusivity constraints during checkpointing. I understand that >> while a checkpoint is occurring ongoing writes will be supported into the >> caches being check pointed, and if those are writes to existing pages then >> those will be duplicated into the checkpoint buffer. If this buffer becomes >> full or stressed then Ignite will throttle, and perhaps block, writes until >> the checkpoint is complete. If this is the case then Ignite will emit >> logging (warning or informational?) that writes are being throttled. >> >> We have cases where simple puts to caches (a few requests per second) are >> taking up to 90 seconds to execute when there is an active check point >> occurring, where the check point has been triggered by the checkpoint >> timer. When a checkpoint is not occurring the time to do this is usually in >> the milliseconds. The checkpoints themselves can take 90 seconds or longer, >> and are updating up to 30,000-40,000 pages, across a pair of data storage >> regions, one with 4Gb in-memory space allocated (which should be 1,000,000 >> pages at the standard 4kb page size), and one small region with 128Mb. >> There is no 'throttling' logging being emitted that we can tell, so the >> checkpoint buffer (which should be 1Gb for the first data region and 256 Mb >> for the second smaller region in this case) does not look like it can fill >> up during the checkpoint. >> >> It seems like the checkpoint is affecting the put operations, but I don't >> understand why that may be given the documented checkpointing process, and >> the checkpoint itself (at least via Informational logging) is not >> advertising any restrictions. >> >> Thanks, >> Raymond. >> >> -- >> <http://www.trimble.com/> >> Raymond Wilson >> Solution Architect, Civil Construction Software Systems (CCSS) >> >> >> >> >> >> >> >> >> >> -- >> <http://www.trimble.com/> >> Raymond Wilson >> Solution Architect, Civil Construction Software Systems (CCSS) >> 11 Birmingham Drive | Christchurch, New Zealand >> +64-21-2013317 Mobile >> raymond_wil...@trimble.com >> <http://e.mail.ru/compose/?mailto=mailto%3araymond_wil...@trimble.com> >> >> >> >> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch> >> >> >> >> -- >> <http://www.trimble.com/> >> Raymond Wilson >> Solution Architect, Civil Construction Software Systems (CCSS) >> 11 Birmingham Drive | Christchurch, New Zealand >> +64-21-2013317 Mobile >> raymond_wil...@trimble.com >> <http://e.mail.ru/compose/?mailto=mailto%3araymond_wil...@trimble.com> >> >> >> >> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch> >> >> >> >> >> >> >> >> >> -- >> <http://www.trimble.com/> >> Raymond Wilson >> Solution Architect, Civil Construction Software Systems (CCSS) >> 11 Birmingham Drive | Christchurch, New Zealand >> +64-21-2013317 Mobile >> raymond_wil...@trimble.com >> <//e.mail.ru/compose/?mailto=mailto%3araymond_wil...@trimble.com> >> >> >> >> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch> >> >> >> >> -- >> <http://www.trimble.com/> >> Raymond Wilson >> Solution Architect, Civil Construction Software Systems (CCSS) >> 11 Birmingham Drive | Christchurch, New Zealand >> +64-21-2013317 Mobile >> raymond_wil...@trimble.com >> <//e.mail.ru/compose/?mailto=mailto%3araymond_wil...@trimble.com> >> >> >> >> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch> >> >> >> >> >> >> > > > -- > <http://www.trimble.com/> > Raymond Wilson > Solution Architect, Civil Construction Software Systems (CCSS) > 11 Birmingham Drive | Christchurch, New Zealand > +64-21-2013317 Mobile > raymond_wil...@trimble.com > > > <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch> >