On 9/11/20 2:37 PM, Florian Fainelli wrote: > > > On 9/11/2020 1:54 PM, Chris Goldsworthy wrote: >> CMA allocations will fail if 'pinned' pages are in a CMA area, since we >> cannot migrate pinned pages. The _refcount of a struct page being greater >> than _mapcount for that page can cause pinning for anonymous pages. This >> is because try_to_unmap(), which (1) is called in the CMA allocation path, >> and (2) decrements both _refcount and _mapcount for a page, will stop >> unmapping a page from VMAs once the _mapcount for a page reaches 0. This >> implies that after try_to_unmap() has finished successfully for a page >> where _recount > _mapcount, that _refcount will be greater than 0. Later >> in the CMA allocation path in migrate_page_move_mapping(), we will have one >> more reference count than intended for anonymous pages, meaning the >> allocation will fail for that page. >> >> One example of where _refcount can be greater than _mapcount for a page we >> would not expect to be pinned is inside of copy_one_pte(), which is called >> during a fork. For ptes for which pte_present(pte) == true, copy_one_pte() >> will increment the _refcount field followed by the _mapcount field of a >> page. If the process doing copy_one_pte() is context switched out after >> incrementing _refcount but before incrementing _mapcount, then the page >> will be temporarily pinned. >> >> So, inside of cma_alloc(), instead of giving up when alloc_contig_range() >> returns -EBUSY after having scanned a whole CMA-region bitmap, perform >> retries indefinitely, with sleeps, to give the system an opportunity to >> unpin any pinned pages. > > I am by no means an authoritative CMA person but this behavior does not seem > acceptable, there is no doubt the existing one is sub-optimal under specific > circumstances, but an indefinite retry, as well as a 100ms sleep appear to be > arbitrary at best. How about you introduce a parameter that allows the tuning > of the number of retries and/or delay between retries?
Also: You should send your patch to [email protected] -- that's where most memory management type patches are reviewed. You should also send your patch to someone who could actually merge it into the kernel source tree -- assuming that's what you want to happen. Try scripts/get_mainttainer.pl on your patch to see what it says. And if you are going to use a "cover letter" or "introductory email" before the actual patch, the second (patch(es)) should be sent chained to the first email. git send-email should do this for you. -- ~Randy

