On Wed, Aug 07, 2019 at 03:30:38PM +0100, Steven Price wrote:
> On 07/08/2019 15:15, Matthew Wilcox wrote:
> > On Tue, Aug 06, 2019 at 11:40:00PM -0700, Christoph Hellwig wrote:
> >> On Tue, Aug 06, 2019 at 12:09:38PM -0700, Matthew Wilcox wrote:
> >>> Has anyone looked at turning the interface inside-out?  ie something like:
> >>>
> >>>   struct mm_walk_state state = { .mm = mm, .start = start, .end = end, };
> >>>
> >>>   for_each_page_range(&state, page) {
> >>>           ... do something with page ...
> >>>   }
> >>>
> >>> with appropriate macrology along the lines of:
> >>>
> >>> #define for_each_page_range(state, page)                          \
> >>>   while ((page = page_range_walk_next(state)))
> >>>
> >>> Then you don't need to package anything up into structs that are shared
> >>> between the caller and the iterated function.
> >>
> >> I'm not an all that huge fan of super magic macro loops.  But in this
> >> case I don't see how it could even work, as we get special callbacks
> >> for huge pages and holes, and people are trying to add a few more ops
> >> as well.
> > 
> > We could have bits in the mm_walk_state which indicate what things to return
> > and what things to skip.  We could (and probably should) also use different
> > iterator names if people actually want to iterate different things.  eg
> > for_each_pte_range(&state, pte) as well as for_each_page_range().
> > 
> 
> The iterator approach could be awkward for the likes of my generic
> ptdump implementation[1]. It would require an iterator which returns all
> levels and allows skipping levels when required (to prevent KASAN
> slowing things down too much). So something like:
> 
> start_walk_range(&state);
> for_each_page_range(&state, page) {
>       switch(page->level) {
>       case PTE:
>               ...
>       case PMD:
>               if (...)
>                       skip_pmd(&state);
>               ...
>       case HOLE:
>               ....
>       ...
>       }
> }
> end_walk_range(&state);
> 
> It seems a little fragile - e.g. we wouldn't (easily) get type checking
> that you are actually treating a PTE as a pte_t. The state mutators like
> skip_pmd() also seem a bit clumsy.

Once you're on-board with using a state structure, you can use it in all
kinds of fun ways.  For example:

struct mm_walk_state {
        struct mm_struct *mm;
        unsigned long start;
        unsigned long end;
        unsigned long curr;
        p4d_t p4d;
        pud_t pud;
        pmd_t pmd;
        pte_t pte;
        enum page_entry_size size;
        int flags;
};

For this user, I'd expect something like ...

        DECLARE_MM_WALK_FLAGS(state, mm, start, end,
                                MM_WALK_HOLES | MM_WALK_ALL_SIZES);

        walk_each_pte(state) {
                switch (state->size) {
                case PE_SIZE_PTE:
                        ... 
                case PE_SIZE_PMD:
                        if (...(state->pmd))
                                continue;
                ...
                }
        }

There's no need to have start / end walk function calls.

Reply via email to