Re: Page fault in kernel code
On Thu, 11 Sep 2014, Leon Romanovsky wrote: > > Linux kernel memory is not page-able, but memory allocated through vmalloc > > can still cause page fault. How device drivers using vmalloc handle this? > Pages allocated via vmalloc call won't generate page-faults. Kernel faults are used on some platforms in a very limited way but not for swapping in or "paging in from disk". Have a look at linux/arch/x86/mm/fault.c: /* * We fault-in kernel-space virtual memory on-demand. The * 'reference' page table is init_mm.pgd. * * NOTE! We MUST NOT take any locks for this case. We may * be in an interrupt or a critical region, and should * only copy the information from the master page table, * nothing more. * * This verifies that the fault happens in kernel space * (error_code & 4) == 0, and that the fault was not a * protection error (error_code & 9) == 0. */ if (unlikely(fault_in_kernel_space(address))) { if (!(error_code & (PF_RSVD | PF_USER | PF_PROT))) { if (vmalloc_fault(address) >= 0) return; if (kmemcheck_fault(regs, address, error_code)) return; } /* Can handle a stale RO->RW TLB: */ if (spurious_fault(error_code, address)) return; /* kprobes don't want to hook the spurious faults: */ if (kprobes_fault(regs)) return; /* * Don't take the mm semaphore here. If we fixup a prefetch * fault we could otherwise deadlock: */ bad_area_nosemaphore(regs, error_code, address); return; } ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Page fault in kernel code
On Thu, Sep 11, 2014 at 5:53 PM, Miles MH Chen wrote: > Not exactly, vmalloc'ed addresses can generate page faults. > > vmalloc'ed page entries live in kernel master page table, not in > > every process' page table. When a vmalloc page fault occurs, > > kernel simply copy the page table entry from master page table to > > the current process' page table and fix the page fault. > > MH Thanks, I was under wrong impression that the difference between page allocated in kmalloc vs. vmalloc is in PTE mapping. > > On Thu, Sep 11, 2014 at 8:03 PM, Leon Romanovsky wrote: >> >> On Wed, Sep 10, 2014 at 5:52 PM, Manavendra Nath Manav >> wrote: >> > >> > On 10-Sep-2014 6:24 pm, wrote: >> >> >> >> On Wed, 10 Sep 2014 14:45:23 +0530, Manavendra Nath Manav said: >> >> >> >> > But if the total RAM is limited (less than 896MB LOWMEM), for example >> >> > as >> >> > in >> >> > embedded devices how the kernel code be kept in RAM all the time. Am >> >> > I >> >> > correct to assume that the kernel pre-fetches all pages when entering >> >> > kernel mode from user mode? >> >> >> >> No, kernel code is loaded by your boot loader, and *it stays there*. >> >> Similarly, >> >> if you modprobe something, the kernel allocates the page, loads the >> >> code, >> >> and leaves it there. >> >> >> >> Particularly in embedded devices, where you know all the modules the >> >> kernel may >> >> need, it's common to just create a kernel with everything built in, no >> >> module >> >> support, and when the system boots, it loads into memory and never >> >> moves >> >> again. >> >> >> > >> > Linux kernel memory is not page-able, but memory allocated through >> > vmalloc >> > can still cause page fault. How device drivers using vmalloc handle >> > this? >> Pages allocated via vmalloc call won't generate page-faults. >> >> > >> > >> > ___ >> > Kernelnewbies mailing list >> > Kernelnewbies@kernelnewbies.org >> > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies >> > >> >> >> >> -- >> Leon Romanovsky | Independent Linux Consultant >> www.leon.nu | l...@leon.nu >> >> ___ >> Kernelnewbies mailing list >> Kernelnewbies@kernelnewbies.org >> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > > -- Leon Romanovsky | Independent Linux Consultant www.leon.nu | l...@leon.nu ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Page fault in kernel code
Not exactly, vmalloc'ed addresses can generate page faults. vmalloc'ed page entries live in kernel master page table, not in every process' page table. When a vmalloc page fault occurs, kernel simply copy the page table entry from master page table to the current process' page table and fix the page fault. MH On Thu, Sep 11, 2014 at 8:03 PM, Leon Romanovsky wrote: > On Wed, Sep 10, 2014 at 5:52 PM, Manavendra Nath Manav > wrote: > > > > On 10-Sep-2014 6:24 pm, wrote: > >> > >> On Wed, 10 Sep 2014 14:45:23 +0530, Manavendra Nath Manav said: > >> > >> > But if the total RAM is limited (less than 896MB LOWMEM), for example > as > >> > in > >> > embedded devices how the kernel code be kept in RAM all the time. Am I > >> > correct to assume that the kernel pre-fetches all pages when entering > >> > kernel mode from user mode? > >> > >> No, kernel code is loaded by your boot loader, and *it stays there*. > >> Similarly, > >> if you modprobe something, the kernel allocates the page, loads the > code, > >> and leaves it there. > >> > >> Particularly in embedded devices, where you know all the modules the > >> kernel may > >> need, it's common to just create a kernel with everything built in, no > >> module > >> support, and when the system boots, it loads into memory and never moves > >> again. > >> > > > > Linux kernel memory is not page-able, but memory allocated through > vmalloc > > can still cause page fault. How device drivers using vmalloc handle this? > Pages allocated via vmalloc call won't generate page-faults. > > > > > > > ___ > > Kernelnewbies mailing list > > Kernelnewbies@kernelnewbies.org > > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > > > > > > -- > Leon Romanovsky | Independent Linux Consultant > www.leon.nu | l...@leon.nu > > ___ > Kernelnewbies mailing list > Kernelnewbies@kernelnewbies.org > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Page fault in kernel code
On Wed, Sep 10, 2014 at 5:52 PM, Manavendra Nath Manav wrote: > > On 10-Sep-2014 6:24 pm, wrote: >> >> On Wed, 10 Sep 2014 14:45:23 +0530, Manavendra Nath Manav said: >> >> > But if the total RAM is limited (less than 896MB LOWMEM), for example as >> > in >> > embedded devices how the kernel code be kept in RAM all the time. Am I >> > correct to assume that the kernel pre-fetches all pages when entering >> > kernel mode from user mode? >> >> No, kernel code is loaded by your boot loader, and *it stays there*. >> Similarly, >> if you modprobe something, the kernel allocates the page, loads the code, >> and leaves it there. >> >> Particularly in embedded devices, where you know all the modules the >> kernel may >> need, it's common to just create a kernel with everything built in, no >> module >> support, and when the system boots, it loads into memory and never moves >> again. >> > > Linux kernel memory is not page-able, but memory allocated through vmalloc > can still cause page fault. How device drivers using vmalloc handle this? Pages allocated via vmalloc call won't generate page-faults. > > > ___ > Kernelnewbies mailing list > Kernelnewbies@kernelnewbies.org > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > -- Leon Romanovsky | Independent Linux Consultant www.leon.nu | l...@leon.nu ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Page fault in kernel code
On 10-Sep-2014 6:24 pm, wrote: > > On Wed, 10 Sep 2014 14:45:23 +0530, Manavendra Nath Manav said: > > > But if the total RAM is limited (less than 896MB LOWMEM), for example as in > > embedded devices how the kernel code be kept in RAM all the time. Am I > > correct to assume that the kernel pre-fetches all pages when entering > > kernel mode from user mode? > > No, kernel code is loaded by your boot loader, and *it stays there*. Similarly, > if you modprobe something, the kernel allocates the page, loads the code, > and leaves it there. > > Particularly in embedded devices, where you know all the modules the kernel may > need, it's common to just create a kernel with everything built in, no module > support, and when the system boots, it loads into memory and never moves again. > Linux kernel memory is not page-able, but memory allocated through vmalloc can still cause page fault. How device drivers using vmalloc handle this? ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Page fault in kernel code
On Wed, 10 Sep 2014 14:45:23 +0530, Manavendra Nath Manav said: > But if the total RAM is limited (less than 896MB LOWMEM), for example as in > embedded devices how the kernel code be kept in RAM all the time. Am I > correct to assume that the kernel pre-fetches all pages when entering > kernel mode from user mode? No, kernel code is loaded by your boot loader, and *it stays there*. Similarly, if you modprobe something, the kernel allocates the page, loads the code, and leaves it there. Particularly in embedded devices, where you know all the modules the kernel may need, it's common to just create a kernel with everything built in, no module support, and when the system boots, it loads into memory and never moves again. pgpX0CkPDppva.pgp Description: PGP signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
RE: Page fault in kernel code
On 09-Sep-2014 10:25 pm, "Jeff Haran" wrote: > > > While reading the book Essential Linux device drivers it says "user mode code is allowed to page fault, however, whereas kernel mode code isn't". > > Why is it so? Why can't kernel mode code handle the page fault and reload the page from swap? Also, can page fault occur when kernel is executing in process context and/or interrupt context? > > -- manav m-n > > Think about handling the case where a page fault has occurred but the code that handles the page fault is itself not already in RAM, which leads to another page fault. Gets complicated. That complexity can be avoided by keeping all the kernel code in RAM all the time. Same applies to the kernel data that is needed to handle a page fault. > > Jeff Haran > > But if the total RAM is limited (less than 896MB LOWMEM), for example as in embedded devices how the kernel code be kept in RAM all the time. Am I correct to assume that the kernel pre-fetches all pages when entering kernel mode from user mode? ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
RE: Page fault in kernel code
From: kernelnewbies-bounces+jharan=bytemobile@kernelnewbies.org [mailto:kernelnewbies-bounces+jharan=bytemobile@kernelnewbies.org] On Behalf Of Manavendra Nath Manav Sent: Tuesday, September 09, 2014 6:24 AM To: kernelnewbies@kernelnewbies.org; feedb...@elinuxdd.com Subject: Page fault in kernel code While reading the book Essential Linux device drivers it says "user mode code is allowed to page fault, however, whereas kernel mode code isn't". Why is it so? Why can't kernel mode code handle the page fault and reload the page from swap? Also, can page fault occur when kernel is executing in process context and/or interrupt context? -- manav m-n Think about handling the case where a page fault has occurred but the code that handles the page fault is itself not already in RAM, which leads to another page fault. Gets complicated. That complexity can be avoided by keeping all the kernel code in RAM all the time. Same applies to the kernel data that is needed to handle a page fault. Jeff Haran ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Page fault in kernel code
On Tue, 09 Sep 2014 18:53:55 +0530, Manavendra Nath Manav said: > Why is it so? Why can't kernel mode code handle the page fault and reload > the page from swap? Also, can page fault occur when kernel is executing in > process context and/or interrupt context? There's no inherent chiseled-in-stone rule that says "the operating systems kernel may not page fault", and in fact many operating systems allow it. The IBM OS/360 family, starting with VS/1 and MVS (as OS/360's MFT and MVT variants ran on hardware that didn't do virtual memory) clear through Z/OS 40 years later now all supported having part of their kernel be pageable. I've worked with several Unix variants that allowed parts of the kernel to be pageable. But that's a design decision that adds little real benefit, especially on today's large RAM systems - even a Raspberry Pi has enough memory that you don't really need to worry about making the kernel pageable. Cautionary tale: I once had a UTX/32 system that had routines for recovery from disk errors (in particular, recovering and forwarding of bad blocks to spare blocks was done by the host, *not* the device), and supported having about 1/3 of the kernel code be pageable (this was in 1985 or so, and a Powernode/9080 with 16M of RAM was a *big* system, so being able to put 500K of a 1.5M kernel out on disk was a big win for performance). I'll let you think about what sort of afternoon I had the day that we kept hitting an I/O error on a bad block in the swap area (which quite reasonably paused all I/O to the failing disk until the error recovery routine ran), while the block-forwarder module was swapped out (And I've had to debug similar dork-ups in VS/1, VM/SP, and MUSIC as well. Actually... hmm, yep. I think I've seen every single OS I've worked with in 3 decades that supported paged kernel end up shooting itself in the foot because the wrong thing was paged out at the wrong time. That stuff is *hard* to get right...) That sort of thing is why Linus decided Just Say No. ;) pgpVFuf1mC6ew.pgp Description: PGP signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Page fault in kernel code
On Tue, Sep 09, 2014 at 06:53:55PM +0530, Manavendra Nath Manav wrote: > While reading the book Essential Linux device drivers it says "user mode code > is allowed to page fault, however, whereas kernel mode code isn't". > > Why is it so? Why can't kernel mode code handle the page fault and reload the > page from swap? Also, can page fault occur when kernel is executing in process > context and/or interrupt context? That is just the way the Linux kernel is designed, no page faults within it, unlike other operating systems. In the end, it makes kernel code much simpler. thanks, greg k-h ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Page fault in kernel code
While reading the book Essential Linux device drivers it says "user mode code is allowed to page fault, however, whereas kernel mode code isn't". Why is it so? Why can't kernel mode code handle the page fault and reload the page from swap? Also, can page fault occur when kernel is executing in process context and/or interrupt context? -- manav m-n ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies