Joel,

To make things clear, 896 MB is not a hardware limitation. The 3GB:1GB split
can be configured during the kernel build but the split cannot be changed
dynamically.

you are correct that ZONE_* refers to grouping of physical memory but the
very concept of ZONES is logical and not physical.

Now, why does ZONE_NORMAL has only 896MB on a 32 bit system?

If you recall the concept of virtual memory, you will remember that its aim
is to provide a illusion to the user processes that it has all the
theoritical maximum memory possible on that specific architecture, which is
4GB in this case, and that that is only process running on the system. The
kernel internally deals with pages, swapping in & out pages to create this
illusion. The advantage is that user processes does not have to care about
how much physical memory is actually present in the system.

So, out of this 4GB, it was conceptually decided that 3GB is the process's
virtual address space and 1GB is the kernel virtual address space. The
kernel maps these 3GB of user processes' virtual address space to physical
memory using page tables. The kernel can just address 1GB of virtual
addresses. This 1GB of virtual addresses is directly mapped (1-1 mapping)
into the physical memory without using page tables. If the kernel wants to
address more virtual addresses, it has to kmap the high memory(ZONE_HIGHMEM)
which sets up the page tables etc. So, you can imagine this as : "Whenever a
context switch occurs, 3GB virtual address space of the previous running
process will be replaced by the virtual address space of the newly selected
process, and the 1GB always remains with the kernel." Note that all this is
virtual (That is, conceptual), this is only an illusion.

So, out of this 1GB of kernel virtual address space that is 1-1 mapped into
the physical memory(without requiring page tables), 0-16MB is used by device
drivers, 896MB - 1024MB is used by the kernel for vmalloc, kmap, etc which
leaves (16MB - 896MB) and this range is "called" ZONE_NORMAL.

Giving specific emphasis to the word "called" in the previous sentence.

In summary, the kernel can only access 896 MB of physical ram because it
only has 1GB of virtual address space available out of which the lower 16MB
is used for DMA by device drivers and the 896MB-1024MB is used to support
kmap, vmalloc etc. And note that this limitation is not because of the
hardware but this is because of the conceptualization of the division of
virtual address space into user address space & kernel address space.

For example, you can make the split 2G-2G instead of 3G-1G. So, the kernel
can now use 2GB of virtual address space (directly mapped to 2GB of physical
memory). You can also make the split 1GB:3GB instead of 3GB:1GB as already
explained.

Hope this clears the confusion.

Regards,
Venkatram Tummala


On Tue, Apr 6, 2010 at 1:01 PM, Joel Fernandes <[email protected]> wrote:

> Hi Peter,
>
> On Wed, Apr 7, 2010 at 1:14 AM, H. Peter Anvin <[email protected]> wrote:
> > On 04/06/2010 12:20 PM, Frank Hu wrote:
> >>>
> >>> The ELF ABI specifies that user space has 3 GB available to it.  That
> >>> leaves 1 GB for the kernel.  The kernel, by default, uses 128 MB for
> I/O
> >>> mapping, vmalloc, and kmap support, which leaves 896 MB for LOWMEM.
> >>>
> >>> All of these boundaries are configurable; with PAE enabled the user
> >>> space boundary has to be on a 1 GB boundary.
> >>>
> >>
> >> the VM split is also configurable when building the kernel (for 32-bit
> >> processors).
> >
> > I did say "all these boundaries are configurable".  Rather explicitly.
> >
>
> I thought the 896 MB was a hardware limitation on 32 bit architectures
> and something that cannot be configured? Or am I missing something
> here? Also the vm-splits refer to "virtual memory" . While ZONE_* and
> the 896MB we were discussing refers to "physical memory". How then is
> discussing about vm splits pertinent here?
>
> Thanks,
> -Joel
>
> --
> To unsubscribe from this list: send an email with
> "unsubscribe kernelnewbies" to [email protected]
> Please read the FAQ at http://kernelnewbies.org/FAQ
>
>

Reply via email to