[tech4all] Kernel Part -3

ch chandrashekar Mon, 09 Jan 2006 20:48:13 -0800

Memory Management

Random access memory (RAM) is a very critical component in any computer system. It's the one component that always seems to be in short supply on most systems. Unfortunately, most organizations' budgets don't allow for the purchase of all the memory that their technical staff feel is necessary to support all their projects. Luckily, UNIX allows us to execute all sorts of programs without, what appears at first glance to be, enough physical memory. This comes in very handy when the system is required to support a user community that needs to execute an organization's custom and commercial software to gain access to its data.

Memory chips are high-speed electronic devices that plug directly into your computer. Main memory is also called core memory by some technicians. Ever heard of a core dump? (Writing out main memory to a storage device for post-dump analysis.) Usually it is caused by a program or system crash or failure. An important aspect of memory chips is that they can store data at specific locations called addresses. This makes it quite convenient for another hardware device called the central processing unit (CPU) to access these locations to run your programs. The kernel uses a paging and segmentation arrangement to organize process memory. This is where the memory management subsystem plays a significant role. Memory management can be defined as the efficient managing and sharing of the system's memory resources by the kernel and user processes.

Memory management follows certain rules that manage both physical and virtual memory. Since we already have an idea of what a physical memory chip or card is, we will provide a definition of virtual memory. Virtual memory is where the addressable memory locations that a process can be mapped into are independent of the physical address space of the CPU. Generally speaking, a process can exceed the physical address space/size of main memory and still load and execute.

The systems administrator should be aware that just because she has a fixed amount of physical memory, she should not expect it all to be available to execute user programs. The kernel is always resident in main memory and depending upon the kernel's configuration (tunable-like kernel tables, daemons, device drivers loaded, and so on), the amount left over can be classified as available memory. It is important for the systems administrator to know how much available memory the system has to work with when supporting his environment. Most systems display memory statistics during boot time. If your kernel is larger than it needs to be to support your environment, consider reconfiguring a smaller kernel to free up resources.

We learned before that a process has a well-defined structure and has certain specific control data structures that the kernel uses to manage the process during its system lifetime. One of the more important data structures that the kernel uses is the virtual address space (vas in HP-UX and as in SVR4. For a more detailed description of the layout of these structures, look at the vas.h or as.h header files under /usr/include on your system.).

A virtual address space exists for each process and is used by the process to keep track of process logical segments or regions that point to specific segments of the process's text (code), data, u_area, user, and kernel stacks; shared memory; shared library; and memory mapped file segments. Per-process regions protect and maintain the number of pages mapped into the segments. Each segment has a virtual address space segment as well. Multiple programs can share the process's text segment. The data segment holds the process's initialized and uninitialized (BSS) data. These areas can change size as the program executes.

The u_area and kernel stack contain information used by the kernel, and are a fixed size. The user stack is contained in the u_area; however, its size will fluctuate during its execution. Memory mapped files allow programmers to bring files into memory and work with them while in memory. Obviously, there is a limit to the size of the file you can load into memory (check your system documentation). Shared memory segments are usually set up and used by a process to share data with other processes. For example, a programmer may want to be able to pass messages to other programs by writing to a shared memory segment and having the receiving programs attach to that specific shared memory segment and read the message. Shared libraries allow programs to link to commonly used code at runtime. Shared libraries reduce the amount of memory needed by executing programs because only one copy of the code is required to be in memory. Each program will access the code at that memory location when necessary.

When a programmer writes and compiles a program, the compiler generates the object file from the source code. The linker program (ld) links the object file with the appropriate libraries and, if necessary, other object files to generate the executable program. The executable program contains virtual addresses that are converted into physical memory addresses when the program is run. This address translation must occur prior to the program being loaded into memory so that the CPU can reference the actual code.

When the program starts to run, the kernel sets up its data structures (proc, virtual address space, per-process region) and begins to execute the process in user mode. Eventually, the process will access a page that's not in main memory (for instance, the pages in its working set are not in main memory). This is called a page fault. When this occurs, the kernel puts the process to sleep, switches from user mode to kernel mode, and attempts to load the page that the process was requesting to be loaded. The kernel searches for the page by locating the per-process region where the virtual address is located. It then goes to the segments (text, data, or other) per-process region to find the actual region that contains the information necessary to read in the page.

The kernel must now find a free page in which to load the process's requested page. If there are no free pages, the kernel must either page or swap out pages to make room for the new page request. Once there is some free space, the kernel pages in a block of pages from disk. This block contains the requested page plus additional pages that may be used by the process. Finally the kernel establishes the permissions and sets the protections for the newly loaded pages. The kernel wakes the process and switches back to user mode so the process can begin executing using the requested page. Pages are not brought into memory until the process requests them for execution. This is why the system is referred to as a demand paging system.

NOTE: The verb page means to move individual blocks of memory for a process between system memory and disk swap area. The pagesize is defined in the /usr/include/limits.h header file. For a definition of paging see RAM I/O.

The memory management unit is a hardware component that handles the translation of virtual address spaces to physical memory addresses. The memory management unit also prevents a process from accessing another process's address space unless it is permitted to do so (protection fault). Memory is thus protected at the page level. The Translation Lookaside Buffer (TLB) is a hardware cache that maintains the most recently used virtual address space to physical address translations. It is controlled by the memory management unit to reduce the number of address translations that occur on the system.

Input and Output Management

The simplest definition of input/output is the control of data between hardware devices and software. A systems administrator is concerned with I/O at two separate levels. The first level is concerned with I/O between user address space and kernel address space; the second level is concerned with I/O between kernel address space and physical hardware devices. When data is written to disk, the first level of the I/O subsystem copies the data from user space to kernel space. Data is then passed from the kernel address space to the second level of the I/O subsystem. This is when the physical hardware device activates its own I/O subsystems, which determine the best location for the data on the available disks.

The OEM (Original Equipment Manufacture) UNIX configuration is satisfactory for many work environments, but does not take into consideration the network traffic or the behavior of specific applications on your system. Systems administrators find that they need to reconfigure the systems I/O to meet the expectations of the users and the demands of their applications. You should use the default configuration as a starting point and, as experience is gained with the demands on the system resources, tune the system to achieve peak I/O performance.

UNIX comes with a wide variety of tools that monitor system performance. Learning to use these tools will help you determine whether a performance problem is hardware or software related. Using these tools will help you determine whether a problem is poor user training, application tuning, system maintenance, or system configuration. sar, iostat, and monitor are some of your best basic I/O performance monitoring tools.

sar The sar command writes to standard output the contents of selected cumulative activity counters in the operating system. The following list is a breakdown of those activity counters that sar accumulates.
- File access
- Buffer usage
- system call activity
- Disk and tape input/output activity
- Free memory and swap space
- Kernel Memory Allocation (KMA)
- Interprocess communication
- Paging
- Queue Activity
- Central Processing Unit (CPU)
- Kernel tables
- Switching
- Terminal device activity
iostat Reports CPU statistics and input/output statistics for TTY devices, disks, and CD-ROMs.
monitor Like the sar command, but with a visual representation of the computer state.

RAM I/O

The memory subsystem comes into effect when the programs start requesting access to more physical RAM memory than is installed on your system. Once this point is reached, UNIX will start I/O processes called paging and swapping. This is when kernel procedures start moving pages of stored memory out to the paging or swap areas defined on your hard drives. (This procedure reflects how swap files work in Windows by Microsoft for a PC.) All UNIX systems use these procedures to free physical memory for reuse by other programs. The drawback to this is that once paging and swapping have started, system performance decreases rapidly. The system will continue using these techniques until demands for physical RAM drop to the amount that is installed on your system. There are only two physical states for memory performance on your system: Either you have enough RAM or you don't, and performance drops through the floor.

Memory performance problems are simple to diagnose; either you have enough memory or your system is thrashing. Computer systems start thrashing when more resources are dedicated to moving memory (paging and swapping) from RAM to the hard drives. Performance decreases as the CPUs and all subsystems become dedicated to trying to free physical RAM for themselves and other processes.

This summary doesn't do justice, however, to the complexity of memory management nor does it help you to deal with problems as they arise. To provide the background to understand these problems, we need to discuss virtual memory activity in more detail.

We have been discussing two memory processes: paging and swapping. These two processes help UNIX fulfill memory requirements for all processes. UNIX systems employ both paging and swapping to reduce I/O traffic and execute better control over the system's total aggregate memory. Keep in mind that paging and swapping are temporary measures; they cannot fix the underlying problem of low physical RAM memory.

Swapping moves entire idle processes to disk for reclamation of memory, and is a normal procedure for the UNIX operating system. When the idle process is called by the system again, it will copy the memory image from the disk swap area back into RAM.

On systems performing paging and swapping, swapping occurs in two separate situations. Swapping is often a part of normal housekeeping. Jobs that sleep for more that 20 seconds are considered idle and may be swapped out at any time. Swapping is also an emergency technique used to combat extreme memory shortages. Remember our definition of thrashing; this is when a system is in trouble. Some system administrators sum this up very well by calling it "desperation swapping."

Paging, on the other hand, moves individual pages (or pieces) of processes to disk and reclaims the freed memory, with most of the process remaining loaded in memory. Paging employs an algorithm to monitor usage of the pages, to leave recently accessed pages in physical memory, and to move idle pages into disk storage. This allows for optimum performance of I/O and reduces the amount of I/O traffic that swapping would normally require.

NOTE: Monitoring what the system is doing is easy with the ps command. ps is a "process status" command on all UNIX systems and typically shows many idle and swapped-out jobs. This command has a rich amount of options to show you what the computer is doing, too many to show you here.

I/O performance management, like all administrative tasks, is a continual process. Generating performance statistics on a routine basis will assist in identifying and correcting potential problems before they have an impact on your system or, worst case, your users. UNIX offers basic system usage statistics packages that will assist you in automatically collecting and examining usage statistics.

You will find the load on the system will increase rapidly as new jobs are submitted and resources are not freed quickly enough. Performance drops as the disks become I/O bound trying to satisfy paging and swapping calls. Memory overload quickly forces a system to become I/O and CPU bound. However, once you identify the problem to be memory, you will find adding RAM to be cheaper than adding another CPU to your system.

Hard Drive I/O

Some simple configuration considerations will help you obtain better I/O performance regardless of your system's usage patterns. The factors to consider are the arrangement of your disks and disk controllers and the speed of the hard drives.

The best policy is to spread the disk workload as evenly as possible across all controllers. If you have a large system with multiple I/O back planes, split your disk drives evenly among the two buses. Most disk controllers allow you to daisy chain several disk drives from the same controller channel. For the absolute best performance, spread the disk drives evenly over all controllers. This is particularly important if your system has many users who all need to make large sequential transfers.

Small Computer System Interface (SCSI) devices are those that adhere to the American National Standards Institute (ANSI) standards for connecting intelligent interface peripherals to computers. The SCSI bus is a daisy-chained arrangement originating at a SCSI adapter card that interconnects several SCSI controllers. Each adapter interfaces the device to the bus and has a different SCSI address that is set on the controller. This address determines the priority that the SCSI device is given, with the highest address having the highest priority. When you load balance a system, always place more frequently accessed data on the hard drives with the highest SCSI address. Data at the top of the channel takes less access time, and load balancing increases the availability of that data to the system.

After deciding the best placement of the controllers and hard drives on your system, you have one last item for increasing system performance. When adding new disks, remember that the seek time of the disk is the single most important indicator of its performance. Different processes will be accessing the disk at the same time as they are accessing different files and reading from different areas at one time.

The seek time of a disk is the measure of time required to move the disk drive's heads from one track to another. Seek time is affected by how far the heads have to move from one track to another. Moving the heads from track to track takes less time that shifting those same drive heads across the entire disk. You will find that seek time is actually a nonlinear measurement, taking into account that the heads have to accelerate, decelerate, and then stabilize in their new position. This is why all disks will typically specify a minimum, average, and maximum seek time. The ratio of time spent seeking between tracks to time spent transferring data is usually at least 10 to 1. The lower the aggregate seek time, the greater your performance gain or improvement.

One problem with allowing for paging and swap files to be added to the hard disks is that some system administrators try to use this feature to add more RAM to a system. It does not work that way. The most you could hope for is to temporarily avert the underlying cause, low physical memory. There is one thing that a systems administrator can do to increase performance, and that is to accurately balance the disk drives.

Don't overlook the obvious upgrade path for I/O performance, tuning. If you understand how your system is configured and how you intend to use it, you will be much less likely to buy equipment you don't need or that won't solve your problem.

Yahoo! DSL Something to write home about. Just $16.99/mo. or less

SPONSORED LINKS

`Technical support`	`Computer security`	`Computer technical support`
`Computer training`	`Free computer technical support`

YAHOO! GROUPS LINKS

Visit your group "tech4all" on the web.
To unsubscribe from this group, send an email to: [EMAIL PROTECTED]
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.

[tech4all] Kernel Part -3

Memory Management

Input and Output Management

RAM I/O

Hard Drive I/O

Reply via email to