Re: ADMIN: end of nl.linux.org, lists will move
Hi Rik, Thanks for all your support in having this on a new name. New mailing alias will be *kernelnewbies.**kernelnewbies.org* and mail-id will be *kernelnewb...@**kernelnewbies.org*. Is that right ? Thanks, Prabhu On Tue, Dec 14, 2010 at 11:35 PM, Rik van Riel r...@surriel.com wrote: The university IT department which has graciously hosted nl.linux.org for the last several years is about to stop existing. I will be moving many of the nl.linux.org services to my own systems and will preserve the four mailing lists that still see occasional traffic. Those mailing lists will be hosted on the kernelnewbies.org mailman instance starting this Friday. The only thing you may need to change are your mail filters. -- To unsubscribe from this list: send an email with unsubscribe kernelnewbies to ecar...@nl.linux.org Please read the FAQ at http://kernelnewbies.org/FAQ
e100 driver's data sheet
Dear All, I am trying to understand and analyse Ethernet driver of the device *03:01.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08)*. Google search couldn't find data sheet for this device. Request you to provide me the link or the pdf if you have datasheet for this device. Thanks, Prabhu
Re: e100 driver's data sheet
Thanks a lot. Prabhu On Fri, Dec 10, 2010 at 3:33 PM, Mulyadi Santosa mulyadi.sant...@gmail.comwrote: On Fri, Dec 10, 2010 at 16:41, Prabhu nath gprabhun...@gmail.com wrote: Dear All, I am trying to understand and analyse Ethernet driver of the device 03:01.0 Ethernet controller: Intel Corporation 82557/8/9/0/1 Ethernet Pro 100 (rev 08). You could also download Qemu and see how it emulates e1000/100. The emulation code is somewhere inside hw folder of the extracted qemu tarball. -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com
Invoking e100_open from userspace
Dear All, I am trying to analyse ethernet and rionet driver. Both the drives are registered with the kernel through register_netdev(). Each driver has their own functions registered such as e100_open(), rionet_open(). How to trigger this functions from the user space. Appreciate your help. Regards, Prabhu
Re: is there still an active kernel janitors project?
Hi Robert, I am also interested. Regards, Prabhu On Sun, Nov 21, 2010 at 6:36 PM, Robert P. J. Day rpj...@crashcourse.cawrote: i can suggest some very specific cleanups people can work on if they're bored. one related to lists: list_for_each() - list_for_each_entry() calls that is, modifying the numerous (older-style) list_for_each() calls to the more convenient list_for_each_entry() calls, something that can be done little by little, a subsystem at a time. good way to jump into kernel janitor work and get your name in the log. rday -- Robert P. J. Day Waterloo, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday -- To unsubscribe from this list: send an email with unsubscribe kernelnewbies to ecar...@nl.linux.org Please read the FAQ at http://kernelnewbies.org/FAQ
/dev/mem
Dear All, Can you please clarify my doubt on /dev/mem When I open /dev/mem, Is that entire physical address space is associated to /dev/mem or only the system memory ? * If I can mmap the kernel memory in read write mode, I can screw up the whole kernel. Is that right ? * Suppose I map a random memory page frame (let's assume it is a free page) from physical address 1abde000 to 1abdf000, then will the page allocator not allocate this page to to any other task or to the kernel ? Thanks, Prabhu
Re: /dev/mem
Thanks a lot for the clarification. Is that not a big hole that the kernel provides ?. If I have a root access, then I can spoil the whole system. Is there any motive for the kernel to support this ? Thanks, Prabhu On Wed, Oct 27, 2010 at 10:04 AM, Dave Hylands dhyla...@gmail.com wrote: Hi Prabhu, On Tue, Oct 26, 2010 at 9:19 PM, Prabhu nath gprabhun...@gmail.com wrote: Dear All, Can you please clarify my doubt on /dev/mem When I open /dev/mem, Is that entire physical address space is associated to /dev/mem or only the system memory ? * If I can mmap the kernel memory in read write mode, I can screw up the whole kernel. Is that right ? Absolutely. * Suppose I map a random memory page frame (let's assume it is a free page) from physical address 1abde000 to 1abdf000, then will the page allocator not allocate this page to to any other task or to the kernel ? Accessing through /dev/mem has no impact on the page allocator. Accessing memory through /dev/mem is the same thing as the kernel accessing that memory from within kernel space. You can access allocated pages, unallocated pages, device registers, pretty much anything at all. Writing indiscriminantly through /dev/mem is the same thing as writing indiscriminantly from within a driver. -- Dave Hylands Shuswap, BC, Canada http://www.DaveHylands.com/
Re: /dev/mem
Please see inline. Plz correct me if I am wrong. On Wed, Oct 27, 2010 at 10:26 AM, Mulyadi Santosa mulyadi.sant...@gmail.com wrote: On Wed, Oct 27, 2010 at 11:39, Prabhu nath gprabhun...@gmail.com wrote: Thanks a lot for the clarification. Is that not a big hole that the kernel provides ?. Naively...you can say so...but AFAIK it serves for some (legacy) purpose: - Wine...or DOSEMU..or dosbox...can't recall which one that used it once... - X server/ X org...once use it too... X server/X org uses the device memory to write frame data and if it wants system memory, it can legally ask kernel thru malloc/calloc. IMHO, I feel mem could have been restricted to device addresses rather than memory address. Though at this point I do not know whether it servers any bigger purpose. Thanks, Prabhu -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com
Re: SIGALRM handler execution time
I have set ITIMER_REAL On Sun, Oct 24, 2010 at 4:57 AM, Vimal j.vi...@gmail.com wrote: Hi Prabhu, On 21 October 2010 23:56, Prabhu nath gprabhun...@gmail.com wrote: Dear All, There is a time difference t between the time t1-at which the pending signal is set and the time t2-at which exact signal handler is executed. Question: I need to calculate the time difference between t1 and t2. Have you set a virtual time interrupt (ITIMER_VIRTUAL) or real timer interrupt (ITIMER_REAL)? There are two components here: scheduler latency and the timer interrupt latency. Both add up to the latency perceived by the userspace program. Which one do you want to measure? If you're interested in the end latency and if you've set a ITIMER_REAL timer, it's pretty easy to measure the latency upto some precision. Before setting the timer, note the 'start' real time using gettimeofday. After the timer handler (after time interval dt) fires, note the real time 'end'. (end-start) - dt should be the required latency. -- Vimal
Profiling the kernel and applications
Dear All, I am interested to perform following profiling 1. MIPS used by the kernel code execution. In particular when CS[2:0] (lower 2 bits of cs register) is 00, I need to calculate mips consumed for a stipulated time. 2. MIPS consumed by the interrupt code - inclusive of top-half and bottom-half 3. MIPS consumed by the application in total and mips consumed by the individual application Please guide me in providing some tools in profiling the above. Thanks, Prabhu
Re: Listing hardware devices, chipset info in the PC
Exactly this and few more for all the devices. This shows for the clock. Thanks, Prabhu On Mon, Oct 18, 2010 at 10:54 AM, Mulyadi Santosa mulyadi.sant...@gmail.com wrote: On Sun, Oct 17, 2010 at 20:15, Prabhu nath gprabhun...@gmail.com wrote: Dear All, lshw is the utilitiy to get the device details in the PC. But it will not list out timers if any, or chipset details. Will /sys filesystem (interface) provide the detail hardware information ? If yes, can you please help with proper link or is there any means to get hardware information through software. You mean this? $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource hpet acpi_pm -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com
Listing hardware devices, chipset info in the PC
Dear All, *lshw* is the utilitiy to get the device details in the PC. But it will not list out timers if any, or chipset details. Will /sys filesystem (interface) provide the detail hardware information ? If yes, can you please help with proper link or is there any means to get hardware information through software. Regards, Prabhu
Re: where does the stack of a process start
What I have understood is, the stack segment and the heap segment in the virtual address space of an application is allocated by the kernel and the starting address of these segments vary for every execution of a program ( on the premise that the program is not changed). Unlike the program's .text and .data sections where the starting address is defined by the linker script and will be same for a program unless one changes the program contents. Even I am interested in knowing the exact reason/algorithm adopted by the kernel. Regards, Prabhu On Sun, Oct 17, 2010 at 10:46 PM, Parmenides mobile.parmeni...@gmail.comwrote: Hi, According to ULK 3rd edition, the stack of a process start from 0xc000 and grow towards lower address. It is not the case in the following session: root [ ~ ]# ulimit -s unlimited root [ ~ ]# cat /proc/self/maps 08048000-0804c000 r-xp 03:01 272003 /bin/cat 0804c000-0804d000 rw-p 3000 03:01 272003 /bin/cat 0804d000-0806e000 rw-p 0804d000 00:00 0 [heap] 4000-4001a000 r-xp 03:01 176003 /lib/ld-2.5.1.so 4001a000-4001b000 r--p 00019000 03:01 176003 /lib/ld-2.5.1.so 4001b000-4001c000 rw-p 0001a000 03:01 176003 /lib/ld-2.5.1.so 40025000-40026000 rw-p 40025000 00:00 0 40026000-4014b000 r-xp 03:01 176002 /lib/libc-2.5.1.so 4014b000-4014d000 r--p 00125000 03:01 176002 /lib/libc-2.5.1.so 4014d000-4014e000 rw-p 00127000 03:01 176002 /lib/libc-2.5.1.so 4014e000-40152000 rw-p 4014e000 00:00 0 bf9f8000-bfa0e000 rw-p bf9f8000 00:00 0 [stack] e000-f000 r-xp 00:00 0 [vdso] It is obvious that the stack of cat's process start from 0xbfa0e000. Is there any explanation about this situation? p.s. The version of the linux is 2.6.22.5. -- To unsubscribe from this list: send an email with unsubscribe kernelnewbies to ecar...@nl.linux.org Please read the FAQ at http://kernelnewbies.org/FAQ
Setting a timer for less than 1 ms
Dear All, If HZ = 1000 then timer will generate interrupt every 1ms. Through * timer_list* structure, I can register a timer function to execute at a minimum resolution of 1ms. Is there any way to register a timer function in the kernel with a resolution of 500 micro second ? In other way, is there any way to register a timer function with a resolution less than the 1/HZ time unit. Thanks, Prabhu
Re: Regarding device cycles
Hi Dave, Please clarify my questions/understanding written inline. Thanks, -Prabhu On Wed, Oct 13, 2010 at 7:34 AM, Dave Hylands dhyla...@gmail.com wrote: Hi Sri, On Tue, Oct 12, 2010 at 6:29 PM, Sri Ram Vemulpali sri.ram.gm...@gmail.com wrote: Hi Dave, Thanks for explanation. So in your explanation you mentioned bus arbiter. So, bus does have controller, which arbitrates between various devices. But CPU is given higher priority than any other device. DMA uses bus only when CPU is not using it, in other words DMA is not given access to bus. UART's are worked through CPU. So they no need to wait for I/O operations. In other words bus is contended for among various devices. Can you please provide more distinct explanation, between UARTs and DMA contention for bus. Thanks in advance. UARTs don't access the memory bus. Your driver accesses the UART registers and puts characters in the FIFO. The UART takes the data from the FIFO and puts them on the serial lines. No memory access required (by the UART). Of course the CPU had to read the characters from memory and put them into the UART, but that was the CPU accessing the memory bus, not the UART. I feel memory bus you are referring is system bus. In that case, plz clarify the following. CPU sending data to the UART registers/FIFO should be through the system bus, right? In that case UART device should be sitting on the system bus. On the return path, if there is data coming from the serial line, it will fill or half fill the FIFO based on the configuration, then the CPU has to read data from the FIFO of UART. This data has to be sent through the system bus. If all what I said is right, then UART should be accessing system/memory bus. If you're DMA'ing from a peripheral into memory, then the DMA engine needs to access the memory bus. The CPU also accesses the memory bus whenever a cache miss occurs, or uncached accesses occur to memory. If the DMA and CPU both try to access the memory bus at the same time, then the bus arbiter decides who gets to go first. If you DMA from one peripheral to another peripheral, then no memory accesses are required by DMA. On the other hand, if you DMA from one memory region to another memory region, then 2 memory accesses are required by the DMA engine for each word transferred. You can think of DMA as a very specialized type of CPU that can read and write blocks of memory, and it executes at the same time as the real CPU. As a more concrete example, let's consider the TI OMAP 3530. Here's a datasheet. http://focus.ti.com/lit/ds/symlink/omap3530.pdf Take a look at the diagram on page 7 (Figure 1-1). You can see the processor on the top (labelled MPU Subsystem ARM Cortex A8) and you can also see the External and Stacked Memories on the bottom. Basically each block on the diagram that has an arrow going from the block to the bus can access the memory directly. DMA and the CPU are just a couple of the many components which can access memory. In this particular chip the UART actually sits on a sub-bus, and can't get access to main memory. Generally speaking, each block on the diagram will have a clock, and will execute in parallel with the CPU (which is just one of many blocks). -- Dave Hylands Shuswap, BC, Canada http://www.DaveHylands.com/ -- To unsubscribe from this list: send an email with unsubscribe kernelnewbies to ecar...@nl.linux.org Please read the FAQ at http://kernelnewbies.org/FAQ
Re: multiboot compliancy of vmlinuz-2.6.34
Then, what is the interface specification to which both Linux and Grub comply to. How does Grub load the linux kernel to the memory and initiate execution of the Kernel code. Thanks, Prabhu On Wed, Oct 6, 2010 at 9:08 PM, Philip Downer p...@pjd.me.uk wrote: Prabhu nath wrote: I have built a kernel version 2.6.34 and have a file *vmlinuz-2.6.34* in /boot folder. Is this file multiboot comliant ?. If yes, then as per multiboot specification, first byte should be magic number 0x1BADB002. But when I read the first byte, it is 0xC5EA. unless a patch has been applied recently (within the last year as thats when I last looked) to make it so, Linux isn't multiboot compliant. If you search the lkml you'll find a few mails about it, I just don't think anyone was interested in making it compliant as they didn't see any benefit. Philip Downer
Re: Reading TLB Entries
Yes please. Nice to have your say on this. Thanks, Prabhu On Mon, Oct 11, 2010 at 9:37 AM, Mulyadi Santosa mulyadi.sant...@gmail.comwrote: On Mon, Oct 11, 2010 at 04:33, Rik van Riel r...@surriel.com wrote: On 10/10/2010 08:33 AM, Dragos Tatulea wrote: Even if you theoretically could, the act of calling a hypothetical TLB reading function would cause mappings of other code to be evicted from the TLB :) Rik, you always surprise me with your humble but slick analysis! I back you up! :) NB: if the above Rik's statement confuse you, let us know so we could elaborate it... -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com -- To unsubscribe from this list: send an email with unsubscribe kernelnewbies to ecar...@nl.linux.org Please read the FAQ at http://kernelnewbies.org/FAQ
multiboot compliancy of vmlinuz-2.6.34
Dear All, I have built a kernel version 2.6.34 and have a file * vmlinuz-2.6.34* in /boot folder. Is this file multiboot comliant ?. If yes, then as per multiboot specification, first byte should be magic number 0x1BADB002. But when I read the first byte, it is 0xC5EA. Please clarify. Regards, Prabhu
Re: mmap problem
Thank you very much. I was bit over-confident on addressing :). Now things are working fine. But still I wonder why didn't it throw bus error. Now, I am facing a new problem * When I map the same physical address to kernel virtual address using ioremap, it fails. I understand that ioremap is doing a sanity check to verify that the physical address belongs to IO device or system memory. I am working on that to remove this sanity check. In the mean time, if you have any easy way to remove this. Please help Thanks, Prabhu On Tue, Oct 5, 2010 at 10:33 AM, Arun KS getaru...@gmail.com wrote: Hello Prabhu, On Mon, Oct 4, 2010 at 3:55 PM, Prabhu nath gprabhun...@gmail.com wrote: Dear All, I have an 512 MB RAM on an Intel desktop machine, of which Kernel uses 256M for all allocation for kernel as well as for user programs. (by passing mem=256M as a boot parameter). Hence I have 256MB of memory which I can treat it as IO memory. System memory is associated from 0x - 0x2000 (512MB) in the physical address space. Memory addresses from 0x - 0x1000 (256 MB) are used by the Kernel - memory management. Subsystem Memory addresses from 0x1000 - 2000 (256MB) is used as IO memory For an experiment, I used mmap() to map a page (4K) in IO memory (page base address 0x20002000) to user virtual address This physical address is above your 512MB ram address. Arun I used kernel function remap_pfn_range() in my kernel module's mmap function. This is rightly mapping the physical page to a user virtual address. But when I write to that address and then read. I get junk value. To just verify, when I mapped the VGA controller memory to user virtual address things are working fine. Can you please help me to resolve this problem. Regards, Prabhu
Re: mmap problem
On inspecting the flags of the PageTable entry that holds the page base address (20002000) I inferred that the page was not dirty (i.e. the page was not written or updated). The following flags were set. 0x237 (Refer. arch/x86/include/asm/pgtable_types.h from the source code 2.6.34). *#define _PAGE_BIT_PRESENT 0 /* is present */* #define _PAGE_BIT_RW1 /* writeable */ #define _PAGE_BIT_USER 2 /* userspace addressable */ #define _PAGE_BIT_PWT 3 /* page write through */ #define _PAGE_BIT_PCD 4 /* page cache disabled */ #define _PAGE_BIT_ACCESSED 5 /* was accessed (raised by CPU) */ *#define _PAGE_BIT_DIRTY 6 /* was written to (raised by CPU) */* #define _PAGE_BIT_UNUSED1 9 /* available for programmer */ The following bit positions are set. 1 2 3 4 5 9 When I write into the virtual address, then the page should be dirty and I expect bit position 6 (blocked above) to be set, but I see that it is not set. *Which function is responsible to update the flags in the page table entry. ?* Thanks, Prabhu On Mon, Oct 4, 2010 at 12:25 PM, Prabhu nath gprabhun...@gmail.com wrote: Dear All, I have an 512 MB RAM on an Intel desktop machine, of which Kernel uses 256M for all allocation for kernel as well as for user programs. (by passing mem=256M as a boot parameter). Hence I have 256MB of memory which I can treat it as IO memory. System memory is associated from 0x - 0x2000 (512MB) in the physical address space. Memory addresses from 0x - 0x1000 (256 MB) are used by the Kernel - memory management. Subsystem Memory addresses from 0x1000 - 2000 (256MB) is used as IO memory For an experiment, I used *mmap()* to map a page (4K) in IO memory (page base address 0x20002000) to user virtual address I used kernel function *remap_pfn_range()* in my kernel module's mmap function. This is rightly mapping the physical page to a user virtual address. But when I write to that address and then read. I get junk value. To just verify, when I mapped the VGA controller memory to user virtual address things are working fine. Can you please help me to resolve this problem. Regards, Prabhu
Re: Reading TLB Entries
I need to read it for intel x86. I looked in tlb.h file and couldn't find any supporting functions. On mips architecture, I found some supporting functions like tlb_probe, tlb_read... Thanks, Prabhu On Mon, Oct 4, 2010 at 4:22 PM, Rajat Jain rajatj...@juniper.net wrote: This is architecture specific question. On powerpc for example, you can read it via special purpose registers MAS0 – MAS3. -- *From:* kernelnewbies-bou...@nl.linux.org [mailto: kernelnewbies-bou...@nl.linux.org] *On Behalf Of *Prabhu nath *Sent:* Monday, October 04, 2010 4:11 PM *To:* kernelnewbies *Subject:* Reading TLB Entries Is there any means to read TLB entries ? Thanks, Prabhu
mmap problem
Dear All, I have an 512 MB RAM on an Intel desktop machine, of which Kernel uses 256M for all allocation for kernel as well as for user programs. (by passing mem=256M as a boot parameter). Hence I have 256MB of memory which I can treat it as IO memory. System memory is associated from 0x - 0x2000 (512MB) in the physical address space. Memory addresses from 0x - 0x1000 (256 MB) are used by the Kernel - memory management. Subsystem Memory addresses from 0x1000 - 2000 (256MB) is used as IO memory For an experiment, I used *mmap()* to map a page (4K) in IO memory (page base address 0x20002000) to user virtual address I used kernel function *remap_pfn_range()* in my kernel module's mmap function. This is rightly mapping the physical page to a user virtual address. But when I write to that address and then read. I get junk value. To just verify, when I mapped the VGA controller memory to user virtual address things are working fine. Can you please help me to resolve this problem. Regards, Prabhu
Re: mmap problem
Dear All, Attaching my experiment files. 1. kmmap.c = Kernel module 2. mmap.c = application 3. Makefile 4. kscript.sh = shell script which will build the kernel module, does insmod and execute the application. I have PFN defined here to experiment with IO memory as well as VGA Controller memory. The rest of the experiments are as written in the below mail thread. Please check these files, and let me know if I have missed anything. Thanks, Prabhu On Mon, Oct 4, 2010 at 2:46 PM, Prabhu nath gprabhun...@gmail.com wrote: On inspecting the flags of the PageTable entry that holds the page base address (20002000) I inferred that the page was not dirty (i.e. the page was not written or updated). The following flags were set. 0x237 (Refer. arch/x86/include/asm/pgtable_types.h from the source code 2.6.34). *#define _PAGE_BIT_PRESENT 0 /* is present */* #define _PAGE_BIT_RW1 /* writeable */ #define _PAGE_BIT_USER 2 /* userspace addressable */ #define _PAGE_BIT_PWT 3 /* page write through */ #define _PAGE_BIT_PCD 4 /* page cache disabled */ #define _PAGE_BIT_ACCESSED 5 /* was accessed (raised by CPU) */ *#define _PAGE_BIT_DIRTY 6 /* was written to (raised by CPU) */* #define _PAGE_BIT_UNUSED1 9 /* available for programmer */ The following bit positions are set. 1 2 3 4 5 9 When I write into the virtual address, then the page should be dirty and I expect bit position 6 (blocked above) to be set, but I see that it is not set. *Which function is responsible to update the flags in the page table entry. ?* Thanks, Prabhu On Mon, Oct 4, 2010 at 12:25 PM, Prabhu nath gprabhun...@gmail.comwrote: Dear All, I have an 512 MB RAM on an Intel desktop machine, of which Kernel uses 256M for all allocation for kernel as well as for user programs. (by passing mem=256M as a boot parameter). Hence I have 256MB of memory which I can treat it as IO memory. System memory is associated from 0x - 0x2000 (512MB) in the physical address space. Memory addresses from 0x - 0x1000 (256 MB) are used by the Kernel - memory management. Subsystem Memory addresses from 0x1000 - 2000 (256MB) is used as IO memory For an experiment, I used *mmap()* to map a page (4K) in IO memory (page base address 0x20002000) to user virtual address I used kernel function *remap_pfn_range()* in my kernel module's mmap function. This is rightly mapping the physical page to a user virtual address. But when I write to that address and then read. I get junk value. To just verify, when I mapped the VGA controller memory to user virtual address things are working fine. Can you please help me to resolve this problem. Regards, Prabhu kmmap.c Description: Binary data kscript.sh Description: Bourne shell script Makefile Description: Binary data mmap.c Description: Binary data
Re: Reducing the physical memory for the allocator
Thanks for your kind replies. Prabhu On Fri, Oct 1, 2010 at 11:18 AM, Dave Hylands dhyla...@gmail.com wrote: Hi Prabhu, On Thu, Sep 30, 2010 at 11:29 PM, Prabhu nath gprabhun...@gmail.com wrote: Dear All, I am trying to experiment the following. * I have a 1 GB of RAM and running Linux 2.6.34 on a Intel Machine. * I want memory allocator to get only 768 MB of RAM. * The rest 256 MB of RAM, Kernel should see it has IO memory Is there any option that I can pass to the kernel so that it only takes 768 MB of RAM. I believe that you can pass mem=768M on the command line -- Dave Hylands Shuswap, BC, Canada http://www.DaveHylands.com/
Re: kmap and page
page, I believe, you are refering to struct page. kmap is a kernel function which takes in the parameter struct page pointer and returns kernel virtual address. On a typical x86 machine with 1GB RAM, kmap works like this. If the struct page is associated with the physical page less than 896 MB, then kmap simply adds the PAGE_OFFSET - 0xC000 to the physical address (VA = PA + PAGE_OFFSET) and returns , because in a typcial 3G/1G partition of the linear virtual address space, first 896MB of the kernel virtual address is directly mapped to the physical address. I If the struct page is associated with physical page in the HIGHMEM region i.e. 896 MB, then it will map this physical page to the kernel virtual address, generally named vmalloc address region or non-contiguous address region and then returns this virtual address Regards, Prabhu On Fri, Oct 1, 2010 at 11:22 AM, Sengottuvelan S sengottuvela...@gmail.comwrote: Hi I am new to this forum. I have specific quesion on kmap and page. I would like to know how these two are related to eachother. Thanks in advance.
Reducing the physical memory for the allocator
Dear All, I am trying to experiment the following. * I have a 1 GB of RAM and running Linux 2.6.34 on a Intel Machine. * I want memory allocator to get only 768 MB of RAM. * The rest 256 MB of RAM, Kernel should see it has IO memory Is there any option that I can pass to the kernel so that it only takes 768 MB of RAM. Request you to help me in this regard. Thanks, Prabhu
Re: Regarding IDT
ULK = Understanding Linux Kernel 3rd edition - Bovet and Cesati On Mon, Sep 20, 2010 at 10:22 AM, Sri Ram Vemulpali sri.ram.gm...@gmail.com wrote: Here is the Intel system programming guide of x86. By the way what is ULK haven't come across it. Thanks, Sri. On Mon, Sep 20, 2010 at 12:07 AM, Prabhu nath gprabhun...@gmail.comwrote: There is a macro called *common_interrupt* which calls do_irq. arch/x86/kernel/entry_32.S Please refer ULK page 162 for more description. Can you please share the data sheet of x86. Regards, Prabhu On Mon, Sep 20, 2010 at 7:20 AM, Sri Ram Vemulpali sri.ram.gm...@gmail.com wrote: Thanks for the replies. I have another question. Thanks in advance for clarifications. I was wondering who calls the do_irq function. I mean when interrupts occurs processor should handle it by interrupting current task. So, when interrupts occurs in linux kernel, then, did processor jumps directly to do_irq or does it executes any other function before entering in to do_irq. If processor directly jumps to do_irq, then how does processor knows what to execute when interrupts happens. I mean, Is there anyway to set in linux kernel to tell processor to jump to this location when an interrupt or exception happens. I am not talking about interrupt subsystem, which is done in do_irq, checking what interrupt occurred by reading bus, and calling appropriate handler. Let me know whether my understanding is right. Thanks, Sri. On Sun, Sep 19, 2010 at 7:27 AM, arshad hussain arshad.su...@gmail.comwrote: On 9/19/2010 4:49 AM, Sri Ram Vemulpali wrote: Hi all, This question is regarding Interrupt descriptor table. Why is the IDTR 48-bits wide and 16Bit Limit + 32 bit Address = 48bits of IDTR. why do we need limit field in the IDTR. Because if we access beyond defined interrupt there will be general protection fault. Since we know there are 256 interrupts or exceptions possible, can't we know boundary by deriving it by length of IDT field. All interrupts are not always defined. There may be fewer interrupts defined depending upon the requirements. Looking up the 'limits' field is faster less error prone than find the length of the IDT, which i guess could only be done via probing for all slots with has present flag set to 0. Also, why is the IDT entry is 8 bytes long. This 8 byte data structure is explained in intel's manual. And how is the interrupt line sharing is provided. Is sharing provided at OS code level. I did not see any explanation of sharing at Intel manual (data sheet of x86 system programming guide). Any thoughts. Please clarify. Thanks. -- Regards, Sri. Thanks. -- To unsubscribe from this list: send an email with unsubscribe kernelnewbies to ecar...@nl.linux.org Please read the FAQ at http://kernelnewbies.org/FAQ -- Regards, Sri. -- Regards, Sri.
Re: offset problem
In the example, there are 3 arguments taken from the command line. They are 1. name of the file 2. the offset within the file from where the contents have to be printed 3. length in bytes from the offset that have to be printed on to stdout. Let us take some input for our understanding. Let the size of the file be 6500 bytes 1. name of the file is foo.txt 2. offset is 5000 3. length = 1000 bytes As per the user input, mmap should now copy the file contents from the offet 5000 to 6000 (offset + length) to the memory and map this memory to the virtual address (symbol *addr *in the program) of the current task. Since the offset parameter of mmap should be page aligned, pa_offset will be 4096 (0x1000) and thus mmap will copy the file contents from the offset 4096 to 6000. The following will be the mmap parameters, considering the page size as 4K. addr = mmap(NULL, 1000+5000-4096, PROT_READ,MAP_PRIVATE, fd, 4096 ) The virtual address returned, say 0x08054000, will be accessing the data of the file from the offset 4096, but the program has to print 1000 (length) bytes of contents from the offset 5000. i.e. from the virtual address 0x08054000 + 904 (5000 - 4096) i.e effectively (addr + offset - pa_offset) that is the 2nd parameter of the write system call. write (STDOUT_FILENO, 0x08054000 + 5000 - 4096, 1000). Hope this clarifies, Regards, Prabhu On Mon, Sep 20, 2010 at 2:29 PM, mohit verma mohit89m...@gmail.com wrote: hi all , i was reading manual of mmap() there i got stuck at the line : s = write(STDOUT_FILENO, addr + offset - pa_offset, length); can anyone please describe me what is going on in this line?
Re: offset problem
Here are my thoughts. Plz correct me if I am wrong. In the given example 5000 is the offset within the file and not the address. Given, the offset 5000, kernel anyway could allocate a physical page and place the contents from the starting of the page and map it to the virtual address. For E.g. it could allocate a physical page of base address 0x2abd1000 and copy the 1000 bytes of contents from the offset 5000 and map the physical address to some virtual address say 0x08054000. Q. How does the offset not being page aligned affect the execution ? The 2nd point wrt fragmentation is convincing. Regards, Prabhu On Mon, Sep 20, 2010 at 10:27 PM, Mulyadi Santosa mulyadi.sant...@gmail.com wrote: hi.. On Mon, Sep 20, 2010 at 19:30, mohit verma mohit89m...@gmail.com wrote: thanks guys, but the problem is : what is the need of page aligned address? as we have given starting address 5000 and the length 1000 ( in the example by prabhu) so it should start from that location. I think it is due to two reasons: 1. on some architecture, non page aligned access could mean trap/exception. Well, x86 is immune in this case...but if you wanna make your code safely portable, I think it's a must. 2. By making it page aligned, at least you reduce fragmentation. How come? Let's assume you just have 8K address space. You allocate a memory starting at 3K as large as 4K. That means you occupy 3K up to 7K address. What's left? 0-1 K and 7-8 K, right? And what if you wanna allocate another 4K? Yes, a page can fit that...but not mapping. We can't split below page granularity, so at least we need two page. one is mapped to 0-1 K, the other is for 7-8K. It's 2 page vs 3 page case. Yes, there's slab...but if we can allocate straight in page size granularity, things would be simpler, right? -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com -- To unsubscribe from this list: send an email with unsubscribe kernelnewbies to ecar...@nl.linux.org Please read the FAQ at http://kernelnewbies.org/FAQ
Re: Regarding IDT
There is a macro called *common_interrupt* which calls do_irq. arch/x86/kernel/entry_32.S Please refer ULK page 162 for more description. Can you please share the data sheet of x86. Regards, Prabhu On Mon, Sep 20, 2010 at 7:20 AM, Sri Ram Vemulpali sri.ram.gm...@gmail.comwrote: Thanks for the replies. I have another question. Thanks in advance for clarifications. I was wondering who calls the do_irq function. I mean when interrupts occurs processor should handle it by interrupting current task. So, when interrupts occurs in linux kernel, then, did processor jumps directly to do_irq or does it executes any other function before entering in to do_irq. If processor directly jumps to do_irq, then how does processor knows what to execute when interrupts happens. I mean, Is there anyway to set in linux kernel to tell processor to jump to this location when an interrupt or exception happens. I am not talking about interrupt subsystem, which is done in do_irq, checking what interrupt occurred by reading bus, and calling appropriate handler. Let me know whether my understanding is right. Thanks, Sri. On Sun, Sep 19, 2010 at 7:27 AM, arshad hussain arshad.su...@gmail.comwrote: On 9/19/2010 4:49 AM, Sri Ram Vemulpali wrote: Hi all, This question is regarding Interrupt descriptor table. Why is the IDTR 48-bits wide and 16Bit Limit + 32 bit Address = 48bits of IDTR. why do we need limit field in the IDTR. Because if we access beyond defined interrupt there will be general protection fault. Since we know there are 256 interrupts or exceptions possible, can't we know boundary by deriving it by length of IDT field. All interrupts are not always defined. There may be fewer interrupts defined depending upon the requirements. Looking up the 'limits' field is faster less error prone than find the length of the IDT, which i guess could only be done via probing for all slots with has present flag set to 0. Also, why is the IDT entry is 8 bytes long. This 8 byte data structure is explained in intel's manual. And how is the interrupt line sharing is provided. Is sharing provided at OS code level. I did not see any explanation of sharing at Intel manual (data sheet of x86 system programming guide). Any thoughts. Please clarify. Thanks. -- Regards, Sri. Thanks. -- To unsubscribe from this list: send an email with unsubscribe kernelnewbies to ecar...@nl.linux.org Please read the FAQ at http://kernelnewbies.org/FAQ -- Regards, Sri.
Re: A question
Interrupt context is a loaded term. It is used for both top-half and bottom-half processing. Linux defines two terms in the top-half, one is Interrupt handler (IH) and Interrupt Service Routine (ISR). Interrupt handler is a standard kernel code that is executed by the processor once the interrupt is generated. To abstract it, this generic code is executed for all the generated interrupts. This, in general saves the present processor context and paves way for the execution of ISR. ISR is a interrupt service routine written by a device driver programmer to handle the interrupt of a device. ISR is mainly responsible to verify the status register of a device to find out the cause of the interrupt and act accordingly. Both IH and ISR will not have a context of its own. Interrupts generated by a hardware device is asynchronous. i.e. it can be generated at any point in time. When such a interrupt is generated, the current execution is interrupted and the execution control is transferred to IH and ISR will be executed. Thus we can infer that both IH and ISR will be executing in an anonymous context. For Eg. Consider there are 3 tasks T2, T3. Suppose T2 invoked a read system call to read data from the secondary media (say hard disk), driver of the hard disk will initiate the device to gather the data and put task T2 in the wait queue and invokes scheduler. Now, scheduler picks up say T3. When processor is executing task T3, there was interrupt generated by the harddisk controller indicating that data is ready. At this point IH and ISR is executed. IH and ISR of this interrupt has no relevance to task T3, but still it is executed in the context of T3. Hence we say that IH and ISR will always be executed in the anonymous context. Most of the bottom half execution happens the same way, i.e. it will be executing in the anonymous context unless the job of bottom half is relegated to respective kernel threads. In the latter case the bottom half execution will have its own context. For more details about top-half and bottom half, refer ULK 3rd edtion by Bovet, Chapter 4 Regards, Prabhu On Wed, Sep 1, 2010 at 9:08 PM, Hiren Panchasara hiren.panchas...@gmail.com wrote: Process context is schedulable but interrupt context is not. Why and how? Any examples I can look at? Thanks. -- To unsubscribe from this list: send an email with unsubscribe kernelnewbies to ecar...@nl.linux.org Please read the FAQ at http://kernelnewbies.org/FAQ
malloc memory region descriptors
Dear All, In linux kernel, for all memory allocation done by *vmalloc*, kernel maintains memory region descriptor *(vm_struct)* which stores information about the linear virtual address range, no. of physical page frames allocated... as a linked list headed by *vmlist *symbol. Can you please give me information about how does kernel maintain information about memory allocation done by malloc/calloc invoked by a user application ? I understand that kernel maintains a descriptor (*vm_area_struct*) per application to hold information about the virtual address space allocated for the heap region. Regards, Prabhu
Re: malloc memory region descriptors
On Mon, Aug 16, 2010 at 1:30 PM, Prabhu nath gprabhun...@gmail.com wrote: Dear All, In linux kernel, for all memory allocation done by *vmalloc*, kernel maintains memory region descriptor *(vm_struct)* which stores information about the linear virtual address range, no. of physical page frames allocated... as a linked list headed by *vmlist *symbol. Can you please give me information about how does kernel maintain information about memory allocation done by malloc/calloc invoked by a user application ? I understand that kernel maintains a descriptor (*vm_area_struct*) per application to hold information about the virtual address space allocated for the heap region. I would like to elicit my question. Here is a simple code sample int *b; int main() { b = malloc(4); printf (Address of b = 0x%08x \n, b); getchar(); return 0; } If I pause the program and inspect the maps file /proc/pid/maps, I get to see a whooping 132K of linear virtual address allocated and Address of b = 0x0804a008 Here is a fragment of the maps file. 08048000-08049000 r-xp fd:00 16845753 cSamples/cTests/a.out 08049000-0804a000 rw-p fd:00 16845753 cSamples/cTests/a.out *0804a000-0806b000 rw-p 0804a000 00:00 0 [heap]* b7f3d000-b7f3e000 rw-p b7f3d000 00:00 0 b7f4a000-b7f4d000 rw-p b7f4a000 00:00 0 Suppose if have to extend my code as c = malloc (34); d = malloc(12); e = malloc(1024); free(e); f = malloc(1845); Assume (c, d, e, f are all declared globally) Now, when free(e) is executed, kernel should have some information about the size of address space allocated to symbol 'e' so that it will only free that virtual address region in the heap space that was allocated to e. Q. Where and how does the kernel maintain information about the size of address space allocated to symbol 'e' Regards, Prabhu
Re: Problem with EXPORT_SYMBOL(irq_to_desc)
Thanks a lot. It was to do with CONFIG_SPARSE_IRQ. Just overshot my sight. I had exported a wrong irq_to_desc() which is under the belt of CONFIG_SPARSE_IRQ which was not configured. Its working now Regards, Prabhu On Wed, Aug 11, 2010 at 10:34 PM, Mulyadi Santosa mulyadi.sant...@gmail.com wrote: I don't exactly know the problem...but here's something to think... On Wed, Aug 11, 2010 at 13:07, Prabhu nath gprabhun...@gmail.com wrote: Hi, I am facing a peculiar behaviour on Kernel version 2.6.34.1 (http://lxr.linux.no/#linux+v2.6.34.1). For my experimentation I just wanted to use the following two kernel functions in my Kernel module 1. find_task_by_vpid (http://lxr.linux.no/#linux+v2.6.34.1/kernel/pid.c#L388) 2. irq_to_desc (http://lxr.linux.no/#linux+v2.6.34.1/kernel/irq/handle.c#L138) Since these two functions are not exported by the Kernel, I exported these functions, built the kernel and used in my kernel module. Seeing around handle.c, precisely lines surrounding the related function, I see this: endif /* !CONFIG_SPARSE_IRQ */ uhum...seems related to sparse IRQ handling..so basic question here: which one did you export? the one in CONFIG_SPARSE_IRQ=y? or the other one? -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com