When to use blk_end_request_* routines.
Hi, I'm using SLES11-SP1 kernel 2.6.32 .12-0.7 I wrote a block driver for an in-memory disk. The driver seems to work fine however sometimes when I try to unload the driver the machine freezes, the module_exit code looks like the following, /*We take no more requests!*/ spin_lock_irqsave(pks_disk-queue_lock,flags); blk_stop_queue(pks_disk-pks_disk_rqq); spin_unlock_irqrestore(pks_disk-queue_lock,flags); /*Remove disk*/ del_gendisk(pks_disk-pks_disk_gd); /*free queue*/ blk_cleanup_queue(pks_disk-pks_disk_rqq); I searched around and found out that while processing the requests from request queue you've got to use one of the blk_end_request* functions (which I did). However if I only use a) blk_end_request: The module removal code freezes machine. b) blk_end_request_cur: The module removal code works fine. There's one more c)blk_end_request_all: Didn't used this one cuz the above one worked :P. I digged around in the source and found this function blk_update_request and in the comment section it says Actual device drivers should use blk_end_request instead which I did and the driver module didn't liked it when I tried to remove it :( Now I'm confused as I've got no clue when to use which function can anyone give some examples please? --P.K.S ::DISCLAIMER:: The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
RE:Query on skb buffer (Kumar amit mehta)
-Original Message- From: kernelnewbies-boun...@kernelnewbies.org [mailto:kernelnewbies- boun...@kernelnewbies.org] On Behalf Of kernelnewbies- requ...@kernelnewbies.org Sent: Thursday, March 07, 2013 8:52 AM To: kernelnewbies@kernelnewbies.org Subject: Kernelnewbies Digest, Vol 28, Issue 12 Send Kernelnewbies mailing list submissions to kernelnewbies@kernelnewbies.org To subscribe or unsubscribe via the World Wide Web, visit http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies or, via email, send a message with subject or body 'help' to kernelnewbies-requ...@kernelnewbies.org You can reach the person managing the list at kernelnewbies-ow...@kernelnewbies.org When replying, please edit your Subject line so it is more specific than Re: Contents of Kernelnewbies digest... Today's Topics: 1. Query on skb buffer (Kumar amit mehta) 2. Re: Query on skb buffer (valdis.kletni...@vt.edu) 3. Several unrelated beginner questions. (Konstantin Kowalski) 4. Re: Several unrelated beginner questions. (Gaurav Jain) 5. Re: Several unrelated beginner questions. (valdis.kletni...@vt.edu) 6. zap_low_mappings (ishare) 7. Re: zap_low_mappings (valdis.kletni...@vt.edu) -- Message: 1 Date: Wed, 6 Mar 2013 10:39:13 -0800 From: Kumar amit mehta gmate.a...@gmail.com Subject: Query on skb buffer To: kernelnewbies@kernelnewbies.org Message-ID: 20130306183913.ga3...@gmail.com Content-Type: text/plain; charset=us-ascii My current understanding is that the skb, while being passed along various layers in linux network stack, will be manipulated majorly, using the skb-{head|data|tail|end|len} fields. Suppose that my application (say 'ping') sends a ICMP echo request with a large packet size of 4k, i.e. $ ping -s 4096 dest addr Now, if alloc_skb(4096, GFP_KERNEL) is the routine that gets called to allocate the kernel buffer then, how does the kernel manages such prospective memory allocation failures and how kernel manages large packet requests from the application. -Amit [Pranay Kumar Srivastava] Perhaps you should've a look at linear and non-linear data (skb_frags to be specific). That's how large data is handled however I don't think you'll be doing that with ICMP or UDP. Reading directly from skbuffs for UDP would also give you header information however with TCP it doesn't. So unless there's any need for it perhaps it can be done in userland or use sock_sendmsg or sendfile (for zero copy). --P.K.S -- Message: 2 Date: Wed, 06 Mar 2013 14:32:27 -0500 From: valdis.kletni...@vt.edu Subject: Re: Query on skb buffer To: Kumar amit mehta gmate.a...@gmail.com Cc: kernelnewbies@kernelnewbies.org Message-ID: 9932.1362598...@turing-police.cc.vt.edu Content-Type: text/plain; charset=us-ascii On Wed, 06 Mar 2013 10:39:13 -0800, Kumar amit mehta said: Now, if alloc_skb(4096, GFP_KERNEL) is the routine that gets called to allocate the kernel buffer then, how does the kernel manages such prospective memory allocation failures and how kernel manages large packet requests from the application. Did you actually look at the source for use of alloc_skb() and how it handles error returns? (Hint - the kernel doesn't do the same thing at every use of alloc_skb(), because an allocation failure needs to be handled differently depending on where it happens. At some places, just bailing out and dropping the packet on the floor without any notification to anybody is appropriate. At other places, we need to propagate an error condition to the caller). Typical pattern (from net/core/sock.c:) /* * Allocate a skb from the socket's send buffer. */ struct sk_buff *sock_wmalloc(struct sock *sk, unsigned long size, int force, gfp_t priority) { if (force || atomic_read(sk-sk_wmem_alloc) sk-sk_sndbuf) { struct sk_buff *skb = alloc_skb(size, priority); if (skb) { skb_set_owner_w(skb, sk); return skb; } } return NULL; } EXPORT_SYMBOL(sock_wmalloc); and then the caller does something like this (net/ipv4/ip_output.c, in function __ip_append_data(): } else { skb = NULL; if (atomic_read(sk-sk_wmem_alloc) = 2 * sk-sk_sndbuf) skb = sock_wmalloc(sk, alloclen + hh_len + 15, 1, sk-sk_allocation); if (unlikely(skb == NULL)) err = -ENOBUFS
RE: Major/minor numbers
, read source code and/or comments and figure out what Linux does to prevent the 'thundering herd' problem (consider 100 threads all waiting on the same mutex - if you blindly wake all 100 up, you'll schedule them all, the first will find the mutex available and then re-take it, and then the next 99 will get run only to find it contended and go back to sleep. So figure out what Linux does in that case. :) Googling around, I found the 'thundering herd' being mentioned in relation to threads waiting on sockets using the accept() sys call. Are wait's on mutex's also plagued by the same issue? I guess it is, though what sys call would be used in this case? Thanks, -mandeep -- Message: 9 Date: Tue, 5 Mar 2013 18:05:46 +0800 From: ishare june.tune@gmail.com Subject: Re: pthread_lock To: kernelnewbies@kernelnewbies.org Message-ID: 20130305100546.GA2541@debian.localdomain Content-Type: text/plain; charset=us-ascii On Tue, Mar 05, 2013 at 01:39:54PM +0530, Mandeep Sandhu wrote: On Tue, Mar 5, 2013 at 11:32 AM, valdis.kletni...@vt.edu wrote: On Tue, 05 Mar 2013 11:02:45 +0530, Mandeep Sandhu said: next schedule. I think the waiting threads (processes) will moved from the wait queue to the run queue from where they will be scheduled to run. For bonus points, read source code and/or comments and figure out what Linux does to prevent the 'thundering herd' problem (consider 100 threads all waiting on the same mutex - if you blindly wake all 100 up, you'll schedule them all, the first will find the mutex available and then re-take it, and then the next 99 will get run only to find it contended and go back to sleep. So figure out what Linux does in that case. :) Googling around, I found the 'thundering herd' being mentioned in relation to threads waiting on sockets using the accept() sys call. Are wait's on mutex's also plagued by the same issue? I guess it is, though what sys call would be used in this case? the threads waiting on sockets will be waked up by net event. similarly,the waiters on mutex's can be wake up by signal.I guess it is pthread_cont_signal Thanks, -mandeep -- Message: 10 Date: Tue, 5 Mar 2013 12:21:47 + From: Anuz Pratap Singh Tomar chambilketha...@gmail.com Subject: Re: Major/minor numbers To: Shraddha Kamat sh200...@gmail.com Cc: kernelnewbies kernelnewbies@kernelnewbies.org Message-ID: CAJnfX5uEbxk2kPLqtS3KsvH01C+iPrj3K3qdgunq9SbrD9GBnQ@mail. gmail.com Content-Type: text/plain; charset=iso-8859-1 On Tue, Mar 5, 2013 at 6:32 AM, Shraddha Kamat sh200...@gmail.com wrote: Does the max number of devices supported by Linux limited by major minor number ? Can you please give me some pointers regarding this. http://stackoverflow.com/questions/14833467/maximum-values-of-major- and-minor-numbers-in-linux [Pranay Kumar Srivastava] Just another point you should be careful when trying to get device number from user land to kernel. I've got 2.6.32 SLES 11-SP1 and the glibc provided major,minor and makedev functions are quite different from what you'll encounter in kernel in form of macros. Apparently the userland still uses 16 bit numbers on my SLES11-SP1 while internally kernel uses 32 bit device numbers(split of 12bits,20bits). If you are thinking of passing direct device numbers from user land to kernel refrain from it. Instead pass them separately as major and minor numbers and then combine them using macros MKDEV within kernel. --P.K.S -- Shraddha ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies -- Thank you Warm Regards Anuz -- next part -- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130 305/47f643b5/attachment.html -- ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies End of Kernelnewbies Digest, Vol 28, Issue 9 ::DISCLAIMER:: The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views
RE: Major/minor numbers
::DISCLAIMER:: The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
RE: barrier()
)); printf(tmp_long_ptr :0x%llx\n, (long long int) tmp_long_ptr); if ((long_ptr == tmp_long_ptr) (long_ptr = 0x3000)) printf(valid 64 addr\n); } -- next part -- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130 224/b8a6e457/attachment-0001.html -- Message: 4 Date: Mon, 25 Feb 2013 12:26:06 +0530 From: Shraddha Kamat sh200...@gmail.com Subject: barrier() To: kernelnewbies kernelnewbies@kernelnewbies.org Message-ID: 1361775366.22170.27.ca...@oc5268484881.ibm.com Content-Type: text/plain; charset=UTF-8 #define barrier() asm volatile( ::: memory) What exactly volatile( ::: memory) doing here ? I was referring to gnu as (ver 2.14) manual but could not get much clue about this assembly construct - any pointers ? [Pranay Kumar Srivastava] From the extended GCC inline assembly manual the last : begins the list of clobber registers so that GCC accidentally doesn't use them if you've used some of the registers. I looked upon this page http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html and its written here something about the special memory keyword. It goes something like the following If your assembler instructions access memory in an unpredictable fashion, add `memory' to the list of clobbered registers. This causes GCC to not keep memory values cached in registers across the assembler instruction and not optimize stores or loads to that memory. You also should add the volatile keyword if the memory affected is not listed in the inputs or outputs of the asm, as the `memory' clobber does not count as a side-effect of the asm. If you know how large the accessed memory is, you can add it as input or output but if this is not known, you should add `memory' There's also a nice small example on that page. I hope that helps. Regards P.K.S -- Message: 5 Date: Mon, 25 Feb 2013 15:02:39 +0800 From: bill4carson bill4car...@gmail.com Subject: Re: test jiffies on ARM SMP board To: kernelnewbies@kernelnewbies.org Message-ID: 512b0c8f.30...@gmail.com Content-Type: text/plain; charset=UTF-8; format=flowed On 2013?02?21? 00:39, buyitian wrote: i am confused about my test. in one device driver, i put below code: printk(start to test test jiffies\n); local_irq_save(flags); jf1 = jiffies; // read jiffies first time // hold cpu for about 2 seconds(do some calculation) jf2 = jiffies; // read jiffies after 2 seconds local_irq_restore(flags); printk(jf1:%lu, jf2:%lu\n, jf1, jf2); and the output is as below: 4[ 108.551124]start to test test jiffies 4[ 110.367604]jf1:4294948151, jf2:4294948151 the jf1 and jf2 are the same value, although they are read between 2 seconds interval, i think this is because i disabled local interrupt. but the printk timestamp is from 108.551124 to 110.367604, which is about 2 seconds. and on my platform, printk timestamp is got from the function read_sched_clock: static u32 __read_mostly (*read_sched_clock)(void) = jiffy_sched_clock_read; and function jiffy_sched_clock_read() is to read from jiffies. it seems that the jiffies is frozen when local irq is disabled, but after local_irq_restore(), the jiffies not only start to run, but also recover the lost 2 seconds. is the jiffies updated from another cpu when irq is disabled on local cpu? is there some internel processor interrupt between cpu1 and cpu0 after local irq is re-enabled so that jiffies recover the lost 2 seconds? 80 /* 81 * Event handler for periodic ticks 82 */ 83 void tick_handle_periodic(struct clock_event_device *dev) 84 { 85 int cpu = smp_processor_id(); 86 ktime_t next; 87 88 tick_periodic(cpu); 89 90 if (dev-mode != CLOCK_EVT_MODE_ONESHOT) 91 return; 92 /* 93 * Setup the next period for devices, which do not have 94 * periodic mode: 95 */ 96 next = ktime_add(dev-next_event, tick_period); 97 for (;;) { 98 if (!clockevents_program_event(dev, next, ktime_get())) --- once irq enabled, here we got -ETIME, then 99 return; 100 /* 101 * Have to be careful here. If we're in oneshot mode, 102 * before we call tick_periodic() in a loop, we need 103 * to be sure we're using a real hardware clocksource. 104 * Otherwise we could get trapped in an infinite 105 * loop, as the tick_periodic() increments jiffies, 106 * when then will increment time, posibly causing 107 * the loop to trigger again and again. 108 */ 109 if (timekeeping_valid_for_hres()) 110 tick_periodic(cpu); here, we add missing jiffies 111 next
RE: Where does kernel store per task file position?
-Original Message- From: Rajat Sharma [mailto:fs.ra...@gmail.com] Sent: Wednesday, January 30, 2013 11:16 AM To: Pranay Kumar Srivastava Cc: kernelnewbies@kernelnewbies.org Subject: Re: Where does kernel store per task file position? I'm still not able to figure out where exactly is the position of file stored per task_struct. struct file * itself is per process (task_struct) so file-f_pos is file position per process, if thats what you are looking for. I hope you haven't assumed that struct file itself is unique for a file, i.e. per inode? Then that assumption is wrong. -Rajat [Pranay Kumar Srivastava] That really was a stupid question, it says right there get_empty_filp() in do_sys_open. For forks the inherited file have common struct file [Correct?] but for the files opened after fork in child/parent will not have shared struct file[Correct?]. So the same dentry can be pointed to by multiple struct file[Correct?] that's why there's an increment of dentry while doing lookup[Correct?]. Thanks a lot! On Tue, Jan 29, 2013 at 6:38 PM, Pranay Kumar Srivastava pranay.shrivast...@hcl.com wrote: Hi Everyone, I was trying to find out where does Linux store per process file position? Since struct file is allocated once when the file is first opened (get_empty_filp() via do_sys_open) .I looked at these, Copy_process---copy_files--dup_fd it seemed to allocate only (struct file*) struct files_struct , but I couldn't find any field that is actually being used to store the file position. I'm still not able to figure out where exactly is the position of file stored per task_struct. Secondly even if this was being saved does the kernel changes f_pos of struct file whenever a (read/write) is done? I don't that happens [Correct?]. Regards, Pranay Kumar Srivastava ::DISCLAIMER:: -- -- The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. -- -- ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
How to Faking a PCI or USB device.
Hi Everyone, I'm learning how to write PCI and USB device drivers however I don't have any real device to work with. Most articles I read either have some USB device(real) or they just tell how it works(like structures and api). Is it possible to fake such a device that probably does nothing but I can say modify some parameters of the fake device? In short is it possible to devise a fake configuration address space of the fake device and try to use it (completely in memory device)?. If it can be done please give me some pointers. Thanks for reading! ::DISCLAIMER:: The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
RE: How to Faking a PCI or USB device.
-Original Message- From: Mulyadi Santosa [mailto:mulyadi.sant...@gmail.com] Sent: Monday, November 26, 2012 4:07 PM To: Pranay Kumar Srivastava Cc: kernelnewbies@kernelnewbies.org Subject: Re: How to Faking a PCI or USB device. Hi.. On Mon, Nov 26, 2012 at 5:27 PM, Pranay Kumar Srivastava pranay.shrivast...@hcl.com wrote: Is it possible to fake such a device that probably does nothing but I can say modify some parameters of the fake device? In short is it possible to devise a fake configuration address space of the fake device and try to use it (completely in memory device)?. If it can be done please give me some pointers. what kind of device? Any kind. I was looking around to find simplest of USB and PCI devices. I Found one article by Greg http://www.linuxjournal.com/article/7353 written a long time back, so maybe something like this. If I can find some cheap hardware like it I would like to use it to decorate my desk for new year :P. But while I find a simple hardware what would I need to know to fake such a simple device? maybe if you're lucky, QEMU can emulate that for you...e.g: network adapter... Well let's just say it emulates network adapter (PCI right?) so wouldn't a default driver exist for that? So I should just remove that driver and use my driver instead? -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com Thanks for the help I really appreciate it. ::DISCLAIMER:: The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Query regarding some blk_queue_XXX functions. (For kernel 2.6.32 SLES 11)
Hi All, I'm using SLES11 (kernel 2.6.32) I recently did a dummy code for block device driver where the whole disk is made of pages in memory. I have some query regarding usage of some of the functions, I've absolutely no idea what the first 2 do. What I was trying to do was that I wanted the block layer to give my driver requests which are aligned according to my device's minimum sector size. Since I wanted to play with the driver I chose 1024 bytes instead of 512, however I've tried with 2048 bytes device sector size as well and all seems well :D Now the problem is kernel wants to give everything in 512 byte but I want to receive everything aligned properly according to my device's sector size. So I looked in kernel code and I thought I could use the functions below... 1. blk_limits_io_min: No clue about this one, in-fact the device didn't got added when I used this. 2. blk_queue_physical_block_size: This one checks the logical block size before it sets physical block size itself. But not really sure about it's purpose 3. blk_queue_logical_block_size: I think this is the one responsible that I get requests properly aligned to device's sector size? Not really sure... 4. set_capacity: There's also a callback in block_device_operations by this name? However this function just sets the size of part0 of the gendisk. I've not seen the callback being called upon, I used some print statements and I never see them. So what's the purpose of the callback? Lastly: There's a field gd-start in gendisk which apparently sets the start of data sector? But why would block layer be bothered with that? Is this used by filesystems since application programs certainly can write to sector 0 as well right? And a couple more :P , When I try to use fdisk on the disk, it seems to create partition table (I think), however when it tries to read the partition table again I get invalid argument to ioctl. So I think I'm missing an ioctl call in my driver correct? I've none yet though. However this issue aside, if I'm able to create a partition then my driver should be able to handle additional disks correct? And it would follow the same operations when I added my module and called add_disk the first time correct? Thanks a lot for reading! Regards, Pranay Kr. Srivastava pranay.shrivast...@hcl.com Software Engineer ERS,HCL Technologies A-5, Sector 24, Noida 201301, U.P. (India) ::DISCLAIMER:: The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Sysfs class attribute problem
-Original Message- From: kernelnewbies-boun...@kernelnewbies.org [mailto:kernelnewbies- boun...@kernelnewbies.org] On Behalf Of kernelnewbies- requ...@kernelnewbies.org Sent: Wednesday, June 20, 2012 11:33 PM To: kernelnewbies@kernelnewbies.org Subject: Kernelnewbies Digest, Vol 19, Issue 39 Send Kernelnewbies mailing list submissions to kernelnewbies@kernelnewbies.org To subscribe or unsubscribe via the World Wide Web, visit http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies or, via email, send a message with subject or body 'help' to kernelnewbies-requ...@kernelnewbies.org You can reach the person managing the list at kernelnewbies-ow...@kernelnewbies.org When replying, please edit your Subject line so it is more specific than Re: Contents of Kernelnewbies digest... Today's Topics: 1. RE: A confusion about invoking my syscall (Jeff Haran) 2. Re: Sysfs class attribute problem (Jeshwanth Kumar N K Jeshu) -- Message: 1 Date: Wed, 20 Jun 2012 16:58:30 + From: Jeff Haran jha...@bytemobile.com Subject: RE: A confusion about invoking my syscall To: ?? wangzhe5...@gmail.com Cc: kernelnewbies kernelnewbies@kernelnewbies.org Message-ID: 4ab110e2fa959f41ac8480e84516371105b...@hq-ex01.bytemobile.com Content-Type: text/plain; charset=iso-2022-jp From: ?? [mailto:wangzhe5...@gmail.com] Sent: Wednesday, June 20, 2012 1:16 AM To: Jeff Haran Cc: kernelnewbies Subject: Re: A confusion about invoking my syscall 2012/6/20 Jeff Haran jha...@bytemobile.commailto:jha...@bytemobile.com From: ?? [mailto:wangzhe5...@gmail.commailto:wangzhe5...@gmail.com] Sent: Monday, June 18, 2012 9:32 PM To: Jeff Haran Cc: kernelnewbies Subject: Re: A confusion about invoking my syscall 2012/6/19 Jeff Haran jha...@bytemobile.commailto:jha...@bytemobile.com From: kernelnewbies-boun...@kernelnewbies.orgmailto:kernelnewbies- boun...@kernelnewbies.org [mailto:kernelnewbies- boun...@kernelnewbies.orgmailto:kernelnewbies- boun...@kernelnewbies.org] On Behalf Of ?? Sent: Monday, June 18, 2012 6:40 PM To: kernelnewbies Subject: A confusion about invoking my syscall Hello everyone: I append a simple syscall in kernel. and the function is as follows: asmlinkage long sys_mysyscall(long data) { printk(This is my syscall!\n); return data; } and i test it sucessfully in user space . and the test program: #include linux/unistd.h #include syscall.h #include sys/types.h #include stdio.h int main(void) { long n = 0,m = 0,pid1,pid2; n = syscall(345,190);// #define __NR_mysyscall 345 printf(n = %ld\n,n); pid1 = syscall(SYS_getpid); //getpid printf(pid = %ld\n,pid1); pid2 = syscall(20); //getpid printf(pid = %ld\n,pid2); return 0; } and the result: n = 190 pid = 4097 pid = 4097 but if the test program is: #include linux/unistd.h #include syscall.h #include sys/types.h #include stdio.h int main(void) { long n = 0,m = 0,pid1,pid2; n = syscall(345,190);// #define __NR_mysyscall 345 printf(n = %ld\n,n); m = syscall(SYS_mysyscall,190); printf(m = %ld\n,m); pid1 = syscall(SYS_getpid); //getpid printf(pid = %ld\n,pid1); pid2 = syscall(20); //getpid printf(pid = %ld\n,pid2); return 0; } and the result: wanny@wanny-C-Notebook-:~/syscall/src$ gcc test1.c test1.c: In function ?main?: test1.c:13:14: error: ?SYS_mysyscall? undeclared (first use in this function) test1.c:13:14: note: each undeclared identifier is reported only once for each function it appears in why i can't invoke my syscall with SYS_mysyscall? Thanks in advance! Because it appears you never defined the symbol SYS_mysyscall. I think so,but where shoud i defne the symbol SYS_mysyscall ? and where is the symbol SYS_getpid defined? On my system /usr/include/bits/syscall.h, which is being included in your program because it includes syscall.h. 83 #define SYS_getpid __NR_getpid ?so SYS_getpid is replaced by __NR_getpid. and __NR_getpid was defined in the kernel(arch/x86/include/asm/unistd_32.h). and my syscall was also defined there.#define SYS_mysyscall __NR_mysyscall, i don't kown why it doesn't works. My sources contain no reference to SYS_mysyscall nor __NR_mysyscall, so I assume you?ve added them to the Linux include files that you built your module from. User space programs like your main() program above generally aren?t going to include Linux source tree include files. When you include syscall.h from a user space program in a typical development environment, the compiler is by default going to look for syscall.h in /usr/include, not in the Linux source tree where presumably you?ve made your modifications. Of course you can always tell the compiler to look
kernel_sendpage query.
Hi, I've been trying to understand kernel_sendpage but I've not been able to figure it out completely and hopefully someone else knows better so please help me out on this. I'm using kernel_sendpage for a TCP connection and it works well when there are lesser number of kernel threads trying to send data using it. Now the page I hand over to kernel_sendpage is reused again for reading data from the socket and then processing it and then again resending the processed data in the same page again. It's at maximum 2KB data and never lesser than 120 bytes. As I see it in the code, the page isn't copied in the skb frags array it's just assigned and get_page is called to increment the page reference count. (I don't free it anyway until the thread is stopped and it never is unless it gets a signal). Now I don't know wether kernel_sendpage will wait for the page to be sent or it won't. I've tried with MSG_DONTWAIT and passing 0 for flags but after every now and then the problem occurs at client which I'm describing below with the best explanation I could think of... When too many kernel threads are trying to send data using kernel_sendpage, with NO MSG_DONTWAIT flag, then also it seems that this call succeeds? However since I'm reusing the page the data can get overridden by the next sock_recvmsg and when the network stack is ready to send my page it gets garbage data at client? The same issue I observed with MSG_DONTWAIT set even in that case the client sometimes get garbage data. So my query is, To use kernel_sendpage what I need to do in order to be sure that network stack indeed has sent my page and that I can reuse it for sock_recvmsg again. Thanks a lot for reading! Regards, Pranay Kr. Srivastava pranay.shrivast...@hcl.com Software Engineer ERS,HCL Technologies A-5, Sector 24, Noida 201301, U.P. (India) ::DISCLAIMER:: The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
RE: Crash when sending a lot of messages through a unix socket
You mentioned your code is atomic. Are you certain that you can do a sock_sendmsg from a non process context? I believe you can't do that from a non process context cuz sock_sendmsg--udp_sendmsg for DGRAM and i think that assumes you are in process context. From: kernelnewbies-boun...@kernelnewbies.org [kernelnewbies-boun...@kernelnewbies.org] On Behalf Of kernelnewbies-requ...@kernelnewbies.org [kernelnewbies-requ...@kernelnewbies.org] Sent: Wednesday, May 02, 2012 9:30 PM To: kernelnewbies@kernelnewbies.org Subject: Kernelnewbies Digest, Vol 18, Issue 2 Send Kernelnewbies mailing list submissions to kernelnewbies@kernelnewbies.org To subscribe or unsubscribe via the World Wide Web, visit http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies or, via email, send a message with subject or body 'help' to kernelnewbies-requ...@kernelnewbies.org You can reach the person managing the list at kernelnewbies-ow...@kernelnewbies.org When replying, please edit your Subject line so it is more specific than Re: Contents of Kernelnewbies digest... Today's Topics: 1. please report distros with CONFIG_DYNAMIC_DEBUG, using ddebug_query= boot param (Jim Cromie) 2. Re: immutable wiki? (mic...@michaelblizek.twilightparadox.com) 3. Re: immutable wiki? (Bill Traynor) 4. Crash when sending a lot of messages through a unix socket (Panagiotis Sakkos) -- Message: 1 Date: Tue, 1 May 2012 12:36:04 -0600 From: Jim Cromie jim.cro...@gmail.com Subject: please report distros with CONFIG_DYNAMIC_DEBUG, using ddebug_query= boot param To: kernelnewbies kernelnewbies@kernelnewbies.org Message-ID: cajfubxw8ayimau7ko42vh_e-zzbnskw9hd8dx2tfcowl4ue...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 hi all, Ive been asked whether ddebug_query= boot param is used in any distros, I think the question seeks to determine a good deprecation schedule for it (its been obsoleted by dyndbg= in driver-core-next) Would you all be so kind as to check your favorite distros, and report the ones that have one or both ? Ubuntu 12.04 LTS lacks it: jimc@chumly:~/projects/lx/linux-2.6$ grep DYNAMIC_DEBUG /boot/config* /boot/config-3.0.0-17-generic:# CONFIG_DYNAMIC_DEBUG is not set /boot/config-3.2.0-24-generic:# CONFIG_DYNAMIC_DEBUG is not set Voyage linux also lacks it, (also debian based) Fedora-16 has config option: $ uname -a Linux groucho.jimc.earth 3.3.2-6.fc16.x86_64 #1 SMP Sat Apr 21 12:43:20 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux $ grep DYNAMIC_DEBUG /boot/config-3.3.* /boot/config-3.3.1-5.fc16.x86_64:CONFIG_DYNAMIC_DEBUG=y /boot/config-3.3.2-1.fc16.x86_64:CONFIG_DYNAMIC_DEBUG=y /boot/config-3.3.2-6.fc16.x86_64:CONFIG_DYNAMIC_DEBUG=y but its not used by default. I have modified my grub defaults: $ grep ddebug_query /etc/default/grub GRUB_CMDLINE_LINUX='quiet rhgb loglevel=8 ddebug_query=module params +p dynamic_debug.verbose=1 nouveau.dyndbg nouveau.force_post=1 it87.dyndbg=+p nouveau.perflvl_wr=' FYI, the above usage will eventually be unsupported, it can be replaced in driver-core-next by either of: dyndbg=module params +p params.dyndbg=+p params.dyndbg # defaults to +p the 2nd, 3rd forms also work for loadable modules (though params isnt one of them) thanks in advance Jim -- Message: 2 Date: Wed, 2 May 2012 12:52:38 +0200 From: mic...@michaelblizek.twilightparadox.com Subject: Re: immutable wiki? To: Bill Traynor w...@alphatroop.com Cc: kernelnewbies@kernelnewbies.org Message-ID: 20120502105237.GA2229@grml Content-Type: text/plain; charset=us-ascii Hi! On 09:29 Tue 01 May , Bill Traynor wrote: I have an Editor account for the kernelnewbies.org wiki, but all pages are currently immutable. Was the wiki made read-only at some point? You must be on http://kernelnewbies.org/EditorsGroup to make changes. This was made for spam procection. Everybody who is on the list can add you. What is your username in the wiki? -Michi -- programing a layer 3+4 network protocol for mesh networks see http://michaelblizek.twilightparadox.com -- Message: 3 Date: Wed, 2 May 2012 08:06:36 -0400 From: Bill Traynor w...@alphatroop.com Subject: Re: immutable wiki? To: mic...@michaelblizek.twilightparadox.com Cc: kernelnewbies@kernelnewbies.org Message-ID: CAGfZjq69kO91rbiG8R5jTUmRDq3ZJRjp_oTLqS8wji1M=1n...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 On Wed, May 2, 2012 at 6:52 AM, mic...@michaelblizek.twilightparadox.com wrote: Hi! On 09:29 Tue 01 May ? ? , Bill Traynor wrote: I have an Editor account for the kernelnewbies.org wiki, but all pages are currently immutable. ?Was the wiki made read-only at some point? You must be on http://kernelnewbies.org/EditorsGroup to make changes. This was made for spam
RE: identity mapped paging (Vaibhav Jain)
-Original Message- From: Vaibhav Jain [mailto:vjoss...@gmail.com] Sent: Wednesday, April 18, 2012 3:49 AM To: Pranay Kumar Srivastava Cc: kernelnewbies@kernelnewbies.org Subject: Re: identity mapped paging (Vaibhav Jain) On Tue, Apr 17, 2012 at 3:46 AM, Pranay Kumar Srivastava pranay.shrivast...@hcl.com wrote: -Original Message- From: Vaibhav Jain [mailto:vjoss...@gmail.com] Sent: Tuesday, April 17, 2012 4:07 PM To: Pranay Kumar Srivastava Cc: kernelnewbies@kernelnewbies.org Subject: Re: identity mapped paging (Vaibhav Jain) On Fri, Apr 13, 2012 at 2:15 AM, Vaibhav Jain vjoss...@gmail.com wrote: I am not clear about the use of identity mapped paging while paging is being enabled by the operating system. Also I don't understand at what point are the identity mappings no longer useful.According to this article http://geezer.osdevbrasil.net/osd/mem/index.htm#identity - The page table entries used to identity-map kernel memory can be deleted once paging and virtual addresses are enabled. Can somebody please explain? Identity mapping is when VA(Virt Address)=PA(Physical address). So basically when you set up your page tables you need to make sure they map identically. This is very easily done if you consider each 4KB block as a page beginning from location 0 upto whatever you've found to be the highest memory available either thru BIOS or GRUB. Remember that while setting up your PTEs and PDE every address is a physical one. So if you thought that your kernel would be linked initially to a higher VA since you would remap it to a lower memory physically then that would be WRONG!. Without PTEs and PDEs installed don't do that!. Why would you want it? Well for a simple reason, when your kernel starts to boot there's no translator,(No PTEs/PDEs and the Paging Enabled bit of processor is also cleared AFAIK just after the BIOS is done), yet since you've not enabled your processor for that but you'll be doing that in a moment. So let's say you made your kernel to be linked to higher VA like 3Gigs. Now the addresses would be generated beginning 3Gigs however you still don't have the Page tables installed since your kernel just started. So in that case the address is the physical address. And if you've not loaded your kernel beginning 3Gigs then it would definitely come crashing down. To avoid the crash in case you made your kernel to link to higher half of the memory, you can use GDT trick since segmentation is always on and you can make the overflow of the address addition to translate to a lower physical memory even if paging is not enabled yet. Thus it is possible to load the kernel at lower memory addresses while the linkage would be for higher VMA. And once your PTEs/PGD are enabled then you can use those instead of the GDT trick. Here's a link to that http://wiki.osdev.org/Higher_Half_With_GDT Thanks Vaibhav Jain Hi, Thanks for replying but I am still confused. I continued reading about this thing and what I have understood is the following : After the kernel executes the instruction to enable paging the instruction pointer will contain the address of the next instruction which will now be treated as a virtual address. So for the next instruction to be executed the page table should map this address to itself. Please correct me if I am wrong. I am confused by the point about linking the kernel to higher address. Could you please put that in a step by step manner to make it clear what happens before paging is enabled and what happens after that. Also, please explain at what point during the execution of kernel code are the identity-mapped addresses no longer useful ? Thanks Vaibhav Hi, I am somewhat understanding your point. But I have some other queries now in my mind. If the kernel is linked to 3Gigs is there a way other than the GDT trick.? Make your load address = VA when you link so you won't have to worry about doing the GDT trick. In fact I am wondering that if the kernel is linked to 3Gigs and Grub loads it at 1MB physical, how will even the first instruction of kernel execute ? I mean if all the address generated by kernel are above 3 Gigs and paging is not enabled how will it start running ? That's what the GDT trick is for. If you read the intel/amd processor manuals the segmentation is always on. So when the address get generated your segment's base address is still added to the generated address before it is put on wire. You can add a constant offset (in your GDT's base address part) to the generated address to get the address beginning from the load address of your kernel. I would suggest you make the higher half kernel later and try to first create some code that can fragment your available memory into pages and store
RE: identity mapped paging (Vaibhav Jain)
-Original Message- From: Vaibhav Jain [mailto:vjoss...@gmail.com] Sent: Tuesday, April 17, 2012 4:07 PM To: Pranay Kumar Srivastava Cc: kernelnewbies@kernelnewbies.org Subject: Re: identity mapped paging (Vaibhav Jain) On Fri, Apr 13, 2012 at 2:15 AM, Vaibhav Jain vjoss...@gmail.com wrote: I am not clear about the use of identity mapped paging while paging is being enabled by the operating system. Also I don't understand at what point are the identity mappings no longer useful.According to this article http://geezer.osdevbrasil.net/osd/mem/index.htm#identity - The page table entries used to identity-map kernel memory can be deleted once paging and virtual addresses are enabled. Can somebody please explain? Identity mapping is when VA(Virt Address)=PA(Physical address). So basically when you set up your page tables you need to make sure they map identically. This is very easily done if you consider each 4KB block as a page beginning from location 0 upto whatever you've found to be the highest memory available either thru BIOS or GRUB. Remember that while setting up your PTEs and PDE every address is a physical one. So if you thought that your kernel would be linked initially to a higher VA since you would remap it to a lower memory physically then that would be WRONG!. Without PTEs and PDEs installed don't do that!. Why would you want it? Well for a simple reason, when your kernel starts to boot there's no translator,(No PTEs/PDEs and the Paging Enabled bit of processor is also cleared AFAIK just after the BIOS is done), yet since you've not enabled your processor for that but you'll be doing that in a moment. So let's say you made your kernel to be linked to higher VA like 3Gigs. Now the addresses would be generated beginning 3Gigs however you still don't have the Page tables installed since your kernel just started. So in that case the address is the physical address. And if you've not loaded your kernel beginning 3Gigs then it would definitely come crashing down. To avoid the crash in case you made your kernel to link to higher half of the memory, you can use GDT trick since segmentation is always on and you can make the overflow of the address addition to translate to a lower physical memory even if paging is not enabled yet. Thus it is possible to load the kernel at lower memory addresses while the linkage would be for higher VMA. And once your PTEs/PGD are enabled then you can use those instead of the GDT trick. Here's a link to that http://wiki.osdev.org/Higher_Half_With_GDT Thanks Vaibhav Jain Hi, Thanks for replying but I am still confused. I continued reading about this thing and what I have understood is the following : After the kernel executes the instruction to enable paging the instruction pointer will contain the address of the next instruction which will now be treated as a virtual address. So for the next instruction to be executed the page table should map this address to itself. Please correct me if I am wrong. I am confused by the point about linking the kernel to higher address. Could you please put that in a step by step manner to make it clear what happens before paging is enabled and what happens after that. Also, please explain at what point during the execution of kernel code are the identity-mapped addresses no longer useful ? Thanks Vaibhav Hi, I am somewhat understanding your point. But I have some other queries now in my mind. If the kernel is linked to 3Gigs is there a way other than the GDT trick.? Make your load address = VA when you link so you won't have to worry about doing the GDT trick. In fact I am wondering that if the kernel is linked to 3Gigs and Grub loads it at 1MB physical, how will even the first instruction of kernel execute ? I mean if all the address generated by kernel are above 3 Gigs and paging is not enabled how will it start running ? That's what the GDT trick is for. If you read the intel/amd processor manuals the segmentation is always on. So when the address get generated your segment's base address is still added to the generated address before it is put on wire. You can add a constant offset (in your GDT's base address part) to the generated address to get the address beginning from the load address of your kernel. I would suggest you make the higher half kernel later and try to first create some code that can fragment your available memory into pages and store this information so you'll know what all pages are there. Next would be to do identity mapping, since your kernel VMA=LMA in your linker script this would be easier to do. When you get that paging enabled you can move on to higher half kernel. I would suggest you to work on page replacement algos and virtual memory management code side by side for better integration with paging in later stages. Maybe you can post your code if you are allowed
RE: Query on linker scripts
-Original Message- From: Vaibhav Jain [mailto:vjoss...@gmail.com] Sent: Sunday, March 25, 2012 3:19 AM To: Pranay Kumar Srivastava Subject: Re: Query on linker scripts Hi Pranay, Thanks for replying!. I am still not clear about this as I have not reached the part of the tutorial which talks about pte and pgd. Could you please explain this point about safety of section with a simpler example? I'll take example from your script. .bss : { sbss = .; *(COMMON) *(.bss) ebss = .; } What I wanted to say was instead of taking ebss within .bss section you should take it outside that section. You might need to do ABSOLUTE since . will give you relative values but you'd want absolute values since addresses that you are interested in will begin from the entry point not from a section. So you can try something like this .bss ALIGN(4096): { sbss = .; *(COMMON) *(.bss) } ebss = ABSOLUTE(.); /*This should be a page aligned address*/ Also, from your reply I figured out that it is not compulsory to define such symbols and the names can be different than sbss and ebss. Am I right ? The names can be anything it's your choice entirely. But being descriptive helps. You should have a close look at the redhat tutorial for linker scripts instead of following someone else's linker script since you might not require all the variables chosen or you might need some additional variables due to the design you've chosen for your kernel. http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/simple-example.html Thanks Vaibhav Jain Please include kernelnewbies@kernelnewbies.org in cc when replying. You are likely to get more responses that way. On Sat, Mar 24, 2012 at 11:45 AM, Pranay Kumar Srivastava pranay.shrivast...@hcl.com wrote: On 03/24/2012 11:52 PM, Pranay Kumar Srivastava wrote: From: kernelnewbies- bounces+pranay.shrivastava=hcl@kernelnewbies.org [kernelnewbies- bounces+pranay.shrivastava=hcl@kernelnewbies.org] On Behalf Of kernelnewbies-requ...@kernelnewbies.org [kernelnewbies- requ...@kernelnewbies.org] Sent: Saturday, March 24, 2012 9:30 PM To: kernelnewbies@kernelnewbies.org Subject: Kernelnewbies Digest, Vol 16, Issue 29 Send Kernelnewbies mailing list submissions to kernelnewbies@kernelnewbies.org To subscribe or unsubscribe via the World Wide Web, visit http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies or, via email, send a message with subject or body 'help' to kernelnewbies-requ...@kernelnewbies.org You can reach the person managing the list at kernelnewbies-ow...@kernelnewbies.org When replying, please edit your Subject line so it is more specific than Re: Contents of Kernelnewbies digest... Today's Topics: 1. Query on linker scripts (Vaibhav Jain) 2. Re: Query on linker scripts (Carlo Caione) - - Message: 1 Date: Fri, 23 Mar 2012 21:43:40 -0700 From: Vaibhav Jainvjoss...@gmail.com Subject: Query on linker scripts To: kernelnewbies@kernelnewbies.org Message-ID: CAKuUYSw=_zzykpwetbjsgeyppsrowk+whm0o5l_pncmanvc...@mail.gmail.com Content-Type: text/plain; charset=iso-8859-1 Hi, Recently I have started reading tutorials for writing a small kernel. All such tutorials mention use of linker scripts. I have read few articles on linker scritps but I am stuck on one thing. I am unable to understand the use of defining new symbols in linker scripts. Using a linker script to arrange different sections in the object file is understandable but defining symbols which are not referenced anywhere in the script is confusing. An example is the use of symbols sbss and ebss in the bss section as show in the script below ENTRY (loader) SECTIONS { . = 0x0010; .text ALIGN (0x1000) : { *(.text) } .rodata ALIGN (0x1000) : { *(.rodata*) } .data ALIGN (0x1000) : { *(.data) } .bss : { sbss = .; *(COMMON) *(.bss) ebss = .; } } Please explain how defining such symbols is useful. Thanks Vaibhav Jain -- next part -- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/2012 0323/6e1741da/attachment-0001.html -- Message: 2 Date: Sat, 24 Mar 2012 16:26:38 +0100 From: Carlo Caionecarlo.cai...@gmail.com Subject: Re: Query on linker scripts To: Vaibhav Jainvjoss...@gmail.com Cc: kernelnewbies@kernelnewbies.org Message-ID:4f6de7ae.9070...@gmail.com Content-Type: text/plain; charset=ISO-8859-1
RE: Query on linker scripts
-Original Message- From: Vaibhav Jain [mailto:vjoss...@gmail.com] Sent: Monday, March 26, 2012 4:31 PM To: Pranay Kumar Srivastava Cc: kernelnewbies@kernelnewbies.org Subject: Re: Query on linker scripts Thanks a lot for the explanation and the link!! I have just one more question about linker scripts. I am not clear about alignment of sections. Is it necessary to align sections? Its not mandatory but helps you to know in terms of pages how many are in use, otherwise while creating ptes for the kernel you'll need to calculate from the number of bytes the kernel has consumed and find out pages required. Otherwise its just a simple shift operation to find out the corresponding page from the page's beginning address. Can the alignment be different from 4096? In the script that I provided the text and data sections are aligned while the bss section is not. Is there a reason for it ? The alignment generally corresponds to page granularity in the case of linker scripts. It's not about aligning data in C which can be done on say 4byte 8 byte etc... This alignment is done so that when the time comes to protect kernel and start user space applications you'll know which pages corresponds to kernel and can't be swapped and must be protected. Instead of thinking about protecting how many bytes kernel uses, think about how many pages the kernel uses. Well, bss isn't part of your executable. It's stack and hence it doesn't make sense to have ALIGN there since you'll be responsible of setting up the stack for the kernel and using it. The data/text are part of the executable so you would want to have them aligned on page boundary so that when you load the kernel those sections would be loaded at page aligned address and end at page boundary. Thanks Vaibhav Jain On Sun, Mar 25, 2012 at 11:41 PM, Pranay Kumar Srivastava pranay.shrivast...@hcl.com wrote: -Original Message- From: Vaibhav Jain [mailto:vjoss...@gmail.com] Sent: Sunday, March 25, 2012 3:19 AM To: Pranay Kumar Srivastava Subject: Re: Query on linker scripts Hi Pranay, Thanks for replying!. I am still not clear about this as I have not reached the part of the tutorial which talks about pte and pgd. Could you please explain this point about safety of section with a simpler example? I'll take example from your script. .bss : { sbss = .; *(COMMON) *(.bss) ebss = .; } What I wanted to say was instead of taking ebss within .bss section you should take it outside that section. You might need to do ABSOLUTE since . will give you relative values but you'd want absolute values since addresses that you are interested in will begin from the entry point not from a section. So you can try something like this .bss ALIGN(4096): { sbss = .; *(COMMON) *(.bss) } ebss = ABSOLUTE(.); /*This should be a page aligned address*/ Also, from your reply I figured out that it is not compulsory to define such symbols and the names can be different than sbss and ebss. Am I right ? The names can be anything it's your choice entirely. But being descriptive helps. You should have a close look at the redhat tutorial for linker scripts instead of following someone else's linker script since you might not require all the variables chosen or you might need some additional variables due to the design you've chosen for your kernel. http://docs.redhat.com/docs/en- US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/simple- example.html Thanks Vaibhav Jain Please include kernelnewbies@kernelnewbies.org in cc when replying. You are likely to get more responses that way. On Sat, Mar 24, 2012 at 11:45 AM, Pranay Kumar Srivastava pranay.shrivast...@hcl.com wrote: On 03/24/2012 11:52 PM, Pranay Kumar Srivastava wrote: From: kernelnewbies- bounces+pranay.shrivastava=hcl@kernelnewbies.org [kernelnewbies- bounces+pranay.shrivastava=hcl@kernelnewbies.org] On Behalf Of kernelnewbies-requ...@kernelnewbies.org [kernelnewbies- requ...@kernelnewbies.org] Sent: Saturday, March 24, 2012 9:30 PM To: kernelnewbies@kernelnewbies.org Subject: Kernelnewbies Digest, Vol 16, Issue 29 Send Kernelnewbies mailing list submissions to kernelnewbies@kernelnewbies.org To subscribe or unsubscribe via the World Wide Web, visit http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies or, via email, send a message with subject or body 'help' to kernelnewbies-requ...@kernelnewbies.org You can reach the person managing the list at kernelnewbies-ow...@kernelnewbies.org When replying, please edit your Subject line so it is more specific than Re: Contents of Kernelnewbies digest... Today's Topics: 1. Query on linker scripts (Vaibhav Jain
Can't get Major and Minor number for device correctly.
Hi, I've a kernel module which needs to get hold of the gen_disk structure. To make it a bit interactive I've a user space program which uses stat sys call to get the respective device's dev_t number (I use s_rdev of the struct stat) and pass it on to my module via an ioctl call. This part works good. When my module tries to use the get_gendisk function it returns NULL. I printed out the MAJOR(dev_t) and MINOR(dev_t) and was surprised to get the value 0 for MAJOR(dev_t). A little more digging got me to this code snippet, the file is kdev_t.h ---% #ifdef __KERNEL__ #define MINORBITS 20 #define MINORMASK ((1U MINORBITS) - 1) #define MAJOR(dev) ((unsigned int) ((dev) MINORBITS)) #define MINOR(dev) ((unsigned int) ((dev) MINORMASK)) #define MKDEV(ma,mi)(((ma) MINORBITS) | (mi)) . #else /* __KERNEL__ */ /* Some programs want their definitions of MAJOR and MINOR and MKDEV from the kernel sources. These must be the externally visible ones. */ #define MAJOR(dev) ((dev)8) #define MINOR(dev) ((dev) 0xff) #define MKDEV(ma,mi)((ma)8 | (mi)) #endif /* __KERNEL__ */ ---% Since __KERNEL__ is defined in the topmost Makefile, the first definition of macros would be used which is giving me the wrong result. But what I really want is the #else one. Surprisingly doing a major and minor in user space works. I traced it back to the following code snippet, in file /usr/include/sys/sysmacros.h ---% # if defined __GNUC__ __GNUC__ = 2 defined __USE_EXTERN_INLINES __extension__ __extern_inline unsigned int __NTH (gnu_dev_major (unsigned long long int __dev)) { return ((__dev 8) 0xfff) | ((unsigned int) (__dev 32) ~0xfff); } __extension__ __extern_inline unsigned int __NTH (gnu_dev_minor (unsigned long long int __dev)) { return (__dev 0xff) | ((unsigned int) (__dev 12) ~0xff); } #endif # define major(dev) gnu_dev_major (dev) # define minor(dev) gnu_dev_minor (dev) ---% So basically the user space code doing stat and then major/minor works because its doing the dev_t8. My questions are, 1. How does MAJOR, MINOR are able to work within kernel with 32 bit definitions?, since dev_t20 would make the major number 0 and get_gendisk should then fail and most of the devices listed like my disks don't report a quite big dev_t when I do stat on them. 2. Should I be checking for wether or not 32 bits dev_t is in effect? But if I've to do this what's the purpose of using the macros MAJOR and MINOR? Aren't these there to do the same job? As an example, doing stat reported stat.s_rdev field to be 2065, which accordingly translate to 8,17. I'm using SLES 11 SP1, kernel 2.6.32 Regards, Pranay Kr. Srivastava pranay.shrivast...@hcl.com Software Engineer ERS,HCL Technologies A-5, Sector 24, Noida 201301, U.P. (India) ::DISCLAIMER:: --- The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect. --- ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Using Sysfs uevents.
Hi, I was playing with sysfs and I'm able to create kset and kobjects within them as well. I need to know how do I use the uevents of these kobjects that I create. For example while reading the code I found that certain events like ADD, DEL a couple more were there are apparently fired. Now currently I'm not handling these events, the ops field is null, so they don't bother me hence they are not mandatory? If I were to actually do something with these events what it should be? Since my module runs fine and the uevents are supposed to be for userland applications (Hotplug) but the point is again how will a userspace application get to know about it? Does the application needs to create netlink sockets for it? If it does then why bother with the uevents of kobject? ::DISCLAIMER:: --- The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect. --- ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Working with kernel_accept?
I was trying to create socket within kernel and I used kernel_* helper functions to get started. This worked fine for UDP however with TCP I ran into some issues when I did the following steps 1.) I was able to create a listening TCP socket using sock_create_kern, should I be using sock_create only? 2.) I had changed the sk_data_ready callback for the listening socket so that a waiting thread would be notified when a connection is ready to be accepted. When that happened the thread was woken up and that thread then called kernel_accept. 3.) Now started the issue, in kernel_accept it uses sock_create_lite and the machine just froze. After quite a lot of hours, i was able to figure out the problem which was apparently with sock_create_lite. This function was not initializing sock-sk, printed it and found it to be NULL, which I guess caused the machine to froze. 4.) As a resolve, I went back to sock_create_kern and called sock-ops-accept instead of kernel_accept and it worked. Is there any other step required in order to work with kernel_accept? I'm using SLES 11 SP1, kernel 2.6.32.12 ::DISCLAIMER:: --- The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect. --- ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies