Re: kernel build error
On Tue, Mar 19, 2013 at 09:43:11PM +0700, Mulyadi Santosa wrote: Hi ... On Tue, Mar 19, 2013 at 12:28 PM, Kumar amit mehta gmate.a...@gmail.com wrote: grep for copy_from_user_overflow gives me this: amit@ubuntu:~/linux-next/linux-next$ grep -ri copy_from_user_overflow * arch/s390/include/asm/uaccess.h:extern void copy_from_user_overflow(void) arch/s390/include/asm/uaccess.h:copy_from_user_overflow(); arch/tile/include/asm/uaccess.h:extern void copy_from_user_overflow(void) arch/tile/include/asm/uaccess.h:copy_from_user_overflow(); arch/parisc/include/asm/uaccess.h:extern void copy_from_user_overflow(void) arch/parisc/include/asm/uaccess.h:copy_from_user_overflow(); arch/x86/include/asm/uaccess_32.h:extern void copy_from_user_overflow(void) arch/x86/include/asm/uaccess_32.h: copy_from_user_overflow(); drivers/vfio/pci/vfio_pci_config.c: * with count of 1/2/4 and hits copy_from_user_overflow without this. lib/usercopy.c:void copy_from_user_overflow(void) IMHO, I think uaccess_32.h is what you need here. I draw that conclusion after checking this line: http://lxr.linux.no/#linux+v3.8.3/arch/x86/include/asm/uaccess_32.h#L194 I might be wrong, so feel free to test first Actually the above header file is supposed to get included, based on the architecture only. snip from arch/x86/include/asm/uaccess.h #ifdef CONFIG_X86_32 # include asm/uaccess_32.h #else # include asm/uaccess_64.h #endif snip from arch/x86/include/asm/uaccess.h snip from .config amit@ubuntu:~/linux-next/linux-next$ grep -w CONFIG_X86_32 .config CONFIG_X86_32=y snip from .config CPU arch on my machine: amit@ubuntu:~/linux-next/linux-next$ uname -m i686 Based on this observation, I think, I do not need to include the uaccess_32.h in any of those files. -Amit ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: kernel build error
On Tue, Mar 19, 2013 at 11:56:44PM -0700, Kumar amit mehta wrote: On Tue, Mar 19, 2013 at 09:43:11PM +0700, Mulyadi Santosa wrote: Hi ... On Tue, Mar 19, 2013 at 12:28 PM, Kumar amit mehta gmate.a...@gmail.com wrote: grep for copy_from_user_overflow gives me this: amit@ubuntu:~/linux-next/linux-next$ grep -ri copy_from_user_overflow * arch/s390/include/asm/uaccess.h:extern void copy_from_user_overflow(void) arch/s390/include/asm/uaccess.h:copy_from_user_overflow(); arch/tile/include/asm/uaccess.h:extern void copy_from_user_overflow(void) arch/tile/include/asm/uaccess.h:copy_from_user_overflow(); arch/parisc/include/asm/uaccess.h:extern void copy_from_user_overflow(void) arch/parisc/include/asm/uaccess.h: copy_from_user_overflow(); arch/x86/include/asm/uaccess_32.h:extern void copy_from_user_overflow(void) arch/x86/include/asm/uaccess_32.h: copy_from_user_overflow(); drivers/vfio/pci/vfio_pci_config.c: * with count of 1/2/4 and hits copy_from_user_overflow without this. lib/usercopy.c:void copy_from_user_overflow(void) IMHO, I think uaccess_32.h is what you need here. I draw that conclusion after checking this line: http://lxr.linux.no/#linux+v3.8.3/arch/x86/include/asm/uaccess_32.h#L194 I might be wrong, so feel free to test first Actually the above header file is supposed to get included, based on the architecture only. snip from arch/x86/include/asm/uaccess.h #ifdef CONFIG_X86_32 # include asm/uaccess_32.h #else # include asm/uaccess_64.h #endif snip from arch/x86/include/asm/uaccess.h snip from .config amit@ubuntu:~/linux-next/linux-next$ grep -w CONFIG_X86_32 .config CONFIG_X86_32=y snip from .config CPU arch on my machine: amit@ubuntu:~/linux-next/linux-next$ uname -m i686 Based on this observation, I think, I do not need to include the uaccess_32.h in any of those files. I forgot that 'uname -m' will return me the kernel version and _not_ the CPU architecture. The CPU on my machine seem to be 64 bit (/proc/cpuinfo|grep flags shows 'lm'). So my understanding is that I've a 32 bit kernel running on a 64 bit machine. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Wake_lock in linux kernel
Dear All: I now study power driver in linux source code, but didn't find some doc about liux,all the doc was for android, can some help me?BTW,how /sys/power interactive with linux kernel?? Many thanks___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: kernel build error
Dear All: I now study power driver in linux source code, but didn't find some doc about liux,all the doc was for android, can some help me?BTW,how /sys/power interactive with linux kernel?? Many thanks___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
wake_lock in linux kernel
Dear All: I now study power driver in linux source code, but didn't find some doc about liux,all the doc was for android, can some help me?BTW,how /sys/power interactive with linux kernel?? Many thanks___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Design Patterns in Linux Kernel: Fancy Tricks With Linked Lists
Robert P. J. Day rpj...@crashcourse.ca wrote: Quoting Arlie Stephens ar...@worldash.org: Interestingly, part of the debate yesterday probably resulted from one engineer having Love's 2nd edition, and me having his 3rd edition. Apparently RPDay pointed out some problems to Love which resulted in him changing his linked list discussion in his 3rd edition ;-) Been a while since I re-read my own tutorial, it might merit a bit of a rewrite. Is there anything about it that seems unclear -- I remember my own moment of epiphany, Holy crap, what an interesting way to do it. And, yes, if you try to reconcile Love's 2nd and 3rd editions on the topic, that will not end well. :-) rday Robert, I read it briefly yesterday. I don't recall it having an example like: Instantiate head pointer Add 2 or 3 list members Walk list and printk a object member I find examples like that make all the difference for me. Greg -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Design Patterns in Linux Kernel: Fancy Tricks With Linked Lists
On Wed, 20 Mar 2013, Greg Freemyer wrote: Robert P. J. Day rpj...@crashcourse.ca wrote: Quoting Arlie Stephens ar...@worldash.org: Interestingly, part of the debate yesterday probably resulted from one engineer having Love's 2nd edition, and me having his 3rd edition. Apparently RPDay pointed out some problems to Love which resulted in him changing his linked list discussion in his 3rd edition ;-) Been a while since I re-read my own tutorial, it might merit a bit of a rewrite. Is there anything about it that seems unclear -- I remember my own moment of epiphany, Holy crap, what an interesting way to do it. And, yes, if you try to reconcile Love's 2nd and 3rd editions on the topic, that will not end well. :-) rday Robert, I read it briefly yesterday. I don't recall it having an example like: it being ... ? Instantiate head pointer Add 2 or 3 list members Walk list and printk a object member I find examples like that make all the difference for me. i agree completely that a simple picture of an empty list showing that it consisted of an initial struct list_head would have been amazingly useful. i'm in the midst of (fingers crossed) moving my entire site to a different technology, part of which should support drawing cool diagrams with a minimum of fuss. at which point most of my stuff will be totally rewritten. yes, i love diagrams. rday -- Robert P. J. Day Ottawa, Ontario, CANADA http://crashcourse.ca Twitter: http://twitter.com/rpjday LinkedIn: http://ca.linkedin.com/in/rpjday ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
About time delay in kernel threads at LOCAL_OUT netfilter hook
Hello there, I am in a situation where I am mangling RTP data in kernel space. I have written a netfilter module which is responsible for encryption, padding and ptime modification. The thing is that, encryption, padding works just fine. Ptime modification involves two steps. When large packets are incoming to the server (say 60 ptime, 60 bytes of RTP payload with g729 codec), they are splitted with ptime 20. The SIP server is Asterisk. When asterisk (the server) is sending packets of 20 ptime, they are merged to the desired big ptime(say 60). All the things went well. I looked up packets whether checksumming, RTP timestamping and sequencing is done perfectly. And it is done quite well. The whole process without large ptime is satisfactory. The thing is that, after contacting with asterisk mailing list, I came to know that I have to make some delay before sending each large packet. But, I am sending packets outside of the box in LOCAL_OUT hook. I wrote a thread to make queue the packets until some delay conditions is satisfied. And then trying to send packets from that thread. But, eventually what I found is that, if I set delay to zero, then everything is fine, the packets are queued and sent as expected. But, if I set delay0, the thread does not send packets. I used dst_output() in include/linux/dst.h for sending packets. But, this function works only in LOCAL_OUT hook. When I am adding some delay in a thread, the thread just misses the hook and the packet is never sent. I tested that, if I add the delay condition in the target function, the whole kernel faces the delay, so it is of no help. I have to do this thing inside a thread. But how to not miss that hook? Can anyone suggest any work around regarding this? -- Rifat Rahman ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
BFQ: simple elevator
I've been pondering making a very simple IO scheduler one step above noop, just keeps everything in a big heap sorted by position and a single cursor bouncing from head to tail shaving off requests in a loop of ascending and descending sweeps. Any gotchas I need to be aware of or can I simply fork off of deadline and simplify it to omit batching? ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: kernel build error
On Wed, 20 Mar 2013 00:07:57 -0700, Kumar amit mehta said: I forgot that 'uname -m' will return me the kernel version and _not_ the CPU architecture. The CPU on my machine seem to be 64 bit (/proc/cpuinfo|grep flags shows 'lm'). So my understanding is that I've a 32 bit kernel running on a 64 bit machine. Or more correctly, you have a kernel actually running in 32-bit mode on a machine that is 64-bit capable. pgppq034nqkqM.pgp Description: PGP signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Memory allocations in linux for processes
On 3/19/13, Niroj Pokhrel nirojpokh...@gmail.com wrote: Hi Mulyadi . Thank you very much But I still have a minor confusion . All I ran was this short program #includestdio.h int main() { while(1) { } return 0; } well, before your program is loaded, certainly one or more memory region must be allocated in RAM, right? if not, where will it be stored ? Like Valdis said, strace utility could help you understanding things during the binary execution. You'll see lots of mmap, I promise :) -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: BFQ: simple elevator
On 3/20/13, Raymond Jennings shent...@gmail.com wrote: I've been pondering making a very simple IO scheduler one step above noop, just keeps everything in a big heap sorted by position and a single cursor bouncing from head to tail shaving off requests in a loop of ascending and descending sweeps. pardon me for any possible sillyness, but what happen if there are incoming I/O operation at very nearby sectors (or perhaps at the same sector?)? I suppose, the elevator will prioritize them first over the rest? (i.e starving will happen...) -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: BFQ: simple elevator
On Thu, 21 Mar 2013 02:24:23 +0700, Mulyadi Santosa said: pardon me for any possible sillyness, but what happen if there are incoming I/O operation at very nearby sectors (or perhaps at the same sector?)? I suppose, the elevator will prioritize them first over the rest? (i.e starving will happen...) And this, my friends, is why elevators aren't as easy to do as the average undergrad might hope - it's a lot harder to balance fairness and throughput across all the corner cases than you might think. It gets really fun when you have (for example) a 'find' command moving the heads all over the disk while another process is trying to do large amounts of streaming I/O. And then you'll get some idiot process that insists on doing the occasional fsync() or syncfs() call. Yes, it's almost always *all* corner cases, it's very rare (unless you're an embedded system like a Tivo) that all your I/O is one flavor that is easily handled by a simple elevator. pgpwhjtDXJzNR.pgp Description: PGP signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
DMA attributes with dma_sync_single()
Hi there, I was wondering whether setting DMA attributes like DMA_ATTR_WRITE_BARRIER will have an effect if I don't dma_unmap_single_attr() an area but merely dma_sync_single() it after a transfer from the device to the host memory. My problem is that I get an interrupt (MSI) indicating that an (independent) DMA transfer is done, where the interrupt flags get pushed (via a DMA write) to host memory. From time to time, the data of the other DMA transfer does not show up in host memory in time with the interrupt. I am trying to force coherency by using the DMA_ATTR_WRITE_BARRIER on this status push (of the interrupt flags), to force the other writes (device to host) to complete. Does that seem reasonable? Can anyone with insight shine some light on that? It's a PCIe device, I'm running Fedora 18 with 3.8.3-201.fc18.x86_64. Cheers, Moritz ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: BFQ: simple elevator
On Wed, Mar 20, 2013 at 2:03 PM, valdis.kletni...@vt.edu wrote: On Thu, 21 Mar 2013 02:24:23 +0700, Mulyadi Santosa said: pardon me for any possible sillyness, but what happen if there are incoming I/O operation at very nearby sectors (or perhaps at the same sector?)? I suppose, the elevator will prioritize them first over the rest? (i.e starving will happen...) This is actually why I proposed to enforce forward progress by only looking for further requests in one direction at a time. Suppose you have requests at sectors 1, 4, 5, and 6 You dispatch sectors 1, 4, and 5, leaving the head parked at 5 and the direction as ascending. But suddenly, just before you get a chance to dispatch for sector 6, sector 4 gets busy again. I'm not proposing going back to sector 4. It's behind us and (as you indicated) we could starve sector 6 indefinitely. So instead, because sector 4 is on the wrong side of our present head position, it is ignored and we keep marching forward, and then we hit sector 6 and dispatch it. Once we hit sector 6 and dispatch it, we do a u-turn and start descending. That's when we pick up sector 4 again. When we're going up, we ignore what's below us, and when we're going down we ignore what is above us. We only switch directions when there's nothing in front of us the way we were going. In theory, given that disk capacity is itself finite, so too is the amount of time one has to wait before getting reached by the elevator. Anyway, does this clarification answer your concerns about starvation? And this, my friends, is why elevators aren't as easy to do as the average undergrad might hope - it's a lot harder to balance fairness and throughput across all the corner cases than you might think. It gets really fun when you have (for example) a 'find' command moving the heads all over the disk while another process is trying to do large amounts of streaming I/O. And then you'll get some idiot process that insists on doing the occasional fsync() or syncfs() call. Yes, it's almost always *all* corner cases, it's very rare (unless you're an embedded system like a Tivo) that all your I/O is one flavor that is easily handled by a simple elevator. In my case I'm just concerned with raw total system throughput. I called it BFQ for a reason. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Linux elevators (Re: BFQ: simple elevator)
The ongoing thread reminds me of a simple question I've had since I first read about linux' mutiple I/O schedulers. Why is the choice of I/O scheduler global to the whole kernel, rather than per-device or similar? Consider a system with both traditional rotating disks and SSDs - not at all far fetched. An appropriate I/O scheduling algorithm for rotating disks is likely to do a lot of work that's useless for SSDs. Why require that the same algorithms be used for both? -- Arlie (Arlie Stephens ar...@worldash.org) ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: BFQ: simple elevator
On Wed, 20 Mar 2013 14:41:31 -0700, Raymond Jennings said: Suppose you have requests at sectors 1, 4, 5, and 6 You dispatch sectors 1, 4, and 5, leaving the head parked at 5 and the direction as ascending. But suddenly, just before you get a chance to dispatch for sector 6, sector 4 gets busy again. I'm not proposing going back to sector 4. It's behind us and (as you indicated) we could starve sector 6 indefinitely. So instead, because sector 4 is on the wrong side of our present head position, it is ignored and we keep marching forward, and then we hit sector 6 and dispatch it. Once we hit sector 6 and dispatch it, we do a u-turn and start descending. That's when we pick up sector 4 again. The problem is that not all seeks are created equal. Consider the requests are at 1, 4, 5, and 199343245. If as we're servicing 5, another request for 4 comes in, we may well be *much* better off doing a short seek to 4 and then one long seek to the boonies, rather than 2 long seeks. My laptop has a 160G Western Digital drive in it (WD1600BJKT). The minimum track-to-track seek time is 2ms, the average time is 12ms, and the maximum is probably on the order of 36ms. So by replacing 2 max-length seeks with a track-to-track seek and 1 max-length, you can almost half the delay waiting for seeks (38ms versus 72ms). (And even better if the target block is logically before the current one, but still on the same track, so you only take a rotational latency hit and no seek hit. (The maximum is not given in the spec sheets, but is almost always 3 times the average - for a discussion of the math behind that, and a lot of other issues, see: http://pages.cs.wisc.edu/~remzi/OSFEP/file-disks.pdf And of course, this interacts in very mysterious ways with the firmware on the drive, which can do its own re-ordering of I/O requests and/or manage the use of the disk's onboard read/write cache - this is why command queueing is useful for throughput, because if the disk has the option of re-ordering 32 requests, it can do more than if it only has 1 or 2 requests in the queue. Of course, very deep command queues have their own issues - most notably that at some point you need to use barriers or something to ensure that the metadata writes aren't being re-ordered into a pattern that could cause corruption if the disk lost its mind before completing all the writes... In my case I'm just concerned with raw total system throughput. See the above discussion. pgpldeK4nVfHY.pgp Description: PGP signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Linux elevators (Re: BFQ: simple elevator)
On Wed, 20 Mar 2013 16:05:09 -0700, Arlie Stephens said: The ongoing thread reminds me of a simple question I've had since I first read about linux' mutiple I/O schedulers. Why is the choice of I/O scheduler global to the whole kernel, rather than per-device or similar? They aren't global to the kernel. On my laptop: # find /sys/devices/pci* -name 'scheduler' | xargs grep . /sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop deadline [cfq] /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:noop deadline [cfq] # echo noop | /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/schedule # find /sys/devices/pci* -name 'scheduler' | xargs grep . /sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop deadline [cfq] /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:[noop] deadline cfq I just changed the scheduler for the CD-ROM. pgp0_KZpObd65.pgp Description: PGP signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Linux elevators (Re: BFQ: simple elevator)
On Mar 20 2013, valdis.kletni...@vt.edu wrote: On Wed, 20 Mar 2013 16:05:09 -0700, Arlie Stephens said: The ongoing thread reminds me of a simple question I've had since I first read about linux' mutiple I/O schedulers. Why is the choice of I/O scheduler global to the whole kernel, rather than per-device or similar? They aren't global to the kernel. Thanks for the correction. It appears I got wrong (outdated?) information from some book on kernel development, or perhaps simply misunderstood what I read. When I tried the example you gave, I saw the same thing, even on the older kernels I'm working with (2.6.32 in particular). On my laptop: # find /sys/devices/pci* -name 'scheduler' | xargs grep . /sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop deadline [cfq] /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:noop deadline [cfq] # echo noop | /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/schedule # find /sys/devices/pci* -name 'scheduler' | xargs grep . /sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop deadline [cfq] /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:[noop] deadline cfq I just changed the scheduler for the CD-ROM. -- Arlie (Arlie Stephens ar...@worldash.org) ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: BFQ: simple elevator
On Wed, Mar 20, 2013 at 4:10 PM, valdis.kletni...@vt.edu wrote: On Wed, 20 Mar 2013 14:41:31 -0700, Raymond Jennings said: Suppose you have requests at sectors 1, 4, 5, and 6 You dispatch sectors 1, 4, and 5, leaving the head parked at 5 and the direction as ascending. But suddenly, just before you get a chance to dispatch for sector 6, sector 4 gets busy again. I'm not proposing going back to sector 4. It's behind us and (as you indicated) we could starve sector 6 indefinitely. So instead, because sector 4 is on the wrong side of our present head position, it is ignored and we keep marching forward, and then we hit sector 6 and dispatch it. Once we hit sector 6 and dispatch it, we do a u-turn and start descending. That's when we pick up sector 4 again. The problem is that not all seeks are created equal. Consider the requests are at 1, 4, 5, and 199343245. If as we're servicing 5, another request for 4 comes in, we may well be *much* better off doing a short seek to 4 and then one long seek to the boonies, rather than 2 long seeks. My laptop has a 160G Western Digital drive in it (WD1600BJKT). The minimum track-to-track seek time is 2ms, the average time is 12ms, and the maximum is probably on the order of 36ms. So by replacing 2 max-length seeks with a track-to-track seek and 1 max-length, you can almost half the delay waiting for seeks (38ms versus 72ms). (And even better if the target block is logically before the current one, but still on the same track, so you only take a rotational latency hit and no seek hit. (The maximum is not given in the spec sheets, but is almost always 3 times the average - for a discussion of the math behind that, and a lot of other issues, see: http://pages.cs.wisc.edu/~remzi/OSFEP/file-disks.pdf And of course, this interacts in very mysterious ways with the firmware on the drive, which can do its own re-ordering of I/O requests and/or manage the use of the disk's onboard read/write cache - this is why command queueing is useful for throughput, because if the disk has the option of re-ordering 32 requests, it can do more than if it only has 1 or 2 requests in the queue. Of course, very deep command queues have their own issues - most notably that at some point you need to use barriers or something to ensure that the metadata writes aren't being re-ordered into a pattern that could cause corruption if the disk lost its mind before completing all the writes... In my case I'm just concerned with raw total system throughput. See the above discussion. Hmm...Maybe a hybrid approach that allows a finite number of reverse seeks, or as I suspect deadline does a finite delay before abandoning the close stuff to march to the boonies. Btw, does deadline currently do ping-pong seeking or does it always go for the nearest sector in either direction? At any rate, I'll probably need performance tuning to get a feel for what will work. What I'd like to start with is correct usage of the api's in question to actually do the processing. Is request dispatch well documented? My first hunch was to copy-paste from deadline and then tweak it. Mostly I want to make something simple and learn how to write an I/O scheduler in the process. ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: Linux elevators (Re: BFQ: simple elevator)
On Wed, Mar 20, 2013 at 4:45 PM, Arlie Stephens ar...@worldash.org wrote: On Mar 20 2013, valdis.kletni...@vt.edu wrote: On Wed, 20 Mar 2013 16:05:09 -0700, Arlie Stephens said: The ongoing thread reminds me of a simple question I've had since I first read about linux' mutiple I/O schedulers. Why is the choice of I/O scheduler global to the whole kernel, rather than per-device or similar? They aren't global to the kernel. Thanks for the correction. It appears I got wrong (outdated?) information from some book on kernel development, or perhaps simply misunderstood what I read. Yeah, the global thing is just the system default. It's what newly created or discovered block devices will be assigned initially. When I tried the example you gave, I saw the same thing, even on the older kernels I'm working with (2.6.32 in particular). On my laptop: # find /sys/devices/pci* -name 'scheduler' | xargs grep . /sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop deadline [cfq] /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:noop deadline [cfq] # echo noop | /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/schedule # find /sys/devices/pci* -name 'scheduler' | xargs grep . /sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop deadline [cfq] /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:[noop] deadline cfq I just changed the scheduler for the CD-ROM. -- Arlie (Arlie Stephens ar...@worldash.org) ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: wake_lock in linux kernel
Hi Ben, Please find the below the link to kernel power management related docs. https://www.kernel.org/doc/Documentation/power/basic-pm-debugging.txt This will explain about how /sys/power/ can be interactive. You can refer /sys/power/pm_test file for testing the power management functionality of linux kernel. Feel free to refer all other files in that directory. Regards, yuvaraj.A On Wed, Mar 20, 2013 at 3:19 PM, Ben Wu cray...@yahoo.cn wrote: Dear All: I now study power driver in linux source code, but didn't find some doc about liux,all the doc was for android, can some help me?BTW,how /sys/power interactive with linux kernel?? Many thanks ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies