Re: kernel build error

2013-03-20 Thread Kumar amit mehta
On Tue, Mar 19, 2013 at 09:43:11PM +0700, Mulyadi Santosa wrote:
 Hi ...
 
 On Tue, Mar 19, 2013 at 12:28 PM, Kumar amit mehta gmate.a...@gmail.com 
 wrote:
  grep for copy_from_user_overflow gives me this:
 
  amit@ubuntu:~/linux-next/linux-next$ grep -ri copy_from_user_overflow *
  arch/s390/include/asm/uaccess.h:extern void copy_from_user_overflow(void)
  arch/s390/include/asm/uaccess.h:copy_from_user_overflow();
  arch/tile/include/asm/uaccess.h:extern void copy_from_user_overflow(void)
  arch/tile/include/asm/uaccess.h:copy_from_user_overflow();
  arch/parisc/include/asm/uaccess.h:extern void copy_from_user_overflow(void)
  arch/parisc/include/asm/uaccess.h:copy_from_user_overflow();
  arch/x86/include/asm/uaccess_32.h:extern void copy_from_user_overflow(void)
  arch/x86/include/asm/uaccess_32.h:  copy_from_user_overflow();
  drivers/vfio/pci/vfio_pci_config.c:  * with count of 1/2/4 and hits
  copy_from_user_overflow without this.
  lib/usercopy.c:void copy_from_user_overflow(void)
 
 
 IMHO, I think uaccess_32.h is what you need here.
 
 I draw that conclusion after checking this line:
 http://lxr.linux.no/#linux+v3.8.3/arch/x86/include/asm/uaccess_32.h#L194
 
 I might be wrong, so feel free to test first
 
Actually the above header file is supposed to get included, based on the 
architecture only.

snip from arch/x86/include/asm/uaccess.h
#ifdef CONFIG_X86_32
# include asm/uaccess_32.h
#else
# include asm/uaccess_64.h
#endif
snip from arch/x86/include/asm/uaccess.h

snip from .config
amit@ubuntu:~/linux-next/linux-next$ grep -w CONFIG_X86_32 .config 
CONFIG_X86_32=y
snip from .config

CPU arch on my machine: 
amit@ubuntu:~/linux-next/linux-next$ uname -m
i686

Based on this observation, I think, I do not need to include the uaccess_32.h
in any of those files. 

-Amit

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: kernel build error

2013-03-20 Thread Kumar amit mehta
On Tue, Mar 19, 2013 at 11:56:44PM -0700, Kumar amit mehta wrote:
 On Tue, Mar 19, 2013 at 09:43:11PM +0700, Mulyadi Santosa wrote:
  Hi ...
  
  On Tue, Mar 19, 2013 at 12:28 PM, Kumar amit mehta gmate.a...@gmail.com 
  wrote:
   grep for copy_from_user_overflow gives me this:
  
   amit@ubuntu:~/linux-next/linux-next$ grep -ri copy_from_user_overflow *
   arch/s390/include/asm/uaccess.h:extern void copy_from_user_overflow(void)
   arch/s390/include/asm/uaccess.h:copy_from_user_overflow();
   arch/tile/include/asm/uaccess.h:extern void copy_from_user_overflow(void)
   arch/tile/include/asm/uaccess.h:copy_from_user_overflow();
   arch/parisc/include/asm/uaccess.h:extern void 
   copy_from_user_overflow(void)
   arch/parisc/include/asm/uaccess.h:
   copy_from_user_overflow();
   arch/x86/include/asm/uaccess_32.h:extern void 
   copy_from_user_overflow(void)
   arch/x86/include/asm/uaccess_32.h:  copy_from_user_overflow();
   drivers/vfio/pci/vfio_pci_config.c:  * with count of 1/2/4 and hits
   copy_from_user_overflow without this.
   lib/usercopy.c:void copy_from_user_overflow(void)
  
  
  IMHO, I think uaccess_32.h is what you need here.
  
  I draw that conclusion after checking this line:
  http://lxr.linux.no/#linux+v3.8.3/arch/x86/include/asm/uaccess_32.h#L194
  
  I might be wrong, so feel free to test first
  
 Actually the above header file is supposed to get included, based on the 
 architecture only.
 
 snip from arch/x86/include/asm/uaccess.h
 #ifdef CONFIG_X86_32
 # include asm/uaccess_32.h
 #else
 # include asm/uaccess_64.h
 #endif
 snip from arch/x86/include/asm/uaccess.h
 
 snip from .config
 amit@ubuntu:~/linux-next/linux-next$ grep -w CONFIG_X86_32 .config 
 CONFIG_X86_32=y
 snip from .config
 
 CPU arch on my machine: 
 amit@ubuntu:~/linux-next/linux-next$ uname -m
 i686
 
 Based on this observation, I think, I do not need to include the uaccess_32.h
 in any of those files. 


I forgot that 'uname -m' will return me the kernel version and _not_ the CPU
architecture. The CPU on my machine seem to be 64 bit (/proc/cpuinfo|grep flags
shows 'lm'). So my understanding is that I've a 32 bit kernel running on a 64
bit machine.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Wake_lock in linux kernel

2013-03-20 Thread Ben Wu
Dear All:
  I now study power driver in linux source code, but didn't find some doc about 
liux,all the doc was for android, can some help me?BTW,how /sys/power 
interactive with linux kernel??

Many thanks___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: kernel build error

2013-03-20 Thread Ben Wu
Dear All:
  I now study power driver in linux source code, but didn't
 find some doc about liux,all the doc was for android, can some help 
me?BTW,how /sys/power interactive with linux kernel??

Many thanks___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


wake_lock in linux kernel

2013-03-20 Thread Ben Wu
Dear All:
  I now study power driver in linux source code, but didn't
 find some doc about liux,all the doc was for android, can some help 
me?BTW,how /sys/power interactive with linux kernel??

Many thanks___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Design Patterns in Linux Kernel: Fancy Tricks With Linked Lists

2013-03-20 Thread Greg Freemyer


Robert P. J. Day rpj...@crashcourse.ca wrote:

Quoting Arlie Stephens ar...@worldash.org:

 Interestingly, part of the debate yesterday probably resulted from
one
 engineer having Love's 2nd edition, and me having his 3rd
 edition. Apparently RPDay pointed out some problems to Love which
 resulted in him changing his linked list discussion in his 3rd
 edition ;-)

  Been a while since I re-read my own tutorial, it might merit a bit of
a rewrite. Is there anything about it that seems unclear -- I remember
my own moment of epiphany, Holy crap, what an interesting way to do
it.

   And, yes, if you try to reconcile Love's 2nd and 3rd editions on the
topic, that will not end well. :-)

rday


Robert,

I read it briefly yesterday.  I don't recall it having an example like:

Instantiate head pointer
Add 2 or 3 list members
Walk list and printk a object member

I find examples like that make all the difference for me.

Greg
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Design Patterns in Linux Kernel: Fancy Tricks With Linked Lists

2013-03-20 Thread Robert P. J. Day
On Wed, 20 Mar 2013, Greg Freemyer wrote:

 Robert P. J. Day rpj...@crashcourse.ca wrote:

 Quoting Arlie Stephens ar...@worldash.org:
 
  Interestingly, part of the debate yesterday probably resulted from
 one
  engineer having Love's 2nd edition, and me having his 3rd
  edition. Apparently RPDay pointed out some problems to Love which
  resulted in him changing his linked list discussion in his 3rd
  edition ;-)
 
   Been a while since I re-read my own tutorial, it might merit a bit of
 a rewrite. Is there anything about it that seems unclear -- I remember
 my own moment of epiphany, Holy crap, what an interesting way to do
 it.
 
And, yes, if you try to reconcile Love's 2nd and 3rd editions on the
 topic, that will not end well. :-)
 
 rday
 

 Robert,

 I read it briefly yesterday.  I don't recall it having an example like:

  it being ... ?

 Instantiate head pointer
 Add 2 or 3 list members
 Walk list and printk a object member

 I find examples like that make all the difference for me.

  i agree completely that a simple picture of an empty list showing
that it consisted of an initial struct list_head would have been
amazingly useful. i'm in the midst of (fingers crossed) moving my
entire site to a different technology, part of which should support
drawing cool diagrams with a minimum of fuss. at which point most of
my stuff will be totally rewritten. yes, i love diagrams.

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday


___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


About time delay in kernel threads at LOCAL_OUT netfilter hook

2013-03-20 Thread Rifat Rahman
Hello there,
I am in a situation where I am mangling RTP data in kernel space. I
have written a netfilter module which is responsible for encryption,
padding and ptime modification. The thing is that, encryption, padding
works just fine. Ptime modification involves two steps. When large
packets are incoming to the server (say 60 ptime, 60 bytes of RTP
payload with g729 codec), they are splitted with ptime 20. The SIP
server is Asterisk. When asterisk (the server) is sending packets of
20 ptime, they are merged to the desired big ptime(say 60). All the
things went well. I looked up packets whether checksumming, RTP
timestamping and sequencing is done perfectly. And it is done quite
well. The whole process without large ptime is satisfactory.

The thing is that, after contacting with asterisk mailing list, I came
to know that I have to make some delay before sending each large
packet. But, I am sending packets outside of the box in LOCAL_OUT
hook. I wrote a thread to make queue the packets until some delay
conditions is satisfied. And then trying to send packets from that
thread.

But, eventually what I found is that, if I set delay to zero, then
everything is fine, the packets are queued and sent as expected. But,
if I set delay0, the thread does not send packets. I used
dst_output() in include/linux/dst.h for sending packets. But, this
function works only in LOCAL_OUT hook. When I am adding some delay in
a thread, the thread just misses the hook and the packet is never
sent. I tested that, if I add the delay condition in the target
function, the whole kernel faces the delay, so it is of no help. I
have to do this thing inside a thread. But how to not miss that hook?
Can anyone suggest any work around regarding this?

-- 
Rifat Rahman

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


BFQ: simple elevator

2013-03-20 Thread Raymond Jennings
I've been pondering making a very simple IO scheduler

one step above noop, just keeps everything in a big heap sorted by
position and a single cursor bouncing from head to tail shaving off
requests in a loop of ascending and descending sweeps.

Any gotchas I need to be aware of or can I simply fork off of deadline
and simplify it to omit batching?

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: kernel build error

2013-03-20 Thread Valdis . Kletnieks
On Wed, 20 Mar 2013 00:07:57 -0700, Kumar amit mehta said:

 I forgot that 'uname -m' will return me the kernel version and _not_ the CPU
 architecture. The CPU on my machine seem to be 64 bit (/proc/cpuinfo|grep 
 flags
 shows 'lm'). So my understanding is that I've a 32 bit kernel running on a 64
 bit machine.

Or more correctly, you have a kernel actually running in 32-bit mode on
a machine that is 64-bit capable.


pgppq034nqkqM.pgp
Description: PGP signature
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Memory allocations in linux for processes

2013-03-20 Thread Mulyadi Santosa
On 3/19/13, Niroj Pokhrel nirojpokh...@gmail.com wrote:

 Hi Mulyadi .
 Thank you very much But I still have a minor confusion .
 All I ran was this short program

 #includestdio.h
 int main()
 {
 while(1)
 {
 }
 return 0;
 }

well, before your program is loaded, certainly one or more memory
region must be allocated in RAM, right? if not, where will it be
stored ?

Like Valdis said, strace utility could help you understanding things
during the binary execution. You'll see lots of mmap, I promise :)

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: BFQ: simple elevator

2013-03-20 Thread Mulyadi Santosa
On 3/20/13, Raymond Jennings shent...@gmail.com wrote:
 I've been pondering making a very simple IO scheduler

 one step above noop, just keeps everything in a big heap sorted by
 position and a single cursor bouncing from head to tail shaving off
 requests in a loop of ascending and descending sweeps.

pardon me for any possible sillyness, but what happen if there are
incoming I/O operation at very nearby sectors (or perhaps at the same
sector?)? I suppose, the elevator will prioritize them first over the
rest? (i.e starving will happen...)

-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: BFQ: simple elevator

2013-03-20 Thread Valdis . Kletnieks
On Thu, 21 Mar 2013 02:24:23 +0700, Mulyadi Santosa said:

 pardon me for any possible sillyness, but what happen if there are
 incoming I/O operation at very nearby sectors (or perhaps at the same
 sector?)? I suppose, the elevator will prioritize them first over the
 rest? (i.e starving will happen...)

And this, my friends, is why elevators aren't as easy to do as the average
undergrad might hope - it's a lot harder to balance fairness and throughput
across all the corner cases than you might think.  It gets really fun
when you have (for example) a 'find' command moving the heads all over
the disk while another process is trying to do large amounts of streaming
I/O.  And then you'll get some idiot process that insists on doing the
occasional fsync() or syncfs() call.  Yes, it's almost always *all*
corner cases, it's very rare (unless you're an embedded system like a Tivo)
that all your I/O is one flavor that is easily handled by a simple elevator.




pgpwhjtDXJzNR.pgp
Description: PGP signature
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


DMA attributes with dma_sync_single()

2013-03-20 Thread Moritz Fischer
Hi there,

I was wondering whether setting DMA attributes like
DMA_ATTR_WRITE_BARRIER will have an effect
if I don't dma_unmap_single_attr() an area but merely
dma_sync_single() it after a transfer from
the device to the host memory. My problem is that I get an interrupt
(MSI) indicating that an (independent) DMA transfer is done,
where the interrupt flags get pushed (via a DMA write) to host memory.
From time to time, the data of the other DMA transfer does not show up
in host memory in time with the interrupt.
I am trying to force coherency by using the DMA_ATTR_WRITE_BARRIER
on this status push (of the interrupt flags), to force the other
writes (device to host) to complete.

Does that seem reasonable?

Can anyone with insight shine some light on that?

It's a PCIe device, I'm running Fedora 18 with 3.8.3-201.fc18.x86_64.

Cheers,

Moritz

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: BFQ: simple elevator

2013-03-20 Thread Raymond Jennings
On Wed, Mar 20, 2013 at 2:03 PM,  valdis.kletni...@vt.edu wrote:
 On Thu, 21 Mar 2013 02:24:23 +0700, Mulyadi Santosa said:

 pardon me for any possible sillyness, but what happen if there are
 incoming I/O operation at very nearby sectors (or perhaps at the same
 sector?)? I suppose, the elevator will prioritize them first over the
 rest? (i.e starving will happen...)

This is actually why I proposed to enforce forward progress by only
looking for further requests in one direction at a time.

Suppose you have requests at sectors 1, 4, 5, and 6

You dispatch sectors 1, 4, and 5, leaving the head parked at 5 and the
direction as ascending.

But suddenly, just before you get a chance to dispatch for sector 6,
sector 4 gets busy again.

I'm not proposing going back to sector 4.  It's behind us and (as you
indicated) we could starve sector 6 indefinitely.

So instead, because sector 4 is on the wrong side of our present head
position, it is ignored and we keep marching forward, and then we hit
sector 6 and dispatch it.

Once we hit sector 6 and dispatch it, we do a u-turn and start
descending.  That's when we pick up sector 4 again.

When we're going up, we ignore what's below us, and when we're going
down we ignore what is above us.

We only switch directions when there's nothing in front of us the way
we were going.  In theory, given that disk capacity is itself finite,
so too is the amount of time one has to wait before getting reached by
the elevator.

Anyway, does this clarification answer your concerns about starvation?

 And this, my friends, is why elevators aren't as easy to do as the average
 undergrad might hope - it's a lot harder to balance fairness and throughput
 across all the corner cases than you might think.  It gets really fun
 when you have (for example) a 'find' command moving the heads all over
 the disk while another process is trying to do large amounts of streaming
 I/O.  And then you'll get some idiot process that insists on doing the
 occasional fsync() or syncfs() call.  Yes, it's almost always *all*
 corner cases, it's very rare (unless you're an embedded system like a Tivo)
 that all your I/O is one flavor that is easily handled by a simple elevator.

In my case I'm just concerned with raw total system throughput.

I called it BFQ for a reason.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Linux elevators (Re: BFQ: simple elevator)

2013-03-20 Thread Arlie Stephens
The ongoing thread reminds me of a simple question I've had since I
first read about linux' mutiple I/O schedulers. Why is the choice of
I/O scheduler global to the whole kernel, rather than per-device or
similar? 

Consider a system with both traditional rotating disks and SSDs - not
at all far fetched. An appropriate I/O scheduling algorithm for
rotating disks is likely to do a lot of work that's useless for
SSDs. Why require that the same algorithms be used for both? 

--
Arlie

(Arlie Stephens ar...@worldash.org)


___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: BFQ: simple elevator

2013-03-20 Thread Valdis . Kletnieks
On Wed, 20 Mar 2013 14:41:31 -0700, Raymond Jennings said:

 Suppose you have requests at sectors 1, 4, 5, and 6

 You dispatch sectors 1, 4, and 5, leaving the head parked at 5 and the
 direction as ascending.

 But suddenly, just before you get a chance to dispatch for sector 6,
 sector 4 gets busy again.

 I'm not proposing going back to sector 4.  It's behind us and (as you
 indicated) we could starve sector 6 indefinitely.

 So instead, because sector 4 is on the wrong side of our present head
 position, it is ignored and we keep marching forward, and then we hit
 sector 6 and dispatch it.

 Once we hit sector 6 and dispatch it, we do a u-turn and start
 descending.  That's when we pick up sector 4 again.

The problem is that not all seeks are created equal.

Consider the requests are at 1, 4, 5, and 199343245.  If as we're servicing
5, another request for 4 comes in, we may well be *much* better off doing a
short seek to 4 and then one long seek to the boonies, rather than 2 long
seeks.

My laptop has a 160G Western Digital drive in it (WD1600BJKT).  The minimum
track-to-track seek time is 2ms, the average time is 12ms, and the maximum is
probably on the order of 36ms. So by replacing 2 max-length seeks with a
track-to-track seek and 1 max-length, you can almost half the delay waiting
for seeks (38ms versus 72ms). (And even better if the target block is logically 
before the current one, but
still on the same track, so you only take a rotational latency hit and no seek
hit.

(The maximum is not given in the spec sheets, but is almost always 3 times the
average - for a discussion of the math behind that, and a lot of other issues,
see:

http://pages.cs.wisc.edu/~remzi/OSFEP/file-disks.pdf

And of course, this interacts in very mysterious ways with the firmware
on the drive, which can do its own re-ordering of I/O requests and/or
manage the use of the disk's onboard read/write cache - this is why
command queueing is useful for throughput, because if the disk has the
option of re-ordering 32 requests, it can do more than if it only has 1 or
2 requests in the queue.  Of course, very deep command queues have their
own issues - most notably that at some point you need to use barriers or
something to ensure that the metadata writes aren't being re-ordered into
a pattern that could cause corruption if the disk lost its mind before
completing all the writes...

 In my case I'm just concerned with raw total system throughput.

See the above discussion.


pgpldeK4nVfHY.pgp
Description: PGP signature
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Linux elevators (Re: BFQ: simple elevator)

2013-03-20 Thread Valdis . Kletnieks
On Wed, 20 Mar 2013 16:05:09 -0700, Arlie Stephens said:
 The ongoing thread reminds me of a simple question I've had since I
 first read about linux' mutiple I/O schedulers. Why is the choice of
 I/O scheduler global to the whole kernel, rather than per-device or
 similar?

They aren't global to the kernel.

On my laptop:

# find /sys/devices/pci* -name 'scheduler' | xargs grep .
/sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop
 deadline [cfq]
/sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:noop
 deadline [cfq]
# echo noop | 
/sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/schedule
# find /sys/devices/pci* -name 'scheduler' | xargs grep .
/sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop
 deadline [cfq]
/sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:[noop]
 deadline cfq

I just changed the scheduler for the CD-ROM.




pgp0_KZpObd65.pgp
Description: PGP signature
___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Linux elevators (Re: BFQ: simple elevator)

2013-03-20 Thread Arlie Stephens
On Mar 20 2013, valdis.kletni...@vt.edu wrote:
 On Wed, 20 Mar 2013 16:05:09 -0700, Arlie Stephens said:
  The ongoing thread reminds me of a simple question I've had since I
  first read about linux' mutiple I/O schedulers. Why is the choice of
  I/O scheduler global to the whole kernel, rather than per-device or
  similar?
 
 They aren't global to the kernel.

Thanks for the correction. It appears I got wrong (outdated?)
information from some book on kernel development, or perhaps simply
misunderstood what I read. 

When I tried the example you gave, I saw the same thing, even on
the older kernels I'm working with (2.6.32 in particular). 


 
 On my laptop:
 
 # find /sys/devices/pci* -name 'scheduler' | xargs grep .
 /sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop
  deadline [cfq]
 /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:noop
  deadline [cfq]
 # echo noop | 
 /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/schedule
 # find /sys/devices/pci* -name 'scheduler' | xargs grep .
 /sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop
  deadline [cfq]
 /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:[noop]
  deadline cfq
 
 I just changed the scheduler for the CD-ROM.

--
Arlie

(Arlie Stephens  ar...@worldash.org)

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: BFQ: simple elevator

2013-03-20 Thread Raymond Jennings
On Wed, Mar 20, 2013 at 4:10 PM,  valdis.kletni...@vt.edu wrote:
 On Wed, 20 Mar 2013 14:41:31 -0700, Raymond Jennings said:

 Suppose you have requests at sectors 1, 4, 5, and 6

 You dispatch sectors 1, 4, and 5, leaving the head parked at 5 and the
 direction as ascending.

 But suddenly, just before you get a chance to dispatch for sector 6,
 sector 4 gets busy again.

 I'm not proposing going back to sector 4.  It's behind us and (as you
 indicated) we could starve sector 6 indefinitely.

 So instead, because sector 4 is on the wrong side of our present head
 position, it is ignored and we keep marching forward, and then we hit
 sector 6 and dispatch it.

 Once we hit sector 6 and dispatch it, we do a u-turn and start
 descending.  That's when we pick up sector 4 again.

 The problem is that not all seeks are created equal.

 Consider the requests are at 1, 4, 5, and 199343245.  If as we're servicing
 5, another request for 4 comes in, we may well be *much* better off doing a
 short seek to 4 and then one long seek to the boonies, rather than 2 long
 seeks.

 My laptop has a 160G Western Digital drive in it (WD1600BJKT).  The minimum
 track-to-track seek time is 2ms, the average time is 12ms, and the maximum is
 probably on the order of 36ms. So by replacing 2 max-length seeks with a
 track-to-track seek and 1 max-length, you can almost half the delay waiting
 for seeks (38ms versus 72ms). (And even better if the target block is 
 logically before the current one, but
 still on the same track, so you only take a rotational latency hit and no seek
 hit.

 (The maximum is not given in the spec sheets, but is almost always 3 times the
 average - for a discussion of the math behind that, and a lot of other issues,
 see:

 http://pages.cs.wisc.edu/~remzi/OSFEP/file-disks.pdf

 And of course, this interacts in very mysterious ways with the firmware
 on the drive, which can do its own re-ordering of I/O requests and/or
 manage the use of the disk's onboard read/write cache - this is why
 command queueing is useful for throughput, because if the disk has the
 option of re-ordering 32 requests, it can do more than if it only has 1 or
 2 requests in the queue.  Of course, very deep command queues have their
 own issues - most notably that at some point you need to use barriers or
 something to ensure that the metadata writes aren't being re-ordered into
 a pattern that could cause corruption if the disk lost its mind before
 completing all the writes...

 In my case I'm just concerned with raw total system throughput.

 See the above discussion.

Hmm...Maybe a hybrid approach that allows a finite number of reverse
seeks, or as I suspect deadline does a finite delay before abandoning
the close stuff to march to the boonies.

Btw, does deadline currently do ping-pong seeking or does it always go
for the nearest sector in either direction?

At any rate, I'll probably need performance tuning to get a feel for
what will work.

What I'd like to start with is correct usage of the api's in question
to actually do the processing.  Is request dispatch well documented?

My first hunch was to copy-paste from deadline and then tweak it.

Mostly I want to make something simple and learn how to write an I/O
scheduler in the process.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Linux elevators (Re: BFQ: simple elevator)

2013-03-20 Thread Raymond Jennings
On Wed, Mar 20, 2013 at 4:45 PM, Arlie Stephens ar...@worldash.org wrote:
 On Mar 20 2013, valdis.kletni...@vt.edu wrote:
 On Wed, 20 Mar 2013 16:05:09 -0700, Arlie Stephens said:
  The ongoing thread reminds me of a simple question I've had since I
  first read about linux' mutiple I/O schedulers. Why is the choice of
  I/O scheduler global to the whole kernel, rather than per-device or
  similar?

 They aren't global to the kernel.

 Thanks for the correction. It appears I got wrong (outdated?)
 information from some book on kernel development, or perhaps simply
 misunderstood what I read.

Yeah, the global thing is just the system default.

It's what newly created or discovered block devices will be assigned initially.

 When I tried the example you gave, I saw the same thing, even on
 the older kernels I'm working with (2.6.32 in particular).



 On my laptop:

 # find /sys/devices/pci* -name 'scheduler' | xargs grep .
 /sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop
  deadline [cfq]
 /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:noop
  deadline [cfq]
 # echo noop | 
 /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/schedule
 # find /sys/devices/pci* -name 'scheduler' | xargs grep .
 /sys/devices/pci:00/:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/scheduler:noop
  deadline [cfq]
 /sys/devices/pci:00/:00:1f.2/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/scheduler:[noop]
  deadline cfq

 I just changed the scheduler for the CD-ROM.

 --
 Arlie

 (Arlie Stephens  ar...@worldash.org)

 ___
 Kernelnewbies mailing list
 Kernelnewbies@kernelnewbies.org
 http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: wake_lock in linux kernel

2013-03-20 Thread Yuva Raj
Hi Ben,

Please find the  below the link to kernel power management related docs.

https://www.kernel.org/doc/Documentation/power/basic-pm-debugging.txt

This will explain about how /sys/power/  can be interactive.

You can  refer /sys/power/pm_test file for  testing the power
management functionality of linux  kernel.

Feel free  to refer all other files in that directory.

Regards,
yuvaraj.A



On Wed, Mar 20, 2013 at 3:19 PM, Ben Wu cray...@yahoo.cn wrote:

 Dear All:
   I now study power driver in linux source code, but didn't find some doc
 about liux,all the doc was for android, can some help me?BTW,how /sys/power
 interactive with linux kernel??

 Many thanks
 ___
 Kernelnewbies mailing list
 Kernelnewbies@kernelnewbies.org
 http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies