Re: [Haifux] SSD and linux

2008-09-18 Thread gabik
Hi Doron
 
The place where the current process goes to sleep and waits until the page
is swapped in is indeed in generic_make_request() (called from submit_bio())
There is a call to block_wait_queue_running(q); which moves this process to
wait and calls for schedule() [prepare_to_wait_exclusive() and after that
io_schedule()].
Thus, this seems to be a place for a busy loop.
 
You must be careful though with what you change and make sure not to break
some other code path, that assumes certain things done in this code path.
For example, if you are not going to put this process in the wait queue, you
must be careful what will happen when the io operation will finish and will
want to remove this process from the wait queue and wake it up.
 
 
Gabi
 
P.S. I was referring to version 2.6.11
http://lxr.linux.no/linux+v2.6.11/drivers/block/ll_rw_blk.c#L2595
 

  _  

From: Doron Zuckerman [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 18, 2008 12:28 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]; haifux@haifux.org
Subject: Re: [Haifux] SSD and linux


Hi Gabi and Muli,
I'm sorry about the mistake- you understood me correctly.

I'm not sure it will speed up the OS, however I'm doing an academic research
on the matter as part of a project I'm taking, and I plan to check this
point.
The leading thought was that since the SSD is not a mechanical drive, pages
can be brought faster in this way, and there is no need to context switch,
thus, avoiding the overhead included.

Yes I plan to use the polling system (busy-wait) , and I'm looking for the
kernel part  in the pagefault handling mechanism in which the process is
suspended in order to prevent it.

So far I found the function __generic_make_request in file ll_blk.
This function calls a sub function named might_sleep.
I have deleted the call to this function whenever I'm in a pagefault,
however I'm not sure if this function casuses the sleep, or is just used for
debugging in order to check if we entered a suspend state.

My question is if this is the function I should change in order to accept
the change I'm willing to get, or if the change should be made in
q-make_request_fn
which, according to my understanding, belongs to the specific driver I'm
using.

Please help me find the specific place I'm looking for that would make the
desired change.

Thank you very much,
Doron.


On Tue, Sep 16, 2008 at 2:42 PM, gabik [EMAIL PROTECTED]
https://mail.google.com/mail?view=cmtf=0[EMAIL PROTECTED] 
wrote:


Hello Doron
 
Why do you think it will speed up the OS?
What do you plan to do until the page is swapped in? Busy loop?
 
About your solution:
handle_mm_fault is called from within page fault handler (do_page_fault
http://lxr.linux.no/linux+v2.6.26.5/+code=do_page_fault ()).
So what is the rational behind calling handle_mm_fault not from inside
pagefault  handler?
Where would you call it from instead and what do you plan to do when you are
in the page fault?
 
Probably what you meant is, in order not to do context switch due to page
fault, is to call handle_mm_fault as usual, but not to raise need_resched
flag, so as not to trigger a context switch in case of a major page fault.
 
 
Gabi
 
 

  _  

From: [EMAIL PROTECTED]
https://mail.google.com/mail?view=cmtf=0[EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
https://mail.google.com/mail?view=cmtf=0[EMAIL PROTECTED] ]
On Behalf Of Doron Zuckerman
Sent: Tuesday, September 16, 2008 12:31 PM
To: haifux@haifux.org
https://mail.google.com/mail?view=cmtf=0[EMAIL PROTECTED] 
Cc: Ronen Gruengras
Subject: [Haifux] SSD and linux


Hi all,

I have a question regarding the linux kernel (for those of you who are
familiar with it).

I'm looking for a way to add a change to the linux kernel in order to check
if I can make it more compatible with my Asus EEE-PC.
I would like to change the kernel in such way that it will not do a context
switch every time there is a page fault 
and will wait for the required page to be brought from the SSD (Solid State
Drive), then continue as usual.
In Such way, I plan to check if I can fasten the speed of the Operating
System (Ubuntu for EEE).
I thought of adding a TIF flag in the process descriptor (thread_info_32.h)
that will tell me if I'm currently in a pagefault and 
then change the fault_32.c in such way that it will do the
handle_mm_fault(mm,vma, address, write_; only if there is no 
pagefault at the moment.
Can you suggest any other solution possible or tell me what you think about
this solution.

I would really appreciate any help with this,
Doron.





___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] SSD and linux

2008-09-18 Thread Doron Zuckerman
Hi Muli,

It seems like a good idea to check the time of a single block read against a
single context switch, we'll try looking more into it.

 Try to find the place where the faulting process is put to sleep
and convert that code to busy wait instead, terminating the busy-wait
when the page has been brought in.

That's exactly what we are looking for, so far with no success...
We tried following the page-fault path and got all the way to the call to
q-make_request_fn (in the __generic_make_request function in the
block\ll_rw_blk.c file). Till there we couldn't find anything that can put
the current process into waiting. Our guess is that it is done somewere
inside this function.
Do you have any idea where we can find this?

Thanks,

Ronen  Doron

On Thu, Sep 18, 2008 at 4:49 PM, Muli Ben-Yehuda [EMAIL PROTECTED] wrote:

 On Thu, Sep 18, 2008 at 12:27:36PM +0300, Doron Zuckerman wrote:

  I'm not sure it will speed up the OS, however I'm doing an academic
  research on the matter as part of a project I'm taking, and I plan
  to check this point.

 I'm pretty sure it won't.



  The leading thought was that since the SSD is not a mechanical
  drive, pages can be brought faster in this way, and there is no need
  to context switch, thus, avoiding the overhead included.

 I suggest a much simpler exercise:

 (a) time how long it takes to read a block of data from the SSD
 (b) time how long a context switch takes

 See that (b) is orders of magnitude faster than (a).

  So far I found the function __generic_make_request in file
  ll_blk.  This function calls a sub function named might_sleep.
  I have deleted the call to this function whenever I'm in a
  pagefault, however I'm not sure if this function casuses the sleep,
  or is just used for debugging in order to check if we entered a
  suspend state.

 might_sleep() is a debugging aid, which is used by code that might
 sleep in order to check that it hasn't been called in a context where
 you can't sleep (non-process context such as an interrupt handler).

  My question is if this is the function I should change in order to
  accept the change I'm willing to get, or if the change should be
  made in  q-make_request_fn which, according to my understanding,
  belongs to the specific driver I'm using.

 Neither. Take a look at the page fault path for a major fault. What it
 does (from 10,000 feet) is initiate reading the page from disk, and
 then going t sleep until the page is ready. Going to sleep in the page
 fault path is what causes the context switch you want to avoid. What
 you want to do instead of going to sleep is busy-wait for the
 data. Try to find the place where the faulting process is put to sleep
 and convert that code to busy wait instead, terminating the busy-wait
 when the page has been brought in.

 Cheers,
 Muli
 --
 Workshop on I/O Virtualization (WIOV '08)
 Co-located with OSDI '08, Dec 2008, San Diego, CA
 http://www.usenix.org/wiov08

___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] SSD and linux

2008-09-18 Thread gabik
Doron
 
You can work on 2.6.24 if you prefer. I just picked some version and checked
on it. (for some reason there is no arch/i386 in 2.6.24. Maybe they have
renamed it into x86?)
 
As for which function to use:
What you want to change is not the place where the io request is done, but
the place where the process puts itself in the wait queue, removes itself
from the runqueue and calls for schedule().
In 2.6.11 this is done in function block_wait_queue_running(q).
 
I have not checked what q-make_request_fn(q,bio)  does exactly, but from
your description, it issues a request to the driver. This is probably done
by simply adding some request struct with instruction on what to do to the
driver request data structure (and maybe signaling the driver in some way).
After doing that the process puts itself to sleep (via
block_wait_queue_running).
When the driver finishes handling the request (A LOT OF time from now), it
raises HW interrupt and this interrupt will wake the waiting process and put
it back in the run queue. Sometime later the process will be scheduled to
run and it will continue from the next place after the call for schedule().
 
Gabi
 
P.S. I think it would be wise to check first what Muli has suggested -
compare times. 
 

  _  

From: Doron Zuckerman [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 18, 2008 6:29 PM
To: gabik; haifux@haifux.org
Cc: Ronen Gruengras
Subject: Re: [Haifux] SSD and linux


Hi Gabi,

First of all thanks for you're help.

We are currently using kernel 2.6.24, and couldn't find any call to the
function block_wait_queue_running(q) there. It seems to handle things a
bit differently.

Moreover I looked at the code of kernel 2.6.11 and from what I can
understand, it seems to me like the block_wait_queue_running(q) function
only waits on the IO queue for the IO to be ready, and not for the IO
request (reading the page from the disk) to be done (it is called before the
IO request is made).

Correct me if I'm wrong but isn't the call to the function
q-make_request_fn(q,bio) what makes the actual request to the device, and
therefore the place which is responsible for waiting for the result of that
request?

P.S.
I don't mind switching to kernel 2.6.11 (or any other for that matter) as
long as I can make the changes I need.

Thanks,
Ronen  Doron


On Thu, Sep 18, 2008 at 4:49 PM, gabik [EMAIL PROTECTED] wrote:


Hi Doron
 
The place where the current process goes to sleep and waits until the page
is swapped in is indeed in generic_make_request() (called from submit_bio())
There is a call to block_wait_queue_running(q); which moves this process to
wait and calls for schedule() [prepare_to_wait_exclusive() and after that
io_schedule()].
Thus, this seems to be a place for a busy loop.
 
You must be careful though with what you change and make sure not to break
some other code path, that assumes certain things done in this code path.
For example, if you are not going to put this process in the wait queue, you
must be careful what will happen when the io operation will finish and will
want to remove this process from the wait queue and wake it up.
 
 
Gabi
 
P.S. I was referring to version 2.6.11
http://lxr.linux.no/linux+v2.6.11/drivers/block/ll_rw_blk.c#L2595
 

  _  

From: Doron Zuckerman [mailto:[EMAIL PROTECTED]
https://mail.google.com/mail?view=cmtf=0[EMAIL PROTECTED] ] 
Sent: Thursday, September 18, 2008 12:28 PM
To: [EMAIL PROTECTED]
https://mail.google.com/mail?view=cmtf=0[EMAIL PROTECTED] ;
[EMAIL PROTECTED]
https://mail.google.com/mail?view=cmtf=0[EMAIL PROTECTED] ;
haifux@haifux.org
https://mail.google.com/mail?view=cmtf=0[EMAIL PROTECTED] 
Subject: Re: [Haifux] SSD and linux


Hi Gabi and Muli,
I'm sorry about the mistake- you understood me correctly.

I'm not sure it will speed up the OS, however I'm doing an academic research
on the matter as part of a project I'm taking, and I plan to check this
point.
The leading thought was that since the SSD is not a mechanical drive, pages
can be brought faster in this way, and there is no need to context switch,
thus, avoiding the overhead included.

Yes I plan to use the polling system (busy-wait) , and I'm looking for the
kernel part  in the pagefault handling mechanism in which the process is
suspended in order to prevent it.

So far I found the function __generic_make_request in file ll_blk.
This function calls a sub function named might_sleep.
I have deleted the call to this function whenever I'm in a pagefault,
however I'm not sure if this function casuses the sleep, or is just used for
debugging in order to check if we entered a suspend state.

My question is if this is the function I should change in order to accept
the change I'm willing to get, or if the change should be made in
q-make_request_fn
which, according to my understanding, belongs to the specific driver I'm
using.

Please help me find the specific place I'm looking for that would make the
desired change.

Thank you very much,
Doron.


On Tue, Sep

Re: [Haifux] SSD and linux

2008-09-18 Thread gabik
q-make_request_fn  seems to call a function __make_request()
[blk_init_queue_node initializes a pointer to this function]
__make_request calls get_request_wait() which in turn calls
prepare_to_wait_exclusive() and later io_schedule().
 
So the logic is exactly like in 2.6.11, but a bit more complex.
 
Gabi
 
 

  _  

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
Of Doron Zuckerman
Sent: Thursday, September 18, 2008 6:50 PM
To: Muli Ben-Yehuda; Ronen Gruengras; haifux@haifux.org
Subject: Re: [Haifux] SSD and linux


Hi Muli,

It seems like a good idea to check the time of a single block read against a
single context switch, we'll try looking more into it.

 Try to find the place where the faulting process is put to sleep
and convert that code to busy wait instead, terminating the busy-wait
when the page has been brought in.

That's exactly what we are looking for, so far with no success...
We tried following the page-fault path and got all the way to the call to
q-make_request_fn (in the __generic_make_request function in the
block\ll_rw_blk.c file). Till there we couldn't find anything that can put
the current process into waiting. Our guess is that it is done somewere
inside this function.
Do you have any idea where we can find this?

Thanks,

Ronen  Doron


On Thu, Sep 18, 2008 at 4:49 PM, Muli Ben-Yehuda [EMAIL PROTECTED] wrote:


On Thu, Sep 18, 2008 at 12:27:36PM +0300, Doron Zuckerman wrote:

 I'm not sure it will speed up the OS, however I'm doing an academic
 research on the matter as part of a project I'm taking, and I plan
 to check this point.


I'm pretty sure it won't. 




 The leading thought was that since the SSD is not a mechanical
 drive, pages can be brought faster in this way, and there is no need
 to context switch, thus, avoiding the overhead included.


I suggest a much simpler exercise:

(a) time how long it takes to read a block of data from the SSD
(b) time how long a context switch takes

See that (b) is orders of magnitude faster than (a).


 So far I found the function __generic_make_request in file
 ll_blk.  This function calls a sub function named might_sleep.
 I have deleted the call to this function whenever I'm in a
 pagefault, however I'm not sure if this function casuses the sleep,
 or is just used for debugging in order to check if we entered a
 suspend state.


might_sleep() is a debugging aid, which is used by code that might
sleep in order to check that it hasn't been called in a context where
you can't sleep (non-process context such as an interrupt handler).


 My question is if this is the function I should change in order to
 accept the change I'm willing to get, or if the change should be
 made in  q-make_request_fn which, according to my understanding,
 belongs to the specific driver I'm using.


Neither. Take a look at the page fault path for a major fault. What it
does (from 10,000 feet) is initiate reading the page from disk, and
then going t sleep until the page is ready. Going to sleep in the page
fault path is what causes the context switch you want to avoid. What
you want to do instead of going to sleep is busy-wait for the
data. Try to find the place where the faulting process is put to sleep
and convert that code to busy wait instead, terminating the busy-wait
when the page has been brought in.


Cheers,
Muli
--
Workshop on I/O Virtualization (WIOV '08)
Co-located with OSDI '08, Dec 2008, San Diego, CA
http://www.usenix.org/wiov08



___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] SSD and linux

2008-09-18 Thread Muli Ben-Yehuda
On Thu, Sep 18, 2008 at 06:56:09PM +0300, gabik wrote:
 Doron
  
 You can work on 2.6.24 if you prefer. I just picked some version and checked
 on it. (for some reason there is no arch/i386 in 2.6.24. Maybe they have
 renamed it into x86?)

Yes. arch/x86 is now for both 32 and 64 bits.

Cheers,
Muli
-- 
Workshop on I/O Virtualization (WIOV '08)
Co-located with OSDI '08, Dec 2008, San Diego, CA
http://www.usenix.org/wiov08
___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] SSD and linux

2008-09-18 Thread Muli Ben-Yehuda
On Thu, Sep 18, 2008 at 06:49:58PM +0300, Doron Zuckerman wrote:

 Do you have any idea where we can find this?

I haven't looked at those bits recently, but it sounds like Gabi is
pointing you to the right path.

In any case, to be honest, I think what you propose doesn't make
sense, even as research. Look at it this way. When does busy waiting
makes sense? When the overhead of sleeping is offset by the useful
work that gets done while you sleep (or when you can't sleep).

So let's say that the overhead of a context switch is T_c. Switching
to some other task and back will cost 2*T_c. Assuming that any work
that the task you switch to does is useful, busy waiting makes sense
only if you can resume executing the faulting task within 2*T_c
time. So, unless you can read the frame from the SSD within 2*T_c time
(which I highly doubt...) busy waiting does not make sense.

Another point to consider is that if you are running on a UP machine
and your kernel isn't preemptible, and the work to submit the I/O to
disk happens in some other context than the one you run in, if you
busy wait the I/O may never get submitted, and you'll busy wait
forever!

Cheers,
Muli
-- 
Workshop on I/O Virtualization (WIOV '08)
Co-located with OSDI '08, Dec 2008, San Diego, CA
http://www.usenix.org/wiov08
___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] SSD and linux

2008-09-17 Thread Muli Ben-Yehuda
On Tue, Sep 16, 2008 at 12:31:09PM +0300, Doron Zuckerman wrote:
 Hi all,
 
 I have a question regarding the linux kernel (for those of you who are
 familiar with it).
 
 I'm looking for a way to add a change to the linux kernel in order to check
 if I can make it more compatible with my Asus EEE-PC.
 I would like to change the kernel in such way that it will not do a context
 switch every time there is a page fault
 and will wait for the required page to be brought from the SSD (Solid State
 Drive), then continue as usual.

We context switch because the task (thread) cannot continue working
until the page is paged in from the disk. If we don't context switch,
and the thread cannot continue running until the page fault is
resolved, what will the OS do in the meantime?

Note that even though the EEE has an SSD drive, it's still several
orders of magnitude slower than the time the context switch takes.

Cheers,
Muli
-- 
Workshop on I/O Virtualization (WIOV '08)
Co-located with OSDI '08, Dec 2008, San Diego, CA
http://www.usenix.org/wiov08
___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux


Re: [Haifux] SSD and linux

2008-09-16 Thread gabik
Hello Doron
 
Why do you think it will speed up the OS?
What do you plan to do until the page is swapped in? Busy loop?
 
About your solution:
handle_mm_fault is called from within page fault handler (do_page_fault
http://lxr.linux.no/linux+v2.6.26.5/+code=do_page_fault ()).
So what is the rational behind calling handle_mm_fault not from inside
pagefault  handler?
Where would you call it from instead and what do you plan to do when you are
in the page fault?
 
Probably what you meant is, in order not to do context switch due to page
fault, is to call handle_mm_fault as usual, but not to raise need_resched
flag, so as not to trigger a context switch in case of a major page fault.
 
 
Gabi
 
 

  _  

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
Of Doron Zuckerman
Sent: Tuesday, September 16, 2008 12:31 PM
To: haifux@haifux.org
Cc: Ronen Gruengras
Subject: [Haifux] SSD and linux


Hi all,

I have a question regarding the linux kernel (for those of you who are
familiar with it).

I'm looking for a way to add a change to the linux kernel in order to check
if I can make it more compatible with my Asus EEE-PC.
I would like to change the kernel in such way that it will not do a context
switch every time there is a page fault 
and will wait for the required page to be brought from the SSD (Solid State
Drive), then continue as usual.
In Such way, I plan to check if I can fasten the speed of the Operating
System (Ubuntu for EEE).
I thought of adding a TIF flag in the process descriptor (thread_info_32.h)
that will tell me if I'm currently in a pagefault and 
then change the fault_32.c in such way that it will do the
handle_mm_fault(mm,vma, address, write_; only if there is no 
pagefault at the moment.
Can you suggest any other solution possible or tell me what you think about
this solution.

I would really appreciate any help with this,
Doron.



___
Haifux mailing list
Haifux@haifux.org
http://hamakor.org.il/cgi-bin/mailman/listinfo/haifux