Re: Disk schedulers

2008-02-20 Thread Lukas Hejtmanek
On Sat, Feb 16, 2008 at 05:20:49PM +, Pavel Machek wrote:
> Is cat /dev/zero > file enough to reproduce this?

yes.


> ext3 filesystem?

yes.
 
> Will cat /etc/passwd work while machine is unresponsive?

yes.

while find does not work:
time find /
/
/etc
/etc/manpath.config
/etc/update-manager
/etc/update-manager/release-upgrades
/etc/gshadow-
/etc/inputrc
/etc/openalrc
/etc/bonobo-activation
/etc/bonobo-activation/bonobo-activation-config.xml
/etc/gnome-vfs-2.0
/etc/gnome-vfs-2.0/modules
/etc/gnome-vfs-2.0/modules/obex-module.conf
/etc/gnome-vfs-2.0/modules/extra-modules.conf
/etc/gnome-vfs-2.0/modules/theme-method.conf
/etc/gnome-vfs-2.0/modules/font-method.conf
/etc/gnome-vfs-2.0/modules/default-modules.conf
^C

real0m7.982s
user0m0.003s
sys 0m0.000s


i.e., it took 8 seconds to list just 17 dir entries.

It looks like I have this problem:
http://www.linuxinsight.com/first_benchmarks_of_the_ext4_file_system.html#comment-619
(the last comment with title: Sustained writes 2 or more times the amount of
memfree)

-- 
Lukáš Hejtmánek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-20 Thread Zdenek Kabelac
2008/2/15, Zan Lynx <[EMAIL PROTECTED]>:
>
> On Fri, 2008-02-15 at 15:57 +0100, Prakash Punnoor wrote:
> > On the day of Friday 15 February 2008 Jan Engelhardt hast written:
> > > On Feb 14 2008 17:21, Lukas Hejtmanek wrote:
> > > >Hello,
> > > >
> > > >whom should I blame about disk schedulers?
> > >
> > > Also consider
> > > - DMA (e.g. only UDMA2 selected)
> > > - aging disk
> >
> > Nope, I also reported this problem _years_ ago, but till now much hasn't
> > changed. Large writes lead to read starvation.
>
> Yes, I see this often myself.  It's like the disk IO queue (I set mine
> to 1024) fills up, and pdflush and friends can stuff write requests into
> it much more quickly than any other programs can provide read requests.
>
> CFQ and ionice work very well up until iostat shows average IO queuing
> above 1024 (where I set the queue number).

I should probably summarize my experience here as well:

I'm using Qemu - inside of it I'm testing some kernel module which is
doing a lot of disk copy operation - its virtual disk has 8GB.  When
my test is started my system starts to feel unresponsive couple times
per minute for nearly 10 minutes - especially if I use some chat tool
like pidgin I'm often left for 5 secs without any visible refresh on
screen (redraw, typed keys,...)  Firefox shows similar symptoms...

Obviously piding has its own responsibility here - because from strace
it's visible it tries to open and read files - that he read already
zillion times before :) - but that's another story.

But I've tried many things - I've started qemu with ionice -c0, used
ionice -c2 for pidgin, tried different io-scheduler, niced qemu,
changed swappiness  to  different values according to various tips &
tricks around the web I could find  and I cannot get properly running
system with my qemu test case because the system feels unresponsive in
some application which needs to touch my drive.

Does anyone have any ideas what should I try/test/check

BTW one interesting things I've noticed is very high kernel IPI number:
i.e.
  77,0% (3794,9)   : Rescheduling interrupts

Sometimes this number attacks 1 barrier.

My machine is 2.2GHz C2D, T61, 2GB - and CPU is 50% idle while machine
freezes - and yes I can move the mouse all the time ;) and no I'm not
out-of-ram

Zdenek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-20 Thread Pavel Machek
Hi!

> whom should I blame about disk schedulers?
> 
> I have the following setup:
> 1Gb network
> 2GB RAM
> disk write speed about 20MB/s
> 
> If I'm scping file (about 500MB) from the network (which is faster than the
> local disk), any process is totally unable to read anything from the local 
> disk
> till the scp finishes. It is not caused by low free memory, while scping
> I have 500MB of free memory (not cached or buffered).
> 
> I tried cfq and anticipatory scheduler, none is different.

Is cat /dev/zero > file enough to reproduce this?

ext3 filesystem?

Will cat /etc/passwd work while machine is unresponsive?
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-17 Thread Linda Walsh

Lukas Hejtmanek wrote:

whom should I blame about disk schedulers?

I have the following setup:
1Gb network
2GB RAM
disk write speed about 20MB/s

If I'm scping file (about 500MB) from the network (which is faster than the
local disk), any process is totally unable to read anything from the local disk
till the scp finishes. It is not caused by low free memory, while scping
I have 500MB of free memory (not cached or buffered).

I tried cfq and anticipatory scheduler, none is different.

  

   You didn't say anything about #processors or speed, nor
did you say anything about your hard disk's raw-io ability.
You also didn't mention what kernel version or whether or not
you were using the new UID-group cpu scheduler in 2.6.24 (which likes
to default to 'on'; not a great choice for single-user, desktop-type
machines, if I understand its grouping policy). 


Are you sure neither end of the copy is cpu-bound on ssh/scp
encrypt/decrypt calculations? It might not just be inability
to read from disk, but low cpu availability.  Scp can be alot
more CPU intensive than you would expect...  Just something to
consider...

Linda

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-16 Thread Lukas Hejtmanek
On Fri, Feb 15, 2008 at 05:24:52PM +, Paulo Marques wrote:
> If you want to take advantage of all that memory to buffer disk writes,  
> so that the reads can proceed better, you might want to tweak your  
> /proc/sys/vm/dirty_ratio amd /proc/sys/vm/dirty_background_ratio to more  
> appropriate values. (maybe also dirty_writeback_centisecs and  
> dirty_expire_centisecs)

I don't feel like to have my whole memory eaten by a single file which is not
to be read again and thus it is pretty useless. Instead, I would like to see
slowdown of scp so that other processes can also access disk. Why is this
possible with kernel process scheduler and not with IO scheduler?

-- 
Lukáš Hejtmánek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-16 Thread Lukas Hejtmanek
On Fri, Feb 15, 2008 at 10:11:26AM -0700, Zan Lynx wrote:
> Yes, I see this often myself.  It's like the disk IO queue (I set mine
> to 1024) fills up, and pdflush and friends can stuff write requests into
> it much more quickly than any other programs can provide read requests.
> 
> CFQ and ionice work very well up until iostat shows average IO queuing
> above 1024 (where I set the queue number).

I though that CFQ would maintain IO queues per process and pick up request in
round robin from non-empty queues. Am I wrong? And if wrong, isn't it desired
behavior for desktop?

-- 
Lukáš Hejtmánek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-15 Thread FD Cami
On Fri, 15 Feb 2008 10:11:26 -0700
Zan Lynx <[EMAIL PROTECTED]> wrote:

> 
> On Fri, 2008-02-15 at 15:57 +0100, Prakash Punnoor wrote:
> > On the day of Friday 15 February 2008 Jan Engelhardt hast written:
> > > On Feb 14 2008 17:21, Lukas Hejtmanek wrote:
> > > >Hello,
> > > >
> > > >whom should I blame about disk schedulers?
> > >
> > > Also consider
> > > - DMA (e.g. only UDMA2 selected)
> > > - aging disk
> > 
> > Nope, I also reported this problem _years_ ago, but till now much hasn't 
> > changed. Large writes lead to read starvation.
> 
> Yes, I see this often myself.  It's like the disk IO queue (I set mine
> to 1024) fills up, and pdflush and friends can stuff write requests into
> it much more quickly than any other programs can provide read requests.
> 
> CFQ and ionice work very well up until iostat shows average IO queuing
> above 1024 (where I set the queue number).

I can confirm that as well.

This is easily reproductible with dd if=/dev/zero of=somefile bs=2048
for example. After a short while, trying to read the disks takes an
awfully long time, even if the dd process is ionice'd.

What is worse is that other drives attached to the same controller become
unresponsive as well.
I use a Dell Perc 5/i (megaraid_sas) with :
* 2 SAS 15000 RPM drives, RAID1 => sda
* 4 SAS 15000 RPM drives, RAID5 => sdb
* 2 SATA 72000 RPM drives, RAID1 => sdc
Using dd or mkfs on sdb or sdc makes sda unresponsive as well.
Is this expected ?

Cheers

Francois
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-15 Thread Roger Heflin

Lukas Hejtmanek wrote:

On Fri, Feb 15, 2008 at 03:42:58PM +0100, Jan Engelhardt wrote:

Also consider
- DMA (e.g. only UDMA2 selected)
- aging disk


it's not the case.

hdparm reports udma5 is used, if it is reliable with libata.

The disk is 3 months old, kernel does not report any errors. And it has never 
been different.




A new current ide/sata disk should do around 60mb/second, check the
min/max bps rate listed on the disk companies site, and divide by 8, and
take maybe 80% of that

Also you may consider using the -l option on the scp command to limit
its total usage.

This feature has been around at least 8 years (from 2.2) that high
levels of writes would significantly starve out reads, mainly because
you can queue up 1000's of writes, and a read, when the read
finishes there are another 1000's writes for the next read to
get in line behind, and wait, and this continues until the
writes stop.

Roger
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-15 Thread Paulo Marques

Lukas Hejtmanek wrote:

[...]
If I'm scping file (about 500MB) from the network (which is faster than the
local disk), any process is totally unable to read anything from the local disk
till the scp finishes. It is not caused by low free memory, while scping
I have 500MB of free memory (not cached or buffered).


If you want to take advantage of all that memory to buffer disk writes, 
so that the reads can proceed better, you might want to tweak your 
/proc/sys/vm/dirty_ratio amd /proc/sys/vm/dirty_background_ratio to more 
appropriate values. (maybe also dirty_writeback_centisecs and 
dirty_expire_centisecs)


You can read all about those tunables in Documentation/filesystems/proc.txt.

Just my 2 cents,

--
Paulo Marques - www.grupopie.com

"Very funny Scotty. Now beam up my clothes."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-15 Thread Zan Lynx

On Fri, 2008-02-15 at 15:57 +0100, Prakash Punnoor wrote:
> On the day of Friday 15 February 2008 Jan Engelhardt hast written:
> > On Feb 14 2008 17:21, Lukas Hejtmanek wrote:
> > >Hello,
> > >
> > >whom should I blame about disk schedulers?
> >
> > Also consider
> > - DMA (e.g. only UDMA2 selected)
> > - aging disk
> 
> Nope, I also reported this problem _years_ ago, but till now much hasn't 
> changed. Large writes lead to read starvation.

Yes, I see this often myself.  It's like the disk IO queue (I set mine
to 1024) fills up, and pdflush and friends can stuff write requests into
it much more quickly than any other programs can provide read requests.

CFQ and ionice work very well up until iostat shows average IO queuing
above 1024 (where I set the queue number).
-- 
Zan Lynx <[EMAIL PROTECTED]>


signature.asc
Description: This is a digitally signed message part


Re: Disk schedulers

2008-02-15 Thread Jeffrey E. Hundstad

Lukas Hejtmanek,

I have to say, that I've heard this subject before, the summary answer 
seems to be, that the kernel can not guess the wishes of the user 100% 
of the time.  If you have a low priority I/O task use ionice(1) to set 
the priority of that task so it doesn't nuke your high priority task.


I have to personal stake in this answer but I can report that for my 
high I/O tasks it does work like a charm.


--
Jeffrey Hundstad

Lukas Hejtmanek wrote:

On Fri, Feb 15, 2008 at 03:42:58PM +0100, Jan Engelhardt wrote:
  

Also consider
- DMA (e.g. only UDMA2 selected)
- aging disk



it's not the case.

hdparm reports udma5 is used, if it is reliable with libata.

The disk is 3 months old, kernel does not report any errors. And it has never
been different.

--
Lukáš Hejtmánek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-15 Thread Lukas Hejtmanek
On Fri, Feb 15, 2008 at 03:42:58PM +0100, Jan Engelhardt wrote:
> Also consider
> - DMA (e.g. only UDMA2 selected)
> - aging disk

it's not the case.

hdparm reports udma5 is used, if it is reliable with libata.

The disk is 3 months old, kernel does not report any errors. And it has never 
been different.

-- 
Lukáš Hejtmánek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-15 Thread Prakash Punnoor
On the day of Friday 15 February 2008 Jan Engelhardt hast written:
> On Feb 14 2008 17:21, Lukas Hejtmanek wrote:
> >Hello,
> >
> >whom should I blame about disk schedulers?
>
> Also consider
> - DMA (e.g. only UDMA2 selected)
> - aging disk

Nope, I also reported this problem _years_ ago, but till now much hasn't 
changed. Large writes lead to read starvation.

-- 
(°= =°)
//\ Prakash Punnoor /\\
V_/ \_V


signature.asc
Description: This is a digitally signed message part.


Re: Disk schedulers

2008-02-15 Thread Jan Engelhardt

On Feb 14 2008 17:21, Lukas Hejtmanek wrote:
>Hello,
>
>whom should I blame about disk schedulers?

Also consider
- DMA (e.g. only UDMA2 selected)
- aging disk
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-15 Thread Lukas Hejtmanek
On Fri, Feb 15, 2008 at 09:02:31AM +0900, Tejun Heo wrote:
> > till the scp finishes. It is not caused by low free memory, while scping
> > I have 500MB of free memory (not cached or buffered).
> > 
> > I tried cfq and anticipatory scheduler, none is different.
> > 
> 
> Does deadline help?

well, deadline is a little bit better. I'm trying to read from disk opening
maildir with 2 mails with mutt. If I open that maildir, mutt shows
progress. With cfq or anticipatory scheduler, progress is 0/2 until scp
finishes. With deadline, progress is 150/2 after scp finished. So I would
say, it is better but I doubt it is OK.

-- 
Lukáš Hejtmánek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disk schedulers

2008-02-14 Thread Tejun Heo
Lukas Hejtmanek wrote:
> Hello,
> 
> whom should I blame about disk schedulers?
> 
> I have the following setup:
> 1Gb network
> 2GB RAM
> disk write speed about 20MB/s
> 
> If I'm scping file (about 500MB) from the network (which is faster than the
> local disk), any process is totally unable to read anything from the local 
> disk
> till the scp finishes. It is not caused by low free memory, while scping
> I have 500MB of free memory (not cached or buffered).
> 
> I tried cfq and anticipatory scheduler, none is different.
> 

Does deadline help?

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/