Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-11 Thread Tobias Ringstrom

[regarding the buffer cache hash size and bad performance on machines
with little memory...  (<32MB)]

On Tue, 9 Jan 2001, Anton Blanchard wrote:
> > Where is the size defined, and is it easy to modify?
>
> Look in fs/buffer.c:buffer_init()

I experimented some, and increasing the huffer cache hash to the 2.2
levels helped a lot, especially for 16 MB memory.  The difference is huge,
64 kB in 2.2 vs 1 kB in 2.4 for a 32 MB memory machine.

> I havent done any testing on slow hardware and the high end stuff is
> definitely performing better in 2.4, but I agree we shouldn't forget
> about the slower stuff.

Being able to tune the machine for both high and low end systems is
neccessary, and if Linux can tune itself, that's of course the best.

> Narrowing down where the problem is would help. My guess is it is a TCP
> problem, can you check if it is performing worse in your case? (eg ftp
> something against 2.2 and 2.4)

Nope, TCP performance seems more or less unchanged.  I will keep
investigating, and get back when I have more info.

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-09 Thread Roger Larsson

On Tuesday 09 January 2001 12:08, Anton Blanchard wrote:
> > Where is the size defined, and is it easy to modify?
>
> Look in fs/buffer.c:buffer_init()
>
> > I noticed that /proc/sys/vm/freepages is not writable any more.  Is there
> > any reason for this?
>
> I am not sure why.
>

It can probably be made writeable, within limits (caused by zones...)

But the interesting part is that 2.4 tries to estimate how much memory it 
will need shortly (inactive_target) and try to keep that amount inactive 
clean (inactive_clean) - clean inactive memory can be freed and reused very
quickly.

cat /proc/meminfo

My feeling is that, for now, keeping it untuneable can help us in finding 
fixable cases...

/RogerL

--
Home page:
  http://www.norran.net/nra02596/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-09 Thread Anton Blanchard

 
> Where is the size defined, and is it easy to modify?

Look in fs/buffer.c:buffer_init()

> I noticed that /proc/sys/vm/freepages is not writable any more.  Is there
> any reason for this?

I am not sure why.

> Hmm...  I'm still using samba 2.0.7.  I'll try 2.2 to see if it
> helps.  What are tdb spinlocks?

samba 2.2 uses tdb which is an SMP safe gdbm like database. By default it
uses byte range fcntl locks to provide locking, but has the option of
using spinlocks (./configure --with-spinlocks). I doubt it would make
a difference on your setup.

> Have you actually compared the same setup with 2.2 and 2.4 kernels and a
> single client transferring a large file, preferably from a slow server
> with little memory?  Most samba servers that people benchmark are fast
> computers with lots of memory.  So far, every major kernel upgrade has
> given me a performance boost, even for slow computers, and I would hate to
> see that trend break for 2.4...

I havent done any testing on slow hardware and the high end stuff is
definitely performing better in 2.4, but I agree we shouldn't forget
about the slower stuff.

Narrowing down where the problem is would help. My guess is it is a TCP
problem, can you check if it is performing worse in your case? (eg ftp
something against 2.2 and 2.4)

Anton
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-04 Thread Tobias Ringstrom

On Fri, 5 Jan 2001, Anton Blanchard wrote:
> 
> > 1) Why does the hdbench numbers go down for 2.4 (only) when 32 MB is used?
> >I fail to see how that matters, especially for the '-T' test.
> 
> When I did some tests long ago, hdparm was hitting the buffer cache hash
> table pretty hard in 2.4 compared to 2.2 because it is now smaller. However
> as davem pointed out, most things don't do such things so resizing the hash
> table just for this is a waste.

Where is the size defined, and is it easy to modify?

> Since the hash is based on RAM, it may end up being big enough on the 128M
> machine.

Maybe.  I have been experimenting some more, and I see that the less
memory i have, kswapd takes more and more CPU (more than 10% for some
cases) when I am doing a continuous read from a block device.

I noticed that /proc/sys/vm/freepages is not writable any more.  Is there
any reason for this?

> > The reason for doing the benchmarks in the first place is that my 32MB P90
> > at home really does perform noticeably worse with samba using 2.4 kernels
> > than using 2.2 kernels, and that bugs me.  I have no hard numbers for that
> > machine (yet).  If they will show anything extra, I will post them here.  
> 
> What exactly are you seeing?

I first noticed the slowdown because the load meter LEDs on my ethernet
hub did not go as high with 2.4 as it did with 2.2.  A simple test,
transferring a large file using smbclient, did in deed show a decrease in
performance, both for a localhost and a remote file transfer.  This in
spite of the tcp transfer rate beeing (much) higher in 2.4 than in 2.2.

> > Btw, has anyone else noticed samba slowdowns when going from 2.2 to 2.4?
> 
> I am seeing good results with 2.4 + samba 2.2 using tdb spinlocks.

Hmm...  I'm still using samba 2.0.7.  I'll try 2.2 to see if it
helps.  What are tdb spinlocks?

Have you actually compared the same setup with 2.2 and 2.4 kernels and a
single client transferring a large file, preferably from a slow server
with little memory?  Most samba servers that people benchmark are fast
computers with lots of memory.  So far, every major kernel upgrade has
given me a performance boost, even for slow computers, and I would hate to
see that trend break for 2.4...

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-04 Thread Anton Blanchard


> 1) Why does the hdbench numbers go down for 2.4 (only) when 32 MB is used?
>I fail to see how that matters, especially for the '-T' test.

When I did some tests long ago, hdparm was hitting the buffer cache hash
table pretty hard in 2.4 compared to 2.2 because it is now smaller. However
as davem pointed out, most things don't do such things so resizing the hash
table just for this is a waste.

Since the hash is based on RAM, it may end up being big enough on the 128M
machine.

> 3) The 2.2 kernels outperform the 2.4 kernels for few clients (see
>especially the "dbench 1" numbers for the PII-128M.  Oops!

dbench was sleeping up to 1 second between starting the clock and starting
the benchmark. This made small benchmarks (ie especially dbench 1) somewhat
variable. Grab the latest version from pserver.samba.org (if you havent
already).

> The reason for doing the benchmarks in the first place is that my 32MB P90
> at home really does perform noticeably worse with samba using 2.4 kernels
> than using 2.2 kernels, and that bugs me.  I have no hard numbers for that
> machine (yet).  If they will show anything extra, I will post them here.  

What exactly are you seeing?

> Btw, has anyone else noticed samba slowdowns when going from 2.2 to 2.4?

I am seeing good results with 2.4 + samba 2.2 using tdb spinlocks.

Anton
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-03 Thread Tobias Ringstrom

On Wed, 3 Jan 2001, Daniel Phillips wrote:

> Tobias Ringstrom wrote:
> > 3) The 2.2 kernels outperform the 2.4 kernels for few clients (see
> >especially the "dbench 1" numbers for the PII-128M.  Oops!
> 
> I noticed that too.  Furthermore I noticed that the results of the more
> heavily loaded tests on the whole 2.4.0 series tend to be highly
> variable (usually worse) if you started by moving the whole disk through
> cache, e.g., fsck on a damaged filesystem.

Yes, they do seem to vary a lot.

> It would be great if you could track the ongoing progress - you could go
> so far as to automatically download the latest patch and rerun the
> tests.  (We have a script like that here to keep our lxr/cvs tree
> current.)  And yes, it gets more important to consider some of the other
> usage patterns so we don't end up with self-fullfilling prophecies.

I was thinking about an automatic test, build, modify lilo, reboot cycle
for a while, but I don't think it's worth it.  Benchmarking is hard, and
making it automatic is probably even harder, not mentioning trying to
interpret the numbers...  Probably "Samba feels slower" works quite well.  
:-)

But then it is even unclear to me what the vm people are trying to
optimize for.  Probably a system that "feels good", which according to
myself above, may actually be a good criteria, although a but imprecise.  
Oh, well...

> For benchmarking it would be really nice to have a way of emptying
> cache, beyond just syncing.  I took a look at that last week and
> unfortunately it's not trivial.  The things that have to be touched are
> optimized for the steady-state running case and tend to take their
> marching orders from global variables and embedded heuristics that you
> don't want to mess with.  Maybe I'm just looking at this problem the
> wrong way because the shortest piece of code I can imagine for doing
> this would be 1-200 lines long and would replicate a lot of the
> functionality of page_launder and flush_dirty_pages, in other words it
> would be a pain to maintain.

How about allocating lots of memory and locking it in memory?  I have not
looked at the source, but it seems (using strace) that hdbench uses shm to
do just that.  I'll dig into the hdbench code and try to make a program
that empties the cache.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-03 Thread Daniel Phillips

Tobias Ringstrom wrote:
> 3) The 2.2 kernels outperform the 2.4 kernels for few clients (see
>especially the "dbench 1" numbers for the PII-128M.  Oops!

I noticed that too.  Furthermore I noticed that the results of the more
heavily loaded tests on the whole 2.4.0 series tend to be highly
variable (usually worse) if you started by moving the whole disk through
cache, e.g., fsck on a damaged filesystem.

> The reason for doing the benchmarks in the first place is that my 32MB P90
> at home really does perform noticeably worse with samba using 2.4 kernels
> than using 2.2 kernels, and that bugs me.  I have no hard numbers for that
> machine (yet).  If they will show anything extra, I will post them here.
> Btw, has anyone else noticed samba slowdowns when going from 2.2 to 2.4?

Again, yes, I saw that.

> Wow!  You made it all the way down here.  Congratulations!  :-)

Heh.  Yes, then I read it all again backwards.  I'll respectfully bow
out of the benchmarking business now.  :-)

It would be great if you could track the ongoing progress - you could go
so far as to automatically download the latest patch and rerun the
tests.  (We have a script like that here to keep our lxr/cvs tree
current.)  And yes, it gets more important to consider some of the other
usage patterns so we don't end up with self-fullfilling prophecies.

For benchmarking it would be really nice to have a way of emptying
cache, beyond just syncing.  I took a look at that last week and
unfortunately it's not trivial.  The things that have to be touched are
optimized for the steady-state running case and tend to take their
marching orders from global variables and embedded heuristics that you
don't want to mess with.  Maybe I'm just looking at this problem the
wrong way because the shortest piece of code I can imagine for doing
this would be 1-200 lines long and would replicate a lot of the
functionality of page_launder and flush_dirty_pages, in other words it
would be a pain to maintain.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/