2.4.x latency

2001-02-05 Thread Daniel Walton


I'm experiencing an odd behavior under the 2.4.0 and 2.4.1 kernels on two 
of my servers.  I'm experiencing high latency periods.  Sometimes the 
periods are long and other times they are short.  As a test I setup three 
ping processes on one of the servers all pinging the same destination on 
the LAN at the same time.  Below is a sample of the ping output.  The 
strange thing is that while all three ping processes went through the 
latency cycle, they each did it at different times.  This tells me that 
surely this isn't a network response issue or else all ping processes would 
show the latency at the same time.

64 bytes from (216.185.106.18): icmp_seq=55 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=56 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=57 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=58 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=59 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=60 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=61 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=62 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=63 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=64 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=65 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=66 ttl=255 time=4121.7 ms
64 bytes from (216.185.106.18): icmp_seq=67 ttl=255 time=3259.0 ms
64 bytes from (216.185.106.18): icmp_seq=68 ttl=255 time=2384.6 ms
64 bytes from (216.185.106.18): icmp_seq=69 ttl=255 time=1511.2 ms
64 bytes from (216.185.106.18): icmp_seq=70 ttl=255 time=666.1 ms
64 bytes from (216.185.106.18): icmp_seq=71 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=72 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=73 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=74 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=75 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=76 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=77 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=78 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=79 ttl=255 time=0.1 ms


The hardware in question are two Athlon servers with VIA KT133 chipset, 
512Mb RAM, and IDE drives.  One server uses the tulip network driver for a 
Netgear FA-310.  The other uses the NatSim DP83810 network driver for the 
FA-312 and both exhibit the same problem.  I've had 3com 3c900 series cards 
in the machines as well and the problem still persisted.

One other interesting little fact is that if I ping the problem machines 
from a good machine I always get 0.1 ms response times, even while the 
pings from the problem machines are showing latency.

I hope this is enough information for someone to work with.  I'm at a loss 
for what the problem is and unfortunately I'm no kernel hacker.  I 
appreciate any help you guys can offer.

Thank you,
Daniel Walton




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.x latency

2001-02-05 Thread Daniel Walton


I'm experiencing an odd behavior under the 2.4.0 and 2.4.1 kernels on two 
of my servers.  I'm experiencing high latency periods.  Sometimes the 
periods are long and other times they are short.  As a test I setup three 
ping processes on one of the servers all pinging the same destination on 
the LAN at the same time.  Below is a sample of the ping output.  The 
strange thing is that while all three ping processes went through the 
latency cycle, they each did it at different times.  This tells me that 
surely this isn't a network response issue or else all ping processes would 
show the latency at the same time.

64 bytes from (216.185.106.18): icmp_seq=55 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=56 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=57 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=58 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=59 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=60 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=61 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=62 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=63 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=64 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=65 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=66 ttl=255 time=4121.7 ms
64 bytes from (216.185.106.18): icmp_seq=67 ttl=255 time=3259.0 ms
64 bytes from (216.185.106.18): icmp_seq=68 ttl=255 time=2384.6 ms
64 bytes from (216.185.106.18): icmp_seq=69 ttl=255 time=1511.2 ms
64 bytes from (216.185.106.18): icmp_seq=70 ttl=255 time=666.1 ms
64 bytes from (216.185.106.18): icmp_seq=71 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=72 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=73 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=74 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=75 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=76 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=77 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=78 ttl=255 time=0.1 ms
64 bytes from (216.185.106.18): icmp_seq=79 ttl=255 time=0.1 ms


The hardware in question are two Athlon servers with VIA KT133 chipset, 
512Mb RAM, and IDE drives.  One server uses the tulip network driver for a 
Netgear FA-310.  The other uses the NatSim DP83810 network driver for the 
FA-312 and both exhibit the same problem.  I've had 3com 3c900 series cards 
in the machines as well and the problem still persisted.

One other interesting little fact is that if I ping the problem machines 
from a good machine I always get 0.1 ms response times, even while the 
pings from the problem machines are showing latency.

I hope this is enough information for someone to work with.  I'm at a loss 
for what the problem is and unfortunately I'm no kernel hacker.  I 
appreciate any help you guys can offer.

Thank you,
Daniel Walton




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 Networking oddity

2001-01-28 Thread Daniel Walton


The server in question is running the tulip driver.  dmesg reports:

Linux Tulip driver version 0.9.13 (January 2, 2001)

I have seen this same behavior on a couple of my servers running 3com 
3c905c adaptors as well.

The last time I was experiencing it I rebooted the system and it didn't 
solve the problem.  When it came up it was still lagging.  This would lead 
me to believe that it is caused by some sort of network condition, but what 
I don't know.

If anyone has ideas, I'd be more than happy to run tests/provide more info..

-Dan



At 10:14 PM 1/28/2001 -0500, you wrote:
>In mailing-lists.linux-kernel, you wrote:
>
> >I am running a web server under the new 2.4.0 kernel and am experiencing
> >some intermittent odd behavior from the kernel.  The machine will sometimes
> >go through cycles where network response becomes slow even though top
> >reports over 60% idle CPU time.   When this is happening ping goes from
> >reasonable response times to response times of several seconds in cycles of
> >about 15 to 20 seconds.
>
>FWIW, I have seen behaviour like this under kernel 2.2.x and 2.4.x,
>for me taking the interface down and then bringing it back up usually
>makes the problem stop, at least for the moment.
>
>I have always assumed that it is caused by a bug in the Ethernet card
>driver, as the first time I noticed this behaviour, I was using the
>Realtek 8139 driver about two years ago, it was really not good
>hardware and the driver was pretty new.  Anyway, it would do this, so
>I contacted Donald Becker about it, he pointed me to a newer version
>of the driver that did it _much_ less often.
>
>Cheers,
>Wayne

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.0 Networking oddity

2001-01-28 Thread Daniel Walton


I am running a web server under the new 2.4.0 kernel and am experiencing 
some intermittent odd behavior from the kernel.  The machine will sometimes 
go through cycles where network response becomes slow even though top 
reports over 60% idle CPU time.   When this is happening ping goes from 
reasonable response times to response times of several seconds in cycles of 
about 15 to 20 seconds.

As a test I pinged another machine on the same network segment and received 
the same results listed above.  On the other hand, I pinged from the other 
machine on the LAN to the problem machine and the ping times were a 
consistent 0.1ms.  This tells me two things.  One, that the network switch 
was not causing the problem, and two, that the problem is very likely 
somewhere in the handoff of packets from kernel-land to user-land on the 
problem server.

Here is the ping results from the problem server to another machine on the 
same segment:

77 packets transmitted, 77 packets received, 0% packet loss
round-trip min/avg/max = 0.2/4368.1/15126.6 ms


Here are the ping results from the other machine to the problem server 
taken at exactly the same time:

116 packets transmitted, 115 packets received, 0% packet loss
round-trip min/avg/max = 0.1/0.1/0.3 ms


A little information about what I'm running.  The server is running about 
700Kbps continuous network output from nearly a thousand concurrent 
connections.  The web server is a single process which utilizes the 
select/poll method of multiplexing.  The machine is an 1gig Athlon 
processor with 512megs with RedHat 6.2 installed.

I have the following tweaks setup in my rc.local file:

echo "7168 32767 65535" > /proc/sys/net/ipv4/tcp_mem
echo 32768 > /proc/sys/net/ipv4/tcp_max_orphans
echo 4096 > /proc/sys/net/ipv4/tcp_max_syn_backlog
echo 1 > /proc/sys/net/ipv4/tcp_syncookies
echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout
echo 4 > /proc/sys/net/ipv4/tcp_syn_retries
echo 7 > /proc/sys/net/ipv4/tcp_retries2
echo 300 > /proc/sys/net/ipv4/tcp_keepalive_time
echo 30 > /proc/sys/net/ipv4/tcp_keepalive_intvl
echo 16384 > /proc/sys/fs/file-max
echo 16384 > /proc/sys/kernel/rtsig-max


Am I simply missing something in my tweaks or is this a bug?  I would be 
happy to supply more information if it would help anyone in the know on a 
problem like this.

I appreciate any light anyone can shed on this subject.  I've been trying 
to find the source of this problem for some time now.

Daniel Walton



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.0 Networking oddity

2001-01-28 Thread Daniel Walton


I am running a web server under the new 2.4.0 kernel and am experiencing 
some intermittent odd behavior from the kernel.  The machine will sometimes 
go through cycles where network response becomes slow even though top 
reports over 60% idle CPU time.   When this is happening ping goes from 
reasonable response times to response times of several seconds in cycles of 
about 15 to 20 seconds.

As a test I pinged another machine on the same network segment and received 
the same results listed above.  On the other hand, I pinged from the other 
machine on the LAN to the problem machine and the ping times were a 
consistent 0.1ms.  This tells me two things.  One, that the network switch 
was not causing the problem, and two, that the problem is very likely 
somewhere in the handoff of packets from kernel-land to user-land on the 
problem server.

Here is the ping results from the problem server to another machine on the 
same segment:

77 packets transmitted, 77 packets received, 0% packet loss
round-trip min/avg/max = 0.2/4368.1/15126.6 ms


Here are the ping results from the other machine to the problem server 
taken at exactly the same time:

116 packets transmitted, 115 packets received, 0% packet loss
round-trip min/avg/max = 0.1/0.1/0.3 ms


A little information about what I'm running.  The server is running about 
700Kbps continuous network output from nearly a thousand concurrent 
connections.  The web server is a single process which utilizes the 
select/poll method of multiplexing.  The machine is an 1gig Athlon 
processor with 512megs with RedHat 6.2 installed.

I have the following tweaks setup in my rc.local file:

echo "7168 32767 65535"  /proc/sys/net/ipv4/tcp_mem
echo 32768  /proc/sys/net/ipv4/tcp_max_orphans
echo 4096  /proc/sys/net/ipv4/tcp_max_syn_backlog
echo 1  /proc/sys/net/ipv4/tcp_syncookies
echo 30  /proc/sys/net/ipv4/tcp_fin_timeout
echo 4  /proc/sys/net/ipv4/tcp_syn_retries
echo 7  /proc/sys/net/ipv4/tcp_retries2
echo 300  /proc/sys/net/ipv4/tcp_keepalive_time
echo 30  /proc/sys/net/ipv4/tcp_keepalive_intvl
echo 16384  /proc/sys/fs/file-max
echo 16384  /proc/sys/kernel/rtsig-max


Am I simply missing something in my tweaks or is this a bug?  I would be 
happy to supply more information if it would help anyone in the know on a 
problem like this.

I appreciate any light anyone can shed on this subject.  I've been trying 
to find the source of this problem for some time now.

Daniel Walton



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 Networking oddity

2001-01-28 Thread Daniel Walton


The server in question is running the tulip driver.  dmesg reports:

Linux Tulip driver version 0.9.13 (January 2, 2001)

I have seen this same behavior on a couple of my servers running 3com 
3c905c adaptors as well.

The last time I was experiencing it I rebooted the system and it didn't 
solve the problem.  When it came up it was still lagging.  This would lead 
me to believe that it is caused by some sort of network condition, but what 
I don't know.

If anyone has ideas, I'd be more than happy to run tests/provide more info..

-Dan



At 10:14 PM 1/28/2001 -0500, you wrote:
In mailing-lists.linux-kernel, you wrote:

 I am running a web server under the new 2.4.0 kernel and am experiencing
 some intermittent odd behavior from the kernel.  The machine will sometimes
 go through cycles where network response becomes slow even though top
 reports over 60% idle CPU time.   When this is happening ping goes from
 reasonable response times to response times of several seconds in cycles of
 about 15 to 20 seconds.

FWIW, I have seen behaviour like this under kernel 2.2.x and 2.4.x,
for me taking the interface down and then bringing it back up usually
makes the problem stop, at least for the moment.

I have always assumed that it is caused by a bug in the Ethernet card
driver, as the first time I noticed this behaviour, I was using the
Realtek 8139 driver about two years ago, it was really not good
hardware and the driver was pretty new.  Anyway, it would do this, so
I contacted Donald Becker about it, he pointed me to a newer version
of the driver that did it _much_ less often.

Cheers,
Wayne

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Out of socket memory? (2.4.0-test11)

2000-12-06 Thread Daniel Walton



I'm not quite clear how the settings under /proc/sys/vm/* would effect the 
problem.  I neglected to mention in my previous post that all web content 
is served directly from the memory of the web server (no file 
accesses).  The only file accesses that happen are from a MySQL server 
which gets queried about once a second.

Here's the output of /proc/meminfo.  I'm not sure how helpful it is.  I was 
kinda hoping for something that would allow me to see how much memory had 
been allocated for sockets and what the max was.

[root@s4 /proc]# cat meminfo
 total:used:free:  shared: buffers:  cached:
Mem:  261742592 122847232 1388953600  1757184 88633344
Swap: 2713927680 271392768
MemTotal:   255608 kB
MemFree:135640 kB
MemShared:   0 kB
Buffers:  1716 kB
Cached:  86556 kB
Active:  15684 kB
Inact_dirty: 72588 kB
Inact_clean: 0 kB
Inact_target:   68 kB
HighTotal:   0 kB
HighFree:0 kB
LowTotal:   255608 kB
LowFree:135640 kB
SwapTotal:  265032 kB
SwapFree:   265032 kB


-Dan




At 12:30 AM 12/7/2000 -0500, you wrote:

>backlog queue?  tuning /proc/sys/vm/*?
>
> > problem?  Is there any way I can get runtime information from the 
> kernel on
> > things like amount of socket memory used and amount available?  Am I using
>
>/proc/meminfo?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Out of socket memory? (2.4.0-test11)

2000-12-06 Thread Daniel Walton


Hello,

I've been having a problem with a high volume Linux web server.  This 
particular web server used to be a FreeBSD machine and I've been trying to 
successfully make the switch for some time now.  I've been trying the 2.4 
development kernels as they come out and I've been tweaking the /proc 
filesystem variables but so far nothing seems to have fixed the 
problem.  The problem is that I get "Out of socket memory" errors and the 
networking locks up.  Sometimes the server will go for weeks without 
running into the problem and other times it'll last 30 minutes.  The 
hardware in question is an 1Ghz Athalon system with 256Mb of ram and an IDE 
hard disk.  I've tried every 2.4 test kernel to date.  The web server is a 
specialized web server running about 10 million hits a day.  Of the 256Mb 
of ram the web server uses 40Mb and there are no other significant memory 
consuming processes on the system.  Currently I am using the following 
/proc modifications in the rc.local file.

echo "7168 11776 16384" > /proc/sys/net/ipv4/tcp_mem
echo 32768 > /proc/sys/net/ipv4/tcp_max_orphans

What am I doing wrong?  Is this a kernel problem or a configuration 
problem?  Is there any way I can get runtime information from the kernel on 
things like amount of socket memory used and amount available?  Am I using 
the right variables to increase available socket memory and just not giving 
it enough yet?

I appreciate any help provided.

Thank you,
Daniel Walton


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Out of socket memory? (2.4.0-test11)

2000-12-06 Thread Daniel Walton


Hello,

I've been having a problem with a high volume Linux web server.  This 
particular web server used to be a FreeBSD machine and I've been trying to 
successfully make the switch for some time now.  I've been trying the 2.4 
development kernels as they come out and I've been tweaking the /proc 
filesystem variables but so far nothing seems to have fixed the 
problem.  The problem is that I get "Out of socket memory" errors and the 
networking locks up.  Sometimes the server will go for weeks without 
running into the problem and other times it'll last 30 minutes.  The 
hardware in question is an 1Ghz Athalon system with 256Mb of ram and an IDE 
hard disk.  I've tried every 2.4 test kernel to date.  The web server is a 
specialized web server running about 10 million hits a day.  Of the 256Mb 
of ram the web server uses 40Mb and there are no other significant memory 
consuming processes on the system.  Currently I am using the following 
/proc modifications in the rc.local file.

echo "7168 11776 16384"  /proc/sys/net/ipv4/tcp_mem
echo 32768  /proc/sys/net/ipv4/tcp_max_orphans

What am I doing wrong?  Is this a kernel problem or a configuration 
problem?  Is there any way I can get runtime information from the kernel on 
things like amount of socket memory used and amount available?  Am I using 
the right variables to increase available socket memory and just not giving 
it enough yet?

I appreciate any help provided.

Thank you,
Daniel Walton


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Out of socket memory? (2.4.0-test11)

2000-12-06 Thread Daniel Walton



I'm not quite clear how the settings under /proc/sys/vm/* would effect the 
problem.  I neglected to mention in my previous post that all web content 
is served directly from the memory of the web server (no file 
accesses).  The only file accesses that happen are from a MySQL server 
which gets queried about once a second.

Here's the output of /proc/meminfo.  I'm not sure how helpful it is.  I was 
kinda hoping for something that would allow me to see how much memory had 
been allocated for sockets and what the max was.

[root@s4 /proc]# cat meminfo
 total:used:free:  shared: buffers:  cached:
Mem:  261742592 122847232 1388953600  1757184 88633344
Swap: 2713927680 271392768
MemTotal:   255608 kB
MemFree:135640 kB
MemShared:   0 kB
Buffers:  1716 kB
Cached:  86556 kB
Active:  15684 kB
Inact_dirty: 72588 kB
Inact_clean: 0 kB
Inact_target:   68 kB
HighTotal:   0 kB
HighFree:0 kB
LowTotal:   255608 kB
LowFree:135640 kB
SwapTotal:  265032 kB
SwapFree:   265032 kB


-Dan




At 12:30 AM 12/7/2000 -0500, you wrote:

backlog queue?  tuning /proc/sys/vm/*?

  problem?  Is there any way I can get runtime information from the 
 kernel on
  things like amount of socket memory used and amount available?  Am I using

/proc/meminfo?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/