Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-09-01 Thread Lincoln Dale

jmerkey wrote:

It might be helpful for someone to look at these sections of code I 
had to patch in 2.6.9.
I discovered a case where the kernel scheduler will pass NULL for the 
array argument
when I started hitting the extreme upper range > 200MB/S combined disk 
and lan
throughput.  This was running with preemptible kernel and 
hyperthreading enabled.


Jeff,

you are running a tainted kernel since you're loading proprietary modules.
you'd better go back to your vendor for support.
haha.


cheers,

lincoln.




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] DSFS Network Forensic File System for Linux Patches

2005-09-01 Thread Lincoln Dale

jmerkey wrote:

It might be helpful for someone to look at these sections of code I 
had to patch in 2.6.9.
I discovered a case where the kernel scheduler will pass NULL for the 
array argument
when I started hitting the extreme upper range  200MB/S combined disk 
and lan
throughput.  This was running with preemptible kernel and 
hyperthreading enabled.


Jeff,

you are running a tainted kernel since you're loading proprietary modules.
you'd better go back to your vendor for support.
haha.


cheers,

lincoln.




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sched_yield() makes OpenLDAP slow

2005-08-23 Thread Lincoln Dale

Denis Vlasenko wrote:


This is what I would expect if run on an otherwise idle machine.
sched_yield just puts you at the back of the line for runnable
processes, it doesn't magically cause you to go to sleep somehow.
 


When a kernel build is occurring??? Plus `top` itself It damn
well sleep while giving up the CPU. If it doesn't it's broken.
   

unless you have all of the kernel source in the buffer cache, a 
concurrent kernel build will spend a fair bit of time in io_wait state ..
as such its perfectly plausible that sched_yield keeps popping back to 
the top of 'runnable' processes . . .



cheers,

lincoln.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sched_yield() makes OpenLDAP slow

2005-08-23 Thread Lincoln Dale

Denis Vlasenko wrote:


This is what I would expect if run on an otherwise idle machine.
sched_yield just puts you at the back of the line for runnable
processes, it doesn't magically cause you to go to sleep somehow.
 


When a kernel build is occurring??? Plus `top` itself It damn
well sleep while giving up the CPU. If it doesn't it's broken.
   

unless you have all of the kernel source in the buffer cache, a 
concurrent kernel build will spend a fair bit of time in io_wait state ..
as such its perfectly plausible that sched_yield keeps popping back to 
the top of 'runnable' processes . . .



cheers,

lincoln.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: fastboot, diskstat

2005-07-22 Thread Lincoln Dale

Avi Kivity wrote:


parallelized initscripts will probably defeat this, though.

put all run-once-but-never-run-again scripts into initrd / initramfs?  


boot into a suspend-to-disk image?

i still see the real solution at least for "desktop" machines is to 
minimize the sheer amount of stuff loaded in the rc scripts.
at least for my use-every-day laptop (IBM T42), i've literally halved 
the startup time by being savvy about what services are started and in 
many cases not starting things until a few minutes after i've logged in.


for example, making use of NetworkManager sorts out a lot of the delay 
associated with dhcp and roaming WiFi connections - so there are no 
start-on-boot network kruft.
likewise, as a desktop its completely academic if sendmail starts at T+0 
seconds or T+2 minutes.

same for sshd/cups/httpd/ntpd et al.

of what does run, you CAN run it in parallel & hopefully get some sense 
out of the elevator being intelligent.



cheers,

lincoln.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: fastboot, diskstat

2005-07-22 Thread Lincoln Dale

Avi Kivity wrote:


parallelized initscripts will probably defeat this, though.

put all run-once-but-never-run-again scripts into initrd / initramfs?  
evil grin

boot into a suspend-to-disk image?

i still see the real solution at least for desktop machines is to 
minimize the sheer amount of stuff loaded in the rc scripts.
at least for my use-every-day laptop (IBM T42), i've literally halved 
the startup time by being savvy about what services are started and in 
many cases not starting things until a few minutes after i've logged in.


for example, making use of NetworkManager sorts out a lot of the delay 
associated with dhcp and roaming WiFi connections - so there are no 
start-on-boot network kruft.
likewise, as a desktop its completely academic if sendmail starts at T+0 
seconds or T+2 minutes.

same for sshd/cups/httpd/ntpd et al.

of what does run, you CAN run it in parallel  hopefully get some sense 
out of the elevator being intelligent.



cheers,

lincoln.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Low file-system performance for 2.6.11 compared to 2.4.26

2005-03-31 Thread Lincoln Dale
At 02:34 AM 1/04/2005, linux-os wrote:
For those interested, some file-system tests and a test-tools
are attached.
in high-performance-I/O-testing i perform regularly, i notice no slowdown 
in 2.6 compared to 2.4.

looking at your test-tools, i would hardly call your workload anywhere near 
'realistic' in terms of its I/O patterns.

a few suggestions / constructive comments:
 (1) 100 processes each performing i/o in the pattern of write 8MB, 
fsync(), wait, read 8MB, wait, delete probably isn't realistic
 (2) you don't mention whether you're performing testing on ext2 or ext3
 (3) you also don't mention what i/o scheduled is being used
 (4) your benchmark doesn't measure 'fairness' between processes
 (5) your benchmark sleeps for a random amount of time at various parts

it is well known that in 2.4 kernels, processes can 'hog' the i/o channel - 
which may result in higher overall throughput but at the detriment of being 
'fair' to the rest of the system.  you should address point (4) above.

can you modify your program to present the time-taken-per-process?
if i'm a betting man, i'd say that 2.6 will be a lot more 'fair' compared 
to 2.4.

default settings for 2.6 likely also means that there is a lot less data 
outstanding in the buffer-cache.
2.6's fsync() behavior is also quite different to that of 2.4.
also note that if you're using a journalled filesystem, fsync() likely does 
different things ...

you don't seed rand, so the random numbers out of rand() aren't actually 
random.
it probably doesn't matter so much since we're only talking microseconds 
here (up to 0.511 msec) - but given 2.4 kernels will have HZ of 100 and 2.6 
will have HZ of 1000, you're clearly going to get a different end result - 
perhaps with 2.6 resulting in a busy-wait from usleep().

cheers,
lincoln.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Low file-system performance for 2.6.11 compared to 2.4.26

2005-03-31 Thread Lincoln Dale
At 02:34 AM 1/04/2005, linux-os wrote:
For those interested, some file-system tests and a test-tools
are attached.
in high-performance-I/O-testing i perform regularly, i notice no slowdown 
in 2.6 compared to 2.4.

looking at your test-tools, i would hardly call your workload anywhere near 
'realistic' in terms of its I/O patterns.

a few suggestions / constructive comments:
 (1) 100 processes each performing i/o in the pattern of write 8MB, 
fsync(), wait, read 8MB, wait, delete probably isn't realistic
 (2) you don't mention whether you're performing testing on ext2 or ext3
 (3) you also don't mention what i/o scheduled is being used
 (4) your benchmark doesn't measure 'fairness' between processes
 (5) your benchmark sleeps for a random amount of time at various parts

it is well known that in 2.4 kernels, processes can 'hog' the i/o channel - 
which may result in higher overall throughput but at the detriment of being 
'fair' to the rest of the system.  you should address point (4) above.

can you modify your program to present the time-taken-per-process?
if i'm a betting man, i'd say that 2.6 will be a lot more 'fair' compared 
to 2.4.

default settings for 2.6 likely also means that there is a lot less data 
outstanding in the buffer-cache.
2.6's fsync() behavior is also quite different to that of 2.4.
also note that if you're using a journalled filesystem, fsync() likely does 
different things ...

you don't seed rand, so the random numbers out of rand() aren't actually 
random.
it probably doesn't matter so much since we're only talking microseconds 
here (up to 0.511 msec) - but given 2.4 kernels will have HZ of 100 and 2.6 
will have HZ of 1000, you're clearly going to get a different end result - 
perhaps with 2.6 resulting in a busy-wait from usleep().

cheers,
lincoln.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: qla2xxx fail over support

2005-03-15 Thread Lincoln Dale
At 07:43 AM 16/03/2005, comsatcat wrote:
Unfortunantly all the beta drivers seem to have issues working with
mcdata switches.  I've tried about 10 different versions available from
qlogic's ftp and all of them give trace messages and "scheduling while
atomic" messages when detecting luns that are going through the mcdata
switch.  any suggestions would be appreciated (along with whom to
contact at qlogic regarding beta driver development).
use a Cisco MDS FC switch and all your problems will go away. :-)
just kidding ... the errors you're seeing will likely happen regardless of 
what brand FC switch you have .. LUN Discovery and/or FC NS queries are 
likely the same regardless of FC switch.

what you're seeing is essentially a bug in the qlogic driver - and likely 
why it was listed as being "beta".

if you're after multipathing support, rather than doing it in the FC 
driver, may i suggest that you instead look at using Christophe Varoqui's 
excellent multipath-tools (see 
http://christophe.varoqui.free.fr/wiki/wakka.php?wiki=Home) which i have 
used successfully here across a range of midrange & enterprise storage arrays

cheers,
lincoln.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: qla2xxx fail over support

2005-03-15 Thread Lincoln Dale
At 07:43 AM 16/03/2005, comsatcat wrote:
Unfortunantly all the beta drivers seem to have issues working with
mcdata switches.  I've tried about 10 different versions available from
qlogic's ftp and all of them give trace messages and scheduling while
atomic messages when detecting luns that are going through the mcdata
switch.  any suggestions would be appreciated (along with whom to
contact at qlogic regarding beta driver development).
use a Cisco MDS FC switch and all your problems will go away. :-)
just kidding ... the errors you're seeing will likely happen regardless of 
what brand FC switch you have .. LUN Discovery and/or FC NS queries are 
likely the same regardless of FC switch.

what you're seeing is essentially a bug in the qlogic driver - and likely 
why it was listed as being beta.

if you're after multipathing support, rather than doing it in the FC 
driver, may i suggest that you instead look at using Christophe Varoqui's 
excellent multipath-tools (see 
http://christophe.varoqui.free.fr/wiki/wakka.php?wiki=Home) which i have 
used successfully here across a range of midrange  enterprise storage arrays

cheers,
lincoln.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-04 Thread Lincoln Dale
At 08:32 PM 4/02/2005, Andrew Morton wrote:
Something funny is happening here - it looks like there's plenty of CPU
capacity left over.
[..]
Could you monitor the CPU load during the various tests?  If the `dd'
workload isn't pegging the CPU then it could be that there's something
wrong with the I/O submission patterns.
as an educated guess, i'd say that the workload is running out of memory 
bandwidth ..

lets say the RAM is single-channel DDR400.  peak bandwidth = 3.2Gb/s (400 x 
10^6 x 64 bits / 10).  its fair to say that peak bandwidth is 
pretty rare thing to achieve with SDRAM given real-world access patterns -- 
lets take a conservative "it'll be 50% efficient" -- so DDR400 realistic 
peak = 1.6Gbps.

as far as memory-accesses go, a standard user-space read() from disk 
results in 4 memory-accesses (1. DMA from HBA to RAM, 2. read in 
copy_to_user(), 3. write in copy_to_user(), 4. userspace accessing that data).
1.6Gbps / 4 = 400MB/s -- or roughly what Ian was seeing.

sg_dd uses a window into a kernel DMA window.  as such, two of the four 
memory acccesses are cut out (1. DMA from HBA to RAM, 2. userspace 
accessing data).
1.6Gbps / 2 = 800MB/s -- or roughly what Ian was seeing with sg_dd.

DIRECT_IO should achieve similar numbers to sg_dd, but perhaps not quite as 
efficient.

cheers,
lincoln.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drive performance bottleneck

2005-02-04 Thread Lincoln Dale
At 08:32 PM 4/02/2005, Andrew Morton wrote:
Something funny is happening here - it looks like there's plenty of CPU
capacity left over.
[..]
Could you monitor the CPU load during the various tests?  If the `dd'
workload isn't pegging the CPU then it could be that there's something
wrong with the I/O submission patterns.
as an educated guess, i'd say that the workload is running out of memory 
bandwidth ..

lets say the RAM is single-channel DDR400.  peak bandwidth = 3.2Gb/s (400 x 
10^6 x 64 bits / 10).  its fair to say that peak bandwidth is 
pretty rare thing to achieve with SDRAM given real-world access patterns -- 
lets take a conservative it'll be 50% efficient -- so DDR400 realistic 
peak = 1.6Gbps.

as far as memory-accesses go, a standard user-space read() from disk 
results in 4 memory-accesses (1. DMA from HBA to RAM, 2. read in 
copy_to_user(), 3. write in copy_to_user(), 4. userspace accessing that data).
1.6Gbps / 4 = 400MB/s -- or roughly what Ian was seeing.

sg_dd uses a window into a kernel DMA window.  as such, two of the four 
memory acccesses are cut out (1. DMA from HBA to RAM, 2. userspace 
accessing data).
1.6Gbps / 2 = 800MB/s -- or roughly what Ian was seeing with sg_dd.

DIRECT_IO should achieve similar numbers to sg_dd, but perhaps not quite as 
efficient.

cheers,
lincoln.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ALSA HELP: Crackling and popping noises with via82xx

2005-02-01 Thread Lincoln Dale
At 01:34 PM 2/02/2005, Timothy Miller wrote:
I've mentioned this problem before.  It seemed to go away around the
2.6.8 timeframe, but when I started using 2.6.9, it came back.   I'm
using 2.6.10, and it's still happening.
almost identical system here, other than i'm using an ASUS A7V600 
motherboard but otherwise have identical chipset, graphics card.
(although the ASUS board has a rev60 version of the sound driver).

no problems with audio crackling at all, using 2.6.10 and 2.6.1-rc2-mm2 
with audio compiled into the kernel (not using modules for OSS/ALSA).

perhaps the interrupt is shared with some other device?
perhaps your speakers are dying?
this is my mythtv box so i'd certainly notice if the audio was bung.
[EMAIL PROTECTED] root]# uname -a
Linux spam 2.6.10ltd1 #1 Sun Jan 30 21:06:01 EST 2005 i686 athlon i386 
GNU/Linux

[EMAIL PROTECTED] root]# lspci
00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host 
Bridge (rev 80)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
00:0a.0 Multimedia video controller: Brooktree Corporation Bt878 Video 
Capture (rev 11)
00:0a.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture 
(rev 11)
00:0e.0 Multimedia video controller: Conexant Winfast TV2000 XP (rev 05)
00:0e.2 Multimedia controller: Conexant: Unknown device 8802 (rev 05)
00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID 
Controller (rev 80)
00:0f.1 IDE interface: VIA Technologies, Inc. 
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
00:10.0 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
00:10.1 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
00:10.2 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
00:10.3 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [K8T800 South]
00:11.5 Multimedia audio controller: VIA Technologies, Inc. 
VT8233/A/8235/8237 AC97 Audio Controller (rev 60)
00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78)
00:13.0 Multimedia video controller: Brooktree Corporation Bt848 Video 
Capture (rev 12)
01:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 
SE] (rev 01)
01:00.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200 SE] 
(Secondary) (rev 01)

[EMAIL PROTECTED] root]# cat /proc/interrupts
   CPU0
  0:  160440190IO-APIC-edge  timer
  1:   6157IO-APIC-edge  i8042
  7: 118047IO-APIC-edge  parport0
  9:  0   IO-APIC-level  acpi
 12: 165567IO-APIC-edge  i8042
 14: 403308IO-APIC-edge  ide0
 15:1685009IO-APIC-edge  ide1
 16:   59442009   IO-APIC-level  bttv0, bt878
 17:  0   IO-APIC-level  cx88[0], cx88[0]
 18:  3   IO-APIC-level  bttv1
 21: 37   IO-APIC-level  ehci_hcd, uhci_hcd, uhci_hcd, uhci_hcd, 
uhci_hcd
 22:  48672   IO-APIC-level  VIA8237
 23: 139365   IO-APIC-level  eth0

cheers,
lincoln.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ALSA HELP: Crackling and popping noises with via82xx

2005-02-01 Thread Lincoln Dale
At 01:34 PM 2/02/2005, Timothy Miller wrote:
I've mentioned this problem before.  It seemed to go away around the
2.6.8 timeframe, but when I started using 2.6.9, it came back.   I'm
using 2.6.10, and it's still happening.
almost identical system here, other than i'm using an ASUS A7V600 
motherboard but otherwise have identical chipset, graphics card.
(although the ASUS board has a rev60 version of the sound driver).

no problems with audio crackling at all, using 2.6.10 and 2.6.1-rc2-mm2 
with audio compiled into the kernel (not using modules for OSS/ALSA).

perhaps the interrupt is shared with some other device?
perhaps your speakers are dying?
this is my mythtv box so i'd certainly notice if the audio was bung.
[EMAIL PROTECTED] root]# uname -a
Linux spam 2.6.10ltd1 #1 Sun Jan 30 21:06:01 EST 2005 i686 athlon i386 
GNU/Linux

[EMAIL PROTECTED] root]# lspci
00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host 
Bridge (rev 80)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI Bridge
00:0a.0 Multimedia video controller: Brooktree Corporation Bt878 Video 
Capture (rev 11)
00:0a.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture 
(rev 11)
00:0e.0 Multimedia video controller: Conexant Winfast TV2000 XP (rev 05)
00:0e.2 Multimedia controller: Conexant: Unknown device 8802 (rev 05)
00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID 
Controller (rev 80)
00:0f.1 IDE interface: VIA Technologies, Inc. 
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
00:10.0 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
00:10.1 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
00:10.2 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
00:10.3 USB Controller: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller (rev 81)
00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [K8T800 South]
00:11.5 Multimedia audio controller: VIA Technologies, Inc. 
VT8233/A/8235/8237 AC97 Audio Controller (rev 60)
00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78)
00:13.0 Multimedia video controller: Brooktree Corporation Bt848 Video 
Capture (rev 12)
01:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 
SE] (rev 01)
01:00.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200 SE] 
(Secondary) (rev 01)

[EMAIL PROTECTED] root]# cat /proc/interrupts
   CPU0
  0:  160440190IO-APIC-edge  timer
  1:   6157IO-APIC-edge  i8042
  7: 118047IO-APIC-edge  parport0
  9:  0   IO-APIC-level  acpi
 12: 165567IO-APIC-edge  i8042
 14: 403308IO-APIC-edge  ide0
 15:1685009IO-APIC-edge  ide1
 16:   59442009   IO-APIC-level  bttv0, bt878
 17:  0   IO-APIC-level  cx88[0], cx88[0]
 18:  3   IO-APIC-level  bttv1
 21: 37   IO-APIC-level  ehci_hcd, uhci_hcd, uhci_hcd, uhci_hcd, 
uhci_hcd
 22:  48672   IO-APIC-level  VIA8237
 23: 139365   IO-APIC-level  eth0

cheers,
lincoln.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: X15 alpha release: as fast as TUX but in user space

2001-05-02 Thread Lincoln Dale

Hi,

At 10:50 AM 2/05/2001 +0200, Ingo Molnar wrote:
>i think Zach's phhttpd is an important milestone as well, it's the first
>userspace webserver that shows how to use event-based, sigio-based async
>networking IO and sendfile() under Linux. (I believe it had some
>performance problems related to sigio queue overflow, these issues might
>be solved in the latest kernels.) The zerocopy enhancements should help
>phhttpd as well.

my experience with sigio-based event-handlers is that the net-gain of 
event-driven i/o is mitigated by the fact that SIGIO is based on signals.

the problem with signals for this purpose are:
  (a) you go thru a syncronization point in the kernel.  signals are protected
  by a spinlock.
  it doesn't scale with SMP.
  (b) SI_PAD_SIZE

explicitly, (b) means that you have an awful lot of memory-accesses going 
on for every signal.
my experience with the overhead is that it mitigates the advantages when 
you become bottlenecked on memory-bus-accesses.


cheers,

lincoln.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: X15 alpha release: as fast as TUX but in user space

2001-05-02 Thread Lincoln Dale

Hi,

At 10:50 AM 2/05/2001 +0200, Ingo Molnar wrote:
i think Zach's phhttpd is an important milestone as well, it's the first
userspace webserver that shows how to use event-based, sigio-based async
networking IO and sendfile() under Linux. (I believe it had some
performance problems related to sigio queue overflow, these issues might
be solved in the latest kernels.) The zerocopy enhancements should help
phhttpd as well.

my experience with sigio-based event-handlers is that the net-gain of 
event-driven i/o is mitigated by the fact that SIGIO is based on signals.

the problem with signals for this purpose are:
  (a) you go thru a syncronization point in the kernel.  signals are protected
  by a spinlock.
  it doesn't scale with SMP.
  (b) SI_PAD_SIZE

explicitly, (b) means that you have an awful lot of memory-accesses going 
on for every signal.
my experience with the overhead is that it mitigates the advantages when 
you become bottlenecked on memory-bus-accesses.


cheers,

lincoln.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Incoming TCP TOS: A simple question, I would have thought...

2001-03-06 Thread Lincoln Dale

getsockopt(fd, SOL_IP, IP_TOS, ..


cheers,

lincoln.

At 03:00 PM 7/03/2001 +1100, David Luyer wrote:

>I've scrolled through various code in net/ipv4, and I can't see how to query
>the TOS of an incoming TCP stream (or at the least, the TOS of the SYN which
>initiated the connection).
>
>Someone has sent in a feature request for squid which would require this,
>presumably so they can set the TOS in their routers and have the squid caches
>honour the TOS to select performance (via delay pools, multiple parents,
>different outgoing IP or similar).  However I can't see how to get the TOS for
>a TCP socket out of the kernel short of having an open raw socket watching for
>SYNs and looking at the TOS on them.
>
>Any pointers?
>
>David.
>--
>David LuyerPhone:   +61 3 9674 7525
>Engineering Projects Manager   P A C I F I C   Fax: +61 3 9699 8693
>Pacific Internet (Australia)  I N T E R N E T  Mobile:  +61 4  2983
>http://www.pacific.net.au/ NASDAQ:  PCNTF
>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [EMAIL PROTECTED]
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: What is 2.4 Linux networking performance like compared to BSD?

2001-03-01 Thread Lincoln Dale

At 07:03 PM 1/03/2001 +0300, Hans Reiser wrote:
> > > They know that iMimic's polymix performance on Linux 2.2.* is half 
> what it is on
> > > BSD.  Has the Linux 2.4 networking code caught up to BSD?
> > >
> > > Can I tell them not to worry about the Linux networking code 
> strangling their
> > > webcache product's performance, or not?

Hans, if iMimic's polygraph performance is "half" on linux versus that of 
freebsd, then it is a sign that iMimic has some awful code and/or are doing 
something wrong in linux versus freebsd.

>The problem is that I really need BSD vs. Linux experiences, not Linux 2.4 vs.
>2.2 experiences, because the webcache industry tends to strongly disparage 
>Linux
>networking code, so much better isn't necessarily good enough.

please stop generalizing.  there is at least one vendor in the webcache 
industry that is more than happy with the linux networking code.


cheers,

lincoln.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: What is 2.4 Linux networking performance like compared to BSD?

2001-03-01 Thread Lincoln Dale

At 07:03 PM 1/03/2001 +0300, Hans Reiser wrote:
   They know that iMimic's polymix performance on Linux 2.2.* is half 
 what it is on
   BSD.  Has the Linux 2.4 networking code caught up to BSD?
  
   Can I tell them not to worry about the Linux networking code 
 strangling their
   webcache product's performance, or not?

Hans, if iMimic's polygraph performance is "half" on linux versus that of 
freebsd, then it is a sign that iMimic has some awful code and/or are doing 
something wrong in linux versus freebsd.

The problem is that I really need BSD vs. Linux experiences, not Linux 2.4 vs.
2.2 experiences, because the webcache industry tends to strongly disparage 
Linux
networking code, so much better isn't necessarily good enough.

please stop generalizing.  there is at least one vendor in the webcache 
industry that is more than happy with the linux networking code.


cheers,

lincoln.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ECN fixes for Cisco gear

2001-01-28 Thread Lincoln Dale

Hi,

At 02:33 PM 28/01/2001 -0700, Dax Kelson wrote:
>Here is the fix for PIX:
>
>(see
>http://www.cisco.com/cgi-bin/Support/Bugtool/onebug.pl?bugid=CSCds23698)
> Bud ID: CSCds23698
> Headline: PIX sends RSET in response to tcp connections with ECN
>  bits set
> Product: PIX
> Component: fw
> Severity: 2 Status: R [Resolved]
> Version Found: 5.1(1)
> Fixed-in Version: 5.1(2.206) 5.1(2.207)  5.2(1.200)

fixes have been incorporated for a number of different release trains for 
the pix.

Fixed-In Version now covers releases:
 5.1(2.206), 5.1(2.207), 5.2(1.200), 6.0(0.100), 5.2(3.210)


cheers,

lincoln.
NB. it has been posted that Raptor filewalls will also apparently fail to 
allow connections with ECN bits set.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: ECN fixes for Cisco gear

2001-01-28 Thread Lincoln Dale

Hi,

At 02:33 PM 28/01/2001 -0700, Dax Kelson wrote:
Here is the fix for PIX:

(see
http://www.cisco.com/cgi-bin/Support/Bugtool/onebug.pl?bugid=CSCds23698)
 Bud ID: CSCds23698
 Headline: PIX sends RSET in response to tcp connections with ECN
  bits set
 Product: PIX
 Component: fw
 Severity: 2 Status: R [Resolved]
 Version Found: 5.1(1)
 Fixed-in Version: 5.1(2.206) 5.1(2.207)  5.2(1.200)

fixes have been incorporated for a number of different release trains for 
the pix.

Fixed-In Version now covers releases:
 5.1(2.206), 5.1(2.207), 5.2(1.200), 6.0(0.100), 5.2(3.210)


cheers,

lincoln.
NB. it has been posted that Raptor filewalls will also apparently fail to 
allow connections with ECN bits set.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: hotmail not dealing with ECN

2001-01-25 Thread Lincoln Dale

Hi,

At 01:06 AM 25/01/2001 -0800, David S. Miller wrote:
>Juri Haberland writes:
>  > Forget it. I mailed them and this is the answer:
>  >
>  > "As ECN is not a widely used internet standard, and as Cisco does not
>  > have a stable OS for their routers that accepts ECN, anyone attempting
>  > to access our site through a gateway or from a computer that uses ECN
>  > will be unable to do so."
>
>The interesting bit is the "Cisco does not have a stable OS..." part.

Cisco _routers_ don't care whether packets have ECN set or not.

>I've been told repeatedly by the Cisco folks that a stable supported
>patch is available from them for their firewall products which were
>rejecting ECN packets.

nothing has changed since before --
both the cisco PIX and cisco LocalDirector didn't used to function 
correctly with ECN bits set.

both were fixed less than a week after a bug was opened and both have 
updates available for download ...
that was many many months ago ..

i wonder if some folk are being too quick to point the finger at just one 
vendor.  did some versions of solaris have problems with ECN too?

>I'd really like Cisco to reaffirm this and furthermore, and
>furthermore get in contact with and correct the hotmail folks
>if necessary.

if Juri can forward me (privately) the details of the hotmail person that 
said the above, i'd be happy to ensure that it is resolved ..

>I have in fact noticed that some sites that did have the problem have
>installed the fix and are now accessible with ECN enabled.

good to hear.


cheers,

lincoln.
NB. some cisco routers may start adding the ability to set ECN to indicate 
congestion too  ...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: hotmail not dealing with ECN

2001-01-25 Thread Lincoln Dale

Hi,

At 01:06 AM 25/01/2001 -0800, David S. Miller wrote:
Juri Haberland writes:
   Forget it. I mailed them and this is the answer:
  
   "As ECN is not a widely used internet standard, and as Cisco does not
   have a stable OS for their routers that accepts ECN, anyone attempting
   to access our site through a gateway or from a computer that uses ECN
   will be unable to do so."

The interesting bit is the "Cisco does not have a stable OS..." part.

Cisco _routers_ don't care whether packets have ECN set or not.

I've been told repeatedly by the Cisco folks that a stable supported
patch is available from them for their firewall products which were
rejecting ECN packets.

nothing has changed since before --
both the cisco PIX and cisco LocalDirector didn't used to function 
correctly with ECN bits set.

both were fixed less than a week after a bug was opened and both have 
updates available for download ...
that was many many months ago ..

i wonder if some folk are being too quick to point the finger at just one 
vendor.  did some versions of solaris have problems with ECN too?

I'd really like Cisco to reaffirm this and furthermore, and
furthermore get in contact with and correct the hotmail folks
if necessary.

if Juri can forward me (privately) the details of the hotmail person that 
said the above, i'd be happy to ensure that it is resolved ..

I have in fact noticed that some sites that did have the problem have
installed the fix and are now accessible with ECN enabled.

good to hear.


cheers,

lincoln.
NB. some cisco routers may start adding the ability to set ECN to indicate 
congestion too  ...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-20 Thread Lincoln Dale

hi,

At 04:56 PM 20/01/2001 +0200, Kai Henningsen wrote:
[EMAIL PROTECTED] (dean
gaudet)  wrote on 18.01.01 in
<[EMAIL PROTECTED]>:
> i'm pretty sure the actual use of pipelining is pretty
disappointing.
> the work i did in apache preceded the widespread use of HTTP/1.1 and
we

What widespread use of HTTP/1.1?
this is probably digressing significantly from linux-kernel related
issues, but i owuld say that HTTP/1.1 usage is more widespread than your
probably think.

from the statistics of a beta site running a commercial transparent
caching software:
#
show statistics http requests
 
Statistics - Requests

   
 
Total   %

  
---
...
  
HTTP 0.9 Requests: 
41907 0.0
  
HTTP 1.0 Requests:   37563201    24.1
  
HTTP 1.1 Requests:  118282092    75.9
  
HTTP Unknown
Requests: 
1 0.0
...


cheers,

lincoln.


Re: [Fwd: [Fwd: Is sendfile all that sexy? (fwd)]]

2001-01-20 Thread Lincoln Dale

hi,

At 04:56 PM 20/01/2001 +0200, Kai Henningsen wrote:
[EMAIL PROTECTED] (dean
gaudet) wrote on 18.01.01 in
[EMAIL PROTECTED]:
 i'm pretty sure the actual use of pipelining is pretty
disappointing.
 the work i did in apache preceded the widespread use of HTTP/1.1 and
we

What widespread use of HTTP/1.1?
this is probably digressing significantly from linux-kernel related
issues, but i owuld say that HTTP/1.1 usage is more widespread than your
probably think.

from the statistics of a beta site running a commercial transparent
caching software:
cache#
show statistics http requests

Statistics - Requests



Total %


---
...

HTTP 0.9 Requests:
41907 0.0

HTTP 1.0 Requests: 37563201 24.1

HTTP 1.1 Requests: 118282092 75.9

HTTP Unknown
Requests:
1 0.0
...


cheers,

lincoln.


Re: path MTU bug still there?

2000-12-31 Thread Lincoln Dale

Hi,

At 05:28 PM 31/12/2000 +0200, Jussi Hamalainen wrote:
>On Sun, 31 Dec 2000, Mikael Abrahamsson wrote:
>
> > When the linux box does TCP to the outside it'll use the MTU of
> > the tunnel (default route is the tunnel) and thus works perfectly
> > (since TCP MSS will be set low enough to fit into the tunnel).
>
>In my case I can't access a problematic host even from the router
>box.
...
>17:19:46.126297 xxx.xxx.xxx.xxx.1029 > 206.96.221.6.80: S 
>2549095564:2549095564(0) win 32120 649398[|tcp]> (DF)
...

i know that you've said previously that you've increased your MTU beyond 
1500, but can you validate that it is actually working?
ie. ping something on the other side of the GRE tunnel using a ping with 
total packet sizes equal to 1500?

alternatively, ensure that your application is capable of enforcing a MSS 
<1460 if this is shown to not be the case ..

http://www.cisco.com/warp/public/105/56.html contains some good information 
on some of the potential pitfalls of using tunnels.



cheers,

lincoln.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: path MTU bug still there?

2000-12-31 Thread Lincoln Dale

Hi,

At 05:28 PM 31/12/2000 +0200, Jussi Hamalainen wrote:
On Sun, 31 Dec 2000, Mikael Abrahamsson wrote:

  When the linux box does TCP to the outside it'll use the MTU of
  the tunnel (default route is the tunnel) and thus works perfectly
  (since TCP MSS will be set low enough to fit into the tunnel).

In my case I can't access a problematic host even from the router
box.
...
17:19:46.126297 xxx.xxx.xxx.xxx.1029  206.96.221.6.80: S 
2549095564:2549095564(0) win 32120 mss 1460,sackOK,timestamp 
649398[|tcp] (DF)
...

i know that you've said previously that you've increased your MTU beyond 
1500, but can you validate that it is actually working?
ie. ping something on the other side of the GRE tunnel using a ping with 
total packet sizes equal to 1500?

alternatively, ensure that your application is capable of enforcing a MSS 
1460 if this is shown to not be the case ..

http://www.cisco.com/warp/public/105/56.html contains some good information 
on some of the potential pitfalls of using tunnels.



cheers,

lincoln.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux's implementation of poll() not scalable?

2000-10-24 Thread Lincoln Dale

At 10:39 PM 23/10/2000 -0700, Linus Torvalds wrote:
>First, let's see what is so nice about "select()" and "poll()". They do
>have one _huge_ advantage, which is why you want to fall back on poll()
>once the RT signal interface stops working. What is that?

RT methods are bad if they consume too many resources.  SIGIO is a good 
example of this - the current overhead of passing events to user-space 
incurs both a spinlock and a memory copy of 512 bytes for each 
event.  while it removes the requirement to "walk lists", the signal 
semantics in the kernel and the overhead of memory copies to userspace 
negate its performance a fair bit.

that isn't to say that all "event-driven" methods are bad.  in the past 
year, i've done many experiments at making SIGIO more efficient.

some of these experiments include --
  [1] 'aggregate' events.  that is, if you've registered a POLL_IN, no need
  to registered another POLL_IN
  this was marginally successful, but ultimately still didn't scale.

  [2] create a new interface for event delivery.

for i settled on a 16-byte structure sufficient to pass all of the relevant 
information:
 typedef struct zerocopy_buf {
 int fd;
 short int   cmd;
 #define ZEROCOPY_VALID_BUFFER   0xe1e2
 short int   valid_buffer;
 void*buf; /* skbuff */
 #ifdef __KERNEL__
 volatile
 #endif
 struct zerocopy_buf *next;
 } zerocopy_buf_t;

so, we get down to 16 bytes per-event.  these are allocated

coupled with this was an interface whereby user-space could view 
kernel-space (via read-only mmap).
in my case, this allowed for user-space to be able to read the above chain 
of zerocopy_buf events with no kernel-to-user memory copies.

an ioctl on a character driver could ask the kernel to give it the head of 
the chain of the current zerocopy_buf structure.  a similar ioctl() call 
allows it to pass a chain of instructions to the kernel (adding/removing 
events from notification) and other housekeeping.

since user-space had read-only visibility into kernel memory address-space, 
one could then pick up skbuff's in userspace without the overhead of copies.

... and so-on.

the above is a bit of a simplification of what goes on.  using flip-buffers 
of queues, one can use this in multiple processes and be SMP-safe without 
the requirements for spinlocks or semaphores in the "fast path".  solving 
the "walk the list of fd's" and "incur the overhead of memory copies" tied 
in with network hardware capable of handling scatter/gather DMA and IP and 
TCP checksum calculations, i've more than doubled the performance of an 
existing application which depended on poll()-type behaviour.

while i agree that it isn't necessarily a 'generic' interface, and won't 
necessarilly appeal to everyone as the cure-all, the techniques used have 
removed two significant bottlenecks to high-network-performance i/o on 
tens-of-thousands of TCP sockets for an application we've been working on.



cheers,

lincoln.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux's implementation of poll() not scalable?

2000-10-24 Thread Lincoln Dale

At 10:39 PM 23/10/2000 -0700, Linus Torvalds wrote:
First, let's see what is so nice about "select()" and "poll()". They do
have one _huge_ advantage, which is why you want to fall back on poll()
once the RT signal interface stops working. What is that?

RT methods are bad if they consume too many resources.  SIGIO is a good 
example of this - the current overhead of passing events to user-space 
incurs both a spinlock and a memory copy of 512 bytes for each 
event.  while it removes the requirement to "walk lists", the signal 
semantics in the kernel and the overhead of memory copies to userspace 
negate its performance a fair bit.

that isn't to say that all "event-driven" methods are bad.  in the past 
year, i've done many experiments at making SIGIO more efficient.

some of these experiments include --
  [1] 'aggregate' events.  that is, if you've registered a POLL_IN, no need
  to registered another POLL_IN
  this was marginally successful, but ultimately still didn't scale.

  [2] create a new interface for event delivery.

for i settled on a 16-byte structure sufficient to pass all of the relevant 
information:
 typedef struct zerocopy_buf {
 int fd;
 short int   cmd;
 #define ZEROCOPY_VALID_BUFFER   0xe1e2
 short int   valid_buffer;
 void*buf; /* skbuff */
 #ifdef __KERNEL__
 volatile
 #endif
 struct zerocopy_buf *next;
 } zerocopy_buf_t;

so, we get down to 16 bytes per-event.  these are allocated

coupled with this was an interface whereby user-space could view 
kernel-space (via read-only mmap).
in my case, this allowed for user-space to be able to read the above chain 
of zerocopy_buf events with no kernel-to-user memory copies.

an ioctl on a character driver could ask the kernel to give it the head of 
the chain of the current zerocopy_buf structure.  a similar ioctl() call 
allows it to pass a chain of instructions to the kernel (adding/removing 
events from notification) and other housekeeping.

since user-space had read-only visibility into kernel memory address-space, 
one could then pick up skbuff's in userspace without the overhead of copies.

... and so-on.

the above is a bit of a simplification of what goes on.  using flip-buffers 
of queues, one can use this in multiple processes and be SMP-safe without 
the requirements for spinlocks or semaphores in the "fast path".  solving 
the "walk the list of fd's" and "incur the overhead of memory copies" tied 
in with network hardware capable of handling scatter/gather DMA and IP and 
TCP checksum calculations, i've more than doubled the performance of an 
existing application which depended on poll()-type behaviour.

while i agree that it isn't necessarily a 'generic' interface, and won't 
necessarilly appeal to everyone as the cure-all, the techniques used have 
removed two significant bottlenecks to high-network-performance i/o on 
tens-of-thousands of TCP sockets for an application we've been working on.



cheers,

lincoln.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: TRACED] Re: "Tux" is the wrong logo for Linux

2000-10-19 Thread Lincoln Dale

At 02:09 PM 19/10/2000 -0400, Mark Haney wrote:
> > Feel free to send complaints to [EMAIL PROTECTED] and get his account
> > yanked for abuse of mailing lists.
>
>http://www.ilan.net/contact.htm for a nice list of addresses to send
>complaints to.

the original email came from 216.27.3.45.
a quick grep thru past archives of this mailing list is a relatively 
trivial way to find out who it is . . .

the person has been foolish enough to post to this list from that 
ip-address before.
i won't post details of that, but it is a relatively trivial exercise to go 
thru . . .

perhaps we can all get on with real work now . . . :-)

cheers,

lincoln.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: TRACED] Re: Tux is the wrong logo for Linux

2000-10-19 Thread Lincoln Dale

At 02:09 PM 19/10/2000 -0400, Mark Haney wrote:
  Feel free to send complaints to [EMAIL PROTECTED] and get his account
  yanked for abuse of mailing lists.

http://www.ilan.net/contact.htm for a nice list of addresses to send
complaints to.

the original email came from 216.27.3.45.
a quick grep thru past archives of this mailing list is a relatively 
trivial way to find out who it is . . .

the person has been foolish enough to post to this list from that 
ip-address before.
i won't post details of that, but it is a relatively trivial exercise to go 
thru . . .

perhaps we can all get on with real work now . . . :-)

cheers,

lincoln.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: ECN & cisco firewall

2000-09-10 Thread Lincoln Dale

Dave, et. al.,

At 05:56 08/09/00, David S. Miller wrote:
..
>in the Cisco PIX case does the firewall send a reset
..

a bug ticket has been opened for the cisco pix firewall and [lack-of] TCP 
ECN inter operability.
the developers know about the issue, and i'm sure that a fix will be 
forthcoming in a future interim release.


back to linux kernel issues,
cheers,

lincoln.


--
   Lincoln Dale   Content Services Business Unit
   [EMAIL PROTECTED]  cisco Systems, Inc.   | |
||||
   +1 (408) 525-1274  bldg G, 170 West Tasman    
   +61 (3) 9659-4294 <<   San Jose CA 95134..:||:..:||:.. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: ECN cisco firewall

2000-09-10 Thread Lincoln Dale

Dave, et. al.,

At 05:56 08/09/00, David S. Miller wrote:
..
in the Cisco PIX case does the firewall send a reset
..

a bug ticket has been opened for the cisco pix firewall and [lack-of] TCP 
ECN inter operability.
the developers know about the issue, and i'm sure that a fix will be 
forthcoming in a future interim release.


back to linux kernel issues,
cheers,

lincoln.


--
   Lincoln Dale   Content Services Business Unit
   [EMAIL PROTECTED]  cisco Systems, Inc.   | |
||||
   +1 (408) 525-1274  bldg G, 170 West Tasman    
   +61 (3) 9659-4294San Jose CA 95134..:||:..:||:.. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: zero-copy TCP

2000-09-03 Thread Lincoln Dale

At 22:53 03/09/00, Linus Torvalds wrote:
> >Ugh.  User space DMA gets complicated quickly.  The performance question
> >is, perhaps, can you do this without a TLB flush (but with locking the
> >struct page of course).  Note that it doesn't matter if another thread,
> >and this includes truncate/write in another thread, clobbers the page
> >data.  That's just the normal effect of two concurrent writers to the
> >same memory.
...
>People who claim "zero-copy" is a great thing often ignore the costs of
>_not_ copying altogether.

many people (myself included) have been experimenting with zerocopy 
infrastructures.
in my case, i've been working on it as time permits for quite a few months 
now, and am about on my fourth rewrite.

i've found exactly what you state about the bad things that occur when you 
associate zerocopy infrastructure with user-space code.  some of the the MM 
tricks required for handling individual pages effectively kills any 
performance gain.

however, approaching it from the other angle of "buffers pinned in kernel 
memory" can give you a huge win.
for the application which prompted me to begin looking at this problem, 
where packets typically go network -> RAM -> network, providing a zerocopy 
infrastructure for (a) viewing incoming packet streams pinned in kernel 
memory from user-space [a sort-of SIGIO with pointers to the buffers], and 
(b) hooks for user-space directing the kernel to do things with these 
buffers [eg. "queue buffer A for output on fd Y"] has provided an immediate 
60% performance gain.

performance was previously pinned on front-side-bus (or memory) bandwidth.

the interfaces are a bit hacky, and the way one has to queue packets for 
tcp-write is awful right now, but i hope these can be cleaned up over time.

network cards which offload the IP & TCP checksum calculation isn't even 
required; provided the incoming checksum is preserved, the original pseudo 
TCP header can be "reversed out" without having to re-read the entire 
packet payloads again.


cheers,

lincoln.

--
   Lincoln Dale   Content Services Business Unit
   [EMAIL PROTECTED]  cisco Systems, Inc.   | |
||||
   +1 (408) 525-1274  bldg G, 170 West Tasman    
   +61 (3) 9659-4294 <<   San Jose CA 95134..:||:..:||:.. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: zero-copy TCP

2000-09-03 Thread Lincoln Dale

At 22:53 03/09/00, Linus Torvalds wrote:
 Ugh.  User space DMA gets complicated quickly.  The performance question
 is, perhaps, can you do this without a TLB flush (but with locking the
 struct page of course).  Note that it doesn't matter if another thread,
 and this includes truncate/write in another thread, clobbers the page
 data.  That's just the normal effect of two concurrent writers to the
 same memory.
...
People who claim "zero-copy" is a great thing often ignore the costs of
_not_ copying altogether.

many people (myself included) have been experimenting with zerocopy 
infrastructures.
in my case, i've been working on it as time permits for quite a few months 
now, and am about on my fourth rewrite.

i've found exactly what you state about the bad things that occur when you 
associate zerocopy infrastructure with user-space code.  some of the the MM 
tricks required for handling individual pages effectively kills any 
performance gain.

however, approaching it from the other angle of "buffers pinned in kernel 
memory" can give you a huge win.
for the application which prompted me to begin looking at this problem, 
where packets typically go network - RAM - network, providing a zerocopy 
infrastructure for (a) viewing incoming packet streams pinned in kernel 
memory from user-space [a sort-of SIGIO with pointers to the buffers], and 
(b) hooks for user-space directing the kernel to do things with these 
buffers [eg. "queue buffer A for output on fd Y"] has provided an immediate 
60% performance gain.

performance was previously pinned on front-side-bus (or memory) bandwidth.

the interfaces are a bit hacky, and the way one has to queue packets for 
tcp-write is awful right now, but i hope these can be cleaned up over time.

network cards which offload the IP  TCP checksum calculation isn't even 
required; provided the incoming checksum is preserved, the original pseudo 
TCP header can be "reversed out" without having to re-read the entire 
packet payloads again.


cheers,

lincoln.

--
   Lincoln Dale   Content Services Business Unit
   [EMAIL PROTECTED]  cisco Systems, Inc.   | |
||||
   +1 (408) 525-1274  bldg G, 170 West Tasman    
   +61 (3) 9659-4294San Jose CA 95134..:||:..:||:.. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: thread rant

2000-09-01 Thread Lincoln Dale

At 22:48 01/09/00, Michael Bacarella wrote:

>Q: Why do we need threads?
>A: Because on some operating systems, task switches are expensive.
>
>Q: So, threads are a hack to get around operating systems that suck?
>A: Basically.

urgh, i think you've missed the point.

while threads /may/ be abused by many applications where fork() or 
decent-event-queue/state-machine would probably produce much better 
performance (i believe some of the java libraries are a perfect example of 
this), there are _MANY_ applications where threads work, work well, and are 
superior to the alternatives available (fork with shm segments).

one such example is cpu compultationally-expensive work where some degree 
of working-set is required to be shared between processes.  one could use 
fork(), run multiple instances, have them register to a shm segment and 
then implement some form of IPC between them.
alternatively, you could create 'n' work threads where "n == NR_CPUs" with 
the working-set automatically available to all worker threads.  for 
whatever synchronization is required, you don't have to write your IPC 
mechanism - mutexes come standard with things like pthreads.  (of course, 
there is no excuse for bad programming or bad algorithms; mutex use should 
be kept to a minimum).  perhaps you then need this application to then do 
bulk disk-i/o?  one-thread-per-disk-spindle works nicely in this scenario too.

threads are useful and powerful.  perhaps the real problem with threads are 
that it is too easy to write bad code using them.
caveat emptor.


cheers,

lincoln.


--
   Lincoln Dale   Content Services Business Unit
   [EMAIL PROTECTED]  cisco Systems, Inc.   | |
||||
   +1 (408) 525-1274  bldg G, 170 West Tasman    
   +61 (3) 9659-4294 <<   San Jose CA 95134..:||:..:||:.. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/