Re: XL driver checksum producing corrupted but checksum-correct packets

2004-01-24 Thread M. Warner Losh
In message: <[EMAIL PROTECTED]>
Robert Watson <[EMAIL PROTECTED]> writes:
: My understanding is that NDIS drivers rely on the HAL provided by NT to
: perform hardware access, so you can generate I/O traces with relative
: ease.

ndis drivers CAN rely on the HAL, but all of them don't necessarily do
so...  Not sure about the 3com driver, but Bill Paul was talking about
this on IRC a few days ago...

Warner
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RFC 1892 - serial arithmetic

2004-01-24 Thread Aniruddha Bohra
Hello,
I was wondering if there is a standard
header file that implements the RFC 1982 ?
It basically defines serial numbers and the
arithmetic operations on them.
All it does is :
s' = (s + n) modulo (2 ^ SERIAL_BITS)
i1 is the arithmetic integer whose value is
the same as s1, and i2 has the same value as i2
s1 is said to be less than s2 if, and only if, s1 is not equal to s2,
   and
(i1 < i2 and i2 - i1 < 2^(SERIAL_BITS - 1)) or
(i1 > i2 and i1 - i2 > 2^(SERIAL_BITS - 1))
s1 is said to be greater than s2 if, and only if, s1 is not equal to
s2, and
(i1 < i2 and i2 - i1 > 2^(SERIAL_BITS - 1)) or
(i1 > i2 and i1 - i2 < 2^(SERIAL_BITS - 1))
I am currently using these as my local defns but would like
to use a standard header if possible.
Thanks

Aniruddha



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: XL driver checksum producing corrupted but checksum-correct packets

2004-01-24 Thread Robert Watson

On Sat, 24 Jan 2004, Matthew Dillon wrote:

> Well, I tried to tcpdump a session.  I managed to hit the error three
> times but in all three cases the tcpdump on the server dropped the
> particular packet I was looking for.  I'm only able to get a 70%
> retention rate in the tcpdump output on the server... its just trying
> to record too much for the machine to handle at the rate the NFS requests
> are coming in.

To pick up the corrupted packet on the machine where the corruption is
occurring, you might want to try hooking up the UDP checksum drop case to
BPF_MTAP() for a special BPF device or rule, or have it spit them into a
raw socket (probably easier).

Problem is, the context switching does in BPF, so if you can get another
machine onto the segment without it being excessively switched (perhaps on
a monitor port), using a third machine to grab the on-the-wire packets
might work best.  That way you can compare pre-corruption and
post-corruption.

> I'm going to give up trying to characterize the corruption for now.
> It could very well be the PCI latency timer as previously discussed
> but I can't test that right now.

If it is the problem, it may be easier to do this and see if it works than
to track down the packet :-).

good luck...

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Senior Research Scientist, McAfee Research

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: XL driver checksum producing corrupted but checksum-correct packets

2004-01-24 Thread Matthew Dillon
Well, I tried to tcpdump a session.  I managed to hit the error three
times but in all three cases the tcpdump on the server dropped the
particular packet I was looking for.  I'm only able to get a 70%
retention rate in the tcpdump output on the server... its just trying
to record too much for the machine to handle at the rate the NFS requests
are coming in.

I'm going to give up trying to characterize the corruption for now.
It could very well be the PCI latency timer as previously discussed
but I can't test that right now.

-Matt
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: XL driver checksum producing corrupted but checksum-correct packets

2004-01-24 Thread Robert Watson

On Sat, 24 Jan 2004, Luigi Rizzo wrote:

> On Sat, Jan 24, 2004 at 02:12:12PM -0500, Robert Watson wrote:
> ...
> > > but going this way you have no idea on what the driver does, including
> > > enabling hw checksums. This looks like a useless test at least for the
> > > purpose of finding out what is going wrong
> > 
> > Actually, I'm more curious about whether it's a known errata/misbehavior
> > for the card that 3Com's drivers work around, or not.  The problem could
> > well be compleely unrelated to hardware checksuming per se -- the
> > corruption might well be taking place as the buffer is moved from the
> > card's buffer to the operating system managed buffer.  If the NDIS driver
> > doesn't illustrate the same problem, it tells us that by frobbing
> > appropriately, this problem can be worked around.  It also tells us that
> > by looking a bit harder at what the driver is doing (i.e., how it frobs
> > the hardware), we can learn something about the appropriate workaround. 
> 
> yes, but how would you know that, short of reverse engineering the
> driver, or tracing I/O accesses to the hardware ?  It really looks like
> an overkill effort... I'd rather just try to debug the issue working on
> an open source driver, or dump the hardware altogether and replace it
> with something known to work... 

My understanding is that NDIS drivers rely on the HAL provided by NT to
perform hardware access, so you can generate I/O traces with relative
ease.  Decoding and following the HAL traces during card setup is probably
relatively straight forward, since presumably most of the I/O transactions
will match the documented services of the card.  It might be useful to add
some KTR support to Bill's NDIS pieces for this very purpose, if there's
interest.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Senior Research Scientist, McAfee Research


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: XL driver checksum producing corrupted but checksum-correct packets

2004-01-24 Thread Luigi Rizzo
On Sat, Jan 24, 2004 at 02:12:12PM -0500, Robert Watson wrote:
...
> > but going this way you have no idea on what the driver does, including
> > enabling hw checksums. This looks like a useless test at least for the
> > purpose of finding out what is going wrong
> 
> Actually, I'm more curious about whether it's a known errata/misbehavior
> for the card that 3Com's drivers work around, or not.  The problem could
> well be compleely unrelated to hardware checksuming per se -- the
> corruption might well be taking place as the buffer is moved from the
> card's buffer to the operating system managed buffer.  If the NDIS driver
> doesn't illustrate the same problem, it tells us that by frobbing
> appropriately, this problem can be worked around.  It also tells us that
> by looking a bit harder at what the driver is doing (i.e., how it frobs
> the hardware), we can learn something about the appropriate workaround. 

yes, but how would you know that, short of reverse engineering
the driver, or tracing I/O accesses to the hardware ?
It really looks like an overkill effort... I'd rather just
try to debug the issue working on an open source driver, or
dump the hardware altogether and replace it with something
known to work...

cheers
luigi
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: symlink: /home -> /usr/home vs. /home -> usr/home in default installation

2004-01-24 Thread William M. Grim
Dmitry Morozovsky wrote:

On Fri, 23 Jan 2004, Andriy Tkachuk wrote:

AT> The idea is this: if you mount your / to another
AT> place (for example /mnt on another computer), your
AT> /mnt/home will point to correct place (/mnt/usr/home)
AT> instead of /usr/home.
AT>
AT> What do you, falks, think about this?
FWIW, I'm making virtually every symlink relative instead of absolute for just
this reason. (To be exact, more similar to Solaris' approach, so
/sys -> ./usr/src/sys and /home -> ./usr/home)
 

This is the second time I've posted about this, but I'm beginning to 
think this is a very good idea.  I really don't foresee any problems 
with it.  Since you guys already have some of this work complete, 
perhaps you could submit a PR for it?

--
William Michael Grim
Student, Southern Illinois University at Edwardsville
Unix Network Administrator, SIUE, Computer Science dept.
Phone: (217) 341-6552
Email: [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: strange "less" behaviour on big files

2004-01-24 Thread Dag-Erling Smørgrav
Kai Mosebach <[EMAIL PROTECTED]> writes:
> i lessed a 1.0 gig file filled with zeros and less seems to slurp in
> as much as it can get at one time!

AFAIK, it tries to read one entire line; since your file doesn't
contain any line feed characters, it ends up reading the entire file.

DES
-- 
Dag-Erling Smørgrav - [EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: XL driver checksum producing corrupted but checksum-correct packets

2004-01-24 Thread Robert Watson

On Sat, 24 Jan 2004, Luigi Rizzo wrote:

> On Sat, Jan 24, 2004 at 01:38:37PM -0500, Robert Watson wrote:
> ...
> > (2) Try the NDIS driver with the NDIS-u-lator on FreeBSD 5.x and see if
> > that also has the problem.
> 
> but going this way you have no idea on what the driver does, including
> enabling hw checksums. This looks like a useless test at least for the
> purpose of finding out what is going wrong

Actually, I'm more curious about whether it's a known errata/misbehavior
for the card that 3Com's drivers work around, or not.  The problem could
well be compleely unrelated to hardware checksuming per se -- the
corruption might well be taking place as the buffer is moved from the
card's buffer to the operating system managed buffer.  If the NDIS driver
doesn't illustrate the same problem, it tells us that by frobbing
appropriately, this problem can be worked around.  It also tells us that
by looking a bit harder at what the driver is doing (i.e., how it frobs
the hardware), we can learn something about the appropriate workaround. 
If it's a delay/timing issue, it's less likely we can learn something, but
if the NDIS driver is simply disabling hardware checksumming for specific
chipsets, that's something we should be able to figure out.  On the other
hand, if the NDIS driver shows the exact same problem, this might not be
an issue known to the vendor.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Senior Research Scientist, McAfee Research


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: XL driver checksum producing corrupted but checksum-correct packets

2004-01-24 Thread Luigi Rizzo
On Sat, Jan 24, 2004 at 01:38:37PM -0500, Robert Watson wrote:
...
> (2) Try the NDIS driver with the NDIS-u-lator on FreeBSD 5.x and see if
> that also has the problem.

but going this way you have no idea on what the driver does,
including enabling hw checksums. This looks like a
useless test at least for the purpose of finding out
what is going wrong

cheers
luigi
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


strange "less" behaviour on big files

2004-01-24 Thread Kai Mosebach
Hi all,

i stumbled over a problem with less, when "lessing" big files.
i lessed a 1.0 gig file filled with zeros and less seems to slurp in as much as 
it can get at one time! this happens only on 5.2-RELEASE, a 4.9-RELEASE does 
fine.

my top says (on a 512 mb machine)

20041 root  760   382M   193M pfault   0:03  7.03%  7.03% less

furthermore i think the state "pfault" in top is pretty strange, and after a 
minute or so the process coredumps

backtrace is not very useful though ...

#0  0x2811dd4f in kill () from /lib/libc.so.5
(gdb) bt
#0  0x2811dd4f in kill () from /lib/libc.so.5
#1  0x281127f8 in raise () from /lib/libc.so.5
#2  0x2818af02 in abort () from /lib/libc.so.5
#3  0x2818967e in tcflow () from /lib/libc.so.5
#4  0x28189f1b in tcflow () from /lib/libc.so.5
#5  0x2818a2d6 in malloc () from /lib/libc.so.5
#6  0x28186ae1 in calloc () from /lib/libc.so.5
#7  0x0805255d in clear ()
#8  0x080535a4 in clear ()
#9  0x08053ab6 in clear ()
#10 0x08053dfb in clear ()
#11 0x08051366 in clear ()
#12 0x080523bf in clear ()
#13 0x0804dbc0 in clear ()
#14 0x0804dc27 in clear ()
#15 0x0804dfcf in clear ()
#16 0x080496b6 in free ()
#17 0x08049242 in free ()


Any ideas ?

Regards Kai

-- 
SapDB for FreeBSD --- see http://www.komadev.de/sapdb
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: XL driver checksum producing corrupted but checksum-correct packets

2004-01-24 Thread Robert Watson

On Sat, 24 Jan 2004, Max Laier wrote:

> On Saturday 24 January 2004 17:06, Robert Watson wrote:
> > On Fri, 23 Jan 2004, Matthew Dillon wrote:
> > > I tracked down an occassional buildworld failure on DragonFly to
> > > my XL driver, which is synchronized to 4.x's XL driver.
> 
> FYI: This was reproduced on OpenBSD as well (w/ ftp and scp): 
> http://marc.theaimsgroup.com/?l=openbsd-tech&m=107494884327698&w=2

Two thoughts on other things to try, with that in mind:

(1) Linux on the same hardware, see if whatever set of XL workarounds they
have addresses this specific problem.

(2) Try the NDIS driver with the NDIS-u-lator on FreeBSD 5.x and see if
that also has the problem.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Senior Research Scientist, McAfee Research


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: XL driver checksum producing corrupted but checksum-correct packets

2004-01-24 Thread Max Laier
On Saturday 24 January 2004 17:06, Robert Watson wrote:
> On Fri, 23 Jan 2004, Matthew Dillon wrote:
> > I tracked down an occassional buildworld failure on DragonFly to
> > my XL driver, which is synchronized to 4.x's XL driver.

FYI: This was reproduced on OpenBSD as well (w/ ftp and scp):
http://marc.theaimsgroup.com/?l=openbsd-tech&m=107494884327698&w=2

-- 
Best regards,   | [EMAIL PROTECTED]
Max Laier   | ICQ #67774661
http://pf4freebsd.love2party.net/   | [EMAIL PROTECTED]

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: XL driver checksum producing corrupted but checksum-correct packets

2004-01-24 Thread Robert Watson

On Fri, 23 Jan 2004, Matthew Dillon wrote:

> I tracked down an occassional buildworld failure on DragonFly to my
> XL driver, which is synchronized to 4.x's XL driver.

It would be very helpful if you could do the following:

(1) See if you can reproduce this using something other than NFS --
perhaps netperf using UDP_STREAM or the like, between that machine and
another machine.  This would give us a more reproduceable workload
than "builds", and hopefully one that is less sensitive to things like
context switching, etc.

(2) See if you can reproduce this with a stock 4.9-RELEASE kernel (or
4-STABLE).  While the drivers are similar between 4.x and DFBSD, there
are actually quite a few structural changes in the DFBSD version.
Maybe it would make sense to try backing out the local DFBSD changes
to the base FreeBSD version, even if not trying a completely FreeBSD
system, to see if they are the cause.  It's difficult to diff the two
because of reorganization and style changes.

> [EMAIL PROTECTED]:6:0:   class=0x02 card=0x764610b7 chip=0x764610b7 rev=0x30 
> hdr=0x00

Does this card have a product name, or is it one of those chips embedded
in a motherboard without a separate name?

I took a look through the xl cards/chips on my various machines, and was
unable to find anything that had remotely the same card or chip ID.  I did
some high-volume packet flows between them with hardware checksumming
disabled and didn't see any corrupted UDP packets, but the workloads I'm
using sound pretty different.  Knowing it could be reproduced using a more
simple workload (and the specifics) would be good.

FYI, I checked the Linux driver for these cards, and didn't see mention of
any quirks for the particular chips/card you're using.  The only thing of
note in the Linux driver was the following:

/* Check the PCI latency value.  On the 3c590 series the latency timer
   must be set to the maximum value to avoid data corruption that occurs
   when the timer expires during a transfer.  This bug exists the Vortex
   chip only. */
if (pdev) {
u8 pci_latency;
u8 new_latency = (drv_flags & IS_VORTEX) ? 248 : 32;

pci_read_config_byte(pdev, PCI_LATENCY_TIMER, &pci_latency);
if (pci_latency < new_latency) {
printk(KERN_INFO "%s: Overriding PCI latency"
   " timer (CFLT) setting of %d, new value is %d.\n",
   dev->name, pci_latency, new_latency);
pci_write_config_byte(pdev, PCI_LATENCY_TIMER, new_latency);
}
}

The rate at which you have failures sounds like it could be a similar
issue, however -- an occasional collision between a timer and DMA.  NFS is
often a mix of small RPCs handling lookups and attributes, and larger RPCs
carrying data.  Using netperf or a related tool might help you identify if
one of those is more likely to cause the failure. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Senior Research Scientist, McAfee Research

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: symlink: /home -> /usr/home vs. /home -> usr/home in default installation

2004-01-24 Thread Dmitry Morozovsky
On Fri, 23 Jan 2004, Andriy Tkachuk wrote:

AT> The idea is this: if you mount your / to another
AT> place (for example /mnt on another computer), your
AT> /mnt/home will point to correct place (/mnt/usr/home)
AT> instead of /usr/home.
AT>
AT> What do you, falks, think about this?

FWIW, I'm making virtually every symlink relative instead of absolute for just
this reason. (To be exact, more similar to Solaris' approach, so
/sys -> ./usr/src/sys and /home -> ./usr/home)

Sincerely,
D.Marck [DM5020, MCK-RIPE, DM3-RIPN]

*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- [EMAIL PROTECTED] ***

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 5.2-RELEASE buildworld failure.

2004-01-24 Thread Ruslan Ermilov
On Fri, Jan 23, 2004 at 06:13:24PM -0800, erek wrote:
> I cvsuped today using tag RELENG_5_2 (i'm already using 5.2-RELEASE),
[...]
> During the buildworld I get this VERY odd error:
[...]
> /usr/src/gnu/usr.bin/cc/cc_tools/freebsd-native.h:62:25: attempt to use poisoned 
> "malloc"
[...]
> mkdep: compile failed
> *** Error code 1
[...]
> any suggestions?
> 


Go to /usr/obj/usr/src/i386/usr/src/gnu/usr.bin/cc/cc1plus, and compare
parse.c and parse+%DIKED.c there.  They should be different, "xmalloc"
vs "malloc", "xrealloc" vs "realloc.  If they are identical, chances
are your /usr/bin/sed is probably broken, and you should read this entry
from src/UPDATING:

: 20030613: [retrospective]
: There was a small window in which sed(1) was broken.  If you
: happen to have sed(1) installed during that window, which is
: evidenced by an inability to build world with the failure
: given below, you need to manually build and install sed(1)
: (and only sed(1)) before doing anything else. This is a one-
: time snafu. Typical failure mode:
: 
: In file included from /usr/src/contrib/binutils/bfd/targets.c:1092:
: targmatch.h:7:1: null character(s) ignored
: targmatch.h:12:1: null character(s) ignored
: targmatch.h:16:1: null character(s) ignored
: :
: 
: The window of "sed(1)-uction" is from Wed Jun 4 15:31:55 2003 UTC
: to Thu Jun 5 12:10:19 2003 UTC (from rev 1.30 to rev 1.31 of
: usr.bin/sed/process.c).

To see if you're affected, run this:

ident /usr/bin/sed

And see which process.c revision your sed(1) has.  It if's 1.30,
you're affected.




Cheers,
-- 
Ruslan Ermilov
FreeBSD committer
[EMAIL PROTECTED]


pgp0.pgp
Description: PGP signature


Re: read-only compressed fs (call for testers) [UPDATE]

2004-01-24 Thread Vincent Jardin
Does it support XIP (eXecution In Place too) ?

Regards,
  Vincent

On Saturday 24 January 2004 03:15, Dario Freni wrote:
> > Thank you by the FreeSBIE team.
>
>  You can try a FreeSBIE iso with the geom_ugz patch (compressed fs) at:
>
> http://www.willystudios.com/freesbie/FreeSBIE-cloop-test.iso.bz2
>
>  I haven't exact numbers to explain the perfomance growth, but it's
> really _very_ fast compared to "normal" version. As filename said, this
> is a test version, then any feedback is appreciated.
>
> Bye,
> Dario
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"