Re: ndisgen generated module load causes page fault, missing functions

2006-07-21 Thread Doug Barton
Ganbold wrote:
> Hi,
> 
> I have FreeBSD-6.1-STABLE dell D620 laptop with Dell Wireless 1490
> 802.11a/g Dual-band Mini Card (which seems like bcm4310).

In my experience, you should try various versions of the Windows driver. I
have a 1400, and the very latest version of the driver does not work with
NDIS, but the version previous to that does.

hth,

Doug

-- 

This .signature sanitized for your protection

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: NVIDIA 6600GT Freeze

2006-07-21 Thread Vinny Abello

At 08:24 AM 7/21/2006, Nealie wrote:

On Thu, 2006-07-20 at 13:51 +0200, Nealie wrote:
> I have a problem with my system freezing when using an NVIDIA video card
> using the nvidia-driver port. All seems to work fine for a while but
> then the system freezes and won't even reply to a ping. This can happen
> regardless of whether I use openGL or not.
>
> Everything works fine using the "nv" driver, so it doesn't seem to be a
> hardware problem.
>
> My setup is as follows:
>
> uname: FreeBSD server.home 6.1-STABLE FreeBSD 6.1-STABLE #0: Wed Jul 19
> 11:19:16 CEST 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SERVER
> i386
>
> AGP NVIDIA 6600GT installed on an MSI K8T Neo-F V2.09 motherboard (VIA
> K8T800 Pro chipset) with an AMD Athlon 64 3500+ CPU.
>
> The NVIDIA driver is installed as per the instructions, with agp and dri
> removed from the kernel in order to use the NVIDIA agp interface, even
> though the sysctl settings suggest otherwise.
>
> If anyone has any ideas about this problem I'd be very grateful.

Just a quick reply to myself: The problem seems to be that the the IRQs
of the motherboards on board network interface and the AGP card are the
same. This works for a while but then something goes horribly wrong and
all comes to a halt. Why the IRQ is shared I have no idea as there are
nine free IRQs.


On most machines I have seen, IRQ's are shared between certain 
"slots". You can change the IRQ that is being used but not the 
devices sharing it. I believe this is inherent to the current PC 
architecture and motherboard design. Being that the NIC is integrated 
and you have so many other IRQ's free, I'm not sure why they chose 
that route for your board. Perhaps NICs can generally share with 
video cards without problems. In general (not guaranteed), devices 
*should* be able to share IRQ's if the drivers are written properly 
and if the hardware isn't designed horribly. This is just a 
generalization of my own experiences. I in no way write drivers for 
hardware for any operating system. YMMV. :)



Vinny Abello
Network Engineer
Server Management
[EMAIL PROTECTED]
(973)300-9211 x 125
(973)940-6125 (Direct)
PGP Key Fingerprint: 3BC5 9A48 FC78 03D3 82E0  E935 5325 FBCB 0100 977A

Tellurian Networks - The Ultimate Internet Connection
http://www.tellurian.com (888)TELLURIAN

"Courage is resistance to fear, mastery of fear - not absence of 
fear" -- Mark Twain


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


linux-firefox does not run on FreeBSD 6.1

2006-07-21 Thread UBM

Hiho! :-)

When trying to start linux-firefox on:

FreeBSD greatsheep 6.1-RC FreeBSD 6.1-RC #5: Tue May  2 20:33:32 CEST
2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SUBMARINE_SMP  i386

it complains about:

/usr/X11R6/lib/linux-firefox/firefox-bin: error while loading shared
libraries: /usr/lib/libm.so.6: ELF file OS ABI invalid


I've installed the newest linux_base-fc-4_6 from ports and I also have
the newest linux-firefox.

Question now is if there is anything I can do about this? :-)

Thanks in advance.

Bye
Marc



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Frozen Processes

2006-07-21 Thread Robert Watson

On Thu, 20 Jul 2006, Holtor wrote:

Since upgrading some of our 5.4 servers to the latest 6.1-STABLE we've had 
some processes stuck in the *inp state as listed in 'top'. Those processes 
can't be killed and any resources they use up in terms of bound IP addresses 
or ports can't be freed. Does anyone know what this *inp state means or how 
to fix this problem?


Processes in state '*inp' are waiting for an inpcb lock, suggesting a deadlock 
or lock leak.  Can you compile your kernel with invariants, witness, ddb, etc, 
and do a bit of kernel debugging?  You can find basic instructions in the 
handbook; what I'm particularly interested in is the output of "alltrace", 
"show alllocks", "show allpcpu".


Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Kernel panic with PF

2006-07-21 Thread Martin Beran
Hello,

I am the author of the proxy code you have been discussing here. I have done
some investigation today and I will try to summarize the results.

> Thank you. No, I am not using it and I am quite sure the proxies aren't
> doing it behind my back either. In fact there isn't a single entry in

I can confirm that the proxies do not use any user/group rules.

> the rules tables - there are only rdr rules generated on the fly by the
> proxies.

Depending on configuration, the proxies can install rdr and/or map rules. The
rdr rules are used for "incoming transparency", i.e., to catch clients'
connections going through the firewall, but with the target server's
destination address, not the firewall's. The map rules are used for "outgoing
transparency", i.e., for changing the source address of connections to
servers (for example, the original client source IP can be used instead of the
firewall's IP).

> Which proxies are you using?  The "pool_ticket: 1429 != 1430" messages you 
> quote below indicate a synchronization problem within the app talking to pf 
> via ioctl's.  Tickets are used to ensure atomic commits for operations that 

This should be only a transient error. The proxies detect this situation and
retry the failed operation. The idea behind this behavior was that the ruleset
manipulation is done by a small number of ioctls called quickly one after
another by a single function, hence the collisions should not occur too often.
But maybe I can add exclusive locking of the PF device, which should remove
all collisions among several proxy processes trying to change PF rules at the
same time.

> If the proxy is using DIOCCHANGERULE (it must be the proxy, pfctl isn't
> using it at all), AND is trying to add/update a rule that requires at
> least one replacement address but contains an empty list, then this
> would cause the panic seen when that rule later matches a packet.

I think this is not the case. The proxy uses either DIOCXBEGIN + DIOCBEGINADDRS
+ DIOCADDADDR + DIOCADDRULE + DIOCXCOMMIT or
DIOCCHANGERULE(PF_CHANGE_GET_TICKET) + DIOCBEGINADDRS + DIOCADDADDR
+ DIOCCHANGERULE(PF_CHANGE_ADD_TAIL). The first method is used in the first
call to create the ruleset. In the subsequent call, the second method is used
to modify the ruleset. But the list is never empty. If it was, the panics
would occur always, which is not happening - there are other installations
(but probably not 64bit SMP) working well.
 
I can imagine the list becoming empty only if some other process deletes it
by DIOCBEGINADDRS during pfioctl processing, after the
"pcr->pool_ticket != ticket_pabuf" check. But this should be guarded by
PF_LOCK.

Of course, I could make some mistake in the calling sequence of PF ioctl.
I wrote this piece of code by trial and error, using pfctl as a source of
ideas, because I have not found a detailed manual for the PF API.

> Michal, can you please confirm that the patch above fixes the panic?
> The proxy will still misbehave and cause the log messages (one more
> EINVAL in this case ;), but the kernel shouldn't crash anymore.

Yes, the patch should fix the panics, but it does not solve the problem.

> This functionality of the software (using PF with anchors) is quite new

It is not so new, it is now about 9 months in production use.

> Anchors were introduced for this purpose, i.e. splitting the ruleset
> into separate pieces, over each of which a single process can have
> authority, so different processes don't stomp on each other's toes with
> ruleset modifications.

In fact, the possibility to split the ruleset into anchors owned by individual
processes was one our major reasons to move from IPF to PF.

> Ask them if they really need to still use DIOCCHANGERULE, as the idea
> with anchors is generally to only operate within one anchor, and usually
> flush or replace the (smaller) ruleset within.

DIOCCHANGERULE is useful for us, because each proxy process can have several
redirections or mappings and it creates/deletes them incrementally, as it
opens/closes individual network connections. It seems to me unnecessary to
always replace the whole ruleset.

> Each anchor has its own ticket, so if you're seeing ticket mismatches,
> that means there are concurrent operations on the same anchor, even.

But the conflicts are on the pool_ticket which is, as I understand it, only
one for all operations.

> They (the Kernun authors) run multiple processes for each proxy.
> Originally they used slightly modified Apached core for their proxies I
> believe. Thus there are probably more processes using the same anchor.

No, there are not. The anchors are even named by the owner process ID.

> I don't really understand what they do inside - I would think that when
> there are no traffic blocking rules, there's no point in doing anything
> with PF except initial setting of the rdr rule to the proxy.

As I have mentioned above, there are dynamicaly created rules for outgoing
transparent connections (source-address i

Re: "swiN: clock sio" process taking 75% CPU

2006-07-21 Thread Gareth McCaughan
I wrote:

> About 6 minutes after booting (on two occasions; I don't
> guarantee that this doesn't vary), a process that appears
> in the output of "ps" as "[swi4: clock sio]" begins to
> use about 3/4 of the machine's CPU. I think it does so
> more or less instantaneously. It continues to do so
> indefinitely, so far as I can tell.

So, here's the answer. Whether it's the same thing that's
afflicted the other people who've reported similar problems,
I don't know. (Thanks to John Baldwin on -hackers for
pointing me in a useful direction.)

Executive summary: If you see symptoms like the one above,
are you running a syscons screen saver? (To check: run
"kldstat | grep _saver".) If so, turn it off and the problem
may go away.

1. The machine in question runs largely unattended.

2. I'd enabled the syscons screen saver and chosen one
   of the ones that puts the screen into a graphics mode.
   ("warp", as it happens; "fire" behaves similarly;
   the character-mode ones don't; I haven't looked at
   all of them.)

3. The screen saver kicks in 5 minutes after it gets
   turned on in /etc/rc.d/syscons, provided nothing's
   happening on the console. Which it isn't: see #1.

4. Now, how do those graphics-mode screen savers work?
   They write to the video card's frame buffer directly,
   but there's only a 64k block of RAM they can do this
   through. So, to cope with larger screens, there's a
   bank switching facility accessed by a BIOS call.

5. This BIOS call, on my machine, takes about 0.1ms; you
   need to do two of them for a bank switch, so the time
   actually taken is about 0.2ms.

6. The screen savers are written in a less than optimal way,
   and do that bank switching thing many times. For instance,
   the "fire" screen saver does it at least once for every
   screen line. Even when the entire screen actually fits
   into a single bank so that no switching at all should be
   needed.

7. So the screensaver eats up something on the order of half
   my CPU time; the exact figure depends on which screensaver
   and on more exact timings than I've given above, which is
   how it ends up actually being 75% for the "warp" screensaver.

8. The screensaver gets run in callouts from a kernel
   interrupt thread that happens to have a silly name
   like "swi4: clock sio".

This is eminently fixable, in several different ways. I've
offered to prepare a patch, or perhaps someone else will
do so, so there's a reasonable prospect of later versions
of FreeBSD not having this problem. For the time being,
there's a simple workaround for anyone facing the same
problem I did: *turn off the screensaver*, or replace it
with one that doesn't use a graphics mode.

For clarity: this is a problem with (some) FreeBSD syscons
screen savers, the ones you might enable in /etc/rc.conf;
not with the ones like xscreensaver that you might run in
user mode under X.

-- 
g

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


gmirror problems

2006-07-21 Thread Anton Nikiforov

Dear All,
I have the following gmirror configuration:
server1
ggated
ggatec create -u 10 -o rw 192.168.100.110 /dev/da1s1d
gmirror label -v -b split -s 2048 geom1 /dev/da1s1d /dev/ggate10d
ggatec create -u 11 -o rw 192.168.100.110 /dev/da1s1e
gmirror label -v -b split -s 2048 geom2 /dev/da1s1e /dev/ggate11
ggatec create -u 12 -o rw 192.168.100.110 /dev/da1s1f
gmirror label -v -b split -s 2048 geom3 /dev/da1s1f /dev/ggate12
ggatec create -u 13 -o rw 192.168.100.110 /dev/da1s1g
gmirror label -v -b split -s 2048 geom4 /dev/da1s1g /dev/ggate13
ggatec create -u 14 -o rw 192.168.100.110 /dev/da1s1h
gmirror label -v -b split -s 2048 geom5 /dev/da1s1h /dev/ggate14

server 2
ggated

When I'm starting from scratch. (currently by manually running all 
commands/daemons)
Everything is fine, until I'm trying to mount gmirrored device from 
server2. What I'm doing listed below

1. Unmounting all file systems
2. stooping all devices (all, but i need only one to start it on another 
host)

3. stopping daemons
4. Starting daemon on server1
5. Trying to create ggatec device on server2 with the same command, but 
with IP of server1

   getting error: ggatec: ggatec: ioctl(/dev/ggctl): Invalid argument.
   ggatec: Exiting.
6. looking into gmirror status device list i have all devices in 
DEGRADED state on server 2 (there was no devices in the list while 
everything was up)
7. longing into gmirror status device list i have all devices in 
COMPLETE state and next time gmirror hangs forever.


Could you please help me or direct to the right manual?
I have found a lot of sources of how to setup (and I'm done with this). But 
what should i do with failure? How to mount disk on another node or start it 
after failure?

And one more question: is there any way to get gmirror to re mirror devices 
before carp interfaces become up? I want to get data mirrored before moving 
services from backup firewall to main one (in case main was failed)

And one more thing. after some manipulations with gmirror devices i have 
server crushed while booting kernel. At the moment it initialize GEOM_MIRROR 
device - kernel panics.
When i remove the disk that was containing gmirror devices - server just booted 
normally. But insertion of that disk back and running camcontol rescan all - 
bring it back to panic... so, i cannot use this disk anymore (i know, that i 
can rewrite it's last sector on machine without GEOM compiled into the kernel)

--
Best regards,
Anton Nikiforov



smime.p7s
Description: S/MIME Cryptographic Signature


filesystem full error with inumber

2006-07-21 Thread Feargal Reilly

The following error is being logged in /var/log/messages on
FreeBSD 5.4:

Jul 21 09:58:44 arwen kernel: pid 615 (postgres), uid 1001
inumber 6166128 on /data0: filesystem full

However, this does not appear to be a case of being out of disk
space, or running out of inodes:

ttyp2$ df -hi
Filesystem   SizeUsed   Avail Capacity iused   ifree
%iused  Mounted on
/dev/amrd0s1f 54G 44G5.4G89% 4104458 3257972
56%   /data0

Nor does it appear to be a file limit:

ttyp2$ sysctl kern.maxfiles kern.openfiles
kern.maxfiles: 2
kern.openfiles: 3582

These reading were not taken at exactly the same time as the
error occured, but close to it.

Here's the head of dumpfs:

magic   19540119 (UFS2) timeFri Jul 21 09:38:40 2006
superblock location 65536   id  [ 42446884 99703062 ]
ncg 693 size29360128blocks  28434238
bsize   8192shift   13  mask0xe000
fsize   2048shift   11  mask0xf800
frag4   shift   2   fsbtodb 2
minfree 8%  optim   timesymlinklen 120
maxbsize 8192   maxbpg  1024maxcontig 16contigsumsize 16
nbfree  563891  ndir495168  nifree  3245588 nffree  19898
bpg 10597   fpg 42388   ipg 10624
nindir  1024inopb   32  maxfilesize 8804691443711
sbsize  2048cgsize  8192csaddr  1372cssize  12288
sblkno  36  cblkno  40  iblkno  44  dblkno  1372
cgrotor 322 fmod0   ronly   0   clean   0
avgfpdir 64 avgfilesize 16384
flags   soft-updates 
fsmnt   /data0
volname swuid   0

Now the server's main function in life is running postgres.
I first noticed this error during a maintainence run
which sequentially dumps and vacuums each individual database.
The are currently 117 databases, most of which are no more than
20M in size, but there are a few outliers, the largest of which
is 792M in size. The chunk of this is stored in a single 500+M
file, so I can't see this consuming all my inodes, even if
soft-updates weren't cleaning up, perhaps I'm wrong. It has
since been happening outside of those runs as well.

I have searched through various forums and list archives, and
while I have found a few references to this error, I have not
been able to find a cause and subsequent solution posted.

Looking through the source, the error is being logged by
ffs_fserr in sys/ufs/ffs/ffs_alloc.c It is being called either
by ffs_alloc or by ffs_realloccg after either of the following
conditions:

ffs_alloc {
...
retry:
  if (size == fs->fs_bsize && fs->fs_cstotal.cs_nbfree == 0)
goto nospace;
freespace(fs, fs->fs_minfree) - numfrags(fs, size) <
0) goto nospace;
...
nospace:
if (fs->fs_pendingblocks > 0 && reclaimed == 0) {
reclaimed = 1;
softdep_request_cleanup(fs, ITOV(ip));
goto retry;
}
ffs_fserr(fs, ip->i_number, "filesystem full");
}

My uninformed and uneducated reading of this is that it does not
think there are enough blocks free, yet that does not tally with
what df is telling me.

Looking again at dumpfs, it appears to say that this is formatted
with a block size of 8K, and a fragment size of 2K, but
tuning(7) says:

 FreeBSD performs best when using 8K or 16K file system
block sizes.  The default file system block size is 16K, which
provides best performance for most applications, with the
exception of those that perform random access on large files
(such as database server software).  Such applica- tions tend to
perform better with a smaller block size, although modern disk
characteristics are such that the performance gain from using a
smaller block size may not be worth consideration.  Using a
block size larger than 16K can cause fragmentation of the buffer
cache and lead to lower performance.

 The defaults may be unsuitable for a file system that
requires a very large number of i-nodes or is intended to hold a
large number of very small files.  Such a file system should be
created with an 8K or 4K block size.  This also requires you to
specify a smaller fragment size.  We recommend always using a
fragment size that is 1/8 the block size (less testing has been
done on other fragment size factors).

Reading this makes me think that when this server was installed,
the block size was dropped from the 16K default to 8K for
performance reasons, but the fragment size was not modified
accordingly.

Would this be the root of my problem? If so, is my only option
to back everything up and newfs the disk, or is there something
else I can do that will minimise my downtime?

Any help and advice would be greatly appreciated.

-Feargal.

-- 
Feargal Reilly, Chief Techie, FBI.
PGP Key: 0x105D7168 (expires: 2006-11-30)
Web: http://www.fbi.ie/ | Tel: +353.14988588 | Fax: +353.14988489
Communications House, 11 Sallymount Avenue, Ranelagh, Dublin 6.


-- 
Feargal Reilly.
PGP Key: 0x847DE4C8 (expires: 2006-11-30)
Web: http://www.helgrim.com/ | ICQ: 109837

filesystem full error with inumber

2006-07-21 Thread Feargal Reilly

The following error is being logged in /var/log/messages on
FreeBSD 5.4:

Jul 21 09:58:44 arwen kernel: pid 615 (postgres), uid 1001
inumber 6166128 on /data0: filesystem full

However, this does not appear to be a case of being out of disk
space, or running out of inodes:

ttyp2$ df -hi
Filesystem   SizeUsed   Avail Capacity iused   ifree
%iused  Mounted on
/dev/amrd0s1f 54G 44G5.4G89% 4104458 3257972
56%   /data0

Nor does it appear to be a file limit:

ttyp2$ sysctl kern.maxfiles kern.openfiles
kern.maxfiles: 2
kern.openfiles: 3582

These reading were not taken at exactly the same time as the
error occured, but close to it.

Here's the head of dumpfs:

magic   19540119 (UFS2) timeFri Jul 21 09:38:40 2006
superblock location 65536   id  [ 42446884 99703062 ]
ncg 693 size29360128blocks  28434238
bsize   8192shift   13  mask0xe000
fsize   2048shift   11  mask0xf800
frag4   shift   2   fsbtodb 2
minfree 8%  optim   timesymlinklen 120
maxbsize 8192   maxbpg  1024maxcontig 16contigsumsize 16
nbfree  563891  ndir495168  nifree  3245588 nffree  19898
bpg 10597   fpg 42388   ipg 10624
nindir  1024inopb   32  maxfilesize 8804691443711
sbsize  2048cgsize  8192csaddr  1372cssize  12288
sblkno  36  cblkno  40  iblkno  44  dblkno  1372
cgrotor 322 fmod0   ronly   0   clean   0
avgfpdir 64 avgfilesize 16384
flags   soft-updates 
fsmnt   /data0
volname swuid   0

Now the server's main function in life is running postgres.
I first noticed this error during a maintainence run
which sequentially dumps and vacuums each individual database.
The are currently 117 databases, most of which are no more than
20M in size, but there are a few outliers, the largest of which
is 792M in size. The chunk of this is stored in a single 500+M
file, so I can't see this consuming all my inodes, even if
soft-updates weren't cleaning up, perhaps I'm wrong. It has
since been happening outside of those runs as well.

I have searched through various forums and list archives, and
while I have found a few references to this error, I have not
been able to find a cause and subsequent solution posted.

Looking through the source, the error is being logged by
ffs_fserr in sys/ufs/ffs/ffs_alloc.c It is being called either
by ffs_alloc or by ffs_realloccg after either of the following
conditions:

ffs_alloc {
...
retry:
  if (size == fs->fs_bsize && fs->fs_cstotal.cs_nbfree == 0)
goto nospace;
freespace(fs, fs->fs_minfree) - numfrags(fs, size) <
0) goto nospace;
...
nospace:
if (fs->fs_pendingblocks > 0 && reclaimed == 0) {
reclaimed = 1;
softdep_request_cleanup(fs, ITOV(ip));
goto retry;
}
ffs_fserr(fs, ip->i_number, "filesystem full");
}

My uninformed and uneducated reading of this is that it does not
think there are enough blocks free, yet that does not tally with
what df is telling me.

Looking again at dumpfs, it appears to say that this is formatted
with a block size of 8K, and a fragment size of 2K, but
tuning(7) says:

 FreeBSD performs best when using 8K or 16K file system
block sizes.  The default file system block size is 16K, which
provides best performance for most applications, with the
exception of those that perform random access on large files
(such as database server software).  Such applica- tions tend to
perform better with a smaller block size, although modern disk
characteristics are such that the performance gain from using a
smaller block size may not be worth consideration.  Using a
block size larger than 16K can cause fragmentation of the buffer
cache and lead to lower performance.

 The defaults may be unsuitable for a file system that
requires a very large number of i-nodes or is intended to hold a
large number of very small files.  Such a file system should be
created with an 8K or 4K block size.  This also requires you to
specify a smaller fragment size.  We recommend always using a
fragment size that is 1/8 the block size (less testing has been
done on other fragment size factors).

Reading this makes me think that when this server was installed,
the block size was dropped from the 16K default to 8K for
performance reasons, but the fragment size was not modified
accordingly.

Would this be the root of my problem? If so, is my only option
to back everything up and newfs the disk, or is there something
else I can do that will minimise my downtime?

Any help and advice would be greatly appreciated.

-Feargal.

-- 
Feargal Reilly, Chief Techie, FBI.
PGP Key: 0x105D7168 (expires: 2006-11-30)
Web: http://www.fbi.ie/ | Tel: +353.14988588 | Fax: +353.14988489
Communications House, 11 Sallymount Avenue, Ranelagh, Dublin 6.


signature.asc
Description: PGP signature


Re: NVIDIA 6600GT Freeze

2006-07-21 Thread Nealie
On Thu, 2006-07-20 at 13:51 +0200, Nealie wrote:
> I have a problem with my system freezing when using an NVIDIA video card
> using the nvidia-driver port. All seems to work fine for a while but
> then the system freezes and won't even reply to a ping. This can happen
> regardless of whether I use openGL or not.
> 
> Everything works fine using the "nv" driver, so it doesn't seem to be a
> hardware problem.
> 
> My setup is as follows:
> 
> uname: FreeBSD server.home 6.1-STABLE FreeBSD 6.1-STABLE #0: Wed Jul 19
> 11:19:16 CEST 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SERVER
> i386
> 
> AGP NVIDIA 6600GT installed on an MSI K8T Neo-F V2.09 motherboard (VIA
> K8T800 Pro chipset) with an AMD Athlon 64 3500+ CPU.
> 
> The NVIDIA driver is installed as per the instructions, with agp and dri
> removed from the kernel in order to use the NVIDIA agp interface, even
> though the sysctl settings suggest otherwise.
> 
> If anyone has any ideas about this problem I'd be very grateful.

Just a quick reply to myself: The problem seems to be that the the IRQs
of the motherboards on board network interface and the AGP card are the
same. This works for a while but then something goes horribly wrong and
all comes to a halt. Why the IRQ is shared I have no idea as there are
nine free IRQs.

Unfortunately there is no way to change either of the IRQs in the BIOS,
so I've had to resort to replacing the on board gigagit network
interface with an add on 100Mb interface. All seems to be working
properly now with the NVIDIA driver.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: ath driver transmits frames only after a low watermark is filled

2006-07-21 Thread JoaoBR
On Thursday 20 July 2006 12:42, Sam Leffler wrote:
> The original posting didn't provide any basic info so there's little
> anyone can provide except wild guesses.  There are debugging mechanisms
> for tracing what's going on at the net80211 layer and in the driver that
> have been referenced countless times in this forum.


I think it could be useful if you could provide clear instructions what you 
need exactly. 

Even if it was once in a while referenced that there are tools for debugging 
there are no instructions how to use them and it is already hard to find them 
and then when found there seems to be no readme and no help at all. 

So since we are forced to guess how to use them then we are probably forced to 
guess results too ...

I am also sure that the actual ath problems do have nothing to do with 
powersaving features because they appear also when this features are turned 
off on both, the ap and the stations(all).

the last modification regarding the mcast issue definitly made the driver 
better and more stable but still it stops transmitting from time to time 
without any usefull message and appearently it is related to some kind 
broadcast traffic. It is no question of quantity, for example I can run 
dhclient gain and again on one machine on the same network and I get an ath 
down and up event in messages what then sometimes causes tx to stop until I 
do manually ifconfig up. So since I am on my own here I simply block any kind 
of broadcast to the AP and the card stands.

But if you could tell me what you exactly need to find out what it is it would 
be easy for me to help because I have lots of this setups running.

-- 

João







A mensagem foi scaneada pelo sistema de e-mail e pode ser considerada segura.
Service fornecido pelo Datacenter Matik  https://datacenter.matik.com.br
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Kernel panic with PF

2006-07-21 Thread Michal Mertl
Daniel Hartmeier wrote:
> On Fri, Jul 21, 2006 at 10:57:28AM +0200, Michal Mertl wrote:
> 
> > The proxy in fact runs in parallel (according to "pfctl -s info" it did
> > about 50 inserts and removal in the state table per second - some 10Mbit
> > of traffic, probably mostly HTTP) and it is quite possible that your
> > explanation is correct. I will forward your suspicion to the vendor.
> > This functionality of the software (using PF with anchors) is quite new
> > - they used different mechanisms in previous versions so it may well
> > have some bugs.
> 
> Anchors were introduced for this purpose, i.e. splitting the ruleset
> into separate pieces, over each of which a single process can have
> authority, so different processes don't stomp on each other's toes with
> ruleset modifications.

They (the Kernun authors) run multiple processes for each proxy.
Originally they used slightly modified Apached core for their proxies I
believe. Thus there are probably more processes using the same anchor.

I don't really understand what they do inside - I would think that when
there are no traffic blocking rules, there's no point in doing anything
with PF except initial setting of the rdr rule to the proxy.

> Ask them if they really need to still use DIOCCHANGERULE, as the idea
> with anchors is generally to only operate within one anchor, and usually
> flush or replace the (smaller) ruleset within.
> 
> Each anchor has its own ticket, so if you're seeing ticket mismatches,
> that means there are concurrent operations on the same anchor, even.

I see. It would be better if they were part of this communication
because I don't know the internals (although I have the source code). I
have problems reaching them at the moment though.


> Daniel
> 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Kernel panic with PF

2006-07-21 Thread Daniel Hartmeier
On Fri, Jul 21, 2006 at 10:57:28AM +0200, Michal Mertl wrote:

> The proxy in fact runs in parallel (according to "pfctl -s info" it did
> about 50 inserts and removal in the state table per second - some 10Mbit
> of traffic, probably mostly HTTP) and it is quite possible that your
> explanation is correct. I will forward your suspicion to the vendor.
> This functionality of the software (using PF with anchors) is quite new
> - they used different mechanisms in previous versions so it may well
> have some bugs.

Anchors were introduced for this purpose, i.e. splitting the ruleset
into separate pieces, over each of which a single process can have
authority, so different processes don't stomp on each other's toes with
ruleset modifications.

Ask them if they really need to still use DIOCCHANGERULE, as the idea
with anchors is generally to only operate within one anchor, and usually
flush or replace the (smaller) ruleset within.

Each anchor has its own ticket, so if you're seeing ticket mismatches,
that means there are concurrent operations on the same anchor, even.

Daniel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Kernel panic with PF

2006-07-21 Thread Michal Mertl
Daniel Hartmeier wrote:
> On Fri, Jul 21, 2006 at 02:05:45AM +0200, Max Laier wrote:
> 
> > Which proxies are you using?  The "pool_ticket: 1429 != 1430" messages you 
> > quote below indicate a synchronization problem within the app talking to pf 
> > via ioctl's.  Tickets are used to ensure atomic commits for operations that 
> > require more than one ioctl.  If your proxy app runs in parallel it might 
> > screw up the internal state and thus leave it undefined afterwards.  I give 
> > you that this shouldn't cause a kernel problem, but if we could fix the app 
> > we can probably find the right sanity check more easily.
> 
> This looks like a bug in pf_ioctl.c pfioctl() DIOCCHANGERULE
> 
> if (newrule->action == PF_NAT) ||
> (newrule->action == PF_RDR) ||
> (newrule->action == PF_BINAT) ||
> (newrule->rt > PF_FASTROUTE)) &&
> -   !pcr->anchor[0])) &&
> +   !newrule->anchor)) &&
> (TAILQ_FIRST(&newrule->rpool.list) == NULL))
> error = EINVAL;
> 
> i.e. the pool must not be empty for routing and translation rules,
> except for translation rules that are actually anchor _calls_.
> 
> The confusion is between translation rules within anchors
> (pcr->anchor[0] != '\0') and calls to anchors' translation rules
> (rule->anchor != NULL).
> 
> If the proxy is using DIOCCHANGERULE (it must be the proxy, pfctl isn't
> using it at all), AND is trying to add/update a rule that requires at
> least one replacement address but contains an empty list, then this
> would cause the panic seen when that rule later matches a packet.
> 
> This needs fixing in OpenBSD as well.
> 
> Michal, can you please confirm that the patch above fixes the panic?
> The proxy will still misbehave and cause the log messages (one more
> EINVAL in this case ;), but the kernel shouldn't crash anymore.

I am afraid I can't test it at the moment. I am going to get one of the
machines to my lab and will experiment with it there. I am afraid I will
have problems generating enough traffic for the problem to appear but I
will try.

> Thanks for the excellent bug report!

Thank you. I don't think is was that good as I now see that you had to
guess there are anchors used.

The rules look like this (except the rules seen by 'pfctl -s nat' they
are generated by the proxies when they start):

fw1#pfctl -s rule
fw1#pfctl -s nat
nat-anchor "/kernun/*" all
rdr-anchor "/kernun/*" all
fw1#pfctl -s Anchors -v
  kernun
  kernun/4026
  kernun/4039
  kernun/4088
  kernun/4112
  kernun/4134
  kernun/4164
  kernun/4197
  kernun/4257
  kernun/4296
  kernun/4338
  kernun/4383
  kernun/4431
  kernun/4482
  kernun/4590
  kernun/4649
fw1# pfctl -a kernun/4039 -s nat
rdr on em0 inet proto tcp from any to any port = http label "HTTP" ->
127.0.0.1

When the system was under load I saw ~5000 states in 'pfctl -s state'.

Thank you again. I will let you know when I get a chance to test your
patch and or find out anything new.

Michal


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Kernel panic with PF

2006-07-21 Thread Michal Mertl
Max Laier píše v pá 21. 07. 2006 v 02:05 +0200:
> [CC'ing -pf]
> 
> On Thursday 20 July 2006 17:53, Michal Mertl wrote:
> > Hello,
> >
> > I am deploying FreeBSD based application proxies' based firewall
> > (www.kernun.com, but not much English there) and am having frequent
> > panics of RELENG_6_1 under load. The server has IP forwarding disabled.
> >
> > I've got two machines in a carp cluster and the transparent proxies use
> > PF to get the data.
> 
> Which proxies are you using?  The "pool_ticket: 1429 != 1430" messages you 
> quote below indicate a synchronization problem within the app talking to pf 
> via ioctl's.  Tickets are used to ensure atomic commits for operations that 
> require more than one ioctl.  If your proxy app runs in parallel it might 
> screw up the internal state and thus leave it undefined afterwards.  I give 
> you that this shouldn't cause a kernel problem, but if we could fix the app 
> we can probably find the right sanity check more easily.

The proxy in fact runs in parallel (according to "pfctl -s info" it did
about 50 inserts and removal in the state table per second - some 10Mbit
of traffic, probably mostly HTTP) and it is quite possible that your
explanation is correct. I will forward your suspicion to the vendor.
This functionality of the software (using PF with anchors) is quite new
- they used different mechanisms in previous versions so it may well
have some bugs.

Thanks

Michal

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"