Re: ndisgen generated module load causes page fault, missing functions
Ganbold wrote: > Hi, > > I have FreeBSD-6.1-STABLE dell D620 laptop with Dell Wireless 1490 > 802.11a/g Dual-band Mini Card (which seems like bcm4310). In my experience, you should try various versions of the Windows driver. I have a 1400, and the very latest version of the driver does not work with NDIS, but the version previous to that does. hth, Doug -- This .signature sanitized for your protection ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: NVIDIA 6600GT Freeze
At 08:24 AM 7/21/2006, Nealie wrote: On Thu, 2006-07-20 at 13:51 +0200, Nealie wrote: > I have a problem with my system freezing when using an NVIDIA video card > using the nvidia-driver port. All seems to work fine for a while but > then the system freezes and won't even reply to a ping. This can happen > regardless of whether I use openGL or not. > > Everything works fine using the "nv" driver, so it doesn't seem to be a > hardware problem. > > My setup is as follows: > > uname: FreeBSD server.home 6.1-STABLE FreeBSD 6.1-STABLE #0: Wed Jul 19 > 11:19:16 CEST 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SERVER > i386 > > AGP NVIDIA 6600GT installed on an MSI K8T Neo-F V2.09 motherboard (VIA > K8T800 Pro chipset) with an AMD Athlon 64 3500+ CPU. > > The NVIDIA driver is installed as per the instructions, with agp and dri > removed from the kernel in order to use the NVIDIA agp interface, even > though the sysctl settings suggest otherwise. > > If anyone has any ideas about this problem I'd be very grateful. Just a quick reply to myself: The problem seems to be that the the IRQs of the motherboards on board network interface and the AGP card are the same. This works for a while but then something goes horribly wrong and all comes to a halt. Why the IRQ is shared I have no idea as there are nine free IRQs. On most machines I have seen, IRQ's are shared between certain "slots". You can change the IRQ that is being used but not the devices sharing it. I believe this is inherent to the current PC architecture and motherboard design. Being that the NIC is integrated and you have so many other IRQ's free, I'm not sure why they chose that route for your board. Perhaps NICs can generally share with video cards without problems. In general (not guaranteed), devices *should* be able to share IRQ's if the drivers are written properly and if the hardware isn't designed horribly. This is just a generalization of my own experiences. I in no way write drivers for hardware for any operating system. YMMV. :) Vinny Abello Network Engineer Server Management [EMAIL PROTECTED] (973)300-9211 x 125 (973)940-6125 (Direct) PGP Key Fingerprint: 3BC5 9A48 FC78 03D3 82E0 E935 5325 FBCB 0100 977A Tellurian Networks - The Ultimate Internet Connection http://www.tellurian.com (888)TELLURIAN "Courage is resistance to fear, mastery of fear - not absence of fear" -- Mark Twain ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
linux-firefox does not run on FreeBSD 6.1
Hiho! :-) When trying to start linux-firefox on: FreeBSD greatsheep 6.1-RC FreeBSD 6.1-RC #5: Tue May 2 20:33:32 CEST 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SUBMARINE_SMP i386 it complains about: /usr/X11R6/lib/linux-firefox/firefox-bin: error while loading shared libraries: /usr/lib/libm.so.6: ELF file OS ABI invalid I've installed the newest linux_base-fc-4_6 from ports and I also have the newest linux-firefox. Question now is if there is anything I can do about this? :-) Thanks in advance. Bye Marc ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Frozen Processes
On Thu, 20 Jul 2006, Holtor wrote: Since upgrading some of our 5.4 servers to the latest 6.1-STABLE we've had some processes stuck in the *inp state as listed in 'top'. Those processes can't be killed and any resources they use up in terms of bound IP addresses or ports can't be freed. Does anyone know what this *inp state means or how to fix this problem? Processes in state '*inp' are waiting for an inpcb lock, suggesting a deadlock or lock leak. Can you compile your kernel with invariants, witness, ddb, etc, and do a bit of kernel debugging? You can find basic instructions in the handbook; what I'm particularly interested in is the output of "alltrace", "show alllocks", "show allpcpu". Robert N M Watson Computer Laboratory University of Cambridge ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Kernel panic with PF
Hello, I am the author of the proxy code you have been discussing here. I have done some investigation today and I will try to summarize the results. > Thank you. No, I am not using it and I am quite sure the proxies aren't > doing it behind my back either. In fact there isn't a single entry in I can confirm that the proxies do not use any user/group rules. > the rules tables - there are only rdr rules generated on the fly by the > proxies. Depending on configuration, the proxies can install rdr and/or map rules. The rdr rules are used for "incoming transparency", i.e., to catch clients' connections going through the firewall, but with the target server's destination address, not the firewall's. The map rules are used for "outgoing transparency", i.e., for changing the source address of connections to servers (for example, the original client source IP can be used instead of the firewall's IP). > Which proxies are you using? The "pool_ticket: 1429 != 1430" messages you > quote below indicate a synchronization problem within the app talking to pf > via ioctl's. Tickets are used to ensure atomic commits for operations that This should be only a transient error. The proxies detect this situation and retry the failed operation. The idea behind this behavior was that the ruleset manipulation is done by a small number of ioctls called quickly one after another by a single function, hence the collisions should not occur too often. But maybe I can add exclusive locking of the PF device, which should remove all collisions among several proxy processes trying to change PF rules at the same time. > If the proxy is using DIOCCHANGERULE (it must be the proxy, pfctl isn't > using it at all), AND is trying to add/update a rule that requires at > least one replacement address but contains an empty list, then this > would cause the panic seen when that rule later matches a packet. I think this is not the case. The proxy uses either DIOCXBEGIN + DIOCBEGINADDRS + DIOCADDADDR + DIOCADDRULE + DIOCXCOMMIT or DIOCCHANGERULE(PF_CHANGE_GET_TICKET) + DIOCBEGINADDRS + DIOCADDADDR + DIOCCHANGERULE(PF_CHANGE_ADD_TAIL). The first method is used in the first call to create the ruleset. In the subsequent call, the second method is used to modify the ruleset. But the list is never empty. If it was, the panics would occur always, which is not happening - there are other installations (but probably not 64bit SMP) working well. I can imagine the list becoming empty only if some other process deletes it by DIOCBEGINADDRS during pfioctl processing, after the "pcr->pool_ticket != ticket_pabuf" check. But this should be guarded by PF_LOCK. Of course, I could make some mistake in the calling sequence of PF ioctl. I wrote this piece of code by trial and error, using pfctl as a source of ideas, because I have not found a detailed manual for the PF API. > Michal, can you please confirm that the patch above fixes the panic? > The proxy will still misbehave and cause the log messages (one more > EINVAL in this case ;), but the kernel shouldn't crash anymore. Yes, the patch should fix the panics, but it does not solve the problem. > This functionality of the software (using PF with anchors) is quite new It is not so new, it is now about 9 months in production use. > Anchors were introduced for this purpose, i.e. splitting the ruleset > into separate pieces, over each of which a single process can have > authority, so different processes don't stomp on each other's toes with > ruleset modifications. In fact, the possibility to split the ruleset into anchors owned by individual processes was one our major reasons to move from IPF to PF. > Ask them if they really need to still use DIOCCHANGERULE, as the idea > with anchors is generally to only operate within one anchor, and usually > flush or replace the (smaller) ruleset within. DIOCCHANGERULE is useful for us, because each proxy process can have several redirections or mappings and it creates/deletes them incrementally, as it opens/closes individual network connections. It seems to me unnecessary to always replace the whole ruleset. > Each anchor has its own ticket, so if you're seeing ticket mismatches, > that means there are concurrent operations on the same anchor, even. But the conflicts are on the pool_ticket which is, as I understand it, only one for all operations. > They (the Kernun authors) run multiple processes for each proxy. > Originally they used slightly modified Apached core for their proxies I > believe. Thus there are probably more processes using the same anchor. No, there are not. The anchors are even named by the owner process ID. > I don't really understand what they do inside - I would think that when > there are no traffic blocking rules, there's no point in doing anything > with PF except initial setting of the rdr rule to the proxy. As I have mentioned above, there are dynamicaly created rules for outgoing transparent connections (source-address i
Re: "swiN: clock sio" process taking 75% CPU
I wrote: > About 6 minutes after booting (on two occasions; I don't > guarantee that this doesn't vary), a process that appears > in the output of "ps" as "[swi4: clock sio]" begins to > use about 3/4 of the machine's CPU. I think it does so > more or less instantaneously. It continues to do so > indefinitely, so far as I can tell. So, here's the answer. Whether it's the same thing that's afflicted the other people who've reported similar problems, I don't know. (Thanks to John Baldwin on -hackers for pointing me in a useful direction.) Executive summary: If you see symptoms like the one above, are you running a syscons screen saver? (To check: run "kldstat | grep _saver".) If so, turn it off and the problem may go away. 1. The machine in question runs largely unattended. 2. I'd enabled the syscons screen saver and chosen one of the ones that puts the screen into a graphics mode. ("warp", as it happens; "fire" behaves similarly; the character-mode ones don't; I haven't looked at all of them.) 3. The screen saver kicks in 5 minutes after it gets turned on in /etc/rc.d/syscons, provided nothing's happening on the console. Which it isn't: see #1. 4. Now, how do those graphics-mode screen savers work? They write to the video card's frame buffer directly, but there's only a 64k block of RAM they can do this through. So, to cope with larger screens, there's a bank switching facility accessed by a BIOS call. 5. This BIOS call, on my machine, takes about 0.1ms; you need to do two of them for a bank switch, so the time actually taken is about 0.2ms. 6. The screen savers are written in a less than optimal way, and do that bank switching thing many times. For instance, the "fire" screen saver does it at least once for every screen line. Even when the entire screen actually fits into a single bank so that no switching at all should be needed. 7. So the screensaver eats up something on the order of half my CPU time; the exact figure depends on which screensaver and on more exact timings than I've given above, which is how it ends up actually being 75% for the "warp" screensaver. 8. The screensaver gets run in callouts from a kernel interrupt thread that happens to have a silly name like "swi4: clock sio". This is eminently fixable, in several different ways. I've offered to prepare a patch, or perhaps someone else will do so, so there's a reasonable prospect of later versions of FreeBSD not having this problem. For the time being, there's a simple workaround for anyone facing the same problem I did: *turn off the screensaver*, or replace it with one that doesn't use a graphics mode. For clarity: this is a problem with (some) FreeBSD syscons screen savers, the ones you might enable in /etc/rc.conf; not with the ones like xscreensaver that you might run in user mode under X. -- g ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
gmirror problems
Dear All, I have the following gmirror configuration: server1 ggated ggatec create -u 10 -o rw 192.168.100.110 /dev/da1s1d gmirror label -v -b split -s 2048 geom1 /dev/da1s1d /dev/ggate10d ggatec create -u 11 -o rw 192.168.100.110 /dev/da1s1e gmirror label -v -b split -s 2048 geom2 /dev/da1s1e /dev/ggate11 ggatec create -u 12 -o rw 192.168.100.110 /dev/da1s1f gmirror label -v -b split -s 2048 geom3 /dev/da1s1f /dev/ggate12 ggatec create -u 13 -o rw 192.168.100.110 /dev/da1s1g gmirror label -v -b split -s 2048 geom4 /dev/da1s1g /dev/ggate13 ggatec create -u 14 -o rw 192.168.100.110 /dev/da1s1h gmirror label -v -b split -s 2048 geom5 /dev/da1s1h /dev/ggate14 server 2 ggated When I'm starting from scratch. (currently by manually running all commands/daemons) Everything is fine, until I'm trying to mount gmirrored device from server2. What I'm doing listed below 1. Unmounting all file systems 2. stooping all devices (all, but i need only one to start it on another host) 3. stopping daemons 4. Starting daemon on server1 5. Trying to create ggatec device on server2 with the same command, but with IP of server1 getting error: ggatec: ggatec: ioctl(/dev/ggctl): Invalid argument. ggatec: Exiting. 6. looking into gmirror status device list i have all devices in DEGRADED state on server 2 (there was no devices in the list while everything was up) 7. longing into gmirror status device list i have all devices in COMPLETE state and next time gmirror hangs forever. Could you please help me or direct to the right manual? I have found a lot of sources of how to setup (and I'm done with this). But what should i do with failure? How to mount disk on another node or start it after failure? And one more question: is there any way to get gmirror to re mirror devices before carp interfaces become up? I want to get data mirrored before moving services from backup firewall to main one (in case main was failed) And one more thing. after some manipulations with gmirror devices i have server crushed while booting kernel. At the moment it initialize GEOM_MIRROR device - kernel panics. When i remove the disk that was containing gmirror devices - server just booted normally. But insertion of that disk back and running camcontol rescan all - bring it back to panic... so, i cannot use this disk anymore (i know, that i can rewrite it's last sector on machine without GEOM compiled into the kernel) -- Best regards, Anton Nikiforov smime.p7s Description: S/MIME Cryptographic Signature
filesystem full error with inumber
The following error is being logged in /var/log/messages on FreeBSD 5.4: Jul 21 09:58:44 arwen kernel: pid 615 (postgres), uid 1001 inumber 6166128 on /data0: filesystem full However, this does not appear to be a case of being out of disk space, or running out of inodes: ttyp2$ df -hi Filesystem SizeUsed Avail Capacity iused ifree %iused Mounted on /dev/amrd0s1f 54G 44G5.4G89% 4104458 3257972 56% /data0 Nor does it appear to be a file limit: ttyp2$ sysctl kern.maxfiles kern.openfiles kern.maxfiles: 2 kern.openfiles: 3582 These reading were not taken at exactly the same time as the error occured, but close to it. Here's the head of dumpfs: magic 19540119 (UFS2) timeFri Jul 21 09:38:40 2006 superblock location 65536 id [ 42446884 99703062 ] ncg 693 size29360128blocks 28434238 bsize 8192shift 13 mask0xe000 fsize 2048shift 11 mask0xf800 frag4 shift 2 fsbtodb 2 minfree 8% optim timesymlinklen 120 maxbsize 8192 maxbpg 1024maxcontig 16contigsumsize 16 nbfree 563891 ndir495168 nifree 3245588 nffree 19898 bpg 10597 fpg 42388 ipg 10624 nindir 1024inopb 32 maxfilesize 8804691443711 sbsize 2048cgsize 8192csaddr 1372cssize 12288 sblkno 36 cblkno 40 iblkno 44 dblkno 1372 cgrotor 322 fmod0 ronly 0 clean 0 avgfpdir 64 avgfilesize 16384 flags soft-updates fsmnt /data0 volname swuid 0 Now the server's main function in life is running postgres. I first noticed this error during a maintainence run which sequentially dumps and vacuums each individual database. The are currently 117 databases, most of which are no more than 20M in size, but there are a few outliers, the largest of which is 792M in size. The chunk of this is stored in a single 500+M file, so I can't see this consuming all my inodes, even if soft-updates weren't cleaning up, perhaps I'm wrong. It has since been happening outside of those runs as well. I have searched through various forums and list archives, and while I have found a few references to this error, I have not been able to find a cause and subsequent solution posted. Looking through the source, the error is being logged by ffs_fserr in sys/ufs/ffs/ffs_alloc.c It is being called either by ffs_alloc or by ffs_realloccg after either of the following conditions: ffs_alloc { ... retry: if (size == fs->fs_bsize && fs->fs_cstotal.cs_nbfree == 0) goto nospace; freespace(fs, fs->fs_minfree) - numfrags(fs, size) < 0) goto nospace; ... nospace: if (fs->fs_pendingblocks > 0 && reclaimed == 0) { reclaimed = 1; softdep_request_cleanup(fs, ITOV(ip)); goto retry; } ffs_fserr(fs, ip->i_number, "filesystem full"); } My uninformed and uneducated reading of this is that it does not think there are enough blocks free, yet that does not tally with what df is telling me. Looking again at dumpfs, it appears to say that this is formatted with a block size of 8K, and a fragment size of 2K, but tuning(7) says: FreeBSD performs best when using 8K or 16K file system block sizes. The default file system block size is 16K, which provides best performance for most applications, with the exception of those that perform random access on large files (such as database server software). Such applica- tions tend to perform better with a smaller block size, although modern disk characteristics are such that the performance gain from using a smaller block size may not be worth consideration. Using a block size larger than 16K can cause fragmentation of the buffer cache and lead to lower performance. The defaults may be unsuitable for a file system that requires a very large number of i-nodes or is intended to hold a large number of very small files. Such a file system should be created with an 8K or 4K block size. This also requires you to specify a smaller fragment size. We recommend always using a fragment size that is 1/8 the block size (less testing has been done on other fragment size factors). Reading this makes me think that when this server was installed, the block size was dropped from the 16K default to 8K for performance reasons, but the fragment size was not modified accordingly. Would this be the root of my problem? If so, is my only option to back everything up and newfs the disk, or is there something else I can do that will minimise my downtime? Any help and advice would be greatly appreciated. -Feargal. -- Feargal Reilly, Chief Techie, FBI. PGP Key: 0x105D7168 (expires: 2006-11-30) Web: http://www.fbi.ie/ | Tel: +353.14988588 | Fax: +353.14988489 Communications House, 11 Sallymount Avenue, Ranelagh, Dublin 6. -- Feargal Reilly. PGP Key: 0x847DE4C8 (expires: 2006-11-30) Web: http://www.helgrim.com/ | ICQ: 109837
filesystem full error with inumber
The following error is being logged in /var/log/messages on FreeBSD 5.4: Jul 21 09:58:44 arwen kernel: pid 615 (postgres), uid 1001 inumber 6166128 on /data0: filesystem full However, this does not appear to be a case of being out of disk space, or running out of inodes: ttyp2$ df -hi Filesystem SizeUsed Avail Capacity iused ifree %iused Mounted on /dev/amrd0s1f 54G 44G5.4G89% 4104458 3257972 56% /data0 Nor does it appear to be a file limit: ttyp2$ sysctl kern.maxfiles kern.openfiles kern.maxfiles: 2 kern.openfiles: 3582 These reading were not taken at exactly the same time as the error occured, but close to it. Here's the head of dumpfs: magic 19540119 (UFS2) timeFri Jul 21 09:38:40 2006 superblock location 65536 id [ 42446884 99703062 ] ncg 693 size29360128blocks 28434238 bsize 8192shift 13 mask0xe000 fsize 2048shift 11 mask0xf800 frag4 shift 2 fsbtodb 2 minfree 8% optim timesymlinklen 120 maxbsize 8192 maxbpg 1024maxcontig 16contigsumsize 16 nbfree 563891 ndir495168 nifree 3245588 nffree 19898 bpg 10597 fpg 42388 ipg 10624 nindir 1024inopb 32 maxfilesize 8804691443711 sbsize 2048cgsize 8192csaddr 1372cssize 12288 sblkno 36 cblkno 40 iblkno 44 dblkno 1372 cgrotor 322 fmod0 ronly 0 clean 0 avgfpdir 64 avgfilesize 16384 flags soft-updates fsmnt /data0 volname swuid 0 Now the server's main function in life is running postgres. I first noticed this error during a maintainence run which sequentially dumps and vacuums each individual database. The are currently 117 databases, most of which are no more than 20M in size, but there are a few outliers, the largest of which is 792M in size. The chunk of this is stored in a single 500+M file, so I can't see this consuming all my inodes, even if soft-updates weren't cleaning up, perhaps I'm wrong. It has since been happening outside of those runs as well. I have searched through various forums and list archives, and while I have found a few references to this error, I have not been able to find a cause and subsequent solution posted. Looking through the source, the error is being logged by ffs_fserr in sys/ufs/ffs/ffs_alloc.c It is being called either by ffs_alloc or by ffs_realloccg after either of the following conditions: ffs_alloc { ... retry: if (size == fs->fs_bsize && fs->fs_cstotal.cs_nbfree == 0) goto nospace; freespace(fs, fs->fs_minfree) - numfrags(fs, size) < 0) goto nospace; ... nospace: if (fs->fs_pendingblocks > 0 && reclaimed == 0) { reclaimed = 1; softdep_request_cleanup(fs, ITOV(ip)); goto retry; } ffs_fserr(fs, ip->i_number, "filesystem full"); } My uninformed and uneducated reading of this is that it does not think there are enough blocks free, yet that does not tally with what df is telling me. Looking again at dumpfs, it appears to say that this is formatted with a block size of 8K, and a fragment size of 2K, but tuning(7) says: FreeBSD performs best when using 8K or 16K file system block sizes. The default file system block size is 16K, which provides best performance for most applications, with the exception of those that perform random access on large files (such as database server software). Such applica- tions tend to perform better with a smaller block size, although modern disk characteristics are such that the performance gain from using a smaller block size may not be worth consideration. Using a block size larger than 16K can cause fragmentation of the buffer cache and lead to lower performance. The defaults may be unsuitable for a file system that requires a very large number of i-nodes or is intended to hold a large number of very small files. Such a file system should be created with an 8K or 4K block size. This also requires you to specify a smaller fragment size. We recommend always using a fragment size that is 1/8 the block size (less testing has been done on other fragment size factors). Reading this makes me think that when this server was installed, the block size was dropped from the 16K default to 8K for performance reasons, but the fragment size was not modified accordingly. Would this be the root of my problem? If so, is my only option to back everything up and newfs the disk, or is there something else I can do that will minimise my downtime? Any help and advice would be greatly appreciated. -Feargal. -- Feargal Reilly, Chief Techie, FBI. PGP Key: 0x105D7168 (expires: 2006-11-30) Web: http://www.fbi.ie/ | Tel: +353.14988588 | Fax: +353.14988489 Communications House, 11 Sallymount Avenue, Ranelagh, Dublin 6. signature.asc Description: PGP signature
Re: NVIDIA 6600GT Freeze
On Thu, 2006-07-20 at 13:51 +0200, Nealie wrote: > I have a problem with my system freezing when using an NVIDIA video card > using the nvidia-driver port. All seems to work fine for a while but > then the system freezes and won't even reply to a ping. This can happen > regardless of whether I use openGL or not. > > Everything works fine using the "nv" driver, so it doesn't seem to be a > hardware problem. > > My setup is as follows: > > uname: FreeBSD server.home 6.1-STABLE FreeBSD 6.1-STABLE #0: Wed Jul 19 > 11:19:16 CEST 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SERVER > i386 > > AGP NVIDIA 6600GT installed on an MSI K8T Neo-F V2.09 motherboard (VIA > K8T800 Pro chipset) with an AMD Athlon 64 3500+ CPU. > > The NVIDIA driver is installed as per the instructions, with agp and dri > removed from the kernel in order to use the NVIDIA agp interface, even > though the sysctl settings suggest otherwise. > > If anyone has any ideas about this problem I'd be very grateful. Just a quick reply to myself: The problem seems to be that the the IRQs of the motherboards on board network interface and the AGP card are the same. This works for a while but then something goes horribly wrong and all comes to a halt. Why the IRQ is shared I have no idea as there are nine free IRQs. Unfortunately there is no way to change either of the IRQs in the BIOS, so I've had to resort to replacing the on board gigagit network interface with an add on 100Mb interface. All seems to be working properly now with the NVIDIA driver. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: ath driver transmits frames only after a low watermark is filled
On Thursday 20 July 2006 12:42, Sam Leffler wrote: > The original posting didn't provide any basic info so there's little > anyone can provide except wild guesses. There are debugging mechanisms > for tracing what's going on at the net80211 layer and in the driver that > have been referenced countless times in this forum. I think it could be useful if you could provide clear instructions what you need exactly. Even if it was once in a while referenced that there are tools for debugging there are no instructions how to use them and it is already hard to find them and then when found there seems to be no readme and no help at all. So since we are forced to guess how to use them then we are probably forced to guess results too ... I am also sure that the actual ath problems do have nothing to do with powersaving features because they appear also when this features are turned off on both, the ap and the stations(all). the last modification regarding the mcast issue definitly made the driver better and more stable but still it stops transmitting from time to time without any usefull message and appearently it is related to some kind broadcast traffic. It is no question of quantity, for example I can run dhclient gain and again on one machine on the same network and I get an ath down and up event in messages what then sometimes causes tx to stop until I do manually ifconfig up. So since I am on my own here I simply block any kind of broadcast to the AP and the card stands. But if you could tell me what you exactly need to find out what it is it would be easy for me to help because I have lots of this setups running. -- João A mensagem foi scaneada pelo sistema de e-mail e pode ser considerada segura. Service fornecido pelo Datacenter Matik https://datacenter.matik.com.br ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Kernel panic with PF
Daniel Hartmeier wrote: > On Fri, Jul 21, 2006 at 10:57:28AM +0200, Michal Mertl wrote: > > > The proxy in fact runs in parallel (according to "pfctl -s info" it did > > about 50 inserts and removal in the state table per second - some 10Mbit > > of traffic, probably mostly HTTP) and it is quite possible that your > > explanation is correct. I will forward your suspicion to the vendor. > > This functionality of the software (using PF with anchors) is quite new > > - they used different mechanisms in previous versions so it may well > > have some bugs. > > Anchors were introduced for this purpose, i.e. splitting the ruleset > into separate pieces, over each of which a single process can have > authority, so different processes don't stomp on each other's toes with > ruleset modifications. They (the Kernun authors) run multiple processes for each proxy. Originally they used slightly modified Apached core for their proxies I believe. Thus there are probably more processes using the same anchor. I don't really understand what they do inside - I would think that when there are no traffic blocking rules, there's no point in doing anything with PF except initial setting of the rdr rule to the proxy. > Ask them if they really need to still use DIOCCHANGERULE, as the idea > with anchors is generally to only operate within one anchor, and usually > flush or replace the (smaller) ruleset within. > > Each anchor has its own ticket, so if you're seeing ticket mismatches, > that means there are concurrent operations on the same anchor, even. I see. It would be better if they were part of this communication because I don't know the internals (although I have the source code). I have problems reaching them at the moment though. > Daniel > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Kernel panic with PF
On Fri, Jul 21, 2006 at 10:57:28AM +0200, Michal Mertl wrote: > The proxy in fact runs in parallel (according to "pfctl -s info" it did > about 50 inserts and removal in the state table per second - some 10Mbit > of traffic, probably mostly HTTP) and it is quite possible that your > explanation is correct. I will forward your suspicion to the vendor. > This functionality of the software (using PF with anchors) is quite new > - they used different mechanisms in previous versions so it may well > have some bugs. Anchors were introduced for this purpose, i.e. splitting the ruleset into separate pieces, over each of which a single process can have authority, so different processes don't stomp on each other's toes with ruleset modifications. Ask them if they really need to still use DIOCCHANGERULE, as the idea with anchors is generally to only operate within one anchor, and usually flush or replace the (smaller) ruleset within. Each anchor has its own ticket, so if you're seeing ticket mismatches, that means there are concurrent operations on the same anchor, even. Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Kernel panic with PF
Daniel Hartmeier wrote: > On Fri, Jul 21, 2006 at 02:05:45AM +0200, Max Laier wrote: > > > Which proxies are you using? The "pool_ticket: 1429 != 1430" messages you > > quote below indicate a synchronization problem within the app talking to pf > > via ioctl's. Tickets are used to ensure atomic commits for operations that > > require more than one ioctl. If your proxy app runs in parallel it might > > screw up the internal state and thus leave it undefined afterwards. I give > > you that this shouldn't cause a kernel problem, but if we could fix the app > > we can probably find the right sanity check more easily. > > This looks like a bug in pf_ioctl.c pfioctl() DIOCCHANGERULE > > if (newrule->action == PF_NAT) || > (newrule->action == PF_RDR) || > (newrule->action == PF_BINAT) || > (newrule->rt > PF_FASTROUTE)) && > - !pcr->anchor[0])) && > + !newrule->anchor)) && > (TAILQ_FIRST(&newrule->rpool.list) == NULL)) > error = EINVAL; > > i.e. the pool must not be empty for routing and translation rules, > except for translation rules that are actually anchor _calls_. > > The confusion is between translation rules within anchors > (pcr->anchor[0] != '\0') and calls to anchors' translation rules > (rule->anchor != NULL). > > If the proxy is using DIOCCHANGERULE (it must be the proxy, pfctl isn't > using it at all), AND is trying to add/update a rule that requires at > least one replacement address but contains an empty list, then this > would cause the panic seen when that rule later matches a packet. > > This needs fixing in OpenBSD as well. > > Michal, can you please confirm that the patch above fixes the panic? > The proxy will still misbehave and cause the log messages (one more > EINVAL in this case ;), but the kernel shouldn't crash anymore. I am afraid I can't test it at the moment. I am going to get one of the machines to my lab and will experiment with it there. I am afraid I will have problems generating enough traffic for the problem to appear but I will try. > Thanks for the excellent bug report! Thank you. I don't think is was that good as I now see that you had to guess there are anchors used. The rules look like this (except the rules seen by 'pfctl -s nat' they are generated by the proxies when they start): fw1#pfctl -s rule fw1#pfctl -s nat nat-anchor "/kernun/*" all rdr-anchor "/kernun/*" all fw1#pfctl -s Anchors -v kernun kernun/4026 kernun/4039 kernun/4088 kernun/4112 kernun/4134 kernun/4164 kernun/4197 kernun/4257 kernun/4296 kernun/4338 kernun/4383 kernun/4431 kernun/4482 kernun/4590 kernun/4649 fw1# pfctl -a kernun/4039 -s nat rdr on em0 inet proto tcp from any to any port = http label "HTTP" -> 127.0.0.1 When the system was under load I saw ~5000 states in 'pfctl -s state'. Thank you again. I will let you know when I get a chance to test your patch and or find out anything new. Michal ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Kernel panic with PF
Max Laier píše v pá 21. 07. 2006 v 02:05 +0200: > [CC'ing -pf] > > On Thursday 20 July 2006 17:53, Michal Mertl wrote: > > Hello, > > > > I am deploying FreeBSD based application proxies' based firewall > > (www.kernun.com, but not much English there) and am having frequent > > panics of RELENG_6_1 under load. The server has IP forwarding disabled. > > > > I've got two machines in a carp cluster and the transparent proxies use > > PF to get the data. > > Which proxies are you using? The "pool_ticket: 1429 != 1430" messages you > quote below indicate a synchronization problem within the app talking to pf > via ioctl's. Tickets are used to ensure atomic commits for operations that > require more than one ioctl. If your proxy app runs in parallel it might > screw up the internal state and thus leave it undefined afterwards. I give > you that this shouldn't cause a kernel problem, but if we could fix the app > we can probably find the right sanity check more easily. The proxy in fact runs in parallel (according to "pfctl -s info" it did about 50 inserts and removal in the state table per second - some 10Mbit of traffic, probably mostly HTTP) and it is quite possible that your explanation is correct. I will forward your suspicion to the vendor. This functionality of the software (using PF with anchors) is quite new - they used different mechanisms in previous versions so it may well have some bugs. Thanks Michal ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"