Re: [RFC] cfg80211 and nl80211
Umm, looks like I skipped this paragraph in my earlier reply to you. Sorry about that. > I'd also argue that one specific BSSID is part of an initial > configuration. We should support that in config command. It's an > implicit SET_FIXED_BSSID, yes. But one of the major points of > nl80211/cfg80211 was that you could bundle up a set of configuration > settings into a single atomic "packet", which you couldn't do with WE. > > So if a specific BSSID isn't sent in the initial config command, when do > you set a specific BSSID? Before? After? The behavior starts getting > complicated, and we're back to a situation where every driver implements > the semantics in a slightly different manner. Ah, good point. But then, why would you want to set a specific (initial) BSSID at all? Either you set userspace roaming (which you'd do before setting the SSID) then the kernel can't do anything without you setting a BSSID, or you don't set userspace roaming, then all the kernel needs is the SSID. I'm thinking you probably want something like 'list of BSSIDs to use for userspace roaming' and possibly a blacklist too, although I'm inclined to let userspace manage the blacklist by way of having a whitelist *only* and having userspace simply add everything to the whitelist that it discovers through scanning and isn't on the blacklist... Hence, would you be satisfied with a BSSID-whitelist for kernel-controlled roaming (userspace roaming doesn't need the kernel to know about the whitelist)? Heck, you could even use a single-element whitelist for when you want to force the kernel to associate to that AP... Maybe we should thus drop the userspace roaming support? I think it's a simpler API though... Then again, why do we need a BSSID-whitelist? Just have userspace control roaming then... Also, the use case you want could probably be achieved by turning on userspace roaming, setting the BSSID for it, configuring the SSID and then turning off userspace roaming again. Or let me put it another way: I'm not sure what the use case actually is :) johannes - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] cfg80211 and nl80211
On Wed, 2006-10-04 at 13:57 -0400, Dan Williams wrote: > Are we talking about config changes when some other process pushes a new > config to the card, or when something happens over the air, like new > association or deauth? Well, both actually :) Yeah, we should have different groups for that. > Is it a problem to actually push the _entire_ scan list out to clients > over netlink? The scan list could be quite large, maybe even a few > kilobytes when stuff like Information Elements, ratesets, etc is > available. I've seen 35-item scan lists that are already around 1.5K. 35-item list at 1.5K, heh, the allocated skbs are always at least 4k so we can just push that out in full I guess. Though scan lists with nl80211 will be slightly larger (some genetlink overhead) than the bit-packed WE stuff. > Ideally, we could push the whole scan list to clients, and then we avoid > the race between getting the scan result notification and hitting the > card. That said, as long as the driver does proper locking, the race > condition shouldn't matter at all. Heh yeah. > I'd vote for pushing results along with the notification, because one of > the most annoying things in the past was the inconsistency between how > drivers reported results and what BSSID attributes they sent. If we can > _standardize_ the result list and its construction inside > cfg80211/nl80211 that would be a great benefit. Well, I'm thinking that drivers provide an iterator that provides a scan result structure for each result and nl80211 iterates over that building up the netlink message. That way, they just have to fill in such a structure for each iteration, which should ease things. > If we can't push results with the notification, at least provide some > functions to build up the GET_SCAN reply message, which you'll likely > have to do anyway once you implement GET_SCAN. We _really_ need drivers > to be consistent here. Same thing, get_scan() gives an iterator function that the driver calls for each result it has with the same scan result struct. > > > * crypto and auth support > > I've done a lot of thinking about crypto/auth this morning while beating > the hell out of the libertas 8388 driver to clean up the ENCODE support. > > There are several issues here. They can be roughly split by encryption > algorithm. But the big question: > > Is there a case for _multiple_ encryption algorithms enabled > on a single "virtual" interface at one time? > > I don't think there is, and I think that just complicates things in > d80211 anyway. If we agree that you can only set one of [none, WEP, > WPA] on a virtual interface at any given time, it makes the crypto > interface for nl80211 a lot easier. I can't see a case point for that. Although maybe for AP interfaces? But does that make sense to have some stations with say TKIP and others with AES, and is that even negotiable? In any case, I think a STA inside an AP should be treated mostly like a single "STA virtual interface" which surely doesn't need multiple algorithms. > Part of the problem of WE right now is that there's no clear API > separation between the different options. You can pass some WEP options > through when you really want to do WPA (like key indexes). That makes > the driver handling code for ENCODE and ENCODEEXT too complex. > > Taking one-at-a-time as a given, and the pseudo-structure > > struct cmd_crypto { > enum crypto_alg alg; > union data { > none_data; > wep_data; > wpa_data; > ... > }; > }; > > Set alg == , set the options, and the driver will _enable_ > that crypto mode with the given options. It makes no sense at all to, > say, set the WEP transmit key index or WEP key when the card is in WPA > mode or no-crypto mode. That makes sense; in netlink it'd be represented by a message containing a algorithm attribute and then attributes for all the other things and not those attributes for say WPA if you use WEP. > It's important to note that some options are independent of the initial > operation that enabled the crypto, and need to be set later without > triggering deauth and such. Setting non-TX-index WEP key is one such > operation. I should be able to set WEP keys at indexes other than the > transmit key index without affecting operation of the card (unless some > hardware/firmware issue prevents this). > > - No crypto > - WEP encryption (following ops are independent of each other): > - Set TX key index > - Set privacy invoked what is that? > - Set exclude unencrypted packets > - Set authentication mode (open, shared-key, or both) > - Set (or clear) WEP key 1, 2, 3, or 4 > - WPA/WPA2/IEEE8021X > - Jouni/others would know better and my brain is fried right now > > All the WEP options should be independent attributes in nl80211. You > could even have a generic WEPKey attribute that is defined like so: > > ATTR_WEP_KEY { > enum
Re: [PATCH] wext
On Wed, 2006-10-04 at 14:45 -0700, Jouni Malinen wrote: > SIOCSIWMLME was designed to allow additional MLME commands to be added. > IMHO, a potential replacement in the future should not prevent us from > extending WEXT at this point and stop all changes in something that is > currently available. Fine with me, it's really just a matter of switch()ing on the sub-command in the cfg80211-we compat code. My answer was more political in nature I guess -- even I need sleep, you know. :P johannes - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take19 1/4] kevent: Core files.
On Wed, Oct 04, 2006 at 10:57:32AM -0700, Ulrich Drepper ([EMAIL PROTECTED]) wrote: > On 10/3/06, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote: > >http://tservice.net.ru/~s0mbre/archive/kevent/evserver_kevent.c > >http://tservice.net.ru/~s0mbre/archive/kevent/evtest.c > > These are simple programs which by themselves have problems. For > instance, I consider a very bad idea to hardcode the size of the ring > buffer. Specifying macros in the header file counts as hardcoding. > Systems grow over time and so will the demand of connections. I have > no problem with the kernel hardcoding the value internally (or having > a /proc entry to select it) but programs should be able to dynamically > learn about the value so they don't have to be recompiled. Well, it is possible to create /sys/proc entry for that, and even now userspace can grow mapping ring until it is forbiden by kernel, which means limit is reached. Actually the whole idea with global limit of kevents does not sound very good to me, but it is required to remove overflow in mapped buffer. > But more problematic is that I don't see how the interfaces can be > efficiently used in multi-threaded (or multi-process) programs. How > would multiple threads using the same kevent queue and running in the > same kevent_get_events() loop work out? How do they guarantee that > each request is only handled once? kqueue_dequeue_ready() is atomic and this function removes kevent from ready queue so other thread can not get it. > From what I see now this means a second data structure is needed to > keep track of the state of each entry. But even then, how do we even > recognized used ring buffer entries? > > For instance, assume two threads. Both call get_events, one event is > reported, both threads are woken up (which is another thing to > consider, more later). One thread uses ring buffer entry, the other > goes back to sleep in get_events. Now, how does the kernel know when > the other thread is done working on the ring buffer entry? There > might be lots of entries coming in overflowing the entire buffer. > Heck, you don't even need two threads for this scenario. Are you talking about mapped buffer or syscall interface? The former has special syscall kevent_wait(), which reports number of 'processed' events and first processed number, so kernel can remove all appropriate events. The latter is described above - kqueue_dequeue_ready() is atomic, so that event will be removed from the ready queue and optionally from the whole kevent tree. It is possible to work with both interfaces at the same time, since mapped buffer contains a copy of the event, which is potentially freed and processed by other thread. Actually I do not like idea of mapped ring anyway, since if application uses a lot of events, it will batch them into big chunks, so syscall overhead is negligible, if application uses small number of events, syscalls will be rare and will not hurt performance. > When I was thinking about this (and discussing it in Ottawa) I was > always assuming that we have a status field in the ring buffer entry > which lets the userlevel code indicate whether the entry is free again > or not. This requires a writable mapping, yes, and potentially causes > cache line ping-pong. I think Zach mentioned he has some ideas about > this. As far as I can see, there are no other ideas on how to implement ring buffer, so I did it like I wanted. It has some limitation indeed, but since I do not see any other code, how can I say what is better or worse? > As for the multiple thread wakeup, I mentioned this before. We have > to avoid the trampling herd problem. We cannot wakeup all waiters. > But we also cannot assume that, without protocols, waking up just one > for each available entry is sufficient. So the first question is: > what is the current policy? It is a good practice to _not_ share the same queue between a lot of threads. Currently all waiters are awakened. > >AIO was removed from patchset by request of Cristoph. > >Timers, network AIO, fs AIO, socket nortifications and poll/select > >events work well with existing structures. > > Well, excuse me if I don't take your word for it. I agree, the AIO > code should not be submitted along with this. The same for any other > code using the event handling. But we need to check whether the > interface is generic enough to accomodate them in a way which actually > makes sense. Again, think highly threaded processes or multiple > processes sharing the same event queue. You missed the point. I implemented _all_ above and it does work. Although it was removed from submission patchset. You can find all patches on kevent homepage, they were posted to lkml@ and netdev@ too many times to miss them. > >It is even possible to create variable sized kevents - each kevent > >contain pointer to user's data, which can be considered as pointer to > >additional area (it's size kernel implementation for given k
Re: [take19 0/4] kevent: Generic event handling mechanism.
On Wed, Oct 04, 2006 at 10:20:44AM -0700, Ulrich Drepper ([EMAIL PROTECTED]) wrote: > Evgeniy Polyakov wrote: > > It is completely possible to do what you describe without special > > syscall parameters. > > First of all, I don't see how this is efficiently possible. The mask > might change from call to call. And you can add/remove signal events using existing kevent api between calls. > Second, hasn't it sunk in that inventing new ways to pass parameters is > bad? Programmers don't want to learn new ways for every new interface. > Reuse is good! And creating special cases for usual events is bad. There is unified way to deal with events in kevent - add/remove/modify/wait on them, signals are just usual events. > This applies to the signal mask here. > > But there is another parameter falling into that category and I meant to > mention it before: the timeout value. All other calls except poll and > especially all modern interfaces use a timespec pointer. This is the > way times are kept in userland code. Don't try to force people to do > something else. > > Using a timespec also has the advantage that we can add an absolute > timeout value mode (optional) instead of the relative timeout value. > > In this context, we should/must be able to specify which clock the > timeout is for (not as part of the wait call, but another control > operation perhaps). It's important to distinguish between > CLOCK_REALTIME and CLOCK_MONOTONE. Both have their use. I think you wanted to say, that 'all event mechanism except the most commonly used poll/select/epoll use timespec'. I designed it to be similar to poll(), it is really good interface. Nature of the waiting is to wait for some time, so I put there that 'some time'. > -- > ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ > -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: !! SPAM Suspect : SPAM-URL-DBL !! Re: [RFC] Disable addrconf on ~multicast interfaces?
This patch will break multicast forwarding using the pim6 daemon. This daemon creates an interface called pim6reg which is not MULTICAST enabled but needs to be configred by addrconf to get ff00::/8 and fe80::/64 routes. This is required since the route lookup process has been enforced to strictly match input interfaces for linklocal and multicast packets. Regards, JP On Thu, Oct 05, 2006 at 04:35:49PM +1000, Herbert Xu wrote: > Pekka Savola <[EMAIL PROTECTED]> wrote: > > On Thu, 5 Oct 2006, Herbert Xu wrote: > >> Are there any non-multicast interfaces that require addrconf? > >> In other words, what does the following patch break :) > > > > Point-to-point (or NOARP) interfaces such as tunnels. I'm not sure > > what are the right flags to check.. > > Tunnels shouldn't even get into that function so they aren't > affected. Are there any Ethernet-like interfaces which do not > set IFF_MULTICAST yet still require addrconf? > > Cheers, > -- > Visit Openswan at http://www.openswan.org/ > Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> > Home Page: http://gondor.apana.org.au/~herbert/ > PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Network Events Connector
On Thu, Oct 05, 2006 at 03:10:02AM +0200, Samir Bellabes ([EMAIL PROTECTED]) wrote: > > You can also extend your module to be more generic and send all (or just > > requested in config) state changes for all protocols (or those checked > > in config). > > Ok, so the next step now is to target all state changes for all > protocols, *but* send only the states asked dynamically from the > userspace, using the userspace-to-kernel's way of the netlink. > What do you think about that ? That sounds good, but as David mentioned, if there are other good possibilities to do so, there is no need to reinvent new one (although sometimes it is much better to reinvent the wheel, if existing one is square). > >> > Btw, you could also create netlink/connector based firewall rules > >> > update, I think people with hundreds of rules in one table will bless > >> > you after that. > >> > >> This is the real goal, using ipset - http://ipset.netfilter.org/ > >> With this we can easily create a uniq rule for iptables, and then > >> add/remove port from the 'set' involve. > > > > It is not the same as create and update existing rules. > > I think hipac project uses feature of fast rules update. > > It is quite major break for existing iptables, but it should be > > eventually done... > > Ok now i understand clearly your point. > But we are a bit far from the initial idea, even if it could be really > good to do that. First, let's code the initial idea. Agree. -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sky2: incorrect length on receive packets
applied to #upstream-fixes - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/5] ibmveth: Harden driver initilisation
applied 1-5 to #upstream-fixes - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] mv643xx_eth: Fix ethtool stats
applied to #upstream-fixes - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/6] 2.6.18: sb1250-mac: Broadcom PHY support
Maciej W. Rozycki wrote: Hello, [...] Please consider. Maciej Please don't include this in the patch description. It must be hand-edited out, before applying with git-applymbox. All comments should be placed AFTER the "---" separator, which terminates the patch description. Applied patches 1-3, patch #4 failed due to drivers/net/Kconfig breakage - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re:
Jay Vosburgh wrote: From: Karsten Keil <[EMAIL PROTECTED]> In bond_alb_monitor the bond->curr_slave_lock write lock is taken and then dev_set_promiscuity maybe called which can take some time, depending on the network HW. If a network IRQ for this card come in the softirq handler maybe try to deliver more packets which end up in a request to the read lock of bond->curr_slave_lock -> deadlock. This issue was found by a test lab during network stress tests, this patch disable the softirq handler for this case and solved the issue. Signed-off-by: Karsten Keil <[EMAIL PROTECTED]> Acked-by: Jay Vosburgh <[EMAIL PROTECTED]> applied, though note that your email was slightly corrupted. It included _two_ Subject headers, making the email non-compliant with RFC822. Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/6] 2.6.18: sb1250-mac: Broadcom PHY support
Jeff Garzik wrote: Maciej W. Rozycki wrote: Hello, [...] Please consider. Maciej Please don't include this in the patch description. It must be hand-edited out, before applying with git-applymbox. All comments should be placed AFTER the "---" separator, which terminates the patch description. Applied patches 1-3, patch #4 failed due to drivers/net/Kconfig breakage Also, in your email subject line, the kernel version should be included in the [PATCH...] brackets. Please see http://linux.yyz.us/patch-format.html and Documentation/SubmittingPatches for more info. Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] b44: fix multicast with >32 groups
Bill Helfinstine wrote: The b44 driver has a bug where if there are more than B44_MCAST_TABLE_SIZE groups in the dev->mc_list, it will only listen to the first B44_MCAST_TABLE_SIZE that it sees. This patch makes the driver go into RXCONFIG_ALLMULTI mode if there are more than B44_MCAST_TABLE_SIZE groups being subscribed to. This patch is against 2.6.18, b44.c version 1.01. Signed-off-by: Bill Helfinstine <[EMAIL PROTECTED]> applied manually, due to whitespace damage - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.18] AT91RM9200 Ethernet update
ACK, but patch doesn't apply to 2.6.18 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: d80211: ieee80211_hw handlers in atomic context
On Wed, 04 Oct 2006 18:51:40 +0200, Jan Kiszka wrote: > Ok, I'm not promising success and I'm going to duck immediately if > someone else feels like working on it, but I could try to patch in this > direction. Your patches are welcomed! > Now there just remains my precautious question if there are other > services in the ieee_80211_hw interface that may conflict with sleeping > USB drivers. What about specifying the possible contexts in > include/net/d80211.h? Yes, that makes sense. Feel free to send a patch :-) Thanks, Jiri -- Jiri Benc SUSE Labs - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: d80211: ieee80211_hw handlers in atomic context
On Wed, 4 Oct 2006 19:22:38 +0200, Ivo van Doorn wrote: > Well another point of concern for me is the TSF handling, those handlers are > called > from interrupt context as well, and also deliver problems for the USB drivers > in case > of adhoc mode. Where is a problem with tsf handlers? get_tsf is not called at all (unless CONFIG_D80211_IBSS_DEBUG is set; well, that raises a question why the function exists in the first place), reset_tsf returns void. Jiri -- Jiri Benc SUSE Labs - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take19 1/4] kevent: Core files.
On Thursday 05 October 2006 10:57, Evgeniy Polyakov wrote: > Well, it is possible to create /sys/proc entry for that, and even now > userspace can grow mapping ring until it is forbiden by kernel, which > means limit is reached. No need for yet another /sys/proc entry. Right now, I (for example) may have a use for Generic event handling, but for a program that needs XXX.XXX handles, and about XX.XXX events per second. Right now, this program uses epoll, and reaches no limit at all, once you pass the "ulimit -n", and other kernel wide tunes of course, not related to epoll. With your current kevent, I cannot switch to it, because of hardcoded limits. I may be wrong, but what is currently missing for me is : - No hardcoded limit on the max number of events. (A process that can open XXX.XXX files should be allowed to open a kevent queue with at least XXX.XXX events). Right now thats not clear what happens IF the current limit is reached. - In order to avoid touching the whole ring buffer, it might be good to be able to reset the indexes to the beginning when ring buffer is empty. (So if the user land is responsive enough to consume events, only first pages of the mapping would be used : that saves L1/L2 cpu caches) A plus would be - A working/usable mmap ring buffer implementation, but I think its not mandatory. System calls are not that expensive, especially if you can batch XX events per syscall (like epoll). Nice thing with a ring buffer is that we touch less cache lines than say epoll that have lot of linked structures. About mmap, I think you might want a hybrid thing : One writable page where userland can write its index, (and hold one or more futex shared by kernel) (with appropriate thread locking in case multiple threads want to dequeue events). In fast path, no syscalls are needed to maintain this user index. XXX readonly pages (for user, but r/w for kernel), where kernel write its own index, and events of course. Using separate cache lines avoid false sharing : kernel can update its own index and events without having to pay the price of cache line ping pongs. It could use futex infrastructure to wakeup one thread 'only' instead of all threads waiting an event. Eric - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take19 1/4] kevent: Core files.
On Thu, Oct 05, 2006 at 11:56:24AM +0200, Eric Dumazet ([EMAIL PROTECTED]) wrote: > On Thursday 05 October 2006 10:57, Evgeniy Polyakov wrote: > > > Well, it is possible to create /sys/proc entry for that, and even now > > userspace can grow mapping ring until it is forbiden by kernel, which > > means limit is reached. > > No need for yet another /sys/proc entry. > > Right now, I (for example) may have a use for Generic event handling, but for > a program that needs XXX.XXX handles, and about XX.XXX events per second. > > Right now, this program uses epoll, and reaches no limit at all, once you > pass > the "ulimit -n", and other kernel wide tunes of course, not related to epoll. > > With your current kevent, I cannot switch to it, because of hardcoded limits. > > I may be wrong, but what is currently missing for me is : > > - No hardcoded limit on the max number of events. (A process that can open > XXX.XXX files should be allowed to open a kevent queue with at least XXX.XXX > events). Right now thats not clear what happens IF the current limit is > reached. This forces to overflows in fixed sized memory mapped buffer. If we remove memory mapped buffer or will allow to have overflows (and thus skipped entries) keven can easily scale to that limits (tested with xx.xxx events though). > - In order to avoid touching the whole ring buffer, it might be good to be > able to reset the indexes to the beginning when ring buffer is empty. (So if > the user land is responsive enough to consume events, only first pages of the > mapping would be used : that saves L1/L2 cpu caches) And what happens when there are 3 empty at the beginning and \we need to put there 4 ready events? > A plus would be > > - A working/usable mmap ring buffer implementation, but I think its not > mandatory. System calls are not that expensive, especially if you can batch > XX events per syscall (like epoll). Nice thing with a ring buffer is that we > touch less cache lines than say epoll that have lot of linked structures. > > About mmap, I think you might want a hybrid thing : > > One writable page where userland can write its index, (and hold one or more > futex shared by kernel) (with appropriate thread locking in case multiple > threads want to dequeue events). In fast path, no syscalls are needed to > maintain this user index. > > XXX readonly pages (for user, but r/w for kernel), where kernel write its own > index, and events of course. The problem is in that xxx pages - how many can we eat per kevent descriptor? It is pinned memory and thus it is possible to have a DoS. If xxx above is not enough to store all events, we will have yet-another-broken behaviour like rt-signal queue overflow. > Using separate cache lines avoid false sharing : kernel can update its own > index and events without having to pay the price of cache line ping pongs. > It could use futex infrastructure to wakeup one thread 'only' instead of all > threads waiting an event. > > > Eric -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take19 1/4] kevent: Core files.
On Thursday 05 October 2006 12:21, Evgeniy Polyakov wrote: > On Thu, Oct 05, 2006 at 11:56:24AM +0200, Eric Dumazet ([EMAIL PROTECTED]) > > I may be wrong, but what is currently missing for me is : > > > > - No hardcoded limit on the max number of events. (A process that can > > open XXX.XXX files should be allowed to open a kevent queue with at least > > XXX.XXX events). Right now thats not clear what happens IF the current > > limit is reached. > > This forces to overflows in fixed sized memory mapped buffer. > If we remove memory mapped buffer or will allow to have overflows (and > thus skipped entries) keven can easily scale to that limits (tested with > xx.xxx events though). What is missing or not obvious is : If events are skipped because of overflows, What happens ? Connections stuck forever ? Hope that everything will restore itself ? Is kernel able to SIGNAL this problem to user land ? > > > - In order to avoid touching the whole ring buffer, it might be good to > > be able to reset the indexes to the beginning when ring buffer is empty. > > (So if the user land is responsive enough to consume events, only first > > pages of the mapping would be used : that saves L1/L2 cpu caches) > > And what happens when there are 3 empty at the beginning and \we need to > put there 4 ready events? Re-read what I said : when ring buffer is empty. When ring buffer is empty, kernel can reset index right before adding XX new events. You read 3 events consumed, I said : When all ring buffer is empty, because all previous events were consumed by user land, then we can reset indexes to 0. > > > A plus would be > > > > - A working/usable mmap ring buffer implementation, but I think its not > > mandatory. System calls are not that expensive, especially if you can > > batch XX events per syscall (like epoll). Nice thing with a ring buffer > > is that we touch less cache lines than say epoll that have lot of linked > > structures. > > > > About mmap, I think you might want a hybrid thing : > > > > One writable page where userland can write its index, (and hold one or > > more futex shared by kernel) (with appropriate thread locking in case > > multiple threads want to dequeue events). In fast path, no syscalls are > > needed to maintain this user index. > > > > XXX readonly pages (for user, but r/w for kernel), where kernel write its > > own index, and events of course. > > The problem is in that xxx pages - how many can we eat per kevent > descriptor? It is pinned memory and thus it is possible to have a DoS. > If xxx above is not enough to store all events, we will have > yet-another-broken behaviour like rt-signal queue overflow. > Re-read : I have a process that has the right to open XXX.XXX handles, allocating XXX.XXX tcp sockets, dentries, files structures, inodes, epoll events, its obviously already a DOS risk, but controled by 'ulimit -n' Allocating XXX.XXX * (32 or 64) bytes is a win if I can zap epoll structures (currently more than 256 bytes per event) epoll structures are pinned too... what's wrong with that ? # egrep "filp|poll|TCP|dentries|sock_inode" /proc/slabinfo |cut -c1-50 tw_sock_TCP 1302 2200192 201 : request_sock_TCP2046 4260128 301 : TCP 151509 196910 147252 : eventpoll_pwq 146718 199439 72 531 : eventpoll_epi 146718 199360192 201 : sock_inode_cache 149182 19794064061 : filp 149537 202515256 151 : If you want to protect from DOS, just use ulimit -n 100 Eric - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take19 1/4] kevent: Core files.
On Thu, Oct 05, 2006 at 12:45:03PM +0200, Eric Dumazet ([EMAIL PROTECTED]) wrote: > On Thursday 05 October 2006 12:21, Evgeniy Polyakov wrote: > > On Thu, Oct 05, 2006 at 11:56:24AM +0200, Eric Dumazet ([EMAIL PROTECTED]) > > > I may be wrong, but what is currently missing for me is : > > > > > > - No hardcoded limit on the max number of events. (A process that can > > > open XXX.XXX files should be allowed to open a kevent queue with at least > > > XXX.XXX events). Right now thats not clear what happens IF the current > > > limit is reached. > > > > This forces to overflows in fixed sized memory mapped buffer. > > If we remove memory mapped buffer or will allow to have overflows (and > > thus skipped entries) keven can easily scale to that limits (tested with > > xx.xxx events though). > > What is missing or not obvious is : If events are skipped because of > overflows, What happens ? Connections stuck forever ? Hope that everything > will restore itself ? Is kernel able to SIGNAL this problem to user land ? Exisitng code does not overflow by design, but can consume a lot of memory. I talked about the case, when there will be some limit on number of entries put into mapped buffer. > > > - In order to avoid touching the whole ring buffer, it might be good to > > > be able to reset the indexes to the beginning when ring buffer is empty. > > > (So if the user land is responsive enough to consume events, only first > > > pages of the mapping would be used : that saves L1/L2 cpu caches) > > > > And what happens when there are 3 empty at the beginning and \we need to > > put there 4 ready events? > > Re-read what I said : when ring buffer is empty. > > When ring buffer is empty, kernel can reset index right before adding XX new > events. You read 3 events consumed, I said : When all ring buffer is empty, > because all previous events were consumed by user land, then we can reset > indexes to 0. It is the same. What if reing buffer was grown upto 3 entry, and is now empty, and we need to put there 4 entries? Grow it again? It can be done, easily, but it looks like a workaround not as solution. And it is highly unlikely that in situation, when there are a lot of event, ring can be empty. > > > > > A plus would be > > > > > > - A working/usable mmap ring buffer implementation, but I think its not > > > mandatory. System calls are not that expensive, especially if you can > > > batch XX events per syscall (like epoll). Nice thing with a ring buffer > > > is that we touch less cache lines than say epoll that have lot of linked > > > structures. > > > > > > About mmap, I think you might want a hybrid thing : > > > > > > One writable page where userland can write its index, (and hold one or > > > more futex shared by kernel) (with appropriate thread locking in case > > > multiple threads want to dequeue events). In fast path, no syscalls are > > > needed to maintain this user index. > > > > > > XXX readonly pages (for user, but r/w for kernel), where kernel write its > > > own index, and events of course. > > > > The problem is in that xxx pages - how many can we eat per kevent > > descriptor? It is pinned memory and thus it is possible to have a DoS. > > If xxx above is not enough to store all events, we will have > > yet-another-broken behaviour like rt-signal queue overflow. > > > > Re-read : I have a process that has the right to open XXX.XXX handles, > allocating XXX.XXX tcp sockets, dentries, files structures, inodes, epoll > events, its obviously already a DOS risk, but controled by 'ulimit -n' > > Allocating XXX.XXX * (32 or 64) bytes is a win if I can zap epoll structures > (currently more than 256 bytes per event) > > epoll structures are pinned too... what's wrong with that ? > > # egrep "filp|poll|TCP|dentries|sock_inode" /proc/slabinfo |cut -c1-50 > tw_sock_TCP 1302 2200192 201 : > request_sock_TCP2046 4260128 301 : > TCP 151509 196910 147252 : > eventpoll_pwq 146718 199439 72 531 : > eventpoll_epi 146718 199360192 201 : > sock_inode_cache 149182 19794064061 : > filp 149537 202515256 151 : > > If you want to protect from DOS, just use ulimit -n 100 epoll() does not have mmap. Problem is not about how many events can be put into the kernel, but how many of them can be put into mapped buffer. There is no problem if mmap is turned off. > Eric -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.19-rc1] ehea bug fix (port state notification, default queue sizes)
which patch is to be applied first? You failed to include an order, as described by http://linux.yyz.us/patch-format.html and Documentation/SubmittingPatches. Also, stuff like "hi Jeff" and "Thanks, Jan-Bernd" must be hand-edited out before patch application. All comments not intended to be DIRECTLY copied into the kernel changeset description should follow the "---" separator. Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take19 1/4] kevent: Core files.
On Thursday 05 October 2006 12:55, Evgeniy Polyakov wrote: > On Thu, Oct 05, 2006 at 12:45:03PM +0200, Eric Dumazet ([EMAIL PROTECTED]) > > > > What is missing or not obvious is : If events are skipped because of > > overflows, What happens ? Connections stuck forever ? Hope that > > everything will restore itself ? Is kernel able to SIGNAL this problem to > > user land ? > > Exisitng code does not overflow by design, but can consume a lot of > memory. I talked about the case, when there will be some limit on > number of entries put into mapped buffer. You still dont answer my question. Please answer the question. Recap : You have a max of events queued. A network message come and kernel want to add another event. It cannot because limit is reached. How the User Program knows that this problem was hit ? > It is the same. > What if reing buffer was grown upto 3 entry, and is now empty, and we > need to put there 4 entries? Grow it again? > It can be done, easily, but it looks like a workaround not as solution. > And it is highly unlikely that in situation, when there are a lot of > event, ring can be empty. I dont speak of re-allocation of ring buffer. I dont care to allocate at startup a big enough buffer. Say you have allocated a ring buffer of 1024*1024 entries. Then you queue 100 events per second, and dequeue them immediatly. No need to blindly use all 1024*1024 slots in the ring buffer, doing index = (index+1)%(1024*1024) > epoll() does not have mmap. > Problem is not about how many events can be put into the kernel, but how > many of them can be put into mapped buffer. > There is no problem if mmap is turned off. So zap mmap() support completely, since it is not usable at all. We wont discuss on it. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take19 1/4] kevent: Core files.
On Thu, Oct 05, 2006 at 02:09:31PM +0200, Eric Dumazet ([EMAIL PROTECTED]) wrote: > On Thursday 05 October 2006 12:55, Evgeniy Polyakov wrote: > > On Thu, Oct 05, 2006 at 12:45:03PM +0200, Eric Dumazet ([EMAIL PROTECTED]) > > > > > > What is missing or not obvious is : If events are skipped because of > > > overflows, What happens ? Connections stuck forever ? Hope that > > > everything will restore itself ? Is kernel able to SIGNAL this problem to > > > user land ? > > > > Exisitng code does not overflow by design, but can consume a lot of > > memory. I talked about the case, when there will be some limit on > > number of entries put into mapped buffer. > > You still dont answer my question. Please answer the question. > Recap : You have a max of events queued. A network message come and > kernel want to add another event. It cannot because limit is reached. How the > User Program knows that this problem was hit ? Existing design does not allow overflow. If event was added into the queue (like user requested notification, when new data has arrived), it is guaranteed that there will be place to put that event into mapped buffer when it is ready. If user wants to add anotehr event (for example after accept() user wants to add another socket with request for notification about data arrival into that socket), it can fail though. This limit is introduced only because of mmap buffer. > > It is the same. > > What if reing buffer was grown upto 3 entry, and is now empty, and we > > need to put there 4 entries? Grow it again? > > It can be done, easily, but it looks like a workaround not as solution. > > And it is highly unlikely that in situation, when there are a lot of > > event, ring can be empty. > > I dont speak of re-allocation of ring buffer. I dont care to allocate at > startup a big enough buffer. > > Say you have allocated a ring buffer of 1024*1024 entries. > Then you queue 100 events per second, and dequeue them immediatly. > No need to blindly use all 1024*1024 slots in the ring buffer, doing > index = (index+1)%(1024*1024) But what if they are not dequeued immediateyl? What if rate is high and while one tries to dequeue, system adds another events? > > epoll() does not have mmap. > > Problem is not about how many events can be put into the kernel, but how > > many of them can be put into mapped buffer. > > There is no problem if mmap is turned off. > > So zap mmap() support completely, since it is not usable at all. We wont > discuss on it. Initial implementation did not have it. But I was requested to do it, and it is ready now. No one likes it, but no one provides an alternative implementation. We are stuck. -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] cfg80211 and nl80211
On Wed, Oct 04, 2006 at 01:57:38PM -0400, Dan Williams wrote: > * None > - Crypto: None > - 802.11 Auth: Open System > > * Static WEP > - Keys: up to 4 group keys > - Crypto: WEP-40, WEP-104, WEP-152, WEP-256 > - 802.11 Auth: Open System or Shared Key > - Key Mgmt/Auth: none > > * Dynamic WEP (LEAP?) > - Keys: up to 4 group keys > - Crypto: WEP-40, WEP-104, WEP-152, WEP-256 > - 802.11 Auth: Open System (only?) > - Key Mgmt/Auth: IEEE 802.1x with LEAP or EAP > > * WPA PSK > - Keys: pairwise & group > - Use WPA IEs > - 802.11 Auth: Open System > - Crypto: TKIP or CCMP > - Key Mgmt/Auth: WPA-PSK (elided 802.1x) > > * WPA Enterprise > - Keys: pairwise & group > - Use WPA IEs > - 802.11 Auth: Open System > - Crypto: TKIP or CCMP > - Key Mgmt/Auth: WPA-EAP (full 802.1x) > > * WPA2 PSK > - Keys: pairwise & group > - Use RSN IEs > - 802.11 Auth: Open System > - Crypto: TKIP or CCMP > - Key Mgmt/Auth: WPA-PSK (elided 802.1x) > > * WPA2 Enterprise > - Keys: pairwise & group > - Use RSN IEs > - 802.11 Auth: Open System > - Crypto: TKIP or CCMP > - Key Mgmt/Auth: WPA-EAP (full 802.1x) This strikes me as overly complicated; to figure out what's necessary you shoudn't be looking at the WEXT API -- The 802.11 standards are all you need, and they lay things out fairly clearly, complete with rx/tx path flowcharts. :) Essentially, you have two crypto paradigms, pre-802.11i and post-802.11i. (WPA uses the latter, and LEAP/CCX v1 is mostly the former; newer ones use the latter) (Leave out the RSNIE, AuthType and KeyMgmt stuff; while they're used in the actual key negotiation/derivation, they're separate problems and have no bearing on the crypto layer. From the driver's perspective the RSNIE is just an opaque blob to be appended to beacons,presps and [re]assoc frames, KeyMgmt is purely a matter for the authenticator/supplicant, and AuthType is just a toggle that happens to be off for post-802.11i, although LEAP v1 adds some complications there..) The old way: * Four "default" keys. (used globally) * PrivacyInvoked * SetDefaultKeyIndex The new way: * PrivacyInvoked * SetProtection (tx&|rx -- essentially "require crypto for a given macaddr) * SetKeyMapping (one key per macaddr) Each key has: * Key type (WEP/TKIP/AES-CCMP/NONE) * Key length (implied, but WEP can have varying key lengths) * Key index (only '0' is generally used for unicast frames, but 802.11i requires use of simultaneous broadcast keys) * Macaddr (ucast addr or broadcast aka pairwise vs group) * RxSequence (mainly for bcast aka group keys) It's fairly easy to implement the old stuff in terms of the new stuff, if you assume that "if I don't have a per-sta key, just use the global/bcast key". The 802.11i rx/tx frame path flow handles the old crypto style just fine. ...Meanwhile. It's foolish to ignore the 802.11 MLME. It lists out pretty much everything that's necessary to get a working connection, and looking at its evolution (and changes in the pipeline) shows that it's impossible to do it all (right) the first time, and that changes, not just additions, will be necessary. (Did I mention that I really like how the ALSA people manage this? The userspace-kernelspace API is effectively private; apps write to the libs, which do the hard work of maintaining backwards compatibility as the internals change and get new features, but now I'm really just armchair quarterbacking, so I'll shut up now.) > Wheee! So you basically have a bunch of buckets and you just pull shit > out of them at random, stick it all together, and you've got a wireless > connection :) Thank you, Cisco. Thank you, Wi-Fi Alliance. You forgot the part about sacrificing rubber chickens with pulleys in the middle. While hopping on one foot. Under a new moon. Bah, it's too early in the morning to be thinking about this stuff. - Solomon -- Solomon Peachy pizza at shaftnet dot org Melbourne, FL ^^ (mail/jabber/gtalk) ^^ Quidquid latine dictum sit, altum viditur. ICQ: 1318344 pgptme77JcM3O.pgp Description: PGP signature
Re: [RFC] [PATCH 3/3] enable IP multicast when bonding IPoIB devices
Jay Vosburgh wrote: Or Gerlitz <[EMAIL PROTECTED]> wrote: My understanding is that changing ifenslave and the bonding kernel code to allow for enslaving while master is not up is enough, so actually no change is needed to the sysconfig tools, correct? Incorrect. The /sbin/ifup included with sysconfig (I'm looking at version 0.31-0-15.51) has logic to set the bonding master device up prior to adding any slaves. E.g., # get up the bonding device before enslaving # if ! is_iface_up $INTERFACE; then ip link set $INTERFACE up 2>&1 # fi # enslave available slave devices; if there is none -> hard break and log MESSAGE=`/sbin/ifenslave $BONDING_OPTIONS $INTERFACE $BSINTERFACES 2>&1` For your purposes, this would cause it to register as an ethernet hardware type, not an IB type. The /sbin/ifup included with initscripts operates a little differently, but also sets the bonding master up prior to adding any slaves. OK, you are correct, i agree that the /sbin/ifup would attempt to first bring up the bonding device so it breaks my assumptions... Yes. Part of the difficulty is that the changes to the initscripts and sysconfig packages won't be compatible with versions of bonding prior to the bonding kernel changes (because older versions of bonding will refuse to add slaves if the master is down). It might require adding another API version to bonding, and modifying ifenslave to work both ways (i.e., with the current "enslave with master up" API, as well as the new "enslave with master down" API). Gee, sounds bad An alternate approach would be to undertake the more substantial task of converting the initscripts and sysconfig code to use sysfs to configure bonding. This would permit changing the logic (to add slaves while the bonding master is down, then set it up), as well as remove the current hacks (present only in sysconfig) to load the bonding module once per configured bonding interface. The initscripts currently don't do this (as far as I know), so it's generally only possible to have one bonding interface under initscripts control. This sounds like a good idea to get out of all these troubles... So the direction to have sysconfig and initscripts tools configure bonding by sysfs and not by the enslave program is something you were considering regardless of the needs imposed by bonding support for non ARPHRD_ETHER netdevices? and you think the distro packages owners would like this? I will look into the current methods used by sysconfig to configure bonding and see if i can come up with sketch of how to do it with sysfs. Basically, i use now my own script working with sysfs in my IPoIB bonding testing where i have followed the directions in the bonding kernel doc. Thanks again for all the coaching... Or. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: d80211: ieee80211_hw handlers in atomic context
On Thursday 05 October 2006 13:29, Jiri Benc wrote: > On Wed, 04 Oct 2006 18:51:40 +0200, Jan Kiszka wrote: > > Ok, I'm not promising success and I'm going to duck immediately if > > someone else feels like working on it, but I could try to patch in this > > direction. > > Your patches are welcomed! > > > Now there just remains my precautious question if there are other > > services in the ieee_80211_hw interface that may conflict with sleeping > > USB drivers. What about specifying the possible contexts in > > include/net/d80211.h? > > Yes, that makes sense. Feel free to send a patch :-) The patch is currently in testing in the rt2x00 tree. So it will be shortly send to the netdev list. :) Ivo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: d80211: ieee80211_hw handlers in atomic context
On Thursday 05 October 2006 13:37, Jiri Benc wrote: > On Wed, 4 Oct 2006 19:22:38 +0200, Ivo van Doorn wrote: > > Well another point of concern for me is the TSF handling, those handlers > > are called > > from interrupt context as well, and also deliver problems for the USB > > drivers in case > > of adhoc mode. > > Where is a problem with tsf handlers? get_tsf is not called at all > (unless CONFIG_D80211_IBSS_DEBUG is set; well, that raises a question > why the function exists in the first place), reset_tsf returns void. Basically it comes down to this: Sep 13 12:27:34 wz4a kernel: wlan0: Creating new IBSS network, BSSID 7a:b9:60:8a:84:39 Sep 13 12:27:34 wz4a kernel: BUG: scheduling while atomic: swapper/0x0100/0 Sep 13 12:27:34 wz4a kernel: schedule+0x43/0xa84 extract_buf+0x97/0xc8 Sep 13 12:27:34 wz4a kernel: wait_for_completion+0x6a/0x9f default_wake_function+0x0/0xc Sep 13 12:27:34 wz4a kernel: usb_start_wait_urb+0x98/0xdc [usbcore] timeout_kill+0x0/0x5 [usbcore] Sep 13 12:27:34 wz4a kernel: usb_control_msg+0xc3/0xde [usbcore] rt2x00_vendor_request+0x7c/0xa6 [rt73usb] Sep 13 12:27:34 wz4a kernel: rt73usb_reset_tsf+0x30/0x59 [rt73usb] ieee80211_sta_join_ibss+0x3a/0x572 [80211] Sep 13 12:27:34 wz4a kernel: printk+0x14/0x18 ieee80211_rx_bss_add+0x88/0x90 [80211] Sep 13 12:27:34 wz4a kernel: ieee80211_sta_find_ibss+0x30e/0x366 [80211] ieee80211_sta_timer+0x0/0x18f [80211] Sep 13 12:27:34 wz4a kernel: ieee80211_sta_timer+0x7a/0x18f [80211] ieee80211_sta_timer+0x0/0x18f [80211] Sep 13 12:27:34 wz4a kernel: run_timer_softirq+0x10b/0x153 __do_softirq+0x58/0xc2 Sep 13 12:27:34 wz4a kernel: do_softirq+0x2e/0x32 do_IRQ+0x1e/0x24 Sep 13 12:27:34 wz4a kernel: common_interrupt+0x1a/0x20 acpi_processor_idle+0x18a/0x39e [processor] Sep 13 12:27:34 wz4a kernel: cpu_idle+0x8f/0xa8 start_kernel+0x355/0x35c With the compilation of d80211 the CONFIG_D80211_DEBUG is set by default, so no CONFIG_D80211_IBSS_DEBUG. This does not happen in rt2500usb driver, since no TSF handling is possible due to a lack of TSF registers in the device. Ivo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: d80211: ieee80211_hw handlers in atomic context
On Thu, 5 Oct 2006 17:00:31 +0200, Ivo van Doorn wrote: > Basically it comes down to this: > > Sep 13 12:27:34 wz4a kernel: wlan0: Creating new IBSS network, BSSID > 7a:b9:60:8a:84:39 > Sep 13 12:27:34 wz4a kernel: BUG: scheduling while atomic: > swapper/0x0100/0 > Sep 13 12:27:34 wz4a kernel: schedule+0x43/0xa84 > extract_buf+0x97/0xc8 > Sep 13 12:27:34 wz4a kernel: wait_for_completion+0x6a/0x9f > default_wake_function+0x0/0xc > Sep 13 12:27:34 wz4a kernel: usb_start_wait_urb+0x98/0xdc > [usbcore] timeout_kill+0x0/0x5 [usbcore] > Sep 13 12:27:34 wz4a kernel: usb_control_msg+0xc3/0xde [usbcore] > rt2x00_vendor_request+0x7c/0xa6 [rt73usb] > Sep 13 12:27:34 wz4a kernel: rt73usb_reset_tsf+0x30/0x59 > [rt73usb] ieee80211_sta_join_ibss+0x3a/0x572 [80211] > Sep 13 12:27:34 wz4a kernel: printk+0x14/0x18 > ieee80211_rx_bss_add+0x88/0x90 [80211] > Sep 13 12:27:34 wz4a kernel: ieee80211_sta_find_ibss+0x30e/0x366 > [80211] ieee80211_sta_timer+0x0/0x18f [80211] > Sep 13 12:27:34 wz4a kernel: ieee80211_sta_timer+0x7a/0x18f > [80211] ieee80211_sta_timer+0x0/0x18f [80211] > Sep 13 12:27:34 wz4a kernel: run_timer_softirq+0x10b/0x153 > __do_softirq+0x58/0xc2 > Sep 13 12:27:34 wz4a kernel: do_softirq+0x2e/0x32 > do_IRQ+0x1e/0x24 > Sep 13 12:27:34 wz4a kernel: common_interrupt+0x1a/0x20 > acpi_processor_idle+0x18a/0x39e [processor] > Sep 13 12:27:34 wz4a kernel: cpu_idle+0x8f/0xa8 > start_kernel+0x355/0x35c So this will be solved for free when sta_timer is converted to a workqueue. Jiri -- Jiri Benc SUSE Labs - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: d80211: ieee80211_hw handlers in atomic context
Ivo van Doorn wrote: > On Thursday 05 October 2006 13:37, Jiri Benc wrote: >> On Wed, 4 Oct 2006 19:22:38 +0200, Ivo van Doorn wrote: >>> Well another point of concern for me is the TSF handling, those handlers >>> are called >>> from interrupt context as well, and also deliver problems for the USB >>> drivers in case >>> of adhoc mode. >> Where is a problem with tsf handlers? get_tsf is not called at all >> (unless CONFIG_D80211_IBSS_DEBUG is set; well, that raises a question >> why the function exists in the first place), reset_tsf returns void. > > Basically it comes down to this: > > Sep 13 12:27:34 wz4a kernel: wlan0: Creating new IBSS network, BSSID > 7a:b9:60:8a:84:39 > Sep 13 12:27:34 wz4a kernel: BUG: scheduling while atomic: > swapper/0x0100/0 > Sep 13 12:27:34 wz4a kernel: schedule+0x43/0xa84 > extract_buf+0x97/0xc8 > Sep 13 12:27:34 wz4a kernel: wait_for_completion+0x6a/0x9f > default_wake_function+0x0/0xc > Sep 13 12:27:34 wz4a kernel: usb_start_wait_urb+0x98/0xdc > [usbcore] timeout_kill+0x0/0x5 [usbcore] > Sep 13 12:27:34 wz4a kernel: usb_control_msg+0xc3/0xde [usbcore] > rt2x00_vendor_request+0x7c/0xa6 [rt73usb] > Sep 13 12:27:34 wz4a kernel: rt73usb_reset_tsf+0x30/0x59 > [rt73usb] ieee80211_sta_join_ibss+0x3a/0x572 [80211] > Sep 13 12:27:34 wz4a kernel: printk+0x14/0x18 > ieee80211_rx_bss_add+0x88/0x90 [80211] > Sep 13 12:27:34 wz4a kernel: ieee80211_sta_find_ibss+0x30e/0x366 > [80211] ieee80211_sta_timer+0x0/0x18f [80211] > Sep 13 12:27:34 wz4a kernel: ieee80211_sta_timer+0x7a/0x18f > [80211] ieee80211_sta_timer+0x0/0x18f [80211] > Sep 13 12:27:34 wz4a kernel: run_timer_softirq+0x10b/0x153 > __do_softirq+0x58/0xc2 > Sep 13 12:27:34 wz4a kernel: do_softirq+0x2e/0x32 > do_IRQ+0x1e/0x24 > Sep 13 12:27:34 wz4a kernel: common_interrupt+0x1a/0x20 > acpi_processor_idle+0x18a/0x39e [processor] > Sep 13 12:27:34 wz4a kernel: cpu_idle+0x8f/0xa8 > start_kernel+0x355/0x35c > > With the compilation of d80211 the CONFIG_D80211_DEBUG is set by default, > so no CONFIG_D80211_IBSS_DEBUG. > > This does not happen in rt2500usb driver, since no TSF handling is possible > due to a lack of TSF registers in the device. This path would be fixed by my conversion patch of sta.timer into sta.work that I sent you yesterday privately. Unfortunately, I don't have a copy at hand ATM. What about the other timers? Can they trigger any sleeping service of rt2x00 drivers? Ok, waiting for a BUG is always possible... ;) Jan signature.asc Description: OpenPGP digital signature
Re: d80211: ieee80211_hw handlers in atomic context
Hi, > > This does not happen in rt2500usb driver, since no TSF handling is possible > > due to a lack of TSF registers in the device. > > This path would be fixed by my conversion patch of sta.timer into > sta.work that I sent you yesterday privately. Unfortunately, I don't > have a copy at hand ATM. That is what I expect as well, I'll ask confirmation from the person who submitted this bug. > What about the other timers? Can they trigger any sleeping service of > rt2x00 drivers? Ok, waiting for a BUG is always possible... ;) Well I currently have no time to check it, but can config_interface handler still be called from interrupt context or has this also been fixed? Ivo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: d80211: ieee80211_hw handlers in atomic context
On Thursday 05 October 2006 17:13, Jiri Benc wrote: > On Thu, 5 Oct 2006 17:00:31 +0200, Ivo van Doorn wrote: > > Basically it comes down to this: > > > > Sep 13 12:27:34 wz4a kernel: wlan0: Creating new IBSS network, BSSID > > 7a:b9:60:8a:84:39 > > Sep 13 12:27:34 wz4a kernel: BUG: scheduling while atomic: > > swapper/0x0100/0 > > Sep 13 12:27:34 wz4a kernel: schedule+0x43/0xa84 > > extract_buf+0x97/0xc8 > > Sep 13 12:27:34 wz4a kernel: wait_for_completion+0x6a/0x9f > > default_wake_function+0x0/0xc > > Sep 13 12:27:34 wz4a kernel: usb_start_wait_urb+0x98/0xdc > > [usbcore] timeout_kill+0x0/0x5 [usbcore] > > Sep 13 12:27:34 wz4a kernel: usb_control_msg+0xc3/0xde > > [usbcore] rt2x00_vendor_request+0x7c/0xa6 [rt73usb] > > Sep 13 12:27:34 wz4a kernel: rt73usb_reset_tsf+0x30/0x59 > > [rt73usb] ieee80211_sta_join_ibss+0x3a/0x572 [80211] > > Sep 13 12:27:34 wz4a kernel: printk+0x14/0x18 > > ieee80211_rx_bss_add+0x88/0x90 [80211] > > Sep 13 12:27:34 wz4a kernel: > > ieee80211_sta_find_ibss+0x30e/0x366 [80211] > > ieee80211_sta_timer+0x0/0x18f [80211] > > Sep 13 12:27:34 wz4a kernel: ieee80211_sta_timer+0x7a/0x18f > > [80211] ieee80211_sta_timer+0x0/0x18f [80211] > > Sep 13 12:27:34 wz4a kernel: run_timer_softirq+0x10b/0x153 > > __do_softirq+0x58/0xc2 > > Sep 13 12:27:34 wz4a kernel: do_softirq+0x2e/0x32 > > do_IRQ+0x1e/0x24 > > Sep 13 12:27:34 wz4a kernel: common_interrupt+0x1a/0x20 > > acpi_processor_idle+0x18a/0x39e [processor] > > Sep 13 12:27:34 wz4a kernel: cpu_idle+0x8f/0xa8 > > start_kernel+0x355/0x35c > > So this will be solved for free when sta_timer is converted to a > workqueue. Hi, True, this is what I realized later as well. :) I have asked for confirmation by the bug submitter. Ivo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: d80211: ieee80211_hw handlers in atomic context
On Thursday 05 October 2006 17:39, Jiri Benc wrote: > On Thu, 5 Oct 2006 17:32:39 +0200, Ivo van Doorn wrote: > > Well I currently have no time to check it, but can > > config_interface handler still be called from interrupt context or has this > > also been fixed? > > Will be fixed by the sta_timer conversion as well. Excellent news. :D Thanks. Ivo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: d80211: ieee80211_hw handlers in atomic context
On Thu, 5 Oct 2006 17:32:39 +0200, Ivo van Doorn wrote: > Well I currently have no time to check it, but can > config_interface handler still be called from interrupt context or has this > also been fixed? Will be fixed by the sta_timer conversion as well. Jiri -- Jiri Benc SUSE Labs - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take19 1/4] kevent: Core files.
On Thursday 05 October 2006 12:21, Evgeniy Polyakov wrote: > On Thu, Oct 05, 2006 at 11:56:24AM +0200, Eric Dumazet ([EMAIL PROTECTED]) wrote: > > On Thursday 05 October 2006 10:57, Evgeniy Polyakov wrote: > > > > > Well, it is possible to create /sys/proc entry for that, and even now > > > userspace can grow mapping ring until it is forbiden by kernel, which > > > means limit is reached. > > > > No need for yet another /sys/proc entry. > > > > Right now, I (for example) may have a use for Generic event handling, but for > > a program that needs XXX.XXX handles, and about XX.XXX events per second. > > > > Right now, this program uses epoll, and reaches no limit at all, once you pass > > the "ulimit -n", and other kernel wide tunes of course, not related to epoll. > > > > With your current kevent, I cannot switch to it, because of hardcoded limits. > > > > I may be wrong, but what is currently missing for me is : > > > > - No hardcoded limit on the max number of events. (A process that can open > > XXX.XXX files should be allowed to open a kevent queue with at least XXX.XXX > > events). Right now thats not clear what happens IF the current limit is > > reached. > > This forces to overflows in fixed sized memory mapped buffer. > If we remove memory mapped buffer or will allow to have overflows (and > thus skipped entries) keven can easily scale to that limits (tested with > xx.xxx events though). > > > - In order to avoid touching the whole ring buffer, it might be good to be > > able to reset the indexes to the beginning when ring buffer is empty. (So if > > the user land is responsive enough to consume events, only first pages of the > > mapping would be used : that saves L1/L2 cpu caches) > > And what happens when there are 3 empty at the beginning and \we need to > put there 4 ready events? Couldn't there be 3 areas in the mmap buffer: - Unused: entries that the kernel can alloc from. - Alloced: entries alloced by kernel but not yet used by user. Kernel can update these if new events requires that. - Consumed: entries that the user are processing. The user takes a set of alloced entries and make them consumed. Then it processes the events after which it makes them unused. If there are no unused entries and the kernel needs some, it has wait for free entries. The user has to notify when unused entries becomes available. It could set a flag in the mmap'ed area to avoid unnessesary wakeups. The are some details with indexing and wakeup notification that I have left out, but I hope my idea is clear. I could give a more detailed description if requested. Also, I'm a user-level programmer so I might not get the whole picture. Hans Henrik Happe - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take19 1/4] kevent: Core files.
On Thu, Oct 05, 2006 at 04:01:19PM +0200, Hans Henrik Happe ([EMAIL PROTECTED]) wrote: > > And what happens when there are 3 empty at the beginning and \we need to > > put there 4 ready events? > > Couldn't there be 3 areas in the mmap buffer: > > - Unused: entries that the kernel can alloc from. > - Alloced: entries alloced by kernel but not yet used by user. Kernel can > update these if new events requires that. > - Consumed: entries that the user are processing. > > The user takes a set of alloced entries and make them consumed. Then it > processes the events after which it makes them unused. > > If there are no unused entries and the kernel needs some, it has wait for > free > entries. The user has to notify when unused entries becomes available. It > could set a flag in the mmap'ed area to avoid unnessesary wakeups. > > The are some details with indexing and wakeup notification that I have left > out, but I hope my idea is clear. I could give a more detailed description if > requested. Also, I'm a user-level programmer so I might not get the whole > picture. This looks good on a picture, but how can you put it into page-based storage without major and complex shared structures, which should be properly locked between kernelspace and userspace? > Hans Henrik Happe -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take19 0/4] kevent: Generic event handling mechanism.
Evgeniy Polyakov wrote: > And you can add/remove signal events using existing kevent api between > calls. That's far more expensive than using a mask under control of the program. > And creating special cases for usual events is bad. > There is unified way to deal with events in kevent - > add/remove/modify/wait on them, signals are just usual events. How can this be unified? The installment of the temporary signal mask is unlike the handling of signal for the purpose of reporting them through the signal queue. It's equally completely new functionality. Don't kid yourself in thinking that because this is signal stuff, too, you're "unifying" something. The way this signal mask is used has nothing whatsoever to do with the delivering signals via the event queue. For the latter the signals always must be blocked (similar to sigwait's requirement). As a result it means you want to introduce a new mechanism for the event queue instead of using the well known and often used method of optionally passing a signal mask to the syscall. That's just insane. > I think you wanted to say, that 'all event mechanism except the most > commonly used poll/select/epoll use timespec'. Get your facts straight. select uses timeval which is just the predecessor of of timespec. And epoll is just (badly) designed after poll. Fact is therefore that poll plus its spawn is the only interface using such a timeout method. > I designed it to be similar to poll(), it is really good interface. Not many people agree. All the interfaces designed (not derived) in the last years take a timespec parameter. Plus, you chose to ignore all the nice things using a timespec allow you like absolute timeout modes etc. See the clock_nanosleep() interface for a way this can be useful. -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ signature.asc Description: OpenPGP digital signature
Re: 2.6.18-mm2 boot failure on x86-64
On Wed, 2006-10-04 at 18:08 -0700, Martin Bligh wrote: > Andi Kleen wrote: > >>I think most likely it would crash on 2.6.18. Keith mannthey had reported > >>a different crash on 2.6.18-rc4-mm2 when this patch was introduced first > >>time. Following is the link to the thread. > > > > > > Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4? > > I think it's fixed already in -git22, or at least it is for the IBM box > reporting to test.kernel.org. You might want to try that one ... -git22 also panics for me. -- Steve Fox IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-mm3 oops in xfrm_register_mode
On Wed, 2006-10-04 at 16:02 -0500, Steve Fox wrote: > On Wed, 2006-10-04 at 09:57 -0700, Andrew Morton wrote: > > > You might well find this bisection lands you on origin.patch. ie: a > > mainline bug. I note that David merged a few more xfrm fixes this morning. > > > > So to confirm that, first test just origin.patch and if that fails, test > > git-of-the-moment. If that doesn't fail, they fixed it. > > origin.patch from --m3 failed. Unfortunately so did a fresh clone of > Linus's git tree. > I am not an expert in that area, but your stack trace made me curious. Looking at the dis-assembly, line of code in question is: if (likely(modemap[mode->encap] == NULL)) { Register contents indicate that, its called as xfrm_register_mode(&xfrm4_tunnel_mode, AF_INET); or xfrm_register_mode(&xfrm4_transport_mode, AF_INET); (family is AF_INET). The invalid deref is due to modemap = 0x7ff (RAX: 07ff) Since its so easy to reproduce, can you add a printk before this check to dump mode->encap and modemap, afinfo, family etc ? Just curious .. Thanks, Badari - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take19 1/4] kevent: Core files.
On Thursday 05 October 2006 16:15, Evgeniy Polyakov wrote: > On Thu, Oct 05, 2006 at 04:01:19PM +0200, Hans Henrik Happe ([EMAIL PROTECTED]) wrote: > > > And what happens when there are 3 empty at the beginning and \we need to > > > put there 4 ready events? > > > > Couldn't there be 3 areas in the mmap buffer: > > > > - Unused: entries that the kernel can alloc from. > > - Alloced: entries alloced by kernel but not yet used by user. Kernel can > > update these if new events requires that. > > - Consumed: entries that the user are processing. > > > > The user takes a set of alloced entries and make them consumed. Then it > > processes the events after which it makes them unused. > > > > If there are no unused entries and the kernel needs some, it has wait for free > > entries. The user has to notify when unused entries becomes available. It > > could set a flag in the mmap'ed area to avoid unnessesary wakeups. > > > > The are some details with indexing and wakeup notification that I have left > > out, but I hope my idea is clear. I could give a more detailed description if > > requested. Also, I'm a user-level programmer so I might not get the whole > > picture. > > This looks good on a picture, but how can you put it into page-based > storage without major and complex shared structures, which should be > properly locked between kernelspace and userspace? I wasn't clear about the structure. I meant a ring-buffer with 3 areas. So it's basically the same model as Eric Dumazet described, only with 3 indexes; 2 in the user-writeable page and 1 in kernel. When the kernel has alloced an entry it should store it in a way that makes it invalid after user consumsion, which is simply an increment of an index. Sliding-window like schemes should solve this. Hans Henrik Happe - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-mm2 boot failure on x86-64
On Thu, 2006-10-05 at 09:53 -0500, Steve Fox wrote: > On Wed, 2006-10-04 at 18:08 -0700, Martin Bligh wrote: > > Andi Kleen wrote: > > >>I think most likely it would crash on 2.6.18. Keith mannthey had reported > > >>a different crash on 2.6.18-rc4-mm2 when this patch was introduced first > > >>time. Following is the link to the thread. > > > > > > > > > Then maybe trying 2.6.17 + the patch and then bisect between that and > > > -rc4? > > > > I think it's fixed already in -git22, or at least it is for the IBM box > > reporting to test.kernel.org. You might want to try that one ... > > -git22 also panics for me. > Steve, Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ? Last time I couldn't match your instruction dump to any code segment in the routine. And also, can you post your .config file. I have an amd64 and em64t machine and both work fine... Thanks, Badari - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-mm2 boot failure on x86-64
On Thu, 2006-10-05 at 08:12 -0700, Badari Pulavarty wrote: > Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ? CONFIG_DEBUG_KERNEL should be on > Last time I couldn't match your instruction dump to any code segment > in the routine. And also, can you post your .config file. I have > an amd64 and em64t machine and both work fine... Unable to handle kernel NULL pointer dereference at 0827 RIP: [] xfrm_register_mode+0x36/0x60 PGD 0 Oops: [1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.18-git22 #1 RIP: 0010:[] [] xfrm_register_mode+0x36/0x60 RSP: :810bffcbded0 EFLAGS: 00010286 RAX: 081f RBX: 805588a0 RCX: RDX: RSI: 0002 RDI: 80559550 RBP: ffef R08: 3f924371 R09: R10: 810bffcbdcb0 R11: 0154 R12: R13: 810bffcbdef0 R14: R15: FS: () GS:805d2000() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 0827 CR3: 00201000 CR4: 06e0 Process swapper (pid: 1, threadinfo 810bffcbc000, task 810bffcbb4e0) Stack: 8061fb48 80207182 0009 The base config file I'm using is at http://flooterbu.net/kernel/elm3b239-2.6.17.config -- Steve Fox IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.19-rc1 2/2] ehea: fix port state notification, default queue sizes
This patch includes a bug fix for the port state notification and fixes the default queue sizes. Signed-off-by: Jan-Bernd Themann <[EMAIL PROTECTED]> --- drivers/net/ehea/ehea.h | 13 +++-- drivers/net/ehea/ehea_main.c |6 +++--- 2 files changed, 10 insertions(+), 9 deletions(-) diff --git a/drivers/net/ehea/ehea.h b/drivers/net/ehea/ehea.h index 23b451a..b40724f 100644 --- a/drivers/net/ehea/ehea.h +++ b/drivers/net/ehea/ehea.h @@ -39,7 +39,7 @@ #include #include #define DRV_NAME "ehea" -#define DRV_VERSION"EHEA_0028" +#define DRV_VERSION"EHEA_0034" #define EHEA_MSG_DEFAULT (NETIF_MSG_LINK | NETIF_MSG_TIMER \ | NETIF_MSG_RX_ERR | NETIF_MSG_TX_ERR) @@ -50,6 +50,7 @@ #define EHEA_MAX_ENTRIES_RQ3 16383 #define EHEA_MAX_ENTRIES_SQ 32767 #define EHEA_MIN_ENTRIES_QP 127 +#define EHEA_SMALL_QUEUES #define EHEA_NUM_TX_QP 1 #ifdef EHEA_SMALL_QUEUES @@ -59,11 +60,11 @@ #define EHEA_DEF_ENTRIES_RQ14095 #define EHEA_DEF_ENTRIES_RQ21023 #define EHEA_DEF_ENTRIES_RQ31023 #else -#define EHEA_MAX_CQE_COUNT 32000 -#define EHEA_DEF_ENTRIES_SQ16000 -#define EHEA_DEF_ENTRIES_RQ1 32080 -#define EHEA_DEF_ENTRIES_RQ24020 -#define EHEA_DEF_ENTRIES_RQ34020 +#define EHEA_MAX_CQE_COUNT 4080 +#define EHEA_DEF_ENTRIES_SQ 4080 +#define EHEA_DEF_ENTRIES_RQ18160 +#define EHEA_DEF_ENTRIES_RQ22040 +#define EHEA_DEF_ENTRIES_RQ32040 #endif #define EHEA_MAX_ENTRIES_EQ 20 diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c index 263d1c5..0edb2f8 100644 --- a/drivers/net/ehea/ehea_main.c +++ b/drivers/net/ehea/ehea_main.c @@ -769,7 +769,7 @@ static void ehea_parse_eqe(struct ehea_a if (EHEA_BMASK_GET(NEQE_PORT_UP, eqe)) { if (!netif_carrier_ok(port->netdev)) { ret = ehea_sense_port_attr( - adapter->port[portnum]); + port); if (ret) { ehea_error("failed resensing port " "attributes"); @@ -821,7 +821,7 @@ static void ehea_parse_eqe(struct ehea_a netif_stop_queue(port->netdev); break; default: - ehea_error("unknown event code %x", ec); + ehea_error("unknown event code %x, eqe=0x%lX", ec, eqe); break; } } @@ -1845,7 +1845,7 @@ static int ehea_start_xmit(struct sk_buf if (netif_msg_tx_queued(port)) { ehea_info("post swqe on QP %d", pr->qp->init_attr.qp_nr); - ehea_dump(swqe, sizeof(*swqe), "swqe"); + ehea_dump(swqe, 512, "swqe"); } ehea_post_swqe(pr->qp, swqe); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][RFC] net/ipv6: seperate sit driver to extra module
Is there a reason why the tunnel driver for IPv6-in-IPv4 is currently compiled into the ipv6 module? This driver is only needed in gateways between different IPv6 networks. On all other hosts with ipv6 enabled it is not required. To have this driver in a seperate module will save memory on those machines. I appended a small and trival patch to 2.6.18 which does exactly this. Joerg diff -upr -X linux-2.6.18/Documentation/dontdiff linux-2.6.18-vanilla/net/ipv6/af_inet6.c linux-2.6.18/net/ipv6/af_inet6.c --- linux-2.6.18-vanilla/net/ipv6/af_inet6.c2006-09-20 05:42:06.0 +0200 +++ linux-2.6.18/net/ipv6/af_inet6.c2006-10-05 16:55:02.0 +0200 @@ -849,7 +849,6 @@ static int __init inet6_init(void) err = addrconf_init(); if (err) goto addrconf_fail; - sit_init(); /* Init v6 extension headers. */ ipv6_rthdr_init(); @@ -920,7 +919,6 @@ static void __exit inet6_exit(void) raw6_proc_exit(); #endif /* Cleanup code parts. */ - sit_cleanup(); ip6_flowlabel_cleanup(); addrconf_cleanup(); ip6_route_cleanup(); diff -upr -X linux-2.6.18/Documentation/dontdiff linux-2.6.18-vanilla/net/ipv6/Kconfig linux-2.6.18/net/ipv6/Kconfig --- linux-2.6.18-vanilla/net/ipv6/Kconfig 2006-09-20 05:42:06.0 +0200 +++ linux-2.6.18/net/ipv6/Kconfig 2006-10-05 17:07:11.0 +0200 @@ -126,6 +126,19 @@ config INET6_XFRM_MODE_TUNNEL If unsure, say Y. +config IPV6_SIT + tristate "IPv6: IPv6-in-IPv4 tunnel (SIT driver)" + depends on IPV6 + default n + ---help--- + Tunneling means encapsulating data of one protocol type within + another protocol and sending it over a channel that understands the + encapsulating protocol. This driver implements encapsulation of IPv6 + into IPv4 packets. This is usefull if you want to connect two IPv6 + networks over an IPv4-only path. + + Saying M here will produce a module called sit.ko. If unsure, say N. + config IPV6_TUNNEL tristate "IPv6: IPv6-in-IPv6 tunnel" select INET6_TUNNEL diff -upr -X linux-2.6.18/Documentation/dontdiff linux-2.6.18-vanilla/net/ipv6/Makefile linux-2.6.18/net/ipv6/Makefile --- linux-2.6.18-vanilla/net/ipv6/Makefile 2006-09-20 05:42:06.0 +0200 +++ linux-2.6.18/net/ipv6/Makefile 2006-10-05 17:10:42.0 +0200 @@ -4,7 +4,7 @@ obj-$(CONFIG_IPV6) += ipv6.o -ipv6-objs := af_inet6.o anycast.o ip6_output.o ip6_input.o addrconf.o sit.o \ +ipv6-objs := af_inet6.o anycast.o ip6_output.o ip6_input.o addrconf.o \ route.o ip6_fib.o ipv6_sockglue.o ndisc.o udp.o raw.o \ protocol.o icmp.o mcast.o reassembly.o tcp_ipv6.o \ exthdrs.o sysctl_net_ipv6.o datagram.o proc.o \ @@ -24,6 +24,7 @@ obj-$(CONFIG_INET6_XFRM_MODE_TRANSPORT) obj-$(CONFIG_INET6_XFRM_MODE_TUNNEL) += xfrm6_mode_tunnel.o obj-$(CONFIG_NETFILTER)+= netfilter/ +obj-$(CONFIG_IPV6_SIT) += sit.o obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o obj-y += exthdrs_core.o diff -upr -X linux-2.6.18/Documentation/dontdiff linux-2.6.18-vanilla/net/ipv6/sit.c linux-2.6.18/net/ipv6/sit.c --- linux-2.6.18-vanilla/net/ipv6/sit.c 2006-09-20 05:42:06.0 +0200 +++ linux-2.6.18/net/ipv6/sit.c 2006-10-05 16:55:02.0 +0200 @@ -850,3 +850,6 @@ int __init sit_init(void) inet_del_protocol(&sit_protocol, IPPROTO_IPV6); goto out; } + +module_init(sit_init); +module_exit(sit_cleanup);
Re: 2.6.18-mm2 boot failure on x86-64
On Thursday 05 October 2006 17:32, Steve Fox wrote: > On Thu, 2006-10-05 at 08:12 -0700, Badari Pulavarty wrote: > > > Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ? > > CONFIG_DEBUG_KERNEL should be on > > > Last time I couldn't match your instruction dump to any code segment > > in the routine. And also, can you post your .config file. I have > > an amd64 and em64t machine and both work fine... > > Unable to handle kernel NULL pointer dereference at 0827 RIP: > [] xfrm_register_mode+0x36/0x60 > PGD 0 > Oops: [1] SMP > CPU 0 > Modules linked in: > Pid: 1, comm: swapper Not tainted 2.6.18-git22 #1 > RIP: 0010:[] [] > xfrm_register_mode+0x36/0x60 > RSP: :810bffcbded0 EFLAGS: 00010286 > RAX: 081f RBX: 805588a0 RCX: > RDX: RSI: 0002 RDI: 80559550 > RBP: ffef R08: 3f924371 R09: > R10: 810bffcbdcb0 R11: 0154 R12: > R13: 810bffcbdef0 R14: R15: > FS: () GS:805d2000() knlGS: > CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b > CR2: 0827 CR3: 00201000 CR4: 06e0 > Process swapper (pid: 1, threadinfo 810bffcbc000, task 810bffcbb4e0) > Stack: 8061fb48 80207182 > > 0009 Please don't snip the Code: line. It is fairly important. > > The base config file I'm using is at > http://flooterbu.net/kernel/elm3b239-2.6.17.config My guess is that something is wrong with the global variable it is accessing. Can you post the output of grep -5 xfrm_policy_afinfo ? I wonder if that variable overlaps something else. And please add a printk("global %p\n", xfrm_policy_afinfo[family]); at the beginning of net/xfrm/xfrm_poliy.c:xfrm_policy_lock_afinfo and post the output. If not then it's possible that some nearby variable is overflowing or similar. Adding some padding around xfrm_policy_afinfo would show that. Another way if that global is proven to be corrupted will be to add checks all over the boot process to track down where it gets corrupted. -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.19-rc1 1/2] ehea: firmware (hvcall) interface changes
This eHEA patch covers required changes related to Anton Blanchard's new hvcall interface. Signed-off-by: Jan-Bernd Themann <[EMAIL PROTECTED]> --- diff --git a/drivers/net/ehea/ehea_phyp.c b/drivers/net/ehea/ehea_phyp.c index 4a85aca..0b51a8c 100644 --- a/drivers/net/ehea/ehea_phyp.c +++ b/drivers/net/ehea/ehea_phyp.c @@ -44,71 +44,99 @@ #define H_ALL_RES_TYPE_EQ3 #define H_ALL_RES_TYPE_MR5 #define H_ALL_RES_TYPE_MW6 -static long ehea_hcall_9arg_9ret(unsigned long opcode, -unsigned long arg1, unsigned long arg2, -unsigned long arg3, unsigned long arg4, -unsigned long arg5, unsigned long arg6, -unsigned long arg7, unsigned long arg8, -unsigned long arg9, unsigned long *out1, -unsigned long *out2,unsigned long *out3, -unsigned long *out4,unsigned long *out5, -unsigned long *out6,unsigned long *out7, -unsigned long *out8,unsigned long *out9) +static long ehea_plpar_hcall_norets(unsigned long opcode, + unsigned long arg1, + unsigned long arg2, + unsigned long arg3, + unsigned long arg4, + unsigned long arg5, + unsigned long arg6, + unsigned long arg7) { - long hret; + long ret; int i, sleep_msecs; for (i = 0; i < 5; i++) { - hret = plpar_hcall_9arg_9ret(opcode,arg1, arg2, arg3, arg4, -arg5, arg6, arg7, arg8, arg9, out1, -out2, out3, out4, out5, out6, out7, -out8, out9); - if (H_IS_LONG_BUSY(hret)) { - sleep_msecs = get_longbusy_msecs(hret); + ret = plpar_hcall_norets(opcode, arg1, arg2, arg3, arg4, +arg5, arg6, arg7); + + if (H_IS_LONG_BUSY(ret)) { + sleep_msecs = get_longbusy_msecs(ret); msleep_interruptible(sleep_msecs); continue; } - if (hret < H_SUCCESS) - ehea_error("op=%lx hret=%lx " - "i1=%lx i2=%lx i3=%lx i4=%lx i5=%lx i6=%lx " - "i7=%lx i8=%lx i9=%lx " - "o1=%lx o2=%lx o3=%lx o4=%lx o5=%lx o6=%lx " - "o7=%lx o8=%lx o9=%lx", - opcode, hret, arg1, arg2, arg3, arg4, arg5, - arg6, arg7, arg8, arg9, *out1, *out2, *out3, - *out4, *out5, *out6, *out7, *out8, *out9); - return hret; + if (ret < H_SUCCESS) + ehea_error("opcode=%lx ret=%lx" + " arg1=%lx arg2=%lx arg3=%lx arg4=%lx" + " arg5=%lx arg6=%lx arg7=%lx ", + opcode, ret, + arg1, arg2, arg3, arg4, arg5, + arg6, arg7); + + return ret; } + return H_BUSY; } -u64 ehea_h_query_ehea_qp(const u64 adapter_handle, const u8 qp_category, -const u64 qp_handle, const u64 sel_mask, void *cb_addr) +static long ehea_plpar_hcall9(unsigned long opcode, + unsigned long *outs, /* array of 9 outputs */ + unsigned long arg1, + unsigned long arg2, + unsigned long arg3, + unsigned long arg4, + unsigned long arg5, + unsigned long arg6, + unsigned long arg7, + unsigned long arg8, + unsigned long arg9) { - u64 dummy; + long ret; + int i, sleep_msecs; - if u64)cb_addr) & (PAGE_SIZE - 1)) != 0) { - ehea_error("not on pageboundary"); - return H_PARAMETER; + for (i = 0; i < 5; i++) { + ret = plpar_hcall9(opcode, outs, + arg1, arg2, arg3, arg4, arg5, + arg6, arg7, arg8, arg9); + + if (H_IS_LONG_BUSY(ret)) { + sleep_msecs = get_longbusy_msecs(ret); + msleep_interruptible(sleep_msecs); + continue; + } + + if (ret < H_SUCCESS) +
Re: [PATCH][RFC] net/ipv6: seperate sit driver to extra module
On Thu, Oct 05, 2006 at 11:49:38AM -0400, James Morris wrote: > On Thu, 5 Oct 2006, Joerg Roedel wrote: > > > Is there a reason why the tunnel driver for IPv6-in-IPv4 is currently > > compiled into the ipv6 module? This driver is only needed in gateways > > between different IPv6 networks. On all other hosts with ipv6 enabled it > > is not required. To have this driver in a seperate module will save > > memory on those machines. > > I appended a small and trival patch to 2.6.18 which does exactly this. > > Looks ok to me, although given that users used to get this by default when > selecting IPv6, perhaps the default in Kconfig should be y. Ok, a good point to write y there. I change this. I wrote n there because of the "If unsure, say N" sentence in the description. Joerg - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.18 4/6]: sb1250-mac: Driver model & phylib support
This is an update including the following changes: 1. Some help text for Kconfig. 2. Removal of unused module options. 3. Phylib support and the resulting removal of generic bits for handling the PHY. 4. Proper reserving of device resources and using ioremap()ped handles to access MAC registers rather than platform-specific macros. 5. Handling of the device using the driver model. Signed-off-by: Maciej W. Rozycki <[EMAIL PROTECTED]> --- This revision fixes the problem with drivers/net/Kconfig. Please consider. Maciej patch-2.6.18-sb1250-mac-16 diff -up --recursive --new-file linux-2.6.18.macro/drivers/net/Kconfig linux-2.6.18/drivers/net/Kconfig --- linux-2.6.18.macro/drivers/net/Kconfig 2006-09-20 03:42:06.0 + +++ linux-2.6.18/drivers/net/Kconfig2006-10-05 15:50:20.0 + @@ -456,6 +456,15 @@ config MIPS_AU1X00_ENET config NET_SB1250_MAC tristate "SB1250 Ethernet support" depends on NET_ETHERNET && SIBYTE_SB1xxx_SOC + select PHYLIB + ---help--- + This driver supports gigabit Ethernet interfaces based on the + Broadcom SiByte family of System-On-a-Chip parts. They include + the BCM1120, BCM1125, BCM1125H, BCM1250, BCM1255, BCM1280, BCM1455 + and BCM1480 chips. + + To compile this driver as a module, choose M here: the module + will be called sb1250-mac. config SGI_IOC3_ETH bool "SGI IOC3 Ethernet" diff -up --recursive --new-file linux-2.6.18.macro/drivers/net/sb1250-mac.c linux-2.6.18/drivers/net/sb1250-mac.c --- linux-2.6.18.macro/drivers/net/sb1250-mac.c 2006-09-20 03:42:06.0 + +++ linux-2.6.18/drivers/net/sb1250-mac.c 2006-10-05 15:48:50.0 + @@ -1,5 +1,6 @@ /* * Copyright (C) 2001,2002,2003,2004 Broadcom Corporation + * Copyright (c) 2006 Maciej W. Rozycki * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License @@ -18,7 +19,11 @@ * * This driver is designed for the Broadcom SiByte SOC built-in * Ethernet controllers. Written by Mitch Lichtenberg at Broadcom Corp. + * + * Updated to the driver model and the PHY abstraction layer + * by Maciej W. Rozycki. */ + #include #include #include @@ -32,9 +37,18 @@ #include #include #include -#include /* Processor type for cache alignment. */ -#include +#include +#include +#include +#include +#include +#include +#include + +#include #include +#include +#include /* Processor type for cache alignment. */ /* This is only here until the firmware is ready. In that case, the firmware leaves the ethernet address in the register for us. */ @@ -48,7 +62,7 @@ /* These identify the driver base version and may not be removed. */ #if 0 -static char version1[] __devinitdata = +static char version1[] __initdata = "sb1250-mac.c:1.00 1/11/2001 Written by Mitch Lichtenberg\n"; #endif @@ -57,8 +71,6 @@ static char version1[] __devinitdata = #define CONFIG_SBMAC_COALESCE -#define MAX_UNITS 4/* More are supported, limit only on options */ - /* Time in jiffies before concluding the transmitter is hung. */ #define TX_TIMEOUT (2*HZ) @@ -74,26 +86,6 @@ static int debug = 1; module_param(debug, int, S_IRUGO); MODULE_PARM_DESC(debug, "Debug messages"); -/* mii status msgs */ -static int noisy_mii = 1; -module_param(noisy_mii, int, S_IRUGO); -MODULE_PARM_DESC(noisy_mii, "MII status messages"); - -/* Used to pass the media type, etc. - Both 'options[]' and 'full_duplex[]' should exist for driver - interoperability. - The media type is usually passed in 'options[]'. -*/ -#ifdef MODULE -static int options[MAX_UNITS] = {-1, -1, -1, -1}; -module_param_array(options, int, NULL, S_IRUGO); -MODULE_PARM_DESC(options, "1-" __MODULE_STRING(MAX_UNITS)); - -static int full_duplex[MAX_UNITS] = {-1, -1, -1, -1}; -module_param_array(full_duplex, int, NULL, S_IRUGO); -MODULE_PARM_DESC(full_duplex, "1-" __MODULE_STRING(MAX_UNITS)); -#endif - #ifdef CONFIG_SBMAC_COALESCE static int int_pktcnt = 0; module_param(int_pktcnt, int, S_IRUGO); @@ -104,6 +96,7 @@ module_param(int_timeout, int, S_IRUGO); MODULE_PARM_DESC(int_timeout, "Timeout value"); #endif +#include #include #if defined(CONFIG_SIBYTE_BCM1x55) || defined(CONFIG_SIBYTE_BCM1x80) #include @@ -126,22 +119,43 @@ MODULE_PARM_DESC(int_timeout, "Timeout v #error invalid SiByte MAC configuation #endif +#ifdef K_INT_PHY +#define SBMAC_PHY_INT K_INT_PHY +#else +#define SBMAC_PHY_INT PHY_POLL +#endif + /** * Simple types * */ - -typedef enum { sbmac_speed_auto, sbmac_speed_10, - sbmac_speed_100, sbmac_speed_1000 } sbmac_speed_t; - -typedef enum { sbmac_duplex_auto, sbmac_duplex_half, - sbmac_duple
Re: [RFC] cfg80211 and nl80211
On Wed, Oct 04, 2006 at 01:57:38PM -0400, Dan Williams wrote: > On Wed, 2006-10-04 at 16:19 +0200, Johannes Berg wrote: > > On Wed, 2006-10-04 at 09:41 +0200, Johannes Berg wrote: > > Should cfg80211 do the chore of keeping track of the whole scan results? > > On the other hand, that doesn't seem to be doable with legacy hardware > > that does all the scanning. So probably one call for > >cfg80211_notify_scan() > > that takes a new scan result structure (taking a single BSSID etc.) and > > notifies all listeners. > > The same structure is used for get_scan() from the wiphy ops in an > > iterator interface like some other calls. > > Is it a problem to actually push the _entire_ scan list out to clients > over netlink? The scan list could be quite large, maybe even a few > kilobytes when stuff like Information Elements, ratesets, etc is > available. I've seen 35-item scan lists that are already around 1.5K. 1.5 KB sounds like a small scan result set to me.. I'm hitting 100+ BSSes at work (well, not really your normal environment ;-), and 50 at home.. These go way beyond 1.5 KB; closer to 32 KB at times, I'd guess. > There are several issues here. They can be roughly split by encryption > algorithm. But the big question: > > Is there a case for _multiple_ encryption algorithms enabled > on a single "virtual" interface at one time? What exactly do you mean with this? WPA allows different STAs associated with an AP to use different unicast encryption algorithms. This means that a client may need to use CCMP with key index 0 for unicast and TKIP with key index 1 for multicast. > Taking one-at-a-time as a given, and the pseudo-structure > > struct cmd_crypto { > enum crypto_alg alg; > union data { > none_data; > wep_data; > wpa_data; wep vs. wpa in crypto configuration does not make sense to me. WPA uses multiple ciphers; even WEP is allowed for group keys.. > Set alg == , set the options, and the driver will _enable_ > that crypto mode with the given options. It makes no sense at all to, > say, set the WEP transmit key index or WEP key when the card is in WPA > mode or no-crypto mode. What is this "WPA" mode? Please note that IEEE 802.11i allows WEP to be used for group (multicast/broadcast) keys.. WPA should not be mixed in here with encryption key configuration. The are different encryption algorithms, like WEP, TKIP, CCMP, and they need to have keys and other parameters like key index and seq# configured. This is regardless of whether WPA is used or not. > It's important to note that some options are independent of the initial > operation that enabled the crypto, and need to be set later without > triggering deauth and such. Setting non-TX-index WEP key is one such > operation. I should be able to set WEP keys at indexes other than the > transmit key index without affecting operation of the card (unless some > hardware/firmware issue prevents this). And same for TKIP and CCMP. > - WEP encryption (following ops are independent of each other): > - Set TX key index > - Set privacy invoked These two are not WEP specific in any way. > - Set exclude unencrypted packets I would consider this more as a global variable for the BSS to match with dot11ExcludeUnecrypted variable defined in IEEE 802.11. In theory, this is not specific to WEP, but in practice, only WEP is sometimes used in mode which allows both encrypted and unencrypted frames. > - Set authentication mode (open, shared-key, or both) Not really WEP specific. Shared Key authentication can only be used with static WEP keys, but still, this configuration is not really part of crypto configuration. In addition, Cisco uses a proprietary "Network EAP" authentication algorithm and IEEE 802.11r is adding a new authentication algorithm, so there are more options to this configuration variable. > - Set (or clear) WEP key 1, 2, 3, or 4 Not specific to WEP. > - WPA/WPA2/IEEE8021X > - Jouni/others would know better and my brain is fried right now This item should not be WPA/WPA2/IEEE8021X, but TKIP/CCMP, i.e., ciphers like WEP.. Just like with WEP, there would need to be key index parameter. Default TX key could be set with separate operation (it is valid to switch between two keys without changing either one). TKIP and CCMP will also need options for setting and getting the sequence number for replay protection (TSC/PN). In other words, WEP, TKIP, CCMP (and likely all future ciphers added to 802.11) would be using the same configuration interface with same set of parameters. Some of these parameters are just ignored for some of the ciphers (e.g., WEP does not really need seq# get/set). > All the WEP options should be independent attributes in nl80211. You > could even have a generic WEPKey attribute that is defined like so: > > ATTR_WEP_KEY { > enum type (one of DISABLE, TYPE_40, TYPE_104, TYPE_152) I would rather use key length than come
Request to postpone WE-21
Hi John, Based on the feedback, I formally request you to back out all of WE-21 from 2.6.19. Rationale : it's probably too early. You can keep it for a later date if you wish. Regards, Jean - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kernel-doc fix for sock.h
From: Randy Dunlap <[EMAIL PROTECTED]> Fix kernel-doc warning in include/net/sock.h: Warning(/var/linsrc/linux-2619-rc1-pv//include/net/sock.h:894): No description found for parameter 'rcu' Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> --- include/net/sock.h |3 +-- 1 files changed, 1 insertion(+), 2 deletions(-) --- linux-2619-rc1-pv.orig/include/net/sock.h +++ linux-2619-rc1-pv/include/net/sock.h @@ -884,8 +884,7 @@ static inline int sk_filter(struct sock /** * sk_filter_release: Release a socket filter - * @sk: socket - * @fp: filter to remove + * @rcu: rcu_head that contains the sk_filter info to remove * * Remove a filter from a socket and release its resources. */ --- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Offloading features in VLAN interfaces
Olivier Crameri wrote: Same thing but with the patch this time. Since the VLAN device's features may also change in the handler, shouldn't we check and generate a feature-change event for the VLAN device(s) as well? Ben -- Ben Greear <[EMAIL PROTECTED]> Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.18 7/6]: sb1250-mac: Remove "typedef" obfuscation
This is a set of changes to remove unneeded type definitions that only make code less obvious. It applies to all "enum" and "struct" types as well as to potentially unsafe use of them within sizeof(). Signed-off-by: Maciej W. Rozycki <[EMAIL PROTECTED]> --- This applies on top of 4/6. Please consider. Maciej patch-mips-2.6.18-20060920-sb1250-mac-typedef-3 diff -up --recursive --new-file linux-mips-2.6.18-20060920.macro/drivers/net/sb1250-mac.c linux-mips-2.6.18-20060920/drivers/net/sb1250-mac.c --- linux-mips-2.6.18-20060920.macro/drivers/net/sb1250-mac.c 2006-09-28 02:51:29.0 + +++ linux-mips-2.6.18-20060920/drivers/net/sb1250-mac.c 2006-10-05 16:18:41.0 + @@ -129,33 +129,33 @@ MODULE_PARM_DESC(int_timeout, "Timeout v * Simple types * */ -typedef enum { +enum sbmac_speed { sbmac_speed_none = 0, sbmac_speed_10 = SPEED_10, sbmac_speed_100 = SPEED_100, sbmac_speed_1000 = SPEED_1000, -} sbmac_speed_t; +}; -typedef enum { +enum sbmac_duplex { sbmac_duplex_none = -1, sbmac_duplex_half = DUPLEX_HALF, sbmac_duplex_full = DUPLEX_FULL, -} sbmac_duplex_t; +}; -typedef enum { +enum sbmac_fc { sbmac_fc_none, sbmac_fc_disabled, sbmac_fc_frame, sbmac_fc_collision, sbmac_fc_carrier, -} sbmac_fc_t; +}; -typedef enum { +enum sbmac_state { sbmac_state_uninit, sbmac_state_off, sbmac_state_on, sbmac_state_broken, -} sbmac_state_t; +}; /** @@ -181,52 +181,58 @@ typedef enum { * DMA Descriptor structure * */ -typedef struct sbdmadscr_s { +struct sbdmadscr { uint64_t dscr_a; uint64_t dscr_b; -} sbdmadscr_t; - -typedef unsigned long paddr_t; +}; /** * DMA Controller structure * */ -typedef struct sbmacdma_s { +struct sbmacdma { /* * This stuff is used to identify the channel and the registers * associated with it. */ - - struct sbmac_softc *sbdma_eth; /* back pointer to associated MAC */ - int sbdma_channel; /* channel number */ - int sbdma_txdir; /* direction (1=transmit) */ - int sbdma_maxdescr;/* total # of descriptors in ring */ + struct sbmac_softc *sbdma_eth; /* back pointer to associated + MAC */ + int sbdma_channel; /* channel number */ + int sbdma_txdir;/* direction (1=transmit) */ + int sbdma_maxdescr; /* total # of descriptors + in ring */ #ifdef CONFIG_SBMAC_COALESCE - int sbdma_int_pktcnt; /* # descriptors rx/tx before interrupt*/ - int sbdma_int_timeout; /* # usec rx/tx interrupt */ + int sbdma_int_pktcnt; + /* # descriptors rx/tx + before interrupt */ + int sbdma_int_timeout; + /* # usec rx/tx interrupt */ #endif - - volatile void __iomem *sbdma_config0; /* DMA config register 0 */ - volatile void __iomem *sbdma_config1; /* DMA config register 1 */ - volatile void __iomem *sbdma_dscrbase; /* Descriptor base address */ - volatile void __iomem *sbdma_dscrcnt; /* Descriptor count register */ - volatile void __iomem *sbdma_curdscr; /* current descriptor address */ + volatile void __iomem *sbdma_config0; /* DMA config register 0 */ + volatile void __iomem *sbdma_config1; /* DMA config register 1 */ + volatile void __iomem *sbdma_dscrbase; + /* descriptor base address */ + volatile void __iomem *sbdma_dscrcnt; /* descriptor count register */ + volatile void __iomem *sbdma_curdscr; /* current descriptor + address */ /* * This stuff is for maintenance of the ring */ - - sbdmadscr_t *sbdma_dscrtable; /* base of descriptor table */ - sbdmadscr_t *sbdma_dscrtable_end; /* end of descriptor table */ - - struct sk_buff **sbdma_ctxtable;/* context table, one per descr */ - - paddr_t sbdma_dscrtable_phys; /* and also the phys addr */ - sbdmadscr_t *sbdma_addptr; /* next dscr for sw to add */ - sbdmadscr_t *sbdma_remptr; /* next dscr for sw to remove */ -} sbmac
[PATCH 2.6.18 8/6]: sb1250-mac: Fix an incorrect use of kfree()
The pointer obtained by kmalloc() is treated with ALIGN() before passing it to kfree(). This may or may not cause problems depending on the minimum alignment enforced by kmalloc() and is ugly anyway. This change records the original pointer returned by kmalloc() so that kfree() may safely use it. Signed-off-by: Maciej W. Rozycki <[EMAIL PROTECTED]> --- This applies on top of the "typedef" change (7/6). Please consider. Maciej patch-mips-2.6.18-20060920-sb1250-mac-kfree-0 diff -up --recursive --new-file linux-mips-2.6.18-20060920.macro/drivers/net/sb1250-mac.c linux-mips-2.6.18-20060920/drivers/net/sb1250-mac.c --- linux-mips-2.6.18-20060920.macro/drivers/net/sb1250-mac.c 2006-10-05 16:18:41.0 + +++ linux-mips-2.6.18-20060920/drivers/net/sb1250-mac.c 2006-10-04 23:07:27.0 + @@ -220,6 +220,7 @@ struct sbmacdma { /* * This stuff is for maintenance of the ring */ + void*sbdma_dscrtable_un; struct sbdmadscr*sbdma_dscrtable; /* base of descriptor table */ struct sbdmadscr*sbdma_dscrtable_end; @@ -640,15 +641,16 @@ static void sbdma_initctx(struct sbmacdm d->sbdma_maxdescr = maxdescr; - d->sbdma_dscrtable = kmalloc((d->sbdma_maxdescr + 1) * -sizeof(*d->sbdma_dscrtable), GFP_KERNEL); + d->sbdma_dscrtable_un = kmalloc((d->sbdma_maxdescr + 1) * + sizeof(*d->sbdma_dscrtable), + GFP_KERNEL); /* * The descriptor table must be aligned to at least 16 bytes or the * MAC will corrupt it. */ d->sbdma_dscrtable = (struct sbdmadscr *) -ALIGN((unsigned long)d->sbdma_dscrtable, +ALIGN((unsigned long)d->sbdma_dscrtable_un, sizeof(*d->sbdma_dscrtable)); memset(d->sbdma_dscrtable, 0, @@ -1309,9 +1311,9 @@ static int sbmac_initctx(struct sbmac_so static void sbdma_uninitctx(struct sbmacdma *d) { - if (d->sbdma_dscrtable) { - kfree(d->sbdma_dscrtable); - d->sbdma_dscrtable = NULL; + if (d->sbdma_dscrtable_un) { + kfree(d->sbdma_dscrtable_un); + d->sbdma_dscrtable = d->sbdma_dscrtable_un = NULL; } if (d->sbdma_ctxtable) { - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] cfg80211 and nl80211
On Thu, Oct 05, 2006 at 09:13:53AM -0400, Stuffed Crust wrote: > (Leave out the RSNIE, AuthType and KeyMgmt stuff; while they're > used in the actual key negotiation/derivation, they're separate > problems and have no bearing on the crypto layer. From the driver's > perspective the RSNIE is just an opaque blob to be appended to > beacons,presps and [re]assoc frames, KeyMgmt is purely a matter for > the authenticator/supplicant, and AuthType is just a toggle that > happens to be off for post-802.11i, although LEAP v1 adds some > complications there..) They are separate problems, but they do need to be taken into account in 802.11 interface to user space. Some drivers generate WPA/RSN IE internally and they need to be told about the allowed protocol version, authenticated key management suite, and pairwise/group cipher suites. In other words, key management is not purely for authenticator/supplicant. > Each key has: > > * Key type (WEP/TKIP/AES-CCMP/NONE) > * Key length (implied, but WEP can have varying key lengths) > * Key index (only '0' is generally used for unicast frames, but 802.11i > requires use of simultaneous broadcast keys) Pre-802.11i supported key mapping and multiple default keys.. To make things complex, many Cisco APs are configured to use non-zero key indexes with dynamic WEP keys.. > ...Meanwhile. It's foolish to ignore the 802.11 MLME. It lists out > pretty much everything that's necessary to get a working connection, and > looking at its evolution (and changes in the pipeline) shows that it's > impossible to do it all (right) the first time, and that changes, not > just additions, will be necessary. There are non-standard WLAN security protocols (look at Cisco) and one needs to keep in mind that just looking at 802.11 MLME may not cover all cases that, in practice, have to be supported.. Anyway, I agree that MLME primitives do change and there will be new commands needed to cover needs of future amendments to 802.11 (see, e.g., 802.11r and 802.11w drafts). -- Jouni MalinenPGP id EFC895FA - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] prism54: wpa support for fullmac cards
On Wed, Oct 04, 2006 23:43 you wrote: > On Wed, Oct 04, 2006 at 04:12:26PM +0200, [EMAIL PROTECTED] wrote: > > the AP code never worked. And the hostapd-ioctl interface was designed > > for prism2/2.5/3 cards, but not for "fullmac" prism54. > > What do you mean by never working? I have seen fullmac Prism54 > completing WPA authentication with hostapd.. This was using the > driver_prism54.c in hostapd, not the Host AP driver interface. > > (BTW, hostapd's backend for prism54 uses a "proprietary" interface - > > PIMFOR -, which never made it into the kernel.) > > But it worked in the external driver. So yes, saying that the version in > kernel tree never worked in AP mode would probably be valid. > ok, sorry my fault, I should have put it this was: it was never woking for ME, linmax, roland warsow, ... and I tried alot of things. (patches, how-tos, ask the maintainer, etc. ) But i only saw "Oops" or "mgt: queue full" ... the PIMFOR-Interface is a direct "tunnel" to the hardware. And guess what? it's very "crashy" .. (e.g "set/get the generic elements" does a very good job. ;) ) > And as far as the WEXT interface in hostapd is concerned, no, there is > no such thing yet. that's correct. WEXT is not going anywhere anymore, but maybe cfg80211? Chr - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] sk98lin: ethtool register dump
Add support for dumping the registers in the deprecated sk98lin driver. This is allows for easier comparison with settings in new skge driver. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- linux-2.6.orig/drivers/net/sk98lin/skethtool.c +++ linux-2.6/drivers/net/sk98lin/skethtool.c @@ -581,6 +581,30 @@ static int setRxCsum(struct net_device * return 0; } +static int getRegsLen(struct net_device *dev) +{ + return 0x4000; +} + +/* + * Returns copy of whole control register region + * Note: skip RAM address register because accessing it will + * cause bus hangs! + */ +static void getRegs(struct net_device *dev, struct ethtool_regs *regs, + void *p) +{ + DEV_NET *pNet = netdev_priv(dev); + const void __iomem *io = pNet->pAC->IoBase; + + regs->version = 1; + memset(p, 0, regs->len); + memcpy_fromio(p, io, B3_RAM_ADDR); + + memcpy_fromio(p + B3_RI_WTO_R1, io + B3_RI_WTO_R1, + regs->len - B3_RI_WTO_R1); +} + const struct ethtool_ops SkGeEthtoolOps = { .get_settings = getSettings, .set_settings = setSettings, @@ -599,4 +623,6 @@ const struct ethtool_ops SkGeEthtoolOps .set_tx_csum= setTxCsum, .get_rx_csum= getRxCsum, .set_rx_csum= setRxCsum, + .get_regs = getRegs, + .get_regs_len = getRegsLen, }; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] sk98lin: MII ioctl support
Add MII ioctl support to the deprecated sk98lin driver. This allows comparison with skge driver's PHY settings. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- linux-2.6.orig/drivers/net/sk98lin/skge.c +++ linux-2.6/drivers/net/sk98lin/skge.c @@ -113,6 +113,7 @@ #include #include #include +#include #include "h/skdrv1st.h" #include "h/skdrv2nd.h" @@ -2843,6 +2844,56 @@ unsigned longFlags; /* for spin lock return(&pAC->stats); } /* SkGeStats */ +/* + * Basic MII register access + */ +static int SkGeMiiIoctl(struct net_device *dev, + struct mii_ioctl_data *data, int cmd) +{ + DEV_NET *pNet = netdev_priv(dev); + SK_AC *pAC = pNet->pAC; + SK_IOC IoC = pAC->IoBase; + int Port = pNet->PortNr; + SK_GEPORT *pPrt = &pAC->GIni.GP[Port]; + unsigned long Flags; + int err = 0; + int reg = data->reg_num & 0x1f; + SK_U16 val = data->val_in; + + if (!netif_running(dev)) + return -ENODEV; /* Phy still in reset */ + + spin_lock_irqsave(&pAC->SlowPathLock, Flags); + switch(cmd) { + case SIOCGMIIPHY: + data->phy_id = pPrt->PhyAddr; + + /* fallthru */ + case SIOCGMIIREG: + if (pAC->GIni.GIGenesis) + SkXmPhyRead(pAC, IoC, Port, reg, &val); + else + SkGmPhyRead(pAC, IoC, Port, reg, &val); + + data->val_out = val; + break; + + case SIOCSMIIREG: + if (!capable(CAP_NET_ADMIN)) + err = -EPERM; + + else if (pAC->GIni.GIGenesis) + SkXmPhyWrite(pAC, IoC, Port, reg, val); + else + SkGmPhyWrite(pAC, IoC, Port, reg, val); + break; + default: + err = -EOPNOTSUPP; + } +spin_unlock_irqrestore(&pAC->SlowPathLock, Flags); + return err; +} + /* * @@ -2876,6 +2927,9 @@ int HeaderLength = sizeof(SK_U32) + siz pNet = netdev_priv(dev); pAC = pNet->pAC; + if (cmd == SIOCGMIIPHY || cmd == SIOCSMIIREG || cmd == SIOCGMIIREG) + return SkGeMiiIoctl(dev, if_mii(rq), cmd); + if(copy_from_user(&Ioctl, rq->ifr_data, sizeof(SK_GE_IOCTL))) { return -EFAULT; } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [IPROUTE2][PATCH] Add missing macros which was removed from kernel header. (Re: [GIT PATCH] NET: Fixes for net-2.6.19)
I applied a combined patch to fix all the headers to iproute2 (for the future 2.6.19 based release). -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] [PATCH 3/3] enable IP multicast when bonding IPoIB devices
Or Gerlitz <[EMAIL PROTECTED]> wrote: >Jay Vosburgh wrote: [...] >> Yes. Part of the difficulty is that the changes to the >> initscripts and sysconfig packages won't be compatible with versions of >> bonding prior to the bonding kernel changes (because older versions of >> bonding will refuse to add slaves if the master is down). It might >> require adding another API version to bonding, and modifying ifenslave >> to work both ways (i.e., with the current "enslave with master up" API, >> as well as the new "enslave with master down" API). > >Gee, sounds bad After some reflection, I suspect it wouldn't be all that awful. The main concern is going to be whether or not the existing ifenslave binaries supplied with distros will run with the new version of bonding. Since the new version of bonding that you're proposing is really just relaxing the rules (rather than imposing a different, incompatible set of rules), that's probably not a really big deal. I don't think it would require a revision change to the bonding ifenslave API. [...] >So the direction to have sysconfig and initscripts tools configure bonding >by sysfs and not by the enslave program is something you were considering >regardless of the needs imposed by bonding support for non ARPHRD_ETHER >netdevices? and you think the distro packages owners would like this? Yes, the long term direction is to have the initscripts configure bonding via sysfs, either directly or via the step of converting ifenslave to a script that uses sysfs. I personally find ifenslave to be more convenient to use than repeated "echo whatever > /sys/this/that/the/other", but there's no reason that ifenslave couldn't do the various echo things itself under the covers. One drawback to sysfs is that there's no real-time error reporting; you have to look at dmesg to see if your request succeeded or not. I'm not sure offhand if, e.g., adding a sysfs file to bonding for "last-request-status" is a kosher sysfs thing to do; if it is, then an ifenslave script could check such a thing to figure out error returns. It seems more logical to me to embed all of the bonding sysfs magic stuff into a separate script, but the maintainers of initscipts or sysconfig may see things differently. The main advantage to either of these (initscripts/sysconfig and/or ifenslave converted to sysfs) is that it eliminates the need to load the bonding driver module multiple times to have more than one bonding device with differing module parameters (because the sysfs interface can create any number of bonding interfaces with arbitrary settings). >I will look into the current methods used by sysconfig to configure >bonding and see if i can come up with sketch of how to do it with sysfs. It's probably easier to first convert ifenslave to a sysfs-using script that the existing initscripts can use. This allows the changes to be published in stages, rather than requiring a single flag day changeover. The first stage changes the bonding driver itself to permit enslavement with the master down (insuring that existing ifenslave binaries supplied with reasonably current distros continue to function). Next, ifenslave is changed to use sysfs (simultaneously removing the adjustment of the master or slave's up/down state during enslavement). The next stage either changes the initscripts/sysconfig to use sysfs directly or change its use of ifenslave to not do multiple loads of the bonding driver. -J --- -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/10] VIOC: New Network Device Driver
Adding VIOC device driver, support files: Documenation, makefiles etc. Signed-off-by: Misha Tomushev <[EMAIL PROTECTED]> diff -uprN linux-2.6.17/Documentation/networking/vioc.txt linux-2.6.17.vioc/Documentation/networking/vioc.txt --- linux-2.6.17/Documentation/networking/vioc.txt 1969-12-31 16:00:00.0 -0800 +++ linux-2.6.17.vioc/Documentation/networking/vioc.txt 2006-09-01 10:09:49.0 -0700 @@ -0,0 +1,98 @@ +VIOC Driver Release Notes (07/12/06) + + [EMAIL PROTECTED] + + +Overview + + +A Virtual Input-Output Controller (VIOC) is a PCI device that provides +10Gbps of I/O bandwidth that can be shared by up to 16 virtual network +interfaces (VNICs). VIOC hardware supports several features such as +large frames, checksum offload, gathered send, MSI/MSI-X, bandwidth +control, interrupt mitigation, etc. + +VNICs are provisioned to a host partition via an out-of-band interface +from the System Controller -- typically before the partition boots, +although they can be dynamically added or removed from a running +partition as well. + +Each provisioned VNIC appears as an Ethernet netdevice to the host OS, +and maintains its own transmit ring in DMA memory. VNICs are +configured to share up to 4 of total 16 receive rings and 1 of total +16 receive-completion rings in DMA memory. VIOC hardware classifies +packets into receive rings based on size, allowing more efficient use +of DMA buffer memory. The default, and recommended, configuration +uses groups of 'receive sets' (rxsets), each with 3 receive rings, a +receive completion ring, and a VIOC Rx interrupt. The driver gives +each rxset a NAPI poll handler associated with a phantom (invisible) +netdevice, for concurrency. VNICs are assigned to rxsets using a +simple modulus. + +VIOC provides 4 interrupts in INTx mode: 2 for Rx, 1 for Tx, and 1 for +out-of-band messages from the System Controller and errors. VIOC also +provides 19 MSI-X interrupts: 16 for Rx, 1 for Tx, 1 for out-of-band +messages from the System Controller, and 1 for error signalling from +the hardware. The VIOC driver makes a determination whether MSI-X +functionality is supported and initializes interrupts accordingly. +[Note: The Linux kernel disables MSI-X for VIOCs on modules with AMD +8131, even if the device is on the HT link.] + + +Module loadable parameters +== + +- poll_weight (default 8) - the number of received packets will be + processed during one call into the NAPI poll handler. + +- rx_intr_timeout (default 1) - hardware rx interrupt mitigation + timer, in units of 5us. + +- rx_intr_pkt_cnt (default 64) - hardware rx interrupt mitigation + counter, in units of packets. + +- tx_pkts_per_irq (default 64) - hardware tx interrupt mitigation + counter, in units of packets. + +- tx_pkts_per_bell (default 1) - the number of packets to enqueue on a + transmit ring before issuing a doorbell to hardware. + +Performance Tuning +== + +You may want to use the following sysctl settings to improve +performance. [NOTE: To be re-checked] + +# set in /etc/sysctl.conf + +net.ipv4.tcp_timestamps = 0 +net.ipv4.tcp_sack = 0 +net.ipv4.tcp_rmem = 1000 1000 1000 +net.ipv4.tcp_wmem = 1000 1000 1000 +net.ipv4.tcp_mem = 1000 1000 1000 + +net.core.rmem_max = 5242879 +net.core.wmem_max = 5242879 +net.core.rmem_default = 5242879 +net.core.wmem_default = 5242879 +net.core.optmem_max = 5242879 +net.core.netdev_max_backlog = 10 + +Out-of-band Communications with System Controller += + +System operators can use the out-of-band facility to allow for remote +shutdown or reboot of the host partition. Upon receiving such a +command, the VIOC driver executes "/sbin/reboot" or "/sbin/shutdown" +via the usermodehelper() call. + +This same communications facility is used for dynamic VNIC +provisioning (plug in and out). + +The VIOC driver also registers a callback with +register_reboot_notifier(). When the callback is executed, the driver +records the shutdown event and reason in a VIOC register to notify the +System Controller. + + + diff -uprN linux-2.6.17/MAINTAINERS linux-2.6.17.vioc/MAINTAINERS --- linux-2.6.17/MAINTAINERS2006-06-17 18:49:35.0 -0700 +++ linux-2.6.17.vioc/MAINTAINERS 2006-09-01 10:09:49.0 -0700 @@ -3106,6 +3106,11 @@ L: [EMAIL PROTECTED] W: http://rio500.sourceforge.net S: Maintained +VIOC NETWORK DRIVER +P: [EMAIL PROTECTED] +L: netdev@vger.kernel.org +S: Maintained + VIDEO FOR LINUX P: Mauro Carvalho Chehab M: [EMAIL PROTECTED] diff -uprN linux-2.6.17/drivers/net/Kconfig linux-2.6.17.vioc/drivers/net/Kconfig --- linux-2.6.17/drivers/net/Kconfig2006-06-17 18:49:35.0 -0700 +++ linux-2.6.17.vioc/drivers/net/Kconfig 2006-09-01 10:19:35.0 -0700 @@ -1818,9 +1818,
[PATCH 6/10] VIOC: New Network Device Driver
Adding VIOC device driver. Ethtool interface. Signed-off-by: Misha Tomushev <[EMAIL PROTECTED]> diff -uprN linux-2.6.17/drivers/net/vioc/vioc_ethtool.c linux-2.6.17.vioc/drivers/net/vioc/vioc_ethtool.c --- linux-2.6.17/drivers/net/vioc/vioc_ethtool.c1969-12-31 16:00:00.0 -0800 +++ linux-2.6.17.vioc/drivers/net/vioc/vioc_ethtool.c 2006-10-04 10:36:10.0 -0700 @@ -0,0 +1,309 @@ +/* + * Fabric7 Systems Virtual IO Controller Driver + * Copyright (C) 2003-2005 Fabric7 Systems. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + * USA + * + * http://www.fabric7.com/ + * + * Maintainers: + *[EMAIL PROTECTED] + * + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "f7/vnic_hw_registers.h" +#include "f7/vnic_defs.h" + +#include +#include "vioc_vnic.h" +#include "vioc_api.h" +#include "driver_version.h" + +/* ethtool support for vnic */ + +#ifdef SIOCETHTOOL +#include + +#ifndef ETH_GSTRING_LEN +#define ETH_GSTRING_LEN 32 +#endif + +#ifdef ETHTOOL_OPS_COMPAT +#include "kcompat_ethtool.c" +#endif + +#define VIOC_READ_REG(R, M, V, viocdev) (\ + readl((viocdev->ba.virt + GETRELADDR(M, V, R + +#define VIOC_WRITE_REG(R, M, V, viocdev, value) (\ + (writel(value, viocdev->ba.virt + GETRELADDR(M, V, R + +#ifdef ETHTOOL_GSTATS +struct vnic_stats { + char stat_string[ETH_GSTRING_LEN]; + int sizeof_stat; + int stat_offset; +}; + +#define VNIC_STAT(m) sizeof(((struct vnic_device *)0)->m), \ + offsetof(struct vnic_device, m) + +static const struct vnic_stats vnic_gstrings_stats[] = { + {"rx_packets", VNIC_STAT(net_stats.rx_packets)}, + {"tx_packets", VNIC_STAT(net_stats.tx_packets)}, + {"rx_bytes", VNIC_STAT(net_stats.rx_bytes)}, + {"tx_bytes", VNIC_STAT(net_stats.tx_bytes)}, + {"rx_errors", VNIC_STAT(net_stats.rx_errors)}, + {"tx_errors", VNIC_STAT(net_stats.tx_errors)}, + {"rx_dropped", VNIC_STAT(net_stats.rx_dropped)}, + {"tx_dropped", VNIC_STAT(net_stats.tx_dropped)}, + {"multicast", VNIC_STAT(net_stats.multicast)}, + {"collisions", VNIC_STAT(net_stats.collisions)}, + {"rx_length_errors", VNIC_STAT(net_stats.rx_length_errors)}, + {"rx_over_errors", VNIC_STAT(net_stats.rx_over_errors)}, + {"rx_crc_errors", VNIC_STAT(net_stats.rx_crc_errors)}, + {"rx_frame_errors", VNIC_STAT(net_stats.rx_frame_errors)}, + {"rx_fifo_errors", VNIC_STAT(net_stats.rx_fifo_errors)}, + {"rx_missed_errors", VNIC_STAT(net_stats.rx_missed_errors)}, + {"tx_aborted_errors", VNIC_STAT(net_stats.tx_aborted_errors)}, + {"tx_carrier_errors", VNIC_STAT(net_stats.tx_carrier_errors)}, + {"tx_fifo_errors", VNIC_STAT(net_stats.tx_fifo_errors)}, + {"tx_heartbeat_errors", VNIC_STAT(net_stats.tx_heartbeat_errors)}, + {"tx_window_errors", VNIC_STAT(net_stats.tx_window_errors)}, + {"rx_fragment_errors", VNIC_STAT(vnic_stats.rx_fragment_errors)}, + {"rx_dropped", VNIC_STAT(vnic_stats.rx_dropped)}, + {"tx_skb_equeued", VNIC_STAT(vnic_stats.skb_enqueued)}, + {"tx_skb_freed", VNIC_STAT(vnic_stats.skb_freed)}, + {"netif_stops", VNIC_STAT(vnic_stats.netif_stops)}, + {"tx_on_empty_intr", VNIC_STAT(vnic_stats.tx_on_empty_interrupts)}, + {"tx_headroom_misses", VNIC_STAT(vnic_stats.headroom_misses)}, + {"tx_headroom_miss_drops", VNIC_STAT(vnic_stats.headroom_miss_drops)}, + {"tx_ring_size", VNIC_STAT(txq.count)}, + {"tx_ring_capacity", VNIC_STAT(txq.empty)}, + {"pkts_till_intr", VNIC_STAT(txq.tx_pkts_til_irq)}, + {"pkts_till_bell", VNIC_STAT(txq.tx_pkts_til_bell)}, + {"bells", VNIC_STAT(txq.bells)}, + {"next_to_use", VNIC_STAT(txq.next_to_use)}, + {"next_to_clean", VNIC_STAT(txq.next_to_clean)}, + {"tx_frags", VNIC_STAT(txq.frags)}, + {"tx_ring_wraps", VNIC_STAT(txq.wraps)}, + {"tx_ring_fulls", VNIC_STAT(txq.full)} +}; + +#define VNIC_STATS_LEN \ + sizeof(vnic_gstrings_stats) / sizeof(struct vnic_stats) +#endif /* ETHTOOL_GS
[PATCH 7/10] VIOC: New Network Device Driver
Adding VIOC device driver. Interrupt handler. Signed-off-by: Misha Tomushev <[EMAIL PROTECTED]> diff -uprN linux-2.6.17/drivers/net/vioc/vioc_irq.c linux-2.6.17.vioc/drivers/net/vioc/vioc_irq.c --- linux-2.6.17/drivers/net/vioc/vioc_irq.c1969-12-31 16:00:00.0 -0800 +++ linux-2.6.17.vioc/drivers/net/vioc/vioc_irq.c 2006-10-04 10:37:56.0 -0700 @@ -0,0 +1,538 @@ +/* + * Fabric7 Systems Virtual IO Controller Driver + * Copyright (C) 2003-2005 Fabric7 Systems. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + * USA + * + * http://www.fabric7.com/ + * + * Maintainers: + *[EMAIL PROTECTED] + * + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "f7/vnic_hw_registers.h" +#include "f7/vnic_defs.h" +#include "vioc_vnic.h" + +#define VIOC_INTERRUPTS_CNT19 /* 16 Rx + 1 Tx + 1 BMC + 1 Error */ +#define VIOC_INTERRUPTS_CNT_PIN_IRQ4 /* 2 Rx + 1 Tx + 1 BMC */ + +#define VIOC_SLVAR(x) x spinlock_t vioc_driver_lock = SPIN_LOCK_UNLOCKED +#define VIOC_CLI spin_lock_irq(&vioc_driver_lock) +#define VIOC_STI spin_unlock_irq(&vioc_driver_lock) +#define IRQRETURN return IRQ_HANDLED +#define TX_IRQ_IDX 16 +#define BMC_IRQ_IDX17 +#define ERR_IRQ_IDX18 +#define HANDLER_TASKLET1 +#define HANDLER_DIRECT 2 +#define HANDLER_TASKQ 3 +#define VIOC_RX0_PCI_FUNC 0 +#define VIOC_TX_PCI_FUNC1 +#define VIOC_BMC_PCI_FUNC 2 +#define VIOC_RX1_PCI_FUNC 3 +#define VIOC_IRQ_NONE (u16) -1 +#define VIOC_ID_NONE-1 +#define VIOC_IVEC_NONE -1 +#define VIOC_INTR_NONE -1 + + +struct vioc_msix_entry { + u16 vector; + u16 entry; +}; + +struct vioc_intreq { + char name[VIOC_NAME_LEN]; + void (*intrFuncp) (void *); + void *intrFuncparm; +irqreturn_t(*hthandler) (int, void *, struct pt_regs *); + unsigned int irq; + unsigned int vec; + unsigned int intr_base; + unsigned int intr_offset; + unsigned int timeout_value; + unsigned int pkt_counter; + unsigned int rxc_mask; + struct work_struct taskq; + struct tasklet_struct tasklet; +}; + +struct viocdev_intreq { + int vioc_id; + struct pci_dev *pci_dev; + void *vioc_virt; + unsigned long long vioc_phy; + void *ioapic_virt; + unsigned long long ioapic_phy; + struct vioc_intreq intreq[VIOC_INTERRUPTS_CNT]; + struct vioc_msix_entry irqs[VIOC_INTERRUPTS_CNT]; +}; + +/* GLOBAL VIOC Interrupt table/structure */ +struct viocdev_intreq vioc_interrupts[VIOC_MAX_VIOCS]; + +VIOC_SLVAR(); + +static irqreturn_t taskq_handler(int i, void *p, struct pt_regs *r) +{ + int intr_id = VIOC_IRQ_PARAM_INTR_ID(p); + int vioc_id = VIOC_IRQ_PARAM_VIOC_ID(p); + + schedule_work(&vioc_interrupts[vioc_id].intreq[intr_id].taskq); + IRQRETURN; +} + +static irqreturn_t tasklet_handler(int i, void *p, struct pt_regs *r) +{ + int intr_id = VIOC_IRQ_PARAM_INTR_ID(p); + int vioc_id = VIOC_IRQ_PARAM_VIOC_ID(p); + + tasklet_schedule(&vioc_interrupts[vioc_id].intreq[intr_id].tasklet); + IRQRETURN; +} + +static irqreturn_t direct_handler(int i, void *p, struct pt_regs *r) +{ + int intr_id = VIOC_IRQ_PARAM_INTR_ID(p); + int vioc_id = VIOC_IRQ_PARAM_VIOC_ID(p); + + vioc_interrupts[vioc_id].intreq[intr_id]. + intrFuncp(vioc_interrupts[vioc_id].intreq[intr_id].intrFuncparm); + IRQRETURN; +} + +static int vioc_enable_msix(u32 viocdev_idx) +{ + struct vioc_device *viocdev = vioc_viocdev(viocdev_idx); + int ret; + +#if defined(CONFIG_MSIX_MOD) + ret = pci_enable_msix(viocdev->pdev, + (struct msix_entry *) + &vioc_interrupts[viocdev_idx].irqs, + VIOC_INTERRUPTS_CNT); + if (ret == 0) { + dev_err(&viocdev->pdev->dev, "MSI-X OK\n"); + return VIOC_INTERRUPTS_CNT; + } else { + dev_err(&viocdev->pdev->dev, +
[PATCH 4/10] VIOC: New Network Device Driver
Adding VIOC device driver. VIOC hardware APIs. Signed-off-by: Misha Tomushev <[EMAIL PROTECTED] diff -uprN linux-2.6.17/drivers/net/vioc/vioc_api.c linux-2.6.17.vioc/drivers/net/vioc/vioc_api.c --- linux-2.6.17/drivers/net/vioc/vioc_api.c1969-12-31 16:00:00.0 -0800 +++ linux-2.6.17.vioc/drivers/net/vioc/vioc_api.c 2006-10-04 10:21:45.0 -0700 @@ -0,0 +1,384 @@ +/* + * Fabric7 Systems Virtual IO Controller Driver + * Copyright (C) 2003-2005 Fabric7 Systems. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + * USA + * + * http://www.fabric7.com/ + * + * Maintainers: + *[EMAIL PROTECTED] + * + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "f7/vnic_hw_registers.h" +#include "f7/vnic_defs.h" + +#include "vioc_vnic.h" +#include "vioc_api.h" + +int vioc_set_rx_intr_param(int viocdev_idx, int rx_intr_id, u32 timeout, u32 cntout) +{ + int ret = 0; + struct vioc_device *viocdev; + u64 regaddr; + + viocdev = vioc_viocdev(viocdev_idx); + + regaddr = GETRELADDR(VIOC_IHCU, 0, (VREG_IHCU_RXCINTTIMER + + (rx_intr_id << 2))); + vioc_reg_wr(timeout, viocdev->ba.virt, regaddr); + + regaddr = GETRELADDR(VIOC_IHCU, 0, (VREG_IHCU_RXCINTPKTCNT + + (rx_intr_id << 2))); + vioc_reg_wr(cntout, viocdev->ba.virt, regaddr); + + return ret; +} + + +int vioc_get_vnic_mac(int viocdev_idx, u32 vnic_id, u8 * p) +{ + struct vioc_device *viocdev = vioc_viocdev(viocdev_idx); + u64 regaddr; + u32 value; + + regaddr = GETRELADDR(VIOC_VENG, vnic_id, VREG_VENG_MACADDRLO); + vioc_reg_rd(viocdev->ba.virt, regaddr, &value); + *((u32 *) & p[2]) = htonl(value); + + regaddr = GETRELADDR(VIOC_VENG, vnic_id, VREG_VENG_MACADDRHI); + vioc_reg_rd(viocdev->ba.virt, regaddr, &value); + *((u16 *) & p[0]) = htons(value); + + return 0; +} + +int vioc_set_vnic_mac(int viocdev_idx, u32 vnic_id, u8 * p) +{ + struct vioc_device *viocdev = vioc_viocdev(viocdev_idx); + u64 regaddr; + u32 value; + + regaddr = GETRELADDR(VIOC_VENG, vnic_id, VREG_VENG_MACADDRLO); + value = ntohl(*((u32 *) & p[2])); + + vioc_reg_wr(value, viocdev->ba.virt, regaddr); + + regaddr = GETRELADDR(VIOC_VENG, vnic_id, VREG_VENG_MACADDRHI); + value = (ntohl(*((u32 *) & p[0])) >> 16) & 0x; + + vioc_reg_wr(value, viocdev->ba.virt, regaddr); + + return 0; +} + +int vioc_set_txq(int viocdev_idx, u32 vnic_id, u32 txq_id, dma_addr_t base, +u32 num_elements) +{ + int ret = 0; + u32 value; + struct vioc_device *viocdev; + u64 regaddr; + + viocdev = vioc_viocdev(viocdev_idx); + if (vnic_id >= VIOC_MAX_VNICS) + goto parm_err_ret; + + if (txq_id >= VIOC_MAX_TXQ) + goto parm_err_ret; + + regaddr = GETRELADDR(VIOC_VENG, vnic_id, (VREG_VENG_TXD_W0 + (txq_id << 5))); + + value = base; + vioc_reg_wr(value, viocdev->ba.virt, regaddr); + + regaddr = GETRELADDR(VIOC_VENG, vnic_id, (VREG_VENG_TXD_W1 + (txq_id << 5))); + value = (((base >> 16) >> 16) & 0x00ff) | + ((num_elements << 8) & 0x0000); + vioc_reg_wr(value, viocdev->ba.virt, regaddr); + + /* +* Enable Interrupt-on-Empty +*/ + regaddr = GETRELADDR(VIOC_VENG, vnic_id, VREG_VENG_TXINTCTL); + vioc_reg_wr(VREG_VENG_TXINTCTL_INTONEMPTY_MASK, viocdev->ba.virt, + regaddr); + + return ret; + + parm_err_ret: + return -EINVAL; +} + +int vioc_set_rxc(int viocdev_idx, struct rxc *rxc) +{ + u32 value; + struct vioc_device *viocdev; + u64 regaddr; + int ret = 0; + + viocdev = vioc_viocdev(viocdev_idx); + + regaddr = GETRELADDR(VIOC_IHCU, 0, (VREG_IHCU_RXC_LO + (rxc->rxc_id << 4))); +value = rxc->dma; + vioc_reg_wr(value, viocdev->ba.virt, regaddr); + + regaddr = GETRELADDR(VIOC_IHCU, 0, (VREG_IHCU_RXC_HI + (rxc->rxc_id << 4))); +value = (((rxc->dma >> 16) >> 16) & 0x00ff) | +
[PATCH 5/10] VIOC: New Network Device Driver
Adding VIOC device driver. Device driver initialization/termination. Signed-off-by: Misha Tomushev <[EMAIL PROTECTED]> diff -uprN linux-2.6.17/drivers/net/vioc/vioc_vnic.h linux-2.6.17.vioc/drivers/net/vioc/vioc_vnic.h --- linux-2.6.17/drivers/net/vioc/vioc_vnic.h 1969-12-31 16:00:00.0 -0800 +++ linux-2.6.17.vioc/drivers/net/vioc/vioc_vnic.h 2006-10-04 10:10:04.0 -0700 @@ -0,0 +1,498 @@ +/* + * Fabric7 Systems Virtual IO Controller Driver + * Copyright (C) 2003-2005 Fabric7 Systems. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + * USA + * + * http://www.fabric7.com/ + * + * Maintainers: + *[EMAIL PROTECTED] + * + * + */ +#ifndef _VIOC_VNIC_H +#define _VIOC_VNIC_H + +#include +#include +#include + +#include "f7/vnic_defs.h" +#include "f7/vnic_hw_registers.h" +#include "f7/vioc_pkts_defs.h" + +/* + * VIOC PCI constants + */ +#define PCI_VENDOR_ID_FABRIC7 0xfab7 +#define PCI_DEVICE_ID_VIOC_1 0x0001 +#define PCI_DEVICE_ID_VIOC_8 0x0008 +#define PCI_DEVICE_ID_IOAPIC 0x7459 + +#define VIOC_DRV_MODULE_NAME "vioc" + +#define F7PF_HLEN_MIN 8 /* Minimal (kl=0) header */ +#define F7PF_HLEN_STD 10 /* Standard (kl=1) header */ + +#define VNIC_MAX_MTU 9180 +#define VNIC_STD_MTU 1500 + +/* VIOC device constants */ +#define VIOC_MAX_RXDQ 16 +#define VIOC_MAX_RXCQ 16 +#define VIOC_MAX_RXQ 4 +#define VIOC_MAX_TXQ 4 +#define VIOC_NAME_LEN 16 + +/* + * VIOC device state + */ + +#define VIOC_STATE_INIT0 +#define VIOC_STATE_UP (VIOC_STATE_INIT + 1) + +#define RX_DESC_SIZE sizeof (struct rx_pktBufDesc_Phys_w) +#define RX_DESC_QUANT (4096/RX_DESC_SIZE) + +#define RXC_DESC_SIZE sizeof (struct rxc_pktDesc_Phys_w) +#define RXC_DESC_QUANT (4096/RXC_DESC_SIZE) + +#define TX_DESC_SIZE sizeof (struct tx_pktBufDesc_Phys_w) +#define TX_DESC_QUANT (4096/TX_DESC_SIZE) + +#define RXS_DESC_SIZE sizeof (struct rxc_pktStatusBlock_w) + +#define VIOC_COPYOUT_THRESHOLD 128 +#define VIOC_RXD_BATCH_BITS32 +#define ALL_BATCH_SW_OWNED 0 +#define ALL_BATCH_HW_OWNED 0x + +#define VIOC_ANY_VNIC 0 +#define VIOC_NONE_TO_HW(u32) -1 + +/* + * Status of the Rx operation as reflected in Rx Completion Descriptor + */ +#define GET_VNIC_RXC_STATUS(rxcd) (\ + GET_VNIC_RXC_BADCRC(rxcd) |\ + GET_VNIC_RXC_BADLENGTH(rxcd) |\ + GET_VNIC_RXC_BADSMPARITY(rxcd) |\ + GET_VNIC_RXC_PKTABORT(rxcd)\ + ) +#define VNIC_RXC_STATUS_OK_W 0 + +#define VNIC_RXC_STATUS_MASK (\ + VNIC_RXC_ISBADLENGTH_W | \ + VNIC_RXC_ISBADCRC_W | \ + VNIC_RXC_ISBADSMPARITY_W | \ + VNIC_RXC_ISPKTABORT_W \ + ) + +#define VIOC_IRQ_PARAM_VIOC_ID(param) \ + (int) (((u64) param >> 28) & 0xf) +#define VIOC_IRQ_PARAM_INTR_ID(param) \ + (int) ((u64) param & 0x) +#define VIOC_IRQ_PARAM_PARAM_ID(param) \ + (int) (((u64) param >> 16) & 0xff) + +#define VIOC_IRQ_PARAM_SET(vioc, intr, param) \ + u64) vioc & 0xf) << 28) | \ + (((u64) param & 0xff) << 16) | \ + ((u64) intr & 0x)) +/* + * Return status codes + */ +#define E_VIOCOK 0 +#define E_VIOCMAX 1 +#define E_VIOCINTERNAL 2 +#define E_VIOCNETREGERR 3 +#define E_VIOCPARMERR 4 +#define E_VIOCNOOP 5 +#define E_VIOCTXFULL 6 +#define E_VIOCIFNOTFOUND 7 +#define E_VIOCMALLOCERR 8 +#define E_VIOCORDERR 9 +#define E_VIOCHWACCESS 10 +#define E_VIOCHWNOTREADY 11 +#define E_ALLOCERR 12 +#define E_VIOCRXHW 13 +#define E_VIOCRXCEMPTY 14 + +/* + * From the HW statnd point, every VNIC has 4 RxQ - receive queues. + * Every RxQ is mapped to RxDQ (a ring with buffers for Rx Packets) + * and RxC queue (a ring with descriptors that reflect the status of the receive. + * I.e. when VIOC receives the packet on any of the 4 RxQ, it would use the mapping to determine where + * to get buffer for the packet (RxDQ) and where to post the result of the operation (RxC). + */ + +struct rxd_q_prov { + u32 buf_size; + u32 entries; + u8 id; + u8 state; +}; + +struct vnic_prov_def { + struct rxd_q_prov rxd_ring[4]; + u32 tx_entries; + u32 rxc_en
[PATCH 3/10] VIOC: New Network Device Driver
Adding VIOC device driver. Out-of-band provisioning protocol support code. Signed-off-by: Misha Tomushev <[EMAIL PROTECTED]> diff -uprN linux-2.6.17/drivers/net/vioc/f7/spp.h linux-2.6.17.vioc/drivers/net/vioc/f7/spp.h --- linux-2.6.17/drivers/net/vioc/f7/spp.h 1969-12-31 16:00:00.0 -0800 +++ linux-2.6.17.vioc/drivers/net/vioc/f7/spp.h 2006-09-06 16:22:59.0 -0700 @@ -0,0 +1,68 @@ +/* + * Fabric7 Systems Virtual IO Controller Driver + * Copyright (C) 2003-2005 Fabric7 Systems. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + * USA + * + * http://www.fabric7.com/ + * + * Maintainers: + *[EMAIL PROTECTED] + * + * + */ +#ifndef _SPP_H_ +#define _SPP_H_ + +#include "vnic_hw_registers.h" + +#define SPP_MODULE VIOC_BMC + +#define SPP_CMD_REG_BANK 15 +#define SPP_SIM_PMM_BANK 14 +#defineSPP_PMM_BMC_BANK13 + +/* communications COMMAND REGISTERS */ +#define SPP_SIM_PMM_CMDREG GETRELADDR(SPP_MODULE, SPP_CMD_REG_BANK, VREG_BMC_REG_R1) +#define VIOCCP_SPP_SIM_PMM_CMDREG \ + VIOCCP_GETRELADDR(SPP_MODULE, SPP_CMD_REG_BANK, VREG_BMC_REG_R1) +#define SPP_PMM_SIM_CMDREG GETRELADDR(SPP_MODULE, SPP_CMD_REG_BANK, VREG_BMC_REG_R2) +#define VIOCCP_SPP_PMM_SIM_CMDREG \ + VIOCCP_GETRELADDR(SPP_MODULE, SPP_CMD_REG_BANK, VREG_BMC_REG_R2) +#define SPP_PMM_BMC_HB_CMDREG GETRELADDR(SPP_MODULE, SPP_CMD_REG_BANK, VREG_BMC_REG_R3) +#define SPP_PMM_BMC_SIG_CMDREG GETRELADDR(SPP_MODULE, SPP_CMD_REG_BANK, VREG_BMC_REG_R4) +#define SPP_PMM_BMC_CMDREG GETRELADDR(SPP_MODULE, SPP_CMD_REG_BANK, VREG_BMC_REG_R5) + +#define SPP_BANK_ADDR(bank) GETRELADDR(SPP_MODULE, bank, VREG_BMC_REG_R0) + +#define SPP_SIM_PMM_DATA GETRELADDR(SPP_MODULE, SPP_SIM_PMM_BANK, VREG_BMC_REG_R0) +#define VIOCCP_SPP_SIM_PMM_DATA\ + VIOCCP_GETRELADDR(SPP_MODULE, SPP_SIM_PMM_BANK, VREG_BMC_REG_R0) + +/* PMM-BMC Sensor register bits */ +#define SPP_PMM_BMC_HB_SENREG GETRELADDR(SPP_MODULE, 0, VREG_BMC_SENSOR0) +#define SPP_PMM_BMC_CTL_SENREG GETRELADDR(SPP_MODULE, 0, VREG_BMC_SENSOR1) +#define SPP_PMM_BMC_SENREG GETRELADDR(SPP_MODULE, 0, VREG_BMC_SENSOR2) + +/* BMC Interrupt number used to alert PMM that message has been sent */ +#define SPP_SIM_PMM_INTR 1 +#define SPP_BANK_REGS 32 + + +#define SPP_OK 0 +#define SPP_CHKSUM_ERR 1 +#endif /* _SPP_H_ */ + diff -uprN linux-2.6.17/drivers/net/vioc/f7/spp_msgdata.h linux-2.6.17.vioc/drivers/net/vioc/f7/spp_msgdata.h --- linux-2.6.17/drivers/net/vioc/f7/spp_msgdata.h 1969-12-31 16:00:00.0 -0800 +++ linux-2.6.17.vioc/drivers/net/vioc/f7/spp_msgdata.h 2006-09-06 16:22:59.0 -0700 @@ -0,0 +1,54 @@ +/* + * Fabric7 Systems Virtual IO Controller Driver + * Copyright (C) 2003-2005 Fabric7 Systems. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + * USA + * + * http://www.fabric7.com/ + * + * Maintainers: + *[EMAIL PROTECTED] + * + * + */ +#ifndef _SPPMSGDATA_H_ +#define _SPPMSGDATA_H_ + +#include "spp.h" + +/* KEYs For SPP_FACILITY_VNIC */ +#define SPP_KEY_VNIC_CTL 1 +#define SPP_KEY_SET_PROV 2 + +/* Data Register Offset for VIOC ID parameter */ +#define SPP_VIOC_ID_IDX0 +#define SPP_VIOC_ID_OFFSET GETRELADDR(SPP_MODULE, SPP_SIM_PMM_BANK, (VREG_BMC_REG_R0 + (SPP_VIOC_ID_IDX << 2))) +#define VIOCCP_SPP_VIOC_ID_OFFSET VIOCCP_GETRELADDR(SPP_MODULE, SPP_SIM_PMM_BANK, (VREG_BMC_REG_R0 + (SPP_VIOC_ID_IDX << 2))) + +/* KEYs for SPP_FACILITY_SYS */ +#define SPP_KEY_REQUEST_SIGNAL 1 + +/* Data Register Offse
[PATCH 0/10] VIOC: New Network Device Driver
The following patch series introduces the VIOC Device Driver, that provides a network device inerface to the internal fabric interconnected network used on servers designed and built by Fabric 7 Systems. -- Misha Tomushev [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 9/10] VIOC: New Network Device Driver
Adding VIOC device driver. Packet receive code. Signed-off-by: Misha Tomushev <[EMAIL PROTECTED]> diff -uprN linux-2.6.17/drivers/net/vioc/vioc_receive.c linux-2.6.17.vioc/drivers/net/vioc/vioc_receive.c --- linux-2.6.17/drivers/net/vioc/vioc_receive.c1969-12-31 16:00:00.0 -0800 +++ linux-2.6.17.vioc/drivers/net/vioc/vioc_receive.c 2006-10-04 10:39:10.0 -0700 @@ -0,0 +1,365 @@ +/* + * Fabric7 Systems Virtual IO Controller Driver + * Copyright (C) 2003-2005 Fabric7 Systems. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + * USA + * + * http://www.fabric7.com/ + * + * Maintainers: + *[EMAIL PROTECTED] + * + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "f7/vnic_hw_registers.h" +#include "f7/vnic_defs.h" + +#include "vioc_vnic.h" +#include "vioc_api.h" + +/* + * Receive one packet. The VIOC is read-locked. Since RxDQs are + * partitioned into independent RxSets and VNICs assigned to exactly + * one RxSet, no locking is needed on RxDQs or RxCQs. + * Return true if we got a packet, false if the queue is empty. + */ +int vioc_rx_pkt(struct vioc_device *viocdev, struct rxc *rxc, u32 sw_idx) +{ + u32 rx_status; + u32 vnic_id; + u32 rxdq_id; + u32 rxd_id; + u32 pkt_len; + u32 dmap_idx; + struct sk_buff *in_skb, *out_skb; + struct vnic_device *vnicdev; + struct rxdq *rxdq; + struct rxc_pktDesc_Phys_w *rxcd; + struct rx_pktBufDesc_Phys_w *rxd; + + rxcd = &rxc->desc[sw_idx]; + if (GET_VNIC_RXC_FLAGGED(rxcd) != VNIC_RXC_FLAGGED_HW_W) + return 0; /* ring empty */ + + vnic_id = GET_VNIC_RXC_VNIC_ID_SHIFTED(rxcd); + rxdq_id = GET_VNIC_RXC_RXQ_ID_SHIFTED(rxcd); + rxd_id = GET_VNIC_RXC_IDX_SHIFTED(rxcd); + rxdq = viocdev->rxd_p[rxdq_id]; + rxd = &rxdq->desc[rxd_id]; + + in_skb = (struct sk_buff *)rxdq->vbuf[rxd_id].skb; + BUG_ON(in_skb == NULL); + out_skb = in_skb; /* default it here */ + + /* +* Reset HW FLAG in this RxC Descriptor, marking it as "SW +* acknowledged HW completion". +*/ + CLR_VNIC_RXC_FLAGGED(rxcd); + + if (!(viocdev->vnics_map & (1 << vnic_id))) + /* VNIC is not enabled - silently drop packet */ + goto out; + + in_skb->dev = viocdev->vnic_netdev[vnic_id]; + vnicdev = in_skb->dev->priv; + BUG_ON(vnicdev == NULL); + + rx_status = GET_VNIC_RXC_STATUS(rxcd); + + if (likely(rx_status == VNIC_RXC_STATUS_OK_W)) { + + pkt_len = GET_NTOH_VIOC_F7PF_PKTLEN_SHIFTED(in_skb->data); + + /* Copy out mice packets in ALL rings, even small */ + if (pkt_len <= VIOC_COPYOUT_THRESHOLD) { + out_skb = dev_alloc_skb(pkt_len); + if (!out_skb) + goto drop; + out_skb->dev = in_skb->dev; + memcpy(out_skb->data, in_skb->data, pkt_len); + } + + vnicdev->net_stats.rx_bytes += pkt_len; + vnicdev->net_stats.rx_packets++; + /* Set ->len and ->tail to reflect packet size */ + skb_put(out_skb, pkt_len); + + skb_pull(out_skb, F7PF_HLEN_STD); + out_skb->protocol = eth_type_trans(out_skb, out_skb->dev); + + /* Checksum offload */ + if (GET_VNIC_RXC_ENTAG_SHIFTED(rxcd) == + VIOC_F7PF_ET_ETH_IPV4_CKS) + out_skb->ip_summed = CHECKSUM_UNNECESSARY; + else { + out_skb->ip_summed = CHECKSUM_HW; + out_skb->csum = + ntohs(~GET_VNIC_RXC_CKSUM_SHIFTED(rxcd) & 0x); + } + + netif_receive_skb(out_skb); + } else { + vnicdev->net_stats.rx_errors++; + if (rx_status & VNIC_RXC_ISBADLENGTH_W) + vnicdev->net_stats.rx_length_errors++; + if (rx_status & VNIC_RXC_ISBADCRC_W) + vnicdev
[PATCH 10/10] VIOC: New Network Device Driver
Adding VIOC device driver. Packet transmit code. Signed-off-by: Misha Tomushev <[EMAIL PROTECTED]> diff -uprN linux-2.6.17/drivers/net/vioc/vioc_transmit.c linux-2.6.17.vioc/drivers/net/vioc/vioc_transmit.c --- linux-2.6.17/drivers/net/vioc/vioc_transmit.c 1969-12-31 16:00:00.0 -0800 +++ linux-2.6.17.vioc/drivers/net/vioc/vioc_transmit.c 2006-10-04 10:51:49.0 -0700 @@ -0,0 +1,1032 @@ +/* + * Fabric7 Systems Virtual IO Controller Driver + * Copyright (C) 2003-2005 Fabric7 Systems. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + * USA + * + * http://www.fabric7.com/ + * + * Maintainers: + *[EMAIL PROTECTED] + * + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "f7/vnic_defs.h" +#include "f7/vioc_pkts_defs.h" + +#include "vioc_vnic.h" +#include "vioc_api.h" + +#define VNIC_MIN_MTU 64 +#define TXQ00 +#define NOT_SET-1 + +static inline u32 vnic_rd_txd_ctl(struct txq *txq) +{ + return readl(txq->va_of_vreg_veng_txd_ctl); +} + +static inline void vnic_ring_tx_bell(struct txq *txq) +{ + writel(txq->shadow_VREG_VENG_TXD_CTL | VREG_VENG_TXD_CTL_QRING_MASK, + txq->va_of_vreg_veng_txd_ctl); + txq->bells++; +} + +static inline void vnic_reset_tx_ring_err(struct txq *txq) +{ + writel(txq->shadow_VREG_VENG_TXD_CTL | + (VREG_VENG_TXD_CTL_QENABLE_MASK | VREG_VENG_TXD_CTL_CLEARMASK), + txq->va_of_vreg_veng_txd_ctl); +} + +static inline void vnic_enable_tx_ring(struct txq *txq) +{ + txq->shadow_VREG_VENG_TXD_CTL = VREG_VENG_TXD_CTL_QENABLE_MASK; + writel(txq->shadow_VREG_VENG_TXD_CTL, txq->va_of_vreg_veng_txd_ctl); +} + +static inline void vnic_disable_tx_ring(struct txq *txq) +{ + txq->shadow_VREG_VENG_TXD_CTL = 0; + writel(0, txq->va_of_vreg_veng_txd_ctl); +} + +static inline void vnic_pause_tx_ring(struct txq *txq) +{ + txq->shadow_VREG_VENG_TXD_CTL |= VREG_VENG_TXD_CTL_QPAUSE_MASK; + writel(txq->shadow_VREG_VENG_TXD_CTL, txq->va_of_vreg_veng_txd_ctl); +} + +static inline void vnic_resume_tx_ring(struct txq *txq) +{ + txq->shadow_VREG_VENG_TXD_CTL &= ~VREG_VENG_TXD_CTL_QPAUSE_MASK; + writel(txq->shadow_VREG_VENG_TXD_CTL, txq->va_of_vreg_veng_txd_ctl); +} + + +/* TxQ must be locked */ +static void vnic_reset_txq(struct vnic_device *vnicdev, struct txq *txq) +{ + + struct tx_pktBufDesc_Phys_w *txd; + int i; + + vnic_reset_tx_ring_err(txq); + + /* The reset of the code is not executing +* because so far we can't reset individual VNICs. +* Need to (SW) Reset the whole VIOC. +*/ + + vnic_disable_tx_ring(txq); + wmb(); + /* +* Clean-up all Tx Descriptors, take ownership of all +* descriptors +*/ + for (i = 0; i < txq->count; i++) { + if (txq->desc) { + txd = TXD_PTR(txq, i); + txd->word_1 = 0; + txd->word_0 = 0; + } + if (txq->vbuf) { + if (txq->vbuf[i].dma) { + pci_unmap_page(vnicdev->viocdev->pdev, + txq->vbuf[i].dma, + txq->vbuf[i].length, + PCI_DMA_TODEVICE); + txq->vbuf[i].dma = 0; + } + + /* Free skb , should be for SOP (in case of frags) only */ + if (txq->vbuf[i].skb) { + dev_kfree_skb_any((struct sk_buff *)txq-> + vbuf[i].skb); + txq->vbuf[i].skb = NULL; + } + } + } + txq->next_to_clean = 0; + txq->next_to_use = 0; + txq->empty = txq->count; + wmb()
[PATCH 8/10] VIOC: New Network Device Driver
Adding VIOC device driver. Device driver provisioning settings. Signed-off-by: Misha Tomushev <[EMAIL PROTECTED]> diff -uprN linux-2.6.17/drivers/net/vioc/vioc_provision.c linux-2.6.17.vioc/drivers/net/vioc/vioc_provision.c --- linux-2.6.17/drivers/net/vioc/vioc_provision.c 1969-12-31 16:00:00.0 -0800 +++ linux-2.6.17.vioc/drivers/net/vioc/vioc_provision.c 2006-10-03 12:17:03.0 -0700 @@ -0,0 +1,226 @@ +/* + * Fabric7 Systems Virtual IO Controller Driver + * Copyright (C) 2003-2005 Fabric7 Systems. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + * USA + * + * http://www.fabric7.com/ + * + * Maintainers: + *[EMAIL PROTECTED] + * + * + */ +#include "f7/vnic_hw_registers.h" +#include "vioc_vnic.h" + +/* + * Standard parameters for ring provisioning. Single TxQ per VNIC. + * Two RX sets per VIOC, with 3 RxDs, 1 RxC, 1 Rx interrupt per set. + */ + +#define TXQ_ENTRIES1024 +#define TX_INTR_ON_EMPTY false + +/* RXDQ sizes (entry counts) must be multiples of this */ +#defineRXDQ_ALIGN VIOC_RXD_BATCH_BITS +#defineRXDQ_ENTRIES1024 + +#defineRXDQ_JUMBO_ENTRIES ALIGN(RXDQ_ENTRIES, RXDQ_ALIGN) +#defineRXDQ_STD_ENTRIESALIGN(RXDQ_ENTRIES, RXDQ_ALIGN) +#defineRXDQ_SMALL_ENTRIES ALIGN(RXDQ_ENTRIES, RXDQ_ALIGN) +#defineRXDQ_EXTRA_ENTRIES ALIGN(RXDQ_ENTRIES, RXDQ_ALIGN) + +#defineRXC_ENTRIES (RXDQ_JUMBO_ENTRIES+RXDQ_STD_ENTRIES+RXDQ_SMALL_ENTRIES+RXDQ_EXTRA_ENTRIES) + +#defineRXDQ_JUMBO_BUFSIZE (VNIC_MAX_MTU+ETH_HLEN+F7PF_HLEN_STD) +#defineRXDQ_STD_BUFSIZE(VNIC_STD_MTU+ETH_HLEN+F7PF_HLEN_STD) +#defineRXDQ_SMALL_BUFSIZE (256+ETH_HLEN+F7PF_HLEN_STD) + +#defineRXDQ_JUMBO_ALLOC_BUFSIZEALIGN(RXDQ_JUMBO_BUFSIZE,64) +#defineRXDQ_STD_ALLOC_BUFSIZE ALIGN(RXDQ_STD_BUFSIZE,64) +#defineRXDQ_SMALL_ALLOC_BUFSIZEALIGN(RXDQ_SMALL_BUFSIZE,64) + +/* + Every entry in this structure is defined as follows: + +struct vnic_prov_def { + struct rxd_q_prov rxd_ring[4]; + u32 tx_entries;Size of Tx Ring + u32 rxc_entries; Size of Rx Completion Ring + u8 rxc_id; Rx Completion queue ID + u8 rxc_intr_id;INTR servicing the above Rx Completion queue +}; + +The 4 rxd_q_prov structures of rxd_ring[] array define Rx queues per VNIC. +struct rxd_q_prov { + u32buf_size;Buffer size + u32entries; Size of the queue + u8 id; Queue id/ + u8 state; Provisioning state 1-ena, 0-dis +}; + +*/ + +struct vnic_prov_def vnic_set_0 = { + .rxd_ring[0].buf_size = RXDQ_SMALL_ALLOC_BUFSIZE, + .rxd_ring[0].entries = RXDQ_SMALL_ENTRIES, + .rxd_ring[0].id = 0, + .rxd_ring[0].state = 1, + .rxd_ring[1].buf_size = RXDQ_STD_ALLOC_BUFSIZE, + .rxd_ring[1].entries = RXDQ_STD_ENTRIES, + .rxd_ring[1].id = 1, + .rxd_ring[1].state = 1, + .rxd_ring[2].buf_size = RXDQ_JUMBO_ALLOC_BUFSIZE, + .rxd_ring[2].entries = RXDQ_JUMBO_ENTRIES, + .rxd_ring[2].id = 2, + .rxd_ring[2].state = 1, + .tx_entries = TXQ_ENTRIES,.rxc_entries = RXC_ENTRIES,.rxc_id = + 0,.rxc_intr_id = 0 +}; + +struct vnic_prov_def vnic_set_1 = { + .rxd_ring[0].buf_size = RXDQ_SMALL_ALLOC_BUFSIZE, + .rxd_ring[0].entries = RXDQ_SMALL_ENTRIES, + .rxd_ring[0].id = 4, + .rxd_ring[0].state = 1, + .rxd_ring[1].buf_size = RXDQ_STD_ALLOC_BUFSIZE, + .rxd_ring[1].entries = RXDQ_STD_ENTRIES, + .rxd_ring[1].id = 5, + .rxd_ring[1].state = 1, + .rxd_ring[2].buf_size = RXDQ_JUMBO_ALLOC_BUFSIZE, + .rxd_ring[2].entries = RXDQ_JUMBO_ENTRIES, + .rxd_ring[2].id = 6, + .rxd_ring[2].state = 1, + .tx_entries = TXQ_ENTRIES,.rxc_entries = RXC_ENTRIES,.rxc_id = + 1,.rxc_intr_id = 1 +}; + +struct vnic_prov_def vnic_set_2 = { + .rxd_ring[0].buf_size = RXDQ_SMALL_ALLOC_BUFSIZE, + .rxd_ring[0].entries = RXDQ_SMALL_ENTRIES, + .rxd_ring[0].id = 8, + .rxd_ring[0].state = 1, + .rxd_ring[1].buf_size = RXDQ_STD_ALLOC_BUFSIZE, + .rxd_ring[1].e
Re: [PATCH][RFC] net/ipv6: seperate sit driver to extra module
On Thu, 5 Oct 2006, Joerg Roedel wrote: > Is there a reason why the tunnel driver for IPv6-in-IPv4 is currently > compiled into the ipv6 module? This driver is only needed in gateways > between different IPv6 networks. On all other hosts with ipv6 enabled it > is not required. To have this driver in a seperate module will save > memory on those machines. > I appended a small and trival patch to 2.6.18 which does exactly this. Looks ok to me, although given that users used to get this by default when selecting IPv6, perhaps the default in Kconfig should be y. - James -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RFC] net/ipv6: seperate sit driver to extra module
In article <[EMAIL PROTECTED]> (at Thu, 5 Oct 2006 11:49:38 -0400 (EDT)), James Morris <[EMAIL PROTECTED]> says: > On Thu, 5 Oct 2006, Joerg Roedel wrote: > > > Is there a reason why the tunnel driver for IPv6-in-IPv4 is currently > > compiled into the ipv6 module? This driver is only needed in gateways > > between different IPv6 networks. On all other hosts with ipv6 enabled it > > is not required. To have this driver in a seperate module will save > > memory on those machines. > > I appended a small and trival patch to 2.6.18 which does exactly this. > > Looks ok to me, although given that users used to get this by default when > selecting IPv6, perhaps the default in Kconfig should be y. Agreed. And, we could add #ifdef in addrconf.c. --yoshfuji - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-mm2 boot failure on x86-64
On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote: > Please don't snip the Code: line. It is fairly important. Sorry about that. The remote console I was using appears to overwrite some text after I force the reboot. Here's a clean one. global Unable to handle kernel NULL pointer dereference at 0827 RIP: [] xfrm_register_mode+0x36/0x60 PGD 0 Oops: [1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.18-git22 #3 RIP: 0010:[] [] xfrm_register_mode+0x36/0x60 RSP: :810bffcbded0 EFLAGS: 00010286 RAX: 081f RBX: 805588a0 RCX: RDX: RSI: 0046 RDI: 80559550 RBP: ffef R08: 7a02 R09: 000e R10: 0006 R11: 80334660 R12: R13: 810bffcbdef0 R14: R15: FS: () GS:805d2000() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 0827 CR3: 00201000 CR4: 06e0 Process swapper (pid: 1, threadinfo 810bffcbc000, task 810bffcbb4e0) Stack: 8061fb48 80207182 0009 Call Trace: [] init+0x162/0x330 [] child_rip+0xa/0x12 [] acpi_ds_init_one_object+0x0/0x82 [] init+0x0/0x330 [] child_rip+0x0/0x12 Code: 48 83 78 08 00 75 06 48 89 58 08 31 ed 48 89 d7 e8 65 fd ff RIP [] xfrm_register_mode+0x36/0x60 RSP CR2: 0827 <0>Kernel panic - not syncing: Aiee, killing interrupt handler! > My guess is that something is wrong with the global variable it is accessing. > Can you post the output of grep -5 xfrm_policy_afinfo ? elm3b239:/boot # grep -5 xfrm_policy_afinfo System.map-2.6.18-git22 805594c0 d xfrm4_state_afinfo 80559500 D xfrm_cfg_mutex 80559530 d xfrm_dev_notifier 80559548 d xfrm_policy_lock 8055954c d xfrm_policy_gc_lock 80559550 d xfrm_policy_afinfo_lock 80559560 d xfrm_hash_work 805595c0 d hash_resize_mutex 80559600 D sysctl_xfrm_aevent_etime 80559604 D sysctl_xfrm_aevent_rseqth 80559610 D km_waitq -- 8075bfd8 b idiagnl 8075bfe0 B xfrm_policy_count 8075bff8 b xfrm_policy_gc_list 8075c000 b dummy.28400 8075c038 b idx_generator.27450 8075c040 b xfrm_policy_afinfo 8075c140 b xfrm_policy_gc_work 8075c1a0 b xfrm_policy_inexact 8075c1e0 B xfrm_nl 8075c1e8 b xfrm_state_gc_list 8075c1f0 b acqseq.27386 > And please add a > printk("global %p\n", xfrm_policy_afinfo[family]); > at the beginning of net/xfrm/xfrm_poliy.c:xfrm_policy_lock_afinfo > and post the output. Included above. -- Steve Fox IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-mm2 boot failure on x86-64
On Thursday 05 October 2006 19:57, Steve Fox wrote: > On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote: > > > Please don't snip the Code: line. It is fairly important. > > Sorry about that. The remote console I was using appears to overwrite > some text after I force the reboot. Here's a clean one. > > global Ok that definitely shouldn't be in there. I guess we need to track when it gets corrupted. Can you send the full boot log with this patch applied? -Andi Index: linux-2.6.19-rc1-hack/init/main.c === --- linux-2.6.19-rc1-hack.orig/init/main.c +++ linux-2.6.19-rc1-hack/init/main.c @@ -75,6 +75,9 @@ static int init(void *); +extern void bugcheck(char *, int); +#define CHECK bugcheck(__FILE__, __LINE__) + extern void init_IRQ(void); extern void fork_init(unsigned long); extern void mca_init(void); @@ -480,6 +483,8 @@ asmlinkage void __init start_kernel(void char * command_line; extern struct kernel_param __start___param[], __stop___param[]; + CHECK; + smp_setup_processor_id(); /* @@ -502,7 +507,9 @@ asmlinkage void __init start_kernel(void page_address_init(); printk(KERN_NOTICE); printk(linux_banner); + CHECK; setup_arch(&command_line); + CHECK; setup_per_cpu_areas(); smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */ @@ -517,6 +524,7 @@ asmlinkage void __init start_kernel(void * fragile until we cpu_idle() for the first time. */ preempt_disable(); + CHECK; build_all_zonelists(); page_alloc_init(); printk(KERN_NOTICE "Kernel command line: %s\n", saved_command_line); @@ -525,6 +533,7 @@ asmlinkage void __init start_kernel(void __stop___param - __start___param, &unknown_bootoption); sort_main_extable(); + CHECK; trap_init(); rcu_init(); init_IRQ(); @@ -533,8 +542,10 @@ asmlinkage void __init start_kernel(void hrtimers_init(); softirq_init(); timekeeping_init(); + CHECK; time_init(); profile_init(); + CHECK; if (!irqs_disabled()) printk("start_kernel(): bug: interrupts were enabled early\n"); early_boot_irqs_on(); @@ -568,7 +579,9 @@ asmlinkage void __init start_kernel(void #endif vfs_caches_init_early(); cpuset_init_early(); + CHECK; mem_init(); + CHECK; kmem_cache_init(); setup_per_cpu_pageset(); numa_policy_init(); @@ -577,6 +590,7 @@ asmlinkage void __init start_kernel(void calibrate_delay(); pidmap_init(); pgtable_cache_init(); + CHECK; prio_tree_init(); anon_vma_init(); #ifdef CONFIG_X86 @@ -586,12 +600,14 @@ asmlinkage void __init start_kernel(void fork_init(num_physpages); proc_caches_init(); buffer_init(); + CHECK; unnamed_dev_init(); key_init(); security_init(); vfs_caches_init(num_physpages); radix_tree_init(); signals_init(); + CHECK; /* rootfs populating might need page-writeback */ page_writeback_init(); #ifdef CONFIG_PROC_FS @@ -599,6 +615,7 @@ asmlinkage void __init start_kernel(void #endif cpuset_init(); taskstats_init_early(); + CHECK; delayacct_init(); check_bugs(); @@ -609,7 +626,7 @@ asmlinkage void __init start_kernel(void rest_init(); } -static int __initdata initcall_debug; +static int __initdata initcall_debug = 1; static int __init initcall_debug_setup(char *str) { @@ -639,7 +656,11 @@ static void __init do_initcalls(void) printk("\n"); } + CHECK; + result = (*call)(); + + CHECK; if (result && result != -ENODEV && initcall_debug) { sprintf(msgbuf, "error code %d", result); @@ -725,21 +746,32 @@ static int init(void * unused) smp_prepare_cpus(max_cpus); + CHECK; + do_pre_smp_initcalls(); smp_init(); + + CHECK; + sched_init_smp(); cpuset_init_smp(); + CHECK; + /* * Do this before initcalls, because some drivers want to access * firmware files. */ populate_rootfs(); + CHECK; + do_basic_setup(); + CHECK; + /* * check if there is an early userspace init. If yes, let it do all * the work Index: linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c === --- linux-2.6.19-rc1-hack.orig/net/xfrm/xfrm_policy.c +++ linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c @@ -39,6 +39,16 @@ EXPORT_SYMBOL(xfrm_policy_count); static DEFINE_RWLOCK(xfrm_policy_afinfo_lock); sta
Re: 2.6.18-mm2 boot failure on x86-64
On Thu, 2006-10-05 at 20:27 +0200, Andi Kleen wrote: > I guess we need to track when it gets corrupted. Can you send the full > boot log with this patch applied? Here she blows! root (hd0,0) Filesystem type is reiserfs, partition type 0x83 kernel /boot/vmlinuz-autobench root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.5 0:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts console=tty0 console=ttyS0, 57600 autobench_args: root=/dev/sda1 ABAT:1160073474 [Linux-bzImage, setup=0x1400, size=0x1dd755] initrd /boot/initrd-autobench.img [Linux-initrd @ 0x37ceb000, 0x304c57 bytes] Linux version 2.6.18-git22 ([EMAIL PROTECTED]) (gcc version 4.1.0 (SUSE Linux)) #4 SMP Thu Oct 5 11:36:21 PDT 2006 Command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160073474 BIOS-provided physical RAM map: BIOS-e820: - 0009ac00 (usable) BIOS-e820: 0009ac00 - 000a (reserved) BIOS-e820: 000e - 0010 (reserved) BIOS-e820: 0010 - bff764c0 (usable) BIOS-e820: bff764c0 - bff98880 (ACPI data) BIOS-e820: bff98880 - c000 (reserved) BIOS-e820: fec0 - 0001 (reserved) BIOS-e820: 0001 - 000c (usable) end_pfn_map = 12582912 DMI 2.3 present. Zone PFN ranges: DMA 0 -> 4096 DMA324096 -> 1048576 Normal1048576 -> 12582912 early_node_map[3] active PFN ranges 0:0 -> 154 0: 256 -> 786294 0: 1048576 -> 12582912 ACPI: PM-Timer IO Port: 0x9c ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 (Bootup-CPU) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) Processor #1 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled) Processor #6 ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled) Processor #7 ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled) Processor #16 ACPI: LAPIC (acpi_id[0x05] lapic_id[0x11] enabled) Processor #17 ACPI: LAPIC (acpi_id[0x06] lapic_id[0x16] enabled) Processor #22 ACPI: LAPIC (acpi_id[0x07] lapic_id[0x17] enabled) Processor #23 ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled) Processor #32 ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled) Processor #33 ACPI: LAPIC (acpi_id[0x12] lapic_id[0x26] enabled) Processor #38 ACPI: LAPIC (acpi_id[0x13] lapic_id[0x27] enabled) Processor #39 ACPI: LAPIC (acpi_id[0x14] lapic_id[0x30] enabled) Processor #48 ACPI: LAPIC (acpi_id[0x15] lapic_id[0x31] enabled) Processor #49 ACPI: LAPIC (acpi_id[0x16] lapic_id[0x36] enabled) Processor #54 ACPI: LAPIC (acpi_id[0x17] lapic_id[0x37] enabled) Processor #55 ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled) Processor #64 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] enabled) Processor #65 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x22] lapic_id[0x46] enabled) Processor #70 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x23] lapic_id[0x47] enabled) Processor #71 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x24] lapic_id[0x50] enabled) Processor #80 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x25] lapic_id[0x51] enabled) Processor #81 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x26] lapic_id[0x56] enabled) Processor #86 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x27] lapic_id[0x57] enabled) Processor #87 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled) Processor #96 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] enabled) Processor #97 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x32] lapic_id[0x66] enabled) Processor #102 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x33] lapic_id[0x67] enabled) Processor #103 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x34] lapic_id[0x70] enabled) Processor #112 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x35] lapic_id[0x71] enabled) Processor #113 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x36] lapic_id[0x76] enabled) Processor #118 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC (acpi_id[0x37] lapic_id[0x77] enabled) Processor #119 WARNING: NR_CPUS limit of 16 reached. Processor ignored. ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x04] dfl dfl lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x05] dfl dfl lint[0x1
Re: 2.6.18-mm2 boot failure on x86-64
On Thu, Oct 05, 2006 at 08:27:02PM +0200, Andi Kleen wrote: > On Thursday 05 October 2006 19:57, Steve Fox wrote: > > On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote: > > > > > Please don't snip the Code: line. It is fairly important. > > > > Sorry about that. The remote console I was using appears to overwrite > > some text after I force the reboot. Here's a clean one. > > > > global > > Ok that definitely shouldn't be in there. > > I guess we need to track when it gets corrupted. Can you send the full > boot log with this patch applied? > Just recalled one more observation about the problem when keith had reported it last. If I just move .bss before .data_nosave instead of it being at the end, keith's problem had disappeared. Thanks Vivek - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-mm2 boot failure on x86-64
On Thursday 05 October 2006 20:51, Steve Fox wrote: > On Thu, 2006-10-05 at 20:27 +0200, Andi Kleen wrote: > > > I guess we need to track when it gets corrupted. Can you send the full > > boot log with this patch applied? > > Here she blows! Can you please try it again with this patch to narrow it down further? -Andi Index: linux-2.6.19-rc1-hack/init/main.c === --- linux-2.6.19-rc1-hack.orig/init/main.c +++ linux-2.6.19-rc1-hack/init/main.c @@ -75,6 +75,9 @@ static int init(void *); +extern void bugcheck(char *, int); +#define CHECK bugcheck(__FILE__, __LINE__) + extern void init_IRQ(void); extern void fork_init(unsigned long); extern void mca_init(void); @@ -480,6 +483,8 @@ asmlinkage void __init start_kernel(void char * command_line; extern struct kernel_param __start___param[], __stop___param[]; + CHECK; + smp_setup_processor_id(); /* @@ -502,7 +507,9 @@ asmlinkage void __init start_kernel(void page_address_init(); printk(KERN_NOTICE); printk(linux_banner); + CHECK; setup_arch(&command_line); + CHECK; setup_per_cpu_areas(); smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */ @@ -517,6 +524,7 @@ asmlinkage void __init start_kernel(void * fragile until we cpu_idle() for the first time. */ preempt_disable(); + CHECK; build_all_zonelists(); page_alloc_init(); printk(KERN_NOTICE "Kernel command line: %s\n", saved_command_line); @@ -525,6 +533,7 @@ asmlinkage void __init start_kernel(void __stop___param - __start___param, &unknown_bootoption); sort_main_extable(); + CHECK; trap_init(); rcu_init(); init_IRQ(); @@ -533,8 +542,10 @@ asmlinkage void __init start_kernel(void hrtimers_init(); softirq_init(); timekeeping_init(); + CHECK; time_init(); profile_init(); + CHECK; if (!irqs_disabled()) printk("start_kernel(): bug: interrupts were enabled early\n"); early_boot_irqs_on(); @@ -568,7 +579,9 @@ asmlinkage void __init start_kernel(void #endif vfs_caches_init_early(); cpuset_init_early(); + CHECK; mem_init(); + CHECK; kmem_cache_init(); setup_per_cpu_pageset(); numa_policy_init(); @@ -577,6 +590,7 @@ asmlinkage void __init start_kernel(void calibrate_delay(); pidmap_init(); pgtable_cache_init(); + CHECK; prio_tree_init(); anon_vma_init(); #ifdef CONFIG_X86 @@ -586,12 +600,14 @@ asmlinkage void __init start_kernel(void fork_init(num_physpages); proc_caches_init(); buffer_init(); + CHECK; unnamed_dev_init(); key_init(); security_init(); vfs_caches_init(num_physpages); radix_tree_init(); signals_init(); + CHECK; /* rootfs populating might need page-writeback */ page_writeback_init(); #ifdef CONFIG_PROC_FS @@ -599,6 +615,7 @@ asmlinkage void __init start_kernel(void #endif cpuset_init(); taskstats_init_early(); + CHECK; delayacct_init(); check_bugs(); @@ -609,7 +626,7 @@ asmlinkage void __init start_kernel(void rest_init(); } -static int __initdata initcall_debug; +static int __initdata initcall_debug = 1; static int __init initcall_debug_setup(char *str) { @@ -639,7 +656,11 @@ static void __init do_initcalls(void) printk("\n"); } + CHECK; + result = (*call)(); + + CHECK; if (result && result != -ENODEV && initcall_debug) { sprintf(msgbuf, "error code %d", result); @@ -725,21 +746,32 @@ static int init(void * unused) smp_prepare_cpus(max_cpus); + CHECK; + do_pre_smp_initcalls(); smp_init(); + + CHECK; + sched_init_smp(); cpuset_init_smp(); + CHECK; + /* * Do this before initcalls, because some drivers want to access * firmware files. */ populate_rootfs(); + CHECK; + do_basic_setup(); + CHECK; + /* * check if there is an early userspace init. If yes, let it do all * the work Index: linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c === --- linux-2.6.19-rc1-hack.orig/net/xfrm/xfrm_policy.c +++ linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c @@ -39,6 +39,16 @@ EXPORT_SYMBOL(xfrm_policy_count); static DEFINE_RWLOCK(xfrm_policy_afinfo_lock); static struct xfrm_policy_afinfo *xfrm_policy_afinfo[NPROTO]; +void bugcheck(char *where, int line) +{ + int i; + for (i = 0; i < NPROTO; i++) +
Re: 2.6.18-mm2 boot failure on x86-64
On Thursday 05 October 2006 20:52, Vivek Goyal wrote: > On Thu, Oct 05, 2006 at 08:27:02PM +0200, Andi Kleen wrote: > > On Thursday 05 October 2006 19:57, Steve Fox wrote: > > > On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote: > > > > > > > Please don't snip the Code: line. It is fairly important. > > > > > > Sorry about that. The remote console I was using appears to overwrite > > > some text after I force the reboot. Here's a clean one. > > > > > > global > > > > Ok that definitely shouldn't be in there. > > > > I guess we need to track when it gets corrupted. Can you send the full > > boot log with this patch applied? > > > > Just recalled one more observation about the problem when keith had > reported it last. If I just move .bss before .data_nosave instead > of it being at the end, keith's problem had disappeared. Yes, that could well be that it's something in the new bootmap management. Steve's box failed at Using ACPI (MADT) for SMP configuration information Nosave address range: 0009a000 - 0009b000 Nosave address range: 0009b000 - 000a Nosave address range: 000a - 000e Nosave address range: 000e - 0010 Nosave address range: bff76000 - bff77000 Nosave address range: bff77000 - bff98000 Nosave address range: bff98000 - bff99000 Nosave address range: bff99000 - c000 Nosave address range: c000 - fec0 Nosave address range: fec0 - 0001 Allocating PCI resources starting at c400 (gap: c000:3ec0) afinfo corrupted at init/main.c:512 which is directly after that code does lots of stuff. Mel might want to take a look (and perhaps also cut down a little on the ugly printks ...) BTW I found one of my test systems too now which does a lot of: I'm about to leave for vacation so i won't have time to track it down any time soon. But here is it for reference. -Andi Please enable the IOMMU option in the BIOS setup This costs you 64 MB of RAM Mapping aperture over 65536 KB of RAM @ 800 Bad page state in process 'swapper' page:810003ee5480 flags:0x mapping: mapcount:1 count:0 Trying to fix it up, but a reboot is needed Backtrace: Call Trace: [] show_trace+0x34/0x47 [] dump_stack+0x12/0x17 [] bad_page+0x57/0x81 [] __free_pages_ok+0x64/0x247 [] free_all_bootmem_core+0xcc/0x1a9 [] numa_free_all_bootmem+0x3b/0x77 [] mem_init+0x44/0x186 [] start_kernel+0x17b/0x207 [] _sinittext+0x168/0x16c Bad page state in process 'swapper' page:810003ee54b8 flags:0x mapping: mapcount:1 count:0 Trying to fix it up, but a reboot is needed Backtrace: Call Trace: [] show_trace+0x34/0x47 [] dump_stack+0x12/0x17 [] bad_page+0x57/0x81 [] __free_pages_ok+0x64/0x247 [] free_all_bootmem_core+0xcc/0x1a9 [] numa_free_all_bootmem+0x3b/0x77 [] mem_init+0x44/0x186 [] start_kernel+0x17b/0x207 [] _sinittext+0x168/0x16c ... lots more of those ... - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Please pull bcm43xx-d80211 fixes
Hi John, I updated the "for-linville" branch of my tree with one important bugfix and a small cleanup. In case you didn't pull, yet, this will pull all changes of my previous pull request plus the following two. bcm43xx-d80211: Wait for the firmware to respond, before we read revision codes. http://bu3sch.de/gitweb?p=wireless-dev.git;a=commitdiff;h=faac518bf4a2d2846a7153b0b4f8b99ff8db4166 bcm43xx-d80211: Remove unused "err" variables. http://bu3sch.de/gitweb?p=wireless-dev.git;a=commitdiff;h=455ae5bb4ee0b18ed06ffee0d89b92a8fca3f217 The old branch (as I submitted it in my previous pull request) is still available for reference as "old-for-linville". But I don't think you'll need it. Simply pull from "for-linville" now, please. -- Greetings Michael. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.18-mm2 boot failure on x86-64
On Thu, 2006-10-05 at 21:08 +0200, Andi Kleen wrote: > Mel might want to take a look (and perhaps > also cut down a little on the ugly printks ...) I tested a patch from Mel which backs out the arch independent zone sizing and got the same results (to my inexperienced eye). I've sent him the boot log to verify they really are the same as without this back-out. -- Steve Fox IBM Linux Technology Center - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] Fix for IPsec leakage with SELinux enabled - V.03
This version takes into account David Miller's comments regarding treatment of security layer errors in the case of socket policies. Specifically, these errors will be treated like how these kind of errors are treated for the main/sub policies, which is to return a full lookup failure. include/linux/security.h| 24 ++- include/net/flow.h |2 include/net/xfrm.h |3 net/core/flow.c | 42 net/ipv4/xfrm4_policy.c |2 net/ipv6/xfrm6_policy.c |2 net/key/af_key.c|5 - net/xfrm/xfrm_policy.c | 101 ++ net/xfrm/xfrm_user.c|9 -- security/dummy.c|3 security/selinux/include/xfrm.h |3 security/selinux/xfrm.c | 53 --- 12 files changed, 162 insertions(+), 87 deletions(-) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] Fix for IPsec leakage with SELinux enabled - V.03: Process security errors for scket policies also
This treats the security errors encountered in the case of socket policy matching, the same as how these are treated in the case of main/sub policies, which is to return a full lookup failure. Signed-off-by: Venkat Yekkirala <[EMAIL PROTECTED]> --- net/xfrm/xfrm_policy.c | 26 ++ 1 file changed, 18 insertions(+), 8 deletions(-) --- net-2.6.sid5/net/xfrm/xfrm_policy.c 2006-10-05 14:36:07.0 -0500 +++ net-2.6/net/xfrm/xfrm_policy.c 2006-10-05 14:38:32.0 -0500 @@ -1013,12 +1013,16 @@ static struct xfrm_policy *xfrm_sk_polic sk->sk_family); int err = 0; - if (match) - err = security_xfrm_policy_lookup(pol, fl->secid, policy_to_flow_dir(dir)); - - if (match && !err) - xfrm_pol_hold(pol); - else + if (match) { + err = security_xfrm_policy_lookup(pol, fl->secid, + policy_to_flow_dir(dir)); + if (!err) + xfrm_pol_hold(pol); + else if (err == -ESRCH) + pol = NULL; + else + pol = ERR_PTR(err); + } else pol = NULL; } read_unlock_bh(&xfrm_policy_lock); @@ -1310,8 +1314,11 @@ restart: pol_dead = 0; xfrm_nr = 0; - if (sk && sk->sk_policy[1]) + if (sk && sk->sk_policy[1]) { policy = xfrm_sk_policy_lookup(sk, XFRM_POLICY_OUT, fl); + if (IS_ERR(policy)) + return PTR_ERR(policy); + } if (!policy) { /* To accelerate a bit... */ @@ -1604,8 +1611,11 @@ int __xfrm_policy_check(struct sock *sk, } pol = NULL; - if (sk && sk->sk_policy[dir]) + if (sk && sk->sk_policy[dir]) { pol = xfrm_sk_policy_lookup(sk, dir, &fl); + if (IS_ERR(pol)) + return 0; + } if (!pol) pol = flow_cache_lookup(&fl, family, fl_dir, - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] Fix for IPsec leakage with SELinux enabled - V.03: Fix xfrm code
Currently when an IPSec policy rule doesn't specify a security context, it is assumed to be "unlabeled" by SELinux, and so the IPSec policy rule fails to match to a flow that it would otherwise match to, unless one has explicitly added an SELinux policy rule allowing the flow to "polmatch" to the "unlabeled" IPSec policy rules. In the absence of such an explicitly added SELinux policy rule, the IPSec policy rule fails to match and so the packet(s) flow in clear text without the otherwise applicable xfrm(s) applied. The above SELinux behavior violates the SELinux security notion of "deny by default" which should actually translate to "encrypt by default" in the above case. This was first reported by Evgeniy Polyakov and the way James Morris was seeing the problem was when connecting via IPsec to a confined service on an SELinux box (vsftpd), which did not have the appropriate SELinux policy permissions to send packets via IPsec. With this patch applied, SELinux "polmatching" of flows Vs. IPSec policy rules will only come into play when there's a explicit context specified for the IPSec policy rule (which also means there's corresponding SELinux policy allowing appropriate domains/flows to polmatch to this context). Secondly, when a security module is loaded (in this case, SELinux), the security_xfrm_policy_lookup() hook can return errors other than access denied, such as -EINVAL. We were not handling that correctly, and in fact inverting the return logic and propagating a false "ok" back up to xfrm_lookup(), which then allowed packets to pass as if they were not associated with an xfrm policy. The solution for this is to first ensure that errno values are correctly propagated all the way back up through the various call chains from security_xfrm_policy_lookup(), and handled correctly. Then, flow_cache_lookup() is modified, so that if the policy resolver fails (typically a permission denied via the security module), the flow cache entry is killed rather than having a null policy assigned (which indicates that the packet can pass freely). This also forces any future lookups for the same flow to consult the security module (e.g. SELinux) for current security policy (rather than, say, caching the error on the flow cache entry). This patch: Fix the selinux side of things. This makes sure SELinux polmatching of flow contexts to IPSec policy rules comes into play only when an explicit context is associated with the IPSec policy rule. Also, this no longer defaults the context of a socket policy to the context of the socket since the "no explicit context" case is now handled properly. Signed-off-by: Venkat Yekkirala <[EMAIL PROTECTED]> --- include/linux/security.h| 24 + include/net/xfrm.h |3 + net/ipv4/xfrm4_policy.c |2 - net/ipv6/xfrm6_policy.c |2 - net/key/af_key.c|5 -- net/xfrm/xfrm_policy.c |7 ++- net/xfrm/xfrm_user.c|9 - security/dummy.c|3 + security/selinux/include/xfrm.h |3 + security/selinux/xfrm.c | 53 +++--- 10 files changed, 62 insertions(+), 49 deletions(-) --- net-2.6.sid3/include/linux/security.h 2006-10-01 15:18:23.0 -0500 +++ net-2.6.sid4/include/linux/security.h 2006-10-05 12:03:39.0 -0500 @@ -893,7 +893,8 @@ struct request_sock; * Check permission when a flow selects a xfrm_policy for processing * XFRMs on a packet. The hook is called when selecting either a * per-socket policy or a generic xfrm policy. - * Return 0 if permission is granted. + * Return 0 if permission is granted, -ESRCH otherwise, or -errno + * on other errors. * @xfrm_state_pol_flow_match: * @x contains the state to match. * @xp contains the policy to check for a match. @@ -902,6 +903,7 @@ struct request_sock; * @xfrm_flow_state_match: * @fl contains the flow key to match. * @xfrm points to the xfrm_state to match. + * @xp points to the xfrm_policy to match. * Return 1 if there is a match. * @xfrm_decode_session: * @skb points to skb to decode. @@ -1402,7 +1404,8 @@ struct security_operations { int (*xfrm_policy_lookup)(struct xfrm_policy *xp, u32 fl_secid, u8 dir); int (*xfrm_state_pol_flow_match)(struct xfrm_state *x, struct xfrm_policy *xp, struct flowi *fl); - int (*xfrm_flow_state_match)(struct flowi *fl, struct xfrm_state *xfrm); + int (*xfrm_flow_state_match)(struct flowi *fl, struct xfrm_state *xfrm, + struct xfrm_policy *xp); int (*xfrm_decode_session)(struct sk_buff *skb, u32 *secid, int ckall); #endif /* CONFIG_SECURITY_NETWORK_XFRM */ @@ -3168,11 +3171,6 @@ static inline int security_xfrm_policy_a return security_ops->xfrm_policy_alloc_security(xp, sec_ctx, NULL); } -static inline int security_xfrm_sock_policy_a
[PATCH 1/3] Fix for IPsec leakage with SELinux enabled - V.03
From: James Morris <[EMAIL PROTECTED]> When a security module is loaded (in this case, SELinux), the security_xfrm_policy_lookup() hook can return an access denied permission (or other error). We were not handling that correctly, and in fact inverting the return logic and propagating a false "ok" back up to xfrm_lookup(), which then allowed packets to pass as if they were not associated with an xfrm policy. The way I was seeing the problem was when connecting via IPsec to a confined service on an SELinux box (vsftpd), which did not have the appropriate SELinux policy permissions to send packets via IPsec. The first SYNACK would be blocked, because of an uncached lookup via flow_cache_lookup(), which would fail to resolve an xfrm policy because the SELinux policy is checked at that point via the resolver. However, retransmitted SYNACKs would then find a cached flow entry when calling into flow_cache_lookup() with a null xfrm policy, which is interpreted by xfrm_lookup() as the packet not having any associated policy and similarly to the first case, allowing it to pass without transformation. The solution presented here is to first ensure that errno values are correctly propagated all the way back up through the various call chains from security_xfrm_policy_lookup(), and handled correctly. Then, flow_cache_lookup() is modified, so that if the policy resolver fails (typically a permission denied via the security module), the flow cache entry is killed rather than having a null policy assigned (which indicates that the packet can pass freely). This also forces any future lookups for the same flow to consult the security module (e.g. SELinux) for current security policy (rather than, say, caching the error on the flow cache entry). Signed-off-by: James Morris <[EMAIL PROTECTED]> --- include/net/flow.h |2 - net/core/flow.c| 42 net/xfrm/xfrm_policy.c | 68 ++- 3 files changed, 82 insertions(+), 30 deletions(-) diff -purN -X dontdiff net-2.6.o/include/net/flow.h net-2.6.w/include/net/flow.h --- net-2.6.o/include/net/flow.h2006-09-29 11:33:58.0 -0400 +++ net-2.6.w/include/net/flow.h2006-09-30 23:50:32.0 -0400 @@ -97,7 +97,7 @@ struct flowi { #define FLOW_DIR_FWD 2 struct sock; -typedef void (*flow_resolve_t)(struct flowi *key, u16 family, u8 dir, +typedef int (*flow_resolve_t)(struct flowi *key, u16 family, u8 dir, void **objp, atomic_t **obj_refp); extern void *flow_cache_lookup(struct flowi *key, u16 family, u8 dir, diff -purN -X dontdiff net-2.6.o/net/core/flow.c net-2.6.w/net/core/flow.c --- net-2.6.o/net/core/flow.c 2006-09-29 11:33:59.0 -0400 +++ net-2.6.w/net/core/flow.c 2006-10-01 01:07:24.0 -0400 @@ -85,6 +85,14 @@ static void flow_cache_new_hashrnd(unsig add_timer(&flow_hash_rnd_timer); } +static void flow_entry_kill(int cpu, struct flow_cache_entry *fle) +{ + if (fle->object) + atomic_dec(fle->object_ref); + kmem_cache_free(flow_cachep, fle); + flow_count(cpu)--; +} + static void __flow_cache_shrink(int cpu, int shrink_to) { struct flow_cache_entry *fle, **flp; @@ -100,10 +108,7 @@ static void __flow_cache_shrink(int cpu, } while ((fle = *flp) != NULL) { *flp = fle->next; - if (fle->object) - atomic_dec(fle->object_ref); - kmem_cache_free(flow_cachep, fle); - flow_count(cpu)--; + flow_entry_kill(cpu, fle); } } } @@ -220,24 +225,33 @@ void *flow_cache_lookup(struct flowi *ke nocache: { + int err; void *obj; atomic_t *obj_ref; - resolver(key, family, dir, &obj, &obj_ref); + err = resolver(key, family, dir, &obj, &obj_ref); if (fle) { - fle->genid = atomic_read(&flow_cache_genid); - - if (fle->object) - atomic_dec(fle->object_ref); - - fle->object = obj; - fle->object_ref = obj_ref; - if (obj) - atomic_inc(fle->object_ref); + if (err) { + /* Force security policy check on next lookup */ + *head = fle->next; + flow_entry_kill(cpu, fle); + } else { + fle->genid = atomic_read(&flow_cache_genid); + + if (fle->object) + atomic_dec(fle->object_ref); + + fle->object =
Re: [PATCH] fix for system lockups in 2.6.18-rcX caused by bcm43xx
On Thu, 14 Sep 2006 10:29:30 +0200 Jarek Poplawski wrote: > On Thu, Sep 14, 2006 at 10:25:32AM +0200, Jarek Poplawski wrote: > ... > > "Attachments are discouraged, but some corporate mail systems > > provide no other way to send patches." > > > > I thought they didn't read this but now I understand for whom > > Mozilla Firefox is breaking all those lines with no mercy! > > Mozilla Thunderbird. Sorry. see http://mbligh.org/linuxdocs/Email/Clients/Thunderbird --- ~Randy - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] Fix for IPsec leakage with SELinux enabled - V.03
On Thu, 5 Oct 2006, Venkat Yekkirala wrote: > - if (xfrm_policy_match(pol, fl, type, family, dir)) { > + err = xfrm_policy_match(pol, fl, type, family, dir); > + if (err) { > + if (err == -ESRCH) > + continue; > + else { > + ret = ERR_PTR(err); > + goto fail; > + } > + } else { Semantics issue: if the exact policy match fails with -EACCESS, should we then try an inexact match before failing? > #ifdef CONFIG_XFRM_SUB_POLICY > pol = xfrm_policy_lookup_bytype(XFRM_POLICY_TYPE_SUB, fl, family, dir); > - if (pol) > + if (IS_ERR(pol)) { > + err = PTR_ERR(pol); > + pol = NULL; > + } > + if (pol || err) > goto end; Similarly, if the sub-policy lookup returns -EACCESS, should we then try a main policy lookup before failing? I would think yes to both. Opinions? - James -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix for IPsec leakage with SELinux enabled - V.02
From: James Morris <[EMAIL PROTECTED]> Date: Thu, 5 Oct 2006 16:58:31 -0400 (EDT) > On Tue, 3 Oct 2006, David Miller wrote: > > > The socket policy behavior deserves some scrutiny. I say this because > > if a matching socket policy is avoided due to security layer error, > > this could potentially make key manager problems very hard to > > diagnose. > > In this case, AVC denial messages would be logged to the audit log, so > there'd be an indication of what's going wrong. Ok. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Fix for IPsec leakage with SELinux enabled - V.03
From: Venkat Yekkirala <[EMAIL PROTECTED]> Date: Thu, 05 Oct 2006 15:42:13 -0500 > This version takes into account David Miller's comments > regarding treatment of security layer errors in the case > of socket policies. Specifically, these errors will be > treated like how these kind of errors are treated for > the main/sub policies, which is to return a full lookup > failure. I only have patches "1" and "3" in my inbox, did you forget to send the second one out or are they simply misnumbered? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix for IPsec leakage with SELinux enabled - V.02
On Tue, 3 Oct 2006, David Miller wrote: > The socket policy behavior deserves some scrutiny. I say this because > if a matching socket policy is avoided due to security layer error, > this could potentially make key manager problems very hard to > diagnose. In this case, AVC denial messages would be logged to the audit log, so there'd be an indication of what's going wrong. - James -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Request to postpone WE-21
On Thu, Oct 05, 2006 at 09:31:13AM -0700, Jean Tourrilhes wrote: > Based on the feedback, I formally request you to back out all > of WE-21 from 2.6.19. Rationale : it's probably too early. You can > keep it for a later date if you wish. Jean, What about a patch like the one below? It tries to detect WE-20 ESSID/NICKN accesses and adjust them to WE-21 style. What am I missing? I haven't had a chance to test it yet -- just hacked it up...YMMV... :-) John diff --git a/net/core/wireless.c b/net/core/wireless.c index 0da..ba5fe77 100644 --- a/net/core/wireless.c +++ b/net/core/wireless.c @@ -748,11 +748,39 @@ #endif/* WE_SET_EVENT */ int extra_size; int user_length = 0; int err; + int essid_compat = 0; /* Calculate space needed by arguments. Always allocate * for max space. Easier, and won't last long... */ extra_size = descr->max_tokens * descr->token_size; + /* Check need for ESSID compatibility for WE < 21 */ + switch (cmd) { + case SIOCSIWESSID: + case SIOCGIWESSID: + case SIOCSIWNICKN: + case SIOCGIWNICKN: + if (iwr->u.data.length == descr->max_tokens + 1) + essid_compat = 1; + else if (IW_IS_SET(cmd)) { + char essid[IW_ESSID_MAX_SIZE + 1]; + + err = copy_from_user(essid, iwr->u.data.pointer, +iwr->u.data.length * +descr->token_size); + if (err) + return -EFAULT; + + if (essid[iwr->u.data.length] == '\0') + essid_compat = 1; + } + break; + default: + break; + } + + iwr->u.data.length -= essid_compat; + /* Check what user space is giving us */ if(IW_IS_SET(cmd)) { /* Check NULL pointer */ @@ -795,7 +823,8 @@ #ifdef WE_IOCTL_DEBUG #endif /* WE_IOCTL_DEBUG */ /* Create the kernel buffer */ - extra = kmalloc(extra_size, GFP_KERNEL); + /*kzalloc ensures NULL-termination for essid_compat */ + extra = kzalloc(extra_size, GFP_KERNEL); if (extra == NULL) { return -ENOMEM; } @@ -819,6 +848,8 @@ #endif /* WE_IOCTL_DEBUG */ /* Call the handler */ ret = handler(dev, &info, &(iwr->u), extra); + iwr->u.data.length += essid_compat; + /* If we have something to return to the user */ if (!ret && IW_IS_GET(cmd)) { /* Check if there is enough buffer up there */ -- John W. Linville [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] Fix for IPsec leakage with SELinux enabled - V.03
From: James Morris <[EMAIL PROTECTED]> Date: Thu, 5 Oct 2006 16:54:38 -0400 (EDT) > > #ifdef CONFIG_XFRM_SUB_POLICY > > pol = xfrm_policy_lookup_bytype(XFRM_POLICY_TYPE_SUB, fl, family, dir); > > - if (pol) > > + if (IS_ERR(pol)) { > > + err = PTR_ERR(pol); > > + pol = NULL; > > + } > > + if (pol || err) > > goto end; > > Similarly, if the sub-policy lookup returns -EACCESS, should we then try a > main policy lookup before failing? We're trying to fill the flow cache here. In the case where we'd have a match in both the sub-policy and main table, I think the sub-policy is supposed to take precedence, and if you fail to get this sub-policy you should fail the entire lookup. The way the sub-policied entries work is that you find the sub-policy as the primary object in the flow cache, and once you notice you have a sub-policy you do an explicit lookup in the main table to put the whole thing together. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 0/3] Fix for IPsec leakage with SELinux enabled - V.03
> > This version takes into account David Miller's comments > > regarding treatment of security layer errors in the case > > of socket policies. Specifically, these errors will be > > treated like how these kind of errors are treated for > > the main/sub policies, which is to return a full lookup > > failure. > > I only have patches "1" and "3" in my inbox, did you forget > to send the second one out or are they simply misnumbered? > My apologies. The second one is also numbered 1, but has the following distinct subject line: [PATCH 1/3] Fix for IPsec leakage with SELinux enabled - V.03: Fix xfrm code - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 0/3] Fix for IPsec leakage with SELinux enabled - V.03
> > > This version takes into account David Miller's comments > > > regarding treatment of security layer errors in the case > > > of socket policies. Specifically, these errors will be > > > treated like how these kind of errors are treated for > > > the main/sub policies, which is to return a full lookup > > > failure. > > > > I only have patches "1" and "3" in my inbox, did you forget > > to send the second one out or are they simply misnumbered? > > > > My apologies. The second one is also numbered 1, but has the > following distinct subject line: > [PATCH 1/3] Fix for IPsec leakage with SELinux enabled - > V.03: Fix xfrm code In actuality, patch 2 in the series has the following subject line: [PATCH 1/3] Fix for IPsec leakage with SELinux enabled - V.03 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
socket/IP on Linux
Hello, Linux Kernel: For a project I will work on for mobile, I am looking into the IP stacks on Linux. I have a few questions to bother you: 1. is "socket.c" the file handling the socket interface? 2. which function is for opening a socket? It looks like "sock_map_fd()" is the one for opening/creating a socket? Is that correct? The "Linux IP Stacks Commentary" book suggested the function is "int socket()" which I didn't find in "socket.c" though. 3. Do you have documentations discussing in details the implemented socket interfaces? Thanks a lot in advance for your help, Jingping __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 1/3] Fix for IPsec leakage with SELinux enabled - V.03
> > - if (xfrm_policy_match(pol, fl, type, family, dir)) { > > + err = xfrm_policy_match(pol, fl, type, family, dir); > > + if (err) { > > + if (err == -ESRCH) > > + continue; > > + else { > > + ret = ERR_PTR(err); > > + goto fail; > > + } > > + } else { > > Semantics issue: if the exact policy match fails with > -EACCESS, should we > then try an inexact match before failing? I wonder what you mean by an inexact match here. > > > #ifdef CONFIG_XFRM_SUB_POLICY > > pol = xfrm_policy_lookup_bytype(XFRM_POLICY_TYPE_SUB, > fl, family, dir); > > - if (pol) > > + if (IS_ERR(pol)) { > > + err = PTR_ERR(pol); > > + pol = NULL; > > + } > > + if (pol || err) > > goto end; > > Similarly, if the sub-policy lookup returns -EACCESS, should > we then try a > main policy lookup before failing? I would think not since we are already handling the more usual "failure" of EACCES properly, and any other error would usually have to be a near-fatal error concerning the whole LSM policy or temporary memory pressure, for example. Usually the latter is auto handled when the callers reattempt the llokup. While it is theoretically possible that the LSM might generate an error for the sub but not for the main, we would have to first redefine the LSM hook to communicate this differentiation. And at least in the case of the current user of LSM (SELinux) I don't currently see the need for this differentiation. > > I would think yes to both. > > Opinions? > > > - James > -- > James Morris > <[EMAIL PROTECTED]> > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Request to postpone WE-21
On Thu, Oct 05, 2006 at 04:49:54PM -0400, John W. Linville wrote: > On Thu, Oct 05, 2006 at 09:31:13AM -0700, Jean Tourrilhes wrote: > > > Based on the feedback, I formally request you to back out all > > of WE-21 from 2.6.19. Rationale : it's probably too early. You can > > keep it for a later date if you wish. > > Jean, Let me say I truly apreciate your effort to bring progress to the discussion. > What about a patch like the one below? It tries to detect WE-20 > ESSID/NICKN accesses and adjust them to WE-21 style. What am > I missing? The idea is clever. The GET is no longer an issue. WE had half the driver doing the GET "new style" since january, so in a sense the API change has already happened, and I've already dealt with the bug reports. So, I think we could drop the "GET" part. As you may have noticed, detecting the API for the GET is easy. On the other hand, detecting it for the SET is not clear cut. As Jouni was pointing out, '\0' is a valid ESSID character, and in the long term we want to allow it, even if it's in the last position. I'm also wondering if this additional complexity could not bring additional trouble, but I'm not currently clear on that. I usually prefer things to be a bit more explicit. > I haven't had a chance to test it yet -- just hacked it > up...YMMV... :-) And I thing there is a couple of way we could refine the implementation, if ever we decide to go that way. For example, the correction could happen after real copy_from_user(), as the uncorrected iwr->u.data.length is always the number of char to pass between kernel and userspace. I think this would simplify drastically the code. I'll try to check that. > John Thanks again... Jean - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 1/3] Fix for IPsec leakage with SELinux enabled - V.03
> From: James Morris <[EMAIL PROTECTED]> > Date: Thu, 5 Oct 2006 16:54:38 -0400 (EDT) > > > > #ifdef CONFIG_XFRM_SUB_POLICY > > > pol = xfrm_policy_lookup_bytype(XFRM_POLICY_TYPE_SUB, > fl, family, dir); > > > - if (pol) > > > + if (IS_ERR(pol)) { > > > + err = PTR_ERR(pol); > > > + pol = NULL; > > > + } > > > + if (pol || err) > > > goto end; > > > > Similarly, if the sub-policy lookup returns -EACCESS, > should we then try a > > main policy lookup before failing? > > We're trying to fill the flow cache here. In the case where we'd > have a match in both the sub-policy and main table, I think the > sub-policy is supposed to take precedence, and if you fail to get > this sub-policy you should fail the entire lookup. Which is what's happening here correct? > > The way the sub-policied entries work is that you find the sub-policy > as the primary object in the flow cache, and once you notice you have > a sub-policy you do an explicit lookup in the main table to put the > whole thing together. May be James can help me understand this; when exactly would a sub-policy be "notice"d here? What does "put the whole thing together" mean? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] Fix for IPsec leakage with SELinux enabled - V.03
From: Venkat Yekkirala <[EMAIL PROTECTED]> Date: Thu, 5 Oct 2006 17:07:59 -0400 > My apologies. The second one is also numbered 1, but has the > following distinct subject line: > [PATCH 1/3] Fix for IPsec leakage with SELinux enabled - V.03: Fix xfrm code I definitely deleted one of them, since I usually get N copies of very single patch posting and two of them looked identical:) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] Fix for IPsec leakage with SELinux enabled - V.03
From: Venkat Yekkirala <[EMAIL PROTECTED]> Date: Thu, 5 Oct 2006 17:27:03 -0400 > May be James can help me understand this; when exactly would a sub-policy > be "notice"d here? What does "put the whole thing together" mean? The code in xfrm_lookup() which does a flow cache lookup, and then if it finds it has obtained a sub-policy it tries to do an explicit main table policy lookup. The sub-policy and the main table policy thus found are "put together" to form the full route. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Request to postpone WE-21
On Thu, Oct 05, 2006 at 04:49:54PM -0400, John W. Linville wrote: > What about a patch like the one below? It tries to detect WE-20 > ESSID/NICKN accesses and adjust them to WE-21 style. What am > I missing? > diff --git a/net/core/wireless.c b/net/core/wireless.c > + else if (IW_IS_SET(cmd)) { > + char essid[IW_ESSID_MAX_SIZE + 1]; > + > + err = copy_from_user(essid, iwr->u.data.pointer, > + iwr->u.data.length * > + descr->token_size); > + if (essid[iwr->u.data.length] == '\0') > + essid_compat = 1; This looks somewhat confusing.. WE-20 (and older) included '\0' in both the data value and length (well, at least in most drivers and user space tools, if I remember correctly), i.e., essid[iwr->u.data.length] would be pointing one byte after the '\0' termination.. And since '\0' is valid character in SSID (it is just an arbitrary array of octets) it can also be the last octet of the SSID and WE-21 style case could have essid[iwr->u.data.length - 1] == '\0'.. -- Jouni MalinenPGP id EFC895FA - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html