Re: 2.6.22: ERROR: __ucmpdi2 [drivers/net/s2io.ko] undefined!
On Thu, 21 Jun 2007 05:55:13 -0400 Sivakumar Subramani [EMAIL PROTECTED] wrote: -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Olaf Hering Sent: Wednesday, June 20, 2007 2:11 AM To: Stephen Hemminger Cc: [EMAIL PROTECTED]; netdev@vger.kernel.org Subject: Re: 2.6.22: ERROR: __ucmpdi2 [drivers/net/s2io.ko] undefined! On Tue, Jun 19, Stephen Hemminger wrote: On Tue, 19 Jun 2007 21:02:53 +0200 Olaf Hering [EMAIL PROTECTED] wrote: What happend to __ucmpdi2 from David Woodhouse? google has a few hits about stuff like this on 32bit powerpc with gcc 4.1.2: ERROR: __ucmpdi2 [drivers/net/s2io.ko] undefined! using the drivers/net/s2io* files from 2.6.21 with 2.6.22-rc5 fixes the compile. 25805dcf9d83098cf5492117ad2669cd14cc9b24 adds two u64 = 48 followed by a switch statement (line 2889 and 6816). Probably the switch(err) { needs a cast to a smaller type (like u8). This change removes the calls to __ucmpdi2. (fixes quoting, fixes top-posting. Please don't top-post). Hi, We will include this fix in next set of patch submission. Thanks for the fix. --- drivers/net/s2io.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) --- a/drivers/net/s2io.c +++ b/drivers/net/s2io.c @@ -2868,6 +2868,7 @@ static void tx_intr_handler(struct fifo_ struct tx_curr_get_info get_info, put_info; struct sk_buff *skb; struct TxD *txdlp; + u8 err_mask; get_info = fifo_data-tx_curr_get_info; memcpy(put_info, fifo_data-tx_curr_put_info, sizeof(put_info)); @@ -2886,8 +2887,8 @@ static void tx_intr_handler(struct fifo_ } /* update t_code statistics */ - err = 48; - switch(err) { + err_mask = err 48; + switch(err_mask) { case 2: nic-mac_control.stats_info-sw_stat. tx_buf_abort_cnt++; @@ -6805,6 +6806,7 @@ static int rx_osm_handler(struct ring_in u16 l3_csum, l4_csum; unsigned long long err = rxdp-Control_1 RXD_T_CODE; struct lro *lro; + u8 err_mask; skb-dev = dev; @@ -6813,8 +6815,8 @@ static int rx_osm_handler(struct ring_in if (err 0x1) { sp-mac_control.stats_info-sw_stat.parity_err_cnt++; } - err = 48; - switch(err) { + err_mask = err 48; + switch(err_mask) { case 1: sp-mac_control.stats_info-sw_stat. rx_parity_err_cnt++; @@ -6867,9 +6869,9 @@ static int rx_osm_handler(struct ring_in * Note that in this case, since checksum will be incorrect, * stack will validate the same. */ - if (err != 0x5) { - DBG_PRINT(ERR_DBG, %s: Rx error Value: 0x%llx\n, - dev-name, err); + if (err_mask != 0x5) { + DBG_PRINT(ERR_DBG, %s: Rx error Value: 0x%x\n, + dev-name, err_mask); sp-stats.rx_crc_errors++; sp-mac_control.stats_info-sw_stat.mem_freed += skb-truesize; This fix is still not present in anyone's tree and is required for 2.6.22. Where are we up to with it? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20-2.6.21 - networking dies after random time
On Tue, Jun 26, 2007 at 08:10:17AM +0200, Marcin Ĺšlusarz wrote: ... I reproduced it on minimal config: ... Hm... This method is usable if you can find such minimal config with which the bug cannot be reproduced. Then you can add more until the bug is back. Of course, this takes time... We know your hardware should be OK - since it was fine with 2.6.20. We don't know how much your configs (kernel apps) have changed. Sometimes the change of kernel needs some apps to be recompiled too. That's why it could be usable to try 2.6.21 from a live distro to find if it's really kernel's fault. And, alas, this log doesn't seem to tell nothing new... Regards, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fwd: [PATCH] [-mm] ACPI: export ACPI events via netlink
On Mon, 2007-06-25 at 13:08 -0400, jamal wrote: Why do you think that would be hard? It'd basically just mean replacing the netlink_capable(sock, NL_NONROOT_RECV) calls with a call that actually tests depending on the group(s) it wants. I think it could be done. You will need to have root maybe initially set such permissions etc - but it may be overkill. I think we pretty much know in the kernel whether we want to require CAP_NET_ADMIN or not, let's punt the rest to userspace. Yeah, sounds reasonable, you could ask the controller for which groups are attached to a family and then get the IDs for those groups by name. Yes, we would need a newer api to do it right. But it could be done if you register for multi groups. I've just replied somewhere else in this thread with a patch, I haven't actually tested that patch yet though. Once the generic netlink multicast is figured out we can start attacking the permissions issue. johannes signature.asc Description: This is a digitally signed message part
Re: Fwd: [PATCH] [-mm] ACPI: export ACPI events via netlink
On Tue, 2007-06-19 at 11:32 +0800, Zhang Rui wrote: Ok, by inspection (sorry, still dont have much time) - your kernel code is sending to group 1; i.e genlmsg_multicast(skb, 0, 1, GFP_ATOMIC); you need to change that to send to your assigned id, i.e: genlmsg_multicast(skb, 0, acpi_event_genl_family.id, GFP_ATOMIC); Oh, that's the problem. Great, now it works happily. :). Jamal, thanks for your help! I wonder if we should hold off on this API until we've worked out the multicast issue. Right now we have (mostly by convention afaict) in generic netlink that everybody has the same group ID as the family ID but that breaks down as soon as somebody needs more groups than that, which nl80211 will most likely need. Hence, the proposal Jamal had was to have a dynamic multicast number allocator and (if I understood correctly) look up multicast numbers by family ID/name. This is fairly extensive API/ABI change, but luckily there are no generic netlink multicast users yet except for the controller which luckily has the fixed ID 1. Therefore, if we hold off on this patch until we've written the code for dynamic multicast groups, we can hardcode the group for controller and have all others dynamically assigned; if we merge the ACPI events now we'll have to hardcode the ACPI family ID (and thus multicast group) to a small number to avoid problems with dynamic multicast groups where the numbers will be != family ID. My proposition for the actual dynamic registration interface would be to add a .groups array to pointers to struct genl_family with that just being struct genl_multicast_group { char *name; u32 id; } (as usual, NULL signifies array termination) and the controller is responsible for assigning the ID and exporting it to userspace. name is a per-family field, something like this patch: --- include/linux/genetlink.h |3 + include/net/genetlink.h | 15 ++ net/netlink/genetlink.c | 111 ++ 3 files changed, 129 insertions(+) --- wireless-dev.orig/include/net/genetlink.h 2007-06-25 23:56:59.085732308 +0200 +++ wireless-dev/include/net/genetlink.h2007-06-26 00:01:43.935732308 +0200 @@ -5,12 +5,26 @@ #include net/netlink.h /** + * struct genl_multicast_group - generic netlink multicast group + * @name: name of the multicast group, names are per-family + * @id: multicast group ID, assigned by the core, to use with + * genlmsg_multicast(). + */ +struct genl_multicast_group +{ + charname[GENL_NAMSIZ]; + u32 id; +}; + +/** * struct genl_family - generic netlink family * @id: protocol family idenfitier * @hdrsize: length of user specific header in bytes * @name: name of family * @version: protocol version * @maxattr: maximum number of attributes supported + * @multicast_groups: multicast groups to be registered + * for this family (%NULL-terminated array) * @attrbuf: buffer to store parsed attributes * @ops_list: list of all assigned operations * @family_list: family list @@ -22,6 +36,7 @@ struct genl_family charname[GENL_NAMSIZ]; unsigned intversion; unsigned intmaxattr; + struct genl_multicast_group **multicast_groups; struct nlattr **attrbuf;/* private */ struct list_headops_list; /* private */ struct list_headfamily_list;/* private */ --- wireless-dev.orig/net/netlink/genetlink.c 2007-06-25 23:56:02.805732308 +0200 +++ wireless-dev/net/netlink/genetlink.c2007-06-26 00:39:26.985732308 +0200 @@ -3,6 +3,7 @@ * * Authors:Jamal Hadi Salim * Thomas Graf [EMAIL PROTECTED] + * Johannes Berg [EMAIL PROTECTED] */ #include linux/module.h @@ -13,6 +14,7 @@ #include linux/string.h #include linux/skbuff.h #include linux/mutex.h +#include linux/bitmap.h #include net/sock.h #include net/genetlink.h @@ -42,6 +44,15 @@ static void genl_unlock(void) #define GENL_FAM_TAB_MASK (GENL_FAM_TAB_SIZE - 1) static struct list_head family_ht[GENL_FAM_TAB_SIZE]; +/* + * To avoid an allocation at boot of just one unsigned long, + * declare it global instead. + * Bit 0 (special?) and bit 1 are marked as already used + * since group 1 is the controller group. + */ +static unsigned long mcast_group_start = 0x3; +static unsigned long *multicast_groups = mcast_group_start; +static unsigned long multicast_group_bits = BITS_PER_LONG; static int genl_ctrl_event(int event, void *data); @@ -116,6 +127,76 @@ static inline u16 genl_generate_id(void) return id_gen_idx; } +static int genl_register_mcast_group(struct genl_multicast_group *grp) +{ + int id = find_first_zero_bit(multicast_groups, multicast_group_bits); + + if (id = multicast_group_bits) { + if (multicast_groups == mcast_group_start) { +
Re: [RTNETLINK]: Add nested compat attribute
Waskiewicz Jr, Peter P wrote: It looks like the one Patrick resent was the older version that requires a typecast. This is the function prototype currently in the kernel: Oops, sorry, I messed that up. Will fix immediately. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RTNETLINK]: Add nested compat attribute
David Miller wrote: Meanwhile, Patrick please clear up the situation :-) Attached is both an incremental patch and a complete replacement, please take whichever you like better :) [RTNETLINK]: Add nested compat attribute Add a nested compat attribute type that can be used to convert attributes that contain a structure to nested attributes in a backwards compatible way. The attribute looks like this: struct { [ compat contents ] struct rtattr { .rta_len= total size, .rta_type = type, } rta; struct old_structure struct; [ nested top-level attribute ] struct rtattr { .rta_len= nest size, .rta_type = type, } nest_attr; [ optional 0 .. n nested attributes ] struct rtattr { .rta_len= private attribute len, .rta_type = private attribute typ, } nested_attr; struct nested_data data; }; Since both userspace and kernel deal correctly with attributes that are larger than expected old versions will just parse the compat part and ignore the rest. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] --- commit 1e16b5e521172515c53ce96a0bc3cf0e2d77c001 tree c9880c58391e2df77ecab3d9b6a6849947714eb3 parent c4edf5d552b1450d903a7e7e2d846f2169087e10 author Patrick McHardy [EMAIL PROTECTED] Fri, 22 Jun 2007 19:06:54 +0200 committer Patrick McHardy [EMAIL PROTECTED] Fri, 22 Jun 2007 19:06:54 +0200 include/linux/rtnetlink.h | 18 ++ net/core/rtnetlink.c | 14 ++ 2 files changed, 32 insertions(+), 0 deletions(-) diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index 6127858..d40b0c9 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -570,10 +570,16 @@ static __inline__ int rtattr_strcmp(cons } extern int rtattr_parse(struct rtattr *tb[], int maxattr, struct rtattr *rta, int len); +extern int __rtattr_parse_nested_compat(struct rtattr *tb[], int maxattr, +struct rtattr *rta, int len); #define rtattr_parse_nested(tb, max, rta) \ rtattr_parse((tb), (max), RTA_DATA((rta)), RTA_PAYLOAD((rta))) +#define rtattr_parse_nested_compat(tb, max, rta, data, len) \ +({ data = RTA_PAYLOAD(rta) = len ? RTA_DATA(rta) : NULL; \ + __rtattr_parse_nested_compat(tb, max, rta, len); }) + extern int rtnetlink_send(struct sk_buff *skb, u32 pid, u32 group, int echo); extern int rtnl_unicast(struct sk_buff *skb, u32 pid); extern int rtnl_notify(struct sk_buff *skb, u32 pid, u32 group, @@ -638,6 +644,18 @@ #define RTA_NEST_END(skb, start) \ ({ (start)-rta_len = skb_tail_pointer(skb) - (unsigned char *)(start); \ (skb)-len; }) +#define RTA_NEST_COMPAT(skb, type, attrlen, data) \ +({ struct rtattr *__start = (struct rtattr *)skb_tail_pointer(skb); \ + RTA_PUT(skb, type, attrlen, data); \ + RTA_NEST(skb, type); \ + __start; }) + +#define RTA_NEST_COMPAT_END(skb, start) \ +({ struct rtattr *__nest = (void *)(start) + NLMSG_ALIGN((start)-rta_len); \ + (start)-rta_len = skb_tail_pointer(skb) - (unsigned char *)(start); \ + RTA_NEST_END(skb, __nest); \ + (skb)-len; }) + #define RTA_NEST_CANCEL(skb, start) \ ({ if (start) \ skb_trim(skb, (unsigned char *) (start) - (skb)-data); \ diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 06c0c5a..54c17e4 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -97,6 +97,19 @@ int rtattr_parse(struct rtattr *tb[], in return 0; } +int __rtattr_parse_nested_compat(struct rtattr *tb[], int maxattr, + struct rtattr *rta, int len) +{ + if (RTA_PAYLOAD(rta) len) + return -1; + if (RTA_PAYLOAD(rta) = RTA_ALIGN(len) + sizeof(struct rtattr)) { + rta = RTA_DATA(rta) + RTA_ALIGN(len); + return rtattr_parse_nested(tb, maxattr, rta); + } + memset(tb, 0, sizeof(struct rtattr *) * maxattr); + return 0; +} + static struct rtnl_link *rtnl_msg_handlers[NPROTO]; static inline int rtm_msgindex(int msgtype) @@ -1297,6 +1310,7 @@ void __init rtnetlink_init(void) EXPORT_SYMBOL(__rta_fill); EXPORT_SYMBOL(rtattr_strlcpy); EXPORT_SYMBOL(rtattr_parse); +EXPORT_SYMBOL(__rtattr_parse_nested_compat); EXPORT_SYMBOL(rtnetlink_put_metrics); EXPORT_SYMBOL(rtnl_lock); EXPORT_SYMBOL(rtnl_trylock); [RTNETLINK]: Fix rtnetlink compat attribute patch Sent the wrong patch previously. Signed-off-by: Patrick McHardy [EMAIL PROTECTED] diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index 5852921..a37be6a 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -570,12 +570,16 @@ static __inline__ int rtattr_strcmp(const struct rtattr *rta, const char *str) } extern int rtattr_parse(struct rtattr *tb[], int maxattr, struct rtattr *rta, int len); -extern int rtattr_parse_nested_compat(struct rtattr *tb[], int maxattr, - struct rtattr *rta, void **data, int len); +extern int __rtattr_parse_nested_compat(struct rtattr *tb[], int
Re: 2.6.22: ERROR: __ucmpdi2 [drivers/net/s2io.ko] undefined!
Andrew Morton wrote: This fix is still not present in anyone's tree and is required for 2.6.22. Where are we up to with it? It's in my mbox queue for 2.6.22 (hopefully today). Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RTNETLINK]: Add nested compat attribute
From: Patrick McHardy [EMAIL PROTECTED] Date: Tue, 26 Jun 2007 12:04:21 +0200 David Miller wrote: Meanwhile, Patrick please clear up the situation :-) Attached is both an incremental patch and a complete replacement, please take whichever you like better :) I applied the incremental, thanks. I'll combine them next time I rebase. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fwd: [PATCH] [-mm] ACPI: export ACPI events via netlink
On Tue, 2007-26-06 at 00:40 +0200, Johannes Berg wrote: I wonder if we should hold off on this API until we've worked out the multicast issue. I think we can fix all the code in one shot later. I just glanced at your patch but i have to run out, i will stare at it later - seems to be in the right direction. cheers, jamal - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fwd: [PATCH] [-mm] ACPI: export ACPI events via netlink
On Tue, 2007-06-26 at 09:33 -0400, jamal wrote: On Tue, 2007-26-06 at 00:40 +0200, Johannes Berg wrote: I wonder if we should hold off on this API until we've worked out the multicast issue. I think we can fix all the code in one shot later. Yes, we could fix the code in the kernel, but since the family ID is dynamically assigned and I'm trying to decouple the multicast group ID from the family ID that would break userspace relying on family==multicast group unless we somehow reserved the family ID number ACPI got to make sure that ACPI gets the same multicast group ID. Combined with the fact that ACPI might be modular and get into generic netlink late in the game this seems non-trivial; also I think it's not necessary since holding off on this ACPI genetlink multicast user (which is the first besides the controller!) until we've worked out the patch shouldn't hurt much. I just glanced at your patch but i have to run out, i will stare at it later - seems to be in the right direction. Thanks. johannes signature.asc Description: This is a digitally signed message part
forcedeth lockup in 2.6.21.5 and 2.6.22-rc6
With Centos-5.0/Redhat 5.0 on multiple motherboards (Tyan 2995/ SuperMicro H8DCE) The forcedeth network driver locks up under heavy NFS traffic (32KB frames) such as linking shared libraries. It either gives a register dump on a lock up in the transmit side, or loops complaining about eth0: too many iterations (6) in nv_nic_irq The system is hung. I cannot get to /var/log/messages, and only dmesg gives information. service network restart or even reloading the forcedeth.ko module does not restore the link. The only fix is reboot -f. These machines are 4GB or 8GB opteron workstations. There is no problem when using 2.6.20. The error counts reported by ifconfig usually show framing errors, sometimes 10 digits worth concurrent with this issue (after being up 5 minutes). I can reproduce this 100% of the time with 2.6.21.5 and 2.6.22-rc6. Any clues on what to try next? berkley -- // E. F. Berkley Shands, MSc// ** Exegy Inc.** 349 Marshall Road, Suite 100 St. Louis , MO 63119 Direct: (314) 218-3600 X450 Cell: (314) 303-2546 Office: (314) 218-3600 Fax: (314) 218-3601 The Usual Disclaimer follows... This e-mail and any documents accompanying it may contain legally privileged and/or confidential information belonging to Exegy, Inc. Such information may be protected from disclosure by law. The information is intended for use by only the addressee. If you are not the intended recipient, you are hereby notified that any disclosure or use of the information is strictly prohibited. If you have received this e-mail in error, please immediately contact the sender by e-mail or phone regarding instructions for return or destruction and do not use or disclose the content to others. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix race in AF_UNIX
Miklos Szeredi [EMAIL PROTECTED] writes: And I think incremental GC algorithms are much too complex for this task. What I've realized, is that in fact we don't require a generic garbage collection algorithm, just a much more specialized cycle collection algorithm, since refcounting in struct file takes care of the rest. This would help with localizing the problem to the problematic sockets (which have an in-flight unix socket), instead of having to blindly traverse _all_ unix sockets in the system. I'll look at reimplementing the GC with such an algorithm. Ok. If you can do it more simply have at it. There are incremental garbage collectors that are essentially just the current algorithm with fine-grained locking. So we don't have to live in a spin-lock the whole time. If your approach fails we can look at something more fine-grained. It appears clear that since we can't stop the world and garbage collect we need an incremental collector. Constraining ourselves to stopping unix sockets from going in flight or coming out of flight during garbage collection should be OK I think. There's still a possibility of a DoS there, but it would only be able to affect _very_ few applications. Yes. Eric - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFD] L2 Network namespace infrastructure
David Miller [EMAIL PROTECTED] writes: From: [EMAIL PROTECTED] (Eric W. Biederman) Date: Sun, 24 Jun 2007 06:58:54 -0600 I am convinced I can keep network namespaces something that is so trivial and obvious to get right you won't have to pay attention to them. Ok then, I'll hold you to this when you post the rest of your implementation :-) Sounds fair to me. I definitely don't want a network stack that is noticeably harder to maintain. Eric - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH UPDATE] Ethernet driver for EISA only SNI RM200/RM400 machines
Thomas Bogendoerfer [EMAIL PROTECTED] : [...] diff --git a/drivers/net/sni_82596.c b/drivers/net/sni_82596.c new file mode 100644 index 000..80e32ad --- /dev/null +++ b/drivers/net/sni_82596.c [...] +static int __devinit sni_82596_probe(struct platform_device *dev) +{ [...] + if (retval) { + free_netdev(netdevice); +probe_failed: + if (mpu_addr) + iounmap(mpu_addr); + if (ca_addr) + iounmap(ca_addr); + if (eth_addr) ^^^ + iounmap(ca_addr); ^^ - typo Please use plain goto with proper labels and remove the tests. -- Ueimor - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH UPDATE] Extract chip specific code out of lasi_82596.c
On Tue, Jun 26, 2007 at 11:47:55PM +0200, Francois Romieu wrote: Thomas Bogendoerfer [EMAIL PROTECTED] : [...] +static inline void init_rx_bufs(struct net_device *dev) +{ + struct i596_private *lp = netdev_priv(dev); + struct i596_dma *dma = lp-dma; + int i; + struct i596_rfd *rfd; + struct i596_rbd *rbd; + + /* First build the Receive Buffer Descriptor List */ + + for (i = 0, rbd = dma-rbds; i rx_ring_size; i++, rbd++) { + dma_addr_t dma_addr; + struct sk_buff *skb = dev_alloc_skb(PKT_BUF_SZ + 4); + + if (skb == NULL) + panic(KERN_ERR %s: alloc_skb() failed, __FILE__); The driver could use netdev_alloc_skb. what's the advantage ? init_rx_bufs() should handle failure more gracefully and return a proper status code. of course. [...] +static int init_i596_mem(struct net_device *dev) +{ [...] + if (request_irq(dev-irq, i596_interrupt, 0, i82596, dev)) { + printk(KERN_ERR %s: IRQ %d not free\n, dev-name, dev-irq); + goto failed_free_irq; + } [...] +failed_free_irq: + free_irq(dev-irq, dev); Oops. thanks, will fix. Thomas. -- Crap can work. Given enough thrust pigs will fly, but it's not necessary a good idea.[ RFC1925, 2.3 ] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [linux-pm] Re: [Bugme-new] [Bug 8678] New: Kernel OOPSes when suspend/resume
Hi! PREEMPT Modules linked in: michael_mic arc4 ecb blkcipher ieee80211_crypt_tkip xt_TCPMSS xt_tcpmss xt_tcpudp iptable_mangle ip_tables x_table s ppp_deflate zlib_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc mt2060 dvb_usb_dib0700 dib7000m dib7000p dvb_usb dvb_core di b3000mc dibx000_common bnep rfcomm hidp hid l2cap capability commoncap eeprom sr_mod sbp2 scsi_mod 8250_pci 8250 serial_core eth1394 hci_usb bluetooth snd_intel8x0m snd_intel8x0 snd_ac97_codec snd_seq_oss ac97_bus snd_seq_midi_event snd_seq snd_seq_device snd_pcm_os s snd_mixer_oss ipw2200 8139too ieee80211 ieee80211_crypt snd_pcm snd_timer iTCO_wdt ehci_hcd mii ohci1394 ieee1394 rtc uhci_hcd snd snd_page_alloc ide_cd i2c_i801 pcspkr cdrom Good heavens. Does it oops every time? And does the oops trace always look like this? Hi Andrew, let's guess why I marked it Critical ;) Can you please retest without DRM? Greetings, Rafael Ehm What is DRM??? Digital Right Management (Direct Rendering M... part of 3d acceleration) Just try to unload as many modules as possible, perhaps we'll find the one causing it. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Hi Oleg, Thanks for your comments, I'm still not convinced, however. On 6/26/07, Oleg Nesterov [EMAIL PROTECTED] wrote: On 06/26, Satyam Sharma wrote: Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() in kthread_stop() itself? Personally, I don't think we should do this. kthread_stop() doesn't always mean kill this thread asap. Suppose that CPU_DOWN does kthread_stop(workqueue-thread) but doesn't flush the queue before that (we did so before 2.6.22 and perhaps we will do again). Now work_struct-func() doing tcp_recvmsg() or wait_event_interruptible() fails, but this is probably not that we want. [ Well, first of all, anybody who sends a possibly-blocking-forever function like tcp_recvmsg() to a *workqueue* needs to get his head checked. ] Anyway, I think _all_ usages of kthread_stop() in the kernel *do* want the thread to stop *right then*. After all, kthread_stop() doesn't even return (gets blocked on wait_for_completion()) till it knows the target kthread *has* exited completely. And if a workqueue is blocked on tcp_recvmsg() or skb_recv_datagram() or some such, I don't see how that flush_workqueue (if that is what you meant) would succeed anyway (unless you do send the signal too), and we'll actually end up having a nice little situation on our hands if we make the mistake of calling flush_workqueue on such a wq. Note that the exact scenario you're talking about wouldn't mean the kthread getting killed before it's supposed to be stopped anyway. force_sig is not a synchronous wakeup, and also note that tcp_recvmsg() or skb_recv_datagram() etc will exit (and are supposed to exit) cleanly on seeing a signal. So could we have signals in _addition_ to kthread_stop_info and change kthread_should_stop() to check for both: kthread_stop_info.k == current signal_pending(current) No, this can't work in general. Some kthreads do flush_signals/dequeue_signal, so TIF_SIGPENDING can be lost anyway. Yup, I had thought of precisely this issue yesterday as well. The mental note I made to myself was that the force_sig(SIGKILL) and wake_up_process() in kthread_stop() must be atomic so that the following race is not possible: Say: #1 - thread that invokes kthread_stop() #2 - kthread to be stopped, (may be) currently in wait_event_interruptible(), such that there is a bigger loop over the wait_event_interruptible() itself, which puts task back to sleep if this was a spurious wake up (if _not_ due to a signal). Thread #1 Thread #2 = = skb_recv_datagram() - wait_for_packet() sleeping ... force_sig(SIGKILL) scheduled out wakes up, sees the pending signal, breaks out of wait_for_packet() and skb_recv_datagram() back out to our kthread code itself, but there we see that kthread_should_stop() is NOT yet true, we also see this spurious signal, flush it, and call skb_recv_datagram() all over again ... skb_recv_datagram() - wait_for_packet() sleeping scheduled in kthread_stop() - wake_up_process() this time we don't even break out of the skb_recv_datagram() either, as no signals are pending any more i.e. thread #2 still does not exit cleanly. The root of the problem is that functions such as skb_recv_datagram() - wait_for_packet() handle spurious wakeups *internally* by themselves, so our kthread does not get a chance to check for kthread_should_stop(). Of course, above race is true only for kthreads that do flush signals on seeing spurious ones periodically. If it did not, then skb_recv_datagram() called second time above would again have broken out because of signal_pending() and we wouldn't have gone back to sleep. But we have to be on safer side and avoid races *irrespective* of what the kthread might or might not do, so let's _not_ depend on _assumed kthread behaviour_. I suspect the above race be avoided by making force_sig() and wake_up_process() atomic in kthread_stop() itself, please correct me if I'm horribly wrong. I personally think Jeff's idea to use force_sig() is right. kthread_create() doesn't use CLONE_SIGHAND, so it is safe to change -sighand-actionp[]. (offtopic) cifs_mount: send_sig(SIGKILL,srvTcp-tsk,1); tsk =
Re: [PATCH] Re: [2.6.21.1] soft lockup when removing netconsole module
On Wed, 13 Jun 2007 11:25:37 +0200 Jarek Poplawski [EMAIL PROTECTED] wrote: On Tue, Jun 12, 2007 at 01:02:33PM +0200, Jarek Poplawski wrote: ... Of course such a problem should preferably be fixed by somebody who knows the code (alas I don't know netconsole), to be sure all needed cancels are still done after this change. I hope Jason's patch is right but I'm a little surprised I can't see netdev in cc (I'll try to fix this). So, I've had a look into netpoll and, unfortunately, I don't think this patch is right... From: Jason Wessel [EMAIL PROTECTED] Do not call cancel_rearming_delayed_work() if there is no pending work. Signed-off-by: Jason Wessel [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- net/core/netpoll.c |6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff -puN net/core/netpoll.c~a net/core/netpoll.c --- a/net/core/netpoll.c~a +++ a/net/core/netpoll.c @@ -784,8 +784,10 @@ void netpoll_cleanup(struct netpoll *np) if (atomic_dec_and_test(npinfo-refcnt)) { skb_queue_purge(npinfo-arp_tx); skb_queue_purge(npinfo-txq); - cancel_rearming_delayed_work(npinfo-tx_work); - flush_scheduled_work(); + if (delayed_work_pending(npinfo-tx_work)) { + cancel_rearming_delayed_work(npinfo-tx_work); + flush_scheduled_work(); + } kfree(npinfo); } _ There are such possibilities: 1. After positive delayed_work_pending(npinfo-tx_work) test some work is queued, but there is no guarantee that when running it'll rearm again, so cancel_rearming_delayed_work can loop again; 2. After negative delayed_work_pending(npinfo-tx_work) test a work is just running, eg. waiting on netif_tx_lock, while kfree(npinfo) is done here (oops?!). I've found an additional problem here with or without this patch: after deleting a timer in cancel_rearming_delayed_work() there could stay a last skb queued in npinfo-txq, and after kfree(npinfo) we have small memory leak. If I'm right here similar fix is needed in the current netpoll code: additional npinfo-txq purging only or maybe the whole cancel_rearming_ changed like this. I've tried to eliminate these problems in attached below patch proposal. I'm not sure it's all right: as I've written earlier I don't know netconsole enough, but it's probably a little better than above solution. I've some doubts yet (I didn't have time to check this all): 1. I hope this other schedule_delayed_work() from netpoll_send_skb() is not possible when netpoll_cleanup() runs - if I'm wrong additional check of npinfo-refcnt should be done there; 2. I also hope npinfo-refcnt before scheduling should be enough here - if not - another possibility is adding some locking eg.: netif_tx_lock before cancel for synchronization. Of course it would be very nice if somebody could test or verify this patch more. Regards, Jarek P. Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.21-/net/core/netpoll.c 2.6.21/net/core/netpoll.c --- 2.6.21-/net/core/netpoll.c2007-04-26 15:08:32.0 +0200 +++ 2.6.21/net/core/netpoll.c 2007-06-12 21:05:23.0 +0200 @@ -73,7 +73,8 @@ static void queue_process(struct work_st netif_tx_unlock(dev); local_irq_restore(flags); - schedule_delayed_work(npinfo-tx_work, HZ/10); + if (atomic_read(npinfo-refcnt)) + schedule_delayed_work(npinfo-tx_work, HZ/10); return; } netif_tx_unlock(dev); @@ -780,9 +781,15 @@ void netpoll_cleanup(struct netpoll *np) if (atomic_dec_and_test(npinfo-refcnt)) { skb_queue_purge(npinfo-arp_tx); skb_queue_purge(npinfo-txq); - cancel_rearming_delayed_work(npinfo-tx_work); + cancel_delayed_work(npinfo-tx_work); flush_scheduled_work(); + /* clean after last, unfinished work */ + if (!skb_queue_empty(npinfo-txq)) { + struct sk_buff *skb; + skb = __skb_dequeue(npinfo-txq); + kfree_skb(skb); + } kfree(npinfo); } } Everything went quiet? If this patch has been tested and fixes the bug, can you please send a version which is ready for merging? (ie: add a suitable
Re: [PATCH] Re: [2.6.21.1] soft lockup when removing netconsole module
On Tue, 26 Jun 2007 17:46:13 -0700 Wessel, Jason [EMAIL PROTECTED] wrote: } } Everything went quiet? If this patch has been tested and fixes the bug, can you please send a version which is ready for merging? (ie: add a suitable description of what it does). I mailed Jarek separately. I had tested the patch with netconsole and kgdb and it does in fact fix the problem that was reported. OK, thanks. Please don't mail people separately! I queued this up with a null changelog for now. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 6/7] CAN: Add maintainer entries - update for offset in current net-2.6.23.git
This patch adds entries in the CREDITS and MAINTAINERS file for CAN. Signed-Off-By: Oliver Hartkopp [EMAIL PROTECTED] Signed-Off-By: Urs Thuermann [EMAIL PROTECTED] --- CREDITS | 16 MAINTAINERS |9 + 2 files changed, 25 insertions(+) Index: linux-2.6.22-rc5/CREDITS === --- linux-2.6.22-rc5.orig/CREDITS 2007-06-20 14:10:41.0 +0200 +++ linux-2.6.22-rc5/CREDITS2007-06-20 14:11:27.0 +0200 @@ -1330,6 +1330,14 @@ S: 5623 HZ Eindhoven S: The Netherlands +N: Oliver Hartkopp +E: [EMAIL PROTECTED] +W: http://www.volkswagen.de +D: Controller Area Network (network layer core) +S: Brieffach 1776 +S: 38436 Wolfsburg +S: Germany + N: Andrew Haylett E: [EMAIL PROTECTED] D: Selection mechanism @@ -3283,6 +3291,14 @@ S: F-35042 Rennes Cedex S: France +N: Urs Thuermann +E: [EMAIL PROTECTED] +W: http://www.volkswagen.de +D: Controller Area Network (network layer core) +S: Brieffach 1776 +S: 38436 Wolfsburg +S: Germany + N: Jon Tombs E: [EMAIL PROTECTED] W: http://www.esi.us.es/~jon Index: linux-2.6.22-rc5/MAINTAINERS === --- linux-2.6.22-rc5.orig/MAINTAINERS 2007-06-20 14:10:41.0 +0200 +++ linux-2.6.22-rc5/MAINTAINERS2007-06-20 14:11:27.0 +0200 @@ -941,6 +941,15 @@ L: [EMAIL PROTECTED] S: Maintained +CAN NETWORK LAYER +P: Urs Thuermann +M: [EMAIL PROTECTED] +P: Oliver Hartkopp +M: [EMAIL PROTECTED] +L: [EMAIL PROTECTED] +W: http://developer.berlios.de/projects/socketcan/ +S: Maintained + CALGARY x86-64 IOMMU P: Muli Ben-Yehuda M: [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linksys Gigabit USB2.0 adapter (asix) regression
On Mon, 2007-06-25 at 19:05 +0200, Erik Slagter wrote: drivers/net/usb/asix.c: PHYID=0x01410cc2 Ok, it is using a Marvell PHY so that part should be fine. You mentioned that it looks like the packets are being transmitted, but are garbled in some way. The device does prepend a 'header' to ethernet packets as they are transmitted down the USB pipe. The device strips this off and puts the packets on the wire. This could be where the issue lies. Are you on x86 by chance or something else? -- David Hollis [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html