Re: ZONE_NORMAL memory exhausted by 4000 TCP sockets
Zhao Xiaoming wrote: The latest update: It seems that Linux kernel memory management mechanisms including buddy and slab algorisms are not very efficient under my test conditions that tcp stack requires a lot of (hundreds of MB) packet buffers and release them very frequently. Here is the proof. After change my kernel configuration to support 2/2 VM splition, LOMEM consumption reduced to 270M bytes compared with 640M bytes of the 1/3 kernel. All test conditions are the same and memory pages allocated by TCP stack are also the same, 34K ~ 38K pages. In other words, 'lost' memory changed from ~500M to ~130M. Thus, I have nothing to do but guessing the much more free pages make the slab/buddy algorisms more efficient and waste less memory. I kind of agree, and always compile for a 2G/2G VM split, as this also seems to affect certain OOM conditions positively. What isn't quite clear though, why is the 2G/2G VM split not the default? Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [XFRM/IPV6] fix next header offset in decode_session
The way to get the next protocol number of an IPv6 tunnel changes after introducing IP6CB, but I think we should go back to the previous version here. In our case I think there was a confusion between the pointer on the first byte of the next header and the value of the next header field. Signed-off-by: Andriot Jean-Philippe [EMAIL PROTECTED] --- xfrm6_policy.c.org 2006-11-07 09:45:47.0 +0100 +++ xfrm6_policy.c 2006-11-07 09:46:19.0 +0100 @@ -255,7 +255,7 @@ _decode_session6(struct sk_buff *skb, st u16 offset = skb-h.raw - skb-nh.raw; struct ipv6hdr *hdr = skb-nh.ipv6h; struct ipv6_opt_hdr *exthdr; - u8 nexthdr = skb-nh.raw[IP6CB(skb)-nhoff]; + u8 nexthdr = skb-nh.ipv6h-nexthdr; memset(fl, 0, sizeof(struct flowi)); ipv6_addr_copy(fl-fl6_dst, hdr-daddr); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] add netpoll support for gianfar
Vitaly Wool wrote: The patch inlined below adds NET_POLL_CONTROLLER support for gianfar network driver. As noted, this patch is out of date. 2.6.19-rc kernels removed the pt_regs argument from all irq handlers. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] ethtool: marvell register dump
Stephen Hemminger wrote: This is a consolidation of earlier marvell register decode patches to ethtool. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] applied patch 1 of 3 patches 2 and 3 still in the queue under consideration. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fixed a number of bugs in the PHY Layer
Andy Fleming wrote: * genphy_update_link is now exported * Added a fix from [EMAIL PROTECTED] which changes forcing so it only updates the link. Otherwise, it never tries the lower values, since it is always overwriting the speed/duplex values with the current ones, rather than the intended ones. * Fixed a bug where bringing up a PHY with no link caused it to timeout, and enter forcing mode. Once in forcing mode, plugging in the link didn't autonegotiate. Now the AN state detects the lack of link, and enters the NO_LINK state. AN only times out if the link is up and AN fails * Cleaned up the PHY_AN case, reducing one level of indentation for the timeout code. applied Please include a Signed-off-by line in future patches! Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 4/8] subdance: fix TX Pause bug (reset_tx, intr_handler)
[EMAIL PROTECTED] wrote: From: Jesse Huang [EMAIL PROTECTED] Fix TX Pause bug (reset_tx, intr_handler). When MaxCollisions occurred, need to re-enable Tx. But just after re-enable, MaxCollisions maybe occurred again and with TxStatusOverflow. This will cause driver can't check new MaxCollisions to re-enable Tx again, because TxStatusOverflow. For this reason, after re-enable Tx, we need to make sure Tx was actually enabled. Signed-off-by: Jesse Huang [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] applied - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 3/8] sundance: remove TxStartThresh and RxEarlyThresh
[EMAIL PROTECTED] wrote: From: Jesse Huang [EMAIL PROTECTED] For patent issue need to remove TxStartThresh and RxEarlyThresh. This patent is cut-through patent. If use this function, Tx will start to transmit after few data be move in to Tx FIFO. We are not allow to use those function in DFE530/DFE550/DFE580/DL10050/IP100/IP100A. It will decrease a little performance. Signed-off-by: Jesse Huang [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] applied - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 6/8] subdance: correct initial and close hardware step.
[EMAIL PROTECTED] wrote: From: Jesse Huang [EMAIL PROTECTED] Correct initial and close hardware step. In some embedded system down and up IP100A will cause DMA crash. We add some for safe down and up IP100A. Signed-off-by: Jesse Huang [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] applied - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [XFRM/IPV6] fix next header offset in decode_session
In article [EMAIL PROTECTED] (at Tue, 7 Nov 2006 10:30:02 +0100), Jean-Philippe Andriot [EMAIL PROTECTED] says: The way to get the next protocol number of an IPv6 tunnel changes after introducing IP6CB, but I think we should go back to the previous version here. : struct ipv6_opt_hdr *exthdr; - u8 nexthdr = skb-nh.raw[IP6CB(skb)-nhoff]; + u8 nexthdr = skb-nh.ipv6h-nexthdr; memset(fl, 0, sizeof(struct flowi)); I disagree. If you do this, you refer to the first extension headers only. We need to skip preceding extension headers using IP6CB(skb)-nhoff, which holds the offset to the current nexthdr. --yoshfuji - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Zero checksum in netconsole/netdump packets
Quoting Chris Lalancette: | Hello, | I realized that all of the packets that go from the crashing machine to the netdump server have a zero checksum. snip | Assuming that this is just an oversight, attached is a simple patch to compute the UDP checksum in netpoll_send_udp. | | Signed-off-by: Chris Lalancette [EMAIL PROTECTED] | RFC 768 allows to not compute the checksum by leaving uh-check at 0 - hence it is not illegal. But without David's suggestion the code is not valid, since otherwise there is no way of distinguishing a computed `0' from an ignored `0' field: if ( udph-check == 0 ) udph-check = -1; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] add netpoll support for gianfar
On Mon, 6 Nov 2006 15:26:33 -0600 Andy Fleming [EMAIL PROTECTED] wrote: You are passing extra arguments, here Oh yes, thanks. I was out of sync here. 1) Do we need the disable/enable irq stuff? It seems like we should be able to either just *mask* the interrupts at the controller, or rely on the locks to disable the interrupts. I don't see how masking the ints at the controller differs much from disable_irq. Locking all the interrupts is definitely worse than disabling selected ones. Also, introducing locks here means that we'll need to handle that specifically for -rt kernels. 2) If we are calling gfar_transmit and gfar_receive, shouldn't we call gfar_error? 3) I think it should be possible to just call gfar_interrupt() in every situation, but I'm not very familiar with net poll's requirements (You can add that into your evaluation of #1, too). Oh yes, that's a nice idea, thanks. Vitaly - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC] add netpoll support for gianfar: respin
The patch inlined below adds NET_POLL_CONTROLLER support for gianfar network driver, slightly modified wrt the comments from Andy Fleming. drivers/net/gianfar.c | 33 + 1 file changed, 33 insertions(+) Signed-off-by: Vitaly Wool [EMAIL PROTECTED] Index: powerpc/drivers/net/gianfar.c === --- powerpc.orig/drivers/net/gianfar.c +++ powerpc/drivers/net/gianfar.c @@ -133,6 +133,9 @@ static void gfar_set_hash_for_addr(struc #ifdef CONFIG_GFAR_NAPI static int gfar_poll(struct net_device *dev, int *budget); #endif +#ifdef CONFIG_NET_POLL_CONTROLLER +static void gfar_netpoll(struct net_device *dev); +#endif int gfar_clean_rx_ring(struct net_device *dev, int rx_work_limit); static int gfar_process_frame(struct net_device *dev, struct sk_buff *skb, int length); static void gfar_vlan_rx_register(struct net_device *netdev, @@ -260,6 +263,9 @@ static int gfar_probe(struct platform_de dev-poll = gfar_poll; dev-weight = GFAR_DEV_WEIGHT; #endif +#ifdef CONFIG_NET_POLL_CONTROLLER + dev-poll_controller = gfar_netpoll; +#endif dev-stop = gfar_close; dev-get_stats = gfar_get_stats; dev-change_mtu = gfar_change_mtu; @@ -1536,6 +1542,33 @@ static int gfar_poll(struct net_device * } #endif +#ifdef CONFIG_NET_POLL_CONTROLLER +/* + * Polling 'interrupt' - used by things like netconsole to send skbs + * without having to re-enable interrupts. It's not called while + * the interrupt routine is executing. + */ +static void gfar_netpoll(struct net_device *dev) +{ + struct gfar_private *priv = netdev_priv(dev); + + /* If the device has multiple interrupts, run tx/rx */ + if (priv-einfo-device_flags FSL_GIANFAR_DEV_HAS_MULTI_INTR) { + disable_irq(priv-interruptTransmit); + disable_irq(priv-interruptReceive); + disable_irq(priv-interruptError); + gfar_interrupt(priv-interruptTransmit, dev); + enable_irq(priv-interruptError); + enable_irq(priv-interruptReceive); + enable_irq(priv-interruptTransmit); + } else { + disable_irq(priv-interruptTransmit); + gfar_interrupt(priv-interruptTransmit, dev); + enable_irq(priv-interruptTransmit); + } +} +#endif + /* The interrupt handler for devices with one interrupt */ static irqreturn_t gfar_interrupt(int irq, void *dev_id) { - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] add dev_to_node()
On Mon, Nov 06, 2006 at 10:25:36PM -0800, Ravikiran G Thirumalai wrote: On Sun, Nov 05, 2006 at 12:53:23AM +0100, Christoph Hellwig wrote: On Sat, Nov 04, 2006 at 06:06:48PM -0500, Dave Jones wrote: On Sat, Nov 04, 2006 at 11:56:29PM +0100, Christoph Hellwig wrote: This will break the compile for !NUMA if someone ends up doing a bisect and lands here as a bisect point. You introduce this nice wrapper.. The dev_to_node wrapper is not enough as we can't assign to (-1) for the non-NUMA case. So I added a second macro, set_dev_node for that. The patch below compiles and works on numa and non-NUMA platforms. Hi Christoph, dev_to_node does not work as expected on x86_64 (and i386). This is because node value returned by pcibus_to_node is initialized after a struct device is created with current x86_64 code. We need the node value initialized before the call to pci_scan_bus_parented, as the generic devices are allocated and initialized off pci_scan_child_bus, which gets called from pci_scan_bus_parented The following patch does that using pci_sysdata introduced by the PCI domain patches in -mm. A nice, that some non-cell folks actually care for this patch. As far as my x86_64 pci code knowledge is concerned that patch look fine to me. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[git patches] net driver fixes
Please pull from 'upstream-linus' branch of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git upstream-linus to receive the following updates: drivers/net/b44.c |5 +++-- drivers/net/e1000/e1000_main.c |7 +++ 2 files changed, 10 insertions(+), 2 deletions(-) Auke Kok: e1000: Fix regression: garbled stats and irq allocation during swsusp Johannes Berg: b44: change comment about irq mask register diff --git a/drivers/net/b44.c b/drivers/net/b44.c index 1ec2174..474a4e3 100644 --- a/drivers/net/b44.c +++ b/drivers/net/b44.c @@ -908,8 +908,9 @@ static irqreturn_t b44_interrupt(int irq istat = br32(bp, B44_ISTAT); imask = br32(bp, B44_IMASK); - /* ??? What the fuck is the purpose of the interrupt mask -* ??? register if we have to mask it out by hand anyways? + /* The interrupt mask register controls which interrupt bits +* will actually raise an interrupt to the CPU when set by hw/firmware, +* but doesn't mask off the bits. */ istat = imask; if (istat) { diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 8d04752..726ec5e 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -4800,6 +4800,9 @@ #endif if (adapter-hw.phy_type == e1000_phy_igp_3) e1000_phy_powerdown_workaround(adapter-hw); + if (netif_running(netdev)) + e1000_free_irq(adapter); + /* Release control of h/w to f/w. If f/w is AMT enabled, this * would have already happened in close and is redundant. */ e1000_release_hw_control(adapter); @@ -4830,6 +4833,10 @@ e1000_resume(struct pci_dev *pdev) pci_enable_wake(pdev, PCI_D3hot, 0); pci_enable_wake(pdev, PCI_D3cold, 0); + if (netif_running(netdev) (err = e1000_request_irq(adapter))) + return err; + + e1000_power_up_phy(adapter); e1000_reset(adapter); E1000_WRITE_REG(adapter-hw, WUS, ~0); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take21 0/4] kevent: Generic event handling mechanism.
Evgeniy Polyakov wrote: Generic event handling mechanism. Consider for inclusion. Changes from 'take20' patchset: * new ring buffer implementation * removed artificial limit on possible number of kevents With this release and fixed userspace web server it was possible to achive 3960+ req/s with client connection rate of 4000 con/s over 100 Mbit lan, data IO over network was about 10582.7 KB/s, which is too close to wire speed if we get into account headers and the like. OK, now that ring buffer is here, I definitely like the direction this code is taking. I just committed the patches to a local repo for a good in-depth review. Could you write up a simple text file, documenting (a) your proposed syscalls and (b) your ring buffer design? Overall I have a Linux design wish, that I hope kevent can fulfill: To develop completely async applications (generally network servers, in Linux-land) and increase the chance of zero-copy I/O, network and file I/O submission and completion should be as async as possible. As such, syscalls themselves have come a serializing bottleneck that isn't strictly necessary. A fully-async application should be able to submit file read, file write, and network write requests asynchronously... in batches. Network reads, and file I/O completions should be received asynchronously, potentially in batches. Even with epoll and AIO syscalls, Linux isn't quite up to the task. So to me, the design of the userspace interface that solves this problem is a fundamental issue. My best guess at a solution would be two classes of mmap'd ring buffers, request and response. Let the app allocate one or more. Then have two hooks, (a) kick the kernel to read the request ring, and (b) kick the app when one or more events have arrived on a ring. But that's just thinking out loud. I welcome any solution that gives userspace a fully-async submission/completion interface for both network and file I/O. Setting the standard for a good interface here means Linux will kick ass for decades more to come ;-) This is IMO a Big Deal(tm). Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Zero checksum in netconsole/netdump packets
On Tue, 7 Nov 2006, Gerrit Renker wrote: Quoting Chris Lalancette: | Hello, | I realized that all of the packets that go from the crashing machine to the netdump server have a zero checksum. snip | Assuming that this is just an oversight, attached is a simple patch to compute the UDP checksum in netpoll_send_udp. | | Signed-off-by: Chris Lalancette [EMAIL PROTECTED] | RFC 768 allows to not compute the checksum by leaving uh-check at 0 - hence it is not illegal. BTW: leaving UDP checksum at 0 is only valid for IPv4, with IPv6 we _have to_ compute a checksum. Best regards, Krzysztof Olędzki
Re: [PATCH 4/4] skge: version 1.9
The skge 1.9 patch is looking good on older syskonnect fiber cards. Stability issues seem to be taken care of and performance is good. There are some strange interactions with bonding, however. If I try to put both interfaces of an sk-9844 into a bonded interface, I only see traffic from one of them. If I try to config the bonded interface down, the system hangs. If I tcpdump either of the individual interfaces (before bonding them) I see all the expected traffic. Mike Stone - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Zero checksum in netconsole/netdump packets
David Miller wrote: From: Chris Lalancette [EMAIL PROTECTED] Date: Mon, 06 Nov 2006 18:40:59 -0500 Assuming that this is just an oversight, attached is a simple patch to compute the UDP checksum in netpoll_send_udp. If the resulting checksum is zero, you should set it to all 1's, like the real UDP code does. David, Ah, thanks. Forgot about that. I re-spun the patch with the change (attached). I also moved the UDP checksum calculation up to where the rest of the UDP header setup is, to make it more consistent. Thanks again for the comments! Signed-off-by: Chris Lalancette [EMAIL PROTECTED] --- linux-2.6/net/core/netpoll.c.orig 2006-11-06 18:16:58.0 -0500 +++ linux-2.6/net/core/netpoll.c 2006-11-07 08:16:29.0 -0500 @@ -340,6 +340,12 @@ void netpoll_send_udp(struct netpoll *np udph-dest = htons(np-remote_port); udph-len = htons(udp_len); udph-check = 0; + udph-check = csum_tcpudp_magic(htonl(np-local_ip), + htonl(np-remote_ip), + udp_len, IPPROTO_UDP, + csum_partial((unsigned char *)udph, udp_len, 0)); + if (udph-check == 0) + udph-check = -1; skb-nh.iph = iph = (struct iphdr *)skb_push(skb, sizeof(*iph));
Re: [take21 0/4] kevent: Generic event handling mechanism.
At an aside... This may be useful. Or not. Al Viro had an interesting idea about kernel-userspace data passing interfaces. He had suggested creating a task-specific filesystem derived from ramfs. Through the normal VFS/VM codepaths, the user can easily create [subject to resource/priv checks] a buffer that is locked into the pagecache. Using mmap, read, write, whatever they prefer. Derive from tmpfs, and the buffers are swappable. Then it would be a simple matter to associate a file stored in keventfs with a ring buffer guaranteed to be pagecache-friendly. Heck, that might make zero-copy easier in some cases, too. And using a filesystem would mean that you could do all this without adding syscalls, by using special (poll-able!) files in the filesystem for control and notification purposes. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take21 0/4] kevent: Generic event handling mechanism.
On Tue, Nov 07, 2006 at 06:46:58AM -0500, Jeff Garzik ([EMAIL PROTECTED]) wrote: At an aside... This may be useful. Or not. Al Viro had an interesting idea about kernel-userspace data passing interfaces. He had suggested creating a task-specific filesystem derived from ramfs. Through the normal VFS/VM codepaths, the user can easily create [subject to resource/priv checks] a buffer that is locked into the pagecache. Using mmap, read, write, whatever they prefer. Derive from tmpfs, and the buffers are swappable. It looks like Al likes filesystems more than any other part of kernel tree... Existing ring buffer is created in process' memory, so it is swappable too (which is probably the most significant part of this ring buffer version), but in theory kevent file descriptor can be obtained not from the char device, but from special filesystem (well, it was done in that way in first releases but then I was asked to remove such functionality). Then it would be a simple matter to associate a file stored in keventfs with a ring buffer guaranteed to be pagecache-friendly. Heck, that might make zero-copy easier in some cases, too. And using a filesystem would mean that you could do all this without adding syscalls, by using special (poll-able!) files in the filesystem for control and notification purposes. There are too many ideas about networking zero-copy both sending and receiving, and some of them are even implemented on different layers (starting from special allocator down to splice() with additional single allocation/copy). Jeff -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take21 0/4] kevent: Generic event handling mechanism.
On Tue, Nov 07, 2006 at 06:26:09AM -0500, Jeff Garzik ([EMAIL PROTECTED]) wrote: Evgeniy Polyakov wrote: Generic event handling mechanism. Consider for inclusion. Changes from 'take20' patchset: * new ring buffer implementation * removed artificial limit on possible number of kevents With this release and fixed userspace web server it was possible to achive 3960+ req/s with client connection rate of 4000 con/s over 100 Mbit lan, data IO over network was about 10582.7 KB/s, which is too close to wire speed if we get into account headers and the like. OK, now that ring buffer is here, I definitely like the direction this code is taking. I just committed the patches to a local repo for a good in-depth review. It is third ring buffer, the fourth one will be in the next release, which should satisfy everyone. Could you write up a simple text file, documenting (a) your proposed syscalls and (b) your ring buffer design? Initial draft about supported syscalls can be found at documentation page at http://linux-net.osdl.org/index.php/Kevent Ring buffer background bits pasted below (quotations from blog, do not pay too much attention if sometimes something is not in sync). New ring buffer is implemented fully in userspace in process' memory, which means that there are no memory pinned, its size can have almost any length, several threads and processes can access it simultaneously. There is new system call int kevent_ring_init(int ctl_fd, struct ring_buffer *ring, unsigned int num); which initializes kevent's ring buffer (int ctl_fd is a kevent file descriptor, struct ring_buffer *ring is a userspace allocated ring buffer, and unsigned int num is maximum number of events (struct ukevent) which can be placed into that buffer). Ring buffer is described with following structure: struct kevent_ring { unsigned intring_kidx, ring_uidx; struct ukevent event[0]; }; where unsigned int ring_kidx, ring_uidx are last kernel's position (i.e. position which points to the first place after the last kevent put by kernel into the ring buffer) and last userspace commit (i.e. position where first unread kevent lives) positions appropriately. I will release appropriate userspace test application when tests are completed. When kevent is removed (not dequeued when it is ready, but just removed), even if it was ready, it is not copied into ring buffer, since if it is removed, no one cares about it (otherwise user would wait until it becomes ready and got it through usual way using kevent_get_events() or kevent_wait()) and thus no need to copy it to the ring buffer. Dequeueing of the kevent (calling kevent_get_events()) means that user has processed previously dequeued kevent and is ready to process new one, which means that position in the ring buffer previously ocupied but that event can be reused by currently dequeued event. In the world where only one type of syscalls to get events is used (either usual way and kevent_get_events() or ring buffer and kevent_wait()) it should not be a problem, since kevent_wait() only allows to mark number of events as processed by userspace starting from the beginning (i.e. from the last processed event), but if several threads will use different models, that can rise some questions, for example one thread can start to read events from ring buffer, and in that time other thread will call kevent_get_events(), which can rewrite that events. Actually other thread can call kevent_wait() to commit that events (i.e. mark them as processed by userspace so kernel could free them or requeue), so appropriate locking is required in userspace in any way. So I want to repeat, that it is possible with userspace ring buffer, that events in the ring buffer can be replaced without knowledge for the thread currently reading them (when other thread calls kevent_get_events() or kevent_wait()), so appropriate locking between threads or processes, which can simultaneously access the same ring buffer, is required. Having userspace ring buffer allows to make all kevent syscalls as so called 'cancellation points' by glibc, i.e. when thread has been cancelled in kevent syscall, thread can be safely removed and no events will be lost, since each syscall will copy event into special ring buffer, accessible from other threads or even processes (if shared memory is used). Overall I have a Linux design wish, that I hope kevent can fulfill: To develop completely async applications (generally network servers, in Linux-land) and increase the chance of zero-copy I/O, network and file I/O submission and completion should be as async as possible. As such, syscalls themselves have come a serializing bottleneck that isn't strictly necessary. A fully-async application should be able to submit file read, file write, and network write requests asynchronously... in batches. Network reads, and file I/O completions should be received
Re: [take22 0/4] kevent: Generic event handling mechanism.
Nate Diller wrote: Indesiciveness has certainly been an issue here, but I remember akpm and Ulrich both giving concrete suggestions. I was particularly interested in Andrew's request to explain and justify the differences between kevent and BSD's kqueue interface. Was there a discussion that I missed? I am very interested to see your work on this mechanism merged, because you've clearly emphasized performance and shown impressive results. But it seems like we lose out on a lot by throwing out all the applications that already use kqueue. kqueue looks pretty nice, the filter/note models in particular. I don't see anything about ring buffers though. I also wonder about the asynchronous event side (send), not just the event reception side. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take22 0/4] kevent: Generic event handling mechanism.
David Miller wrote: From: Pavel Machek [EMAIL PROTECTED] Date: Fri, 3 Nov 2006 09:57:12 +0100 Not sure what you are smoking, but there's unsigned long in *bsd version, lets rewrite it from scratch sounds like very bad idea. What about fixing that one bit you don't like? I disagree, it's more like since we have to be structure incompatible anyways, let's design something superior if we can. Definitely agreed. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take21 0/4] kevent: Generic event handling mechanism.
Evgeniy Polyakov wrote: Well, kevent network and FS AIO are suspended for now (although first Why? IMO, getting async event submission right is important. It should be designed in parallel with async event reception. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take21 0/4] kevent: Generic event handling mechanism.
Evgeniy Polyakov wrote: Mmap ring buffer implementation was stopped by Andrew Morton and Ulrich Drepper, process' memory is used instead. copy_to_user() is slower (and some times noticebly), but there are major advantages of such approach. h. I say there are advantages to both. Perhaps create a kevent_direct_limit resource limit for each thread. By default, each thread could mmap $n pinned pagecache pages. Sysadmin can tune certain app resource limits to permit more. I would think that retaining the option to avoid copy_to_user() -somehow- in -some- cases would be wise. Jeff - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take21 0/4] kevent: Generic event handling mechanism.
On Tue, Nov 07, 2006 at 07:17:03AM -0500, Jeff Garzik ([EMAIL PROTECTED]) wrote: Evgeniy Polyakov wrote: Well, kevent network and FS AIO are suspended for now (although first Why? IMO, getting async event submission right is important. It should be designed in parallel with async event reception. It was not only designed but also implemented, but... FS AIO was confirmed to have correct design, but there were minor (from my point of view) layering design problems (I was almost suggested to make myself a lobotomy after I put get_block() callback into address_space_operations, there were also some code duplication of mpage_readpages() in async way in kevent/kevent_aio.c - I made it to separate kevent as much as possible, both changes can live in fs/ with appropriate callback export). Network AIO I postponed for a while, since looking how hard core changed are processed, it looks like a better decision... Using Ulrich's DMA allocation API (if it would exist not only as proposal) it would be possible to speed up NAIO yet a bit too. Kevent based FS AIO patch can be found for example here (it contains full kevent subsystem with network aio and fs aio): http://tservice.net.ru/~s0mbre/archive/kevent/kevent_full.diff.3 Network aio homepage: http://tservice.net.ru/~s0mbre/old/?section=projectsitem=naio Jeff -- Evgeniy Polyakov - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] NetXen: 1G/10G Ethernet Driver updates
Hi All, I will be sending NetXen 1G/10G ethernet driver updates in subsequent emails. Kindly review it and feel free to send feedback. Thanks, --Amit - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] NetXen: Fixed /sys mapping between device and driver
Hi Amit, one minor nitpick: You wrote: diff --git a/drivers/net/netxen/netxen_nic_main.c b/drivers/net/netxen/netxen_nic_main.c index b54ea16..4effb87 100644 --- a/drivers/net/netxen/netxen_nic_main.c +++ b/drivers/net/netxen/netxen_nic_main.c [...] @@ -1040,7 +1041,7 @@ static int netxen_nic_poll(struct net_de netxen_nic_enable_int(adapter); } - return (done ? 0 : 1); + return (!done); return !done; Please lose the braces here (CodingStyle). Just respin or send this change along with later patchsets. Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] NetXen: 1G/10G Ethernet Driver updates
NetXen: 1G/10G Ethernet Driver updates - Driver cleanup - These fixes take care of driver on machines with 4G memory Signed-off-by: Amit S. Kale [EMAIL PROTECTED] netxen_nic.h | 41 ++ netxen_nic_ethtool.c | 19 ++-- netxen_nic_hdr.h |0 netxen_nic_hw.c | 10 +- netxen_nic_hw.h |4 netxen_nic_init.c | 51 +++- netxen_nic_ioctl.h|0 netxen_nic_isr.c |3 netxen_nic_main.c | 204 +++--- netxen_nic_niu.c |0 netxen_nic_phan_reg.h | 10 +- 11 files changed, 293 insertions(+), 49 deletions(-) diff --git a/drivers/net/netxen/netxen_nic.h b/drivers/net/netxen/netxen_nic.h index d0d9a29..104f60d 100644 --- a/drivers/net/netxen/netxen_nic.h +++ b/drivers/net/netxen/netxen_nic.h @@ -6,12 +6,12 @@ * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version 2 * of the License, or (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, but * WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place - Suite 330, Boston, @@ -90,8 +90,8 @@ #define ADDR_IN_WINDOW1(off) \ * normalize a 64MB crb address to 32MB PCI window * To use NETXEN_CRB_NORMALIZE, window _must_ be set to 1 */ -#define NETXEN_CRB_NORMAL(reg)\ - (reg) - NETXEN_CRB_PCIX_HOST2 + NETXEN_CRB_PCIX_HOST +#define NETXEN_CRB_NORMAL(reg) \ + ((reg) - NETXEN_CRB_PCIX_HOST2 + NETXEN_CRB_PCIX_HOST) #define NETXEN_CRB_NORMALIZE(adapter, reg) \ pci_base_offset(adapter, NETXEN_CRB_NORMAL(reg)) @@ -165,7 +165,7 @@ #define RCV_DESC_TYPE(ID) \ #define MAX_CMD_DESCRIPTORS1024 #define MAX_RCV_DESCRIPTORS32768 -#define MAX_JUMBO_RCV_DESCRIPTORS 1024 +#define MAX_JUMBO_RCV_DESCRIPTORS 4096 #define MAX_RCVSTATUS_DESCRIPTORS MAX_RCV_DESCRIPTORS #define MAX_JUMBO_RCV_DESC MAX_JUMBO_RCV_DESCRIPTORS #define MAX_RCV_DESC MAX_RCV_DESCRIPTORS @@ -593,6 +593,16 @@ struct netxen_skb_frag { u32 length; }; +/* Bounce buffer index */ +struct bounce_index { + /* Index of a buffer */ + unsigned buffer_index; + /* Offset inside the buffer */ + unsigned buffer_offset; +}; + +#define IS_BOUNCE 0xcafebb + /*Following defines are for the state of the buffers*/ #defineNETXEN_BUFFER_FREE 0 #defineNETXEN_BUFFER_BUSY 1 @@ -612,6 +622,8 @@ struct netxen_cmd_buffer { unsigned long time_stamp; u32 state; u32 no_of_descriptors; + u32 tx_bounce_buff; + struct bounce_index bnext; }; /* In rx_buffer, we do not need multiple fragments as is a single buffer */ @@ -620,6 +632,9 @@ struct netxen_rx_buffer { u64 dma; u16 ref_handle; u16 state; + u32 rx_bounce_buff; + struct bounce_index bnext; + char *bounce_ptr; }; /* Board types */ @@ -704,6 +719,7 @@ struct netxen_recv_context { }; #define NETXEN_NIC_MSI_ENABLED 0x02 +#define NETXEN_DMA_MASK0xfffe struct netxen_drvops; @@ -938,9 +954,7 @@ static inline void netxen_nic_disable_in /* * ISR_INT_MASK: Can be read from window 0 or 1. */ - writel(0x7ff, - (void __iomem - *)(PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK))); + writel(0x7ff, PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK)); } @@ -960,14 +974,12 @@ static inline void netxen_nic_enable_int break; } - writel(mask, - (void __iomem - *)(PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK))); + writel(mask, PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK)); if (!(adapter-flags NETXEN_NIC_MSI_ENABLED)) { mask = 0xbff; - writel(mask, (void __iomem *) - (PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_TARGET_MASK))); + writel(mask, PCI_OFFSET_SECOND_RANGE(adapter, +ISR_INT_TARGET_MASK)); } } @@ -1041,6 +1053,9 @@ static inline void get_brd_name_by_type( int netxen_is_flash_supported(struct netxen_adapter *adapter); int netxen_get_flash_mac_addr(struct netxen_adapter *adapter, u64 mac[]); +int netxen_get_next_bounce_buffer(struct bounce_index *head, + struct bounce_index *tail, + struct bounce_index *biret, unsigned len); extern void netxen_change_ringparam(struct netxen_adapter *adapter); extern int netxen_rom_fast_read(struct netxen_adapter
[PATCH 3/3] mlsxfrm: Various fixes
Fix the selection of an SA for an outgoing packet to be at the same context as the originating socket/flow. This eliminates the SELinux policy's ability to use/sendto SAs with contexts other than the socket's. With this patch applied, the SELinux policy will require one or more of the following for a socket to be able to communicate with/without SAs: 1. To enable a socket to communicate without using labeled-IPSec SAs: allow socket_t unlabeled_t:association { sendto recvfrom } 2. To enable a socket to communicate with labeled-IPSec SAs: allow socket_t self:association { sendto }; allow socket_t peer_sa_t:association { recvfrom }; Signed-off-by: Venkat Yekkirala [EMAIL PROTECTED] --- include/linux/security.h| 19 - net/xfrm/xfrm_policy.c |3 security/dummy.c|7 - security/selinux/hooks.c| 26 -- security/selinux/include/security.h |2 security/selinux/include/xfrm.h |7 - security/selinux/ss/services.c | 44 +++ security/selinux/xfrm.c | 97 -- 8 files changed, 112 insertions(+), 93 deletions(-) --- net-2.6.xfrm2/include/linux/security.h 2006-10-25 12:26:20.0 -0500 +++ net-2.6/include/linux/security.h2006-11-01 11:22:17.0 -0600 @@ -886,11 +886,6 @@ struct request_sock; * @xp contains the policy to check for a match. * @fl contains the flow to check for a match. * Return 1 if there is a match. - * @xfrm_flow_state_match: - * @fl contains the flow key to match. - * @xfrm points to the xfrm_state to match. - * @xp points to the xfrm_policy to match. - * Return 1 if there is a match. * @xfrm_decode_session: * @skb points to skb to decode. * @secid points to the flow key secid to set. @@ -1388,8 +1383,6 @@ struct security_operations { int (*xfrm_policy_lookup)(struct xfrm_policy *xp, u32 fl_secid, u8 dir); int (*xfrm_state_pol_flow_match)(struct xfrm_state *x, struct xfrm_policy *xp, struct flowi *fl); - int (*xfrm_flow_state_match)(struct flowi *fl, struct xfrm_state *xfrm, - struct xfrm_policy *xp); int (*xfrm_decode_session)(struct sk_buff *skb, u32 *secid, int ckall); #endif /* CONFIG_SECURITY_NETWORK_XFRM */ @@ -3186,12 +3179,6 @@ static inline int security_xfrm_state_po return security_ops-xfrm_state_pol_flow_match(x, xp, fl); } -static inline int security_xfrm_flow_state_match(struct flowi *fl, - struct xfrm_state *xfrm, struct xfrm_policy *xp) -{ - return security_ops-xfrm_flow_state_match(fl, xfrm, xp); -} - static inline int security_xfrm_decode_session(struct sk_buff *skb, u32 *secid) { return security_ops-xfrm_decode_session(skb, secid, 1); @@ -3255,12 +3242,6 @@ static inline int security_xfrm_state_po return 1; } -static inline int security_xfrm_flow_state_match(struct flowi *fl, - struct xfrm_state *xfrm, struct xfrm_policy *xp) -{ - return 1; -} - static inline int security_xfrm_decode_session(struct sk_buff *skb, u32 *secid) { return 0; --- net-2.6.xfrm2/net/xfrm/xfrm_policy.c2006-11-01 11:25:39.0 -0600 +++ net-2.6/net/xfrm/xfrm_policy.c 2006-11-01 12:10:23.0 -0600 @@ -1894,7 +1894,8 @@ int xfrm_bundle_ok(struct xfrm_policy *p if (fl !xfrm_selector_match(dst-xfrm-sel, fl, family)) return 0; - if (fl !security_xfrm_flow_state_match(fl, dst-xfrm, pol)) + if (fl pol + !security_xfrm_state_pol_flow_match(dst-xfrm, pol, fl)) return 0; if (dst-xfrm-km.state != XFRM_STATE_VALID) return 0; --- net-2.6.xfrm2/security/dummy.c 2006-10-25 12:23:47.0 -0500 +++ net-2.6/security/dummy.c2006-11-01 11:22:34.0 -0600 @@ -886,12 +886,6 @@ static int dummy_xfrm_state_pol_flow_mat return 1; } -static int dummy_xfrm_flow_state_match(struct flowi *fl, struct xfrm_state *xfrm, - struct xfrm_policy *xp) -{ - return 1; -} - static int dummy_xfrm_decode_session(struct sk_buff *skb, u32 *fl, int ckall) { return 0; @@ -1126,7 +1120,6 @@ void security_fixup_ops (struct security set_to_dummy_if_null(ops, xfrm_state_delete_security); set_to_dummy_if_null(ops, xfrm_policy_lookup); set_to_dummy_if_null(ops, xfrm_state_pol_flow_match); - set_to_dummy_if_null(ops, xfrm_flow_state_match); set_to_dummy_if_null(ops, xfrm_decode_session); #endif /* CONFIG_SECURITY_NETWORK_XFRM */ #ifdef CONFIG_KEYS --- net-2.6.xfrm2/security/selinux/include/xfrm.h 2006-11-07 09:49:24.0 -0600 +++ net-2.6/security/selinux/include/xfrm.h 2006-11-07 10:03:20.0 -0600 @@ -19,9 +19,6 @@ int selinux_xfrm_state_delete(struct xfr int
Re: [PATCH 4/4] skge: version 1.9
On Tue, 07 Nov 2006 08:25:07 -0500 Michael Stone [EMAIL PROTECTED] wrote: The skge 1.9 patch is looking good on older syskonnect fiber cards. Stability issues seem to be taken care of and performance is good. There are some strange interactions with bonding, however. If I try to put both interfaces of an sk-9844 into a bonded interface, I only see traffic from one of them. If I try to config the bonded interface down, the system hangs. If I tcpdump either of the individual interfaces (before bonding them) I see all the expected traffic. Mike Stone Which form of bonding link checking are you using. It could be that bonding MII checking is confused. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.19-rc4-git10][PKT_SCHED] sch_htb: INIT_HLIST_NODE after hlist_del()
On Tue, 7 Nov 2006 07:49:43 +0100 Jarek Poplawski [EMAIL PROTECTED] wrote: On Mon, Nov 06, 2006 at 09:44:49AM -0800, Stephen Hemminger wrote: On Mon, 6 Nov 2006 12:33:53 +0100 Jarek Poplawski [EMAIL PROTECTED] wrote: After hlist_del() next and pprev pointers are not NULL so hlist_unhashed() doesn't work properly. Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp linux-2.6.19-rc4-git10-/net/sched/sch_htb.c linux-2.6.19-rc4-git10/net/sched/sch_htb.c --- linux-2.6.19-rc4-git10-/net/sched/sch_htb.c 2006-11-06 11:42:41.0 +0100 +++ linux-2.6.19-rc4-git10/net/sched/sch_htb.c2006-11-06 11:53:15.0 +0100 @@ -1284,8 +1284,10 @@ static void htb_destroy_class(struct Qdi struct htb_class, sibling)); /* note: this delete may happen twice (see htb_delete) */ - if (!hlist_unhashed(cl-hlist)) + if (!hlist_unhashed(cl-hlist)) { hlist_del(cl-hlist); + INIT_HLIST_NODE(cl-hlist); + } why not use hlist_del_init? Your patch duplicated the code in hlist_del_init(). Why not do: --- a/net/sched/sch_htb.c 2006-11-07 09:48:22.0 -0800 +++ b/net/sched/sch_htb.c 2006-11-07 09:49:01.0 -0800 @@ -1284,8 +1284,7 @@ struct htb_class, sibling)); /* note: this delete may happen twice (see htb_delete) */ - if (!hlist_unhashed(cl-hlist)) - hlist_del(cl-hlist); + hlist_del_init(cl-hlist); list_del(cl-sibling); if (cl-prio_activity) @@ -1333,8 +1332,7 @@ sch_tree_lock(sch); /* delete from hash and active; remainder in destroy_class */ - if (!hlist_unhashed(cl-hlist)) - hlist_del(cl-hlist); + hlist_del_init(cl-hlist); if (cl-prio_activity) htb_deactivate(q, cl); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Rewrite e100_phys_id
Matthew Wilcox wrote: The motivator for this was to fix the sparse warning: drivers/net/e100.c:2418:48: warning: cast truncates bits from constant value (83126e978d4fdf becomes 978d4fdf) drivers/net/e100.c:2419:37: warning: cast truncates bits from constant value (83126e978d4fdf becomes 978d4fdf) Initially, I tried a quick fix, but when it ran into difficulties, I looked at tg3.c to see how it does it. I liked their way better, so I rewrote e100.c to be similar. It shaves ~700 bytes off the size of the driver, and a few bytes off the size of struct nic, so I think it's a win all round. Tested on the internal interface of an HP Integrity rx2600. bad news, it's completely hosed. The adapter does some indistinguishable blinking for a second, then stops blinking alltogether. I might revert the code to the old situation. I guess I should have tested it initially right away. I'm not even going to touch the e1000 patch for now ;) Auke Signed-off-by: Matthew Wilcox [EMAIL PROTECTED] diff --git a/drivers/net/e100.c b/drivers/net/e100.c index a3a08a5..aade1e9 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -556,7 +556,6 @@ struct nic { struct params params; struct net_device_stats net_stats; struct timer_list watchdog; - struct timer_list blink_timer; struct mii_if_info mii; struct work_struct tx_timeout_task; enum loopback loopback; @@ -581,7 +580,6 @@ struct nic { u32 rx_over_length_errors; u8 rev_id; - u16 leds; u16 eeprom_wc; u16 eeprom[256]; spinlock_t mdio_lock; @@ -2168,23 +2166,6 @@ err_clean_rx: return err; } -#define MII_LED_CONTROL 0x1B -static void e100_blink_led(unsigned long data) -{ - struct nic *nic = (struct nic *)data; - enum led_state { - led_on = 0x01, - led_off= 0x04, - led_on_559 = 0x05, - led_on_557 = 0x07, - }; - - nic-leds = (nic-leds led_on) ? led_off : - (nic-mac mac_82559_D101M) ? led_on_557 : led_on_559; - mdio_write(nic-netdev, nic-mii.phy_id, MII_LED_CONTROL, nic-leds); - mod_timer(nic-blink_timer, jiffies + HZ / 4); -} - static int e100_get_settings(struct net_device *netdev, struct ethtool_cmd *cmd) { struct nic *nic = netdev_priv(netdev); @@ -2411,16 +2392,32 @@ static void e100_diag_test(struct net_de msleep_interruptible(4 * 1000); } +#define MII_LED_CONTROL 0x1B static int e100_phys_id(struct net_device *netdev, u32 data) { struct nic *nic = netdev_priv(netdev); + int i; + + enum led_state { + led_off= 0x04, + led_on_559 = 0x05, + led_on_557 = 0x07, + }; + u16 leds = led_off; + + if (data == 0) + data = 2; + + for (i = 0; i (data * 2); i++) { + leds = (leds == led_off) ? + (nic-mac mac_82559_D101M) ? led_on_557 : led_on_559 : + led_off; + mdio_write(nic-netdev, nic-mii.phy_id, MII_LED_CONTROL, leds); + if (msleep_interruptible(500)) + break; + } - if(!data || data (u32)(MAX_SCHEDULE_TIMEOUT / HZ)) - data = (u32)(MAX_SCHEDULE_TIMEOUT / HZ); - mod_timer(nic-blink_timer, jiffies); - msleep_interruptible(data * 1000); - del_timer_sync(nic-blink_timer); - mdio_write(netdev, nic-mii.phy_id, MII_LED_CONTROL, 0); + mdio_write(netdev, nic-mii.phy_id, MII_LED_CONTROL, led_off); return 0; } @@ -2633,9 +2630,6 @@ #endif init_timer(nic-watchdog); nic-watchdog.function = e100_watchdog; nic-watchdog.data = (unsigned long)nic; - init_timer(nic-blink_timer); - nic-blink_timer.function = e100_blink_led; - nic-blink_timer.data = (unsigned long)nic; INIT_WORK(nic-tx_timeout_task, (void (*)(void *))e100_tx_timeout_task, netdev); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/11] convert d80211 to a proper protocol
I've put a new patchset up at http://johannes.sipsolutions.net/files/d80211-cleanup/ It now contains: 001-cfg80211-fix-Makefile.patch 002-cfg80211-wext-compat.patch as before. 003-d80211-reduce-mdev-1.patch 004-d80211-reduce-mdev-2.patch 005-d80211-cleanup-rxmgmt.patch 006-d80211-scan-sanity.patch similar to before, but modified to apply without the previous cookie patch. I decided to drop the cookie patch because it unnecessarily breaks drivers. We obviously haven't figured out what we want, so let's just go for the lowest common denominator. I think these are mostly cleanups and it all compiles fine after each one. No API changes. Haven't gotten around to testing it yet. johannes signature.asc Description: This is a digitally signed message part
Re: [PATCH 0/11] convert d80211 to a proper protocol
Hi, http://johannes.sipsolutions.net/files/d80211-cleanup/ You might want to fix the rights to the folder again ;) Ivo - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Rewrite e100_phys_id
On Tue, Nov 07, 2006 at 10:33:14AM -0800, Auke Kok wrote: Matthew Wilcox wrote: Tested on the internal interface of an HP Integrity rx2600. bad news, it's completely hosed. The adapter does some indistinguishable blinking for a second, then stops blinking alltogether. Weird. I tested it on the only e100 I have access to, and it worked. I've just reviewed the patch you quoted below, and I don't see what the problem is. I wonder if this is wrong: -mdio_write(netdev, nic-mii.phy_id, MII_LED_CONTROL, 0); +mdio_write(netdev, nic-mii.phy_id, MII_LED_CONTROL, led_off); but everything else seems pretty straight-forward. I might revert the code to the old situation. I guess I should have tested it initially right away. I'm not even going to touch the e1000 patch for now ;) Auke Signed-off-by: Matthew Wilcox [EMAIL PROTECTED] diff --git a/drivers/net/e100.c b/drivers/net/e100.c index a3a08a5..aade1e9 100644 --- a/drivers/net/e100.c +++ b/drivers/net/e100.c @@ -556,7 +556,6 @@ struct nic { struct params params; struct net_device_stats net_stats; struct timer_list watchdog; -struct timer_list blink_timer; struct mii_if_info mii; struct work_struct tx_timeout_task; enum loopback loopback; @@ -581,7 +580,6 @@ struct nic { u32 rx_over_length_errors; u8 rev_id; -u16 leds; u16 eeprom_wc; u16 eeprom[256]; spinlock_t mdio_lock; @@ -2168,23 +2166,6 @@ err_clean_rx: return err; } -#define MII_LED_CONTROL 0x1B -static void e100_blink_led(unsigned long data) -{ -struct nic *nic = (struct nic *)data; -enum led_state { -led_on = 0x01, -led_off= 0x04, -led_on_559 = 0x05, -led_on_557 = 0x07, -}; - -nic-leds = (nic-leds led_on) ? led_off : -(nic-mac mac_82559_D101M) ? led_on_557 : led_on_559; -mdio_write(nic-netdev, nic-mii.phy_id, MII_LED_CONTROL, nic-leds); -mod_timer(nic-blink_timer, jiffies + HZ / 4); -} - static int e100_get_settings(struct net_device *netdev, struct ethtool_cmd *cmd) { struct nic *nic = netdev_priv(netdev); @@ -2411,16 +2392,32 @@ static void e100_diag_test(struct net_de msleep_interruptible(4 * 1000); } +#define MII_LED_CONTROL 0x1B static int e100_phys_id(struct net_device *netdev, u32 data) { struct nic *nic = netdev_priv(netdev); +int i; + +enum led_state { +led_off= 0x04, +led_on_559 = 0x05, +led_on_557 = 0x07, +}; +u16 leds = led_off; + +if (data == 0) +data = 2; + +for (i = 0; i (data * 2); i++) { +leds = (leds == led_off) ? +(nic-mac mac_82559_D101M) ? led_on_557 : led_on_559 : +led_off; +mdio_write(nic-netdev, nic-mii.phy_id, MII_LED_CONTROL, leds); +if (msleep_interruptible(500)) +break; +} -if(!data || data (u32)(MAX_SCHEDULE_TIMEOUT / HZ)) -data = (u32)(MAX_SCHEDULE_TIMEOUT / HZ); -mod_timer(nic-blink_timer, jiffies); -msleep_interruptible(data * 1000); -del_timer_sync(nic-blink_timer); -mdio_write(netdev, nic-mii.phy_id, MII_LED_CONTROL, 0); +mdio_write(netdev, nic-mii.phy_id, MII_LED_CONTROL, led_off); return 0; } @@ -2633,9 +2630,6 @@ #endif init_timer(nic-watchdog); nic-watchdog.function = e100_watchdog; nic-watchdog.data = (unsigned long)nic; -init_timer(nic-blink_timer); -nic-blink_timer.function = e100_blink_led; -nic-blink_timer.data = (unsigned long)nic; INIT_WORK(nic-tx_timeout_task, (void (*)(void *))e100_tx_timeout_task, netdev); - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] skge: version 1.9
On Tue, Nov 07, 2006 at 09:51:04AM -0800, Stephen Hemminger wrote: Which form of bonding link checking are you using. It could be that bonding MII checking is confused. I'm not specifying anything, just ifenslave bond0 eth2 eth3 Mike Stone - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] mlsxfrm: Various fixes
On Tue, 2006-11-07 at 11:17 -0600, Venkat Yekkirala wrote: int selinux_xfrm_policy_alloc(struct xfrm_policy *xp, - struct xfrm_user_sec_ctx *uctx, struct sock *sk) + struct xfrm_user_sec_ctx *uctx) { int err; - u32 sid; - BUG_ON(!xp); - BUG_ON(uctx sk); - - if (sk) { - struct sk_security_struct *ssec = sk-sk_security; - sid = ssec-sid; - } - else - sid = SECSID_NULL; + BUG_ON(!xp || !uctx); - err = selinux_xfrm_sec_ctx_alloc(xp-security, uctx, NULL, sid); + err = selinux_xfrm_sec_ctx_alloc(xp-security, uctx, 0); return err; } BUG_ON() with an || makes this a slight bit trickier to debug if something goes wrong. I'd have to dig around a little in the assembly and look at the registers in the back trace to know which of the 2 was the problem. I personally would rather have a seperate BUG_ON(!xp); BUG_ON(!uctx); probably not worth resubmitting, but if you have to make another set of these -Eric - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] skge: version 1.9
On Tue, 07 Nov 2006 13:58:31 -0500 Michael Stone [EMAIL PROTECTED] wrote: On Tue, Nov 07, 2006 at 09:51:04AM -0800, Stephen Hemminger wrote: Which form of bonding link checking are you using. It could be that bonding MII checking is confused. I'm not specifying anything, just ifenslave bond0 eth2 eth3 Mike Stone Do both ports report carrier present? ethtool eth2 ethtool eth3 -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Fw: 2.6.19-rc1: Volanomark slowdown
Begin forwarded message: Date: Tue, 07 Nov 2006 10:32:34 -0800 From: Tim Chen [EMAIL PROTECTED] Newsgroups: linux.dev.kernel Subject: 2.6.19-rc1: Volanomark slowdown The patch [TCP]: Send ACKs each 2nd received segment commit: 1ef9696c909060ccdae3ade245ca88692b49285b http://kernel.org/git/? p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1ef9696c909060ccdae3ade245ca88692b49285b reduced Volanomark benchmark throughput by 10%. This is because Volanomark sends short message (100 bytes) on its TCP connections. This patch increases the number of ACKs traffic by 3.5 times. By adopting this patch, we assume that with small segment, having short delay is important enough that we are willing to reduce bandwidth with more ACKs. Is there any real application out there that this new behavior could be a concern? Tim - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] skge: version 1.9
On Tue, Nov 07, 2006 at 11:18:07AM -0800, Stephen Hemminger wrote: Do both ports report carrier present? ethtool eth2 ethtool eth3 Link detected? yes Mike Stone - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[take23 2/5] kevent: Core files.
Core files. This patch includes core kevent files: * userspace controlling * kernelspace interfaces * initialization * notification state machines Some bits of documentation can be found on project's homepage (and links from there): http://tservice.net.ru/~s0mbre/old/?section=projectsitem=kevent Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED] diff --git a/arch/i386/kernel/syscall_table.S b/arch/i386/kernel/syscall_table.S index 7e639f7..fa8075b 100644 --- a/arch/i386/kernel/syscall_table.S +++ b/arch/i386/kernel/syscall_table.S @@ -318,3 +318,7 @@ ENTRY(sys_call_table) .long sys_vmsplice .long sys_move_pages .long sys_getcpu + .long sys_kevent_get_events + .long sys_kevent_ctl/* 320 */ + .long sys_kevent_wait + .long sys_kevent_ring_init diff --git a/arch/x86_64/ia32/ia32entry.S b/arch/x86_64/ia32/ia32entry.S index b4aa875..95fb252 100644 --- a/arch/x86_64/ia32/ia32entry.S +++ b/arch/x86_64/ia32/ia32entry.S @@ -714,8 +714,12 @@ #endif .quad compat_sys_get_robust_list .quad sys_splice .quad sys_sync_file_range - .quad sys_tee + .quad sys_tee /* 315 */ .quad compat_sys_vmsplice .quad compat_sys_move_pages .quad sys_getcpu + .quad sys_kevent_get_events + .quad sys_kevent_ctl/* 320 */ + .quad sys_kevent_wait + .quad sys_kevent_ring_init ia32_syscall_end: diff --git a/include/asm-i386/unistd.h b/include/asm-i386/unistd.h index bd99870..2161ef2 100644 --- a/include/asm-i386/unistd.h +++ b/include/asm-i386/unistd.h @@ -324,10 +324,14 @@ #define __NR_tee 315 #define __NR_vmsplice 316 #define __NR_move_pages317 #define __NR_getcpu318 +#define __NR_kevent_get_events 319 +#define __NR_kevent_ctl320 +#define __NR_kevent_wait 321 +#define __NR_kevent_ring_init 322 #ifdef __KERNEL__ -#define NR_syscalls 319 +#define NR_syscalls 323 #include linux/err.h /* diff --git a/include/asm-x86_64/unistd.h b/include/asm-x86_64/unistd.h index 6137146..3669c0f 100644 --- a/include/asm-x86_64/unistd.h +++ b/include/asm-x86_64/unistd.h @@ -619,10 +619,18 @@ #define __NR_vmsplice 278 __SYSCALL(__NR_vmsplice, sys_vmsplice) #define __NR_move_pages279 __SYSCALL(__NR_move_pages, sys_move_pages) +#define __NR_kevent_get_events 280 +__SYSCALL(__NR_kevent_get_events, sys_kevent_get_events) +#define __NR_kevent_ctl281 +__SYSCALL(__NR_kevent_ctl, sys_kevent_ctl) +#define __NR_kevent_wait 282 +__SYSCALL(__NR_kevent_wait, sys_kevent_wait) +#define __NR_kevent_ring_init 283 +__SYSCALL(__NR_kevent_ring_init, sys_kevent_ring_init) #ifdef __KERNEL__ -#define __NR_syscall_max __NR_move_pages +#define __NR_syscall_max __NR_kevent_ring_init #include linux/err.h #ifndef __NO_STUBS diff --git a/include/linux/kevent.h b/include/linux/kevent.h new file mode 100644 index 000..781ffa8 --- /dev/null +++ b/include/linux/kevent.h @@ -0,0 +1,201 @@ +/* + * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED] + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#ifndef __KEVENT_H +#define __KEVENT_H +#include linux/types.h +#include linux/list.h +#include linux/rbtree.h +#include linux/spinlock.h +#include linux/mutex.h +#include linux/wait.h +#include linux/net.h +#include linux/rcupdate.h +#include linux/kevent_storage.h +#include linux/ukevent.h + +#define KEVENT_MIN_BUFFS_ALLOC 3 + +struct kevent; +struct kevent_storage; +typedef int (* kevent_callback_t)(struct kevent *); + +/* @callback is called each time new event has been caught. */ +/* @enqueue is called each time new event is queued. */ +/* @dequeue is called each time event is dequeued. */ + +struct kevent_callbacks { + kevent_callback_t callback, enqueue, dequeue; +}; + +#define KEVENT_READY 0x1 +#define KEVENT_STORAGE 0x2 +#define KEVENT_USER0x4 + +struct kevent +{ + /* Used for kevent freeing.*/ + struct rcu_head rcu_head; + struct ukevent event; + /* This lock protects ukevent manipulations, e.g. ret_flags changes. */ + spinlock_t ulock; + + /* Entry
[take23 1/5] kevent: Description.
Description. int kevent_ctl(int fd, unsigned int cmd, unsigned int num, struct ukevent *arg); fd - is the file descriptor referring to the kevent queue to manipulate. It is created by opening /dev/kevent char device, which is created with dynamic minor number and major number assigned for misc devices. cmd - is the requested operation. It can be one of the following: KEVENT_CTL_ADD - add event notification KEVENT_CTL_REMOVE - remove event notification KEVENT_CTL_MODIFY - modify existing notification num - number of struct ukevent in the array pointed to by arg arg - array of struct ukevent When called, kevent_ctl will carry out the operation specified in the cmd parameter. - int kevent_get_events(int ctl_fd, unsigned int min_nr, unsigned int max_nr, __u64 timeout, struct ukevent *buf, unsigned flags) ctl_fd - file descriptor referring to the kevent queue min_nr - minimum number of completed events that kevent_get_events will block waiting for max_nr - number of struct ukevent in buf timeout - number of nanoseconds to wait before returning less than min_nr events. If this is -1, then wait forever. buf - pointer to an array of struct ukevent. flags - unused kevent_get_events will wait timeout milliseconds for at least min_nr completed events, copying completed struct ukevents to buf and deleting any KEVENT_REQ_ONESHOT event requests. In nonblocking mode it returns as many events as possible, but not more than max_nr. In blocking mode it waits until timeout or if at least min_nr events are ready. - int kevent_wait(int ctl_fd, unsigned int num, __u64 timeout) ctl_fd - file descriptor referring to the kevent queue num - number of processed kevents timeout - this timeout specifies number of nanoseconds to wait until there is free space in kevent queue This syscall waits until either timeout expires or at least one event becomes ready. It also copies that num events into special ring buffer and requeues them (or removes depending on flags). - int kevent_ring_init(int ctl_fd, struct kevent_ring *ring, unsigned int num) ctl_fd - file descriptor referring to the kevent queue num - size of the ring buffer in events struct kevent_ring { unsigned int ring_kidx; struct ukevent event[0]; } ring_kidx - is an index in the ring buffer where kernel will put new events when kevent_wait() or kevent_get_events() is called Example userspace code (ring_buffer.c) can be found on project's homepage. Each kevent syscall can be so called cancellation point in glibc, i.e. when thread has been cancelled in kevent syscall, thread can be safely removed and no events will be lost, since each syscall (kevent_wait() or kevent_get_events()) will copy event into special ring buffer, accessible from other threads or even processes (if shared memory is used). When kevent is removed (not dequeued when it is ready, but just removed), even if it was ready, it is not copied into ring buffer, since if it is removed, no one cares about it (otherwise user would wait until it becomes ready and got it through usual way using kevent_get_events() or kevent_wait()) and thus no need to copy it to the ring buffer. It is possible with userspace ring buffer, that events in the ring buffer can be replaced without knowledge for the thread currently reading them (when other thread calls kevent_get_events() or kevent_wait()), so appropriate locking between threads or processes, which can simultaneously access the same ring buffer, is required. - The bulk of the interface is entirely done through the ukevent struct. It is used to add event requests, modify existing event requests, specify which event requests to remove, and return completed events. struct ukevent contains the following members: struct kevent_id id Id of this request, e.g. socket number, file descriptor and so on __u32 type Event type, e.g. KEVENT_SOCK, KEVENT_INODE, KEVENT_TIMER and so on __u32 event Event itself, e.g. SOCK_ACCEPT, INODE_CREATED, TIMER_FIRED __u32 req_flags Per-event request flags, KEVENT_REQ_ONESHOT event will be removed when it is ready KEVENT_REQ_WAKEUP_ONE When several threads wait on the same kevent queue and requested the same event, for example 'wake me up when new client has connected, so I could call accept()', then all threads will be awakened when new client has connected, but only one of them can process the data. This problem is known as thundering nerd problem. Events which have this flag set will not be marked as ready (and appropriate
[take23 3/5] kevent: poll/select() notifications.
poll/select() notifications. This patch includes generic poll/select notifications. kevent_poll works simialr to epoll and has the same issues (callback is invoked not from internal state machine of the caller, but through process awake, a lot of allocations and so on). Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED] diff --git a/include/linux/fs.h b/include/linux/fs.h index 5baf3a1..f81299f 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -276,6 +276,7 @@ #include linux/prio_tree.h #include linux/init.h #include linux/sched.h #include linux/mutex.h +#include linux/kevent.h #include asm/atomic.h #include asm/semaphore.h @@ -586,6 +587,10 @@ #ifdef CONFIG_INOTIFY struct mutexinotify_mutex; /* protects the watches list */ #endif +#ifdef CONFIG_KEVENT_SOCKET + struct kevent_storage st; +#endif + unsigned long i_state; unsigned long dirtied_when; /* jiffies of first dirtying */ @@ -739,6 +744,9 @@ #ifdef CONFIG_EPOLL struct list_headf_ep_links; spinlock_t f_ep_lock; #endif /* #ifdef CONFIG_EPOLL */ +#ifdef CONFIG_KEVENT_POLL + struct kevent_storage st; +#endif struct address_space*f_mapping; }; extern spinlock_t files_lock; diff --git a/kernel/kevent/kevent_poll.c b/kernel/kevent/kevent_poll.c new file mode 100644 index 000..94facbb --- /dev/null +++ b/kernel/kevent/kevent_poll.c @@ -0,0 +1,222 @@ +/* + * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED] + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include linux/kernel.h +#include linux/types.h +#include linux/list.h +#include linux/slab.h +#include linux/spinlock.h +#include linux/timer.h +#include linux/file.h +#include linux/kevent.h +#include linux/poll.h +#include linux/fs.h + +static kmem_cache_t *kevent_poll_container_cache; +static kmem_cache_t *kevent_poll_priv_cache; + +struct kevent_poll_ctl +{ + struct poll_table_structpt; + struct kevent *k; +}; + +struct kevent_poll_wait_container +{ + struct list_headcontainer_entry; + wait_queue_head_t *whead; + wait_queue_twait; + struct kevent *k; +}; + +struct kevent_poll_private +{ + struct list_headcontainer_list; + spinlock_t container_lock; +}; + +static int kevent_poll_enqueue(struct kevent *k); +static int kevent_poll_dequeue(struct kevent *k); +static int kevent_poll_callback(struct kevent *k); + +static int kevent_poll_wait_callback(wait_queue_t *wait, + unsigned mode, int sync, void *key) +{ + struct kevent_poll_wait_container *cont = + container_of(wait, struct kevent_poll_wait_container, wait); + struct kevent *k = cont-k; + struct file *file = k-st-origin; + u32 revents; + + revents = file-f_op-poll(file, NULL); + + kevent_storage_ready(k-st, NULL, revents); + + return 0; +} + +static void kevent_poll_qproc(struct file *file, wait_queue_head_t *whead, + struct poll_table_struct *poll_table) +{ + struct kevent *k = + container_of(poll_table, struct kevent_poll_ctl, pt)-k; + struct kevent_poll_private *priv = k-priv; + struct kevent_poll_wait_container *cont; + unsigned long flags; + + cont = kmem_cache_alloc(kevent_poll_container_cache, SLAB_KERNEL); + if (!cont) { + kevent_break(k); + return; + } + + cont-k = k; + init_waitqueue_func_entry(cont-wait, kevent_poll_wait_callback); + cont-whead = whead; + + spin_lock_irqsave(priv-container_lock, flags); + list_add_tail(cont-container_entry, priv-container_list); + spin_unlock_irqrestore(priv-container_lock, flags); + + add_wait_queue(whead, cont-wait); +} + +static int kevent_poll_enqueue(struct kevent *k) +{ + struct file *file; + int err, ready = 0; + unsigned int revents; + struct kevent_poll_ctl ctl; + struct kevent_poll_private *priv; + + file = fget(k-event.id.raw[0]); + if (!file) + return -EBADF; + + err = -EINVAL; + if (!file-f_op || !file-f_op-poll) + goto err_out_fput; + + err = -ENOMEM; + priv = kmem_cache_alloc(kevent_poll_priv_cache, SLAB_KERNEL); + if (!priv) + goto err_out_fput; + +
[take23 4/5] kevent: Socket notifications.
Socket notifications. This patch includes socket send/recv/accept notifications. Using trivial web server based on kevent and this features instead of epoll it's performance increased more than noticebly. More details about various benchmarks and server itself (evserver_kevent.c) can be found on project's homepage. Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED] diff --git a/fs/inode.c b/fs/inode.c index ada7643..ff1b129 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -21,6 +21,7 @@ #include linux/pagemap.h #include linux/cdev.h #include linux/bootmem.h #include linux/inotify.h +#include linux/kevent.h #include linux/mount.h /* @@ -164,12 +165,18 @@ #endif } inode-i_private = 0; inode-i_mapping = mapping; +#if defined CONFIG_KEVENT_SOCKET + kevent_storage_init(inode, inode-st); +#endif } return inode; } void destroy_inode(struct inode *inode) { +#if defined CONFIG_KEVENT_SOCKET + kevent_storage_fini(inode-st); +#endif BUG_ON(inode_has_buffers(inode)); security_inode_free(inode); if (inode-i_sb-s_op-destroy_inode) diff --git a/include/net/sock.h b/include/net/sock.h index edd4d73..d48ded8 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -48,6 +48,7 @@ #include linux/lockdep.h #include linux/netdevice.h #include linux/skbuff.h /* struct sk_buff */ #include linux/security.h +#include linux/kevent.h #include linux/filter.h @@ -450,6 +451,21 @@ static inline int sk_stream_memory_free( extern void sk_stream_rfree(struct sk_buff *skb); +struct socket_alloc { + struct socket socket; + struct inode vfs_inode; +}; + +static inline struct socket *SOCKET_I(struct inode *inode) +{ + return container_of(inode, struct socket_alloc, vfs_inode)-socket; +} + +static inline struct inode *SOCK_INODE(struct socket *socket) +{ + return container_of(socket, struct socket_alloc, socket)-vfs_inode; +} + static inline void sk_stream_set_owner_r(struct sk_buff *skb, struct sock *sk) { skb-sk = sk; @@ -477,6 +493,7 @@ static inline void sk_add_backlog(struct sk-sk_backlog.tail = skb; } skb-next = NULL; + kevent_socket_notify(sk, KEVENT_SOCKET_RECV); } #define sk_wait_event(__sk, __timeo, __condition) \ @@ -679,21 +696,6 @@ static inline struct kiocb *siocb_to_kio return si-kiocb; } -struct socket_alloc { - struct socket socket; - struct inode vfs_inode; -}; - -static inline struct socket *SOCKET_I(struct inode *inode) -{ - return container_of(inode, struct socket_alloc, vfs_inode)-socket; -} - -static inline struct inode *SOCK_INODE(struct socket *socket) -{ - return container_of(socket, struct socket_alloc, socket)-vfs_inode; -} - extern void __sk_stream_mem_reclaim(struct sock *sk); extern int sk_stream_mem_schedule(struct sock *sk, int size, int kind); diff --git a/include/net/tcp.h b/include/net/tcp.h index 7a093d0..69f4ad2 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -857,6 +857,7 @@ static inline int tcp_prequeue(struct so tp-ucopy.memory = 0; } else if (skb_queue_len(tp-ucopy.prequeue) == 1) { wake_up_interruptible(sk-sk_sleep); + kevent_socket_notify(sk, KEVENT_SOCKET_RECV|KEVENT_SOCKET_SEND); if (!inet_csk_ack_scheduled(sk)) inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK, (3 * TCP_RTO_MIN) / 4, diff --git a/kernel/kevent/kevent_socket.c b/kernel/kevent/kevent_socket.c new file mode 100644 index 000..7f74110 --- /dev/null +++ b/kernel/kevent/kevent_socket.c @@ -0,0 +1,135 @@ +/* + * kevent_socket.c + * + * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED] + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include linux/kernel.h +#include linux/types.h +#include linux/list.h +#include linux/slab.h +#include linux/spinlock.h +#include linux/timer.h +#include linux/file.h +#include linux/tcp.h +#include linux/kevent.h + +#include net/sock.h +#include net/request_sock.h +#include net/inet_connection_sock.h + +static int
[take23 5/5] kevent: Timer notifications.
Timer notifications. Timer notifications can be used for fine grained per-process time management, since interval timers are very inconvenient to use, and they are limited. This subsystem uses high-resolution timers. id.raw[0] is used as number of seconds id.raw[1] is used as number of nanoseconds Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED] diff --git a/kernel/kevent/kevent_timer.c b/kernel/kevent/kevent_timer.c new file mode 100644 index 000..df93049 --- /dev/null +++ b/kernel/kevent/kevent_timer.c @@ -0,0 +1,112 @@ +/* + * 2006 Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED] + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ + +#include linux/kernel.h +#include linux/types.h +#include linux/list.h +#include linux/slab.h +#include linux/spinlock.h +#include linux/hrtimer.h +#include linux/jiffies.h +#include linux/kevent.h + +struct kevent_timer +{ + struct hrtimer ktimer; + struct kevent_storage ktimer_storage; + struct kevent *ktimer_event; +}; + +static int kevent_timer_func(struct hrtimer *timer) +{ + struct kevent_timer *t = container_of(timer, struct kevent_timer, ktimer); + struct kevent *k = t-ktimer_event; + + kevent_storage_ready(t-ktimer_storage, NULL, KEVENT_MASK_ALL); + hrtimer_forward(timer, timer-base-softirq_time, + ktime_set(k-event.id.raw[0], k-event.id.raw[1])); + return HRTIMER_RESTART; +} + +static struct lock_class_key kevent_timer_key; + +static int kevent_timer_enqueue(struct kevent *k) +{ + int err; + struct kevent_timer *t; + + t = kmalloc(sizeof(struct kevent_timer), GFP_KERNEL); + if (!t) + return -ENOMEM; + + hrtimer_init(t-ktimer, CLOCK_MONOTONIC, HRTIMER_REL); + t-ktimer.expires = ktime_set(k-event.id.raw[0], k-event.id.raw[1]); + t-ktimer.function = kevent_timer_func; + t-ktimer_event = k; + + err = kevent_storage_init(t-ktimer, t-ktimer_storage); + if (err) + goto err_out_free; + lockdep_set_class(t-ktimer_storage.lock, kevent_timer_key); + + err = kevent_storage_enqueue(t-ktimer_storage, k); + if (err) + goto err_out_st_fini; + + hrtimer_start(t-ktimer, t-ktimer.expires, HRTIMER_REL); + + return 0; + +err_out_st_fini: + kevent_storage_fini(t-ktimer_storage); +err_out_free: + kfree(t); + + return err; +} + +static int kevent_timer_dequeue(struct kevent *k) +{ + struct kevent_storage *st = k-st; + struct kevent_timer *t = container_of(st, struct kevent_timer, ktimer_storage); + + hrtimer_cancel(t-ktimer); + kevent_storage_dequeue(st, k); + kfree(t); + + return 0; +} + +static int kevent_timer_callback(struct kevent *k) +{ + k-event.ret_data[0] = jiffies_to_msecs(jiffies); + return 1; +} + +static int __init kevent_init_timer(void) +{ + struct kevent_callbacks tc = { + .callback = kevent_timer_callback, + .enqueue = kevent_timer_enqueue, + .dequeue = kevent_timer_dequeue}; + + return kevent_add_callbacks(tc, KEVENT_TIMER); +} +module_init(kevent_init_timer); + - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[take23 0/5] kevent: Generic event handling mechanism.
Generic event handling mechanism. Kevent is a generic subsytem which allows to handle event notifications. It supports both level and edge triggered events. It is similar to poll/epoll in some cases, but it is more scalable, it is faster and allows to work with essentially eny kind of events. Events are provided into kernel through control syscall and can be read back through mmaped ring or syscall. Kevent update (i.e. readiness switching) happens directly from internals of the appropriate state machine of the underlying subsytem (like network, filesystem, timer or any other). Homepage: http://tservice.net.ru/~s0mbre/old/?section=projectsitem=kevent Documentation page: http://linux-net.osdl.org/index.php/Kevent Consider for inclusion. Changes from 'take22' patchset: * new ring buffer implementation in process' memory * wakeup-one-thread flag * edge-triggered behaviour With this release additional independent benchmark shows kevent speed compared to epoll: Eric Dumazet created special benchmark which creates set of AF_INET sockets and two threads start to simultaneously read and write data from/into them. Here is results: epoll (no EPOLLET): 57428 events/sec kevent (no ET): 59794 events/sec epoll (with EPOLLET): 71000 events/sec kevent (with ET): 78265 events/sec Maximum (busy loop reading events): 88482 events/sec Changes from 'take21' patchset: * minor cleanups (different return values, removed unneded variables, whitespaces and so on) * fixed bug in kevent removal in case when kevent being removed is the same as overflow_kevent (spotted by Eric Dumazet) Changes from 'take20' patchset: * new ring buffer implementation * removed artificial limit on possible number of kevents With this release and fixed userspace web server it was possible to achive 3960+ req/s with client connection rate of 4000 con/s over 100 Mbit lan, data IO over network was about 10582.7 KB/s, which is too close to wire speed if we get into account headers and the like. Changes from 'take19' patchset: * use __init instead of __devinit * removed 'default N' from config for user statistic * removed kevent_user_fini() since kevent can not be unloaded * use KERN_INFO for statistic output Changes from 'take18' patchset: * use __init instead of __devinit * removed 'default N' from config for user statistic * removed kevent_user_fini() since kevent can not be unloaded * use KERN_INFO for statistic output Changes from 'take17' patchset: * Use RB tree instead of hash table. At least for a web sever, frequency of addition/deletion of new kevent is comparable with number of search access, i.e. most of the time events are added, accesed only couple of times and then removed, so it justifies RB tree usage over AVL tree, since the latter does have much slower deletion time (max O(log(N)) compared to 3 ops), although faster search time (1.44*O(log(N)) vs. 2*O(log(N))). So for kevents I use RB tree for now and later, when my AVL tree implementation is ready, it will be possible to compare them. * Changed readiness check for socket notifications. With both above changes it is possible to achieve more than 3380 req/second compared to 2200, sometimes 2500 req/second for epoll() for trivial web-server and httperf client on the same hardware. It is possible that above kevent limit is due to maximum allowed kevents in a time limit, which is 4096 events. Changes from 'take16' patchset: * misc cleanups (__read_mostly, const ...) * created special macro which is used for mmap size (number of pages) calculation * export kevent_socket_notify(), since it is used in network protocols which can be built as modules (IPv6 for example) Changes from 'take15' patchset: * converted kevent_timer to high-resolution timers, this forces timer API update at http://linux-net.osdl.org/index.php/Kevent * use struct ukevent* instead of void * in syscalls (documentation has been updated) * added warning in kevent_add_ukevent() if ring has broken index (for testing) Changes from 'take14' patchset: * added kevent_wait() This syscall waits until either timeout expires or at least one event becomes ready. It also commits that @num events from @start are processed by userspace and thus can be be removed or rearmed (depending on it's flags). It can be used for commit events read by userspace through mmap interface. Example userspace code (evtest.c) can be found on project's homepage. * added socket notifications (send/recv/accept) Changes from 'take13' patchset: * do not get lock aroung user data check in __kevent_search() * fail early if there were no registered callbacks for given type of kevent * trailing whitespace cleanup Changes from 'take12' patchset: * remove non-chardev interface for initialization * use pointer to kevent_mring instead of unsigned longs * use aligned 64bit type in raw user data (can be
Re: [take21 0/4] kevent: Generic event handling mechanism.
On Tue, 07 Nov 2006 07:32:20 -0500 Jeff Garzik [EMAIL PROTECTED] wrote: Evgeniy Polyakov wrote: Mmap ring buffer implementation was stopped by Andrew Morton and Ulrich Drepper, process' memory is used instead. copy_to_user() is slower (and some times noticebly), but there are major advantages of such approach. h. I say there are advantages to both. My problem with the old mmapped ringbuffer was that it permitted each user to pin (typically) 48MB of unswappable memory. Plus this pinned-memory problem would put upper bounds on the ring size. Perhaps create a kevent_direct_limit resource limit for each thread. By default, each thread could mmap $n pinned pagecache pages. Sysadmin can tune certain app resource limits to permit more. I would think that retaining the option to avoid copy_to_user() -somehow- in -some- cases would be wise. What Evgeniy means here is that copy_to_user() is slower than memcpy() (on his machine, with his kernel config, at least). Which is kinda weird and unexpected and is something which we should investigate independently from this project. (Rather than simply going and bypassing it!) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] skge: version 1.9
Michael Stone [EMAIL PROTECTED] wrote: The skge 1.9 patch is looking good on older syskonnect fiber cards. Stability issues seem to be taken care of and performance is good. There are some strange interactions with bonding, however. If I try to put both interfaces of an sk-9844 into a bonded interface, I only see traffic from one of them. If I try to config the bonded interface down, the system hangs. If I tcpdump either of the individual interfaces (before bonding them) I see all the expected traffic. Can you provide some bonding configuration details? Which mode, options, etc, as well as the relevant bits from dmesg (you can send it to me privately if it's huge)? I don't have any skge hardware, so I'm not able to test this locally. -J --- -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.19-rc1: Volanomark slowdown
From: Tim Chen [EMAIL PROTECTED] Date: Tue, 07 Nov 2006 10:32:34 -0800 [ Please bring up networking questions on netdev@vger.kernel.org as that is the place where networking developers read bug reports and questions, they by-in-large do not read linux-kernel at all. ] [TCP]: Send ACKs each 2nd received segment commit: 1ef9696c909060ccdae3ade245ca88692b49285b http://kernel.org/git/? p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=1ef9696c909060ccdae3ade245ca88692b49285b reduced Volanomark benchmark throughput by 10%. This is because Volanomark sends short message (100 bytes) on its TCP connections. This patch increases the number of ACKs traffic by 3.5 times. By adopting this patch, we assume that with small segment, having short delay is important enough that we are willing to reduce bandwidth with more ACKs. Is there any real application out there that this new behavior could be a concern? That's unfortunate, because without that patch connections can hang which is more important to fix than your performance test. :-) If we don't ACK every two segments, stacks which grow the congestion window based upon packet counting will not grow the congestion window properly when they are sending smaller than MSS sized segments. This topic has been discussed quite a bit, you may want to do some searching in the netdev archives to read some of that. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] mlsxfrm: Various fixes
On Tue, 2006-11-07 at 15:29 -0500, Paul Moore wrote: Venkat Yekkirala wrote: +/* + * security_sid_compare() - compares two given sid contexts. + * Returns 1 if they are equal, 0 otherwise. + */ +int security_sid_compare(u32 sid1, u32 sid2) +{ + struct context *context1; + struct context *context2; + int rc; + + if (!ss_initialized) + return 1; + + if (sid1 == sid2) + return 1; + else if (sid1 SECINITSID_NUM sid2 SECINITSID_NUM) + return 0; + + /* explicit comparison in order */ + + POLICY_RDLOCK; + context1 = sidtab_search(sidtab, sid1); + if (!context1) { + printk(KERN_ERR security_sid_compare: unrecognized SID + %u\n, sid1); + rc = 0; + goto out_unlock; + } + + context2 = sidtab_search(sidtab, sid2); + if (!context2) { + printk(KERN_ERR security_sid_compare: unrecognized SID + %u\n, sid2); + rc = 0; + goto out_unlock; + } + + rc = context_cmp(context1, context2); + +out_unlock: + POLICY_RDUNLOCK; + return rc; +} I understand wanting a generic LSM interface to do secid token comparisons, but in the SELinux implementation of this function I think we can get away with only a simple sid1 == sid2 since the security server shouldn't be creating duplicate SID/secid values for identical contexts, I think. Did you run into something in testing that would indicate otherwise? Such duplication can occur among the initial SIDs. Not sure though when that would apply here, and it would only apply if both SIDs were initial SIDs. -- Stephen Smalley National Security Agency - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.19-rc1: Volanomark slowdown
David Miller wrote: If we don't ACK every two segments, stacks which grow the congestion window based upon packet counting will not grow the congestion window properly when they are sending smaller than MSS sized segments. The only stack I know of that does this currently is linux, and in doing so does not conform to the spec. ;) Sending to a BSD receiver will result in the same behavior, so the right place to fix this is on the sending side. (I know the issue of packet vs. byte counting has come up many times over the last 10 years or so, and many arguments have been made on either side... I don't mean this to be flame bait but it's clear what will happen in this scenario.) One way of viewing the current situation is that linux's packet counting plus ABC is more conservative than byte counting -- sometimes much more so. Packet counting without ABC may be more or less conservative than byte counting, depending on segment sizes and receiver ACK strategy. Without ABC, linux is vulnerable to aggressive ACKing to inflate the cwnd. This is a kind of ugly state of affairs. Unfortunately I see no clear way to reconcile these issues short of switching to byte counting. Obviously this would be a big change as packet counting is deeply ingrained in not only the congestion control but also the recovery code. -John - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.19-rc1: Volanomark slowdown
From: John Heffner [EMAIL PROTECTED] Date: Tue, 07 Nov 2006 16:50:33 -0500 The only stack I know of that does this currently is linux, and in doing so does not conform to the spec. ;) Sending to a BSD receiver will result in the same behavior, so the right place to fix this is on the sending side. (I know the issue of packet vs. byte counting has come up many times over the last 10 years or so, and many arguments have been made on either side... I don't mean this to be flame bait but it's clear what will happen in this scenario.) John, you cannot change the N-million existing Linux systems out there doing congestion control via byte counting. You cannot do this no matter how much you wish it so :-) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.19-rc1: Volanomark slowdown
David Miller wrote: From: John Heffner [EMAIL PROTECTED] Date: Tue, 07 Nov 2006 16:50:33 -0500 The only stack I know of that does this currently is linux, and in doing so does not conform to the spec. ;) Sending to a BSD receiver will result in the same behavior, so the right place to fix this is on the sending side. (I know the issue of packet vs. byte counting has come up many times over the last 10 years or so, and many arguments have been made on either side... I don't mean this to be flame bait but it's clear what will happen in this scenario.) John, you cannot change the N-million existing Linux systems out there doing congestion control via byte counting. You cannot do this no matter how much you wish it so :-) That would make our lives easier, wouldn't it? ;) Clearly there are some combinations of TCP stacks out there that won't interoperate well under certain workloads. Making new versions of the stack work well is the best we can hope for... Fixing the sending side does not mean we have to back out the work-around on the receiving side. -John - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Rewrite e100_phys_id
Matthew Wilcox wrote: On Tue, Nov 07, 2006 at 10:33:14AM -0800, Auke Kok wrote: Matthew Wilcox wrote: Tested on the internal interface of an HP Integrity rx2600. bad news, it's completely hosed. The adapter does some indistinguishable blinking for a second, then stops blinking alltogether. Weird. I tested it on the only e100 I have access to, and it worked. I've just reviewed the patch you quoted below, and I don't see what the problem is. I don't understand it either, and will dig into this after I get more coffee. point is that `ethtool -p` now exits immediately after 500ms. it should loop until ^C is pressed. Somehow msleep_interruptable is always returning 0 on my platform? very strange. Auke - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.19-rc4-git10][PKT_SCHED] sch_htb: INIT_HLIST_NODE after hlist_del()
From: Stephen Hemminger [EMAIL PROTECTED] Date: Tue, 7 Nov 2006 09:50:07 -0800 Your patch duplicated the code in hlist_del_init(). Why not do: Indeed, this is the patch I will apply. Thanks Stephen. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [take23 3/5] kevent: poll/select() notifications.
On Tue, 7 Nov 2006, Evgeniy Polyakov wrote: +static int kevent_poll_wait_callback(wait_queue_t *wait, + unsigned mode, int sync, void *key) +{ + struct kevent_poll_wait_container *cont = + container_of(wait, struct kevent_poll_wait_container, wait); + struct kevent *k = cont-k; + struct file *file = k-st-origin; + u32 revents; + + revents = file-f_op-poll(file, NULL); + + kevent_storage_ready(k-st, NULL, revents); + + return 0; +} Are you sure you can safely call file-f_op-poll() from inside a callback based wakeup? The low level driver may be calling the wakeup with one of its locks held, and during the file-f_op-poll may be trying to acquire the same lock. I remember there was a discussion about this, and assuming the above not true, made epoll code more complex (and slower, since an extra O(R) loop was needed to fetch events). - Davide - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3_read_partno(): possible array overrun
From: Michael Chan [EMAIL PROTECTED] Date: Mon, 06 Nov 2006 12:07:31 -0800 On Mon, 2006-11-06 at 10:45 +0100, Adrian Bunk wrote: The Coverity checker noted the following in drivers/net/tg3.c: -- snip -- The problem is that vpd_data[i + 2] could be vpd_data[255 + 2]. Thanks. This should fix it: [TG3]: Fix array overrun in tg3_read_partno(). Use proper upper limits for the loops and check for all error conditions. The problem was noticed by Adrian Bunk. Signed-off-by: Michael Chan [EMAIL PROTECTED] Applied, thanks Michael. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH wireless-2.6-git] prism54: WPA/RSN support for fullmac cards
On Fri, Nov 03, 2006 at 01:41:46PM -0500, Luis R. Rodriguez wrote: On 11/3/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: yes, especially mgt_commit_list caused alot headaches, until I removed DOT11_OID_PSM from the cache list. Now, I can hammer it with ping -f for hours. nice, perhaps that's been the culprit all along... going to dig to see if I find a fullmac prism card. Will like to get this merged in. Any resolution on this? -- John W. Linville [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Please pull 'upstream-fixes' branch of wireless-2.6
The following changes since commit edd106fc8ac1826dbe231b70ce0762db24133e5c: Auke Kok: e1000: Fix regression: garbled stats and irq allocation during swsusp are found in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git upstream-fixes Adrian Bunk: bcm43xx: Add error checking in bcm43xx_sprom_write() Michael Buesch: bcm43xx: Drain TX status before starting IRQs drivers/net/wireless/bcm43xx/bcm43xx_main.c | 22 -- 1 files changed, 20 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/bcm43xx/bcm43xx_main.c b/drivers/net/wireless/bcm43xx/bcm43xx_main.c index 65edb56..a1b7838 100644 --- a/drivers/net/wireless/bcm43xx/bcm43xx_main.c +++ b/drivers/net/wireless/bcm43xx/bcm43xx_main.c @@ -746,7 +746,7 @@ int bcm43xx_sprom_write(struct bcm43xx_p if (err) goto err_ctlreg; spromctl |= 0x10; /* SPROM WRITE enable. */ - bcm43xx_pci_write_config32(bcm, BCM43xx_PCICFG_SPROMCTL, spromctl); + err = bcm43xx_pci_write_config32(bcm, BCM43xx_PCICFG_SPROMCTL, spromctl); if (err) goto err_ctlreg; /* We must burn lots of CPU cycles here, but that does not @@ -768,7 +768,7 @@ int bcm43xx_sprom_write(struct bcm43xx_p mdelay(20); } spromctl = ~0x10; /* SPROM WRITE enable. */ - bcm43xx_pci_write_config32(bcm, BCM43xx_PCICFG_SPROMCTL, spromctl); + err = bcm43xx_pci_write_config32(bcm, BCM43xx_PCICFG_SPROMCTL, spromctl); if (err) goto err_ctlreg; mdelay(500); @@ -1463,6 +1463,23 @@ static void handle_irq_transmit_status(s } } +static void drain_txstatus_queue(struct bcm43xx_private *bcm) +{ + u32 dummy; + + if (bcm-current_core-rev 5) + return; + /* Read all entries from the microcode TXstatus FIFO +* and throw them away. +*/ + while (1) { + dummy = bcm43xx_read32(bcm, BCM43xx_MMIO_XMITSTAT_0); + if (!dummy) + break; + dummy = bcm43xx_read32(bcm, BCM43xx_MMIO_XMITSTAT_1); + } +} + static void bcm43xx_generate_noise_sample(struct bcm43xx_private *bcm) { bcm43xx_shm_write16(bcm, BCM43xx_SHM_SHARED, 0x408, 0x7F7F); @@ -3532,6 +3549,7 @@ int bcm43xx_select_wireless_core(struct bcm43xx_macfilter_clear(bcm, BCM43xx_MACFILTER_ASSOC); bcm43xx_macfilter_set(bcm, BCM43xx_MACFILTER_SELF, (u8 *)(bcm-net_dev-dev_addr)); bcm43xx_security_init(bcm); + drain_txstatus_queue(bcm); ieee80211softmac_start(bcm-net_dev); /* Let's go! Be careful after enabling the IRQs. -- John W. Linville [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Please pull 'upstream' branch of wireless-2.6
The following changes since commit d4f748365129ccfc9dadf6fb14331e45e33cc4ed: John W. Linville: Merge branch 'upstream-fixes' into upstream are found in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git upstream John W. Linville: wireless: clean-up some check return code warnings Larry Finger: bcm43xx: remove badness variable and related routine bcm43xx: Remove useless core enable/disable messages ieee80211softmac: fix verbosity when debug disabled drivers/net/wireless/bcm43xx/bcm43xx_main.c | 56 + drivers/net/wireless/hostap/hostap_pci.c |8 +++- drivers/net/wireless/ipw2100.c|8 +++- drivers/net/wireless/ipw2200.c|8 +++- drivers/net/wireless/orinoco_pci.h|7 +++ drivers/net/wireless/prism54/islpci_hotplug.c | 20 +++-- net/ieee80211/softmac/ieee80211softmac_auth.c | 10 ++-- 7 files changed, 60 insertions(+), 57 deletions(-) diff --git a/drivers/net/wireless/bcm43xx/bcm43xx_main.c b/drivers/net/wireless/bcm43xx/bcm43xx_main.c index c6bd868..60a9745 100644 --- a/drivers/net/wireless/bcm43xx/bcm43xx_main.c +++ b/drivers/net/wireless/bcm43xx/bcm43xx_main.c @@ -2684,14 +2684,10 @@ #endif bcm-chip_id, bcm-chip_rev); dprintk(KERN_INFO PFX Number of cores: %d\n, core_count); if (bcm-core_chipcommon.available) { - dprintk(KERN_INFO PFX Core 0: ID 0x%x, rev 0x%x, vendor 0x%x, %s\n, - core_id, core_rev, core_vendor, - bcm43xx_core_enabled(bcm) ? enabled : disabled); - } - - if (bcm-core_chipcommon.available) + dprintk(KERN_INFO PFX Core 0: ID 0x%x, rev 0x%x, vendor 0x%x\n, + core_id, core_rev, core_vendor); current_core = 1; - else + } else current_core = 0; for ( ; current_core core_count; current_core++) { struct bcm43xx_coreinfo *core; @@ -2709,9 +2705,8 @@ #endif core_rev = (sb_id_hi 0xF); core_vendor = (sb_id_hi 0x) 16; - dprintk(KERN_INFO PFX Core %d: ID 0x%x, rev 0x%x, vendor 0x%x, %s\n, - current_core, core_id, core_rev, core_vendor, - bcm43xx_core_enabled(bcm) ? enabled : disabled ); + dprintk(KERN_INFO PFX Core %d: ID 0x%x, rev 0x%x, vendor 0x%x\n, + current_core, core_id, core_rev, core_vendor); core = NULL; switch (core_id) { @@ -3209,55 +3204,27 @@ static void bcm43xx_periodic_every15sec( static void do_periodic_work(struct bcm43xx_private *bcm) { - unsigned int state; - - state = bcm-periodic_state; - if (state % 8 == 0) + if (bcm-periodic_state % 8 == 0) bcm43xx_periodic_every120sec(bcm); - if (state % 4 == 0) + if (bcm-periodic_state % 4 == 0) bcm43xx_periodic_every60sec(bcm); - if (state % 2 == 0) + if (bcm-periodic_state % 2 == 0) bcm43xx_periodic_every30sec(bcm); - if (state % 1 == 0) - bcm43xx_periodic_every15sec(bcm); - bcm-periodic_state = state + 1; + bcm43xx_periodic_every15sec(bcm); schedule_delayed_work(bcm-periodic_work, HZ * 15); } -/* Estimate a Badness value based on the periodic work - * state-machine state. Badness is worse (bigger), if the - * periodic work will take longer. - */ -static int estimate_periodic_work_badness(unsigned int state) -{ - int badness = 0; - - if (state % 8 == 0) /* every 120 sec */ - badness += 10; - if (state % 4 == 0) /* every 60 sec */ - badness += 5; - if (state % 2 == 0) /* every 30 sec */ - badness += 1; - if (state % 1 == 0) /* every 15 sec */ - badness += 1; - -#define BADNESS_LIMIT 4 - return badness; -} - static void bcm43xx_periodic_work_handler(void *d) { struct bcm43xx_private *bcm = d; struct net_device *net_dev = bcm-net_dev; unsigned long flags; u32 savedirqs = 0; - int badness; unsigned long orig_trans_start = 0; mutex_lock(bcm-mutex); - badness = estimate_periodic_work_badness(bcm-periodic_state); - if (badness BADNESS_LIMIT) { + if (unlikely(bcm-periodic_state % 4 == 0)) { /* Periodic work will take a long time, so we want it to * be preemtible. */ @@ -3289,7 +3256,7 @@ static void bcm43xx_periodic_work_handle do_periodic_work(bcm); - if (badness BADNESS_LIMIT) { + if (unlikely(bcm-periodic_state % 4 == 0)) { spin_lock_irqsave(bcm-irq_lock, flags); tasklet_enable(bcm-isr_tasklet); bcm43xx_interrupt_enable(bcm, savedirqs); @@ -3300,6 +3267,7 @@
Re: [PATCH 1/3] NetXen: Fixed /sys mapping between device and driver
Hi Ingo, Will do. Thanks for reviewing it. -Amit On Tuesday 07 November 2006 22:19, Ingo Oeser wrote: Hi Amit, one minor nitpick: You wrote: diff --git a/drivers/net/netxen/netxen_nic_main.c b/drivers/net/netxen/netxen_nic_main.c index b54ea16..4effb87 100644 --- a/drivers/net/netxen/netxen_nic_main.c +++ b/drivers/net/netxen/netxen_nic_main.c [...] @@ -1040,7 +1041,7 @@ static int netxen_nic_poll(struct net_de netxen_nic_enable_int(adapter); } - return (done ? 0 : 1); + return (!done); return !done; Please lose the braces here (CodingStyle). Just respin or send this change along with later patchsets. Regards Ingo Oeser - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Add support for configuring the PHY connection interface
Most PHYs connect to an ethernet controller over a GMII or MII interface. However, a growing number are connected over different interfaces, such as RGMII or SGMII. The ethernet driver will tell the PHY what type of connection it is by setting it manually, or passing it in through phy_connect (or phy_attach). Changes include: * Updates to documentation * Updates to other PHY Lib consumers * Changes to PHY Lib to add interface support * Some minor changes to whitespace in phy.h * interface values now passed to gianfar Signed-off-by: Andrew Fleming [EMAIL PROTECTED] --- Documentation/networking/phy.txt | 11 --- arch/powerpc/sysdev/fsl_soc.c | 36 drivers/net/au1000_eth.c |3 ++- drivers/net/fs_enet/fs_enet-main.c |3 ++- drivers/net/gianfar.c |5 +++-- drivers/net/phy/phy_device.c | 29 - include/linux/phy.h| 32 ++-- 7 files changed, 97 insertions(+), 22 deletions(-) diff --git a/Documentation/networking/phy.txt b/Documentation/networking/phy.txt index 29ccae4..1c9873d 100644 --- a/Documentation/networking/phy.txt +++ b/Documentation/networking/phy.txt @@ -97,11 +97,12 @@ Letting the PHY Abstraction Layer do Eve Next, you need to know the device name of the PHY connected to this device. The name will look something like, phy0:0, where the first number is the - bus id, and the second is the PHY's address on that bus. + bus id, and the second is the PHY's address on that bus. Typically, + the bus is responsible for making its ID unique. Now, to connect, just call this function: - phydev = phy_connect(dev, phy_name, adjust_link, flags); + phydev = phy_connect(dev, phy_name, adjust_link, flags, interface); phydev is a pointer to the phy_device structure which represents the PHY. If phy_connect is successful, it will return the pointer. dev, here, is the @@ -115,6 +116,10 @@ Letting the PHY Abstraction Layer do Eve This is useful if the system has put hardware restrictions on the PHY/controller, of which the PHY needs to be aware. + interface is a u32 which specifies the connection type used + between the controller and the PHY. Examples are GMII, MII, + RGMII, and SGMII. For a full list, see include/linux/phy.h + Now just make sure that phydev-supported and phydev-advertising have any values pruned from them which don't make sense for your controller (a 10/100 controller may be connected to a gigabit capable PHY, so you would need to @@ -191,7 +196,7 @@ Doing it all yourself start, or disables then frees them for stop. struct phy_device * phy_attach(struct net_device *dev, const char *phy_id, -u32 flags); +u32 flags, u32 interface); Attaches a network device to a particular PHY, binding the PHY to a generic driver if none was found during bus initialization. Passes in diff --git a/arch/powerpc/sysdev/fsl_soc.c b/arch/powerpc/sysdev/fsl_soc.c index b4b5b4a..b053370 100644 --- a/arch/powerpc/sysdev/fsl_soc.c +++ b/arch/powerpc/sysdev/fsl_soc.c @@ -211,6 +211,36 @@ static int __init gfar_set_flags(struct return device_flags; } +/* Return the Linux interface mode type based on the + * specification in the device-tree */ +static int __init gfar_get_interface(struct device_node *np) +{ + const char *istr; + int interface = 0; + + istr = get_property(np, interface, NULL); + + if (istr == NULL) + istr = GMII; + + if (!strcasecmp(istr, GMII)) + interface = PHY_INTERFACE_MODE_GMII; + else if (!strcasecmp(istr, MII)) + interface = PHY_INTERFACE_MODE_MII; + else if (!strcasecmp(istr, RGMII)) + interface = PHY_INTERFACE_MODE_RGMII; + else if (!strcasecmp(istr, SGMII)) + interface = PHY_INTERFACE_MODE_SGMII; + else if (!strcasecmp(istr, TBI)) + interface = PHY_INTERFACE_MODE_TBI; + else if (!strcasecmp(istr, RMII)) + interface = PHY_INTERFACE_MODE_RMII; + else if (!strcasecmp(istr, RTBI)) + interface = PHY_INTERFACE_MODE_RTBI; + + return interface; +} + static struct device_node * __init gfar_get_phy_node(struct device_node *np) { const phandle *ph; @@ -342,6 +372,12 @@ static int __init gfar_of_init(void) if (mac_addr) memcpy(gfar_data.mac_addr, mac_addr, 6); + gfar_data.interface = gfar_get_interface(np); + if (gfar_data.interface == 0) { + printk(gfar %d failed to set interface\n, num); + continue; + } + ret = gfar_set_phy_info(np, gfar_data.phy_id, gfar_data.bus_id, gfar_data.phy_flags); if (ret) { diff --git a/drivers/net/au1000_eth.c
[PATCH] Add support for Marvell 88e1111S and 88e1145
This patch requires the new support for configurable PHY interfaces. Changes include: * New support for 88e1145 * New support for 88e111s * Fixing 88e1101 driver to not match non-88e1101 PHYs * Increases in feature support across Marvell PHY product line * Fixes a bunch of whitespace issues found by Lindent Signed-off-by: Andrew Fleming [EMAIL PROTECTED] --- drivers/net/phy/marvell.c | 156 ++--- 1 files changed, 144 insertions(+), 12 deletions(-) diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c index 0ad2532..5320ab9 100644 --- a/drivers/net/phy/marvell.c +++ b/drivers/net/phy/marvell.c @@ -43,6 +43,19 @@ #define MII_M1011_IMASK 0x12 #define MII_M1011_IMASK_INIT 0x6400 #define MII_M1011_IMASK_CLEAR 0x +#define MII_M1011_PHY_SCR 0x10 +#define MII_M1011_PHY_SCR_AUTO_CROSS 0x0060 + +#define MII_M1145_PHY_EXT_CR 0x14 +#define MII_M1145_RGMII_RX_DELAY 0x0080 +#define MII_M1145_RGMII_TX_DELAY 0x0002 + +#define M1145_DEV_FLAGS_RESISTANCE 0x0001 + +#define MII_M_PHY_LED_CONTROL 0x18 +#define MII_M_PHY_LED_DIRECT 0x4100 +#define MII_M_PHY_LED_COMBINE 0x411c + MODULE_DESCRIPTION(Marvell PHY driver); MODULE_AUTHOR(Andy Fleming); MODULE_LICENSE(GPL); @@ -64,7 +77,7 @@ static int marvell_config_intr(struct ph { int err; - if(phydev-interrupts == PHY_INTERRUPT_ENABLED) + if (phydev-interrupts == PHY_INTERRUPT_ENABLED) err = phy_write(phydev, MII_M1011_IMASK, MII_M1011_IMASK_INIT); else err = phy_write(phydev, MII_M1011_IMASK, MII_M1011_IMASK_CLEAR); @@ -104,34 +117,153 @@ static int marvell_config_aneg(struct ph if (err 0) return err; + err = phy_write(phydev, MII_M1011_PHY_SCR, + MII_M1011_PHY_SCR_AUTO_CROSS); + if (err 0) + return err; + + err = phy_write(phydev, MII_M_PHY_LED_CONTROL, + MII_M_PHY_LED_DIRECT); + if (err 0) + return err; err = genphy_config_aneg(phydev); return err; } +static int m88e1145_config_init(struct phy_device *phydev) +{ + int err; + + /* Take care of errata E0 E1 */ + err = phy_write(phydev, 0x1d, 0x001b); + if (err 0) + return err; + + err = phy_write(phydev, 0x1e, 0x418f); + if (err 0) + return err; + + err = phy_write(phydev, 0x1d, 0x0016); + if (err 0) + return err; + + err = phy_write(phydev, 0x1e, 0xa2da); + if (err 0) + return err; + + if (phydev-interface == PHY_INTERFACE_MODE_RGMII) { + int temp = phy_read(phydev, MII_M1145_PHY_EXT_CR); + if (temp 0) + return temp; + + temp |= (MII_M1145_RGMII_RX_DELAY | MII_M1145_RGMII_TX_DELAY); + + err = phy_write(phydev, MII_M1145_PHY_EXT_CR, temp); + if (err 0) + return err; + + if (phydev-dev_flags M1145_DEV_FLAGS_RESISTANCE) { + err = phy_write(phydev, 0x1d, 0x0012); + if (err 0) + return err; + + temp = phy_read(phydev, 0x1e); + if (temp 0) + return temp; + + temp = 0xf03f; + temp |= 2 9; /* 36 ohm */ + temp |= 2 6; /* 39 ohm */ + + err = phy_write(phydev, 0x1e, temp); + if (err 0) + return err; + + err = phy_write(phydev, 0x1d, 0x3); + if (err 0) + return err; + + err = phy_write(phydev, 0x1e, 0x8000); + if (err 0) + return err; + } + } + + return 0; +} static struct phy_driver m88e1101_driver = { - .phy_id = 0x01410c00, - .phy_id_mask= 0xff00, - .name = Marvell 88E1101, - .features = PHY_GBIT_FEATURES, - .flags = PHY_HAS_INTERRUPT, - .config_aneg= marvell_config_aneg, - .read_status= genphy_read_status, - .ack_interrupt = marvell_ack_interrupt, - .config_intr= marvell_config_intr, - .driver = { .owner = THIS_MODULE,}, + .phy_id = 0x01410c60, + .phy_id_mask = 0xfff0, + .name = Marvell 88E1101, + .features = PHY_GBIT_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .config_aneg = marvell_config_aneg, + .read_status = genphy_read_status, + .ack_interrupt = marvell_ack_interrupt, + .config_intr = marvell_config_intr, + .driver = {.owner = THIS_MODULE,},
Re: [PATCH] Add support for configuring the PHY connection interface
On Nov 8, 2006, at 12:10 AM, Andy Fleming wrote: Most PHYs connect to an ethernet controller over a GMII or MII interface. However, a growing number are connected over different interfaces, such as RGMII or SGMII. The ethernet driver will tell the PHY what type of connection it is by setting it manually, or passing it in through phy_connect (or phy_attach). Changes include: * Updates to documentation * Updates to other PHY Lib consumers * Changes to PHY Lib to add interface support * Some minor changes to whitespace in phy.h * interface values now passed to gianfar Signed-off-by: Andrew Fleming [EMAIL PROTECTED] Any reason to not make interface an enum? - k --- Documentation/networking/phy.txt | 11 --- arch/powerpc/sysdev/fsl_soc.c | 36 + +++ drivers/net/au1000_eth.c |3 ++- drivers/net/fs_enet/fs_enet-main.c |3 ++- drivers/net/gianfar.c |5 +++-- drivers/net/phy/phy_device.c | 29 +++ +- include/linux/phy.h| 32 + +-- 7 files changed, 97 insertions(+), 22 deletions(-) diff --git a/Documentation/networking/phy.txt b/Documentation/ networking/phy.txt index 29ccae4..1c9873d 100644 --- a/Documentation/networking/phy.txt +++ b/Documentation/networking/phy.txt @@ -97,11 +97,12 @@ Letting the PHY Abstraction Layer do Eve Next, you need to know the device name of the PHY connected to this device. The name will look something like, phy0:0, where the first number is the - bus id, and the second is the PHY's address on that bus. + bus id, and the second is the PHY's address on that bus. Typically, + the bus is responsible for making its ID unique. Now, to connect, just call this function: - phydev = phy_connect(dev, phy_name, adjust_link, flags); + phydev = phy_connect(dev, phy_name, adjust_link, flags, interface); phydev is a pointer to the phy_device structure which represents the PHY. If phy_connect is successful, it will return the pointer. dev, here, is the @@ -115,6 +116,10 @@ Letting the PHY Abstraction Layer do Eve This is useful if the system has put hardware restrictions on the PHY/controller, of which the PHY needs to be aware. + interface is a u32 which specifies the connection type used + between the controller and the PHY. Examples are GMII, MII, + RGMII, and SGMII. For a full list, see include/linux/phy.h + Now just make sure that phydev-supported and phydev-advertising have any values pruned from them which don't make sense for your controller (a 10/100 controller may be connected to a gigabit capable PHY, so you would need to @@ -191,7 +196,7 @@ Doing it all yourself start, or disables then frees them for stop. struct phy_device * phy_attach(struct net_device *dev, const char *phy_id, -u32 flags); +u32 flags, u32 interface); Attaches a network device to a particular PHY, binding the PHY to a generic driver if none was found during bus initialization. Passes in diff --git a/arch/powerpc/sysdev/fsl_soc.c b/arch/powerpc/sysdev/ fsl_soc.c index b4b5b4a..b053370 100644 --- a/arch/powerpc/sysdev/fsl_soc.c +++ b/arch/powerpc/sysdev/fsl_soc.c @@ -211,6 +211,36 @@ static int __init gfar_set_flags(struct return device_flags; } +/* Return the Linux interface mode type based on the + * specification in the device-tree */ +static int __init gfar_get_interface(struct device_node *np) +{ + const char *istr; + int interface = 0; + + istr = get_property(np, interface, NULL); + + if (istr == NULL) + istr = GMII; + + if (!strcasecmp(istr, GMII)) + interface = PHY_INTERFACE_MODE_GMII; + else if (!strcasecmp(istr, MII)) + interface = PHY_INTERFACE_MODE_MII; + else if (!strcasecmp(istr, RGMII)) + interface = PHY_INTERFACE_MODE_RGMII; + else if (!strcasecmp(istr, SGMII)) + interface = PHY_INTERFACE_MODE_SGMII; + else if (!strcasecmp(istr, TBI)) + interface = PHY_INTERFACE_MODE_TBI; + else if (!strcasecmp(istr, RMII)) + interface = PHY_INTERFACE_MODE_RMII; + else if (!strcasecmp(istr, RTBI)) + interface = PHY_INTERFACE_MODE_RTBI; + + return interface; +} + static struct device_node * __init gfar_get_phy_node(struct device_node *np) { const phandle *ph; @@ -342,6 +372,12 @@ static int __init gfar_of_init(void) if (mac_addr) memcpy(gfar_data.mac_addr, mac_addr, 6); + gfar_data.interface = gfar_get_interface(np); + if (gfar_data.interface == 0) { + printk(gfar %d failed to set interface\n, num); + continue; + } + ret = gfar_set_phy_info(np, gfar_data.phy_id,
Re: [PATCH] Add support for configuring the PHY connection interface
On Nov 8, 2006, at 00:16, Kumar Gala wrote: On Nov 8, 2006, at 12:10 AM, Andy Fleming wrote: Most PHYs connect to an ethernet controller over a GMII or MII interface. However, a growing number are connected over different interfaces, such as RGMII or SGMII. The ethernet driver will tell the PHY what type of connection it is by setting it manually, or passing it in through phy_connect (or phy_attach). Changes include: * Updates to documentation * Updates to other PHY Lib consumers * Changes to PHY Lib to add interface support * Some minor changes to whitespace in phy.h * interface values now passed to gianfar Signed-off-by: Andrew Fleming [EMAIL PROTECTED] Any reason to not make interface an enum? I became mildly attached to the notion of having a reduced bit. I'd be open to changing it. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.19-rc4-git10][PKT_SCHED] sch_htb: INIT_HLIST_NODE after hlist_del()
On Tue, Nov 07, 2006 at 09:50:07AM -0800, Stephen Hemminger wrote: On Tue, 7 Nov 2006 07:49:43 +0100 Jarek Poplawski [EMAIL PROTECTED] wrote: ... Your patch duplicated the code in hlist_del_init(). Why not do: --- a/net/sched/sch_htb.c 2006-11-07 09:48:22.0 -0800 +++ b/net/sched/sch_htb.c 2006-11-07 09:49:01.0 -0800 @@ -1284,8 +1284,7 @@ struct htb_class, sibling)); /* note: this delete may happen twice (see htb_delete) */ - if (!hlist_unhashed(cl-hlist)) - hlist_del(cl-hlist); + hlist_del_init(cl-hlist); list_del(cl-sibling); if (cl-prio_activity) @@ -1333,8 +1332,7 @@ sch_tree_lock(sch); /* delete from hash and active; remainder in destroy_class */ - if (!hlist_unhashed(cl-hlist)) - hlist_del(cl-hlist); + hlist_del_init(cl-hlist); if (cl-prio_activity) htb_deactivate(q, cl); I've understood you first suggestion. But after sending my patch I've found it is also hiding a real problem of excessive deletion in one and possibly more places. So probably this should be done the right way and this hlist_unhashed testing left in BUG_ON only... Cheers, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] [2.6.19-rc4-mm2] can't compile drivers/acpi/processor_idle.c
On Wed, 8 Nov 2006 15:01:41 +0900 KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote: While compiling 2.6.19-rc4-mm2 on ia64, I met this compile error. == CC [M] drivers/acpi/processor_idle.o drivers/acpi/processor_idle.c:43:22: asm/apic.h: No such file or directory drivers/acpi/processor_idle.c: In function `acpi_processor_power_seq_show': drivers/acpi/processor_idle.c:1202: warning: long long unsigned int format, u64 arg (arg 5) == This is because of acpi-include-apic-h.patch, maybe. ia64 doesn't have asm/acpi.h That got fixed (by ugly means). my .config is attached. But rc5-mm1 remains broken with that .config: arch/ia64/pci/pci.c: In function `pci_acpi_scan_root': arch/ia64/pci/pci.c:354: warning: implicit declaration of function `pxm_to_node' ... arch/ia64/pci/built-in.o(.text+0xe92): In function `pci_acpi_scan_root': : undefined reference to `pxm_to_node' This bug exists in mainline. Also, drivers/built-in.o(.text+0xd9a72): In function `e1000_xmit_frame': : undefined reference to `csum_ipv6_magic' I don't know how this got broken. ia64 seems to be the only architecture which doesn't have an implementation of csum_ipv6_magic(). This bug appears to be introduced by git-netdev-all.patch. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] [2.6.19-rc4-mm2] can't compile drivers/acpi/processor_idle.c
From: Andrew Morton [EMAIL PROTECTED] Date: Tue, 7 Nov 2006 22:52:59 -0800 Also, drivers/built-in.o(.text+0xd9a72): In function `e1000_xmit_frame': : undefined reference to `csum_ipv6_magic' I don't know how this got broken. ia64 seems to be the only architecture which doesn't have an implementation of csum_ipv6_magic(). This bug appears to be introduced by git-netdev-all.patch. There is a generic version, which e1000 would get if it included the net/ip_checksum.h header file. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] [2.6.19-rc4-mm2] can't compile drivers/acpi/processor_idle.c
On Tue, 7 Nov 2006 22:52:59 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Wed, 8 Nov 2006 15:01:41 +0900 KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote: While compiling 2.6.19-rc4-mm2 on ia64, I met this compile error. == CC [M] drivers/acpi/processor_idle.o drivers/acpi/processor_idle.c:43:22: asm/apic.h: No such file or directory drivers/acpi/processor_idle.c: In function `acpi_processor_power_seq_show': drivers/acpi/processor_idle.c:1202: warning: long long unsigned int format, u64 arg (arg 5) == This is because of acpi-include-apic-h.patch, maybe. ia64 doesn't have asm/acpi.h That got fixed (by ugly means). Ah, okay. I'll move to rc5-mm1. Thank you. my .config is attached. But rc5-mm1 remains broken with that .config: arch/ia64/pci/pci.c: In function `pci_acpi_scan_root': arch/ia64/pci/pci.c:354: warning: implicit declaration of function `pxm_to_node' ... arch/ia64/pci/built-in.o(.text+0xe92): In function `pci_acpi_scan_root': : undefined reference to `pxm_to_node' This bug exists in mainline. How about this ? Maybe ia64 people's review is necessary. -Kame == When ACPI NUMA, pxm_to_node is used and it exists in drivers/acpi/numa.c Signed-Off-By: KAMEZAWA Hiroyuki [EMAIL PROTECTED] Index: linux-2.6.19-rc4-mm2/arch/ia64/Kconfig === --- linux-2.6.19-rc4-mm2.orig/arch/ia64/Kconfig 2006-11-08 14:15:21.0 +0900 +++ linux-2.6.19-rc4-mm2/arch/ia64/Kconfig 2006-11-08 16:16:40.0 +0900 @@ -353,6 +353,7 @@ bool NUMA support depends on !IA64_HP_SIM !FLATMEM default y if IA64_SGI_SN2 + select ACPI_NUMA if ACPI help Say Y to compile the kernel to support NUMA (Non-Uniform Memory Access). This option is for configuring high-end multiprocessor - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html