[PATCH 0/1] IPN: Inter Process Networking
Inter Process Networking (PATCH): This patch adds a new address family for inter process communication. AF_IPN: inter process networking, i.e. multipoint, multicast/broadcast communication among processes (and networks). Contents of this document: 1. What is IPN? 2. Why IPN? 2.1 Why IPN instead of IP Multicast? 2.2 Why IPN instead of AF_NETLINK? 3. How? We've read all the comments in the previous thread about IPN and we've tried to answer. 1. WHAT IS IPN? --- IPN is a new address family designed for one-to-many, many-to-many and peer-to-peer communication among processes. Berkeley sockets have been designed for client-server or point-to-point communication; AF_UNIX does not support multicast/broadcast. AF_IPN does, in a simple, efficient but extensible way. IPN is an Inter Process Communication paradigm where all the processes appear as they were connected by a networking bus. On IPN, processes can interoperate using real networking protocols (e.g. ethernet) but also using application defined protocols (maybe just sending ascii strings, video or audio frames, etc). IPN provides networking (in the broaden definition you can imagine) to the processes. Processes can be ethernet nodes, run their own TCP-IP stacks if they like (e.g. virtual machines), mount ATAonEthernet disks, etc.etc. IPN networks can be interconnected with real networks or IPN networks running on different computers can interoperate (can be connected by virtual cables). IPN is part of the Virtual Square Project (vde, lwipv6, view-os, umview/kmview, see wiki.virtualsquare.org). 2. WHY IPN? --- Many applications can benefit from IPN. First of all VDE (Virtual Distributed Ethernet): one service of IPN is a kernel implementation of VDE. IPN can be useful for applications where one or some processes feed their data (*any kind* of data, not only networking-related messages) to several consuming processes (maybe joining the stream at run time). IPN sockets can be also connected to tap (tuntap) like interfaces or to real interfaces (like brctl addif). There are specific ioctls to define a tap interface or grab an existing one. Several existing services could be implemented (and often could have extended features) on the top of IPN: - kernel Ethernet bridging - TUN/TAP - MACVLAN IPN could be used (IMHO) to provide multicast services to processes. Audio frames or video frames could be multiplexed such that multiple applications can use them. I think that something like Jack can be implemented on the top of IPN. Something like a VideoJack can provide video frames to several applications: e.g. the same image from a camera can be viewed by xawtv, recorded and sent to a streaming service. IPN sockets can be used wherever there is the idea of broadcasting channel i.e. where processes can join (and leave) the information flow at runtime. IPN can be seen as publish and subscribe. Different delivery policies can be defined as IPN protocols (loaded as submodules of ipn.ko). For instance, an ethernet switch is a policy (kvde_switch.ko: packets are unicast delivered if the MAC address is already in the switching hash table), we are designing an extendended switch, full of interesting features like our userland vde_switch (with vlan/fst/manamement etc..), and a layer3 switch, but other policies can be defined to implement the specific requirements of other services. I feel that there is no limits to creativity about multicast services for processes. Userspace services (like vde) do exist, but IPN provides a faster and unified support. 2.1 Why IPN instead of IP Multicast? - IPN seems to be faster than IP Multicast. (see my message to LKML of Dec 06). - IPN provides file system permission to access the communication medium, and it uses the file system for naming. - IPN does not need any tunneling or packet encapsulation, it works as a layer 1 virtual network. - IPN protocols (implemented by kernel submodules) provide forwarding policies: the set of receipients for each messages is computed from the contents of the message itself. Ethernet virtual switches or other routing rules for any kind of data can be implemented as IPN protocols. 2.2 Why IPN instead of AF_NETLINK? -- - Netlink has been designed for user to kernel communication. - Netlink has many missing features to provide services similar to IPN. - Currently multicast seems to be allowed for root only. Access control should be added completely. - Netlink interface for user processes is not very immediate (libnl has been developed as a higher level solution to that). - Netlink already seems to suffer from overpopulation: NETLINK_GENERIC has been added for simplified netlink usage but it adds yet another header and rules to be followed. - Netlinks is quite rigid as for message delivery guarantees: unicast implies lossless
Re: init_timer_deferrable conversion
On Sun, 16 Dec 2007 22:00:23 -0500 (EST) Parag Warudkar [EMAIL PROTECTED] wrote: In my quest to get the wake-ups from idle per second down to bare minimum, I noticed 3 places in the kernel that could benefit from using init_timer_deferrable() instead of init_timer() - a) drivers/net/sky2.c - watchdog_timer. This was showing up high on Powertop's list of things that cause routine wakeups from idle. After converting to init_timer_deferrable() the wakeups went down and this one no longer shows up in powertop's list. 25% reduction. b) kernel/time/clocksource.c - watchdog_timer - same story as sky2.c c) net/core/neighbour.c - gc_timer - Most benefit from deferrable timer. neigh_periodic_timer() is actually doing almost nothing per round, since it looks only one slot of hash table. We could probably convert it to a workqueue and scan whole table at once. I am running a kernel with above changes and haven't noticed any immediate problems and the wakeups-from-idle have gone down from 5-7 to mere 1-2 per second. Is there any reason not to make the above timers deferrable - corner cases, other side effects? If not then I will submit a patch. Thanks Parag -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: net-2.6.25 rebased...
On Sun, 16 Dec 2007, David Miller wrote: I needed to rebase for two reasons: 1) Ilpo asked me to revert a lot of TCP stuff and the easiest way to do that was during a rebase. 2) Patrick McHardy needs some of the pending net-2.6 bug fixes in there in order to send me patches for some netfilter compat stuff. It's all there in the usual spot: kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.25.git I'm still doing build tests with various configurations on my Niagara-2 box. Let me know if I screwed up anything, thanks! Thanks, at least the end result for TCP was exactly what I expected to find there... -- i. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] IPN: Inter Process Networking
On Mon, 17 Dec 2007, Renzo Davoli wrote: Inter Process Networking (PATCH): 1. WHAT IS IPN? --- IPN is a new address family designed for one-to-many, many-to-many and peer-to-peer communication among processes. Berkeley sockets have been designed for client-server or point-to-point communication; AF_UNIX does not support multicast/broadcast. AF_IPN does, in a simple, efficient but extensible way. IPN is an Inter Process Communication paradigm where all the processes appear as they were connected by a networking bus. On IPN, processes can interoperate using real networking protocols (e.g. ethernet) but also using application defined protocols (maybe just sending ascii strings, video or audio frames, etc). IPN provides networking (in the broaden definition you can imagine) to the processes. Processes can be ethernet nodes, run their own TCP-IP stacks if they like (e.g. virtual machines), mount ATAonEthernet disks, etc.etc. IPN networks can be interconnected with real networks or IPN networks running on different computers can interoperate (can be connected by virtual cables). IPN is part of the Virtual Square Project (vde, lwipv6, view-os, umview/kmview, see wiki.virtualsquare.org). other then the fact that this is bi-directional, how is this better then using pipes and splice? wouldn't it be better to just add the ability for multiple writers to send to the same pipe, and then have all of them splice into the output of that pipe? this would give the same data-agnostic communication that you are looking for, and with the minor detail that software would have to filter out messages that they send, would appear to meet all the goals you are looking at, useing existing kernel features that are designed to be very high performance. David Lang -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[ROSE] [AX25] possible circular locking
Hi, When I killall kissattach I can see the following message. This happens on kernel 2.6.24-rc5 already patched with the 6 previously patches I sent recently. === [ INFO: possible circular locking dependency detected ] 2.6.23.9 #1 --- kissattach/2906 is trying to acquire lock: (linkfail_lock){-+..}, at: [d8bd4603] ax25_link_failed+0x11/0x39 [ax25] but task is already holding lock: (ax25_list_lock){-+..}, at: [d8bd7c7c] ax25_device_event+0x38/0x84 [ax25] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: - #3 (ax25_list_lock){-+..}: [c0130897] __lock_acquire+0x9e9/0xbe6 [d8bd845c] ax25_find_cb+0x18/0xc6 [ax25] [c0130b02] lock_acquire+0x6e/0x87 [d8bd845c] ax25_find_cb+0x18/0xc6 [ax25] [c02a399b] _spin_lock_bh+0x2e/0x39 [d8bd845c] ax25_find_cb+0x18/0xc6 [ax25] [d8bd845c] ax25_find_cb+0x18/0xc6 [ax25] [d8bd5d57] ax25_send_frame+0x40/0x131 [ax25] [d8bed51a] rose_send_frame+0x4a/0x5b [rose] [d8bed946] rose_link_rx_restart+0x135/0x157 [rose] [c02a399b] _spin_lock_bh+0x2e/0x39 [d8bee56a] rose_route_frame+0xad/0x4f3 [rose] [c0105215] dump_trace+0x81/0x8b [c012dea3] save_trace+0x37/0x8c [c012f73c] mark_lock+0x337/0x44b [c0130a4c] __lock_acquire+0xb9e/0xbe6 [d8bd471e] ax25_protocol_function+0x30/0x34 [ax25] [d8bd46fb] ax25_protocol_function+0xd/0x34 [ax25] [d8bd5271] ax25_rx_iframe+0x2e3/0x332 [ax25] [c011f839] __mod_timer+0x89/0x93 [d8bd6b95] ax25_std_frame_in+0x5b1/0x638 [ax25] [d8bd4c49] ax25_kiss_rcv+0x3cd/0x712 [ax25] [c012f889] mark_held_locks+0x39/0x53 [c02a3d2a] _spin_unlock_irqrestore+0x34/0x39 [c024a79b] sock_queue_rcv_skb+0xd6/0xf3 [c02a3879] _read_unlock+0x14/0x1c [c024a79b] sock_queue_rcv_skb+0xd6/0xf3 [c025033c] netif_receive_skb+0x22d/0x289 [c012fa60] trace_hardirqs_on+0x109/0x148 [c02521ff] process_backlog+0x7b/0xeb [c02522c6] net_rx_action+0x57/0xfd [c011c52d] __do_softirq+0x40/0x90 [c011c5a4] do_softirq+0x27/0x3d [c0106768] do_IRQ+0x58/0x6c [c0104cee] common_interrupt+0x2e/0x40 [] 0x - #2 (rose_route_list_lock){-+..}: [c0130897] __lock_acquire+0x9e9/0xbe6 [d8bee50a] rose_route_frame+0x4d/0x4f3 [rose] [c0130b02] lock_acquire+0x6e/0x87 [d8bee50a] rose_route_frame+0x4d/0x4f3 [rose] [c02a399b] _spin_lock_bh+0x2e/0x39 [d8bee50a] rose_route_frame+0x4d/0x4f3 [rose] [d8bee50a] rose_route_frame+0x4d/0x4f3 [rose] [c0105215] dump_trace+0x81/0x8b [c012dea3] save_trace+0x37/0x8c [c012f73c] mark_lock+0x337/0x44b [c0130a4c] __lock_acquire+0xb9e/0xbe6 [d8bd471e] ax25_protocol_function+0x30/0x34 [ax25] [d8bd46fb] ax25_protocol_function+0xd/0x34 [ax25] [d8bd5271] ax25_rx_iframe+0x2e3/0x332 [ax25] [c011f839] __mod_timer+0x89/0x93 [d8bd6b95] ax25_std_frame_in+0x5b1/0x638 [ax25] [d8bd4c49] ax25_kiss_rcv+0x3cd/0x712 [ax25] [c012f889] mark_held_locks+0x39/0x53 [c02a3d2a] _spin_unlock_irqrestore+0x34/0x39 [c024a79b] sock_queue_rcv_skb+0xd6/0xf3 [c02a3879] _read_unlock+0x14/0x1c [c024a79b] sock_queue_rcv_skb+0xd6/0xf3 [c025033c] netif_receive_skb+0x22d/0x289 [c012fa60] trace_hardirqs_on+0x109/0x148 [c02521ff] process_backlog+0x7b/0xeb [c02522c6] net_rx_action+0x57/0xfd [c011c52d] __do_softirq+0x40/0x90 [c011c5a4] do_softirq+0x27/0x3d [c0106768] do_IRQ+0x58/0x6c [c0104cee] common_interrupt+0x2e/0x40 [] 0x - #1 (rose_neigh_list_lock){-+..}: [c0130897] __lock_acquire+0x9e9/0xbe6 [d8bee31e] rose_link_failed+0xe/0x44 [rose] [c0130b02] lock_acquire+0x6e/0x87 [d8bee31e] rose_link_failed+0xe/0x44 [rose] [d8bd7783] ax25_t1timer_expiry+0x0/0x20 [ax25] [c02a399b] _spin_lock_bh+0x2e/0x39 [d8bee31e] rose_link_failed+0xe/0x44 [rose] [d8bee31e] rose_link_failed+0xe/0x44 [rose] [d8bd461a] ax25_link_failed+0x28/0x39 [ax25] [d8bd7300] ax25_disconnect+0x34/0xbe [ax25] [c011f4f3] run_timer_softirq+0xee/0x14a [c011c51e] __do_softirq+0x31/0x90 [c012fa60] trace_hardirqs_on+0x109/0x148 [c011c52d] __do_softirq+0x40/0x90 [c011c5a4] do_softirq+0x27/0x3d [c0106768] do_IRQ+0x58/0x6c [c0104cee] common_interrupt+0x2e/0x40 [d8a9163f] acpi_processor_idle+0x262/0x3cf [processor] [c0102342] cpu_idle+0x3c/0x51 [c0382a0c] start_kernel+0x272/0x277 [c0382323] unknown_bootoption+0x0/0x195 [] 0x - #0 (linkfail_lock){-+..}: [c0130780] __lock_acquire+0x8d2/0xbe6 [c0130b02] lock_acquire+0x6e/0x87 [d8bd4603]
[BUG] [ROSE] [AX25] system impossible to reboot with linux-2.6.24-rc5
Hi, I am running a packet switch application using AX25, mkiss and ROSE modules (FPAC). It runs for days without problems when the patch I recently sent to this list are applied. However, if I try to shutdown linux-2.6.24-rc5 it goes into an infinite loop displaying the following message again and again and I cannot stop it nor end the reboot unless hard resetting. I took a picture of the screen at this time showing : === __delay+0x6/0x7 _raw_spin_lock°0x7a/0xd2 ax25_find_cb+0x18/0xc6 [ax25] ax25_send_frame+0x40/0x31 [ax25] rose_send_frame+0x47/0x58 [rose] rose_transmit_clear_request+0xae/0xc8 [rose] rose_del_route_by_neigh+0x7e/0xbb [rose] rose_link_failed+0x31/0x44 [rose] ax25_link_failed+0x28/0x39 [ax25] ax25_disconnect+0x34/0xbe [ax25] ax25_device_event+0x5f/0x90 [ax25] notifier_call_chain+0x2a/0x47 raw_notifier_call_chain+0x17/0x1a dev_close+0x54/0x58 rollback_registered+0x9b/0x11d unregister_netdevice+0x8/0x3d unregister_netdev+0xf/0x15 mkiss_close+0x63/0x7c [mkiss] release_dev+0x4ed/0x5a2 __lock_acquire+0xb9e/0xbe6 __lock_acquire+0xb9e/0xbe6 _atomic_dec_and_lock+0x22/0x2c tty_release+0x7/0xa __fput+0xbc/0x16b filp_close+0x51/0x58 put_files_struct+0x5f/0xa7 do_exit+0x205/0x674 error_code+0x6a/0x70 sys_exit_group+0x0/0xd sysenter_past_esp+0x5f/0x9a == Bernard Pidoux -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] IPN: Inter Process Networking
On Mon, Dec 17, 2007 at 03:31:48AM -0800, [EMAIL PROTECTED] wrote: wouldn't it be better to just add the ability for multiple writers to send to the same pipe, and then have all of them splice into the output of that pipe? this would give the same data-agnostic communication that you are looking for, and with the minor detail that software would have to filter out messages that they send, would appear to meet all the goals you are looking at, useing existing kernel features that are designed to be very high performance. Being able to define both filtering policies (think of a virtual ethernet layer 2 switch, for instance. We have situations where dozens or hundreds of virtual cables are connected to the same switch, it would be much, much slower if you had to awake all the user processes for each single non-broadcast ethernet frame, and send them useless data) and delivery guarantees (lossless vs best-effort delivery) are not minor details in our opinion. We might have added a level2 virtual ethernet switch at kernel level, but it seemed to specific. With a minor effort we have split the dumb bus (IPN) and the ability to process specific structured data with specific policies (sub-modules as kvde_switch). We surely may adapt existing features (AF_UNIX, or pipes) but they offer a quite established interface and semantics and we think it should be better to add a new family. This would prevent from breaking what already exists and leaving more freedom in defining the new family according to needs. As for ptrace vs utrace: ptrace has been designed for debugging; trying to bend it to be fit for virtualization is likely to end up in an intricated interface and implementation. utrace has been designed in a much more general way. You can implement ptrace over utrace, but you can use utrace also for virtualization in a cleaner, simpler and more efficient way. Why not? Ludovico -- [EMAIL PROTECTED]#acheronte (irc.freenode.net) ICQ: 64483080 GPG ID: 07F89BB8 Jabber: [EMAIL PROTECTED] Yahoo: gardenghelle -- This is signature nr. 3556 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] IPN: Inter Process Networking
On Mon, 17 Dec 2007, Ludovico Gardenghi wrote: On Mon, Dec 17, 2007 at 03:31:48AM -0800, [EMAIL PROTECTED] wrote: wouldn't it be better to just add the ability for multiple writers to send to the same pipe, and then have all of them splice into the output of that pipe? this would give the same data-agnostic communication that you are looking for, and with the minor detail that software would have to filter out messages that they send, would appear to meet all the goals you are looking at, useing existing kernel features that are designed to be very high performance. Being able to define both filtering policies (think of a virtual ethernet layer 2 switch, for instance. We have situations where dozens or hundreds of virtual cables are connected to the same switch, it would be much, much slower if you had to awake all the user processes for each single non-broadcast ethernet frame, and send them useless data) and delivery guarantees (lossless vs best-effort delivery) are not minor details in our opinion. We might have added a level2 virtual ethernet switch at kernel level, but it seemed to specific. With a minor effort we have split the dumb bus (IPN) and the ability to process specific structured data with specific policies (sub-modules as kvde_switch). it seems like you are mixing your use cases and arguing reasons for one when answering questions about another. if you are talking network connections between virtual systems, then the exiting tap interfaces would seem to do everything you are looking for. you can add them to bridges, route between them, filter traffic between them (at whatever layer you want with netfilter), use multicast, etc as you would any real interface. if, however, you are talking about non-network communications (your example of sending raw video frames across the interface), and want multiple processes to receive them, this sounds like exactly the thing that splice was designed to do, distribute data to multiple recipiants simultaniously and efficiantly. I think you need to seperate out these two use cases (and any others you are advocating this for) and argue each one on it's own. We surely may adapt existing features (AF_UNIX, or pipes) but they offer a quite established interface and semantics and we think it should be better to add a new family. This would prevent from breaking what already exists and leaving more freedom in defining the new family according to needs. for a new family to be valuble, you need to show what it does that isn't available in existing families. As for ptrace vs utrace: ptrace has been designed for debugging; trying to bend it to be fit for virtualization is likely to end up in an intricated interface and implementation. utrace has been designed in a much more general way. You can implement ptrace over utrace, but you can use utrace also for virtualization in a cleaner, simpler and more efficient way. Why not? I'm not familiar enough with ptrace vs utrace to know this argument. but I haven't heard any of the virtualization people complaining about the existing interfaces. They seem to have been happily useing them for a number of years. David Lang -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Packet per Second
In article [EMAIL PROTECTED] Glen Turner[EMAIL PROTECTED] wrote: On Fri, 2007-12-14 at 15:34 +, Flvio Pires wrote: Well, I work on an ISP and we have a linux box acting as a bridge+firewall. With this bridge+firewall we control the packet rate per second from each client and from our repeaters. But I can`t measure the packet rate per IP. Is there any tool for this? The usual approach is to generate NetFlow records -- there are a number of Linux tools for this. Collect them with a collector(flow-tools being a common choice). Then have a Perl script which reads the flow records, processes them whichever way you desire, and drops the result into a rrdtool file (there are modules for both reading the flow-tools data and outputing in the rrdtoolformat). The rrdtool utilities have a limited range of graphs, but there is a huge selection of graphing packages from other authors for rrdtool-stored data (Drraw, etc). Flow-tools also has some third-party analysis tools, some of those have good top talker statistics. This is a lot of work, since you are really putting a completemeasurement infrastructure in place to get the one statistic you desire. But I'd encourage you to do that, since knowing one statistic usually leads to further questions of the data Thx for the answer Glen, I alread though about something like this. But, isn`t NetFlow just for Cisco IOS ? -- I'm trying a new usenet client for Mac, Nemo OS X. You can download it at http://www.malcom-mac.com/nemo -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] [TCP]: Include __tcp_reset_fack_counts to non-__ version
On Tue, 11 Dec 2007, Christoph Hellwig wrote: On Tue, Dec 11, 2007 at 01:50:39PM +0200, Ilpo J?rvinen wrote: + BUG_ON((prev != NULL) !tcp_skb_adjacent(sk, prev, skb[queue])); + + tcp_for_write_queue_from(skb[queue], sk, queue) { + if ((prev != NULL) !tcp_skb_adjacent(sk, prev, skb[queue])) + break; + + if (!before(TCP_SKB_CB(skb[queue])-seq, tcp_sk(sk)-snd_nxt) || + TCP_SKB_CB(skb[queue])-fack_count == fc) + return; There's quite a few overflows of the normal 80 char limit here. Because you're current style is a little on the verbose side that's trivially fixable, though: This part got removed when part of TCP code got removed during net-2.6.25 rebase... Thanks anyway for the reminder, I'll try to be more careful during code moves in future but I'll probably continue to allow expections in cases where the offenders only consist of closing parenthesis, block opening brace and termination semicolon (like it was in one of these lines as well). -- i. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/1] IPN: Inter Process Networking
On Mon, Dec 17, 2007 at 04:10:19AM -0800, [EMAIL PROTECTED] wrote: if you are talking network connections between virtual systems, then the exiting tap interfaces would seem to do everything you are looking for. you can add them to bridges, route between them, filter traffic between them (at whatever layer you want with netfilter), use multicast, etc as you would any real interface. if, however, you are talking about non-network communications (your example of sending raw video frames across the interface), and want multiple processes to receive them, this sounds like exactly the thing that splice was designed to do, distribute data to multiple recipiants simultaniously and efficiantly. I'll try to explain. Our first interest was to be able to interconnect virtual, real, and partial virtual machines. We developed VDE for this, it's a user-level L2 switch. Specific as it may be, it's quite popular as a simple but flexible tool. It can interconnect UML, Qemu, UMView, slirp, everything that can be connected to a tap interface, etc. So, you say, it's a networking issue and we could live with tun/tap. There's a major point here: at present, dealing with tun/tap, bridges, routing is quite difficult if you are a *regular* user with *no* capabilites at all. You have tun/tap persistency and association to a specific user (or group, recently), at most. That's good - we don't want regular users to mess with global networking rules and settings. Think of a bunch of etherogeneous virtual machines, partial virtual machines (i.e. VMs where only a subset of system calls may be virtualized or not depending on the parameters - that's the case of View-OS) that must be interconnected and that may or may not have a connection to a real network interface (maybe via a tunnel towards a different machine). There's no need for administrator intervention here. Why should an user have to ask root to create lots of tap interfaces for him, bind them in a bridge and set up filtering/routing rules? What would the list of interfaces become when different users asked for the same thing at the same time? You could define a specific interconnecting bus, but we've already have it: ethernet. VDE comes in help as it allows regular users to build distributed ethernet networks. VDE works fine, but at present often results in a bottleneck because of the high number of user-processes involved and user-kernel-user switches needed in order to transfer a single ethernet frame. Moving the core inside the kernel would limit this problem and result in faster communication with still no need for root intervention or global namespace messing. (we're thinking if something can be done working with containers or similar structures, both for networking and partial virtualization, but that's another topic). So we started thinking how to use existing kernel structures, and we concluded that: - no existing kernel structures appeared to be optimal for this work; - if we've had to design a new structure, it would have been more useful if we tried to be as more general as we could. At present we're still focused on networking and other applications are just examples, but we thought that adding a general extensible multipoint IPC family is quite better than adding the most specific solution to our current problem. Maybe people with experience in other fields may tell us if there are other problems that can be resolved, or optimized, or simply made simpler, with IPN. Maybe our proposal is not the best as for interface and semantics. But we feel that it may fill an empty space in the available IPC mechanisms with a quite simple but powerful approach. for a new family to be valuble, you need to show what it does that isn't available in existing families. Is it more acceptable to add a new address family or to add features to existing ones? (my question is purely informative, I don't want to sound sarcastic or whatever) For instance, someone proposed let's just add access control to the netlink family. It seems a though work. You proposed splice, other have proposed multicast or netlink. If I have understood correctly, splice helps in copying data to different destinations in a very fast way. But it needs a userspace program that receives data, iterates on fds and splices the data out, calling a syscall for each destination. syscall calling may have become very fast but we still notice slowdowns due to the reasons I've explained before. --- (the following is not related to IPN but i wanted to answer this too) I'm not familiar enough with ptrace vs utrace to know this argument. but I haven't heard any of the virtualization people complaining about the existing interfaces. They seem to have been happily useing them for a number of years. ptrace has a number of drawbacks that have been partially addressed adding flags and parameters for cheating and obtaining better performances. It's *slow* expecially if you want to
[PATCH] [NET][POWERPC] ucc_geth: really fix section mismatch
Commit ed7e63a51d46e835422d89c687b8a3e419a4212a has tried to fix section mismatch: WARNING: vmlinux.o(.init.text+0x17278): Section mismatch: reference to .exit.text:uec_mdio_exit (between 'ucc_geth_init' and 'uec_mdio_init') But that mismatch still happens. This patch actually fixing section mismatch by removing __exit from the header file. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- drivers/net/ucc_geth_mii.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/ucc_geth_mii.h b/drivers/net/ucc_geth_mii.h index d834370..1e45b20 100644 --- a/drivers/net/ucc_geth_mii.h +++ b/drivers/net/ucc_geth_mii.h @@ -96,5 +96,5 @@ enum enet_tbi_mii_reg { int uec_mdio_read(struct mii_bus *bus, int mii_id, int regnum); int uec_mdio_write(struct mii_bus *bus, int mii_id, int regnum, u16 value); int __init uec_mdio_init(void); -void __exit uec_mdio_exit(void); +void uec_mdio_exit(void); #endif /* __UEC_MII_H */ -- 1.5.2.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Packet per Second
On Mon, Dec 17, 2007 at 11:18:57AM +, Flávio Pires wrote: In article [EMAIL PROTECTED] Glen Turner[EMAIL PROTECTED] wrote: On Fri, 2007-12-14 at 15:34 +, Flvio Pires wrote: Well, I work on an ISP and we have a linux box acting as a bridge+firewall. With this bridge+firewall we control the packet rate per second from each client and from our repeaters. But I can`t measure the packet rate per IP. Is there any tool for this? Thx for the answer Glen, I alread though about something like this. But, isn`t NetFlow just for Cisco IOS ? You probably want this, or at least its sources: http://packages.debian.org/stable/net/fprobe-ng the non-ng version is probably called: fprobe-ulog Then you want also NetFlow collector/accounter.. /Matti Aarnio -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NETFILTER] xt_hashlimit : speedups hash_dst()
David Miller wrote: From: Eric Dumazet [EMAIL PROTECTED] Date: Sat, 15 Dec 2007 12:04:47 +0100 I prefer to let admins chose their size, since it makes attacker life more difficult :) For example, I can tell you I have a server, were size is between 2.000.000 and 3.500.000, I dont want to be forced to use 2097152 A multiply is cheap, at least on current hardware. I agree, and I see nothing wrong with Eric's patch and it should be merged ASAP. I have it queued for 2.6.25 - would you like me to send it for 2.6.24? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC]: Break up a patch in two (rfc3448bis changes to feedback reception)
Hi Gerrit, Please take a look at the two attached patches, they were made from your patch [CCID3]: Implement rfc3448bis changes to feedback reception, that has this changeset comment: -- [CCID 3]: Implement rfc3448bis changes to feedback reception This implements the algorithm to update the allowed sending rate X upon receiving feedback packets, as described in draft rfc3448bis, 4.2/4.3. The patch further removes two irrelevant states in TX feedback handling: * the NO_SENT state is only triggered in bidirectional mode, costing unnecessary processing. * the TERM (terminating) state is irrelevant. -- The second part further removes, was made a separate patch that I have applied before the rfc3448bis changes to feedback reception. Doing it this way eases understanding the change as we don't mixup identantion changes made due to removing the switch statement. The end result should be equivalent, but please take a look and let me know if the algorithm was changed in any way. Thanks a lot, - Arnaldo From 9165240abd3d4e8280647b9372dc3f223a802347 Mon Sep 17 00:00:00 2001 From: Gerrit Renker [EMAIL PROTECTED] Date: Mon, 17 Dec 2007 10:25:06 -0200 Subject: [PATCH 2/3] [CCID3]: Remove two irrelevant states in TX feedback handling * the NO_SENT state is only triggered in bidirectional mode, costing unnecessary processing. * the TERM (terminating) state is irrelevant. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Ian McDonald [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/ccids/ccid3.c | 173 +++- 1 files changed, 84 insertions(+), 89 deletions(-) diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c index 90e0454..1b1cb74 100644 --- a/net/dccp/ccids/ccid3.c +++ b/net/dccp/ccids/ccid3.c @@ -402,112 +402,107 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) if (!(DCCP_SKB_CB(skb)-dccpd_type == DCCP_PKT_ACK || DCCP_SKB_CB(skb)-dccpd_type == DCCP_PKT_DATAACK)) return; + /* ... and only in the established state */ + if (hctx-ccid3hctx_state != TFRC_SSTATE_FBACK + hctx-ccid3hctx_state != TFRC_SSTATE_NO_FBACK) + return; opt_recv = hctx-ccid3hctx_options_received; + now = ktime_get_real(); - switch (hctx-ccid3hctx_state) { - case TFRC_SSTATE_NO_FBACK: - case TFRC_SSTATE_FBACK: - now = ktime_get_real(); - - /* estimate RTT from history if ACK number is valid */ - r_sample = tfrc_tx_hist_rtt(hctx-ccid3hctx_hist, - DCCP_SKB_CB(skb)-dccpd_ack_seq, now); - if (r_sample == 0) { - DCCP_WARN(%s(%p): %s with bogus ACK-%llu\n, dccp_role(sk), sk, - dccp_packet_name(DCCP_SKB_CB(skb)-dccpd_type), - (unsigned long long)DCCP_SKB_CB(skb)-dccpd_ack_seq); - return; - } + /* Estimate RTT from history if ACK number is valid */ + r_sample = tfrc_tx_hist_rtt(hctx-ccid3hctx_hist, + DCCP_SKB_CB(skb)-dccpd_ack_seq, now); + if (r_sample == 0) { + DCCP_WARN(%s(%p): %s with bogus ACK-%llu\n, dccp_role(sk), sk, + dccp_packet_name(DCCP_SKB_CB(skb)-dccpd_type), + (unsigned long long)DCCP_SKB_CB(skb)-dccpd_ack_seq); + return; + } + + /* Update receive rate in units of 64 * bytes/second */ + hctx-ccid3hctx_x_recv = opt_recv-ccid3or_receive_rate; + hctx-ccid3hctx_x_recv = 6; - /* Update receive rate in units of 64 * bytes/second */ - hctx-ccid3hctx_x_recv = opt_recv-ccid3or_receive_rate; - hctx-ccid3hctx_x_recv = 6; + /* Update loss event rate (which is scaled by 1e6) */ + pinv = opt_recv-ccid3or_loss_event_rate; + if (pinv == ~0U || pinv == 0) /* see RFC 4342, 8.5 */ + hctx-ccid3hctx_p = 0; + else /* can not exceed 100% */ + hctx-ccid3hctx_p = 100 / pinv; + /* +* Validate new RTT sample and update moving average +*/ + r_sample = dccp_sample_rtt(sk, r_sample); + hctx-ccid3hctx_rtt = tfrc_ewma(hctx-ccid3hctx_rtt, r_sample, 9); - /* Update loss event rate */ - pinv = opt_recv-ccid3or_loss_event_rate; - if (pinv == ~0U || pinv == 0) /* see RFC 4342, 8.5 */ - hctx-ccid3hctx_p = 0; - else /* can not exceed 100% */ - hctx-ccid3hctx_p = 100 / pinv; + if (hctx-ccid3hctx_state == TFRC_SSTATE_NO_FBACK) { /* -
[PATCH] Fix lost export-dynamic
get_link_kind() fails for statically linked modules (vlan, veth, etc.) if ip was linked without export-dynamic. Signed-off-by: Vitaliy Gusev [EMAIL PROTECTED] -- Thank, Vitaliy Gusev diff --git a/ip/Makefile b/ip/Makefile index 448efb9..b427d58 100644 --- a/ip/Makefile +++ b/ip/Makefile @@ -24,3 +24,5 @@ clean: rm -f $(ALLOBJ) $(TARGETS) LDLIBS += -ldl + +LDFLAGS += -Wl,-export-dynamic
[PATCH/RFC] TCP: use non-delayed ACK for congestion control RTT
When a delayed ACK representing two packets arrives, there are two RTT samples available, one for each packet. The first (in order of seq number) will be artificially long due to the delay waiting for the second packet, the second will trigger the ACK and so will not itself be delayed. According to rfc1323, the SRTT used for RTO calculation should use the first rtt, so receivers echo the timestamp from the first packet in the delayed ack. For congestion control however, it seems measuring delayed ack delay is not desirable as it varies independently of congestion. The patch below causes seq_rtt to be updated with any available later packet rtts which should have less (and hopefully zero) delack delay. The lower seq_rtt then gets passed to ca_ops-pkts_acked(). For non-delay based congestion control (cubic, h-tcp), rtt is sometimes used for rtt-scaling. In shortening the RTT, this may make them a little less aggressive. Delay-based schemes (eg vegas, illinois) should get a considerably cleaner, more accurate congestion signal, particularly for small cwnds. The congestion control module can potentially also filter out bad RTTs due to the delayed ack alarm by looking at the associated cnt which (where delayed acking is in use) should probably be 1 if the alarm went off or greater if the ACK was triggered by a packet. I seem to be undoing a design decision here so perhaps there is some reason this should not be done? Comments/explanations appreciated... Signed-off-by: Gavin McCullagh [EMAIL PROTECTED] --- a/net/ipv4/tcp_input.c 2007-12-15 00:22:23.0 + +++ b/net/ipv4/tcp_input.c 2007-12-17 13:35:16.0 + @@ -2691,11 +2691,9 @@ static int tcp_clean_rtx_queue(struct so (packets_acked 1)) flag |= FLAG_NONHEAD_RETRANS_ACKED; } else { - if (seq_rtt 0) { - seq_rtt = now - scb-when; - if (fully_acked) - last_ackt = skb-tstamp; - } + seq_rtt = now - scb-when; + if (fully_acked) + last_ackt = skb-tstamp; if (!(sacked TCPCB_SACKED_ACKED)) reord = min(cnt, reord); } @@ -2709,11 +2707,9 @@ static int tcp_clean_rtx_queue(struct so !before(end_seq, tp-snd_up)) tp-urg_mode = 0; } else { - if (seq_rtt 0) { - seq_rtt = now - scb-when; - if (fully_acked) - last_ackt = skb-tstamp; - } + seq_rtt = now - scb-when; + if (fully_acked) + last_ackt = skb-tstamp; reord = min(cnt, reord); } tp-packets_out -= packets_acked; -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC]: Break up a patch in two (rfc3448bis changes to feedback reception)
| The end result should be equivalent, but please take a look and That is a good catch - this patch was a pain to keep updated exactly due to the many indentation levels. I had a quick look, the patch looks ok. Just a small suggestion - since the RTT lookup code in tx_packet_recv() is new, would it make sense to group it with the RTT validation, as e.g. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NETFILTER] xt_hashlimit : speedups hash_dst()
On Sat, Dec 15, 2007 at 21:42:19 -0800, David Miller wrote: From: Eric Dumazet [EMAIL PROTECTED] Date: Sat, 15 Dec 2007 12:04:47 +0100 I prefer to let admins chose their size, since it makes attacker life more difficult :) For example, I can tell you I have a server, were size is between 2.000.000 and 3.500.000, I dont want to be forced to use 2097152 A multiply is cheap, at least on current hardware. I agree, and I see nothing wrong with Eric's patch and it should be merged ASAP. You could do the same optimization for net/netfilter/nf_conntrack_core.c:__hash_conntrack() , too. -- Do what you love because life is too short for anything else. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NETFILTER] xt_hashlimit : speedups hash_dst()
Sami Farin wrote: On Sat, Dec 15, 2007 at 21:42:19 -0800, David Miller wrote: From: Eric Dumazet [EMAIL PROTECTED] Date: Sat, 15 Dec 2007 12:04:47 +0100 I prefer to let admins chose their size, since it makes attacker life more difficult :) For example, I can tell you I have a server, were size is between 2.000.000 and 3.500.000, I dont want to be forced to use 2097152 A multiply is cheap, at least on current hardware. I agree, and I see nothing wrong with Eric's patch and it should be merged ASAP. You could do the same optimization for net/netfilter/nf_conntrack_core.c:__hash_conntrack() , too. Yes, I already took care of that for conntrack and other netfilter non-power-of-two hashes. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: init_timer_deferrable conversion
On Mon, 17 Dec 2007 09:55:04 +0100 Eric Dumazet [EMAIL PROTECTED] wrote: On Sun, 16 Dec 2007 22:00:23 -0500 (EST) Parag Warudkar [EMAIL PROTECTED] wrote: In my quest to get the wake-ups from idle per second down to bare minimum, I noticed 3 places in the kernel that could benefit from using init_timer_deferrable() instead of init_timer() - a) drivers/net/sky2.c - watchdog_timer. This was showing up high on Powertop's list of things that cause routine wakeups from idle. After converting to init_timer_deferrable() the wakeups went down and this one no longer shows up in powertop's list. 25% reduction. b) kernel/time/clocksource.c - watchdog_timer - same story as sky2.c c) net/core/neighbour.c - gc_timer - Most benefit from deferrable timer. neigh_periodic_timer() is actually doing almost nothing per round, since it looks only one slot of hash table. We could probably convert it to a workqueue and scan whole table at once. Parag, could you please try this patch ? [NET] ARP : Convert neigh garbage collection from softirq to workqueue Current neigh_periodic_timer() function is fired by timer IRQ, and scans one hash bucket out of a potentially big number. As we are supposed to scan whole hash table in 15 seconds, this means neigh_periodic_timer() can be fired very often. (depending on the number of concurrent hash entries we stored in this table) Converting this to a workqueue permits scaning whole table, minimizing icache pollution, and firing this work every 15 seconds, independantly of hash table size. Signed-off-by: Eric Dumazet [EMAIL PROTECTED] include/net/neighbour.h |4 - net/core/neighbour.c| 89 ++ 2 files changed, 45 insertions(+), 48 deletions(-) diff --git a/include/net/neighbour.h b/include/net/neighbour.h index a4f2618..fdb9251 100644 --- a/include/net/neighbour.h +++ b/include/net/neighbour.h @@ -24,6 +24,7 @@ #include linux/err.h #include linux/sysctl.h +#include linux/workqueue.h #include net/rtnetlink.h #define NUD_IN_TIMER (NUD_INCOMPLETE|NUD_REACHABLE|NUD_DELAY|NUD_PROBE) @@ -155,7 +156,7 @@ struct neigh_table int gc_thresh2; int gc_thresh3; unsigned long last_flush; - struct timer_list gc_timer; + struct delayed_work gc_work; struct timer_list proxy_timer; struct sk_buff_head proxy_queue; atomic_tentries; @@ -166,7 +167,6 @@ struct neigh_table struct neighbour**hash_buckets; unsigned inthash_mask; __u32 hash_rnd; - unsigned inthash_chain_gc; struct pneigh_entry **phash_buckets; #ifdef CONFIG_PROC_FS struct proc_dir_entry *pde; diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 4b6dd1e..495ab19 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -637,75 +637,74 @@ static void neigh_connect(struct neighbour *neigh) hh-hh_output = neigh-ops-hh_output; } -static void neigh_periodic_timer(unsigned long arg) +static void neigh_periodic_work(struct work_struct *work) { - struct neigh_table *tbl = (struct neigh_table *)arg; + struct neigh_table *tbl = container_of(work, struct neigh_table, gc_work.work); struct neighbour *n, **np; - unsigned long expire, now = jiffies; + unsigned int i; NEIGH_CACHE_STAT_INC(tbl, periodic_gc_runs); - write_lock(tbl-lock); + write_lock_bh(tbl-lock); /* * periodically recompute ReachableTime from random function */ - if (time_after(now, tbl-last_rand + 300 * HZ)) { + if (time_after(jiffies, tbl-last_rand + 300 * HZ)) { struct neigh_parms *p; - tbl-last_rand = now; + tbl-last_rand = jiffies; for (p = tbl-parms; p; p = p-next) p-reachable_time = neigh_rand_reach_time(p-base_reachable_time); } - np = tbl-hash_buckets[tbl-hash_chain_gc]; - tbl-hash_chain_gc = ((tbl-hash_chain_gc + 1) tbl-hash_mask); + for (i = 0 ; i = tbl-hash_mask; i++) { + np = tbl-hash_buckets[i]; - while ((n = *np) != NULL) { - unsigned int state; + while ((n = *np) != NULL) { + unsigned int state; - write_lock(n-lock); + write_lock(n-lock); - state = n-nud_state; - if (state (NUD_PERMANENT | NUD_IN_TIMER)) { - write_unlock(n-lock); - goto next_elt; - } + state = n-nud_state; + if (state (NUD_PERMANENT | NUD_IN_TIMER)) { + write_unlock(n-lock); + goto next_elt; +
Re: [PATCH] net/ipv4/netfilter/ip_tables.c: remove some inlines
Please CC netfilter-devel on netfilter patches. Denys Vlasenko wrote: Hi Patrick, Harald, I was working on unrelated problem and noticed that ip_tables.c seem to abuse inline. I prepared a patch which removes inlines except those which are used by packet matching code (and thus are really performance-critical). I added comments explaining that remaining inlines are performance critical. Result as reported by size: textdata bss dec hex filename - 6451 380 8869191b07 ip_tables.o + 6339 348 7267591a67 ip_tables.o Please take this patch into netfilter queue. This clashes with my pending patches, which I'll push upstream today. I also spent some time resyncing ip_tables and ip6_tables so a diff of both (with some sed'ing) shows only the actual differences, so please update ip6_tables as well. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHES 0/5]: DCCP patches for 2.6.25
Hi David, Please consider pulling from: master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6.25 Best Regards, - Arnaldo net/dccp/ccids/ccid3.c | 252 +- net/dccp/ccids/ccid3.h |8 - net/dccp/dccp.h|6 - 3 files changed, 130 insertions(+), 136 deletions(-) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/5] [CCID3]: Implement rfc3448bis changes to feedback reception
From: Gerrit Renker [EMAIL PROTECTED] This implements the algorithm to update the allowed sending rate X upon receiving feedback packets, as described in draft rfc3448bis, 4.2/4.3. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Ian McDonald [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/ccids/ccid3.c | 47 ++- 1 files changed, 26 insertions(+), 21 deletions(-) diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c index 1b1cb74..7c8e9ad 100644 --- a/net/dccp/ccids/ccid3.c +++ b/net/dccp/ccids/ccid3.c @@ -429,40 +429,46 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) if (pinv == ~0U || pinv == 0) /* see RFC 4342, 8.5 */ hctx-ccid3hctx_p = 0; else /* can not exceed 100% */ - hctx-ccid3hctx_p = 100 / pinv; + hctx-ccid3hctx_p = scaled_div(1, pinv); /* * Validate new RTT sample and update moving average */ r_sample = dccp_sample_rtt(sk, r_sample); hctx-ccid3hctx_rtt = tfrc_ewma(hctx-ccid3hctx_rtt, r_sample, 9); - + /* +* Update allowed sending rate X as per draft rfc3448bis-00, 4.2/3 +*/ if (hctx-ccid3hctx_state == TFRC_SSTATE_NO_FBACK) { - /* -* Larger Initial Windows [RFC 4342, sec. 5] -*/ - hctx-ccid3hctx_x= rfc3390_initial_rate(sk); - hctx-ccid3hctx_t_ld = now; + ccid3_hc_tx_set_state(sk, TFRC_SSTATE_FBACK); - ccid3_update_send_interval(hctx); + if (hctx-ccid3hctx_t_rto == 0) { + /* +* Initial feedback packet: Larger Initial Windows (4.2) +*/ + hctx-ccid3hctx_x= rfc3390_initial_rate(sk); + hctx-ccid3hctx_t_ld = now; - ccid3_pr_debug(%s(%p), s=%u, MSS=%u, - R_sample=%uus, X=%u\n, dccp_role(sk), - sk, hctx-ccid3hctx_s, - dccp_sk(sk)-dccps_mss_cache, r_sample, - (unsigned)(hctx-ccid3hctx_x 6)); + ccid3_update_send_interval(hctx); - ccid3_hc_tx_set_state(sk, TFRC_SSTATE_FBACK); - } else { + goto done_computing_x; + } else if (hctx-ccid3hctx_p == 0) { + /* +* First feedback after nofeedback timer expiry (4.3) +*/ + goto done_computing_x; + } + } - /* Update sending rate (step 4 of [RFC 3448, 4.3]) */ - if (hctx-ccid3hctx_p 0) - hctx-ccid3hctx_x_calc = + /* Update sending rate (step 4 of [RFC 3448, 4.3]) */ + if (hctx-ccid3hctx_p 0) + hctx-ccid3hctx_x_calc = tfrc_calc_x(hctx-ccid3hctx_s, hctx-ccid3hctx_rtt, hctx-ccid3hctx_p); - ccid3_hc_tx_update_x(sk, now); + ccid3_hc_tx_update_x(sk, now); - ccid3_pr_debug(%s(%p), RTT=%uus (sample=%uus), s=%u, +done_computing_x: + ccid3_pr_debug(%s(%p), RTT=%uus (sample=%uus), s=%u, p=%u, X_calc=%u, X_recv=%u, X=%u\n, dccp_role(sk), sk, hctx-ccid3hctx_rtt, r_sample, @@ -470,7 +476,6 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) hctx-ccid3hctx_x_calc, (unsigned)(hctx-ccid3hctx_x_recv 6), (unsigned)(hctx-ccid3hctx_x 6)); - } /* unschedule no feedback timer */ sk_stop_timer(sk, hctx-ccid3hctx_no_feedback_timer); -- 1.5.3.6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/5] [CCID3]: Use a function to update p_inv, and p is never used
From: Gerrit Renker [EMAIL PROTECTED] This patch 1) concentrates previously scattered computation of p_inv into one function; 2) removes the `p' element of the CCID3 RX sock (it is redundant); 3) makes the tfrc_rx_info structure standalone, only used on demand. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Ian McDonald [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/ccids/ccid3.c | 11 --- net/dccp/ccids/ccid3.h |8 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c index 876c747..90e0454 100644 --- a/net/dccp/ccids/ccid3.c +++ b/net/dccp/ccids/ccid3.c @@ -917,6 +917,7 @@ static int ccid3_hc_rx_getsockopt(struct sock *sk, const int optname, int len, u32 __user *optval, int __user *optlen) { const struct ccid3_hc_rx_sock *hcrx; + struct tfrc_rx_info rx_info; const void *val; /* Listen socks doesn't have a private CCID block */ @@ -926,10 +927,14 @@ static int ccid3_hc_rx_getsockopt(struct sock *sk, const int optname, int len, hcrx = ccid3_hc_rx_sk(sk); switch (optname) { case DCCP_SOCKOPT_CCID_RX_INFO: - if (len sizeof(hcrx-ccid3hcrx_tfrc)) + if (len sizeof(rx_info)) return -EINVAL; - len = sizeof(hcrx-ccid3hcrx_tfrc); - val = hcrx-ccid3hcrx_tfrc; + rx_info.tfrcrx_x_recv = hcrx-ccid3hcrx_x_recv; + rx_info.tfrcrx_rtt= hcrx-ccid3hcrx_rtt; + rx_info.tfrcrx_p = hcrx-ccid3hcrx_pinv == 0 ? ~0U : + scaled_div(1, hcrx-ccid3hcrx_pinv); + len = sizeof(rx_info); + val = rx_info; break; default: return -ENOPROTOOPT; diff --git a/net/dccp/ccids/ccid3.h b/net/dccp/ccids/ccid3.h index e9f6ff4..49ca32b 100644 --- a/net/dccp/ccids/ccid3.h +++ b/net/dccp/ccids/ccid3.h @@ -139,6 +139,8 @@ enum ccid3_hc_rx_states { * @ccid3hcrx_last_counter - Tracks window counter (RFC 4342, 8.1) * @ccid3hcrx_state - Receiver state, one of %ccid3_hc_rx_states * @ccid3hcrx_bytes_recv - Total sum of DCCP payload bytes + * @ccid3hcrx_x_recv - Receiver estimate of send rate (RFC 3448, sec. 4.3) + * @ccid3hcrx_rtt - Receiver estimate of RTT * @ccid3hcrx_tstamp_last_feedback - Time at which last feedback was sent * @ccid3hcrx_tstamp_last_ack - Time at which last feedback was sent * @ccid3hcrx_hist - Packet history (loss detection + RTT sampling) @@ -147,13 +149,11 @@ enum ccid3_hc_rx_states { * @ccid3hcrx_pinv - Inverse of Loss Event Rate (RFC 4342, sec. 8.5) */ struct ccid3_hc_rx_sock { - struct tfrc_rx_info ccid3hcrx_tfrc; -#define ccid3hcrx_x_recv ccid3hcrx_tfrc.tfrcrx_x_recv -#define ccid3hcrx_rtt ccid3hcrx_tfrc.tfrcrx_rtt -#define ccid3hcrx_pccid3hcrx_tfrc.tfrcrx_p u8 ccid3hcrx_last_counter:4; enum ccid3_hc_rx_states ccid3hcrx_state:8; u32 ccid3hcrx_bytes_recv; + u32 ccid3hcrx_x_recv; + u32 ccid3hcrx_rtt; ktime_t ccid3hcrx_tstamp_last_feedback; struct tfrc_rx_hist ccid3hcrx_hist; struct tfrc_loss_hist ccid3hcrx_li_hist; -- 1.5.3.6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/5] [CCID3]: Remove two irrelevant states in TX feedback handling
From: Gerrit Renker [EMAIL PROTECTED] * the NO_SENT state is only triggered in bidirectional mode, costing unnecessary processing. * the TERM (terminating) state is irrelevant. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Ian McDonald [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/ccids/ccid3.c | 173 +++- 1 files changed, 84 insertions(+), 89 deletions(-) diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c index 90e0454..1b1cb74 100644 --- a/net/dccp/ccids/ccid3.c +++ b/net/dccp/ccids/ccid3.c @@ -402,112 +402,107 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) if (!(DCCP_SKB_CB(skb)-dccpd_type == DCCP_PKT_ACK || DCCP_SKB_CB(skb)-dccpd_type == DCCP_PKT_DATAACK)) return; + /* ... and only in the established state */ + if (hctx-ccid3hctx_state != TFRC_SSTATE_FBACK + hctx-ccid3hctx_state != TFRC_SSTATE_NO_FBACK) + return; opt_recv = hctx-ccid3hctx_options_received; + now = ktime_get_real(); - switch (hctx-ccid3hctx_state) { - case TFRC_SSTATE_NO_FBACK: - case TFRC_SSTATE_FBACK: - now = ktime_get_real(); - - /* estimate RTT from history if ACK number is valid */ - r_sample = tfrc_tx_hist_rtt(hctx-ccid3hctx_hist, - DCCP_SKB_CB(skb)-dccpd_ack_seq, now); - if (r_sample == 0) { - DCCP_WARN(%s(%p): %s with bogus ACK-%llu\n, dccp_role(sk), sk, - dccp_packet_name(DCCP_SKB_CB(skb)-dccpd_type), - (unsigned long long)DCCP_SKB_CB(skb)-dccpd_ack_seq); - return; - } + /* Estimate RTT from history if ACK number is valid */ + r_sample = tfrc_tx_hist_rtt(hctx-ccid3hctx_hist, + DCCP_SKB_CB(skb)-dccpd_ack_seq, now); + if (r_sample == 0) { + DCCP_WARN(%s(%p): %s with bogus ACK-%llu\n, dccp_role(sk), sk, + dccp_packet_name(DCCP_SKB_CB(skb)-dccpd_type), + (unsigned long long)DCCP_SKB_CB(skb)-dccpd_ack_seq); + return; + } + + /* Update receive rate in units of 64 * bytes/second */ + hctx-ccid3hctx_x_recv = opt_recv-ccid3or_receive_rate; + hctx-ccid3hctx_x_recv = 6; - /* Update receive rate in units of 64 * bytes/second */ - hctx-ccid3hctx_x_recv = opt_recv-ccid3or_receive_rate; - hctx-ccid3hctx_x_recv = 6; + /* Update loss event rate (which is scaled by 1e6) */ + pinv = opt_recv-ccid3or_loss_event_rate; + if (pinv == ~0U || pinv == 0) /* see RFC 4342, 8.5 */ + hctx-ccid3hctx_p = 0; + else /* can not exceed 100% */ + hctx-ccid3hctx_p = 100 / pinv; + /* +* Validate new RTT sample and update moving average +*/ + r_sample = dccp_sample_rtt(sk, r_sample); + hctx-ccid3hctx_rtt = tfrc_ewma(hctx-ccid3hctx_rtt, r_sample, 9); - /* Update loss event rate */ - pinv = opt_recv-ccid3or_loss_event_rate; - if (pinv == ~0U || pinv == 0) /* see RFC 4342, 8.5 */ - hctx-ccid3hctx_p = 0; - else /* can not exceed 100% */ - hctx-ccid3hctx_p = 100 / pinv; + if (hctx-ccid3hctx_state == TFRC_SSTATE_NO_FBACK) { /* -* Validate new RTT sample and update moving average +* Larger Initial Windows [RFC 4342, sec. 5] */ - r_sample = dccp_sample_rtt(sk, r_sample); - hctx-ccid3hctx_rtt = tfrc_ewma(hctx-ccid3hctx_rtt, r_sample, 9); + hctx-ccid3hctx_x= rfc3390_initial_rate(sk); + hctx-ccid3hctx_t_ld = now; - if (hctx-ccid3hctx_state == TFRC_SSTATE_NO_FBACK) { - /* -* Larger Initial Windows [RFC 4342, sec. 5] -*/ - hctx-ccid3hctx_x= rfc3390_initial_rate(sk); - hctx-ccid3hctx_t_ld = now; + ccid3_update_send_interval(hctx); - ccid3_update_send_interval(hctx); + ccid3_pr_debug(%s(%p), s=%u, MSS=%u, + R_sample=%uus, X=%u\n, dccp_role(sk), + sk, hctx-ccid3hctx_s, + dccp_sk(sk)-dccps_mss_cache, r_sample, + (unsigned)(hctx-ccid3hctx_x 6)); - ccid3_pr_debug(%s(%p), s=%u, MSS=%u, - R_sample=%uus, X=%u\n, dccp_role(sk), -
[PATCH 5/5] [DCCP]: Remove unused inline function
From: Gerrit Renker [EMAIL PROTECTED] The function follows48(), which is a special-case of dccp_delta_seqno(), is nowhere used in the DCCP code, thus removed by this patch. Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Ian McDonald [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/dccp.h |6 -- 1 files changed, 0 insertions(+), 6 deletions(-) diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h index b138e20..ebe59d9 100644 --- a/net/dccp/dccp.h +++ b/net/dccp/dccp.h @@ -153,12 +153,6 @@ static inline u64 max48(const u64 seq1, const u64 seq2) return after48(seq1, seq2) ? seq1 : seq2; } -/* is seq1 next seqno after seq2 */ -static inline int follows48(const u64 seq1, const u64 seq2) -{ - return dccp_delta_seqno(seq2, seq1) == 1; -} - enum { DCCP_MIB_NUM = 0, DCCP_MIB_ACTIVEOPENS, /* ActiveOpens */ -- 1.5.3.6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/5] [CCID3]: Nofeedback timer according to rfc3448bis
From: Gerrit Renker [EMAIL PROTECTED] This implements the changes to the nofeedback timer handling suggested in draft rfc3448bis00, section 4.4. In particular, these changes mean: * better handling of the lossless case (p == 0) * the timestamp for computing t_ld becomes obsolete * much more recent document (RFC 3448 is almost 5 years old) * concepts in rfc3448bis arose from a real, working implementation (cf. sec. 12) Signed-off-by: Gerrit Renker [EMAIL PROTECTED] Signed-off-by: Ian McDonald [EMAIL PROTECTED] Signed-off-by: Arnaldo Carvalho de Melo [EMAIL PROTECTED] --- net/dccp/ccids/ccid3.c | 63 ++-- 1 files changed, 29 insertions(+), 34 deletions(-) diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c index 7c8e9ad..00b5f11 100644 --- a/net/dccp/ccids/ccid3.c +++ b/net/dccp/ccids/ccid3.c @@ -131,12 +131,11 @@ static u32 ccid3_hc_tx_idle_rtt(struct ccid3_hc_tx_sock *hctx, ktime_t now) * */ static void ccid3_hc_tx_update_x(struct sock *sk, ktime_t *stamp) - { struct ccid3_hc_tx_sock *hctx = ccid3_hc_tx_sk(sk); __u64 min_rate = 2 * hctx-ccid3hctx_x_recv; const __u64 old_x = hctx-ccid3hctx_x; - ktime_t now = stamp? *stamp : ktime_get_real(); + ktime_t now = stamp ? *stamp : ktime_get_real(); /* * Handle IDLE periods: do not reduce below RFC3390 initial sending rate @@ -230,27 +229,27 @@ static void ccid3_hc_tx_no_feedback_timer(unsigned long data) ccid3_pr_debug(%s(%p, state=%s) - entry \n, dccp_role(sk), sk, ccid3_tx_state_name(hctx-ccid3hctx_state)); - switch (hctx-ccid3hctx_state) { - case TFRC_SSTATE_NO_FBACK: - /* RFC 3448, 4.4: Halve send rate directly */ + if (hctx-ccid3hctx_state == TFRC_SSTATE_FBACK) + ccid3_hc_tx_set_state(sk, TFRC_SSTATE_NO_FBACK); + else if (hctx-ccid3hctx_state != TFRC_SSTATE_NO_FBACK) + goto out; + + /* +* Determine new allowed sending rate X as per draft rfc3448bis-00, 4.4 +*/ + if (hctx-ccid3hctx_t_rto == 0 || /* no feedback received yet */ + hctx-ccid3hctx_p == 0) { + + /* halve send rate directly */ hctx-ccid3hctx_x = max(hctx-ccid3hctx_x / 2, (((__u64)hctx-ccid3hctx_s) 6) / TFRC_T_MBI); - - ccid3_pr_debug(%s(%p, state=%s), updated tx rate to %u - bytes/s\n, dccp_role(sk), sk, - ccid3_tx_state_name(hctx-ccid3hctx_state), - (unsigned)(hctx-ccid3hctx_x 6)); - /* The value of R is still undefined and so we can not recompute -* the timout value. Keep initial value as per [RFC 4342, 5]. */ - t_nfb = TFRC_INITIAL_TIMEOUT; ccid3_update_send_interval(hctx); - break; - case TFRC_SSTATE_FBACK: + } else { /* -* Modify the cached value of X_recv [RFC 3448, 4.4] +* Modify the cached value of X_recv * -* If (p == 0 || X_calc 2 * X_recv) +* If (X_calc 2 * X_recv) *X_recv = max(X_recv / 2, s / (2 * t_mbi)); * Else *X_recv = X_calc / 4; @@ -259,32 +258,28 @@ static void ccid3_hc_tx_no_feedback_timer(unsigned long data) */ BUG_ON(hctx-ccid3hctx_p !hctx-ccid3hctx_x_calc); - if (hctx-ccid3hctx_p == 0 || - (hctx-ccid3hctx_x_calc (hctx-ccid3hctx_x_recv 5))) { - + if (hctx-ccid3hctx_x_calc (hctx-ccid3hctx_x_recv 5)) hctx-ccid3hctx_x_recv = max(hctx-ccid3hctx_x_recv / 2, (((__u64)hctx-ccid3hctx_s) 6) / (2 * TFRC_T_MBI)); - } else { + else { hctx-ccid3hctx_x_recv = hctx-ccid3hctx_x_calc; hctx-ccid3hctx_x_recv = 4; } - /* Now recalculate X [RFC 3448, 4.3, step (4)] */ ccid3_hc_tx_update_x(sk, NULL); - /* -* Schedule no feedback timer to expire in -* max(t_RTO, 2 * s/X) = max(t_RTO, 2 * t_ipi) -* See comments in packet_recv() regarding the value of t_RTO. -*/ - t_nfb = max(hctx-ccid3hctx_t_rto, 2 * hctx-ccid3hctx_t_ipi); - break; - case TFRC_SSTATE_NO_SENT: - DCCP_BUG(%s(%p) - Illegal state NO_SENT, dccp_role(sk), sk); - /* fall through */ - case TFRC_SSTATE_TERM: - goto out; } + ccid3_pr_debug(Reduced X to %llu/64
[0/4] DST: Distributed storage.
Distributed storage. I'm pleased to announce the 12'th release of the distributed storage subsystem (DST). DST allows to form a storage on top of local and remote nodes and combine them into linear or mirroring setup, which in turn can be exported to remote nodes. Short changelog: * new improved mirroring algorithm. This algorithm uses sliding window approach for full resync and write log for partial resync. * fixed number of typos and debug cleanups * update inode size when linear algorithm changes the size of the storage in run time * extended number of sysfs files and documentation for them * fixed leak in local export node setup * name is 'Dancing with the smoked neutrino' now Overall list of features of the DST can be found on project's homepage: http://tservice.net.ru/~s0mbre/old/?section=projectsitem=dst DST is also exported as a git tree available for clone and pull from http://tservice.net.ru/~s0mbre/archive/dst/dst.git Interested reader can test DST with 2.6.23 tree too (it should compile fine, but was not tested). DST passed all FS tests in LTP with XFS (modulo MAX_LOCK_DEPTH too low bug: [ 8398.605691] BUG: MAX_LOCK_DEPTH too low! [ 8398.609641] turning off the locking correctness validator. this is not DST problem though), but it was not performed with offline/online nodes. Thank you. Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2/4] DST: Core distributed storage files.
Core distributed storage files. Include userspace interfaces, initialization, block layer bindings and other core functionality. Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED] diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig index b4c8319..ca6592d 100644 --- a/drivers/block/Kconfig +++ b/drivers/block/Kconfig @@ -451,6 +451,8 @@ config ATA_OVER_ETH This driver provides Support for ATA over Ethernet block devices like the Coraid EtherDrive (R) Storage Blade. +source drivers/block/dst/Kconfig + source drivers/s390/block/Kconfig endmenu diff --git a/drivers/block/Makefile b/drivers/block/Makefile index dd88e33..fcf042d 100644 --- a/drivers/block/Makefile +++ b/drivers/block/Makefile @@ -29,3 +29,4 @@ obj-$(CONFIG_VIODASD) += viodasd.o obj-$(CONFIG_BLK_DEV_SX8) += sx8.o obj-$(CONFIG_BLK_DEV_UB) += ub.o +obj-$(CONFIG_DST) += dst/ diff --git a/drivers/block/dst/Kconfig b/drivers/block/dst/Kconfig new file mode 100644 index 000..67a7dad --- /dev/null +++ b/drivers/block/dst/Kconfig @@ -0,0 +1,28 @@ +config DST + tristate Distributed storage + depends on NET + select CONNECTOR + select LIBCRC32C + ---help--- + This driver allows to create a distributed storage. + +config DST_DEBUG + bool DST debug + depends on DST + ---help--- + This option will turn HEAVY debugging of the DST. + Turn it on ONLY if you have to debug some really obscure problem. + +config DST_ALG_LINEAR + tristate Linear distribution algorithm + depends on DST + ---help--- + This module allows to create linear mapping of the nodes + in the distributed storage. + +config DST_ALG_MIRROR + tristate Mirror distribution algorithm + depends on DST + ---help--- + This module allows to create a mirror of the nodes in the + distributed storage. diff --git a/drivers/block/dst/Makefile b/drivers/block/dst/Makefile new file mode 100644 index 000..1400e94 --- /dev/null +++ b/drivers/block/dst/Makefile @@ -0,0 +1,6 @@ +obj-$(CONFIG_DST) += dst.o + +dst-y := dcore.o kst.o + +obj-$(CONFIG_DST_ALG_LINEAR) += alg_linear.o +obj-$(CONFIG_DST_ALG_MIRROR) += alg_mirror.o diff --git a/drivers/block/dst/dcore.c b/drivers/block/dst/dcore.c new file mode 100644 index 000..423e7b2 --- /dev/null +++ b/drivers/block/dst/dcore.c @@ -0,0 +1,1622 @@ +/* + * 2007+ Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED] + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include linux/module.h +#include linux/kernel.h +#include linux/init.h +#include linux/blkdev.h +#include linux/bio.h +#include linux/slab.h +#include linux/connector.h +#include linux/socket.h +#include linux/dst.h +#include linux/device.h +#include linux/in.h +#include linux/in6.h +#include linux/buffer_head.h + +#include net/sock.h + +static LIST_HEAD(dst_storage_list); +static LIST_HEAD(dst_alg_list); +static DEFINE_MUTEX(dst_storage_lock); +static DEFINE_MUTEX(dst_alg_lock); +static int dst_major; +static struct kst_worker *kst_main_worker; +static struct cb_id cn_dst_id = { CN_DST_IDX, CN_DST_VAL }; + +struct kmem_cache *dst_request_cache; + +static char dst_name[] = Dancing with the smoked neutrino; + +/* + * DST sysfs tree. For device called 'storage' which is formed + * on top of two nodes this looks like this: + * + * /sys/devices/storage/ + * /sys/devices/storage/alg : alg_linear + * /sys/devices/storage/n-800/type : R: 192.168.4.80:1025 + * /sys/devices/storage/n-800/size : 800 + * /sys/devices/storage/n-800/start : 800 + * /sys/devices/storage/n-800/clean + * /sys/devices/storage/n-800/dirty + * /sys/devices/storage/n-0/type : R: 192.168.4.81:1025 + * /sys/devices/storage/n-0/size : 800 + * /sys/devices/storage/n-0/start : 0 + * /sys/devices/storage/n-0/clean + * /sys/devices/storage/n-0/dirty + * /sys/devices/storage/remove_all_nodes + * /sys/devices/storage/nodes : sectors (start [size]): 0 [800] | 800 [800] + * /sys/devices/storage/name : storage + */ + +static int dst_dev_match(struct device *dev, struct device_driver *drv) +{ + return 1; +} + +static void dst_dev_release(struct device *dev) +{ +} + +static struct bus_type dst_dev_bus_type = { + .name = dst, + .match = dst_dev_match, +}; + +static struct device dst_dev = { + .bus= dst_dev_bus_type, + .release= dst_dev_release +}; + +static void dst_node_release(struct device *dev) +{ +} +
[3/4] DST: Network state machine.
Network state machine. Includes network async processing state machine and related tasks. Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED] diff --git a/drivers/block/dst/kst.c b/drivers/block/dst/kst.c new file mode 100644 index 000..6d92014 --- /dev/null +++ b/drivers/block/dst/kst.c @@ -0,0 +1,1515 @@ +/* + * 2007+ Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED] + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include linux/kernel.h +#include linux/module.h +#include linux/list.h +#include linux/slab.h +#include linux/socket.h +#include linux/kthread.h +#include linux/net.h +#include linux/in.h +#include linux/poll.h +#include linux/bio.h +#include linux/dst.h + +#include net/sock.h + +struct kst_poll_helper +{ + poll_table pt; + struct kst_state*st; +}; + +static LIST_HEAD(kst_worker_list); +static DEFINE_MUTEX(kst_worker_mutex); + +/* + * This function creates bound socket for local export node. + */ +static int kst_sock_create(struct kst_state *st, struct saddr *addr, + int type, int proto, int backlog) +{ + int err; + + err = sock_create(addr-sa_family, type, proto, st-socket); + if (err) + goto err_out_exit; + + err = st-socket-ops-bind(st-socket, (struct sockaddr *)addr, + addr-sa_data_len); + + err = st-socket-ops-listen(st-socket, backlog); + if (err) + goto err_out_release; + + st-socket-sk-sk_allocation = GFP_NOIO; + + return 0; + +err_out_release: + sock_release(st-socket); +err_out_exit: + return err; +} + +static void kst_sock_release(struct kst_state *st) +{ + if (st-socket) { + sock_release(st-socket); + st-socket = NULL; + } +} + +void kst_wake(struct kst_state *st) +{ + if (st) { + struct kst_worker *w = st-node-w; + unsigned long flags; + + spin_lock_irqsave(w-ready_lock, flags); + if (list_empty(st-ready_entry)) + list_add_tail(st-ready_entry, w-ready_list); + spin_unlock_irqrestore(w-ready_lock, flags); + + wake_up(w-wait); + } +} +EXPORT_SYMBOL_GPL(kst_wake); + +/* + * Polling machinery. + */ +static int kst_state_wake_callback(wait_queue_t *wait, unsigned mode, + int sync, void *key) +{ + struct kst_state *st = container_of(wait, struct kst_state, wait); + kst_wake(st); + return 1; +} + +static void kst_queue_func(struct file *file, wait_queue_head_t *whead, +poll_table *pt) +{ + struct kst_state *st = container_of(pt, struct kst_poll_helper, pt)-st; + + st-whead = whead; + init_waitqueue_func_entry(st-wait, kst_state_wake_callback); + add_wait_queue(whead, st-wait); +} + +static void kst_poll_exit(struct kst_state *st) +{ + if (st-whead) { + remove_wait_queue(st-whead, st-wait); + st-whead = NULL; + } +} + +/* + * This function removes request from state tree and ordering list. + */ +void kst_del_req(struct dst_request *req) +{ + list_del_init(req-request_list_entry); +} +EXPORT_SYMBOL_GPL(kst_del_req); + +static struct dst_request *kst_req_first(struct kst_state *st) +{ + struct dst_request *req = NULL; + + if (!list_empty(st-request_list)) + req = list_entry(st-request_list.next, struct dst_request, + request_list_entry); + return req; +} + +/* + * This function dequeues first request from the queue and tree. + */ +static struct dst_request *kst_dequeue_req(struct kst_state *st) +{ + struct dst_request *req; + + mutex_lock(st-request_lock); + req = kst_req_first(st); + if (req) + kst_del_req(req); + mutex_unlock(st-request_lock); + return req; +} + +/* + * This function enqueues request into tree, indexed by start of the request, + * and also puts request into ordered queue. + */ +int kst_enqueue_req(struct kst_state *st, struct dst_request *req) +{ + if (unlikely(req-flags DST_REQ_CHECK_QUEUE)) { + struct dst_request *r; + + list_for_each_entry(r, st-request_list, request_list_entry) { + if (bio_rw(r-bio) != bio_rw(req-bio)) + continue; + + if (r-start = req-start + req-size) + continue; + +
[1/4] DST: Distributed storage documentation.
Distributed storage documentation. Algorithms used in the system, userspace interfaces (sysfs dirs and files), design and implementation details are described here. Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED] diff --git a/Documentation/dst/algorithms.txt b/Documentation/dst/algorithms.txt new file mode 100644 index 000..1437a6a --- /dev/null +++ b/Documentation/dst/algorithms.txt @@ -0,0 +1,115 @@ +Each storage by itself is just a set of contiguous logical blocks, with +allowed number of operations. Nodes, each of which has own start and size, +are placed into storage by appropriate algorithm, which remaps +logical sector number into real node's sector. One can create +own algorithms, since DST has pluggable interface for that. +Currently mirrored and linear algorithms are supported. + +Let's briefly describe how they work. + +Linear algorithm. +Simple approach of concatenating storages into single device with +increased size is used in this algorithm. Essentially new device +has size equal to sum of sizes of underlying nodes and nodes are +placed one after another. + + /- Node 1 ---\ /-- Node 3 \ +start end start end + |==||==| + |start end | + | \--- Node 2 -/ | + | | +start end + \-- DST storage --/ + + /\ + || + || + + IO operations + + Figure 1. + 3 nodes combined into single storage using linear algorithm. + +Mirror algorithm. +In this algorithms nodes are placed under each other, so when +operation comes to the first one, it can be mirrored to all +underlying nodes. In case of reading, actual data is obtained from +the nearest node - algoritm keeps track of previous operation +and knows where it was stopped, so that subsequent seek to the +start of the new request will take the shortest time. +Writing is always mirrored to all underlying nodes. + + IO operations + || + || + \/ + +| DST storage ---| +| prev position | +|---| Node 1 | +| prev pos | +| Node 2 -|--| +|prev pos| +|---| Node 3 | + + Figure 2. + 3 nodes combined into single storage using mirror algorithm. + +Each algorithm must implement number of callbacks, +which must be registered during initialization time. + +struct dst_alg_ops +{ + int (*add_node)(struct dst_node *n); + void(*del_node)(struct dst_node *n); + int (*remap)(struct dst_request *req); + int (*error)(struct kst_state *state, int err); + struct module *owner; +}; + [EMAIL PROTECTED] +This callback is invoked when new node is being added into the storage, +but before node is actually added into the storage, so that it could +be accessed from it. When it is called, all appropriate initialization +of the underlying device is already completed (system has been connected +to remote node or got a reference to the local block device). At this +stage algorithm can add node into private map. +It must return zero on success or negative value otherwise. + [EMAIL PROTECTED] +This callback is invoked when node is being deleted from the storage, +i.e. when its reference counter hits zero. It is called before +any cleaning is performed. +It must return zero on success or negative value otherwise. + [EMAIL PROTECTED] +This callback is invoked each time new bio hits the storage. +Request structure contains BIO itself, pointer to the node, which originally +stores the whole region under given IO request, and various parameters +used by storage core to process this block request. +It must return zero on success or negative value otherwise. It is upto +this method to call all cleaning if remapping failed, for example it must +call kst_bio_endio() for given callback in case of error, which in turn +will call bio_endio(). Note, that dst_request structure provided in this +callback is allocated on stack, so if there is a need to use it outside +of the given function, it must be cloned (it will happen automatically +in state's push callback, but that copy will not be shared by any other +user). + [EMAIL PROTECTED] +This callback is invoked for each error, which happend when processed
[4/4] DST: Algorithms used in distributed storage.
Algorithms used in distributed storage. Mirror and linear mapping code. Signed-off-by: Evgeniy Polyakov [EMAIL PROTECTED] diff --git a/drivers/block/dst/alg_linear.c b/drivers/block/dst/alg_linear.c new file mode 100644 index 000..836764d --- /dev/null +++ b/drivers/block/dst/alg_linear.c @@ -0,0 +1,114 @@ +/* + * 2007+ Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED] + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include linux/module.h +#include linux/kernel.h +#include linux/init.h +#include linux/dst.h + +static struct dst_alg *alg_linear; + +/* + * This callback is invoked when node is removed from storage. + */ +static void dst_linear_del_node(struct dst_node *n) +{ +} + +/* + * This callback is invoked when node is added to storage. + */ +static int dst_linear_add_node(struct dst_node *n) +{ + struct dst_storage *st = n-st; + struct block_device *bdev; + + dprintk(%s: disk_size: %llu, node_size: %llu.\n, + __func__, st-disk_size, n-size); + + mutex_lock(st-tree_lock); + n-start = st-disk_size; + st-disk_size += n-size; + set_capacity(st-disk, st-disk_size); + + bdev = bdget_disk(st-disk, 0); + if (bdev) { + mutex_lock(bdev-bd_inode-i_mutex); + i_size_write(bdev-bd_inode, to_bytes(st-disk_size)); + mutex_unlock(bdev-bd_inode-i_mutex); + bdput(bdev); + } + mutex_unlock(st-tree_lock); + + return 0; +} + +static int dst_linear_remap(struct dst_request *req) +{ + int err; + + if (req-node-bdev) { + generic_make_request(req-bio); + return 0; + } + + err = kst_check_permissions(req-state, req-bio); + if (err) + return err; + + return req-state-ops-push(req); +} + +/* + * Failover callback - it is invoked each time error happens during + * request processing. + */ +static int dst_linear_error(struct kst_state *st, int err) +{ + if (err) + set_bit(DST_NODE_FROZEN, st-node-flags); + else + clear_bit(DST_NODE_FROZEN, st-node-flags); + return 0; +} + +static struct dst_alg_ops alg_linear_ops = { + .remap = dst_linear_remap, + .add_node = dst_linear_add_node, + .del_node = dst_linear_del_node, + .error = dst_linear_error, + .owner = THIS_MODULE, +}; + +static int __devinit alg_linear_init(void) +{ + alg_linear = dst_alloc_alg(alg_linear, alg_linear_ops); + if (!alg_linear) + return -ENOMEM; + + return 0; +} + +static void __devexit alg_linear_exit(void) +{ + dst_remove_alg(alg_linear); +} + +module_init(alg_linear_init); +module_exit(alg_linear_exit); + +MODULE_LICENSE(GPL); +MODULE_AUTHOR(Evgeniy Polyakov [EMAIL PROTECTED]); +MODULE_DESCRIPTION(Linear distributed algorithm.); diff --git a/drivers/block/dst/alg_mirror.c b/drivers/block/dst/alg_mirror.c new file mode 100644 index 000..c10d582 --- /dev/null +++ b/drivers/block/dst/alg_mirror.c @@ -0,0 +1,1536 @@ +/* + * 2007+ Copyright (c) Evgeniy Polyakov [EMAIL PROTECTED] + * All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include linux/module.h +#include linux/kernel.h +#include linux/init.h +#include linux/poll.h +#include linux/dst.h +#include linux/vmstat.h + +struct dst_write_entry +{ + int error; + u32 size; + u64 start; +}; +#define DST_LOG_ENTRIES_PER_PAGE (PAGE_SIZE/sizeof(struct dst_write_entry)) + +struct dst_mirror_node_data +{ + u64 age; + u64 num, write_idx, resync_idx; +}; + +struct dst_mirror_log +{ + unsigned intnr_pages; + struct dst_write_entry **entries; +}; + +struct dst_mirror_priv +{ + u64 resync_start, resync_size; + atomic_tresync_num; + struct completion resync_complete; +
Re: ip neigh show not showing arp cache entries?
YOSHIFUJI Hideaki / 吉藤英明 wrote: In article [EMAIL PROTECTED] (at Wed, 12 Dec 2007 15:57:08 -0600), Chris Friesen [EMAIL PROTECTED] says: You may try other versions of this command http://devresources.linux-foundation.org/dev/iproute2/download/ They appear to be numbered by kernel version, and the above version is the most recent one for 2.6.14. Will more recent ones (for newer kernels) work with my kernel? It should work; if it doesn't, please make a report. Thanks. I downloaded iproute2-2.6.23 and built it for my kernel. I'm compiling for a different kernel than is actually running on the build system, so I had to add a line defining KERNEL_INCLUDE to the Makefile, and I had to add -I${KERNEL_INCLUDE} to the CFLAGS definition. Someone might want to do something about that... Anyways, the arp entry issue is still there. The arp command gives a bunch of entries: [EMAIL PROTECTED]:/root arp -n Address HWtype HWaddress Flags MaskIface 192.168.24.81ether 00:01:AF:14:E9:8A C bond2 172.24.132.2 (incomplete) bond0 172.24.136.0 ether 00:C0:8B:07:B3:7E C bond0 172.24.137.0 (incomplete) bond0 172.24.0.9 ether 00:07:E9:41:4B:B4 C bond0 10.41.18.101 ether 00:0E:0C:5E:95:BD C eth6 172.24.0.11 ether 00:03:CC:51:06:5E C bond0 172.24.132.1 ether 00:01:AF:14:E9:88 C bond0 172.24.0.15 ether 00:0E:0C:85:FD:D2 C bond0 172.24.0.3 ether 00:01:AF:14:C8:CC C bond0 172.24.0.5 ether 00:01:AF:15:E0:6A C bond0 The original ip command and the new one (/tmp/ip) both give the same results--some of the entries are missing. [EMAIL PROTECTED]:/root ip neigh show all 172.24.137.0 dev bond0 FAILED 172.24.0.9 dev bond0 lladdr 00:07:e9:41:4b:b4 REACHABLE 10.41.18.101 dev eth6 lladdr 00:0e:0c:5e:95:bd REACHABLE 172.24.0.11 dev bond0 lladdr 00:03:cc:51:06:5e STALE 172.24.132.1 dev bond0 lladdr 00:01:af:14:e9:88 REACHABLE 172.24.0.15 dev bond0 lladdr 00:0e:0c:85:fd:d2 STALE 172.24.0.3 dev bond0 lladdr 00:01:af:14:c8:cc REACHABLE 172.24.0.5 dev bond0 lladdr 00:01:af:15:e0:6a STALE [EMAIL PROTECTED]:/root /tmp/ip neigh show all 172.24.137.0 dev bond0 FAILED 172.24.0.9 dev bond0 lladdr 00:07:e9:41:4b:b4 REACHABLE 10.41.18.101 dev eth6 lladdr 00:0e:0c:5e:95:bd REACHABLE 172.24.0.11 dev bond0 lladdr 00:03:cc:51:06:5e STALE 172.24.132.1 dev bond0 lladdr 00:01:af:14:e9:88 REACHABLE 172.24.0.15 dev bond0 lladdr 00:0e:0c:85:fd:d2 STALE 172.24.0.3 dev bond0 lladdr 00:01:af:14:c8:cc REACHABLE 172.24.0.5 dev bond0 lladdr 00:01:af:15:e0:6a STALE However, if I specifically try to print out one of the missing entries, it shows up: [EMAIL PROTECTED]:/root /tmp/ip neigh show 192.168.24.81 192.168.24.81 dev bond2 lladdr 00:01:af:14:e9:8a REACHABLE Chris -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] ehea: kdump support - rework
This patch adds kdump support using the new PPC crash shutdown hook to the ehea driver. The reworked implementation follows the feedback I got. The crash handler now just iterates over two simple arrays instead of handling linked lists. Further feedback will be appreciated. ehea kdump support RFC #1: http://lkml.org/lkml/2007/12/12/241 Signed-off-by: Thomas Klein [EMAIL PROTECTED] --- diff -Nurp -X dontdiff linux-2.6.24-rc5/drivers/net/ehea/ehea.h patched_kernel/drivers/net/ehea/ehea.h --- linux-2.6.24-rc5/drivers/net/ehea/ehea.h2007-12-11 04:48:43.0 +0100 +++ patched_kernel/drivers/net/ehea/ehea.h 2007-12-17 16:18:49.0 +0100 @@ -40,7 +40,7 @@ #include asm/io.h #define DRV_NAME ehea -#define DRV_VERSIONEHEA_0083 +#define DRV_VERSIONEHEA_0085 /* eHEA capability flags */ #define DLPAR_PORT_ADD_REM 1 @@ -386,6 +386,13 @@ struct ehea_port_res { #define EHEA_MAX_PORTS 16 + +#define EHEA_NUM_PORTRES_FW_HANDLES6 /* QP handle, SendCQ handle, +RecvCQ handle, EQ handle, +SendMR handle, RecvMR handle */ +#define EHEA_NUM_PORT_FW_HANDLES 1 /* EQ handle */ +#define EHEA_NUM_ADAPTER_FW_HANDLES2 /* MR handle, NEQ handle */ + struct ehea_adapter { u64 handle; struct of_device *ofdev; @@ -405,6 +412,31 @@ struct ehea_mc_list { u64 macaddr; }; +/* kdump support */ +struct ehea_fw_handle_entry { + u64 adh; /* Adapter Handle */ + u64 fwh; /* Firmware Handle */ +}; + +struct ehea_fw_handle_array { + struct ehea_fw_handle_entry *arr; + int num_entries; + struct semaphore lock; +}; + +struct ehea_bcmc_reg_entry { + u64 adh; /* Adapter Handle */ + u32 port_id; /* Logical Port Id */ + u8 reg_type; /* Registration Type */ + u64 macaddr; +}; + +struct ehea_bcmc_reg_array { + struct ehea_bcmc_reg_entry *arr; + int num_entries; + struct semaphore lock; +}; + #define EHEA_PORT_UP 1 #define EHEA_PORT_DOWN 0 #define EHEA_PHY_LINK_UP 1 diff -Nurp -X dontdiff linux-2.6.24-rc5/drivers/net/ehea/ehea_main.c patched_kernel/drivers/net/ehea/ehea_main.c --- linux-2.6.24-rc5/drivers/net/ehea/ehea_main.c 2007-12-11 04:48:43.0 +0100 +++ patched_kernel/drivers/net/ehea/ehea_main.c 2007-12-17 16:18:49.0 +0100 @@ -35,6 +35,7 @@ #include linux/if_ether.h #include linux/notifier.h #include linux/reboot.h +#include asm/kexec.h #include net/ip.h @@ -98,8 +99,10 @@ static int port_name_cnt = 0; static LIST_HEAD(adapter_list); u64 ehea_driver_flags = 0; struct work_struct ehea_rereg_mr_task; - struct semaphore dlpar_mem_lock; +struct ehea_fw_handle_array ehea_fw_handles; +struct ehea_bcmc_reg_array ehea_bcmc_regs; + static int __devinit ehea_probe_adapter(struct of_device *dev, const struct of_device_id *id); @@ -131,6 +134,160 @@ void ehea_dump(void *adr, int len, char } } +static void ehea_update_firmware_handles(void) +{ + struct ehea_fw_handle_entry *arr = NULL; + struct ehea_adapter *adapter; + int num_adapters = 0; + int num_ports = 0; + int num_portres = 0; + int i = 0; + int num_fw_handles, k, l; + + /* Determine number of handles */ + list_for_each_entry(adapter, adapter_list, list) { + num_adapters++; + + for (k = 0; k EHEA_MAX_PORTS; k++) { + struct ehea_port *port = adapter-port[k]; + + if (!port || (port-state != EHEA_PORT_UP)) + continue; + + num_ports++; + num_portres += port-num_def_qps + port-num_add_tx_qps; + } + } + + num_fw_handles = num_adapters * EHEA_NUM_ADAPTER_FW_HANDLES + +num_ports * EHEA_NUM_PORT_FW_HANDLES + +num_portres * EHEA_NUM_PORTRES_FW_HANDLES; + + if (num_fw_handles) { + arr = kzalloc(num_fw_handles * sizeof(*arr), GFP_KERNEL); + if (!arr) + return; /* Keep the existing array */ + } else + goto out_update; + + list_for_each_entry(adapter, adapter_list, list) { + for (k = 0; k EHEA_MAX_PORTS; k++) { + struct ehea_port *port = adapter-port[k]; + + if (!port || (port-state != EHEA_PORT_UP)) + continue; + + for (l = 0; +l port-num_def_qps + port-num_add_tx_qps; +l++) { + struct ehea_port_res *pr = port-port_res[l]; + + arr[i].adh = adapter-handle; + arr[i++].fwh = pr-qp-fw_handle; +
Re: [1/4] DST: Distributed storage documentation.
On Dec 17, 2007 4:03 PM, Evgeniy Polyakov [EMAIL PROTECTED] wrote: +++ b/Documentation/dst/sysfs.txt @@ -0,0 +1,33 @@ +This file describes sysfs files created for each storage. + +1. Per-storage files. +Each storage has its own dir /sysfs/devices/$storage_name, +2. Per-node files. +Node's files are located in /sysfs/devices/$storage_name/n-$start-$cookie As already pointed out last time, you can't reference /sys/devices/ directly, please use the path from the bus/class directory which points there. Thanks, Kay -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: After many hours all outbound connections get stuck in SYN_SENT
Here is some additional information about this problem as requested. I ran ss -m, but no data was returned, what options should I use with ss to gather relevant information? The output of netstat -s: Ip: 1346453452 total packets received 0 forwarded 0 incoming packets discarded 1345744076 incoming packets delivered 1338284375 requests sent out 50 reassemblies required 15 packets reassembled ok 15 fragments received ok 50 fragments created Icmp: 431 ICMP messages received 0 input ICMP message failed. ICMP input histogram: destination unreachable: 42 echo requests: 6 echo replies: 377 timestamp request: 2 address mask request: 2 747 ICMP messages sent 0 ICMP messages failed ICMP output histogram: destination unreachable: 739 echo replies: 6 timestamp replies: 2 Tcp: 13115640 active connections openings 1291131 passive connection openings 381803 failed connection attempts 6445 connection resets received 148 connections established 1339571927 segments received 1330375560 segments send out 2443951 segments retransmited 345 bad segments received. 61292 resets sent Udp: 5608790 packets received 725 packets to unknown port received. 0 packet receive errors 5609766 packets sent TcpExt: 1916 resets received for embryonic SYN_RECV sockets 1290 packets pruned from receive queue because of socket buffer overrun 1250631 TCP sockets finished time wait in fast timer 43568 time wait sockets recycled by time stamp 16323 active connections rejected because of time stamp 262 packets rejects in established connections because of timestamp 18505058 delayed acks sent 3931 delayed acks further delayed because of locked socket Quick ack mode was activated 434830 times 1902 times the listen queue of a socket overflowed 1902 SYNs to LISTEN sockets ignored 1068352581 packets directly queued to recvmsg prequeue. 92424765 packets directly received from backlog 800659035 packets directly received from prequeue 1158417138 packets header predicted 2223869 packets header predicted and directly queued to user 22256941 acknowledgments not containing data received 1109445014 predicted acknowledgments 96 times recovered from packet loss due to fast retransmit 325 times recovered from packet loss due to SACK data 1 bad SACKs received Detected reordering 8 times using FACK Detected reordering 7 times using time stamp 21 congestion windows fully recovered 29 congestion windows partially recovered using Hoe heuristic 452978 congestion windows recovered after partial ack 97 TCP data loss events 2269 timeouts after reno fast retransmit 144 timeouts after SACK recovery 12690 timeouts in loss state 731 fast retransmits 70 forward retransmits 38188 retransmits in slow start 959183 other TCP timeouts TCPRenoRecoveryFail: 67 38 sack retransmits failed 42 times receiver scheduled too late for direct processing 75627 packets collapsed in receive queue due to low socket buffer 6003 DSACKs sent for old packets 13 DSACKs sent for out of order packets 136 DSACKs received 4038 connections reset due to unexpected data 557 connections reset due to early user close 319219 connections aborted due to timeout On 12/16/07, James Nichols [EMAIL PROTECTED] wrote: Hello, I have a Java application that makes a large number of outbound webservice calls over HTTP/TCP. The hosts contacted are a fixed set of about 2000 hosts and a web service call is made to each of them approximately every 5 mintues by a pool of 200 Java threads. Over time, on average a percentage of these hosts are unreachable for one reason or another, usually because they are on wireless cell phone NICs, so there is a persistent count of sockets in the SYN_SENT state in the range of about 60-80. This is fine, as these failed connection attempts eventually time out. However, after approximately 38 hours of operation, all outbound connection attempts get stuck in the SYN_SENT state. It happens instantaneously, where I go from the baseline of about 60-80 sockets in SYN_SENT to a count of 200 (corresponding to the # of java threads that make these calls). When I stop and start the Java application, all the new outbound connections still get stuck in SYN_SENT state. During this time, I am still able to SSH to the box and run wget to Google, cnn, etc, so the problem appears to be specific to the hosts that I'm accessing via the webservices. For a long time, the only thing that would resolve this was rebooting the entire machine. Once I did this, the outbound connections could be made succesfully. However, very recently when I had once of these incidents I disabled tcp_sack via: echo 0 /proc/sys/net/ipv4/tcp_sack
Re: [PATCH] bridge: assign random address
On Sun, 16 Dec 2007 14:29:15 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Sun, 16 Dec 2007 13:37:17 -0800 (PST) David Miller [EMAIL PROTECTED] wrote: From: Stephen Hemminger [EMAIL PROTECTED] Date: Tue, 11 Dec 2007 15:48:35 -0800 Subject: Re: [PATCH] bridge: assign random address bridge should all-caps and in brackets, No, bridge should not be in []. Lots of people's patch-receiving scripts assume that any text in [] is to be removed as the patch is committed. It contains text which is only relevant to the particular email which carried the patch. Stuff like patch and 4/5 and linux-2.6.23, etc. assign random address should be capitalized like a proper english sentence with a period at the end. Actually I usually remove the caps and the waste-of-space period, but that's much less important than the brackets abuse. The bracket convention is quite useful and I've often wondered why I need to edit the patch title when I merge up patches from net developers ;) I try to follow the title convention that Jeff was promoting. It works well because he is dealing with many different drivers. -- Stephen Hemminger [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: init_timer_deferrable conversion
On Mon, 17 Dec 2007 15:29:43 +0100 Eric Dumazet [EMAIL PROTECTED] wrote: On Mon, 17 Dec 2007 09:55:04 +0100 Eric Dumazet [EMAIL PROTECTED] wrote: On Sun, 16 Dec 2007 22:00:23 -0500 (EST) Parag Warudkar [EMAIL PROTECTED] wrote: In my quest to get the wake-ups from idle per second down to bare minimum, I noticed 3 places in the kernel that could benefit from using init_timer_deferrable() instead of init_timer() - a) drivers/net/sky2.c - watchdog_timer. This was showing up high on Powertop's list of things that cause routine wakeups from idle. After converting to init_timer_deferrable() the wakeups went down and this one no longer shows up in powertop's list. 25% reduction. This surprises me because it is a 1 hz timer and uses round_jiffies() in the current kernel. -- Stephen Hemminger [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: init_timer_deferrable conversion
On Dec 17, 2007 12:00 PM, Stephen Hemminger [EMAIL PROTECTED] wrote: a) drivers/net/sky2.c - watchdog_timer. This was showing up high on Powertop's list of things that cause routine wakeups from idle. After converting to init_timer_deferrable() the wakeups went down and this one no longer shows up in powertop's list. 25% reduction. This surprises me because it is a 1 hz timer and uses round_jiffies() in the current kernel. I am using the current git and I already have low wakeups per second to begin with - 5-7 and out of that 25% are attributed to sky2. Not sure if that matches up with the 1 hz + round_jiffies() logic. But is it conceptually ok to make this deferrable? I suppose yes as it's just a watchdog that checks if the link is up and delaying that would not make a difference? Thanks Parag -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ip neigh show not showing arp cache entries?
Chris Friesen wrote: The original ip command and the new one (/tmp/ip) both give the same results--some of the entries are missing. [EMAIL PROTECTED]:/root ip neigh show all 172.24.137.0 dev bond0 FAILED 172.24.0.9 dev bond0 lladdr 00:07:e9:41:4b:b4 REACHABLE 10.41.18.101 dev eth6 lladdr 00:0e:0c:5e:95:bd REACHABLE 172.24.0.11 dev bond0 lladdr 00:03:cc:51:06:5e STALE 172.24.132.1 dev bond0 lladdr 00:01:af:14:e9:88 REACHABLE 172.24.0.15 dev bond0 lladdr 00:0e:0c:85:fd:d2 STALE 172.24.0.3 dev bond0 lladdr 00:01:af:14:c8:cc REACHABLE 172.24.0.5 dev bond0 lladdr 00:01:af:15:e0:6a STALE [EMAIL PROTECTED]:/root /tmp/ip neigh show all 172.24.137.0 dev bond0 FAILED 172.24.0.9 dev bond0 lladdr 00:07:e9:41:4b:b4 REACHABLE 10.41.18.101 dev eth6 lladdr 00:0e:0c:5e:95:bd REACHABLE 172.24.0.11 dev bond0 lladdr 00:03:cc:51:06:5e STALE 172.24.132.1 dev bond0 lladdr 00:01:af:14:e9:88 REACHABLE 172.24.0.15 dev bond0 lladdr 00:0e:0c:85:fd:d2 STALE 172.24.0.3 dev bond0 lladdr 00:01:af:14:c8:cc REACHABLE 172.24.0.5 dev bond0 lladdr 00:01:af:15:e0:6a STALE However, if I specifically try to print out one of the missing entries, it shows up: [EMAIL PROTECTED]:/root /tmp/ip neigh show 192.168.24.81 192.168.24.81 dev bond2 lladdr 00:01:af:14:e9:8a REACHABLE From a kernel perspective there are only complete dumps, the filtering is done by iproute. So the fact that it shows them when querying specifically implies there is a bug in the iproute neighbour filter. Does it work if you omit all from the ip neigh show command? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.25 1/2]S2io: Fixes to enable multiple transmit fifo support
Fixes to enable multiple transmit fifos (upto a maximum of eight). - Moved single tx_lock from struct s2io_nic to struct fifo_info. - Moved single ufo_in_band_v structure from struct s2io_nic to struct fifo_info. - Assign the respective interrupt number for the transmitting fifo in the transmit descriptor (TXD). - Added boundary checks for number of FIFOs enabled and FIFO length. Signed-off-by: Surjit Reang [EMAIL PROTECTED] Signed-off-by: Sreenivasa Honnur [EMAIL PROTECTED] Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED] --- diff -Nurp 2-0-26-10/drivers/net/s2io.c 2-0-26-15-1/drivers/net/s2io.c --- 2-0-26-10/drivers/net/s2io.c2007-12-17 22:09:00.0 +0530 +++ 2-0-26-15-1/drivers/net/s2io.c 2007-12-17 22:10:50.0 +0530 @@ -84,7 +84,7 @@ #include s2io.h #include s2io-regs.h -#define DRV_VERSION 2.0.26.10 +#define DRV_VERSION 2.0.26.15-1 /* S2io Driver name version. */ static char s2io_driver_name[] = Neterion; @@ -368,12 +368,19 @@ static void do_s2io_copy_mac_addr(struct static void s2io_vlan_rx_register(struct net_device *dev, struct vlan_group *grp) { + int i; struct s2io_nic *nic = dev-priv; - unsigned long flags; + unsigned long flags[MAX_TX_FIFOS]; + struct mac_info *mac_control = nic-mac_control; + struct config_param *config = nic-config; + + for (i = 0; i config-tx_fifo_num; i++) + spin_lock_irqsave(mac_control-fifos[i].tx_lock, flags[i]); - spin_lock_irqsave(nic-tx_lock, flags); nic-vlgrp = grp; - spin_unlock_irqrestore(nic-tx_lock, flags); + for (i = config-tx_fifo_num - 1; i = 0; i--) + spin_unlock_irqrestore(mac_control-fifos[i].tx_lock, + flags[i]); } /* A flag indicating whether 'RX_PA_CFG_STRIP_VLAN_TAG' bit is set or not */ @@ -565,6 +572,21 @@ static int init_shared_mem(struct s2io_n return -EINVAL; } + size = 0; + for (i = 0; i config-tx_fifo_num; i++) { + size = config-tx_cfg[i].fifo_len; + /* +* Legal values are from 2 to 8192 +*/ + if (size 2) { + DBG_PRINT(ERR_DBG, s2io: Invalid fifo len (%d), size); + DBG_PRINT(ERR_DBG, for fifo %d\n, i); + DBG_PRINT(ERR_DBG, s2io: Legal values for fifo len + are 2 to 8192\n); + return -EINVAL; + } + } + lst_size = (sizeof(struct TxD) * config-max_txds); lst_per_page = PAGE_SIZE / lst_size; @@ -639,10 +661,14 @@ static int init_shared_mem(struct s2io_n } } - nic-ufo_in_band_v = kcalloc(size, sizeof(u64), GFP_KERNEL); - if (!nic-ufo_in_band_v) - return -ENOMEM; -mem_allocated += (size * sizeof(u64)); + for (i = 0; i config-tx_fifo_num; i++) { + size = config-tx_cfg[i].fifo_len; + mac_control-fifos[i].ufo_in_band_v + = kcalloc(size, sizeof(u64), GFP_KERNEL); + if (!mac_control-fifos[i].ufo_in_band_v) + return -ENOMEM; + mem_allocated += (size * sizeof(u64)); + } /* Allocation and initialization of RXDs in Rings */ size = 0; @@ -829,7 +855,6 @@ static int init_shared_mem(struct s2io_n static void free_shared_mem(struct s2io_nic *nic) { int i, j, blk_cnt, size; - u32 ufo_size = 0; void *tmp_v_addr; dma_addr_t tmp_p_addr; struct mac_info *mac_control; @@ -850,7 +875,6 @@ static void free_shared_mem(struct s2io_ lst_per_page = PAGE_SIZE / lst_size; for (i = 0; i config-tx_fifo_num; i++) { - ufo_size += config-tx_cfg[i].fifo_len; page_num = TXD_MEM_PAGE_CNT(config-tx_cfg[i].fifo_len, lst_per_page); for (j = 0; j page_num; j++) { @@ -940,18 +964,21 @@ static void free_shared_mem(struct s2io_ } } + for (i = 0; i nic-config.tx_fifo_num; i++) { + if (mac_control-fifos[i].ufo_in_band_v) { + nic-mac_control.stats_info-sw_stat.mem_freed + += (config-tx_cfg[i].fifo_len * sizeof(u64)); + kfree(mac_control-fifos[i].ufo_in_band_v); + } + } + if (mac_control-stats_mem) { + nic-mac_control.stats_info-sw_stat.mem_freed += + mac_control-stats_mem_sz; pci_free_consistent(nic-pdev, mac_control-stats_mem_sz, mac_control-stats_mem, mac_control-stats_mem_phy); - nic-mac_control.stats_info-sw_stat.mem_freed += -
[PATCH 2.6.25 2/2]S2io: Fixes to enable multiple transmit fifos
Multiple transmit fifo initialization - - Assigned equal scheduling priority for all configured FIFO's. - Modularized transmit traffic interrupt initialization since it is executed in s2io_card_up and s2io_link. Enable continuous tx interrupt when link is UP and vice verse. - Enable transmit interrupts for all configured transmit fifos. - Fixed typo errors. Signed-off-by: Surjit Reang [EMAIL PROTECTED] Signed-off-by: Sreenivasa Honnur [EMAIL PROTECTED] Signed-off-by: Ramkrishna Vepa [EMAIL PROTECTED] --- diff -Nurp 2-0-26-15-1/drivers/net/s2io.c 2-0-26-15-2/drivers/net/s2io.c --- 2-0-26-15-1/drivers/net/s2io.c 2007-12-17 22:10:50.0 +0530 +++ 2-0-26-15-2/drivers/net/s2io.c 2007-12-17 22:52:13.0 +0530 @@ -84,7 +84,7 @@ #include s2io.h #include s2io-regs.h -#define DRV_VERSION 2.0.26.15-1 +#define DRV_VERSION 2.0.26.15-2 /* S2io Driver name version. */ static char s2io_driver_name[] = Neterion; @@ -1079,8 +1079,67 @@ static int s2io_print_pci_mode(struct s2 } /** + * init_tti - Initialization transmit traffic interrupt scheme + * @nic: device private variable + * @link: link status (UP/DOWN) used to enable/disable continuous + * transmit interrupts + * Description: The function configures transmit traffic interrupts + * Return Value: SUCCESS on success and + * '-1' on failure + */ + +int init_tti(struct s2io_nic *nic, int link) +{ + struct XENA_dev_config __iomem *bar0 = nic-bar0; + register u64 val64 = 0; + int i; + struct config_param *config; + + config = nic-config; + + for (i = 0; i config-tx_fifo_num; i++) { + /* +* TTI Initialization. Default Tx timer gets us about +* 250 interrupts per sec. Continuous interrupts are enabled +* by default. +*/ + if (nic-device_type == XFRAME_II_DEVICE) { + int count = (nic-config.bus_speed * 125)/2; + val64 = TTI_DATA1_MEM_TX_TIMER_VAL(count); + } else + val64 = TTI_DATA1_MEM_TX_TIMER_VAL(0x2078); + + val64 |= TTI_DATA1_MEM_TX_URNG_A(0xA) | + TTI_DATA1_MEM_TX_URNG_B(0x10) | + TTI_DATA1_MEM_TX_URNG_C(0x30) | + TTI_DATA1_MEM_TX_TIMER_AC_EN; + + if (use_continuous_tx_intrs (link == LINK_UP)) + val64 |= TTI_DATA1_MEM_TX_TIMER_CI_EN; + writeq(val64, bar0-tti_data1_mem); + + val64 = TTI_DATA2_MEM_TX_UFC_A(0x10) | + TTI_DATA2_MEM_TX_UFC_B(0x20) | + TTI_DATA2_MEM_TX_UFC_C(0x40) | + TTI_DATA2_MEM_TX_UFC_D(0x80); + + writeq(val64, bar0-tti_data2_mem); + + val64 = TTI_CMD_MEM_WE | TTI_CMD_MEM_STROBE_NEW_CMD | + TTI_CMD_MEM_OFFSET(i); + writeq(val64, bar0-tti_command_mem); + + if (wait_for_cmd_complete(bar0-tti_command_mem, + TTI_CMD_MEM_STROBE_NEW_CMD, S2IO_BIT_RESET) != SUCCESS) + return FAILURE; + } + + return SUCCESS; +} + +/** * init_nic - Initialization of hardware - * @nic: device peivate variable + * @nic: device private variable * Description: The function sequentially configures every block * of the H/W from their reset values. * Return Value: SUCCESS on success and @@ -1185,9 +1244,9 @@ static int init_nic(struct s2io_nic *nic for (i = 0, j = 0; i config-tx_fifo_num; i++) { val64 |= - vBIT(config-tx_cfg[i].fifo_len - 1, ((i * 32) + 19), + vBIT(config-tx_cfg[i].fifo_len - 1, ((j * 32) + 19), 13) | vBIT(config-tx_cfg[i].fifo_priority, - ((i * 32) + 5), 3); + ((j * 32) + 5), 3); if (i == (config-tx_fifo_num - 1)) { if (i % 2 == 0) @@ -1198,17 +1257,25 @@ static int init_nic(struct s2io_nic *nic case 1: writeq(val64, bar0-tx_fifo_partition_0); val64 = 0; + j = 0; break; case 3: writeq(val64, bar0-tx_fifo_partition_1); val64 = 0; + j = 0; break; case 5: writeq(val64, bar0-tx_fifo_partition_2); val64 = 0; + j = 0; break; case 7: writeq(val64, bar0-tx_fifo_partition_3); + val64 = 0; + j = 0; + break; + default: + j++;
Re: init_timer_deferrable conversion
On Mon, 17 Dec 2007 12:47:59 -0500 Parag Warudkar [EMAIL PROTECTED] wrote: On Dec 17, 2007 12:00 PM, Stephen Hemminger [EMAIL PROTECTED] wrote: a) drivers/net/sky2.c - watchdog_timer. This was showing up high on Powertop's list of things that cause routine wakeups from idle. After converting to init_timer_deferrable() the wakeups went down and this one no longer shows up in powertop's list. 25% reduction. This surprises me because it is a 1 hz timer and uses round_jiffies() in the current kernel. I am using the current git and I already have low wakeups per second to begin with - 5-7 and out of that 25% are attributed to sky2. Not sure if that matches up with the 1 hz + round_jiffies() logic. But is it conceptually ok to make this deferrable? I suppose yes as it's just a watchdog that checks if the link is up and delaying that would not make a difference? I think you are going to wake up once a second anyway, so all it ends up changing is the accounting. Please check with the powertop developers. I'm fine with changing sky2, but it would be good if you could go through all the network drivers and fix them as well. -- Stephen Hemminger [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: init_timer_deferrable conversion
On Dec 17, 2007 1:13 PM, Stephen Hemminger [EMAIL PROTECTED] wrote: On Mon, 17 Dec 2007 12:47:59 -0500 Parag Warudkar [EMAIL PROTECTED] wrote: On Dec 17, 2007 12:00 PM, Stephen Hemminger [EMAIL PROTECTED] wrote: a) drivers/net/sky2.c - watchdog_timer. This was showing up high on Powertop's list of things that cause routine wakeups from idle. After converting to init_timer_deferrable() the wakeups went down and this one no longer shows up in powertop's list. 25% reduction. This surprises me because it is a 1 hz timer and uses round_jiffies() in the current kernel. I am using the current git and I already have low wakeups per second to begin with - 5-7 and out of that 25% are attributed to sky2. Not sure if that matches up with the 1 hz + round_jiffies() logic. But is it conceptually ok to make this deferrable? I suppose yes as it's just a watchdog that checks if the link is up and delaying that would not make a difference? I think you are going to wake up once a second anyway, so all it ends up changing is the accounting. Please check with the powertop developers. As I understand it the advantage of deferrable is that sky2 won't have to wakeup the CPU just for itself. It can be coupled with other things that need to wake up the CPU. So hopefully this isn't just a powertop accounting fixup :) I'm fine with changing sky2, but it would be good if you could go through all the network drivers and fix them as well. Arjan - if there is value in converting netdev watchdogs to deferrable from a PM perpective I will fix up other drivers as well. Thanks Parag -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
The cpu alloc patches also fix this issue one way (disabling preempt) or the other (atomic instruction that does not need disabling of preeemption). -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] drivers/ssb/: Spelling fixes
Signed-off-by: Joe Perches [EMAIL PROTECTED] --- drivers/ssb/b43_pci_bridge.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/ssb/b43_pci_bridge.c b/drivers/ssb/b43_pci_bridge.c index f145d8a..310b84f 100644 --- a/drivers/ssb/b43_pci_bridge.c +++ b/drivers/ssb/b43_pci_bridge.c @@ -1,7 +1,7 @@ /* * Broadcom 43xx PCI-SSB bridge module * - * This technically is a seperate PCI driver module, but + * This technically is a separate PCI driver module, but * because of its small size we include it in the SSB core * instead of creating a standalone module. * -- 1.5.3.7.949.g2221a6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] include/net/: Spelling fixes
Signed-off-by: Joe Perches [EMAIL PROTECTED] --- include/net/ax25.h |2 +- include/net/ip6_tunnel.h |2 +- include/net/irda/discovery.h |2 +- include/net/sctp/structs.h |6 +++--- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/include/net/ax25.h b/include/net/ax25.h index 4e3cd93..32a57e1 100644 --- a/include/net/ax25.h +++ b/include/net/ax25.h @@ -35,7 +35,7 @@ #define AX25_P_ATALK 0xca/* Appletalk */ #define AX25_P_ATALK_ARP 0xcb/* Appletalk ARP */ #define AX25_P_IP 0xcc/* ARPA Internet Protocol */ -#define AX25_P_ARP 0xcd/* ARPA Adress Resolution */ +#define AX25_P_ARP 0xcd/* ARPA Address Resolution*/ #define AX25_P_FLEXNET 0xce/* FlexNet*/ #define AX25_P_NETROM 0xcf/* NET/ROM*/ #define AX25_P_TEXT0xF0/* No layer 3 protocol impl. */ diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h index 29c9da7..c17fa1f 100644 --- a/include/net/ip6_tunnel.h +++ b/include/net/ip6_tunnel.h @@ -23,7 +23,7 @@ struct ip6_tnl { struct net_device *dev; /* virtual device associated with tunnel */ struct net_device_stats stat; /* statistics for tunnel device */ int recursion; /* depth of hard_start_xmit recursion */ - struct ip6_tnl_parm parms; /* tunnel configuration paramters */ + struct ip6_tnl_parm parms; /* tunnel configuration parameters */ struct flowi fl;/* flowi template for xmit */ struct dst_entry *dst_cache;/* cached dst */ u32 dst_cookie; diff --git a/include/net/irda/discovery.h b/include/net/irda/discovery.h index eb0f9de..e4efad1 100644 --- a/include/net/irda/discovery.h +++ b/include/net/irda/discovery.h @@ -80,7 +80,7 @@ typedef struct discovery_t { irda_queue_tq; /* Must be first! */ discinfo_t data; /* Basic discovery information */ - int name_len; /* Lenght of nickname */ + int name_len; /* Length of nickname */ LAP_REASON condition; /* More info about the discovery */ int gen_addr_bit; /* Need to generate a new device diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index 002a00a..bb96574 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -301,7 +301,7 @@ struct sctp_sock { /* The default SACK delay timeout for new associations. */ __u32 sackdelay; - /* Flags controling Heartbeat, SACK delay, and Path MTU Discovery. */ + /* Flags controlling Heartbeat, SACK delay, and Path MTU Discovery. */ __u32 param_flags; struct sctp_initmsg initmsg; @@ -955,7 +955,7 @@ struct sctp_transport { /* PMTU : The current known path MTU. */ __u32 pathmtu; - /* Flags controling Heartbeat, SACK delay, and Path MTU Discovery. */ + /* Flags controlling Heartbeat, SACK delay, and Path MTU Discovery. */ __u32 param_flags; /* The number of times INIT has been sent on this transport. */ @@ -1638,7 +1638,7 @@ struct sctp_association { */ __u32 pathmtu; - /* Flags controling Heartbeat, SACK delay, and Path MTU Discovery. */ + /* Flags controlling Heartbeat, SACK delay, and Path MTU Discovery. */ __u32 param_flags; /* SACK delay timeout */ -- 1.5.3.7.949.g2221a6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net/dccp/: Spelling fixes
Signed-off-by: Joe Perches [EMAIL PROTECTED] --- net/dccp/ackvec.h |2 +- net/dccp/ccids/ccid3.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/dccp/ackvec.h b/net/dccp/ackvec.h index 9ef0737..9671ecd 100644 --- a/net/dccp/ackvec.h +++ b/net/dccp/ackvec.h @@ -71,7 +71,7 @@ struct dccp_ackvec { * @dccpavr_ack_ackno - sequence number being acknowledged * @dccpavr_ack_ptr - pointer into dccpav_buf where this record starts * @dccpavr_ack_nonce - dccpav_ack_nonce at the time this record was sent - * @dccpavr_sent_len - lenght of the record in dccpav_buf + * @dccpavr_sent_len - length of the record in dccpav_buf */ struct dccp_ackvec_record { struct list_head dccpavr_node; diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c index 19b3358..d133416 100644 --- a/net/dccp/ccids/ccid3.c +++ b/net/dccp/ccids/ccid3.c @@ -239,7 +239,7 @@ static void ccid3_hc_tx_no_feedback_timer(unsigned long data) ccid3_tx_state_name(hctx-ccid3hctx_state), (unsigned)(hctx-ccid3hctx_x 6)); /* The value of R is still undefined and so we can not recompute -* the timout value. Keep initial value as per [RFC 4342, 5]. */ +* the timeout value. Keep initial value as per [RFC 4342, 5]. */ t_nfb = TFRC_INITIAL_TIMEOUT; ccid3_update_send_interval(hctx); break; -- 1.5.3.7.949.g2221a6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net/ipv6/: Spelling fixes
Signed-off-by: Joe Perches [EMAIL PROTECTED] --- net/ipv6/ndisc.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 67997a7..777ed73 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -612,7 +612,7 @@ void ndisc_send_rs(struct net_device *dev, struct in6_addr *saddr, * optimistic addresses, but we may send the solicitation * if we don't include the sllao. So here we check * if our address is optimistic, and if so, we -* supress the inclusion of the sllao. +* suppress the inclusion of the sllao. */ if (send_sllao) { struct inet6_ifaddr *ifp = ipv6_get_ifaddr(saddr, dev, 1); -- 1.5.3.7.949.g2221a6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net/irda/: Spelling fixes
Signed-off-by: Joe Perches [EMAIL PROTECTED] --- net/irda/ircomm/ircomm_param.c |2 +- net/irda/irlan/irlan_eth.c |2 +- net/irda/irlap_frame.c |2 +- net/irda/parameters.c | 12 ++-- net/irda/wrapper.c |2 +- 5 files changed, 10 insertions(+), 10 deletions(-) diff --git a/net/irda/ircomm/ircomm_param.c b/net/irda/ircomm/ircomm_param.c index e5e4792..598dcbe 100644 --- a/net/irda/ircomm/ircomm_param.c +++ b/net/irda/ircomm/ircomm_param.c @@ -496,7 +496,7 @@ static int ircomm_param_poll(void *instance, irda_param_t *param, int get) IRDA_ASSERT(self != NULL, return -1;); IRDA_ASSERT(self-magic == IRCOMM_TTY_MAGIC, return -1;); - /* Poll parameters are always of lenght 0 (just a signal) */ + /* Poll parameters are always of length 0 (just a signal) */ if (!get) { /* Respond with DTE line settings */ ircomm_param_request(self, IRCOMM_DTE, TRUE); diff --git a/net/irda/irlan/irlan_eth.c b/net/irda/irlan/irlan_eth.c index c682207..1ab91f7 100644 --- a/net/irda/irlan/irlan_eth.c +++ b/net/irda/irlan/irlan_eth.c @@ -342,7 +342,7 @@ static void irlan_eth_set_multicast_list(struct net_device *dev) if (dev-flags IFF_PROMISC) { /* Enable promiscuous mode */ - IRDA_WARNING(Promiscous mode not implemented by IrLAN!\n); + IRDA_WARNING(Promiscuous mode not implemented by IrLAN!\n); } else if ((dev-flags IFF_ALLMULTI) || dev-mc_count HW_MAX_ADDRS) { /* Disable promiscuous mode, use normal mode. */ diff --git a/net/irda/irlap_frame.c b/net/irda/irlap_frame.c index 4f37645..7c132d6 100644 --- a/net/irda/irlap_frame.c +++ b/net/irda/irlap_frame.c @@ -144,7 +144,7 @@ void irlap_send_snrm_frame(struct irlap_cb *self, struct qos_info *qos) frame-control = SNRM_CMD | PF_BIT; /* -* If we are establishing a connection then insert QoS paramerters +* If we are establishing a connection then insert QoS parameters */ if (qos) { skb_put(tx_skb, 9); /* 25 left */ diff --git a/net/irda/parameters.c b/net/irda/parameters.c index 2627dad..b23a3c7 100644 --- a/net/irda/parameters.c +++ b/net/irda/parameters.c @@ -133,7 +133,7 @@ static int irda_insert_integer(void *self, __u8 *buf, int len, __u8 pi, int err; p.pi = pi; /* In case handler needs to know */ - p.pl = type PV_MASK; /* The integer type codes the lenght as well */ + p.pl = type PV_MASK; /* The integer type codes the length as well */ p.pv.i = 0;/* Clear value */ /* Call handler for this parameter */ @@ -142,7 +142,7 @@ static int irda_insert_integer(void *self, __u8 *buf, int len, __u8 pi, return err; /* -* If parameter lenght is still 0, then (1) this is an any length +* If parameter length is still 0, then (1) this is an any length * integer, and (2) the handler function does not care which length * we choose to use, so we pick the one the gives the fewest bytes. */ @@ -206,11 +206,11 @@ static int irda_extract_integer(void *self, __u8 *buf, int len, __u8 pi, { irda_param_t p; int n = 0; - int extract_len;/* Real lenght we extract */ + int extract_len;/* Real length we extract */ int err; p.pi = pi; /* In case handler needs to know */ - p.pl = buf[1]; /* Extract lenght of value */ + p.pl = buf[1]; /* Extract length of value */ p.pv.i = 0;/* Clear value */ extract_len = p.pl; /* Default : extract all */ @@ -297,7 +297,7 @@ static int irda_extract_string(void *self, __u8 *buf, int len, __u8 pi, IRDA_DEBUG(2, %s()\n, __FUNCTION__); p.pi = pi; /* In case handler needs to know */ - p.pl = buf[1]; /* Extract lenght of value */ + p.pl = buf[1]; /* Extract length of value */ IRDA_DEBUG(2, %s(), pi=%#x, pl=%d\n, __FUNCTION__, p.pi, p.pl); @@ -339,7 +339,7 @@ static int irda_extract_octseq(void *self, __u8 *buf, int len, __u8 pi, irda_param_t p; p.pi = pi; /* In case handler needs to know */ - p.pl = buf[1]; /* Extract lenght of value */ + p.pl = buf[1]; /* Extract length of value */ /* Check if buffer is long enough for parsing */ if (len (2+p.pl)) { diff --git a/net/irda/wrapper.c b/net/irda/wrapper.c index e712867..c246983 100644 --- a/net/irda/wrapper.c +++ b/net/irda/wrapper.c @@ -238,7 +238,7 @@ async_bump(struct net_device *dev, skb_reserve(newskb, 1); if(docopy) { - /* Copy data without CRC (lenght already checked) */ + /* Copy data without CRC (length already checked) */ skb_copy_to_linear_data(newskb, rx_buff-data, rx_buff-len -
[PATCH] net/core/: Spelling fixes
Signed-off-by: Joe Perches [EMAIL PROTECTED] --- net/core/dev.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 26a3a3a..be9d301 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2819,7 +2819,7 @@ void dev_set_allmulti(struct net_device *dev, int inc) /* * Upload unicast and multicast address lists to device and * configure RX filtering. When the device doesn't support unicast - * filtering it is put in promiscous mode while unicast addresses + * filtering it is put in promiscuous mode while unicast addresses * are present. */ void __dev_set_rx_mode(struct net_device *dev) -- 1.5.3.7.949.g2221a6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net/sched/: Spelling fixes
Signed-off-by: Joe Perches [EMAIL PROTECTED] --- net/sched/sch_hfsc.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c index 55e7e45..a6ad491 100644 --- a/net/sched/sch_hfsc.c +++ b/net/sched/sch_hfsc.c @@ -160,7 +160,7 @@ struct hfsc_class u64 cl_vtoff; /* inter-period cumulative vt offset */ u64 cl_cvtmax; /* max child's vt in the last period */ u64 cl_cvtoff; /* cumulative cvtmax of all periods */ - u64 cl_pcvtoff; /* parent's cvtoff at initalization + u64 cl_pcvtoff; /* parent's cvtoff at initialization time */ struct internal_sc cl_rsc; /* internal real-time service curve */ -- 1.5.3.7.949.g2221a6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net/sctp/: Spelling fixes
Signed-off-by: Joe Perches [EMAIL PROTECTED] --- net/sctp/sm_make_chunk.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c index f487629..ed7c9e3 100644 --- a/net/sctp/sm_make_chunk.c +++ b/net/sctp/sm_make_chunk.c @@ -286,7 +286,7 @@ struct sctp_chunk *sctp_make_init(const struct sctp_association *asoc, sctp_addto_chunk(retval, sizeof(ecap_param), ecap_param); - /* Add the supported extensions paramter. Be nice and add this + /* Add the supported extensions parameter. Be nice and add this * fist before addiding the parameters for the extensions themselves */ if (num_ext) { @@ -2859,7 +2859,7 @@ struct sctp_chunk *sctp_process_asconf(struct sctp_association *asoc, chunk_len -= length; /* Skip the address parameter and store a pointer to the first -* asconf paramter. +* asconf parameter. */ length = ntohs(addr_param-v4.param_hdr.length); asconf_param = (sctp_addip_param_t *)((void *)addr_param + length); @@ -2868,7 +2868,7 @@ struct sctp_chunk *sctp_process_asconf(struct sctp_association *asoc, /* create an ASCONF_ACK chunk. * Based on the definitions of parameters, we know that the size of * ASCONF_ACK parameters are less than or equal to the twice of ASCONF -* paramters. +* parameters. */ asconf_ack = sctp_make_asconf_ack(asoc, serial, chunk_len * 2); if (!asconf_ack) @@ -3062,7 +3062,7 @@ int sctp_process_asconf_ack(struct sctp_association *asoc, asconf_len -= length; /* Skip the address parameter in the last asconf sent and store a -* pointer to the first asconf paramter. +* pointer to the first asconf parameter. */ length = ntohs(addr_param-v4.param_hdr.length); asconf_param = (sctp_addip_param_t *)((void *)addr_param + length); -- 1.5.3.7.949.g2221a6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net/netlabel/: Spelling fixes
Signed-off-by: Joe Perches [EMAIL PROTECTED] --- net/netlabel/netlabel_mgmt.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/netlabel/netlabel_mgmt.c b/net/netlabel/netlabel_mgmt.c index 5648337..9c41464 100644 --- a/net/netlabel/netlabel_mgmt.c +++ b/net/netlabel/netlabel_mgmt.c @@ -71,7 +71,7 @@ static const struct nla_policy netlbl_mgmt_genl_policy[NLBL_MGMT_A_MAX + 1] = { }; /* - * NetLabel Misc Managment Functions + * NetLabel Misc Management Functions */ /** -- 1.5.3.7.949.g2221a6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net/netfilter/: Spelling fixes
Signed-off-by: Joe Perches [EMAIL PROTECTED] --- net/netfilter/nf_conntrack_sip.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c index 8f8b5a4..515abff 100644 --- a/net/netfilter/nf_conntrack_sip.c +++ b/net/netfilter/nf_conntrack_sip.c @@ -187,7 +187,7 @@ static const struct sip_header_nfo ct_sip_hdrs[] = { } }; -/* get line lenght until first CR or LF seen. */ +/* get line length until first CR or LF seen. */ int ct_sip_lnlen(const char *line, const char *limit) { const char *k = line; @@ -236,7 +236,7 @@ static int digits_len(struct nf_conn *ct, const char *dptr, return len; } -/* get digits lenght, skiping blank spaces. */ +/* get digits length, skipping blank spaces. */ static int skp_digits_len(struct nf_conn *ct, const char *dptr, const char *limit, int *shift) { -- 1.5.3.7.949.g2221a6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] [UDP6]: Counter increment on BH mode
On Sun, 16 Dec 2007, Herbert Xu wrote: If we can get the address of the per-cpu counter against some sort of a per-cpu base pointer, e.g., %gs on x86, then we can do incq%gs:(%rax) where %rax would be the offset with %gs as the base. This would obviate the need for the CPU ID and therefore avoid disabling preemption. Hmm, wasn't Christoph working on something like that? Yes that is what the cpu alloc patchset implements. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net/ipv4/: Spelling fixes
Signed-off-by: Joe Perches [EMAIL PROTECTED] --- net/ipv4/netfilter/nf_nat_sip.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/ipv4/netfilter/nf_nat_sip.c b/net/ipv4/netfilter/nf_nat_sip.c index 3ca9897..8996ccb 100644 --- a/net/ipv4/netfilter/nf_nat_sip.c +++ b/net/ipv4/netfilter/nf_nat_sip.c @@ -165,7 +165,7 @@ static int mangle_content_len(struct sk_buff *skb, dataoff = ip_hdrlen(skb) + sizeof(struct udphdr); - /* Get actual SDP lenght */ + /* Get actual SDP length */ if (ct_sip_get_info(ct, dptr, skb-len - dataoff, matchoff, matchlen, POS_SDP_HEADER) 0) { -- 1.5.3.7.949.g2221a6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ip neigh show not showing arp cache entries?
Patrick McHardy wrote: From a kernel perspective there are only complete dumps, the filtering is done by iproute. So the fact that it shows them when querying specifically implies there is a bug in the iproute neighbour filter. Does it work if you omit all from the ip neigh show command? Omitting all gives identical results. It is still missing entries when compared with the output of arp. Chris -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net/netlabel/: Spelling fixes
On Monday 17 December 2007 2:40:35 pm Joe Perches wrote: Signed-off-by: Joe Perches [EMAIL PROTECTED] Thanks Joe. Acked-by: Paul Moore [EMAIL PROTECTED] --- net/netlabel/netlabel_mgmt.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/net/netlabel/netlabel_mgmt.c b/net/netlabel/netlabel_mgmt.c index 5648337..9c41464 100644 --- a/net/netlabel/netlabel_mgmt.c +++ b/net/netlabel/netlabel_mgmt.c @@ -71,7 +71,7 @@ static const struct nla_policy netlbl_mgmt_genl_policy[NLBL_MGMT_A_MAX + 1] = { }; /* - * NetLabel Misc Managment Functions + * NetLabel Misc Management Functions */ /** -- paul moore linux security @ hp -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please pull 'fixes-jgarzik' branch of wireless-2.6
On Sat, Dec 15, 2007 at 11:31:48PM -0500, John W. Linville wrote: Cyrill Gorcunov (2): ieee80211_rate: missed unlock net/mac80211/ieee80211_rate.c |1 + diff --git a/net/mac80211/ieee80211_rate.c b/net/mac80211/ieee80211_rate.c index 7254bd6..3260a4a 100644 --- a/net/mac80211/ieee80211_rate.c +++ b/net/mac80211/ieee80211_rate.c @@ -33,6 +33,7 @@ int ieee80211_rate_control_register(struct rate_control_ops *ops) if (!strcmp(alg-ops-name, ops-name)) { /* don't register an algorithm twice */ WARN_ON(1); + mutex_unlock(rate_ctrl_mutex); return -EALREADY; } } Crud...there is a one-line fix in here that should have gone to Dave. Jeff, (assuming Dave ACKs it) would you mind just taking it your way along with the other posted patches? Since this is intended for 2.6.24, there should be no great maintenance hardship if it goes to Linus through your tree instead of Dave's. Thanks, John -- John W. Linville [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please pull 'fixes-jgarzik' branch of wireless-2.6
From: John W. Linville [EMAIL PROTECTED] Date: Mon, 17 Dec 2007 14:34:02 -0500 On Sat, Dec 15, 2007 at 11:31:48PM -0500, John W. Linville wrote: Cyrill Gorcunov (2): ieee80211_rate: missed unlock net/mac80211/ieee80211_rate.c |1 + diff --git a/net/mac80211/ieee80211_rate.c b/net/mac80211/ieee80211_rate.c index 7254bd6..3260a4a 100644 --- a/net/mac80211/ieee80211_rate.c +++ b/net/mac80211/ieee80211_rate.c @@ -33,6 +33,7 @@ int ieee80211_rate_control_register(struct rate_control_ops *ops) if (!strcmp(alg-ops-name, ops-name)) { /* don't register an algorithm twice */ WARN_ON(1); + mutex_unlock(rate_ctrl_mutex); return -EALREADY; } } Crud...there is a one-line fix in here that should have gone to Dave. Jeff, (assuming Dave ACKs it) would you mind just taking it your way along with the other posted patches? Since this is intended for 2.6.24, there should be no great maintenance hardship if it goes to Linus through your tree instead of Dave's. It's totally fine if Jeff takes this, don't worry so much about it :-) And yes the change looks fine to me too :) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please pull 'fixes-jgarzik' branch of wireless-2.6
John W. Linville wrote: On Sat, Dec 15, 2007 at 11:31:48PM -0500, John W. Linville wrote: Cyrill Gorcunov (2): ieee80211_rate: missed unlock net/mac80211/ieee80211_rate.c |1 + diff --git a/net/mac80211/ieee80211_rate.c b/net/mac80211/ieee80211_rate.c index 7254bd6..3260a4a 100644 --- a/net/mac80211/ieee80211_rate.c +++ b/net/mac80211/ieee80211_rate.c @@ -33,6 +33,7 @@ int ieee80211_rate_control_register(struct rate_control_ops *ops) if (!strcmp(alg-ops-name, ops-name)) { /* don't register an algorithm twice */ WARN_ON(1); + mutex_unlock(rate_ctrl_mutex); return -EALREADY; } } Crud...there is a one-line fix in here that should have gone to Dave. Jeff, (assuming Dave ACKs it) would you mind just taking it your way along with the other posted patches? Since this is intended for 2.6.24, there should be no great maintenance hardship if it goes to Linus through your tree instead of Dave's. Will do... Jeff -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFT] tehuti: napi fix
On Sun, 16 Dec 2007 13:38:33 -0800 (PST) David Miller [EMAIL PROTECTED] wrote: From: Stephen Hemminger [EMAIL PROTECTED] Date: Wed, 12 Dec 2007 13:58:52 -0800 This should fix the tehuti napi fence post problems by getting rid of priv-napi_stop, and setting weight to 32 (like other 10G). Also, used the wierd entry/exit macro's like rest of driver. It fixes the fench-post problem, but like the comments you removed explain: - /* from time to time we exit to let NAPI layer release -* device lock and allow waiting tasks (eg rmmod) to advance) */ - priv-napi_stop = 0; - We now hang on rmmod during constant packet load. This change just trades one bug for another, we have to get the device close issue sorted out before we can go around removing these things. Well the napi_stop had the same effect as having a smaller weight value, so my patch just shrunk the weight. That causes the device to exit NAPI (and should solve the rmmod problem). -- Stephen Hemminger [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [NET][POWERPC] ucc_geth: really fix section mismatch
Anton Vorontsov wrote: Commit ed7e63a51d46e835422d89c687b8a3e419a4212a has tried to fix section mismatch: WARNING: vmlinux.o(.init.text+0x17278): Section mismatch: reference to .exit.text:uec_mdio_exit (between 'ucc_geth_init' and 'uec_mdio_init') But that mismatch still happens. This patch actually fixing section mismatch by removing __exit from the header file. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- drivers/net/ucc_geth_mii.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/net/ucc_geth_mii.h b/drivers/net/ucc_geth_mii.h index d834370..1e45b20 100644 --- a/drivers/net/ucc_geth_mii.h +++ b/drivers/net/ucc_geth_mii.h @@ -96,5 +96,5 @@ enum enet_tbi_mii_reg { int uec_mdio_read(struct mii_bus *bus, int mii_id, int regnum); int uec_mdio_write(struct mii_bus *bus, int mii_id, int regnum, u16 value); int __init uec_mdio_init(void); -void __exit uec_mdio_exit(void); applied #upstream-fixes -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please pull 'fixes-jgarzik' branch of wireless-2.6
John W. Linville wrote: Jeff, A few more fixes for 2.6.24...let me know if there are any problems! Thanks, John P.S. The zd1211rw patch is already in netdev-2.6#upstream, but it belongs in 2.6.24 as well. --- Individual patches available here: http://www.kernel.org/pub/linux/kernel/people/linville/wireless-2.6/fixes-jgarzik --- The following changes since commit 82d29bf6dc7317aeb0a3a13c2348ca8591965875: Linus Torvalds (1): Linux 2.6.24-rc5 are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git fixes-jgarzik Adrian Bunk (1): wireless/ipw2200.c: add __dev{init,exit} annotations Andrew Morton (1): bcm43xx_debugfs sscanf fix Cyrill Gorcunov (2): ieee80211_rate: missed unlock iwlwifi3945/4965: fix rate control algo reference leak Dan Williams (1): libertas: select WIRELESS_EXT Larry Finger (1): b43: Fix rfkill radio LED Stefano Brivio (1): libertas: add Dan Williams as maintainer Ulrich Kunitz (1): zd1211rw: Fix alignment problems Zhu Yi (1): iwlwifi: fix rf_kill state inconsistent during suspend and resume MAINTAINERS|6 drivers/net/wireless/Kconfig |1 + drivers/net/wireless/b43/leds.c|4 ++ drivers/net/wireless/b43/main.c| 22 +++--- drivers/net/wireless/b43/rfkill.c | 37 --- drivers/net/wireless/bcm43xx/bcm43xx_debugfs.c |2 +- drivers/net/wireless/ipw2200.c |7 ++-- drivers/net/wireless/iwlwifi/iwl3945-base.c|5 ++- drivers/net/wireless/iwlwifi/iwl4965-base.c|5 ++- drivers/net/wireless/zd1211rw/zd_mac.c | 10 +- net/mac80211/ieee80211_rate.c |1 + 11 files changed, 76 insertions(+), 24 deletions(-) pulled -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net/sctp/: Spelling fixes
Joe Perches wrote: Signed-off-by: Joe Perches [EMAIL PROTECTED] Thanks... I am surprised this is all you found :) ACK. -vlad --- net/sctp/sm_make_chunk.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c index f487629..ed7c9e3 100644 --- a/net/sctp/sm_make_chunk.c +++ b/net/sctp/sm_make_chunk.c @@ -286,7 +286,7 @@ struct sctp_chunk *sctp_make_init(const struct sctp_association *asoc, sctp_addto_chunk(retval, sizeof(ecap_param), ecap_param); - /* Add the supported extensions paramter. Be nice and add this + /* Add the supported extensions parameter. Be nice and add this * fist before addiding the parameters for the extensions themselves */ if (num_ext) { @@ -2859,7 +2859,7 @@ struct sctp_chunk *sctp_process_asconf(struct sctp_association *asoc, chunk_len -= length; /* Skip the address parameter and store a pointer to the first - * asconf paramter. + * asconf parameter. */ length = ntohs(addr_param-v4.param_hdr.length); asconf_param = (sctp_addip_param_t *)((void *)addr_param + length); @@ -2868,7 +2868,7 @@ struct sctp_chunk *sctp_process_asconf(struct sctp_association *asoc, /* create an ASCONF_ACK chunk. * Based on the definitions of parameters, we know that the size of * ASCONF_ACK parameters are less than or equal to the twice of ASCONF - * paramters. + * parameters. */ asconf_ack = sctp_make_asconf_ack(asoc, serial, chunk_len * 2); if (!asconf_ack) @@ -3062,7 +3062,7 @@ int sctp_process_asconf_ack(struct sctp_association *asoc, asconf_len -= length; /* Skip the address parameter in the last asconf sent and store a - * pointer to the first asconf paramter. + * pointer to the first asconf parameter. */ length = ntohs(addr_param-v4.param_hdr.length); asconf_param = (sctp_addip_param_t *)((void *)addr_param + length); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] drivers/net/: Spelling fixes
On Mon, 2007-12-17 at 21:56 +0100, Stefano Brivio wrote: On Mon, 17 Dec 2007 11:40:08 -0800 Joe Perches [EMAIL PROTECTED] wrote: diff --git a/drivers/net/ucc_geth_ethtool.c b/drivers/net/ucc_geth_ethtool.c index 9a9622c..f8d319b 100644 --- a/drivers/net/ucc_geth_ethtool.c +++ b/drivers/net/ucc_geth_ethtool.c @@ -7,7 +7,7 @@ * * Limitation: * Can only get/set setttings of the first queue. ^^^ Good eyes... Unrelated to what I changed too. cheers, Joe Signed-off-by: Joe Perches [EMAIL PROTECTED] --- diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c index 121cb10..cdfb2b0 100644 --- a/drivers/net/s2io.c +++ b/drivers/net/s2io.c @@ -6823,8 +6823,8 @@ static void do_s2io_card_down(struct s2io_nic * sp, int do_io) while(do_io) { /* As per the HW requirement we need to replenish the * receive buffer to avoid the ring bump. Since there is -* no intention of processing the Rx frame at this pointwe are -* just settting the ownership bit of rxd in Each Rx +* no intention of processing the Rx frame at this point we are +* just setting the ownership bit of rxd in each Rx * ring to HW and set the appropriate buffer size * based on the ring mode */ diff --git a/drivers/net/ucc_geth_ethtool.c b/drivers/net/ucc_geth_ethtool.c index f8d319b..3e50df8 100644 --- a/drivers/net/ucc_geth_ethtool.c +++ b/drivers/net/ucc_geth_ethtool.c @@ -6,7 +6,7 @@ * Author: Li Yang [EMAIL PROTECTED] * * Limitation: - * Can only get/set setttings of the first queue. + * Can only get/set settings of the first queue. * Need to re-open the interface manually after changing some parameters. * * This program is free software; you can redistribute it and/or modify it -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Please pull 'fixes-davem' branch of wireless-2.6
Dave, A few more small fixes for 2.6.24. Let me know if there are any problems! Thanks, John --- Individual patches are available here: http://www.kernel.org/pub/linux/kernel/people/linville/wireless-2.6/fixes-davem --- The following changes since commit 82d29bf6dc7317aeb0a3a13c2348ca8591965875: Linus Torvalds (1): Linux 2.6.24-rc5 are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git fixes-davem Cyrill Gorcunov (1): NET: mac80211: fix inappropriate memory freeing Johannes Berg (1): mac80211: fix header ops Michael Wu (1): mac80211: Drop out of associated state if link is lost net/mac80211/ieee80211.c |1 - net/mac80211/ieee80211_rate.c |2 +- net/mac80211/ieee80211_sta.c |8 ++-- 3 files changed, 3 insertions(+), 8 deletions(-) diff --git a/net/mac80211/ieee80211.c b/net/mac80211/ieee80211.c index 505af1f..6378850 100644 --- a/net/mac80211/ieee80211.c +++ b/net/mac80211/ieee80211.c @@ -427,7 +427,6 @@ static const struct header_ops ieee80211_header_ops = { void ieee80211_if_setup(struct net_device *dev) { ether_setup(dev); - dev-header_ops = ieee80211_header_ops; dev-hard_start_xmit = ieee80211_subif_start_xmit; dev-wireless_handlers = ieee80211_iw_handler_def; dev-set_multicast_list = ieee80211_set_multicast_list; diff --git a/net/mac80211/ieee80211_rate.c b/net/mac80211/ieee80211_rate.c index 7254bd6..9f26a10 100644 --- a/net/mac80211/ieee80211_rate.c +++ b/net/mac80211/ieee80211_rate.c @@ -59,11 +59,11 @@ void ieee80211_rate_control_unregister(struct rate_control_ops *ops) list_for_each_entry(alg, rate_ctrl_algs, list) { if (alg-ops == ops) { list_del(alg-list); + kfree(alg); break; } } mutex_unlock(rate_ctrl_mutex); - kfree(alg); } EXPORT_SYMBOL(ieee80211_rate_control_unregister); diff --git a/net/mac80211/ieee80211_sta.c b/net/mac80211/ieee80211_sta.c index 16afd24..bee8080 100644 --- a/net/mac80211/ieee80211_sta.c +++ b/net/mac80211/ieee80211_sta.c @@ -808,12 +808,8 @@ static void ieee80211_associated(struct net_device *dev, sta_info_put(sta); } if (disassoc) { - union iwreq_data wrqu; - memset(wrqu.ap_addr.sa_data, 0, ETH_ALEN); - wrqu.ap_addr.sa_family = ARPHRD_ETHER; - wireless_send_event(dev, SIOCGIWAP, wrqu, NULL); - mod_timer(ifsta-timer, jiffies + - IEEE80211_MONITORING_INTERVAL + 30 * HZ); + ifsta-state = IEEE80211_DISABLED; + ieee80211_set_associated(dev, ifsta, 0); } else { mod_timer(ifsta-timer, jiffies + IEEE80211_MONITORING_INTERVAL); -- John W. Linville [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Please pull 'upstream-davem' branch of wireless-2.6
Dave, A few more patches for 2.6.25... Note that there are a few one-line patches to some drivers to support a new flag used for timestamps in radiotap headers for mac80211, and a couple others related to the new scan capabilities stuff added to WEXT in order to better support hidden SSIDs for wpa_supplicant/NetworkManager. I'll CC Jeff as well... Let me know if there are any problems! Thanks, John --- Individual patches are available here: http://www.kernel.org/pub/linux/kernel/people/linville/wireless-2.6/upstream-davem --- The following changes since commit e75bf3477c0d63cdd1f49f91a90816e4360ffc23: Joe Perches (1): [PARISC]: Fix build after ipv4_is_*() changes. are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git upstream-davem Dan Williams (1): introduce WEXT scan capabilities Johannes Berg (2): mac80211: conditionally include timestamp in radiotap information wireless: make drivers include the TSF RX flag where appropriate drivers/net/wireless/b43/xmit.c|1 + drivers/net/wireless/b43legacy/xmit.c |1 + drivers/net/wireless/hostap/hostap_ioctl.c |3 ++ drivers/net/wireless/ipw2200.c |2 + drivers/net/wireless/p54common.c |1 + drivers/net/wireless/rtl8187_dev.c |1 + include/linux/wireless.h | 13 +++ include/net/mac80211.h |3 ++ net/mac80211/ieee80211_ioctl.c |2 + net/mac80211/rx.c | 48 ++- 10 files changed, 59 insertions(+), 16 deletions(-) diff --git a/drivers/net/wireless/b43/xmit.c b/drivers/net/wireless/b43/xmit.c index 0bd6f8a..77b3690 100644 --- a/drivers/net/wireless/b43/xmit.c +++ b/drivers/net/wireless/b43/xmit.c @@ -526,6 +526,7 @@ void b43_rx(struct b43_wldev *dev, struct sk_buff *skb, const void *_rxhdr) status.rate = b43_plcp_get_bitrate_cck(plcp); status.antenna = !!(phystat0 B43_RX_PHYST0_ANT); status.mactime = mactime; + status.flag |= RX_FLAG_TSFT; chanid = (chanstat B43_RX_CHAN_ID) B43_RX_CHAN_ID_SHIFT; switch (chanstat B43_RX_CHAN_PHYTYPE) { diff --git a/drivers/net/wireless/b43legacy/xmit.c b/drivers/net/wireless/b43legacy/xmit.c index fa1e656..b71cc94 100644 --- a/drivers/net/wireless/b43legacy/xmit.c +++ b/drivers/net/wireless/b43legacy/xmit.c @@ -532,6 +532,7 @@ void b43legacy_rx(struct b43legacy_wldev *dev, status.rate = b43legacy_plcp_get_bitrate_cck(plcp); status.antenna = !!(phystat0 B43legacy_RX_PHYST0_ANT); status.mactime = mactime; + status.flag |= RX_FLAG_TSFT; chanid = (chanstat B43legacy_RX_CHAN_ID) B43legacy_RX_CHAN_ID_SHIFT; diff --git a/drivers/net/wireless/hostap/hostap_ioctl.c b/drivers/net/wireless/hostap/hostap_ioctl.c index d8f5efc..3a57d48 100644 --- a/drivers/net/wireless/hostap/hostap_ioctl.c +++ b/drivers/net/wireless/hostap/hostap_ioctl.c @@ -1089,6 +1089,9 @@ static int prism2_ioctl_giwrange(struct net_device *dev, range-enc_capa = IW_ENC_CAPA_WPA | IW_ENC_CAPA_WPA2 | IW_ENC_CAPA_CIPHER_TKIP | IW_ENC_CAPA_CIPHER_CCMP; + if (local-sta_fw_ver = PRISM2_FW_VER(1,3,1)) + range-scan_capa = IW_SCAN_CAPA_ESSID; + return 0; } diff --git a/drivers/net/wireless/ipw2200.c b/drivers/net/wireless/ipw2200.c index 54f44e5..e30ad24 100644 --- a/drivers/net/wireless/ipw2200.c +++ b/drivers/net/wireless/ipw2200.c @@ -8901,6 +8901,8 @@ static int ipw_wx_get_range(struct net_device *dev, range-enc_capa = IW_ENC_CAPA_WPA | IW_ENC_CAPA_WPA2 | IW_ENC_CAPA_CIPHER_TKIP | IW_ENC_CAPA_CIPHER_CCMP; + range-scan_capa = IW_SCAN_CAPA_ESSID | IW_SCAN_CAPA_TYPE; + IPW_DEBUG_WX(GET Range\n); return 0; } diff --git a/drivers/net/wireless/p54common.c b/drivers/net/wireless/p54common.c index 1437db0..5f8d898 100644 --- a/drivers/net/wireless/p54common.c +++ b/drivers/net/wireless/p54common.c @@ -314,6 +314,7 @@ static void p54_rx_data(struct ieee80211_hw *dev, struct sk_buff *skb) rx_status.phymode = MODE_IEEE80211G; rx_status.antenna = hdr-antenna; rx_status.mactime = le64_to_cpu(hdr-timestamp); + rx_status.flag |= RX_FLAG_TSFT; skb_pull(skb, sizeof(*hdr)); skb_trim(skb, le16_to_cpu(hdr-len)); diff --git a/drivers/net/wireless/rtl8187_dev.c b/drivers/net/wireless/rtl8187_dev.c index e454ae8..b23191f 100644 --- a/drivers/net/wireless/rtl8187_dev.c +++ b/drivers/net/wireless/rtl8187_dev.c @@ -225,6 +225,7 @@ static void rtl8187_rx_cb(struct urb *urb) rx_status.channel = dev-conf.channel; rx_status.phymode = dev-conf.phymode; rx_status.mactime = le64_to_cpu(hdr-mac_time); + rx_status.flag |= RX_FLAG_TSFT; if (flags (1 13)) rx_status.flag |=
[PATCH 2.6.25 0/9]: SCTP: Update ADD-IP implementation to conform to spec
The following is a set of patches that updates the SCTP ADD-IP implementation to conform to the recently published RFC. ADD-IP is a SCTP Dynamic Address Configuration extensions, whereby the two end systems can dynamically modify the address lists for a given connection. One of the applications of this is mobility. The systems exchange Address Configuration (ASCONF) and Address Configuration Acknowlegement (ASCONF-ACK) messages which contain the info. If you want more info the operation, read RFC 5061. The implementation in lksctp was a few years old and implemented draft-05 of the specification. So this long overdue. -vlad -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.25 1/9] SCTP: Discard unauthenticated ASCONF and ASCONF ACK chunks
Now that we support AUTH, discard unauthenticated ASCONF and ASCONF ACK chunks as mandated in the ADD-IP spec. Signed-off-by: Vlad Yasevich [EMAIL PROTECTED] --- net/sctp/sm_statefuns.c | 18 ++ 1 files changed, 18 insertions(+), 0 deletions(-) diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index 5fb8477..859be75 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -3399,6 +3399,15 @@ sctp_disposition_t sctp_sf_do_asconf(const struct sctp_endpoint *ep, return sctp_sf_pdiscard(ep, asoc, type, arg, commands); } + /* ADD-IP: Section 4.1.1 +* This chunk MUST be sent in an authenticated way by using +* the mechanism defined in [I-D.ietf-tsvwg-sctp-auth]. If this chunk +* is received unauthenticated it MUST be silently discarded as +* described in [I-D.ietf-tsvwg-sctp-auth]. +*/ + if (!sctp_addip_noauth !chunk-auth) + return sctp_sf_discard_chunk(ep, asoc, type, arg, commands); + /* Make sure that the ASCONF ADDIP chunk has a valid length. */ if (!sctp_chunk_length_valid(chunk, sizeof(sctp_addip_chunk_t))) return sctp_sf_violation_chunklen(ep, asoc, type, arg, @@ -3485,6 +3494,15 @@ sctp_disposition_t sctp_sf_do_asconf_ack(const struct sctp_endpoint *ep, return sctp_sf_pdiscard(ep, asoc, type, arg, commands); } + /* ADD-IP, Section 4.1.2: +* This chunk MUST be sent in an authenticated way by using +* the mechanism defined in [I-D.ietf-tsvwg-sctp-auth]. If this chunk +* is received unauthenticated it MUST be silently discarded as +* described in [I-D.ietf-tsvwg-sctp-auth]. +*/ + if (!sctp_addip_noauth !asconf_ack-auth) + return sctp_sf_discard_chunk(ep, asoc, type, arg, commands); + /* Make sure that the ADDIP chunk has a valid length. */ if (!sctp_chunk_length_valid(asconf_ack, sizeof(sctp_addip_chunk_t))) return sctp_sf_violation_chunklen(ep, asoc, type, arg, -- 1.5.3.5 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.25 2/9] SCTP: Handle the wildcard ADD-IP Address parameter
The Address Parameter in the parameter list of the ASCONF chunk may be a wildcard address. In this case special processing is required. For the 'add' case, the source IP of the packet is added. In the 'del' case, all addresses except the source IP of packet are removed. In the mark primary case, the source address is marked as primary. Signed-off-by: Vlad Yasevich [EMAIL PROTECTED] --- include/net/sctp/structs.h |2 ++ net/sctp/associola.c | 17 + net/sctp/sm_make_chunk.c | 40 3 files changed, 55 insertions(+), 4 deletions(-) diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index 002a00a..55acadc 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -1938,6 +1938,8 @@ void sctp_assoc_rwnd_increase(struct sctp_association *, unsigned); void sctp_assoc_rwnd_decrease(struct sctp_association *, unsigned); void sctp_assoc_set_primary(struct sctp_association *, struct sctp_transport *); +void sctp_assoc_del_nonprimary_peers(struct sctp_association *, + struct sctp_transport *); int sctp_assoc_set_bind_addr_from_ep(struct sctp_association *, gfp_t); int sctp_assoc_set_bind_addr_from_cookie(struct sctp_association *, diff --git a/net/sctp/associola.c b/net/sctp/associola.c index 33ae9b0..61bebb9 100644 --- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -730,6 +730,23 @@ struct sctp_transport *sctp_assoc_lookup_paddr( return NULL; } +/* Remove all transports except a give one */ +void sctp_assoc_del_nonprimary_peers(struct sctp_association *asoc, +struct sctp_transport *primary) +{ + struct sctp_transport *temp; + struct sctp_transport *t; + + list_for_each_entry_safe(t, temp, asoc-peer.transport_addr_list, +transports) { + /* if the current transport is not the primary one, delete it */ + if (t != primary) + sctp_assoc_rm_peer(asoc, t); + } + + return; +} + /* Engage in transport control operations. * Mark the transport up or down and send a notification to the user. * Select and update the new active and retran paths. diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c index f487629..00598ee 100644 --- a/net/sctp/sm_make_chunk.c +++ b/net/sctp/sm_make_chunk.c @@ -2721,7 +2721,6 @@ static __be16 sctp_process_asconf_param(struct sctp_association *asoc, struct sctp_transport *peer; struct sctp_af *af; union sctp_addr addr; - struct list_head *pos; union sctp_addr_param *addr_param; addr_param = (union sctp_addr_param *) @@ -2732,8 +2731,24 @@ static __be16 sctp_process_asconf_param(struct sctp_association *asoc, return SCTP_ERROR_INV_PARAM; af-from_addr_param(addr, addr_param, htons(asoc-peer.port), 0); + + /* ADDIP 4.2.1 This parameter MUST NOT contain a broadcast +* or multicast address. +* (note: wildcard is permitted and requires special handling so +* make sure we check for that) +*/ + if (!af-is_any(addr) !af-addr_valid(addr, NULL, asconf-skb)) + return SCTP_ERROR_INV_PARAM; + switch (asconf_param-param_hdr.type) { case SCTP_PARAM_ADD_IP: + /* Section 4.2.1: +* If the address 0.0.0.0 or ::0 is provided, the source +* address of the packet MUST be added. +*/ + if (af-is_any(addr)) + memcpy(addr, asconf-source, sizeof(addr)); + /* ADDIP 4.3 D9) If an endpoint receives an ADD IP address * request and does not have the local resources to add this * new address to the association, it MUST return an Error @@ -2755,8 +2770,7 @@ static __be16 sctp_process_asconf_param(struct sctp_association *asoc, * MUST send an Error Cause TLV with the error cause set to the * new error code 'Request to Delete Last Remaining IP Address'. */ - pos = asoc-peer.transport_addr_list.next; - if (pos-next == asoc-peer.transport_addr_list) + if (asoc-peer.transport_count == 1) return SCTP_ERROR_DEL_LAST_IP; /* ADDIP 4.3 D8) If a request is received to delete an IP @@ -2769,9 +2783,27 @@ static __be16 sctp_process_asconf_param(struct sctp_association *asoc, if (sctp_cmp_addr_exact(sctp_source(asconf), addr)) return SCTP_ERROR_DEL_SRC_IP; - sctp_assoc_del_peer(asoc, addr); + /* Section 4.2.2 +* If the address 0.0.0.0 or ::0 is provided, all +* addresses of the peer except the source address of the +
[PATCH 2.6.25 3/9] SCTP: Add the handling of Set Primary IP Address parameter to INIT
The ADD-IP Set Primary IP Address parameter is allowed in the INIT/INIT-ACK exchange. Allow processing of this parameter during the INIT/INIT-ACK. Signed-off-by: Vlad Yasevich [EMAIL PROTECTED] --- include/net/sctp/structs.h |1 + net/sctp/sm_make_chunk.c | 27 +++ 2 files changed, 28 insertions(+), 0 deletions(-) diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index 55acadc..fb9b7e7 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -451,6 +451,7 @@ union sctp_params { struct sctp_random_param *random; struct sctp_chunks_param *chunks; struct sctp_hmac_algo_param *hmac_algo; + struct sctp_addip_param *addip; }; /* RFC 2960. Section 3.3.5 Heartbeat. diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c index 00598ee..62af33d 100644 --- a/net/sctp/sm_make_chunk.c +++ b/net/sctp/sm_make_chunk.c @@ -1963,6 +1963,11 @@ static sctp_ierror_t sctp_verify_param(const struct sctp_association *asoc, case SCTP_PARAM_SUPPORTED_EXT: break; + case SCTP_PARAM_SET_PRIMARY: + if (sctp_addip_enable) + break; + goto fallthrough; + case SCTP_PARAM_HOST_NAME_ADDRESS: /* Tell the peer, we won't support this param. */ sctp_process_hn_param(asoc, param, chunk, err_chunk); @@ -2280,6 +2285,8 @@ static int sctp_process_param(struct sctp_association *asoc, sctp_scope_t scope; time_t stale; struct sctp_af *af; + union sctp_addr_param *addr_param; + struct sctp_transport *t; /* We maintain all INIT parameters in network byte order all the * time. This allows us to not worry about whether the parameters @@ -2370,6 +2377,26 @@ static int sctp_process_param(struct sctp_association *asoc, asoc-peer.adaptation_ind = param.aind-adaptation_ind; break; + case SCTP_PARAM_SET_PRIMARY: + addr_param = param.v + sizeof(sctp_addip_param_t); + + af = sctp_get_af_specific(param_type2af(param.p-type)); + af-from_addr_param(addr, addr_param, + htons(asoc-peer.port), 0); + + /* if the address is invalid, we can't process it. +* XXX: see spec for what to do. +*/ + if (!af-addr_valid(addr, NULL, NULL)) + break; + + t = sctp_assoc_lookup_paddr(asoc, addr); + if (!t) + break; + + sctp_assoc_set_primary(asoc, t); + break; + case SCTP_PARAM_SUPPORTED_EXT: sctp_process_ext_param(asoc, param); break; -- 1.5.3.5 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.25 4/9] SCTP: Update association lookup to look at ASCONF chunks as well
ADD-IP draft section 5.2 specifies that if an association can not be found using the source and destination of the IP packet, then, if the packet contains ASCONF chunks, the Address Parameter TLV should be used to lookup an association. Signed-off-by: Vlad Yasevich [EMAIL PROTECTED] --- net/sctp/input.c | 124 - 1 files changed, 103 insertions(+), 21 deletions(-) diff --git a/net/sctp/input.c b/net/sctp/input.c index b08c7cb..d695f71 100644 --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -891,14 +891,6 @@ static struct sctp_association *__sctp_rcv_init_lookup(struct sk_buff *skb, ch = (sctp_chunkhdr_t *) skb-data; - /* The code below will attempt to walk the chunk and extract -* parameter information. Before we do that, we need to verify -* that the chunk length doesn't cause overflow. Otherwise, we'll -* walk off the end. -*/ - if (WORD_ROUND(ntohs(ch-length)) skb-len) - return NULL; - /* * This code will NOT touch anything inside the chunk--it is * strictly READ-ONLY. @@ -935,6 +927,44 @@ static struct sctp_association *__sctp_rcv_init_lookup(struct sk_buff *skb, return NULL; } +/* ADD-IP, Section 5.2 + * When an endpoint receives an ASCONF Chunk from the remote peer + * special procedures may be needed to identify the association the + * ASCONF Chunk is associated with. To properly find the association + * the following procedures SHOULD be followed: + * + * D2) If the association is not found, use the address found in the + * Address Parameter TLV combined with the port number found in the + * SCTP common header. If found proceed to rule D4. + * + * D2-ext) If more than one ASCONF Chunks are packed together, use the + * address found in the ASCONF Address Parameter TLV of each of the + * subsequent ASCONF Chunks. If found, proceed to rule D4. + */ +static struct sctp_association *__sctp_rcv_asconf_lookup( + sctp_chunkhdr_t *ch, + const union sctp_addr *laddr, + __be32 peer_port, + struct sctp_transport **transportp) +{ + sctp_addip_chunk_t *asconf = (struct sctp_addip_chunk *)ch; + struct sctp_af *af; + union sctp_addr_param *param; + union sctp_addr paddr; + + /* Skip over the ADDIP header and find the Address parameter */ + param = (union sctp_addr_param *)(asconf + 1); + + af = sctp_get_af_specific(param_type2af(param-v4.param_hdr.type)); + if (unlikely(!af)) + return NULL; + + af-from_addr_param(paddr, param, peer_port, 0); + + return __sctp_lookup_association(laddr, paddr, transportp); +} + + /* SCTP-AUTH, Section 6.3: *If the receiver does not find a STCB for a packet containing an AUTH *chunk as the first chunk and not a COOKIE-ECHO chunk as the second @@ -943,20 +973,64 @@ static struct sctp_association *__sctp_rcv_init_lookup(struct sk_buff *skb, * * This means that any chunks that can help us identify the association need * to be looked at to find this assocation. -* -* TODO: The only chunk currently defined that can do that is ASCONF, but we -* don't support that functionality yet. */ -static struct sctp_association *__sctp_rcv_auth_lookup(struct sk_buff *skb, - const union sctp_addr *paddr, +static struct sctp_association *__sctp_rcv_walk_lookup(struct sk_buff *skb, const union sctp_addr *laddr, struct sctp_transport **transportp) { - /* XXX - walk through the chunks looking for something that can -* help us find the association. INIT, and INIT-ACK are not permitted. -* That leaves ASCONF, but we don't support that yet. + struct sctp_association *asoc = NULL; + sctp_chunkhdr_t *ch; + int have_auth = 0; + unsigned int chunk_num = 1; + __u8 *ch_end; + + /* Walk through the chunks looking for AUTH or ASCONF chunks +* to help us find the association. */ - return NULL; + ch = (sctp_chunkhdr_t *) skb-data; + do { + /* Break out if chunk length is less then minimal. */ + if (ntohs(ch-length) sizeof(sctp_chunkhdr_t)) + break; + + ch_end = ((__u8 *)ch) + WORD_ROUND(ntohs(ch-length)); + if (ch_end skb_tail_pointer(skb)) + break; + + switch(ch-type) { + case SCTP_CID_AUTH: + have_auth = chunk_num; + break; + + case SCTP_CID_COOKIE_ECHO: + /* If a packet arrives containing an AUTH chunk as +* a first chunk, a COOKIE-ECHO chunk as the second +
[PATCH 2.6.25 5/9] SCTP: ADD-IP updates the states where ASCONFs can be sent
C4) Both ASCONF and ASCONF-ACK Chunks MUST NOT be sent in any SCTP state except ESTABLISHED, SHUTDOWN-PENDING, SHUTDOWN-RECEIVED, and SHUTDOWN-SENT. Signed-off-by: Vlad Yasevich [EMAIL PROTECTED] --- net/sctp/sm_statetable.c | 18 +- 1 files changed, 9 insertions(+), 9 deletions(-) diff --git a/net/sctp/sm_statetable.c b/net/sctp/sm_statetable.c index a93a4bc..e6016e4 100644 --- a/net/sctp/sm_statetable.c +++ b/net/sctp/sm_statetable.c @@ -457,11 +457,11 @@ static const sctp_sm_table_entry_t chunk_event_table[SCTP_NUM_BASE_CHUNK_TYPES][ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_asconf), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ - TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ + TYPE_SCTP_FUNC(sctp_sf_do_asconf), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ - TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ + TYPE_SCTP_FUNC(sctp_sf_do_asconf), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ - TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ + TYPE_SCTP_FUNC(sctp_sf_do_asconf), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_ASCONF */ @@ -478,11 +478,11 @@ static const sctp_sm_table_entry_t chunk_event_table[SCTP_NUM_BASE_CHUNK_TYPES][ /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_asconf_ack), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ - TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ + TYPE_SCTP_FUNC(sctp_sf_do_asconf_ack), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ - TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ + TYPE_SCTP_FUNC(sctp_sf_do_asconf_ack), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ - TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ + TYPE_SCTP_FUNC(sctp_sf_do_asconf_ack), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_discard_chunk), \ } /* TYPE_SCTP_ASCONF_ACK */ @@ -691,11 +691,11 @@ chunk_event_table_unknown[SCTP_STATE_NUM_STATES] = { /* SCTP_STATE_ESTABLISHED */ \ TYPE_SCTP_FUNC(sctp_sf_do_prm_asconf), \ /* SCTP_STATE_SHUTDOWN_PENDING */ \ - TYPE_SCTP_FUNC(sctp_sf_error_shutdown), \ + TYPE_SCTP_FUNC(sctp_sf_do_prm_asconf), \ /* SCTP_STATE_SHUTDOWN_SENT */ \ - TYPE_SCTP_FUNC(sctp_sf_error_shutdown), \ + TYPE_SCTP_FUNC(sctp_sf_do_prm_asconf), \ /* SCTP_STATE_SHUTDOWN_RECEIVED */ \ - TYPE_SCTP_FUNC(sctp_sf_error_shutdown), \ + TYPE_SCTP_FUNC(sctp_sf_do_prm_asconf), \ /* SCTP_STATE_SHUTDOWN_ACK_SENT */ \ TYPE_SCTP_FUNC(sctp_sf_error_shutdown), \ } /* TYPE_SCTP_PRIMITIVE_REQUESTHEARTBEAT */ -- 1.5.3.5 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2.6.25 9/9] SCTP: Follow Add-IP security consideratiosn wrt INIT/INIT-ACK
The Security Considerations section of RFC 5061 has the following text: If an SCTP endpoint that supports this extension receives an INIT that indicates that the peer supports the ASCONF extension but does NOT support the [RFC4895] extension, the receiver of such an INIT MUST send an ABORT in response. Note that an implementation is allowed to silently discard such an INIT as an option as well, but under NO circumstance is an implementation allowed to proceed with the association setup by sending an INIT-ACK in response. An implementation that receives an INIT-ACK that indicates that the peer does not support the [RFC4895] extension MUST NOT send the COOKIE-ECHO to establish the association. Instead, the implementation MUST discard the INIT-ACK and report to the upper- layer user that an association cannot be established destroying the Transmission Control Block (TCB). Follow the recomendations. Signed-off-by: Vlad Yasevich [EMAIL PROTECTED] --- net/sctp/sm_make_chunk.c | 47 ++--- net/sctp/sm_statefuns.c |7 ++--- 2 files changed, 46 insertions(+), 8 deletions(-) diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c index 68a994c..ae9fc9e 100644 --- a/net/sctp/sm_make_chunk.c +++ b/net/sctp/sm_make_chunk.c @@ -1830,6 +1830,39 @@ static int sctp_process_hn_param(const struct sctp_association *asoc, return 0; } +static int sctp_verify_ext_param(union sctp_params param) +{ + __u16 num_ext = ntohs(param.p-length) - sizeof(sctp_paramhdr_t); + int have_auth = 0; + int have_asconf = 0; + int i; + + for (i = 0; i num_ext; i++) { + switch (param.ext-chunks[i]) { + case SCTP_CID_AUTH: + have_auth = 1; + break; + case SCTP_CID_ASCONF: + case SCTP_CID_ASCONF_ACK: + have_asconf = 1; + break; + } + } + + /* ADD-IP Security: The draft requires us to ABORT or ignore the +* INIT/INIT-ACK if ADD-IP is listed, but AUTH is not. Do this +* only if ADD-IP is turned on and we are not backward-compatible +* mode. +*/ + if (sctp_addip_noauth) + return 1; + + if (sctp_addip_enable !have_auth have_asconf) + return 0; + + return 1; +} + static void sctp_process_ext_param(struct sctp_association *asoc, union sctp_params param) { @@ -1960,7 +1993,11 @@ static sctp_ierror_t sctp_verify_param(const struct sctp_association *asoc, case SCTP_PARAM_UNRECOGNIZED_PARAMETERS: case SCTP_PARAM_ECN_CAPABLE: case SCTP_PARAM_ADAPTATION_LAYER_IND: + break; + case SCTP_PARAM_SUPPORTED_EXT: + if (!sctp_verify_ext_param(param)) + return SCTP_IERROR_ABORT; break; case SCTP_PARAM_SET_PRIMARY: @@ -2133,10 +2170,11 @@ int sctp_process_init(struct sctp_association *asoc, sctp_cid_t cid, !asoc-peer.peer_hmacs)) asoc-peer.auth_capable = 0; - - /* If the peer claims support for ADD-IP without support -* for AUTH, disable support for ADD-IP. -* Do this only if backward compatible mode is turned off. + /* In a non-backward compatible mode, if the peer claims +* support for ADD-IP but not AUTH, the ADD-IP spec states +* that we MUST ABORT the association. Section 6. The section +* also give us an option to silently ignore the packet, which +* is what we'll do here. */ if (!sctp_addip_noauth (asoc-peer.asconf_capable !asoc-peer.auth_capable)) { @@ -2144,6 +2182,7 @@ int sctp_process_init(struct sctp_association *asoc, sctp_cid_t cid, SCTP_PARAM_DEL_IP | SCTP_PARAM_SET_PRIMARY); asoc-peer.asconf_capable = 0; + goto clean_up; } /* Walk list of transports, removing transports in the UNKNOWN state. */ diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index eed47c6..aadbed1 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -507,7 +507,9 @@ sctp_disposition_t sctp_sf_do_5_1C_ack(const struct sctp_endpoint *ep, err_chunk)) { /* This chunk contains fatal error. It is to be discarded. -* Send an ABORT, with causes if there is any. +* Send an ABORT, with causes. If there are no causes, +* then there wasn't enough memory. Just terminate +* the association. */ if (err_chunk) { packet = sctp_abort_pkt_new(ep, asoc,
[PATCH 2.6.25 8/9] SCTP: Implement ADD-IP special case processing for ABORT chunk
ADD-IP spec has a special case for processing ABORTs: F4) ... One special consideration is that ABORT Chunks arriving destined to the IP address being deleted MUST be ignored (see Section 5.3.1 for further details). Check if the address we received on is in the DEL state, and if so, ignore the ABORT. Signed-off-by: Vlad Yasevich [EMAIL PROTECTED] --- include/net/sctp/structs.h |2 + net/sctp/bind_addr.c | 26 ++ net/sctp/sm_statefuns.c| 52 --- 3 files changed, 76 insertions(+), 4 deletions(-) diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index 32e6591..27e9cf5 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -1200,6 +1200,8 @@ int sctp_add_bind_addr(struct sctp_bind_addr *, union sctp_addr *, int sctp_del_bind_addr(struct sctp_bind_addr *, union sctp_addr *); int sctp_bind_addr_match(struct sctp_bind_addr *, const union sctp_addr *, struct sctp_sock *); +int sctp_bind_addr_state(const struct sctp_bind_addr *bp, +const union sctp_addr *addr); union sctp_addr *sctp_find_unmatch_addr(struct sctp_bind_addr *bp, const union sctp_addr *addrs, int addrcnt, diff --git a/net/sctp/bind_addr.c b/net/sctp/bind_addr.c index 4326611..13fbfb4 100644 --- a/net/sctp/bind_addr.c +++ b/net/sctp/bind_addr.c @@ -353,6 +353,32 @@ int sctp_bind_addr_match(struct sctp_bind_addr *bp, return match; } +/* Get the state of the entry in the bind_addr_list */ +int sctp_bind_addr_state(const struct sctp_bind_addr *bp, +const union sctp_addr *addr) +{ + struct sctp_sockaddr_entry *laddr; + struct sctp_af *af; + int state = -1; + + af = sctp_get_af_specific(addr-sa.sa_family); + if (unlikely(!af)) + return state; + + rcu_read_lock(); + list_for_each_entry_rcu(laddr, bp-address_list, list) { + if (!laddr-valid) + continue; + if (af-cmp_addr(laddr-a, addr)) { + state = laddr-state; + break; + } + } + rcu_read_unlock(); + + return state; +} + /* Find the first address in the bind address list that is not present in * the addrs packed array. */ diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index 8fe2e61..eed47c6 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -143,6 +143,12 @@ static sctp_ierror_t sctp_sf_authenticate(const struct sctp_endpoint *ep, const sctp_subtype_t type, struct sctp_chunk *chunk); +static sctp_disposition_t __sctp_sf_do_9_1_abort(const struct sctp_endpoint *ep, + const struct sctp_association *asoc, + const sctp_subtype_t type, + void *arg, + sctp_cmd_seq_t *commands); + /* Small helper function that checks if the chunk length * is of the appropriate length. The 'required_length' argument * is set to be the size of a specific chunk we are testing. @@ -2095,11 +2101,20 @@ sctp_disposition_t sctp_sf_shutdown_pending_abort( if (!sctp_chunk_length_valid(chunk, sizeof(sctp_abort_chunk_t))) return sctp_sf_pdiscard(ep, asoc, type, arg, commands); + /* ADD-IP: Special case for ABORT chunks +* F4) One special consideration is that ABORT Chunks arriving +* destined to the IP address being deleted MUST be +* ignored (see Section 5.3.1 for further details). +*/ + if (SCTP_ADDR_DEL == + sctp_bind_addr_state(asoc-base.bind_addr, chunk-dest)) + return sctp_sf_discard_chunk(ep, asoc, type, arg, commands); + /* Stop the T5-shutdown guard timer. */ sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_STOP, SCTP_TO(SCTP_EVENT_TIMEOUT_T5_SHUTDOWN_GUARD)); - return sctp_sf_do_9_1_abort(ep, asoc, type, arg, commands); + return __sctp_sf_do_9_1_abort(ep, asoc, type, arg, commands); } /* @@ -2131,6 +2146,15 @@ sctp_disposition_t sctp_sf_shutdown_sent_abort(const struct sctp_endpoint *ep, if (!sctp_chunk_length_valid(chunk, sizeof(sctp_abort_chunk_t))) return sctp_sf_pdiscard(ep, asoc, type, arg, commands); + /* ADD-IP: Special case for ABORT chunks +* F4) One special consideration is that ABORT Chunks arriving +* destined to the IP address being deleted MUST be +* ignored (see Section 5.3.1 for further details). +*/ + if (SCTP_ADDR_DEL == + sctp_bind_addr_state(asoc-base.bind_addr, chunk-dest)) + return
[PATCH 2.6.25 6/9] SCTP: Update ASCONF processing to conform to spec.
The processing of the ASCONF chunks has changed a lot in the spec. New items are: 1. A list of ASCONF-ACK chunks is now cached 2. The source of the packet is used in response. 3. New handling for unexpect ASCONF chunks. Signed-off-by: Vlad Yasevich [EMAIL PROTECTED] --- include/net/sctp/structs.h | 24 +--- net/sctp/associola.c | 58 ++- net/sctp/outqueue.c| 29 ++- net/sctp/sm_make_chunk.c | 12 +++- net/sctp/sm_statefuns.c| 64 --- 5 files changed, 143 insertions(+), 44 deletions(-) diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index fb9b7e7..39e74d7 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -744,6 +744,7 @@ struct sctp_chunk { __u8 tsn_missing_report; /* Data chunk missing counter. */ __u8 data_accepted; /* At least 1 chunk in this packet accepted */ __u8 auth; /* IN: was auth'ed | OUT: needs auth */ + __u8 has_asconf;/* IN: have seen an asconf before */ }; void sctp_chunk_hold(struct sctp_chunk *); @@ -1785,20 +1786,16 @@ struct sctp_association { */ struct sctp_chunk *addip_last_asconf; - /* ADDIP Section 4.2 Upon reception of an ASCONF Chunk. + /* ADDIP Section 5.2 Upon reception of an ASCONF Chunk. * -* IMPLEMENTATION NOTE: As an optimization a receiver may wish -* to save the last ASCONF-ACK for some predetermined period -* of time and instead of re-processing the ASCONF (with the -* same serial number) it may just re-transmit the -* ASCONF-ACK. It may wish to use the arrival of a new serial -* number to discard the previously saved ASCONF-ACK or any -* other means it may choose to expire the saved ASCONF-ACK. +* This is needed to implement itmes E1 - E4 of the updated +* spec. Here is the justification: * -* [This is our saved ASCONF-ACK. We invalidate it when a new -* ASCONF serial number arrives.] +* Since the peer may bundle multiple ASCONF chunks toward us, +* we now need the ability to cache multiple ACKs. The section +* describes in detail how they are cached and cleaned up. */ - struct sctp_chunk *addip_last_asconf_ack; + struct list_head asconf_ack_list; /* These ASCONF chunks are waiting to be sent. * @@ -1947,6 +1944,11 @@ int sctp_assoc_set_bind_addr_from_cookie(struct sctp_association *, struct sctp_cookie*, gfp_t gfp); int sctp_assoc_set_id(struct sctp_association *, gfp_t); +void sctp_assoc_clean_asconf_ack_cache(const struct sctp_association *asoc); +struct sctp_chunk *sctp_assoc_lookup_asconf_ack( + const struct sctp_association *asoc, + __be32 serial); + int sctp_cmp_addr_exact(const union sctp_addr *ss1, const union sctp_addr *ss2); diff --git a/net/sctp/associola.c b/net/sctp/associola.c index 61bebb9..a016e78 100644 --- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -61,6 +61,7 @@ /* Forward declarations for internal functions. */ static void sctp_assoc_bh_rcv(struct work_struct *work); +static void sctp_assoc_free_asconf_acks(struct sctp_association *asoc); /* 1st Level Abstractions. */ @@ -242,6 +243,7 @@ static struct sctp_association *sctp_association_init(struct sctp_association *a asoc-addip_serial = asoc-c.initial_tsn; INIT_LIST_HEAD(asoc-addip_chunk_list); + INIT_LIST_HEAD(asoc-asconf_ack_list); /* Make an empty list of remote transport addresses. */ INIT_LIST_HEAD(asoc-peer.transport_addr_list); @@ -431,8 +433,7 @@ void sctp_association_free(struct sctp_association *asoc) asoc-peer.transport_count = 0; /* Free any cached ASCONF_ACK chunk. */ - if (asoc-addip_last_asconf_ack) - sctp_chunk_free(asoc-addip_last_asconf_ack); + sctp_assoc_free_asconf_acks(asoc); /* Free any cached ASCONF chunk. */ if (asoc-addip_last_asconf) @@ -1485,3 +1486,56 @@ retry: asoc-assoc_id = (sctp_assoc_t) assoc_id; return error; } + +/* Free asconf_ack cache */ +static void sctp_assoc_free_asconf_acks(struct sctp_association *asoc) +{ + struct sctp_chunk *ack; + struct sctp_chunk *tmp; + + list_for_each_entry_safe(ack, tmp, asoc-asconf_ack_list, + transmitted_list) { + list_del_init(ack-transmitted_list); + sctp_chunk_free(ack); + } +} + +/* Clean up the ASCONF_ACK queue */ +void sctp_assoc_clean_asconf_ack_cache(const struct sctp_association *asoc) +{ + struct sctp_chunk *ack; + struct sctp_chunk *tmp; + + /*
Re: [BUG] lack of /proc/net/ax25 with 2.6.24-rc5
On Sunday, 16 of December 2007, Bernard Pidoux wrote: With 2.6.24-rc5 there is no /proc/net/ax25 FYI, I've created a Bugzilla entry for this issue at: http://bugzilla.kernel.org/show_bug.cgi?id=9589 Please add your address to the CC list in there. Thanks, Rafael Here is an extract from dmesg after boot : === sysctl table check failed: /net/ax25/ax0/ax25_default_mode .3.9.1.2 Unknown sysctl binary path Pid: 2936, comm: kissattach Not tainted 2.6.24-rc5 #1 [c012ca6a] set_fail+0x3b/0x43 [c012ce7a] sysctl_check_table+0x408/0x456 [c012ce8e] sysctl_check_table+0x41c/0x456 [c012ce8e] sysctl_check_table+0x41c/0x456 [c02ac64a] _spin_unlock+0x14/0x1c [c012ce8e] sysctl_check_table+0x41c/0x456 [c011e681] sysctl_set_parent+0x19/0x2a [c011f55c] register_sysctl_table+0x45/0x85 [d8be9d26] ax25_register_sysctl+0x112/0x11c [ax25] [d8be6c76] ax25_device_event+0x2e/0x90 [ax25] [c012c560] notifier_call_chain+0x2a/0x47 [c012c59f] raw_notifier_call_chain+0x17/0x1a [c0259290] dev_open+0x6f/0x75 [c0257ee7] dev_change_flags+0x9c/0x148 [c0256ab3] __dev_get_by_name+0x68/0x73 [c0292307] devinet_ioctl+0x22e/0x53b [c0259074] dev_ioctl+0x472/0x5ba [c024d4ba] sock_ioctl+0x1aa/0x1cf [c024d310] sock_ioctl+0x0/0x1cf [c016bc19] do_ioctl+0x19/0x4c [c016be40] vfs_ioctl+0x1f4/0x20b [c0103d01] sysenter_past_esp+0x9a/0xa9 [c016be9c] sys_ioctl+0x45/0x5d [c0103cc6] sysenter_past_esp+0x5f/0xa9 === sysctl table check failed: /net/ax25/ax0/backoff_type .3.9.1.3 Unknown sysctl binary path (...) truncated === sysctl table check failed: /net/ax25/ax0/connect_mode .3.9.1.4 Unknown sysctl binary path (...) === sysctl table check failed: /net/ax25/ax0/standard_window_size .3.9.1.5 Unknown sysctl binary path === (...) and so on ... Bernard Pidoux [EMAIL PROTECTED] -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Premature optimization is the root of all evil. - Donald Knuth -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH drivers/net/skfp/h/fplustm.h] parentheses around RX_FIFO_SPACE definition
drivers/net/skfp/h/fplustm.h:129: #define RX_FIFO_SPACE 0x4000 - RX_FIFO_OFF drivers/net/skfp/fplustm.c:1404: smc-hw.fp.fifo.rx1_fifo_size = RX_FIFO_SPACE * SMT_R1_RXD_COUNT/(SMT_R1_RXD_COUNT+SMT_R2_RXD_COUNT) ; smc-hw.fp.fifo.rx2_fifo_size = RX_FIFO_SPACE * SMT_R2_RXD_COUNT/(SMT_R1_RXD_COUNT+SMT_R2_RXD_COUNT) ; Add parentheses to definition to prevent operator precedence errors Signed-off-by: Roel Kluin [EMAIL PROTECTED] --- diff --git a/drivers/net/skfp/h/fplustm.h b/drivers/net/skfp/h/fplustm.h index 98bbf65..588579c 100644 --- a/drivers/net/skfp/h/fplustm.h +++ b/drivers/net/skfp/h/fplustm.h @@ -126,7 +126,7 @@ struct s_smt_rx_queue { #defineSYNC_TRAFFIC_ON 0x2 /* big FIFO memory */ -#defineRX_FIFO_SPACE 0x4000 - RX_FIFO_OFF +#defineRX_FIFO_SPACE (0x4000 - RX_FIFO_OFF) #defineTX_FIFO_SPACE 0x4000 #defineTX_SMALL_FIFO 0x0900 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] e1000: Dump the eeprom when a user encounters a bad checksum
To help supporting users with a bad eeprom checksum, dump the eeprom info when such a situation is encountered by a user. Signed-off-by: Auke Kok [EMAIL PROTECTED] --- drivers/net/e1000/e1000_main.c | 85 +++- 1 files changed, 74 insertions(+), 11 deletions(-) diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index efd8c2d..aac55be 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -817,6 +817,64 @@ e1000_reset(struct e1000_adapter *adapter) } /** + * Dump the eeprom for users having checksum issues + **/ +void e1000_dump_eeprom(struct e1000_adapter *adapter) +{ + struct net_device *netdev = adapter-netdev; + struct ethtool_eeprom eeprom; + const struct ethtool_ops *ops = netdev-ethtool_ops; + u8 *data; + int i; + u16 csum_old, csum_new = 0; + + eeprom.len = ops-get_eeprom_len(netdev); + eeprom.offset = 0; + + data = kmalloc(eeprom.len, GFP_KERNEL); + if (!data) { + printk(KERN_ERR Unable to allocate memory to dump EEPROM + data\n); + return; + } + + ops-get_eeprom(netdev, eeprom, data); + + csum_old = (data[EEPROM_CHECKSUM_REG * 2]) + + (data[EEPROM_CHECKSUM_REG * 2 + 1] 8); + for (i = 0; i EEPROM_CHECKSUM_REG * 2; i += 2) + csum_new += data[i] + (data[i + 1] 8); + csum_new = EEPROM_SUM - csum_new; + + printk(KERN_ERR /*/\n); + printk(KERN_ERR Current EEPROM Checksum : 0x%04x\n, csum_old); + printk(KERN_ERR Calculated : 0x%04x\n, csum_new); + + printk(KERN_ERR OffsetValues\n); + printk(KERN_ERR ==\n); + print_hex_dump(KERN_ERR, , DUMP_PREFIX_OFFSET, 16, 1, data, 128, 0); + + printk(KERN_ERR Include this output when contacting your support + provider.\n); + printk(KERN_ERR This is not a software error! Something bad + happened to your hardware or\n); + printk(KERN_ERR EEPROM image. Ignoring this + problem could result in further problems,\n); + printk(KERN_ERR possibly loss of data, corruption or system hangs!\n); + printk(KERN_ERR The MAC Address will be reset to 00:00:00:00:00:00, + which is invalid\n); + printk(KERN_ERR and requires you to set the proper MAC + address manually before continuing\n); + printk(KERN_ERR to enable this network device.\n); + printk(KERN_ERR Please inspect the EEPROM dump and report the issue + to your hardware vendor\n); + printk(KERN_ERR or Intel Customer Support: [EMAIL PROTECTED]); + printk(KERN_ERR /*/\n); + + kfree(data); +} + +/** * e1000_probe - Device Initialization Routine * @pdev: PCI device information struct * @ent: entry in e1000_pci_tbl @@ -967,7 +1025,6 @@ e1000_probe(struct pci_dev *pdev, adapter-en_mng_pt = e1000_enable_mng_pass_thru(adapter-hw); /* initialize eeprom parameters */ - if (e1000_init_eeprom_params(adapter-hw)) { E1000_ERR(EEPROM initialization failed\n); goto err_eeprom; @@ -979,23 +1036,29 @@ e1000_probe(struct pci_dev *pdev, e1000_reset_hw(adapter-hw); /* make sure the EEPROM is good */ - if (e1000_validate_eeprom_checksum(adapter-hw) 0) { DPRINTK(PROBE, ERR, The EEPROM Checksum Is Not Valid\n); - goto err_eeprom; + e1000_dump_eeprom(adapter); + /* +* set MAC address to all zeroes to invalidate and temporary +* disable this device for the user. This blocks regular +* traffic while still permitting ethtool ioctls from reaching +* the hardware as well as allowing the user to run the +* interface after manually setting a hw addr using +* `ip set address` +*/ + memset(adapter-hw.mac_addr, 0, netdev-addr_len); + } else { + /* copy the MAC address out of the EEPROM */ + if (e1000_read_mac_addr(adapter-hw)) + DPRINTK(PROBE, ERR, EEPROM Read Error\n); } - - /* copy the MAC address out of the EEPROM */ - - if (e1000_read_mac_addr(adapter-hw)) - DPRINTK(PROBE, ERR, EEPROM Read Error\n); + /* don't block initalization here due to bad MAC address */ memcpy(netdev-dev_addr, adapter-hw.mac_addr, netdev-addr_len); memcpy(netdev-perm_addr, adapter-hw.mac_addr, netdev-addr_len); - if (!is_valid_ether_addr(netdev-perm_addr)) { + if (!is_valid_ether_addr(netdev-perm_addr)) DPRINTK(PROBE, ERR, Invalid MAC Address\n); - goto err_eeprom; - } e1000_get_bus_info(adapter-hw); -- To unsubscribe from
Re: [PATCH] e1000: Dump the eeprom when a user encounters a bad checksum
On Mon, 2007-12-17 at 13:50 -0800, Auke Kok wrote: diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index efd8c2d..aac55be 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -979,23 +1036,29 @@ e1000_probe(struct pci_dev *pdev, e1000_reset_hw(adapter-hw); /* make sure the EEPROM is good */ - if (e1000_validate_eeprom_checksum(adapter-hw) 0) { DPRINTK(PROBE, ERR, The EEPROM Checksum Is Not Valid\n); - goto err_eeprom; + e1000_dump_eeprom(adapter); + /* + * set MAC address to all zeroes to invalidate and temporary + * disable this device for the user. This blocks regular + * traffic while still permitting ethtool ioctls from reaching + * the hardware as well as allowing the user to run the + * interface after manually setting a hw addr using + * `ip set address` + */ + memset(adapter-hw.mac_addr, 0, netdev-addr_len); Do you need to set netdev-dev_addr too? + } else { + /* copy the MAC address out of the EEPROM */ + if (e1000_read_mac_addr(adapter-hw)) + DPRINTK(PROBE, ERR, EEPROM Read Error\n); } - - /* copy the MAC address out of the EEPROM */ - - if (e1000_read_mac_addr(adapter-hw)) - DPRINTK(PROBE, ERR, EEPROM Read Error\n); + /* don't block initalization here due to bad MAC address */ I just sent a patch to fix these typos and another pops up... initialization cheers, Joe -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.25 0/9]: SCTP: Update ADD-IP implementation to conform to spec
From: Vlad Yasevich [EMAIL PROTECTED] Date: Mon, 17 Dec 2007 16:32:40 -0500 The following is a set of patches that updates the SCTP ADD-IP implementation to conform to the recently published RFC. Patch 7 didn't seem to make it. If you CC: on submissions like this, in the worst case at least I'll get a copy even if the mailing list blocks it for whatever reason (size, SPAM filter, etc.) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.25 1/8] Create ipv4_is_type(__be32 addr) functions
On Dec 13 2007 15:38, Joe Perches wrote: Change IPV4 specific macros LOOPBACK MULTICAST LOCAL_MCAST BADCLASS and ZERONET macros to inline functions ipv4_is_type(__be32 addr) Adds type safety and arguably some readability. Changes since last submission: Removed ipv4_addr_octets function Used hex constants Converted recently added rfc3330 macros Signed-off-by: Joe Perches [EMAIL PROTECTED] --- +static inline bool ipv4_is_loopback(__be32 addr) +{ + return (addr htonl(0xff00)) == htonl(0x7f00); +} + Can we use __constant_htonl()? +static inline bool ipv4_is_private_10(__be32 addr) +{ + return (addr htonl(0xff00)) == htonl(0x0a00); +} What are these functions needed for, even? There does not seem to be any code (at least in davem's net-2.6.25:net/ipv4/, where I dared to grep) that uses them. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.25 0/9]: SCTP: Update ADD-IP implementation to conform to spec
David Miller wrote: From: Vlad Yasevich [EMAIL PROTECTED] Date: Mon, 17 Dec 2007 16:32:40 -0500 The following is a set of patches that updates the SCTP ADD-IP implementation to conform to the recently published RFC. Patch 7 didn't seem to make it. If you CC: on submissions like this, in the worst case at least I'll get a copy even if the mailing list blocks it for whatever reason (size, SPAM filter, etc.) Hm... only missing from netdev.. :) Do you want me to send just patch 7, or resend the whole series? -vlad -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.25 1/8] Create ipv4_is_type(__be32 addr) functions
From: Jan Engelhardt [EMAIL PROTECTED] Date: Mon, 17 Dec 2007 23:37:24 +0100 (CET) Can we use __constant_htonl()? That should only be used in initializers. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.25 1/8] Create ipv4_is_type(__be32 addr) functions
On Mon, 2007-12-17 at 23:37 +0100, Jan Engelhardt wrote: +static inline bool ipv4_is_loopback(__be32 addr) +{ +return (addr htonl(0xff00)) == htonl(0x7f00); +} + Can we use __constant_htonl()? I believe the generated code is the same. +static inline bool ipv4_is_private_10(__be32 addr) +{ +return (addr htonl(0xff00)) == htonl(0x0a00); +} What are these functions needed for, even? There does not seem to be any code (at least in davem's net-2.6.25:net/ipv4/, where I dared to grep) that uses them. include/net/addrconf.h net/sctp/protocol.c cheers, Joe -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.25 1/8] Create ipv4_is_type(__be32 addr) functions
From: Jan Engelhardt [EMAIL PROTECTED] Date: Mon, 17 Dec 2007 23:37:24 +0100 (CET) On Dec 13 2007 15:38, Joe Perches wrote: +static inline bool ipv4_is_private_10(__be32 addr) +{ +return (addr htonl(0xff00)) == htonl(0x0a00); +} What are these functions needed for, even? There does not seem to be any code (at least in davem's net-2.6.25:net/ipv4/, where I dared to grep) that uses them. You really need to grep the whole tree, never ever decrease the scope of directories to search if you want to see if an interface is used. It's used by some ipv6 address translation code as well as some bits in SCTP: include/net/addrconf.h: eui[0] = (ipv4_is_zeronet(addr) || ipv4_is_private_10(addr) || net/sctp/protocol.c:} else if (ipv4_is_private_10(addr-v4.sin_addr.s_addr) || So lazy... -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.25 0/9]: SCTP: Update ADD-IP implementation to conform to spec
From: Vlad Yasevich [EMAIL PROTECTED] Date: Mon, 17 Dec 2007 17:40:25 -0500 David Miller wrote: From: Vlad Yasevich [EMAIL PROTECTED] Date: Mon, 17 Dec 2007 16:32:40 -0500 The following is a set of patches that updates the SCTP ADD-IP implementation to conform to the recently published RFC. Patch 7 didn't seem to make it. If you CC: on submissions like this, in the worst case at least I'll get a copy even if the mailing list blocks it for whatever reason (size, SPAM filter, etc.) Hm... only missing from netdev.. :) Do you want me to send just patch 7, or resend the whole series? I'd like to see patch 7 so please send it. Is it particularly big? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] sysctl: Fix ax25 checks
Bernard Pidoux [EMAIL PROTECTED] writes: With 2.6.24-rc5 there is no /proc/net/ax25 /proc/sys/net/ax25? Here is an extract from dmesg after boot : Groan. I thought I had found the last of the bugs with my sysctl sanity checks. I guess you actually have to use ax25 for this bug to show up. Thank you for catching this. === sysctl table check failed: /net/ax25/ax0/ax25_default_mode .3.9.1.2 Unknown sysctl binary path Pid: 2936, comm: kissattach Not tainted 2.6.24-rc5 #1 [c012ca6a] set_fail+0x3b/0x43 [c012ce7a] sysctl_check_table+0x408/0x456 [c012ce8e] sysctl_check_table+0x41c/0x456 [c012ce8e] sysctl_check_table+0x41c/0x456 [c02ac64a] _spin_unlock+0x14/0x1c [c012ce8e] sysctl_check_table+0x41c/0x456 [c011e681] sysctl_set_parent+0x19/0x2a [c011f55c] register_sysctl_table+0x45/0x85 [d8be9d26] ax25_register_sysctl+0x112/0x11c [ax25] [d8be6c76] ax25_device_event+0x2e/0x90 [ax25] [c012c560] notifier_call_chain+0x2a/0x47 [c012c59f] raw_notifier_call_chain+0x17/0x1a [c0259290] dev_open+0x6f/0x75 [c0257ee7] dev_change_flags+0x9c/0x148 [c0256ab3] __dev_get_by_name+0x68/0x73 [c0292307] devinet_ioctl+0x22e/0x53b [c0259074] dev_ioctl+0x472/0x5ba [c024d4ba] sock_ioctl+0x1aa/0x1cf [c024d310] sock_ioctl+0x0/0x1cf [c016bc19] do_ioctl+0x19/0x4c [c016be40] vfs_ioctl+0x1f4/0x20b [c0103d01] sysenter_past_esp+0x9a/0xa9 [c016be9c] sys_ioctl+0x45/0x5d [c0103cc6] sysenter_past_esp+0x5f/0xa9 === sysctl table check failed: /net/ax25/ax0/backoff_type .3.9.1.3 Unknown sysctl binary path (...) truncated === sysctl table check failed: /net/ax25/ax0/connect_mode .3.9.1.4 Unknown sysctl binary path (...) === sysctl table check failed: /net/ax25/ax0/standard_window_size .3.9.1.5 Unknown sysctl binary path === (...) and so on ... Signed-off-by: Eric W. Biederman [EMAIL PROTECTED] --- kernel/sysctl_check.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c index bed939f..a68425a 100644 --- a/kernel/sysctl_check.c +++ b/kernel/sysctl_check.c @@ -428,7 +428,7 @@ static struct trans_ctl_table trans_net_netrom_table[] = { {} }; -static struct trans_ctl_table trans_net_ax25_table[] = { +static struct trans_ctl_table trans_net_ax25_param_table[] = { { NET_AX25_IP_DEFAULT_MODE, ip_default_mode }, { NET_AX25_DEFAULT_MODE,ax25_default_mode }, { NET_AX25_BACKOFF_TYPE,backoff_type }, @@ -446,6 +446,11 @@ static struct trans_ctl_table trans_net_ax25_table[] = { {} }; +static struct trans_ctl_table trans_net_ax25_table[] = { + { 0, NULL, trans_net_ax25_param_table }, + {} +}; + static struct trans_ctl_table trans_net_bridge_table[] = { { NET_BRIDGE_NF_CALL_ARPTABLES, bridge-nf-call-arptables }, { NET_BRIDGE_NF_CALL_IPTABLES, bridge-nf-call-iptables }, -- 1.5.3.rc6.17.g1911 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-2.6.25 1/8] Create ipv4_is_type(__be32 addr) functions
On Dec 17 2007 14:43, David Miller wrote: On Dec 13 2007 15:38, Joe Perches wrote: +static inline bool ipv4_is_private_10(__be32 addr) +{ + return (addr htonl(0xff00)) == htonl(0x0a00); +} What are these functions needed for, even? There does not seem to be any code (at least in davem's net-2.6.25:net/ipv4/, where I dared to grep) that uses them. You really need to grep the whole tree, never ever decrease the scope of directories to search if you want to see if an interface is used. Hah you got me there :) It's used by some ipv6 address translation code as well as some bits in SCTP: include/net/addrconf.h:eui[0] = (ipv4_is_zeronet(addr) || ipv4_is_private_10(addr) || net/sctp/protocol.c: } else if (ipv4_is_private_10(addr-v4.sin_addr.s_addr) || So lazy... I just discovered git-grep... -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] endianness annotations and fixes for olympic
Al Viro wrote: * missing braces in !readl(...) ... * trivial endianness annotations * in olympic_arb_cmd() the loop collecting fragments of packet is b0rken on big-endian - we have (next_ptr (buf_ptr=olympic_priv-olympic_lap + ntohs(next_ptr))) as condition and it should have swab16(), not ntohs() - it's host-endian byteswapped, not big-endian. So if we get more than one fragment on big-endian host, we get screwed. This ntohs() got missed back when the rest of those had been switched to swab16() in 2.4.0-test2-pre1 - at a guess, nobody had hit fragmented packets during the testing of PPC fixes. PS: Ken Aaker cc'd on assumption that he is the same guy who'd done the original set of PPC fixes in olympic Signed-off-by: Al Viro [EMAIL PROTECTED] applied #upstream -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Pull request for 'sis190' branch
Francois Romieu wrote: Please pull from branch 'sis190' in repository git://git.kernel.org/pub/scm/linux/kernel/git/romieu/netdev-2.6.git sis190 to get the changes below. Distance from 'upstream-linus' (7962024e9d16e9349d76b553326f3fa7be64305e) - c27e14e508664471b8e44ef1f81ec080213ea314 348de67fe200e25d8cb80cff35642192436cfeda 004a22d03d62cd08e5287273a5143447db009cd0 14deb44ffe7220be2de697d616f28cce17e72297 Diffstat drivers/net/sis190.c | 21 ++--- 1 files changed, 10 insertions(+), 11 deletions(-) Shortlog Francois Romieu (4): sis190: add cmos ram access code for the SiS19x/968 chipset pair sis190: remove duplicate INIT_WORK sis190: mdio operation failure is not correctly detected sis190: scheduling while atomic error pulled into #upstream -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Pull request for 'ipg' branch
Francois Romieu wrote: Please pull from branch 'ipg' in repository git://git.kernel.org/pub/scm/linux/kernel/git/romieu/netdev-2.6.git ipg to get the changes below. Distance from 'upstream' (558f08ed31c6909d3c9ae5d6dbf81220ede4b54a) --- 4918e9ebf74735bb8e664f97dc1dcc1e3d6abf9e e7b6ced0731fc6ad1a15a015de7e8c2e15da95f8 46bc63253ca320770b855249ef8cb894940a3437 992a029067f6e5c70021a0df65b2469a06362832 adf129afde7ea8444fb6441da27e2d3385dbc297 25ba65ab7ff51de58ea0c70d23b3257f658f180d 2af61e99e3d1c959840ea007ff56b15db794fb99 9312ed326b6d944ec01636db69cba84de46c0c69 Diffstat drivers/net/ipg.c | 284 - drivers/net/ipg.h | 99 +++ 2 files changed, 168 insertions(+), 215 deletions(-) pulled -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][NETDEV]: remove netif_running() check from myri10ge_poll()
Andrew Gallatin wrote: Remove the bogus netif_running() check from myri10ge_poll(). This eliminates any chance that myri10ge_poll() can trigger an oops by calling netif_rx_complete() and returning with work_done == budget. Signed-off-by: Andrew Gallatin [EMAIL PROTECTED] holding onto this patch but not applying, because NAPI discussions appear to be continuing... -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] [POWERPC] Add fixed-phy support for fs_enet
Jochen Friedrich wrote: This patch adds support to use the fixed-link property of an ethernet node to fs_enet for the CONFIG_PPC_CPM_NEW_BINDING case. Signed-off-by: Jochen Friedrich [EMAIL PROTECTED] --- drivers/net/fs_enet/fs_enet-main.c |9 - 1 files changed, 8 insertions(+), 1 deletions(-) diff --git a/drivers/net/fs_enet/fs_enet-main.c b/drivers/net/fs_enet/fs_enet-main.c index f2a4d39..8220c70 100644 --- a/drivers/net/fs_enet/fs_enet-main.c +++ b/drivers/net/fs_enet/fs_enet-main.c @@ -1174,8 +1174,15 @@ static int __devinit find_phy(struct device_node *np, struct device_node *phynode, *mdionode; struct resource res; int ret = 0, len; + const u32 *data; + + data = of_get_property(np, fixed-link, NULL); + if (data) { + snprintf(fpi-bus_id, 16, PHY_ID_FMT, 0, *data); + return 0; + } - const u32 *data = of_get_property(np, phy-handle, len); + data = of_get_property(np, phy-handle, len); if (!data || len != 4) ACK, pass this through paulus? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Please pull 'upstream-jgarzik' branch of wireless-2.6
Hum, this required merging also, and broke the build too :/ drivers/net/wireless/iwlwifi/iwl3945-base.c: In function ‘iwl3945_alive_start’: drivers/net/wireless/iwlwifi/iwl3945-base.c:6285: error: implicit declaration of function ‘iwl_rate_control_unregister’ make[4]: *** [drivers/net/wireless/iwlwifi/iwl3945-base.o] Error 1 make[3]: *** [drivers/net/wireless/iwlwifi] Error 2 make[2]: *** [drivers/net/wireless] Error 2 make[1]: *** [drivers/net] Error 2 make: *** [drivers] Error 2 I'll leave it there and assume that you will send a fix --on top of-- netdev#upstream ... -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] add driver for enc28j60 ethernet chip
Claudio Lanconelli wrote: These patches add support for Microchip enc28j60 ethernet chip controlled via SPI. I tested it on my custom board (S162) with ARM9 s3c2442 SoC. Any comments are welcome. Signed-off-by: Claudio Lanconelli [EMAIL PROTECTED] comments: * overall: a clean driver that looks mostly acceptable, good work * use stats in net_device rather than defining your own * use ethtool rather than 'full_duplex' variable to select duplex * [suggestion but not requirement] kernel prefers u8 and u32 types to the C99 types uint8_t or uint32_t * remove the 'inline' markers from functions, and let the compiler make the decision * udelay() in enc28j60_phy_write() -- and any similar code pattern -- may not actually delay for the specified amount of time, when you consider that writes may be posted. normally a read will flush a write. * Why do interrupt work in a kernel thread? Your comment says you cannot, but does not explain. * should use NAPI * should be able to program multicast list while everything is active -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] add driver for enc28j60 ethernet chip
oh yeah: make sure your Kconfig/Makefile stuff is in the _same_ patch as your driver. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html