Re: help on tg3 polling extension
From: "Qinghua(Kevin) Ye" <[EMAIL PROTECTED]> Date: Wed, 6 Jul 2005 13:15:40 -0600 > Yes, It wastes CPU cycles if there is other process running. However, as it > being a dedicated router, it should not be a problem. The process of packets > is the only task it is supposed to do. Linux is a general purpose operating system. Even as a dedicated router, a router daemon still has to execute in userspace to do BGP etc. signaling with routing peers. The administrator also might want to run diagnostic tools to monitor the network. You cannot spin polling on the device, it's simply unacceptable to starve out userspace and the rest of the system like that. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: help on tg3 polling extension
From: "Qinghua(Kevin) Ye" <[EMAIL PROTECTED]> Date: Wed, 6 Jul 2005 14:12:29 -0600 > Yes, you are right. Click acturally will release the CPU to OS at interval. > Other processes will be responded at this interval. It is not Click's right to make this kind of decision, that is what we have the process scheduler for. > The goal of polling extension is to reduce the interrupt overhead and > improve the throughput, especailly the small packets. NAPI does solve this > problem to some extend. And the extent to which NAPI does not solve this problem is??? Please propose something that solves this problem better and still respects the other processes and resources in the system. > If not use polling, how can I make use of all the CPUs to process packets? > Can I make all of the CPUs run SOFTIRQ and IRQ code simultaneously? It seems > there is only one ksoftirqd process busy dealing with process, while the > other ksoftirqd is idle in my system. There is one ksoftirqd for each cpu in the system. All the network card interrupts are arriving at that one cpu on your machine, so the other ksoftirqd doesn't have any work to do. If ksoftirqd is running very often, this means that network processing is consuming an enormous amount of your cpu. So it gets scheduled to a process and thus the packet processing is properly shared with other processes on the system and nobody is starved out. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: help on tg3 polling extension
From: "Qinghua(Kevin) Ye" <[EMAIL PROTECTED]> Date: Wed, 6 Jul 2005 15:57:00 -0600 > In my SMP platform, there is no other processes running. The usage of CPUs > are 100% and 0%. How could I make Nic interrupts not arrive at only one CPU, > or balance the interrupt between two CPUs? This doesn't work. If you try to split up the work for one network card amongst multiple cpus, you'll get SMP cache line movements for shared data between the processors and performance will go down. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] Add Networking menu and clean up net/Kconfig
From: Sam Ravnborg <[EMAIL PROTECTED]> Date: Wed, 6 Jul 2005 23:06:53 +0200 > When (if) accepted I expect someone (Dave?) from netdev to push this > onwards. I can't apply these patches because they will break several platforms that don't use drivers/Kconfig, my workstation (sparc64) would be one of those platforms :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 polling extension
From: "Qinghua(Kevin) Ye" <[EMAIL PROTECTED]> Date: Thu, 7 Jul 2005 16:04:40 -0600 > I did some small test showing that polling can improve the packet processing > throughput a bit. I still need to do more tests. Could anyone give me some > information about the lock scheme of RX and TX precedure? I would be very > appreciate. Thanks. It depends, the locking changed significantly in the current 2.6.13-rcX version of the driver. But before that: 1) ->hard_start_xmit() needs to hold the tx_lock with hard IRQs disabled, as does tg3_tx(). It uses NETIF_F_LLTX locking, thus the callers do not grab netdev->xmit_lock and thus do not guarentee atomic invocation of the driver's ->hard_start_xmit method. 2) Interrupt processing needs to hold ->lock with hard IRQs disabled. As does any code which wants to reprogram the hardware. 3) tg3_rx() runs without locks held because ->poll() calls are guarenteed to be atomic. 4) Any piece of code which wants to significantly reprogram the tg3 chip must: a) shut down ->poll() processing by doing tg3_netif_stop() b) grabbing ->lock with HW irqs disabled c) grabbing ->tx_lock The unlocking afterwards must be done in the precise reverse order. You could have figured this out by simply reading the driver and looking at how the locks are used. I merely translated the code into english, and also there is a big fat comment at the top of "struct tg3" describing how the locks are used. I did not put that comment there for my health. :-) You also can turn on spinlock debugging to try and figure out any SMP hang problems you might be seeing as well. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/1] net: fix sparse warnings
From: [EMAIL PROTECTED] Date: Thu, 07 Jul 2005 23:30:26 +0200 > From: Victor Fusco <[EMAIL PROTECTED]> > > Fix the sparse warning "implicit cast to nocast type" > > Signed-off-by: Victor Fusco <[EMAIL PROTECTED]> > Signed-off-by: Domen Puncer <[EMAIL PROTECTED]> Applied, thanks a lot. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tg3 polling extension
From: "Qinghua(Kevin) Ye" <[EMAIL PROTECTED]> Date: Thu, 7 Jul 2005 17:43:06 -0600 > So the tg3_tx() and tg3_start_xmit do not include any code of reprogramming > the hardware? Right, they just process the TX ring. > What kinds of code can be classifed to reprogramming the hardware? Should > the tw32_t/rx_mbox and tw32_mailbox operation be classified into this > catalog? Anything other than normal packet processing. The MBOX writes used to process the packets in the RX ring would not be considered reprogramming of the chip. > Another problem is about the Flushing the Status block to host memory. In > your original code, this is done by > tr32(MAILBOX_INTERRUPT_0+TG3_64BIT_REG_LOW). This readback is necessary to flush out any posted PCI writes to chip registers in most circumstances. The one exception is tg3_restart_ints() and normal interrupt handling. The current 2.6.13-rcX tg3.c driver is totally revamped wrt. interrupt processing, PIO flushes, and locking in general. You may wish to check it out. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.12.1 5/12] S2io: Performance improvements
From: "Raghavendra Koushik" <[EMAIL PROTECTED]> Date: Thu, 7 Jul 2005 18:06:19 -0700 > wmb() is to ensure ordered PIO writes. wmb() does no such thing. It only has influence on load and store instructions done by the local processor, it has no effect on what the PCI bus may do with PIO writes (ie. post them). If you need a PIO to complete in a specific order, you have to read it back. If you need PIO operations to occur in a specific order wrt. cpu memory operations, mmiowb() is what you need to use. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[INCOMPLETE PATCH]: killing skb->list
I got inspired eariler today, and found that it seemed it might be easy to kill off the 'list' member of struct sk_buff without changing sk_buff_head at all. I got very far. Nearly every single piece of code was easy to change to pass in an explicit SKB list instead of skb->list to the SKB queue management functions. The big exception was SCTP. I can't believe after being in the kernel for several years it has all of this complicated list handling, SKB structure overlaying, and casting all over the place. It was a big downer after a very positive day of coding. First, it casts "struct sctp_chunk *" pointers to "struct sk_buff *" so that it can "borrow" the SKB list handling functions. I just copied over the skb_*() routines it used in this way to be sctp_chunk_*(), and used them throughout and eliminated the ugly casts. This can be simplified a lot further, since it really doesn't care about 'qlen'. In fact, what it wants is just the most basic list handling, ala linux/list.h So just sticking a list_head into sctp_chunk and replacing sctp_chunk_list with a list_head as well should do the trick. Some of the rest of the SCTP stuff was transformable with not too much effort. But then I really got stymied by the reassembly and partial queue handling. These SCTP ulp event things make a layer of abstraction to the skb_unlink() point such that you can't know what list the SKB is on. One way to deal with this is to store the list pointer in the event struct, and that's likely what will happen at first. This isn't trivial because you have to make sure the assignment is done at all of the receive packet list insertion points, some places even use sk_buff_head lists on the local stack making this chore even more "exciting" :( But I didn't try to do that properly for now, and SCTP needs to be disabled in the config to play with this patch below. Another case that needs some careful study and review by others is the usbnet.c driver. Man, that is another piece of networking code that could use some serious cleanups. I bet it would be a simpler driver if it did things NAPI style too. But of particular concern is all of the SKB data area mangling it does to workaround restrictions in various USB net device implementations. This patch goes on top of the skb_queue_empty() diff I sent earlier today. I know I may have missed some skb_unlink() et al. fixups for things that I didn't enable in my config, so patches to cure that would be appreciated. It's ususally a very simplistic transformation. Even TCP, my biggest fear, only needed some minor modifications to tcp_collapse() and the rest was straightforward. Frankly, other than the SCTP parts, this is not very invasive at all. But it needs a lot of testing and review before I'd feel comfortable sending it along. diff --git a/drivers/bluetooth/bfusb.c b/drivers/bluetooth/bfusb.c --- a/drivers/bluetooth/bfusb.c +++ b/drivers/bluetooth/bfusb.c @@ -158,7 +158,7 @@ static int bfusb_send_bulk(struct bfusb if (err) { BT_ERR("%s bulk tx submit failed urb %p err %d", bfusb->hdev->name, urb, err); - skb_unlink(skb); + skb_unlink(skb, &bfusb->pending_q); usb_free_urb(urb); } else atomic_inc(&bfusb->pending_tx); @@ -212,7 +212,7 @@ static void bfusb_tx_complete(struct urb read_lock(&bfusb->lock); - skb_unlink(skb); + skb_unlink(skb, &bfusb->pending_q); skb_queue_tail(&bfusb->completed_q, skb); bfusb_tx_wakeup(bfusb); @@ -253,7 +253,7 @@ static int bfusb_rx_submit(struct bfusb if (err) { BT_ERR("%s bulk rx submit failed urb %p err %d", bfusb->hdev->name, urb, err); - skb_unlink(skb); + skb_unlink(skb, &bfusb->pending_q); kfree_skb(skb); usb_free_urb(urb); } @@ -398,7 +398,7 @@ static void bfusb_rx_complete(struct urb buf += len; } - skb_unlink(skb); + skb_unlink(skb, &bfusb->pending_q); kfree_skb(skb); bfusb_rx_submit(bfusb, urb); diff --git a/drivers/ieee1394/ieee1394_core.c b/drivers/ieee1394/ieee1394_core.c --- a/drivers/ieee1394/ieee1394_core.c +++ b/drivers/ieee1394/ieee1394_core.c @@ -678,7 +678,7 @@ static void handle_packet_response(struc return; } - __skb_unlink(skb, skb->list); + __skb_unlink(skb, &host->pending_packet_queue); if (packet->state == hpsb_queued) { packet->sendtime = jiffies; @@ -986,7 +986,7 @@ void abort_timedouts(unsigned long __opa packet = (struct hpsb_packet *)skb->data; if (time_before(packet->sendtime + expire, jiffies)) { - __skb_unlink(skb, skb->list); + __skb_unlink(skb, &host->pending_packet_queue); pa
Re: Seekable Sockets
From: Chase Douglas <[EMAIL PROTECTED]> Date: Fri, 08 Jul 2005 16:12:12 -0500 > This can be useful for programs such as mpi. In mpi, a server receives > results of computations from clients. However, the server cannot control who > sends data when. If the server needs data from client A to know how to > process the data from client B, then the server will want data from client A > first. Currently, if data from Client B comes first, then the mpi library > will copy the data into the library in userspace, then copy the data from > client A into the server program, and then copy the data from client B from > its own library buffer into the server program. If the socket is seekable, > then if data from client B comes first, we can seek past it and grab the > data from client A and copy it directly to the server program, then copy the > data from client B directly into the server program, saving a copy from > userspace to userspace (and possibly an allocation in userspace in the mpi > library). Other uses can also be found. It seems more logical to use a different socket for each client to solve this problem. You're trying to multiplex a single TCP connection, and that's what multiple TCP connections are for. This whole seekable socket idea seems quite foolhardy, and seems to serve only to help misdesigned userspace applications. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [INCOMPLETE PATCH]: killing skb->list
From: Sridhar Samudrala <[EMAIL PROTECTED]> Date: Fri, 08 Jul 2005 15:47:56 -0700 > I guess we could use the generic lists rather than skb list. But > your sctp_chunk_list looks fine for now except for a minor > bug in __sctp_chunk_dequeue(). You missed resetting result->list > to NULL. Thanks for the patch. Today I tried to actually move over to generic lists for the chunk stuff. I got really close (see patch below), but the last snag I hit was the backlog processing. The SCTP stack sneakily passes sctp_chunk pointers into sk_add_backlog(). I would never have spotted this without explicitly looking for all the ugly "struct sctp_chunk *" and "struct sk_buff *" casts. I'll see if I can figure out a way to deal with this cleanly. Oh yeah, also, I think the list_empty() if statement in sctp_free_chunk() can be turned entirely into a BUG_ON(). Every caller should be unlinking the chunk from whatever list it is on beforehand. diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -592,13 +592,8 @@ int sctp_chunk_abandoned(struct sctp_chu * each chunk as well as a few other header pointers... */ struct sctp_chunk { - /* These first three elements MUST PRECISELY match the first -* three elements of struct sk_buff. This allows us to reuse -* all the skb_* queue management functions. -*/ - struct sctp_chunk *next; - struct sctp_chunk *prev; - struct sk_buff_head *list; + struct list_head list; + atomic_t refcnt; /* This is our link to the per-transport transmitted list. */ @@ -717,7 +712,7 @@ struct sctp_packet { __u32 vtag; /* This contains the payload chunks. */ - struct sk_buff_head chunks; + struct list_head chunk_list; /* This is the overhead of the sctp and ip headers. */ size_t overhead; @@ -974,7 +969,7 @@ struct sctp_inq { /* This is actually a queue of sctp_chunk each * containing a partially decoded packet. */ - struct sk_buff_head in; + struct list_head in_chunk_list; /* This is the packet which is currently off the in queue and is * being worked on through the inbound chunk processing. */ @@ -1017,7 +1012,7 @@ struct sctp_outq { struct sctp_association *asoc; /* Data pending that has never been transmitted. */ - struct sk_buff_head out; + struct list_head out_chunk_list; unsigned out_qlen; /* Total length of queued data chunks. */ @@ -1025,7 +1020,7 @@ struct sctp_outq { unsigned error; /* These are control chunks we want to send. */ - struct sk_buff_head control; + struct list_head control_chunk_list; /* These are chunks that have been sacked but are above the * CTSN, or cumulative tsn ack point. @@ -1672,7 +1667,7 @@ struct sctp_association { * which already resides in sctp_outq. Please move this * queue and its supporting logic down there. --piggy] */ - struct sk_buff_head addip_chunks; + struct list_head addip_chunk_list; /* ADDIP Section 4.1 ASCONF Chunk Procedures * diff --git a/net/sctp/associola.c b/net/sctp/associola.c --- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -203,7 +203,7 @@ static struct sctp_association *sctp_ass */ asoc->addip_serial = asoc->c.initial_tsn; - skb_queue_head_init(&asoc->addip_chunks); + INIT_LIST_HEAD(&asoc->addip_chunk_list); /* Make an empty list of remote transport addresses. */ INIT_LIST_HEAD(&asoc->peer.transport_addr_list); diff --git a/net/sctp/input.c b/net/sctp/input.c --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -308,6 +308,7 @@ int sctp_backlog_rcv(struct sock *sk, st /* One day chunk will live inside the skb, but for * now this works. */ +#error this does not work -DaveM chunk = (struct sctp_chunk *) skb; inqueue = &chunk->rcvr->inqueue; diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c --- a/net/sctp/inqueue.c +++ b/net/sctp/inqueue.c @@ -50,7 +50,7 @@ /* Initialize an SCTP inqueue. */ void sctp_inq_init(struct sctp_inq *queue) { - skb_queue_head_init(&queue->in); + INIT_LIST_HEAD(&queue->in_chunk_list); queue->in_progress = NULL; /* Create a task for delivering data. */ @@ -62,11 +62,13 @@ void sctp_inq_init(struct sctp_inq *queu /* Release the memory associated with an SCTP inqueue. */ void sctp_inq_free(struct sctp_inq *queue) { - struct sctp_chunk *chunk; + struct sctp_chunk *chunk, *tmp; /* Empty the queue. */ - while ((chunk = (struct sctp_chunk *) skb_dequeue(&queue->in)) != NULL) + list_for_each_entry_safe(chunk, tmp, &queue->in_chunk_list, list) { + list_del_init(&chunk->list); sctp_chunk_free(chunk);
[PATCH]: Make SCTP use list_head for all chunk lists
From: "David S. Miller" <[EMAIL PROTECTED]> Date: Fri, 08 Jul 2005 16:27:56 -0700 (PDT) > I'll see if I can figure out a way to deal with this cleanly. I figured out a way. Sridhar can you give this patch below a test? I use the control block to store the chunk pointer, then pass skb's around. It's use is confined only to sctp_rcv() and sctp_backlog_rcv() so I didn't put it into any of the SCTP header files, it can stay local to sctp/input.c BTW, the rest of the SCTP input path should be audited to make sure any other use of the SKB control block on input does not spam the ipv4/ipv6 parameter area (struct inet_skb_parm and struct inet6_skb_parm). That must be preserved on input (unless you unshare the SKB of course). That's why TCP's skb control block (in net/tcp.h) uses this header as well. Also, if you can get this patch working, can you check to see if it works to change sctp_chunk_free() to go: BUG_ON(!list_empty(&chunk->list)); Thanks a lot. diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -582,7 +582,6 @@ void sctp_datamsg_track(struct sctp_chun void sctp_chunk_fail(struct sctp_chunk *, int error); int sctp_chunk_abandoned(struct sctp_chunk *); - /* RFC2960 1.4 Key Terms * * o Chunk: A unit of information within an SCTP packet, consisting of @@ -592,13 +591,8 @@ int sctp_chunk_abandoned(struct sctp_chu * each chunk as well as a few other header pointers... */ struct sctp_chunk { - /* These first three elements MUST PRECISELY match the first -* three elements of struct sk_buff. This allows us to reuse -* all the skb_* queue management functions. -*/ - struct sctp_chunk *next; - struct sctp_chunk *prev; - struct sk_buff_head *list; + struct list_head list; + atomic_t refcnt; /* This is our link to the per-transport transmitted list. */ @@ -717,7 +711,7 @@ struct sctp_packet { __u32 vtag; /* This contains the payload chunks. */ - struct sk_buff_head chunks; + struct list_head chunk_list; /* This is the overhead of the sctp and ip headers. */ size_t overhead; @@ -974,7 +968,7 @@ struct sctp_inq { /* This is actually a queue of sctp_chunk each * containing a partially decoded packet. */ - struct sk_buff_head in; + struct list_head in_chunk_list; /* This is the packet which is currently off the in queue and is * being worked on through the inbound chunk processing. */ @@ -1017,7 +1011,7 @@ struct sctp_outq { struct sctp_association *asoc; /* Data pending that has never been transmitted. */ - struct sk_buff_head out; + struct list_head out_chunk_list; unsigned out_qlen; /* Total length of queued data chunks. */ @@ -1025,7 +1019,7 @@ struct sctp_outq { unsigned error; /* These are control chunks we want to send. */ - struct sk_buff_head control; + struct list_head control_chunk_list; /* These are chunks that have been sacked but are above the * CTSN, or cumulative tsn ack point. @@ -1672,7 +1666,7 @@ struct sctp_association { * which already resides in sctp_outq. Please move this * queue and its supporting logic down there. --piggy] */ - struct sk_buff_head addip_chunks; + struct list_head addip_chunk_list; /* ADDIP Section 4.1 ASCONF Chunk Procedures * diff --git a/net/sctp/associola.c b/net/sctp/associola.c --- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -203,7 +203,7 @@ static struct sctp_association *sctp_ass */ asoc->addip_serial = asoc->c.initial_tsn; - skb_queue_head_init(&asoc->addip_chunks); + INIT_LIST_HEAD(&asoc->addip_chunk_list); /* Make an empty list of remote transport addresses. */ INIT_LIST_HEAD(&asoc->peer.transport_addr_list); diff --git a/net/sctp/input.c b/net/sctp/input.c --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -115,6 +115,17 @@ static void sctp_rcv_set_owner_r(struct atomic_add(sizeof(struct sctp_chunk),&sk->sk_rmem_alloc); } +struct sctp_input_cb { + union { + struct inet_skb_parmh4; +#if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) + struct inet6_skb_parm h6; +#endif + } header; + struct sctp_chunk *chunk; +}; +#define SCTP_INPUT_CB(__skb) ((struct sctp_input_cb *)&((__skb)->cb[0])) + /* * This is the routine which IP calls when receiving an SCTP packet. */ @@ -243,6 +254,7 @@ int sctp_rcv(struct sk_buff *skb) ret = -ENOMEM; goto discard_release; } + SCTP_INPUT_CB(skb)->chunk = chunk; sctp_rcv_set_owner_r(skb,sk);
Re: [PATCH 1/2] (INCLUDE,empty)/leave-group equivalence for full-state MSF APIs & errno fix
From: David Stevens <[EMAIL PROTECTED]> Date: Fri, 8 Jul 2005 14:56:34 -0600 > This patch: > 1) Adds (INCLUDE, empty)/leave-group equivalence to the full-state > multicast > source filter APIs (IPv4 and IPv6) > 2) Fixes an incorrect errno in the IPv6 leave-group (ENOENT should be > EADDRNOTAVAIL) Applied, thanks David. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] fix IPv4 leave-group group matching
From: David Stevens <[EMAIL PROTECTED]> Date: Fri, 8 Jul 2005 14:59:30 -0600 > This patch fixes the multicast group matching for > IP_DROP_MEMBERSHIP, > similar to the IP_ADD_MEMBERSHIP fix in a prior patch. Groups are > identified > by and including the interface address in the > match > will fail if a leave-group is done by address when the join was done by > index, > or if different addresses on the same interface are used in the join and > leave. Also applied, thanks a lot. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] multicast API "join" issues
Patch applied, thanks David. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] multicast API "join" issues
Patch applied, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] multicast API "join" issues
Also applied, thanks a lot. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH]: Make SCTP use list_head for all chunk lists
From: Sridhar Samudrala <[EMAIL PROTECTED]> Date: Fri, 08 Jul 2005 18:40:18 -0700 > On Fri, 2005-07-08 at 17:22 -0700, David S. Miller wrote: > > From: "David S. Miller" <[EMAIL PROTECTED]> > > Date: Fri, 08 Jul 2005 16:27:56 -0700 (PDT) > > > > > I'll see if I can figure out a way to deal with this cleanly. > > > > I figured out a way. Sridhar can you give this patch below > > a test? > > I did a quick run of the regression tests with the patch and i > didn't see any problems. Thank you very much Sridhar. > > BTW, the rest of the SCTP input path should be audited to make sure > > any other use of the SKB control block on input does not spam the > > ipv4/ipv6 parameter area (struct inet_skb_parm and struct > > inet6_skb_parm). That must be preserved on input (unless you > > unshare the SKB of course). That's why TCP's skb control block > > (in net/tcp.h) uses this header as well. > > We do a skb_clone() before using the SKB control block to store > the ulpevent structure, so i guess it should be OK. Aha! Now if you add the proper header to the front of the ulpevent, you will not need to clone SKBs at all. > > Also, if you can get this patch working, can you check to see > > if it works to change sctp_chunk_free() to go: > > > > BUG_ON(!list_empty(&chunk->list)); > > Even this works fine, so we can replace the list_empty() check > with BUG_ON. Thanks a lot for checking that out for me. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
SKB tutorial updated
I've updated the SKB tutorial a little bit today, in particular I added coverage of non-linear data areas to the SKB data handling tutorial page at: http://vger.kernel.org/~davem/skb_data.html I'll probably start working on the TCP packet output engine next. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
TCP output engine tutorial
Ok, I went a little bananas with the diagrams, but here goes nothing :-) http://vger.kernel.org/~davem/tcp_output.html it's linked from my top-level page as well. Enjoy. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: hard_header_len
From: Feyd <[EMAIL PROTECTED]> Date: Mon, 11 Jul 2005 07:43:10 +0200 > can I assume that hard_start_xmit will always get skbs with hard_header_len > reserved? I need two more bytes at the start of the packet and I'm getting > spurious panics in skb_push. Typically, no. ->hard_start_xmit() has a fully built packet, hardware headers and all. Protocols push the hardware header and copy it into the packet long before you get called. For example, look at net/ipv4/ip_output.c:ip_finish_output2(), it takes the cached ARP response hardware header and copies it into the packet like so: read_lock_bh(&hh->hh_lock); hh_alen = HH_DATA_ALIGN(hh->hh_len); memcpy(skb->data - hh_alen, hh->hh_data, hh_alen); read_unlock_bh(&hh->hh_lock); skb_push(skb, hh->hh_len); return hh->hh_output(skb); Your driver's ->hard_start_xmit() gets the packet after this has occured. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SKB tutorial updated
From: "Catalin(ux aka Dino) BOIE" <[EMAIL PROTECTED]> Date: Mon, 11 Jul 2005 13:47:20 +0300 (EEST) > Any chance to include all this documentation in > Documentation/networking/skb/? No, no intention of doing that. For the same reason we don't put all of lartc.org into the kernel tree :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Seekable Sockets
From: Chase Douglas <[EMAIL PROTECTED]> Date: Mon, 11 Jul 2005 12:33:46 -0500 > I'm sorry, I made a careless mistake in choice of context. What this would > be useful for is applications where we want to seek ahead in one stream from > one connection. This is not meant for seeking somehow between multiple > connections, but for one single connection between only two computers. What do you do is the data you want is beyond the size of the socket's receive buffer? You can't seek past that without allowing the socket to go over it's receive buffer limits. And if you limit the seek to the receive buffer limit, that's a real grotty limitation and it's not real seek() support on the stream. Really, this idea has more holes than swiss cheese. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Seekable Sockets
From: Chase Douglas <[EMAIL PROTECTED]> Date: Mon, 11 Jul 2005 15:14:44 -0500 > This may not mean much for normal use, but there are academic instances, > especially in cluster computing, where saving many extra user-user > copies can really add up. So, basically, the useful situation is that the sender sends data the receiver doesn't want. Which still sounds like it's the applications that need to be fixed. If the receiver does want all the data, it simply needs to provision enough buffer area so that it can receive at least enough data so that it can seek first to the data it's interested in (inside of it's application buffer) and then go back. It's still one copy in this case. I still see no real use for this feature. Either the data is stored inside of kernel buffers, or user application buffers. And if the data sent is not useful to the receiver, fix the sender to not send the unwanted data. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] loop unrolling in net/sched/sch_generic.c
From: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> Date: Fri, 8 Jul 2005 08:03:27 -0300 > Of course only the skbs created after the skb_alloc_extension() call would > be valid for the subsystem > that alloced the extension, would this be a problem? It might be. It is entirely possible, for example, for an old skb to pop up and appear in netfilter. I have no idea how we'd take care of that kind of issue. Perhaps we could explore some mechnism by which to indicate an extension was present when an SKB was allocated. If we ask for a pointer to an extension which was not there at SKB allocation time, we do a data area realloc with the new space size, and copy the old stuff over. This means that extensions have to be ordered in the extension area precisely as they were allocated. It also means you really cannot un-allocate. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH]: Make SCTP use list_head for all chunk lists
From: Sridhar Samudrala <[EMAIL PROTECTED]> Date: Mon, 11 Jul 2005 09:56:10 -0700 > An incoming skb can contain more than 1 user message(chunk) and > we do a clone for each message and store the per-message information > in the ulpevent structure. > Moreover, the ulpevent structure is already 34 bytes which makes it > impossible to share the 40-byte control block with ip specific info. If the sctp_chunk structure is per-user-message, and so is the ulpevent object, it makes no sense to store the ulpevent information seperately from sctp_chunk. Look at how all of the ulpevent members tend to be initialized: event->stream = ntohs(chunk->subh.data_hdr->stream); event->ssn = ntohs(chunk->subh.data_hdr->ssn); event->ppid = chunk->subh.data_hdr->ppid; if (chunk->chunk_hdr->flags & SCTP_DATA_UNORDERED) { event->flags |= MSG_UNORDERED; event->cumtsn = sctp_tsnmap_get_ctsn(&asoc->peer.tsn_map); } event->tsn = ntohl(chunk->subh.data_hdr->tsn); event->msg_flags |= chunk->chunk_hdr->flags; event->iif = sctp_chunk_iif(chunk); It's just transferring chunk information over to the ulpevent structure with only minor modifications such as endinness swapping of packet header fields. So I think we can store all of this stuff in the sctp_chunk and then just make sure the chunk is available. In fact, we can replace all the event->{stream,ssn,ppid,cumtsn,tsn,iif} members with just a backpointer to the sctp_chunk. This also means you won't need to clone so much anymore either. You'll only need to clone at the chunking level. I'll see if I can get a spare moment to try and implement this. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH]: Make SCTP use list_head for all chunk lists
From: Sridhar Samudrala <[EMAIL PROTECTED]> Date: Mon, 11 Jul 2005 17:00:56 -0700 > For incoming packets, the same sctp_chunk structure is used for all > chunks in the packet whereas ulpevent is per-chunk. An sctp_chunk is > allocated for a packet when we do a sctp_chunkify() in sctp_rcv(). We > walk through the chunks in a packet and reuse the same chunk structure > as we move to the next chunk(sctp_inq_pop). > So we cannot keep per-message ulpevent info in sctp_chunk. I see, let me think about this some more. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: TCP output engine tutorial
From: Lennert Buytenhek <[EMAIL PROTECTED]> Date: Mon, 11 Jul 2005 11:31:45 +0200 > Interesting material. What happens if there is an skb with paged > data (say, a page cache page) on the sk_write_queue, and it ends up > having to be retransmitted -- is it possible that the retransmit is > sent with different data if the page is modified in the meanwhile? Yep. We grab references to the pages, but we do not lock the _contents_. This was a deliberate design decision. This is why we only support scatter-gather (and thus paged SKB transmission by a device) when checksum offloading is being done. This is why, for example, SAMBA will only use sendfile() when the SMB client has an oplock held on the file (and thus the file contents are guarenteed to not change). It turns out that Microsoft's Windows client SMB grabs the oplocks when you open a file by default, so this is rarely a performance problem. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] loop unrolling in net/sched/sch_generic.c
From: Andi Kleen <[EMAIL PROTECTED]> Date: 12 Jul 2005 04:25:49 +0200 > What other plans do have? I think a lot of stuff could be moved > into ->cb, in particular tc_* and the HIPPI field. See: http://vger.kernel.org/~davem/net_todo.html there is an entry entitled "SKBs are too large", it lists our exact plans. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH _ALMOST_]: Kill skb->list
Ok, this almost fully kills of skb->list. The thing that is missing is that a couple ugly ATM drivers need to have their skb_unlink() calls fixed to pass in the list head pointer as the second arg. I gave it a quick shot, but I was unsuccessful. I can't even compile one of them on my workstation (nicstar.c) because it casts pointers to "u32" and stuff like that :-/ Sridhar, I resolved the remaining SCTP issues by taking advantage of the fact that when we are collecting onto a list, the SKB of the "event" we use is the first member of that on-stack temp list we are using. Thus, "(struct sk_buff_head *) sctp_event2skb(event)->prev" gets us the list head pointer we need, and if the 'prev' pointer is NULL then it wasn't on a list. It is not the prettiest solution in the world, but for now it works. We should really do serious cleanups in this area in the future to flesh all of this out more nicely. But anyways, let me know if SCTP still passes your tests with this change installed. Thanks. Please, if anyone has the stomache to try and fix the ATM drivers (I think the two that need fixing is nicstar.c and zatm.c, grep for skb_unlink() calls using only one argument), I would _seriously_ appreciate it, thanks. diff --git a/drivers/bluetooth/bfusb.c b/drivers/bluetooth/bfusb.c --- a/drivers/bluetooth/bfusb.c +++ b/drivers/bluetooth/bfusb.c @@ -158,7 +158,7 @@ static int bfusb_send_bulk(struct bfusb if (err) { BT_ERR("%s bulk tx submit failed urb %p err %d", bfusb->hdev->name, urb, err); - skb_unlink(skb); + skb_unlink(skb, &bfusb->pending_q); usb_free_urb(urb); } else atomic_inc(&bfusb->pending_tx); @@ -212,7 +212,7 @@ static void bfusb_tx_complete(struct urb read_lock(&bfusb->lock); - skb_unlink(skb); + skb_unlink(skb, &bfusb->pending_q); skb_queue_tail(&bfusb->completed_q, skb); bfusb_tx_wakeup(bfusb); @@ -253,7 +253,7 @@ static int bfusb_rx_submit(struct bfusb if (err) { BT_ERR("%s bulk rx submit failed urb %p err %d", bfusb->hdev->name, urb, err); - skb_unlink(skb); + skb_unlink(skb, &bfusb->pending_q); kfree_skb(skb); usb_free_urb(urb); } @@ -398,7 +398,7 @@ static void bfusb_rx_complete(struct urb buf += len; } - skb_unlink(skb); + skb_unlink(skb, &bfusb->pending_q); kfree_skb(skb); bfusb_rx_submit(bfusb, urb); diff --git a/drivers/ieee1394/ieee1394_core.c b/drivers/ieee1394/ieee1394_core.c --- a/drivers/ieee1394/ieee1394_core.c +++ b/drivers/ieee1394/ieee1394_core.c @@ -681,7 +681,7 @@ static void handle_packet_response(struc return; } - __skb_unlink(skb, skb->list); + __skb_unlink(skb, &host->pending_packet_queue); if (packet->state == hpsb_queued) { packet->sendtime = jiffies; @@ -989,7 +989,7 @@ void abort_timedouts(unsigned long __opa packet = (struct hpsb_packet *)skb->data; if (time_before(packet->sendtime + expire, jiffies)) { - __skb_unlink(skb, skb->list); + __skb_unlink(skb, &host->pending_packet_queue); packet->state = hpsb_complete; packet->ack_code = ACKX_TIMEOUT; queue_packet_complete(packet); diff --git a/drivers/isdn/act2000/capi.c b/drivers/isdn/act2000/capi.c --- a/drivers/isdn/act2000/capi.c +++ b/drivers/isdn/act2000/capi.c @@ -606,7 +606,7 @@ handle_ack(act2000_card *card, act2000_c if m->msg.data_b3_req.fakencci >> 8) & 0xff) == chan->ncci) && (m->msg.data_b3_req.blocknr == blocknr)) { /* found corresponding DATA_B3_REQ */ -skb_unlink(tmp); +skb_unlink(tmp, &card->ackq); chan->queued -= m->msg.data_b3_req.datalen; if (m->msg.data_b3_req.flags) ret = m->msg.data_b3_req.datalen; diff --git a/drivers/net/shaper.c b/drivers/net/shaper.c --- a/drivers/net/shaper.c +++ b/drivers/net/shaper.c @@ -156,52 +156,6 @@ static int shaper_start_xmit(struct sk_b SHAPERCB(skb)->shapelen= shaper_clocks(shaper,skb); -#ifdef SHAPER_COMPLEX /* and broken.. */ - - while(ptr && ptr!=(struct sk_buff *)&shaper->sendq) - { - if(ptr->pripri - && jiffies - SHAPERCB(ptr)->shapeclock < SHAPER_MAXSLIP) - { - struct sk_buff *tmp=ptr->prev; - - /* -* It goes before us therefore we slip the length -* of the new frame. -*/ - -
Re: [PATCH] SCTP: __nocast annotations
From: Alexey Dobriyan <[EMAIL PROTECTED]> Date: Tue, 12 Jul 2005 04:39:25 +0400 > Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> Applied, thanks Alexey. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Prevent oops when printing martian source
From: Olaf Kirch <[EMAIL PROTECTED]> Date: Mon, 11 Jul 2005 12:58:59 +0200 > In some cases, we may be generating packets with a source address that > qualifies as martian. This can happen when we're in the middle of setting > up or tearing down the network, and netfilter decides to reject a packet > with an RST. The routing code would detect the martian, and try to > print a warning. This would oops, because locally generated packets do > not have a valid skb->mac.raw pointer at this point. > > I didn't actually investigate why netfilter was generating an RST with > invalid IP source, but from what it looked like, the system was in the > middle of some interface setup/teardown. It is better to be safe than sorry, I've applied your patch thanks Olaf. Please provide a proper "Signed-off-by: " line for yourself in future patches, thanks a lot. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] loop unrolling in net/sched/sch_generic.c
From: Andi Kleen <[EMAIL PROTECTED]> Date: Tue, 12 Jul 2005 06:32:33 +0200 > On Mon, Jul 11, 2005 at 07:44:47PM -0700, David S. Miller wrote: > > From: Andi Kleen <[EMAIL PROTECTED]> > > Date: 12 Jul 2005 04:25:49 +0200 > > > > > What other plans do have? I think a lot of stuff could be moved > > > into ->cb, in particular tc_* and the HIPPI field. > > > > See: > > > > http://vger.kernel.org/~davem/net_todo.html > > > > there is an entry entitled "SKBs are too large", it lists > > our exact plans. > > You could add my proposal if you agree. It could only be legal to move things into ->cb[] if they are only referenced before the protocols get their hands on it. It certainly looks to be the case for tc_*, although HIPPI I am less sure of. I also have the idea to move ->real_dev into ->cb[] so that the bonding driver, the only user of skb->real_dev, can maintain the pointer there. So we'd need to be careful that there is agreement on how the pre-protocol ->cb[] area is layed out to avoid conflict. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: more comments on skb shrinking
From: Andi Kleen <[EMAIL PROTECTED]> Date: Tue, 12 Jul 2005 06:48:02 +0200 > >> > # skb->real_dev, it is used in one place, in bonding device. Suggestion > is to remove it and replace with an ugly per-cpu static variable (but > still less ugly than keeping a useless pointer in struct sk_buff) We > can add per-cpu variable real_dev, save real_dev there and bonding can > fetch it from there. The trick relies on the fact that real_dev can be > forgotten after we leave softirq handler. > << > > You can just put it into ->cb while bond is active. Yes, but we must have a consistent layout with other things potentially stuff into there in the pre-protocol input path. See my other email. > >> > skb->h is really useless and can be eliminated immediately. The only place > where it is really used is checksumming offload on output. skb->h is used > there to mark the beginning of area to checksum, the idea was to support > offload for protocols other than TCP and UDP. Given that this generality is > not used, it can be replaced with direct parsing of IP header. > << > > I would rather add an u16 header_offset field instead of adding > header parsing code in all drivers. With some other fields > being u16 there should be enough padding for that. The idea is, rather, that skb->data is not pushed by the drivers, it is left at the MAC header when the SKB is given to netif_receive_skb(). I think this is a much cleaner thing than what happens now. Then we just pass skb->data to the ptype handlers. So it's not really "all drivers", it's things like eth_header_type() and friends that get changed. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Seekable Sockets
From: Harald Welte <[EMAIL PROTECTED]> Date: Tue, 12 Jul 2005 13:31:23 +0200 > On Mon, Jul 11, 2005 at 01:21:03PM -0700, David S. Miller wrote: > > I still see no real use for this feature. Either the data is > > stored inside of kernel buffers, or user application buffers. > > And if the data sent is not useful to the receiver, fix the > > sender to not send the unwanted data. > > Well, this assumes that you have full control over both ends. In > reality you are often confronted with misdesigned "standard" protocols > (I'm not even talking about proprietary senders/servers, or boxes > outside of your control). > > So I have to disagree in that I think the feature is useful. Whether or > not it is feasible without complicating the codebase unneccessary, I > don't know. But even if you can't control the sender, the seeking buys you no memory savings at all. You don't consume less memory, because even the data you're not interested in sits in kernel buffers while you wait for the stuff you're interested in to arrive. I do see how you can avoid some copies, but that seems easier to implement with a "MSG_NOCOPY" or similar recvmsg() flag rather than seeking. Ie, to skip the crap you're not interested in you go: int len = recv(sock_fd, NULL, ignore_len, MSG_NOCOPY); Then you do normal reads into a buffer for the parts you really do want. That is infinitely cleaner than this seek() idea, really. Sockets were not meant to be seek()'d upon, so don't go there. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] __be'ify *_type_trans()
From: Alexey Dobriyan <[EMAIL PROTECTED]> Date: Tue, 12 Jul 2005 23:13:44 +0400 > tr_type_trans(), hippi_type_trans() left as-is. > > Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> Applied, thanks Alexey. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 4810] New: Early vlan adding leads to not functional device
I've applied Tommy's patch to fix this bug for now. Tommy, please provide a proper "Signed-off-by:" for patches you post in the future, thanks a lot. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SKB tutorial, Blog, and NET TODO
From: Zhu Yi <[EMAIL PROTECTED]> Date: Thu, 30 Jun 2005 10:07:03 +0800 > Agreed. The ipw2200 card provides 4 hardware queues for QoS. But current > network stack only supports one Tx queue. This is actually difficult to implement support for. Well, not difficult, but rather I mean that it is costly. We would need to change the 'qdisc' member of struct netdev into an array of qdiscs. How to do that efficiently so that other devices do not eat the space cost assosciated with this is unclear. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SKB tutorial, Blog, and NET TODO
From: "Leonid Grossman" <[EMAIL PROTECTED]> Date: Wed, 29 Jun 2005 14:11:13 -0700 > - TSO support for IPv6 > - USO (UDP TSO) support > - support for multiple hardware queues/channels and TCP traffic steering; > there are number of benefits in the ability to associate TCP flows with a > particular hw queue/cpu/MSI (MSI-X), one of them is improving receive > bottleneck for high-speed networks at 1500mtu > - support for Large Receive Offload, mainly to the same end of reducing cpu > utilization and solving 1500 mtu receive bottleneck I've added entries for this stuff, thanks for the suggestions. I've labelled the TCP flow assosciation and LRO stuff as "Investigate .." because it is still unclear how exactly we should proceed here. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] net: add a top-level Networking menu to *config
From: Sam Ravnborg <[EMAIL PROTECTED]> Date: Fri, 8 Jul 2005 00:38:32 + > Create a new top-level menu named "Networking" thus moving > net related options and protocol selection way from the drivers > menu and up on the top-level where they belong. > > To implement this all architectures has to source "net/Kconfig" before > drivers/*/Kconfig in their Kconfig file. This change has been > implemented for all architectures. > > Device drivers for ordinary NIC's are still to be found > in the Device Drivers section, but Bluetooth, IrDA and ax25 > are located with their corresponding menu entries under the new > networking menu item. > > Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]> Patch applied, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] net: add a top-level Networking menu to *config
From: randy_dunlap <[EMAIL PROTECTED]> Date: Fri, 8 Jul 2005 10:02:11 -0700 > Can the NETPOLL options that are under > Networking support + Networking options > (that depend on NETCONSOLE) and the NETCONSOLE option that is under > Device Drivers + Network device support > be moved to the same area? I don't really care which area, > but it's a hassle to have to move between them to enable/disable > them. > > I also think that there is some room for more consistency in > the presentation of similar Network device categories, but > maybe that (other) menuconfig patch will address this after > your patch is merged, since that isn't what this patch was > addressing. Please provide a followon patch to make these changes, as I put Sam's patches into my tree last night. Thanks a lot. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] net: move config options out to individual protocols
From: Sam Ravnborg <[EMAIL PROTECTED]> Date: Fri, 8 Jul 2005 00:43:18 + > Move the protocol specific config options out to the specific protocols. > With this change net/Kconfig now starts to become readable and serve as a > good basis for further re-structuring. > > The menu structure is left almost intact, except that indention is > fixed in most cases. Most visible are the INET changes where several > "depends on INET" are replaced with a single ifdef INET / endif pair. > > Several new files were created to accomplish this change - they are > small but serve the purpose that config options are now distributed > out where they belongs. > > Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]> Patch applied, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Seekable Sockets
From: David Stevens <[EMAIL PROTECTED]> Date: Tue, 12 Jul 2005 14:16:30 -0700 > In its RFC incantation, it allows for out-of-order delivery of an > arbitrary (but limited) amount of data. The BSD implementation > made it largely unusable by widely distributing something that > didn't compute the offset correctly and only supported 1 byte of > urgent data, but its original form seems pretty close to what you > want, without the receiver having to know where the special > data is in advance. And the BSD-compatible form can be used > in a similar way, with the app doing the buffering instead of the > kernel. > Or am I missing something?? URG is generally unusable. First, it's disuse results in nearly all TCP stacks going to the slow path when it is used. TSO (on output) and the TCP input fast path (on input) are both not used when URG is active. Secondly, if you know where the data is, my MSG_NOCOPY idea takes care of things quite nicely. URG also has the nasty side effect of using signals. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SKB tutorial, Blog, and NET TODO
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Wed, 13 Jul 2005 01:38:37 +0200 > If its about outgoing traffic, shouldn't a prio-qdisc as root qdisc do > just fine? skb->priority can be used to select a queue. Incoming traffic > with pre-classification by the NIC would require multiple input queues > though .. I forgot what the real problem was, sorry. Yes, the issue is on outgoing traffic, and it has to do with netif_queue_stop(). We need one piece of queue plugging state for every queue the hardware supports. So if queue 0 fills up, packets can still be queued for queue 1, 2, ... This can't be cleanly done with a single binary queue-stopped state like we have now. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ALIGN at crypt/cipher.c
From: Herbert Xu <[EMAIL PROTECTED]> Date: Fri, 15 Jul 2005 12:27:56 +1000 > On Thu, Jul 14, 2005 at 02:36:16PM +, Ken-ichirou MATSUZAWA wrote: > > > > No, I think I can understand. align should be unsigned long too. > > After changing align to unsigned long from int, it works fine. > > Thanks for pin-pointing the problem Matsuzawa-san. The following > patch implements your suggestion to fix the bug where the alignment > mask is incorrectly zero-extended on 64-bit architectures. > > Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> Applied, thanks Herbert. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/1] drivers/net/pcmcia/smc91c92_cs.c : Use of time_after macro
From: [EMAIL PROTECTED] Date: Thu, 14 Jul 2005 23:41:51 +0200 > Use of the time_after() macro, defined at linux/jiffies.h, which deal > with wrapping correctly and are nicer to read. > > Signed-off-by: Marcelo Feitoza Parisi <[EMAIL PROTECTED]> > Signed-off-by: Domen Puncer <[EMAIL PROTECTED]> Patch applied, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/1] drivers/net/wan/: use of time_after macro
From: [EMAIL PROTECTED] Date: Thu, 14 Jul 2005 23:41:43 +0200 > Use of the time_after() macro, defined at linux/jiffies.h, which deal > with wrapping correctly and are nicer to read. > > Signed-off-by: Marcelo Feitoza Parisi <[EMAIL PROTECTED]> > Signed-off-by: Domen Puncer <[EMAIL PROTECTED]> Patch applied, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2.6.13-rc3] tg3: Move tg3 firmware into separate file
From: "Nathanael Nerode" <[EMAIL PROTECTED]> Date: Sun, 17 Jul 2005 07:55:45 -0400 > This is partly for the purpose of doing firmware loading in the future, > but it's also a matter of tidiness. So make the change when we do the loading like that in the future. The fact that you are forcing the issue right now makes me suspicious of your real reason for desiring this change. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [PKT_SCHED]: Reduce branch mispredictions in pfifo_fast_dequeue
From: Thomas Graf <[EMAIL PROTECTED]> Date: Mon, 18 Jul 2005 15:36:36 +0200 > The current call to __qdisc_dequeue_head leads to a branch > misprediction for every loop iteration, the fact that the > most common priority is 2 makes this even worse. This issue > has been brought up by Eric Dumazet <[EMAIL PROTECTED]> > but unlike his solution which was to manually unroll the loop, > this approach preserves the possibility to increase the number > of bands at compile time. > > Signed-off-by: Thomas Graf <[EMAIL PROTECTED]> Also applied, thanks Thomas. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [PKT_SCHED]: Remove debugging leftover from textsearch ematch
From: Thomas Graf <[EMAIL PROTECTED]> Date: Mon, 18 Jul 2005 15:35:02 +0200 > Signed-off-by: Thomas Graf <[EMAIL PROTECTED]> Applied, thanks Thomas. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 3/3] net/sctp/objcnt: Audit return code of create_proc_*
From: [EMAIL PROTECTED] Date: Thu, 14 Jul 2005 23:42:00 +0200 > From: Christophe Lucas <[EMAIL PROTECTED]> > > Audit return of create_proc_* functions. > > Signed-off-by: Christophe Lucas <[EMAIL PROTECTED]> > Signed-off-by: Domen Puncer <[EMAIL PROTECTED]> Applied, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/3] netlink: Fix "nocast type" warnings
From: [EMAIL PROTECTED] Date: Thu, 14 Jul 2005 23:41:58 +0200 > From: Victor Fusco <[EMAIL PROTECTED]> > > Fix the sparse warning "implicit cast to nocast type" > > File/Subsystem:net/netlink/af_netlink.c > > Signed-off-by: Victor Fusco <[EMAIL PROTECTED]> > Signed-off-by: Domen Puncer <[EMAIL PROTECTED]> Applied, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 2.6 3/3]ioctl: Add support for getting a permanent hardware address
From: Jon Wetzel <[EMAIL PROTECTED]> Subject: [Patch 2.6 3/3]ioctl: Add support for getting a permanent hardware address Date: Thu, 14 Jul 2005 16:43:50 -0500 > This patch is the third of three, designed to allow access to the > permanent hardware address of a network device. This patch adds a new > ioctl to get the field, "perm_addr," which was added to net_device by > the first patch. > > Signed-off-by: Jon Wetzel <[EMAIL PROTECTED]> No new BSD style device ioctls, _please_! If we want to create new facilities, they should be done via netlink and possibly the ethtool interfaces. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net configs: NETCONSOLE and NETPOLL together
From: randy_dunlap <[EMAIL PROTECTED]> Date: Tue, 12 Jul 2005 21:27:28 -0700 > Put NETCONSOLE and NETPOLL options together since they are related. > This cuts down on the hassle of flipping back and forth between > the Networking menu and the Network drivers menu to change their > config settings. > > Tested with menuconfig, gconfig, and xconfig. > gconfig has a small problem with this. I think that it's > a bug in gconfig and I will take it up with Romain Lievin. > > Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> Applied, thanks Randy. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 2/3] skbuff.h: Fix "nocast type" warnings
From: [EMAIL PROTECTED] Date: Thu, 14 Jul 2005 23:41:59 +0200 > From: Victor Fusco <[EMAIL PROTECTED]> > > Fix the sparse warning "implicit cast to nocast type" > > File/Subsystem:include/linux/skbuff.h > > Signed-off-by: Victor Fusco <[EMAIL PROTECTED]> > Signed-off-by: Domen Puncer <[EMAIL PROTECTED]> Applied, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET]: Kill skb->tc_classid
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Mon, 18 Jul 2005 06:39:11 +0200 > OK, here's the patch to remove it. Dave, please apply together with the > previous patch. Patch applied, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] convert nfmark and conntrack mark to 32bit
From: Harald Welte <[EMAIL PROTECTED]> Date: Sun, 17 Jul 2005 23:42:23 +0200 > As discussed at netconf'05, we convert nfmark and conntrack-mark to be > 32bits even on 64bit architectures. > > Signed-off-by: Harald Welte <[EMAIL PROTECTED]> Applied, thanks Harald. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] reduce netfilte sk_buff enlargement
From: Harald Welte <[EMAIL PROTECTED]> Date: Mon, 18 Jul 2005 00:04:51 +0200 > The only real in-tree user of nfcache was IPVS, who only needs a single > bit. Unfortunately I couldn't find some other free bit in sk_buff to > stuff that bit into, so I introduced a separate field for them. Maybe > the IPVS guys can resolve that to further save space. I think we must resolve this one before 2.6.14 goes out, which gives us a lot of time, but for now I'll eat that one-bit member. > Initially I wanted to shrink pkt_type to three bits (PACKET_HOST and > alike are only 6 values defined), but unfortunately the bluetooth code > overloads pkt_type :( This also must be cured somehow, that really isn't a clean nor nice usage of this field. > - remove all never-implemented 'nfcache' code > - don't have ipvs code abuse 'nfcache' field. currently get's their own > compile-conditional skb->ipvs_property field. IPVS maintainers can > decide to move this bit elswhere, but nfcache needs to die. > - remove skb->nfcache field to save 4 bytes > - move skb->nfctinfo into three unused bits to save further 4 bytes > > Signed-off-by: Harald Welte <[EMAIL PROTECTED]> Applied, thanks Harald. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 2.6 3/3]ioctl: Add support for getting a permanent hardware address
From: Matt Domsch <[EMAIL PROTECTED]> Date: Mon, 18 Jul 2005 22:30:11 -0500 > Do you want a patch for netlink too then, given the ethtool kernel work is > already done? I think what we're going to end up doing is have a netlink interface for the ethtool stuff, so if you add some ethtool bits they will show up in the netlink think we come up with automatically. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
net-2.6.14 tree made
I just put up the first batch of changes due for the 2.6.14 networking at: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.14.git Anything that isn't a bug fix will end up there, and once 2.6.13 goes out the door I'll push the stuff in that tree to Linus. Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] reduce netfilte sk_buff enlargement
From: Jan Engelhardt <[EMAIL PROTECTED]> Date: Tue, 19 Jul 2005 09:18:38 +0200 (MEST) > >but for now I'll eat that one-bit member. > > What is more important? Being as small as possible using bitfields, or being > as fast as possible? (Usage of bitfields is some CPU overhead for their > extraction) I'm conjuring that we can store the state elsewhere, for example in the SKB ->cb[] control block. But that requires some verifications. Memory access overhead dwarfs whatever the cpu has to do to extract the bits. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: netdev TODO list
From: Ben Greear <[EMAIL PROTECTED]> Date: Tue, 19 Jul 2005 10:58:33 -0700 > That way, any out-of-tree code that uses skb->stamp will no longer > compile (it is much better to fail at compile time than run time). Sure. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET]: Kill skb->tc_classid
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Tue, 19 Jul 2005 20:29:30 +0200 > Did you also get the patch to kill skb->tc_classid? I can only see > the patch to remove the define in your 2.6.14 tree. I just put it into the tree right now, it should show up on kernel.org in about a half hour. Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8][ATM]: [speedtch] cure atm_printk() macro gcc-2.95 compile error
All 8 patches applied, thanks Chas. Chas, please update your address book, [EMAIL PROTECTED] is no longer in service. The current address for the list is netdev@vger.kernel.org Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET]: Only build flow.o if CONFIG_XFRM=y
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Tue, 19 Jul 2005 20:44:02 +0200 > [NET]: Only build flow.o if CONFIG_XFRM=y > > Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Applied to net-2.6, thanks Patrick. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [IPV4]: Don't select XFRM for ip_gre
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Tue, 19 Jul 2005 20:47:18 +0200 > [IPV4]: Don't select XFRM for ip_gre > > Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Applied to net-2.6, thanks Patrick. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fib_trie whitespace fixes
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Tue, 19 Jul 2005 08:49:17 -0400 > Fix up lots of little whitespace indentation stuff in fib_trie. > > Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> Applied to net-2.6, thanks Stephen. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET]: Make ipip/ip6_tunnel independant of XFRM
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Tue, 19 Jul 2005 21:23:55 +0200 > [NET]: Make ipip/ip6_tunnel independant of XFRM > > Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Looks good, applied to net-2.6, thanks Patrick. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6 patch] BRIDGE_EBT_ARPREPLY must depend on INET
From: Adrian Bunk <[EMAIL PROTECTED]> Date: Tue, 19 Jul 2005 15:55:29 +0200 > BRIDGE_EBT_ARPREPLY=y and INET=n results in the following compile error: Applied, thanks Adrian. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.6 patch] NETCONSOLE must depend on INET
From: Adrian Bunk <[EMAIL PROTECTED]> Date: Tue, 19 Jul 2005 20:29:19 +0200 > NETCONSOLE=y and INET=n results in the following compile error: Also applied, thanks Adrian. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] reduce netfilte sk_buff enlargement
From: Harald Welte <[EMAIL PROTECTED]> Date: Wed, 20 Jul 2005 09:23:05 -0400 > On Mon, Jul 18, 2005 at 08:31:45PM -0700, David S. Miller wrote: > > From: Harald Welte <[EMAIL PROTECTED]> > > Date: Mon, 18 Jul 2005 00:04:51 +0200 > > > > > The only real in-tree user of nfcache was IPVS, who only needs a single > > > bit. Unfortunately I couldn't find some other free bit in sk_buff to > > > stuff that bit into, so I introduced a separate field for them. Maybe > > > the IPVS guys can resolve that to further save space. > > > > I think we must resolve this one before 2.6.14 goes out, which > > gives us a lot of time, but for now I'll eat that one-bit member. > > Well, I hope IPVS people will take care of this. I don't really know > that code too well... Ok, I might take a look at this myself. > > > Initially I wanted to shrink pkt_type to three bits (PACKET_HOST and > > > alike are only 6 values defined), but unfortunately the bluetooth code > > > overloads pkt_type :( > > > > This also must be cured somehow, that really isn't a clean nor nice > > usage of this field. > > I just ran into Marcel Holtmann earlier today. He thinks moving that > data into the cb is fine, though he has to double-check that. > > He also said that he really only needs 5 bits, so even if the current > pkt_type overloading would persist, we could probably shrink it to make > space for the IPVS bit. Ok, sounds great. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/*] re-add NFC_ defines
All 3 patches applied, thanks Harald. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] reduce netfilte sk_buff enlargement
From: Marcel Holtmann <[EMAIL PROTECTED]> Date: Thu, 21 Jul 2005 20:20:35 +0200 > However after a look trough the Bluetooth core it should be quite > easy too move the pkt_type into the control buffer. We already use > it for a direction bit. The nasty thing is that I have to modify all > the drivers. So when you finally decided to shrink the pkt_type, I > think that I can come up with a patch for it quiet quickly. We are trimming SKB madly right now, so if you could work on the bluetooth patch so we can trim the pkt_type size ASAP that would be much appreciated. You can send diffs against my net-2.6.14 tree at: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.14.git Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2][REQSK] Move the syn_table destruction from tcp_listen_stop to reqsk_queue_destroy
From: [EMAIL PROTECTED] (Arnaldo Carvalho de Melo) Date: Wed, 20 Jul 2005 20:22:28 -0300 > + if (lopt->qlen != 0) { > + struct request_sock *req; > + int i; > + > + for (i = 0; i < lopt->nr_table_entries; i++) > + while ((req = lopt->syn_table[i]) != NULL) { > + lopt->syn_table[i] = req->dl_next; > + lopt->qlen--; > + reqsk_free(req); > + } > + } Please fix the tabbing of the closing braces. In fact, put an openning brace after the for() statement, then add the necessary closing brace at the proper tabbing level to close the top-level if() basic block. I'll hold on both patches until you fix this up, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] reduce netfilte sk_buff enlargement
From: Marcel Holtmann <[EMAIL PROTECTED]> Date: Thu, 21 Jul 2005 23:42:11 +0200 > unfortunatly it is not that straight forward as I thought. The attached > patch which modifies the Bluetooth core and the hci_usb driver is not > working on my machine. Hmmm... I'll see if I can spot anything obvious in the patch. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] reduce netfilte sk_buff enlargement
From: Marcel Holtmann <[EMAIL PROTECTED]> Date: Thu, 21 Jul 2005 23:42:11 +0200 > unfortunatly it is not that straight forward as I thought. The attached > patch which modifies the Bluetooth core and the hci_usb driver is not > working on my machine. This probably has nothing to do with why the patch doesn't work for you, but the transformation of "incoming" to a "u8" from an "int" is not fully correct, because hci_sock.c does this: put_cmsg(msg, SOL_HCI, HCI_CMSG_DIR, sizeof(int), &bt_cb(skb)->incoming); I haven't found any other problems though... Maybe the bluetooth code was somehow depending upon the initial value of skb->pkt_type or something like that? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET,RFC]: Introduce SO_{SND,RCV}BUFFORCE socket options
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Thu, 21 Jul 2005 02:36:13 +0200 > ctnetlink needs large socket buffer sizes. To avoid increasing > the system wide limit we would like to have something that allows > CAP_NET_ADMIN to override these limits. The first idea was to > change the SO_{SND,RCV}BUF behaviour, but since a valid way of > getting the largest possible size is to use ~0 this would possibly > break existing applications. So this patch introduces two new > socket options, SO_SNDBUFFORCE and SO_RCVBUFFORCE, that allow to > set it to any value. I couldn't come up with a better way to do this, so the patch is applied to my net-2.6.14 tree. I thought perhaps we could special case "~0", but actually there are many programs which use an algorithm like "double socket buffer size until reading it back does not show an increase". - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] reduce netfilte sk_buff enlargement
From: Marcel Holtmann <[EMAIL PROTECTED]> Date: Fri, 22 Jul 2005 01:49:51 +0200 > The pkt_type zero is not a valid one. We only use 1-4 and 0xff. So this > can't be the problem. I assume that the cb is not copied from the driver > into the core at some point. All clones and copies of SKBs copy of the ->cb[] for you. So perhaps something is spamming the ->cb[] between these two places. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH RFC]: Killing skb->real_dev
I studied this and it's merely a matter of parameter passing. Specifically, at ptype->func() time, it is plainly the skb->dev before skb_bond() is applied. So I added a "real_dev" arg to ptype->func() and converted the tree over to that. Thomas, this kills the TCF_META_ID_REALDEV stuff, so we should kill it in 2.6.13-rcX too so that nobody starts using it in userspace ok? I'm trying to figure out if it matters for multiple levels of decapsulation. As far as I can tell for the bond_3ad() case, it doesn't, and that's the only user of this thing. if_vlan.h was setting ->real_dev but that looked totally wrong and had no usage, so I simply deleted that. Comments? diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c --- a/drivers/block/aoe/aoenet.c +++ b/drivers/block/aoe/aoenet.c @@ -120,7 +120,7 @@ aoenet_xmit(struct sk_buff *sl) * (1) len doesn't include the header by default. I want this. */ static int -aoenet_rcv(struct sk_buff *skb, struct net_device *ifp, struct packet_type *pt) +aoenet_rcv(struct sk_buff *skb, struct net_device *ifp, struct packet_type *pt, struct net_device *real_dev) { struct aoe_hdr *h; u32 n; diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c --- a/drivers/net/bonding/bond_3ad.c +++ b/drivers/net/bonding/bond_3ad.c @@ -2419,22 +2419,19 @@ out: return 0; } -int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type* ptype) +int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type* ptype, struct net_device *real_dev) { struct bonding *bond = dev->priv; struct slave *slave = NULL; int ret = NET_RX_DROP; - if (!(dev->flags & IFF_MASTER)) { + if (!(dev->flags & IFF_MASTER)) goto out; - } read_lock(&bond->lock); - slave = bond_get_slave_by_dev((struct bonding *)dev->priv, - skb->real_dev); - if (slave == NULL) { + slave = bond_get_slave_by_dev(bond, real_dev); + if (!slave) goto out_unlock; - } bond_3ad_rx_indication((struct lacpdu *) skb->data, slave, skb->len); diff --git a/drivers/net/bonding/bond_3ad.h b/drivers/net/bonding/bond_3ad.h --- a/drivers/net/bonding/bond_3ad.h +++ b/drivers/net/bonding/bond_3ad.h @@ -295,6 +295,6 @@ void bond_3ad_adapter_duplex_changed(str void bond_3ad_handle_link_change(struct slave *slave, char link); int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info); int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev); -int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type* ptype); +int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type* ptype, struct net_device *real_dev); #endif //__BOND_3AD_H__ diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c --- a/drivers/net/bonding/bond_alb.c +++ b/drivers/net/bonding/bond_alb.c @@ -354,15 +354,14 @@ static void rlb_update_entry_from_arp(st _unlock_rx_hashtbl(bond); } -static int rlb_arp_recv(struct sk_buff *skb, struct net_device *bond_dev, struct packet_type *ptype) +static int rlb_arp_recv(struct sk_buff *skb, struct net_device *bond_dev, struct packet_type *ptype, struct net_device *real_dev) { struct bonding *bond = bond_dev->priv; struct arp_pkt *arp = (struct arp_pkt *)skb->data; int res = NET_RX_DROP; - if (!(bond_dev->flags & IFF_MASTER)) { + if (!(bond_dev->flags & IFF_MASTER)) goto out; - } if (!arp) { dprintk("Packet has no ARP data\n"); diff --git a/drivers/net/hamradio/bpqether.c b/drivers/net/hamradio/bpqether.c --- a/drivers/net/hamradio/bpqether.c +++ b/drivers/net/hamradio/bpqether.c @@ -98,7 +98,7 @@ static char bcast_addr[6]={0xFF,0xFF,0xF static char bpq_eth_addr[6]; -static int bpq_rcv(struct sk_buff *, struct net_device *, struct packet_type *); +static int bpq_rcv(struct sk_buff *, struct net_device *, struct packet_type *, struct net_device *); static int bpq_device_event(struct notifier_block *, unsigned long, void *); static const char *bpq_print_ethaddr(const unsigned char *); @@ -165,7 +165,7 @@ static inline int dev_is_ethdev(struct n /* * Receive an AX.25 frame via an ethernet interface. */ -static int bpq_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *ptype) +static int bpq_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *ptype, struct net_device *real_dev) { int len; char * ptr; diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c --- a/drivers/net/pppoe.c +++ b/drivers/net/pppoe.c @@ -377,7 +377,8 @@ abort_kfree: ***/ static int pppoe_rcv(struct sk_buff *skb, struct net_
Re: [PATCH RFC]: Killing skb->real_dev
From: Ben Greear <[EMAIL PROTECTED]> Date: Thu, 21 Jul 2005 17:41:55 -0700 > Er, now I feel like an idiot. I am using the real_dev that > is saved in the vlan device logic, not the thing in the > skb. Don't feel bad, that usage confused me as well while working on this patch :-) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC]: Killing skb->real_dev
From: Jay Vosburgh <[EMAIL PROTECTED]> Date: Thu, 21 Jul 2005 17:35:24 -0700 > FWIW, there have been a couple of proposals floating around > bonding-devel for a while from people looking to get the skb->real_dev > in user space (for network manager applications and user-level link > state monitor type things). There was a patch posted to bonding-devel a > couple of months ago proposing a sockopt to pass the real_dev up to user > space. I'm not sure where things stand with them now. I don't think we really want that. People could ask for that for any similar relationship of encapsulation, and real_dev only works for one level so doesn't cover all cases anyways. Going the other way is simpler of course, because of dev->master - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [IPV4] fib_trie cleanups
A lot of the spacing and tabbing has been cleaned up by Stephen Hemminger, so you might want to patch against the copy in my 2.6.14 networking it tree at: rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.14.git Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC]: Killing skb->real_dev
From: Thomas Graf <[EMAIL PROTECTED]> Date: Fri, 22 Jul 2005 17:04:00 +0200 > Sure. I was just thinking that maybe we should delay > the iproute2 release with the ematch bits until we > finished to shrink the skb. Stephen? Hopefully we can weed out the unusable ematch bits before 2.6.13 is released. Therefore, once 2.6.13 goes out the iproute2 update should be OK. I'm hoping that since we're doing the SKB shrinking in parallel in the net-2.6.14 tree with the ongoing 2.6.13 bug fixing, we should be able to catch all such cases. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] inlining failing in ip_conntrack
From: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> Date: Fri, 22 Jul 2005 03:00:13 -0300 > CC [M] net/ipv4/netfilter/ip_conntrack_core.o > net/ipv4/netfilter/ip_conntrack_core.c: In function > `ip_conntrack_event_cache_init': > include/linux/netfilter_ipv4/ip_conntrack.h:296: sorry, unimplemented: > inlining failed in call to 'ip_conntrack_put': function body not > available > net/ipv4/netfilter/ip_conntrack_core.c:139: sorry, unimplemented: > called from here > make[3]: ** [net/ipv4/netfilter/ip_conntrack_core.o] Erro 1 > make[2]: ** [net/ipv4/netfilter] Erro 2 > make[1]: ** [net/ipv4] Erro 2 > > This is on the net-2.6.14 tree, using gcc 3.4.3 It's marked inline in the header file yet not in the implementation. I think we should work out that descrepancy :-) Since it might conflict, I'm going to apply Harald's ctnetlink stuff, and then remove the inline tag from ip_conntrack.h's extern declaration of ip_conntrack_put(). If we really want it to be inline, a followon patch can do that. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2][NET] cleanup INET_REFCNT_DEBUG code
From: Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> Date: Thu, 21 Jul 2005 23:02:03 -0300 > The second one again, also at: > > rsync://rsync.kernel.org/pub/scm/linux/kernel/git/acme/net-2.6.14.git How is this handling properly the case where sk_prot changes? Do you remember we had that problem with socket SLAB caches, because of how IPV6 and IPV4 sockets can change into the other type? That's why we store the socket SLAB cache in there, as well as the sk_prot. Also, would be nice to have some "do { } while (0)" for the NOP version of the debug macros just in case :-) The first patch doing the reqsk stuff looks fine, so I'll apply that and push it into the net-2.6.14 tree. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] ctnetlink
From: Harald Welte <[EMAIL PROTECTED]> Date: Fri, 22 Jul 2005 09:11:30 -0400 > This is a patch for your net-2.6.14 tree (incremental to the > expect-double-free fix). It adds the ctnetlink code, and all the > required core conntrack/nat changes that it needs. Applied and pushed to net-2.6.14 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: slow tcp acks on loopback device
From: Steve French <[EMAIL PROTECTED]> Date: 22 Jul 2005 14:56:59 -0500 > Noticing that the loopback device (at least on RHEL4) has an unfortunate > mtu size 16384 (which is about 50 bytes too small for SMB read > responses), I did try increasing the MTU slightly. Changing that to > 18000 did avoid the fragmentation and the 40ms delay - but what puzzled > me was why setting TCP_NODELAY after the socket was created did not > eliminate the delay on the ack and if there is a way to avoid the huge > tcp ack delay by either doing something else to force client acking > immediately or to do something on the client side of the stack to get > the server to send the whole 16K+ frame - it looks like the tcp windows > is 32K if the value in the tcp acks in the network trace is to be > trusted. TCP_NODELAY does not control ACK generation, instead it modifies the Nagle algorithm behavior when sending data packets. Please take networking discussions to netdev@vger.kernel.org which is where the networking developers are. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC]: Killing skb->real_dev
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Fri, 22 Jul 2005 13:48:42 -0700 > I don't see how the ematch iproute2 stuff depends on SKB shrinking. > The CVS repository has the latest ematch stuff, just testing and > checking before the next drop. We're killing SKB members that the ematch stuff supports keying on. Thus we're deleting the enumeration constants that supported that stuff, which changes the value of the rest of the enumeration constants. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC]: Killing skb->real_dev
From: Thomas Graf <[EMAIL PROTECTED]> Date: Fri, 22 Jul 2005 22:51:05 +0200 > Yes, currently we have TCF_META_ID_SECURITY still in there > with a "/* obsolete */" comment so we can remove that > immediately. Other candidates for removal are indev, realdev, > and tcverdict so it's not a big problem, we can just remove > all of them before the release and in the unlikely case that > we continue to use one of them, readd it. Reasonable? Sounds fine. It might have been less painful if these constants were really constant defines and not an enumeration where removing one value changes all the others after it. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC]: Killing skb->real_dev
From: YOSHIFUJI Hideaki <[EMAIL PROTECTED]> Date: Fri, 22 Jul 2005 17:06:38 -0400 (EDT) > No, please, please do not break binaries, whenever it is possible. > It is definitely much better to have many deaf entries in enums. That is why we are trying to kill the constants before 2.6.13 gets released. These new interfaces do not exist in 2.6.12, and 2.6.13 is not released yet. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC]: Killing skb->real_dev
From: Thomas Graf <[EMAIL PROTECTED]> Date: Fri, 22 Jul 2005 22:51:05 +0200 > Yes, currently we have TCF_META_ID_SECURITY still in there > with a "/* obsolete */" comment so we can remove that > immediately. Other candidates for removal are indev, realdev, > and tcverdict so it's not a big problem, we can just remove > all of them before the release and in the unlikely case that > we continue to use one of them, readd it. Reasonable? Done and pushed to net-2.6 like so. I think due to the mounting number of net-2.6 --> net-2.6.14 tree conflicts, I'll start to work on rebuilding the net-2.6.14 GIT tree using net-2.6 as a base. diff-tree 261688d01ec07d3a265b8ace6ec68310fbd96a96 (from d3984a6b6abac6203868f0e9095c0ed9e33ece03) Author: David S. Miller <[EMAIL PROTECTED]> Date: Fri Jul 22 14:43:52 2005 -0700 [PKT_SCHED]: em_meta: Kill TCF_META_ID_{INDEV,SECURITY,TCVERDICT} More unusable TCF_META_* match types that need to get eliminated before 2.6.13 goes out the door. Signed-off-by: David S. Miller <[EMAIL PROTECTED]> Acked-by: Thomas Graf <[EMAIL PROTECTED]> diff --git a/include/linux/tc_ematch/tc_em_meta.h b/include/linux/tc_ematch/tc_em_meta.h --- a/include/linux/tc_ematch/tc_em_meta.h +++ b/include/linux/tc_ematch/tc_em_meta.h @@ -41,17 +41,14 @@ enum TCF_META_ID_LOADAVG_1, TCF_META_ID_LOADAVG_2, TCF_META_ID_DEV, - TCF_META_ID_INDEV, TCF_META_ID_PRIORITY, TCF_META_ID_PROTOCOL, - TCF_META_ID_SECURITY, /* obsolete */ TCF_META_ID_PKTTYPE, TCF_META_ID_PKTLEN, TCF_META_ID_DATALEN, TCF_META_ID_MACLEN, TCF_META_ID_NFMARK, TCF_META_ID_TCINDEX, - TCF_META_ID_TCVERDICT, TCF_META_ID_RTCLASSID, TCF_META_ID_RTIIF, TCF_META_ID_SK_FAMILY, diff --git a/net/sched/em_meta.c b/net/sched/em_meta.c --- a/net/sched/em_meta.c +++ b/net/sched/em_meta.c @@ -27,17 +27,17 @@ * lvalue rvalue * +---+ +---+ * | type: INT | | type: INT | - * def | id: INDEV | | id: VALUE | + * def | id: DEV | | id: VALUE | * | data: | | data: 3 | * +---+ +---+ * | | - * ---> meta_ops[INT][INDEV](...) | + * ---> meta_ops[INT][DEV](...)| * | | * --- | * V V * +---+ +---+ * | type: INT | | type: INT | - * obj | id: INDEV | | id: VALUE | + * obj | id: DEV | | id: VALUE | * | data: 2 |<--data got filled out | data: 3 | * +---+ +---+ * | | @@ -170,16 +170,6 @@ META_COLLECTOR(var_dev) *err = var_dev(skb->dev, dst); } -META_COLLECTOR(int_indev) -{ - *err = int_dev(skb->input_dev, dst); -} - -META_COLLECTOR(var_indev) -{ - *err = var_dev(skb->input_dev, dst); -} - /** * skb attributes **/ @@ -235,13 +225,6 @@ META_COLLECTOR(int_tcindex) dst->value = skb->tc_index; } -#ifdef CONFIG_NET_CLS_ACT -META_COLLECTOR(int_tcverd) -{ - dst->value = skb->tc_verd; -} -#endif - /** * Routing **/ @@ -490,7 +473,6 @@ struct meta_ops static struct meta_ops __meta_ops[TCF_META_TYPE_MAX+1][TCF_META_ID_MAX+1] = { [TCF_META_TYPE_VAR] = { [META_ID(DEV)] = META_FUNC(var_dev), - [META_ID(INDEV)]= META_FUNC(var_indev), [META_ID(SK_BOUND_IF)] = META_FUNC(var_sk_bound_if), }, [TCF_META_TYPE_INT] = { @@ -499,7 +481,6 @@ static struct meta_ops __meta_ops[TCF_ME [META_ID(LOADAVG_1)]= META_FUNC(int_loadavg_1), [META_ID(LOADAVG_2)]= META_FUNC(int_loadavg_2), [META_ID(DEV)] = META_FUNC(int_dev), - [META_ID(INDEV)]= META_FUNC(int_indev), [META_ID(PRIORITY)] = META_FUNC(int_priority),
Re: SKB tutorial, Blog, and NET TODO
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Fri, 22 Jul 2005 04:59:30 +0200 > We have multiple queue states, one for each hardware TX queue. > But instead of multiple qdiscs per device we add a "prio"-argument > to the dequeue-function. The top-level qdisc is dequeued with the > highest active priority and hands out a packet of this priority, or, > if it doesn't support priorities, any packet. The priority of the > dequeued packet is either passed as argument to hard_start_xmit or > stored in skb->priority. This approach has a great advantage over > multiple top-level qdiscs, we can use all the existing classification > stuff, including SO_PRIORITY etc., and non-work-conserving qdiscs > like HTB, HFSC, ... can still be used to enforce bandwidth limits. > It should be possible to implement it in a way that causes only minimal > overhead for devices not supporting multiple TX queues. > > If everyone can agree to this approach I'll hack something up. Sounds OK. What happens if the top-level queue pulls out a packet with a certain priority, and that priority's queue in the device is stopped? Will it look for lower-priority packets and try to send those? All of this kind of logic could result in some ugly loops :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] reduce netfilte sk_buff enlargement
From: Marcel Holtmann <[EMAIL PROTECTED]> Date: Fri, 22 Jul 2005 02:26:34 +0200 > I found the problem. The hci_usb is using the cb[] by itself and so > overwriting the pkt_type value. The attached patch works for me with the > hci_usb driver. However I haven't converted all other drivers and > checked them. This won't happen until I am back home, because I don't > have any of these devices with me around. However it looks like this > seems to work without any problems. Great. I'll wait until you get back and code up an updated patch that takes care of all of the drivers. Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
net-2.6.14 GIT rebased
Ok, I rebased the net-2.6.14 GIT tree based upon the current net-2.6 tree. It may take a few hours for this rebasing to hit the kernel.org mirror system, so please be patient :) You'll have to be careful when resyncing to this thing since all the changesets were redone. I would recommend pulling out local changes into patches, rsync'ing the new net-2.6.14 tree, running git-prune-script, then readding your local changes. Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: net-2.6.14 GIT rebased
From: Harald Welte <[EMAIL PROTECTED]> Date: Sat, 23 Jul 2005 01:11:08 -0400 > so there now is > rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.14.git/ > and > rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.14.git/new-net-2.6.14.git > > I don't think this is this intentional? Which one to use ? I'm quite the bozo, and it figures I'd do something like that right before going away for a day and a half, sorry. The latter is the correct one, which I've moved over to the correct net-2.6.14.git location. Sorry again. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Patch: reduce skb input dev on 64 bit machines
From: Jamal Hadi Salim <[EMAIL PROTECTED]> Date: Sat, 23 Jul 2005 09:32:07 -0400 > This is part of mission skb diet. > Against git/davem/2.6.14 that was on vger 30 minutes back: It changes > input_dev to be an ifindex so we dont bother holding devices. > Would only cut a 4 byte fat on 64 bit machines. > > Signed-off-by: Jamal Hadi Salim <[EMAIL PROTECTED]> I have a better change in the wings that totally eliminates real_dev _AND_ input_dev completely, and passes them as parameters into pt->func() and ->enqueue() as it should have been from the beginning. input_dev is "skb->dev at time netif_receive_skb()" was called, and also, this is pretty much what the "real_dev" code wants too, it wants the device before skb_bond() was invoked. So both cases want the same exact device pointer, and we can pass them around as parameters instead of all of the current bogus stuff. And the mere act of passing this "orig_dev" in as a parameter makes it easier to verify that references to it will not escape from the softirq input packet processing context. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SKB tutorial, Blog, and NET TODO
From: Jamal Hadi Salim <[EMAIL PROTECTED]> Date: Sat, 23 Jul 2005 10:14:52 -0400 > Setting the skb->prio to be used by the driver sounds reasonable. > Another alternative would be what was already mentioned to change the > call to hardware_start_transmit() to take a prio option. > The driver should take care of what that all means given that we have > views that differ depending on the h/ware. But this simply doesn't work by itself, that's why we need the per-queue "stopped" states. We need something that properly synchronizes the queue "full" state transitions, so that the queue does not deadlock and when one priority queue fills up, we do the right thing. All of the packet scheduler is keyed off of being able to atomically "send the queue X while not stopped", and that transition from stopped to not-stopped is interlocked properly with the asynchronous sending path. Alexey explained this to both you and I about 3 years ago. At the time we were talking about the prioritized queues provides by a few gigabit NICs at the time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SKB tutorial, Blog, and NET TODO
From: Thomas Graf <[EMAIL PROTECTED]> Date: Sat, 23 Jul 2005 18:14:58 +0200 > The simplest case is if the hardware does strict prio and does > the queueing itself based on skb->priority or similiar. We don't > need to change anything in this case except for adding the > interface to transfer the classification result to the driver. The key is what should happen when the ring for prio X fills up? netif_stop_queue() in it's current form is the wrong thing to do, because it prevents lower priority packets from being queued which is exactly what we want to do if those queues have space. The higher-prio packets will still go out first, of course, but queueing to lower prio rings should still be possible. So we need some kind of netif_stop_queue_prio(dev, prio_nr) or similar. The next issue is how to demultiplex from the number of prios we want, to what the hardware actually supports if the latter is smaller. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Patch: reduce skb input dev on 64 bit machines
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Sat, 23 Jul 2005 18:54:13 +0200 > Let me propose again to just set it here for all cases. So far there > hasn't been a single exception where indev is not the input device > as seen by netif_rx(), and I don't expect any to come up. In any case > you should guard this printk by net_ratelimit() to avoid spamming > peoples logs. I totally agree, that madness has existed for far too long. See my other email, and hold on for a bit so I can try and finish up my real_dev/input_dev killing patch. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net/ipv6/ip6_tunnel.c: implicit declaration of function `xfrm6_tunnel_unregister'
From: Cal Peake <[EMAIL PROTECTED]> Date: Sat, 23 Jul 2005 20:50:48 -0400 (EDT) > This patch seems correct: > > Signed-off-by: Cal Peake <[EMAIL PROTECTED]> Applied, thanks Cal. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html