Re: [PATCH v3 2/2][BNX2]: Add iSCSI support to BNX2 devices.

2007-09-08 Thread FUJITA Tomonori
On Fri, 07 Sep 2007 17:36:47 -0500
Mike Christie [EMAIL PROTECTED] wrote:

  +/*
  + * map SG list
  + */
  +static int bnx2i_map_sg(struct bnx2i_hba *hba, struct bnx2i_cmd *cmd)
  +{
  +   struct scsi_cmnd *sc = cmd-scsi_cmd;
  +   struct iscsi_bd *bd = cmd-bd_tbl-bd_tbl;
  +   struct scatterlist *sg;
  +   int byte_count = 0;
  +   int sg_frags;
  +   int bd_count = 0;
  +   int sg_count;
  +   int sg_len;
  +   u64 addr;
  +   int i;
  +
  +   sg = sc-request_buffer;
  +   sg_count = pci_map_sg(hba-pci_dev, sg, sc-use_sg,
  + sc-sc_data_direction);

Can you use scsi_dma_map() here?


  +   for (i = 0; i  sg_count; i++) {
  +   sg_len = sg_dma_len(sg);
  +   addr = sg_dma_address(sg);
  +   if (sg_len  MAX_BD_LENGTH)
  +   sg_frags = bnx2i_split_bd(cmd, addr, sg_len,
  + bd_count);

Please use scsi_for_each_sg().

You can't directly access to use_sg, request_buffer, request_bufflen,
and resid in scsi_cmnd structure. Please use the scsi data accessors
in scsi_cmnd.h: scsi_sg_count, scsi_sglist, scsi_bufflen,
scsi_set_resid, and scsi_get_resid.


 If you call blk_queue_max_segment_size() in the slave_configure callout 
 you can limit the size of the segments that the block layer builds so 
 they are smaller than MAX_BD_LENGTH. However, I am not sure how useful 
 that is. I think DMA-API.txt states that the mapping code is ok to 
 merged mutliple sglists entries into one so I think that means that we 
 can still end up with an entry that is larger than MAX_BD_LENGTH. Not 
 sure if there is way to tell the pci/dma map_sg code to limit this too.

Yeah, iommu code ignores the lld limitations (the problem is that the
lld limitations are in request_queue and iommu code can't access to
request_queue). There is no way to tell iommu code about the lld
limitations.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/4] 2.6.23-rc5: known regressions v2

2007-09-08 Thread Michal Piotrowski
Hi all,

Here is a list of some known regressions in 2.6.23-rc5.

Feel free to add new regressions/remove fixed etc.
http://kernelnewbies.org/known_regressions

List of Aces

NameRegressions fixed since 21-Jun-2007
Adrian Bunk10
Linus Torvalds 6
Andi Kleen 5
Hugh Dickins   5
Trond Myklebust5
Andrew Morton  4
Al Viro3
Alan Stern 3
Alexey Starikovskiy3
Cornelia Huck  3
David S. Miller3
Jens Axboe 3
Stephen Hemminger  3
Tejun Heo  3



CPUFREQ

Subject : ide problems: 2.6.22-git17 working, 2.6.23-rc1* is not
References  : http://lkml.org/lkml/2007/7/27/298
  http://lkml.org/lkml/2007/7/29/371
Last known good : ?
Submitter   : dth [EMAIL PROTECTED]
Caused-By   : Len Brown [EMAIL PROTECTED]
  commit f79e3185dd0f8650022518d7624c876d8929061b
Handled-By  : Len Brown [EMAIL PROTECTED]
Status  : problem is being debugged



Networking

Subject : 2.6.23-rc5: possible irq lock inversion dependency detected
References  : http://lkml.org/lkml/2007/9/2/97
Last known good : ?
Submitter   : Christian Kujau [EMAIL PROTECTED]
Caused-By   : ?
Handled-By  : ?
Status  : unknown

Subject : zd1211rw regression, device does not enumerate
References  : http://marc.info/?l=linux-usb-develm=118854967709322w=2
  http://bugzilla.kernel.org/show_bug.cgi?id=8972
Last known good : ?
Submitter   : Oliver Neukum [EMAIL PROTECTED]
Caused-By   : Daniel Drake [EMAIL PROTECTED]
  commit 74553aedd46b3a2cae986f909cf2a3f99369decc
Handled-By  : ?
Status  : unknown

Subject : NETDEV WATCHDOG: eth0: transmit timed out
References  : http://lkml.org/lkml/2007/8/13/737
Last known good : ?
Submitter   : Karl Meyer [EMAIL PROTECTED]
Caused-By   : ?
Handled-By  : Francois Romieu [EMAIL PROTECTED]
Status  : problem is being debugged

Subject : Weird network problems with 2.6.23-rc2
References  : http://lkml.org/lkml/2007/8/11/40
Last known good : ?
Submitter   : Shish [EMAIL PROTECTED]
Caused-By   : ?
Handled-By  : ?
Status  : unknown



Regards,
Michal

--
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [2/2] 2.6.23-rc5: known regressions with patches v2

2007-09-08 Thread Michal Piotrowski
Hi all,

Here is a list of some known regressions in 2.6.23-rc5
with patches available.

Feel free to add new regressions/remove fixed etc.
http://kernelnewbies.org/known_regressions

List of Aces

NameRegressions fixed since 21-Jun-2007
Adrian Bunk10
Linus Torvalds 6
Andi Kleen 5
Hugh Dickins   5
Trond Myklebust5
Andrew Morton  4
Al Viro3
Alan Stern 3
Alexey Starikovskiy3
Cornelia Huck  3
David S. Miller3
Jens Axboe 3
Stephen Hemminger  3
Tejun Heo  3



Some of these patches are available in -krf (known regressions fixes) tree
http://www.stardust.webpages.pl/files/patches/krf/2.6.23-rc5-git1/linux-2.6.23-rc5-git1-krf1.patch.bz2
http://www.stardust.webpages.pl/files/patches/krf/2.6.23-rc5-git1/linux-2.6.23-rc5-git1-krf1.tar.bz2



MMC

Subject : Unable to access memory card reader anymore
References  : http://bugzilla.kernel.org/show_bug.cgi?id=8885
Last known good : ?
Submitter   : Christian Casteyde [EMAIL PROTECTED]
Caused-By   : ?
Handled-By  : Alan Stern [EMAIL PROTECTED]
Patch   : http://bugzilla.kernel.org/attachment.cgi?id=12438
Status  : patch available



Networking

Subject : ifconfig eth1 - scheduling while atomic: 
ifconfig/0x0002/4170
References  : http://lkml.org/lkml/2007/9/2/165
Last known good : ?
Submitter   : Florian Lohoff [EMAIL PROTECTED]
Caused-By   : ?
Handled-By  : Johannes Berg [EMAIL PROTECTED]
Patch   : http://lkml.org/lkml/2007/9/7/75
Status  : patch available

Subject : System freeze when restarting network connection with 
Broadcom driver
References  : http://bugzilla.kernel.org/show_bug.cgi?id=8934
Last known good : ?
Submitter   : Christian Casteyde [EMAIL PROTECTED]
Caused-By   : ?
Handled-By  : ?
Status  : patch has been submitted to John Linville



Regards,
Michal

--
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2][BNX2]: Add iSCSI support to BNX2 devices.

2007-09-08 Thread Jeff Garzik

FUJITA Tomonori wrote:

Yeah, iommu code ignores the lld limitations (the problem is that the
lld limitations are in request_queue and iommu code can't access to
request_queue). There is no way to tell iommu code about the lld
limitations.



This fact very much wants fixing.

Jeff


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2][BNX2]: Add iSCSI support to BNX2 devices.

2007-09-08 Thread Christoph Hellwig
On Wed, Sep 05, 2007 at 02:27:02PM -0700, Anil Veerabhadrappa wrote:
 This is a very tricky proposal as this header file is automatically
 generated by a well defined process and is shared between various driver
 supporting multiple platform/OS and the firmware. If it is not of a big
 issue I would like to keep it the way it is.

Most of it should just go away, and the other bits shouldn't change over
the lifetime of the driver except for additions.  So there really isn't
any point in auto-generating it.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/2][BNX2]: Add iSCSI support to BNX2 devices.

2007-09-08 Thread Michael Chan
Christoph Hellwig wrote:

 Most of it should just go away, and the other bits shouldn't 
 change over
 the lifetime of the driver except for additions.  So there 
 really isn't
 any point in auto-generating it.
 

Yes, I agree with Mike Christie on this.  These values in
question are defined in iSCSI RFC and therefore should be defined
in a common file.

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 0/3] rfkill

2007-09-08 Thread Ivo van Doorn
Hi Dmitry,

I have a few rfkill related patches for which I would prefer if you to could 
take a look at before I send them for inclusion.

Thanks. :)

Ivo
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/3] rfkill: Remove IRDA

2007-09-08 Thread Ivo van Doorn
As Dmitry pointed out earlier, rfkill-input.c
doesn't support irda because there are no users
and we shouldn't add unrequired KEY_ defines.

However, RFKILL_TYPE_IRDA was defined in the
rfkill.h header file and would confuse people
about whether it is implemented or not.

This patch removes IRDA support completely,
so it can be added whenever a driver wants the
feature.

Signed-off-by: Ivo van Doorn [EMAIL PROTECTED]
---
 include/linux/rfkill.h |8 +++-
 net/rfkill/Kconfig |2 +-
 net/rfkill/rfkill.c|5 +
 3 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/include/linux/rfkill.h b/include/linux/rfkill.h
index a8a6ea8..c4546e1 100644
--- a/include/linux/rfkill.h
+++ b/include/linux/rfkill.h
@@ -31,13 +31,11 @@
  * enum rfkill_type - type of rfkill switch.
  * RFKILL_TYPE_WLAN: switch is no a Wireless network devices.
  * RFKILL_TYPE_BlUETOOTH: switch is on a bluetooth device.
- * RFKILL_TYPE_IRDA: switch is on an infrared devices.
  */
 enum rfkill_type {
-   RFKILL_TYPE_WLAN = 0,
-   RFKILL_TYPE_BLUETOOTH = 1,
-   RFKILL_TYPE_IRDA = 2,
-   RFKILL_TYPE_MAX = 3,
+   RFKILL_TYPE_WLAN ,
+   RFKILL_TYPE_BLUETOOTH,
+   RFKILL_TYPE_MAX,
 };
 
 enum rfkill_state {
diff --git a/net/rfkill/Kconfig b/net/rfkill/Kconfig
index 8b31759..d28a6d9 100644
--- a/net/rfkill/Kconfig
+++ b/net/rfkill/Kconfig
@@ -5,7 +5,7 @@ menuconfig RFKILL
tristate RF switch subsystem support
help
  Say Y here if you want to have control over RF switches
- found on many WiFi, Bluetooth and IRDA cards.
+ found on many WiFi and Bluetooth cards.
 
  To compile this driver as a module, choose M here: the
  module will be called rfkill.
diff --git a/net/rfkill/rfkill.c b/net/rfkill/rfkill.c
index db3395b..50e0102 100644
--- a/net/rfkill/rfkill.c
+++ b/net/rfkill/rfkill.c
@@ -106,9 +106,6 @@ static ssize_t rfkill_type_show(struct device *dev,
case RFKILL_TYPE_BLUETOOTH:
type = bluetooth;
break;
-   case RFKILL_TYPE_IRDA:
-   type = irda;
-   break;
default:
BUG();
}
@@ -281,7 +278,7 @@ static void rfkill_remove_switch(struct rfkill *rfkill)
 /**
  * rfkill_allocate - allocate memory for rfkill structure.
  * @parent: device that has rf switch on it
- * @type: type of the switch (wlan, bluetooth, irda)
+ * @type: type of the switch (RFKILL_TYPE_*)
  *
  * This function should be called by the network driver when it needs
  * rfkill structure. Once the structure is allocated the driver shoud
-- 
1.5.3

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: possible NAPI improvements to reduce interrupt rates for low traffic rates

2007-09-08 Thread Mandeep Singh Baines
James Chapman ([EMAIL PROTECTED]) wrote:
 Hi Mandeep,

 Mandeep Singh Baines wrote:
 Hi James,
 I like the idea of staying in poll longer.
 My comments are similar to what Jamal and Stephen have already 
 said.
 A tunable (via sysfs) would be nice.
 A timer might be preferred to jiffy polling. Jiffy polling will not 
 increase latency the way a timer would. However, jiffy polling will 
 consume a lot more
 CPU than a timer would. Hence more power. For jiffy polling, you 
 could have thousands of calls to poll for a single packet received. 
 While in a timer approach the numbers of polls per packet is upper 
 bound to 2. 

 Why would using a timer to hold off the napi_complete() rather than 
 jiffy count limit the polls per packet to 2?


I was thinking a timer could be used in the way suggested in Jamal's
paper. The driver would do nothing (park) until the timer expires. So
there would be no calls to poll for the duration of the timer. Hence,
this approach would add extra latency not present in a jiffy polling
approach.

 I think it may difficult to make poll efficient for the no packet 
 case because,
 at a minimum, you have to poll the device state via the has_work 
 method.

 Why wouldn't it be efficient? It would usually be done by reading an 
 interrupt pending register.


Reading the interrupt pending register would require an MMIO read.
MMIO reads are very expensive. In some systems the latency of an MMIO
read can be 1000x that of an L1 cache access.

You can use mmio_test to measure MMIO read latency on your system:

http://svn.gnumonks.org/trunk/mmio_test/

However, work_done() doesn't have to be inefficient. For newer
devices you can implement work_done() without an MMIO read by polling
the next ring entry status in memory or some other mechanism. Since
PCI is coherent, acceses to this memory location could be cached
after the first miss. For architectures where PCI is not coherent you'd 
have to go to memory for every poll. So for these architectures has_work()
will be moderately expensive (memory access) even when has_work() does
not require an MMIO read. This might affect home routers: not sure if MIPS or
ARM have coherent PCI.

 If you go to a timer implementation then having a tunable will be 
 important.
 Different appications will have different requirements on delay and 
 jitter.
 Some applications may want to trade delay/jitter for less CPU/power 
 consumption and some may not.

 I agree. I'm leaning towards a new ethtool parameter to control this 
 to be consistent with other per-device tunables.

 imho, the work should definately be pursued further:)

 Thanks Mandeep. I'll try. :)

 -- 
 James Chapman
 Katalix Systems Ltd
 http://www.katalix.com
 Catalysts for your Embedded Linux software development

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What's in netdev-2.6.git?

2007-09-08 Thread Jeff Garzik

The following is the current contents of
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
(recently rebased)

The 'upstream' branch is what I will push upstream for 2.6.24, once
the merge window opens.  I also have a pile in my inbox I need to go
over, while I was away at conference.

List of branches

  ALL   == contents of branches: upstream, stats, ethtool
  ethtool
* master
  stats
  upstream

branch upstream
---
Andrew Morton (1):
  libertas: printk warning fixes

Auke Kok (13):
  e1000e: New pci-express e1000 driver (currently for ICH9 devices only)
  e1000e: Remove unused or empty labels
  e1000e: Make a few functions static
  e1000e: remove duplicate shadowing reference to adapter-hw
  e1000e: Fix header includes [v2]
  e1000e: remove namespace collisions with e1000
  e1000e: Use dma_alloc_coherent where possible
  e1000e: Use time_after to account for jiffies wrapping
  e1000e: error handling for pci_map_single calls.
  e1000e: Remove two compile warnings
  e1000e: retire last_tx_tso workaround
  e1000e: Add read code and printout of PBA number (board identifier)
  e1000e: Remove conditional packet split disable flag

Bill Nottingham (1):
  remove gratuitous space in airo module description

Brajesh Dave (1):
  libertas: advertise 11g ad-hoc rates

Brian King (6):
  ibmveth: Enable TCP checksum offload
  ibmveth: Implement ethtool hooks to enable/disable checksum offload
  ibmveth: Add ethtool TSO handlers
  ibmveth: Add ethtool driver stats hooks
  ibmveth: Remove dead frag processing code
  ibmveth: Remove use of bitfields

Dan Williams (31):
  libertas: kill ieeetypes_capinfo bitfield, use ieee80211.h types
  libertas: rename WLAN_802_11_KEY to enc_key and clean up usage
  libertas: clean up indentation in libertas_association_worker
  libertas: clean up 802.11 IE post-scan handling
  libertas: remove if_bootcmd.c
  libertas: fix mixed-case abuse in cmd_ds_802_11_scan
  libertas: fix mixed-case abuse in cmd_ds_802_11_ad_hoc_result
  libertas: fix mixed-case abuse in cmd_ds_802_11_ad_hoc_start
  libertas: re-uppercase command defines and other constants
  libertas: fix debug build breakage due to field rename
  libertas: remove thread.h and make kthread usage clearer
  libertas: new mesh control knobs
  libertas: bump version to 322.p1
  libertas: fix more mixed-case abuse
  libertas: move generic firmware reset command to common code
  libertas: wlan_ - libertas_ function prefix renames for main.c
  libertas: simplify and clean up data rate handling
  libertas: fix WEXT quality reporting
  libertas: send association events on adhoc reassociation
  libertas: push mesh beacon bit to userspace in scan results
  libertas: fix assignment of WEP key type
  libertas: push WEXT scan requests to a work queue
  libertas: fix misspelling in debug message
  libertas: ignore spurious mesh autostart events
  libertas: better descriptions for association errors
  libertas: fix sparse-reported problems
  libertas: bump driver version
  libertas: fix inadvertant removal of bits from commit 
831441862956fffa17b9801db37e6ea1650b0f69
  libertas: reorganize and simplify init sequence
  libertas: don't stomp on interface-specific private data
  libertas: send reset command directly instead of calling 
libertas_reset_device

Daniel Drake (2):
  zd1211rw: Add ID for Sitecom WL-162
  zd1211rw: Add ID for ZyXEL M-202 XtremeMIMO

Denis Cheng (1):
  drivers/net/cxgb3: removed several unneeded zero initilization

Divy Le Ray (9):
  cxgb3 - MAC workaround update
  cxgb3 - Update rx coalescing length
  cxgb3 - SGE doorbell overflow warning
  cxgb3 - use immediate data for offload Tx
  cxgb3 - Expose HW memory page info
  cxgb3 - tighten checks on TID values
  cxgb3 - Fatal error update
  cxgb3 - log adapter serial number
  cxgb3 - Update internal memory management

Don Fry (1):
  pcnet32: add suspend and resume capability

Eugene Teo (1):
  drivers/net/wireless/libertas/cmd.c: fix adapter-driver_lock dereference

Faidon Liambotis (2):
  Kconfig: order options
  Kconfig: remove references of pcmcia-cs

Holger Schurig (33):
  libertas: remove fw.c
  libertas: fix one more sparse warning
  libertas: make more functions static  remove unused functions
  libertas: uppercase some #defines
  libertas: access mesh_dev more carefully
  libertas: tune hardware info output
  libertas: remove debugmode
  libertas: make the hex dumper nicer
  libertas: remove a hundred CMD_RET_xxx definitions
  libertas: use LBS_DEB_HOST for host-to-card communications
  libertas: use LBS_DEB_HOST for host-to-card communications
  add support for Marvell 8385 CF cards
  

Re: [PATCH v3 2/2][BNX2]: Add iSCSI support to BNX2 devices.

2007-09-08 Thread Anil Veerabhadrappa
On Sat, 2007-09-08 at 07:49 -0700, Michael Chan wrote:
 Christoph Hellwig wrote:
 
  Most of it should just go away, and the other bits shouldn't 
  change over
  the lifetime of the driver except for additions.  So there 
  really isn't
  any point in auto-generating it.
  
 
 Yes, I agree with Mike Christie on this.  These values in
 question are defined in iSCSI RFC and therefore should be defined
 in a common file. 

Sure, we will remove these duplicate defines from bnx2i header file and
work of iscsi_proto.h macro definitions

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NFS] problems with lockd in 2.6.22.6

2007-09-08 Thread Wolfgang Walter
On Friday 07 September 2007, J. Bruce Fields wrote:
 On Fri, Sep 07, 2007 at 05:49:55PM +0200, Wolfgang Walter wrote:
  Hello,
  

  3) For unknown reason these sockets then remain open. In the morning
  when people start their workstation again we therefor not only get a
  lot of these messages again but often the nfs-server does not proberly
  work any more. Restarting the nfs-daemon is a workaround.
 

I wonder why these sockets remain open, by the way. Even if they aren't used
for days. Such a socket only gets deleted when the 81. socket must be opened.

If I do not misunderstand the idea then temporary sockets should be destroyed
after some time without activity by svc_age_temp_sockets.

Now I wonder how svc_age_temp_sockets works. Does it ever close and delete a
temporary socket at all?


static void
svc_age_temp_sockets(unsigned long closure)
{
struct svc_serv *serv = (struct svc_serv *)closure;
struct svc_sock *svsk;
struct list_head *le, *next;
LIST_HEAD(to_be_aged);

dprintk(svc_age_temp_sockets\n);

if (!spin_trylock_bh(serv-sv_lock)) {
/* busy, try again 1 sec later */
dprintk(svc_age_temp_sockets: busy\n);
mod_timer(serv-sv_temptimer, jiffies + HZ);
return;
}

list_for_each_safe(le, next, serv-sv_tempsocks) {
svsk = list_entry(le, struct svc_sock, sk_list);

if (!test_and_set_bit(SK_OLD, svsk-sk_flags))
continue;
if (atomic_read(svsk-sk_inuse) || test_bit(SK_BUSY, 
svsk-sk_flags))
continue;

doesn't this mean that svsk-sk_inuse must be zero which means that SK_DEAD is 
set?
and wouldn't that mean that svc_delete_socket already has been called for that 
socket
(and probably is already closed) ?
and wouldn't that mean that svc_sock_enqueue which is called later does not 
make any
sense (it checks for SK_DEAD)?


atomic_inc(svsk-sk_inuse);
list_move(le, to_be_aged);
set_bit(SK_CLOSE, svsk-sk_flags);
set_bit(SK_DETACHED, svsk-sk_flags);
}
spin_unlock_bh(serv-sv_lock);

while (!list_empty(to_be_aged)) {
le = to_be_aged.next;
/* fiddling the sk_list node is safe 'cos we're SK_DETACHED */
list_del_init(le);
svsk = list_entry(le, struct svc_sock, sk_list);

dprintk(queuing svsk %p for closing, %lu seconds old\n,
svsk, get_seconds() - svsk-sk_lastrecv);

/* a thread will dequeue and close it soon */
svc_sock_enqueue(svsk);
svc_sock_put(svsk);
}

mod_timer(serv-sv_temptimer, jiffies + svc_conn_age_period * HZ);
}

Regards,
-- 
Wolfgang Walter
Studentenwerk München
Anstalt des öffentlichen Rechts
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH, 2nd try] remove setup of platform device from jazzsonic.c

2007-09-08 Thread Thomas Bogendoerfer

remove setup platform device from jazzsonic, which is done in arch code now

Signed-off-by: Thomas Bogendoerfer [EMAIL PROTECTED]
---

diff --git a/drivers/net/jazzsonic.c b/drivers/net/jazzsonic.c
index 75f6f44..435060a 100644
--- a/drivers/net/jazzsonic.c
+++ b/drivers/net/jazzsonic.c
@@ -45,7 +45,6 @@
 #include asm/jazzdma.h
 
 static char jazz_sonic_string[] = jazzsonic;
-static struct platform_device *jazz_sonic_device;
 
 #define SONIC_MEM_SIZE 0x100
 
@@ -70,14 +69,6 @@ static unsigned int sonic_debug = 1;
 #endif
 
 /*
- * Base address and interrupt of the SONIC controller on JAZZ boards
- */
-static struct {
-   unsigned int port;
-   unsigned int irq;
-} sonic_portlist[] = { {JAZZ_ETHERNET_BASE, JAZZ_ETHERNET_IRQ}, {0, 0}};
-
-/*
  * We cannot use station (ethernet) address prefixes to detect the
  * sonic controller since these are board manufacturer depended.
  * So we check for known Silicon Revision IDs instead.
@@ -215,13 +206,12 @@ static int __init jazz_sonic_probe(struct platform_device 
*pdev)
 {
struct net_device *dev;
struct sonic_local *lp;
+   struct resource *res;
int err = 0;
int i;
 
-   /*
-* Don't probe if we're not running on a Jazz board.
-*/
-   if (mips_machgroup != MACH_GROUP_JAZZ)
+   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   if (!res)
return -ENODEV;
 
dev = alloc_etherdev(sizeof(struct sonic_local));
@@ -235,20 +225,9 @@ static int __init jazz_sonic_probe(struct platform_device 
*pdev)
 
netdev_boot_setup_check(dev);
 
-   if (dev-base_addr = KSEG0) { /* Check a single specified location. */
-   err = sonic_probe1(dev);
-   } else if (dev-base_addr != 0) { /* Don't probe at all. */
-   err = -ENXIO;
-   } else {
-   for (i = 0; sonic_portlist[i].port; i++) {
-   dev-base_addr = sonic_portlist[i].port;
-   dev-irq = sonic_portlist[i].irq;
-   if (sonic_probe1(dev) == 0)
-   break;
-   }
-   if (!sonic_portlist[i].port)
-   err = -ENODEV;
-   }
+   dev-base_addr = res-start;
+   dev-irq = platform_get_irq(pdev, 0);
+   err = sonic_probe1(dev);
if (err)
goto out;
err = register_netdev(dev);
@@ -303,38 +282,12 @@ static struct platform_driver jazz_sonic_driver = {
 
 static int __init jazz_sonic_init_module(void)
 {
-   int err;
-
-   if ((err = platform_driver_register(jazz_sonic_driver))) {
-   printk(KERN_ERR Driver registration failed\n);
-   return err;
-   }
-
-   jazz_sonic_device = platform_device_alloc(jazz_sonic_string, 0);
-   if (!jazz_sonic_device)
-   goto out_unregister;
-
-   if (platform_device_add(jazz_sonic_device)) {
-   platform_device_put(jazz_sonic_device);
-   jazz_sonic_device = NULL;
-   }
-
-   return 0;
-
-out_unregister:
-   platform_driver_unregister(jazz_sonic_driver);
-
-   return -ENOMEM;
+   return platform_driver_register(jazz_sonic_driver);
 }
 
 static void __exit jazz_sonic_cleanup_module(void)
 {
platform_driver_unregister(jazz_sonic_driver);
-
-   if (jazz_sonic_device) {
-   platform_device_unregister(jazz_sonic_device);
-   jazz_sonic_device = NULL;
-   }
 }
 
 module_init(jazz_sonic_init_module);
-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessary a
good idea.[ RFC1925, 2.3 ]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] remove asm/bitops.h includes

2007-09-08 Thread Jiri Slaby
remove asm/bitops.h includes

including asm/bitops directly may cause compile errors. don't include it
and include linux/bitops instead. next patch will deny including asm header
directly.

Cc: Adrian Bunk [EMAIL PROTECTED]
Signed-off-by: Jiri Slaby [EMAIL PROTECTED]

---
commit 3c05eef3d0a98065323d7d6d9a78e0985eba4b10
tree cb9691832992f570b0363dd568f6fa3d2c81e3f5
parent 132bb039c741d00f066e7501e3613d2d20bf0595
author Jiri Slaby [EMAIL PROTECTED] Tue, 04 Sep 2007 21:01:35 +0200
committer Jiri Slaby [EMAIL PROTECTED] Tue, 04 Sep 2007 21:01:35 +0200

 arch/alpha/lib/fls.c|2 +-
 arch/frv/kernel/irq-mb93091.c   |2 +-
 arch/frv/kernel/irq-mb93093.c   |2 +-
 arch/frv/kernel/irq-mb93493.c   |2 +-
 arch/frv/kernel/irq.c   |2 +-
 arch/mips/au1000/pb1200/irqmap.c|2 +-
 arch/mips/basler/excite/excite_irq.c|2 +-
 arch/mips/gt64120/wrppmc/irq.c  |1 -
 arch/mips/tx4938/common/setup.c |2 +-
 arch/powerpc/platforms/maple/setup.c|2 +-
 drivers/char/esp.c  |2 +-
 drivers/char/mxser.c|2 +-
 drivers/char/mxser_new.c|2 +-
 drivers/ide/ide-io.c|2 +-
 drivers/media/dvb/ttpci/av7110_ir.c |2 +-
 drivers/net/bnx2.c  |2 +-
 drivers/net/cris/eth_v10.c  |2 +-
 drivers/net/cxgb3/adapter.h |2 +-
 drivers/net/hamradio/dmascc.c   |2 +-
 drivers/net/mac89x0.c   |2 +-
 drivers/net/spider_net.c|2 +-
 drivers/net/tulip/uli526x.c |2 +-
 drivers/net/wireless/bcm43xx/bcm43xx_leds.c |2 +-
 drivers/pcmcia/m32r_pcc.c   |2 +-
 drivers/pcmcia/m8xx_pcmcia.c|2 +-
 drivers/ps3/vuart.c |2 +-
 drivers/rtc/rtc-pl031.c |2 +-
 drivers/rtc/rtc-sa1100.c|2 +-
 drivers/s390/cio/idset.c|2 +-
 drivers/s390/net/claw.c |2 +-
 drivers/scsi/ide-scsi.c |2 +-
 drivers/serial/crisv10.c|2 +-
 drivers/watchdog/at91rm9200_wdt.c   |2 +-
 drivers/watchdog/ks8695_wdt.c   |2 +-
 drivers/watchdog/omap_wdt.c |2 +-
 drivers/watchdog/sa1100_wdt.c   |2 +-
 fs/reiser4/jnode.h  |2 +-
 fs/reiser4/plugin/space/bitmap.c|2 +-
 include/asm-cris/posix_types.h  |2 +-
 include/asm-i386/pgtable.h  |5 +
 include/asm-i386/smp.h  |2 +-
 include/asm-ia64/cacheflush.h   |2 +-
 include/asm-ia64/pgtable.h  |2 +-
 include/asm-ia64/smp.h  |2 +-
 include/asm-ia64/spinlock.h |2 +-
 include/asm-m32r/pgtable.h  |2 +-
 include/asm-mips/fpu.h  |2 +-
 include/asm-parisc/pgtable.h|2 +-
 include/asm-powerpc/iommu.h |2 +-
 include/asm-powerpc/mmu_context.h   |2 +-
 include/asm-ppc/mmu_context.h   |3 ++-
 include/asm-sparc64/smp.h   |2 +-
 include/asm-x86_64/pgtable.h|2 +-
 include/asm-x86_64/topology.h   |2 +-
 include/linux/of.h  |2 +-
 lib/hweight.c   |2 +-
 net/core/gen_estimator.c|2 +-
 net/core/pktgen.c   |2 +-
 net/ipv4/fib_trie.c |2 +-
 net/netfilter/xt_connbytes.c|2 +-
 60 files changed, 60 insertions(+), 63 deletions(-)

diff --git a/arch/alpha/lib/fls.c b/arch/alpha/lib/fls.c
index 7ad84ea..32afaa3 100644
--- a/arch/alpha/lib/fls.c
+++ b/arch/alpha/lib/fls.c
@@ -3,7 +3,7 @@
  */
 
 #include linux/module.h
-#include asm/bitops.h
+#include linux/bitops.h
 
 /* This is fls(x)-1, except zero is held to zero.  This allows most
efficent input into extbl, plus it allows easy handling of fls(0)=0.  */
diff --git a/arch/frv/kernel/irq-mb93091.c b/arch/frv/kernel/irq-mb93091.c
index ad753c1..9e38f99 100644
--- a/arch/frv/kernel/irq-mb93091.c
+++ b/arch/frv/kernel/irq-mb93091.c
@@ -17,10 +17,10 @@
 #include linux/interrupt.h
 #include linux/init.h
 #include linux/irq.h
+#include linux/bitops.h
 
 #include asm/io.h
 #include asm/system.h
-#include asm/bitops.h
 #include asm/delay.h
 #include asm/irq.h
 #include asm/irc-regs.h
diff --git a/arch/frv/kernel/irq-mb93093.c b/arch/frv/kernel/irq-mb93093.c
index e0983f6..3c2752c 100644
--- a/arch/frv/kernel/irq-mb93093.c
+++ b/arch/frv/kernel/irq-mb93093.c
@@ -17,10 +17,10 @@
 #include linux/interrupt.h
 #include linux/init.h
 #include linux/irq.h

[PATCH 00/16] core network namespace support

2007-09-08 Thread Eric W. Biederman

The following patchset was built against the latest net-2.6.24
tree, and should be safe to apply assume not issues are found during
the review.  In the interest of keeping the patcheset to a reviewable
size, just the core of the network stack has been covered.


The 10,000 foot overview.  We want to make it look to user space 
like the kernel implements multiple network stacks.

To implement this some of the currently global variables in the
network stack need to have one instance per network namespace,
or the global data structure needs to have a network namespace
field.

Currently control enters the network stack in one of 4 major ways.
Through operations on a socket, through a packet coming in
from a network device, through miscellaneous syscalls from
a process, and through operations on a virtual filesystem.
So the current design calls for placing a pointer to 
struct net (the network namespace structure) on network
devices, sockets, processes, and on filesystems so we
have a clear understanding of which network namespace
operations should be done in the context of.

Packets do not contain a pointer to a network device structure.
Instead their network device is derived from which network
device or which socket they are passing through.

On the input path we only need to look at the network namespace
to determine which routing tables to use, and which sockets the
packet can be destined for.

Similarly on the output path we only need to consult the network
namespace for the output routing tables which point to which
network devices we can use.

So while there are accesses to the network namespace as
we process each packet they are in well contained spots that occur
rarely.

Where the network namespace appears most is on the control,
setup, and clean up code paths, in the network stack that we
change rarely.  There we currently don't have anything except
a global context so modifications are necessary, but since
the network parameter is not implicit it should not require
much thought to use.

The implementation strategy follows the classic global
lock reduction pattern.  First all of the interfaces
at a given level in the network stack are made to filter
out traffic from anything except the initial network namespace,
and then those interfaces are allowed to see packets from
any network namespace.  Then some subset of those interfaces
are taught to handle packets from all namespaces, after the
more specific protocol layers below them have been made to
filter those packets.

What this means is that we start out with large intrusive
stupid patches and end up with small patches that enable
small bits of functionality in the secondary network
namespaces.

Eric
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/16] appletalk: In notifier handlers convert the void pointer to a netdevice

2007-09-08 Thread Eric W. Biederman

This slightly improves code safetly and clarity.

Later network namespace patches touch this code so this is a
preliminary cleanup.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 net/appletalk/aarp.c |7 ---
 net/appletalk/ddp.c  |4 +++-
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/appletalk/aarp.c b/net/appletalk/aarp.c
index 3d1655f..80b5414 100644
--- a/net/appletalk/aarp.c
+++ b/net/appletalk/aarp.c
@@ -330,15 +330,16 @@ static void aarp_expire_timeout(unsigned long unused)
 static int aarp_device_event(struct notifier_block *this, unsigned long event,
 void *ptr)
 {
+   struct net_device *dev = ptr;
int ct;
 
if (event == NETDEV_DOWN) {
write_lock_bh(aarp_lock);
 
for (ct = 0; ct  AARP_HASH_SIZE; ct++) {
-   __aarp_expire_device(resolved[ct], ptr);
-   __aarp_expire_device(unresolved[ct], ptr);
-   __aarp_expire_device(proxies[ct], ptr);
+   __aarp_expire_device(resolved[ct], dev);
+   __aarp_expire_device(unresolved[ct], dev);
+   __aarp_expire_device(proxies[ct], dev);
}
 
write_unlock_bh(aarp_lock);
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index fbdfb12..594b597 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -647,9 +647,11 @@ static inline void atalk_dev_down(struct net_device *dev)
 static int ddp_device_event(struct notifier_block *this, unsigned long event,
void *ptr)
 {
+   struct net_device *dev = ptr;
+
if (event == NETDEV_DOWN)
/* Discard any use of this */
-   atalk_dev_down(ptr);
+   atalk_dev_down(dev);
 
return NOTIFY_DONE;
 }
-- 
1.5.3.rc6.17.g1911

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/16] net: Don't implement dev_ifname32 inline

2007-09-08 Thread Eric W. Biederman

The current implementation of dev_ifname makes maintenance difficult
because updates to the implementation of the ioctl have to made in two
places.  So this patch updates dev_ifname32 to do a classic 32/64
structure conversion and call sys_ioctl like the rest of the
compat calls do.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 fs/compat_ioctl.c |   21 ++---
 1 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index a6c9078..361b994 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -324,22 +324,21 @@ struct ifconf32 {
 
 static int dev_ifname32(unsigned int fd, unsigned int cmd, unsigned long arg)
 {
-   struct net_device *dev;
-   struct ifreq32 ifr32;
+   struct ifreq __user *uifr;
int err;
 
-   if (copy_from_user(ifr32, compat_ptr(arg), sizeof(ifr32)))
+   uifr = compat_alloc_user_space(sizeof(struct ifreq));
+   if (copy_in_user(uifr, compat_ptr(arg), sizeof(struct ifreq32)));
return -EFAULT;
 
-   dev = dev_get_by_index(ifr32.ifr_ifindex);
-   if (!dev)
-   return -ENODEV;
+   err = sys_ioctl(fd, SIOCGIFNAME, (unsigned long)uifr);
+   if (err)
+   return err;
 
-   strlcpy(ifr32.ifr_name, dev-name, sizeof(ifr32.ifr_name));
-   dev_put(dev);
-   
-   err = copy_to_user(compat_ptr(arg), ifr32, sizeof(ifr32));
-   return (err ? -EFAULT : 0);
+   if (copy_in_user(compat_ptr(arg), uifr, sizeof(struct ifreq32)))
+   return -EFAULT;
+
+   return 0;
 }
 
 static int dev_ifconf(unsigned int fd, unsigned int cmd, unsigned long arg)
-- 
1.5.3.rc6.17.g1911

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/16] net: Basic network namespace infrastructure.

2007-09-08 Thread Eric W. Biederman

This is the basic infrastructure needed to support network
namespaces.  This infrastructure is:
- Registration functions to support initializing per network
  namespace data when a network namespaces is created or destroyed.

- struct net.  The network namespace data structure.
  This structure will grow as variables are made per network
  namespace but this is the minimal starting point.

- Functions to grab a reference to the network namespace.
  I provide both get/put functions that keep a network namespace
  from being freed.  And hold/release functions serve as weak references
  and will warn if their count is not zero when the data structure
  is freed.  Useful for dealing with more complicated data structures
  like the ipv4 route cache.

- A list of all of the network namespaces so we can iterate over them.

- A slab for the network namespace data structure allowing leaks
  to be spotted.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 include/net/net_namespace.h |   68 ++
 net/core/Makefile   |2 +-
 net/core/net_namespace.c|  292 +++
 3 files changed, 361 insertions(+), 1 deletions(-)
 create mode 100644 include/net/net_namespace.h
 create mode 100644 net/core/net_namespace.c

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
new file mode 100644
index 000..6344b77
--- /dev/null
+++ b/include/net/net_namespace.h
@@ -0,0 +1,68 @@
+/*
+ * Operations on the network namespace
+ */
+#ifndef __NET_NET_NAMESPACE_H
+#define __NET_NET_NAMESPACE_H
+
+#include asm/atomic.h
+#include linux/workqueue.h
+#include linux/list.h
+
+struct net {
+   atomic_tcount;  /* To decided when the network
+*  namespace should be freed.
+*/
+   atomic_tuse_count;  /* To track references we
+* destroy on demand
+*/
+   struct list_headlist;   /* list of network namespaces */
+   struct work_struct  work;   /* work struct for freeing */
+};
+
+extern struct net init_net;
+extern struct list_head net_namespace_list;
+
+extern void __put_net(struct net *net);
+
+static inline struct net *get_net(struct net *net)
+{
+   atomic_inc(net-count);
+   return net;
+}
+
+static inline void put_net(struct net *net)
+{
+   if (atomic_dec_and_test(net-count))
+   __put_net(net);
+}
+
+static inline struct net *hold_net(struct net *net)
+{
+   atomic_inc(net-use_count);
+   return net;
+}
+
+static inline void release_net(struct net *net)
+{
+   atomic_dec(net-use_count);
+}
+
+extern void net_lock(void);
+extern void net_unlock(void);
+
+#define for_each_net(VAR)  \
+   list_for_each_entry(VAR, net_namespace_list, list)
+
+
+struct pernet_operations {
+   struct list_head list;
+   int (*init)(struct net *net);
+   void (*exit)(struct net *net);
+};
+
+extern int register_pernet_subsys(struct pernet_operations *);
+extern void unregister_pernet_subsys(struct pernet_operations *);
+extern int register_pernet_device(struct pernet_operations *);
+extern void unregister_pernet_device(struct pernet_operations *);
+
+#endif /* __NET_NET_NAMESPACE_H */
diff --git a/net/core/Makefile b/net/core/Makefile
index 4751613..ea9b3f3 100644
--- a/net/core/Makefile
+++ b/net/core/Makefile
@@ -3,7 +3,7 @@
 #
 
 obj-y := sock.o request_sock.o skbuff.o iovec.o datagram.o stream.o scm.o \
-gen_stats.o gen_estimator.o
+gen_stats.o gen_estimator.o net_namespace.o
 
 obj-$(CONFIG_SYSCTL) += sysctl_net_core.o
 
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
new file mode 100644
index 000..f259a9b
--- /dev/null
+++ b/net/core/net_namespace.c
@@ -0,0 +1,292 @@
+#include linux/workqueue.h
+#include linux/rtnetlink.h
+#include linux/cache.h
+#include linux/slab.h
+#include linux/list.h
+#include linux/delay.h
+#include net/net_namespace.h
+
+/*
+ * Our network namespace constructor/destructor lists
+ */
+
+static LIST_HEAD(pernet_list);
+static struct list_head *first_device = pernet_list;
+static DEFINE_MUTEX(net_mutex);
+
+static DEFINE_MUTEX(net_list_mutex);
+LIST_HEAD(net_namespace_list);
+
+static struct kmem_cache *net_cachep;
+
+struct net init_net;
+EXPORT_SYMBOL_GPL(init_net);
+
+void net_lock(void)
+{
+   mutex_lock(net_list_mutex);
+}
+
+void net_unlock(void)
+{
+   mutex_unlock(net_list_mutex);
+}
+
+static struct net *net_alloc(void)
+{
+   return kmem_cache_alloc(net_cachep, GFP_KERNEL);
+}
+
+static void net_free(struct net *net)
+{
+   if (!net)
+   return;
+
+   if (unlikely(atomic_read(net-use_count) != 0)) {
+   printk(KERN_EMERG network namespace not free! Usage: %d\n,
+   

[PATCH 04/16] net: Add a network namespace parameter to tasks

2007-09-08 Thread Eric W. Biederman

This is the network namespace from which all which all sockets
and anything else under user control ultimately get their network
namespace parameters.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 include/linux/init_task.h |2 ++
 include/linux/nsproxy.h   |1 +
 2 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index cab741c..685d631 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -9,6 +9,7 @@
 #include linux/ipc.h
 #include linux/pid_namespace.h
 #include linux/user_namespace.h
+#include net/net_namespace.h
 
 #define INIT_FDTABLE \
 {  \
@@ -78,6 +79,7 @@ extern struct nsproxy init_nsproxy;
.nslock = __SPIN_LOCK_UNLOCKED(nsproxy.nslock), \
.uts_ns = init_uts_ns, \
.mnt_ns = NULL, \
+   .net_ns = init_net,\
INIT_IPC_NS(ipc_ns) \
.user_ns= init_user_ns,\
 }
diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index ce06188..bec4485 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -29,6 +29,7 @@ struct nsproxy {
struct mnt_namespace *mnt_ns;
struct pid_namespace *pid_ns;
struct user_namespace *user_ns;
+   struct net   *net_ns;
 };
 extern struct nsproxy init_nsproxy;
 
-- 
1.5.3.rc6.17.g1911

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/16] net: Add a network namespace parameter to struct sock

2007-09-08 Thread Eric W. Biederman

Sockets need to get a reference to their network namespace,
or possibly a simple hold if someone registers on the network
namespace notifier and will free the sockets when the namespace
is going to be destroyed.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 include/net/inet_timewait_sock.h |1 +
 include/net/sock.h   |3 +++
 2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
index 47d52b2..abaff05 100644
--- a/include/net/inet_timewait_sock.h
+++ b/include/net/inet_timewait_sock.h
@@ -115,6 +115,7 @@ struct inet_timewait_sock {
 #define tw_refcnt  __tw_common.skc_refcnt
 #define tw_hash__tw_common.skc_hash
 #define tw_prot__tw_common.skc_prot
+#define tw_net __tw_common.skc_net
volatile unsigned char  tw_substate;
/* 3 bits hole, try to pack */
unsigned char   tw_rcv_wscale;
diff --git a/include/net/sock.h b/include/net/sock.h
index 802c670..253df3f 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -106,6 +106,7 @@ struct proto;
  * @skc_refcnt: reference count
  * @skc_hash: hash value used with various protocol lookup tables
  * @skc_prot: protocol handlers inside a network family
+ * @skc_net: reference to the network namespace of this socket
  *
  * This is the minimal network layer representation of sockets, the header
  * for struct sock and struct inet_timewait_sock.
@@ -120,6 +121,7 @@ struct sock_common {
atomic_tskc_refcnt;
unsigned intskc_hash;
struct proto*skc_prot;
+   struct net  *skc_net;
 };
 
 /**
@@ -196,6 +198,7 @@ struct sock {
 #define sk_refcnt  __sk_common.skc_refcnt
 #define sk_hash__sk_common.skc_hash
 #define sk_prot__sk_common.skc_prot
+#define sk_net __sk_common.skc_net
unsigned char   sk_shutdown : 2,
sk_no_check : 2,
sk_userlocks : 4;
-- 
1.5.3.rc6.17.g1911

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/16] net: Make socket creation namespace safe.

2007-09-08 Thread Eric W. Biederman

This patch passes in the namespace a new socket should be created in
and has the socket code do the appropriate reference counting.  By
virtue of this all socket create methods are touched.  In addition
the socket create methods are modified so that they will fail if
you attempt to create a socket in a non-default network namespace.

Failing if we attempt to create a socket outside of the default
network namespace ensures that as we incrementally make the network stack
network namespace aware we will not export functionality that someone
has not audited and made certain is network namespace safe.
Allowing us to partially enable network namespaces before all of the
exotic protocols are supported.

Any protocol layers I have missed will fail to compile because I now
pass an extra parameter into the socket creation code.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 drivers/net/pppoe.c  |4 ++--
 drivers/net/pppol2tp.c   |4 ++--
 drivers/net/pppox.c  |7 +--
 include/linux/if_pppox.h |2 +-
 include/linux/net.h  |3 ++-
 include/net/llc_conn.h   |2 +-
 include/net/sock.h   |4 +++-
 net/appletalk/ddp.c  |7 +--
 net/atm/common.c |4 ++--
 net/atm/common.h |2 +-
 net/atm/pvc.c|7 +--
 net/atm/svc.c|   11 +++
 net/ax25/af_ax25.c   |9 ++---
 net/bluetooth/af_bluetooth.c |7 +--
 net/bluetooth/bnep/sock.c|4 ++--
 net/bluetooth/cmtp/sock.c|4 ++--
 net/bluetooth/hci_sock.c |4 ++--
 net/bluetooth/hidp/sock.c|4 ++--
 net/bluetooth/l2cap.c|   10 +-
 net/bluetooth/rfcomm/sock.c  |   10 +-
 net/bluetooth/sco.c  |   10 +-
 net/core/sock.c  |6 --
 net/decnet/af_decnet.c   |   13 -
 net/econet/af_econet.c   |7 +--
 net/ipv4/af_inet.c   |7 +--
 net/ipv6/af_inet6.c  |7 +--
 net/ipx/af_ipx.c |7 +--
 net/irda/af_irda.c   |   11 +++
 net/key/af_key.c |7 +--
 net/llc/af_llc.c |7 +--
 net/llc/llc_conn.c   |6 +++---
 net/netlink/af_netlink.c |   15 +--
 net/netrom/af_netrom.c   |9 ++---
 net/packet/af_packet.c   |7 +--
 net/rose/af_rose.c   |9 ++---
 net/rxrpc/af_rxrpc.c |7 +--
 net/sctp/ipv6.c  |2 +-
 net/sctp/protocol.c  |2 +-
 net/socket.c |9 +
 net/tipc/socket.c|9 ++---
 net/unix/af_unix.c   |   13 -
 net/x25/af_x25.c |   13 -
 42 files changed, 182 insertions(+), 110 deletions(-)

diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index a9b6971..f8bf5fc 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -477,12 +477,12 @@ static struct proto pppoe_sk_proto = {
  * Initialize a new struct sock.
  *
  **/
-static int pppoe_create(struct socket *sock)
+static int pppoe_create(struct net *net, struct socket *sock)
 {
int error = -ENOMEM;
struct sock *sk;
 
-   sk = sk_alloc(PF_PPPOX, GFP_KERNEL, pppoe_sk_proto, 1);
+   sk = sk_alloc(net, PF_PPPOX, GFP_KERNEL, pppoe_sk_proto, 1);
if (!sk)
goto out;
 
diff --git a/drivers/net/pppol2tp.c b/drivers/net/pppol2tp.c
index c12e0a8..07d7f5b 100644
--- a/drivers/net/pppol2tp.c
+++ b/drivers/net/pppol2tp.c
@@ -1423,12 +1423,12 @@ static struct proto pppol2tp_sk_proto = {
 
 /* socket() handler. Initialize a new struct sock.
  */
-static int pppol2tp_create(struct socket *sock)
+static int pppol2tp_create(struct net *net, struct socket *sock)
 {
int error = -ENOMEM;
struct sock *sk;
 
-   sk = sk_alloc(PF_PPPOX, GFP_KERNEL, pppol2tp_sk_proto, 1);
+   sk = sk_alloc(net, PF_PPPOX, GFP_KERNEL, pppol2tp_sk_proto, 1);
if (!sk)
goto out;
 
diff --git a/drivers/net/pppox.c b/drivers/net/pppox.c
index 25c52b5..c6898c1 100644
--- a/drivers/net/pppox.c
+++ b/drivers/net/pppox.c
@@ -104,10 +104,13 @@ int pppox_ioctl(struct socket *sock, unsigned int cmd, 
unsigned long arg)
 
 EXPORT_SYMBOL(pppox_ioctl);
 
-static int pppox_create(struct socket *sock, int protocol)
+static int pppox_create(struct net *net, struct socket *sock, int protocol)
 {
int rc = -EPROTOTYPE;
 
+   if (net != init_net)
+   return -EAFNOSUPPORT;
+
if (protocol  0 || protocol  PX_MAX_PROTO)
goto out;
 
@@ -123,7 +126,7 @@ static int pppox_create(struct socket *sock, int protocol)
!try_module_get(pppox_protos[protocol]-owner))
goto out;
 
-   rc = pppox_protos[protocol]-create(sock);
+   rc = pppox_protos[protocol]-create(net, sock);
 

[PATCH 09/16] net: Initialize the network namespace of network devices.

2007-09-08 Thread Eric W. Biederman

Except for carefully selected pseudo devices all network
interfaces should start out in the initial network namespace.
Ultimately it will be register_netdev that examines what
dev-nd_net is set to and places a device in a network namespace.

This patch modifies alloc_netdev to initialize the network
namespace a device is in with the initial network namespace.
This gets it right for the vast majority of devices so their
drivers need not be modified and for those few pseudo devices
that need something different they can change this parameter
before calling register_netdevice.

The network namespace parameter on a network device is not
reference counted as the devices are inside of a network namespace
and cannot remain in that namespace past the lifetime of the
network namespace.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 net/core/dev.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 2ade518..316043b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3707,6 +3707,7 @@ struct net_device *alloc_netdev_mq(int sizeof_priv, const 
char *name,
dev = (struct net_device *)
(((long)p + NETDEV_ALIGN_CONST)  ~NETDEV_ALIGN_CONST);
dev-padded = (char *)dev - (char *)p;
+   dev-nd_net = init_net;
 
if (sizeof_priv) {
dev-priv = ((char *)dev +
-- 
1.5.3.rc6.17.g1911

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [TG3]: Workaround MSI bug on 5714/5780.

2007-09-08 Thread David Miller
From: Michael Chan [EMAIL PROTECTED]
Date: Fri, 07 Sep 2007 19:26:21 -0700

 David, I see that you have already done the revert in your 2.6.23 tree.

I intend to revert that and only push the tg3 fix into
2.6.23

Earlier I had the idea to undo the quirk too, but that's
something I decided was not a good idea.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/16] net: Make packet reception network namespace safe

2007-09-08 Thread Eric W. Biederman

This patch modifies every packet receive function
registered with dev_add_pack() to drop packets if they
are not from the initial network namespace.

This should ensure that the various network stacks do
not receive packets in a anything but the initial network
namespace until the code has been converted and is ready
for them.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 drivers/block/aoe/aoenet.c  |4 
 drivers/net/bonding/bond_3ad.c  |4 
 drivers/net/bonding/bond_alb.c  |3 +++
 drivers/net/bonding/bond_main.c |3 +++
 drivers/net/hamradio/bpqether.c |3 +++
 drivers/net/pppoe.c |6 ++
 drivers/net/wan/hdlc.c  |7 +++
 drivers/net/wan/lapbether.c |3 +++
 drivers/net/wan/syncppp.c   |6 ++
 net/8021q/vlan_dev.c|5 +
 net/appletalk/aarp.c|3 +++
 net/appletalk/ddp.c |6 ++
 net/ax25/ax25_in.c  |5 +
 net/bridge/br_stp_bpdu.c|4 
 net/decnet/dn_route.c   |3 +++
 net/econet/af_econet.c  |3 +++
 net/ipv4/arp.c  |3 +++
 net/ipv4/ip_input.c |3 +++
 net/ipv4/ipconfig.c |6 ++
 net/ipv6/ip6_input.c|5 +
 net/ipx/af_ipx.c|3 +++
 net/irda/irlap_frame.c  |3 +++
 net/llc/llc_input.c |4 
 net/packet/af_packet.c  |9 +
 net/tipc/eth_media.c|6 ++
 net/x25/x25_dev.c   |3 +++
 26 files changed, 113 insertions(+), 0 deletions(-)

diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
index f9ddfda..4dc0fb7 100644
--- a/drivers/block/aoe/aoenet.c
+++ b/drivers/block/aoe/aoenet.c
@@ -8,6 +8,7 @@
 #include linux/blkdev.h
 #include linux/netdevice.h
 #include linux/moduleparam.h
+#include net/net_namespace.h
 #include asm/unaligned.h
 #include aoe.h
 
@@ -114,6 +115,9 @@ aoenet_rcv(struct sk_buff *skb, struct net_device *ifp, 
struct packet_type *pt,
struct aoe_hdr *h;
u32 n;
 
+   if (ifp-nd_net != init_net)
+   goto exit;
+
skb = skb_share_check(skb, GFP_ATOMIC);
if (skb == NULL)
return 0;
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index f829e4a..94bd739 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -29,6 +29,7 @@
 #include linux/ethtool.h
 #include linux/if_bonding.h
 #include linux/pkt_sched.h
+#include net/net_namespace.h
 #include bonding.h
 #include bond_3ad.h
 
@@ -2448,6 +2449,9 @@ int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct 
net_device *dev, struct pac
struct slave *slave = NULL;
int ret = NET_RX_DROP;
 
+   if (dev-nd_net != init_net)
+   goto out;
+
if (!(dev-flags  IFF_MASTER))
goto out;
 
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 92c3b6f..419a9f8 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -345,6 +345,9 @@ static int rlb_arp_recv(struct sk_buff *skb, struct 
net_device *bond_dev, struct
struct arp_pkt *arp = (struct arp_pkt *)skb-data;
int res = NET_RX_DROP;
 
+   if (bond_dev-nd_net != init_net)
+   goto out;
+
if (!(bond_dev-flags  IFF_MASTER))
goto out;
 
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 5de648f..e4e5fdc 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2458,6 +2458,9 @@ static int bond_arp_rcv(struct sk_buff *skb, struct 
net_device *dev, struct pack
unsigned char *arp_ptr;
u32 sip, tip;
 
+   if (dev-nd_net != init_net)
+   goto out;
+
if (!(dev-priv_flags  IFF_BONDING) || !(dev-flags  IFF_MASTER))
goto out;
 
diff --git a/drivers/net/hamradio/bpqether.c b/drivers/net/hamradio/bpqether.c
index 1699d42..85fb8e7 100644
--- a/drivers/net/hamradio/bpqether.c
+++ b/drivers/net/hamradio/bpqether.c
@@ -173,6 +173,9 @@ static int bpq_rcv(struct sk_buff *skb, struct net_device 
*dev, struct packet_ty
struct ethhdr *eth;
struct bpqdev *bpq;
 
+   if (dev-nd_net != init_net)
+   goto drop;
+
if ((skb = skb_share_check(skb, GFP_ATOMIC)) == NULL)
return NET_RX_DROP;
 
diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index f8bf5fc..a29ea22 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -386,6 +386,9 @@ static int pppoe_rcv(struct sk_buff *skb,
struct pppoe_hdr *ph;
struct pppox_sock *po;
 
+   if (dev-nd_net != init_net)
+   goto drop;
+
if (!pskb_may_pull(skb, sizeof(struct pppoe_hdr)))
goto drop;
 
@@ -418,6 +421,9 @@ static int pppoe_disc_rcv(struct sk_buff *skb,
struct pppoe_hdr *ph;
struct pppox_sock 

[PATCH 11/16] net: Make device event notification network namespace safe

2007-09-08 Thread Eric W. Biederman

Every user of the network device notifiers is either a protocol
stack or a pseudo device.  If a protocol stack that does not have
support for multiple network namespaces receives an event for a
device that is not in the initial network namespace it quite possibly
can get confused and do the wrong thing.

To avoid problems until all of the protocol stacks are converted
this patch modifies all netdev event handlers to ignore events on
devices that are not in the initial network namespace.

As the rest of the code is made network namespace aware these
checks can be removed.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 arch/ia64/hp/sim/simeth.c   |3 +++
 drivers/net/bonding/bond_main.c |3 +++
 drivers/net/hamradio/bpqether.c |3 +++
 drivers/net/pppoe.c |3 +++
 drivers/net/wan/dlci.c  |3 +++
 drivers/net/wan/hdlc.c  |3 +++
 drivers/net/wan/lapbether.c |3 +++
 net/8021q/vlan.c|4 
 net/appletalk/aarp.c|3 +++
 net/appletalk/ddp.c |3 +++
 net/atm/clip.c  |3 +++
 net/atm/mpc.c   |4 
 net/ax25/af_ax25.c  |3 +++
 net/bridge/br_notify.c  |4 
 net/core/dst.c  |4 
 net/core/fib_rules.c|4 
 net/core/pktgen.c   |3 +++
 net/core/rtnetlink.c|4 
 net/decnet/af_decnet.c  |3 +++
 net/econet/af_econet.c  |3 +++
 net/ipv4/arp.c  |3 +++
 net/ipv4/devinet.c  |3 +++
 net/ipv4/fib_frontend.c |3 +++
 net/ipv4/ipmr.c |7 ++-
 net/ipv4/netfilter/ip_queue.c   |3 +++
 net/ipv4/netfilter/ipt_MASQUERADE.c |3 +++
 net/ipv6/addrconf.c |3 +++
 net/ipv6/ndisc.c|3 +++
 net/ipv6/netfilter/ip6_queue.c  |3 +++
 net/ipx/af_ipx.c|3 +++
 net/netfilter/nfnetlink_queue.c |3 +++
 net/netrom/af_netrom.c  |3 +++
 net/packet/af_packet.c  |3 +++
 net/rose/af_rose.c  |3 +++
 net/tipc/eth_media.c|3 +++
 net/x25/af_x25.c|3 +++
 net/xfrm/xfrm_policy.c  |5 +
 security/selinux/netif.c|4 
 38 files changed, 126 insertions(+), 1 deletions(-)

diff --git a/arch/ia64/hp/sim/simeth.c b/arch/ia64/hp/sim/simeth.c
index f26077a..93d6004 100644
--- a/arch/ia64/hp/sim/simeth.c
+++ b/arch/ia64/hp/sim/simeth.c
@@ -300,6 +300,9 @@ simeth_device_event(struct notifier_block *this,unsigned 
long event, void *ptr)
return NOTIFY_DONE;
}
 
+   if (dev-nd_net != init_net)
+   return NOTIFY_DONE;
+
if ( event != NETDEV_UP  event != NETDEV_DOWN ) return NOTIFY_DONE;
 
/*
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e4e5fdc..cf97d8a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3299,6 +3299,9 @@ static int bond_netdev_event(struct notifier_block *this, 
unsigned long event, v
 {
struct net_device *event_dev = (struct net_device *)ptr;
 
+   if (event_dev-nd_net != init_net)
+   return NOTIFY_DONE;
+
dprintk(event_dev: %s, event: %lx\n,
(event_dev ? event_dev-name : None),
event);
diff --git a/drivers/net/hamradio/bpqether.c b/drivers/net/hamradio/bpqether.c
index 85fb8e7..df09210 100644
--- a/drivers/net/hamradio/bpqether.c
+++ b/drivers/net/hamradio/bpqether.c
@@ -563,6 +563,9 @@ static int bpq_device_event(struct notifier_block 
*this,unsigned long event, voi
 {
struct net_device *dev = (struct net_device *)ptr;
 
+   if (dev-nd_net != init_net)
+   return NOTIFY_DONE;
+
if (!dev_is_ethdev(dev))
return NOTIFY_DONE;
 
diff --git a/drivers/net/pppoe.c b/drivers/net/pppoe.c
index a29ea22..b4ba584 100644
--- a/drivers/net/pppoe.c
+++ b/drivers/net/pppoe.c
@@ -301,6 +301,9 @@ static int pppoe_device_event(struct notifier_block *this,
 {
struct net_device *dev = (struct net_device *) ptr;
 
+   if (dev-nd_net != init_net)
+   return NOTIFY_DONE;
+
/* Only look at sockets that are using this specific device. */
switch (event) {
case NETDEV_CHANGEMTU:
diff --git a/drivers/net/wan/dlci.c b/drivers/net/wan/dlci.c
index 66be20c..61041d5 100644
--- a/drivers/net/wan/dlci.c
+++ b/drivers/net/wan/dlci.c
@@ -513,6 +513,9 @@ static int dlci_dev_event(struct notifier_block *unused,
 {
struct net_device *dev = (struct net_device *) ptr;
 
+   if (dev-nd_net != init_net)
+   return NOTIFY_DONE;
+
if (event == NETDEV_UNREGISTER) {
struct dlci_local *dlp;
 
diff --git 

[PATCH 12/16] net: Support multiple network namespaces with netlink

2007-09-08 Thread Eric W. Biederman

Each netlink socket will live in exactly one network namespace,
this includes the controlling kernel sockets.

This patch updates all of the existing netlink protocols
to only support the initial network namespace.  Request
by clients in other namespaces will get -ECONREFUSED.
As they would if the kernel did not have the support for
that netlink protocol compiled in.

As each netlink protocol is updated to be multiple network
namespace safe it can register multiple kernel sockets
to acquire a presence in the rest of the network namespaces.

The implementation in af_netlink is a simple filter implementation
at hash table insertion and hash table look up time.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 drivers/connector/connector.c   |2 +-
 drivers/scsi/scsi_netlink.c |2 +-
 drivers/scsi/scsi_transport_iscsi.c |2 +-
 fs/ecryptfs/netlink.c   |2 +-
 include/linux/netlink.h |6 ++-
 kernel/audit.c  |4 +-
 lib/kobject_uevent.c|5 +-
 net/bridge/netfilter/ebt_ulog.c |5 +-
 net/core/rtnetlink.c|4 +-
 net/decnet/netfilter/dn_rtmsg.c |3 +-
 net/ipv4/fib_frontend.c |4 +-
 net/ipv4/inet_diag.c|4 +-
 net/ipv4/netfilter/ip_queue.c   |6 +-
 net/ipv4/netfilter/ipt_ULOG.c   |3 +-
 net/ipv6/netfilter/ip6_queue.c  |6 +-
 net/netfilter/nfnetlink.c   |2 +-
 net/netfilter/nfnetlink_log.c   |3 +-
 net/netfilter/nfnetlink_queue.c |3 +-
 net/netlink/af_netlink.c|  106 ++-
 net/netlink/genetlink.c |4 +-
 net/xfrm/xfrm_user.c|2 +-
 security/selinux/netlink.c  |5 +-
 22 files changed, 122 insertions(+), 61 deletions(-)

diff --git a/drivers/connector/connector.c b/drivers/connector/connector.c
index a7b9e9b..5690709 100644
--- a/drivers/connector/connector.c
+++ b/drivers/connector/connector.c
@@ -446,7 +446,7 @@ static int __devinit cn_init(void)
dev-id.idx = cn_idx;
dev-id.val = cn_val;
 
-   dev-nls = netlink_kernel_create(NETLINK_CONNECTOR,
+   dev-nls = netlink_kernel_create(init_net, NETLINK_CONNECTOR,
 CN_NETLINK_USERS + 0xf,
 dev-input, NULL, THIS_MODULE);
if (!dev-nls)
diff --git a/drivers/scsi/scsi_netlink.c b/drivers/scsi/scsi_netlink.c
index 4bf9aa5..163acf6 100644
--- a/drivers/scsi/scsi_netlink.c
+++ b/drivers/scsi/scsi_netlink.c
@@ -167,7 +167,7 @@ scsi_netlink_init(void)
return;
}
 
-   scsi_nl_sock = netlink_kernel_create(NETLINK_SCSITRANSPORT,
+   scsi_nl_sock = netlink_kernel_create(init_net, NETLINK_SCSITRANSPORT,
SCSI_NL_GRP_CNT, scsi_nl_rcv, NULL,
THIS_MODULE);
if (!scsi_nl_sock) {
diff --git a/drivers/scsi/scsi_transport_iscsi.c 
b/drivers/scsi/scsi_transport_iscsi.c
index 34c1860..4916f01 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -1523,7 +1523,7 @@ static __init int iscsi_transport_init(void)
if (err)
goto unregister_conn_class;
 
-   nls = netlink_kernel_create(NETLINK_ISCSI, 1, iscsi_if_rx, NULL,
+   nls = netlink_kernel_create(init_net, NETLINK_ISCSI, 1, iscsi_if_rx, 
NULL,
THIS_MODULE);
if (!nls) {
err = -ENOBUFS;
diff --git a/fs/ecryptfs/netlink.c b/fs/ecryptfs/netlink.c
index fe91863..056519c 100644
--- a/fs/ecryptfs/netlink.c
+++ b/fs/ecryptfs/netlink.c
@@ -227,7 +227,7 @@ int ecryptfs_init_netlink(void)
 {
int rc;
 
-   ecryptfs_nl_sock = netlink_kernel_create(NETLINK_ECRYPTFS, 0,
+   ecryptfs_nl_sock = netlink_kernel_create(init_net, NETLINK_ECRYPTFS, 0,
 ecryptfs_receive_nl_message,
 NULL, THIS_MODULE);
if (!ecryptfs_nl_sock) {
diff --git a/include/linux/netlink.h b/include/linux/netlink.h
index 83d8239..d2843ae 100644
--- a/include/linux/netlink.h
+++ b/include/linux/netlink.h
@@ -27,6 +27,8 @@
 
 #define MAX_LINKS 32   
 
+struct net;
+
 struct sockaddr_nl
 {
sa_family_t nl_family;  /* AF_NETLINK   */
@@ -157,7 +159,8 @@ struct netlink_skb_parms
 #define NETLINK_CREDS(skb) (NETLINK_CB((skb)).creds)
 
 
-extern struct sock *netlink_kernel_create(int unit, unsigned int groups,
+extern struct sock *netlink_kernel_create(struct net *net,
+ int unit,unsigned int groups,
  void (*input)(struct sock *sk, int 
len),
  struct mutex *cb_mutex,
  struct module *module);
@@ -206,6 +209,7 @@ struct netlink_callback
 
 struct netlink_notify
 {
+ 

[PATCH 14/16] net: Factor out __dev_alloc_name from dev_alloc_name

2007-09-08 Thread Eric W. Biederman

When forcibly changing the network namespace of a device
I need something that can generate a name for the device
in the new namespace without overwriting the old name.

__dev_alloc_name provides me that functionality.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 net/core/dev.c |   48 +++-
 1 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index c51cf40..53cdb64 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -739,9 +739,10 @@ int dev_valid_name(const char *name)
 }
 
 /**
- * dev_alloc_name - allocate a name for a device
- * @dev: device
+ * __dev_alloc_name - allocate a name for a device
+ * @net: network namespace to allocate the device name in
  * @name: name format string
+ * @buf:  scratch buffer and result name string
  *
  * Passed a format string - eg lt%d it will try and find a suitable
  * id. It scans list of devices to build up a free map, then chooses
@@ -752,18 +753,13 @@ int dev_valid_name(const char *name)
  * Returns the number of the unit assigned or a negative errno code.
  */
 
-int dev_alloc_name(struct net_device *dev, const char *name)
+static int __dev_alloc_name(struct net *net, const char *name, char *buf)
 {
int i = 0;
-   char buf[IFNAMSIZ];
const char *p;
const int max_netdevices = 8*PAGE_SIZE;
long *inuse;
struct net_device *d;
-   struct net *net;
-
-   BUG_ON(!dev-nd_net);
-   net = dev-nd_net;
 
p = strnchr(name, IFNAMSIZ-1, '%');
if (p) {
@@ -787,7 +783,7 @@ int dev_alloc_name(struct net_device *dev, const char *name)
continue;
 
/*  avoid cases where sscanf is not exact inverse of 
printf */
-   snprintf(buf, sizeof(buf), name, i);
+   snprintf(buf, IFNAMSIZ, name, i);
if (!strncmp(buf, d-name, IFNAMSIZ))
set_bit(i, inuse);
}
@@ -796,11 +792,9 @@ int dev_alloc_name(struct net_device *dev, const char 
*name)
free_page((unsigned long) inuse);
}
 
-   snprintf(buf, sizeof(buf), name, i);
-   if (!__dev_get_by_name(net, buf)) {
-   strlcpy(dev-name, buf, IFNAMSIZ);
+   snprintf(buf, IFNAMSIZ, name, i);
+   if (!__dev_get_by_name(net, buf))
return i;
-   }
 
/* It is possible to run out of possible slots
 * when the name is long and there isn't enough space left
@@ -809,6 +803,34 @@ int dev_alloc_name(struct net_device *dev, const char 
*name)
return -ENFILE;
 }
 
+/**
+ * dev_alloc_name - allocate a name for a device
+ * @dev: device
+ * @name: name format string
+ *
+ * Passed a format string - eg lt%d it will try and find a suitable
+ * id. It scans list of devices to build up a free map, then chooses
+ * the first empty slot. The caller must hold the dev_base or rtnl lock
+ * while allocating the name and adding the device in order to avoid
+ * duplicates.
+ * Limited to bits_per_byte * page size devices (ie 32K on most platforms).
+ * Returns the number of the unit assigned or a negative errno code.
+ */
+
+int dev_alloc_name(struct net_device *dev, const char *name)
+{
+   char buf[IFNAMSIZ];
+   struct net *net;
+   int ret;
+
+   BUG_ON(!dev-nd_net);
+   net = dev-nd_net;
+   ret = __dev_alloc_name(net, name, buf);
+   if (ret = 0)
+   strlcpy(dev-name, buf, IFNAMSIZ);
+   return ret;
+}
+
 
 /**
  * dev_change_name - change name of a device
-- 
1.5.3.rc6.17.g1911

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/16] net: Implement network device movement between namespaces

2007-09-08 Thread Eric W. Biederman

This patch introduces NETIF_F_NETNS_LOCAL a flag to indicate
a network device is local to a single network namespace and
should never be moved.  Useful for pseudo devices that we
need an instance in each network namespace (like the loopback
device) and for any device we find that cannot handle multiple
network namespaces so we may trap them in the initial network
namespace.

This patch introduces the function dev_change_net_namespace
a function used to move a network device from one network
namespace to another.  To the network device nothing
special appears to happen, to the components of the network
stack it appears as if the network device was unregistered
in the network namespace it is in, and a new device
was registered in the network namespace the device
was moved to.

This patch sets up a namespace device destructor that
upon the exit of a network namespace moves all of the
movable network devices  to the initial network namespace
so they are not lost.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 drivers/net/loopback.c|3 +-
 include/linux/netdevice.h |3 +
 net/core/dev.c|  189 ++---
 3 files changed, 184 insertions(+), 11 deletions(-)

diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index 5106c23..e399f7b 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -222,7 +222,8 @@ struct net_device loopback_dev = {
  | NETIF_F_TSO
 #endif
  | NETIF_F_NO_CSUM | NETIF_F_HIGHDMA
- | NETIF_F_LLTX,
+ | NETIF_F_LLTX
+ | NETIF_F_NETNS_LOCAL,
.ethtool_ops= loopback_ethtool_ops,
 };
 
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ec90d1a..d33d897 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -435,6 +435,7 @@ struct net_device
 #define NETIF_F_VLAN_CHALLENGED1024/* Device cannot handle VLAN 
packets */
 #define NETIF_F_GSO2048/* Enable software GSO. */
 #define NETIF_F_LLTX   4096/* LockLess TX */
+#define NETIF_F_NETNS_LOCAL8192/* Does not change network namespaces */
 #define NETIF_F_MULTI_QUEUE16384   /* Has multiple TX/RX queues */
 #define NETIF_F_LRO32768   /* large receive offload */
 
@@ -1002,6 +1003,8 @@ extern intdev_ethtool(struct net *net, 
struct ifreq *);
 extern unsigneddev_get_flags(const struct net_device *);
 extern int dev_change_flags(struct net_device *, unsigned);
 extern int dev_change_name(struct net_device *, char *);
+extern int dev_change_net_namespace(struct net_device *,
+struct net *, const char *);
 extern int dev_set_mtu(struct net_device *, int);
 extern int dev_set_mac_address(struct net_device *,
struct sockaddr *);
diff --git a/net/core/dev.c b/net/core/dev.c
index 53cdb64..d82ec5a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -208,6 +208,34 @@ static inline struct hlist_head *dev_index_hash(struct net 
*net, int ifindex)
return net-dev_index_head[ifindex  ((1  NETDEV_HASHBITS) - 1)];
 }
 
+/* Device list insertion */
+static int list_netdevice(struct net_device *dev)
+{
+   struct net *net = dev-nd_net;
+
+   ASSERT_RTNL();
+
+   write_lock_bh(dev_base_lock);
+   list_add_tail(dev-dev_list, net-dev_base_head);
+   hlist_add_head(dev-name_hlist, dev_name_hash(net, dev-name));
+   hlist_add_head(dev-index_hlist, dev_index_hash(net, dev-ifindex));
+   write_unlock_bh(dev_base_lock);
+   return 0;
+}
+
+/* Device list removal */
+static void unlist_netdevice(struct net_device *dev)
+{
+   ASSERT_RTNL();
+
+   /* Unlink dev from the device chain */
+   write_lock_bh(dev_base_lock);
+   list_del(dev-dev_list);
+   hlist_del(dev-name_hlist);
+   hlist_del(dev-index_hlist);
+   write_unlock_bh(dev_base_lock);
+}
+
 /*
  * Our notifier list
  */
@@ -3553,12 +3581,8 @@ int register_netdevice(struct net_device *dev)
set_bit(__LINK_STATE_PRESENT, dev-state);
 
dev_init_scheduler(dev);
-   write_lock_bh(dev_base_lock);
-   list_add_tail(dev-dev_list, net-dev_base_head);
-   hlist_add_head(dev-name_hlist, head);
-   hlist_add_head(dev-index_hlist, dev_index_hash(net, dev-ifindex));
dev_hold(dev);
-   write_unlock_bh(dev_base_lock);
+   list_netdevice(dev);
 
/* Notify protocols, that a new device appeared. */
ret = raw_notifier_call_chain(netdev_chain, NETDEV_REGISTER, dev);
@@ -3865,11 +3889,7 @@ void unregister_netdevice(struct net_device *dev)
dev_close(dev);
 
/* And unlink it from device chain. */
-   write_lock_bh(dev_base_lock);
-   

[PATCH 16/16] net: netlink support for moving devices between network namespaces.

2007-09-08 Thread Eric W. Biederman

The simplest thing to implement is moving network devices between
namespaces.  However with the same attribute IFLA_NET_NS_PID we can
easily implement creating devices in the destination network
namespace as well.  However that is a little bit trickier so this
patch sticks to what is simple and easy.

A pid is used to identify a process that happens to be a member
of the network namespace we want to move the network device to.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---
 include/linux/if_link.h |1 +
 net/core/rtnetlink.c|   35 +++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 422084d..84c3492 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -78,6 +78,7 @@ enum
IFLA_LINKMODE,
IFLA_LINKINFO,
 #define IFLA_LINKINFO IFLA_LINKINFO
+   IFLA_NET_NS_PID,
__IFLA_MAX
 };
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 44f91bb..1b9c32d 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -35,6 +35,7 @@
 #include linux/security.h
 #include linux/mutex.h
 #include linux/if_addr.h
+#include linux/nsproxy.h
 
 #include asm/uaccess.h
 #include asm/system.h
@@ -727,6 +728,7 @@ const struct nla_policy ifla_policy[IFLA_MAX+1] = {
[IFLA_WEIGHT]   = { .type = NLA_U32 },
[IFLA_OPERSTATE]= { .type = NLA_U8 },
[IFLA_LINKMODE] = { .type = NLA_U8 },
+   [IFLA_NET_NS_PID]   = { .type = NLA_U32 },
 };
 
 static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = {
@@ -734,12 +736,45 @@ static const struct nla_policy 
ifla_info_policy[IFLA_INFO_MAX+1] = {
[IFLA_INFO_DATA]= { .type = NLA_NESTED },
 };
 
+static struct net *get_net_ns_by_pid(pid_t pid)
+{
+   struct task_struct *tsk;
+   struct net *net;
+
+   /* Lookup the network namespace */
+   net = ERR_PTR(-ESRCH);
+   rcu_read_lock();
+   tsk = find_task_by_pid(pid);
+   if (tsk) {
+   task_lock(tsk);
+   if (tsk-nsproxy)
+   net = get_net(tsk-nsproxy-net_ns);
+   task_unlock(tsk);
+   }
+   rcu_read_unlock();
+   return net;
+}
+
 static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm,
  struct nlattr **tb, char *ifname, int modified)
 {
int send_addr_notify = 0;
int err;
 
+   if (tb[IFLA_NET_NS_PID]) {
+   struct net *net;
+   net = get_net_ns_by_pid(nla_get_u32(tb[IFLA_NET_NS_PID]));
+   if (IS_ERR(net)) {
+   err = PTR_ERR(net);
+   goto errout;
+   }
+   err = dev_change_net_namespace(dev, net, ifname);
+   put_net(net);
+   if (err)
+   goto errout;
+   modified = 1;
+   }
+
if (tb[IFLA_MAP]) {
struct rtnl_link_ifmap *u_map;
struct ifmap k_map;
-- 
1.5.3.rc6.17.g1911

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/16] net: Disable netfilter sockopts when not in the initial network namespace

2007-09-08 Thread Eric W. Biederman

Until we support multiple network namespaces with netfilter only allow
netfilter configuration in the initial network namespace.

Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
---

Ooops I overlooked this one on my first path through when gathering up this
patchset.

 net/netfilter/nf_sockopt.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/net/netfilter/nf_sockopt.c b/net/netfilter/nf_sockopt.c
index 8b8ece7..c12ea9b 100644
--- a/net/netfilter/nf_sockopt.c
+++ b/net/netfilter/nf_sockopt.c
@@ -80,6 +80,9 @@ static int nf_sockopt(struct sock *sk, int pf, int val,
struct nf_sockopt_ops *ops;
int ret;
 
+   if (sk-sk_net != init_net)
+   return -ENOPROTOOPT;
+
if (mutex_lock_interruptible(nf_sockopt_mutex) != 0)
return -EINTR;
 
@@ -138,6 +141,10 @@ static int compat_nf_sockopt(struct sock *sk, int pf, int 
val,
struct nf_sockopt_ops *ops;
int ret;
 
+   if (sk-sk_net != init_net)
+   return -ENOPROTOOPT;
+
+
if (mutex_lock_interruptible(nf_sockopt_mutex) != 0)
return -EINTR;
 
-- 
1.5.3.rc6.17.g1911

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/16] net: Basic network namespace infrastructure.

2007-09-08 Thread Paul E. McKenney
On Sat, Sep 08, 2007 at 03:15:34PM -0600, Eric W. Biederman wrote:
 
 This is the basic infrastructure needed to support network
 namespaces.  This infrastructure is:
 - Registration functions to support initializing per network
   namespace data when a network namespaces is created or destroyed.
 
 - struct net.  The network namespace data structure.
   This structure will grow as variables are made per network
   namespace but this is the minimal starting point.
 
 - Functions to grab a reference to the network namespace.
   I provide both get/put functions that keep a network namespace
   from being freed.  And hold/release functions serve as weak references
   and will warn if their count is not zero when the data structure
   is freed.  Useful for dealing with more complicated data structures
   like the ipv4 route cache.
 
 - A list of all of the network namespaces so we can iterate over them.
 
 - A slab for the network namespace data structure allowing leaks
   to be spotted.

If I understand this correctly, the only way to get to a namespace is
via get_net_ns_by_pid(), which contains the rcu_read_lock() that matches
the rcu_barrier() below.

So, is the get_net() in sock_copy() in this patch adding a reference to
an element that is guaranteed to already have at least one reference?
If not, how are we preventing sock_copy() from running concurrently with
cleanup_net()?  Ah, I see -- in sock_copy() we are getting a reference
to the new struct sock that no one else can get a reference to, so OK.
Ditto for the get_net() in sk_alloc().

But I still don't understand what is protecting the get_net() in
dev_seq_open().  Is there an existing reference?  If so, how do we know
that it won't be removed just as we are trying to add our reference
(while at the same time cleanup_net() is running)?  Ditto for the other
_open() operations in the same patch.  And for netlink_seq_open().

Enlightenment?

Thanx, Paul

 Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
 ---
  include/net/net_namespace.h |   68 ++
  net/core/Makefile   |2 +-
  net/core/net_namespace.c|  292 
 +++
  3 files changed, 361 insertions(+), 1 deletions(-)
  create mode 100644 include/net/net_namespace.h
  create mode 100644 net/core/net_namespace.c
 
 diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
 new file mode 100644
 index 000..6344b77
 --- /dev/null
 +++ b/include/net/net_namespace.h
 @@ -0,0 +1,68 @@
 +/*
 + * Operations on the network namespace
 + */
 +#ifndef __NET_NET_NAMESPACE_H
 +#define __NET_NET_NAMESPACE_H
 +
 +#include asm/atomic.h
 +#include linux/workqueue.h
 +#include linux/list.h
 +
 +struct net {
 + atomic_tcount;  /* To decided when the network
 +  *  namespace should be freed.
 +  */
 + atomic_tuse_count;  /* To track references we
 +  * destroy on demand
 +  */
 + struct list_headlist;   /* list of network namespaces */
 + struct work_struct  work;   /* work struct for freeing */
 +};
 +
 +extern struct net init_net;
 +extern struct list_head net_namespace_list;
 +
 +extern void __put_net(struct net *net);
 +
 +static inline struct net *get_net(struct net *net)
 +{
 + atomic_inc(net-count);
 + return net;
 +}
 +
 +static inline void put_net(struct net *net)
 +{
 + if (atomic_dec_and_test(net-count))
 + __put_net(net);
 +}
 +
 +static inline struct net *hold_net(struct net *net)
 +{
 + atomic_inc(net-use_count);
 + return net;
 +}
 +
 +static inline void release_net(struct net *net)
 +{
 + atomic_dec(net-use_count);
 +}
 +
 +extern void net_lock(void);
 +extern void net_unlock(void);
 +
 +#define for_each_net(VAR)\
 + list_for_each_entry(VAR, net_namespace_list, list)
 +
 +
 +struct pernet_operations {
 + struct list_head list;
 + int (*init)(struct net *net);
 + void (*exit)(struct net *net);
 +};
 +
 +extern int register_pernet_subsys(struct pernet_operations *);
 +extern void unregister_pernet_subsys(struct pernet_operations *);
 +extern int register_pernet_device(struct pernet_operations *);
 +extern void unregister_pernet_device(struct pernet_operations *);
 +
 +#endif /* __NET_NET_NAMESPACE_H */
 diff --git a/net/core/Makefile b/net/core/Makefile
 index 4751613..ea9b3f3 100644
 --- a/net/core/Makefile
 +++ b/net/core/Makefile
 @@ -3,7 +3,7 @@
  #
 
  obj-y := sock.o request_sock.o skbuff.o iovec.o datagram.o stream.o scm.o \
 -  gen_stats.o gen_estimator.o
 +  gen_stats.o gen_estimator.o net_namespace.o
 
  obj-$(CONFIG_SYSCTL) += sysctl_net_core.o
 
 diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
 new file mode 

Re: [Bugme-new] [Bug 8996] New: atl1 driver cause kernel oops IF ram 4Gyte and a lot of network load

2007-09-08 Thread Andrew Morton
On Sat,  8 Sep 2007 13:08:37 -0700 (PDT) [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=8996
 
Summary: atl1 driver cause kernel oops IF ram  4Gyte and a lot
 of network load
Product: Networking
Version: 2.5
  KernelVersion: 2.6.22
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: normal
   Priority: P1
  Component: Other
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Most recent kernel where this bug did not occur:
 Distribution:
 Hardware Environment:
 Software Environment:
 Problem Description:
 
 Steps to reproduce:
 
 

(switching to email.  Please respond via emailed reply-to-all, not
via the bugzilla UI, thanks).

We'd really need to see the kernel log output from that oops, please.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bugme-new] [Bug 8996] New: atl1 driver cause kernel oops IF ram 4Gyte and a lot of network load

2007-09-08 Thread Jay Cliburn
On Sat, 8 Sep 2007 17:36:21 -0700
Andrew Morton [EMAIL PROTECTED] wrote:

 On Sat,  8 Sep 2007 13:08:37 -0700 (PDT)
 [EMAIL PROTECTED] wrote:
 
  http://bugzilla.kernel.org/show_bug.cgi?id=8996
  
 Summary: atl1 driver cause kernel oops IF ram  4Gyte
  and a lot of network load

The submitter emailed me privately today asking about the atl1 DMA bug.
The instant bugzilla is the apparent result of our exchange.  

I replied to the private inquiry thusly:

[quote]
A workaround is in place in the -mm kernel tree.  

http://marc.info/?l=linux-mm-commitsm=118764750020317w=2

I expect it will be pushed into the 2.6.24 kernel series.

Fedora intends to pick up the patch for F7 in the near future.

https://bugzilla.redhat.com/show_bug.cgi?id=249511

Even without the patch, you can boot with mem=3900 on the kernel
command line and you won't hit the problem.
[/quote]

In related news, Jeff appears poised to push the patch (with
loathing) into netdev.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html