date:20060602

Re: link-local address via ifconfig

2006-06-02 Thread Herbert Xu

Anand Kumria <[EMAIL PROTECTED]> wrote:
> 
> There are plenty of people who still use ifconfig to list the addresses
> assigned to their network interfaces (I know, ifconfig is broken) and
> who then parse the output.

If people insist on using hammers on screws, the answer is not to improve
the hammer.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC3927 ARP patch status?

2006-06-02 Thread Herbert Xu

On Sat, Jun 03, 2006 at 12:50:02PM +1000, Anand Kumria wrote:
> 
> I guess it would be something set during RTM_NEWADDR (and returned by
> RTM_GETADDR?). How does IFA_DIRECTEDARP sound? With a value type of int;
> defaulting to 1.  When set to 0, generate a broadcast ARP for the
> address.

Address flags start with IFA_F and are in ifa_flags.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

link-local address via ifconfig

2006-06-02 Thread Anand Kumria

Hi,

There are plenty of people who still use ifconfig to list the addresses
assigned to their network interfaces (I know, ifconfig is broken) and
who then parse the output.

However the kernel puts link-local scoped address first if the address
list of an interface, so an interface like:

eve:[~]% ip addr show wlan0
3: wlan0:  mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:12:f0:03:d9:e7 brd ff:ff:ff:ff:ff:ff
inet 169.254.182.108/16 brd 169.254.255.255 scope link wlan0
inet 192.168.2.2/24 brd 192.168.2.255 scope global wlan0
inet6 fe80::212:f0ff:fe03:d9e7/64 scope link
   valid_lft forever preferred_lft forever

appears as:
eve:[~]% ifconfig wlan0
wlan0 Link encap:Ethernet  HWaddr 00:12:F0:03:D9:E7
  inet addr:169.254.182.108  Bcast:169.254.255.255 Mask:255.255.0.0
  inet6 addr: fe80::212:f0ff:fe03:d9e7/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  [... elided ...]

Is there any reason to put the link-local address first in the list?

I've had a number of bugreports (or outright panic attacks) where the 
problem turned out to be that ifconfig was reporting the link-local 
address first, rather than the global/site one.

Thanks,
Anand

-- 
 `When any government, or any church for that matter, undertakes to say to
  its subjects, "This you may not read, this you must not see, this you are
  forbidden to know," the end result is tyranny and oppression no matter how
  holy the motives' -- Robert A Heinlein, "If this goes on --"
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC3927 ARP patch status?

2006-06-02 Thread Anand Kumria

On Fri, Jun 02, 2006 at 05:58:27PM -0700, David Daney wrote:
> Anand Kumria wrote:
> >Herbert,
> >
> >On Sat, Jun 03, 2006 at 09:12:06AM +1000, Herbert Xu wrote:
> >
> >>David Daney <[EMAIL PROTECTED]> wrote:
> >>
> >>>There were some discussions about whether it made sense for the kernel 
> >>>to support the behavior required by the RFC.  Other comments debated the 
> >>>wisdom of using a tightly targeted patch specific to the RFC, or whether 
> >>>a more general but intrusive solution would be better.
> >>
> >>I think we've made it quite clear what needs to be done for it to be
> >>accepted.  All that remains is for someone to implement it.  If anyone
> >>really cares about this, then please write the code instead of talking
> >>about it.
> >
> >
> >Okay, to confirm: you want a patch which looks at the scope value and if 
> >the scope is link-local then we broadcast rather than do a directed ARP?
> >
> 
> I don't think that was the plan.  In an earlier e-mail Herbert Xu said 
> (and I concur):
> 
> --
> I like the idea of allowing user-space to control what addresses cause
> broadcasts.  However, I'm uncomfortable with overloading existing flags
> even though they might appear to fit the bill on the face of it.
> 

[...]

Sorry, I can't find any email with any of those words in it by Herbert.
Could you tell me the message-id, so I can read some of the surrounding
context?

> 
> The idea was to add a new flag, *not* reuse the scope value.
> 

I guess it would be something set during RTM_NEWADDR (and returned by
RTM_GETADDR?). How does IFA_DIRECTEDARP sound? With a value type of int;
defaulting to 1.  When set to 0, generate a broadcast ARP for the
address.

Thanks,
Anand

-- 
 `When any government, or any church for that matter, undertakes to say to
  its subjects, "This you may not read, this you must not see, this you are
  forbidden to know," the end result is tyranny and oppression no matter how
  holy the motives' -- Robert A Heinlein, "If this goes on --"
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.16.18] MSI: Proposed fix for MSI/MSI-X load failure

2006-06-02 Thread Paul Mackerras

Rajesh Shah writes:

> The current MSI code actually does this deliberately, not by
> accident. It's got a lot of complex code to track devices and
> vectors and make sure an enable_msi -> disable -> enable sequence
> gives a driver the same vector. It also has policies about
> reserving vectors based on potential hotplug activity etc.
> Frankly, I've never understood the need for such policies, and
> am in the process of removing all of them.

Good.  We will not be able to support a policy of giving the driver
the same vector across an enable_msi/disable/enable sequence on IBM
System p machines (64-bit PowerPC), because the firmware controls the
MSI allocation, and it doesn't give us the necessary guarantees.

Paul.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC] TCP limited slow start

2006-06-02 Thread Stephen Hemminger

Rolled my sleeve's up and gave this a try...

This is a implementation of Sally Floyd's Limited Slow Start
for Large Congestion Windows.

Summary from RFC:
   Limited Slow-Start introduces a parameter, "max_ssthresh", and
   modifies the slow-start mechanism for values of the congestion window
   where "cwnd" is greater than "max_ssthresh".  That is, during Slow-
   Start, when

  cwnd <= max_ssthresh,

   cwnd is increased by one MSS (MAXIMUM SEGMENT SIZE) for every
   arriving ACK (acknowledgement) during slow-start, as is always the
   case.  During Limited Slow-Start, when

  max_ssthresh < cwnd <= ssthresh,

   the invariant is maintained so that the congestion window is
   increased during slow-start by at most max_ssthresh/2 MSS per round-
   trip time.  This is done as follows:

  For each arriving ACK in slow-start:
If (cwnd <= max_ssthresh)
   cwnd += MSS;
else
   K = int(cwnd/(0.5 max_ssthresh));
   cwnd += int(MSS/K);

   Thus, during Limited Slow-Start the window is increased by 1/K MSS
   for each arriving ACK, for K = int(cwnd/(0.5 max_ssthresh)), instead
   of by 1 MSS as in standard slow-start [RFC2581].

---

 Documentation/networking/ip-sysctl.txt |8 +-
 include/linux/sysctl.h |1 +
 include/net/tcp.h  |1 +
 net/ipv4/sysctl_net_ipv4.c |8 ++
 net/ipv4/tcp_cong.c|   46 
 net/ipv4/tcp_input.c   |1 +
 6 files changed, 47 insertions(+), 18 deletions(-)

0884f45c9f21c50dd9117b2fc02bf5436be3c3bf
diff --git a/Documentation/networking/ip-sysctl.txt 
b/Documentation/networking/ip-sysctl.txt
index f12007b..9869298 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -103,9 +103,15 @@ TCP variables: 
 
 tcp_abc - INTEGER
Controls Appropriate Byte Count defined in RFC3465. If set to
-   0 then does congestion avoid once per ack. 1 is conservative
+   0 then does congestion avoid once per ack. 1 (default) is conservative
value, and 2 is more agressive.
 
+tcp_limited_ssthresh - INTEGER
+   Controls the increase of the congestion window during slow start as
+   defined in RFC3742. The purpose is to slow the growth of the congestion
+   window on high delay networks where agressive growth can cause losses
+   of 1000's of packets. Default is 100 packets.
+
 tcp_syn_retries - INTEGER
Number of times initial SYNs for an active TCP connection attempt
will be retransmitted. Should not be higher than 255. Default value
diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 76eaeff..a455165 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -403,6 +403,7 @@ enum
NET_TCP_MTU_PROBING=113,
NET_TCP_BASE_MSS=114,
NET_IPV4_TCP_WORKAROUND_SIGNED_WINDOWS=115,
+   NET_TCP_LIMITED_SSTHRESH=116,
 };
 
 enum {
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 575636f..3a14861 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -225,6 +225,7 @@ extern int sysctl_tcp_abc;
 extern int sysctl_tcp_mtu_probing;
 extern int sysctl_tcp_base_mss;
 extern int sysctl_tcp_workaround_signed_windows;
+extern int sysctl_tcp_limited_ssthresh;
 
 extern atomic_t tcp_memory_allocated;
 extern atomic_t tcp_sockets_allocated;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 6b6c3ad..d1358d3 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -688,6 +688,14 @@ #endif
.mode   = 0644,
.proc_handler   = &proc_dointvec
},
+   {
+   .ctl_name   = NET_TCP_LIMITED_SSTHRESH,
+   .procname   = "tcp_max_ssthresh",
+   .data   = &sysctl_tcp_limited_ssthresh,
+   .maxlen = sizeof(int),
+   .mode   = 0644,
+   .proc_handler   = &proc_dointvec,
+   },
{ .ctl_name = 0 }
 };
 
diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index 857eefc..a27c792 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -180,25 +180,37 @@ int tcp_set_congestion_control(struct so
  */
 void tcp_slow_start(struct tcp_sock *tp)
 {
-   if (sysctl_tcp_abc) {
-   /* RFC3465: Slow Start
-* TCP sender SHOULD increase cwnd by the number of
-* previously unacknowledged bytes ACKed by each incoming
-* acknowledgment, provided the increase is not more than L
-*/
-   if (tp->bytes_acked < tp->mss_cache)
-   return;
-
-   /* We MAY increase by 2 if discovered delayed ack */
-   if (sysctl_tcp_abc > 1 && tp->bytes_acked > 2*tp->mss_cache) {
-   if (tp->snd_cwnd < tp->snd_cwnd_clamp)
-   tp->snd_cwnd++;
-   }
+

Re: RFC3927 ARP patch status?

2006-06-02 Thread David Miller

From: David Daney <[EMAIL PROTECTED]>
Date: Fri, 02 Jun 2006 17:55:17 -0700

> RFC3927 may be a mine field, but the only thing that has to be changed 
> in the kernel to support it is to somehow configure the arp driver to 
> broadcast unconditionally on certain interfaces.

Ok, I'd have to see the final patch after Herbert's suggestions
are taken into account.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC3927 ARP patch status?

2006-06-02 Thread David Daney


Anand Kumria wrote:

Herbert,

On Sat, Jun 03, 2006 at 09:12:06AM +1000, Herbert Xu wrote:


David Daney <[EMAIL PROTECTED]> wrote:

There were some discussions about whether it made sense for the kernel 
to support the behavior required by the RFC.  Other comments debated the 
wisdom of using a tightly targeted patch specific to the RFC, or whether 
a more general but intrusive solution would be better.


I think we've made it quite clear what needs to be done for it to be
accepted.  All that remains is for someone to implement it.  If anyone
really cares about this, then please write the code instead of talking
about it.



Okay, to confirm: you want a patch which looks at the scope value and if 
the scope is link-local then we broadcast rather than do a directed ARP?




I don't think that was the plan.  In an earlier e-mail Herbert Xu said 
(and I concur):


--
I like the idea of allowing user-space to control what addresses cause
broadcasts.  However, I'm uncomfortable with overloading existing flags
even though they might appear to fit the bill on the face of it.

People may be using this for completely different reasons (address
selection) and it's not polite to suddenly turn all their ARPs into
broadcasts.

So how about a new address flag? We still have some vacancies there.
--

The idea was to add a new flag, *not* reuse the scope value.


David Daney
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC3927 ARP patch status?

2006-06-02 Thread David Daney


David Miller wrote:

RFC3927 seem to be an intellectual property mine field, I really don't
see how we can include this in the Linux kernel.

Go to "http://www.ietf.org/ipr";, click on "Search the IPR
disclosures", then enter "3927" in the "Enter RFC number" field and
click SEARCH.


RFC3927 may be a mine field, but the only thing that has to be changed 
in the kernel to support it is to somehow configure the arp driver to 
broadcast unconditionally on certain interfaces.  The majority of the 
rfc3927 protocol is done by userspace applications, so should *not* 
really effect the kernel.


David Daney.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: The AI parameter of tcp_highspeed.c (in 2.6.18)

2006-06-02 Thread David Miller

From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Fri, 2 Jun 2006 12:05:07 -0700

> Went backed and looked at the RFC. The problem was just a simple
> translation of table to C array (0 based). Added this to the TCP
> testing repository.

Patch applied, thanks a lot.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC3927 ARP patch status?

2006-06-02 Thread David Miller


RFC3927 seem to be an intellectual property mine field, I really don't
see how we can include this in the Linux kernel.

Go to "http://www.ietf.org/ipr";, click on "Search the IPR
disclosures", then enter "3927" in the "Enter RFC number" field and
click SEARCH.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC3927 ARP patch status?

2006-06-02 Thread Anand Kumria

Herbert,

On Sat, Jun 03, 2006 at 09:12:06AM +1000, Herbert Xu wrote:
> David Daney <[EMAIL PROTECTED]> wrote:
> > 
> > There were some discussions about whether it made sense for the kernel 
> > to support the behavior required by the RFC.  Other comments debated the 
> > wisdom of using a tightly targeted patch specific to the RFC, or whether 
> > a more general but intrusive solution would be better.
> 
> I think we've made it quite clear what needs to be done for it to be
> accepted.  All that remains is for someone to implement it.  If anyone
> really cares about this, then please write the code instead of talking
> about it.

Okay, to confirm: you want a patch which looks at the scope value and if 
the scope is link-local then we broadcast rather than do a directed ARP?

One I have where IPv6 encodes what address mask is link-local,
I'd like to follow the same style as. Should the kernel also set the
scope on IPv4 addresses like it does for IPv6 ones?

Thanks,
Anand

-- 
 `When any government, or any church for that matter, undertakes to say to
  its subjects, "This you may not read, this you must not see, this you are
  forbidden to know," the end result is tyranny and oppression no matter how
  holy the motives' -- Robert A Heinlein, "If this goes on --"
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

TCP Limited slow start

2006-06-02 Thread Stephen Hemminger

Has anyone done an implementation of RFC3742 for Linux? It looks interesting, 
but
would need some integration with current ABC code.

There was some evidence of a version in old Web100 code, but it's gone now. Was
it deemed a mistake?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC3927 ARP patch status?

2006-06-02 Thread Herbert Xu

David Daney <[EMAIL PROTECTED]> wrote:
> 
> There were some discussions about whether it made sense for the kernel 
> to support the behavior required by the RFC.  Other comments debated the 
> wisdom of using a tightly targeted patch specific to the RFC, or whether 
> a more general but intrusive solution would be better.

I think we've made it quite clear what needs to be done for it to be
accepted.  All that remains is for someone to implement it.  If anyone
really cares about this, then please write the code instead of talking
about it.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bugme-new] [Bug 6638] New: tg3 output freezes on compaq nc6000

2006-06-02 Thread Michael Chan

On Fri, 2006-06-02 at 11:22 -0700, Andrew Morton wrote:
> On Fri, 2 Jun 2006 05:40:51 -0700
> [EMAIL PROTECTED] wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=6638
> > 
> >Summary: tg3 output freezes on compaq nc6000
> > Kernel Version: 2.6.16.19
> > Status: NEW
> >   Severity: normal
> >  Owner: [EMAIL PROTECTED]
> >  Submitter: [EMAIL PROTECTED]
> > 
> > 
> > Most recent kernel where this bug did not occur: none
> > Distribution: Debian Sarge with latest kernel
> > Hardware Environment: compaq nc6000
> > Software Environment: 
> > Problem Description: 
> > The output engine of the tg3 driver freezes when generating high load.
> > 
> > `ifconfig' shows incomming packets, however, outgoing counter is not 
> > incremented
> > any more.
> > 
Please provide:

1. tg3 probing output during ifconfig up.
2. /proc/interrupts output to see if interrupt counter is increasing
after failure.
3. "ethtool -d eth0 > dump" after the failure.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [FIX] e1000: fix irq sharing when running ethtool test

2006-06-02 Thread Auke Kok


Jeff Garzik wrote:

On Fri, Jun 02, 2006 at 03:19:47PM -0700, Auke Kok wrote:
Because upstream and upstream-fixes have a whitespace conflict in them, 
I've prepared two separate git branches to pull from so that a subsequent 
pull or merge from upstream-fixes into upstream doesn't resolve into a 
conflict:


That won't work, because it creates duplicate changesets in the history.

I'll pull the upstream-fixes version, and then merge into #upstream.


thanks, I wish I had thought of that first!

Auke
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [FIX] e1000: fix irq sharing when running ethtool test

2006-06-02 Thread Jeff Garzik

On Fri, Jun 02, 2006 at 03:19:47PM -0700, Auke Kok wrote:
> Because upstream and upstream-fixes have a whitespace conflict in them, 
> I've prepared two separate git branches to pull from so that a subsequent 
> pull or merge from upstream-fixes into upstream doesn't resolve into a 
> conflict:

That won't work, because it creates duplicate changesets in the history.

I'll pull the upstream-fixes version, and then merge into #upstream.

Jeff



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[FIX] e1000: fix irq sharing when running ethtool test

2006-06-02 Thread Auke Kok




New code added in 2.6.17 caused setup_irq to print a warning when
running ethtool -t eth0 offline.

This test marks the request_irq call made by this test as a "probe" to
see if the interrupt is shared or not.

Signed-off-by: Jesse Brandeburg <[EMAIL PROTECTED]>
Signed-off-by: Auke Kok <[EMAIL PROTECTED]>


---

Jeff,

Because upstream and upstream-fixes have a whitespace conflict in them, I've 
prepared two separate git branches to pull from so that a subsequent pull or 
merge from upstream-fixes into upstream doesn't resolve into a conflict:


please pull from our git-server:

into upstream:
git-pull git://lost.foo-projects.org/~ahkok/git/netdev-2.6 upstream

into upstream-fixes:
git-pull git://lost.foo-projects.org/~ahkok/git/netdev-2.6 upstream-fixes


Cheers,

Auke


---
 e1000_ethtool.c |9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

e1000: fix ethtool test irq alloc as "probe"

New code added in 2.6.17 caused setup_irq to print a warning when
running ethtool -t eth0 offline.

This test marks the request_irq call made by this test as a "probe"
to see if the interrupt is shared or not.

Signed-off-by: Jesse Brandeburg <[EMAIL PROTECTED]>
Signed-off-by: Auke Kok <[EMAIL PROTECTED]>

diff --git a/drivers/net/e1000/e1000_ethtool.c b/drivers/net/e1000/e1000_ethtool.c
index ea3..d1c705b 100644
--- a/drivers/net/e1000/e1000_ethtool.c
+++ b/drivers/net/e1000/e1000_ethtool.c
@@ -870,13 +870,16 @@ e1000_intr_test(struct e1000_adapter *ad
 	*data = 0;
 
 	/* Hook up test interrupt handler just for this test */
-	if (!request_irq(irq, &e1000_test_intr, 0, netdev->name, netdev)) {
+	if (!request_irq(irq, &e1000_test_intr, SA_PROBEIRQ, netdev->name,
+	 netdev)) {
 		shared_int = FALSE;
 	} else if (request_irq(irq, &e1000_test_intr, SA_SHIRQ,
 			  netdev->name, netdev)){
 		*data = 1;
 		return -1;
 	}
+	DPRINTK(PROBE,INFO, "testing %s interrupt\n",
+	(shared_int ? "shared" : "unshared"));
 
 	/* Disable all the interrupts */
 	E1000_WRITE_REG(&adapter->hw, IMC, 0x);

RE: [PATCH 2.6.16.18] MSI: Proposed fix for MSI/MSI-X load failure

2006-06-02 Thread Ravinandan Arakali

Rajesh,
It's possible that the current behavior is by design but once the driver is 
loaded with MSI, you need a reboot to be able to load MSI-X. And vice versa. I 
found this rather restrictive.

I did test the fix multiple times. For eg. multiple load/unload iterations of
MSI followed by multiple load/unload of MSI-X followed by load/unload MSI. That 
way both transitions(MSI-to-MSI-X and vice versa) are tested.

Thanks,
Ravi

-Original Message-
From: Rajesh Shah [mailto:[EMAIL PROTECTED]
Sent: Friday, June 02, 2006 2:55 PM
To: Ravinandan Arakali
Cc: linux-kernel@vger.kernel.org; netdev@vger.kernel.org; Leonid
Grossman; Ananda Raju; Sriram Rapuru
Subject: Re: [PATCH 2.6.16.18] MSI: Proposed fix for MSI/MSI-X load
failure


On Fri, Jun 02, 2006 at 03:21:37PM -0400, Ravinandan Arakali wrote:
> 
> Symptoms:
> When a driver is loaded with MSI followed by MSI-X, the load fails indicating 
> that the MSI vector is still active. And vice versa.
> 
> Suspected rootcause:
> This happens inspite of driver calling free_irq() followed by 
> pci_disable_msi/pci_disable_msix. This appears to be a kernel bug 
> wherein the pci_disable_msi and pci_disable_msix calls do not 
> clear/unpopulate the msi_desc data structure that was populated 
> by pci_enable_msi/pci_enable_msix.
> 
The current MSI code actually does this deliberately, not by
accident. It's got a lot of complex code to track devices and
vectors and make sure an enable_msi -> disable -> enable sequence
gives a driver the same vector. It also has policies about
reserving vectors based on potential hotplug activity etc.
Frankly, I've never understood the need for such policies, and
am in the process of removing all of them.

> Proposed fix:
> Free the MSI vector in pci_disable_msi and all allocated MSI-X vectors 
> in pci_disable_msix.
> 
This will break the existing MSI policies. Once you take that away,
a whole lot of additional code and complexity can be removed too.
That's what I'm working on right now, but such a change is likely
too big for -stable.

So, I'm ok with this patch if it actually doesn't break MSI/MSI-X.
Did you try to repeatedly load/unload an MSI capable driver with
this patch? Did you repeatedly try to ifdown/ifup an Ethernet
driver that uses MSI? I'm not in a position to test this today, but
will try it out next week.

thanks,
Rajesh

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2.6.16.18] MSI: Proposed fix for MSI/MSI-X load failure

2006-06-02 Thread Rajesh Shah

On Fri, Jun 02, 2006 at 03:21:37PM -0400, Ravinandan Arakali wrote:
> 
> Symptoms:
> When a driver is loaded with MSI followed by MSI-X, the load fails indicating 
> that the MSI vector is still active. And vice versa.
> 
> Suspected rootcause:
> This happens inspite of driver calling free_irq() followed by 
> pci_disable_msi/pci_disable_msix. This appears to be a kernel bug 
> wherein the pci_disable_msi and pci_disable_msix calls do not 
> clear/unpopulate the msi_desc data structure that was populated 
> by pci_enable_msi/pci_enable_msix.
> 
The current MSI code actually does this deliberately, not by
accident. It's got a lot of complex code to track devices and
vectors and make sure an enable_msi -> disable -> enable sequence
gives a driver the same vector. It also has policies about
reserving vectors based on potential hotplug activity etc.
Frankly, I've never understood the need for such policies, and
am in the process of removing all of them.

> Proposed fix:
> Free the MSI vector in pci_disable_msi and all allocated MSI-X vectors 
> in pci_disable_msix.
> 
This will break the existing MSI policies. Once you take that away,
a whole lot of additional code and complexity can be removed too.
That's what I'm working on right now, but such a change is likely
too big for -stable.

So, I'm ok with this patch if it actually doesn't break MSI/MSI-X.
Did you try to repeatedly load/unload an MSI capable driver with
this patch? Did you repeatedly try to ifdown/ifup an Ethernet
driver that uses MSI? I'm not in a position to test this today, but
will try it out next week.

thanks,
Rajesh
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Driver for Rsltek 8139D / Silan SC92031

2006-06-02 Thread Stephen Hemminger

On Fri, 2 Jun 2006 13:28:46 -0700
Stephen Hemminger <[EMAIL PROTECTED]> wrote:

> On Fri, 02 Jun 2006 21:16:53 +0100
> Daniel Drake <[EMAIL PROTECTED]> wrote:
> 
> > Here's a strange one. Cantao (on CC) bought what he thought was a cheap 
> > realtek PCI NIC, it actually turns out it is a Rsltek (yes, Rsltek) 
> > 8139D card.
> > 
> > It includes an old (2.4/2.5) driver which claims to be for Silan SC92031 
> > (attached).
> > 
> > The driver has some very obvious similarities with 8139too, however the 
> > register layout and usage is quite different.
> > 
> > Has anyone got any idea whats going on here? It seems like something 
> > based on a realtek chip, but not...
> > 
> > Daniel
> > 
> 
> It certainly is a good driver as is...
NOT
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Driver for Rsltek 8139D / Silan SC92031

2006-06-02 Thread Stephen Hemminger

On Fri, 02 Jun 2006 21:16:53 +0100
Daniel Drake <[EMAIL PROTECTED]> wrote:

> Here's a strange one. Cantao (on CC) bought what he thought was a cheap 
> realtek PCI NIC, it actually turns out it is a Rsltek (yes, Rsltek) 
> 8139D card.
> 
> It includes an old (2.4/2.5) driver which claims to be for Silan SC92031 
> (attached).
> 
> The driver has some very obvious similarities with 8139too, however the 
> register layout and usage is quite different.
> 
> Has anyone got any idea whats going on here? It seems like something 
> based on a realtek chip, but not...
> 
> Daniel
> 

It certainly is a good driver as is...
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2.6.16.18] MSI: Proposed fix for MSI/MSI-X load failure

2006-06-02 Thread Ravinandan Arakali

Hi,
This patch suggests a fix for the MSI/MSI-X load failure.

Please review the patch.

Symptoms:
When a driver is loaded with MSI followed by MSI-X, the load fails indicating 
that the MSI vector is still active. And vice versa.

Suspected rootcause:
This happens inspite of driver calling free_irq() followed by 
pci_disable_msi/pci_disable_msix. This appears to be a kernel bug 
wherein the pci_disable_msi and pci_disable_msix calls do not 
clear/unpopulate the msi_desc data structure that was populated 
by pci_enable_msi/pci_enable_msix.

Proposed fix:
Free the MSI vector in pci_disable_msi and all allocated MSI-X vectors 
in pci_disable_msix.

Testing:
The fix has been tested on IA64 platforms with Neterion's Xframe driver.

Signed-off-by: Ravinandan Arakali <[EMAIL PROTECTED]>
---

diff -urpN old/drivers/pci/msi.c new/drivers/pci/msi.c
--- old/drivers/pci/msi.c   2006-05-31 19:02:19.0 -0700
+++ new/drivers/pci/msi.c   2006-05-31 19:02:39.0 -0700
@@ -779,6 +779,7 @@ void pci_disable_msi(struct pci_dev* dev
nr_released_vectors++;
default_vector = entry->msi_attrib.default_vector;
spin_unlock_irqrestore(&msi_lock, flags);
+   msi_free_vector(dev, dev->irq, 1);
/* Restore dev->irq to its default pin-assertion vector */
dev->irq = default_vector;
disable_msi_mode(dev, pci_find_capability(dev, PCI_CAP_ID_MSI),
@@ -1046,6 +1047,7 @@ void pci_disable_msix(struct pci_dev* de
 
}
}
+   msi_remove_pci_irq_vectors(dev);
 }
 
 /**

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: The AI parameter of tcp_highspeed.c (in 2.6.18)

2006-06-02 Thread Stephen Hemminger

Went backed and looked at the RFC. The problem was just a simple
translation of table to C array (0 based). Added this to the TCP testing 
repository.

Subject: [PATCH] Problem observed by Xiaoliang (David) Wei:

  When snd_cwnd is smaller than 38 and the connection is in
  congestion avoidance phase (snd_cwnd > snd_ssthresh), the snd_cwnd
  seems to stop growing.

The additive increase was confused because C array's are 0 based.

Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>

---

 net/ipv4/tcp_highspeed.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

121685b7b61c8faeb87e6c6f0c346b0fe1c46fd2
diff --git a/net/ipv4/tcp_highspeed.c b/net/ipv4/tcp_highspeed.c
index b72fa55..ba7c63c 100644
--- a/net/ipv4/tcp_highspeed.c
+++ b/net/ipv4/tcp_highspeed.c
@@ -135,7 +135,8 @@ static void hstcp_cong_avoid(struct sock
 
/* Do additive increase */
if (tp->snd_cwnd < tp->snd_cwnd_clamp) {
-   tp->snd_cwnd_cnt += ca->ai;
+   /* cwnd = cwnd + a(w) / cwnd */
+   tp->snd_cwnd_cnt += ca->ai + 1;
if (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
tp->snd_cwnd_cnt -= tp->snd_cwnd;
tp->snd_cwnd++;
-- 
1.3.3

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bugme-new] [Bug 6638] New: tg3 output freezes on compaq nc6000

2006-06-02 Thread Andrew Morton

On Fri, 2 Jun 2006 05:40:51 -0700
[EMAIL PROTECTED] wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=6638
> 
>Summary: tg3 output freezes on compaq nc6000
> Kernel Version: 2.6.16.19
> Status: NEW
>   Severity: normal
>  Owner: [EMAIL PROTECTED]
>  Submitter: [EMAIL PROTECTED]
> 
> 
> Most recent kernel where this bug did not occur: none
> Distribution: Debian Sarge with latest kernel
> Hardware Environment: compaq nc6000
> Software Environment: 
> Problem Description: 
> The output engine of the tg3 driver freezes when generating high load.
> 
> `ifconfig' shows incomming packets, however, outgoing counter is not 
> incremented
> any more.
> 
> Resetting the device (ifdown eth0, ifup eth0) heals the problem.
> 
> Steps to reproduce:
> Heavily copy files to NFS disk.
> 
> --- You are receiving this mail because: ---
> You are on the CC list for the bug, or are watching someone who is.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Question about tcp hash function tcp_hashfn()

2006-06-02 Thread Brian F. G. Bidulock

Florian,

On Fri, 02 Jun 2006, Florian Weimer wrote:
> 
> I see them now.  Hmm.  Is there a theoretical explanation for them?

Jenkins is an ad hoc function that is far from ideal.  As you know,
the ideal hash changes 1/2 the bits in the output value for each one
bit change in the input value(s).  Jenkins changes a few as 1/3 and
performs less than ideal over even a small smaple of the input data
set (Jenkins said he checked several billion of the trilions of
changes).

It should not be suprising that a general purpose ad hoc function
(Jenkins) performs poorer than a specific purpose ad hoc function
(XOR), for the very specific input data sets that the later was chosen
to cover.

Theoretically, XOR can be improved upon, but Jenkins doesn't do it.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC3927 ARP patch status?

2006-06-02 Thread Anand Kumria

On Fri, Jun 02, 2006 at 09:36:54AM -0700, David Daney wrote:
> Anand Kumria wrote:
> >Hi David,
> >
> >Do you know the status of your RFC3927 ARP patch? Is it likely to make
> >it into a mainline kernel?
> >
> 
> That would be up to the kernel network maintainers.
> 
> There were some discussions about whether it made sense for the kernel 
> to support the behavior required by the RFC.  Other comments debated the 

Hmm, well the behaviour at the moment is certainly suboptimal. Any
compliant RFC3927 implementation has to generate an additional broadcast
ARP -- the kernel will send a directed response, which isn't enough.

> The patch is there.  I signed-off-by on it.
> 
> If you need RFC3927 compliance, you are free to apply the patch.  If the 
>  network maintainers are so inclined, they can do the necessary things 
> to get it into the mainline.

Okay, thanks.

Anand

-- 
 `When any government, or any church for that matter, undertakes to say to
  its subjects, "This you may not read, this you must not see, this you are
  forbidden to know," the end result is tyranny and oppression no matter how
  holy the motives' -- Robert A Heinlein, "If this goes on --"
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Question about tcp hash function tcp_hashfn()

2006-06-02 Thread Florian Weimer

* Evgeniy Polyakov:

> :) thats true, but to be 100% honest I used different code to test for
> hash artifacts...

Ah, okay.

> But it still does not fix artifacts with for example const IP and random
> ports or const IP and linear port selection.

I see them now.  Hmm.  Is there a theoretical explanation for them?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: RFC3927 ARP patch status?

2006-06-02 Thread David Daney


Anand Kumria wrote:

Hi David,

Do you know the status of your RFC3927 ARP patch? Is it likely to make
it into a mainline kernel?



That would be up to the kernel network maintainers.

There were some discussions about whether it made sense for the kernel 
to support the behavior required by the RFC.  Other comments debated the 
wisdom of using a tightly targeted patch specific to the RFC, or whether 
a more general but intrusive solution would be better.


The patch is there.  I signed-off-by on it.

If you need RFC3927 compliance, you are free to apply the patch.  If the 
 network maintainers are so inclined, they can do the necessary things 
to get it into the mainline.


David Daney
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: new driver for IBM ethernet chip

2006-06-02 Thread Randy.Dunlap

On Fri, 2 Jun 2006 13:02:37 +0200 Christoph Raisch wrote:

> We're currently developing a new Ethernet device driver for a 10G IBM chip
> for System p. (ppc64)
> 
> A later version of the driver should end up in mainline kernel.
> How should we proceed to get first comments by the community?
> Either post this code as a patch to netdev or
yes

> put a full tarball on for example sourceforge?
nope.

Please read and observe:  Documentation/SubmittingPatches
and Section 3 of it, References, for other sources of
expectations/requirements.

The -mm tree also contains Documentation/SubmitChecklist
that you may find useful.

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Question about tcp hash function tcp_hashfn()

2006-06-02 Thread Brian F. G. Bidulock

Evgeniy,

I agree, even with constant source IP, the hash still should have
performed better (but didn't).  Constant source IP and varying
port is a realistic data set for a port proxy.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] softmac: Fix handling of authentication failure

2006-06-02 Thread Larry Finger


John W. Linville wrote:

On Fri, Jun 02, 2006 at 11:09:56AM +0100, Daniel Drake wrote:
  

Larry Finger wrote:

This statement fails to compile on my system using Linus' tree, because 
ieee80211softmac_disassoc needs a second argument - the reason. It seems 
as if WLAN_REASON_PREV_AUTH_NOT_VALID would be appropriate.
  
Thanks for pointing that out. This patch depends on a patch titled 
"softmac: deauthentication implies deassociation" which is apparently 
not present in Linus' tree.



That patch is in the upstream branch (also available in the master
branch) of wireless-2.6, which is probably what anyone doing wireless
patches should(*) be using.

The approved release process only allows for non-bugfix patches
in the first two weeks after a Linus blesses a new kernel release.
After that only bugfixes are allowed, with non-bugfix patches getting
queued for the next merge window.  The upstream branch represents
that queue of patches.

Hth!

John

(*) The exception being those workign on Devicescape-related patches,
who should be working off the master branch of wireless-dev.
  
Normally I use the master branch of wireless-2.6; however that code has 
something in it that kills interrupts from my bcm4306 card, but I 
haven't had time to chase down that problem. A second difficulty is that 
my Linksys WRT54G V1 died and the only replacement I could find on short 
notice was a V5 model, which is truly an abomination using VXWorks, not 
Linux. When the AP changed, my working system of bcm43xx-softmac with 
WPA authentication now refuses to authenticate, and I have to use a 
wired connection. Accordingly, I try every softmac patch that might 
solve my problem - thus I applied Daniels's patch to Linus's tree.


Larry


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.17-rc4: netfilter LOG messages truncated via NETCONSOLE (2)

2006-06-02 Thread Frank van Maarseveen

On Fri, Jun 02, 2006 at 04:16:08PM +0200, Patrick McHardy wrote:
> Which network driver are you using?

I've seen it with two completely different NICs at the sender side:
:02:08.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 
100] (rev 05)
:02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751 
Gigabit Ethernet PCI Express (rev 01)

Snippet from bootlog:
Jun  2 16:24:05 posio kernel: e100: Intel(R) PRO/100 Network Driver, 
3.5.10-k2-NAPI
Jun  2 16:24:05 posio kernel: e100: Copyright(c) 1999-2005 Intel Corporation
Jun  2 16:24:05 posio kernel: ACPI: PCI Interrupt :02:08.0[A] -> GSI 16 
(level, low) -> IRQ 16
Jun  2 16:24:05 posio kernel: e100: eth0: e100_probe: addr 0x4040, irq 16, 
MAC addr 00:08:C7:69:29:AE
Jun  2 16:24:05 posio kernel: netconsole: device eth0 not up yet, forcing it
Jun  2 16:24:05 posio kernel: e100: eth0: e100_watchdog: link up, 100Mbps, 
full-duplex
Jun  2 16:24:05 posio kernel: netconsole: carrier detect appears untrustworthy, 
waiting 4 seconds
Jun  2 16:24:05 posio kernel: netconsole: network logging started

> Does this patch show anything in
> the ringbuffer?

no.

> --- a/net/core/netpoll.c
> +++ b/net/core/netpoll.c
> @@ -302,6 +302,9 @@ static void netpoll_send_skb(struct netp
>   netpoll_poll(np);
>   udelay(50);
>   } while (npinfo->tries > 0);
> +
> + printk("failed to transmit\n");
> + kfree_skb(skb);
>  }
>  
>  void netpoll_send_udp(struct netpoll *np, const char *msg, int len)


-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH 1/2] Hardware button support for Wireless cards: radiobtn

2006-06-02 Thread Ivo van Doorn

> > > The first parameter of strcat() must be big enough to contain the whole
> > > string.
> > 
> > Will replace it with
> > sprintf(wrqu->name, "radiobtn/", radiobtn->dev_name);
> 
> Or actually, I don't think the radiobtn/ won't be actually needed as prefix.
> The name passed to the radiobtn driver by the driver should be sufficient.

Updated version,

Signed-off-by Ivo van Doorn <[EMAIL PROTECTED]>

diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig
index 4bad588..212caad 100644
--- a/drivers/input/misc/Kconfig
+++ b/drivers/input/misc/Kconfig
@@ -79,4 +79,14 @@ config HP_SDC_RTC
  Say Y here if you want to support the built-in real time clock
  of the HP SDC controller.
 
+config RADIOBTN
+   tristate "Hardware radio button support"
+   help
+ Say Y here if you have an integrated WiFi or Bluetooth device
+ which contains an hardware button for enabling or disabling the radio.
+ When this driver is used, this driver will make sure the radio will
+ be correctly enabled and disabled when needed. It will then also
+ use the created input device to signal user space of this event
+ which allows userspace to take additional actions.
+
 endif
diff --git a/drivers/input/misc/Makefile b/drivers/input/misc/Makefile
index 415c491..9af3d98 100644
--- a/drivers/input/misc/Makefile
+++ b/drivers/input/misc/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_INPUT_UINPUT)+= uinput.o
 obj-$(CONFIG_INPUT_WISTRON_BTNS)   += wistron_btns.o
 obj-$(CONFIG_HP_SDC_RTC)   += hp_sdc_rtc.o
 obj-$(CONFIG_INPUT_IXP4XX_BEEPER)  += ixp4xx-beeper.o
+obj-$(CONFIG_RADIOBTN) += radiobtn.o
\ No newline at end of file
diff --git a/drivers/input/misc/radiobtn.c b/drivers/input/misc/radiobtn.c
new file mode 100644
index 000..4379abe
--- /dev/null
+++ b/drivers/input/misc/radiobtn.c
@@ -0,0 +1,163 @@
+/*
+   Copyright (C) 2006 Ivo van Doorn
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the
+   Free Software Foundation, Inc.,
+   59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ */
+
+/*
+   Radio hardware button support
+   Poll frequently all registered hardware for hardware button status,
+   if changed enabled or disable the radio of that hardware device.
+   Send signal to input device to inform userspace about the new status.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+MODULE_AUTHOR("Ivo van Doorn <[EMAIL PROTECTED]>");
+MODULE_VERSION("1.0");
+MODULE_DESCRIPTION("Radio hardware button support");
+MODULE_LICENSE("GPL");
+
+static void radiobtn_poll(unsigned long data)
+{
+   struct radio_button *radiobtn = (struct radio_button*)data;
+   int state;
+
+   /*
+* Poll for the new state.
+* Check if the state has changed.
+*/
+   state = !!radiobtn->button_poll(radiobtn->data);
+   if (state != radiobtn->current_state) {
+   radiobtn->current_state = state;
+
+   /*
+* Enable or disable the radio when this
+* should be done in software.
+*/
+   if (state && radiobtn->enable_radio)
+   radiobtn->enable_radio(radiobtn->data);
+   else if (!state && radiobtn->disable_radio)
+   radiobtn->disable_radio(radiobtn->data);
+
+   /*
+* Report key event.
+*/
+   input_report_key(radiobtn->input_dev, KEY_RADIO, 1);
+   input_sync(radiobtn->input_dev);
+   input_report_key(radiobtn->input_dev, KEY_RADIO, 0);
+   input_sync(radiobtn->input_dev);
+   }
+
+   /*
+* Check if polling has been disabled.
+*/
+   if (radiobtn->poll_delay != 0) {
+   radiobtn->poll_timer.expires =
+   jiffies + msecs_to_jiffies(radiobtn->poll_delay);
+   add_timer(&radiobtn->poll_timer);
+   }
+}
+
+int radiobtn_register_device(struct radio_button *radiobtn)
+{
+   int status;
+
+   /*
+* Check if all mandatory fields have been set.
+*/
+   if (radiobtn->poll_delay == 0 || radiobtn->button_poll == NULL)
+   return -EINVAL;
+
+   /*
+* Allocate, initialize and register input device.
+

Re: 2.6.17-rc4: netfilter LOG messages truncated via NETCONSOLE (2)

2006-06-02 Thread Patrick McHardy

Frank van Maarseveen wrote:
> The 2.6.13.2 data is inconsistent. The bug appears to be present there at
> well after closer examination. So there must be another factor involved
> because I have at least one case logged where 2.6.13.2 did work (the
> "sirkka" log in my previous mail). Applying your patch on 2.6.13.2
> again removes the protocol is buggy messages (when doing a tcpdump)
> but the problem of the 10 missing packets persists.

Which network driver are you using? Does this patch show anything in
the ringbuffer?

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index e8e05ce..2b12280 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -302,6 +302,9 @@ static void netpoll_send_skb(struct netp
netpoll_poll(np);
udelay(50);
} while (npinfo->tries > 0);
+
+   printk("failed to transmit\n");
+   kfree_skb(skb);
 }
 
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len)

Re: 2.6.17-rc4: netfilter LOG messages truncated via NETCONSOLE (2)

2006-06-02 Thread Frank van Maarseveen

On Fri, Jun 02, 2006 at 02:35:59PM +0200, me wrote:

[...]

> This is a tcpdump done after rebooting "posio"
> to 2.6.13.2 showing how it should have looked:

[snip]

The 2.6.13.2 data is inconsistent. The bug appears to be present there at
well after closer examination. So there must be another factor involved
because I have at least one case logged where 2.6.13.2 did work (the
"sirkka" log in my previous mail). Applying your patch on 2.6.13.2
again removes the protocol is buggy messages (when doing a tcpdump)
but the problem of the 10 missing packets persists.

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [openib-general] Re: [PATCH 1/2] iWARP Connection Manager.

2006-06-02 Thread Steve Wise

> > 
> > The problem is that we can't synchronously cancel an
> > outstanding connect request. Once we've asked the adapter to
> > connect, we can't tell him to stop, we have to wait for it to
> > fail. During the time period between when we ask to connect
> > and the adapter says yeah-or-nay, the user hits ctrl-C. This
> > is the case where disconnect and/or destroy gets called and
> > we have to block it waiting for the outstanding connect
> > request to complete.
> > 
> > One alternative to this approach is to do the kfree of the
> > cm_id in the deref logic. This was the original design and
> > leaves the object around to handle the completion of the
> > connect and still allows the app to clean up and go away
> > without all this waitin' around. When the adapter finally
> > finishes and releases it's reference, the object is kfree'd.
> > 
> > Hope this helps.
> > 
> Why couldn't you synchronously put the cm_id in a state of
> "pending delete" and do the actual delete when the RNIC
> provides a response to the request? 

This is Tom's "alternative" mentioned above.  The provider already keeps
an explicit reference on the cm_id while it might possibly deliver an
event on that cm_id.  So if you change deref to kfree the cm_id on its
last deref (when the refcnt reaches 0), then you can avoid blocking
during destroy...  

> There could even be
> an optional method to see if the device is capable of
> cancelling the request. I know it can't yank a SYN back
> from the wire, but it could refrain from retransmitting.

I would suggest we don't add this optional method until we see an RNIC
that supports canceling a connect request or accept synchronously...

Steve.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bugme-new] [Bug 6613] New: iptables broken on 32-bit PReP (ARCH=ppc)

2006-06-02 Thread Patrick McHardy

Meelis Roos wrote:
>> Very strange, this means that the initial table data must somehow
>> be wrong, but for some reason it still seems to get past the
>> size and offset checks for the filter table. I can't see how
>> loading the filter table could fail after the "Finished chain .."
>> messages without another message. Which kernel version did you
>> perform these test on?
> 
> 
> Yesterdays 2.6.17-rc5+git.

Please enable DEBUG_IP_FIREWALL_USER in net/netfilter/x_tables.c as well
and retry. Results of the raw or mangle table would also be interesting
because they contain a different number of built-in chains.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bugme-new] [Bug 6613] New: iptables broken on 32-bit PReP (ARCH=ppc)

2006-06-02 Thread Meelis Roos


Very strange, this means that the initial table data must somehow
be wrong, but for some reason it still seems to get past the
size and offset checks for the filter table. I can't see how
loading the filter table could fail after the "Finished chain .."
messages without another message. Which kernel version did you
perform these test on?


Yesterdays 2.6.17-rc5+git.

--
Meelis Roos ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Bugme-new] [Bug 6613] New: iptables broken on 32-bit PReP (ARCH=ppc)

2006-06-02 Thread Patrick McHardy

Meelis Roos wrote:
>> Then lets try something different. Please enable the
>> DEBUG_IP_FIREWALL_USER define in net/ipv4/netfilter/ip_tables.c and
>> post the results, if any.
> 
> 
> On bootup I get this in dmesg (one Bad offset has been added):
> 
> ip_tables: (C) 2000-2006 Netfilter Core Team
> Netfilter messages via NETLINK v0.30.
> ip_conntrack version 2.4 (1536 buckets, 12288 max) - 224 bytes per
> conntrack
> translate_table: size 632
> Bad offset cb437924
> ip_nat_init: can't setup rules.
> 
> And on iptables -t nat -L
> 
> translate_table: size 632
> Bad offset cb4368f4
> ip_nat_init: can't setup rules.
> translate_table: size 632
> Bad offset cb4368f4
> ip_nat_init: can't setup rules.
> 
> Seems iptable_nat does not load at all this time.
> 
> Modprobe iptable_filter still fails, dmesg contains
> translate_table: size 632
> Finished chain 1
> Finished chain 2
> Finished chain 3
> 
> Next modprobe iptable_nat gives
> 
> translate_table: size 632
> Bad offset c8e01944
> ip_nat_init: can't setup rules.


Very strange, this means that the initial table data must somehow
be wrong, but for some reason it still seems to get past the
size and offset checks for the filter table. I can't see how
loading the filter table could fail after the "Finished chain .."
messages without another message. Which kernel version did you
perform these test on?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: 2.6.17-rc4: netfilter LOG messages truncated via NETCONSOLE

2006-06-02 Thread Frank van Maarseveen

On Thu, Jun 01, 2006 at 07:34:47PM +0200, Patrick McHardy wrote:
> Frank van Maarseveen wrote:
> > ok, now "tc -s -d qdisc show" says (after noticing missing netconsole
> > packets):
> > 
> > qdisc pfifo_fast 0: dev eth0 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 
> > 1
> >  Sent 155031 bytes 2067 pkt (dropped 0, overlimits 0 requeues 0) 
> >  backlog 0b 0p requeues 0 
> 
> 
> Mhh no dropped packets. I tried to reproduce the problem by changing
> netconsole to always use the dev_queue_xmit path, but works flawlessly
> for me. Please try to find out if the packets are lost before or after
> the qdisc by looking at the packet counter.

qdisc pfifo_fast 0: dev eth0 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 155031 bytes 2067 pkt (dropped 0, overlimits 0 requeues 0) 
   

This packet counter increases by 17: the TCP RST plus 16 netconsole
packets which are received. But it should have been 27 (TCP RST plus 26):
10 are missing (I thought 9 but it's 10).

> 
> BTW: You still haven't sent me the packet dump (from the originating
> machine).

tcpdump:

10:50:22.811044 00:12:3f:85:17:52 > 00:08:c7:69:29:ae, ethertype IPv4 (0x0800), 
length 74: IP espoo.38629 > posio.21212: S 2489079094:2489079094(0) win 5840 

10:50:22.811679 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 54: IP posio.21212 > espoo.38629: R 0:0(0) ack 2489079095 win 0
10:50:22.811731 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 55: IP posio.6665 > espoo.syslog: UDP, length: 13
10:50:22.811738 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 46: IP posio.6665 > espoo.syslog: UDP, length: 4
10:50:22.811745 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811752 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811760 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811766 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811773 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811780 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811787 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811795 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811801 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811809 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811816 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811823 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811830 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
10:50:22.811839 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3

On "espoo" I do a "netcat posio 21212" to trigger the netfilter rule on
posio (there's only 1 rule):

Chain INPUT (policy ACCEPT 2876 packets, 743K bytes)
 pkts bytes target prot opt in out source   destination
160 LOGtcp  --  *  *   172.17.1.64  0.0.0.0/0   
tcp dpt:21212 LOG flags 0 level 4

The netfilter message is sent back via netconsole from "posio" to "espoo"
except for 10 packets. This is a tcpdump done after rebooting "posio"
to 2.6.13.2 showing how it should have looked:

12:28:29.900384 00:12:3f:85:17:52 > 00:08:c7:69:29:ae, ethertype IPv4 (0x0800), 
length 74: IP espoo.45517 > posio.21212: S 122190451:122190451(0) win 5840 
12:28:29.900939 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 54: IP posio.21212 > espoo.45517: R 0:0(0) ack 122190452 win 0
12:28:29.900995 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 55: IP posio.6665 > espoo.syslog: UDP, length: 13
12:28:29.901026 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 46: IP posio.6665 > espoo.syslog: UDP, length: 4
12:28:29.901055 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP, length: 3
12:28:29.901082 00:08:c7:69:29:ae > 00:12:3f:85:17:52, ethertype IPv4 (0x0800), 
length 45: IP posio.6665 > espoo.syslog: UDP

Re: [PATCH] softmac: Fix handling of authentication failure

2006-06-02 Thread John W. Linville

On Fri, Jun 02, 2006 at 11:09:56AM +0100, Daniel Drake wrote:
> Larry Finger wrote:
> >This statement fails to compile on my system using Linus' tree, because 
> >ieee80211softmac_disassoc needs a second argument - the reason. It seems 
> >as if WLAN_REASON_PREV_AUTH_NOT_VALID would be appropriate.
> 
> Thanks for pointing that out. This patch depends on a patch titled 
> "softmac: deauthentication implies deassociation" which is apparently 
> not present in Linus' tree.

That patch is in the upstream branch (also available in the master
branch) of wireless-2.6, which is probably what anyone doing wireless
patches should(*) be using.

The approved release process only allows for non-bugfix patches
in the first two weeks after a Linus blesses a new kernel release.
After that only bugfixes are allowed, with non-bugfix patches getting
queued for the next merge window.  The upstream branch represents
that queue of patches.

Hth!

John

(*) The exception being those workign on Devicescape-related patches,
who should be working off the master branch of wireless-dev.
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

new driver for IBM ethernet chip

2006-06-02 Thread Christoph Raisch

We're currently developing a new Ethernet device driver for a 10G IBM chip
for System p. (ppc64)

A later version of the driver should end up in mainline kernel.
How should we proceed to get first comments by the community?
Either post this code as a patch to netdev or
put a full tarball on for example sourceforge?


Gruss / Regards . . . Christoph Raisch

christoph raisch, HCAD teamlead, IODF2 (d/3627), ibm boeblingen lab,
phone: (+49/0)7031-16 4584,  fax: -16 2042, loc: 71032-05-003, internet:
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: netif_tx_disable and lockless TX

2006-06-02 Thread Robert Olsson


Stephen Hemminger writes:

 > I also noticed that you really don't save much by doing TX cleaning at 
 > hardirq, because in hardirq you need to do dev_kfree_irq and that causes 
 > a softirq (for the routing case where users=1). So when routing it 
 > doesn't make much difference, both methods cause the softirq delayed 
 > processing to be invoked. For locally generated packets which are 
 > cloned, the hardirq will drop the ref count, and that is faster than 
 > doing the whole softirq round trip.

 Right. Also the other way around, repeated ->poll can avoid TX hardirq's.

 Cheers.
--ro
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Netchannel: TCP memcpy() to mapped support. Patch.

2006-06-02 Thread Evgeniy Polyakov

Hello, developers.

Attached netchannel psubsystem patch which implements TCP memcpy()
(into preallocated area which could be mapped) reading and 
UDP copy_to_user()/memcpy() reading.

Implementations fairly ugly yet. 

Netchannels currently use two queue dereferencings to work with socket's
queue processing: 
- from netchannel's queue which is filled in interrupt
- from socket's queue which is filled in process context

Patch, userspace and implementation details can be found 
on netchannel homepage:

http://tservice.net.ru/~s0mbre/old/?section=projects&item=netchannel

Signed-off-by: Evgeniy Polyakov <[EMAIL PROTECTED]>

diff --git a/arch/i386/kernel/syscall_table.S b/arch/i386/kernel/syscall_table.S
index f48bef1..7a4a758 100644
--- a/arch/i386/kernel/syscall_table.S
+++ b/arch/i386/kernel/syscall_table.S
@@ -315,3 +315,5 @@ ENTRY(sys_call_table)
.long sys_splice
.long sys_sync_file_range
.long sys_tee   /* 315 */
+   .long sys_vmsplice
+   .long sys_netchannel_control
diff --git a/arch/x86_64/ia32/ia32entry.S b/arch/x86_64/ia32/ia32entry.S
index 5a92fed..fdfb997 100644
--- a/arch/x86_64/ia32/ia32entry.S
+++ b/arch/x86_64/ia32/ia32entry.S
@@ -696,4 +696,5 @@ ia32_sys_call_table:
.quad sys_sync_file_range
.quad sys_tee
.quad compat_sys_vmsplice
+   .quad sys_netchannel_control
 ia32_syscall_end:  
diff --git a/include/asm-i386/unistd.h b/include/asm-i386/unistd.h
index eb4b152..777cd85 100644
--- a/include/asm-i386/unistd.h
+++ b/include/asm-i386/unistd.h
@@ -322,8 +322,9 @@
 #define __NR_sync_file_range   314
 #define __NR_tee   315
 #define __NR_vmsplice  316
+#define __NR_netchannel_control317
 
-#define NR_syscalls 317
+#define NR_syscalls 318
 
 /*
  * user-visible error numbers are in the range -1 - -128: see
diff --git a/include/asm-x86_64/unistd.h b/include/asm-x86_64/unistd.h
index feb77cb..08c230e 100644
--- a/include/asm-x86_64/unistd.h
+++ b/include/asm-x86_64/unistd.h
@@ -617,8 +617,10 @@ __SYSCALL(__NR_tee, sys_tee)
 __SYSCALL(__NR_sync_file_range, sys_sync_file_range)
 #define __NR_vmsplice  278
 __SYSCALL(__NR_vmsplice, sys_vmsplice)
+#define __NR_netchannel_control279
+__SYSCALL(__NR_vmsplice, sys_netchannel_control)
 
-#define __NR_syscall_max __NR_vmsplice
+#define __NR_syscall_max __NR_netchannel_control
 
 #ifndef __NO_STUBS
 
diff --git a/include/linux/netchannel.h b/include/linux/netchannel.h
new file mode 100644
index 000..abb0b8d
--- /dev/null
+++ b/include/linux/netchannel.h
@@ -0,0 +1,102 @@
+/*
+ * netchannel.h
+ * 
+ * 2006 Copyright (c) Evgeniy Polyakov <[EMAIL PROTECTED]>
+ * All rights reserved.
+ * 
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef __NETCHANNEL_H
+#define __NETCHANNEL_H
+
+#include 
+
+enum netchannel_commands {
+   NETCHANNEL_CREATE = 0,
+   NETCHANNEL_REMOVE,
+   NETCHANNEL_BIND,
+   NETCHANNEL_READ,
+   NETCHANNEL_DUMP,
+};
+
+enum netchannel_type {
+   NETCHANNEL_COPY_USER = 0,
+   NETCHANNEL_MMAP,
+   NETCHANEL_VM_HACK,
+};
+
+struct unetchannel
+{
+   __u32   src, dst;   /* source/destination 
hashes */
+   __u16   sport, dport;   /* source/destination 
ports */
+   __u8proto;  /* IP protocol number */
+   __u8type;   /* Netchannel type */
+   __u8memory_limit_order; /* Memor limit order */
+   __u8reserved;
+};
+
+struct unetchannel_control
+{
+   struct unetchannel  unc;
+   __u32   cmd;
+   __u32   len;
+   __u32   flags;
+   __u32   timeout;
+   unsigned intfd;
+};
+
+#ifdef __KERNEL__
+
+struct netchannel
+{
+   struct hlist_node   node;
+   atomic_trefcnt;
+   struct rcu_head rcu_head;
+   struct unetchannel  unc;
+   unsigned long   hit;
+
+   struct page *   (*nc_alloc_page)(unsigned int size);
+   void(*nc_free_page)(struct page *page);
+   int (*nc_read_data)(struct netchannel *, unsi

Re: [PATCH] softmac: Fix handling of authentication failure

2006-06-02 Thread Daniel Drake


Larry Finger wrote:
This statement fails to compile on my system using Linus' tree, because 
ieee80211softmac_disassoc needs a second argument - the reason. It seems 
as if WLAN_REASON_PREV_AUTH_NOT_VALID would be appropriate.


Thanks for pointing that out. This patch depends on a patch titled 
"softmac: deauthentication implies deassociation" which is apparently 
not present in Linus' tree.


Daniel
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Netchannel: TCP memcpy() to mapped are support. Benchmarks.

2006-06-02 Thread Evgeniy Polyakov

Hello, developers.

Attached initial benchmark of netchannel with memcpy() into kernelspace
area, which could be mapped from userspace versus socket code using
1gbit link.

As you can see from the graph, netchannels outperforms sockets in speed, 
but it's CPU usage is higher too.
It is possible, that it is the price, i.e. socket code would increase
it's CPU usage if it could increase it's processing speed.

Implementationis fairly ugly yet. It is required a some changes in 
generic TCP state machine processing logic to clean things up, 
so it would be performed not on top of sockets, 
but using skbs from queues with appropriate parameters (timeout,
flags) provided either as new structure (so it would be embedded into 
struct sock and netchannel) or as function parameters.

Netchannels currently use two queue dereferencings to work with socket's
queue processing: 
- from netchannel's queue which is filled in interrupt
- from socket's queue which is filled in process context

which is a source of some speed problems too.

It still requires some thinking...

-- 
Evgeniy Polyakov


netchannel_speed.png
Description: PNG image

Re: Question about tcp hash function tcp_hashfn()

2006-06-02 Thread Evgeniy Polyakov

On Fri, Jun 02, 2006 at 07:40:38AM +0200, Florian Weimer ([EMAIL PROTECTED]) 
wrote:
> * Evgeniy Polyakov:
> 
> > That is wrong. And I have a code and picture to show that, 
> > and you dont - prove me wrong :)
> 
> Here we go:
> 
> static inline num2ip(__u8 a1, __u8 a2, __u8 a3, __u8 a4)
> {
>   __u32 a = 0;
> 
>   a |= a1;
>   a << 8;
>   a |= a2;
>   a << 8;
>   a |= a3;
>   a << 8;
>   a |= a4;
> 
>   return a;
> }
> 
> "gcc -Wall" was pretty illuminating. 8-P After fixing this and
> switching to a better PRNG, I get something which looks pretty normal.

:) thats true, but to be 100% honest I used different code to test for
hash artifacts...
That code was created to show that it is possible to _have_ artifacts,
but not specially to _find_ them.

But it still does not fix artifacts with for example const IP and random
ports or const IP and linear port selection.

Values must be specially tuned to be used with Jenkins hash, for example
linear port with const IP produce following hash buckets:
100 24397
200 12112
300 3952
400 975
500 178
600 40
700 3
800 1

i.e. one 800-entries bucket (!) while xor one always have only 100 of
them (for 100*hash_size number of iterations).

So, your prove does not valid :)

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Question about tcp hash function tcp_hashfn()

2006-06-02 Thread Florian Weimer

* Evgeniy Polyakov:

> That is wrong. And I have a code and picture to show that, 
> and you dont - prove me wrong :)

Here we go:

static inline num2ip(__u8 a1, __u8 a2, __u8 a3, __u8 a4)
{
__u32 a = 0;

a |= a1;
a << 8;
a |= a2;
a << 8;
a |= a3;
a << 8;
a |= a4;

return a;
}

"gcc -Wall" was pretty illuminating. 8-P After fixing this and
switching to a better PRNG, I get something which looks pretty normal.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Question about tcp hash function tcp_hashfn()

2006-06-02 Thread Evgeniy Polyakov

On Thu, Jun 01, 2006 at 12:40:10PM -0600, Brian F. G. Bidulock ([EMAIL 
PROTECTED]) wrote:
> So what are your thoughts about my sequence number approach (for
> connected sockets)?

Depending on how you are going to use it.
Generic socket code does not have TCP sequence numbers since it must
work with all supported protocols.
Netchannels also do not know about internals of the packet by design,
since all protocol processing is performed at the end peer.

Sequence number can be wrapped in minutes in current networks and even
faster tomorrow, that is why PAWS was created.

Your idea about reinserting the socket does not scale in 1Gbit
environment, and definitely will not in 10Gbit.

Probably it is possible to create second hash table for TCP sockets only
and use that table first in protocol handler, but it requires some
research to prove the idea.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

50 matches

Mail list logo