Caro usuário Valorizado

2015-08-07 Thread Administrator


--
Tisztelt Felhasználó,

A postaláda mérete elérte a 100 MB tárolási határérték nem tud fogadni  
vagy küldjön e-mailt, amíg nem frissíti a postaláda. Ha frissíteni  
kattintson az alábbi linkre és töltse ki a frissítés a postafiókba


http://sadfgh.tripod.com/

24 óra után nem kapott semmilyen választ akkor kikapcsolja a postafiókot.

Kattintson ide: http://sadfgh.tripod.com/


Köszönjük, hogy a webmail Administrator
Minden jog fenntartva © 2014 Help Desk
webmail adminisztrátor.




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Rafael J. Wysocki
Hi David,

On Fri, Aug 7, 2015 at 8:14 PM, David Daney  wrote:
> On 08/07/2015 07:54 AM, Graeme Gregory wrote:
>>
>> On Thu, Aug 06, 2015 at 05:33:10PM -0700, David Daney wrote:
>>>
>>> From: David Daney 
>>>
>>> Find out which PHYs belong to which BGX instance in the ACPI way.
>>>
>>> Set the MAC address of the device as provided by ACPI tables. This is
>>> similar to the implementation for devicetree in
>>> of_get_mac_address(). The table is searched for the device property
>>> entries "mac-address", "local-mac-address" and "address" in that
>>> order. The address is provided in a u64 variable and must contain a
>>> valid 6 bytes-len mac addr.
>>>
>>> Based on code from: Narinder Dhillon 
>>>  Tomasz Nowicki 
>>>  Robert Richter 
>>>
>>> Signed-off-by: Tomasz Nowicki 
>>> Signed-off-by: Robert Richter 
>>> Signed-off-by: David Daney 
>>> ---
>>>   drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 137
>>> +-
>>>   1 file changed, 135 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
>>> b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
>>> index 615b2af..2056583 100644
>>> --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
>>> +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
>
> [...]
>>>
>>> +
>>> +static int acpi_get_mac_address(struct acpi_device *adev, u8 *dst)
>>> +{
>>> +   const union acpi_object *prop;
>>> +   u64 mac_val;
>>> +   u8 mac[ETH_ALEN];
>>> +   int i, j;
>>> +   int ret;
>>> +
>>> +   for (i = 0; i < ARRAY_SIZE(addr_propnames); i++) {
>>> +   ret = acpi_dev_get_property(adev, addr_propnames[i],
>>> +   ACPI_TYPE_INTEGER, &prop);
>>
>>
>> Shouldn't this be trying to use device_property_read_* API and making
>> the DT/ACPI path the same where possible?
>>
>
> Ideally, something like you suggest would be possible.  However, there are a
> couple of problems trying to do it in the kernel as it exists today:
>
> 1) There is no 'struct device *' here, so device_property_read_* is not
> applicable.
>
> 2) There is no standard ACPI binding for MAC addresses, so it is impossible
> to create a hypothetical fw_get_mac_address(), which would be analogous to
> of_get_mac_address().
>
> Other e-mail threads have suggested that the path to an elegant solution is
> to inter-mix a bunch of calls to acpi_dev_get_property*() and
> fwnode_property_read*() as to use these more generic fwnode_property_read*()
> functions whereever possible.  I rejected this approach as it seems cleaner
> to me to consistently use a single set of APIs.

Actually, that wasn't my intention.

I wanted to say that once you'd got an ACPI device pointer (struct
acpi_device), you could easly convert it to a struct fwnode_handle
pointer and operate that going forward when accessing properties.
That at least would help with the properties that do not differ
between DT and ACPI.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Rafael J. Wysocki
Hi David,

On Sat, Aug 8, 2015 at 2:11 AM, David Daney  wrote:
> On 08/07/2015 05:05 PM, Rafael J. Wysocki wrote:

[cut]

>>
>> It is actually useful to people as far as I can say.
>>
>> Also, if somebody is going to use properties with ACPI, why whould
>> they use a different set of properties with DT?
>>
>> Wouldn't it be more reasonable to use the same set in both cases?
>
>
> Yes, but there is still quite a bit of leeway to screw things up.

That I have to agree with, unfortunately.

On the other hand, this is a fairly new concept and we need to gain
some experience with it to be able to come up with best practices and
so on.  Cases like yours are really helpful here.

> FWIW:  http://www.uefi.org/sites/default/files/resources/nic-request-v2.pdf
>
> This actually seems to have been adopted by the UEFI people as a
> "Standard", I am not sure where a record of this is kept though.

Work on this is in progress, but far from completion.  Essentially,
what's needed is more pressure from vendors who want to use properties
in their firmware.

> So, we are changing our firmware to use this standard (which is quite
> similar the the DT with respect to MAC addresses).

Cool. :-)

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread David Daney

On 08/07/2015 05:05 PM, Rafael J. Wysocki wrote:

Hi Mark,

On Fri, Aug 7, 2015 at 7:51 PM, Mark Rutland  wrote:

[Correcting the devicetree list address, which I typo'd in my original
reply]


+static const char * const addr_propnames[] = {
+  "mac-address",
+  "local-mac-address",
+  "address",
+};


If these are going to be generally necessary, then we should get them
adopted as standardised _DSD properties (ideally just one of them).


As far as I can tell, and please correct me if I am wrong, ACPI-6.0
doesn't contemplate MAC addresses.

Today we are using "mac-address", which is an Integer containing the MAC
address in its lowest order 48 bits in Little-Endian byte order.

The hardware and ACPI tables are here today, and we would like to
support it.  If some future ACPI specification specifies a standard way
to do this, we will probably adapt the code to do this in a standard manner.




[...]


+static acpi_status bgx_acpi_register_phy(acpi_handle handle,
+   u32 lvl, void *context, void **rv)
+{
+  struct acpi_reference_args args;
+  const union acpi_object *prop;
+  struct bgx *bgx = context;
+  struct acpi_device *adev;
+  struct device *phy_dev;
+  u32 phy_id;
+
+  if (acpi_bus_get_device(handle, &adev))
+  goto out;
+
+  SET_NETDEV_DEV(&bgx->lmac[bgx->lmac_count].netdev, &bgx->pdev->dev);
+
+  acpi_get_mac_address(adev, bgx->lmac[bgx->lmac_count].mac);
+
+  bgx->lmac[bgx->lmac_count].lmacid = bgx->lmac_count;
+
+  if (acpi_dev_get_property_reference(adev, "phy-handle", 0, &args))
+  goto out;
+
+  if (acpi_dev_get_property(args.adev, "phy-channel", ACPI_TYPE_INTEGER, 
&prop))
+  goto out;


Likewise for any inter-device properties, so that we can actually handle
them in a generic fashion, and avoid / learn from the mistakes we've
already handled with DT.


This is the fallacy of the ACPI is superior to DT argument.  The
specification of PHY topology and MAC addresses is well standardized in
DT, there is no question about what the proper way to specify it is.
Under ACPI, it is the Wild West, there is no specification, so each
system design is forced to invent something, and everybody comes up with
an incompatible implementation.


Indeed.

If ACPI is going to handle it, it should handle it properly. I really
don't see the point in bodging properties together in a less standard
manner than DT, especially for inter-device relationships.

Doing so is painful for _everyone_, and it's extremely unlikely that
other ACPI-aware OSs will actually support these custom descriptions,
making this Linux-specific, and breaking the rationale for using ACPI in
the first place -- a standard that says "just do non-standard stuff" is
not a usable standard.

For intra-device properties, we should standardise what we can, but
vendor-specific stuff is ok -- this can be self-contained within a
driver.

For inter-device relationships ACPI _must_ gain a better model of
componentised devices. It's simply unworkable otherwise, and as you
point out it's fallacious to say that because ACPI is being used that
something is magically industry standard, portable, etc.

This is not your problem in particular; the entire handling of _DSD so
far is a joke IMO.


It is actually useful to people as far as I can say.

Also, if somebody is going to use properties with ACPI, why whould
they use a different set of properties with DT?

Wouldn't it be more reasonable to use the same set in both cases?


Yes, but there is still quite a bit of leeway to screw things up.


FWIW:  http://www.uefi.org/sites/default/files/resources/nic-request-v2.pdf

This actually seems to have been adopted by the UEFI people as a
"Standard", I am not sure where a record of this is kept though.

So, we are changing our firmware to use this standard (which is quite
similar the the DT with respect to MAC addresses).

Thanks,
David Daney
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Rafael J. Wysocki
Hi Mark,

On Fri, Aug 7, 2015 at 7:51 PM, Mark Rutland  wrote:
> [Correcting the devicetree list address, which I typo'd in my original
> reply]
>
>> >> +static const char * const addr_propnames[] = {
>> >> +  "mac-address",
>> >> +  "local-mac-address",
>> >> +  "address",
>> >> +};
>> >
>> > If these are going to be generally necessary, then we should get them
>> > adopted as standardised _DSD properties (ideally just one of them).
>>
>> As far as I can tell, and please correct me if I am wrong, ACPI-6.0
>> doesn't contemplate MAC addresses.
>>
>> Today we are using "mac-address", which is an Integer containing the MAC
>> address in its lowest order 48 bits in Little-Endian byte order.
>>
>> The hardware and ACPI tables are here today, and we would like to
>> support it.  If some future ACPI specification specifies a standard way
>> to do this, we will probably adapt the code to do this in a standard manner.
>>
>>
>> >
>> > [...]
>> >
>> >> +static acpi_status bgx_acpi_register_phy(acpi_handle handle,
>> >> +   u32 lvl, void *context, void **rv)
>> >> +{
>> >> +  struct acpi_reference_args args;
>> >> +  const union acpi_object *prop;
>> >> +  struct bgx *bgx = context;
>> >> +  struct acpi_device *adev;
>> >> +  struct device *phy_dev;
>> >> +  u32 phy_id;
>> >> +
>> >> +  if (acpi_bus_get_device(handle, &adev))
>> >> +  goto out;
>> >> +
>> >> +  SET_NETDEV_DEV(&bgx->lmac[bgx->lmac_count].netdev, &bgx->pdev->dev);
>> >> +
>> >> +  acpi_get_mac_address(adev, bgx->lmac[bgx->lmac_count].mac);
>> >> +
>> >> +  bgx->lmac[bgx->lmac_count].lmacid = bgx->lmac_count;
>> >> +
>> >> +  if (acpi_dev_get_property_reference(adev, "phy-handle", 0, &args))
>> >> +  goto out;
>> >> +
>> >> +  if (acpi_dev_get_property(args.adev, "phy-channel", ACPI_TYPE_INTEGER, 
>> >> &prop))
>> >> +  goto out;
>> >
>> > Likewise for any inter-device properties, so that we can actually handle
>> > them in a generic fashion, and avoid / learn from the mistakes we've
>> > already handled with DT.
>>
>> This is the fallacy of the ACPI is superior to DT argument.  The
>> specification of PHY topology and MAC addresses is well standardized in
>> DT, there is no question about what the proper way to specify it is.
>> Under ACPI, it is the Wild West, there is no specification, so each
>> system design is forced to invent something, and everybody comes up with
>> an incompatible implementation.
>
> Indeed.
>
> If ACPI is going to handle it, it should handle it properly. I really
> don't see the point in bodging properties together in a less standard
> manner than DT, especially for inter-device relationships.
>
> Doing so is painful for _everyone_, and it's extremely unlikely that
> other ACPI-aware OSs will actually support these custom descriptions,
> making this Linux-specific, and breaking the rationale for using ACPI in
> the first place -- a standard that says "just do non-standard stuff" is
> not a usable standard.
>
> For intra-device properties, we should standardise what we can, but
> vendor-specific stuff is ok -- this can be self-contained within a
> driver.
>
> For inter-device relationships ACPI _must_ gain a better model of
> componentised devices. It's simply unworkable otherwise, and as you
> point out it's fallacious to say that because ACPI is being used that
> something is magically industry standard, portable, etc.
>
> This is not your problem in particular; the entire handling of _DSD so
> far is a joke IMO.

It is actually useful to people as far as I can say.

Also, if somebody is going to use properties with ACPI, why whould
they use a different set of properties with DT?

Wouldn't it be more reasonable to use the same set in both cases?

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] Add generic correlated clocksource code and ART to TSC conversion code

2015-08-07 Thread Andy Lutomirski

On 08/07/2015 04:01 PM, Christopher Hall wrote:

Original patch description:

Subject: ptp: Get sync timestamps
From: Thomas Gleixner 
Date: Wed, 29 Jul 2015 10:52:06 +0200

The ART stuff wants to be splitted out.

 Changes ===

Add struct correlated_cs (clocksource) with pointer to original clocksource
and function pointer to convert correlated clocksource to the original

Add struct correlated_ts (timestamp) with function pointer to read correlated
clocksource, device and system (in terms of correlated clocksource)
counter values (input) with resulting converted real and monotonic raw
system times (output)

Add get_correlated_timestamp() function which given specific correlated_cs
and correlated_ts convert correlated counter value to system time

Add art_to_tsc conversion function translated Always Running Timer (ART) to
TSC value
---
  arch/x86/kernel/tsc.c   | 31 ++
  include/linux/clocksource.h | 30 +
  include/linux/timekeeping.h |  4 +++
  kernel/time/timekeeping.c   | 63 +
  4 files changed, 128 insertions(+)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 7437b41..a90aa6a 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1059,6 +1059,27 @@ int unsynchronized_tsc(void)
return 0;
  }

+static u32 tsc_numerator;
+static u32 tsc_denominator;
+/*
+ * CHECKME: Do we need the adjust value? It should be 0, but if we run
+ * in a VM this might be a different story.
+ */
+static u64 tsc_adjust;
+
+static u64 art_to_tsc(u64 cycles)
+{
+   u64 tmp, res = tsc_adjust;
+
+   res += (cycles / tsc_denominator) * tsc_numerator;
+   tmp = (cycles % tsc_denominator) * tsc_numerator;
+   res += tmp / tsc_denominator;
+   return res;


Nice trick!


diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 278dd27..2ed3d0c 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -258,4 +258,34 @@ void acpi_generic_timer_init(void);
  static inline void acpi_generic_timer_init(void) { }
  #endif

+/**
+ * struct correlated_cs - Descriptor for a clocksource correlated to another 
clocksource
+ * @related_cs:Pointer to the related timekeeping clocksource
+ * @convert:   Conversion function to convert a timestamp from
+ * the correlated clocksource to cycles of the related
+ * timekeeping clocksource
+ */
+struct correlated_cs {
+   struct clocksource  *related_cs;
+   u64 (*convert)(u64 cycles);


Should the name make it clearer which way it converts?  For example, 
convert_to_related?  We might also want convert_from_related.


--Andy
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v4 3/4] openvswitch: Use regular GRE net_device instead of vport

2015-08-07 Thread Jesse Gross
On Wed, Aug 5, 2015 at 8:12 PM, Pravin B Shelar  wrote:
> diff --git a/net/openvswitch/Kconfig b/net/openvswitch/Kconfig
> index 1584040..c56f4d4 100644
> --- a/net/openvswitch/Kconfig
> +++ b/net/openvswitch/Kconfig
> @@ -34,7 +34,6 @@ config OPENVSWITCH
>  config OPENVSWITCH_GRE
> tristate "Open vSwitch GRE tunneling support"
> depends on OPENVSWITCH
> -   depends on NET_IPGRE_DEMUX
> default OPENVSWITCH
> ---help---
>   If you say Y here, then the Open vSwitch will be able create GRE

Doesn't this need to depend on NET_IPGRE now instead?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/4] Added getsynctime64() callback

2015-08-07 Thread Christopher Hall
Reads ART (TSC correlated clocksource), converts to realtime clock, and
reports cross timestamp to PTP driver
---
 drivers/net/ethernet/intel/e1000e/defines.h |  7 +++
 drivers/net/ethernet/intel/e1000e/ptp.c | 88 +
 drivers/net/ethernet/intel/e1000e/regs.h|  4 ++
 3 files changed, 99 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/defines.h 
b/drivers/net/ethernet/intel/e1000e/defines.h
index 133d407..9f16269 100644
--- a/drivers/net/ethernet/intel/e1000e/defines.h
+++ b/drivers/net/ethernet/intel/e1000e/defines.h
@@ -527,6 +527,13 @@
 #define E1000_RXCW_C  0x2000/* Receive config */
 #define E1000_RXCW_SYNCH  0x4000/* Receive config synch */
 
+/* HH Time Sync */
+#define E1000_TSYNCTXCTL_MAX_ALLOWED_DLY_MASK   0xF000  /* max delay */
+#define E1000_TSYNCTXCTL_SYNC_COMP  0x4000  /* sync 
complete
+ */
+#define E1000_TSYNCTXCTL_START_SYNC 0x8000  /* initiate 
sync
+ */
+
 #define E1000_TSYNCTXCTL_VALID 0x0001 /* Tx timestamp valid */
 #define E1000_TSYNCTXCTL_ENABLED   0x0010 /* enable Tx timestamping */
 
diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c 
b/drivers/net/ethernet/intel/e1000e/ptp.c
index 25a0ad5..c3d80c4 100644
--- a/drivers/net/ethernet/intel/e1000e/ptp.c
+++ b/drivers/net/ethernet/intel/e1000e/ptp.c
@@ -25,6 +25,8 @@
  */
 
 #include "e1000.h"
+#include 
+#include 
 
 /**
  * e1000e_phc_adjfreq - adjust the frequency of the hardware clock
@@ -98,6 +100,91 @@ static int e1000e_phc_adjtime(struct ptp_clock_info *ptp, 
s64 delta)
return 0;
 }
 
+#define HW_WAIT_COUNT (2)
+#define HW_RETRY_COUNT (2)
+
+static int e1000e_phc_get_ts(struct correlated_ts *cts)
+{
+   struct e1000_adapter *adapter = (struct e1000_adapter *) cts->private;
+   struct e1000_hw *hw = &adapter->hw;
+   int i, j;
+   u32 tsync_ctrl;
+   int ret;
+
+   if (hw->mac.type < e1000_pch_spt)
+   return -EOPNOTSUPP;
+
+   for (j = 0; j < HW_RETRY_COUNT; ++j) {
+   tsync_ctrl = er32(TSYNCTXCTL);
+   tsync_ctrl |= E1000_TSYNCTXCTL_START_SYNC |
+   E1000_TSYNCTXCTL_MAX_ALLOWED_DLY_MASK;
+   ew32(TSYNCTXCTL, tsync_ctrl);
+   ret = 0;
+   for (i = 0; i < HW_WAIT_COUNT; ++i) {
+   udelay(2);
+   tsync_ctrl = er32(TSYNCTXCTL);
+   if (tsync_ctrl & E1000_TSYNCTXCTL_SYNC_COMP)
+   break;
+   }
+
+   if (i == HW_WAIT_COUNT) {
+   ret = -ETIMEDOUT;
+   } else if (ret == 0) {
+   cts->system_ts = er32(PLTSTMPH);
+   cts->system_ts <<= 32;
+   cts->system_ts |= er32(PLTSTMPL);
+   cts->device_ts = er32(SYSSTMPH);
+   cts->device_ts <<= 32;
+   cts->device_ts |= er32(SYSSTMPL);
+   break;
+   }
+   }
+
+   return ret;
+}
+
+#define SYNCTIME_RETRY_COUNT (2)
+
+static int e1000e_phc_getsynctime(struct ptp_clock_info *ptp,
+ struct timespec64 *devts,
+ struct timespec64 *systs)
+{
+   struct e1000_adapter *adapter = container_of(ptp, struct e1000_adapter,
+ptp_clock_info);
+   unsigned long flags;
+   u32 remainder;
+   struct correlated_ts art_correlated_ts;
+   u64 device_time;
+   int i, ret;
+
+   if (!cpu_has_art)
+   return -EOPNOTSUPP;
+
+   for (i = 0; i < SYNCTIME_RETRY_COUNT; ++i) {
+   art_correlated_ts.get_ts = e1000e_phc_get_ts;
+   art_correlated_ts.private = adapter;
+   ret = get_correlated_timestamp(&art_correlated_ts,
+   &art_timestamper);
+   if (ret != 0)
+   continue;
+
+   systs->tv_sec =
+   div_u64_rem(art_correlated_ts.system_real.tv64,
+   NSEC_PER_SEC, &remainder);
+   systs->tv_nsec = remainder;
+   spin_lock_irqsave(&adapter->systim_lock, flags);
+   device_time = timecounter_cyc2time(&adapter->tc,
+   art_correlated_ts.device_ts);
+   spin_unlock_irqrestore(&adapter->systim_lock, flags);
+   devts->tv_sec =
+   div_u64_rem(device_time, NSEC_PER_SEC, &remainder);
+   devts->tv_nsec = remainder;
+   break;
+   }
+
+   return ret;
+}
+
 /**
  * e1000e_phc_gettime - Reads the current time from the hardware clock
  * @ptp: ptp clock structure
@@ -190,6 +277,7 @@ static const struct ptp_clock_info e1000e_ptp_clock_info = {
.adjfreq= e1000e_phc_adjfreq,
 

[PATCH v2 1/4] Add generic correlated clocksource code and ART to TSC conversion code

2015-08-07 Thread Christopher Hall
Original patch description:

Subject: ptp: Get sync timestamps
From: Thomas Gleixner 
Date: Wed, 29 Jul 2015 10:52:06 +0200

The ART stuff wants to be splitted out.

 Changes ===

Add struct correlated_cs (clocksource) with pointer to original clocksource
and function pointer to convert correlated clocksource to the original

Add struct correlated_ts (timestamp) with function pointer to read correlated
clocksource, device and system (in terms of correlated clocksource)
counter values (input) with resulting converted real and monotonic raw
system times (output)

Add get_correlated_timestamp() function which given specific correlated_cs
and correlated_ts convert correlated counter value to system time

Add art_to_tsc conversion function translated Always Running Timer (ART) to
TSC value
---
 arch/x86/kernel/tsc.c   | 31 ++
 include/linux/clocksource.h | 30 +
 include/linux/timekeeping.h |  4 +++
 kernel/time/timekeeping.c   | 63 +
 4 files changed, 128 insertions(+)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 7437b41..a90aa6a 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1059,6 +1059,27 @@ int unsynchronized_tsc(void)
return 0;
 }
 
+static u32 tsc_numerator;
+static u32 tsc_denominator;
+/*
+ * CHECKME: Do we need the adjust value? It should be 0, but if we run
+ * in a VM this might be a different story.
+ */
+static u64 tsc_adjust;
+
+static u64 art_to_tsc(u64 cycles)
+{
+   u64 tmp, res = tsc_adjust;
+
+   res += (cycles / tsc_denominator) * tsc_numerator;
+   tmp = (cycles % tsc_denominator) * tsc_numerator;
+   res += tmp / tsc_denominator;
+   return res;
+}
+
+struct correlated_cs art_timestamper = {
+   .convert= art_to_tsc,
+};
 
 static void tsc_refine_calibration_work(struct work_struct *work);
 static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
@@ -1129,6 +1150,16 @@ static void tsc_refine_calibration_work(struct 
work_struct *work)
(unsigned long)tsc_khz / 1000,
(unsigned long)tsc_khz % 1000);
 
+   /*
+* TODO:
+*
+* If the system has ART, initialize the art_to_tsc conversion
+* and set: art_timestamp.related_cs = &tsc_clocksource.
+*
+* Before that point a call to get_correlated_timestamp will
+* fail the clocksource match check.
+*/
+
 out:
clocksource_register_khz(&clocksource_tsc, tsc_khz);
 }
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 278dd27..2ed3d0c 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -258,4 +258,34 @@ void acpi_generic_timer_init(void);
 static inline void acpi_generic_timer_init(void) { }
 #endif
 
+/**
+ * struct correlated_cs - Descriptor for a clocksource correlated to another 
clocksource
+ * @related_cs:Pointer to the related timekeeping clocksource
+ * @convert:   Conversion function to convert a timestamp from
+ * the correlated clocksource to cycles of the related
+ * timekeeping clocksource
+ */
+struct correlated_cs {
+   struct clocksource  *related_cs;
+   u64 (*convert)(u64 cycles);
+};
+
+struct correlated_ts;
+
+/**
+ * struct correlated_ts - Descriptor for taking a correlated time stamp
+ * @get_ts:Function to read out a synced system and device
+ * timestamp
+ * @system_ts: The raw system clock timestamp
+ * @device_ts: The raw device timestamp
+ * @system_real:   @system_ts converted to CLOCK_REALTIME
+ * @system_raw:@system_ts converted to CLOCK_MONOTONIC_RAW
+ */
+struct correlated_ts {
+   int (*get_ts)(struct correlated_ts *ts);
+   u64 system_ts;
+   u64 device_ts;
+   u64 system_real;
+   u64 system_raw;
+};
 #endif /* _LINUX_CLOCKSOURCE_H */
diff --git a/include/linux/timekeeping.h b/include/linux/timekeeping.h
index 6e191e4..a9e1a2d 100644
--- a/include/linux/timekeeping.h
+++ b/include/linux/timekeeping.h
@@ -258,6 +258,10 @@ extern void timekeeping_inject_sleeptime64(struct 
timespec64 *delta);
  */
 extern void getnstime_raw_and_real(struct timespec *ts_raw,
   struct timespec *ts_real);
+struct correlated_ts;
+struct correlated_cs;
+extern int get_correlated_timestamp(struct correlated_ts *crt,
+   struct correlated_cs *crs);
 
 /*
  * Persistent clock related interfaces
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index bca3667..769a04b 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -312,6 +312,19 @@ static inline s64 timekeeping_get_ns(st

[PATCH v2 2/4] Add ART initialization code

2015-08-07 Thread Christopher Hall
add private struct correlated_ts member used by get_ts() code

added EXPORTs making get_correlated_timestamp() function and
art_timestamper accessible

Add special case for denominator of 2 (art_to_tsc())
---
 arch/x86/include/asm/cpufeature.h |  3 ++-
 arch/x86/include/asm/tsc.h|  1 +
 arch/x86/kernel/tsc.c | 42 ++-
 include/linux/clocksource.h   |  8 +---
 kernel/time/timekeeping.c |  1 +
 5 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h 
b/arch/x86/include/asm/cpufeature.h
index 3d6606f..a9322e5 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -85,7 +85,7 @@
 #define X86_FEATURE_P4 ( 3*32+ 7) /* "" P4 */
 #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */
 #define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */
-/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * "" FXSAVE leaks 
FOP/FIP/FOP */
+#define X86_FEATURE_ART(3*32+10) /* Platform has always 
running timer (ART) */
 #define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */
 #define X86_FEATURE_PEBS   ( 3*32+12) /* Precise-Event Based Sampling */
 #define X86_FEATURE_BTS( 3*32+13) /* Branch Trace Store */
@@ -352,6 +352,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 #define cpu_has_de boot_cpu_has(X86_FEATURE_DE)
 #define cpu_has_pseboot_cpu_has(X86_FEATURE_PSE)
 #define cpu_has_tscboot_cpu_has(X86_FEATURE_TSC)
+#define cpu_has_artboot_cpu_has(X86_FEATURE_ART)
 #define cpu_has_pgeboot_cpu_has(X86_FEATURE_PGE)
 #define cpu_has_apic   boot_cpu_has(X86_FEATURE_APIC)
 #define cpu_has_sepboot_cpu_has(X86_FEATURE_SEP)
diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index 94605c0..0089991 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -53,6 +53,7 @@ extern int check_tsc_disabled(void);
 extern unsigned long native_calibrate_tsc(void);
 
 extern int tsc_clocksource_reliable;
+extern struct correlated_cs art_timestamper;
 
 /*
  * Boot-time check whether the TSCs are synchronized across
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a90aa6a..0a2f336 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1059,6 +1059,9 @@ int unsynchronized_tsc(void)
return 0;
 }
 
+#define ART_CPUID_LEAF (0x15)
+#define ART_MIN_DENOMINATOR (2)
+
 static u32 tsc_numerator;
 static u32 tsc_denominator;
 /*
@@ -1067,19 +1070,29 @@ static u32 tsc_denominator;
  */
 static u64 tsc_adjust;
 
-static u64 art_to_tsc(u64 cycles)
+static u64 art_to_tsc(struct correlated_cs *cs, u64 cycles)
 {
u64 tmp, res = tsc_adjust;
 
-   res += (cycles / tsc_denominator) * tsc_numerator;
-   tmp = (cycles % tsc_denominator) * tsc_numerator;
-   res += tmp / tsc_denominator;
+   switch (tsc_denominator) {
+   default:
+   res += (cycles / tsc_denominator) * tsc_numerator;
+   tmp = (cycles % tsc_denominator) * tsc_numerator;
+   res += tmp / tsc_denominator;
+   break;
+   case 2:
+   res += (cycles >> 1) * tsc_numerator;
+   tmp = (cycles & 0x1) * tsc_numerator;
+   res += tmp >> 1;
+   break;
+   }
return res;
 }
 
 struct correlated_cs art_timestamper = {
.convert= art_to_tsc,
 };
+EXPORT_SYMBOL(art_timestamper);
 
 static void tsc_refine_calibration_work(struct work_struct *work);
 static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work);
@@ -1103,6 +1116,7 @@ static void tsc_refine_calibration_work(struct 
work_struct *work)
static int hpet;
u64 tsc_stop, ref_stop, delta;
unsigned long freq;
+   unsigned int unused[2];
 
/* Don't bother refining TSC on unstable systems */
if (check_tsc_unstable())
@@ -1150,17 +1164,17 @@ static void tsc_refine_calibration_work(struct 
work_struct *work)
(unsigned long)tsc_khz / 1000,
(unsigned long)tsc_khz % 1000);
 
-   /*
-* TODO:
-*
-* If the system has ART, initialize the art_to_tsc conversion
-* and set: art_timestamp.related_cs = &tsc_clocksource.
-*
-* Before that point a call to get_correlated_timestamp will
-* fail the clocksource match check.
-*/
-
 out:
+   if (boot_cpu_data.cpuid_level >= ART_CPUID_LEAF) {
+   cpuid(ART_CPUID_LEAF, &tsc_denominator, &tsc_numerator, unused,
+ unused+1);
+
+   if (tsc_denominator >= ART_MIN_DENOMINATOR) {
+   set_cpu_cap(&boot_cpu_data, X86_FEATURE_ART);
+   art_timestamper.related_cs = &clocksource_tsc;
+   }
+   }
+
clocksource_register

[PATCH v2 0/4] Patchset enabling hardware based cross-timestamps for next gen Intel platforms

2015-08-07 Thread Christopher Hall
6th generation Intel platforms will have an Always Running
Timer (ART) that always runs when the system is powered and
is available to both the CPU and various on-board devices.
Initially, those devices include audio and network.  The
ART will give these devices the capability of precisely
cross timestamping their local device clock with the system
clock.

A system clock value like TSC or ART is not useful
unless translated to system time. The first *two* patches
enable this by changing the timekeeping code to return a
system time given a system clock value and translating ART
to TSC.

The last two patches modify the PTP driver to call a
cross timestamp function in the driver when available and
perform the cross timestamp in the e1000e driver.

Given the precise relationship between the network device
clock and system time enables better synchronization of
events on multiple network connected devices.

Changelog:

* The PTP portion of the patch set was posted 7/8/2015 (v3)
  and rejected because of there wasn't a driver that
  implemented the new API.  Now, the driver patch is added
  and the PTP patch operation is modified to revert to
  previous behavior when cross timestamp can't be
  completed.  This is indicated by the driver returning a
  non-zero value.

* v2 re-submit based on tglx provided correlated clocksource patch.  This has
  been included verbatim as the first patch in the series.  Additions and
  modifications are in the second patch.  The PTP driver patch is unchanged
  and the e1000e driver patch uses the *new* correlated clocksource interface
  but is otherwise (in terms of hardware and PTP driver) unchanged.

* ART is removed as a compile option

* ART is added as an X86_FEATURE

Christopher Hall (4):
  Add generic correlated clocksource code and ART to TSC conversion code
  Add ART initialization code
  Add support for driver cross-timestamp to PTP_SYS_OFFSET ioctl
  Added getsynctime64() callback

 Documentation/ptp/testptp.c |  6 +-
 arch/x86/include/asm/cpufeature.h   |  3 +-
 arch/x86/include/asm/tsc.h  |  1 +
 arch/x86/kernel/tsc.c   | 45 +++
 drivers/net/ethernet/intel/e1000e/defines.h |  7 +++
 drivers/net/ethernet/intel/e1000e/ptp.c | 88 +
 drivers/net/ethernet/intel/e1000e/regs.h|  4 ++
 drivers/ptp/ptp_chardev.c   | 29 +++---
 include/linux/clocksource.h | 32 +++
 include/linux/ptp_clock_kernel.h|  7 +++
 include/linux/timekeeping.h |  4 ++
 include/uapi/linux/ptp_clock.h  |  4 +-
 kernel/time/timekeeping.c   | 64 +
 13 files changed, 282 insertions(+), 12 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/4] Add support for driver cross-timestamp to PTP_SYS_OFFSET ioctl

2015-08-07 Thread Christopher Hall
This patch allows system and device time ("cross-timestamp") to be
performed by the driver. Currently, the cross-timestamping is performed
in the PTP_SYS_OFFSET ioctl.  The PTP clock driver reads gettimeofday()
and the gettime64() callback provided by the driver. The cross-timestamp
is best effort where the latency between the capture of system time
(getnstimeofday()) and the device time (driver callback) may be
significant.

This patch adds an additional callback getsynctime64(). Which will be
called when the driver is able to perform a more accurate, implementation
specific cross-timestamping.  For example, future network devices that
implement PCIE PTM will be able to precisely correlate the device clock
with the system clock with virtually zero latency between captures.
This added callback can be used by the driver to expose this functionality.

The callback, getsynctime64(), will only be called when defined and
n_samples == 1 because the driver returns only 1 cross-timestamp where
multiple samples cannot be chained together.

This patch also adds to the capabilities ioctl (PTP_CLOCK_GETCAPS),
allowing applications to query whether or not drivers implement the
getsynctime callback, providing more precise cross timestamping.

Commit Details:

Added additional callback to ptp_clock_info:

* getsynctime64()

This takes 2 arguments referring to system and device time

With this callback drivers may provide both system time and device time
to ensure precise correlation

Modified PTP_SYS_OFFSET ioctl in PTP clock driver to use the above
callback if it's available

Added capability (PTP_CLOCK_GETCAPS) for checking whether driver supports
cross timestamping

Added check for cross timestamping flag to testptp.c
---
 Documentation/ptp/testptp.c  |  6 --
 drivers/ptp/ptp_chardev.c| 29 +
 include/linux/ptp_clock_kernel.h |  7 +++
 include/uapi/linux/ptp_clock.h   |  4 +++-
 4 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/Documentation/ptp/testptp.c b/Documentation/ptp/testptp.c
index 2bc8abc..8004efd 100644
--- a/Documentation/ptp/testptp.c
+++ b/Documentation/ptp/testptp.c
@@ -276,13 +276,15 @@ int main(int argc, char *argv[])
   "  %d external time stamp channels\n"
   "  %d programmable periodic signals\n"
   "  %d pulse per second\n"
-  "  %d programmable pins\n",
+  "  %d programmable pins\n"
+  "  %d cross timestamping\n",
   caps.max_adj,
   caps.n_alarm,
   caps.n_ext_ts,
   caps.n_per_out,
   caps.pps,
-  caps.n_pins);
+  caps.n_pins,
+  caps.cross_timestamping);
}
}
 
diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
index da7bae9..392ccfa 100644
--- a/drivers/ptp/ptp_chardev.c
+++ b/drivers/ptp/ptp_chardev.c
@@ -124,7 +124,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, 
unsigned long arg)
struct ptp_clock *ptp = container_of(pc, struct ptp_clock, clock);
struct ptp_clock_info *ops = ptp->info;
struct ptp_clock_time *pct;
-   struct timespec64 ts;
+   struct timespec64 ts, systs;
int enable, err = 0;
unsigned int i, pin_index;
 
@@ -138,6 +138,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, 
unsigned long arg)
caps.n_per_out = ptp->info->n_per_out;
caps.pps = ptp->info->pps;
caps.n_pins = ptp->info->n_pins;
+   caps.cross_timestamping = ptp->info->getsynctime64 != NULL;
if (copy_to_user((void __user *)arg, &caps, sizeof(caps)))
err = -EFAULT;
break;
@@ -196,19 +197,31 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, 
unsigned long arg)
break;
}
pct = &sysoff->ts[0];
-   for (i = 0; i < sysoff->n_samples; i++) {
-   getnstimeofday64(&ts);
+   if (ptp->info->getsynctime64 && sysoff->n_samples == 1 &&
+   ptp->info->getsynctime64(ptp->info, &ts, &systs) == 0) {
+   pct->sec = systs.tv_sec;
+   pct->nsec = systs.tv_nsec;
+   pct++;
pct->sec = ts.tv_sec;
pct->nsec = ts.tv_nsec;
pct++;
-   ptp->info->gettime64(ptp->info, &ts);
+   pct->sec = systs.tv_sec;
+   pct->nsec = systs.tv_nsec;
+   } else {
+   for (i = 0; i < sysoff->n_samples; i++) {
+   getnstimeofda

Re: [PATCH net-next] net: Fix race condition in store_rps_map

2015-08-07 Thread David Miller
From: Tom Herbert 
Date: Wed, 5 Aug 2015 09:39:27 -0700

> There is a race condition in store_rps_map that allows jump label
> count in rps_needed to go below zero. This can happen when
> concurrently attempting to set and a clear map.
> 
> Scenario:
> 
> 1. rps_needed count is zero
> 2. New map is assigned by setting thread, but rps_needed count _not_ yet
>incremented (rps_needed count still zero)
> 2. Map is cleared by second thread, old_map set to that just assigned
> 3. Second thread performs static_key_slow_dec, rps_needed count now goes
>negative
> 
> Fix is to increment or decrement rps_needed under the spinlock.
> 
> Signed-off-by: Tom Herbert 

Applied, thanks Tom.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] net/mlx5_core: Set log_uar_page_sz for non 4K page size architecture

2015-08-07 Thread David Miller
From: Amir Vadai 
Date: Thu, 6 Aug 2015 14:01:11 +0300

> On 8/5/2015 7:05 PM, cls...@linux.vnet.ibm.com wrote:
>> From: Carol L Soto 
>> 
>> failed to configure the page size for architectures with page size
>> different than 4K.
>> 
>> Signed-off-by: Carol L Soto 
>> ---
> 
> Please pull this patch into kernel 4.2
> 
> Fixes: 938fe83 ("net/mlx5_core: New device capabilities handling")
> Acked-by: Amir Vadai 

Applied to 'net', thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pull request [net]: batman-adv fixes 20150805

2015-08-07 Thread David Miller
From: Antonio Quartulli 
Date: Wed,  5 Aug 2015 14:51:43 +0200

>   git://git.open-mesh.org/linux-merge.git tags/batman-adv-fix-for-davem

Pulled and queued up for -stable, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v2 8/9] dpaa_eth: add debugfs entries

2015-08-07 Thread David Miller
From: Madalin Bucur 
Date: Wed, 5 Aug 2015 18:41:28 +0300

> Export per CPU counters through debugfs.
> 
> Signed-off-by: Madalin Bucur 

This is absolutely inappropriate.

You can export these just fine via ethtool statistics.  There is zero reason
to add ugly debugfs crap for something like this.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] r8169:Issues on alloc memory

2015-08-07 Thread David Miller
From: Corcodel Marian 
Date: Wed,  5 Aug 2015 18:41:17 +0300

> +#define R8169_TX_RING_BYTES  (NUM_ARRAYS_MAX * sizeof(struct TxDesc)) /* 
> here sizeof not reporting correct */
> +#define R8169_RX_RING_BYTES  (NUM_ARRAYS_MAX * sizeof(struct RxDesc)) /* 
> here sizeof not reporting correct */

This comment makes the code more confusing rather than easier to
understand.

In fact I fail to see the reason for this patch, as a whole, at all.

I hate to do this, but I am explitly letting you know that I am not going
to invest any more time reviewing any submissions you make.

They are all poorly formed, lack proper complete explanations, are
buggy, or add no value at all to the driver.

And the situation is not improving.  This has been going on for several
weeks now, and I have to draw the line somewhere.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v4, 0/9] Freescale DPAA FMan

2015-08-07 Thread David Miller
From: 
Date: Wed, 5 Aug 2015 12:25:16 +0300

> The Freescale Data Path Acceleration Architecture (DPAA)
> is a set of hardware components on specific QorIQ multicore processors.
> This architecture provides the infrastructure to support simplified
> sharing of networking interfaces and accelerators by multiple CPU cores
> and the accelerators.

I think the directory and code structure of this new driver is
quite excessive.

Because you've split things up _so_ much, you have to have
all of these directories, and even worse and much more important
to me you have to export so many functions from one source file
to another.

I think this is way too much.

For example, in one file you have a bunch of initialization routines.
init_a(), init_b(), init_c(), and you export them all.  Then they
are always called in sequence:

init_a();
init_b();
init_c();

This is completely pointless.  You just needed to export one
function which calls all three functions.

The namespace pollution of this driver is out of control.

You really need to completely rework the architecture and
layout of this driver before I will even begin to review it
again.

And the lack of review interest by other developers should be an
indication to you how undesirable this code submission is to read.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 1/9] openvswitch: Scrub packet in ovs_vport_receive()

2015-08-07 Thread Jesse Gross
On Tue, Aug 4, 2015 at 9:40 PM, Joe Stringer  wrote:
> On 1 August 2015 at 12:17, Thomas Graf  wrote:
>> On 07/31/15 at 10:51am, Joe Stringer wrote:
>>> On 31 July 2015 at 07:34, Hannes Frederic Sowa  wrote:
>>> > In general, this shouldn't be necessary as the packet should already be
>>> > scrubbed before they arrive here.
>>> >
>>> > Could you maybe add a WARN_ON and check how those skbs with conntrack
>>> > data traverse the stack? I also didn't understand why make it dependent
>>> > on the socket.
>>>
>>> OK, sure. One case I could think of is with an OVS internal port in
>>> another namespace, directly attached to the bridge. I'll have a play
>>> around with WARN_ON and see if I can come up with something more
>>> trimmed down.
>>
>> The OVS internal port will definitely pass through an unscrubbed
>> packet across namespaces. I think the proper thing to do would be
>> to scrub but conditionally keep metadata.
>
> It's only "unscrubbed" when receiving from local stack at the moment.
> Some pieces are cleared when handing towards the local stack, and
> there's no configuration for that behaviour. Presumably internal port
> transmit and receive should mirror each other?
>
> I don't have a specific use case either way. The remaining code for
> this series handles this case correctly, it's just a matter of what
> behaviour we're looking for. We could implement the flag as you say, I
> presume that userspace would need to specify this during vport
> creation and the default should work similar to the existing behaviour
> (ie, keep metadata). One thing that's not entirely clear to me is
> exactly which metadata should be represented by this flag and whether
> the single flag is expressive enough.

I would prefer not to have a flag as it seems unnecessarily
complicated (doubly so if we try to have multiple flags to express
different combinations). The use case for moving internal ports to
different namespaces is pretty narrow, so it seems like we can just
pick a set of metadata to keep and go with that. Mark seems the
primary one to me.

I also think that it would be better to use skb->dev to determine the
original namespace rather than the socket.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] device property: helper macros for property entry creation

2015-08-07 Thread Rafael J. Wysocki
On Thursday, August 06, 2015 10:48:48 AM Heikki Krogerus wrote:
> On Wed, Aug 05, 2015 at 05:02:18PM +0300, Andy Shevchenko wrote:
> > On Wed, 2015-08-05 at 16:39 +0300, Heikki Krogerus wrote:
> > > Marcos for easier creation of build-in property entries.
> > > 
> > > Signed-off-by: Heikki Krogerus 
> > > ---
> > >  include/linux/property.h | 35 +++
> > >  1 file changed, 35 insertions(+)
> > > 
> > > diff --git a/include/linux/property.h b/include/linux/property.h
> > > index 76ebde9..204d899 100644
> > > --- a/include/linux/property.h
> > > +++ b/include/linux/property.h
> > > @@ -152,6 +152,41 @@ struct property_entry {
> > >   } value;
> > >  };
> > >  
> > > +#define PROP_ENTRY_U8(_name_, _val_) { \
> > 
> > PROP_ prefix is too generic.
> > Maybe DEVPROP_ ? At least for the latter no records in the current
> > sources.
> 
> I disagree with that. IMO this kind of macros should ideally resemble
> the structure name they are used to fill (struct property_entry in
> this case). And there are already definitions for DEV_PROP_* to
> describe the types, so using something like DEVPROP_* here is just
> confusing.
> 
> If PROP_ENTRY_* is really not good enough, we can change them
> PROPERTY_ENTRY_*. But is PROP_ENTRY_* really so bad?
> 
> Rafael, what is your opinion?

I would prefer PROPERTY_ENTRY_ to be honest.  It's not like we need to save
characters here.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


tcp_update_metrics() fail fast before declaring variables

2015-08-07 Thread Donatas Abraitis
Hi folks,

one short question regarding net.ipv4.tcp_no_metrics_save sysctl
parameter. Code snippet is actually the following:

void tcp_update_metrics(struct sock *sk)
{
  struct tcp_sock *tp = tcp_sk(sk);
  struct dst_entry *dst = __sk_dst_get(sk);

  if (sysctl_tcp_nometrics_save)
return;

Why this 'if' statement can't be moved before variables? I think fail
fast could save some instructions in this place. Any thoughts?

Here is the performance benchmarking results:

Test #1 (sysctl -w net.ipv4.tcp_no_metrics_save=0)
sum: 79726, avg: 1295ns
min: 453ns, max: 16487ns

Test #2 (sysctl -w net.ipv4.tcp_no_metrics_save=1)
sum: 77515, avg: 1181ns
min: 431ns, max: 15989ns

Benchmarked with Systemtap:

global c;
probe kernel.function("tcp_update_metrics").return
{
c <<< (gettimeofday_ns() - @entry(gettimeofday_ns()));
}
probe timer.s(60)
{
printf("sum: %d, avg: %dns\n", @count(c), @avg(c));
printf("min: %dns, max: %dns\n", @min(c), @max(c));
delete c;
exit();
}
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v3] openvswitch: Make 100 percents packets sampled when sampling rate is 1.

2015-08-07 Thread David Miller
From: Wenyu Zhang 
Date: Wed, 5 Aug 2015 00:30:47 -0700

> When sampling rate is 1, the sampling probability is UINT32_MAX. The packet
> should be sampled even the prandom32() generate the number of UINT32_MAX.
> And none packet need be sampled when the probability is 0.
> 
> Signed-off-by: Wenyu Zhang 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH net-next] tcp: reduce cpu usage under tcp memory pressure when SO_SNDBUF is set

2015-08-07 Thread Jason Baron
From: Jason Baron 

When SO_SNDBUF is set and we are under tcp memory pressure, the effective write
buffer space can be much lower than what was set using SO_SNDBUF. For example,
we may have set the buffer to 100kb, but we may only be able to write 10kb. In
this scenario poll()/select()/epoll(), are going to continuously return POLLOUT,
followed by -EAGAIN from write() in a very tight loop.

Introduce sk->sk_effective_sndbuf, such that we can track the 'effective' size
of the sndbuf, when we have a short write due to memory pressure. By using the
sk->sk_effective_sndbuf instead of the sk->sk_sndbuf when we are under memory
pressure, we can delay the POLLOUT until 1/3 of the buffer clears as we normally
do. There is no issue here when SO_SNDBUF is not set, since the tcp layer will
auto tune the sk->sndbuf.

In my testing, this brought a single threaad's cpu usage down from 100% to 1%
while maintaining the same level of throughput when under memory pressure.

Signed-off-by: Jason Baron 
---
 include/net/sock.h | 12 
 net/core/sock.c|  1 +
 net/core/stream.c  |  1 +
 net/ipv4/tcp.c | 10 +++---
 4 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 43c6abc..ca49415 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -380,6 +380,7 @@ struct sock {
atomic_tsk_wmem_alloc;
atomic_tsk_omem_alloc;
int sk_sndbuf;
+   int sk_effective_sndbuf;
struct sk_buff_head sk_write_queue;
kmemcheck_bitfield_begin(flags);
unsigned intsk_shutdown  : 2,
@@ -779,6 +780,14 @@ static inline bool sk_acceptq_is_full(const struct sock 
*sk)
return sk->sk_ack_backlog > sk->sk_max_ack_backlog;
 }
 
+static inline void sk_set_effective_sndbuf(struct sock *sk)
+{
+   if (sk->sk_wmem_queued > sk->sk_sndbuf)
+   sk->sk_effective_sndbuf = sk->sk_sndbuf;
+   else
+   sk->sk_effective_sndbuf = sk->sk_wmem_queued;
+}
+
 /*
  * Compute minimal free write space needed to queue new packets.
  */
@@ -789,6 +798,9 @@ static inline int sk_stream_min_wspace(const struct sock 
*sk)
 
 static inline int sk_stream_wspace(const struct sock *sk)
 {
+   if (sk->sk_effective_sndbuf)
+   return sk->sk_effective_sndbuf - sk->sk_wmem_queued;
+
return sk->sk_sndbuf - sk->sk_wmem_queued;
 }
 
diff --git a/net/core/sock.c b/net/core/sock.c
index 193901d..4fce879 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2309,6 +2309,7 @@ void sock_init_data(struct socket *sock, struct sock *sk)
sk->sk_allocation   =   GFP_KERNEL;
sk->sk_rcvbuf   =   sysctl_rmem_default;
sk->sk_sndbuf   =   sysctl_wmem_default;
+   sk->sk_effective_sndbuf =   0;
sk->sk_state=   TCP_CLOSE;
sk_set_socket(sk, sock);
 
diff --git a/net/core/stream.c b/net/core/stream.c
index d70f77a..7c175e7 100644
--- a/net/core/stream.c
+++ b/net/core/stream.c
@@ -32,6 +32,7 @@ void sk_stream_write_space(struct sock *sk)
 
if (sk_stream_is_writeable(sk) && sock) {
clear_bit(SOCK_NOSPACE, &sock->flags);
+   sk->sk_effective_sndbuf = 0;
 
rcu_read_lock();
wq = rcu_dereference(sk->sk_wq);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 45534a5..9e7f0a5 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -845,6 +845,7 @@ struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int 
size, gfp_t gfp,
sk->sk_prot->enter_memory_pressure(sk);
sk_stream_moderate_sndbuf(sk);
}
+   sk_set_effective_sndbuf(sk);
return NULL;
 }
 
@@ -939,9 +940,10 @@ new_segment:
tcp_mark_push(tp, skb);
goto new_segment;
}
-   if (!sk_wmem_schedule(sk, copy))
+   if (!sk_wmem_schedule(sk, copy)) {
+   sk_set_effective_sndbuf(sk);
goto wait_for_memory;
-
+   }
if (can_coalesce) {
skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy);
} else {
@@ -1214,8 +1216,10 @@ new_segment:
 
copy = min_t(int, copy, pfrag->size - pfrag->offset);
 
-   if (!sk_wmem_schedule(sk, copy))
+   if (!sk_wmem_schedule(sk, copy)) {
+   sk_set_effective_sndbuf(sk);
goto wait_for_memory;
+   }
 
err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb,
   pfrag->page,
-- 
1.8.2.rc2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majo

Re: [PATCH 0/3] be2net: patch set

2015-08-07 Thread David Miller
From: Sathya Perla 
Date: Wed,  5 Aug 2015 03:27:47 -0400

> This patch set contains 2 driver fixes to a Lancer HW issue and a fix
> to a double free bug. Pls apply to the "net" tree. Thanks!
> 
> Patch 1 now enables filters only after creating RXQs. This is done as 
> HW issues were observed on Lancer adapters if filters
> (flags, mac addrs etc) are enabled *before* creating RXQs. This patch
> changes the driver design by enabling filters in be_open() --
> instead of be_setup() -- after RXQs are created and buffers posted.
> 
> Patch 2 fixes an RX stall issue that was seen on Lancer adapters when
> RXQs are destroyed while they are in an "out of buffer" state.
> This patch fixes this issue by posting 64 buffers to each RXQ before
> destroying them in the close path. This is done after ensuring that no
> more new packets are selected for transfer to the RXQs by disabling
> interface filters.
> 
> Patch 3 protects eqo->affinity_mask variable from being freed twice and
> resulting in a crash.  It's now freed only when EQs haven't yet been
> destroyed.

Series applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] vxlan: combine VXLAN_FLOWBASED into VXLAN_COLLECT_METADATA

2015-08-07 Thread David Miller
From: Alexei Starovoitov 
Date: Tue,  4 Aug 2015 22:51:07 -0700

> IFLA_VXLAN_FLOWBASED is useless without IFLA_VXLAN_COLLECT_METADATA,
> so combine them into single IFLA_VXLAN_COLLECT_METADATA flag.
> 'flowbased' doesn't convey real meaning of the vxlan tunnel mode.
> This mode can be used by routing, tc+bpf and ovs.
> Only ovs is strictly flow based, so 'collect metadata' is a better
> name for this tunnel mode.
> 
> Signed-off-by: Alexei Starovoitov 
> ---
> it's not too late to kill uapi IFLA_VXLAN_FLOWBASED, since it
> was introduced two weeks ago.
> Alternatively we could rename both into vxlan_accept_metadata
> or something else, but imo collect_metadata is ok.

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH net] bna: fix interrupts storm caused by erroneous packets

2015-08-07 Thread Rasesh Mody
> From: Ivan Vecera [mailto:ivec...@redhat.com]
> Sent: Thursday, August 06, 2015 1:48 PM
> 
> The commit "e29aa33 bna: Enable Multi Buffer RX" moved packets counter
> increment from the beginning of the NAPI processing loop after the check
> for erroneous packets so they are never accounted. This counter is used to
> inform firmware about number of processed completions (packets).
> As these packets are never acked the firmware fires IRQs for them again and
> again.
> 
> Fixes: e29aa33 bna: Enable Multi Buffer RX
> Signed-off-by: Ivan Vecera 

Acked-by: Rasesh Mody 

Thanks!
Rasesh
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 net-next 0/2] RDS-TCP: Network namespace support

2015-08-07 Thread David Miller
From: Sowmini Varadhan 
Date: Wed,  5 Aug 2015 01:43:24 -0400

> This patch series contains the set of changes to correctly set up 
> the infra for PF_RDS sockets that use TCP as the transport in multiple
> network namespaces.
> 
> Patch 1 in the series is the minimal set of changes to allow
> a single instance of RDS-TCP to run in any (i.e init_net or other) net
> namespace.  The changes in this patch set ensure that the execution of 
> 'modprobe [-r] rds_tcp' sets up the kernel TCP sockets 
> relative to the current netns, so that RDS applications can send/recv
> packets from that netns, and the netns can later be deleted cleanly.
> 
> Patch 2 of the series further allows multiple RDS-TCP instances,
> one per network namespace. The changes in this patch allows dynamic
> creation/tear-down of RDS-TCP client and server sockets  across all
> current and future namespaces. 
> 
> v2 changes from RFC sent out earlier:
> David Ahern comments in patch 1, net_device notifier in patch 2, 
> patch 3 broken off and submitted separately.
> v3: Cong Wang review comments.

Series applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread David Daney

On 08/07/2015 07:54 AM, Graeme Gregory wrote:

On Thu, Aug 06, 2015 at 05:33:10PM -0700, David Daney wrote:

From: David Daney 

Find out which PHYs belong to which BGX instance in the ACPI way.

Set the MAC address of the device as provided by ACPI tables. This is
similar to the implementation for devicetree in
of_get_mac_address(). The table is searched for the device property
entries "mac-address", "local-mac-address" and "address" in that
order. The address is provided in a u64 variable and must contain a
valid 6 bytes-len mac addr.

Based on code from: Narinder Dhillon 
 Tomasz Nowicki 
 Robert Richter 

Signed-off-by: Tomasz Nowicki 
Signed-off-by: Robert Richter 
Signed-off-by: David Daney 
---
  drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 137 +-
  1 file changed, 135 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c 
b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 615b2af..2056583 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c

[...]

+
+static int acpi_get_mac_address(struct acpi_device *adev, u8 *dst)
+{
+   const union acpi_object *prop;
+   u64 mac_val;
+   u8 mac[ETH_ALEN];
+   int i, j;
+   int ret;
+
+   for (i = 0; i < ARRAY_SIZE(addr_propnames); i++) {
+   ret = acpi_dev_get_property(adev, addr_propnames[i],
+   ACPI_TYPE_INTEGER, &prop);


Shouldn't this be trying to use device_property_read_* API and making
the DT/ACPI path the same where possible?



Ideally, something like you suggest would be possible.  However, there 
are a couple of problems trying to do it in the kernel as it exists today:


1) There is no 'struct device *' here, so device_property_read_* is not 
applicable.


2) There is no standard ACPI binding for MAC addresses, so it is 
impossible to create a hypothetical fw_get_mac_address(), which would be 
analogous to of_get_mac_address().


Other e-mail threads have suggested that the path to an elegant solution 
is to inter-mix a bunch of calls to acpi_dev_get_property*() and 
fwnode_property_read*() as to use these more generic 
fwnode_property_read*() functions whereever possible.  I rejected this 
approach as it seems cleaner to me to consistently use a single set of APIs.


Thanks,
David Daney




Graeme


+   if (ret)
+   continue;
+
+   mac_val = prop->integer.value;
+
+   if (mac_val & (~0ULL << 48))
+   continue;   /* more than 6 bytes */
+
+   for (j = 0; j < ARRAY_SIZE(mac); j++)
+   mac[j] = (u8)(mac_val >> (8 * j));
+   if (!is_valid_ether_addr(mac))
+   continue;
+
+   memcpy(dst, mac, ETH_ALEN);
+
+   return 0;
+   }
+
+   return ret ? ret : -EINVAL;
+}
+
+static acpi_status bgx_acpi_register_phy(acpi_handle handle,
+u32 lvl, void *context, void **rv)
+{
+   struct acpi_reference_args args;
+   const union acpi_object *prop;
+   struct bgx *bgx = context;
+   struct acpi_device *adev;
+   struct device *phy_dev;
+   u32 phy_id;
+
+   if (acpi_bus_get_device(handle, &adev))
+   goto out;
+
+   SET_NETDEV_DEV(&bgx->lmac[bgx->lmac_count].netdev, &bgx->pdev->dev);
+
+   acpi_get_mac_address(adev, bgx->lmac[bgx->lmac_count].mac);
+
+   bgx->lmac[bgx->lmac_count].lmacid = bgx->lmac_count;
+
+   if (acpi_dev_get_property_reference(adev, "phy-handle", 0, &args))
+   goto out;
+
+   if (acpi_dev_get_property(args.adev, "phy-channel", ACPI_TYPE_INTEGER, 
&prop))
+   goto out;
+
+   phy_id = prop->integer.value;
+
+   phy_dev = bus_find_device(&mdio_bus_type, NULL, (void *)&phy_id,
+ bgx_match_phy_id);
+   if (!phy_dev)
+   goto out;
+
+   bgx->lmac[bgx->lmac_count].phydev = to_phy_device(phy_dev);
+out:
+   bgx->lmac_count++;
+   return AE_OK;
+}
+
+static acpi_status bgx_acpi_match_id(acpi_handle handle, u32 lvl,
+void *context, void **ret_val)
+{
+   struct acpi_buffer string = { ACPI_ALLOCATE_BUFFER, NULL };
+   struct bgx *bgx = context;
+   char bgx_sel[5];
+
+   snprintf(bgx_sel, 5, "BGX%d", bgx->bgx_id);
+   if (ACPI_FAILURE(acpi_get_name(handle, ACPI_SINGLE_NAME, &string))) {
+   pr_warn("Invalid link device\n");
+   return AE_OK;
+   }
+
+   if (strncmp(string.pointer, bgx_sel, 4))
+   return AE_OK;
+
+   acpi_walk_namespace(ACPI_TYPE_DEVICE, handle, 1,
+   bgx_acpi_register_phy, NULL, bgx, NULL);
+
+   kfree(string.pointer);
+   return AE_CTRL_TERMINATE;
+}
+
+static int bgx_init_acpi_phy(struct b

[PATCH net-next] net: add explicit logging and stat for neighbour table overflow

2015-08-07 Thread Rick Jones
From: Rick Jones 

Add an explicit neighbour table overflow message (ratelimited) and
statistic to make diagnosing neighbour table overflows tractable in
the wild.

Diagnosing a neighbour table overflow can be quite difficult in the wild
because there is no explicit dmesg logged.  Callers to neighbour code
seem to use net_dbg_ratelimit when the neighbour call fails which means
the "base message" is not emitted and the callback suppressed messages
from the ratelimiting can end-up juxtaposed with unrelated messages.
Further, a forced garbage collection will increment a stat on each call
whether it was successful in freeing-up a table entry or not, so that
statistic is only a hint.  So, add a net_info_ratelimited message and
explicit statistic to the neighbour code.

Signed-off-by: Rick Jones 

---

Tested by cutting back the gc_threshN sysctls and attempting to ping a
number of target IPs greater than the maximum size of the ARP table.

diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index bd33e66..8b68384 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -125,6 +125,7 @@ struct neigh_statistics {
unsigned long forced_gc_runs;   /* number of forced GC runs */
 
unsigned long unres_discards;   /* number of unresolved drops */
+   unsigned long table_fulls;  /* times even gc couldn't help */
 };
 
 #define NEIGH_CACHE_STAT_INC(tbl, field) this_cpu_inc((tbl)->stats->field)
diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
index 2e35c61..788655b 100644
--- a/include/uapi/linux/neighbour.h
+++ b/include/uapi/linux/neighbour.h
@@ -106,6 +106,7 @@ struct ndt_stats {
__u64   ndts_rcv_probes_ucast;
__u64   ndts_periodic_gc_runs;
__u64   ndts_forced_gc_runs;
+   __u64   ndts_table_fulls;
 };
 
 enum {
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 84195da..2b515ba 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -274,8 +274,12 @@ static struct neighbour *neigh_alloc(struct neigh_table 
*tbl, struct net_device
(entries >= tbl->gc_thresh2 &&
 time_after(now, tbl->last_flush + 5 * HZ))) {
if (!neigh_forced_gc(tbl) &&
-   entries >= tbl->gc_thresh3)
+   entries >= tbl->gc_thresh3) {
+   net_info_ratelimited("%s: neighbor table overflow!\n",
+tbl->id);
+   NEIGH_CACHE_STAT_INC(tbl, table_fulls);
goto out_entries;
+   }
}
 
n = kzalloc(tbl->entry_size + dev->neigh_priv_len, GFP_ATOMIC);
@@ -1849,6 +1853,7 @@ static int neightbl_fill_info(struct sk_buff *skb, struct 
neigh_table *tbl,
ndst.ndts_rcv_probes_ucast  += st->rcv_probes_ucast;
ndst.ndts_periodic_gc_runs  += st->periodic_gc_runs;
ndst.ndts_forced_gc_runs+= st->forced_gc_runs;
+   ndst.ndts_table_fulls   += st->table_fulls;
}
 
if (nla_put(skb, NDTA_STATS, sizeof(ndst), &ndst))
@@ -2717,12 +2722,12 @@ static int neigh_stat_seq_show(struct seq_file *seq, 
void *v)
struct neigh_statistics *st = v;
 
if (v == SEQ_START_TOKEN) {
-   seq_printf(seq, "entries  allocs destroys hash_grows  lookups 
hits  res_failed  rcv_probes_mcast rcv_probes_ucast  periodic_gc_runs 
forced_gc_runs unresolved_discards\n");
+   seq_printf(seq, "entries  allocs destroys hash_grows  lookups 
hits  res_failed  rcv_probes_mcast rcv_probes_ucast  periodic_gc_runs 
forced_gc_runs unresolved_discards table_fulls\n");
return 0;
}
 
seq_printf(seq, "%08x  %08lx %08lx %08lx  %08lx %08lx  %08lx  "
-   "%08lx %08lx  %08lx %08lx %08lx\n",
+   "%08lx %08lx  %08lx %08lx %08lx %08lx\n",
   atomic_read(&tbl->entries),
 
   st->allocs,
@@ -2739,7 +2744,8 @@ static int neigh_stat_seq_show(struct seq_file *seq, void 
*v)
 
   st->periodic_gc_runs,
   st->forced_gc_runs,
-  st->unres_discards
+  st->unres_discards,
+  st->table_fulls
   );
 
return 0;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Mark Rutland
[Correcting the devicetree list address, which I typo'd in my original
reply]

[resending to _really_ correct the address, apologies for the spam]

> >> +static const char * const addr_propnames[] = {
> >> +  "mac-address",
> >> +  "local-mac-address",
> >> +  "address",
> >> +};
> >
> > If these are going to be generally necessary, then we should get them
> > adopted as standardised _DSD properties (ideally just one of them).
> 
> As far as I can tell, and please correct me if I am wrong, ACPI-6.0 
> doesn't contemplate MAC addresses.
> 
> Today we are using "mac-address", which is an Integer containing the MAC 
> address in its lowest order 48 bits in Little-Endian byte order.
> 
> The hardware and ACPI tables are here today, and we would like to 
> support it.  If some future ACPI specification specifies a standard way 
> to do this, we will probably adapt the code to do this in a standard manner.
> 
> 
> >
> > [...]
> >
> >> +static acpi_status bgx_acpi_register_phy(acpi_handle handle,
> >> +   u32 lvl, void *context, void **rv)
> >> +{
> >> +  struct acpi_reference_args args;
> >> +  const union acpi_object *prop;
> >> +  struct bgx *bgx = context;
> >> +  struct acpi_device *adev;
> >> +  struct device *phy_dev;
> >> +  u32 phy_id;
> >> +
> >> +  if (acpi_bus_get_device(handle, &adev))
> >> +  goto out;
> >> +
> >> +  SET_NETDEV_DEV(&bgx->lmac[bgx->lmac_count].netdev, &bgx->pdev->dev);
> >> +
> >> +  acpi_get_mac_address(adev, bgx->lmac[bgx->lmac_count].mac);
> >> +
> >> +  bgx->lmac[bgx->lmac_count].lmacid = bgx->lmac_count;
> >> +
> >> +  if (acpi_dev_get_property_reference(adev, "phy-handle", 0, &args))
> >> +  goto out;
> >> +
> >> +  if (acpi_dev_get_property(args.adev, "phy-channel", ACPI_TYPE_INTEGER, 
> >> &prop))
> >> +  goto out;
> >
> > Likewise for any inter-device properties, so that we can actually handle
> > them in a generic fashion, and avoid / learn from the mistakes we've
> > already handled with DT.
> 
> This is the fallacy of the ACPI is superior to DT argument.  The 
> specification of PHY topology and MAC addresses is well standardized in 
> DT, there is no question about what the proper way to specify it is. 
> Under ACPI, it is the Wild West, there is no specification, so each 
> system design is forced to invent something, and everybody comes up with 
> an incompatible implementation.

Indeed.

If ACPI is going to handle it, it should handle it properly. I really
don't see the point in bodging properties together in a less standard
manner than DT, especially for inter-device relationships.

Doing so is painful for _everyone_, and it's extremely unlikely that
other ACPI-aware OSs will actually support these custom descriptions,
making this Linux-specific, and breaking the rationale for using ACPI in
the first place -- a standard that says "just do non-standard stuff" is
not a usable standard.

For intra-device properties, we should standardise what we can, but
vendor-specific stuff is ok -- this can be self-contained within a
driver.

For inter-device relationships ACPI _must_ gain a better model of
componentised devices. It's simply unworkable otherwise, and as you
point out it's fallacious to say that because ACPI is being used that
something is magically industry standard, portable, etc.

This is not your problem in particular; the entire handling of _DSD so
far is a joke IMO.

Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Mark Rutland
[Correcting the devicetree list address, which I typo'd in my original
reply]

> >> +static const char * const addr_propnames[] = {
> >> +  "mac-address",
> >> +  "local-mac-address",
> >> +  "address",
> >> +};
> >
> > If these are going to be generally necessary, then we should get them
> > adopted as standardised _DSD properties (ideally just one of them).
> 
> As far as I can tell, and please correct me if I am wrong, ACPI-6.0 
> doesn't contemplate MAC addresses.
> 
> Today we are using "mac-address", which is an Integer containing the MAC 
> address in its lowest order 48 bits in Little-Endian byte order.
> 
> The hardware and ACPI tables are here today, and we would like to 
> support it.  If some future ACPI specification specifies a standard way 
> to do this, we will probably adapt the code to do this in a standard manner.
> 
> 
> >
> > [...]
> >
> >> +static acpi_status bgx_acpi_register_phy(acpi_handle handle,
> >> +   u32 lvl, void *context, void **rv)
> >> +{
> >> +  struct acpi_reference_args args;
> >> +  const union acpi_object *prop;
> >> +  struct bgx *bgx = context;
> >> +  struct acpi_device *adev;
> >> +  struct device *phy_dev;
> >> +  u32 phy_id;
> >> +
> >> +  if (acpi_bus_get_device(handle, &adev))
> >> +  goto out;
> >> +
> >> +  SET_NETDEV_DEV(&bgx->lmac[bgx->lmac_count].netdev, &bgx->pdev->dev);
> >> +
> >> +  acpi_get_mac_address(adev, bgx->lmac[bgx->lmac_count].mac);
> >> +
> >> +  bgx->lmac[bgx->lmac_count].lmacid = bgx->lmac_count;
> >> +
> >> +  if (acpi_dev_get_property_reference(adev, "phy-handle", 0, &args))
> >> +  goto out;
> >> +
> >> +  if (acpi_dev_get_property(args.adev, "phy-channel", ACPI_TYPE_INTEGER, 
> >> &prop))
> >> +  goto out;
> >
> > Likewise for any inter-device properties, so that we can actually handle
> > them in a generic fashion, and avoid / learn from the mistakes we've
> > already handled with DT.
> 
> This is the fallacy of the ACPI is superior to DT argument.  The 
> specification of PHY topology and MAC addresses is well standardized in 
> DT, there is no question about what the proper way to specify it is. 
> Under ACPI, it is the Wild West, there is no specification, so each 
> system design is forced to invent something, and everybody comes up with 
> an incompatible implementation.

Indeed.

If ACPI is going to handle it, it should handle it properly. I really
don't see the point in bodging properties together in a less standard
manner than DT, especially for inter-device relationships.

Doing so is painful for _everyone_, and it's extremely unlikely that
other ACPI-aware OSs will actually support these custom descriptions,
making this Linux-specific, and breaking the rationale for using ACPI in
the first place -- a standard that says "just do non-standard stuff" is
not a usable standard.

For intra-device properties, we should standardise what we can, but
vendor-specific stuff is ok -- this can be self-contained within a
driver.

For inter-device relationships ACPI _must_ gain a better model of
componentised devices. It's simply unworkable otherwise, and as you
point out it's fallacious to say that because ACPI is being used that
something is magically industry standard, portable, etc.

This is not your problem in particular; the entire handling of _DSD so
far is a joke IMO.

Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread David Daney

On 08/07/2015 07:01 AM, Mark Rutland wrote:

On Fri, Aug 07, 2015 at 01:33:10AM +0100, David Daney wrote:

From: David Daney 

Find out which PHYs belong to which BGX instance in the ACPI way.

Set the MAC address of the device as provided by ACPI tables. This is
similar to the implementation for devicetree in
of_get_mac_address(). The table is searched for the device property
entries "mac-address", "local-mac-address" and "address" in that
order. The address is provided in a u64 variable and must contain a
valid 6 bytes-len mac addr.

Based on code from: Narinder Dhillon 
 Tomasz Nowicki 
 Robert Richter 

Signed-off-by: Tomasz Nowicki 
Signed-off-by: Robert Richter 
Signed-off-by: David Daney 
---
  drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 137 +-
  1 file changed, 135 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c 
b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 615b2af..2056583 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -6,6 +6,7 @@
   * as published by the Free Software Foundation.
   */

+#include 
  #include 
  #include 
  #include 
@@ -26,7 +27,7 @@
  struct lmac {
struct bgx  *bgx;
int dmac;
-   unsigned char   mac[ETH_ALEN];
+   u8  mac[ETH_ALEN];
boollink_up;
int lmacid; /* ID within BGX */
int lmacid_bd; /* ID on board */
@@ -835,6 +836,133 @@ static void bgx_get_qlm_mode(struct bgx *bgx)
}
  }

+#ifdef CONFIG_ACPI
+
+static int bgx_match_phy_id(struct device *dev, void *data)
+{
+   struct phy_device *phydev = to_phy_device(dev);
+   u32 *phy_id = data;
+
+   if (phydev->addr == *phy_id)
+   return 1;
+
+   return 0;
+}
+
+static const char * const addr_propnames[] = {
+   "mac-address",
+   "local-mac-address",
+   "address",
+};


If these are going to be generally necessary, then we should get them
adopted as standardised _DSD properties (ideally just one of them).


As far as I can tell, and please correct me if I am wrong, ACPI-6.0 
doesn't contemplate MAC addresses.


Today we are using "mac-address", which is an Integer containing the MAC 
address in its lowest order 48 bits in Little-Endian byte order.


The hardware and ACPI tables are here today, and we would like to 
support it.  If some future ACPI specification specifies a standard way 
to do this, we will probably adapt the code to do this in a standard manner.





[...]


+static acpi_status bgx_acpi_register_phy(acpi_handle handle,
+u32 lvl, void *context, void **rv)
+{
+   struct acpi_reference_args args;
+   const union acpi_object *prop;
+   struct bgx *bgx = context;
+   struct acpi_device *adev;
+   struct device *phy_dev;
+   u32 phy_id;
+
+   if (acpi_bus_get_device(handle, &adev))
+   goto out;
+
+   SET_NETDEV_DEV(&bgx->lmac[bgx->lmac_count].netdev, &bgx->pdev->dev);
+
+   acpi_get_mac_address(adev, bgx->lmac[bgx->lmac_count].mac);
+
+   bgx->lmac[bgx->lmac_count].lmacid = bgx->lmac_count;
+
+   if (acpi_dev_get_property_reference(adev, "phy-handle", 0, &args))
+   goto out;
+
+   if (acpi_dev_get_property(args.adev, "phy-channel", ACPI_TYPE_INTEGER, 
&prop))
+   goto out;


Likewise for any inter-device properties, so that we can actually handle
them in a generic fashion, and avoid / learn from the mistakes we've
already handled with DT.


This is the fallacy of the ACPI is superior to DT argument.  The 
specification of PHY topology and MAC addresses is well standardized in 
DT, there is no question about what the proper way to specify it is. 
Under ACPI, it is the Wild West, there is no specification, so each 
system design is forced to invent something, and everybody comes up with 
an incompatible implementation.


That said, this is all specific to our BGX device, so anything we do 
doesn't break other devices.


David Daney


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 17/20] net/xen-netfront: Make it running on 64KB page granularity

2015-08-07 Thread Julien Grall
The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using network
device on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Note that we allocate a Linux page for each rx skb but only the first
4KB is used. We may improve the memory usage by extending the size of
the rx skb.

Signed-off-by: Julien Grall 

---
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: netdev@vger.kernel.org

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a Linux
using 64KB pages on a non-modified Xen.

Tested with workload such as ping, ssh, wget, git... I would happy if
someone give details how to test all the path.

Changes in v3:
- Fix errors reported by checkpatch.pl
- s/mfn/gfn/ base on the new naming
- xennet_tx_setup_grant was calling itself resulting an
guest stall when using iperf.
- The grant callback doesn't allow anymore to change the len
(wasn't used here)
- gnttab_foreach_grant has been renamed to gnttab_foreach_grant_in_range
- gnttab_page_grant_foreign_ref has been renamed to
gnttab_foreach_grant_foreign_ref_one

Changes in v2:
- Use gnttab_foreach_grant to split a Linux page in grant
- Fix count slots
---
 drivers/net/xen-netfront.c | 122 -
 1 file changed, 86 insertions(+), 36 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 47f791e..204ffb8 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -74,8 +74,8 @@ struct netfront_cb {
 
 #define GRANT_INVALID_REF  0
 
-#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 /* Minimum number of Rx slots (includes slot for GSO metadata). */
 #define NET_RX_SLOTS_MIN (XEN_NETIF_NR_SLOTS_MIN + 1)
@@ -291,7 +291,7 @@ static void xennet_alloc_rx_buffers(struct netfront_queue 
*queue)
struct sk_buff *skb;
unsigned short id;
grant_ref_t ref;
-   unsigned long gfn;
+   struct page *page;
struct xen_netif_rx_request *req;
 
skb = xennet_alloc_one_rx_buffer(queue);
@@ -307,14 +307,13 @@ static void xennet_alloc_rx_buffers(struct netfront_queue 
*queue)
BUG_ON((signed short)ref < 0);
queue->grant_rx_ref[id] = ref;
 
-   gfn = 
xen_page_to_gfn(skb_frag_page(&skb_shinfo(skb)->frags[0]));
+   page = skb_frag_page(&skb_shinfo(skb)->frags[0]);
 
req = RING_GET_REQUEST(&queue->rx, req_prod);
-   gnttab_grant_foreign_access_ref(ref,
-   queue->info->xbdev->otherend_id,
-   gfn,
-   0);
-
+   gnttab_page_grant_foreign_access_ref_one(ref,
+
queue->info->xbdev->otherend_id,
+page,
+0);
req->id = id;
req->gref = ref;
}
@@ -415,25 +414,33 @@ static void xennet_tx_buf_gc(struct netfront_queue *queue)
xennet_maybe_wake_tx(queue);
 }
 
-static struct xen_netif_tx_request *xennet_make_one_txreq(
-   struct netfront_queue *queue, struct sk_buff *skb,
-   struct page *page, unsigned int offset, unsigned int len)
+struct xennet_gnttab_make_txreq {
+   struct netfront_queue *queue;
+   struct sk_buff *skb;
+   struct page *page;
+   struct xen_netif_tx_request *tx; /* Last request */
+   unsigned int size;
+};
+
+static void xennet_tx_setup_grant(unsigned long gfn, unsigned int offset,
+ unsigned int len, void *data)
 {
+   struct xennet_gnttab_make_txreq *info = data;
unsigned int id;
struct xen_netif_tx_request *tx;
grant_ref_t ref;
-
-   len = min_t(unsigned int, PAGE_SIZE - offset, len);
+   /* convenient aliases */
+   struct page *page = info->page;
+   struct netfront_queue *queue = info->queue;
+   struct sk_buff *skb = info->skb;
 
id = get_id_from_freelist(&queue->tx_skb_freelist, queue->tx_skbs);
tx = RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++);
ref = gnttab_claim_grant_reference(&queue->gref_tx_head);
BUG_ON((signed short)ref < 0);
 
-   gnttab_grant_foreign_access_ref(ref,
-

[PATCH v3 18/20] net/xen-netback: Make it running on 64KB page granularity

2015-08-07 Thread Julien Grall
The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity working as a
network backend on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Signed-off-by: Julien Grall 

---
Cc: Ian Campbell 
Cc: Wei Liu 
Cc: netdev@vger.kernel.org

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.

Changes in v3:
- Fix errors reported by checkpatch.pl
- s/mfn/gfn/ based on the new naming
- gnttab_foreach_grant has been renamed to gnttab_forach_grant_in_range
- The grant callback doesn't allow anymore to use less data. An
helpers has been added in netback to handle this.

Changes in v2:
- Correctly set MAX_GRANT_COPY_OPS and XEN_NETBK_RX_SLOTS_MAX
- Don't use XEN_PAGE_SIZE in handle_frag_list as we coalesce
fragment into a new skb
- Use gnntab_foreach_grant to split a Linux page into grant
---
 drivers/net/xen-netback/common.h  |  15 ++--
 drivers/net/xen-netback/netback.c | 153 --
 2 files changed, 107 insertions(+), 61 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 8a495b3..bb68211 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -44,6 +44,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 typedef unsigned int pending_ring_idx_t;
@@ -64,8 +65,8 @@ struct pending_tx_info {
struct ubuf_info callback_struct;
 };
 
-#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 struct xenvif_rx_meta {
int id;
@@ -80,16 +81,18 @@ struct xenvif_rx_meta {
 /* Discriminate from any valid pending_idx value. */
 #define INVALID_PENDING_IDX 0x
 
-#define MAX_BUFFER_OFFSET PAGE_SIZE
+#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
 
 #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
 
+#define MAX_XEN_SKB_FRAGS (65536 / XEN_PAGE_SIZE + 1)
+
 /* It's possible for an skb to have a maximal number of frags
  * but still be less than MAX_BUFFER_OFFSET in size. Thus the
- * worst-case number of copy operations is MAX_SKB_FRAGS per
+ * worst-case number of copy operations is MAX_XEN_SKB_FRAGS per
  * ring slot.
  */
-#define MAX_GRANT_COPY_OPS (MAX_SKB_FRAGS * XEN_NETIF_RX_RING_SIZE)
+#define MAX_GRANT_COPY_OPS (MAX_XEN_SKB_FRAGS * XEN_NETIF_RX_RING_SIZE)
 
 #define NETBACK_INVALID_HANDLE -1
 
@@ -203,7 +206,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
 /* Maximum number of Rx slots a to-guest packet may use, including the
  * slot needed for GSO meta-data.
  */
-#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
+#define XEN_NETBK_RX_SLOTS_MAX ((MAX_XEN_SKB_FRAGS + 1))
 
 enum state_bit_shift {
/* This bit marks that the vif is connected */
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 66f1780..c32a9f2 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -263,6 +263,80 @@ static struct xenvif_rx_meta *get_next_rx_buffer(struct 
xenvif_queue *queue,
return meta;
 }
 
+struct gop_frag_copy {
+   struct xenvif_queue *queue;
+   struct netrx_pending_operations *npo;
+   struct xenvif_rx_meta *meta;
+   int head;
+   int gso_type;
+
+   struct page *page;
+};
+
+static void xenvif_setup_copy_gop(unsigned long gfn,
+ unsigned int offset,
+ unsigned int *len,
+ struct gop_frag_copy *info)
+{
+   struct gnttab_copy *copy_gop;
+   struct xen_page_foreign *foreign;
+   /* Convenient aliases */
+   struct xenvif_queue *queue = info->queue;
+   struct netrx_pending_operations *npo = info->npo;
+   struct page *page = info->page;
+
+   BUG_ON(npo->copy_off > MAX_BUFFER_OFFSET);
+
+   if (npo->copy_off == MAX_BUFFER_OFFSET)
+   info->meta = get_next_rx_buffer(queue, npo);
+
+   if (npo->copy_off + *len > MAX_BUFFER_OFFSET)
+   *len = MAX_BUFFER_OFFSET - npo->copy_off;
+
+   copy_gop = npo->copy + npo->copy_prod++;
+   copy_gop->flags = GNTCOPY_dest_gref;
+   copy_gop->len = *len;
+
+   foreign = xen_page_foreign(page);
+   if (foreign) {
+   copy_gop->source.domid = foreign->domid;
+   copy_gop->source.u.ref = foreign->gref;
+   copy_gop->flags |= GNTCOPY_source_gref;
+   } else {
+   copy_gop->source.domid = DOMID_SELF;
+

Re: [BUG] net/ipv4: inconsistent routing table

2015-08-07 Thread Hannes Frederic Sowa
Hello,

Alexander Duyck  writes:
> On 08/07/2015 01:23 AM, Zang MingJie wrote:
>> IMO, the routing decision is determined, given a specific routing
>> table and local network the result MUST be determined, independence of
>> how/what order the routing entry is added.
>>
>> Now there are two ways to configure the system resulting EXACTLY the
>> same routing table and local addresses, but the routing decision is
>> totally different.
>>
>> SAME routing table, DIFFERENT routing decision, there MUST be bugs in kernel
>
> I wasn't arguing that the behavior is undesirable, but the likelihood of 
> having a default route assigned to a local address should be pretty 
> low.  If the system is the default route of others then it should have a 
> different default gateway than itself.  For example an office router 
> would end up pointing to the ISP as the gateway, and the ISP would 
> either point to some other provider or run a BGP configuration.  So in 
> the case of the default route transitioning to us we should end up 
> having to delete and update the default route anyway.  This is likely 
> one of the reasons why there hasn't been any issues reported with this 
> behavior until now.
>
> I'm just wondering if the work involved to fix it is going to be worth 
> it.  We have to keep in mind that this will result in a change of 
> behavior for existing users and we don't know if anyone might be 
> expecting this type of behavior.
>
> We basically are looking at one of three options.  The first one is to 
> just delete the route if you add the gateway as a local address or 
> remove it.  That would be consistent with what you might see if the 
> address was the sole address on an interface of its own.  The second 
> option is to update the nh_scope which I believe should be transitioned 
> between RT_SCOPE_HOST to RT_SCOPE_LINK if I am understanding things 
> correctly.  The third option is we don't change the behavior and just 
> document it.  This would then require manually deleting and restoring 
> any routes that use a recently modified address as their gateway.
>
> Based on your feedback I'm assuming you would probably prefer the second 
> option.  I'm just waiting to see if there are any other opinions on the 
> matter before I act.

The semantics behind this are not easy and the result might well break
other people's system. I would leave the current resolution logic as-is
and merely change the way iproute presents those information.

Currently we resolve the nexthop during route setup time and install the
resulting information into the FIB. This is very common on other OS, too.

In case we would reevaluate the nexthop part of a route during local
address changes on one of the interfaces, we could get the system very
well in a situation where it would have to remove its default route
because the network would not be reachable via ip subnetting any more,
but neighboring information would still keep the machine connected. And
this could happen with setups where someone did not configure their
routes to their own addresses, which are much more widespread.

The change wouldn't be in contradiction with weak end system behavior,
but I very much don't want to make other people's machines unreachable
because of such a change.

If we could rewind time, we could make local nexthops -EINVAL.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread David Daney

On 08/07/2015 05:42 AM, Tomasz Nowicki wrote:

On 07.08.2015 13:56, Robert Richter wrote:

On 07.08.15 12:52:41, Tomasz Nowicki wrote:

[...]



I would not pollute bgx_probe() with acpi and dt specifics, and instead
keep bgx_init_phy(). The typical design pattern for this is:

static int bgx_init_phy(struct bgx *bgx)
{
#ifdef CONFIG_ACPI
 if (!acpi_disabled)
 return bgx_init_acpi_phy(bgx);
#endif
 return bgx_init_of_phy(bgx);
}

This adds acpi runtime detection (acpi=no), does not call dt code in
case of acpi, and saves the #else for bgx_init_acpi_phy().



I am fine with keeping it in bgx_init_phy(), however we can drop there
#ifdefs since both of bgx_init_{acpi,of}_phy calls have empty stub
for !ACPI
and !OF case. Like that:

static int bgx_init_phy(struct bgx *bgx)
{

 if (!acpi_disabled)
 return bgx_init_acpi_phy(bgx);
else
 return bgx_init_of_phy(bgx);
}


As said, keeping it in #ifdefs makes the empty stub function for !acpi
obsolete, which makes the code smaller and better readable. This style
is common practice in the kernel. Apart from that, the 'else' should
be dropped as it is useless.



I would't say the code is better readable (bgx_init_of_phy has still two
stubs) but this is just cosmetic, my point was to use run time detection
using acpi_disabled.



Thanks Tomasz and Robert for the input.  Because of this, it seems that 
another version of the patch will be necessary.  In the interests of 
code clarity and asthetics, we will go with the code without the 
#ifdefs, and rely on the compiler to optimize away any dead code.


David Daney


Tomasz


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 01/20] net/xen-netback: xenvif_gop_frag_copy: move GSO check out of the loop

2015-08-07 Thread Julien Grall
The skb doesn't change within the function. Therefore it's only
necessary to check if we need GSO once at the beginning.

Signed-off-by: Julien Grall 

---
Cc: Ian Campbell 
Cc: Wei Liu 
Cc: netdev@vger.kernel.org

Changes in v2:
- Patch added
---
 drivers/net/xen-netback/netback.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 8293374..66f1780 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -277,6 +277,13 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
*queue, struct sk_buff *skb
unsigned long bytes;
int gso_type = XEN_NETIF_GSO_TYPE_NONE;
 
+   if (skb_is_gso(skb)) {
+   if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
+   gso_type = XEN_NETIF_GSO_TYPE_TCPV4;
+   else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
+   gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
+   }
+
/* Data must not cross a page boundary. */
BUG_ON(size + offset > PAGE_SIZEgso_type & SKB_GSO_TCPV6)
-   gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
-   }
-
if (*head && ((1 << gso_type) & queue->vif->gso_mask))
queue->rx.req_cons++;
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2] bridge: netlink: add support for vlan_filtering attribute

2015-08-07 Thread Nikolay Aleksandrov
From: Nikolay Aleksandrov 

This patch adds the ability to toggle the vlan filtering support via
netlink. Since we're already running with rtnl in .changelink() we don't
need to take any additional locks.

Signed-off-by: Nikolay Aleksandrov 
---
v2: return EOPNOTSUPP when vlan filtering isn't configured and can't be
toggled

I'll post the iproute2 patch if this one gets accepted.

 include/uapi/linux/if_link.h |  1 +
 net/bridge/br_netlink.c  | 14 +-
 net/bridge/br_private.h  |  7 +++
 net/bridge/br_vlan.c | 18 --
 4 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index ea047480a1f0..7531815bf88a 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -230,6 +230,7 @@ enum {
IFLA_BR_AGEING_TIME,
IFLA_BR_STP_STATE,
IFLA_BR_PRIORITY,
+   IFLA_BR_VLAN_FILTERING,
__IFLA_BR_MAX,
 };
 
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 91a2e08c2bb8..6eb683d8e0c5 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -724,6 +724,7 @@ static const struct nla_policy br_policy[IFLA_BR_MAX + 1] = 
{
[IFLA_BR_AGEING_TIME] = { .type = NLA_U32 },
[IFLA_BR_STP_STATE] = { .type = NLA_U32 },
[IFLA_BR_PRIORITY] = { .type = NLA_U16 },
+   [IFLA_BR_VLAN_FILTERING] = { .type = NLA_U8 },
 };
 
 static int br_changelink(struct net_device *brdev, struct nlattr *tb[],
@@ -771,6 +772,14 @@ static int br_changelink(struct net_device *brdev, struct 
nlattr *tb[],
br_stp_set_bridge_priority(br, priority);
}
 
+   if (data[IFLA_BR_VLAN_FILTERING]) {
+   u8 vlan_filter = nla_get_u8(data[IFLA_BR_VLAN_FILTERING]);
+
+   err = __br_vlan_filter_toggle(br, vlan_filter);
+   if (err)
+   return err;
+   }
+
return 0;
 }
 
@@ -782,6 +791,7 @@ static size_t br_get_size(const struct net_device *brdev)
   nla_total_size(sizeof(u32)) +/* IFLA_BR_AGEING_TIME */
   nla_total_size(sizeof(u32)) +/* IFLA_BR_STP_STATE */
   nla_total_size(sizeof(u16)) +/* IFLA_BR_PRIORITY */
+  nla_total_size(sizeof(u8)) + /* IFLA_BR_VLAN_FILTERING */
   0;
 }
 
@@ -794,13 +804,15 @@ static int br_fill_info(struct sk_buff *skb, const struct 
net_device *brdev)
u32 ageing_time = jiffies_to_clock_t(br->ageing_time);
u32 stp_enabled = br->stp_enabled;
u16 priority = (br->bridge_id.prio[0] << 8) | br->bridge_id.prio[1];
+   u8 vlan_enabled = br_vlan_enabled(br);
 
if (nla_put_u32(skb, IFLA_BR_FORWARD_DELAY, forward_delay) ||
nla_put_u32(skb, IFLA_BR_HELLO_TIME, hello_time) ||
nla_put_u32(skb, IFLA_BR_MAX_AGE, age_time) ||
nla_put_u32(skb, IFLA_BR_AGEING_TIME, ageing_time) ||
nla_put_u32(skb, IFLA_BR_STP_STATE, stp_enabled) ||
-   nla_put_u16(skb, IFLA_BR_PRIORITY, priority))
+   nla_put_u16(skb, IFLA_BR_PRIORITY, priority) ||
+   nla_put_u8(skb, IFLA_BR_VLAN_FILTERING, vlan_enabled))
return -EMSGSIZE;
 
return 0;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index e2cb359f9dd3..3d95647039d0 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -614,6 +614,7 @@ int br_vlan_delete(struct net_bridge *br, u16 vid);
 void br_vlan_flush(struct net_bridge *br);
 bool br_vlan_find(struct net_bridge *br, u16 vid);
 void br_recalculate_fwd_mask(struct net_bridge *br);
+int __br_vlan_filter_toggle(struct net_bridge *br, unsigned long val);
 int br_vlan_filter_toggle(struct net_bridge *br, unsigned long val);
 int br_vlan_set_proto(struct net_bridge *br, unsigned long val);
 int br_vlan_init(struct net_bridge *br);
@@ -771,6 +772,12 @@ static inline int br_vlan_enabled(struct net_bridge *br)
 {
return 0;
 }
+
+static inline int __br_vlan_filter_toggle(struct net_bridge *br,
+ unsigned long val)
+{
+   return -EOPNOTSUPP;
+}
 #endif
 
 struct nf_br_ops {
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 0d41f81838ff..3cef6892c0bb 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -468,21 +468,27 @@ void br_recalculate_fwd_mask(struct net_bridge *br)
  ~(1u << br->group_addr[5]);
 }
 
-int br_vlan_filter_toggle(struct net_bridge *br, unsigned long val)
+int __br_vlan_filter_toggle(struct net_bridge *br, unsigned long val)
 {
-   if (!rtnl_trylock())
-   return restart_syscall();
-
if (br->vlan_enabled == val)
-   goto unlock;
+   return 0;
 
br->vlan_enabled = val;
br_manage_promisc(br);
recalculate_group_addr(br);
br_recalculate_fwd_mask(br);
 
-unlock:
+   return 0;
+}
+
+int br_vlan_filter_toggle(struct net_br

[PATCH v3 4/9] xen: Use correctly the Xen memory terminologies

2015-08-07 Thread Julien Grall
Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
is meant, I suspect this is because the first support for Xen was for
PV. This resulted in some misimplementation of helpers on ARM and
confused developers about the expected behavior.

For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
Although, if we look at the implementation on x86, it's returning a GFN.

For clarity and avoid new confusion, replace any reference to mfn with
gfn in any helpers used by PV drivers. The x86 code will still keep some
reference of pfn_to_mfn but exclusively for PV (a BUG_ON has been added
to ensure this). No changes as been made in the hypercall field, even
though they may be invalid, in order to keep the same as the defintion
in xen repo.

Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a
name to close to the KVM function gfn_to_page.

Take also the opportunity to simplify simple construction such
as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up
will come in follow-up patches.

[1] 
http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb

Signed-off-by: Julien Grall 
Reviewed-by: Stefano Stabellini 
Acked-by: Dmitry Torokhov 
Acked-by: Wei Liu 

---
Cc: Russell King 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: "Roger Pau Monné" 
Cc: Ian Campbell 
Cc: Juergen Gross 
Cc: "James E.J. Bottomley" 
Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Tomi Valkeinen 
Cc: linux-in...@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-s...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux-fb...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org

Note that I've re-introduced in v2 mfn_to_pfn & co only for x86 PV code.
The helpers contain a BUG_ON to ensure that it's never called for
auto-translated guests. I did as best as my can to determine whether
mfn or gfn helpers should be used. Although, I haven't tried to boot
it.

It may be possible to do further cleanup in the mmu.c where I found
some check to auto-translated. I'm not sure why given that the pvmmu
callback are only used for non-auto translated guest.

Changes in v3:
- Add Stefano's reviewed-by (except for the x86 bits)
- Add Wei (netback) and Dmitry's (input) acked-by
- Keep the VIRT <-> MACHINE macro in the same order as before
in arch/x86/include/asm/xen/page.h
- Rename page_to_gfn to xen_page_to_gfn to avoid confusion with
the KVM function gfn_to_page.
- Typoes in the commit title

Changes in v2:
- Give directly the URL to the commit rather than the commit ID
- xenstored_local_init: keep the cast to void *
- Typoes
- Keep pfn_to_mfn for x86 and PV-only. The *mfn* helpers are
used in arch/x86/xen for enlighten.c, mmu.c, p2m.c, setup.c,
smp.c and mm.c
---
 arch/arm/include/asm/xen/page.h | 13 +++--
 arch/x86/include/asm/xen/page.h | 31 +--
 arch/x86/xen/smp.c  |  2 +-
 drivers/block/xen-blkfront.c|  6 +++---
 drivers/input/misc/xen-kbdfront.c   |  4 ++--
 drivers/net/xen-netback/netback.c   |  4 ++--
 drivers/net/xen-netfront.c  | 12 +++-
 drivers/scsi/xen-scsifront.c| 10 +-
 drivers/tty/hvc/hvc_xen.c   |  5 +++--
 drivers/video/fbdev/xen-fbfront.c   |  4 ++--
 drivers/xen/balloon.c   |  2 +-
 drivers/xen/events/events_base.c|  2 +-
 drivers/xen/events/events_fifo.c|  4 ++--
 drivers/xen/gntalloc.c  |  3 ++-
 drivers/xen/manage.c|  2 +-
 drivers/xen/tmem.c  |  4 ++--
 drivers/xen/xenbus/xenbus_client.c  |  2 +-
 drivers/xen/xenbus/xenbus_dev_backend.c |  2 +-
 drivers/xen/xenbus/xenbus_probe.c   |  8 +++-
 include/xen/page.h  |  4 ++--
 20 files changed, 73 insertions(+), 51 deletions(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 911d62b..1279563 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -34,14 +34,15 @@ typedef struct xpaddr {
 unsigned long __pfn_to_mfn(unsigned long pfn);
 extern struct rb_root phys_to_mach;
 
-static inline unsigned long pfn_to_mfn(unsigned long pfn)
+/* Pseudo-physical <-> Guest conversion */
+static inline unsigned long pfn_to_gfn(unsigned long pfn)
 {
return pfn;
 }
 
-static inline unsigned long mfn_to_pfn(unsigned long mfn)
+static inline unsigned long gfn_to_pfn(unsigned long gfn)
 {
-   return mfn;
+   return gfn;
 }
 
 /* Pseudo-physical <-> BUS conversion */
@@ -65,9 +66,9 @@ static inline unsigned long bfn_to_pfn(unsigned long bfn)
 
 #define bfn_to_local_pfn

[PATCH v3 0/9] Use correctly the Xen memory terminologies

2015-08-07 Thread Julien Grall
Hi all,

This patch series aims to use the memory terminologies described in
include/xen/mm.h [1] for Linux xen code.

The differences from v2 is minor but I resent it because my 64K series depends
on this series.

Linux is using mistakenly MFN when GFN is meant, I suspect this is because the
first support of Xen was for PV. This has brought some misimplementation
of memory helpers on ARM and make the developper confused about the expected
behavior.

For instance, with pfn_to_mfn, we expect to get a MFN based on the name.
Although, if we look at the implementation on x86, it's returning a GFN.
Most of the callers are also using it this way.

The first 2 patches of this series is ARM related in order to remove
PV specific helpers which should not be used and fixing the implementation of
pfn_to_mfn.

The rest of the series is here rename most of the usage in the common code
of MFN to GFN. I also took the opportunity to replace most of the call to
pfn_to_gfn in the common code by page_to_gfn avoid construction such
as pfn_to_gfn(page_to_pfn(...).

Note the one xen-blkfront will be dropped by 64K series [2], I can include
here if necessary.

Major changes in v3:
- More typoes
- Rename page_to_gfn to xen_page_to_gfn to avoid confusing with the
KVM function gfn_to_page

Major changes in v2:
- Use bfn rather than dfn for the DMA address
- Re-introduced pfn_to_mfn for PV guests only
- Typoes

For all the changes see in each patch.

This series is based on xentip for-linus-4.3 branch. A branch with all the
patches can be found here:
git://xenbits.xen.org/people/julieng/linux-arm.git branch page-renaming-v3

It has been boot tested on ARM64 and ARM32 and only built for x86.
I would be happy if someone can give a try on x86 as I don't have a x86
box setup with Xen.

Sincerely yours,

[1] 
http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb
[2] https://lkml.org/lkml/2015/7/9/628

Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: Dmitry Torokhov 
Cc: Greg Kroah-Hartman 
Cc: "H. Peter Anvin" 
Cc: Ian Campbell 
Cc: Ingo Molnar 
Cc: "James E.J. Bottomley" 
Cc: Jean-Christophe Plagniol-Villard 
Cc: Jiri Slaby 
Cc: Juergen Gross 
Cc: Konrad Rzeszutek Wilk 
Cc: linux-...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-fb...@vger.kernel.org
Cc: linux-in...@vger.kernel.org
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: "Roger Pau Monné" 
Cc: Russell King 
Cc: Stefano Stabellini 
Cc: Thomas Gleixner 
Cc: Tomi Valkeinen 
Cc: Wei Liu 
Cc: x...@kernel.org



Julien Grall (9):
  arm/xen: Remove helpers which are PV specific
  xen: Make clear that swiotlb and biomerge are dealing with DMA address
  arm/xen: implement correctly pfn_to_mfn
  xen: Use correctly the Xen memory terminologies
  xen/tmem: Use xen_page_to_gfn rather than pfn_to_gfn
  video/xen-fbfront: Further s/MFN/GFN clean-up
  hvc/xen: Further s/MFN/GFN clean-up
  xen/privcmd: Further s/MFN/GFN/ clean-up
  xen/xenbus: Rename the variable xen_store_mfn to xen_store_gfn

 arch/arm/include/asm/xen/page.h | 44 -
 arch/arm/xen/enlighten.c| 18 +++---
 arch/arm/xen/mm.c   |  4 +--
 arch/x86/include/asm/xen/page.h | 35 +-
 arch/x86/xen/mmu.c  | 32 
 arch/x86/xen/smp.c  |  2 +-
 drivers/block/xen-blkfront.c|  6 ++---
 drivers/input/misc/xen-kbdfront.c   |  4 +--
 drivers/net/xen-netback/netback.c   |  4 +--
 drivers/net/xen-netfront.c  | 12 +
 drivers/scsi/xen-scsifront.c| 10 
 drivers/tty/hvc/hvc_xen.c   | 18 ++
 drivers/video/fbdev/xen-fbfront.c   | 20 +++
 drivers/xen/balloon.c   |  2 +-
 drivers/xen/biomerge.c  |  6 ++---
 drivers/xen/events/events_base.c|  2 +-
 drivers/xen/events/events_fifo.c|  4 +--
 drivers/xen/gntalloc.c  |  3 ++-
 drivers/xen/manage.c|  2 +-
 drivers/xen/privcmd.c   | 44 -
 drivers/xen/swiotlb-xen.c   | 16 ++--
 drivers/xen/tmem.c  | 21 ++--
 drivers/xen/xenbus/xenbus_client.c  |  2 +-
 drivers/xen/xenbus/xenbus_dev_backend.c |  2 +-
 drivers/xen/xenbus/xenbus_probe.c   | 16 ++--
 drivers/xen/xlate_mmu.c | 18 +++---
 include/uapi/xen/privcmd.h  |  4 +++
 include/xen/page.h  |  4 +--
 include/xen/xen-ops.h   | 10 
 29 files changed, 191 insertions(+), 174 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net-timestamp: Update skb_complete_tx_timestamp comment

2015-08-07 Thread Benjamin Poirier
After "62bccb8 net-timestamp: Make the clone operation stand-alone from phy
timestamping" the hwtstamps parameter of skb_complete_tx_timestamp() may no
longer be NULL.

Signed-off-by: Benjamin Poirier 
Cc: Alexander Duyck 
---
 include/linux/skbuff.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index d6cdd6e..22b6d9c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2884,11 +2884,11 @@ static inline bool skb_defer_rx_timestamp(struct 
sk_buff *skb)
  *
  * PHY drivers may accept clones of transmitted packets for
  * timestamping via their phy_driver.txtstamp method. These drivers
- * must call this function to return the skb back to the stack, with
- * or without a timestamp.
+ * must call this function to return the skb back to the stack with a
+ * timestamp.
  *
  * @skb: clone of the the original outgoing packet
- * @hwtstamps: hardware time stamps, may be NULL if not available
+ * @hwtstamps: hardware time stamps
  *
  */
 void skb_complete_tx_timestamp(struct sk_buff *skb,
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] bridge: netlink: add support for vlan_filtering attribute

2015-08-07 Thread Nikolay Aleksandrov

> On Aug 7, 2015, at 7:24 PM, Nikolay Aleksandrov  wrote:
> 
> From: Nikolay Aleksandrov 
> 
> This patch adds the ability to toggle the vlan filtering support via
> netlink. Since we're already running with rtnl in .changelink() we don't
> need to take any additional locks.
> 
> Signed-off-by: Nikolay Aleksandrov 
> ---
> I'll post the iproute2 patch if this one gets accepted.
> 

Uh, I wanted to post a version that returns an error when vlan filtering isn’t 
supported
instead of always succeeding. So please drop this patch, I’ll post a v2 with 
that change
so the user will know if vlan filtering was actually enabled.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] bridge: netlink: add support for vlan_filtering attribute

2015-08-07 Thread Nikolay Aleksandrov
From: Nikolay Aleksandrov 

This patch adds the ability to toggle the vlan filtering support via
netlink. Since we're already running with rtnl in .changelink() we don't
need to take any additional locks.

Signed-off-by: Nikolay Aleksandrov 
---
I'll post the iproute2 patch if this one gets accepted.

 include/uapi/linux/if_link.h |  1 +
 net/bridge/br_netlink.c  | 12 +++-
 net/bridge/br_private.h  |  6 ++
 net/bridge/br_vlan.c | 16 ++--
 4 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index ea047480a1f0..7531815bf88a 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -230,6 +230,7 @@ enum {
IFLA_BR_AGEING_TIME,
IFLA_BR_STP_STATE,
IFLA_BR_PRIORITY,
+   IFLA_BR_VLAN_FILTERING,
__IFLA_BR_MAX,
 };
 
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 91a2e08c2bb8..2f5ab3def714 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -724,6 +724,7 @@ static const struct nla_policy br_policy[IFLA_BR_MAX + 1] = 
{
[IFLA_BR_AGEING_TIME] = { .type = NLA_U32 },
[IFLA_BR_STP_STATE] = { .type = NLA_U32 },
[IFLA_BR_PRIORITY] = { .type = NLA_U16 },
+   [IFLA_BR_VLAN_FILTERING] = { .type = NLA_U8 },
 };
 
 static int br_changelink(struct net_device *brdev, struct nlattr *tb[],
@@ -771,6 +772,12 @@ static int br_changelink(struct net_device *brdev, struct 
nlattr *tb[],
br_stp_set_bridge_priority(br, priority);
}
 
+   if (data[IFLA_BR_VLAN_FILTERING]) {
+   u8 vlan_filter = nla_get_u8(data[IFLA_BR_VLAN_FILTERING]);
+
+   __br_vlan_filter_toggle(br, vlan_filter);
+   }
+
return 0;
 }
 
@@ -782,6 +789,7 @@ static size_t br_get_size(const struct net_device *brdev)
   nla_total_size(sizeof(u32)) +/* IFLA_BR_AGEING_TIME */
   nla_total_size(sizeof(u32)) +/* IFLA_BR_STP_STATE */
   nla_total_size(sizeof(u16)) +/* IFLA_BR_PRIORITY */
+  nla_total_size(sizeof(u8)) + /* IFLA_BR_VLAN_FILTERING */
   0;
 }
 
@@ -794,13 +802,15 @@ static int br_fill_info(struct sk_buff *skb, const struct 
net_device *brdev)
u32 ageing_time = jiffies_to_clock_t(br->ageing_time);
u32 stp_enabled = br->stp_enabled;
u16 priority = (br->bridge_id.prio[0] << 8) | br->bridge_id.prio[1];
+   u8 vlan_enabled = br_vlan_enabled(br);
 
if (nla_put_u32(skb, IFLA_BR_FORWARD_DELAY, forward_delay) ||
nla_put_u32(skb, IFLA_BR_HELLO_TIME, hello_time) ||
nla_put_u32(skb, IFLA_BR_MAX_AGE, age_time) ||
nla_put_u32(skb, IFLA_BR_AGEING_TIME, ageing_time) ||
nla_put_u32(skb, IFLA_BR_STP_STATE, stp_enabled) ||
-   nla_put_u16(skb, IFLA_BR_PRIORITY, priority))
+   nla_put_u16(skb, IFLA_BR_PRIORITY, priority) ||
+   nla_put_u8(skb, IFLA_BR_VLAN_FILTERING, vlan_enabled))
return -EMSGSIZE;
 
return 0;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index e2cb359f9dd3..f8b613195a07 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -614,6 +614,7 @@ int br_vlan_delete(struct net_bridge *br, u16 vid);
 void br_vlan_flush(struct net_bridge *br);
 bool br_vlan_find(struct net_bridge *br, u16 vid);
 void br_recalculate_fwd_mask(struct net_bridge *br);
+void __br_vlan_filter_toggle(struct net_bridge *br, unsigned long val);
 int br_vlan_filter_toggle(struct net_bridge *br, unsigned long val);
 int br_vlan_set_proto(struct net_bridge *br, unsigned long val);
 int br_vlan_init(struct net_bridge *br);
@@ -771,6 +772,11 @@ static inline int br_vlan_enabled(struct net_bridge *br)
 {
return 0;
 }
+
+static inline void __br_vlan_filter_toggle(struct net_bridge *br,
+  unsigned long val)
+{
+}
 #endif
 
 struct nf_br_ops {
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 0d41f81838ff..ea07b9eae0f6 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -468,21 +468,25 @@ void br_recalculate_fwd_mask(struct net_bridge *br)
  ~(1u << br->group_addr[5]);
 }
 
-int br_vlan_filter_toggle(struct net_bridge *br, unsigned long val)
+void __br_vlan_filter_toggle(struct net_bridge *br, unsigned long val)
 {
-   if (!rtnl_trylock())
-   return restart_syscall();
-
if (br->vlan_enabled == val)
-   goto unlock;
+   return;
 
br->vlan_enabled = val;
br_manage_promisc(br);
recalculate_group_addr(br);
br_recalculate_fwd_mask(br);
+}
 
-unlock:
+int br_vlan_filter_toggle(struct net_bridge *br, unsigned long val)
+{
+   if (!rtnl_trylock())
+   return restart_syscall();
+
+   __br_vlan_filter_toggle(br, val);
rtnl_unlock();
+
return 0;
 }
 
-- 
2.4.

Re: [BUG] net/ipv4: inconsistent routing table

2015-08-07 Thread Alexander Duyck

On 08/07/2015 01:23 AM, Zang MingJie wrote:

IMO, the routing decision is determined, given a specific routing
table and local network the result MUST be determined, independence of
how/what order the routing entry is added.

Now there are two ways to configure the system resulting EXACTLY the
same routing table and local addresses, but the routing decision is
totally different.

SAME routing table, DIFFERENT routing decision, there MUST be bugs in kernel


I wasn't arguing that the behavior is undesirable, but the likelihood of 
having a default route assigned to a local address should be pretty 
low.  If the system is the default route of others then it should have a 
different default gateway than itself.  For example an office router 
would end up pointing to the ISP as the gateway, and the ISP would 
either point to some other provider or run a BGP configuration.  So in 
the case of the default route transitioning to us we should end up 
having to delete and update the default route anyway.  This is likely 
one of the reasons why there hasn't been any issues reported with this 
behavior until now.


I'm just wondering if the work involved to fix it is going to be worth 
it.  We have to keep in mind that this will result in a change of 
behavior for existing users and we don't know if anyone might be 
expecting this type of behavior.


We basically are looking at one of three options.  The first one is to 
just delete the route if you add the gateway as a local address or 
remove it.  That would be consistent with what you might see if the 
address was the sole address on an interface of its own.  The second 
option is to update the nh_scope which I believe should be transitioned 
between RT_SCOPE_HOST to RT_SCOPE_LINK if I am understanding things 
correctly.  The third option is we don't change the behavior and just 
document it.  This would then require manually deleting and restoring 
any routes that use a recently modified address as their gateway.


Based on your feedback I'm assuming you would probably prefer the second 
option.  I'm just waiting to see if there are any other opinions on the 
matter before I act.


Thanks.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] can: flexcan: demote register output to debug level

2015-08-07 Thread Lucas Stach
This message isn't really helpful for the general reader of the kernel
logs, so should not be printed with info level. All other register
programming outputs in the flexcan driver already use the debug level.

Signed-off-by: Lucas Stach 
---
 drivers/net/can/flexcan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index ad0a7e8c2c2b..95abd395cb0d 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -797,7 +797,7 @@ static void flexcan_set_bittiming(struct net_device *dev)
if (priv->can.ctrlmode & CAN_CTRLMODE_3_SAMPLES)
reg |= FLEXCAN_CTRL_SMP;
 
-   netdev_info(dev, "writing ctrl=0x%08x\n", reg);
+   netdev_dbg(dev, "writing ctrl=0x%08x\n", reg);
flexcan_write(reg, ®s->ctrl);
 
/* print chip status */
-- 
2.4.6

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Graeme Gregory
On Thu, Aug 06, 2015 at 05:33:10PM -0700, David Daney wrote:
> From: David Daney 
> 
> Find out which PHYs belong to which BGX instance in the ACPI way.
> 
> Set the MAC address of the device as provided by ACPI tables. This is
> similar to the implementation for devicetree in
> of_get_mac_address(). The table is searched for the device property
> entries "mac-address", "local-mac-address" and "address" in that
> order. The address is provided in a u64 variable and must contain a
> valid 6 bytes-len mac addr.
> 
> Based on code from: Narinder Dhillon 
> Tomasz Nowicki 
> Robert Richter 
> 
> Signed-off-by: Tomasz Nowicki 
> Signed-off-by: Robert Richter 
> Signed-off-by: David Daney 
> ---
>  drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 137 
> +-
>  1 file changed, 135 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c 
> b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
> index 615b2af..2056583 100644
> --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
> +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
> @@ -6,6 +6,7 @@
>   * as published by the Free Software Foundation.
>   */
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -26,7 +27,7 @@
>  struct lmac {
>   struct bgx  *bgx;
>   int dmac;
> - unsigned char   mac[ETH_ALEN];
> + u8  mac[ETH_ALEN];
>   boollink_up;
>   int lmacid; /* ID within BGX */
>   int lmacid_bd; /* ID on board */
> @@ -835,6 +836,133 @@ static void bgx_get_qlm_mode(struct bgx *bgx)
>   }
>  }
>  
> +#ifdef CONFIG_ACPI
> +
> +static int bgx_match_phy_id(struct device *dev, void *data)
> +{
> + struct phy_device *phydev = to_phy_device(dev);
> + u32 *phy_id = data;
> +
> + if (phydev->addr == *phy_id)
> + return 1;
> +
> + return 0;
> +}
> +
> +static const char * const addr_propnames[] = {
> + "mac-address",
> + "local-mac-address",
> + "address",
> +};
> +
> +static int acpi_get_mac_address(struct acpi_device *adev, u8 *dst)
> +{
> + const union acpi_object *prop;
> + u64 mac_val;
> + u8 mac[ETH_ALEN];
> + int i, j;
> + int ret;
> +
> + for (i = 0; i < ARRAY_SIZE(addr_propnames); i++) {
> + ret = acpi_dev_get_property(adev, addr_propnames[i],
> + ACPI_TYPE_INTEGER, &prop);

Shouldn't this be trying to use device_property_read_* API and making
the DT/ACPI path the same where possible?

Graeme

> + if (ret)
> + continue;
> +
> + mac_val = prop->integer.value;
> +
> + if (mac_val & (~0ULL << 48))
> + continue;   /* more than 6 bytes */
> +
> + for (j = 0; j < ARRAY_SIZE(mac); j++)
> + mac[j] = (u8)(mac_val >> (8 * j));
> + if (!is_valid_ether_addr(mac))
> + continue;
> +
> + memcpy(dst, mac, ETH_ALEN);
> +
> + return 0;
> + }
> +
> + return ret ? ret : -EINVAL;
> +}
> +
> +static acpi_status bgx_acpi_register_phy(acpi_handle handle,
> +  u32 lvl, void *context, void **rv)
> +{
> + struct acpi_reference_args args;
> + const union acpi_object *prop;
> + struct bgx *bgx = context;
> + struct acpi_device *adev;
> + struct device *phy_dev;
> + u32 phy_id;
> +
> + if (acpi_bus_get_device(handle, &adev))
> + goto out;
> +
> + SET_NETDEV_DEV(&bgx->lmac[bgx->lmac_count].netdev, &bgx->pdev->dev);
> +
> + acpi_get_mac_address(adev, bgx->lmac[bgx->lmac_count].mac);
> +
> + bgx->lmac[bgx->lmac_count].lmacid = bgx->lmac_count;
> +
> + if (acpi_dev_get_property_reference(adev, "phy-handle", 0, &args))
> + goto out;
> +
> + if (acpi_dev_get_property(args.adev, "phy-channel", ACPI_TYPE_INTEGER, 
> &prop))
> + goto out;
> +
> + phy_id = prop->integer.value;
> +
> + phy_dev = bus_find_device(&mdio_bus_type, NULL, (void *)&phy_id,
> +   bgx_match_phy_id);
> + if (!phy_dev)
> + goto out;
> +
> + bgx->lmac[bgx->lmac_count].phydev = to_phy_device(phy_dev);
> +out:
> + bgx->lmac_count++;
> + return AE_OK;
> +}
> +
> +static acpi_status bgx_acpi_match_id(acpi_handle handle, u32 lvl,
> +  void *context, void **ret_val)
> +{
> + struct acpi_buffer string = { ACPI_ALLOCATE_BUFFER, NULL };
> + struct bgx *bgx = context;
> + char bgx_sel[5];
> +
> + snprintf(bgx_sel, 5, "BGX%d", bgx->bgx_id);
> + if (ACPI_FAILURE(acpi_get_name(handle, ACPI_SINGLE_NAME, &string))) {
> + pr_warn("Invalid link device\n");
> + return AE_OK;
> + }
> +
> + if (strncmp(string.pointer, bgx_sel, 4))
> 

Re: [Xen-devel] [PATCH v2] xen-apic: Enable on domU as well

2015-08-07 Thread Jason A. Donenfeld
On Fri, Aug 7, 2015 at 4:23 PM, Konrad Rzeszutek Wilk
 wrote:
> Anyhow, your patch seems to fix a regression my patch
> feb44f1f7a4ac299d1ab1c3606860e70b9b89d69
> "x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs"
> introduced.

Ahhh, good, okay. That explains why I didn't encounter this with older
kernels. The whole picture makes sense now. Thanks for reviewing this.

David - mergable?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Xen-devel] [PATCH v2] xen-apic: Enable on domU as well

2015-08-07 Thread Konrad Rzeszutek Wilk
On Thu, Aug 06, 2015 at 06:37:05PM +0200, Jason A. Donenfeld wrote:
> It turns out that domU also requires the Xen APIC driver. Otherwise we
> get stuck in busy loops that never exit, such as in this stack trace:
> 
> (gdb) target remote localhost:
> Remote debugging using localhost:
> __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
> 56  while (native_apic_mem_read(APIC_ICR) & APIC_ICR_BUSY)
> (gdb) bt
>  #0  __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56
>  #1  __default_send_IPI_shortcut (shortcut=,
> dest=, vector=) at
> ./arch/x86/include/asm/ipi.h:75
>  #2  apic_send_IPI_self (vector=246) at arch/x86/kernel/apic/probe_64.c:54
>  #3  0x81011336 in arch_irq_work_raise () at
> arch/x86/kernel/irq_work.c:47
>  #4  0x8114990c in irq_work_queue (work=0x88000fc0e400) at
> kernel/irq_work.c:100
>  #5  0x8110c29d in wake_up_klogd () at kernel/printk/printk.c:2633
>  #6  0x8110ca60 in vprintk_emit (facility=0, level= out>, dict=0x0 , dictlen=,
> fmt=, args=)
> at kernel/printk/printk.c:1778
>  #7  0x816010c8 in printk (fmt=) at
> kernel/printk/printk.c:1868
>  #8  0xc00013ea in ?? ()
>  #9  0x in ?? ()
> 
> Mailing-list-thread: https://lkml.org/lkml/2015/8/4/755
> Signed-off-by: Jason A. Donenfeld 
> Cc: David Vrabel 
> Cc: Ian Campbell 
> Cc: 

While this patch is OK for the trees that implement the PV APIC
driver it won't apply to older ones (and it does not need to).

In the older ones this was working with f447d56d36af18c5104ff29dcb1327c0c0ac3634
"xen: implement apic ipi interface", which should have worked for
your case.

And would have made the arch_irq_work_raise and such use the 
Xen code paths:
 952
 
 953 #ifdef CONFIG_SMP  
 
 954 apic->send_IPI_allbutself = xen_send_IPI_allbutself;   
 
 955 apic->send_IPI_mask_allbutself = xen_send_IPI_mask_allbutself; 
 
 956 apic->send_IPI_mask = xen_send_IPI_mask;   
 
 957 apic->send_IPI_all = xen_send_IPI_all; 
 
 958 apic->send_IPI_self = xen_send_IPI_self;   
 
 959 #endif

Anyhow, your patch seems to fix a regression my patch 
feb44f1f7a4ac299d1ab1c3606860e70b9b89d69
"x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs"
introduced.

I would to the stable.vger.kernel.org:
# apply only to v4.1

As the earlier ones will work fine.

Thank you!

Oh, and Reviewed-by: Konrad Rzeszutek Wilk 


> ---
>  arch/x86/xen/Makefile  |  4 ++--
>  arch/x86/xen/xen-ops.h | 11 ---
>  2 files changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
> index 7322755..4b6e29a 100644
> --- a/arch/x86/xen/Makefile
> +++ b/arch/x86/xen/Makefile
> @@ -13,13 +13,13 @@ CFLAGS_mmu.o  := $(nostackp)
>  obj-y:= enlighten.o setup.o multicalls.o mmu.o irq.o \
>   time.o xen-asm.o xen-asm_$(BITS).o \
>   grant-table.o suspend.o platform-pci-unplug.o \
> - p2m.o
> + p2m.o apic.o
>  
>  obj-$(CONFIG_EVENT_TRACING) += trace.o
>  
>  obj-$(CONFIG_SMP)+= smp.o
>  obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= spinlock.o
>  obj-$(CONFIG_XEN_DEBUG_FS)   += debugfs.o
> -obj-$(CONFIG_XEN_DOM0)   += apic.o vga.o
> +obj-$(CONFIG_XEN_DOM0)   += vga.o
>  obj-$(CONFIG_SWIOTLB_XEN)+= pci-swiotlb-xen.o
>  obj-$(CONFIG_XEN_EFI)+= efi.o
> diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
> index c20fe29..cd248ff 100644
> --- a/arch/x86/xen/xen-ops.h
> +++ b/arch/x86/xen/xen-ops.h
> @@ -98,20 +98,17 @@ static inline void xen_uninit_lock_cpu(int cpu)
>  #endif
>  
>  struct dom0_vga_console_info;
> -
>  #ifdef CONFIG_XEN_DOM0
>  void __init xen_init_vga(const struct dom0_vga_console_info *, size_t size);
> -void __init xen_init_apic(void);
>  #else
> -static inline void __init xen_init_vga(const struct dom0_vga_console_info 
> *info,
> -size_t size)
> -{
> -}
> -static inline void __init xen_init_apic(void)
> +void __init xen_init_vga(const struct dom0_vga_console_info *info,
> + size_t size);
>  {
>  }
>  #endif
>  
> +void __init xen_init_apic(void);
> +
>  #ifdef CONFIG_XEN_EFI
>  extern void xen_efi_init(void);
>  #else
> -- 
> 2.5.0
> 
> 
> ___
> Xen-devel mailing list
> xen-de...@lists.xen.org
> http://lists.xen.org/xen-devel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Mark Rutland
On Fri, Aug 07, 2015 at 01:33:10AM +0100, David Daney wrote:
> From: David Daney 
> 
> Find out which PHYs belong to which BGX instance in the ACPI way.
> 
> Set the MAC address of the device as provided by ACPI tables. This is
> similar to the implementation for devicetree in
> of_get_mac_address(). The table is searched for the device property
> entries "mac-address", "local-mac-address" and "address" in that
> order. The address is provided in a u64 variable and must contain a
> valid 6 bytes-len mac addr.
> 
> Based on code from: Narinder Dhillon 
> Tomasz Nowicki 
> Robert Richter 
> 
> Signed-off-by: Tomasz Nowicki 
> Signed-off-by: Robert Richter 
> Signed-off-by: David Daney 
> ---
>  drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 137 
> +-
>  1 file changed, 135 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c 
> b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
> index 615b2af..2056583 100644
> --- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
> +++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
> @@ -6,6 +6,7 @@
>   * as published by the Free Software Foundation.
>   */
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -26,7 +27,7 @@
>  struct lmac {
>   struct bgx  *bgx;
>   int dmac;
> - unsigned char   mac[ETH_ALEN];
> + u8  mac[ETH_ALEN];
>   boollink_up;
>   int lmacid; /* ID within BGX */
>   int lmacid_bd; /* ID on board */
> @@ -835,6 +836,133 @@ static void bgx_get_qlm_mode(struct bgx *bgx)
>   }
>  }
>  
> +#ifdef CONFIG_ACPI
> +
> +static int bgx_match_phy_id(struct device *dev, void *data)
> +{
> + struct phy_device *phydev = to_phy_device(dev);
> + u32 *phy_id = data;
> +
> + if (phydev->addr == *phy_id)
> + return 1;
> +
> + return 0;
> +}
> +
> +static const char * const addr_propnames[] = {
> + "mac-address",
> + "local-mac-address",
> + "address",
> +};

If these are going to be generally necessary, then we should get them
adopted as standardised _DSD properties (ideally just one of them).

[...]

> +static acpi_status bgx_acpi_register_phy(acpi_handle handle,
> +  u32 lvl, void *context, void **rv)
> +{
> + struct acpi_reference_args args;
> + const union acpi_object *prop;
> + struct bgx *bgx = context;
> + struct acpi_device *adev;
> + struct device *phy_dev;
> + u32 phy_id;
> +
> + if (acpi_bus_get_device(handle, &adev))
> + goto out;
> +
> + SET_NETDEV_DEV(&bgx->lmac[bgx->lmac_count].netdev, &bgx->pdev->dev);
> +
> + acpi_get_mac_address(adev, bgx->lmac[bgx->lmac_count].mac);
> +
> + bgx->lmac[bgx->lmac_count].lmacid = bgx->lmac_count;
> +
> + if (acpi_dev_get_property_reference(adev, "phy-handle", 0, &args))
> + goto out;
> +
> + if (acpi_dev_get_property(args.adev, "phy-channel", ACPI_TYPE_INTEGER, 
> &prop))
> + goto out;

Likewise for any inter-device properties, so that we can actually handle
them in a generic fashion, and avoid / learn from the mistakes we've
already handled with DT.

Mark.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[iproute PATCH] misc/ss: don't imply -a when -A was specified

2015-08-07 Thread Phil Sutter
Signed-off-by: Phil Sutter 
---
 misc/ss.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/misc/ss.c b/misc/ss.c
index bba7009..2f34962 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -3669,6 +3669,8 @@ int main(int argc, char *argv[])
char *p, *p1;
if (!saw_query) {
current_filter.dbs = 0;
+   state_filter = state_filter ?
+  state_filter : SS_CONN;
saw_query = 1;
do_default = 0;
}
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Robert Richter
On 07.08.15 10:09:04, Tomasz Nowicki wrote:
> On 07.08.2015 02:33, David Daney wrote:

...

> >+#else
> >+
> >+static int bgx_init_acpi_phy(struct bgx *bgx)
> >+{
> >+return -ENODEV;
> >+}
> >+
> >+#endif /* CONFIG_ACPI */
> >+
> >  #if IS_ENABLED(CONFIG_OF_MDIO)
> >
> >  static int bgx_init_of_phy(struct bgx *bgx)
> >@@ -882,7 +1010,12 @@ static int bgx_init_of_phy(struct bgx *bgx)
> >
> >  static int bgx_init_phy(struct bgx *bgx)
> >  {
> >-return bgx_init_of_phy(bgx);
> >+int err = bgx_init_of_phy(bgx);
> >+
> >+if (err != -ENODEV)
> >+return err;
> >+
> >+return bgx_init_acpi_phy(bgx);
> >  }
> >
> 
> If kernel can work with DT and ACPI (both compiled in), it should take only
> one path instead of probing DT and ACPI sequentially. How about:
> 
> @@ -902,10 +925,9 @@ static int bgx_probe(struct pci_dev *pdev, const struct
> pci_device_id *ent)
>   bgx_vnic[bgx->bgx_id] = bgx;
>   bgx_get_qlm_mode(bgx);
> 
> - snprintf(bgx_sel, 5, "bgx%d", bgx->bgx_id);
> - np = of_find_node_by_name(NULL, bgx_sel);
> - if (np)
> - bgx_init_of(bgx, np);
> + err = acpi_disabled ? bgx_init_of_phy(bgx) : bgx_init_acpi_phy(bgx);
> + if (err)
> + goto err_enable;
> 
>   bgx_init_hw(bgx);

I would not pollute bgx_probe() with acpi and dt specifics, and instead
keep bgx_init_phy(). The typical design pattern for this is:

static int bgx_init_phy(struct bgx *bgx)
{
#ifdef CONFIG_ACPI
if (!acpi_disabled)
return bgx_init_acpi_phy(bgx);
#endif
return bgx_init_of_phy(bgx);
}

This adds acpi runtime detection (acpi=no), does not call dt code in
case of acpi, and saves the #else for bgx_init_acpi_phy().

-Robert
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Tomasz Nowicki

On 07.08.2015 13:56, Robert Richter wrote:

On 07.08.15 12:52:41, Tomasz Nowicki wrote:

On 07.08.2015 12:43, Robert Richter wrote:

On 07.08.15 10:09:04, Tomasz Nowicki wrote:

On 07.08.2015 02:33, David Daney wrote:


...


+#else
+
+static int bgx_init_acpi_phy(struct bgx *bgx)
+{
+   return -ENODEV;
+}
+
+#endif /* CONFIG_ACPI */
+
  #if IS_ENABLED(CONFIG_OF_MDIO)

  static int bgx_init_of_phy(struct bgx *bgx)
@@ -882,7 +1010,12 @@ static int bgx_init_of_phy(struct bgx *bgx)

  static int bgx_init_phy(struct bgx *bgx)
  {
-   return bgx_init_of_phy(bgx);
+   int err = bgx_init_of_phy(bgx);
+
+   if (err != -ENODEV)
+   return err;
+
+   return bgx_init_acpi_phy(bgx);
  }



If kernel can work with DT and ACPI (both compiled in), it should take only
one path instead of probing DT and ACPI sequentially. How about:

@@ -902,10 +925,9 @@ static int bgx_probe(struct pci_dev *pdev, const struct
pci_device_id *ent)
bgx_vnic[bgx->bgx_id] = bgx;
bgx_get_qlm_mode(bgx);

-   snprintf(bgx_sel, 5, "bgx%d", bgx->bgx_id);
-   np = of_find_node_by_name(NULL, bgx_sel);
-   if (np)
-   bgx_init_of(bgx, np);
+   err = acpi_disabled ? bgx_init_of_phy(bgx) : bgx_init_acpi_phy(bgx);
+   if (err)
+   goto err_enable;

bgx_init_hw(bgx);


I would not pollute bgx_probe() with acpi and dt specifics, and instead
keep bgx_init_phy(). The typical design pattern for this is:

static int bgx_init_phy(struct bgx *bgx)
{
#ifdef CONFIG_ACPI
 if (!acpi_disabled)
 return bgx_init_acpi_phy(bgx);
#endif
 return bgx_init_of_phy(bgx);
}

This adds acpi runtime detection (acpi=no), does not call dt code in
case of acpi, and saves the #else for bgx_init_acpi_phy().



I am fine with keeping it in bgx_init_phy(), however we can drop there
#ifdefs since both of bgx_init_{acpi,of}_phy calls have empty stub for !ACPI
and !OF case. Like that:

static int bgx_init_phy(struct bgx *bgx)
{

 if (!acpi_disabled)
 return bgx_init_acpi_phy(bgx);
else
return bgx_init_of_phy(bgx);
}


As said, keeping it in #ifdefs makes the empty stub function for !acpi
obsolete, which makes the code smaller and better readable. This style
is common practice in the kernel. Apart from that, the 'else' should
be dropped as it is useless.



I would't say the code is better readable (bgx_init_of_phy has still two 
stubs) but this is just cosmetic, my point was to use run time detection 
using acpi_disabled.


Tomasz
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 4/6] qlcnic: Add new VF device ID 0x8C30

2015-08-07 Thread Shahed Shaikh
From: Shahed Shaikh 

This is a 83xx series based VF device

Signed-off-by: Shahed Shaikh 
---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic.h  |   12 
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c |5 -
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
index a861592..17f37b7 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
@@ -2290,8 +2290,9 @@ extern const struct ethtool_ops qlcnic_ethtool_failed_ops;
 
 #define PCI_DEVICE_ID_QLOGIC_QLE824X   0x8020
 #define PCI_DEVICE_ID_QLOGIC_QLE834X   0x8030
-#define PCI_DEVICE_ID_QLOGIC_QLE8830   0x8830
 #define PCI_DEVICE_ID_QLOGIC_VF_QLE834X0x8430
+#define PCI_DEVICE_ID_QLOGIC_QLE8830   0x8830
+#define PCI_DEVICE_ID_QLOGIC_VF_QLE8C300x8C30
 #define PCI_DEVICE_ID_QLOGIC_QLE844X   0x8040
 #define PCI_DEVICE_ID_QLOGIC_VF_QLE844X0x8440
 
@@ -2318,7 +2319,8 @@ static inline bool qlcnic_83xx_check(struct 
qlcnic_adapter *adapter)
  (device == PCI_DEVICE_ID_QLOGIC_QLE8830) ||
  (device == PCI_DEVICE_ID_QLOGIC_QLE844X) ||
  (device == PCI_DEVICE_ID_QLOGIC_VF_QLE844X) ||
- (device == PCI_DEVICE_ID_QLOGIC_VF_QLE834X)) ? true : false;
+ (device == PCI_DEVICE_ID_QLOGIC_VF_QLE834X) ||
+ (device == PCI_DEVICE_ID_QLOGIC_VF_QLE8C30)) ? true : false;
 
return status;
 }
@@ -2334,7 +2336,8 @@ static inline bool qlcnic_sriov_vf_check(struct 
qlcnic_adapter *adapter)
bool status;
 
status = ((device == PCI_DEVICE_ID_QLOGIC_VF_QLE834X) ||
- (device == PCI_DEVICE_ID_QLOGIC_VF_QLE844X)) ? true : false;
+ (device == PCI_DEVICE_ID_QLOGIC_VF_QLE844X) ||
+ (device == PCI_DEVICE_ID_QLOGIC_VF_QLE8C30)) ? true : false;
 
return status;
 }
@@ -2350,7 +2353,8 @@ static inline bool qlcnic_83xx_vf_check(struct 
qlcnic_adapter *adapter)
 {
unsigned short device = adapter->pdev->device;
 
-   return (device == PCI_DEVICE_ID_QLOGIC_VF_QLE834X) ? true : false;
+   return ((device == PCI_DEVICE_ID_QLOGIC_VF_QLE834X) ||
+   (device == PCI_DEVICE_ID_QLOGIC_VF_QLE8C30)) ? true : false;
 }
 
 static inline bool qlcnic_sriov_check(struct qlcnic_adapter *adapter)
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
index b714cee..8b08b20 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
@@ -110,8 +110,9 @@ static u32 qlcnic_vlan_tx_check(struct qlcnic_adapter 
*adapter)
 static const struct pci_device_id qlcnic_pci_tbl[] = {
ENTRY(PCI_DEVICE_ID_QLOGIC_QLE824X),
ENTRY(PCI_DEVICE_ID_QLOGIC_QLE834X),
-   ENTRY(PCI_DEVICE_ID_QLOGIC_QLE8830),
ENTRY(PCI_DEVICE_ID_QLOGIC_VF_QLE834X),
+   ENTRY(PCI_DEVICE_ID_QLOGIC_QLE8830),
+   ENTRY(PCI_DEVICE_ID_QLOGIC_VF_QLE8C30),
ENTRY(PCI_DEVICE_ID_QLOGIC_QLE844X),
ENTRY(PCI_DEVICE_ID_QLOGIC_VF_QLE844X),
{0,}
@@ -1148,6 +1149,7 @@ static void qlcnic_get_bar_length(u32 dev_id, ulong *bar)
case PCI_DEVICE_ID_QLOGIC_QLE844X:
case PCI_DEVICE_ID_QLOGIC_VF_QLE834X:
case PCI_DEVICE_ID_QLOGIC_VF_QLE844X:
+   case PCI_DEVICE_ID_QLOGIC_VF_QLE8C30:
*bar = QLCNIC_83XX_BAR0_LENGTH;
break;
default:
@@ -2490,6 +2492,7 @@ qlcnic_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
qlcnic_83xx_register_map(ahw);
break;
case PCI_DEVICE_ID_QLOGIC_VF_QLE834X:
+   case PCI_DEVICE_ID_QLOGIC_VF_QLE8C30:
case PCI_DEVICE_ID_QLOGIC_VF_QLE844X:
qlcnic_sriov_vf_register_map(ahw);
break;
-- 
1.5.6

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 3/6] qlcnic: Print firmware minidump buffer and template header addresses

2015-08-07 Thread Shahed Shaikh
From: Shahed Shaikh 

Signed-off-by: Shahed Shaikh 
---
 .../net/ethernet/qlogic/qlcnic/qlcnic_minidump.c   |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
index aca47fd..cda9e60 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
@@ -1388,8 +1388,9 @@ int qlcnic_dump_fw(struct qlcnic_adapter *adapter)
fw_dump->clr = 1;
snprintf(mesg, sizeof(mesg), "FW_DUMP=%s", adapter->netdev->name);
netdev_info(adapter->netdev,
-   "Dump data %d bytes captured, template header size %d 
bytes\n",
-   fw_dump->size, fw_dump->tmpl_hdr_size);
+   "Dump data %d bytes captured, dump data address = %p, 
template header size %d bytes, template address = %p\n",
+   fw_dump->size, fw_dump->data, fw_dump->tmpl_hdr_size,
+   fw_dump->tmpl_hdr);
/* Send a udev event to notify availability of FW dump */
kobject_uevent_env(&dev->kobj, KOBJ_CHANGE, msg);
 
-- 
1.5.6

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 2/6] qlcnic: Add support to enable capability to extend minidump for iSCSI

2015-08-07 Thread Shahed Shaikh
From: Shahed Shaikh 

In some cases it is required to capture minidump for iSCSI functions
as part of default minidump collection process. To enable this, firmware
exports it's capability and driver need to enable that capability
by issuing a mailbox command.

With this feature, firmware can provide additional iSCSI function's
minidump with smaller minidump capture mask (0x1f).

Signed-off-by: Shahed Shaikh 
---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic.h|1 +
 .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c|   26 
 .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h|1 +
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h |1 +
 .../net/ethernet/qlogic/qlcnic/qlcnic_minidump.c   |   32 
 5 files changed, 61 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
index c6dca5d..a861592 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
@@ -924,6 +924,7 @@ struct qlcnic_mac_vlan_list {
 #define QLCNIC_FW_CAPABILITY_SET_DRV_VER   BIT_5
 #define QLCNIC_FW_CAPABILITY_2_BEACON  BIT_7
 #define QLCNIC_FW_CAPABILITY_2_PER_PORT_ESWITCH_CFGBIT_9
+#define QLCNIC_FW_CAPABILITY_2_EXT_ISCSI_DUMP  BIT_13
 
 #define QLCNIC_83XX_FW_CAPAB_ENCAP_RX_OFFLOAD  BIT_0
 #define QLCNIC_83XX_FW_CAPAB_ENCAP_TX_OFFLOAD  BIT_1
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
index 172192d..5ab3adf 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
@@ -119,6 +119,7 @@ static const struct qlcnic_mailbox_metadata 
qlcnic_83xx_mbx_tbl[] = {
{QLCNIC_CMD_DCB_QUERY_CAP, 1, 2},
{QLCNIC_CMD_DCB_QUERY_PARAM, 1, 50},
{QLCNIC_CMD_SET_INGRESS_ENCAP, 2, 1},
+   {QLCNIC_CMD_83XX_EXTEND_ISCSI_DUMP_CAP, 4, 1},
 };
 
 const u32 qlcnic_83xx_ext_reg_tbl[] = {
@@ -3514,6 +3515,31 @@ out:
qlcnic_free_mbx_args(&cmd);
 }
 
+#define QLCNIC_83XX_ADD_PORT0  BIT_0
+#define QLCNIC_83XX_ADD_PORT1  BIT_1
+#define QLCNIC_83XX_EXTENDED_MEM_SIZE  13 /* In MB */
+int qlcnic_83xx_extend_md_capab(struct qlcnic_adapter *adapter)
+{
+   struct qlcnic_cmd_args cmd;
+   int err;
+
+   err = qlcnic_alloc_mbx_args(&cmd, adapter,
+   QLCNIC_CMD_83XX_EXTEND_ISCSI_DUMP_CAP);
+   if (err)
+   return err;
+
+   cmd.req.arg[1] = (QLCNIC_83XX_ADD_PORT0 | QLCNIC_83XX_ADD_PORT1);
+   cmd.req.arg[2] = QLCNIC_83XX_EXTENDED_MEM_SIZE;
+   cmd.req.arg[3] = QLCNIC_83XX_EXTENDED_MEM_SIZE;
+
+   err = qlcnic_issue_cmd(adapter, &cmd);
+   if (err)
+   dev_err(&adapter->pdev->dev,
+   "failed to issue extend iSCSI minidump capability\n");
+
+   return err;
+}
+
 int qlcnic_83xx_reg_test(struct qlcnic_adapter *adapter)
 {
u32 major, minor, sub;
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h
index 999a90e..331ae2c 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h
@@ -627,6 +627,7 @@ int qlcnic_83xx_set_port_eswitch_status(struct 
qlcnic_adapter *, int, int *);
 
 void qlcnic_83xx_get_minidump_template(struct qlcnic_adapter *);
 void qlcnic_83xx_get_stats(struct qlcnic_adapter *adapter, u64 *data);
+int qlcnic_83xx_extend_md_capab(struct qlcnic_adapter *);
 int qlcnic_83xx_get_settings(struct qlcnic_adapter *, struct ethtool_cmd *);
 int qlcnic_83xx_set_settings(struct qlcnic_adapter *, struct ethtool_cmd *);
 void qlcnic_83xx_get_pauseparam(struct qlcnic_adapter *,
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h
index cbe2399..4bb33af 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h
@@ -109,6 +109,7 @@ enum qlcnic_regs {
 #define QLCNIC_CMD_GET_LED_CONFIG  0x6A
 #define QLCNIC_CMD_83XX_SET_DRV_VER0x6F
 #define QLCNIC_CMD_ADD_RCV_RINGS   0x0B
+#define QLCNIC_CMD_83XX_EXTEND_ISCSI_DUMP_CAP  0x37
 
 #define QLCNIC_INTRPT_INTX 1
 #define QLCNIC_INTRPT_MSIX 3
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
index 6f33e2d..aca47fd 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
@@ -1396,19 +1396,51 @@ int qlcnic_dump_fw(struct qlcnic_adapter *adapter)
return 0;
 }
 
+static inline bool
+qlcnic_83xx_md_check_extended_dump_capability(struct qlcnic_adapter *adapter)
+{
+   /* For special adapters (with 0x8830 device ID), where iSCSI firmware
+* dump needs to be captured as part of 

[PATCH net-next 6/6] qlcnic: Update version to 5.3.63

2015-08-07 Thread Shahed Shaikh
From: Shahed Shaikh 

Signed-off-by: Shahed Shaikh 
---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
index 17f37b7..06bcc73 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
@@ -37,8 +37,8 @@
 
 #define _QLCNIC_LINUX_MAJOR 5
 #define _QLCNIC_LINUX_MINOR 3
-#define _QLCNIC_LINUX_SUBVERSION 62
-#define QLCNIC_LINUX_VERSIONID  "5.3.62"
+#define _QLCNIC_LINUX_SUBVERSION 63
+#define QLCNIC_LINUX_VERSIONID  "5.3.63"
 #define QLCNIC_DRV_IDC_VER  0x01
 #define QLCNIC_DRIVER_VERSION  ((_QLCNIC_LINUX_MAJOR << 16) |\
 (_QLCNIC_LINUX_MINOR << 8) | (_QLCNIC_LINUX_SUBVERSION))
-- 
1.5.6

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 0/6] qlcnic: enhancements

2015-08-07 Thread Shahed Shaikh
From: Shahed Shaikh 

Hi Dave,

This series adds few enhancements.

  o Patch from Harish reorders the sequence of header files inclusion,
keeping kernel's header files on top.

  o Firmware introduced a new feature which allows driver to increases
the size of firmware dump of iSCSI function which is being collected
by NIC driver.

  o Print buffer address which is holding a firmware dump.

  o Use vzalloc() instead kzalloc() for allocating large chunk of memory
which will avoid potential memory allocation failure.

  o Add new device ID for 0x8C30 which is a 83xx series based VF function.

Please apply this series to net-next.

Thanks,
Shahed

Harish Patil (1):
  qlcnic: Rearrange ordering of header files inclusion

Shahed Shaikh (5):
  qlcnic: Add support to enable capability to extend minidump for iSCSI
  qlcnic: Print firmware minidump buffer and template header addresses
  qlcnic: Add new VF device ID 0x8C30
  qlcnic: Don't use kzalloc unncecessarily for allocating large chunk
of memory
  qlcnic: Update version to 5.3.63

 drivers/net/ethernet/qlogic/qlcnic/qlcnic.h|   19 +
 .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c|   31 ++-
 .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h|2 +
 .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c  |4 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c |6 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.h |1 +
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c   |   14 ---
 .../net/ethernet/qlogic/qlcnic/qlcnic_minidump.c   |   41 ++--
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov.h  |3 +-
 .../ethernet/qlogic/qlcnic/qlcnic_sriov_common.c   |3 +-
 .../net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c   |3 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c  |7 +--
 12 files changed, 102 insertions(+), 32 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 5/6] qlcnic: Don't use kzalloc unncecessarily for allocating large chunk of memory

2015-08-07 Thread Shahed Shaikh
From: Shahed Shaikh 

Driver allocates a large chunk of temporary buffer using kzalloc
to copy FW image. As there is no real need of this memory to be
physically contiguous, use vzalloc instead.

Signed-off-by: Shahed Shaikh 
---
 .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c  |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c
index 753ea8b..bf89216 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_init.c
@@ -1384,7 +1384,7 @@ static int qlcnic_83xx_copy_fw_file(struct qlcnic_adapter 
*adapter)
size_t size;
u64 addr;
 
-   temp = kzalloc(fw->size, GFP_KERNEL);
+   temp = vzalloc(fw->size);
if (!temp) {
release_firmware(fw);
fw_info->fw = NULL;
@@ -1430,7 +1430,7 @@ static int qlcnic_83xx_copy_fw_file(struct qlcnic_adapter 
*adapter)
 exit:
release_firmware(fw);
fw_info->fw = NULL;
-   kfree(temp);
+   vfree(temp);
 
return ret;
 }
-- 
1.5.6

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 1/6] qlcnic: Rearrange ordering of header files inclusion

2015-08-07 Thread Shahed Shaikh
From: Harish Patil 

Include local headers files after kernel's header files.

Signed-off-by: Harish Patil 
Signed-off-by: Shahed Shaikh 
---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic.h|2 --
 .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c|5 +++--
 .../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h|1 +
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c |6 +++---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c   |9 -
 .../net/ethernet/qlogic/qlcnic/qlcnic_minidump.c   |4 ++--
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov.h  |3 ++-
 .../ethernet/qlogic/qlcnic/qlcnic_sriov_common.c   |3 ++-
 .../net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c   |3 ++-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c  |7 +++
 10 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
index 055f376..c6dca5d 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
@@ -24,9 +24,7 @@
 #include 
 #include 
 #include 
-
 #include 
-
 #include 
 #include 
 #include 
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
index 840bf36..172192d 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
@@ -5,14 +5,15 @@
  * See LICENSE.qlcnic for copyright and licensing details.
  */
 
-#include "qlcnic.h"
-#include "qlcnic_sriov.h"
 #include 
 #include 
 #include 
 #include 
 #include 
 
+#include "qlcnic.h"
+#include "qlcnic_sriov.h"
+
 static void __qlcnic_83xx_process_aen(struct qlcnic_adapter *);
 static int qlcnic_83xx_clear_lb_mode(struct qlcnic_adapter *, u8);
 static void qlcnic_83xx_configure_mac(struct qlcnic_adapter *, u8 *, u8,
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h
index 69f828e..999a90e 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.h
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+
 #include "qlcnic_hw.h"
 
 #define QLCNIC_83XX_BAR0_LENGTH 0x4000
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c
index 75ee9e4..509b596 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c
@@ -5,13 +5,13 @@
  * See LICENSE.qlcnic for copyright and licensing details.
  */
 
-#include "qlcnic.h"
-#include "qlcnic_hdr.h"
-
 #include 
 #include 
 #include 
 
+#include "qlcnic.h"
+#include "qlcnic_hdr.h"
+
 #define MASK(n) ((1ULL<<(n))-1)
 #define OCM_WIN_P3P(addr) (addr & 0xffc)
 
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
index 7dbab3c..b714cee 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
@@ -7,11 +7,6 @@
 
 #include 
 #include 
-
-#include "qlcnic.h"
-#include "qlcnic_sriov.h"
-#include "qlcnic_hw.h"
-
 #include 
 #include 
 #include 
@@ -25,6 +20,10 @@
 #include 
 #endif
 
+#include "qlcnic.h"
+#include "qlcnic_sriov.h"
+#include "qlcnic_hw.h"
+
 MODULE_DESCRIPTION("QLogic 1/10 GbE Converged/Intelligent Ethernet Driver");
 MODULE_LICENSE("GPL");
 MODULE_VERSION(QLCNIC_LINUX_VERSIONID);
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
index 332bb8a..6f33e2d 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
@@ -5,13 +5,13 @@
  * See LICENSE.qlcnic for copyright and licensing details.
  */
 
+#include 
+
 #include "qlcnic.h"
 #include "qlcnic_hdr.h"
 #include "qlcnic_83xx_hw.h"
 #include "qlcnic_hw.h"
 
-#include 
-
 #define QLC_83XX_MINIDUMP_FLASH0x52
 #define QLC_83XX_OCM_INDEX 3
 #define QLC_83XX_PCI_INDEX 0
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov.h 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov.h
index 4677b2e..017d8c2 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov.h
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov.h
@@ -8,10 +8,11 @@
 #ifndef _QLCNIC_83XX_SRIOV_H_
 #define _QLCNIC_83XX_SRIOV_H_
 
-#include "qlcnic.h"
 #include 
 #include 
 
+#include "qlcnic.h"
+
 extern const u32 qlcnic_83xx_reg_tbl[];
 extern const u32 qlcnic_83xx_ext_reg_tbl[];
 
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c
index e631246..546cd5f 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c
@@ -5,10 +5,11 @@
  * See LICENSE.qlcnic for copyright and licensing details.
  */
 
+#include 
+
 #include "qlcn

Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Robert Richter
On 07.08.15 12:52:41, Tomasz Nowicki wrote:
> On 07.08.2015 12:43, Robert Richter wrote:
> >On 07.08.15 10:09:04, Tomasz Nowicki wrote:
> >>On 07.08.2015 02:33, David Daney wrote:
> >
> >...
> >
> >>>+#else
> >>>+
> >>>+static int bgx_init_acpi_phy(struct bgx *bgx)
> >>>+{
> >>>+  return -ENODEV;
> >>>+}
> >>>+
> >>>+#endif /* CONFIG_ACPI */
> >>>+
> >>>  #if IS_ENABLED(CONFIG_OF_MDIO)
> >>>
> >>>  static int bgx_init_of_phy(struct bgx *bgx)
> >>>@@ -882,7 +1010,12 @@ static int bgx_init_of_phy(struct bgx *bgx)
> >>>
> >>>  static int bgx_init_phy(struct bgx *bgx)
> >>>  {
> >>>-  return bgx_init_of_phy(bgx);
> >>>+  int err = bgx_init_of_phy(bgx);
> >>>+
> >>>+  if (err != -ENODEV)
> >>>+  return err;
> >>>+
> >>>+  return bgx_init_acpi_phy(bgx);
> >>>  }
> >>>
> >>
> >>If kernel can work with DT and ACPI (both compiled in), it should take only
> >>one path instead of probing DT and ACPI sequentially. How about:
> >>
> >>@@ -902,10 +925,9 @@ static int bgx_probe(struct pci_dev *pdev, const struct
> >>pci_device_id *ent)
> >>bgx_vnic[bgx->bgx_id] = bgx;
> >>bgx_get_qlm_mode(bgx);
> >>
> >>-   snprintf(bgx_sel, 5, "bgx%d", bgx->bgx_id);
> >>-   np = of_find_node_by_name(NULL, bgx_sel);
> >>-   if (np)
> >>-   bgx_init_of(bgx, np);
> >>+   err = acpi_disabled ? bgx_init_of_phy(bgx) : bgx_init_acpi_phy(bgx);
> >>+   if (err)
> >>+   goto err_enable;
> >>
> >>bgx_init_hw(bgx);
> >
> >I would not pollute bgx_probe() with acpi and dt specifics, and instead
> >keep bgx_init_phy(). The typical design pattern for this is:
> >
> >static int bgx_init_phy(struct bgx *bgx)
> >{
> >#ifdef CONFIG_ACPI
> > if (!acpi_disabled)
> > return bgx_init_acpi_phy(bgx);
> >#endif
> > return bgx_init_of_phy(bgx);
> >}
> >
> >This adds acpi runtime detection (acpi=no), does not call dt code in
> >case of acpi, and saves the #else for bgx_init_acpi_phy().
> >
> 
> I am fine with keeping it in bgx_init_phy(), however we can drop there
> #ifdefs since both of bgx_init_{acpi,of}_phy calls have empty stub for !ACPI
> and !OF case. Like that:
> 
> static int bgx_init_phy(struct bgx *bgx)
> {
> 
> if (!acpi_disabled)
> return bgx_init_acpi_phy(bgx);
>   else
>   return bgx_init_of_phy(bgx);
> }

As said, keeping it in #ifdefs makes the empty stub function for !acpi
obsolete, which makes the code smaller and better readable. This style
is common practice in the kernel. Apart from that, the 'else' should
be dropped as it is useless.

-Robert
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Tomasz Nowicki

On 07.08.2015 12:43, Robert Richter wrote:

On 07.08.15 10:09:04, Tomasz Nowicki wrote:

On 07.08.2015 02:33, David Daney wrote:


...


+#else
+
+static int bgx_init_acpi_phy(struct bgx *bgx)
+{
+   return -ENODEV;
+}
+
+#endif /* CONFIG_ACPI */
+
  #if IS_ENABLED(CONFIG_OF_MDIO)

  static int bgx_init_of_phy(struct bgx *bgx)
@@ -882,7 +1010,12 @@ static int bgx_init_of_phy(struct bgx *bgx)

  static int bgx_init_phy(struct bgx *bgx)
  {
-   return bgx_init_of_phy(bgx);
+   int err = bgx_init_of_phy(bgx);
+
+   if (err != -ENODEV)
+   return err;
+
+   return bgx_init_acpi_phy(bgx);
  }



If kernel can work with DT and ACPI (both compiled in), it should take only
one path instead of probing DT and ACPI sequentially. How about:

@@ -902,10 +925,9 @@ static int bgx_probe(struct pci_dev *pdev, const struct
pci_device_id *ent)
bgx_vnic[bgx->bgx_id] = bgx;
bgx_get_qlm_mode(bgx);

-   snprintf(bgx_sel, 5, "bgx%d", bgx->bgx_id);
-   np = of_find_node_by_name(NULL, bgx_sel);
-   if (np)
-   bgx_init_of(bgx, np);
+   err = acpi_disabled ? bgx_init_of_phy(bgx) : bgx_init_acpi_phy(bgx);
+   if (err)
+   goto err_enable;

bgx_init_hw(bgx);


I would not pollute bgx_probe() with acpi and dt specifics, and instead
keep bgx_init_phy(). The typical design pattern for this is:

static int bgx_init_phy(struct bgx *bgx)
{
#ifdef CONFIG_ACPI
 if (!acpi_disabled)
 return bgx_init_acpi_phy(bgx);
#endif
 return bgx_init_of_phy(bgx);
}

This adds acpi runtime detection (acpi=no), does not call dt code in
case of acpi, and saves the #else for bgx_init_acpi_phy().



I am fine with keeping it in bgx_init_phy(), however we can drop there 
#ifdefs since both of bgx_init_{acpi,of}_phy calls have empty stub for 
!ACPI and !OF case. Like that:


static int bgx_init_phy(struct bgx *bgx)
{

if (!acpi_disabled)
return bgx_init_acpi_phy(bgx);
else
return bgx_init_of_phy(bgx);
}

Tomasz
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/31] net/sched: use kmemdup rather than duplicating its implementation

2015-08-07 Thread Daniel Borkmann

On 08/07/2015 09:59 AM, Andrzej Hajda wrote:

The patch was generated using fixed coccinelle semantic patch
scripts/coccinelle/api/memdup.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2014320

Signed-off-by: Andrzej Hajda 


Acked-by: Daniel Borkmann 

Not sure where the rest of this series went, but if you want this patch
to be routed via net-next tree (which I recommend, to avoid cross tree
conflicts), then you would need to send these patches separately, rebased
to that tree, and also mention [PATCH net-next XX/YY] in the subject.
  
Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V4 4/7] Drivers: hv: vmbus: add APIs to register callbacks to process hvsock connection

2015-08-07 Thread Dexuan Cui
> -Original Message-
> From: KY Srinivasan
> Sent: Friday, August 7, 2015 2:28
> To: Dexuan Cui ; David Miller 
> Cc: o...@aepfle.de; gre...@linuxfoundation.org; jasow...@redhat.com;
> driverdev-de...@linuxdriverproject.org; linux-ker...@vger.kernel.org;
> step...@networkplumber.org; stefa...@redhat.com; netdev@vger.kernel.org;
> a...@canonical.com; pebo...@tiscali.nl; dan.carpen...@oracle.com
> Subject: RE: [PATCH V4 4/7] Drivers: hv: vmbus: add APIs to register 
> callbacks to
> process hvsock connection
> 
> > -Original Message-
> > From: Dexuan Cui
> > Sent: Wednesday, August 5, 2015 9:54 PM
> > To: David Miller ; KY Srinivasan
> > 
> > Cc: o...@aepfle.de; gre...@linuxfoundation.org; jasow...@redhat.com;
> > driverdev-de...@linuxdriverproject.org; linux-ker...@vger.kernel.org;
> > step...@networkplumber.org; stefa...@redhat.com;
> > netdev@vger.kernel.org; a...@canonical.com; pebo...@tiscali.nl;
> > dan.carpen...@oracle.com
> > Subject: RE: [PATCH V4 4/7] Drivers: hv: vmbus: add APIs to register 
> > callbacks
> > to process hvsock connection
> >
> > > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On
> > Behalf
> > > Of Dexuan Cui
> > > Sent: Thursday, July 30, 2015 18:20
> > > To: David Miller ; KY Srinivasan
> > 
> > > Cc: o...@aepfle.de; gre...@linuxfoundation.org; jasow...@redhat.com;
> > > driverdev-de...@linuxdriverproject.org; linux-ker...@vger.kernel.org;
> > > step...@networkplumber.org; stefa...@redhat.com;
> > netdev@vger.kernel.org;
> > > a...@canonical.com; pebo...@tiscali.nl; dan.carpen...@oracle.com
> > > Subject: RE: [PATCH V4 4/7] Drivers: hv: vmbus: add APIs to register
> > callbacks to
> > > process hvsock connection
> > >
> > > > From: David Miller
> > > > Sent: Thursday, July 30, 2015 6:27
> > > >
> > > > From: Dexuan Cui
> > > > Date: Tue, 28 Jul 2015 05:35:11 -0700
> > > >
> > > > > With the 2 APIs supplied by the VMBus driver, the coming net/hvsock
> > driver
> > > > > can register 2 callbacks and can know when a new hvsock connection is
> > > > > offered by the host, and when a hvsock connection is being closed by
> > the
> > > > > host.
> > > > >
> > > > This is an extremely terrible interface.
> > > >
> > > > It's an opaque hook that allows on registry, and it's solve purpose
> > > > is to allow a backdoor call into a foreign driver in another module.
> > > >
> > > > These are exactly the things we try to avoid.
> > >
> > > Hi David,
> > > Thanks a lot for your reviewing and the suggestion!
> > >
> > > > Why not create a real abstraction where clients register an object,
> > > > that can be contained as a sub-member inside of their own driver
> > > > private, that provides the callback registry mechanism.
> >
> > Hi David,
> > Can you please have a look at my below questions?
> >
> > I like your idea of a real abstraction. Your answer would definitely
> > help me to implement that correctly.
> >
> > > Please pardon me for my inexperience.
> > > Can you please be a bit more specific?
> > > I guess maybe you're referencing a common design pattern in the driver
> > > code, so an example in some existing driver would be the best. :-)
> > >
> > > "clients register an object " --
> > > does the "clients" mean the hvsock driver?
> > > and the "object" means the 2 callbacks?
> > >
> > > IMHO, here the vmbus driver has to synchronously pass the 2 events
> > > to the hvsock driver, so a "backdoor call into the hvsock driver" is
> > > inevitable anyway?
> > >
> > > e.g., in the path vmbus_process_offer() -> hvsock_process_offer(), the
> > > return value of the latter is important to the former, because on error
> > > the former needs to clean up some internal states of the vmbus driver
> > (that
> > > is, the "goto err_deq_chan").
> > >
> > >
> > > > That way you can register multiple clients, do things like allow
> > > > AF_PACKET capturing of vmbus traffic, etc.
> > >
> > > I thought AF_PACKET can only capture IP packetsor Ethernet frames.
> > > Can it be used to capture AF_UNIX packet?
> > > If yes, I suppose we can consider making it work for AF_HYPERV too,
> > > if people ask for that.
> > >
> 
> Dexuan,
> 
> The notion of a channel on Hyper-V has been mapped to a device on Linux and
> the mechanism we have
> had of notifying the driver of the creation of the channel was through
> registering this device with the kernel
> (vmbus_device_create). The first exception to this was when we introduced
> multi-channel support that broke
> the assumption of this one to one mapping between the channel and Linux
> device. In the case of the sub-channels,
> we handled the  driver notification issue via the sub-channel callback that 
> the
> driver registers at the point of
> opening the channel. Perhaps we could make the sub-channel handling
> mechanism more generic to handle the case
> of VMSOCK as well?
> 
> K. Y

Good suggestion!
Let me think this over and make a new patch.

Thanks,
-- Dexuan

--
To unsubscribe from this list: send the line "unsubscribe

RE: [PATCH V4 7/7] Drivers: hv: vmbus: disable local interrupt when hvsock's callback is running

2015-08-07 Thread Dexuan Cui
> From: KY Srinivasan
> Sent: Friday, August 7, 2015 1:50
> To: Dexuan Cui ; David Miller 
> Cc: o...@aepfle.de; gre...@linuxfoundation.org; jasow...@redhat.com;
> driverdev-de...@linuxdriverproject.org; linux-ker...@vger.kernel.org;
> step...@networkplumber.org; stefa...@redhat.com; netdev@vger.kernel.org;
> a...@canonical.com; pebo...@tiscali.nl; dan.carpen...@oracle.com
> Subject: RE: [PATCH V4 7/7] Drivers: hv: vmbus: disable local interrupt when
> hvsock's callback is running
> > From: Dexuan Cui
> > Sent: Wednesday, August 5, 2015 9:44 PM
> > To: David Miller ; KY Srinivasan
> > 
> > Cc: o...@aepfle.de; gre...@linuxfoundation.org; jasow...@redhat.com;
> > driverdev-de...@linuxdriverproject.org; linux-ker...@vger.kernel.org;
> > step...@networkplumber.org; stefa...@redhat.com;
> > netdev@vger.kernel.org; a...@canonical.com; pebo...@tiscali.nl;
> > dan.carpen...@oracle.com
> > Subject: RE: [PATCH V4 7/7] Drivers: hv: vmbus: disable local interrupt when
> > hvsock's callback is running
> >
> > > From: devel [mailto:driverdev-devel-boun...@linuxdriverproject.org] On
> > Behalf
> > > Of Dexuan Cui
> > > Sent: Thursday, July 30, 2015 18:18
> > > To: David Miller ; KY Srinivasan
> > 
> > > Cc: o...@aepfle.de; gre...@linuxfoundation.org; jasow...@redhat.com;
> > > driverdev-de...@linuxdriverproject.org; linux-ker...@vger.kernel.org;
> > > step...@networkplumber.org; stefa...@redhat.com;
> > netdev@vger.kernel.org;
> > > a...@canonical.com; pebo...@tiscali.nl; dan.carpen...@oracle.com
> > > Subject: RE: [PATCH V4 7/7] Drivers: hv: vmbus: disable local interrupt
> > when
> > > hvsock's callback is running
> > >
> > > > From: David Miller
> > > > Sent: Thursday, July 30, 2015 6:28
> > > > > From: Dexuan Cui 
> > > > > Date: Tue, 28 Jul 2015 05:35:30 -0700
> > > > >
> > > > > In the SMP guest case, when the per-channel callback hvsock_events()
> > is
> > > > > running on virtual CPU A, if the guest tries to close the connection 
> > > > > on
> > > > > virtual CPU B: we invoke vmbus_close() -> vmbus_close_internal(),
> > > > > then we can have trouble: on B, vmbus_close_internal() will send IPI
> > > > > reset_channel_cb() to A, trying to set channel->onchannel_callbackto
> > NULL;
> > > > > on A, if the IPI handler happens between
> > > > > "if (channel->onchannel_callback != NULL)" and invoking
> > > > > channel->onchannel_callback, we'll invoke a function pointer of NULL.
> > > > >
> > > > > This is why the patch is necessary.
> > > > >
> > > > Sorry, I do not accept that you must use conditional locking and/or
> > > > IRQ disabling.
> > > >
> > > > Boil it down to what is necessary for the least common denominator,
> > > > and use that unconditionally.
> > >
> > > Hi David,
> > > Thanks for the comment!
> > >
> > > I agree with you it's not clean to use conditional IRQ disabling.
> > >
> > > Here I didn't use unconditionally IRQ disabling because the Hyper-V netvsc
> > > and storvsc driver's vmbus event callbacks (i.e. netvsc_channel_cb() and
> > > storvsc_on_channel_callback()) may take relatively long time (e.g., netvsc
> > can
> > > operate at a speed of 10Gb) and I think it's bad to disable IRQ for long 
> > > time
> > > when the callbacks are running in a tasklet context, e.g., the Hyper-V 
> > > timer
> > > can be affected: see vmbus_isr() -> hv_process_timer_expiration().
> > >
> > > To resolve the race condition between vmbus_close_internal() and
> > > process_chn_event() in SMP case, now I propose a new method:
> > >
> > > we can serialize the 2 paths by adding
> > > tasklet_disable(hv_context.event_dpc[channel->target_cpu]) and
> > > tasklet_enable(...) in vmbus_close_internal().
> > >
> > > In this way, we need the least change and we can drop this patch.
> > >
> > > Please let me know your opinion.
> > >
> > > -- Dexuan
> >
> > Hi David, KY and all,
> >
> > May I know your opinion about my idea of adding tasklet_disable/enbable()
> > in vmbus_close_internal() and dropping this patch?
> 
> Sorry for the delayed response; I think this is a reasonable solution. Send 
> me the
> patch.
> 
> Regards,
> 
> K. Y

OK. Will do.

 Thanks,
 -- Dexuan
 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH iproute2 -next] m_bpf: add frontend support for late binding

2015-08-07 Thread Daniel Borkmann
Frontend support for kernel commit a5c90b29e5cc ("act_bpf: properly
support late binding of bpf action to a classifier").

Signed-off-by: Daniel Borkmann 
---
 tc/m_bpf.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/tc/m_bpf.c b/tc/m_bpf.c
index c51f44f..e1bb6a4 100644
--- a/tc/m_bpf.c
+++ b/tc/m_bpf.c
@@ -28,7 +28,7 @@ static const enum bpf_prog_type bpf_type = 
BPF_PROG_TYPE_SCHED_ACT;
 
 static void explain(void)
 {
-   fprintf(stderr, "Usage: ... bpf ...\n");
+   fprintf(stderr, "Usage: ... bpf ... [ index INDEX ]\n");
fprintf(stderr, "\n");
fprintf(stderr, "BPF use case:\n");
fprintf(stderr, " bytecode BPF_BYTECODE\n");
@@ -49,6 +49,9 @@ static void explain(void)
fprintf(stderr, "\n");
fprintf(stderr, "Where UDS_FILE points to a unix domain socket file in 
order\n");
fprintf(stderr, "to hand off control of all created eBPF maps to an 
agent.\n");
+   fprintf(stderr, "\n");
+   fprintf(stderr, "Where optionally INDEX points to an existing action, 
or\n");
+   fprintf(stderr, "explicitly specifies an action index upon 
creation.\n");
 }
 
 static void usage(void)
@@ -64,6 +67,7 @@ static int parse_bpf(struct action_util *a, int *argc_p, char 
***argv_p,
struct rtattr *tail;
struct tc_act_bpf parm = { 0 };
struct sock_filter bpf_ops[BPF_MAXINSNS];
+   bool ebpf_fill = false, bpf_fill = false;
bool ebpf = false, seen_run = false;
const char *bpf_uds_name = NULL;
const char *bpf_sec_name = NULL;
@@ -148,11 +152,15 @@ opt_bpf:
 bpf_obj, bpf_sec_name);
 
bpf_fd = ret;
+   ebpf_fill = true;
} else {
bpf_len = ret;
+   bpf_fill = true;
}
} else if (matches(*argv, "help") == 0) {
usage();
+   } else if (matches(*argv, "index") == 0) {
+   break;
} else {
if (!seen_run)
goto opt_bpf;
@@ -200,21 +208,15 @@ opt_bpf:
}
}
 
-   if ((!bpf_len && !ebpf) || (!bpf_fd && ebpf)) {
-   fprintf(stderr, "bpf: Bytecode needs to be passed\n");
-   explain();
-   return -1;
-   }
-
tail = NLMSG_TAIL(n);
 
addattr_l(n, MAX_MSG, tca_id, NULL, 0);
addattr_l(n, MAX_MSG, TCA_ACT_BPF_PARMS, &parm, sizeof(parm));
 
-   if (ebpf) {
+   if (ebpf_fill) {
addattr32(n, MAX_MSG, TCA_ACT_BPF_FD, bpf_fd);
addattrstrz(n, MAX_MSG, TCA_ACT_BPF_NAME, bpf_name);
-   } else {
+   } else if (bpf_fill) {
addattr16(n, MAX_MSG, TCA_ACT_BPF_OPS_LEN, bpf_len);
addattr_l(n, MAX_MSG, TCA_ACT_BPF_OPS, &bpf_ops,
  bpf_len * sizeof(struct sock_filter));
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net: phy: select copper mode when Marvel 88e1111 in SGMII

2015-08-07 Thread shh.xie
From: Madalin Bucur 

For the Marvel 88e PHY only two SGMII modes are available, both
allowing only SGMII to copper mode (with or without clock). SGMII
to fiber mode is not supported. Make sure the fiber/copper registers
selector bits are cleared for selecting copper mode.

Signed-off-by: Madalin Bucur 
Signed-off-by: Shaohui Xie 
---
 drivers/net/phy/marvell.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index 3320a17..e6897b6 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -52,6 +52,7 @@
 #define MII_M1011_PHY_SCR_MDI_X0x0020
 #define MII_M1011_PHY_SCR_AUTO_CROSS   0x0060
 
+#define MII_M1145_PHY_EXT_ADDR_PAGE0x16
 #define MII_M1145_PHY_EXT_SR   0x1b
 #define MII_M1145_PHY_EXT_CR   0x14
 #define MII_M1145_RGMII_RX_DELAY   0x0080
@@ -552,6 +553,16 @@ static int m88e_config_init(struct phy_device *phydev)
err = phy_write(phydev, MII_M_PHY_EXT_SR, temp);
if (err < 0)
return err;
+
+   /* make sure copper is selected */
+   err = phy_read(phydev, MII_M1145_PHY_EXT_ADDR_PAGE);
+   if (err < 0)
+   return err;
+
+   err = phy_write(phydev, MII_M1145_PHY_EXT_ADDR_PAGE,
+   err & (~0xff));
+   if (err < 0)
+   return err;
}
 
if (phydev->interface == PHY_INTERFACE_MODE_RTBI) {
-- 
2.1.0.27.g96db324

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] mac80211_hwsim: unregister genetlink family properly

2015-08-07 Thread Su Kang Yin
During hwsim_init_netlink(), we should call genl_unregister_family()
if failed on netlink_register_notifier() since the genetlink is
already registered.

Signed-off-by: Su Kang Yin 
---
 drivers/net/wireless/mac80211_hwsim.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/mac80211_hwsim.c 
b/drivers/net/wireless/mac80211_hwsim.c
index 99e873d..16d953e 100644
--- a/drivers/net/wireless/mac80211_hwsim.c
+++ b/drivers/net/wireless/mac80211_hwsim.c
@@ -3120,8 +3120,10 @@ static int hwsim_init_netlink(void)
goto failure;
 
rc = netlink_register_notifier(&hwsim_netlink_notifier);
-   if (rc)
+   if (rc) {
+   genl_unregister_family(&hwsim_genl_family);
goto failure;
+   }
 
return 0;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] ipv6: don't reject link-local nexthop on other interface

2015-08-07 Thread Florian Westphal
48ed7b26faa7 ("ipv6: reject locally assigned nexthop addresses") is too
strict; it rejects following corner-case:

ip -6 route add default via fe80::1:2:3 dev eth1

[ where fe80::1:2:3 is assigned to a local interface, but not eth1 ]

Fix this by restricting search to given device if nh is linklocal.

Joint work with Hannes Frederic Sowa.

Fixes: 48ed7b26faa7 ("ipv6: reject locally assigned nexthop addresses")
Signed-off-by: Hannes Frederic Sowa 
Signed-off-by: Florian Westphal 
---
 net/ipv6/route.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 6090969..9de4d2b 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1831,6 +1831,7 @@ int ip6_route_add(struct fib6_config *cfg)
int gwa_type;
 
gw_addr = &cfg->fc_gateway;
+   gwa_type = ipv6_addr_type(gw_addr);
 
/* if gw_addr is local we will fail to detect this in case
 * address is still TENTATIVE (DAD in progress). rt6_lookup()
@@ -1838,11 +1839,12 @@ int ip6_route_add(struct fib6_config *cfg)
 * prefix route was assigned to, which might be non-loopback.
 */
err = -EINVAL;
-   if (ipv6_chk_addr_and_flags(net, gw_addr, NULL, 0, 0))
+   if (ipv6_chk_addr_and_flags(net, gw_addr,
+   gwa_type & IPV6_ADDR_LINKLOCAL ?
+   dev : NULL, 0, 0))
goto out;
 
rt->rt6i_gateway = *gw_addr;
-   gwa_type = ipv6_addr_type(gw_addr);
 
if (gwa_type != (IPV6_ADDR_LINKLOCAL|IPV6_ADDR_UNICAST)) {
struct rt6_info *grt;
-- 
2.0.5

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] net/ipv4: inconsistent routing table

2015-08-07 Thread Zang MingJie
IMO, the routing decision is determined, given a specific routing
table and local network the result MUST be determined, independence of
how/what order the routing entry is added.

Now there are two ways to configure the system resulting EXACTLY the
same routing table and local addresses, but the routing decision is
totally different.

SAME routing table, DIFFERENT routing decision, there MUST be bugs in kernel.

On Thu, Aug 6, 2015 at 3:43 PM, Alexander Duyck
 wrote:
> On 08/06/2015 03:13 AM, Zang MingJie wrote:
>>
>> On Thu, Aug 6, 2015 at 1:45 AM, Alexander Duyck
>>  wrote:
>>>
>>> On 08/05/2015 02:06 AM, Daniel Borkmann wrote:


 [ please cc netdev ]

 On 08/05/2015 10:56 AM, Zang MingJie wrote:
>
>
> Hi:
>
> I found a bug when remove an ip address which is referenced by a
> routing
> entry.
>
> step to reproduce:
>
> ip li add type dummy
> ip li set dummy0 up
> ip ad add 10.0.0.1/24 dev dummy0
> ip ad add 10.0.0.2/24 dev dummy0
>>>
>>>
>>>
>>> Okay, so up to this point you have 2 addresses on the same subnet that
>>> are
>>> now on dummy0.
>>>
> ip ro add default via 10.0.0.2/24
>>>
>>>
>>>
>>> This makes the default route go through 10.0.0.2.
>>>
> ip ad del 10.0.0.2/24 dev dummy0
>>>
>>>
>>>
>>> Then you remove 10.0.0.2 from the local system, however since 10.0.0.1 is
>>> on
>>> the same subnet dummy0 would still be the correct interface to access
>>> 10.0.0.2 it is just no longer local to the system.
>>>
> after deleting the secondary ip address, the routing entry still
> pointing to 10.0.0.2
>>>
>>>
>>>
>>> You didn't delete the default routing entry so why would you expect it to
>>> change?  All you did is remove 10.0.0.2 from the local system.  I believe
>>> the assumption is that 10.0.0.2 is still out there somewhere, it just
>>> isn't
>>> on the local system anymore.
>>
>>
>> Yes, 10.0.0.2 is migrated to somewhere else
>
>
> The address might have migrated, but the interface is still up and 10.0.0.1
> is still present on the same subnet.  Because you made a local address the
> default gateway the assumption is any routes not specifically called out on
> other interfaces are directly accessible to this interface.
>
> The bug indicates that the kernel is doing something to make the table
> inconsistent, but a default route that is a local interface address does
> essentially the same thing.
>
>>>
> # ip ro
> default via 10.0.0.2 dev dummy0
> 10.0.0.0/24 dev dummy0  proto kernel  scope link  src 10.0.0.1
>>>
>>>
>>>
>>> This matches up with what I would expect.  10.0.0.2 is the default
>>> gateway
>>> and it is accessible from dummy0 since 10.0.0.0/24 is accessible from
>>> dummy0.
>>
>>
>> This means 0.0.0.0/0 is accessible via 10.0.0.2 on the network of dummy0
>
>
> Yes, but at the time you specified it 10.0.0.2 was a local address which
> belonged to dummy0.  This means that dummy0 can access anything not
> specified elsewhere via pretty much any address it wants.  So it is
> perfectly valid if it wants to use a source address of 10.0.0.1 to send
> packets to 1.1.1.1 over dummy0.
>
>>>
> but actually, kernel considers the default route is directly connected.
>
> # ip ro get 1.1.1.1
> 1.1.1.1 dev dummy0  src 10.0.0.1
>   cache
>>>
>>>
>>>
>>> I'm not sure how you came to the "directly connected" conclusion. It is
>>> still routing things out through 10.0.0.2 from 10.0.0.1.
>>>
>>> Maybe your example would work better if you used 10.0.0.1 and 10.0.1.1
>>> instead.  Then I think you might be able to better see that when you
>>> delete
>>> the second address the route would be broken.
>>
>>
>> No, it isn't. when ping 1.1.1.1, kernel will directly send arp request
>> braodcast to 1.1.1.1, this is not what I expect. it should send arp
>> request to 10.0.0.2, following should be the correct routing entry:
>>
>> # ip ro get 1.1.1.1
>> 1.1.1.1 via 10.0.0.2 dev dummy0  src 10.0.0.1
>>  cache
>
>
> I see what you are trying to say, but the example provided is a bit lacking.
> Assuming you could ping 1.1.1.1 via dummy0 before with 10.0.0.2 as your
> default gateway, that shouldn't change if 10.0.0.2 is migrated to another
> address.  That is, unless there is an issue on the system 10.0.0.2 was
> migrated to.
>
> Now if I move away from using dummy interface and instead using a real
> network interface things can get a bit more interesting.  So if we follow
> your example and use 2 different subnets on the two systems then pings
> continue to work after we remove the addresses.  However if we flip things a
> bit and add the default route, and then the local address for the gateway
> they don't.  So something like below:
> ip li set eth0 up
> ip ad add 10.0.0.1/24 dev eth0
> ip ro add default via 10.0.0.2
> ip ad add 10.0.0.2/24 dev eth0
>
> What you end up with is eth0 sending arp requests looking for 10.0.0.2 even
> though it is a local address

[PATCH 16/31] net/cavium/liquidio: use kmemdup rather than duplicating its implementation

2015-08-07 Thread Andrzej Hajda
The patch was generated using fixed coccinelle semantic patch
scripts/coccinelle/api/memdup.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2014320

Signed-off-by: Andrzej Hajda 
---
 drivers/net/ethernet/cavium/liquidio/octeon_device.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_device.c 
b/drivers/net/ethernet/cavium/liquidio/octeon_device.c
index f67641a..8e23e3f 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_device.c
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_device.c
@@ -602,12 +602,10 @@ int octeon_download_firmware(struct octeon_device *oct, 
const u8 *data,
snprintf(oct->fw_info.liquidio_firmware_version, 32, "LIQUIDIO: %s",
 h->version);
 
-   buffer = kmalloc(size, GFP_KERNEL);
+   buffer = kmemdup(data, size, GFP_KERNEL);
if (!buffer)
return -ENOMEM;
 
-   memcpy(buffer, data, size);
-
p = buffer + sizeof(struct octeon_firmware_file_header);
 
/* load all images */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH iproute2] tipc: fix bearer get/set help synopsis

2015-08-07 Thread richard.alpe
From: Richard Alpe 

One option is required for bearer set and bearer get.
---
 tipc/bearer.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tipc/bearer.c b/tipc/bearer.c
index 33295f9..30b54d9 100644
--- a/tipc/bearer.c
+++ b/tipc/bearer.c
@@ -412,7 +412,7 @@ static int cmd_bearer_disable(struct nlmsghdr *nlh, const 
struct cmd *cmd,
 
 static void cmd_bearer_set_help(struct cmdl *cmdl)
 {
-   fprintf(stderr, "Usage: %s bearer set [OPTIONS] media MEDIA ARGS...\n",
+   fprintf(stderr, "Usage: %s bearer set OPTION media MEDIA ARGS...\n",
cmdl->argv[0]);
_print_bearer_opts();
_print_bearer_media();
@@ -420,7 +420,7 @@ static void cmd_bearer_set_help(struct cmdl *cmdl)
 
 static void cmd_bearer_set_udp_help(struct cmdl *cmdl)
 {
-   fprintf(stderr, "Usage: %s bearer set [OPTIONS] media udp name 
NAME\n\n",
+   fprintf(stderr, "Usage: %s bearer set OPTION media udp name NAME\n\n",
cmdl->argv[0]);
_print_bearer_opts();
 }
@@ -528,7 +528,7 @@ static int cmd_bearer_set(struct nlmsghdr *nlh, const 
struct cmd *cmd,
 
 static void cmd_bearer_get_help(struct cmdl *cmdl)
 {
-   fprintf(stderr, "Usage: %s bearer get [OPTIONS] media MEDIA ARGS...\n",
+   fprintf(stderr, "Usage: %s bearer get OPTION media MEDIA ARGS...\n",
cmdl->argv[0]);
_print_bearer_opts();
_print_bearer_media();
@@ -536,7 +536,7 @@ static void cmd_bearer_get_help(struct cmdl *cmdl)
 
 static void cmd_bearer_get_udp_help(struct cmdl *cmdl)
 {
-   fprintf(stderr, "Usage: %s bearer get [OPTIONS] media udp name 
NAME\n\n",
+   fprintf(stderr, "Usage: %s bearer get OPTION media udp name NAME\n\n",
cmdl->argv[0]);
_print_bearer_opts();
 }
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] net, thunder, bgx: Add support for ACPI binding.

2015-08-07 Thread Tomasz Nowicki

On 07.08.2015 02:33, David Daney wrote:

From: David Daney 

Find out which PHYs belong to which BGX instance in the ACPI way.

Set the MAC address of the device as provided by ACPI tables. This is
similar to the implementation for devicetree in
of_get_mac_address(). The table is searched for the device property
entries "mac-address", "local-mac-address" and "address" in that
order. The address is provided in a u64 variable and must contain a
valid 6 bytes-len mac addr.

Based on code from: Narinder Dhillon 
 Tomasz Nowicki 
 Robert Richter 

Signed-off-by: Tomasz Nowicki 
Signed-off-by: Robert Richter 
Signed-off-by: David Daney 
---
  drivers/net/ethernet/cavium/thunder/thunder_bgx.c | 137 +-
  1 file changed, 135 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c 
b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
index 615b2af..2056583 100644
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -6,6 +6,7 @@
   * as published by the Free Software Foundation.
   */

+#include 
  #include 
  #include 
  #include 
@@ -26,7 +27,7 @@
  struct lmac {
struct bgx  *bgx;
int dmac;
-   unsigned char   mac[ETH_ALEN];
+   u8  mac[ETH_ALEN];
boollink_up;
int lmacid; /* ID within BGX */
int lmacid_bd; /* ID on board */
@@ -835,6 +836,133 @@ static void bgx_get_qlm_mode(struct bgx *bgx)
}
  }

+#ifdef CONFIG_ACPI
+
+static int bgx_match_phy_id(struct device *dev, void *data)
+{
+   struct phy_device *phydev = to_phy_device(dev);
+   u32 *phy_id = data;
+
+   if (phydev->addr == *phy_id)
+   return 1;
+
+   return 0;
+}
+
+static const char * const addr_propnames[] = {
+   "mac-address",
+   "local-mac-address",
+   "address",
+};
+
+static int acpi_get_mac_address(struct acpi_device *adev, u8 *dst)
+{
+   const union acpi_object *prop;
+   u64 mac_val;
+   u8 mac[ETH_ALEN];
+   int i, j;
+   int ret;
+
+   for (i = 0; i < ARRAY_SIZE(addr_propnames); i++) {
+   ret = acpi_dev_get_property(adev, addr_propnames[i],
+   ACPI_TYPE_INTEGER, &prop);
+   if (ret)
+   continue;
+
+   mac_val = prop->integer.value;
+
+   if (mac_val & (~0ULL << 48))
+   continue;   /* more than 6 bytes */
+
+   for (j = 0; j < ARRAY_SIZE(mac); j++)
+   mac[j] = (u8)(mac_val >> (8 * j));
+   if (!is_valid_ether_addr(mac))
+   continue;
+
+   memcpy(dst, mac, ETH_ALEN);
+
+   return 0;
+   }
+
+   return ret ? ret : -EINVAL;
+}
+
+static acpi_status bgx_acpi_register_phy(acpi_handle handle,
+u32 lvl, void *context, void **rv)
+{
+   struct acpi_reference_args args;
+   const union acpi_object *prop;
+   struct bgx *bgx = context;
+   struct acpi_device *adev;
+   struct device *phy_dev;
+   u32 phy_id;
+
+   if (acpi_bus_get_device(handle, &adev))
+   goto out;
+
+   SET_NETDEV_DEV(&bgx->lmac[bgx->lmac_count].netdev, &bgx->pdev->dev);
+
+   acpi_get_mac_address(adev, bgx->lmac[bgx->lmac_count].mac);
+
+   bgx->lmac[bgx->lmac_count].lmacid = bgx->lmac_count;
+
+   if (acpi_dev_get_property_reference(adev, "phy-handle", 0, &args))
+   goto out;
+
+   if (acpi_dev_get_property(args.adev, "phy-channel", ACPI_TYPE_INTEGER, 
&prop))
+   goto out;
+
+   phy_id = prop->integer.value;
+
+   phy_dev = bus_find_device(&mdio_bus_type, NULL, (void *)&phy_id,
+ bgx_match_phy_id);
+   if (!phy_dev)
+   goto out;
+
+   bgx->lmac[bgx->lmac_count].phydev = to_phy_device(phy_dev);
+out:
+   bgx->lmac_count++;
+   return AE_OK;
+}
+
+static acpi_status bgx_acpi_match_id(acpi_handle handle, u32 lvl,
+void *context, void **ret_val)
+{
+   struct acpi_buffer string = { ACPI_ALLOCATE_BUFFER, NULL };
+   struct bgx *bgx = context;
+   char bgx_sel[5];
+
+   snprintf(bgx_sel, 5, "BGX%d", bgx->bgx_id);
+   if (ACPI_FAILURE(acpi_get_name(handle, ACPI_SINGLE_NAME, &string))) {
+   pr_warn("Invalid link device\n");
+   return AE_OK;
+   }
+
+   if (strncmp(string.pointer, bgx_sel, 4))
+   return AE_OK;
+
+   acpi_walk_namespace(ACPI_TYPE_DEVICE, handle, 1,
+   bgx_acpi_register_phy, NULL, bgx, NULL);
+
+   kfree(string.pointer);
+   return AE_CTRL_TERMINATE;
+}
+
+static int bgx_init_acpi_phy(struct bgx *bgx)
+{
+   acpi_get_de

[PATCH 26/31] net/sched: use kmemdup rather than duplicating its implementation

2015-08-07 Thread Andrzej Hajda
The patch was generated using fixed coccinelle semantic patch
scripts/coccinelle/api/memdup.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2014320

Signed-off-by: Andrzej Hajda 
---
 net/sched/act_bpf.c | 4 +---
 net/sched/cls_bpf.c | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c
index 1b97dab..5c0fa03 100644
--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -190,12 +190,10 @@ static int tcf_bpf_init_from_ops(struct nlattr **tb, 
struct tcf_bpf_cfg *cfg)
if (bpf_size != nla_len(tb[TCA_ACT_BPF_OPS]))
return -EINVAL;
 
-   bpf_ops = kzalloc(bpf_size, GFP_KERNEL);
+   bpf_ops = kmemdup(nla_data(tb[TCA_ACT_BPF_OPS]), bpf_size, GFP_KERNEL);
if (bpf_ops == NULL)
return -ENOMEM;
 
-   memcpy(bpf_ops, nla_data(tb[TCA_ACT_BPF_OPS]), bpf_size);
-
fprog_tmp.len = bpf_num_ops;
fprog_tmp.filter = bpf_ops;
 
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index e5168f8..423f774 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -212,12 +212,10 @@ static int cls_bpf_prog_from_ops(struct nlattr **tb,
if (bpf_size != nla_len(tb[TCA_BPF_OPS]))
return -EINVAL;
 
-   bpf_ops = kzalloc(bpf_size, GFP_KERNEL);
+   bpf_ops = kmemdup(nla_data(tb[TCA_BPF_OPS]), bpf_size, GFP_KERNEL);
if (bpf_ops == NULL)
return -ENOMEM;
 
-   memcpy(bpf_ops, nla_data(tb[TCA_BPF_OPS]), bpf_size);
-
fprog_tmp.len = bpf_num_ops;
fprog_tmp.filter = bpf_ops;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 27/31] net/tipc: use kmemdup rather than duplicating its implementation

2015-08-07 Thread Andrzej Hajda
The patch was generated using fixed coccinelle semantic patch
scripts/coccinelle/api/memdup.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2014320

Signed-off-by: Andrzej Hajda 
---
 net/tipc/server.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/tipc/server.c b/net/tipc/server.c
index 922e04a..c187cad 100644
--- a/net/tipc/server.c
+++ b/net/tipc/server.c
@@ -411,13 +411,12 @@ static struct outqueue_entry *tipc_alloc_entry(void 
*data, int len)
if (!entry)
return NULL;
 
-   buf = kmalloc(len, GFP_ATOMIC);
+   buf = kmemdup(data, len, GFP_ATOMIC);
if (!buf) {
kfree(entry);
return NULL;
}
 
-   memcpy(buf, data, len);
entry->iov.iov_base = buf;
entry->iov.iov_len = len;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 28/31] net/xfrm: use kmemdup rather than duplicating its implementation

2015-08-07 Thread Andrzej Hajda
The patch was generated using fixed coccinelle semantic patch
scripts/coccinelle/api/memdup.cocci [1].

[1]: http://permalink.gmane.org/gmane.linux.kernel/2014320

Signed-off-by: Andrzej Hajda 
---
 net/xfrm/xfrm_user.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 0cebf1f..a8de9e3 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -925,12 +925,10 @@ static int xfrm_dump_sa(struct sk_buff *skb, struct 
netlink_callback *cb)
return err;
 
if (attrs[XFRMA_ADDRESS_FILTER]) {
-   filter = kmalloc(sizeof(*filter), GFP_KERNEL);
+   filter = kmemdup(nla_data(attrs[XFRMA_ADDRESS_FILTER]),
+sizeof(*filter), GFP_KERNEL);
if (filter == NULL)
return -ENOMEM;
-
-   memcpy(filter, nla_data(attrs[XFRMA_ADDRESS_FILTER]),
-  sizeof(*filter));
}
 
if (attrs[XFRMA_PROTO])
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2015-08-05

2015-08-07 Thread David Miller
From: Jeff Kirsher 
Date: Wed,  5 Aug 2015 16:52:00 -0700

> This series contains updates to i40e, i40evf and e1000e.

Pulled, thanks Jeff.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] net: ipv4: increase dhcp inter device timeout

2015-08-07 Thread YOSHIFUJI Hideaki
Hi,

Mugunthan V N wrote:
> When a system has multiple ethernet devices and during DHCP
> request (for using NFS), the system waits only for HZ/2 which is
> 500mS before switching to another interface for DHCP.
> 
> There are some routers (Ex: Trendnet routers) which responds to
> DHCP request at about 560mS. When the system has only one
> ethernet interface there is no issue as the timeout is 2S and the
> dev xid doesn't change and only retries.
> 
> But when the system has multiple Ethernet like DRA74x with CPSW
> in dual EMAC mode, the DHCP response is dropped as the dev xid
> changes while shifting to the next device. So changing inter
> device timeout to HZ (which is 1S).
> 
> Signed-off-by: Mugunthan V N 
> ---
>  net/ipv4/ipconfig.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
> index 8e7328c..bdb8cb5 100644
> --- a/net/ipv4/ipconfig.c
> +++ b/net/ipv4/ipconfig.c
> @@ -94,7 +94,7 @@
>  /* Define the timeout for waiting for a DHCP/BOOTP/RARP reply */
>  #define CONF_OPEN_RETRIES2   /* (Re)open devices twice */
>  #define CONF_SEND_RETRIES6   /* Send six requests per open */
> -#define CONF_INTER_TIMEOUT   (HZ/2)  /* Inter-device timeout: 1/2 second */
> +#define CONF_INTER_TIMEOUT   (HZ)/* Inter-device timeout: 1/2 second */

You should update comment as well at least.

--yoshfuji

>  #define CONF_BASE_TIMEOUT(HZ*2)  /* Initial timeout: 2 seconds */
>  #define CONF_TIMEOUT_RANDOM  (HZ)/* Maximum amount of randomization */
>  #define CONF_TIMEOUT_MULT*7/4/* Rate of timeout growth */
> 

-- 
Hideaki Yoshifuji 
Technical Division, MIRACLE LINUX CORPORATION
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net V2] virtio-net: drop NETIF_F_FRAGLIST

2015-08-07 Thread David Miller
From: Jason Wang 
Date: Wed,  5 Aug 2015 10:34:04 +0800

> virtio declares support for NETIF_F_FRAGLIST, but assumes
> that there are at most MAX_SKB_FRAGS + 2 fragments which isn't
> always true with a fraglist.
> 
> A longer fraglist in the skb will make the call to skb_to_sgvec overflow
> the sg array, leading to memory corruption.
> 
> Drop NETIF_F_FRAGLIST so we only get what we can handle.
> 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Jason Wang 

Applied, thanks Jason.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] stmmac: dwmac-ipq806x: fix static checker warning

2015-08-07 Thread David Miller
From: Mathieu Olivari 
Date: Tue,  4 Aug 2015 17:25:02 -0700

> The patch b1c17215d718: "stmmac: add ipq806x glue layer", leads to the
> following static checker warning:
> 
> .../stmmac/dwmac-ipq806x.c:314 ipq806x_gmac_probe()
> warn: double left shift '1 << (1 << gmac->id)'
> 
> The NSS_COMMON_CLK_SRC_CTRL_OFFSET macro is used once as an offset, and
> once as a mask, which is a bug indeed. We'll fix it by defining the
> offset as the real offset value and computing the mask from it when
> required.
> 
> Tested on IPQ806x ref designs AP148 & DB149.
> 
> Reported-by: Dan Carpenter 
> Signed-off-by: Mathieu Olivari 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: netcp: fix unused interface rx buffer size configuration

2015-08-07 Thread David Miller
From: WingMan Kwok 
Date: Tue, 4 Aug 2015 16:56:53 -0400

> Prior to this patch, rx buffer size for each rx queue
> of an interface is configurable through dts bindings.
> But for an interface, the first rx queue's rx buffer
> size is always the usual MTU size (plus usual overhead)
> and page size for the remaining rx queues (if they are
> enabled by specifying a non-zero rx queue depth dts
> binding of the corresponding interface).  This patch
> removes the rx buffer size configuration capability.
> 
> Signed-off-by: WingMan Kwok 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] r8169: enforce RX_MULTI_EN on rtl8168ep/8111ep chips

2015-08-07 Thread David Miller
From: Ivan Vecera 
Date: Tue,  4 Aug 2015 22:11:43 +0200

> Enforcing this flag in RxConfig for the mentioned chips fixes netdev
> watchdog issues prepended with AMD IOMMU message(s) like:
> AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x001d 
> address=0x3000 flags=0x0050]
> 
> Note that this flag is also set in Realtek's own driver for these chips.
> 
> Signed-off-by: Ivan Vecera 
> Tested-by: Alexander Lindqvist 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html