Re: [PATCH] bnx2x: drop packets where gso_size is too big for hardware

2017-09-17 Thread Daniel Axtens
Hi Eric,

>> +if (unlikely(skb_shinfo(skb)->gso_size + hlen > 
>> MAX_PACKET_SIZE)) {
>> +BNX2X_ERR("reported gso segment size plus headers "
>> +  "(%d + %d) > MAX_PACKET_SIZE; dropping pkt!",
>> +  skb_shinfo(skb)->gso_size, hlen);
>> +
>> +goto free_and_drop;
>> +}
>> +
>
>
> If you had this test in bnx2x_features_check(), packet could be
> segmented by core networking stack before reaching bnx2x_start_xmit() by
> clearing NETIF_F_GSO_MASK
>
> -> No drop would be involved.
>
> check i40evf_features_check() for similar logic.

So I've been experimenting with this and reading through the core
networking code. If my understanding is correct, disabling GSO will
cause the packet to be segmented, but it will be segemented into
gso_size+header length packets. So in this case (~10kB gso_size) the
resultant packets will still be too big - although at least they don't
cause a crash in that case.

We could continue with this anyway as it at least prevents the crash -
but, and I haven't been able to find a nice definitive answer to this -
are implementations of ndo_start_xmit permitted to assume that the the
skb passed in will fit within the MTU? I notice that most callers will
attempt to ensure this - for example ip_output.c, ip6_output.c and
ip_forward.c all contain calls to skb_gso_validate_mtu(). If
implementations are permitted to assume this, perhaps a fix to
openvswitch would be more appropriate?

Regards,
Daniel


Re: [PATCH] vhost_net: conditionally enable tx polling

2017-09-17 Thread kbuild test robot
Hi Jason,

[auto build test WARNING on vhost/linux-next]
[also build test WARNING on v4.14-rc1 next-20170915]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Jason-Wang/vhost_net-conditionally-enable-tx-polling/20170918-112041
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next
config: x86_64-randconfig-x009-201738 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   drivers//vhost/net.c: In function 'handle_tx':
>> drivers//vhost/net.c:565:4: warning: suggest parentheses around assignment 
>> used as truth value [-Wparentheses]
   if (err = -EAGAIN)
   ^~

vim +565 drivers//vhost/net.c

   442  
   443  /* Expects to be always run from workqueue - which acts as
   444   * read-size critical section for our kind of RCU. */
   445  static void handle_tx(struct vhost_net *net)
   446  {
   447  struct vhost_net_virtqueue *nvq = >vqs[VHOST_NET_VQ_TX];
   448  struct vhost_virtqueue *vq = >vq;
   449  unsigned out, in;
   450  int head;
   451  struct msghdr msg = {
   452  .msg_name = NULL,
   453  .msg_namelen = 0,
   454  .msg_control = NULL,
   455  .msg_controllen = 0,
   456  .msg_flags = MSG_DONTWAIT,
   457  };
   458  size_t len, total_len = 0;
   459  int err;
   460  size_t hdr_size;
   461  struct socket *sock;
   462  struct vhost_net_ubuf_ref *uninitialized_var(ubufs);
   463  bool zcopy, zcopy_used;
   464  
   465  mutex_lock(>mutex);
   466  sock = vq->private_data;
   467  if (!sock)
   468  goto out;
   469  
   470  if (!vq_iotlb_prefetch(vq))
   471  goto out;
   472  
   473  vhost_disable_notify(>dev, vq);
   474  vhost_net_disable_vq(net, vq);
   475  
   476  hdr_size = nvq->vhost_hlen;
   477  zcopy = nvq->ubufs;
   478  
   479  for (;;) {
   480  /* Release DMAs done buffers first */
   481  if (zcopy)
   482  vhost_zerocopy_signal_used(net, vq);
   483  
   484  /* If more outstanding DMAs, queue the work.
   485   * Handle upend_idx wrap around
   486   */
   487  if (unlikely(vhost_exceeds_maxpend(net)))
   488  break;
   489  
   490  head = vhost_net_tx_get_vq_desc(net, vq, vq->iov,
   491  ARRAY_SIZE(vq->iov),
   492  , );
   493  /* On error, stop handling until the next kick. */
   494  if (unlikely(head < 0))
   495  break;
   496  /* Nothing new?  Wait for eventfd to tell us they 
refilled. */
   497  if (head == vq->num) {
   498  if (unlikely(vhost_enable_notify(>dev, 
vq))) {
   499  vhost_disable_notify(>dev, vq);
   500  continue;
   501  }
   502  break;
   503  }
   504  if (in) {
   505  vq_err(vq, "Unexpected descriptor format for 
TX: "
   506 "out %d, int %d\n", out, in);
   507  break;
   508  }
   509  /* Skip header. TODO: support TSO. */
   510  len = iov_length(vq->iov, out);
   511  iov_iter_init(_iter, WRITE, vq->iov, out, len);
   512  iov_iter_advance(_iter, hdr_size);
   513  /* Sanity check */
   514  if (!msg_data_left()) {
   515  vq_err(vq, "Unexpected header len for TX: "
   516 "%zd expected %zd\n",
   517 len, hdr_size);
   518  break;
   519  }
   520  len = msg_data_left();
   521  
   522  zcopy_used = zcopy && len >= VHOST_GOODCOPY_LEN
   523 && (nvq->upend_idx + 1) % UIO_MAXIOV 
!=
   524nvq->done_idx
   525 && vhost_net_tx_select_zcopy(net);
   526  
   527  /* use msg_control to pass vhost zerocopy ubuf info to 
skb */
   528  if (zcopy_used) {
   529  struct ubuf_info *ubuf;
   530  ubuf = nvq->ubuf_info + nvq->upend_idx;
   531  
   532 

Re: [RFC PATCH] can: m_can: Support higher speed CAN-FD bitrates

2017-09-17 Thread Yang, Wenyou



On 2017/9/14 13:06, Sekhar Nori wrote:

On Thursday 14 September 2017 03:28 AM, Franklin S Cooper Jr wrote:


On 08/18/2017 02:39 PM, Franklin S Cooper Jr wrote:

During test transmitting using CAN-FD at high bitrates (4 Mbps) only
resulted in errors. Scoping the signals I noticed that only a single bit
was being transmitted and with a bit more investigation realized the actual
MCAN IP would go back to initialization mode automatically.

It appears this issue is due to the MCAN needing to use the Transmitter
Delay Compensation Mode as defined in the MCAN User's Guide. When this
mode is used the User's Guide indicates that the Transmitter Delay
Compensation Offset register should be set. The document mentions that this
register should be set to (1/dbitrate)/2*(Func Clk Freq).

Additional CAN-CIA's "Bit Time Requirements for CAN FD" document indicates
that this TDC mode is only needed for data bit rates above 2.5 Mbps.
Therefore, only enable this mode and only set TDCO when the data bit rate
is above 2.5 Mbps.

Signed-off-by: Franklin S Cooper Jr 
---
I'm pretty surprised that this hasn't been implemented already since
the primary purpose of CAN-FD is to go beyond 1 Mbps and the MCAN IP
supports up to 10 Mbps.

So it will be nice to get comments from users of this driver to understand
if they have been able to use CAN-FD beyond 2.5 Mbps without this patch.
If they haven't what did they do to get around it if they needed higher
speeds.

Meanwhile I plan on testing this using a more "realistic" CAN bus to insure
everything still works at 5 Mbps which is the max speed of my CAN
transceiver.

ping. Anyone has any thoughts on this?

I added Dong who authored the m_can driver and Wenyou who added the only
in-kernel user of the driver for any help.
I tested it on SAMA5D2 Xplained board both with and without this patch,  
both work with the 4M bps data bit rate.




Thanks,
Sekhar


  drivers/net/can/m_can/m_can.c | 24 +++-
  1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index f4947a7..720e073 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -126,6 +126,12 @@ enum m_can_mram_cfg {
  #define DBTP_DSJW_SHIFT   0
  #define DBTP_DSJW_MASK(0xf << DBTP_DSJW_SHIFT)
  
+/* Transmitter Delay Compensation Register (TDCR) */

+#define TDCR_TDCO_SHIFT8
+#define TDCR_TDCO_MASK (0x7F << TDCR_TDCO_SHIFT)
+#define TDCR_TDCF_SHIFT0
+#define TDCR_TDCF_MASK (0x7F << TDCR_TDCO_SHIFT)
+
  /* Test Register (TEST) */
  #define TEST_LBCK BIT(4)
  
@@ -977,6 +983,8 @@ static int m_can_set_bittiming(struct net_device *dev)

const struct can_bittiming *dbt = >can.data_bittiming;
u16 brp, sjw, tseg1, tseg2;
u32 reg_btp;
+   u32 enable_tdc = 0;
+   u32 tdco;
  
  	brp = bt->brp - 1;

sjw = bt->sjw - 1;
@@ -991,9 +999,23 @@ static int m_can_set_bittiming(struct net_device *dev)
sjw = dbt->sjw - 1;
tseg1 = dbt->prop_seg + dbt->phase_seg1 - 1;
tseg2 = dbt->phase_seg2 - 1;
+
+   /* TDC is only needed for bitrates beyond 2.5 MBit/s
+* Specified in the "Bit Time Requirements for CAN FD" document
+*/
+   if (dbt->bitrate > 250) {
+   enable_tdc = DBTP_TDC;
+   /* Equation based on Bosch's M_CAN User Manual's
+* Transmitter Delay Compensation Section
+*/
+   tdco = priv->can.clock.freq / (dbt->bitrate * 2);
+   m_can_write(priv, M_CAN_TDCR, tdco << TDCR_TDCO_SHIFT);
+   }
+
reg_btp = (brp << DBTP_DBRP_SHIFT) | (sjw << DBTP_DSJW_SHIFT) |
(tseg1 << DBTP_DTSEG1_SHIFT) |
-   (tseg2 << DBTP_DTSEG2_SHIFT);
+   (tseg2 << DBTP_DTSEG2_SHIFT) | enable_tdc;
+
m_can_write(priv, M_CAN_DBTP, reg_btp);
}
  



Regards,
Wenyou Yang



Re: Regression in throughput between kvm guests over virtual bridge

2017-09-17 Thread Jason Wang



On 2017年09月16日 03:19, Matthew Rosato wrote:

It looks like vhost is slowed down for some reason which leads to more
idle time on 4.13+VHOST_RX_BATCH=1. Appreciated if you can collect the
perf.diff on host, one for rx and one for tx.


perf data below for the associated vhost threads, baseline=4.12,
delta1=4.13, delta2=4.13+VHOST_RX_BATCH=1

Client vhost:

60.12%  -11.11%  -12.34%  [kernel.vmlinux]   [k] raw_copy_from_user
13.76%   -1.28%   -0.74%  [kernel.vmlinux]   [k] get_page_from_freelist
  2.00%   +3.69%   +3.54%  [kernel.vmlinux]   [k] __wake_up_sync_key
  1.19%   +0.60%   +0.66%  [kernel.vmlinux]   [k] __alloc_pages_nodemask
  1.12%   +0.76%   +0.86%  [kernel.vmlinux]   [k] copy_page_from_iter
  1.09%   +0.28%   +0.35%  [vhost][k] vhost_get_vq_desc
  1.07%   +0.31%   +0.26%  [kernel.vmlinux]   [k] alloc_skb_with_frags
  0.94%   +0.42%   +0.65%  [kernel.vmlinux]   [k] alloc_pages_current
  0.91%   -0.19%   -0.18%  [kernel.vmlinux]   [k] memcpy
  0.88%   +0.26%   +0.30%  [kernel.vmlinux]   [k] __next_zones_zonelist
  0.85%   +0.05%   +0.12%  [kernel.vmlinux]   [k] iov_iter_advance
  0.79%   +0.09%   +0.19%  [vhost][k] __vhost_add_used_n
  0.74%[kernel.vmlinux]   [k] get_task_policy.part.7
  0.74%   -0.01%   -0.05%  [kernel.vmlinux]   [k] tun_net_xmit
  0.60%   +0.17%   +0.33%  [kernel.vmlinux]   [k] policy_nodemask
  0.58%   -0.15%   -0.12%  [ebtables] [k] ebt_do_table
  0.52%   -0.25%   -0.22%  [kernel.vmlinux]   [k] __alloc_skb
...
  0.42%   +0.58%   +0.59%  [kernel.vmlinux]   [k] eventfd_signal
...
  0.32%   +0.96%   +0.93%  [kernel.vmlinux]   [k] finish_task_switch
...
  +1.50%   +1.16%  [kernel.vmlinux]   [k] get_task_policy.part.9
  +0.40%   +0.42%  [kernel.vmlinux]   [k] __skb_get_hash_symmetr
  +0.39%   +0.40%  [kernel.vmlinux]   [k] _copy_from_iter_full
  +0.24%   +0.23%  [vhost_net][k] vhost_net_buf_peek

Server vhost:

61.93%  -10.72%  -10.91%  [kernel.vmlinux]   [k] raw_copy_to_user
  9.25%   +0.47%   +0.86%  [kernel.vmlinux]   [k] free_hot_cold_page
  5.16%   +1.41%   +1.57%  [vhost][k] vhost_get_vq_desc
  5.12%   -3.81%   -3.78%  [kernel.vmlinux]   [k] skb_release_data
  3.30%   +0.42%   +0.55%  [kernel.vmlinux]   [k] raw_copy_from_user
  1.29%   +2.20%   +2.28%  [kernel.vmlinux]   [k] copy_page_to_iter
  1.24%   +1.65%   +0.45%  [vhost_net][k] handle_rx
  1.08%   +3.03%   +2.85%  [kernel.vmlinux]   [k] __wake_up_sync_key
  0.96%   +0.70%   +1.10%  [vhost][k] translate_desc
  0.69%   -0.20%   -0.22%  [kernel.vmlinux]   [k] tun_do_read.part.10
  0.69%[kernel.vmlinux]   [k] tun_peek_len
  0.67%   +0.75%   +0.78%  [kernel.vmlinux]   [k] eventfd_signal
  0.52%   +0.96%   +0.98%  [kernel.vmlinux]   [k] finish_task_switch
  0.50%   +0.05%   +0.09%  [vhost][k] vhost_add_used_n
...
  +0.63%   +0.58%  [vhost_net][k] vhost_net_buf_peek
  +0.32%   +0.32%  [kernel.vmlinux]   [k] _copy_to_iter
  +0.19%   +0.19%  [kernel.vmlinux]   [k] __skb_get_hash_symmetr
  +0.11%   +0.21%  [vhost][k] vhost_umem_interval_tr



Looks like for some unknown reason which leads more wakeups.

Could you please try to attached patch to see if it solves or mitigate 
the issue?


Thanks
>From 63b276ed881c1e2a89b7ea35b6f328f70ddd6185 Mon Sep 17 00:00:00 2001
From: Jason Wang 
Date: Mon, 18 Sep 2017 10:56:30 +0800
Subject: [PATCH] vhost_net: conditionally enable tx polling

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 58585ec..397d86a 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -471,6 +471,7 @@ static void handle_tx(struct vhost_net *net)
 		goto out;
 
 	vhost_disable_notify(>dev, vq);
+	vhost_net_disable_vq(net, vq);
 
 	hdr_size = nvq->vhost_hlen;
 	zcopy = nvq->ubufs;
@@ -562,6 +563,8 @@ static void handle_tx(struct vhost_net *net)
 	% UIO_MAXIOV;
 			}
 			vhost_discard_vq_desc(vq, 1);
+			if (err = -EAGAIN)
+vhost_net_enable_vq(net, vq);
 			break;
 		}
 		if (err != len)
-- 
1.8.3.1



Re: [PATCH V2] tipc: Use bsearch library function

2017-09-17 Thread Joe Perches
On Sun, 2017-09-17 at 16:27 +, Jon Maloy wrote:
> > -Original Message-
> > From: Thomas Meyer [mailto:tho...@m3y3r.de]
[]
> > What about the other binary search implementation in the same file? Should
> > I try to convert it it will it get NAKed for performance reasons too?
> 
> The searches for inserting and removing publications is less time critical,
> so that would be ok with me.
> If you have any more general interest in improving the code in this file
> (which is needed) it would also be appreciated.

Perhaps using an rbtree would be an improvement.


Re: [PATCH net-next v2 0/7] korina: performance fixes and cleanup

2017-09-17 Thread Florian Fainelli


On 09/17/2017 10:23 AM, Roman Yeryomin wrote:
> Changes from v1:
> - use GRO instead of increasing ring size
> - use NAPI_POLL_WEIGHT instead of defining own NAPI_WEIGHT
> - optimize rx descriptor flags processing

net-next is closed at the moment, but these look like reasonable
changes, I would just replace patch 7 with a patch that entirely drops
the driver specific version since that does not serve any purpose in the
context of an in-kernel driver.

Some nice clean-ups that you should also consider for future changes:

- reduce the duplication of tests/conditions in korina_send_packet(), a
lot of them are testing for the same things and setting the same
descriptor bits

- move korina_tx() to a NAPI context instead of working from hard
interrupt context

- get rid of the MIPS dma_cache_* calls and instead properly use the
DMA-API to allocate descriptors and invalidate/write-back skb->data

> 
> Roman Yeryomin (7):
>   net: korina: don't use overflow and underflow interrupts
>   net: korina: optimize rx descriptor flags processing
>   net: korina: use NAPI_POLL_WEIGHT
>   net: korina: use GRO
>   net: korina: whitespace cleanup
>   net: korina: update authors
>   net: korina: bump version
> 
>  drivers/net/ethernet/korina.c | 230 
> ++
>  1 file changed, 78 insertions(+), 152 deletions(-)
> 

-- 
Florian


Re: [PATCH net-next v2 7/7] net: korina: bump version

2017-09-17 Thread Florian Fainelli


On 09/17/2017 10:25 AM, Roman Yeryomin wrote:
> Signed-off-by: Roman Yeryomin 

You can probably drop the version because it does not really make much
sense for an in-kernel driver anyway.
-- 
Florian


Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c

2017-09-17 Thread Oleksandr Natalenko
Hi.

Just to note that it looks like disabling RACK and re-enabling FACK prevents 
warning from happening:

net.ipv4.tcp_fack = 1
net.ipv4.tcp_recovery = 0

Hope I get semantics of these tunables right.

On pátek 15. září 2017 21:04:36 CEST Oleksandr Natalenko wrote:
> Hello.
> 
> With net.ipv4.tcp_fack set to 0 the warning still appears:
> 
> ===
> » sysctl net.ipv4.tcp_fack
> net.ipv4.tcp_fack = 0
> 
> » LC_TIME=C dmesg -T | grep WARNING
> [Fri Sep 15 20:40:30 2017] WARNING: CPU: 1 PID: 711 at net/ipv4/tcp_input.c:
> 2826 tcp_fastretrans_alert+0x7c8/0x990
> [Fri Sep 15 20:40:30 2017] WARNING: CPU: 0 PID: 711 at net/ipv4/tcp_input.c:
> 2826 tcp_fastretrans_alert+0x7c8/0x990
> [Fri Sep 15 20:48:37 2017] WARNING: CPU: 1 PID: 711 at net/ipv4/tcp_input.c:
> 2826 tcp_fastretrans_alert+0x7c8/0x990
> [Fri Sep 15 20:48:55 2017] WARNING: CPU: 0 PID: 711 at net/ipv4/tcp_input.c:
> 2826 tcp_fastretrans_alert+0x7c8/0x990
> 
> » ps -up 711
> USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
> root   711  4.3  0.0  0 0 ?S18:12   7:23 [irq/123-
> enp3s0]
> ===
> 
> Any suggestions?
> 
> On pátek 15. září 2017 16:03:00 CEST Neal Cardwell wrote:
> > Thanks for testing that. That is a very useful data point.
> > 
> > I was able to cook up a packetdrill test that could put the connection
> > in CA_Disorder with retransmitted packets out, but not in CA_Open. So
> > we do not yet have a test case to reproduce this.
> > 
> > We do not see this warning on our fleet at Google. One significant
> > difference I see between our environment and yours is that it seems
> > 
> > you run with FACK enabled:
> >   net.ipv4.tcp_fack = 1
> > 
> > Note that FACK was disabled by default (since it was replaced by RACK)
> > between kernel v4.10 and v4.11. And this is exactly the time when this
> > bug started manifesting itself for you and some others, but not our
> > fleet. So my new working hypothesis would be that this warning is due
> > to a behavior that only shows up in kernels >=4.11 when FACK is
> > enabled.
> > 
> > Would you be able to disable FACK ("sysctl net.ipv4.tcp_fack=0" at
> > boot, or net.ipv4.tcp_fack=0 in /etc/sysctl.conf, or equivalent),
> > reboot, and test the kernel for a few days to see if the warning still
> > pops up?
> > 
> > thanks,
> > neal
> > 
> > [ps: apologies for the previous, mis-formatted post...]




Re: [PATCH] hamradio: baycom: use new parport device model

2017-09-17 Thread Thomas Sailer

Acked-By: Thomas Sailer 

Am 17.09.2017 um 13:46 schrieb Sudip Mukherjee:

Modify baycom driver to use the new parallel port device model.

Signed-off-by: Sudip Mukherjee 
---

Not tested on real hardware, only tested on qemu and verified that the
device is binding to the driver properly in epp_open but then unbinding
as the device was not found.

  drivers/net/hamradio/baycom_epp.c | 50 +++
  1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/drivers/net/hamradio/baycom_epp.c 
b/drivers/net/hamradio/baycom_epp.c
index 1503f10..1e62d00 100644
--- a/drivers/net/hamradio/baycom_epp.c
+++ b/drivers/net/hamradio/baycom_epp.c
@@ -840,6 +840,7 @@ static int epp_open(struct net_device *dev)
unsigned char tmp[128];
unsigned char stat;
unsigned long tstart;
+   struct pardev_cb par_cb;

  if (!pp) {
  printk(KERN_ERR "%s: parport at 0x%lx unknown\n", bc_drvname, 
dev->base_addr);
@@ -859,8 +860,21 @@ static int epp_open(struct net_device *dev)
  return -EIO;
}
memset(>modem, 0, sizeof(bc->modem));
-bc->pdev = parport_register_device(pp, dev->name, NULL, epp_wakeup,
-  NULL, PARPORT_DEV_EXCL, dev);
+   memset(_cb, 0, sizeof(par_cb));
+   par_cb.wakeup = epp_wakeup;
+   par_cb.private = (void *)dev;
+   par_cb.flags = PARPORT_DEV_EXCL;
+   for (i = 0; i < NR_PORTS; i++)
+   if (baycom_device[i] == dev)
+   break;
+
+   if (i == NR_PORTS) {
+   pr_err("%s: no device found\n", bc_drvname);
+   parport_put_port(pp);
+   return -ENODEV;
+   }
+
+   bc->pdev = parport_register_dev_model(pp, dev->name, _cb, i);
parport_put_port(pp);
  if (!bc->pdev) {
  printk(KERN_ERR "%s: cannot register parport at 0x%lx\n", 
bc_drvname, pp->base);
@@ -1185,6 +1199,23 @@ MODULE_LICENSE("GPL");
  
  /* - */
  
+static int baycom_epp_par_probe(struct pardevice *par_dev)

+{
+   struct device_driver *drv = par_dev->dev.driver;
+   int len = strlen(drv->name);
+
+   if (strncmp(par_dev->name, drv->name, len))
+   return -ENODEV;
+
+   return 0;
+}
+
+static struct parport_driver baycom_epp_par_driver = {
+   .name = "bce",
+   .probe = baycom_epp_par_probe,
+   .devmodel = true,
+};
+
  static void __init baycom_epp_dev_setup(struct net_device *dev)
  {
struct baycom_state *bc = netdev_priv(dev);
@@ -1204,10 +1235,15 @@ static void __init baycom_epp_dev_setup(struct 
net_device *dev)
  
  static int __init init_baycomepp(void)

  {
-   int i, found = 0;
+   int i, found = 0, ret;
char set_hw = 1;
  
  	printk(bc_drvinfo);

+
+   ret = parport_register_driver(_epp_par_driver);
+   if (ret)
+   return ret;
+
/*
 * register net devices
 */
@@ -1241,7 +1277,12 @@ static int __init init_baycomepp(void)
found++;
}
  
-	return found ? 0 : -ENXIO;

+   if (found == 0) {
+   parport_unregister_driver(_epp_par_driver);
+   return -ENXIO;
+   }
+
+   return 0;
  }
  
  static void __exit cleanup_baycomepp(void)

@@ -1260,6 +1301,7 @@ static void __exit cleanup_baycomepp(void)
printk(paranoia_str, "cleanup_module");
}
}
+   parport_unregister_driver(_epp_par_driver);
  }
  
  module_init(init_baycomepp);






[PATCH net-next v2 7/7] net: korina: bump version

2017-09-17 Thread Roman Yeryomin
Signed-off-by: Roman Yeryomin 
---
 drivers/net/ethernet/korina.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
index d58aa4bfcb58..7cecd9dbc111 100644
--- a/drivers/net/ethernet/korina.c
+++ b/drivers/net/ethernet/korina.c
@@ -66,8 +66,8 @@
 #include 
 
 #define DRV_NAME   "korina"
-#define DRV_VERSION"0.10"
-#define DRV_RELDATE"04Mar2008"
+#define DRV_VERSION"0.20"
+#define DRV_RELDATE"15Sep2017"
 
 #define STATION_ADDRESS_HIGH(dev) (((dev)->dev_addr[0] << 8) | \
   ((dev)->dev_addr[1]))
-- 
2.11.0



[PATCH net-next v2 4/7] net: korina: use GRO

2017-09-17 Thread Roman Yeryomin
Performance gain when receiving locally is 55->95Mbps and 50->65Mbps for NAT.

Signed-off-by: Roman Yeryomin 
---
 drivers/net/ethernet/korina.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
index c210add9b654..5f36e1703378 100644
--- a/drivers/net/ethernet/korina.c
+++ b/drivers/net/ethernet/korina.c
@@ -406,7 +406,7 @@ static int korina_rx(struct net_device *dev, int limit)
skb->protocol = eth_type_trans(skb, dev);
 
/* Pass the packet to upper layers */
-   netif_receive_skb(skb);
+   napi_gro_receive(>napi, skb);
dev->stats.rx_packets++;
dev->stats.rx_bytes += pkt_len;
 
-- 
2.11.0



[PATCH net-next v2 5/7] net: korina: whitespace cleanup

2017-09-17 Thread Roman Yeryomin
Signed-off-by: Roman Yeryomin 
---
 drivers/net/ethernet/korina.c | 58 +++
 1 file changed, 31 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
index 5f36e1703378..c26f0d84ba6b 100644
--- a/drivers/net/ethernet/korina.c
+++ b/drivers/net/ethernet/korina.c
@@ -64,9 +64,9 @@
 #include 
 #include 
 
-#define DRV_NAME"korina"
-#define DRV_VERSION "0.10"
-#define DRV_RELDATE "04Mar2008"
+#define DRV_NAME   "korina"
+#define DRV_VERSION"0.10"
+#define DRV_RELDATE"04Mar2008"
 
 #define STATION_ADDRESS_HIGH(dev) (((dev)->dev_addr[0] << 8) | \
   ((dev)->dev_addr[1]))
@@ -75,7 +75,7 @@
   ((dev)->dev_addr[4] << 8)  | \
   ((dev)->dev_addr[5]))
 
-#define MII_CLOCK 125  /* no more than 2.5MHz */
+#define MII_CLOCK  125 /* no more than 2.5MHz */
 
 /* the following must be powers of two */
 #define KORINA_NUM_RDS 64  /* number of receive descriptors */
@@ -87,15 +87,19 @@
 #define KORINA_RBSIZE  1536 /* size of one resource buffer = Ether MTU */
 #define KORINA_RDS_MASK(KORINA_NUM_RDS - 1)
 #define KORINA_TDS_MASK(KORINA_NUM_TDS - 1)
-#define RD_RING_SIZE   (KORINA_NUM_RDS * sizeof(struct dma_desc))
+#define RD_RING_SIZE   (KORINA_NUM_RDS * sizeof(struct dma_desc))
 #define TD_RING_SIZE   (KORINA_NUM_TDS * sizeof(struct dma_desc))
 
-#define TX_TIMEOUT (6000 * HZ / 1000)
+#define TX_TIMEOUT (6000 * HZ / 1000)
 
-enum chain_status { desc_filled, desc_empty };
-#define IS_DMA_FINISHED(X)   (((X) & (DMA_DESC_FINI)) != 0)
-#define IS_DMA_DONE(X)   (((X) & (DMA_DESC_DONE)) != 0)
-#define RCVPKT_LENGTH(X) (((X) & ETH_RX_LEN) >> ETH_RX_LEN_BIT)
+enum chain_status {
+   desc_filled,
+   desc_empty
+};
+
+#define IS_DMA_FINISHED(X) (((X) & (DMA_DESC_FINI)) != 0)
+#define IS_DMA_DONE(X) (((X) & (DMA_DESC_DONE)) != 0)
+#define RCVPKT_LENGTH(X)   (((X) & ETH_RX_LEN) >> ETH_RX_LEN_BIT)
 
 /* Information that need to be kept for each board. */
 struct korina_private {
@@ -123,7 +127,7 @@ struct korina_private {
int rx_irq;
int tx_irq;
 
-   spinlock_t lock;/* NIC xmit lock */
+   spinlock_t lock;/* NIC xmit lock */
 
int dma_halt_cnt;
int dma_run_cnt;
@@ -146,17 +150,17 @@ static inline void korina_start_dma(struct dma_reg *ch, 
u32 dma_addr)
 static inline void korina_abort_dma(struct net_device *dev,
struct dma_reg *ch)
 {
-   if (readl(>dmac) & DMA_CHAN_RUN_BIT) {
-  writel(0x10, >dmac);
+   if (readl(>dmac) & DMA_CHAN_RUN_BIT) {
+   writel(0x10, >dmac);
 
-  while (!(readl(>dmas) & DMA_STAT_HALT))
-  netif_trans_update(dev);
+   while (!(readl(>dmas) & DMA_STAT_HALT))
+   netif_trans_update(dev);
 
-  writel(0, >dmas);
-   }
+   writel(0, >dmas);
+   }
 
-   writel(0, >dmadptr);
-   writel(0, >dmandptr);
+   writel(0, >dmadptr);
+   writel(0, >dmandptr);
 }
 
 static inline void korina_chain_dma(struct dma_reg *ch, u32 dma_addr)
@@ -685,7 +689,7 @@ static int korina_ioctl(struct net_device *dev, struct 
ifreq *rq, int cmd)
 
 /* ethtool helpers */
 static void netdev_get_drvinfo(struct net_device *dev,
-   struct ethtool_drvinfo *info)
+   struct ethtool_drvinfo *info)
 {
struct korina_private *lp = netdev_priv(dev);
 
@@ -728,10 +732,10 @@ static u32 netdev_get_link(struct net_device *dev)
 }
 
 static const struct ethtool_ops netdev_ethtool_ops = {
-   .get_drvinfo= netdev_get_drvinfo,
-   .get_link   = netdev_get_link,
-   .get_link_ksettings = netdev_get_link_ksettings,
-   .set_link_ksettings = netdev_set_link_ksettings,
+   .get_drvinfo= netdev_get_drvinfo,
+   .get_link   = netdev_get_link,
+   .get_link_ksettings = netdev_get_link_ksettings,
+   .set_link_ksettings = netdev_set_link_ksettings,
 };
 
 static int korina_alloc_ring(struct net_device *dev)
@@ -863,7 +867,7 @@ static int korina_init(struct net_device *dev)
/* Management Clock Prescaler Divisor
 * Clock independent setting */
writel(((idt_cpu_freq) / MII_CLOCK + 1) & ~1,
-  >eth_regs->ethmcp);
+   >eth_regs->ethmcp);
 
/* don't transmit until fifo contains 48b */
writel(48, >eth_regs->ethfifott);
@@ -946,14 +950,14 @@ static int korina_open(struct net_device *dev)
0, "Korina ethernet Rx", dev);
if (ret < 0) {
printk(KERN_ERR "%s: unable to get Rx DMA IRQ %d\n",
-   dev->name, lp->rx_irq);
+   dev->name, 

[PATCH net-next v2 6/7] net: korina: update authors

2017-09-17 Thread Roman Yeryomin
Signed-off-by: Roman Yeryomin 
---
 drivers/net/ethernet/korina.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
index c26f0d84ba6b..d58aa4bfcb58 100644
--- a/drivers/net/ethernet/korina.c
+++ b/drivers/net/ethernet/korina.c
@@ -4,6 +4,7 @@
  *  Copyright 2004 IDT Inc. (risch...@idt.com)
  *  Copyright 2006 Felix Fietkau 
  *  Copyright 2008 Florian Fainelli 
+ *  Copyright 2017 Roman Yeryomin 
  *
  *  This program is free software; you can redistribute  it and/or modify it
  *  under  the terms of  the GNU General  Public License as published by the
@@ -1150,5 +1151,6 @@ module_platform_driver(korina_driver);
 MODULE_AUTHOR("Philip Rischel ");
 MODULE_AUTHOR("Felix Fietkau ");
 MODULE_AUTHOR("Florian Fainelli ");
+MODULE_AUTHOR("Roman Yeryomin ");
 MODULE_DESCRIPTION("IDT RC32434 (Korina) Ethernet driver");
 MODULE_LICENSE("GPL");
-- 
2.11.0



[PATCH net-next v2 3/7] net: korina: use NAPI_POLL_WEIGHT

2017-09-17 Thread Roman Yeryomin
Signed-off-by: Roman Yeryomin 
---
 drivers/net/ethernet/korina.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
index e5466e19994a..c210add9b654 100644
--- a/drivers/net/ethernet/korina.c
+++ b/drivers/net/ethernet/korina.c
@@ -1082,7 +1082,7 @@ static int korina_probe(struct platform_device *pdev)
dev->netdev_ops = _netdev_ops;
dev->ethtool_ops = _ethtool_ops;
dev->watchdog_timeo = TX_TIMEOUT;
-   netif_napi_add(dev, >napi, korina_poll, 64);
+   netif_napi_add(dev, >napi, korina_poll, NAPI_POLL_WEIGHT);
 
lp->phy_addr = (((lp->rx_irq == 0x2c? 1:0) << 8) | 0x05);
lp->mii_if.dev = dev;
-- 
2.11.0



[PATCH net-next v2 2/7] net: korina: optimize rx descriptor flags processing

2017-09-17 Thread Roman Yeryomin
Signed-off-by: Roman Yeryomin 
---
 drivers/net/ethernet/korina.c | 87 ++-
 1 file changed, 44 insertions(+), 43 deletions(-)

diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
index 98d686ed69a9..e5466e19994a 100644
--- a/drivers/net/ethernet/korina.c
+++ b/drivers/net/ethernet/korina.c
@@ -363,59 +363,60 @@ static int korina_rx(struct net_device *dev, int limit)
if ((KORINA_RBSIZE - (u32)DMA_COUNT(rd->control)) == 0)
break;
 
-   /* Update statistics counters */
-   if (devcs & ETH_RX_CRC)
-   dev->stats.rx_crc_errors++;
-   if (devcs & ETH_RX_LOR)
-   dev->stats.rx_length_errors++;
-   if (devcs & ETH_RX_LE)
-   dev->stats.rx_length_errors++;
-   if (devcs & ETH_RX_OVR)
-   dev->stats.rx_fifo_errors++;
-   if (devcs & ETH_RX_CV)
-   dev->stats.rx_frame_errors++;
-   if (devcs & ETH_RX_CES)
-   dev->stats.rx_length_errors++;
-   if (devcs & ETH_RX_MP)
-   dev->stats.multicast++;
-
-   if ((devcs & ETH_RX_LD) != ETH_RX_LD) {
-   /* check that this is a whole packet
-* WARNING: DMA_FD bit incorrectly set
-* in Rc32434 (errata ref #077) */
+   /* check that this is a whole packet
+* WARNING: DMA_FD bit incorrectly set
+* in Rc32434 (errata ref #077) */
+   if (!(devcs & ETH_RX_LD))
+   goto next;
+
+   if (!(devcs & ETH_RX_ROK)) {
+   /* Update statistics counters */
dev->stats.rx_errors++;
dev->stats.rx_dropped++;
-   } else if ((devcs & ETH_RX_ROK)) {
-   pkt_len = RCVPKT_LENGTH(devcs);
+   if (devcs & ETH_RX_CRC)
+   dev->stats.rx_crc_errors++;
+   if (devcs & ETH_RX_LE)
+   dev->stats.rx_length_errors++;
+   if (devcs & ETH_RX_OVR)
+   dev->stats.rx_fifo_errors++;
+   if (devcs & ETH_RX_CV)
+   dev->stats.rx_frame_errors++;
+   if (devcs & ETH_RX_CES)
+   dev->stats.rx_frame_errors++;
+
+   goto next;
+   }
 
-   /* must be the (first and) last
-* descriptor then */
-   pkt_buf = (u8 *)lp->rx_skb[lp->rx_next_done]->data;
+   pkt_len = RCVPKT_LENGTH(devcs);
 
-   /* invalidate the cache */
-   dma_cache_inv((unsigned long)pkt_buf, pkt_len - 4);
+   /* must be the (first and) last
+* descriptor then */
+   pkt_buf = (u8 *)lp->rx_skb[lp->rx_next_done]->data;
 
-   /* Malloc up new buffer. */
-   skb_new = netdev_alloc_skb_ip_align(dev, KORINA_RBSIZE);
+   /* invalidate the cache */
+   dma_cache_inv((unsigned long)pkt_buf, pkt_len - 4);
 
-   if (!skb_new)
-   break;
-   /* Do not count the CRC */
-   skb_put(skb, pkt_len - 4);
-   skb->protocol = eth_type_trans(skb, dev);
+   /* Malloc up new buffer. */
+   skb_new = netdev_alloc_skb_ip_align(dev, KORINA_RBSIZE);
 
-   /* Pass the packet to upper layers */
-   netif_receive_skb(skb);
-   dev->stats.rx_packets++;
-   dev->stats.rx_bytes += pkt_len;
+   if (!skb_new)
+   break;
+   /* Do not count the CRC */
+   skb_put(skb, pkt_len - 4);
+   skb->protocol = eth_type_trans(skb, dev);
 
-   /* Update the mcast stats */
-   if (devcs & ETH_RX_MP)
-   dev->stats.multicast++;
+   /* Pass the packet to upper layers */
+   netif_receive_skb(skb);
+   dev->stats.rx_packets++;
+   dev->stats.rx_bytes += pkt_len;
 
-   lp->rx_skb[lp->rx_next_done] = skb_new;
-   }
+   /* Update the mcast stats */
+   if (devcs & ETH_RX_MP)
+   dev->stats.multicast++;
+
+   lp->rx_skb[lp->rx_next_done] = skb_new;
 
+next:
rd->devcs = 0;
 
/* Restore descriptor's curr_addr */
-- 
2.11.0



[PATCH net-next v2 1/7] net: korina: don't use overflow and underflow interrupts

2017-09-17 Thread Roman Yeryomin
When such interrupts occur there is not much we can do.
Dropping the whole ring doesn't help and only produces high packet loss.
If we just ignore the interrupt the mac will drop one or few packets instead of 
the whole ring.
Also this will lower the irq handling load and increase performance.

Signed-off-by: Roman Yeryomin 
---
 drivers/net/ethernet/korina.c | 83 +--
 1 file changed, 1 insertion(+), 82 deletions(-)

diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
index 3c0a6451273d..98d686ed69a9 100644
--- a/drivers/net/ethernet/korina.c
+++ b/drivers/net/ethernet/korina.c
@@ -122,8 +122,6 @@ struct korina_private {
 
int rx_irq;
int tx_irq;
-   int ovr_irq;
-   int und_irq;
 
spinlock_t lock;/* NIC xmit lock */
 
@@ -891,8 +889,6 @@ static void korina_restart_task(struct work_struct *work)
 */
disable_irq(lp->rx_irq);
disable_irq(lp->tx_irq);
-   disable_irq(lp->ovr_irq);
-   disable_irq(lp->und_irq);
 
writel(readl(>tx_dma_regs->dmasm) |
DMA_STAT_FINI | DMA_STAT_ERR,
@@ -911,40 +907,10 @@ static void korina_restart_task(struct work_struct *work)
}
korina_multicast_list(dev);
 
-   enable_irq(lp->und_irq);
-   enable_irq(lp->ovr_irq);
enable_irq(lp->tx_irq);
enable_irq(lp->rx_irq);
 }
 
-static void korina_clear_and_restart(struct net_device *dev, u32 value)
-{
-   struct korina_private *lp = netdev_priv(dev);
-
-   netif_stop_queue(dev);
-   writel(value, >eth_regs->ethintfc);
-   schedule_work(>restart_task);
-}
-
-/* Ethernet Tx Underflow interrupt */
-static irqreturn_t korina_und_interrupt(int irq, void *dev_id)
-{
-   struct net_device *dev = dev_id;
-   struct korina_private *lp = netdev_priv(dev);
-   unsigned int und;
-
-   spin_lock(>lock);
-
-   und = readl(>eth_regs->ethintfc);
-
-   if (und & ETH_INT_FC_UND)
-   korina_clear_and_restart(dev, und & ~ETH_INT_FC_UND);
-
-   spin_unlock(>lock);
-
-   return IRQ_HANDLED;
-}
-
 static void korina_tx_timeout(struct net_device *dev)
 {
struct korina_private *lp = netdev_priv(dev);
@@ -952,25 +918,6 @@ static void korina_tx_timeout(struct net_device *dev)
schedule_work(>restart_task);
 }
 
-/* Ethernet Rx Overflow interrupt */
-static irqreturn_t
-korina_ovr_interrupt(int irq, void *dev_id)
-{
-   struct net_device *dev = dev_id;
-   struct korina_private *lp = netdev_priv(dev);
-   unsigned int ovr;
-
-   spin_lock(>lock);
-   ovr = readl(>eth_regs->ethintfc);
-
-   if (ovr & ETH_INT_FC_OVR)
-   korina_clear_and_restart(dev, ovr & ~ETH_INT_FC_OVR);
-
-   spin_unlock(>lock);
-
-   return IRQ_HANDLED;
-}
-
 #ifdef CONFIG_NET_POLL_CONTROLLER
 static void korina_poll_controller(struct net_device *dev)
 {
@@ -993,8 +940,7 @@ static int korina_open(struct net_device *dev)
}
 
/* Install the interrupt handler
-* that handles the Done Finished
-* Ovr and Und Events */
+* that handles the Done Finished */
ret = request_irq(lp->rx_irq, korina_rx_dma_interrupt,
0, "Korina ethernet Rx", dev);
if (ret < 0) {
@@ -1010,31 +956,10 @@ static int korina_open(struct net_device *dev)
goto err_free_rx_irq;
}
 
-   /* Install handler for overrun error. */
-   ret = request_irq(lp->ovr_irq, korina_ovr_interrupt,
-   0, "Ethernet Overflow", dev);
-   if (ret < 0) {
-   printk(KERN_ERR "%s: unable to get OVR IRQ %d\n",
-   dev->name, lp->ovr_irq);
-   goto err_free_tx_irq;
-   }
-
-   /* Install handler for underflow error. */
-   ret = request_irq(lp->und_irq, korina_und_interrupt,
-   0, "Ethernet Underflow", dev);
-   if (ret < 0) {
-   printk(KERN_ERR "%s: unable to get UND IRQ %d\n",
-   dev->name, lp->und_irq);
-   goto err_free_ovr_irq;
-   }
mod_timer(>media_check_timer, jiffies + 1);
 out:
return ret;
 
-err_free_ovr_irq:
-   free_irq(lp->ovr_irq, dev);
-err_free_tx_irq:
-   free_irq(lp->tx_irq, dev);
 err_free_rx_irq:
free_irq(lp->rx_irq, dev);
 err_release:
@@ -1052,8 +977,6 @@ static int korina_close(struct net_device *dev)
/* Disable interrupts */
disable_irq(lp->rx_irq);
disable_irq(lp->tx_irq);
-   disable_irq(lp->ovr_irq);
-   disable_irq(lp->und_irq);
 
korina_abort_tx(dev);
tmp = readl(>tx_dma_regs->dmasm);
@@ -1073,8 +996,6 @@ static int korina_close(struct net_device *dev)
 
free_irq(lp->rx_irq, dev);
free_irq(lp->tx_irq, dev);
-   free_irq(lp->ovr_irq, dev);
-   free_irq(lp->und_irq, dev);
 
return 0;
 }
@@ -1113,8 +1034,6 @@ static int 

[PATCH net-next v2 0/7] korina: performance fixes and cleanup

2017-09-17 Thread Roman Yeryomin
Changes from v1:
- use GRO instead of increasing ring size
- use NAPI_POLL_WEIGHT instead of defining own NAPI_WEIGHT
- optimize rx descriptor flags processing

Roman Yeryomin (7):
  net: korina: don't use overflow and underflow interrupts
  net: korina: optimize rx descriptor flags processing
  net: korina: use NAPI_POLL_WEIGHT
  net: korina: use GRO
  net: korina: whitespace cleanup
  net: korina: update authors
  net: korina: bump version

 drivers/net/ethernet/korina.c | 230 ++
 1 file changed, 78 insertions(+), 152 deletions(-)

-- 
2.11.0



Dear Talented

2017-09-17 Thread Kim Sharma
Dear Talented,

I am Talent Scout For BLUE SKY FILM STUDIO, Present Blue sky Studio a
Film Corporation Located in the United State, is Soliciting for the
Right to use Your Photo/Face and Personality as One of the Semi -Major
Role/ Character in our Upcoming ANIMATED Stereoscope 3D Movie-The Story
of Anubis (Anubis 2018) The Movie is Currently Filming (In
Production) Please Note That There Will Be No Auditions, Traveling or
Any Special / Professional Acting Skills, Since the Production of This
Movie Will Be Done with our State of Art Computer -Generating Imagery
Equipment. We Are Prepared to Pay the Total Sum of $620,000.00 USD. For
More Information/Understanding, Please Write us on the E-Mail Below.
CONTACT EMAIL: blueskyanimatedstu...@usa.com
All Reply to: blueskyanimatedstu...@usa.com
Note: Only the Response send to this mail will be Given a Prior
Consideration.


Talent Scout
Kim Sharma


Dear Talented

2017-09-17 Thread Kim Sharma
Dear Talented,

I am Talent Scout For BLUE SKY FILM STUDIO, Present Blue sky Studio a
Film Corporation Located in the United State, is Soliciting for the
Right to use Your Photo/Face and Personality as One of the Semi -Major
Role/ Character in our Upcoming ANIMATED Stereoscope 3D Movie-The Story
of Anubis (Anubis 2018) The Movie is Currently Filming (In
Production) Please Note That There Will Be No Auditions, Traveling or
Any Special / Professional Acting Skills, Since the Production of This
Movie Will Be Done with our State of Art Computer -Generating Imagery
Equipment. We Are Prepared to Pay the Total Sum of $620,000.00 USD. For
More Information/Understanding, Please Write us on the E-Mail Below.
CONTACT EMAIL: blueskyanimatedstu...@usa.com
All Reply to: blueskyanimatedstu...@usa.com
Note: Only the Response send to this mail will be Given a Prior
Consideration.


Talent Scout
Kim Sharma


RE: [PATCH V2] tipc: Use bsearch library function

2017-09-17 Thread Jon Maloy


> -Original Message-
> From: Thomas Meyer [mailto:tho...@m3y3r.de]
> Sent: Sunday, September 17, 2017 11:00
> To: Jon Maloy 
> Cc: Joe Perches ; Ying Xue ;
> netdev@vger.kernel.org; tipc-discuss...@lists.sourceforge.net; linux-
> ker...@vger.kernel.org; da...@davemloft.net
> Subject: Re: [PATCH V2] tipc: Use bsearch library function
> 
> 
> > Am 16.09.2017 um 15:20 schrieb Jon Maloy .
> >>
> >> What part of "very time critical" have you verified and benchmarked as
> >> inconsequential?
> >>
> >> Please post your results.
> >
> > I agree with Joe here. This change does not simplify anything, it does
not
> reduce the amount of code, plus that it introduce an unnecessary outline
call
> in a place where we have every reason to let the compiler do its
optimization
> job properly.
> 
> Hi,
> 
> Okay, should I prepare some performance numbers or do we NAK this
> change?
> What about the other binary search implementation in the same file? Should
> I try to convert it it will it get NAKed for performance reasons too?

The searches for inserting and removing publications is less time critical,
so that would be ok with me.
If you have any more general interest in improving the code in this file
(which is needed) it would also be appreciated.

BR
///jon


> 
> With kind regards
> Thomas


smime.p7s
Description: S/MIME cryptographic signature


Re: Page allocator bottleneck

2017-09-17 Thread Tariq Toukan



On 15/09/2017 10:28 AM, Jesper Dangaard Brouer wrote:

On Thu, 14 Sep 2017 19:49:31 +0300
Tariq Toukan  wrote:


Hi all,

As part of the efforts to support increasing next-generation NIC speeds,
I am investigating SW bottlenecks in network stack receive flow.

Here I share some numbers I got for a simple experiment, in which I
simulate the page allocation rate needed in 200Gpbs NICs.


Thanks for bringing this up again.


Sure. We need to keep up with the increasing NIC speeds.




I ran the test below over 3 different (modified) mlx5 driver versions,
loaded on server side (RX):
1) RX page cache disabled, 2 packets per page.


2 packets per page basically reduce the overhead you see from the page
allocator to half.


2) RX page cache disabled, one packet per page.


This, should stress the page allocator.


3) Huge RX page cache, one packet per page.


A driver level page-cache will look nice, as long as it "works".


I verified that it worked in the experiment.



Drivers usually have no other option than basing their recycle facility
to be based on the page-refcnt (as there is no destructor callback).
Which implies packets/pages need to be returned quickly enough for it
to work.


Yes, that's how our current default (small) RX page-cache is 
implemented. Unfortunately, the timing and terms for a fair reuse rate 
are not always satisfied.





All page allocations are of order 0.

NIC: Connectx-5 100 Gbps.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

Test:
128 TCP streams (using super_netperf).
Changing num of RX queues.
HW LRO OFF, GRO ON, MTU 1500.


With TCP streams and GRO, is actually a good stress test for the page
allocator (or drivers page-recycle cache). As  Eric Dumazet have made
some nice optimizations, that (in most situations) cause us to quickly
free/recycle the SKB (coming from driver) and store the pages in 1-SKB.
This cause us to hit the SLUB fastpath for the SKBs, but once the pages
need to be free'ed this stress the page allocator more.


Yep, bulking would help here, as you mention below.



Also be aware that with TCP flows, the packets are likely delivered
into a socket, that is consumed on another CPU.  Thus, the pages are
allocated on one CPU and free'ed on another. AFAIK this stress the
order-0 cache PCP (Per-Cpu-Pages).


Good point.
Do you know of any tool/kernel counters that help observe and quantify 
this behavior?





Observe: BW as a function of num RX queues.

Results:

Driver #1:
#rings  BW (Mbps)
1   23,813
2   44,086
3   62,128
4   78,058
6   94,210 (linerate)
8   94,205 (linerate)
12  94,202 (linerate)
16  94,191 (linerate)

Driver #2:
#rings  BW (Mbps)
1   18,835
2   36,716
3   50,521
4   61,746
6   63,637
8   60,299
12  51,048
16  43,337

Driver #3:
#rings  BW (Mbps)
1   19,316
2   44,850
3   69,549
4   87,434
6   94,342 (linerate)
8   94,350 (linerate)
12  94,327 (linerate)
16  94,327 (linerate)


Insights:
Major degradation between #1 and #2, not getting any close to linerate!
Degradation is fixed between #2 and #3.
This is because page allocator cannot stand the higher allocation rate.
In #2, we also see that the addition of rings (cores) reduces BW (!!),
as result of increasing congestion over shared resources.

Congestion in this case is very clear.
When monitored in perf top:
85.58% [kernel] [k] queued_spin_lock_slowpath


Well, we obviously need to know the caller of the spin_lock.  In this
case it is likely the page allocator lock.  It could also be the TCP
socket locks, but given GRO is enabled, they should be hit much less.



It is the page allocator lock.
I verified this based on Andi's suggestion, see other mail.

It's nice to have the option to dynamically play with the parameter.
But maybe we should also think of changing the default fraction 
guaranteed to the PCP, so that unaware admins of networking servers 
would also benefit.





I think that page allocator issues should be discussed separately:
1) Rate: Increase the allocation rate on a single core.
2) Scalability: Reduce congestion and sync overhead between cores.


Yes, but this no small task.  I is on my TODO-list (emacs org-mode),
but I have other tasks that have higher priority atm.  I'll be working
on XDP_REDIRECT for the next many months.  Currently trying to convince
people that we do an explicit packet-page return/free callback (which
would avoid many of these issues).



This is clearly the current bottleneck in the network stack receive
flow.

I know about some efforts that were made in the past two years.
For example the ones from Jesper et al.:

- Page-pool (not accepted AFAIK).


The page-pool have many purposes.
  1. generic page-cache for drivers,
  2. keep pages DMA-mapped
  3. facilitate drivers to change RX-ring memory model

 From a MM-point-of-view the page pool is just a destructor callback,
that can "steal" the page.

If I can convince XDP_REDIRECT 

Re: Page allocator bottleneck

2017-09-17 Thread Tariq Toukan



On 14/09/2017 11:19 PM, Andi Kleen wrote:

Tariq Toukan  writes:


Congestion in this case is very clear.
When monitored in perf top:
85.58% [kernel] [k] queued_spin_lock_slowpath


Please look at the callers. Spinlock profiles without callers
are usually useless because it's just blaming the messenger.

Most likely the PCP lists are too small for your extreme allocation
rate, so it goes back too often to the shared pool.

You can play with the vm.percpu_pagelist_fraction setting.


Thanks Andi.
That was my initial guess, but I wasn't familiar with these tunes in VM 
to verify that.
Indeed, bottleneck is released when increasing the PCP size, and BW 
becomes significantly better.




-Andi



Re: [PATCH V2] tipc: Use bsearch library function

2017-09-17 Thread Thomas Meyer

> Am 16.09.2017 um 15:20 schrieb Jon Maloy .
>> 
>> What part of "very time critical" have you verified and benchmarked as
>> inconsequential?
>> 
>> Please post your results.
> 
> I agree with Joe here. This change does not simplify anything, it does not 
> reduce the amount of code, plus that it introduce an unnecessary outline call 
> in a place where we have every reason to let the compiler do its optimization 
> job properly.

Hi,

Okay, should I prepare some performance numbers or do we NAK this change?
What about the other binary search implementation in the same file? Should I 
try to convert it it will it get NAKed for performance reasons too?

With kind regards
Thomas

smime.p7s
Description: S/MIME cryptographic signature


Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

2017-09-17 Thread Cole Robinson
On 09/15/2017 01:51 PM, Josef Bacik wrote:
> Finally got access to a box to run this down myself.  This patch on top of 
> the other patches fixes the problem for me, could you verify it works for 
> you?  Thanks,
> 

Yup I can confirm that patch fixes things when applied on top of the
previous 3 patches. Thanks! Please tag those patches for stable releases
if appropriate, this is affecting a decent amount of libvirt users

Thanks,
Cole



[pktgen script v2 1/2] Add some helper functions

2017-09-17 Thread Robert Hoo
From: Robert Hoo 

1. given a device, get its NUMA belongings
2. given a device, get its queues' irq numbers.
3. given a NUMA node, get its cpu id list.

Signed-off-by: Robert Hoo 
---
 pktgen/functions.sh | 44 
 1 file changed, 44 insertions(+)

diff --git a/pktgen/functions.sh b/pktgen/functions.sh
index 205e4cd..09dfe7a 100644
--- a/pktgen/functions.sh
+++ b/pktgen/functions.sh
@@ -119,3 +119,47 @@ function root_check_run_with_sudo() {
err 4 "cannot perform sudo run of $0"
 fi
 }
+
+# Exact input device's NUMA node info
+function get_iface_node()
+{
+local node=$(

[pktgen script v2 2/2] Add pktgen script: pktgen_sample06_numa_awared_queue_irq_affinity.sh

2017-09-17 Thread Robert Hoo
From: Robert Hoo 

This script simply does:
Detect $DEV's NUMA node belonging.
Bind each thread (processor of NUMA locality) with each $DEV queue's
irq affinity, 1:1 mapping.
How many '-t' threads input determines how many queues will be
utilized.
If '-f' designates first cpu id, then offset in the NUMA node's cpu
list.

Signed-off-by: Robert Hoo 
---
 ...tgen_sample06_numa_awared_queue_irq_affinity.sh | 97 ++
 1 file changed, 97 insertions(+)
 create mode 100755 pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh

diff --git a/pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh 
b/pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh
new file mode 100755
index 000..52da0f4
--- /dev/null
+++ b/pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh
@@ -0,0 +1,97 @@
+#!/bin/bash
+#
+# Multiqueue: Using pktgen threads for sending on multiple CPUs
+#  * adding devices to kernel threads which are in the same NUMA node
+#  * bound devices queue's irq affinity to the threads, 1:1 mapping
+#  * notice the naming scheme for keeping device names unique
+#  * nameing scheme: dev@thread_number
+#  * flow variation via random UDP source port
+#
+basedir=`dirname $0`
+source ${basedir}/functions.sh
+root_check_run_with_sudo "$@"
+#
+# Required param: -i dev in $DEV
+source ${basedir}/parameters.sh
+
+# Base Config
+DELAY="0"# Zero means max speed
+COUNT="2000"   # Zero means indefinitely
+[ -z "$CLONE_SKB" ] && CLONE_SKB="0"
+
+# Flow variation random source port between min and max
+UDP_MIN=9
+UDP_MAX=109
+
+node=`get_iface_node $DEV`
+irq_array=(`get_iface_irqs $DEV`)
+cpu_array=(`get_node_cpus $node`)
+
+[ $THREADS -gt ${#irq_array[*]} -o $THREADS -gt ${#cpu_array[*]}  ] && \
+   err 1 "Thread number $THREADS exceeds: min 
(${#irq_array[*]},${#cpu_array[*]})"
+
+# (example of setting default params in your script)
+if [ -z "$DEST_IP" ]; then
+[ -z "$IP6" ] && DEST_IP="198.18.0.42" || DEST_IP="FD00::1"
+fi
+[ -z "$DST_MAC" ] && DST_MAC="90:e2:ba:ff:ff:ff"
+
+# General cleanup everything since last run
+pg_ctrl "reset"
+
+# Threads are specified with parameter -t value in $THREADS
+for ((i = 0; i < $THREADS; i++)); do
+# The device name is extended with @name, using thread number to
+# make then unique, but any name will do.
+# Set the queue's irq affinity to this $thread (processor)
+# if '-f' is designated, offset cpu id
+thread=${cpu_array[$((i+F_THREAD))]}
+dev=${DEV}@${thread}
+echo $thread > /proc/irq/${irq_array[$i]}/smp_affinity_list
+info "irq ${irq_array[$i]} is set affinity to `cat 
/proc/irq/${irq_array[$i]}/smp_affinity_list`"
+
+# Add remove all other devices and add_device $dev to thread
+pg_thread $thread "rem_device_all"
+pg_thread $thread "add_device" $dev
+
+# select queue and bind the queue and $dev in 1:1 relationship
+queue_num=$i
+info "queue number is $queue_num"
+pg_set $dev "queue_map_min $queue_num"
+pg_set $dev "queue_map_max $queue_num"
+
+# Notice config queue to map to cpu (mirrors smp_processor_id())
+# It is beneficial to map IRQ /proc/irq/*/smp_affinity 1:1 to CPU number
+pg_set $dev "flag QUEUE_MAP_CPU"
+
+# Base config of dev
+pg_set $dev "count $COUNT"
+pg_set $dev "clone_skb $CLONE_SKB"
+pg_set $dev "pkt_size $PKT_SIZE"
+pg_set $dev "delay $DELAY"
+
+# Flag example disabling timestamping
+pg_set $dev "flag NO_TIMESTAMP"
+
+# Destination
+pg_set $dev "dst_mac $DST_MAC"
+pg_set $dev "dst$IP6 $DEST_IP"
+
+# Setup random UDP port src range
+pg_set $dev "flag UDPSRC_RND"
+pg_set $dev "udp_src_min $UDP_MIN"
+pg_set $dev "udp_src_max $UDP_MAX"
+done
+
+# start_run
+echo "Running... ctrl^C to stop" >&2
+pg_ctrl "start"
+echo "Done" >&2
+
+# Print results
+for ((i = 0; i < $THREADS; i++)); do
+thread=${cpu_array[$((i+F_THREAD))]}
+dev=${DEV}@${thread}
+echo "Device: $dev"
+cat /proc/net/pktgen/$dev | grep -A2 "Result:"
+done
-- 
1.8.3.1



[pktgen script v2 0/2] Add a pktgen sample script of NUMA awareness

2017-09-17 Thread Robert Hoo
From: Robert Hoo 

It's hard to benchmark 40G+ network bandwidth using ordinary
tools like iperf, netperf (see reference 1). 
Pktgen, packet generator from Kernel sapce, shall be a candidate.
I derived this NUMA awared irq affinity sample script from
multi-queue sample02, successfully benchmarked 40G link. I think this can
also be useful for 100G reference, though I haven't got device to test yet.

This script simply does:
Detect $DEV's NUMA node belonging.
Bind each thread (processor of NUMA locality) with each $DEV queue's
irq affinity, 1:1 mapping.
How many '-t' threads input determines how many queues will be
utilized.
If '-f' designates first cpu id, then offset in the NUMA node's cpu
list.

Tested with Intel XL710 NIC with Cisco 3172 switch.

Referrences:
https://people.netfilter.org/hawk/presentations/LCA2015/net_stack_challenges_100G_LCA2015.pdf
http://www.intel.cn/content/dam/www/public/us/en/documents/reference-guides/xl710-x710-performance-tuning-linux-guide.pdf

Change log
v2:
Rebased to 
https://github.com/netoptimizer/network-testing/tree/master/pktgen
Move helper functions to functions.sh
More concise shell grammar usage
Take '-f' parameter into consideration. If the first CPU is designaed,
offset in the NUMA-aware CPU list.
Use err(), info() helper functions for such outputs.

Robert Hoo (2):
  Add some helper functions
  Add pktgen script: pktgen_sample06_numa_awared_queue_irq_affinity.sh

 pktgen/functions.sh| 44 ++
 ...tgen_sample06_numa_awared_queue_irq_affinity.sh | 97 ++
 2 files changed, 141 insertions(+)
 create mode 100755 pktgen/pktgen_sample06_numa_awared_queue_irq_affinity.sh

-- 
1.8.3.1



[PATCH] hamradio: baycom: use new parport device model

2017-09-17 Thread Sudip Mukherjee
Modify baycom driver to use the new parallel port device model.

Signed-off-by: Sudip Mukherjee 
---

Not tested on real hardware, only tested on qemu and verified that the
device is binding to the driver properly in epp_open but then unbinding
as the device was not found.

 drivers/net/hamradio/baycom_epp.c | 50 +++
 1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/drivers/net/hamradio/baycom_epp.c 
b/drivers/net/hamradio/baycom_epp.c
index 1503f10..1e62d00 100644
--- a/drivers/net/hamradio/baycom_epp.c
+++ b/drivers/net/hamradio/baycom_epp.c
@@ -840,6 +840,7 @@ static int epp_open(struct net_device *dev)
unsigned char tmp[128];
unsigned char stat;
unsigned long tstart;
+   struct pardev_cb par_cb;

 if (!pp) {
 printk(KERN_ERR "%s: parport at 0x%lx unknown\n", bc_drvname, 
dev->base_addr);
@@ -859,8 +860,21 @@ static int epp_open(struct net_device *dev)
 return -EIO;
}
memset(>modem, 0, sizeof(bc->modem));
-bc->pdev = parport_register_device(pp, dev->name, NULL, epp_wakeup, 
-  NULL, PARPORT_DEV_EXCL, dev);
+   memset(_cb, 0, sizeof(par_cb));
+   par_cb.wakeup = epp_wakeup;
+   par_cb.private = (void *)dev;
+   par_cb.flags = PARPORT_DEV_EXCL;
+   for (i = 0; i < NR_PORTS; i++)
+   if (baycom_device[i] == dev)
+   break;
+
+   if (i == NR_PORTS) {
+   pr_err("%s: no device found\n", bc_drvname);
+   parport_put_port(pp);
+   return -ENODEV;
+   }
+
+   bc->pdev = parport_register_dev_model(pp, dev->name, _cb, i);
parport_put_port(pp);
 if (!bc->pdev) {
 printk(KERN_ERR "%s: cannot register parport at 0x%lx\n", 
bc_drvname, pp->base);
@@ -1185,6 +1199,23 @@ MODULE_LICENSE("GPL");
 
 /* - */
 
+static int baycom_epp_par_probe(struct pardevice *par_dev)
+{
+   struct device_driver *drv = par_dev->dev.driver;
+   int len = strlen(drv->name);
+
+   if (strncmp(par_dev->name, drv->name, len))
+   return -ENODEV;
+
+   return 0;
+}
+
+static struct parport_driver baycom_epp_par_driver = {
+   .name = "bce",
+   .probe = baycom_epp_par_probe,
+   .devmodel = true,
+};
+
 static void __init baycom_epp_dev_setup(struct net_device *dev)
 {
struct baycom_state *bc = netdev_priv(dev);
@@ -1204,10 +1235,15 @@ static void __init baycom_epp_dev_setup(struct 
net_device *dev)
 
 static int __init init_baycomepp(void)
 {
-   int i, found = 0;
+   int i, found = 0, ret;
char set_hw = 1;
 
printk(bc_drvinfo);
+
+   ret = parport_register_driver(_epp_par_driver);
+   if (ret)
+   return ret;
+
/*
 * register net devices
 */
@@ -1241,7 +1277,12 @@ static int __init init_baycomepp(void)
found++;
}
 
-   return found ? 0 : -ENXIO;
+   if (found == 0) {
+   parport_unregister_driver(_epp_par_driver);
+   return -ENXIO;
+   }
+
+   return 0;
 }
 
 static void __exit cleanup_baycomepp(void)
@@ -1260,6 +1301,7 @@ static void __exit cleanup_baycomepp(void)
printk(paranoia_str, "cleanup_module");
}
}
+   parport_unregister_driver(_epp_par_driver);
 }
 
 module_init(init_baycomepp);
-- 
2.7.4



RE: You've won the 10.5M lawsuit verdict!

2017-09-17 Thread info




Re: [PATCH net] net/sched: cls_matchall: fix crash when used with classful qdisc

2017-09-17 Thread Yotam Gigi
On 09/16/2017 03:02 PM, Davide Caratti wrote:
> this script, edited from Linux Advanced Routing and Traffic Control guide
>
> tc q a dev en0 root handle 1: htb default a
> tc c a dev en0 parent 1:  classid 1:1 htb rate 6mbit burst 15k
> tc c a dev en0 parent 1:1 classid 1:a htb rate 5mbit ceil 6mbit burst 15k
> tc c a dev en0 parent 1:1 classid 1:b htb rate 1mbit ceil 6mbit burst 15k
> tc f a dev en0 parent 1:0 prio 1 $clsname $clsargs classid 1:b
> ping $address -c1
> tc -s c s dev en0
>
> classifies traffic to 1:b or 1:a, depending on whether the packet matches
> or not the pattern $clsargs of filter $clsname. However, when $clsname is
> 'matchall', a systematic crash can be observed in htb_classify(). HTB and
> classful qdiscs don't assign initial value to struct tcf_result, but then
> they expect it to contain valid values after filters have been run. Thus,
> current 'matchall' ignores the TCA_MATCHALL_CLASSID attribute, configured
> by user, and makes HTB (and classful qdiscs) dereference random pointers.
>
> By assigning head->res to *res in mall_classify(), before the actions are
> invoked, we fix this crash and enable TCA_MATCHALL_CLASSID functionality,
> that had no effect on 'matchall' classifier since its first introduction.
>
> BugLink: 
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1460213=02%7C01%7Cyotamg%40mellanox.com%7C399f6ff50cb148cbd0d408d4fcfad4c7%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636411601630363571=PSkkBrWNXkTxsvXrTmK6Dx9iKZMq61MAKlTcdVcPj8w%3D=0
> Reported-by: Jiri Benc 
> Fixes: b87f7936a932 ("net/sched: introduce Match-all classifier")
> Signed-off-by: Davide Caratti 
> ---
>  net/sched/cls_matchall.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
> index 21cc45caf842..eeac606c95ab 100644
> --- a/net/sched/cls_matchall.c
> +++ b/net/sched/cls_matchall.c
> @@ -32,6 +32,7 @@ static int mall_classify(struct sk_buff *skb, const struct 
> tcf_proto *tp,
>   if (tc_skip_sw(head->flags))
>   return -1;
>  
> + *res = head->res;
>   return tcf_exts_exec(skb, >exts, res);
>  }
>  

Acked-by: Yotam Gigi