Re: [PATCH v2 14/28] thunderbolt: Extend tunnel creation to more than 2 adjacent switches

2019-02-12 Thread Lukas Wunner
On Mon, Feb 11, 2019 at 10:45:58AM +0200, Mika Westerberg wrote:
> On Sun, Feb 10, 2019 at 04:33:28PM +0100, Lukas Wunner wrote:
> > at the bottom of this page there's
> > a figure showing a PCI tunnel between non-adjacent switches (blue line):
> > 
> > https://developer.apple.com/library/archive/documentation/HardwareDrivers/Conceptual/ThunderboltDevGuide/Basics/Basics.html
> > 
> Are you sure Apple actually uses setup like that? I think I have never
> seen such configuration happening on any of the devices I have.

Sorry, I don't know if they actually use that.


> I can update the changelog to mention that if you think it is useful.
> Something like below maybe?
> 
>  PCIe actually does not need this as it is typically a daisy chain
>  between two adjacent switches but this way we do not need to hard-code
>  creation of the tunnel.

LGTM, thanks.


> > > + i = 0;
> > > + tb_for_each_port(in_port, src, dst)
> > > + i++;
> > 
> > This looks more complicated than necessary.  Isn't the path length
> > always the length of the route string from in_port switch to out_port
> > switch, plus 2 for the adapter on each end?  Or do paths without
> > adapters exist?
> 
> Yes, I think you are right.

Simply subtracting the depths of the start and end port's switch also yields
the path length.  Of course this assumes that tunnels aren't established
between non-adjacent switches, but your algorithm doesn't do that.

Thanks,

Lukas


Re: [PATCH v2 14/28] thunderbolt: Extend tunnel creation to more than 2 adjacent switches

2019-02-11 Thread Mika Westerberg
On Sun, Feb 10, 2019 at 04:33:28PM +0100, Lukas Wunner wrote:
> On Wed, Feb 06, 2019 at 04:17:24PM +0300, Mika Westerberg wrote:
> > Now that we can allocate hop IDs per port on a path, we can take
> > advantage of this and create tunnels covering longer paths than just
> > between two adjacent switches. PCIe actually does not need this as it is
> > always a daisy chain between two adjacent switches but this way we do
> > not need to hard-code creation of the tunnel.
> 
> That doesn't seem to be correct, at the bottom of this page there's
> a figure showing a PCI tunnel between non-adjacent switches (blue line):
> 
> https://developer.apple.com/library/archive/documentation/HardwareDrivers/Conceptual/ThunderboltDevGuide/Basics/Basics.html
> 
> I'm not sure if there are advantages to such tunnels:  Reduced latency
> perhaps because packets need not pass through PCIe adapters on the
> in-between device?  Or maybe this allows for more fine-grained traffic
> prioritization?

Interesting.

Are you sure Apple actually uses setup like that? I think I have never
seen such configuration happening on any of the devices I have.

I can update the changelog to mention that if you think it is useful.
Something like below maybe?

 PCIe actually does not need this as it is typically a daisy chain
 between two adjacent switches but this way we do not need to hard-code
 creation of the tunnel.

> > +   i = 0;
> > +   tb_for_each_port(in_port, src, dst)
> > +   i++;
> 
> This looks more complicated than necessary.  Isn't the path length
> always the length of the route string from in_port switch to out_port
> switch, plus 2 for the adapter on each end?  Or do paths without
> adapters exist?

Yes, I think you are right.

> > +   for (i = 0; i < num_hops; i++) {
> > +   in_port = tb_port_get_next(src, dst, out_port);
> > +
> > +   if (in_port->dual_link_port && in_port->link_nr != link_nr)
> > +   in_port = in_port->dual_link_port;
> > +
> > +   ret = tb_port_alloc_in_hopid(in_port, in_hopid, -1);
> > +   if (ret < 0)
> > +   goto err;
> > +   in_hopid = ret;
> > +
> > +   out_port = tb_port_get_next(src, dst, in_port);
> > +   if (!out_port)
> > +   goto err;
> 
> There's a NULL pointer check here, but the invocation of tb_port_get_next()
> further up to assign in_port lacks such a check.  Is it guaranteed to never
> be NULL?

No, I'll add NULL check there.


Re: [PATCH v2 14/28] thunderbolt: Extend tunnel creation to more than 2 adjacent switches

2019-02-10 Thread Lukas Wunner
On Wed, Feb 06, 2019 at 04:17:24PM +0300, Mika Westerberg wrote:
> Now that we can allocate hop IDs per port on a path, we can take
> advantage of this and create tunnels covering longer paths than just
> between two adjacent switches. PCIe actually does not need this as it is
> always a daisy chain between two adjacent switches but this way we do
> not need to hard-code creation of the tunnel.

That doesn't seem to be correct, at the bottom of this page there's
a figure showing a PCI tunnel between non-adjacent switches (blue line):

https://developer.apple.com/library/archive/documentation/HardwareDrivers/Conceptual/ThunderboltDevGuide/Basics/Basics.html

I'm not sure if there are advantages to such tunnels:  Reduced latency
perhaps because packets need not pass through PCIe adapters on the
in-between device?  Or maybe this allows for more fine-grained traffic
prioritization?


> + i = 0;
> + tb_for_each_port(in_port, src, dst)
> + i++;

This looks more complicated than necessary.  Isn't the path length
always the length of the route string from in_port switch to out_port
switch, plus 2 for the adapter on each end?  Or do paths without
adapters exist?


> + for (i = 0; i < num_hops; i++) {
> + in_port = tb_port_get_next(src, dst, out_port);
> +
> + if (in_port->dual_link_port && in_port->link_nr != link_nr)
> + in_port = in_port->dual_link_port;
> +
> + ret = tb_port_alloc_in_hopid(in_port, in_hopid, -1);
> + if (ret < 0)
> + goto err;
> + in_hopid = ret;
> +
> + out_port = tb_port_get_next(src, dst, in_port);
> + if (!out_port)
> + goto err;

There's a NULL pointer check here, but the invocation of tb_port_get_next()
further up to assign in_port lacks such a check.  Is it guaranteed to never
be NULL?

Thanks,

Lukas


[PATCH v2 14/28] thunderbolt: Extend tunnel creation to more than 2 adjacent switches

2019-02-06 Thread Mika Westerberg
Now that we can allocate hop IDs per port on a path, we can take
advantage of this and create tunnels covering longer paths than just
between two adjacent switches. PCIe actually does not need this as it is
always a daisy chain between two adjacent switches but this way we do
not need to hard-code creation of the tunnel.

Signed-off-by: Mika Westerberg 
---
 drivers/thunderbolt/path.c   | 94 ++--
 drivers/thunderbolt/tb.h |  4 +-
 drivers/thunderbolt/tunnel.c | 54 +
 3 files changed, 106 insertions(+), 46 deletions(-)

diff --git a/drivers/thunderbolt/path.c b/drivers/thunderbolt/path.c
index 48cb15ff4446..122e6a1daf34 100644
--- a/drivers/thunderbolt/path.c
+++ b/drivers/thunderbolt/path.c
@@ -31,23 +31,97 @@ static void tb_dump_hop(struct tb_port *port, struct 
tb_regs_hop *hop)
 }
 
 /**
- * tb_path_alloc() - allocate a thunderbolt path
+ * tb_path_alloc() - allocate a thunderbolt path between two ports
+ * @tb: Domain pointer
+ * @src: Source port of the path
+ * @dst: Destination port of the path
+ * @start_hopid: Hop ID used for the first ingress port in the path
+ * @end_hopid: Hop ID used for the last egress port in the path (%-1 for
+ *automatic allocation)
+ * @link_nr: Preferred link if there are dual links on the path
+ *
+ * Creates path between two ports starting with given @start_hopid. Reserves
+ * hop IDs for each port (they can be different from @start_hopid depending on
+ * how many hop IDs each port already have reserved). If there are dual
+ * links on the path, prioritizes using @link_nr.
  *
  * Return: Returns a tb_path on success or NULL on failure.
  */
-struct tb_path *tb_path_alloc(struct tb *tb, int num_hops)
+struct tb_path *tb_path_alloc(struct tb *tb, struct tb_port *src,
+ struct tb_port *dst, int start_hopid,
+ int end_hopid, int link_nr)
 {
-   struct tb_path *path = kzalloc(sizeof(*path), GFP_KERNEL);
+   struct tb_port *in_port, *out_port;
+   int in_hopid, out_hopid;
+   struct tb_path *path;
+   size_t num_hops;
+   int i, ret;
+
+   path = kzalloc(sizeof(*path), GFP_KERNEL);
if (!path)
return NULL;
+
+   i = 0;
+   tb_for_each_port(in_port, src, dst)
+   i++;
+
+   /* Each hop takes two ports */
+   num_hops = i / 2;
+
path->hops = kcalloc(num_hops, sizeof(*path->hops), GFP_KERNEL);
if (!path->hops) {
kfree(path);
return NULL;
}
+
+   in_hopid = start_hopid;
+   out_port = NULL;
+   out_hopid = -1;
+
+   for (i = 0; i < num_hops; i++) {
+   in_port = tb_port_get_next(src, dst, out_port);
+
+   if (in_port->dual_link_port && in_port->link_nr != link_nr)
+   in_port = in_port->dual_link_port;
+
+   ret = tb_port_alloc_in_hopid(in_port, in_hopid, -1);
+   if (ret < 0)
+   goto err;
+   in_hopid = ret;
+
+   out_port = tb_port_get_next(src, dst, in_port);
+   if (!out_port)
+   goto err;
+
+   if (out_port->dual_link_port && out_port->link_nr != link_nr)
+   out_port = out_port->dual_link_port;
+
+   if (end_hopid && i == num_hops - 1)
+   ret = tb_port_alloc_out_hopid(out_port, end_hopid,
+ end_hopid);
+   else
+   ret = tb_port_alloc_out_hopid(out_port, -1, -1);
+
+   if (ret < 0)
+   goto err;
+   out_hopid = ret;
+
+   path->hops[i].in_hop_index = in_hopid;
+   path->hops[i].in_port = in_port;
+   path->hops[i].in_counter_index = -1;
+   path->hops[i].out_port = out_port;
+   path->hops[i].next_hop_index = out_hopid;
+
+   in_hopid = out_hopid;
+   }
+
path->tb = tb;
path->path_length = num_hops;
return path;
+
+err:
+   tb_path_free(path);
+   return NULL;
 }
 
 /**
@@ -55,10 +129,24 @@ struct tb_path *tb_path_alloc(struct tb *tb, int num_hops)
  */
 void tb_path_free(struct tb_path *path)
 {
+   int i;
+
if (path->activated) {
tb_WARN(path->tb, "trying to free an activated path\n")
return;
}
+
+   for (i = 0; i < path->path_length; i++) {
+   const struct tb_path_hop *hop = >hops[i];
+
+   if (hop->in_port)
+   tb_port_release_in_hopid(hop->in_port,
+hop->in_hop_index);
+   if (hop->out_port)
+   tb_port_release_out_hopid(hop->out_port,
+ hop->next_hop_index);
+   }
+
kfree(path->hops);
kfree(path);
 }
diff --git