Re: [PATCH V4] leds: trigger: Introduce an USB port trigger

2016-08-31 Thread Rafał Miłecki
On 31 August 2016 at 21:00, Rafał Miłecki  wrote:
> On 31 August 2016 at 20:23, Alan Stern  wrote:
>> On Tue, 30 Aug 2016, Rafał Miłecki wrote:
>>> Not really as it won't cover some pretty common use cases. Many home
>>> routers have few USB ports (2-5) and only 1 USB LED. It has to be
>>> possible to assign few USB ports to a single LED (trigger). That way
>>> LED should be turned on (and kept on) if there is at least 1 USB
>>> device connected. You obviously can't do:
>>> echo "usb1-1 usb1-2 usb2-1" > /sys/class/leds/foo/trigger
>>>
>>> This was already brought up by Rob (who mentioned CPU trigger) and I
>>> replied him pretty much the same way in:
>>> https://lkml.org/lkml/2016/7/29/38
>>> (reply starts with "Anyway, the serious limitation I see").
>>
>> The code for a bunch of triggers must already be written.  What would
>> the user do if he wanted to flash a single LED in response to both
>> CPU activity and MTD activity?  If not
>>
>> echo "cpu mtd" >/sys/class/leds/foo/trigger
>>
>> then what?
>
> Well, it sounds like a new feature then. Shall we add an extra API
> with a request function for turning LED on? It could internally count
> how many requests were raised and keep LED on as long as there is at
> least 1 left. I guess we should implement it in trigger "subsystem"
> (if I can call it so). Does it sound like a good plan?

I'm pretty sure noone ever planned to have more than 1 trigger
assigned to a single LED. I just realized there will be a problem with
proposed solution: sysfs files conflict.

Consider 2 existing triggers for a moment:
1) oneshot: it creates following sysfs files:
/sys/class/leds/foo/delay_on
/sys/class/leds/foo/delay_off
/sys/class/leds/foo/invert
/sys/class/leds/foo/shot
2) timer: it creates following sysfs files:
/sys/class/leds/foo/delay_on
/sys/class/leds/foo/delay_off

Activating both of them will probably cause a WARNING in sysfs. They
can't coexist :(

We should probably have per-trigger subdirs, e.g.:
/sys/class/leds/foo/trigger-oneshot/delay_on
/sys/class/leds/foo/trigger-oneshot/delay_off
/sys/class/leds/foo/trigger-oneshot/invert
/sys/class/leds/foo/trigger-oneshot/shot
/sys/class/leds/foo/trigger-timer/delay_on
/sys/class/leds/foo/trigger-timer/delay_off
but implementing it now would break the ABI.

One workaround I can see is doing triggers V2, they:
1) Would put sysfs files in /sys/class/leds/foo/trigger-bar/
2) Use new API for *requesting* LED to be on/off
3) There would be a counter of requests in V2 API
4) Multiple triggers V2 would be allowed to be used (assigned) at the same time

-- 
Rafał
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller

2016-08-31 Thread Matan Barak

On 31/08/2016 11:37, Parav Pandit wrote:

Added rdma cgroup controller that does accounting, limit enforcement
on rdma/IB verbs and hw resources.

Added rdma cgroup header file which defines its APIs to perform
charing/uncharing functionality. It also defined APIs for RDMA/IB
stack for device registration. Devices which are registered will
participate in controller functions of accounting and limit
enforcements. It define rdmacg_device structure to bind IB stack
and RDMA cgroup controller.

RDMA resources are tracked using resource pool. Resource pool is per
device, per cgroup entity which allows setting up accounting limits
on per device basis.

Currently resources are defined by the RDMA cgroup.

Resource pool is created/destroyed dynamically whenever
charging/uncharging occurs respectively and whenever user
configuration is done. Its a tradeoff of memory vs little more code
space that creates resource pool object whenever necessary, instead of
creating them during cgroup creation and device registration time.

Signed-off-by: Parav Pandit 
---
 include/linux/cgroup_rdma.h   |  66 +
 include/linux/cgroup_subsys.h |   4 +
 init/Kconfig  |  10 +
 kernel/Makefile   |   1 +
 kernel/cgroup_rdma.c  | 664 ++
 5 files changed, 745 insertions(+)
 create mode 100644 include/linux/cgroup_rdma.h
 create mode 100644 kernel/cgroup_rdma.c

diff --git a/include/linux/cgroup_rdma.h b/include/linux/cgroup_rdma.h
new file mode 100644
index 000..6710e28
--- /dev/null
+++ b/include/linux/cgroup_rdma.h
@@ -0,0 +1,66 @@
+/*
+ * Copyright (C) 2016 Parav Pandit 
+ *
+ * This file is subject to the terms and conditions of version 2 of the GNU
+ * General Public License. See the file COPYING in the main directory of the
+ * Linux distribution for more details.
+ */
+
+#ifndef _CGROUP_RDMA_H
+#define _CGROUP_RDMA_H
+
+#include 
+
+enum rdmacg_resource_type {
+   RDMACG_VERB_RESOURCE_UCTX,
+   RDMACG_VERB_RESOURCE_AH,
+   RDMACG_VERB_RESOURCE_PD,
+   RDMACG_VERB_RESOURCE_CQ,
+   RDMACG_VERB_RESOURCE_MR,
+   RDMACG_VERB_RESOURCE_MW,
+   RDMACG_VERB_RESOURCE_SRQ,
+   RDMACG_VERB_RESOURCE_QP,
+   RDMACG_VERB_RESOURCE_FLOW,
+   /*
+* add any hw specific resource here as RDMA_HW_RESOURCE_NAME
+*/
+   RDMACG_RESOURCE_MAX,
+};
+
+#ifdef CONFIG_CGROUP_RDMA
+


Currently, there are some discussions regarding the RDMA ABI. The 
current proposed approach (after a lot of discussions in the OFVWG) is 
to have driver dependent object types rather than the fixed set of IB 
object types we have today.
AFAIK, some vendors might want to use the RDMA subsystem for a different 
fabrics which has a different set of objects.
You could see RFCs for such concepts both from Mellanox and Intel on the 
linux-rdma mailing list.


Saying that, maybe we need to make the resource types a bit more 
flexible and dynamic.


Regards,
Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller

2016-08-31 Thread Tejun Heo
Hello,

On Wed, Aug 31, 2016 at 06:07:30PM +0300, Matan Barak wrote:
> Currently, there are some discussions regarding the RDMA ABI. The current
> proposed approach (after a lot of discussions in the OFVWG) is to have
> driver dependent object types rather than the fixed set of IB object types
> we have today.
> AFAIK, some vendors might want to use the RDMA subsystem for a different
> fabrics which has a different set of objects.
> You could see RFCs for such concepts both from Mellanox and Intel on the
> linux-rdma mailing list.
> 
> Saying that, maybe we need to make the resource types a bit more flexible
> and dynamic.

That'd be back to square one and Christoph was dead against it too,
so...

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4] leds: trigger: Introduce an USB port trigger

2016-08-31 Thread Rafał Miłecki
On 31 August 2016 at 20:23, Alan Stern  wrote:
> On Tue, 30 Aug 2016, Rafał Miłecki wrote:
>
>> >> As you quite often need more complex LED management, there are
>> >> triggers that were introduced in 2006 by c3bc9956ec52f ("[PATCH] LED:
>> >> add LED trigger tupport"). Some triggers are trivial and could be
>> >> implemented in userspace as well (e.g. "timer"). Some had to be
>> >> implemented in kernelspace (CPU activity, MTD activity, etc.). Having
>> >> few triggers compiled, you can assign them to LEDs at it pleases you.
>> >> Your hardware may have generic LED (not labeled) and you can
>> >> dynamically assign various triggers to it, depending e.g. on user
>> >> actions. E.g. if user (using GUI or whatever) wants to see flash
>> >> activity, your userspace script should do:
>> >> echo mtd > /sys/class/leds/foo/trigger
>> >
>> > So for example, you might want to do:
>> >
>> > echo usb1-4 >/sys/class/leds/foo/trigger
>> >
>> > and then have the "foo" LED toggle whenever an URB was submitted or
>> > completed for a device attached to the 1-4 port.  Right?
>>
>> Not really as it won't cover some pretty common use cases. Many home
>> routers have few USB ports (2-5) and only 1 USB LED. It has to be
>> possible to assign few USB ports to a single LED (trigger). That way
>> LED should be turned on (and kept on) if there is at least 1 USB
>> device connected. You obviously can't do:
>> echo "usb1-1 usb1-2 usb2-1" > /sys/class/leds/foo/trigger
>>
>> This was already brought up by Rob (who mentioned CPU trigger) and I
>> replied him pretty much the same way in:
>> https://lkml.org/lkml/2016/7/29/38
>> (reply starts with "Anyway, the serious limitation I see").
>
> The code for a bunch of triggers must already be written.  What would
> the user do if he wanted to flash a single LED in response to both
> CPU activity and MTD activity?  If not
>
> echo "cpu mtd" >/sys/class/leds/foo/trigger
>
> then what?

Well, it sounds like a new feature then. Shall we add an extra API
with a request function for turning LED on? It could internally count
how many requests were raised and keep LED on as long as there is at
least 1 left. I guess we should implement it in trigger "subsystem"
(if I can call it so). Does it sound like a good plan?

-- 
Rafał
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4] leds: trigger: Introduce an USB port trigger

2016-08-31 Thread Alan Stern
On Tue, 30 Aug 2016, Rafał Miłecki wrote:

> >> As you quite often need more complex LED management, there are
> >> triggers that were introduced in 2006 by c3bc9956ec52f ("[PATCH] LED:
> >> add LED trigger tupport"). Some triggers are trivial and could be
> >> implemented in userspace as well (e.g. "timer"). Some had to be
> >> implemented in kernelspace (CPU activity, MTD activity, etc.). Having
> >> few triggers compiled, you can assign them to LEDs at it pleases you.
> >> Your hardware may have generic LED (not labeled) and you can
> >> dynamically assign various triggers to it, depending e.g. on user
> >> actions. E.g. if user (using GUI or whatever) wants to see flash
> >> activity, your userspace script should do:
> >> echo mtd > /sys/class/leds/foo/trigger
> >
> > So for example, you might want to do:
> >
> > echo usb1-4 >/sys/class/leds/foo/trigger
> >
> > and then have the "foo" LED toggle whenever an URB was submitted or
> > completed for a device attached to the 1-4 port.  Right?
> 
> Not really as it won't cover some pretty common use cases. Many home
> routers have few USB ports (2-5) and only 1 USB LED. It has to be
> possible to assign few USB ports to a single LED (trigger). That way
> LED should be turned on (and kept on) if there is at least 1 USB
> device connected. You obviously can't do:
> echo "usb1-1 usb1-2 usb2-1" > /sys/class/leds/foo/trigger
> 
> This was already brought up by Rob (who mentioned CPU trigger) and I
> replied him pretty much the same way in:
> https://lkml.org/lkml/2016/7/29/38
> (reply starts with "Anyway, the serious limitation I see").

The code for a bunch of triggers must already be written.  What would 
the user do if he wanted to flash a single LED in response to both
CPU activity and MTD activity?  If not

echo "cpu mtd" >/sys/class/leds/foo/trigger

then what?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PACTH v4 1/3] mm, proc: Implement /proc//totmaps

2016-08-31 Thread Mateusz Guzik
On Wed, Aug 31, 2016 at 12:36:26PM -0400, Robert Foss wrote:
> On 2016-08-31 05:45 AM, Jacek Anaszewski wrote:
> > > +static void *m_totmaps_start(struct seq_file *p, loff_t *pos)
> > > +{
> > > +return NULL + (*pos == 0);
> > > +}
> > > +
> > > +static void *m_totmaps_next(struct seq_file *p, void *v, loff_t *pos)
> > > +{
> > > +++*pos;
> > > +return NULL;
> > > +}
> > > +
> > 
> > When reading totmaps of kernel processes the following NULL pointer
> > dereference occurs:
> > 
> > Unable to handle kernel NULL pointer dereference at virtual address
> > 0044
> > [] (down_read) from [] (totmaps_proc_show+0x2c/0x1e8)
> > [] (totmaps_proc_show) from [] (seq_read+0x1c8/0x4b8)
> > [] (seq_read) from [] (__vfs_read+0x2c/0x110)
> > [] (__vfs_read) from [] (vfs_read+0x8c/0x110)
> > [] (vfs_read) from [] (SyS_read+0x40/0x8c)
> > [] (SyS_read) from [] (ret_fast_syscall+0x0/0x3c)
> > 
> > It seems that some protection is needed for such processes, so that
> > totmaps would return empty string then, like in case of smaps.
> > 
> 
> Thanks for the testing Jacek!
> 
> I had a look around the corresponding smaps code, but I'm not seeing any
> checks, do you know where that check actually is made?
> 

See m_start in f/sproc/task_mmu.c. It not only check for non-null mm,
but also tries to bump ->mm_users and only then proceeds to walk the mm.

-- 
Mateusz Guzik
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] treewide: Remove references to the now unnecessary DEFINE_PCI_DEVICE_TABLE

2016-08-31 Thread Joe Perches
It's been eliminated from the sources, remove it from everywhere else.

Signed-off-by: Joe Perches 
---
 Documentation/PCI/pci.txt | 1 -
 include/linux/pci.h   | 9 -
 scripts/checkpatch.pl | 9 -
 scripts/tags.sh   | 1 -
 4 files changed, 20 deletions(-)

diff --git a/Documentation/PCI/pci.txt b/Documentation/PCI/pci.txt
index 123881f..77f49dc 100644
--- a/Documentation/PCI/pci.txt
+++ b/Documentation/PCI/pci.txt
@@ -124,7 +124,6 @@ initialization with a pointer to a structure describing the 
driver
 
 The ID table is an array of struct pci_device_id entries ending with an
 all-zero entry.  Definitions with static const are generally preferred.
-Use of the deprecated macro DEFINE_PCI_DEVICE_TABLE should be avoided.
 
 Each entry consists of:
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index fbc1fa6..0ab8359 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -683,15 +683,6 @@ struct pci_driver {
 #defineto_pci_driver(drv) container_of(drv, struct pci_driver, driver)
 
 /**
- * DEFINE_PCI_DEVICE_TABLE - macro used to describe a pci device table
- * @_table: device table name
- *
- * This macro is deprecated and should not be used in new code.
- */
-#define DEFINE_PCI_DEVICE_TABLE(_table) \
-   const struct pci_device_id _table[]
-
-/**
  * PCI_DEVICE - macro used to describe a specific pci device
  * @vend: the 16 bit PCI Vendor ID
  * @dev: the 16 bit PCI Device ID
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 8946904..1c82b01 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3610,15 +3610,6 @@ sub process {
}
}
 
-# check for uses of DEFINE_PCI_DEVICE_TABLE
-   if ($line =~ /\bDEFINE_PCI_DEVICE_TABLE\s*\(\s*(\w+)\s*\)\s*=/) 
{
-   if (WARN("DEFINE_PCI_DEVICE_TABLE",
-"Prefer struct pci_device_id over deprecated 
DEFINE_PCI_DEVICE_TABLE\n" . $herecurr) &&
-   $fix) {
-   $fixed[$fixlinenr] =~ 
s/\b(?:static\s+|)DEFINE_PCI_DEVICE_TABLE\s*\(\s*(\w+)\s*\)\s*=\s*/static const 
struct pci_device_id $1\[\] = /;
-   }
-   }
-
 # check for new typedefs, only function parameters and sparse annotations
 # make sense.
if ($line =~ /\btypedef\s/ &&
diff --git a/scripts/tags.sh b/scripts/tags.sh
index ed7eef2..b3775a9 100755
--- a/scripts/tags.sh
+++ b/scripts/tags.sh
@@ -206,7 +206,6 @@ regex_c=(
'/\

[PATCH 3/3] doc-rst:c-domain: function-like macros index entry

2016-08-31 Thread Markus Heiser
From: Markus Heiser 

For function-like macros, sphinx creates 'FOO (C function)' entries.
With this patch 'FOO (C macro)' are created for function-like macros,
which is the same for object-like macros.

Signed-off-by: Markus Heiser 
---
 Documentation/sphinx/cdomain.py | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/sphinx/cdomain.py b/Documentation/sphinx/cdomain.py
index 0816090..2a1bd09 100644
--- a/Documentation/sphinx/cdomain.py
+++ b/Documentation/sphinx/cdomain.py
@@ -37,6 +37,7 @@ from docutils.parsers.rst import directives
 
 import sphinx
 from sphinx import addnodes
+from sphinx.locale import _
 from sphinx.domains.c import c_funcptr_sig_re, c_sig_re
 from sphinx.domains.c import CObject as Base_CObject
 from sphinx.domains.c import CDomain as Base_CDomain
@@ -66,6 +67,8 @@ class CObject(Base_CObject):
 "name" : directives.unchanged
 }
 
+is_function_like_macro = False
+
 def handle_func_like_macro(self, sig, signode):
 u"""Handles signatures of function-like macros.
 
@@ -104,6 +107,7 @@ class CObject(Base_CObject):
 param += nodes.emphasis(argname, argname)
 paramlist += param
 
+self.is_function_like_macro = True
 return fullname
 
 def handle_signature(self, sig, signode):
@@ -151,6 +155,12 @@ class CObject(Base_CObject):
 self.indexnode['entries'].append(
 ('single', indextext, targetname, '', None))
 
+def get_index_text(self, name):
+if self.is_function_like_macro:
+return _('%s (C macro)') % name
+else:
+return super(CObject, self).get_index_text(name)
+
 class CDomain(Base_CDomain):
 
 """C language domain."""
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] doc-rst:c-domain: fix sphinx version incompatibility

2016-08-31 Thread Markus Heiser
From: Markus Heiser 

The self.indexnode's tuple has changed in sphinx version 1.4, from a
former 4 element tuple to a 5 element tuple.

https://github.com/sphinx-doc/sphinx/commit/e6a5a3a92e938fcd75866b4227db9e0524d58f7c

Signed-off-by: Markus Heiser 
---
 Documentation/sphinx/cdomain.py | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/Documentation/sphinx/cdomain.py b/Documentation/sphinx/cdomain.py
index 9eb714a..66816ae 100644
--- a/Documentation/sphinx/cdomain.py
+++ b/Documentation/sphinx/cdomain.py
@@ -29,11 +29,15 @@ u"""
 
 from docutils.parsers.rst import directives
 
+import sphinx
 from sphinx.domains.c import CObject as Base_CObject
 from sphinx.domains.c import CDomain as Base_CDomain
 
 __version__  = '1.0'
 
+# Get Sphinx version
+major, minor, patch = map(int, sphinx.__version__.split("."))
+
 def setup(app):
 
 app.override_domain(CDomain)
@@ -85,8 +89,14 @@ class CObject(Base_CObject):
 
 indextext = self.get_index_text(name)
 if indextext:
-self.indexnode['entries'].append(('single', indextext,
-  targetname, '', None))
+if major >= 1 and minor < 4:
+# indexnode's tuple changed in 1.4
+# 
https://github.com/sphinx-doc/sphinx/commit/e6a5a3a92e938fcd75866b4227db9e0524d58f7c
+self.indexnode['entries'].append(
+('single', indextext, targetname, ''))
+else:
+self.indexnode['entries'].append(
+('single', indextext, targetname, '', None))
 
 class CDomain(Base_CDomain):
 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 04/20] x86: Secure Memory Encryption (SME) support

2016-08-31 Thread Tom Lendacky
On 08/30/2016 09:57 AM, Andy Lutomirski wrote:
> On Aug 30, 2016 6:34 AM, "Tom Lendacky"  wrote:
>>
>> On 08/25/2016 08:04 AM, Thomas Gleixner wrote:
>>> On Mon, 22 Aug 2016, Tom Lendacky wrote:
>>>
 Provide support for Secure Memory Encryption (SME). This initial support
 defines the memory encryption mask as a variable for quick access and an
 accessor for retrieving the number of physical addressing bits lost if
 SME is enabled.
>>>
>>> What is the reason that this needs to live in assembly code?
>>
>> In later patches this code is expanded and deals with a lot of page
>> table manipulation, cpuid/rdmsr instructions, etc. and so I thought it
>> was best to do it this way.
> 
> None of that sounds like it needs to be in asm, though.
> 
> I, at least, have a strong preference for minimizing the amount of asm
> in the low-level arch code.

I can take a look at converting it over to C code.

Thanks,
Tom

> 
> --Andy
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 7/8] thunderbolt: Networking doc

2016-08-31 Thread Greg KH
On Mon, Aug 01, 2016 at 03:23:52PM +0300, Amir Levy wrote:
> Adding Thunderbolt(TM) networking documentation.
> 
> Signed-off-by: Amir Levy 
> ---
>  Documentation/00-INDEX   |   2 +
>  Documentation/thunderbolt-networking.txt | 135 
> +++

Documentation/thunderbolt/networking.txt?

>  2 files changed, 137 insertions(+)
>  create mode 100644 Documentation/thunderbolt-networking.txt
> 
> diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> index cb9a6c6..80a6706 100644
> --- a/Documentation/00-INDEX
> +++ b/Documentation/00-INDEX
> @@ -439,6 +439,8 @@ this_cpu_ops.txt
>   - List rationale behind and the way to use this_cpu operations.
>  thermal/
>   - directory with information on managing thermal issues (CPU/temp)
> +thunderbolt-networking.txt
> + - Thunderbolt(TM) Networking driver description.
>  trace/
>   - directory with info on tracing technologies within linux
>  unaligned-memory-access.txt
> diff --git a/Documentation/thunderbolt-networking.txt 
> b/Documentation/thunderbolt-networking.txt
> new file mode 100644
> index 000..e112313
> --- /dev/null
> +++ b/Documentation/thunderbolt-networking.txt
> @@ -0,0 +1,135 @@
> +Intel Thunderbolt(TM) Linux driver
> +==
> +
> +Copyright(c) 2013 - 2016 Intel Corporation.
> +
> +Contact Information:
> +Intel Thunderbolt mailing list 
> +Edited by Michael Jamet 
> +
> +Overview
> +
> +
> +Thunderbolt(TM) Networking mode is introduced with this driver.

What is "this driver"?

> +This kernel code creates an ethernet device utilized in computer to computer
> +communication over a Thunderbolt cable.

What kernel code?

> +This driver has been added on the top of the existing thunderbolt driver
> +for systems with firwmare (FW) based Thunderbolt controllers supporting
> +Thunderbolt Networking.

How do I know if my hardware supports this or not?

> +
> +Files
> +=
> +
> +- icm_nhi.c/h:   These files allow communication with the FW (a.k.a ICM) 
> based controller.
> + In addition, they create an interface for netlink communication 
> with
> + a user space daemon.
> +
> +- net.c/net.h:   These files implement the 'eth' interface for the 
> Thunderbolt(TM)
> + networking.

Where are these files?  Not in this documentation directory :(

> +
> +Interface to user space
> +===
> +
> +The interface to the user space module is implemented through a Generic 
> Netlink.
> +In order to be accessed by the user space module, both kernel and user space
> +modules have to register with the same GENL_NAME. In our case, this is
> +simply "thunderbolt".
> +The registration is done at driver initialization time for all instances of
> +the Thunderbolt controllers.
> +The communication is then carried through pre-defined Thunderbolt messages.
> +Each specific message has a callback function that is called when
> +the related message is received.
> +
> +The messages are defined as follows:
> +* NHI_CMD_UNSPEC: Not used.
> +* NHI_CMD_SUBSCRIBE: Subscription request from daemon to driver to open the
> +  communication channel.
> +* NHI_CMD_UNSUBSCRIBE: Request from daemon to driver to unsubscribe
> +  to close communication channel.
> +* NHI_CMD_QUERY_INFORMATION: Request information from the driver such as
> +  driver version, FW version offset, number of ports in the controller
> +  and DMA port.
> +* NHI_CMD_MSG_TO_ICM: Message from user space module to FW.
> +* NHI_CMD_MSG_FROM_ICM: Response from FW to user space module.
> +* NHI_CMD_MAILBOX: Message that uses mailbox mechanism such as FW policy
> +  changes or disconnect path.
> +* NHI_CMD_APPROVE_TBT_NETWORKING: Request from user space
> +  module to FW to establish path.
> +* NHI_CMD_ICM_IN_SAFE_MODE: Indication that the FW has entered safe mode.

I want an ack from the network maintainers that this is an acceptable
way to configure a network device, and that you aren't just
reimplementing something that can already be done with existing tools.

Creating new apis seems really strange to me, you will have to really
convince me why you are "special" and need one.


> +
> +Communication with ICM (Firmware)
> +=
> +
> +The communication with ICM is principally achieved through
> +a DMA mechanism on Ring 0.

Where is this ring 0?

Why does a user/developer care about it?

> +The driver allocates a shared memory that is physically mapped onto
> +the DMA physical space at Ring 0.

Again ring 0?

> +
> +Interrupts
> +==
> +
> +Thunderbolt relies on MSI-X interrupts.
> +The MSI-X vector is allocated as follows:
> +ICM
> + - Tx: MSI-X vector index 0
> + - Rx: MSI-X vector index 1
> +
> +Port 0
> + - Tx: MSI-X vector index 2
> + - Rx: MSI-X vector index 3
> +
> +Port 1
> + - Tx: MSI-X vector index 4
> + - Rx: MSI-X 

Re: [PATCH v6 4/8] thunderbolt: Communication with the ICM (firmware)

2016-08-31 Thread Greg KH
On Mon, Aug 01, 2016 at 03:23:49PM +0300, Amir Levy wrote:
> Firmware-based (a.k.a ICM - Intel Connection Manager) controller is
> used for establishing and maintaining the Thunderbolt Networking
> connection. We need to be able to communicate with it.
> 
> Signed-off-by: Amir Levy 
> ---
>  drivers/thunderbolt/Makefile  |1 +
>  drivers/thunderbolt/icm/Makefile  |   28 +
>  drivers/thunderbolt/icm/icm_nhi.c | 1324 
> +
>  drivers/thunderbolt/icm/icm_nhi.h |   93 +++
>  drivers/thunderbolt/icm/net.h |  227 +++
>  5 files changed, 1673 insertions(+)
>  create mode 100644 drivers/thunderbolt/icm/Makefile
>  create mode 100644 drivers/thunderbolt/icm/icm_nhi.c
>  create mode 100644 drivers/thunderbolt/icm/icm_nhi.h
>  create mode 100644 drivers/thunderbolt/icm/net.h
> 
> diff --git a/drivers/thunderbolt/Makefile b/drivers/thunderbolt/Makefile
> index 7a85bd1..b6aa6a3 100644
> --- a/drivers/thunderbolt/Makefile
> +++ b/drivers/thunderbolt/Makefile
> @@ -1,3 +1,4 @@
>  obj-${CONFIG_THUNDERBOLT_APPLE} := thunderbolt.o
>  thunderbolt-objs := nhi.o ctl.o tb.o switch.o cap.o path.o tunnel_pci.o 
> eeprom.o
>  
> +obj-${CONFIG_THUNDERBOLT_ICM} += icm/
> diff --git a/drivers/thunderbolt/icm/Makefile 
> b/drivers/thunderbolt/icm/Makefile
> new file mode 100644
> index 000..3adfc35
> --- /dev/null
> +++ b/drivers/thunderbolt/icm/Makefile
> @@ -0,0 +1,28 @@
> +
> +#
> +# Intel Thunderbolt(TM) driver
> +# Copyright(c) 2014 - 2016 Intel Corporation.
> +#
> +# This program is free software; you can redistribute it and/or modify it
> +# under the terms and conditions of the GNU General Public License,
> +# version 2, as published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope it will be useful, but WITHOUT
> +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> +# more details.
> +#
> +# You should have received a copy of the GNU General Public License along
> +# with this program.  If not, see .
> +#
> +# The full GNU General Public License is included in this distribution in
> +# the file called "COPYING".

Why are these two paragraphs needed?

> +#
> +# Contact Information:
> +# Intel Thunderbolt Mailing List 
> +# Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497

We try to stay away from addresses in kernel files as no one wants to
maintain the corporate movements of companies for years to come.  Are
you willing to do this for the next 40 years?

> +#
> +
> +
> +obj-${CONFIG_THUNDERBOLT_ICM} += thunderbolt-icm.o
> +thunderbolt-icm-objs := icm_nhi.o

All of that for a simple 2 line Makefile?  Please no, just have a 2 line
Makefile...

And you don't have a Kconfig file that adds this option, right?

What exactly does this driver do?  Why would I want it enabled?  I'm
lost...

> diff --git a/drivers/thunderbolt/icm/icm_nhi.c 
> b/drivers/thunderbolt/icm/icm_nhi.c
> new file mode 100644
> index 000..bcb5c1b
> --- /dev/null
> +++ b/drivers/thunderbolt/icm/icm_nhi.c
> @@ -0,0 +1,1324 @@
> +/***
> + *
> + * Intel Thunderbolt(TM) driver
> + * Copyright(c) 2014 - 2016 Intel Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program.  If not, see .
> + *
> + * The full GNU General Public License is included in this distribution in
> + * the file called "COPYING".
> + *
> + * Contact Information:
> + * Intel Thunderbolt Mailing List 
> + * Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
> + *
> + 
> **/
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include "icm_nhi.h"
> +#include "net.h"
> +
> +#define NHI_GENL_VERSION 1
> +#define NHI_GENL_NAME DRV_NAME

Why not just use DRV_NAME?

Why have this at all?  Why even have DRV_NAME?  Seems useless as you are
not going to change it.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a 

Re: [PATCH v6 3/8] thunderbolt: Kconfig for Thunderbolt(TM) networking

2016-08-31 Thread Greg KH
On Mon, Aug 01, 2016 at 03:23:48PM +0300, Amir Levy wrote:
> Updating the Kconfig Thunderbolt(TM) description.

Why are you inserting a (TM) in here?  Do you see that in any other
kernel Kconfig file?  Please don't start adding it here, it's not
needed from what I can tell.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 0/8] thunderbolt: Introducing Thunderbolt(TM) networking

2016-08-31 Thread Greg KH
On Mon, Aug 01, 2016 at 03:23:45PM +0300, Amir Levy wrote:
> This is version 6 of Thunderbolt(TM) driver for non-Apple hardware.
> 
> Changes since v5:
>  - Removed the padding of short packets in receive
>  - Replaced RW semaphore with mutex
>  - Cleanup
> 
> These patches were pushed to GitHub where they can be reviewed more
> comfortably with green/red highlighting:
>   https://github.com/01org/thunderbolt-software-kernel-tree
> 
> Daemon code:
>   https://github.com/01org/thunderbolt-software-daemon
> 
> For reference, here's a link to version 5:
> [v5]: https://lkml.org/lkml/2016/7/28/85

Without acks from the thunderbolt maintainer, or any network driver
developers, I'm not going to take this series.  Please work to get that
review.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] docs-rst: ignore arguments on macro definitions

2016-08-31 Thread Markus Heiser

Am 31.08.2016 um 12:26 schrieb Mauro Carvalho Chehab :

> Em Wed, 31 Aug 2016 12:09:39 +0200
> Markus Heiser  escreveu:
> 
>> Am 31.08.2016 um 11:02 schrieb Jani Nikula :
>> 
>>> On Wed, 31 Aug 2016, Markus Heiser  wrote:  
 I haven't tested your suggestion, but since *void* is in the list
 of stop-words:
 
   # These C types aren't described anywhere, so don't try to create
   # a cross-reference to them
   stopwords = set((
   'const', 'void', 'char', 'wchar_t', 'int', 'short',
   'long', 'float', 'double', 'unsigned', 'signed', 'FILE',
   'clock_t', 'time_t', 'ptrdiff_t', 'size_t', 'ssize_t',
   'struct', '_Bool',
   ))
 
 I think it will work in the matter you think. 
 
 However I like to prefer to fix it in the C-domain, using
 Mauro's suggestion on argument parsing. IMHO it is not
 the best solution to add a void type to the reST signature
 of a macro. This will result in a unusual output and does
 not fix what is wrong in Sphinx's c-domain (there is also
 a drawback in the index, where a function-type macro is
 referred as function, not as macro).  
>>> 
>>> From an API user's perspective, functions and function-like macros
>>> should work interchangeably.  
>> 
>> Ah, OK.
>> 
>>> Personally, I don't think there needs to be
>>> a difference in the index. This seems to be the approach taken in
>>> Sphinx, but it just doesn't work well for automatic documentation
>>> generation because we can't deduce the parameter types from the macro
>>> definition.  
>> 
>> In the index, sphinx refers only object-like macros with an entry 
>> "FOO (C macro))". Function-like macros are referred as "BAR (C function)".
>> 
>> I thought it is more straight forward to refer all macros with a 
>> "BAR (C macro)" entry in the index. I will split this change in
>> a separate patch, so we can decide if we like to patch the index
>> that way.
>> 
>> But now, as we discuss this, I have another doubt to fix the index.
>> It might be confusing when writing references to those macros.
>> 
>> Since function-like macros internally are functions in the c-domain, 
>> they are referred with ":c:func:`BAR`". On the other side, object-like
>> macros are referred by role ":c:macro:`FOO`".
>> 
>> Taking this into account, it might be one reason more to follow
>> your conclusion that functions and function-like macros are 
>> interchangeable from the user's perspective.
> 
> It is not uncommon to "promote" some such macros to inline
> functions, in order to have a stronger type check, or to do the
> reverse, when we need a more generic declaration that would work
> for multiple types.
> 
> So, keeping both macro function-like functions and functions using
> the :c:function: seems to be the best, IMHO. It also makes life
> easier for kernel-doc script.


May, I was unclear. I don't want to change the behavior: """keeping both
macro function-like functions and functions using the :c:function:""". 

The only thing I thought to change is, how the index entry will be. 
First I thought it might be more straight forward to refer func-like 
as "BAR (C macro)". But after Jani's conclusion, I had a doubt if
this is really a better entry in the index, than that what sphinx
already does "BAR (C function)".

Sorry for the confusion.

-- Markus --



--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] docs-rst: ignore arguments on macro definitions

2016-08-31 Thread Markus Heiser

Am 31.08.2016 um 11:02 schrieb Jani Nikula :

> On Wed, 31 Aug 2016, Markus Heiser  wrote:
>> I haven't tested your suggestion, but since *void* is in the list
>> of stop-words:
>> 
>># These C types aren't described anywhere, so don't try to create
>># a cross-reference to them
>>stopwords = set((
>>'const', 'void', 'char', 'wchar_t', 'int', 'short',
>>'long', 'float', 'double', 'unsigned', 'signed', 'FILE',
>>'clock_t', 'time_t', 'ptrdiff_t', 'size_t', 'ssize_t',
>>'struct', '_Bool',
>>))
>> 
>> I think it will work in the matter you think. 
>> 
>> However I like to prefer to fix it in the C-domain, using
>> Mauro's suggestion on argument parsing. IMHO it is not
>> the best solution to add a void type to the reST signature
>> of a macro. This will result in a unusual output and does
>> not fix what is wrong in Sphinx's c-domain (there is also
>> a drawback in the index, where a function-type macro is
>> referred as function, not as macro).
> 
> From an API user's perspective, functions and function-like macros
> should work interchangeably.

Ah, OK.

> Personally, I don't think there needs to be
> a difference in the index. This seems to be the approach taken in
> Sphinx, but it just doesn't work well for automatic documentation
> generation because we can't deduce the parameter types from the macro
> definition.

In the index, sphinx refers only object-like macros with an entry 
"FOO (C macro))". Function-like macros are referred as "BAR (C function)".

I thought it is more straight forward to refer all macros with a 
"BAR (C macro)" entry in the index. I will split this change in
a separate patch, so we can decide if we like to patch the index
that way.

But now, as we discuss this, I have another doubt to fix the index.
It might be confusing when writing references to those macros.

Since function-like macros internally are functions in the c-domain, 
they are referred with ":c:func:`BAR`". On the other side, object-like
macros are referred by role ":c:macro:`FOO`".

Taking this into account, it might be one reason more to follow
your conclusion that functions and function-like macros are 
interchangeable from the user's perspective.

-- Markus --

> 
> BR,
> Jani.
> 
> 
> -- 
> Jani Nikula, Intel Open Source Technology Center
> --
> To unsubscribe from this list: send the line "unsubscribe linux-doc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PACTH v4 1/3] mm, proc: Implement /proc//totmaps

2016-08-31 Thread Jacek Anaszewski

Hi Robert,

On 08/17/2016 12:33 AM, robert.f...@collabora.com wrote:

From: Robert Foss 

This is based on earlier work by Thiago Goncales. It implements a new
per process proc file which summarizes the contents of the smaps file
but doesn't display any addresses.  It gives more detailed information
than statm like the PSS (proprotional set size).  It differs from the
original implementation in that it doesn't use the full blown set of
seq operations, uses a different termination condition, and doesn't
displayed "Locked" as that was broken on the original implemenation.

This new proc file provides information faster than parsing the potentially
huge smaps file.

Tested-by: Robert Foss 
Signed-off-by: Robert Foss 

Signed-off-by: Sonny Rao 
---
 fs/proc/base.c |   1 +
 fs/proc/internal.h |   2 +
 fs/proc/task_mmu.c | 141 +
 3 files changed, 144 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index a11eb71..de3acdf 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2855,6 +2855,7 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
REG("smaps",  S_IRUGO, proc_pid_smaps_operations),
REG("pagemap",S_IRUSR, proc_pagemap_operations),
+   REG("totmaps",S_IRUGO, proc_totmaps_operations),
 #endif
 #ifdef CONFIG_SECURITY
DIR("attr",   S_IRUGO|S_IXUGO, proc_attr_dir_inode_operations, 
proc_attr_dir_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index aa27810..99f97d7 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -297,6 +297,8 @@ extern const struct file_operations 
proc_pid_smaps_operations;
 extern const struct file_operations proc_tid_smaps_operations;
 extern const struct file_operations proc_clear_refs_operations;
 extern const struct file_operations proc_pagemap_operations;
+extern const struct file_operations proc_totmaps_operations;
+

 extern unsigned long task_vsize(struct mm_struct *);
 extern unsigned long task_statm(struct mm_struct *,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 4648c7f..fd8fd7f 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -802,6 +802,75 @@ static int show_smap(struct seq_file *m, void *v, int 
is_pid)
return 0;
 }

+static void add_smaps_sum(struct mem_size_stats *mss,
+   struct mem_size_stats *mss_sum)
+{
+   mss_sum->resident += mss->resident;
+   mss_sum->pss += mss->pss;
+   mss_sum->shared_clean += mss->shared_clean;
+   mss_sum->shared_dirty += mss->shared_dirty;
+   mss_sum->private_clean += mss->private_clean;
+   mss_sum->private_dirty += mss->private_dirty;
+   mss_sum->referenced += mss->referenced;
+   mss_sum->anonymous += mss->anonymous;
+   mss_sum->anonymous_thp += mss->anonymous_thp;
+   mss_sum->swap += mss->swap;
+}
+
+static int totmaps_proc_show(struct seq_file *m, void *data)
+{
+   struct proc_maps_private *priv = m->private;
+   struct mm_struct *mm = priv->mm;
+   struct vm_area_struct *vma;
+   struct mem_size_stats mss_sum;
+
+   memset(_sum, 0, sizeof(mss_sum));
+   down_read(>mmap_sem);
+   hold_task_mempolicy(priv);
+
+   for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
+   struct mem_size_stats mss;
+   struct mm_walk smaps_walk = {
+   .pmd_entry = smaps_pte_range,
+   .mm = vma->vm_mm,
+   .private = ,
+   };
+
+   if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
+   memset(, 0, sizeof(mss));
+   walk_page_vma(vma, _walk);
+   add_smaps_sum(, _sum);
+   }
+   }
+
+   release_task_mempolicy(priv);
+   up_read(>mmap_sem);
+
+   seq_printf(m,
+  "Rss:%8lu kB\n"
+  "Pss:%8lu kB\n"
+  "Shared_Clean:   %8lu kB\n"
+  "Shared_Dirty:   %8lu kB\n"
+  "Private_Clean:  %8lu kB\n"
+  "Private_Dirty:  %8lu kB\n"
+  "Referenced: %8lu kB\n"
+  "Anonymous:  %8lu kB\n"
+  "AnonHugePages:  %8lu kB\n"
+  "Swap:   %8lu kB\n",
+  mss_sum.resident >> 10,
+  (unsigned long)(mss_sum.pss >> (10 + PSS_SHIFT)),
+  mss_sum.shared_clean  >> 10,
+  mss_sum.shared_dirty  >> 10,
+  mss_sum.private_clean >> 10,
+  mss_sum.private_dirty >> 10,
+  mss_sum.referenced >> 10,
+  mss_sum.anonymous >> 10,
+  mss_sum.anonymous_thp >> 10,
+  mss_sum.swap >> 10);
+
+   return 0;
+}
+
 static int 

Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller

2016-08-31 Thread Leon Romanovsky
On Wed, Aug 31, 2016 at 02:07:25PM +0530, Parav Pandit wrote:
> Added rdma cgroup controller that does accounting, limit enforcement
> on rdma/IB verbs and hw resources.
>
> Added rdma cgroup header file which defines its APIs to perform
> charing/uncharing functionality. It also defined APIs for RDMA/IB
> stack for device registration. Devices which are registered will
> participate in controller functions of accounting and limit
> enforcements. It define rdmacg_device structure to bind IB stack
> and RDMA cgroup controller.
>
> RDMA resources are tracked using resource pool. Resource pool is per
> device, per cgroup entity which allows setting up accounting limits
> on per device basis.
>
> Currently resources are defined by the RDMA cgroup.
>
> Resource pool is created/destroyed dynamically whenever
> charging/uncharging occurs respectively and whenever user
> configuration is done. Its a tradeoff of memory vs little more code
> space that creates resource pool object whenever necessary, instead of
> creating them during cgroup creation and device registration time.
>
> Signed-off-by: Parav Pandit 
> ---

<...>

> +
> +static struct rdmacg_resource_pool *
> +get_cg_rpool_locked(struct rdma_cgroup *cg, struct rdmacg_device *device)
> +{
> + struct rdmacg_resource_pool *rpool;
> +
> + rpool = find_cg_rpool_locked(cg, device);
> + if (rpool)
> + return rpool;
> +
> + rpool = kzalloc(sizeof(*rpool), GFP_KERNEL);
> + if (!rpool)
> + return ERR_PTR(-ENOMEM);
> +
> + rpool->device = device;
> + set_all_resource_max_limit(rpool);
> +
> + INIT_LIST_HEAD(>cg_node);
> + INIT_LIST_HEAD(>dev_node);
> + list_add_tail(>cg_node, >rpools);
> + list_add_tail(>dev_node, >rpools);
> + return rpool;
> +}

<...>

> + for (p = cg; p; p = parent_rdmacg(p)) {
> + rpool = get_cg_rpool_locked(p, device);
> + if (IS_ERR_OR_NULL(rpool)) {

get_cg_rpool_locked always returns !NULL (error, or pointer)

> + ret = PTR_ERR(rpool);
> + goto err;

I didn't review the whole series yet.


signature.asc
Description: PGP signature


Re: [PATCH v3] docs-rst: ignore arguments on macro definitions

2016-08-31 Thread Jani Nikula
On Wed, 31 Aug 2016, Markus Heiser  wrote:
> I haven't tested your suggestion, but since *void* is in the list
> of stop-words:
>
> # These C types aren't described anywhere, so don't try to create
> # a cross-reference to them
> stopwords = set((
> 'const', 'void', 'char', 'wchar_t', 'int', 'short',
> 'long', 'float', 'double', 'unsigned', 'signed', 'FILE',
> 'clock_t', 'time_t', 'ptrdiff_t', 'size_t', 'ssize_t',
> 'struct', '_Bool',
> ))
>
> I think it will work in the matter you think. 
>
> However I like to prefer to fix it in the C-domain, using
> Mauro's suggestion on argument parsing. IMHO it is not
> the best solution to add a void type to the reST signature
> of a macro. This will result in a unusual output and does
> not fix what is wrong in Sphinx's c-domain (there is also
> a drawback in the index, where a function-type macro is
> referred as function, not as macro).

>From an API user's perspective, functions and function-like macros
should work interchangeably. Personally, I don't think there needs to be
a difference in the index. This seems to be the approach taken in
Sphinx, but it just doesn't work well for automatic documentation
generation because we can't deduce the parameter types from the macro
definition.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv12 3/3] rdmacg: Added documentation for rdmacg

2016-08-31 Thread Parav Pandit
Added documentation for v1 and v2 version describing high
level design and usage examples on using rdma controller.

Signed-off-by: Parav Pandit 
---
 Documentation/cgroup-v1/rdma.txt | 117 +++
 Documentation/cgroup-v2.txt  |  45 +++
 2 files changed, 162 insertions(+)
 create mode 100644 Documentation/cgroup-v1/rdma.txt

diff --git a/Documentation/cgroup-v1/rdma.txt b/Documentation/cgroup-v1/rdma.txt
new file mode 100644
index 000..28cb59e
--- /dev/null
+++ b/Documentation/cgroup-v1/rdma.txt
@@ -0,0 +1,117 @@
+   RDMA Controller
+   
+
+Contents
+
+
+1. Overview
+  1-1. What is RDMA controller?
+  1-2. Why RDMA controller needed?
+  1-3. How is RDMA controller implemented?
+2. Usage Examples
+
+1. Overview
+
+1-1. What is RDMA controller?
+-
+
+RDMA controller allows user to limit RDMA/IB specific resources that a given
+set of processes can use. These processes are grouped using RDMA controller.
+
+RDMA controller defines well defined verb resources which can be limited for
+processes of a cgroup.
+
+1-2. Why RDMA controller needed?
+
+
+Currently user space applications can easily take away all the rdma device
+specific resources such as AH, CQ, QP, MR etc. Due to which other applications
+in other cgroup or kernel space ULPs may not even get chance to allocate any
+rdma resources. This can leads to service unavailability.
+
+Therefore RDMA controller is needed through which resource consumption
+of processes can be limited. Through this controller various different rdma
+resources can be accounted.
+
+1-3. How is RDMA controller implemented?
+
+
+RDMA cgroup allows limit configuration of resources. Rdma cgroup maintains
+resource accounting per cgroup, per device using resource pool structure.
+Each such resource pool is limited up to 64 resources in given resource pool
+by rdma cgroup, which can be extended later if required.
+
+This resource pool object is linked to the cgroup css. Typically there
+are 0 to 4 resource pool instances per cgroup, per device in most use cases.
+But nothing limits to have it more. At present hundreds of RDMA devices per
+single cgroup may not be handled optimally, however there is no
+known use case or requirement for such configuration either.
+
+Since RDMA resources can be allocated from any process and can be freed by any
+of the child processes which shares the address space, rdma resources are
+always owned by the creator cgroup css. This allows process migration from one
+to other cgroup without major complexity of transferring resource ownership;
+because such ownership is not really present due to shared nature of
+rdma resources. Linking resources around css also ensures that cgroups can be
+deleted after processes migrated. This allow progress migration as well with
+active resources, even though that is not a primary use case.
+
+Whenever RDMA resource charging occurs, owner rdma cgroup is returned to
+the caller. Same rdma cgroup should be passed while uncharging the resource.
+This also allows process migrated with active RDMA resource to charge
+to new owner cgroup for new resource. It also allows to uncharge resource of
+a process from previously charged cgroup which is migrated to new cgroup,
+even though that is not a primary use case.
+
+Resource pool object is created in following situations.
+(a) User sets the limit and no previous resource pool exist for the device
+of interest for the cgroup.
+(b) No resource limits were configured, but IB/RDMA stack tries to
+charge the resource. So that it correctly uncharge them when applications are
+running without limits and later on when limits are enforced during uncharging,
+otherwise usage count will drop to negative.
+
+Resource pool is destroyed if all the resource limits are set to max and
+it is the last resource getting deallocated.
+
+User should set all the limit to max value if it intents to remove/unconfigure
+the resource pool for a particular device.
+
+IB stack honors limits enforced by the rdma controller. When application
+query about maximum resource limits of IB device, it returns minimum of
+what is configured by user for a given cgroup and what is supported by
+IB device.
+
+Following resources can be accounted by rdma controller.
+  uctx Maximum number of User Contexts
+  pd   Maximum number of Protection domains
+  ah   Maximum number of Address handles
+  mr   Maximum number of Memory Regions
+  mw   Maximum number of Memory Windows
+  cq   Maximum number of Completion Queues
+  srq  Maximum number of Shared Receive Queues
+  qp   Maximum number of Queue Pairs
+  flow Maximum number of Flows
+
+
+2. Usage Examples
+-
+
+(a) Configure resource 

[PATCHv12 1/3] rdmacg: Added rdma cgroup controller

2016-08-31 Thread Parav Pandit
Added rdma cgroup controller that does accounting, limit enforcement
on rdma/IB verbs and hw resources.

Added rdma cgroup header file which defines its APIs to perform
charing/uncharing functionality. It also defined APIs for RDMA/IB
stack for device registration. Devices which are registered will
participate in controller functions of accounting and limit
enforcements. It define rdmacg_device structure to bind IB stack
and RDMA cgroup controller.

RDMA resources are tracked using resource pool. Resource pool is per
device, per cgroup entity which allows setting up accounting limits
on per device basis.

Currently resources are defined by the RDMA cgroup.

Resource pool is created/destroyed dynamically whenever
charging/uncharging occurs respectively and whenever user
configuration is done. Its a tradeoff of memory vs little more code
space that creates resource pool object whenever necessary, instead of
creating them during cgroup creation and device registration time.

Signed-off-by: Parav Pandit 
---
 include/linux/cgroup_rdma.h   |  66 +
 include/linux/cgroup_subsys.h |   4 +
 init/Kconfig  |  10 +
 kernel/Makefile   |   1 +
 kernel/cgroup_rdma.c  | 664 ++
 5 files changed, 745 insertions(+)
 create mode 100644 include/linux/cgroup_rdma.h
 create mode 100644 kernel/cgroup_rdma.c

diff --git a/include/linux/cgroup_rdma.h b/include/linux/cgroup_rdma.h
new file mode 100644
index 000..6710e28
--- /dev/null
+++ b/include/linux/cgroup_rdma.h
@@ -0,0 +1,66 @@
+/*
+ * Copyright (C) 2016 Parav Pandit 
+ *
+ * This file is subject to the terms and conditions of version 2 of the GNU
+ * General Public License. See the file COPYING in the main directory of the
+ * Linux distribution for more details.
+ */
+
+#ifndef _CGROUP_RDMA_H
+#define _CGROUP_RDMA_H
+
+#include 
+
+enum rdmacg_resource_type {
+   RDMACG_VERB_RESOURCE_UCTX,
+   RDMACG_VERB_RESOURCE_AH,
+   RDMACG_VERB_RESOURCE_PD,
+   RDMACG_VERB_RESOURCE_CQ,
+   RDMACG_VERB_RESOURCE_MR,
+   RDMACG_VERB_RESOURCE_MW,
+   RDMACG_VERB_RESOURCE_SRQ,
+   RDMACG_VERB_RESOURCE_QP,
+   RDMACG_VERB_RESOURCE_FLOW,
+   /*
+* add any hw specific resource here as RDMA_HW_RESOURCE_NAME
+*/
+   RDMACG_RESOURCE_MAX,
+};
+
+#ifdef CONFIG_CGROUP_RDMA
+
+struct rdma_cgroup {
+   struct cgroup_subsys_state  css;
+
+   /*
+* head to keep track of all resource pools
+* that belongs to this cgroup.
+*/
+   struct list_headrpools;
+};
+
+struct rdmacg_device {
+   struct list_headdev_node;
+   struct list_headrpools;
+   char*name;
+};
+
+/*
+ * APIs for RDMA/IB stack to publish when a device wants to
+ * participate in resource accounting
+ */
+int rdmacg_register_device(struct rdmacg_device *device);
+void rdmacg_unregister_device(struct rdmacg_device *device);
+
+/* APIs for RDMA/IB stack to charge/uncharge pool specific resources */
+int rdmacg_try_charge(struct rdma_cgroup **rdmacg,
+ struct rdmacg_device *device,
+ enum rdmacg_resource_type index);
+void rdmacg_uncharge(struct rdma_cgroup *cg,
+struct rdmacg_device *device,
+enum rdmacg_resource_type index);
+void rdmacg_query_limit(struct rdmacg_device *device,
+   int *limits);
+
+#endif /* CONFIG_CGROUP_RDMA */
+#endif /* _CGROUP_RDMA_H */
diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index 0df0336a..d0e597c 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -56,6 +56,10 @@ SUBSYS(hugetlb)
 SUBSYS(pids)
 #endif
 
+#if IS_ENABLED(CONFIG_CGROUP_RDMA)
+SUBSYS(rdma)
+#endif
+
 /*
  * The following subsystems are not supported on the default hierarchy.
  */
diff --git a/init/Kconfig b/init/Kconfig
index cac3f09..c7dc64b 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1080,6 +1080,16 @@ config CGROUP_PIDS
  since the PIDs limit only affects a process's ability to fork, not to
  attach to a cgroup.
 
+config CGROUP_RDMA
+   bool "RDMA controller"
+   help
+ Provides enforcement of RDMA resources defined by IB stack.
+ It is fairly easy for consumers to exhaust RDMA resources, which
+ can result into resource unavailability to other consumers.
+ RDMA controller is designed to stop this from happening.
+ Attaching processes with active RDMA resources to the cgroup
+ hierarchy is allowed even if can cross the hierarchy's limit.
+
 config CGROUP_FREEZER
bool "Freezer controller"
help
diff --git a/kernel/Makefile b/kernel/Makefile
index e2ec54e..d2b76d0 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -67,6 +67,7 @@ obj-$(CONFIG_COMPAT) += compat.o
 obj-$(CONFIG_CGROUPS) += cgroup.o
 

[PATCHv12 0/3] rdmacg: IB/core: rdma controller support

2016-08-31 Thread Parav Pandit
rdmacg: IB/core: rdma controller support

Patch is generated and tested against below Doug's linux-rdma
git tree.

URL: git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git
Branch: master

Patchset is also compiled and tested against below Tejun's cgroup tree
using cgroup v2 mode.
URL: git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git
Branch: master

Overview:
Currently user space applications can easily take away all the rdma
device specific resources such as AH, CQ, QP, MR etc. Due to which other
applications in other cgroup or kernel space ULPs may not even get chance
to allocate any rdma resources. This results into service unavailibility.

RDMA cgroup addresses this issue by allowing resource accounting,
limit enforcement on per cgroup, per rdma device basis.

RDMA uverbs layer will enforce limits on well defined RDMA verb
resources without any HCA vendor device driver involvement.

RDMA uverbs layer will not do limit enforcement of HCA hw vendor
specific resources. Instead rdma cgroup provides set of APIs
through which vendor specific drivers can do resource accounting
by making use of rdma cgroup.

Resource limit enforcement is hierarchical.

When process is migrated with active RDMA resources, rdma cgroup
continues to uncharge original cgroup for allocated resource. New resource
is charged to current process's cgroup, which means if the process is
migrated with active resources, for new resources it will be charged to
new cgroup and old resources will be correctly uncharged from old cgroup.

Changes from v11:
  * (To address comments from Tejun)
   1. Added information in Documentation about nested-keyed file
  * (To address comments from Rami Rosen)
   1. Corrected typo errors in Documentation
  * (To address comments from Leon Romanovsky)
   1. Changed cgroup.c copyright to match with other files of the IB stack
  which is dual license GPLv2 + BSD

Changes from v10:
  * (To address comments from Tejun, Christoph)
   1. Removed unused rpool_list_lock from rdma_cgroup structure.
   2. Moved rdma resource definition to rdma cgroup instead of IB stack
   3. Added prefix rdmacg to static instances
   4. Simplified locking with single mutex for all operations
   5. Following approach of atomically allocating object and
  charging resource in hirerchy
   6. Code simplification due to single lock
   7. Using for_each_set_bit API for bit operation
   8. Renamed list heads as Objects instead of _head
   9. Renamed list entries as _node instead of _list.
  10. Made usage_num to 64 bit to avoid overflow and to avoid 
  additional code to track non zero number of usage counts.
  * (To address comments from Doug)
   1. Added copyright and GPLv2 license

Changes from v9:
  * (To address comments from Tejun)
   1. Included clear documentation of resources.
   2. Fixed issue of race condition of process migration during
  charging stage.
   3. Fixed comments and code to adhere to CodingStyle.
   4. Simplified and removed support to charge/uncharge multiple
  resource.
   5. Fixed replaced refcnt with usage_num that tracks how many
  resources are unused to trigger freeing the object.
   6. Simplified locking scheme to use single spin lock for whole
  subsystem.

Changes from v8:
 * Fixed compilation error.
 * Fixed warning reported by checkpatch script.

Changes from v7:
 * (To address comments from Haggai)
   1. Removed max_limit from query_limit function as it is
  unnecessary.
   2. Kept existing printk as it is to instead of replacing all
  with pr_warn except newly added printk.

Changes from v6:
 * (To address comments from Haggai)
   1. Made functions as void wherever necessary.
   2. Code cleanup related to correting few spelling mistakes
  in comments, correcting comments to reflect the code.
   3. Removed max_count parameter from query_limit as its not
  necessary.
   4. Fixed printk to pr_warn.
   5. Removed dependency on pd, instead relying on ib_dev.
   6. Added more documentation to reflect that IB stack honors
  configured limit during query_device operation.
   7. Added pr_warn and avoided system crash in case of
  IB stack or rdma cgroup bug.
 * (To address comments from Leon)
   1. Removed #ifdef CONFIG_CGROUP_RDMA from .c files and added
  necessary dummy functions in header file.
   2. Removed unwanted forward declaration.
 * Fixed uncharing to rdma controller after resource is released
   from verb layer, instead of uncharing first. This ensures that
   uncharging doesn't complete while resource is still allocated.
 
Changes from v5:
 * (To address comments from Tejun)
   1. Removed two type of resource pool, made is single type (as Tejun
  described in past comment)
   2. Removed match tokens and have array definition like "qp", "mr",
  "cq" etc.
   3. Wrote small parser and avoided match_token API as that won't work
  due to different array definitions
   4. Removed one-off remove API to unconfigure 

[PATCHv12 2/3] IB/core: added support to use rdma cgroup controller

2016-08-31 Thread Parav Pandit
Added support APIs for IB core to register/unregister every IB/RDMA
device with rdma cgroup for tracking verbs and hw resources.
IB core registers with rdma cgroup controller.
Added support APIs for uverbs layer to make use of rdma controller.
Added uverbs layer to perform resource charge/uncharge functionality.
Added support during query_device uverb operation to ensure it
returns resource limits by honoring rdma cgroup configured limits.

Signed-off-by: Parav Pandit 
---
 drivers/infiniband/core/Makefile|  1 +
 drivers/infiniband/core/cgroup.c| 93 +
 drivers/infiniband/core/core_priv.h | 41 
 drivers/infiniband/core/device.c| 10 
 4 files changed, 145 insertions(+)
 create mode 100644 drivers/infiniband/core/cgroup.c

diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index edaae9f..e426ac8 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -13,6 +13,7 @@ ib_core-y :=  packer.o ud_header.o verbs.o 
cq.o rw.o sysfs.o \
multicast.o mad.o smi.o agent.o mad_rmpp.o
 ib_core-$(CONFIG_INFINIBAND_USER_MEM) += umem.o
 ib_core-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o umem_rbtree.o
+ib_core-$(CONFIG_CGROUP_RDMA) += cgroup.o
 
 ib_cm-y := cm.o
 
diff --git a/drivers/infiniband/core/cgroup.c b/drivers/infiniband/core/cgroup.c
new file mode 100644
index 000..ffe7234
--- /dev/null
+++ b/drivers/infiniband/core/cgroup.c
@@ -0,0 +1,93 @@
+/*
+ * Copyright (C) 2016 Parav Pandit 
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "core_priv.h"
+
+/*
+ * resource table definition as to be seen by the user.
+ * Need to add entries to it when more resources are
+ * added/defined at IB verb/core layer.
+ */
+
+/**
+ * ib_device_register_rdmacg - register with rdma cgroup.
+ * @device: device to register to participate in resource
+ *  accounting by rdma cgroup.
+ *
+ * Register with the rdma cgroup. Should be called before
+ * exposing rdma device to user space applications to avoid
+ * resource accounting leak.
+ * Returns 0 on success or otherwise failure code.
+ */
+int ib_device_register_rdmacg(struct ib_device *device)
+{
+   device->cg_device.name = device->name;
+   return rdmacg_register_device(>cg_device);
+}
+
+/**
+ * ib_device_unregister_rdmacg - unregister with rdma cgroup.
+ * @device: device to unregister.
+ *
+ * Unregister with the rdma cgroup. Should be called after
+ * all the resources are deallocated, and after a stage when any
+ * other resource allocation by user application cannot be done
+ * for this device to avoid any leak in accounting.
+ */
+void ib_device_unregister_rdmacg(struct ib_device *device)
+{
+   rdmacg_unregister_device(>cg_device);
+}
+
+int ib_rdmacg_try_charge(struct ib_rdmacg_object *cg_obj,
+struct ib_device *device,
+enum rdmacg_resource_type resource_index)
+{
+   return rdmacg_try_charge(_obj->cg, >cg_device,
+resource_index);
+}
+EXPORT_SYMBOL(ib_rdmacg_try_charge);
+
+void ib_rdmacg_uncharge(struct ib_rdmacg_object *cg_obj,
+   struct ib_device *device,
+   enum rdmacg_resource_type resource_index)
+{
+   rdmacg_uncharge(cg_obj->cg, >cg_device,
+   resource_index);
+}
+EXPORT_SYMBOL(ib_rdmacg_uncharge);
+
+void ib_rdmacg_query_limit(struct ib_device *device, int *limits)
+{
+   rdmacg_query_limit(>cg_device, limits);
+}

Re: [PATCH v3] docs-rst: ignore arguments on macro definitions

2016-08-31 Thread Markus Heiser

Am 29.08.2016 um 17:36 schrieb Jani Nikula :

> On Mon, 29 Aug 2016, Mauro Carvalho Chehab  wrote:
>> Em Mon, 29 Aug 2016 16:12:39 +0200
>> Markus Heiser  escreveu:
>> 
>>> Am 29.08.2016 um 15:13 schrieb Mauro Carvalho Chehab 
>>> :
>>> 
 A macro definition is mapped via .. c:function:: at the
 ReST markup when using the following kernel-doc tag:
 
/**
 * DMX_FE_ENTRY - Casts elements in the list of registered
 *   front-ends from the generic type struct list_head
 *   to the type * struct dmx_frontend
 *
 * @list: list of struct dmx_frontend
 */
 #define DMX_FE_ENTRY(list) \
list_entry(list, struct dmx_frontend, connectivity_list)
 
 However, unlike a function description, the arguments of a macro
 doesn't contain the data type.
 
 This causes warnings when enabling Sphinx on nitkpick mode,
 like this one:
./drivers/media/dvb-core/demux.h:358: WARNING: c:type reference target 
 not found: list  
>>> 
>>> I think this is a drawback of sphinx's C-domain, using function
>>> definition for macros also. From the function documentation
>>> 
>>> """This is also used to describe function-like preprocessor
>>>macros. The names of the arguments should be given so
>>>they may be used in the description."""
>>> 
>>> I think about to fix the nitpick message for macros (aka function
>>> directive) in the C-domain extension (we already have).
>> 
>> Yeah, that could produce a better output, if it is doable.
>> 
>>> 
>>> But for this, I need a rule to distinguish between macros
>>> and functions ... is the uppercase of the macro name a good
>>> rule to suppress the nitpick message? 
>> 
>> No. There are lots of macros in lowercase. never did any stats about
>> that, but I guess that we actually have a way more such macros in
>> lowercase.
>> 
>>> Any other suggestions?
>> 
>> I guess the best thing is to check if the type is empty, just like
>> on this patch. Macros are always:
>>  foo(arg1, arg2, arg3, ...)

Yes, it is so clear, ... I'am a gawk ;-)

>> while functions always have some type (with could be as complex as
>> a function pointer). So, if all arguments match this rejex:
>>  \s*\S+\s*
>> Then, it is a macro. Otherwise, it is a function.
>> 
>> There's no way for the C domain to distinguish between a macro or
>> a function when the number of arguments is zero, but, on such case,
>> it doesn't really matter.
> 
> What does Sphinx say if you add "void" as the type? Or a fake
> "macroparam" type?

Hi Jani, sorry for my late reply,

I haven't tested your suggestion, but since *void* is in the list
of stop-words:

# These C types aren't described anywhere, so don't try to create
# a cross-reference to them
stopwords = set((
'const', 'void', 'char', 'wchar_t', 'int', 'short',
'long', 'float', 'double', 'unsigned', 'signed', 'FILE',
'clock_t', 'time_t', 'ptrdiff_t', 'size_t', 'ssize_t',
'struct', '_Bool',
))

I think it will work in the matter you think. 

However I like to prefer to fix it in the C-domain, using
Mauro's suggestion on argument parsing. IMHO it is not
the best solution to add a void type to the reST signature
of a macro. This will result in a unusual output and does
not fix what is wrong in Sphinx's c-domain (there is also
a drawback in the index, where a function-type macro is
referred as function, not as macro).

I will give it a try, to eliminate these drawbacks in 
the C-domain and send a patch series, we can discuss further.

-- Markus --


> 
> If those hacks don't help, Mauro's suggestion seems sane.
> 
> BR,
> Jani.
> 
> 
> 
>> 
>> Thanks,
>> Mauro
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-doc" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> -- 
> Jani Nikula, Intel Open Source Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html