Re: [RFC6 PATCH v6 00/21] ILP32 for ARM64

2016-04-08 Thread Arnd Bergmann
On Friday 08 April 2016, Andrew Pinski wrote:
> On Thu, Apr 7, 2016 at 5:18 AM, Adam Borowski  wrote:
> > On Wed, 6 Apr 2016, Geert Uytterhoeven wrote:
> >> On Wed, Apr 6, 2016 at 12:08 AM, Yury Norov  
> >> wrote:
> >>>  v6:
> >>>  - time_t, __kenel_off_t and other types turned to be 32-bit
> >>>for compatibility reasons (after v5 discussion);
> >
> > Introducing a new arch today with y2038 problems is not a good idea.
> > Linus said so with appropriately pointy words in 2011.

This was before we made the decision to fix the y2038 problem for all
architectures.

> This is the third time we had this discussion on time_t for ILP32.  I
> had originally it as 32bit, then Catalin suggested I change it to
> 64bit and then Arnd (with his work for 2038 issue on 32bit arch) said
> ILP32 should match all other 32bit targets and the other 64bit time_t
> be fixed by the current work he was working on.  Now you are
> suggesting we change it again.
> Arnd can you please comment more on why we want 32bit time_t instead
> of the 64bit one?  I Know there was some POSIX (or was it C90)
> violation but I suspect there is an easy way to workaround this inside
> the kernel but the discussion to move over to 32bit time_t was already
> made by the time I started to look into that.

x32 still runs into new problems today, and will continue to have problems
with newly added drivers that pass time_t (or other __kernel_long_t) arguments
through ioctl.

To avoid having to audit every new driver for interfaces that behave
differently based on the __kernel_long_t definition, arm64 is not following
the same route as x86 here and instead uses the normal 32-bit ABI like
any other architecture. This means we use 32-bit time_t, aio_context_t,
size_t and clock_t and share the system call implementation with the
compat handling for arm (aarch32) mode.

Once we have the interfaces for 64-bit time_t in place in the kernel,
we will be able to rebuild glibc on all 32-bit architectures including
arm and arm64/ilp32 that way.

The POSIX and C99 incompatibility you mention is about struct timespec,
which uses 'long' as the type for the tv_nsec member. This is vaguely
related to the issue of 64-bit time_t, but is not the reason for
starting out with 32-bit time_t for the new ABI here.

[side note:
How to precisely handle tv_nsec on 32-bit architectures is still an open
issue that will have to be solved when we nail down the new system call
interfaces:
The issue specifically is what happens when the upper half of the
second 64-bit word in struct timespec argument passed into a system
call is nonzero: the normal 64-bit syscalls must return an error,
while the 32-bit user space expects the kernel to ignore the upper bits.
This means something between the application and the native system call
has to clear the bits, and this can either be done by copying the
data inside of glibc (as done on x32) or by adding an extra system
call entry point in the kernel.]

> >> We're already closer to the (future) y2038 than to the (past) introduction 
> >> of
> >> LP64...
> >>
> >> These unfixable legacy applications have been spreading through x32 to
> >> the shiny new arm64 server architecture (does ppc64el also have an ILP32 
> >> mode,
> >> or is it planned)? Lots of resources are spent on maintaining the status 
> >> quo,
> >> instead of on fixing the real problems.
> >
> > As an x32 (userland) porter, I can tell you that time_t!=long _did_ cause
> > non-trivial amounts of work.  But that work is already done (at least in
> > Debian), so you might as well benefit from it.
> 
> There is actually private code out there which uses timespec and
> timeval to pass time over the wire; yes I know bad coding style and
> all but they did it that way.  This is code which was working for x86
> and we are porting it to ARM64; a data center code by the way; not
> some networking code even.  This means they have not ported the code
> to fully 64bit yet and they might never.

This code will run into the same problem on arm64/ilp32 when built against
a future libc implementation that defines time_t as 64-bit, but at least
the glibc maintainers so far plan to leave this as a per-application
option for the forseeable future: even on a system that uses 64-bit time_t
in user space and kernel by default, you should be able to build an
application using a 32-bit time_t.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


FROM: MR. OLIVER SENO!!

2016-04-08 Thread AKINWUMI
Dear Sir.

I bring you greetings. My name is Mr.Oliver Seno Lim, I am a staff of Abbey 
National Plc. London and heading our regional office in West Africa. Our late 
customer named Engr.Ben W.westland, made a fixed deposit amount of 
US$7Million.He did not declare any next of kin in any of his paper work, I want 
you as a foreigner to stand as the beneficiary to transfer this funds out of my 
bank into your account, after the successful transfer, we shall share in the 
ratio of 30% for you, 70%for me. Should you be interested please send me your 
information:

1,Full names.
2,current residential address.
3,Tele/Fax numbers./your work.
 
   
All I need from you is your readiness, trustworthiness and edication. Please 
email me directly on my private email address: officeose...@yahoo.com) so we 
can begin arrangements and I would give you more information on how we would 
handle this venture and once i hear from you i will give you information of the 
bank for the transferring funds on your name.

Regards,
Mr.Oliver Seno Lim 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 04/13] task_isolation: add initial support

2016-04-08 Thread Chris Metcalf

On 4/8/2016 9:56 AM, Frederic Weisbecker wrote:

On Wed, Mar 09, 2016 at 02:39:28PM -0500, Chris Metcalf wrote:
>   TL;DR: Let's make an explicit decision about whether task isolation
>   should be "persistent" or "one-shot".  Both have some advantages.
>   =
>
> An important high-level issue is how "sticky" task isolation mode is.
> We need to choose one of these two options:
>
> "Persistent mode": A task switches state to "task isolation" mode
> (kind of a level-triggered analogy) and stays there indefinitely.  It
> can make a syscall, take a page fault, etc., if it wants to, but the
> kernel protects it from incurring any further asynchronous interrupts.
> This is the model I've been advocating for.

But then in this mode, what happens when an interrupt triggers.


So here I'm taking "interrupt" to mean an external, asynchronous
interrupt, from another core or device, or asynchronously triggered
on the local core, like a timer interrupt.  By contrast I use "exception"
or "fault" to refer to synchronous, locally-triggered interruptions.

So for interrupts, the short answer is, it's a bug! :-)

An interrupt could be a kernel bug, in which case we consider it a
"true" bug.  This could be a timer interrupt occurring even after the
task isolation code thought there were none pending, or a hardware
device that incorrectly distributes interrupts to a task-isolation
cpu, or a global IPI that should be sent to fewer cores, or a kernel
TLB flush that could be deferred until the task-isolation task
re-enters the kernel later, etc.  Regardless, I'd consider it a kernel
bug.  I'm sure there are more such bugs that we can continue to fix
going forward; it depends on how arbitrary you want to allow code
running on other cores to be.  For example, can another core unload a
kernel module without interrupting a task-isolation task?  Not right now.

Or, it could be an application bug: the standard example is if you
have an application with task-isolated cores that also does occasional
unmaps on another thread in the same process, on another core.  This
causes TLB flush interrupts under application control.  The
application shouldn't do this, and we tell our customers not to build
their applications this way.  The typical way we encourage our
customers to arrange this kind of "multi-threading" is by having a
pure memory API between the task isolation threads and what are
typically "control" threads running on non-task-isolated cores.  The
two types of threads just both mmap some common, shared memory but run
as different processes.

So what happens if an interrupt does occur?

In the "base" task isolation mode, you just take the interrupt, then
wait to quiesce any further kernel timer ticks, etc., and return to
the process.  This at least limits the damage to being a single
interruption rather than potentially additional ones, if the interrupt
also caused timers to get queued, etc.

If you enable "strict" mode, we disable task isolation mode for that
core and deliver a signal to it.  This lets the application know that
an interrupt occurred, and it can take whatever kind of logging or
debugging action it wants to, re-enable task isolation if it wants to
and continue, or just exit or abort, etc.

If you don't enable "strict" mode, but you do have
task_isolation_debug enabled as a boot flag, you will at least get a
console dump with a backtrace and whatever other data we have.
(Sometimes the debug info actually includes a backtrace of the
interrupting core, if it's an IPI or TLB flush from another core,
which can be pretty useful.)


> "One-shot mode": A task requests isolation via prctl(), the kernel
> ensures it is isolated on return from the prctl(), but then as soon as
> it enters the kernel again, task isolation is switched off until
> another prctl is issued.  This is what you recommended in your last
> email.

No I think we can issue syscalls for exemple. But asynchronous interruptions
such as exceptions (actually somewhat synchronous but can be unexpected) and
interrupts are what we want to avoid.


Hmm, so I think I'm not really understanding what you are suggesting.

We're certainly in agreement that avoiding interrupts and exceptions
is important.  I'm arguing that the way to deal with them is to
generate appropriate signals/printks, etc.  I'm not actually sure what
you're recommending we do to avoid exceptions.  Since they're
synchronous and deterministic, we can't really avoid them if the
program wants to issue them.  For example, mmap() some anonymous
memory and then start running, and you'll take exceptions each time
you touch a page in that mapped region.  I'd argue it's an application
bug; one should enable "strict" mode to catch and deal with such bugs.

(Typically the recommendation is to do an mlockall() before starting
task isolation mode, to handle the case of page faults.  But you can
do that and still be screwed by another thread in your process doing a
fork() and then your pages end up 

Re: [PATCH] [linux-next] Doc: networking: Fix typo in dsa

2016-04-08 Thread Andrew Lunn
On Sat, Apr 09, 2016 at 12:00:25AM +0900, Masanari Iida wrote:
> This patch fix typos in Documentation/networking/dsa.
> 
> Signed-off-by: Masanari Iida 

Reviewed-by: Andrew Lunn 

Thanks
Andrew

> ---
>  Documentation/networking/dsa/bcm_sf2.txt | 2 +-
>  Documentation/networking/dsa/dsa.txt | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/networking/dsa/bcm_sf2.txt 
> b/Documentation/networking/dsa/bcm_sf2.txt
> index d999d0c1c5b8..eba3a2431e91 100644
> --- a/Documentation/networking/dsa/bcm_sf2.txt
> +++ b/Documentation/networking/dsa/bcm_sf2.txt
> @@ -38,7 +38,7 @@ Implementation details
>  ==
>  
>  The driver is located in drivers/net/dsa/bcm_sf2.c and is implemented as a 
> DSA
> -driver; see Documentation/networking/dsa/dsa.txt for details on the subsytem
> +driver; see Documentation/networking/dsa/dsa.txt for details on the subsystem
>  and what it provides.
>  
>  The SF2 switch is configured to enable a Broadcom specific 4-bytes switch tag
> diff --git a/Documentation/networking/dsa/dsa.txt 
> b/Documentation/networking/dsa/dsa.txt
> index 3b196c304b73..36f905d9c77c 100644
> --- a/Documentation/networking/dsa/dsa.txt
> +++ b/Documentation/networking/dsa/dsa.txt
> @@ -334,7 +334,7 @@ more specifically with its VLAN filtering portion when 
> configuring VLANs on top
>  of per-port slave network devices. Since DSA primarily deals with
>  MDIO-connected switches, although not exclusively, SWITCHDEV's
>  prepare/abort/commit phases are often simplified into a prepare phase which
> -checks whether the operation is supporte by the DSA switch driver, and a 
> commit
> +checks whether the operation is supported by the DSA switch driver, and a 
> commit
>  phase which applies the changes.
>  
>  As of today, the only SWITCHDEV objects supported by DSA are the FDB and VLAN
> -- 
> 2.8.0
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel docs: muddying the waters a bit

2016-04-08 Thread Markus Heiser
Hi kernel-doc authors,

motivated by this MT, I implemented a toolchain to migrate the kernel’s 
DocBook XML documentation to reST markup. 

It converts 99% of the docs well ... to gain an impression how 
kernel-docs could benefit from, visit my sphkerneldoc project page
on github:

  http://return42.github.io/sphkerneldoc/

The sources available at:

  https://github.com/return42/sphkerneldoc

The work is underway, suggestions are welcome!

.. have a nice weekend ..

--M--


Am 13.03.2016 um 16:33 schrieb Markus Heiser :

> 
> Am 10.03.2016 um 16:21 schrieb Mauro Carvalho Chehab 
> :
> 
>> Em Thu, 10 Mar 2016 12:25:58 +0200
>> Jani Nikula  escreveu:
>> 
>>> TL;DR? Skip to the last paragraph.
>>> 
>>> On Wed, 09 Mar 2016, Mauro Carvalho Chehab  wrote:
 I guess the conversion to asciidoc format is now in good shape,
 at least to demonstrate that it is possible to use this format for the
 media docbook. Still, there are lots of broken references.  
>>> 
>>> Getting references right with asciidoc is a big problem in the
>>> kernel-doc side. As I wrote before, the proofs of concept only worked
>>> because everything was processed as one big file (via includes). The
>>> Asciidoctor inter-document references won't help, because we won't know
>>> the target document name while processing kernel-doc.
>> 
>> I was able to produce chunked htmls here with:
>> 
>>  asciidoctor -b docbook45 media_api.adoc
>>  xmlto -o html-dir html media_api.xml
>> 
>> The results are at:
>>  
>> https://mchehab.fedorapeople.org/media-kabi-docs-test/asciidoc_tests/chunked/
>> 
>> But yeah, all references seem to be broken there. It could be due to some
>> conversion issue (I didn't actually tried to check what's wrong there),
>> but I think that there's something not ok with docbook45
>> output for multi-part documents (on both AsciiDoc and Asciidoctor).
>> 
>>> Sphinx is massively better at handling cross references for
>>> kernel-doc. We can use domains (C language) and roles (e.g. functions,
>>> types, etc.) for the references, which provide kind of
>>> namespaces. Sphinx warns for referencing non-existing targets, but
>>> doesn't generate broken links in the result like Asciidoctor does.
>>> 
>>> For example, in the documentation for a function that has struct foo as
>>> parameter or return type, a cross reference to struct foo is added
>>> automagically, but only if documentation for struct foo actually
>>> exists. In Asciidoctor, we would have to blindly generate the references
>>> ourselves, and try to resolve broken links ourselves by somehow
>>> post-processing the result.
>>> 
 Yet, from my side, if we're willing to get rid of DocBook, then
 Asciidoctor seems to be the *only* alternative so far to parse the
 complex media documents.  
>>> 
>>> I think you mean, "get rid of DocBook as source format", not altogether?
>>> I'm yet to be convinved we could rely on Asciidoctor's native formats.
>> 
>> What I mean is that, right now, I see only two alternatives for the
>> media uAPI documentation:
>>  1) keep using DocBook;
>>  2) AsciiDoc/Asciidoctor.
>> 
>> Sphinx doesn't have what's needed to support the complexity of the
>> media books, specially since cell span seems to be possible only
>> by using asciiArt formats. Writing a big table using asciiArt is
>> something that is a *real pain*. Also, as tested, if the table is
>> too big, it fails to parse such asciiArt tables. So, while Sphinx
>> doesn't have a decent way to describe tables, we can't use it.
> 
> 
> Huge tables and cell-spans are the *real pain* ;-) ... with sphinx-doc,
> (mostly) you have more then one choice .. e.g. import csv tables .. 
> but this should be discussed by example ...
> 
> 
>> If it starts implementing it, then we can check if the other
>> features used by the media documentation are also supported.
>> Probably, multi-part books would be another pain with Sphinx.
>> We have actually 4 books inside a common body. A few chapters
>> (like book licensing, bibliography, error codes) are shared
>> by all 4 documents.
>> 
>> But, so far, I can't see any way to port media books without
>> lots of lot of work to develop new features at the Sphinx code.
> 
> 
> may I can help you ...
> 
> 
>>> The toolchain gets faster, easier to debug and simplified a lot with
>>> DocBook out of the equation completely. Sphinx itself is stable, widely
>>> available, and well documented. IMO there's sufficient native output
>>> format support. There are plenty of really nice extensions
>>> available. There's a possibility of doing kernel-doc as an extension in
>>> the future (either by calling current kernel-doc from the extension or
>>> by rewriting it).
>> 
>> Well, if we go to Sphinx for kernel-doc, that means that we'll need
>> 2 different tools for the documentation:
>>  - Sphinx for kernel-doc
>>  - either DocBook or 

[PATCH] [linux-next] Doc: networking: Fix typo in dsa

2016-04-08 Thread Masanari Iida
This patch fix typos in Documentation/networking/dsa.

Signed-off-by: Masanari Iida 
---
 Documentation/networking/dsa/bcm_sf2.txt | 2 +-
 Documentation/networking/dsa/dsa.txt | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/networking/dsa/bcm_sf2.txt 
b/Documentation/networking/dsa/bcm_sf2.txt
index d999d0c1c5b8..eba3a2431e91 100644
--- a/Documentation/networking/dsa/bcm_sf2.txt
+++ b/Documentation/networking/dsa/bcm_sf2.txt
@@ -38,7 +38,7 @@ Implementation details
 ==
 
 The driver is located in drivers/net/dsa/bcm_sf2.c and is implemented as a DSA
-driver; see Documentation/networking/dsa/dsa.txt for details on the subsytem
+driver; see Documentation/networking/dsa/dsa.txt for details on the subsystem
 and what it provides.
 
 The SF2 switch is configured to enable a Broadcom specific 4-bytes switch tag
diff --git a/Documentation/networking/dsa/dsa.txt 
b/Documentation/networking/dsa/dsa.txt
index 3b196c304b73..36f905d9c77c 100644
--- a/Documentation/networking/dsa/dsa.txt
+++ b/Documentation/networking/dsa/dsa.txt
@@ -334,7 +334,7 @@ more specifically with its VLAN filtering portion when 
configuring VLANs on top
 of per-port slave network devices. Since DSA primarily deals with
 MDIO-connected switches, although not exclusively, SWITCHDEV's
 prepare/abort/commit phases are often simplified into a prepare phase which
-checks whether the operation is supporte by the DSA switch driver, and a commit
+checks whether the operation is supported by the DSA switch driver, and a 
commit
 phase which applies the changes.
 
 As of today, the only SWITCHDEV objects supported by DSA are the FDB and VLAN
-- 
2.8.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [linux-next] clk:st: Fix typo in st,clkgen.txt

2016-04-08 Thread Masanari Iida
This patch fix typos in st,clkgen.txt

Signed-off-by: Masanari Iida 
---
 Documentation/devicetree/bindings/clock/st/st,clkgen.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/clock/st/st,clkgen.txt 
b/Documentation/devicetree/bindings/clock/st/st,clkgen.txt
index 78978f1f5158..bde199bda22a 100644
--- a/Documentation/devicetree/bindings/clock/st/st,clkgen.txt
+++ b/Documentation/devicetree/bindings/clock/st/st,clkgen.txt
@@ -1,7 +1,7 @@
 Binding for a Clockgen hardware block found on
 certain STMicroelectronics consumer electronics SoC devices.
 
-A Clockgen node can contain pll, diviser or multiplexer nodes.
+A Clockgen node can contain pll, divider or multiplexer nodes.
 
 We will find only the base address of the Clockgen, this base
 address is common of all subnode.
@@ -40,7 +40,7 @@ address is common of all subnode.
};
 
 This binding uses the common clock binding[1].
-Each subnode should use the binding discribe in [2]..[7]
+Each subnode should use the binding describe in [2]..[7]
 
 [1] Documentation/devicetree/bindings/clock/clock-bindings.txt
 [2] Documentation/devicetree/bindings/clock/st,clkgen-divmux.txt
-- 
2.8.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] ARM64: ACPI: Update documentation for latest specification version

2016-04-08 Thread Al Stone
On 04/08/2016 07:12 AM, Will Deacon wrote:
> On Thu, Apr 07, 2016 at 03:50:55PM -0600, Al Stone wrote:
>> On 03/28/2016 06:06 PM, Al Stone wrote:
>>> The ACPI 6.1 specification was recently released at the end of January
>>> 2016, but the arm64 kernel documentation for the use of ACPI was written
>>> for the 5.1 version of the spec.  There were significant additions to the
>>> spec that had not yet been mentioned -- for example, the 6.0 mechanisms
>>> added to make it easier to define processors and low power idle states,
>>> as well as the 6.1 addition allowing regular interrupts (not just from
>>> GPIO) be used to signal ACPI general purpose events.
>>>
>>> This patch reflects going back through and examining the specs in detail
>>> and updating content appropriately.  Whilst there, a few odds and ends of
>>> typos were caught as well.  This brings the documentation up to date with
>>> ACPI 6.1 for arm64.
>>>
>>> Changes for v3:
>>>-- Clarify use of _LPI/_RDI (Vikas Sajjan)
>>>-- Whitespace cleanup as pointed out by checkpatch
>>>
>>> Changes for v2:
>>>-- Clean up white space (Harb Abdulhahmid)
>>>-- Clarification on _CCA usage (Harb Abdulhamid)
>>>-- IORT moved to required from recommended (Hanjun Guo)
>>>-- Clarify IORT description (Hanjun Guo)
>>>
>>> Signed-off-by: Al Stone 
>>> Cc: Catalin Marinas 
>>> Cc: Will Deacon 
>>> Cc: Jonathan Corbet 
>>> ---
>>>  Documentation/arm64/acpi_object_usage.txt | 446 
>>> ++
>>>  Documentation/arm64/arm-acpi.txt  |  28 +-
>>>  2 files changed, 357 insertions(+), 117 deletions(-)
>>> [snip...]
>>
>> Ping?  Any further comments or is this good to go?
> 
> It would be nice to see an ack from some other ACPI people, if that's
> possible. Which tree were you planning to merge this through?
> 
> Will
> 

Agreed.  I was hoping the ping would elicit some of that.

I assumed this would go through the arm64 tree since it's pretty specific
to the architecture.

-- 
ciao,
al
---
Al Stone
Software Engineer
Linaro Enterprise Group
al.st...@linaro.org
---
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 04/13] task_isolation: add initial support

2016-04-08 Thread Frederic Weisbecker
On Wed, Mar 09, 2016 at 02:39:28PM -0500, Chris Metcalf wrote:
> Frederic,
> 
> Thanks for the detailed feedback on the task isolation stuff.
> 
> This reply kind of turned into an essay, so I've added a little "TL;DR"
> sentence before each section.

I think I'm going to cut my reply into several threads, because really
I can't get myself to make a giant reply in once :-)

> 
> 
>   TL;DR: Let's make an explicit decision about whether task isolation
>   should be "persistent" or "one-shot".  Both have some advantages.
>   =
> 
> An important high-level issue is how "sticky" task isolation mode is.
> We need to choose one of these two options:
> 
> "Persistent mode": A task switches state to "task isolation" mode
> (kind of a level-triggered analogy) and stays there indefinitely.  It
> can make a syscall, take a page fault, etc., if it wants to, but the
> kernel protects it from incurring any further asynchronous interrupts.
> This is the model I've been advocating for.

But then in this mode, what happens when an interrupt triggers.

> 
> "One-shot mode": A task requests isolation via prctl(), the kernel
> ensures it is isolated on return from the prctl(), but then as soon as
> it enters the kernel again, task isolation is switched off until
> another prctl is issued.  This is what you recommended in your last
> email.

No I think we can issue syscalls for exemple. But asynchronous interruptions
such as exceptions (actually somewhat synchronous but can be unexpected) and
interrupts are what we want to avoid.

> 
> There are a number of pros and cons to the two models.  I think on
> balance I still like the "persistent mode" approach, but here's all
> the pros/cons I can think of:
> 
> PRO for persistent mode: A somewhat easier programming model.  Users
> can just imagine "task isolation" as a way for them to still be able
> to use the kernel exactly as they always have; it's just slower to get
> back out of the kernel so you use it judiciously. For example, a
> process is free to call write() on a socket to perform a diagnostic,
> but when returning from the write() syscall, the kernel will hold the
> task in kernel mode until any timer ticks (perhaps from networking
> stuff) are complete, and then let it return to userspace to continue
> in task isolation mode.

So this is not hard isolation anymore. This is rather soft isolation with
best efforts to avoid disturbance.

Surely we can have different levels of isolation.

I'm still wondering what to do if the task migrates to another CPU. In fact,
perhaps what you're trying to do is rather a CPU property than a process 
property?

> This is convenient to the user since they
> don't have to fret about re-enabling task isolation after that
> syscall, page fault, or whatever; they can just continue running.
> With your suggestion, the user pretty much has to leave STRICT mode
> enabled so he gets notified of any unexpected return to kernel space
> (in fact we might make it required so you always get a signal when
> leaving task isolation unless it's via a prctl or exit syscall).

Right. Although we can allow all syscalls in this mode actually.

> 
> PRO for one-shot mode: A somewhat crisper interaction with
> sched_setaffinity() etc.  With a persistent mode approach, a task can
> start up task isolation, then later another task can be placed on its
> cpu and break it (it won't return to userspace until killed or the new
> process affinitizes itself away or stops running).  By contrast, in
> one-shot mode, any return to kernel spaces turns off task isolation
> anyway, so it's very clear what the interaction looks like.  I suspect
> this is more a theoretical advantage to one-shot mode than a practical
> one, though.

I think I heard about workloads that need such strict hard isolation.
Workloads that really can not afford any disturbance. They even
use userspace network stack. Maybe HFT?

> CON for one-shot mode: It's actually hard to catch every kernel entry
> so we can turn the task-isolation flag off again - and we really do
> need to have a flag, just so that we can suitably debug any bad
> actions that bring us into the kernel when we're not expecting it.
> Right now there are things that bring us into the kernel that we don't
> bother annotating for task isolation STRICT mode, just because they're
> visible to the user anyway: e.g., a bus fault or segmentation
> violation.
> 
> I think we can actually make both modes available to users with just
> another flag bit, so maybe we can look at what that looks like in v11:
> adding a PR_TASK_ISOLATION_ONESHOT flag would turn off task
> isolation at the next syscall entry, page fault, etc.  Then we can
> think more specifically about whether we want to remove the flag or
> not, and if we remove it, whether we want to make the code that was
> controlled by it unconditionally true or unconditionally false
> (i.e. remove it again).

I think we shouldn't bother with strict hard isolation if we don't need
it 

Re: [PATCH v3] ARM64: ACPI: Update documentation for latest specification version

2016-04-08 Thread Will Deacon
On Thu, Apr 07, 2016 at 03:50:55PM -0600, Al Stone wrote:
> On 03/28/2016 06:06 PM, Al Stone wrote:
> > The ACPI 6.1 specification was recently released at the end of January
> > 2016, but the arm64 kernel documentation for the use of ACPI was written
> > for the 5.1 version of the spec.  There were significant additions to the
> > spec that had not yet been mentioned -- for example, the 6.0 mechanisms
> > added to make it easier to define processors and low power idle states,
> > as well as the 6.1 addition allowing regular interrupts (not just from
> > GPIO) be used to signal ACPI general purpose events.
> > 
> > This patch reflects going back through and examining the specs in detail
> > and updating content appropriately.  Whilst there, a few odds and ends of
> > typos were caught as well.  This brings the documentation up to date with
> > ACPI 6.1 for arm64.
> > 
> > Changes for v3:
> >-- Clarify use of _LPI/_RDI (Vikas Sajjan)
> >-- Whitespace cleanup as pointed out by checkpatch
> > 
> > Changes for v2:
> >-- Clean up white space (Harb Abdulhahmid)
> >-- Clarification on _CCA usage (Harb Abdulhamid)
> >-- IORT moved to required from recommended (Hanjun Guo)
> >-- Clarify IORT description (Hanjun Guo)
> > 
> > Signed-off-by: Al Stone 
> > Cc: Catalin Marinas 
> > Cc: Will Deacon 
> > Cc: Jonathan Corbet 
> > ---
> >  Documentation/arm64/acpi_object_usage.txt | 446 
> > ++
> >  Documentation/arm64/arm-acpi.txt  |  28 +-
> >  2 files changed, 357 insertions(+), 117 deletions(-)
> > [snip...]
> 
> Ping?  Any further comments or is this good to go?

It would be nice to see an ack from some other ACPI people, if that's
possible. Which tree were you planning to merge this through?

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64: erratum: Workaround for Kryo reserved system register read

2016-04-08 Thread Marc Zyngier
On 08/04/16 11:31, Suzuki K Poulose wrote:
> On 08/04/16 11:24, Marc Zyngier wrote:
>> On 08/04/16 10:58, Suzuki K Poulose wrote:
>>> On 07/04/16 18:31, Marc Zyngier wrote:
>>>
> + All system register encodings above use the form
> +
> + Op0, Op1, CRn, CRm, Op2.
> +
> + Note that some of the encodings listed above include
> + the system register space reserved for the following
> + identification registers which may appear in future revisions
> + of the ARM architecture beyond ARMv8.0.
> + This space includes:
> + ID_AA64PFR[2-7]_EL1
> + ID_AA64DFR[2-3]_EL1
> + ID_AA64AFR[2-3]_EL1
> + ID_AA64ISAR[2-7]_EL1
> + ID_AA64MMFR[2-7]_EL1
>>>
>>>
>>> AFAIK, the id space is unassigned. So the naming above could cause confusion
>>> if the register is named something else.
>>
>> It is reserved *at the moment*, but already has a defined behaviour. My
> 
> Absolutely, they do need to be RAZ.  My point was assigning names to the 
> reserved
> space where the names are unassigned.

Sorry - I misread your statement. It makes a lot more sense now that the
coffee has trickled in.

We're in violent agreement! ;-)

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v5 7/7] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-04-08 Thread Yongji Xie

Hi Eric,
On 2016/4/8 17:10, Eric Auger wrote:

Hi Yongji,
On 04/08/2016 10:14 AM, Yongji Xie wrote:

Hi Eric,
On 2016/4/7 22:23, Eric Auger wrote:

Hi Yongji,
On 04/07/2016 01:38 PM, Yongji Xie wrote:

On 2016/4/6 22:45, Alex Williamson wrote:

On Tue,  5 Apr 2016 21:46:44 +0800
Yongji Xie  wrote:


This patch enables mmapping MSI-X tables if
hardware supports interrupt remapping which
can ensure that a given pci device can only
shoot the MSIs assigned for it.

Signed-off-by: Yongji Xie 
---
drivers/vfio/pci/vfio_pci.c |9 +++--
drivers/vfio/pci/vfio_pci_private.h |1 +
drivers/vfio/pci/vfio_pci_rdwr.c|2 +-
3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index c60d790..ef02896 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -201,6 +201,10 @@ static int vfio_pci_enable(struct
vfio_pci_device *vdev)
} else
vdev->msix_bar = 0xFF;
+if (iommu_capable(pdev->dev.bus, IOMMU_CAP_INTR_REMAP) ||

This doesn't address the issue I raised earlier where ARM SMMU sets
this capability, but doesn't really provide per vector isolation.  ARM
either needs to be fixed or we need to consider the whole capability
tainted for this application and standardize around the bus flags.
It's not very desirable to have two different ways to test this anyway.

I saw Eric posted a patchset [1] which introduce a flag
MSI_FLAG_IRQ_REMAPPING to indicate the capability
for ARM SMMU. With this patchset applied, it would
be  workable to use bus_flags to test the capability
of ARM SMMU:

My purpose was to remove the advertising of IOMMU_CAP_INTR_REMAP from
arm-smmu.c, "fix" mentionned by Alex (by the way I also need to do the
same in v3 code) and to advertise the functionality on MSI controller
instead (since the IRQ REMAPPING functionality is abstracted in GICv3
ITS MSI controller)

Thank you for your explanation.  Now we have three
flags to test this capability with your and my patches
applied.  We need to test something like
IOMMU_CAP_INTR_REMAP || MSI_FLAG_IRQ_REMAPPING ||
PCI_BUS_FLAGS_MSI_REMAP if we want to mmap
MSI-X table. It's not very desirable if I understood
Alex correctly. So I'm thinking whether we can make
bus_flags compatible with other two flags and only
test bus_flags here.


On top of that, on ARM we have platform (non PCI) MSI controllers so my
understanding is the capability advertising should be possible beyond
the PCI bus?

Actually, we just need one flag which can standardize
the capability on PCI side. With this flag set, we can
easily know hardware supports the capability of
interrupt remapping and it's safe to mmap MSI-X
tables of PCI BARs in any userspace driver.

I agree with you on the fact storing the info at a single place looks
better. However my question was: if my understanding is correct, you
plan to store the info in pci_bus flags. What about platform_bus? Don't
we need to advertise the IRQ remapping capability also with a platform
bus topology? We can have platform devices writing to a platform MSI
controller that supports irq remapping. Assignment of such devices is
not considered yet though and maybe not feasible. I don't know if the
capability is used in other use cases.


My purpose is to make bus_flags compatible with other
two flags so that we can only test bus_flags when mmapping
MSI-X table. We would not remove the flag
MSI_FLAG_IRQ_REMAPPING and IOMMU_CAP_INTR_REMAP.
So we still can test these two flags if we have platform
devices writing to a platform MSI controller.

Of course, it would be better to have a flag which can
advertise the IRQ remapping capability for both PCI
bus and platform bus.  But now I don't find a proper
way to achieve that...

Regards,
Yongji


Best Regards

Eric

Of course, we can also achieve that by testing all the
three flags. But I'm not sure whether it is good enough.

Regards,
Yongji


Best Regards

Eric

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index a080f44..b2d1756 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1134,6 +1134,21 @@ void *msi_desc_to_pci_sysdata(struct msi_desc
*desc)
   }
   EXPORT_SYMBOL_GPL(msi_desc_to_pci_sysdata);

+void pci_check_msi_remapping(struct pci_bus *bus)
+{
+#ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
+struct irq_domain *domain;
+struct msi_domain_info *info;
+
+domain = dev_get_msi_domain(>dev);
+if (domain) {
+info = msi_get_domain_info(domain);
+if (info->flags & MSI_FLAG_IRQ_REMAPPING)
+pdev->bus->bus_flags |=
PCI_BUS_FLAGS_MSI_REMAP;
+}
+#endif
+}
+
   #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
   /**
* pci_msi_domain_write_msg - Helper to write MSI message to PCI
config
space
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 6d7ab9b..24e9606 100644
--- a/drivers/pci/probe.c
+++ 

Re: [PATCH] arm64: erratum: Workaround for Kryo reserved system register read

2016-04-08 Thread Suzuki K Poulose

On 08/04/16 11:24, Marc Zyngier wrote:

On 08/04/16 10:58, Suzuki K Poulose wrote:

On 07/04/16 18:31, Marc Zyngier wrote:


+   All system register encodings above use the form
+
+   Op0, Op1, CRn, CRm, Op2.
+
+   Note that some of the encodings listed above include
+   the system register space reserved for the following
+   identification registers which may appear in future revisions
+   of the ARM architecture beyond ARMv8.0.
+   This space includes:
+   ID_AA64PFR[2-7]_EL1
+   ID_AA64DFR[2-3]_EL1
+   ID_AA64AFR[2-3]_EL1
+   ID_AA64ISAR[2-7]_EL1
+   ID_AA64MMFR[2-7]_EL1



AFAIK, the id space is unassigned. So the naming above could cause confusion
if the register is named something else.


It is reserved *at the moment*, but already has a defined behaviour. My


Absolutely, they do need to be RAZ.  My point was assigning names to the 
reserved
space where the names are unassigned.


worry is that when some new architecture revision comes around, we start
using these registers without thinking much about it (because we should
be able to). At this point, your SoC will catch fire and nobody will
have a clue about the problem because it is not apparent in the code.

I'd really like to see something a bit more forward looking that covers
that space for good.


I agree, the patch definitely needs to take care of handling the entire space.

Cheers
Suzuki
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64: erratum: Workaround for Kryo reserved system register read

2016-04-08 Thread Suzuki K Poulose

On 07/04/16 18:31, Marc Zyngier wrote:


+   All system register encodings above use the form
+
+   Op0, Op1, CRn, CRm, Op2.
+
+   Note that some of the encodings listed above include
+   the system register space reserved for the following
+   identification registers which may appear in future revisions
+   of the ARM architecture beyond ARMv8.0.
+   This space includes:
+   ID_AA64PFR[2-7]_EL1
+   ID_AA64DFR[2-3]_EL1
+   ID_AA64AFR[2-3]_EL1
+   ID_AA64ISAR[2-7]_EL1
+   ID_AA64MMFR[2-7]_EL1



AFAIK, the id space is unassigned. So the naming above could cause confusion
if the register is named something else.


+
+   check_local_cpu_errata();
+


What is the impact of moving this around? Suzuki, was there any
particular reason why this check was done later rather than earlier?


All the existing errata look for MIDR to match, which is read separately
using read_cpuid_id(). The moment we need to do something w.r.t an ID register,
this will break. So at the moment moving this doesn't have much of an impact.

Thanks
Suzuki
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v5 7/7] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-04-08 Thread Eric Auger
Hi Yongji,
On 04/08/2016 10:14 AM, Yongji Xie wrote:
> Hi Eric,
> On 2016/4/7 22:23, Eric Auger wrote:
>> Hi Yongji,
>> On 04/07/2016 01:38 PM, Yongji Xie wrote:
>>> On 2016/4/6 22:45, Alex Williamson wrote:
 On Tue,  5 Apr 2016 21:46:44 +0800
 Yongji Xie  wrote:

> This patch enables mmapping MSI-X tables if
> hardware supports interrupt remapping which
> can ensure that a given pci device can only
> shoot the MSIs assigned for it.
>
> Signed-off-by: Yongji Xie 
> ---
>drivers/vfio/pci/vfio_pci.c |9 +++--
>drivers/vfio/pci/vfio_pci_private.h |1 +
>drivers/vfio/pci/vfio_pci_rdwr.c|2 +-
>3 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index c60d790..ef02896 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -201,6 +201,10 @@ static int vfio_pci_enable(struct
> vfio_pci_device *vdev)
>} else
>vdev->msix_bar = 0xFF;
>+if (iommu_capable(pdev->dev.bus, IOMMU_CAP_INTR_REMAP) ||
 This doesn't address the issue I raised earlier where ARM SMMU sets
 this capability, but doesn't really provide per vector isolation.  ARM
 either needs to be fixed or we need to consider the whole capability
 tainted for this application and standardize around the bus flags.
 It's not very desirable to have two different ways to test this anyway.
>>> I saw Eric posted a patchset [1] which introduce a flag
>>> MSI_FLAG_IRQ_REMAPPING to indicate the capability
>>> for ARM SMMU. With this patchset applied, it would
>>> be  workable to use bus_flags to test the capability
>>> of ARM SMMU:
>> My purpose was to remove the advertising of IOMMU_CAP_INTR_REMAP from
>> arm-smmu.c, "fix" mentionned by Alex (by the way I also need to do the
>> same in v3 code) and to advertise the functionality on MSI controller
>> instead (since the IRQ REMAPPING functionality is abstracted in GICv3
>> ITS MSI controller)
> 
> Thank you for your explanation.  Now we have three
> flags to test this capability with your and my patches
> applied.  We need to test something like
> IOMMU_CAP_INTR_REMAP || MSI_FLAG_IRQ_REMAPPING ||
> PCI_BUS_FLAGS_MSI_REMAP if we want to mmap
> MSI-X table. It's not very desirable if I understood
> Alex correctly. So I'm thinking whether we can make
> bus_flags compatible with other two flags and only
> test bus_flags here.
> 
>> On top of that, on ARM we have platform (non PCI) MSI controllers so my
>> understanding is the capability advertising should be possible beyond
>> the PCI bus?
> 
> Actually, we just need one flag which can standardize
> the capability on PCI side. With this flag set, we can
> easily know hardware supports the capability of
> interrupt remapping and it's safe to mmap MSI-X
> tables of PCI BARs in any userspace driver.
I agree with you on the fact storing the info at a single place looks
better. However my question was: if my understanding is correct, you
plan to store the info in pci_bus flags. What about platform_bus? Don't
we need to advertise the IRQ remapping capability also with a platform
bus topology? We can have platform devices writing to a platform MSI
controller that supports irq remapping. Assignment of such devices is
not considered yet though and maybe not feasible. I don't know if the
capability is used in other use cases.

Best Regards

Eric
> 
> Of course, we can also achieve that by testing all the
> three flags. But I'm not sure whether it is good enough.
> 
> Regards,
> Yongji
> 
>> Best Regards
>>
>> Eric
>>> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
>>> index a080f44..b2d1756 100644
>>> --- a/drivers/pci/msi.c
>>> +++ b/drivers/pci/msi.c
>>> @@ -1134,6 +1134,21 @@ void *msi_desc_to_pci_sysdata(struct msi_desc
>>> *desc)
>>>   }
>>>   EXPORT_SYMBOL_GPL(msi_desc_to_pci_sysdata);
>>>
>>> +void pci_check_msi_remapping(struct pci_bus *bus)
>>> +{
>>> +#ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
>>> +struct irq_domain *domain;
>>> +struct msi_domain_info *info;
>>> +
>>> +domain = dev_get_msi_domain(>dev);
>>> +if (domain) {
>>> +info = msi_get_domain_info(domain);
>>> +if (info->flags & MSI_FLAG_IRQ_REMAPPING)
>>> +pdev->bus->bus_flags |=
>>> PCI_BUS_FLAGS_MSI_REMAP;
>>> +}
>>> +#endif
>>> +}
>>> +
>>>   #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
>>>   /**
>>>* pci_msi_domain_write_msg - Helper to write MSI message to PCI
>>> config
>>> space
>>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>>> index 6d7ab9b..24e9606 100644
>>> --- a/drivers/pci/probe.c
>>> +++ b/drivers/pci/probe.c
>>> @@ -2115,6 +2115,7 @@ struct pci_bus *pci_create_root_bus(struct device
>>> *parent, int bus,
>>>  device_enable_async_suspend(b->bridge);
>>>